There was an error in this gadget

Thursday, November 15, 2012

Why Sears Is Going All-In On Hadoop

Why Sears Is Going All-In On Hadoop is an interesting, if ‘rose coloured’ view of Hadoop from Phil Shelley, CTO at Sears.  Note that he also leads a Sears subsidiary called MetaScale – which is offering Big Data architecture, consulting & services to companies outside the retail space.

A few choice quotes:
  • Moving up the stack, Sears is consolidating its databases to MySQL, InfoBright, and Teradata--EMC Greenplum, Microsoft SQL Server, and Oracle (including four Exadata boxes) are on their way out, Shelley says.
  • "The Holy Grail in data warehousing has always been to have all your data in one place so you can do big models on large data sets, but that hasn't been feasible either economically or in terms of technical capabilities," Shelley says, noting that Sears previously kept data anywhere from 90 days to two years. "With Hadoop we can keep everything, which is crucial because we don't want to archive or delete meaningful data."
  • "ETL is an antiquated technique, and for large companies it's inefficient and wasteful because you create multiple copies of data," he says. "Everybody used ETL because they couldn't put everything in one place, but that has changed with Hadoop, and now we copy data, as a matter of principle, only when we absolutely have to copy."
  • Shelley sees Hadoop as part of a larger IT ecosystem, too, and says systems such as Teradata will continue to have an important, focused role at Sears. But he's on the far end of the spectrum in terms of how much of the legacy environment Hadoop might replace. Countering Shelley's sometimes sweeping predictions of legacy system replacement, Mike Olson, CEO of Cloudera says: "It's unlikely that a brand-new entrant to the market [like Hadoop] is going to displace tools for established workloads”.
  • MetaScale also offers data architecture, modeling, and management services and consulting. The big idea behind Hadoop is to bring in as much data as possible while keeping data structures simple. "People want to overcomplicate things by representing data and dividing things up into separate files," says Scott LaCosse, director of data management at Sears and MetaScale. "The object is not to save space, it's to eliminate joins, denormalize the data, and put it all in one big file where you can analyze it."  It's an approach that's counterintuitive for a SQL veteran, so a big part of MetaScale's work is to help customers change their thinking: You apply schema as you pull data out to use it, rather than take the relational database approach of imposing a schema on data before it's loaded onto the platform. Hadoop holds data in its raw form, giving users the flexibility to combine and examine the data in many ways over time


Big Data – The Reality beyond the Hype


I was having a coffee with a friend last week and the conversation turned to the latest trends in technology – as it often does.  His view was that ‘Big Data’ was just another in a long line of over-hyped technologies, aimed more at selling the shiniest new product than solving some real-world problem.

I think that the Big Data term is really a shorthand way of describing the escalating amount of data being generated by the actions of people and their devices as they interact with each other and the world at large.  Every time we use a web-site, smartphone or other electronic service data is created and collected – to understand our behaviour, predict what we’d like to buy or where we’ll go, or perhaps show a relevant advertisement. 

An even larger amount of data is beginning to be created by the ‘internet of things’ – a term used to describe the invisible devices and sensors all around us in our vehicles and transport systems, communications and power grids which collect and report on the health of these environments.  For example, engines in the latest commercial aircraft capture a large volume of performance data and report any abnormal operation in real-time via satellite links.  Current car models can already report back if they are involved in a crash or require roadside assistance, collecting engine and performance data can’t be too far in the future.

As the cost of collecting and storing this data continues to drop, it doesn’t take too much imagination to see the value in being able to analyse more fine-grained data on power consumption, real-time traffic, when and where we buy products, or whatever we can imagine being sensed and measured.  Having this newly available data can lead to discovery of previously unknown patterns of behaviour or relationships – telling us about a new artist, restaurant or author, a nearby bargain or a group who share our passion.

So, even if you think Big Data is just an over-hyped buzzword, a tremendous and ever-growing variety and volume of data is being created by our use of web-sites devices and sensors. I don’t think this trend is likely to slow down in the foreseeable future, as ever more of our interactions move into the digital realm.

The era of Big Data is with us, no matter what we call it.

Wednesday, July 25, 2012

Reborn 50mm


Reborn, originally uploaded by justinknol.

from OM-10 to OM-D

Monday, July 11, 2011

Detroit


Detroit, originally uploaded by justinknol.

Semidelinkification, Shirky-style

Nicholas Carr's Blog: Semidelinkification, Shirky-style:
But I did manage to read a sizable chunk of it before clicking the Instapaper 'Read Later' button (a terrific way to avoid reading long stuff without having to feel guilty about it). It was a solid piece, as you'd expect from Shirky, if marred a bit by an unappealing new-media elitism (apparently the great unwashed never made it past the sports pages). But what interests me at the moment is not the content of Shirky's post but its form, particularly the form of its linkage.

Saturday, September 04, 2010

Luminous


Luminous, originally uploaded by justinknol.

"Shoot the Museum" tonight at the National Museum of Australia.

Tuesday, July 27, 2010

Lincoln


Lincoln, originally uploaded by justinknol.

Wednesday, July 07, 2010

Frosty kitchen window this morning


Kitchen Window ii, originally uploaded by justinknol.

Saturday, May 29, 2010

Court in the Rain


Court in the Rain, originally uploaded by justinknol.

Tuesday, May 18, 2010

Granite Island Causeway


Granite Island Causeway, originally uploaded by justinknol.

Sunday, May 16, 2010

What Makes A Volcano Explode?

What Makes A Volcano Explode?:

What Makes A Volcano Explode?


This morning at my front door...

I love Canberra's seasons...

Friday, May 14, 2010

Remarkable Rocks, Kangaroo Island


Remarkable Rocks, KI, originally uploaded by justinknol.

Saturday, May 01, 2010

Little Sahara, Kangaroo Island


Little Sahara, Kangaroo Island, originally uploaded by justinknol.

Wednesday, April 28, 2010

Remarkable Rocks, Kangaroo Island


Remarkable Rocks, KI, originally uploaded by justinknol.

Monday, April 26, 2010

Jetty, Kangaroo Island


Jetty, KI, originally uploaded by justinknol.

Sunday, April 25, 2010

McLaren Vale Sunset


McLaren Vale Sunset, originally uploaded by justinknol.

Seppelt Mausoleum


Seppelt Mausoleum, originally uploaded by justinknol.

Chateau Yaldara


Chateau Yaldara, originally uploaded by justinknol.

Kangaroo Island Jetty


Jetty, KI, originally uploaded by justinknol.

Looking up at Oldham House, Kapunda

Cape Willoughby Lighthouse and Sky, Kangaroo Island

Cape Willoughby Lighthouse, Kangaroo Island

Grosvenor Hotel, Adelaide


Grosvenor Hotel, Adelaide, originally uploaded by justinknol.

I've been away for two weeks to South Australia with Amanda - while the kids were overseas.

more photos over the next few days as i sort through them...

Sunday, April 11, 2010

Hipstamatic Oven Gloves


Oven Gloves, originally uploaded by justinknol.

Saturday, February 13, 2010

Saturday, February 06, 2010

Microsoft confirms FAST Search on Linux and UNIX is DEAD

FAST ESP on non-Windows platforms was doomed from the moment Microsoft acquired FAST. The blog entry Microsoft Enterprise Search Blog : Innovation on Linux and UNIX confirms it:
With our 2010 products scheduled for release in a few months, we’ve just started to plan for our next wave of products. As a part of that planning process, we have decided that in order to deliver more innovation per release in the future, the 2010 products will be the last to include a search core that runs on Linux and UNIX.
Five years 'mainstream' support for FAST ESP 5.3 and then five more years of 'extended' support.

Sunday, January 31, 2010

UK Government ICT Strategy 2010-2020

The need to transform public services and to fully exploit ICT to achieve this is accelerating. To meet increasing demand within this complex technology arena, the UK public sector has built an ICT infrastructure that in many instances duplicates solutions across different areas of Government. The ICT strategy will ensure that the infrastructure will go through a process of standardisation and simplification based on the premise of a common infrastructure designed to enable local delivery suited to local needs. Delivery will increasingly be through partnerships between the public, private and third sectors and the strategy enables greater interoperability to underpin this model. The strategy applies to all of the UK Public Sector, whether Central Government, Local Government, Wider Public Sector or Devolved Administrations. It provides a common approach to ICT that maintains local accountability and control over implementation to meet unique delivery and business requirements."
There are fourteen strands to the strategy:
  1. The Public Sector Network
  2. The Government Cloud (G-Cloud)
  3. Data Centres
  4. Government Applications Store (G-AS)
  5. Shared Services
  6. Desktop Services
  7. Architecture and Standards
  8. Open Source, Open Standards, Reuse
  9. Greening Government ICT
  10. Information Security & Assurance
  11. Professionalising IT enabled change
  12. Reliable Project Delivery
  13. Supply Management
  14. International Alignment

Tuesday, January 26, 2010

RAAF Roulettes at the Australia Day Concert


Roulettes, originally uploaded by justinknol.

Thursday, January 07, 2010

on Flickr - Reconciliation


Reconciliation One, originally uploaded by justinknol.