wwHadoop: Analytics without Borders

wwhadoop.png

This week at EMC World 2012, the EMC technical community is launching a community program called World Wide Hadoop.  I am really excited to be a part of a collaboration across the EMC technical community that has been looking to extend the “borders” of our Big Data portfolio by building on the success of our Greenplum Hadoop distribution in offering the open source community the ability to federate their hBase analytics across a distributed set of hadoop clusters.

wwhadoopqr.png

In the past month, the EMC Distinguished Engineer community has been collaborating with our St. Petersburg [RUS] Center of Excellence to demonstrate the ability to distribute analytic jobs (move the code vs. moving the data) across multiple [potentially] geographically dispersed clusters, manage those jobs, and enjoin the results.

The big problem that we are addressing is that of Reed’s law, and the value of combinatorial value of networked resources, in our case, information sets.

Reed.png

“[E]ven Metcalfe’s law understates the value created by a group-forming network [GFN] as it grows. Let’s say you have a GFN with in/i members. If you add up all the potential two-person groups, three-person groups, and so on that those members could form, the number of possible groups equals 2i. So the value of a GFN increases exponentially, in proportion to 2i. I call that Reed’s Law. And its implications are profound.”

A few of our Big Data Challenges

201205181256.jpg

  • Valuable information is produced across geographically diverse locations
  • The data has become too big to move [thus we need to process in place]
  • Scientists and Analysts have begun to move partial sets vs. full corpi to try and save time
  • But this partial data can, and often does create inadvertent variance or occlusions in correlation and value

EMC is demonstrating a working distributed cluster model for analytics across multiple clusters.

201205181314.jpg

We want to work on this with the open community, as we believe that there is tremendous value in enabling the community to both derive and add value with EMC in this space and invite all to join us at http://www.wwhadoop.com.

Leave a Reply