Hadoop Talk - SkillsMatter 2009

After an embarrassing tale of misunderstanding, wrong locations and blind luck I recently ended up at the Introduction to data processing with Hadoop and Pig talk over at SkillsMatter - and it was excellent.

For those that don’t know about Hadoop, it’s an OpenSource Java framework for data-intensive distributed applications. It enables applications to work with thousands of nodes and petabytes of data. Hadoop was inspired by Google’s MapReduce and Google File System (GFS) papers. I was aware of the basics but even in an hour I learned enough to know where to look for more details. Pig on the other hand is (to me) like SQL but for Hadoop, it’s a lot easier to use than writing your own Java apps and simpler (and actually possible) for non-developers to read than the reams of classes required for custom jobs.

The speaker was excellent, the presentation was well timed, fluid, concise, paced just the way I like it and other than the question session the evening was very enjoyable. You can find the Hadoop slides online.