Introduction to Hadoop

Data: Hadoop
Location: C123
Average rating: ****.
(4.27, 11 ratings)

This presentation will give you the big picture of how hadoop works. We will cover the key pieces of the hadoop ecosystem.

HDFS, the distributed fault tolerant filesystem.

Map/Reduce, the method of batch processing distributed data.

In this intro we will cover the key processes of the namenode the tasktracker, and jobtracker, the map the reduce, and the sort and shuffle.

A diverse ecosystem of tools are commonly used, those will be given brief mention with only brief time to mention features. Flume, Oozie, Hive, Pig, Hbase.

Photo of Tom Hanlon

Tom Hanlon

Functional Media

Tom Hanlon is a senior instructor at Functional Media, where he delivers courses on the wonders of the Hadoop ecosystem. Before beginning his relationship with Hadoop and large distributed data, he had a happy and lengthy relationship with MySQL with a focus on web operations. He has been a trainer for MySQL, Sun, and Percona.