Presented By O'Reilly and Cloudera
Make Data Work
5–7 May, 2015 • London, UK

Building an Apache Hadoop data application

Tom White (Cloudera), Joey Echeverria (Rocana), Ryan Blue (Cloudera)
13:30–17:00 Tuesday, 5/05/2015
Hadoop Platform
Location: King's Suite - Balmoral
Average rating: ***..
(3.50, 12 ratings)

Prerequisite Knowledge

Knowledge of Hadoop components, working knowledge of Java

Materials or downloads needed in advance

THIS TUTORIAL HAS REQUIREMENTS AND INSTRUCTIONS LISTED BELOW

In the second (afternoon) half of the Architecture Day tutorial, attendees will build a data application from the ground up. The application will ingest streaming user data (like web clicks) and, using tools and APIs in the Kite SDK, transform and store the data in Hadoop in a form that is readily consumable with Hadoop tools like Impala and Spark.

As a part of the tutorial we will demonstrate how Kite codifies the best practices from the Hadoop Architecture Day morning session.

TUTORIAL REQUIREMENTS AND INSTRUCTIONS FOR ATTENDEES

Please read through the following to make sure you are prepared in advance of the Building an Apache Hadoop Data Application, before you arrive onsite.

Images can be downloaded http://bits.cloudera.com/6ce9c414/

  • Username: strata2015
  • Password: strata2015

There are both VirtualBox and VMWare downloads for students that are a little less than 4GB.

Instructions for students:

Go to http://bits.cloudera.com/6ce9c414/

  • Username: strata2015
  • Password: strata2015
  • Download a VM image for your player of choice (VirtualBox recommended)
  • Unpack the VM and import it in your player

Please do this before the tutorial to avoid a stampede of downloads over conference wireless.

Description

In the second (afternoon) half of the Architecture Day tutorial, attendees will build a data application from the ground up. The application will ingest streaming user data (like web clicks), and using tools and APIs in the Kite SDK, transform and store the data in Hadoop in a form that is readily consumable with Hadoop tools like Impala and Spark.

As a part of the tutorial we will demonstrate how Kite codifies the best practices from the Hadoop Architecture Day morning session.

Photo of Tom White

Tom White

Cloudera

Tom White has been an Apache Hadoop committer since February 2007, and is a member of the Apache Software Foundation. He is the author of Hadoop: The Definitive Guide for O’Reilly. Previously he worked as an independent consultant specializing in Hadoop, and before that was co-founder and lead developer at Kizoom, a UK mobile applications startup. Tom has a Bachelor’s degree in Mathematics from the University of Cambridge, and a Master’s degree in History and Philosophy of Science from the Universities of Leeds, UK, and Florence, Italy.

Photo of Joey Echeverria

Joey Echeverria

Rocana

Joey Echeverria is the director of engineering at Rocana, where he builds applications for scaling IT operations built on the Apache Hadoop platform. Joey is a committer on the Kite SDK, an Apache-licensed data API for the Hadoop ecosystem. Joey was previously a software engineer at Cloudera, where contributed to several ASF projects including Apache Flume, Apache Sqoop, Apache Hadoop, and Apache HBase. Joey is also a coauthor of Hadoop Security, published by O’Reilly.

Photo of Ryan Blue

Ryan Blue

Cloudera

Ryan Blue is a software engineer at Cloudera, currently working on the Kite SDK team.

Comments on this page are now closed.

Comments

Picture of Tom White
Tom White
5/05/2015 15:13 BST

Slides are available on “slideshare”: http://www.slideshare.net/tomwhite/strata-london-building-an-apache-hadoop-data-application

Picture of Joey Echeverria
Joey Echeverria
1/05/2015 0:40 BST

The pre-requisites didn’t get updated here. Please follow the directions from this page:

http://strataconf.com/big-data-conference-ca-2015/public/schedule/detail/38206

Picture of Markus Perl
Markus Perl
30/04/2015 14:32 BST

Hi,
is the tutorial based on CDH 5.3 or on CDH 5.4? The first e-mail “PLEASE READ (Tutorial Instructions): Building an Apache Hadoop Data Application” recommends to install a CDH 5.3 VM but the second e-mail “PLEASE READ (Tutorial Instructions):Architectural considerations for Hadoop applications” links to CDH 5.4.

Thanks,
Markus