Presented By O'Reilly and Cloudera
Make Data Work
Feb 17–20, 2015 • San Jose, CA

Building an Apache Hadoop Data Application

Tom White (Cloudera), Joey Echeverria (Rocana), Ryan Blue (Cloudera)
1:30pm–5:00pm Wednesday, 02/18/2015
Hadoop Platform
Location: 210 D/H
Average rating: ***..
(3.67, 3 ratings)
Slides:   1-PDF    2-PDF 

Materials or downloads needed in advance

*THIS TUTORIAL HAS REQUIREMENTS AND INSTRUCTIONS LISTED BELOW* In the second (afternoon) half of the Architecture Day tutorial, attendees will build a data application from the ground up. The application will ingest streaming user data (like web clicks) and, using tools and APIs in the Kite SDK, transform and store the data in Hadoop in a form that is readily consumable with Hadoop tools like Impala and Spark. As a part of the tutorial we will demonstrate how Kite codifies the best practices from the Hadoop Architecture Day morning session.

*TUTORIAL REQUIREMENTS AND INSTRUCTIONS FOR ATTENDEES* Please read through the following to make sure you are prepared in advance of the Building an Apache Hadoop Data Application, before you arrive onsite.

  • Images can be downloaded http://bits.cloudera.com/6ce9c414/
  • *Username:* strata2015
  • *Password:* strata2015 There are both VirtualBox and VMWare downloads for students that are a little less than 4GB. Instructions for students:
    • Go to http://bits.cloudera.com/6ce9c414/ *Username:* strata2015 *Password:* strata2015
    • Download a VM image for your player of choice (VirtualBox recommended)
    • Unpack the VM and import it in your player
    Please do this *before* the tutorial to avoid a stampede of downloads over conference wireless.
  • Description

    THIS TUTORIAL HAS REQUIREMENTS AND INSTRUCTIONS LISTED BELOW

    In the second (afternoon) half of the Architecture Day tutorial, attendees will build a data application from the ground up. The application will ingest streaming user data (like web clicks) and, using tools and APIs in the Kite SDK, transform and store the data in Hadoop in a form that is readily consumable with Hadoop tools like Impala and Spark.

    As a part of the tutorial we will demonstrate how Kite codifies the best practices from the Hadoop Architecture Day morning session.

    TUTORIAL REQUIREMENTS AND INSTRUCTIONS FOR ATTENDEES

    Please read through the following to make sure you are prepared in advance of the Building an Apache Hadoop Data Application, before you arrive onsite.

    Images can be downloaded here
    Username: strata2015
    Password: strata2015

    There are both VirtualBox and VMWare downloads for students that are a little less than 4GB.

    Instructions for students:


    • Go to http://bits.cloudera.com/6ce9c414/
    • Username: strata2015
      Password: strata2015

    • Download a VM image for your player of choice (VirtualBox recommended)

    • Unpack the VM and import it in your player

    Please do this before the tutorial to avoid a stampede of downloads over conference wireless.

    Photo of Tom White

    Tom White

    Cloudera

    Tom White has been an Apache Hadoop committer since February 2007, and is a member of the Apache Software Foundation. He is the author of “Hadoop: The Definitive Guide” for O’Reilly. Previously he worked as an independent consultant specializing in Hadoop, and before that was co-founder and Lead Developer at Kizoom, a UK mobile application startup. Tom has a Bachelor’s degree in Mathematics from the University of Cambridge, and a Master’s degree in History and Philosophy of Science from the Universities of Leeds, UK, and Florence, Italy.

    Photo of Joey Echeverria

    Joey Echeverria

    Rocana

    Joey Echeverria is the director of engineering at Rocana, where he builds applications for scaling IT operations built on the Apache Hadoop platform. Joey is a committer on the Kite SDK, an Apache-licensed data API for the Hadoop ecosystem. Joey was previously a software engineer at Cloudera, where contributed to several ASF projects including Apache Flume, Apache Sqoop, Apache Hadoop, and Apache HBase. Joey is also a coauthor of Hadoop Security, published by O’Reilly.

    Photo of Ryan Blue

    Ryan Blue

    Cloudera

    Ryan Blue is a software engineer at Cloudera, currently working on the Kite SDK team.

    Comments on this page are now closed.

    Comments

    peter chun
    02/19/2015 2:21am PST

    Hi, where can I get the slides?

    Picture of Joey Echeverria
    Joey Echeverria
    02/18/2015 6:38am PST

    Yes, we will be posting the slides including links to the labs.

    Rajesh Haran
    02/18/2015 5:04am PST

    Can you kindly post the slides and the materials?

    Picture of Ryan Blue
    Ryan Blue
    02/17/2015 8:57am PST

    John, after the download, you’ll want to unpack the zip file and load the VM in VMWare. I’m not sure what the exact steps are for VMWare, but there should be some options to import an appliance. Once imported, make sure you can boot the VM and then you’re ready to go.

    John Walker
    02/17/2015 8:26am PST

    Sorry, A little clarification required. So I’m on a MAC – already have a VMWare PC, that I’ve downloaded the the zip file into … are there next steps ?
    thx – see you at the conference