Skip to main content

Strata + Hadoop World Tutorials

All confirmed Tutorials for Strata + Hadoop World are listed below. Please note: to attend, your registration must include Tutorials on Monday.

Add to your personal schedule
Sutton Center - Sutton South
Tutorial Please note: to attend, your registration must include Tutorials on Monday.
Tom White (Cloudera), Eric Sammer (Rocana), Joey Echeverria (Rocana)
Average rating: ***..
(3.71, 14 ratings)
In this tutorial we'll use the Cloudera Development Kit (CDK) to build a Java web app that logs application events to Hadoop, and then run ad hoc and scheduled queries against the collected data. Read more.
Add to your personal schedule
Murray Hill Suite
Tutorial Please note: to attend, your registration must include Tutorials on Monday.
Julie Rodriguez (Eagle Investment Systems)
Average rating: *....
(1.00, 5 ratings)
Learn how to find beauty in data. The beauty of a visual is that it can communicate so much. As we become more sophisticated with the amount of data we can harness, it will become more important for us to be equally good at visually communicating that data. This workshop will guide attendees through the process of learning a method that will aide in selecting the right visualization. Read more.
Add to your personal schedule
Regent Parlor
Tutorial Please note: to attend, your registration must include Tutorials on Monday.
Matt Harrison (MetaSnake)
Average rating: ***..
(3.71, 7 ratings)
This Tutorial will jumpstart your Python experience. Learn the basics-enough Python to be dangerous. Then use two of the most popular packages for analysis, Matplotlib for plotting, and Pandas for data wrangling. This will be a hands-on tutorial, so bring a laptop with Python 2.7 installed, and the gumption to hit the ground running and see what everyone is raving about. Read more.
Add to your personal schedule
Nassau Suite
Tutorial Please note: to attend, your registration must include Tutorials on Monday.
Antonio piccolbo@gmail.com (Per data LLC), Joseph Rickert (Revolution Analytics)
Average rating: ***..
(3.40, 5 ratings)
This tutorial is aimed at R users who want to use Hadoop to work on big data and Hadoop users who want to do sophisticated analytics. We will introduce to R, Hadoop and the RHadoop project. We will then cover three R packages for Hadoop and the mapreduce model. We will present numerous examples of incremental complexity including the combination of rmr and RevoscaleR to solve modeling problems. Read more.
Add to your personal schedule
Grand Ballroom West
Tutorial Please note: to attend, your registration must include Tutorials on Monday.
Israel Ekpo (Walt Disney Parks and Resorts)
Average rating: *....
(1.19, 47 ratings)
This is a 3-hour tutorial on how to use Apache Flume to aggregate massive quantities of structured or unstructured data from sources such as log data, click streams, social media data, graph data and network traffic into centralized data stores such as HDFS, ElasticSearch, Neo4j and MongoDB so that they can be processed, digested and visualized in realtime using D3.js and HTML5 WebSockets. Read more.
Add to your personal schedule
Rhinelander South
Tutorial Please note: to attend, your registration must include Tutorials on Monday.
Tathagata Das (Databricks), Haoyuan Li (Alluxio), Ion Stoica (UC Berkeley), Reynold Xin (Databricks), Sameer Agarwal (UC Berkeley)
Average rating: ****.
(4.80, 10 ratings)
An introduction to the open-source Berkeley Data Analytics Stack (BDAS). Spark is a high-speed cluster computing engine that supports rich analytics (e.g. machine learning) and lower-latency processing (e.g. streaming). Tachyon provides in-memory storage, letting Spark and Hadoop jobs share data efficiently. Shark and GraphX provide high-speed Hive SQL queries and graph processing on top of Spark. Read more.
Add to your personal schedule
Beekman Parlor - Sutton North
Tutorial Please note: to attend, your registration must include Tutorials on Monday.
Alistair Croll (Solve For Interesting)
Average rating: **...
(2.50, 4 ratings)
For business strategists, marketers, product managers, and entrepreneurs, Data-Driven Business looks at how to use data to make better business decisions faster. Packed with case studies, panels, and eye-opening presentations, this fast-paced day focuses on how to solve today's thorniest business problems with Big Data. It's the missing MBA for a data-driven, always-on business world. Read more.
Add to your personal schedule
Gramercy Suite
Tutorial Please note: to attend, your registration must include Tutorials on Monday.
Average rating: ***..
(3.14, 7 ratings)
Strata's regular data science track has great talks with real world experience from leading edge speakers. But we didn't just stop there—we added the Hardcore Data Science day to give you a chance to go even deeper. The Hardcore day will add new techniques and technologies to your data science toolbox, shared by leading data science practitioners from startups, industry, consulting and academia. Read more.
Add to your personal schedule
Regent Parlor
Tutorial Please note: to attend, your registration must include Tutorials on Monday.
Patricia Gorla (The Last Pickle)
Average rating: ***..
(3.00, 12 ratings)
Before you analyze your big data, you need a way to store and access it. Here we examine the benefits of using a highly-available, eventually consistent storage system, and what impact this has on real-time analytics. This session will prepare you to set up a multi-node working Cassandra and Hadoop cluster. Read more.
Add to your personal schedule
Murray Hill Suite
Tutorial Please note: to attend, your registration must include Tutorials on Monday.
Giovanni Seni (Intuit)
Average rating: ***..
(3.44, 9 ratings)
This tutorial, based on a published book by the speaker, offers a hands-on intro to ensemble models, which combine multiple models into a single predictive system that’s often more accurate than the best of its components. Participants will use data sets and snippets of R code to experiment with the methods to gain a practical understanding of this breakthrough technology. Read more.
Add to your personal schedule
Sutton Center - Sutton South
Tutorial Please note: to attend, your registration must include Tutorials on Monday.
Sean Murphy (PingThings), Benjamin Bengfort (District Data Labs and University of Maryland)
Average rating: ****.
(4.56, 18 ratings)
Much of the world’s data (and your own) is text. The key to unlocking its value is in a series of Natural Language Processing transformations that turn raw strings into a machine usable form. We will use Hadoop alongside Python’s NLTK to do these steps and discuss why each is necessary in your application. Read more.
Add to your personal schedule
Rhinelander South
Tutorial Please note: to attend, your registration must include Tutorials on Monday.
Matthew Russell (Digital Reasoning)
Average rating: *****
(5.00, 8 ratings)
A code-intensive workshop that breaks down the nuts and bolts of using IPython Notebook to uncover insights from social web APIs such as Twitter, Facebook, LinkedIn, and Google+. Attendees with a basic programming background will walk away with a working knowledge of how to access and mine valuable information the social web. Read more.
Add to your personal schedule
Nassau Suite
Tutorial Please note: to attend, your registration must include Tutorials on Monday.
Leah Hanson (Google)
Average rating: ****.
(4.00, 1 rating)
Julia is a high-performance, open source language with great tools for numerical and statistical work. If you know R, MATLAB, or NumPy, you will feel at home in Julia. Unlike these systems, however, Julia takes advantage of modern compiler technology, combining an intuitive programming model with the speed of a low-level language. This workshop will take you from installed to productive in Julia. Read more.
Add to your personal schedule
Grand Ballroom West
Tutorial Please note: to attend, your registration must include Tutorials on Monday.
John Akred (Silicon Valley Data Science), Richard Williamson (Silicon Valley Data Science), Stephen O'Sullivan (Silicon Valley Data Science)
Average rating: ***..
(3.71, 17 ratings)
What are the essential components of a data platform? This tutorial will explain how the various parts of the Hadoop and big data ecosystem fit together in production to create a data platform supporting batch, interactive and realtime analytical workloads. Read more.

Sponsors

Sponsorship Opportunities

For exhibition and sponsorship opportunities, contact Susan Stewart at sstewart@oreilly.com

Media Partner Opportunities

For information on trade opportunities with O'Reilly conferences email mediapartners
@oreilly.com

Press & Media

For media-related inquiries, contact Maureen Jennings at maureen@oreilly.com

Contact Us

View a complete list of Strata + Hadoop World 2013 contacts