Presented By O'Reilly and Cloudera
December 5-6, 2016: Training
December 6–8, 2016: Tutorials & Conference

Schedule: Hadoop use cases sessions

9:00am–12:30pm Tuesday, December 6, 2016
Location: 310/311 Level: Intermediate
Mark Grover (Lyft), Ted Malaska (Capital One), Jonathan Seidman (Cloudera)
Average rating: ****.
(4.75, 4 ratings)
Mark Grover, Ted Malaska, and Jonathan Seidman explain how to architect a modern, real-time big data platform leveraging recent advancements in the open source software world and discuss how to use components like Kafka, Impala, Kudu, Spark Streaming, and Spark SQL with Hadoop to enable new forms of data processing and analytics. Read more.
12:05pm–12:45pm Wednesday, December 7, 2016
Location: 334/335 Level: Non-technical
Imron Zuhri (Mediatrac)
Average rating: ***..
(3.33, 3 ratings)
Mediatrac, a big data technology platform focused on data connectivity, object profiling, and knowledge discovery, allows businesses and startups to build advanced analytic solutions on top of it. Imron Zuhri shares several data connectivity use cases and explains how to leverage distributed computing to tackle massive entity recognition and resolution problems. Read more.
4:15pm–4:55pm Wednesday, December 7, 2016
Location: 321/322 Level: Intermediate
Ted Malaska (Capital One)
Average rating: *****
(5.00, 2 ratings)
If your design only focuses on the processing layer to get speed and power, you may be leaving a significant amount of optimization untapped. Ted Malaska describes a set of storage design patterns and schemas implemented on HBase, Kudu, Kafka, SolR, HDFS, and S3 that, by carefully tailoring how data is stored, can reduce processing and access times by two to three orders of magnitude. Read more.
5:05pm–5:45pm Wednesday, December 7, 2016
Location: 308/309 Level: Non-technical
Takayuki Nishikawa (Panasonic Corporation), Ei Yamaguhi (NTT DATA)
Takayuki Nishikawa and Ei Yamaguhi explain how Panasonic developed an integrated data analytics platform to analyze the increasing number of home appliances logs from its IoT products, achieving scalability for millions of households and a 10x improvement in processing time with Hadoop and Hive, in the process gaining more reliable knowledge about users’ lifestyles with Spark MLlib. Read more.
2:35pm–3:15pm Thursday, December 8, 2016
Location: 328/329 Level: Beginner
Tags: ecommerce
Qiaoliang Xiang (ShopBack)
Shopback, a company that gives cash back to customers for successful transactions covering various lifestyles, crawls 25 million products from multiple ecommerce websites to provide a smooth customer experience. Qiaoliang Xiang walks you through how to crawl and update products, how to scale it using big data tools, and how to design a modularized system. Read more.
5:05pm–5:45pm Thursday, December 8, 2016
Location: 308/309 Level: Intermediate
Rebecca Tien Yu Lin (is-land Systems Inc.), Mon-Fong Mike Jiang (is-land Systems Inc.)
Average rating: ***..
(3.67, 3 ratings)
Rebecca Tien Yu Lin and Mon-Fong Mike Jiang offer an overview of a Hadoop-based big data solution helping the semiconductor industry increase yield by monitoring the huge amount of tool logs and the data generated from the FDC system. Read more.