Presented By O'Reilly and Cloudera
Make Data Work
September 25–26, 2017: Training
September 26–28, 2017: Tutorials & Conference
New York, NY

Schedule: Platform sessions

Add to your personal schedule
9:00am5:00pm Tuesday, September 26, 2017
Location: 1E 09
Rose Winterton (Pitney Bowes), Audrey Spencer-Alvarado (Portland Trail Blazers), Amie Elcan (CenturyLink), Sean Power (Repable), Parisa Foster (Play The Future), Nick Selby (CJX, Inc. | Midlothian Police Department), Salema Rice (Allegis Group), Aneesh Karve (Quilt), Derek Ruths (CAI), Kristina Bergman (Integris Software), Natalia Adler (UNICEF HQ), Brandon O'Brien (Expedia, Inc)
In a series of 12 half-hour talks aimed at a business audience, you’ll hear data-themed case studies from household brands and global companies, explaining the challenges they wanted to tackle, the approaches they took, and the benefits—and drawbacks—of their solutions. If you want practical insights about applied data, look no further. Read more.
Add to your personal schedule
9:00am5:00pm Tuesday, September 26, 2017
Location: 1A 06/07
Ben Lorica (O'Reilly Media), Assaf Araki (Intel), Jacob Schreiber (University of Washington), Alex Ratner (Stanford University), Madeleine Udell (Cornell University), Yunsong Guo (Pinterest), Katherine Heller (Duke University), Alan Nichol (Rasa), Gerard de Melo (Rutgers University), Tamara Broderick (MIT), Inbal Tadeski (Anodot), Daniel Kang (Stanford University), Bichen Wu (UC Berkeley), Shaked Shammah (Hebrew University)
A full day of hardcore data science, exploring emerging topics and new areas of study made possible by vast troves of raw data and cutting-edge architectures for analyzing and exploring information. Along the way, leading data science practitioners teach new techniques and technologies to add to your data science toolbox. Read more.
Add to your personal schedule
11:20am12:00pm Wednesday, September 27, 2017
Data engineering, Data Engineering & Architecture
Location: 1A 23/24 Level: Intermediate
Zhenxiao Luo (Uber), Wei Yan (Uber)
Average rating: ****.
(4.43, 7 ratings)
Uber's geospatial data is increasing exponentially as the company grows. As a result, its big data systems must also grow in scalability, reliability, and performance to support business decisions, user recommendations, and experiments for geospatial data. Zhenxiao Luo and Wei Yan explain how Uber runs geospatial analysis efficiently in its big data systems, including Hadoop, Hive, and Presto. Read more.
Add to your personal schedule
1:15pm1:55pm Wednesday, September 27, 2017
Travis Bakeman (T-Mobile)
Average rating: **...
(2.00, 1 rating)
Travis Bakeman shares how T-Mobile ported its large-scale network performance management platform, T-PIM, from a legacy database to a big data platform with Impala as the main reporting interface, covering the migration journey, including the challenges the team faced, how the team evaluated new technologies, lessons learned along the way, and the efficiencies gained as a result. Read more.
Add to your personal schedule
1:15pm1:55pm Wednesday, September 27, 2017
Big data and the Cloud, Data Engineering & Architecture
Location: 1A 21/22 Level: Advanced
Average rating: ****.
(4.57, 7 ratings)
John Hitchingham shares insights into the design and operation of FINRA's data lake in the AWS cloud, where FINRA extracts, transforms, and loads over 75B transactions per day. Users can query across petabytes of data in seconds on AWS S3 using Presto and Spark—all while maintaining security and data lineage. Read more.
Add to your personal schedule
2:05pm2:45pm Wednesday, September 27, 2017
Enterprise adoption, Strata Business Summit
Location: 1A 18 Level: Intermediate
Nandu Jayakumar (Visa), Justin Erickson (Cloudera)
Average rating: *....
(1.00, 1 rating)
At Visa, the process of optimizing the enterprise data warehouse and consolidating data marts by migrating these analytic workloads to Hadoop has played a key role in the adoption of the platform and how data has transformed Visa as an organization. Nandu Jayakumar and Justin Erickson share Visa’s journey along with some best practices for organizations migrating workloads to Hadoop. Read more.
Add to your personal schedule
2:55pm3:35pm Wednesday, September 27, 2017
Data Engineering & Architecture, Enterprise adoption
Location: 1A 23/24 Level: Beginner
Simon Chan (Salesforce)
Average rating: *****
(5.00, 1 rating)
Salesforce recently released Einstein, which brings AI into its core platform to power every business. The secret behind Einstein is an underlying platform that accelerates AI development at scale for both internal and external data scientists. Simon Chan shares his experience building this unified platform for a multitenancy, multibusiness cloud enterprise. Read more.
Add to your personal schedule
4:35pm5:15pm Wednesday, September 27, 2017
Stephen Devine (Big Fish Games), Kalah Brown (Big Fish Games)
Companies are increasingly interested in processing and analyzing live-streaming data. The Hadoop ecosystem includes platforms and software library frameworks to support this work, but these components require correct architecture, performance tuning, and customization. Stephen Devine and Kalah Brown explain how they used Spark, Flume, and Kafka to build a live-streaming data pipeline. Read more.
Add to your personal schedule
4:35pm5:15pm Wednesday, September 27, 2017
Data engineering, Data Engineering & Architecture
Location: 1A 23/24 Level: Advanced
Barbara Eckman (Comcast)
Average rating: ***..
(3.00, 2 ratings)
Barbara Eckman offers an overview of Comcast’s streaming data platform, comprised of a variety of ingest, transformation, and storage services, which uses Apache Avro schemas to support end-to-end data governance, Apache Atlas for data discovery and lineage, and custom asynchronous messaging libraries to notify Atlas of new data and schema entities and lineage links as they are created. Read more.
Add to your personal schedule
4:35pm5:15pm Wednesday, September 27, 2017
Artificial Intelligence, Machine Learning & Data Science
Location: 1A 12/14 Level: Intermediate
Nadeem Gulzar (Danske Bank Group), Sune Askjær (Think Big Analytics, a Teradata Company)
Average rating: *****
(5.00, 3 ratings)
Fraud in banking is an arms race, and criminals are now using machine learning to improve their attack effectiveness. Sune Askjaer and Nadeem Gulzar explore how Danske Bank uses deep learning for better fraud detection, covering model effectiveness, TensorFlow versus boosted decision trees, operational considerations in training and deploying models, and lessons learned along the way. Read more.
Add to your personal schedule
5:25pm6:05pm Wednesday, September 27, 2017
Big data and the Cloud, Data Engineering & Architecture
Location: 1A 15/16/17 Level: Intermediate
Josh Baer (Spotify), Alison Gilles (Spotify)
Average rating: ****.
(4.00, 1 rating)
In early 2016, Spotify decided that it didn’t want to be in the data center business. The future was the cloud. Josh Baer and Alison Gilles explain what it took to move Spotify to the cloud, covering Spotify's technology choices, challenges faced, and the lessons Spotify learned along the way. Read more.
Add to your personal schedule
1:15pm1:55pm Thursday, September 28, 2017
Data engineering, Strata Business Summit
Location: 1E 10/11 Level: Intermediate
Kurt Brown (Netflix)
Average rating: ****.
(4.40, 5 ratings)
Kurt Brown explains how to get the most out of your data infrastructure with 20 principles and practices used at Netflix. Kurt covers each in detail and explores how they relate to the technologies used at Netflix, including S3, Spark, Presto, Druid, R, Python, and Jupyter. Read more.
Add to your personal schedule
1:15pm1:55pm Thursday, September 28, 2017
Big data and the Cloud, Data Engineering & Architecture
Location: 1A 21/22 Level: Intermediate
Andrew Otto (Wikimedia Foundation), Fangjin Yang (Imply)
The Wikimedia Foundation (WMF) is a nonprofit charitable organization. As the parent company of Wikipedia, one of the most visited websites in the world, WMF faces many unique challenges around its ecosystem of editors, readers, and content. Andrew Otto and Fangjin Yang explain how the WMF does analytics and offer an overview of the technology it uses to do so. Read more.
Add to your personal schedule
2:05pm2:45pm Thursday, September 28, 2017
Javier Esplugas (DHL Supply Chain), Kevin Parent (Conduce)
DHL has created an IoT initiative for its supply chain warehouse operations. Javier Esplugas and Kevin Parent explain how DHL has gained unprecedented insight—from the most comprehensive global view across all locations to a unique data feed from a single sensor—to see, understand, and act on everything that occurs in its warehouses with immersive operational data visualization. Read more.
Add to your personal schedule
2:05pm2:45pm Thursday, September 28, 2017
Bargava Subramanian (Independent), Harjinder Mistry (Red Hat)
Average rating: ***..
(3.00, 1 rating)
Bargava Subramanian and Harjinder Mistry explain how machine learning and deep learning techniques are helping Red Hat build smart developer tools to make software developers become more efficient. Read more.
Add to your personal schedule
2:55pm3:35pm Thursday, September 28, 2017
Data engineering, Data Engineering & Architecture
Location: 1A 23/24 Level: Intermediate
Felix GV (LinkedIn), Yan Yan (LinkedIn)
Average rating: **...
(2.00, 1 rating)
Companies with batch and stream processing pipelines need to serve the insights they glean back to their users, an often-overlooked problem that can be hard to achieve reliably and at scale. Felix GV and Yan Yan offer an overview of Venice, a new data store capable of ingesting data from Hadoop and Kafka, merging it together, replicating it globally, and serving it online at low latency. Read more.