Presented By O’Reilly and Cloudera
Make Data Work
September 11, 2018: Training & Tutorials
September 12–13, 2018: Keynotes & Sessions
New York, NY

Schedule: Temporal data and time-series analytics sessions

9:00am–5:00pm Tuesday, 09/11/2018
Location: 1A 08
Alistair Croll (Solve For Interesting), Robert Passarella (Alpha Features), Amro Alkhatib (National Health Insurance Company-Daman), Mridul Mishra (Fidelity Investments), Patrick Angeles (Cloudera), James Psota (Panjiva ), Andreas Kohlmaier (Munich Re), Paul Lashmet (Arcadia Data), Nick Curcuru (Mastercard), Robin Way (Corios), Theresa Johnson (Airbnb), Jane Tran (Unqork), Swatee Singh (American Express)
From analyzing risk and detecting fraud to predicting payments and improving customer experience, take a deep dive into the ways data technologies are transforming the financial industry. Read more.
1:30pm–5:00pm Tuesday, 09/11/2018
Location: 1A 12/14 Level: Intermediate
Bruno Goncalves (Data For Science, Inc)
Average rating: ***..
(3.14, 7 ratings)
Time series are everywhere around us. Understanding them requires taking into account the sequence of values seen in previous steps and even long-term temporal correlations. Join Bruno Gonçalves to learn how to use recurrent neural networks to model and forecast time series and discover the advantages and disadvantages of recurrent neural networks with respect to more traditional approaches. Read more.
11:20am–12:00pm Wednesday, 09/12/2018
Location: 1A 15/16 Level: Intermediate
Mikio Braun (Zalando SE)
Average rating: ****.
(4.86, 7 ratings)
Time series data has many applications in industry, from analyzing server metrics to monitoring IoT signals and outlier detection. Mikio Braun offers an overview of time series analysis with a focus on modern machine learning approaches and practical considerations, including recommendations for what works and what doesn’t, and industry use cases. Read more.
1:15pm–1:55pm Wednesday, 09/12/2018
Location: 1A 12/14 Level: Intermediate
Arun Kejariwal (Independent), Francois Orsini (MZ)
Average rating: ****.
(4.00, 1 rating)
The rate of growth of data volume and velocity has been accelerating along with increases in the variety of data sources. This poses a significant challenge to extracting actionable insights in a timely fashion. Arun Kejariwal and Francois Orsini explain how marrying correlation analysis with anomaly detection can help and share techniques to guide effective decision making. Read more.
2:05pm–2:45pm Wednesday, 09/12/2018
Location: 1A 15/16 Level: Intermediate
Ankit Jain (Uber)
Average rating: ***..
(3.00, 3 ratings)
Personalization is a common theme in social networks and ecommerce businesses. Personalization at Uber involves an understanding of how each driver and rider is expected to behave on the platform. Ankit Jain explains how Uber employs deep learning using LSTMs and its huge database to understand and predict the behavior of each and every user on the platform. Read more.
2:05pm–2:45pm Wednesday, 09/12/2018
Location: 1A 12/14 Level: Intermediate
Roger Barga (Amazon Web Services), Sudipto Guha (Amazon Web Services), Kapil Chhabra (Amazon Web Services )
Average rating: *****
(5.00, 3 ratings)
Roger Barga, Sudipto Guha, and Kapil Chhabra explain how unsupervised learning with the robust random cut forest (RRCF) algorithm enables insights into streaming data and share new applications to impute missing values, forecast future values, detect hotspots, and perform classification tasks. They also demonstrate how to implement unsupervised learning over massive data streams. Read more.
2:55pm–3:35pm Wednesday, 09/12/2018
Location: 1A 15/16 Level: Intermediate
Alex Heye (Cray), Ding Ding (Intel)
Precipitation nowcasting is used to predict the future rainfall intensity over a relatively short timeframe. The forecasting resolution and time accuracy required are much higher than for other traditional forecasting tasks. Alexander Heye and Ding Ding explain how to build a precipitation nowcasting system with recurrent neural networks using BigDL on Apache Spark. Read more.
11:20am–12:00pm Thursday, 09/13/2018
Location: 1A 08 Level: Intermediate
Cris Lowery (Baringa), Marc Warner (ASI)
Average rating: ****.
(4.00, 1 rating)
In EU households, heating and hot water alone account for 80% of energy usage. Cristobal Lowery and Marc Warner explain how future home energy management systems could improve their energy efficiency by predicting resident needs through utilities data, with a particular focus on the key data features, the need for data compression, and the data quality challenges. Read more.
11:20am–12:00pm Thursday, 09/13/2018
Location: 1E 07/08 Level: Beginner
Thomas Weise (Lyft), Mark Grover (Lyft)
Average rating: **...
(2.50, 2 ratings)
Thomas Weise and Mark Grover explain how Lyft uses its streaming platform to detect and respond to anomalous events, using data science tools for machine learning and a process that allows for fast and predictable deployment. Read more.
3:30pm–4:10pm Thursday, 09/13/2018
Location: 1E 07/08 Level: Intermediate
Heitor Murilo Gomes (Télécom ParisTech), Albert Bifet (Télécom ParisTech)
The StreamDM library provides the largest collection of data stream mining algorithms for Spark. Heitor Murilo Gomes and Albert Bifet explain how to use StreamDM and Structured Streaming to develop, apply, and evaluate learning models specially for nonstationary streams (i.e., those with concept drifts). Read more.
3:30pm–4:10pm Thursday, 09/13/2018
Location: 1A 06/07 Level: Beginner
Jared Lander (Lander Analytics)
Average rating: *****
(5.00, 3 ratings)
Temporal data is being produced in ever-greater quantity, but fortunately our time series capabilities are keeping pace. Jared Lander explores techniques for modeling time series, from traditional methods such as ARMA to more modern tools such as Prophet and machine learning models like XGBoost and neural nets. Along the way, Jared shares theory and code for training these models. Read more.
3:30pm–4:10pm Thursday, 09/13/2018
Location: 1A 03/04/05 Level: Intermediate
Revant Nayar (FMI Technologies LLC )
Average rating: *....
(1.50, 2 ratings)
Machine learning has so far underperformed in time series prediction (slowness and overfitting), and classical methods are ineffective at capturing nonlinearity. Revant Nayar shares an alternative approach that is faster and more transparent and does not overfit. It can also pick up regime changes in the time series and systematically captures all the nonlinearity of a given dataset. Read more.