Presented By
O’Reilly + Cloudera
Make Data Work
29 April–2 May 2019
London, UK

Schedule: Temporal data and time-series sessions

Add to your personal schedule
13:3017:00 Tuesday, 30 April 2019
Arun Kejariwal (Independent), Karthik Ramasamy (Streamlio), Ivan Kelly (Streamlio)
Average rating: ***..
(3.00, 10 ratings)
Many industry segments have been grappling with fast data (high-volume, high-velocity data). Arun Kejariwal and Karthik Ramasamy walk you through the state-of-the-art systems for each stage of an end-to-end data processing pipeline—messaging, compute, and storage—for real-time data and algorithms to extract insights (e.g., heavy hitters and quantiles) from data streams. Read more.
Add to your personal schedule
13:3017:00 Tuesday, 30 April 2019
Data Science, Machine Learning & AI
Location: Capital Suite 2/3
Francesca Lazzeri (Microsoft), Aashish Bhateja (Microsoft)
Average rating: ****.
(4.25, 4 ratings)
Time series modeling and forecasting is fundamentally important to various practical domains; in the past few decades, machine learning model-based forecasting has become very popular in both private and public decision-making processes. Francesca Lazzeri walks you through using Azure Machine Learning to build and deploy your time series forecasting models. Read more.
Add to your personal schedule
11:1511:55 Wednesday, 1 May 2019
Data Science, Machine Learning & AI
Location: Capital Suite 17
Sami Niemi (Barclays)
Average rating: ****.
(4.62, 16 ratings)
Predicting transaction fraud of debit and credit card payments in real time is an important challenge, which state-of-art supervised machine learning models can help to solve. Sami Niemi offers an overview of the solutions Barclays has been developing and testing and details how well models perform in variety of situations like card present and card not present debit and credit card transactions. Read more.
Add to your personal schedule
12:0512:45 Wednesday, 1 May 2019
Data Science, Machine Learning & AI
Location: Capital Suite 17
Arun Kejariwal (Independent), Ira Cohen (Anodot)
Average rating: ****.
(4.00, 5 ratings)
Sequence-to-sequence modeling (seq2seq) is now being used for applications based on time series data. Arun Kejariwal and Ira Cohen offer an overview seq2seq and explore its early use cases. They then walk you through leveraging seq2seq modeling for these use cases, particularly with regard to real-time anomaly detection and forecasting. Read more.
Add to your personal schedule
14:0514:45 Wednesday, 1 May 2019
Data Engineering and Architecture
Location: Capital Suite 8/9
JIAN CHANG (Alibaba Group), Sanjian Chen (Alibaba Group)
Average rating: ***..
(3.33, 3 ratings)
Jian Chang and Sanjian Chen share the architecture design and many detailed technology innovations of Alibaba TSDB, a state-of-the-art database for IoT data management, and discuss lessons learned from years of development and continuous improvement. Read more.
Add to your personal schedule
14:0514:45 Wednesday, 1 May 2019
Data Science, Machine Learning & AI
Location: Capital Suite 15/16
Alun Biffin (Van Lanschot Kempen), David Dogon (Van Lanschot Kempen)
Average rating: ****.
(4.45, 11 ratings)
Alun Biffin and David Dogon explain how machine learning revolutionized the stock-picking process for portfolio managers at Kempen Capital Management by filtering the vast small-cap investment universe down to a handful of optimal stocks. Read more.
Add to your personal schedule
16:3517:15 Wednesday, 1 May 2019
Data Science, Machine Learning & AI
Location: Capital Suite 17
Guoqiong Song (Intel)
Average rating: ***..
(3.40, 5 ratings)
Collecting and processing massive time series data (e.g., logs, sensor readings, etc.) and detecting the anomalies in real time is critical for many emerging smart systems, such as industrial, manufacturing, AIOps, and the IoT. Guoqiong Song explains how to detect anomalies in time series data using Analytics Zoo and BigDL at scale on a standard Spark cluster. Read more.
Add to your personal schedule
16:3517:15 Wednesday, 1 May 2019
Data Science, Machine Learning & AI
Location: Capital Suite 15/16
Shivnath Babu (Unravel Data Systems | Duke University), Alkis Simitsis (Micro Focus)
Average rating: *****
(5.00, 1 rating)
Cost and resource provisioning are critical components of the big data stack. Shivnath Babu and Alkis Simitsis detail how to build a Magic 8 Ball for the big data stack—a decomposable time series model for optimal cost and resource allocation that offers enterprises a glimpse into their future needs and enables effective and cost-efficient project and operational planning. Read more.
Add to your personal schedule
14:0514:45 Thursday, 2 May 2019
Data Science, Machine Learning & AI
Location: Capital Suite 15/16
Christian Hidber (bSquare)
Average rating: ****.
(4.86, 7 ratings)
Reinforcement learning (RL) learns complex processes autonomously like walking, beating the world champion in Go, or flying a helicopter. No big datasets with the “right” answers are needed: the algorithms learn by experimenting. Christian Hidber shows how and why RL works and demonstrates how to apply it to an industrial hydraulics application with 7,000 clients in 42 countries. Read more.
Add to your personal schedule
14:5515:35 Thursday, 2 May 2019
Data Engineering and Architecture, Expo Hall
Location: Expo Hall 2 (Capital Hall N24)
Michael Freedman (TimescaleDB | Princeton University)
Average rating: ****.
(4.75, 4 ratings)
Time series databases require ingesting high volumes of structured data, answering complex, performant queries for recent and historical time intervals, and performing specialized time-centric analysis and data management. Michael Freedman explains how to avoid these operational problems by reengineering Postgres to serve as a general data platform, including high-volume time series workloads. Read more.
Add to your personal schedule
14:5515:35 Thursday, 2 May 2019
Data Science, Machine Learning & AI
Location: Capital Suite 15/16
Christopher Hooi (Land Transport Authority of Singapore)
Average rating: *****
(5.00, 3 ratings)
Christopher Hooi offers an overview of the Fusion Analytics for Public Transport Event Response (FASTER) system, a real-time advanced analytics solution for early warning of potential train incidents. FASTER uses engineering and commuter-centric IoT data sources to activate contingency plans at the earliest possible time and reduce impact to commuters. Read more.