Presented By O'Reilly and Cloudera
Make Data Work
31 May–1 June 2016: Training
1 June–3 June 2016: Conference
London, UK

Schedule: Hardcore data science sessions

Add to your personal schedule
9:00–17:00 Wednesday, 1/06/2016
Location: Capital Suite 4
Average rating: ***..
(3.91, 11 ratings)
Hardcore Data Science offers a chance to dive deeper into data science and add new techniques and technologies to your data science toolbox on topics such as data management, machine learning, natural language processing, crowdsourcing, and algorithm design—all shared by leading data science practitioners. Read more.
Add to your personal schedule
9:05–9:30 Wednesday, 1/06/2016
Location: Capital Suite 4 Level: Intermediate
Mounia Lalmas (Yahoo)
Average rating: ****.
(4.00, 5 ratings)
Mounia Lalmas offers an overview of work aimed at understanding the user preclick experience of ads and building a learning framework to identify ads with low preclick quality. Read more.
Add to your personal schedule
9:30–10:00 Wednesday, 1/06/2016
Location: Capital Suite 4 Level: Intermediate
Tags: real-time, iot
Ira Cohen (Anodot)
Average rating: ****.
(4.80, 5 ratings)
Time series and event data form the basis for real-time insights about the performance of businesses such as ecommerce, the IoT, and web services, but gaining these insights involves designing a learning system that scales to millions and billions of data streams. Ira Cohen outlines such a system that performs real-time machine learning and analytics on streams at massive scale. Read more.
Add to your personal schedule
10:00–10:30 Wednesday, 1/06/2016
Location: Capital Suite 4 Level: Intermediate
Danny Bickson (1972)
Average rating: ****.
(4.33, 3 ratings)
A Netflix competition triggered a major academic research effort in recommender systems. However, there is still a big gap between academic research and industry. Danny Bickson covers the current state of recommender systems in industry and explains why, while user historical purchase data is understood very well, recommenders based on images and text are just starting to pick up. Read more.
Add to your personal schedule
11:00–11:30 Wednesday, 1/06/2016
Location: Capital Suite 4 Level: Advanced
Tags: ai
Francesca Odone (University of Genova)
Average rating: ***..
(3.50, 2 ratings)
Francesca Odone explores analyzing visual data (images and videos) with the purpose of extracting meaningful information to solve different scene-understanding tasks. Francesca addresses the problem of learning adaptive data representations and covers different application scenarios, including human-robot interaction, activity recognition, and object categorization. Read more.
Add to your personal schedule
12:00–12:30 Wednesday, 1/06/2016
Location: Capital Suite 4 Level: Advanced
Piotr Mirowski (Google DeepMind)
Average rating: ****.
(4.75, 4 ratings)
Piotr Mirowski looks under the hood of recurrent neural networks and explains how they can be applied to speech recognition, machine translation, sentence completion, and image captioning. Read more.
Add to your personal schedule
13:30–14:00 Wednesday, 1/06/2016
Location: Capital Suite 4 Level: Advanced
Tags: real-time, iot
Average rating: ****.
(4.50, 4 ratings)
Anomaly detection is a hot topic in data and can be applied to various fields. Anomaly detection faces challenges common to all big data projects but also deals with higher uncertainty and more difficult measurements, all while operating in real time. Alessandra Staglianò explains how those challenges translate to the real world and how to overcome them with the latest data science tools. Read more.
Add to your personal schedule
14:00–14:30 Wednesday, 1/06/2016
Location: Capital Suite 4 Level: Advanced
Olivier Grisel (Inria & scikit-learn)
Average rating: ****.
(4.00, 3 ratings)
Deep learning leverages compositions of parametrized differentiable modules commonly referred to as neural networks to build versatile and powerful predictive models from richly annotated data. Olivier Grisel offers an overview of recent trends and advances in deep learning research in computer vision, natural language understanding, and agent control via reinforcement learning. Read more.
Add to your personal schedule
14:30–15:00 Wednesday, 1/06/2016
Location: Capital Suite 4 Level: Advanced
Tags: ecommerce
Mikio Braun (Zalando SE)
Average rating: *****
(5.00, 6 ratings)
Mikio Braun explains why, in practice, hardcore data science is not just about learning methods but also about bringing these methods to production. This does not mean simply reimplementing methods in production systems. Rather, you must successfully deal with issues like data updates, cultural differences between data science and developers, and how to monitor and test in practice. Read more.
Add to your personal schedule
15:30–16:00 Wednesday, 1/06/2016
Location: Capital Suite 4 Level: Advanced
Matthew Smith (Microsoft Research)
Average rating: ***..
(3.50, 2 ratings)
Matthew Smith demonstrates how to gain unexpectedly high predictive accuracy, new insights for the domain experts and customers into the functioning of the system, and computationally efficient prediction algorithms, in applications such as predicting crops, global carbon emissions, diseases, ecosystems, species distributions, weather, roads, and riots. Read more.
Add to your personal schedule
16:00–16:30 Wednesday, 1/06/2016
Location: Capital Suite 4 Level: Intermediate
Tags: text
Roxana Danger (reed.co.uk)
Average rating: ***..
(3.60, 5 ratings)
One of the main challenges organizations face is the semantic categorization of textual data. Roxana Danger offers an overview of ROOT, the reed online occupational taxonomy, which was constructed to improve the quality of services at reed.co.uk, and discusses this semisupervised methodology for generating (and maintaining) taxonomies from large collections of textual data. Read more.
Add to your personal schedule
16:30–17:00 Wednesday, 1/06/2016
Location: Capital Suite 4 Level: Advanced
Michal Galas (University College London)
Average rating: ***..
(3.50, 2 ratings)
Experimental computational simulation environments are increasingly being developed by major financial institutions to model their analytic algorithms. Michal Galas introduces the key concepts underlying these environments, which rely on big data analytics to enable large-scale testing, optimization, and monitoring of algorithms running in the virtual or real mode. Read more.
Add to your personal schedule
14:55–15:35 Friday, 3/06/2016
Location: Capital Suite 17 Level: Intermediate
Johannes Bauer (Cambridge Analytica)
Average rating: **...
(2.00, 1 rating)
Efficient, accurate, and robust ETL (extract, transform, load) pipelines are essential components for building successful data products. Johannes Bauer discusses the fundamental requirements for ETL pipelines, highlighting major guiding principles as well as challenges and outlining selected elements of ETL pipeline implementations using advanced elements of Scala. Read more.