Data Science, Machine Learning & AI: Big data conference & machine learning training

Wednesday 1 May: Keynotes & Sessions (Platinum, Gold, Silver & Bronze passes)
9:00 \| Location: Auditorium Strata Data Conference Keynotes
10:45 Morning break

Thursday 2 May: Keynotes & Sessions (Platinum, Gold, Silver & Bronze passes)
9:00 \| Location: Auditorium Strata Data Conference Keynotes
10:45 Morning break

9:00 - 17:00 Monday, 29 April & Tuesday, 30 April

Hands-on data science with Python

Location: Capital Suite 1

Secondary topics: Data preparation, data governance, and data lineage

Robert Schroll (The Data Incubator)

Average rating:

(4.75, 4 ratings)

Robert Schroll walks you through all the steps of developing a machine learning pipeline from prototyping to production. You'll explore data cleaning, feature engineering, model building and evaluation, and deployment and then extend these models into two applications from real-world datasets. All work will be done in Python. Read more.

9:00 - 17:00 Monday, 29 April & Tuesday, 30 April

Expand your data science and machine learning skills with Python, R, SQL, Spark, and TensorFlow

Location: Capital Suite 7

Secondary topics: Deep Learning

Ian Cook (Cloudera)

Average rating:

(4.33, 3 ratings)

Advancing your career in data science requires learning new languages and frameworks—but learners face an overwhelming array of choices, each with different syntaxes, conventions, and terminology. Ian Cook simplifies the learning process by elucidating the abstractions common to these systems. Through hands-on exercises, you'll overcome obstacles to getting started using new tools. Read more.

9:00 - 17:00 Monday, 29 April & Tuesday, 30 April

Large-scale ML with MLflow, deep learning, and Apache Spark

Location: Capital Suite 17

Secondary topics: Deep Learning, Model lifecycle management

Amir Issaei (Databricks)

Average rating:

(5.00, 1 rating)

Join Amir Issaei to explore neural network fundamentals and learn how to build distributed Keras/TensorFlow models on top of Spark DataFrames. You'll use Keras, TensorFlow, Deep Learning Pipelines, and Horovod to build and tune models and MLflow to track experiments and manage the machine learning lifecycle. This course is taught entirely in Python. Read more.

9:00 - 17:00 Monday, 29 April & Tuesday, 30 April

Machine learning from scratch in TensorFlow

Location: Capital Suite 9

Secondary topics: Deep Learning

Ana Hocevar (The Data Incubator)

Average rating:

(4.38, 8 ratings)

The TensorFlow library provides for the use of computational graphs, with automatic parallelization across resources. This architecture is ideal for implementing neural networks. Ana Hocevar offers an intro to TensorFlow's capabilities in Python, taking you from building machine learning algorithms piece by piece to using the Keras API provided by TensorFlow with several hands-on applications. Read more.

9:00–12:30 Tuesday, 30 April 2019

Continuous intelligence: Moving machine learning into production reliably

Location: Capital Suite 14

Secondary topics: Model lifecycle management

Danilo Sato (ThoughtWorks), Christoph Windheuser (ThoughtWorks)

Average rating:

(4.31, 13 ratings)

Danilo Sato and Christoph Windheuser walk you through applying continuous delivery (CD), pioneered by ThoughtWorks, to data science and machine learning. Join in to learn how to make changes to your models while safely integrating and deploying them into production, using testing and automation techniques to release reliably at any time and with a high frequency. Read more.

9:00–12:30 Tuesday, 30 April 2019

Cross-cloud model training and serving with Kubeflow

Location: Capital Suite 15

Secondary topics: AI and Data technologies in the cloud, Model lifecycle management

Holden Karau (Independent), Trevor Grant (IBM), Francesca Lazzeri (Microsoft)

Average rating:

(4.43, 7 ratings)

Holden Karau, Francesca Lazzeri, and Trevor Grant offer an overview of Kubeflow and walk you through using it to train and serve models across different cloud environments (and on-premises). You'll use a script to do the initial setup work, so you can jump (almost) straight into training a model on one cloud and then look at how to set up serving in another cluster/cloud. Read more.

9:00–12:30 Tuesday, 30 April 2019

Serverless machine learning with TensorFlow: Part I

Location: Capital Suite 2/3

Secondary topics: AI and Data technologies in the cloud, Deep Learning

Melinda King (ROI Training)

Average rating:

(3.00, 8 ratings)

Melinda King offers an introduction to designing and building machine learning models on Google Cloud Platform. Through a combination of presentations, demos, and hands-on labs, you’ll learn machine learning (ML) and TensorFlow concepts, and develop skills in developing, evaluating, and productionizing ML models. Read more.

9:00–12:30 Tuesday, 30 April 2019

Using AWS serverless technologies to analyze large datasets

Location: Capital Suite 4

Secondary topics: AI and Data technologies in the cloud, Data preparation, data governance, and data lineage, Health and Medicine

Krishnan Saidapet (REAN Cloud, A Hitachi Vantara company)

Average rating:

(3.43, 7 ratings)

Krishnan Saidapet offers an overview of the latest big data and machine learning serverless technologies from Amazon Web Services (AWS) and leads a deep dive into using them to process and analyze two different datasets: the publicly available Bureau of Labor Statistics dataset and the Chest X-Ray Image Data dataset. Read more.

13:30–17:00 Tuesday, 30 April 2019

Serverless machine learning with TensorFlow: Part II

Location: Capital Suite 11

Secondary topics: AI and Data technologies in the cloud, Deep Learning

Melinda King (ROI Training)

Average rating:

(3.12, 8 ratings)

Melinda King offers an introduction to designing and building machine learning models on Google Cloud Platform. Through a combination of presentations, demos, and hands-on labs, you’ll learn machine learning (ML) and TensorFlow concepts and develop skills in developing, evaluating, and productionizing ML models. Read more.

13:30–17:00 Tuesday, 30 April 2019

Natural language understanding at scale with Spark NLP

Location: Capital Suite 14

Secondary topics: Deep Learning, Text and Language processing and analysis

Alexander Thomas (John Snow Labs), Claudiu Branzan (Accenture)

Average rating:

(4.00, 4 ratings)

Alex Thomas and Claudiu Branzan lead a hands-on introduction to scalable NLP using the highly performant, highly scalable open source Spark NLP library. You’ll spend about half your time coding as you work through four sections, each with an end-to-end working code base that you can change and improve. Read more.

13:30–17:00 Tuesday, 30 April 2019

Time series forecasting with Azure Machine Learning

Location: Capital Suite 2/3

Secondary topics: AI and Data technologies in the cloud, Deep Learning, Financial Services, Temporal data and time-series

Francesca Lazzeri (Microsoft), Aashish Bhateja (Microsoft)

Average rating:

(4.25, 4 ratings)

Time series modeling and forecasting is fundamentally important to various practical domains; in the past few decades, machine learning model-based forecasting has become very popular in both private and public decision-making processes. Francesca Lazzeri walks you through using Azure Machine Learning to build and deploy your time series forecasting models. Read more.

9:25–9:45 Wednesday, 1 May 2019

Making data science useful

Location: Auditorium

Cassie Kozyrkov (Google)

Average rating:

(4.05, 39 ratings)

Despite the rise of data engineering and data science functions in today's corporations, leaders report difficulty in extracting value from data. Many organizations aren’t aware that they have a blindspot with respect to their lack of data effectiveness, and hiring experts doesn’t seem to help. Join Cassie Kozyrkov to talk about how you can change that. Read more.

11:15–11:55 Wednesday, 1 May 2019

Recommending and searching at Spotify

Location: Expo Hall (Capital Hall N24)

Secondary topics: Media, Marketing, Advertising, Retail and e-commerce

Mounia Lalmas (Spotify)

Average rating:

(4.16, 19 ratings)

Spotify's mission is "to match fans and artists in a personal and relevant way." Mounia Lalmas shares some of the (research) work the company is doing to achieve this, from using machine learning to metric validation, illustrated through examples within the context of home and search. Read more.

11:15–11:55 Wednesday, 1 May 2019

Spark NLP in action: How Indeed applies NLP to standardize résumé content at scale

Location: Capital Suite 14

Secondary topics: Deep Learning, Media, Marketing, Advertising, Text and Language processing and analysis

Alexander Thomas (John Snow Labs), Alexis Yelton (Indeed)

Average rating:

(4.67, 3 ratings)

Alexander Thomas and Alexis Yelton demonstrate how to use Spark NLP and Apache Spark to standardize semistructured text, illustrated by Indeed's standardization process for résumé content. Read more.

11:15–11:55 Wednesday, 1 May 2019

Building a secure and transparent ML pipeline using open source technologies

Location: Capital Suite 15/16

Secondary topics: Ethics, Security and Privacy

Nick Pentreath (IBM)

Average rating:

(4.75, 4 ratings)

The application of AI algorithms in domains such as criminal justice, credit scoring, and hiring holds unlimited promise. At the same time, it raises legitimate concerns about algorithmic fairness. There's a growing demand for fairness, accountability, and transparency from machine learning (ML) systems. Nick Pentreath explains how to build just such a pipeline leveraging open source tools. Read more.

11:15–11:55 Wednesday, 1 May 2019

Predicting real-time transaction fraud using supervised learning

Location: Capital Suite 17

Secondary topics: Deep Learning, Financial Services, Temporal data and time-series

Sami Niemi (Barclays)

Average rating:

(4.62, 16 ratings)

Predicting transaction fraud of debit and credit card payments in real time is an important challenge, which state-of-art supervised machine learning models can help to solve. Sami Niemi offers an overview of the solutions Barclays has been developing and testing and details how well models perform in variety of situations like card present and card not present debit and credit card transactions. Read more.

12:05–12:45 Wednesday, 1 May 2019

Agile NLP workflows with spaCy and Prodigy

Location: Expo Hall (Capital Hall N24)

Secondary topics: AI and machine learning in the enterprise, Text and Language processing and analysis

Matthew Honnibal (Explosion AI)

Average rating:

(4.00, 4 ratings)

Matthew Honnibal shares "one weird trick" that can give your NLP project a better chance of success: avoid a waterfall methodology where data definition, corpus construction, modeling, and deployment are performed as separate phases of work. Read more.

12:05–12:45 Wednesday, 1 May 2019

Dealing with data scarcity in natural language processing

Location: Capital Suite 14

Secondary topics: Text and Language processing and analysis

Yves Peirsman (NLP Town)

Average rating:

(4.57, 7 ratings)

In this age of big data, NLP professionals are all too often faced with a lack of data: written language is abundant, but labeled text is much harder to come by. Yves Peirsman outlines the most effective ways of addressing this challenge, from the semiautomatic construction of labeled training data to transfer learning approaches that reduce the need for labeled training examples. Read more.

12:05–12:45 Wednesday, 1 May 2019

Visually communicating statistical and machine learning methods

Location: Capital Suite 15/16

Secondary topics: Visualization, Design, and UX

Michael Freeman (University of Washington)

Average rating:

(4.18, 11 ratings)

Statistical and machine learning techniques are only useful when they're understood by decision makers. While implementing these techniques is easier than ever, communicating about their assumptions and mechanics is not. Michael Freeman details a design process for crafting visual explanations of analytical techniques and communicating them to stakeholders. Read more.

12:05–12:45 Wednesday, 1 May 2019

Sequence-to-sequence modeling for time series

Location: Capital Suite 17

Secondary topics: Deep Learning, Temporal data and time-series

Arun Kejariwal (Independent), Ira Cohen (Anodot)

Average rating:

(4.00, 5 ratings)

Sequence-to-sequence modeling (seq2seq) is now being used for applications based on time series data. Arun Kejariwal and Ira Cohen offer an overview seq2seq and explore its early use cases. They then walk you through leveraging seq2seq modeling for these use cases, particularly with regard to real-time anomaly detection and forecasting. Read more.

14:05–14:45 Wednesday, 1 May 2019

The evolution of data science skill sets: An analysis using exponential family embeddings

Location: Capital Suite 14

Secondary topics: Media, Marketing, Advertising, Text and Language processing and analysis

Maryam Jahanshahi (TapRecruit)

Average rating:

(4.00, 3 ratings)

Maryam Jahanshahi explores exponential family embeddings: methods that extend the idea behind word embeddings to other data types. You'll learn how TapRecruit used dynamic embeddings to understand how data science skill sets have transformed over the last three years, using its large corpus of job descriptions, and more generally, how these models can enrich analysis of specialized datasets. Read more.

14:05–14:45 Wednesday, 1 May 2019

Using machine learning for stock picking

Location: Capital Suite 15/16

Secondary topics: Financial Services, Temporal data and time-series

Alun Biffin (Van Lanschot Kempen), David Dogon (Van Lanschot Kempen)

Average rating:

(4.45, 11 ratings)

Alun Biffin and David Dogon explain how machine learning revolutionized the stock-picking process for portfolio managers at Kempen Capital Management by filtering the vast small-cap investment universe down to a handful of optimal stocks. Read more.

14:05–14:45 Wednesday, 1 May 2019

The unreasonable effectiveness of transfer learning on NLP

Location: Capital Suite 17

Secondary topics: Deep Learning, Text and Language processing and analysis

David Low (Pand.ai)

Average rating:

(3.57, 7 ratings)

Transfer learning has been proven to be a tremendous success in computer vision—a result of the ImageNet competition. In the past few months, there have been several breakthroughs in natural language processing with transfer learning, namely ELMo, OpenAI Transformer, and ULMFit. David Low demonstrates how to use transfer learning on an NLP application with SOTA accuracy. Read more.

14:05–14:45 Wednesday, 1 May 2019

Fair, privacy-preserving, and secure ML

Location: Expo Hall (Capital Hall N24)

Secondary topics: Security and Privacy

Mikio Braun (Zalando)

Average rating:

(5.00, 3 ratings)

Mikio Braun explores techniques and concepts around fairness, privacy, and security when it comes to machine learning models. Read more.

14:55–15:35 Wednesday, 1 May 2019

TensorFlow for everyone

Location: Expo Hall (Capital Hall N24)

Secondary topics: Deep Learning

Wolff Dobson (Google, Inc.)

Average rating:

(3.83, 6 ratings)

Wolff Dobson covers the latest in TensorFlow. Whether you're a beginner or are migrating from 1.x to 2.0, you'll learn the best ways to set up your model, feed your data to it, and distribute it for fast training. You'll also discover how TensorFlow has been recently upgraded to be more intuitive. Read more.

14:55–15:35 Wednesday, 1 May 2019

Solving data cleaning and unification using human-guided machine learning

Location: Capital Suite 14

Secondary topics: Data preparation, data governance, and data lineage

Ihab Ilyas (University of Waterloo)

Average rating:

(4.71, 7 ratings)

Last year, Ihab Ilyas covered two primary challenges in applying machine learning to data curation: entity consolidation and using probabilistic inference to suggest data repair for identified errors and anomalies. This year, he explores these limitations in greater detail and explains why data unification projects quickly require human-guided machine learning and a probabilistic model. Read more.

14:55–15:35 Wednesday, 1 May 2019

Explainable machine learning in fintech

Location: Capital Suite 15/16

Secondary topics: Ethics, Financial Services, Health and Medicine

Eitan Anzenberg (Bill.com)

Average rating:

(4.50, 4 ratings)

Machine learning applications balance interpretability and performance. Linear models provide formulas to directly compare the influence of the input variables, while nonlinear algorithms produce more accurate models. Eitan Anzenberg explores a solution that utilizes what-if scenarios to calculate the marginal influence of features per prediction and compare with standardized methods such as LIME. Read more.

14:55–15:35 Wednesday, 1 May 2019

Synthetic video generation: Why seeing should not always be believing

Location: Capital Suite 17

Secondary topics: Deep Learning, Media, Marketing, Advertising, Security and Privacy

Alexander Adam (Faculty)

Average rating:

(4.00, 1 rating)

The advent of "fake news" has led us to doubt the truth of online media, and advances in machine learning give us an even greater reason to question what we are seeing. Despite the many beneficial applications of this technology, it's also potentially very dangerous. Alex Adam explains how synthetic videos are created and how they can be detected. Read more.

16:35–17:15 Wednesday, 1 May 2019

A Magic 8 Ball for optimal cost and resource allocation for the big data stack

Location: Capital Suite 15/16

Secondary topics: Automation in data science and big data, Temporal data and time-series

Shivnath Babu (Unravel Data Systems | Duke University), Alkis Simitsis (Micro Focus)

Average rating:

(5.00, 1 rating)

Cost and resource provisioning are critical components of the big data stack. Shivnath Babu and Alkis Simitsis detail how to build a Magic 8 Ball for the big data stack—a decomposable time series model for optimal cost and resource allocation that offers enterprises a glimpse into their future needs and enables effective and cost-efficient project and operational planning. Read more.

16:35–17:15 Wednesday, 1 May 2019

LSTM-based time series anomaly detection using Analytics Zoo for Spark and BigDL

Location: Capital Suite 17

Secondary topics: Deep Learning, Temporal data and time-series

Guoqiong Song (Intel)

Average rating:

(3.40, 5 ratings)

Collecting and processing massive time series data (e.g., logs, sensor readings, etc.) and detecting the anomalies in real time is critical for many emerging smart systems, such as industrial, manufacturing, AIOps, and the IoT. Guoqiong Song explains how to detect anomalies in time series data using Analytics Zoo and BigDL at scale on a standard Spark cluster. Read more.

16:35–17:15 Wednesday, 1 May 2019

Opening the black box: Explainable AI (XAI)

Location: Expo Hall (Capital Hall N24)

Secondary topics: Ethics, Security and Privacy

Maren Eckhoff (QuantumBlack)

Average rating:

(4.50, 4 ratings)

The success of machine learning algorithms in a wide range of domains has led to a desire to leverage their power in ever more areas. Maren Eckhoff discusses modern explainability techniques that increase the transparency of black box algorithms, drive adoption, and help manage ethical, legal, and business risks. Many of these methods can be applied to any model without limiting performance. Read more.

17:25–18:05 Wednesday, 1 May 2019

Federated learning: Machine learning with privacy on the edge

Location: Capital Suite 14

Secondary topics: Security and Privacy

Chris Wallace (Cloudera)

Average rating:

(5.00, 4 ratings)

Imagine building a model whose training data is collected on edge devices such as cell phones or sensors. Each device collects data unlike any other, and the data cannot leave the device because of privacy concerns or unreliable network access. This challenging situation is known as federated learning. Chris Wallace discusses the algorithmic solutions and the product opportunities. Read more.

17:25–18:05 Wednesday, 1 May 2019

Reading China: Predicting policy change with machine learning

Location: Capital Suite 15/16

Secondary topics: Text and Language processing and analysis

Weifeng Zhong (Mercatus Center at George Mason University)

Average rating:

(4.75, 4 ratings)

Weifeng Zhong shares a machine learning algorithm built to “read” the People’s Daily (the official newspaper of the Communist Party of China) and predict changes in China’s policy priorities. The output of this algorithm, named the Policy Change Index (PCI) of China, turns out to be a leading indicator of the actual policy changes in China since 1951. Read more.

11:15–11:55 Thursday, 2 May 2019

Fraud detection at a financial institution using unsupervised learning and text mining

Location: Capital Suite 14

Secondary topics: AI and machine learning in the enterprise, Financial Services, Security and Privacy, Text and Language processing and analysis

David Dogon (Van Lanschot Kempen)

Average rating:

(4.75, 8 ratings)

David Dogon dives into a best practice use case for detecting fraud at a financial institution and details a dynamic and robust monitoring system that successfully detects unwanted client behavior. Join in to learn how machine learning models can provide a solution in cases where traditional systems fall short. Read more.

11:15–11:55 Thursday, 2 May 2019

Learning "learning to rank"

Location: Capital Suite 15/16

Secondary topics: Media, Marketing, Advertising, Retail and e-commerce

Sophie Watson (Red Hat)

Average rating:

(4.10, 10 ratings)

Identifying relevant documents quickly and efficiently enhances both user experience and business revenue every day. Sophie Watson demonstrates how to implement learning-to-rank algorithms and provides you with the information you need to implement your own successful ranking system. Read more.

11:15–11:55 Thursday, 2 May 2019

Deep learning for speech synthesis: The good news, the bad news, and the fake news

Location: Capital Suite 17

Secondary topics: Deep Learning, Graph technologies and analytics, Security and Privacy

Scott Stevenson (Faculty)

Average rating:

(5.00, 4 ratings)

Modern deep learning systems allow us to build speech synthesis systems with the naturalness of a human speaker. While there are myriad benevolent applications, this also ushers in a new era of fake news. Scott Stevenson explores the danger of such systems and details how deep learning can also be used to build countermeasures to protect against political disinformation. Read more.

11:15–11:55 Thursday, 2 May 2019

How to keep ethical with machine learning

Location: Expo Hall (Capital Hall N24)

Secondary topics: Ethics

Jerry Overton (DXC)

Average rating:

(5.00, 3 ratings)

Machine learning (ML) algorithms are good at learning new behaviors but bad at identifying when those behaviors are harmful or don’t make sense. Bias, ethics, and fairness are big risk factors in ML. However, we creators have a lot of experience dealing with intelligent beings—one another. Jerry Overton uses this common sense to build a checklist for protecting against ethical violations with ML. Read more.

12:05–12:45 Thursday, 2 May 2019

Deep learning for recommender systems

Location: Expo Hall (Capital Hall N24)

Secondary topics: Deep Learning, Media, Marketing, Advertising, Retail and e-commerce

Oliver Gindele (Datatonic)

Average rating:

(4.50, 6 ratings)

The success of deep learning has reached the realm of structured data in the past few years, where neural networks have been shown to improve the effectiveness and predictability of recommendation engines. Oliver Gindele offers a brief overview of such deep recommender systems and explains how they can be implemented in TensorFlow. Read more.

12:05–12:45 Thursday, 2 May 2019

NLP Architect by Intel's AI Lab

Location: Capital Suite 14

Secondary topics: Deep Learning, Text and Language processing and analysis

Moshe Wasserblat (Intel)

Average rating:

(4.67, 3 ratings)

Moshe Wasserblat offers an overview of NLP Architect, an open source DL NLP library that provides SOTA NLP models, making it easy for researchers to implement NLP algorithms and for data scientists to build NLP-based solutions for extracting insight from textual data to improve business operations. Read more.

12:05–12:45 Thursday, 2 May 2019

How to mitigate mobile fraud risk by data analytics

Location: Capital Suite 15/16

SEONMIN KIM (LINE)

Average rating:

(4.00, 6 ratings)

Seonmin Kim offers an introduction to activities that mitigate the risk of mobile payments through various data analytical skills, drawn from actual case studies of mobile frauds, along with tree-based machine learning, graph analytics, and statistical approaches. Read more.

12:05–12:45 Thursday, 2 May 2019

Inclusive design: Deep learning on audio in Azure, identifying sounds in real time

Location: Capital Suite 17

Swetha Machanavajhala (Microsoft), Xiaoyong Zhu (Microsoft)

In this auditory world, the human brain processes and reacts effortlessly to a variety of sounds. While many of us take this for granted, there are over 360 million in this world who are deaf or hard of hearing. Swetha Machanavajhala and Xiaoyong Zhu explain how to make the auditory world inclusive and meet the great demand in other sectors by applying deep learning on audio in Azure. Read more.

14:05–14:45 Thursday, 2 May 2019

AI for good at scale in real time: Challenges in machine learning and deep learning

Location: Expo Hall (Capital Hall N24)

Secondary topics: Data Integration and Data Pipelines, Deep Learning

Alex Jaimes (Dataminr)

Average rating:

(3.00, 2 ratings)

When emergency events occur, social signals and sensor data are generated. Alex Jaimes explains how to apply machine learning and deep learning to process large amounts of heterogeneous data from various sources in real time, with a particular focus on how such information can be used for emergencies and in critical events for first responders and for other social good use cases. Read more.

14:05–14:45 Thursday, 2 May 2019

8 prerequisites of a graph query language

Location: Capital Suite 14

Secondary topics: Graph technologies and analytics

Mingxi Wu (TigerGraph)

Average rating:

(2.75, 4 ratings)

Graph query language is the key to unleash the value from connected data. Mingxi Wu outlines the eight prerequisites of a practical graph query language, drawn from six years' experience dealing with real-world graph analytical use cases. Along the way, Mingxi compares GSQL, Gremlin, Cypher, and SPARQL, pointing out their respective pros and cons. Read more.

14:05–14:45 Thursday, 2 May 2019

Reinforcement learning: A gentle introduction and an industrial application

Location: Capital Suite 15/16

Secondary topics: IoT and its applications, Temporal data and time-series

Christian Hidber (bSquare)

Average rating:

(4.86, 7 ratings)

Reinforcement learning (RL) learns complex processes autonomously like walking, beating the world champion in Go, or flying a helicopter. No big datasets with the “right” answers are needed: the algorithms learn by experimenting. Christian Hidber shows how and why RL works and demonstrates how to apply it to an industrial hydraulics application with 7,000 clients in 42 countries. Read more.

14:05–14:45 Thursday, 2 May 2019

Deep learning for fonts

Location: Capital Suite 17

Secondary topics: Deep Learning

Raghotham Sripadraj (Ericsson), Nischal Harohalli Padmanabha (Omnius)

Average rating:

(5.00, 1 rating)

Deep learning has enabled massive breakthroughs in offbeat tracks and has enabled better understanding of how an artist paints, how an artist composes music, and so on. Nischal Harohalli Padmanabha and Raghotham Sripadraj discuss their project Deep Learning for Humans and their plans to build a font classifier. Read more.

14:55–15:35 Thursday, 2 May 2019

Learning with limited labeled data

Location: Capital Suite 14

Shioulin Sam (Cloudera Fast Forward Labs)

Average rating:

(4.45, 11 ratings)

Supervised machine learning requires large labeled datasets—a prohibitive limitation in many real-world applications. What if machines could learn with fewer labeled examples? Shioulin Sam shares an algorithmic solution that relies on collaboration between humans and machines to label smartly and discusses product possibilities. Read more.

14:55–15:35 Thursday, 2 May 2019

Early incident detection using fusion analytics of commuter-centric data sources

Location: Capital Suite 15/16

Secondary topics: IoT and its applications, Temporal data and time-series, Transportation and Logistics

Christopher Hooi (Land Transport Authority of Singapore)

Average rating:

(5.00, 3 ratings)

Christopher Hooi offers an overview of the Fusion Analytics for Public Transport Event Response (FASTER) system, a real-time advanced analytics solution for early warning of potential train incidents. FASTER uses engineering and commuter-centric IoT data sources to activate contingency plans at the earliest possible time and reduce impact to commuters. Read more.

14:55–15:35 Thursday, 2 May 2019

A deep learning approach to automatic call routing

Location: Capital Suite 17

Secondary topics: AI and machine learning in the enterprise, Deep Learning

Tal Doron (GigaSpaces)

Average rating:

(3.50, 2 ratings)

Technological advancements are transforming customer experience, and businesses are beginning to benefit from deep learning innovations to automate call center routing to the most proper agent. Tal Doron explains how to run deep learning models with Intel BigDL and Spark frameworks colocated on an in-memory computing platform to enhance the customer experience without the need for GPUs Read more.

16:35–17:15 Thursday, 2 May 2019

Evaluating cybersecurity defenses with a data science approach

Location: Capital Suite 14

Secondary topics: AI and machine learning in the enterprise, Financial Services, Security and Privacy

Brennan Lodge (Goldman Sachs), Jay Kesavan (Bowery Analytics LLC)

Average rating:

(3.00, 3 ratings)

Cybersecurity analysts are under siege to keep pace with the ever-changing threat landscape. The analysts are overworked as they are bombarded with and burned out by the sheer number of alerts that they must carefully investigate. Brennan Lodge and Jay Kesavan explain how to use a data science model for alert evaluations to empower your cybersecurity analysts. Read more.

16:35–17:15 Thursday, 2 May 2019

Improving infrastructure efficiency with unsupervised algorithms

Location: Capital Suite 15/16

Secondary topics: IoT and its applications, Transportation and Logistics

Alexandre Hubert (Dataiku)

Average rating:

(5.00, 1 rating)

GRDF helps bring natural gas to nearly 11 million customers every day. Alexandre Hubert explains how, in partnership with GRDF, Dataiku worked to optimize the manual process of qualifying addresses to visit and ultimately save GRDF time and money. This solution was the culmination of a yearlong adventure in the land of maintenance experts, legacy IT systems, and Agile development. Read more.

Data Science, Machine Learning & AI

If you're in data, you need to understand machine learning & AI

Featured Speakers

Sponsorship Opportunities

Partner Opportunities

Contact Us