Sep 23–26, 2019

Data Science, Machine Learning, & AI

Machine learning lets you discover hidden insight from your data. It's a simple idea with phenomenal impact and sophisticated use cases like recommenders, text mining, real-time analytics, large-scale anomaly detection, and business forecasting.

At Strata, you’ll get a deeper and broader understanding of machine and deep learning—take a look at the sessions below.

Featured Speakers

Monday-Tuesday, September 23-24: 2-Day Training (Platinum & Training passes)
Tuesday, September 24: Tutorials (Gold & Silver passes)
Wednesday, September 25: Keynotes & Sessions (Platinum, Gold, Silver & Bronze passes)
9:00 | Location: Auditorium
Strata Data Conference Keynotes
10:50
Morning break
Thursday, September 26: Keynotes & Sessions (Platinum, Gold, Silver & Bronze passes)
9:00 | Location: Auditorium
Strata Data Conference Keynotes
10:50
Morning break
Add to your personal schedule
9:00am - 5:00pm Monday, September 23 & Tuesday, September 24
Location: 1A 03
Bargava Subramanian (Binaize Labs), Amit Kapoor (narrativeVIZ)
Recommendation systems play a significant role—for users, a new world of options; for companies, it drives engagement and satisfaction. Amit Kapoor and Bargava Subramanian walk you through the different paradigms of recommendation systems and introduce you to deep learning-based approaches. You'll gain the practical hands-on knowledge to build, select, deploy, and maintain a recommendation system. Read more.
Add to your personal schedule
9:00am - 5:00pm Monday, September 23 & Tuesday, September 24
Location: 1A 15/16
Michael Cullan (The Data Incubator)
Michael Cullan walks you through developing a machine learning pipeline from prototyping to production. You'll learn about data cleaning, feature engineering, model building and evaluation, and deployment and then extend these models into two applications from real-world datasets. All work will be done in Python. Read more.
Add to your personal schedule
9:00am - 5:00pm Monday, September 23 & Tuesday, September 24
Location: 1A 18
Ian Cook (Cloudera)
Advancing your career in data science requires learning new languages and frameworks—but you face an overwhelming array of choices, each with different syntaxes, conventions, and terminology. Ian Cook simplifies the learning process by outlining the abstractions common to these systems. You'll go hands-on exercises to overcome obstacles to getting started using new tools. Read more.
Add to your personal schedule
9:00am - 5:00pm Monday, September 23 & Tuesday, September 24
Location: 1E 07
Dylan Bargteil (The Data Incubator)
The TensorFlow library provides for the use of computational graphs with automatic parallelization across resources. This architecture is ideal for implementing neural networks. Dylan Bargteil explores TensorFlow's capabilities in Python, demonstrating how to build machine learning algorithms piece by piece and how to use TensorFlow's Keras API with several hands-on applications. Read more.
Add to your personal schedule
9:00am12:30pm Tuesday, September 24, 2019
Location: 1A 12
Sourav Dey (Manifold), Jakov Kucan (Manifold)
Sourav Dey and Jakov Kucan walk you through the six steps of the Lean AI process and explain how it helps your ML engineers work as an an integrated part of your development and production teams. You'll get a hands-on example using real-world data, so you can get up and running with Docker and Orbyter and see firsthand how streamlined they can make your workflow. Read more.
Add to your personal schedule
9:00am12:30pm Tuesday, September 24, 2019
Location: 1A 21
Jules Damji (Databricks)
ML development brings many new complexities beyond the software development lifecycle. Unlike in traditional software development, ML developers want to try multiple algorithms, tools, and parameters to get the best results, and they need to track this information. Jules Damji walks you through MLflow, an open source project that simplifies the entire ML lifecycle, to solve this problem. Read more.
Add to your personal schedule
9:00am12:30pm Tuesday, September 24, 2019
Location: 1A 23
Alice Zhao (Metis)
As a data scientist, we are known to crunch numbers, but you need to decide what to do when you run into text data. Alice Zhao walks you through the steps to turn text data into a format that a machine can understand, explores some of the most popular text analytics techniques, and showcases several natural language processing (NLP) libraries in Python, including NLTK, TextBlob, spaCy, and gensim. Read more.
Add to your personal schedule
9:00am12:30pm Tuesday, September 24, 2019
Location: 1E 12/13
Secondary topics:  Deep Learning
Bruno Goncalves (Data For Science, Inc)
You'll go hands-on to learn the theoretical foundations and principal ideas underlying deep learning and neural networks. Bruno Goncalves provides the code structure of the implementations that closely resembles the way Keras is structured, so that by the end of the course, you'll be prepared to dive deeper into the deep learning applications of your choice. Read more.
Add to your personal schedule
1:30pm5:00pm Tuesday, September 24, 2019
Location: 1A 12
Garrett Hoffman (StockTwits)
Garrett Hoffman walks you through deep learning methods for natural language processing and natural language understanding tasks, using a live example in Python and TensorFlow with StockTwits data. Methods include Word2Vec, recurrent neural networks (RNNs) and variants (long short-term memory [LSTM] and gated recurrent unit [GRU]), and convolutional neural networks. Read more.
Add to your personal schedule
1:30pm5:00pm Tuesday, September 24, 2019
Location: 1A 21
Karthik Sonti (Amazon Web Services), Emily Webber (Amazon Web Services), Varun Rao Bhamidimarri (Amazon Web Services)
Karthik Sonti, Emily Webber, and Varun Rao Bhamidimarri introduce you to the Amazon SageMaker machine learning platform and provide a high-level discussion of recommender systems. You'll dig into different machine learning approaches for recommender systems, including common methods such as matrix factorization as well as newer embedding approaches. Read more.
Add to your personal schedule
1:30pm5:00pm Tuesday, September 24, 2019
Location: 1A 23
David Talby (Pacific AI), Alex Thomas (John Snow Labs), Saif Addin Ellafi (John Snow Labs), Claudiu Branzan (Accenture)
David Talby, Alex Thomas, Saif Addin Ellafi, and Claudiu Branzan walk you through state-of-the-art natural language processing (NLP) using the highly performant, highly scalable open source Spark NLP library. You'll spend about half your time coding as you work through four sections, each with an end-to-end working codebase that you can change and improve. Read more.
Add to your personal schedule
1:30pm5:00pm Tuesday, September 24, 2019
Location: 1E 08
Sophie Watson (Red Hat), William Benton (Red Hat)
Go hands-on with Sophie Watson and William Benton to examine data structures that let you answer interesting queries about massive datasets in fixed amounts of space and constant time. This seems like magic, but they'll explain the key trick that makes it possible and show you how to use these structures for real-world machine learning and data engineering applications. Read more.
Add to your personal schedule
11:20am12:00pm Wednesday, September 25, 2019
Location: 3B - Expo Hall
Secondary topics:  Ethics
Harsha Nori (Microsoft), Sameul Jenkins (Microsoft), Rich Caruana (Microsoft)
Understanding decisions made by machine learning systems is critical for sensitive uses, ensuring fairness, and debugging production models. Interpretability presents options for trying to understand model decisions. Harsha Nori, Sameul Jenkins, and Rich Caruana explore the tools Microsoft is releasing to help you train powerful, interpretable models and interpret existing black box systems. Read more.
Add to your personal schedule
11:20am12:00pm Wednesday, September 25, 2019
Location: 1A 06/07
Ying Yau (Walmart Labs)
Time series forecasting techniques can be applied in a wide range of scientific disciplines, business scenarios, and policy settings. Jeffrey Yau details the application of deep learning techniques to time series forecasting and compares them to time series statistical models when forecasting time series with trends, multiple seasonality, regime switch, and exogenous series. Read more.
Add to your personal schedule
11:20am12:00pm Wednesday, September 25, 2019
Location: 1A 08/10
Secondary topics:  Culture and Organization
Ann Spencer (Domino), Paco Nathan (Derwen), Amy Heineike (Primer), Pete Warden (TensorFlow)
If, as a data scientist, you've wondered why it takes so long to deploy your model into production or, as an engineer, thought data scientists have no idea what they want, you're not alone. Join a lively discussion panel with industry veterans Ann Spencer, Paco Nathan, Amy Heineike, and Pete Warden to find best practices or insights on increasing collaboration when developing and deploying models. Read more.
Add to your personal schedule
11:20am12:00pm Wednesday, September 25, 2019
Location: 1A 12
Ted Dunning (MapR)
Feature engineering is generally the section that gets left out of machine learning books, but it's also the most critical part in practice. Ted Dunning explores techniques, a few well known, but some rarely spoken of outside the institutional knowledge of top teams, including how to handle categorical inputs, natural language, transactions, and more in the context of machine learning. Read more.
Add to your personal schedule
1:15pm1:55pm Wednesday, September 25, 2019
Location: 3B - Expo Hall
Saif Addin Ellafi (John Snow Labs), Scott Hoch (BlackBox Engineering)
Recruiting patients for clinical trials is a major challenge in drug development. Saif Addin Ellafi and Scott Hoch explain how Deep 6 uses Spark NLP to scale its training and inference pipelines to millions of patients while achieving state-of-the-art accuracy. They dive into the technical challenges, the architecture of the full solution, and the lessons the company learned. Read more.
Add to your personal schedule
1:15pm1:55pm Wednesday, September 25, 2019
Location: 1A 06/07
Every NLP-based document-processing solution depends on converting scanned documents and images to machine readable text using an OCR solution, limited by the quality of scanned images. Nagendra Shishodia, Chaithanya Manda, and Solmaz Torabi explore how GAN can bring significant efficiencies in any document-processing solution by enhancing resolution and denoising scanned images. Read more.
Add to your personal schedule
1:15pm1:55pm Wednesday, September 25, 2019
Location: 1A 08/10
James Tang (Walmart Labs), Yiyi Zeng (Walmart Labs), Linhong Kang (Walmart Labs)
James Tang, Yiyi Zeng, and Linhong Kang outline how Walmart provides a secure and seamless shopping experience through machine learning and large scale data analysis on centralized platform. Read more.
Add to your personal schedule
1:15pm1:55pm Wednesday, September 25, 2019
Location: 1A 12
Secondary topics:  Deep Learning
Shioulin Sam (Cloudera Fast Forward Labs)
Supervised machine learning requires large labeled datasets—a prohibitive limitation in many real world applications. But this could be avoided if machines could earn with a few labeled examples. Shioulin Sam explores and demonstrates an algorithmic solution that relies on collaboration between human and machine to label smartly, and she outlines product possibilities. Read more.
Add to your personal schedule
2:05pm2:45pm Wednesday, September 25, 2019
Location: 3B - Expo Hall
Panos Alexopoulos (Textkernel)
In an era where discussions among data scientists are monopolized by the latest trends in machine learning, the role of semantics in data science is often underplayed. Panos Alexopoulos presents real-world cases where making fine, seemingly pedantic, distinctions in the meaning of data science tasks and the related data has helped improve significantly the effectiveness and value. Read more.
Add to your personal schedule
2:05pm2:45pm Wednesday, September 25, 2019
Location: 1A 06/07
Keshav Peswani (Expedia Group), Ashish Aggarwal (Expedia Group)
Observability is the key in modern architecture to quickly detect and repair problems in microservices. Modern observability platforms have evolved beyond simple application logs and include distributed tracing systems like Zipkin and Haystack. Keshav Peswani and Ashish Aggarwal explore how combining them with real-time, intelligent alerting mechanisms helps in the automated detection of problems. Read more.
Add to your personal schedule
2:05pm2:45pm Wednesday, September 25, 2019
Location: 1A 08/10
Nan Zhu (Uber), Felix Cheung (Uber)
XGBoost has been widely deployed in companies across the industry. Nan Zhu and Felix Cheung dive into the internals of distributed training in XGBoost and demonstrate how XGBoost resolves the business problem in Uber with a scale to thousands of workers and tens of TB of training data. Read more.
Add to your personal schedule
2:05pm2:45pm Wednesday, September 25, 2019
Location: 1A 12
Mikio Braun (Zalando)
With ML becoming more mainstream, the side effects of machine learning and AI on our lives become more visible. You have to take extra measures to make machine learning models fair and unbiased. And awareness for preserving the privacy in ML models is rapidly growing. Mikio Braun explores techniques and concepts around fairness, privacy, and security when it comes to machine learning models. Read more.
Add to your personal schedule
2:55pm3:35pm Wednesday, September 25, 2019
Location: 3B - Expo Hall
Gerard de Melo (Rutgers University)
Gerard de Melo takes a deep dive into the kinds of sentiment and emotion consumers associate with a text. With new data-driven approaches, organizations can better pay attention to what's being said about them in different markets. And you can consider fonts and palettes best suited to convey specific emotions, so organizations can make informed choices when presenting information to consumers. Read more.
Add to your personal schedule
2:55pm3:35pm Wednesday, September 25, 2019
Location: 1A 06/07
Tony Xing (Microsoft), Bixiong Xu (Microsoft), Congrui Huang (Microsoft), Qiyang Li (Microsoft)
Anomaly detection may sound old fashioned, yet it's super important in many industry applications. Tony Xing, Bixiong Xu, Congrui Huang, and Qiyang Li detail a novel anomaly-detection algorithm based on spectral residual (SR) and convolutional neural network (CNN) and how this method was applied in the monitoring system supporting Microsoft AIOps and business incident prevention. Read more.
Add to your personal schedule
2:55pm3:35pm Wednesday, September 25, 2019
Location: 1A 08/10
Fei Wang (CarGurus)
Fei Wang takes a deep dive into a case study for the CarGurus TV Attribution Model. You'll understand how you can leverage the creation of a causal inference model to calculate cost per acquisition (CPA) of TV spend and measure effectiveness when compared to CPA of digital performance marketing spend. Read more.
Add to your personal schedule
2:55pm3:35pm Wednesday, September 25, 2019
Location: 1A 12
Secondary topics:  Financial Services
Jari Koister (FICO )
Machine learning and constraint-based optimization are both used to solve critical business problems. They come from distinct research communities and have traditionally been treated separately. But Jari Koister examines how they're similar, how they're different, and how they can be used to solve complex problems with amazing results. Read more.
Add to your personal schedule
4:35pm5:15pm Wednesday, September 25, 2019
Location: 3B - Expo Hall
John Berryman (Eventbrite)
Eventbrite is exploring a new machine learning approach that allows it to harvest data from customer search logs and automatically tag events based upon their content. John Berryman dives into the results and how they have allowed the company to provide users with a better inventory-browsing experience. Read more.
Add to your personal schedule
4:35pm5:15pm Wednesday, September 25, 2019
Location: 1A 06/07
Siddha Ganju (NVIDIA), Meher Kasam (Square)
Over the last few years, convolutional neural networks (CNNs) have risen in popularity, especially in the area of computer vision. Siddha Ganju and Meher Kasam take you through how you can get deep neural nets to run efficiently on mobile devices. Read more.
Add to your personal schedule
4:35pm5:15pm Wednesday, September 25, 2019
Location: 1A 08/10
Robert Pesch (inovex), Robin Senge (inovex)
Data-driven software is revolutionizing the world and enable intelligent services we interact with daily. Robert Pesch and Robin Senge outline the development process, statistical modeling, data-driven decision making, and components needed for productionizing a fully automated and highly scalable demand forecasting system for an online grocery shop for a billion-dollar retail group in Europe. Read more.
Add to your personal schedule
4:35pm5:15pm Wednesday, September 25, 2019
Location: 1A 12
Criteo’s infrastructure provides the capacity and connectivity to host Criteo’s platform and applications. The evolution of this infrastructure is driven by the ability to forecast Criteo’s traffic demand. Hamlet Jesse Medina Ruiz explains how Criteo uses Bayesian dynamic time series models to accurately forecast its traffic load and optimize hardware resources across data centers. Read more.
Add to your personal schedule
5:25pm6:05pm Wednesday, September 25, 2019
Location: 3B - Expo Hall
Sireesha Muppala (Amazon Web Services), Shelbee Eigenbrode (Amazon Web Services), Emily Webber (Amazon Web Services)
Mansplaining. Know it? Hate it? Want to make it go away? Sireesha Muppala, Shelbee Eigenbrode, and Emily Webber tackle the problem of men talking over or down to women and its impact on career progression for women. They also demonstrate an Alexa skill that uses deep learning techniques on incoming audio feeds, examine ownership of the problem for women and men, and suggest helpful strategies. Read more.
Add to your personal schedule
5:25pm6:05pm Wednesday, September 25, 2019
Location: 1A 06/07
The common perception of deep learning is that it results in a fully self-contained model. However, in most cases, these models have similar requirements for data preprocessing as does more "traditional" machine learning. Despite this, there are few standard solutions for deploying end-to-end deep learning. Nick Pentreath explores how the ONNX format and ecosystem addresses this challenge. Read more.
Add to your personal schedule
5:25pm6:05pm Wednesday, September 25, 2019
Location: 1A 08/10
Secondary topics:  Media and Advertising
Aaron Owen (Major League Baseball), Matthew Horton (Major League Baseball), Josh Hamilton (Major League Baseball)
Using SAS, Python, and AWS SageMaker, Major League Baseball's (MLB's) data science team outlines how it predicts ticket purchasers’ likelihood to purchase again, evaluates prospective season schedules, estimates customer lifetime value, optimizes promotion schedules, quantifies the strength of fan avidity, and monitors the health of monthly subscriptions to its game-streaming service. Read more.
Add to your personal schedule
5:25pm6:05pm Wednesday, September 25, 2019
Location: 1A 12
Secondary topics:  Retail and e-commerce
Subhasish Misra (Walmart )
Causal questions are ubiquitous, and randomized tests are considered the gold standard. However, such tests are not always feasible, and then you just have observational data to get to causal insights. But techniques such as matching offer an opportunity to solve this. Subhasish Misra explores this and practical tips when trying to infer causal effects. Read more.
Add to your personal schedule
5:25pm6:05pm Wednesday, September 25, 2019
Location: 1A 01/02
Secondary topics:  Transportation and Logistics
Brandy Freitas (Pitney Bowes)
In this session, Brandy Freitas from Pitney Bowes will cover the interplay between graph analytics and machine learning, improved feature engineering with graph native algorithms, and harnessing the power of graph structure for machine learning through node embedding. Read more.
Add to your personal schedule
11:20am12:00pm Thursday, September 26, 2019
Location: 3B - Expo Hall
Brian Keng (Rubikloud Technologies Inc)
Automating decisions require a system to consider more than just a data-driven prediction. Real-world decisions require additional constraints and fuzzy objectives to ensure that they are robust and consistent with business goals. This talk will describe how to leverage modern machine learning methods and traditional mathematical optimization techniques for decision automation. Read more.
Add to your personal schedule
11:20am12:00pm Thursday, September 26, 2019
Location: 1A 06/07
Shital Shah (Microsoft Research)
How do we visualize what exactly deep learning is doing? Taming the massive models, data and training times requires new way of thinking about them. In talk we will introduce explore new tools and methods to understand AI better. Explaining the decisions made by AI not only helps us accelerate its development but also make it safe and more trustworthy. Read more.
Add to your personal schedule
11:20am12:00pm Thursday, September 26, 2019
Location: 1A 08/10
Anjali Samani (CircleUp)
The application of smoothing and imputation strategies is common practice in predictive modelling and time series analysis. With a technique-agnostic approach, this session will provide qualitative and quantitative frameworks that address questions related to smoothing and imputation of missing values to improve data density. Read more.
Add to your personal schedule
11:20am12:00pm Thursday, September 26, 2019
Location: 1A 12
Secondary topics:  Ethics
Alejandro Saucedo (The Institute for Ethical AI & Machine Learning)
Undesired bias in machine learning has become a worrying topic due to the numerous high profile incidents. In this talk we demystify machine learning bias through a hands-on example. We'll be tasked to automate the loan approval process for a company, and introduce key tools and techniques from latest research that allow us to assess and mitigate undesired bias in our machine learning models. Read more.
Add to your personal schedule
11:20am12:00pm Thursday, September 26, 2019
Location: 1E 14
John Allen (Deutsche Bank)
As an early adopter of data science, machine learning, and AI, Deutsche Bank's analytics function is trailblazing new ways to drive revenues, lower costs, and reduce risk across all areas of the group. John Allen shares how his team combines commercial offerings with open source technologies to revolutionize legacy processes and transform the way the bank uses technology to drive innovation. Read more.
Add to your personal schedule
1:15pm1:55pm Thursday, September 26, 2019
Location: 3B - Expo Hall
Victor Dibia (Cloudera Fast Forward Labs)
Recent advances in Machine Learning frameworks for the browser such as Tensorflow.js provides opportunity to craft truly novel experiences within front-end applications. This talk explores the state of the art for Machine Learning in the browser using Tensorflow.js and covers its use in the design of Handtrack.js - a library for prototyping real time hand detection in the browser. Read more.
Add to your personal schedule
1:15pm1:55pm Thursday, September 26, 2019
Location: 1A 06/07
Sameer Agarwal (Facebook Inc.)
Apache Spark is the largest compute engine at Facebook by CPU. This talk will cover the story of how we optimized, tuned and scaled Apache Spark at Facebook to run on clusters of tens of thousands of machines, processing hundreds of petabytes of data, and used by thousands of data scientists, engineers and product analysts every day. Read more.
Add to your personal schedule
1:15pm1:55pm Thursday, September 26, 2019
Location: 1A 08/10
Alfred Whitehead (Klick), Clare Jeon (KLICK INC)
What will tomorrow’s temperature be? My blood glucose levels tonight before bed? Time series forecasts depend on sensors or measurements made out in the real, messy world. Those sensors flake out, get turned off, disconnect, and otherwise conspire to cause missing data in our signals. We will show a number of methods for handling data gaps and give advice on which to consider and when. Read more.
Add to your personal schedule
1:15pm1:55pm Thursday, September 26, 2019
Location: 1A 12
Sandra Carrico (Glynt.ai)
This talk motivates mixed formal learning, explains it and outlines one machine learning example that previously used large numbers of examples and now learns with either zero or a handful of training examples. It maps apparently idiosyncratic techniques to Mixed Formal Learning, a general AI architecture that you can use in your projects. Read more.
Add to your personal schedule
2:05pm2:45pm Thursday, September 26, 2019
Location: 3B - Expo Hall
Heitor Murilo Gomes (Télécom ParisTech), Albert Bifet (Télécom ParisTech)
In this talk, we show how to develop a machine learning pipeline for streaming data using the StreamDM framework (https://github.com/huawei-noah/streamDM). We also introduce how to use StreamDM for supervised and unsupervised learning tasks, show examples of online preprocessing methods, and how to expand the framework adding new learning algorithms or preprocessing methods. Read more.
Add to your personal schedule
2:05pm2:45pm Thursday, September 26, 2019
Location: 1A 06/07
Secondary topics:  Deep Learning, Streaming and IoT
Ryan Foltz (Exabeam)
Unmanaged & foreign devices in the corporate networks pose a security risk. The 1st step toward reducing risk from these devices is the ability to identify them. To have a comprehensive device management program, we proposed a machine learning model based on Deep Learning to perform anomaly detection based on only device names to flag devices that do not follow device naming structures. Read more.
Add to your personal schedule
2:05pm2:45pm Thursday, September 26, 2019
Location: 1A 08/10
Anais Dotis (InfluxData)
Did you know that Classical algorithms outperform Machine Learning methods in time series forecasting? I’ll show you how I used the Holt-Winters forecasting algorithm to predict water levels in a creek. Read more.
Add to your personal schedule
2:05pm2:45pm Thursday, September 26, 2019
Location: 1A 12
Andrew Leamon (Comcast), Wadkar Sameer (Comcast NBCUniversal)
And overview of the Data Management and privacy challenges around automating ML model (re)deployments and stream based inferencing at scale. Read more.
Add to your personal schedule
3:45pm4:25pm Thursday, September 26, 2019
Location: 1A 06/07
Secondary topics:  Deep Learning
Sajan Govindan (Intel), Luca Canali (CERN)
We will show CERN’s research on applying Deep Learning in High Energy Physics experiments as an alternative to customized rule based methods with an example of topology classification to improve real-time event selection at the Large Hadron Collider experiments. CERN implemented deep learning pipelines on Apache Spark using BigDL and Analytics Zoo open source software on Intel Xeon-based clusters Read more.
Add to your personal schedule
3:45pm4:25pm Thursday, September 26, 2019
Location: 1A 12
Secondary topics:  Financial Services
David Mack (Octavian)
Graphs are a powerful way to represent knowledge. Organizations (in fields such as bio-sciences and finance) are starting to amass large knowledge graphs, but lack the machine-learning tools to extract the insights they need from them. In this presentation, I’ll give an overview of what insights are possible and survey the most popular approaches. Read more.
Add to your personal schedule
3:45pm4:25pm Thursday, September 26, 2019
Location: 1A 08/10
Chad Scherrer (Metis)
This talk will explore the basic ideas in Soss, a new probabilistic programming library for Julia. Soss allows a high-level representation of the kinds of models often written in PyMC3 or Stan, and offers a way to programmatically specify and apply model transformations like approximations or reparameterizations. Read more.
Add to your personal schedule
4:35pm5:15pm Thursday, September 26, 2019
Location: 1A 06/07
Secondary topics:  Deep Learning
Naoto Umemori (NTT DATA Corporation), Masaru Dobashi (NTT Data Corp.)
Giant Hogweed is a highly toxic plant. Our project aims to automate the process of detecting the Giant Hogweed by exploiting technologies like drones and image recognition/detection using Machine Learning. We show you how we designed the architecture, how we took advantage of both of Big Data and Machine / Deep Learning technologies (e.g. Hadoop, Spark and TensorFlow) and lessons learned. Read more.
Add to your personal schedule
4:35pm5:15pm Thursday, September 26, 2019
Location: 1A 08/10
Jeroen Janssens (Data Science Workshops B.V.)
In this talk, we present Stochastic Outlier Section (SOS), an unsupervised algorithm for detecting anomalies in large, high-dimensional data. SOS has been implemented in Python, R, and most recently, Spark. First, we illustrate the idea and intuition behind SOS. Subsequently, we demonstrate our implementation of SOS on top of Spark. Finally, we apply SOS to a real-world use case. Read more.

Contact us

confreg@oreilly.com

For conference registration information and customer service

partners@oreilly.com

For more information on community discounts and trade opportunities with O’Reilly conferences

strataconf@oreilly.com

For information on exhibiting or sponsoring a conference

Contact list

View a complete list of Strata Data Conference contacts