Tutorials
On Tuesday, March 6, choose from all-day and half-day tutorials. These expert-led presentations give you a chance to dive deep into the subject matter. Please note: to attend, you must register for a Gold or Silver pass; does not include access to training courses.
9:00am–12:30pm Tuesday, March 6, 2018
Burcu Baran, Wei Di, Michael Li, and Chi-Yi Kuan walk you through the big data analytics and data science lifecycle and share their experience and lessons learned leveraging advanced analytics and machine learning techniques such as predictive modeling to drive and grow business at LinkedIn.
Read more.
9:00am–12:30pm Tuesday, March 6, 2018
Aishwarya Venkataraman, Jason Wang, Mala Ramakrishnan, Stefan Salandy, and Vinithra Varadharajan lead a deep dive into running data analytic workloads in a managed service capacity in the public cloud and highlight cloud infrastructure best practices.
Read more.
9:00am–12:30pm Tuesday, March 6, 2018
R and Python top the list of languages used in data science and machine learning, and data scientists and engineers fluent in one of these languages are increasingly marketable. Come learn how to build and operationalize machine learning models using distributed functions and do scalable, end-to-end data science in R and Python on single machines, Spark clusters, and cloud-based infrastructure.
Read more.
9:00am–12:30pm Tuesday, March 6, 2018
Secondary topics:
Graphs and Time-series
Across diverse segments in industry, there has been a shift in focus from big data to fast data. Karthik Ramasamy, Sanjeev Kulkarni, Arun Kejariwal, and Sijie Guo walk you through state-of-the-art streaming architectures, streaming frameworks, and streaming algorithms, covering the typical challenges in modern real-time big data platforms and offering insights on how to address them.
Read more.
9:00am–12:30pm Tuesday, March 6, 2018
Tim Berglund leads a basic architectural introduction to Kafka and walks you through using Kafka Streams and KSQL to process streaming data.
Read more.
9:00am–12:30pm Tuesday, March 6, 2018
Secondary topics:
Graphs and Time-series
Since its arrival in early 2017, PyTorch has won over many deep learning researchers and developers due to its dynamic computation framework. Mo Patel and Neejole Patel walk you through using PyTorch to build a content recommendation model.
Read more.
9:00am–12:30pm Tuesday, March 6, 2018
Want to learn how to use Amazon's big data web services to launch your first big data application in the cloud? Jorge Lopez walks you through building a big data application using a combination of open source technologies and AWS managed services.
Read more.
9:00am–12:30pm Tuesday, March 6, 2018
Martin Görner walks you through training and deploying a machine learning system using popular open source library TensorFlow. Martin takes you from a conceptual overview all the way to building complex classifiers and explains how you can apply deep learning to complex problems in science and industry.
Read more.
9:00am–5:00pm Tuesday, March 6, 2018
Join Joseph Kambourakis for an introduction to Apache Spark 2.0 core concepts with a focus on Spark's machine learning library, using text mining on real-world data as the primary end-to-end use case.
Read more.
9:00am–12:30pm Tuesday, March 6, 2018
New regulations are driving compliance, governance, and security challenges for big data, and infosec and security groups must ensure a consistently secured and governed environment across multiple workloads that span a variety of deployments. Mark Donsky, Andre Araujo, Syed Rafice, and Mubashir Kazia walk you through securing a Hadoop cluster, with special attention to GDPR.
Read more.
9:00am–5:00pm Tuesday, March 6, 2018
David Boyle (Audience Strategies),
Violeta Hennessey (Warner Bros.),
April Chen (Civis Analytics),
Sridhar Alla (BlueWhale),
Noah Gift (UC Davis),
Blake Irvine (Netflix),
Kevin Lyons (Nielsen Marketing Cloud),
Jennifer Webb (SuprFanz),
Rizwan Patel (Caesars Entertainment),
Anthony Accardo (Disney),
Amanda Gerdes (Blizzard Entertainment),
Violeta Hennessey (Warner Bros.),
Aneesh Karve (Quilt),
David Boyle (Audience Strategies),
Pete Skomoroch (Workday)
Hear from innovators in ad tech, measurement, automation, and audience engagement about where the media industry is today—and where it's likely to go next.
Read more.
9:00am–5:00pm Tuesday, March 6, 2018
Madhav Madaboosi (BP),
Meenakshisundaram Thandavarayan (Infosys),
Matt Conners (Microsoft),
Katie Malone (Civis Analytics),
Mike Prorock (mesur.io),
Thomas Miller (Northwestern University),
Ann Nguyen (Whole Whale),
Jennie Shin (Kaiser Permanente),
Valentin Bercovici (PencilDATA),
Wayde Fleener (General Mills),
Joe Dumoulin (Next IT),
Jules Malin (GoPro),
Taylor Martin Martin (O'Reilly Media),
Divya Ramachandran (Captricity)
Hear practical insights from household brands and global companies: the challenges they tackled, approaches they took, and the benefits—and drawbacks—of their solutions.
Read more.
1:30pm–5:00pm Tuesday, March 6, 2018
Controlled experiments such as A/B tests have revolutionized the way software is being developed, allowing real users to objectively evaluate new ideas. Ronny Kohavi, Alex Deng, Somit Gupta, and Paul Raff lead an introduction to A/B testing and share lessons learned from one of the largest A/B testing platforms on the planet, running at Microsoft, which executes over 10K experiments a year.
Read more.
1:30pm–5:00pm Tuesday, March 6, 2018
Secondary topics:
Graphs and Time-series
If you have data that has a time factor to it, then you need to think in terms of time series datasets. Ted Malaska explores time series in all of its forms, from tumbling windows to sessionization in batch or in streaming. You'll gain exposure to the tools and background you need to be successful in the world of time-oriented data.
Read more.
1:30pm–5:00pm Tuesday, March 6, 2018
Natural language processing is a key component in many data science systems. David Talby, Claudiu Branzan, and Alex Thomas lead a hands-on tutorial on scalable NLP, using spaCy for building annotation pipelines, Spark NLP for building distributed natural language machine-learned pipelines, and Spark ML and TensorFlow for using deep learning to build and apply word embeddings.
Read more.
1:30pm–5:00pm Tuesday, March 6, 2018
Abhishek Kumar and Vijay Srinivas Agneeswaran offer an introduction to deep learning-based recommendation and learning-to-rank systems using TensorFlow. You'll learn how to build a recommender system based on intent prediction using deep learning that is based on a real-world implementation for an ecommerce client.
Read more.
1:30pm–5:00pm Tuesday, March 6, 2018
Join Dean Wampler and Boris Lublinsky to learn how to build two microservice streaming applications based on Kafka using Akka Streams and Kafka Streams for data processing. You'll explore the strengths and weaknesses of each tool for particular design needs and contrast them with Spark Streaming and Flink, so you'll know when to choose them instead.
Read more.
1:30pm–5:00pm Tuesday, March 6, 2018
Python lets you solve data science problems by stitching together packages from its ecosystem, but it can be difficult to choose packages that work well together. James Bednar and Philipp Rudiger walk you through a concise, fast, easily customizable, and fully reproducible recipe for interactive visualization of millions or billions of datapoints—all in just 30 lines of Python code.
Read more.
1:30pm–5:00pm Tuesday, March 6, 2018
Apache Impala (incubating) is an exceptional, best-of-breed massively parallel processing SQL query engine that is a fundamental component of the big data software stack. Juan Yu demystifies the cost model Impala Planner uses and how Impala optimizes queries and explains how to identify performance bottleneck through query plan and profile and how to drive Impala to its full potential.
Read more.
1:30pm–5:00pm Tuesday, March 6, 2018
The honeymoon era of data science is ending, and accountability is coming. Not content to wait for results that may or may not arrive, successful data science leaders deliver measurable impact on an increasing share of an enterprise's KPIs. Nick Elprin details how leading organizations have taken a holistic approach to people, process, and technology to build a sustainable competitive advantage.
Read more.
1:30pm–5:00pm Tuesday, March 6, 2018
TensorFlow and Keras are popular libraries for machine learning because of their support for deep learning and GPU deployment. Join Ron Bodkin and Brian Foo to learn how to execute these libraries in production with vision and recommendation models and how to export, package, deploy, optimize, serve, monitor, and test models using Docker and TensorFlow Serving in Kubernetes.
Read more.