Presented By O'Reilly and Cloudera
Make Data Work
September 25–26, 2017: Training
September 26–28, 2017: Tutorials & Conference
New York, NY
 

Schedule

< Filters

No Results Found

Clear all filters

Close

Filters

      Clear filters
      1A 06/07
      Add Hardcore Data Science to your personal schedule
      9:00am Hardcore Data Science Ben Lorica (O'Reilly Media), Assaf Araki (Intel), Jacob Schreiber (University of Washington), Alex Ratner (Stanford University), Madeleine Udell (Cornell University), Yunsong Guo (Pinterest), Katherine Heller (Duke University), Alan Nichol (Rasa), Gerard de Melo (Rutgers University), Tamara Broderick (MIT), Inbal Tadeski (Anodot), Daniel Kang (Stanford University), Bichen Wu (UC Berkeley), Shaked Shammah (Hebrew University)
      1A 08/10
      1A 12/14
      Add Unraveling data with Spark using deep learning and other algorithms from machine learning to your personal schedule
      9:00am Unraveling data with Spark using deep learning and other algorithms from machine learning Vartika Singh (Cloudera), Jeffrey Shmain (Cloudera)
      Add Securely building deep learning models for digital health data to your personal schedule
      1:30pm Securely building deep learning models for digital health data Josh Patterson (Skymind), Vartika Singh (Cloudera), Dave Kale (Skymind), Tom Hanlon (Skymind)
      1A 18
      Add Deep learning for recommender systems to your personal schedule
      9:00am Deep learning for recommender systems Mo Patel (Teradata), Junxia Li (Think Big Analytics)
      Add A practitioner’s guide to Hadoop security for the hybrid cloud to your personal schedule
      1:30pm A practitioner’s guide to Hadoop security for the hybrid cloud Mark Donsky (Cloudera), Manish Ahluwalia (Nerdwallet), Andre Araujo (Cloudera), Syed Rafice (Cloudera)
      1A 21/22
      Add Getting started with TensorFlow to your personal schedule
      9:00am Getting started with TensorFlow Yufeng Guo (Google), Amy Unruh (Google)
      Add A deep dive into deep learning with Keras to your personal schedule
      1:30pm A deep dive into deep learning with Keras julia lintern (Metis)
      1A 23/24
      Add Building big data applications on Azure to your personal schedule
      9:00am Building big data applications on Azure Pranav Rastogi (Microsoft)
      Add Natural language understanding at scale with spaCy, Spark ML, and TensorFlow to your personal schedule
      1:30pm Natural language understanding at scale with spaCy, Spark ML, and TensorFlow David Talby (Pacific AI), Claudiu Branzan (G2 Web Services), Alex Thomas (Indeed)
      1E 07/08
      Add Findata Day to your personal schedule
      9:00am Findata Day Bradford Cross (DCVC), Robert Passarella (Protégé Partners), Jason Morton (Ascendant), Leigh Drogen (Estimize), Jessica Stauth (Quantopian), Abraham Thomas (Quandl), Alistair Croll (Solve For Interesting), Robert Passarella (Protégé Partners), Vincent-Charles Hodder (Local Logic), Priya Koul (American Express), Tanvi Singh (Credit Suisse), José Ribau (CIBC), Michael Beal (Data Capital Management), Jike Chong (Tsinghua University | Acorns)
      1E 09
      Add Data Case Studies to your personal schedule
      9:00am Data Case Studies Rose Winterton (Pitney Bowes), Audrey Spencer-Alvarado (Portland Trail Blazers), Amie Elcan (CenturyLink), Sean Power (Repable), Parisa Foster (Play The Future), Nick Selby (CJX, Inc. | Midlothian Police Department), Salema Rice (Allegis Group), Aneesh Karve (Quilt), Derek Ruths (CAI), Kristina Bergman (Integris Software), Natalia Adler (UNICEF HQ), Brandon O'Brien (Expedia, Inc)
      1E 10
      Add A deep dive into running data engineering workloads in AWS to your personal schedule
      9:00am A deep dive into running data engineering workloads in AWS Jennifer Wu (Cloudera), Fahd Siddiqui (Cloudera), Paul George (Cloudera), Eugene Fratkin (Cloudera)
      Add Machine learning in R to your personal schedule
      1:30pm Machine learning in R Jared Lander (Lander Analytics)
      1E 11
      Add Data 101 to your personal schedule
      9:00am Data 101 Dan Roesch (Roesch & Associates LLC), Dan Roesch (Roesch & Associates LLC), Edd Wilder-James (Silicon Valley Data Science), Mikio Braun (Zalando SE), Javier Esplugas (DHL Supply Chain), Kevin Parent (Conduce), Jim Scott (MapR Technologies), Melanie Warrick (Google), Sarah Manning (Etsy)
      Add Managing data science in the enterprise to your personal schedule
      1:30pm Managing data science in the enterprise John Akred (Silicon Valley Data Science), Heather Nelson (Silicon Valley Data Science)
      1E 12/13
      Add Architecting a data platform to your personal schedule
      9:00am Architecting a data platform John Akred (Silicon Valley Data Science), Stephen O'Sullivan (Silicon Valley Data Science)
      Add Architecting a next-generation data platform to your personal schedule
      1:30pm Architecting a next-generation data platform Jonathan Seidman (Cloudera), Gwen Shapira (Confluent), Mark Grover (Lyft)
      1E 14
      Add Building real-time data pipelines with Apache Kafka to your personal schedule
      9:00am Building real-time data pipelines with Apache Kafka Ian Wrigley (StreamSets)
      Add Modern real-time streaming architectures to your personal schedule
      1:30pm Modern real-time streaming architectures Karthik Ramasamy (Streamlio), Sanjeev Kulkarni (Streamlio), Avrilia Floratau (Microsoft), Ashvin Agrawal (Microsoft), Arun Kejariwal (MZ), Sijie Guo (Streamlio)
      1E 15/16
      Add Scaling Python data analysis to your personal schedule
      9:00am Scaling Python data analysis Matthew Rocklin (Anaconda), Ben Zaitlen (Anaconda)
      Add Building your first big data application on AWS to your personal schedule
      1:30pm Building your first big data application on AWS Ryan Nienhuis (Amazon Web Services (AWS)), Radhika Ravirala (Amazon Web Services (AWS)), Allan MacInnis (Amazon Web Services), Ben Snively (Amazon Web Services (AWS))
      10:30am Morning break | Room: Break
      3:00pm Afternoon break | Room: Break
      12:30pm Lunch | Room: 3A
      Add Speed Networking to your personal schedule
      8:15am Speed Networking | Room: Crystal Palace
      Add Ignite to your personal schedule
      6:30pm Ignite | Room: 3D11
      9:00am-5:00pm (8h)
      Hardcore Data Science
      Ben Lorica (O'Reilly Media), Assaf Araki (Intel), Jacob Schreiber (University of Washington), Alex Ratner (Stanford University), Madeleine Udell (Cornell University), Yunsong Guo (Pinterest), Katherine Heller (Duke University), Alan Nichol (Rasa), Gerard de Melo (Rutgers University), Tamara Broderick (MIT), Inbal Tadeski (Anodot), Daniel Kang (Stanford University), Bichen Wu (UC Berkeley), Shaked Shammah (Hebrew University)
      A full day of hardcore data science, exploring emerging topics and new areas of study made possible by vast troves of raw data and cutting-edge architectures for analyzing and exploring information. Along the way, leading data science practitioners teach new techniques and technologies to add to your data science toolbox.
      9:00am-5:00pm (8h) Spark & beyond Text
      Spark camp: Apache Spark 2.0 for analytics and text mining with Spark ML
      Brooke Wenig (Databricks)
      Brooke Wenig introduces you to Apache Spark 2.0 core concepts with a focus on Spark's machine learning library, using text mining on real-world data as the primary end-to-end use case.
      9:00am-12:30pm (3h 30m) Machine Learning & Data Science, Spark & beyond Deep learning
      Unraveling data with Spark using deep learning and other algorithms from machine learning
      Vartika Singh (Cloudera), Jeffrey Shmain (Cloudera)
      Vartika Singh and Jeffrey Shmain walk you through various approaches using the machine learning algorithms available in Spark ML to understand and decipher meaningful patterns in real-world data. Vartika and Jeff also demonstrate how to leverage open source deep learning frameworks to run classification problems on image and text datasets leveraging Spark.
      1:30pm-5:00pm (3h 30m) Artificial Intelligence Deep learning, Healthcare
      Securely building deep learning models for digital health data
      Josh Patterson (Skymind), Vartika Singh (Cloudera), Dave Kale (Skymind), Tom Hanlon (Skymind)
      Josh Patterson, Vartika Singh, David Kale, and Tom Hanlon walk you through interactively developing and training deep neural networks to analyze digital health data using the Cloudera Workbench and Deeplearning4j (DL4J). You'll learn how to use the Workbench to rapidly explore real-world clinical data, build data-preparation pipelines, and launch training of neural networks.
      9:00am-12:30pm (3h 30m) Artificial Intelligence, Machine Learning & Data Science Deep learning, ecommerce
      Deep learning for recommender systems
      Mo Patel (Teradata), Junxia Li (Think Big Analytics)
      Junxia Li and Mo Patel demonstrate how to apply deep learning to improve consumer recommendations by training neural nets to learn categories of interest for recommendations using embeddings. You'll also learn how to achieve wide and deep learning with WALS matrix factorization—now used in production for the Google Play store.
      1:30pm-5:00pm (3h 30m) Data Engineering & Architecture, Security Cloud
      A practitioner’s guide to Hadoop security for the hybrid cloud
      Mark Donsky (Cloudera), Manish Ahluwalia (Nerdwallet), Andre Araujo (Cloudera), Syed Rafice (Cloudera)
      Mark Donsky, André Araujo, Syed Rafice, and Manish Ahluwalia walk you through securing a Hadoop cluster. You’ll start with a cluster with no security and then add security features related to authentication, authorization, encryption of data at rest, encryption of data in transit, and complete data governance.
      9:00am-12:30pm (3h 30m) Data science & advanced analytics, Machine Learning & Data Science
      Getting started with TensorFlow
      Yufeng Guo (Google), Amy Unruh (Google)
      Yufeng Guo and Amy Unruh walk you through training and deploying a machine learning system using TensorFlow, a popular open source library. Yufeng and Amy take you from a conceptual overview all the way to building complex classifiers and explain how you can apply deep learning to complex problems in science and industry.
      1:30pm-5:00pm (3h 30m) Machine Learning & Data Science Deep learning
      A deep dive into deep learning with Keras
      julia lintern (Metis)
      Julia Lintern offers a deep dive into deep learning with Keras, beginning with basic neural nets and before exploring convolutional neural nets and recurrent neural nets. Along the way, Julia explains both the design theory behind and the Keras implementations of today's most widely used deep learning algorithms.
      9:00am-12:30pm (3h 30m) Big data and the Cloud, Data Engineering & Architecture Cloud
      Building big data applications on Azure
      Pranav Rastogi (Microsoft)
      As big data solutions are rapidly moving to the cloud, it's becoming increasingly important to know how to use Apache Hadoop, Spark, R Server, and other open source technologies in the cloud. Pranav Rastogi walks you through building big data applications on Azure HDInsight and other Azure services.
      1:30pm-5:00pm (3h 30m) Data science & advanced analytics, Machine Learning & Data Science Deep learning, Pydata, Text
      Natural language understanding at scale with spaCy, Spark ML, and TensorFlow
      David Talby (Pacific AI), Claudiu Branzan (G2 Web Services), Alex Thomas (Indeed)
      Natural language processing is a key component in many data science systems that must understand or reason about text. David Talby, Claudiu Branzan, and Alex Thomas lead a hands-on tutorial on scalable NLP using spaCy for building annotation pipelines, TensorFlow for training custom machine-learned annotators, and Spark ML and TensorFlow for using deep learning to build and apply word embeddings.
      9:00am-5:00pm (8h)
      Findata Day
      Bradford Cross (DCVC), Robert Passarella (Protégé Partners), Jason Morton (Ascendant), Leigh Drogen (Estimize), Jessica Stauth (Quantopian), Abraham Thomas (Quandl), Alistair Croll (Solve For Interesting), Robert Passarella (Protégé Partners), Vincent-Charles Hodder (Local Logic), Priya Koul (American Express), Tanvi Singh (Credit Suisse), José Ribau (CIBC), Michael Beal (Data Capital Management), Jike Chong (Tsinghua University | Acorns)
      Finance is information. From analyzing risk and detecting fraud to predicting payments and improving customer experience, data technologies are transforming the financial industry. And we're diving deep into this change with a new day of data-meets-finance talks, tailored for Strata Data Conference events in the world's financial hubs.
      9:00am-5:00pm (8h)
      Data Case Studies
      Rose Winterton (Pitney Bowes), Audrey Spencer-Alvarado (Portland Trail Blazers), Amie Elcan (CenturyLink), Sean Power (Repable), Parisa Foster (Play The Future), Nick Selby (CJX, Inc. | Midlothian Police Department), Salema Rice (Allegis Group), Aneesh Karve (Quilt), Derek Ruths (CAI), Kristina Bergman (Integris Software), Natalia Adler (UNICEF HQ), Brandon O'Brien (Expedia, Inc)
      In a series of 12 half-hour talks aimed at a business audience, you’ll hear data-themed case studies from household brands and global companies, explaining the challenges they wanted to tackle, the approaches they took, and the benefits—and drawbacks—of their solutions. If you want practical insights about applied data, look no further.
      9:00am-12:30pm (3h 30m) Big data and the Cloud, Data Engineering & Architecture Architecture, Cloud
      A deep dive into running data engineering workloads in AWS
      Jennifer Wu (Cloudera), Fahd Siddiqui (Cloudera), Paul George (Cloudera), Eugene Fratkin (Cloudera)
      Jennifer Wu, Paul George, Fahd Siddiqui, and Eugene Fratkin lead a deep dive into running data engineering workloads in a managed service capacity in the public cloud. Along the way, they share AWS infrastructure best practices and explain how data engineering workloads interoperate with data analytic workloads.
      1:30pm-5:00pm (3h 30m) Data science & advanced analytics, Machine Learning & Data Science R
      Machine learning in R
      Jared Lander (Lander Analytics)
      Modern statistics has become almost synonymous with machine learning—a collection of techniques that utilize today's incredible computing power. Jared Lander walks you through the available methods for implementing machine learning algorithms in R and explores underlying theories such as the elastic net, boosted trees, and cross-validation.
      9:00am-12:30pm (3h 30m)
      Data 101
      Dan Roesch (Roesch & Associates LLC), Dan Roesch (Roesch & Associates LLC), Edd Wilder-James (Silicon Valley Data Science), Mikio Braun (Zalando SE), Javier Esplugas (DHL Supply Chain), Kevin Parent (Conduce), Jim Scott (MapR Technologies), Melanie Warrick (Google), Sarah Manning (Etsy)
      Data 101 introduces you to core principles of data architecture, teaches you how to build and manage successful data teams, and inspires you to do more with your data through real-world applications. Setting the foundation for deeper dives on the following days of Strata + Hadoop World, Data 101 reinforces data fundamentals and helps you focus on how data can solve your business problems.
      1:30pm-5:00pm (3h 30m) Data-driven business management, Strata Business Summit
      Managing data science in the enterprise
      John Akred (Silicon Valley Data Science), Heather Nelson (Silicon Valley Data Science)
      John Akred and Heather Nelson share methods and observations from three years of effectively deploying data science in enterprise organizations. You'll learn how to build, run, and get the most value from data science teams and how to work with and plan for the needs of the business.
      9:00am-12:30pm (3h 30m) Data Engineering & Architecture, Spark & beyond Architecture
      Architecting a data platform
      John Akred (Silicon Valley Data Science), Stephen O'Sullivan (Silicon Valley Data Science)
      What are the essential components of a data platform? John Akred and Stephen O'Sullivan explain how the various parts of the Hadoop, Spark, and big data ecosystems fit together in production to create a data platform supporting batch, interactive, and real-time analytical workloads.
      1:30pm-5:00pm (3h 30m) Data Engineering & Architecture, Hadoop platform & applications Architecture
      Architecting a next-generation data platform
      Jonathan Seidman (Cloudera), Gwen Shapira (Confluent), Mark Grover (Lyft)
      Using Customer 360 and the IoT as examples, Jonathan Seidman, Mark Grover, and Gwen Shapira explain how to architect a modern, real-time big data platform leveraging recent advancements in the open source software world, using components like Kafka, Impala, Kudu, Spark Streaming, and Spark SQL with Hadoop to enable new forms of data processing and analytics.
      9:00am-12:30pm (3h 30m) Data Engineering & Architecture, Stream processing and analytics Streaming
      Building real-time data pipelines with Apache Kafka
      Ian Wrigley (StreamSets)
      Ian Wrigley demonstrates how Kafka Connect and Kafka Streams can be used together to build real-world, real-time streaming data pipelines. Using Kafka Connect, you'll ingest data from a relational database into Kafka topics as the data is being generated and then process and enrich the data in real time using Kafka Streams before writing it out for further analysis.
      1:30pm-5:00pm (3h 30m) Data Engineering & Architecture, Stream processing and analytics Architecture, Streaming
      Modern real-time streaming architectures
      Karthik Ramasamy (Streamlio), Sanjeev Kulkarni (Streamlio), Avrilia Floratau (Microsoft), Ashvin Agrawal (Microsoft), Arun Kejariwal (MZ), Sijie Guo (Streamlio)
      Karthik Ramasamy, Sanjeev Kulkarni, Avrilia Floratau, Ashvin Agrawal, Arun Kejariwal, and Sijie Guo walk you through state-of-the-art streaming systems, algorithms, and deployment architectures, covering the typical challenges in modern real-time big data platforms and offering insights on how to address them.
      9:00am-12:30pm (3h 30m) Data science & advanced analytics
      Scaling Python data analysis
      Matthew Rocklin (Anaconda), Ben Zaitlen (Anaconda)
      The Python data science stack, which includes NumPy, pandas, and scikit-learn, is efficient and intuitive but only for in-memory data and a single core. Matthew Rocklin and Ben Zaitlen demonstrate how to parallelize and scale your Python workloads to multicore machines and multimachine clusters.
      1:30pm-5:00pm (3h 30m) Big data and the Cloud, Data Engineering & Architecture Architecture, Cloud
      Building your first big data application on AWS
      Ryan Nienhuis (Amazon Web Services (AWS)), Radhika Ravirala (Amazon Web Services (AWS)), Allan MacInnis (Amazon Web Services), Ben Snively (Amazon Web Services (AWS))
      Want to learn how to use Amazon's big data web services to launch your first big data application on the cloud? Ryan Nienhuis, Radhika Ravirala, Allan MacInnis, and Ben Snively walk you through building a big data application using a combination of open source technologies and AWS managed services.
      10:30am-11:00am (30m)
      Break: Morning break
      3:00pm-3:30pm (30m)
      Break: Afternoon break
      12:30pm-1:30pm (1h)
      Break: Lunch
      5:00pm-6:30pm (1h 30m)
      Opening Reception (Sponsored by Datameer and Lenovo)
      Grab a drink and mingle with fellow Strata Data Conference attendees while you check out all of the exhibitors in the Expo Hall.
      8:15am-8:45am (30m)
      Speed Networking
      Gather before Tutorials on Tuesday morning for a speed networking event. Enjoy casual conversation while meeting fellow attendees.
      6:30pm-8:00pm (1h 30m)
      Ignite
      Ignite is happening at Strata on Tuesday, September 26. Join us for a fun, high-energy evening of five-minute talks—all aspiring to live up to the Ignite motto: Enlighten us, but make it quick.