Presented By O'Reilly and Cloudera
Make Data Work
September 25–26, 2017: Training
September 26–28, 2017: Tutorials & Conference
New York, NY
 

Schedule

< Filters

No Results Found

Clear all filters

Close

Filters

      Clear filters
      1A 06/07
      Add Hardcore Data Science to your personal schedule
      9:00am Hardcore Data Science Ben Lorica (O'Reilly Media), Assaf Araki (Intel), Jacob Schreiber (University of Washington), Alex Ratner (Stanford University), Madeleine Udell (Cornell University), Yunsong Guo (Pinterest), Katherine Heller (Duke University), Amitai Armon (Intel), Yahav Shadmi (Intel), Gerard de Melo (Rutgers University), Tamara Broderick (MIT), Inbal Tadeski (Anodot), Daniel Kang (Stanford University), Bichen Wu (UC Berkeley), Shaked Shammah (Hebrew University)
      1A 08/10
      1A 12/14
      Add Securely Building Deep Learning Models for Digital Health Data to your personal schedule
      1:30pm Securely Building Deep Learning Models for Digital Health Data Josh Patterson (Skymind), Vartika Singh (Cloudera), David Kale (University of Southern California), Tom Hanlon (Skymind)
      1A 18
      Add Deep Learning for Recommender Systems to your personal schedule
      9:00am Deep Learning for Recommender Systems Ron Bodkin (Teradata), Mo Patel (Teradata)
      Add Machine Learning in R to your personal schedule
      1:30pm Machine Learning in R Jared Lander (Lander Analytics)
      1A 21/22
      Add Getting started with TensorFlow to your personal schedule
      9:00am Getting started with TensorFlow Yufeng Guo (Google), Amy Unruh (Google)
      Add Take a Deep-Learning Dive via Keras to your personal schedule
      1:30pm Take a Deep-Learning Dive via Keras julia lintern (Metis)
      1A 23/24
      Add Scaling Python Data Analysis to your personal schedule
      9:00am Scaling Python Data Analysis Matthew Rocklin (Continuum), Ben Zaitlen (Continuum Analytics)
      Add Natural language understanding at scale with spaCy, Spark ML & TensorFlow to your personal schedule
      1:30pm Natural language understanding at scale with spaCy, Spark ML & TensorFlow David Talby (Atigeo), Claudiu Branzan (G2 Web Services), Alex Thomas (Indeed)
      1E 07/08
      Add Findata Day to your personal schedule
      9:00am Findata Day Bradford Cross (DCVC), Stuart Lacey (Trunomi), Jason Morton (Corvil), Leigh Drogen (Estimize), Jessica Stauth (Quantopian), Abraham Thomas (Quandl), Alistair Croll (Solve For Interesting), Robert Passarella (Protege Partners), Vincent-Charles Hodder (www.locallogic.co), Sastry Durvasula (American Express), Priya Koul (American Express), Tanvi Singh (Credit Suisse), José Ribau (CIBC)
      1E 09
      Add Data Case Studies to your personal schedule
      9:00am Data Case Studies Rose Winterton (Pitney Bowes), Audrey Spencer-Alvarado (Portland Trail Blazers), Amie Elcan (CenturyLink), Sean Power (Repable), Parisa Foster (Play The Future), Nick Selby (CJX, Inc. | Midlothian Police Department), Salema Rice (Allegis Group)
      1E 10/11
      Add A Deep Dive into Running Data Engineering Workloads in AWS to your personal schedule
      9:00am A Deep Dive into Running Data Engineering Workloads in AWS Jennifer Wu (Cloudera), Andrei Savu (Cloudera), Vinithra Varadharajan (Cloudera), Eugene Fratkin (Cloudera)
      Add A practitioner’s guide to Hadoop security for the hybrid cloud to your personal schedule
      1:30pm A practitioner’s guide to Hadoop security for the hybrid cloud Mark Donsky (Cloudera), Manish Ahluwalia (Cloudera), Andre Araujo (Cloudera), Syed Rafice (Cloudera)
      1E 12/13
      Add Architecting A Data Platform to your personal schedule
      9:00am Architecting A Data Platform John Akred (Silicon Valley Data Science), Stephen O'Sullivan (Silicon Valley Data Science)
      Add Architecting a next generation data platform to your personal schedule
      1:30pm Architecting a next generation data platform Jonathan Seidman (Cloudera), Gwen Shapira (Confluent), Ted Malaska (Blizzard Entertainment), Mark Grover (Cloudera)
      1E 14
      Add Modern Real Time Streaming Architectures to your personal schedule
      1:30pm Modern Real Time Streaming Architectures Karthik Ramasamy (Streamlio), Sanjeev Kulkarni (Streamlio), Avrilia Floratau (Microsoft), Ashvin Agrawal (Microsoft), Arun Kejariwal (Machine Zone), Sijie Guo (Streamlio)
      1E 15/16
      Add Building big data applications on Azure to your personal schedule
      9:00am Building big data applications on Azure Pranav Rastogi (Microsoft)
      Add Building your first big data application on AWS to your personal schedule
      1:30pm Building your first big data application on AWS Ryan Nienhuis (Amazon Web Services (AWS)), Radhika Ravirala (Amazon Web Services (AWS)), Dario Rivera (Amazon Web Services (AWS))
      1E 06
      Add Data 101 to your personal schedule
      9:00am Data 101 Shannon Cutt (O'Reilly Media), Edd Wilder-James (Silicon Valley Data Science), Jim Scott (MapR Technologies), Julie Rodriguez (BNY - Eagle), Melanie Warrick (Google)
      Add Managing data science in the enterprise to your personal schedule
      1:30pm Managing data science in the enterprise John Akred (Silicon Valley Data Science)
      10:30am Morning break | Room: Break
      3:00pm Afternoon break | Room: Break
      12:30pm Lunch | Room: Lunch
      Add Opening Reception to your personal schedule
      5:00pm Opening Reception | Room: Expo Hall
      Add Speed Networking to your personal schedule
      8:15am Speed Networking | Room: TBD
      9:00am-5:00pm (8h)
      Hardcore Data Science
      Ben Lorica (O'Reilly Media), Assaf Araki (Intel), Jacob Schreiber (University of Washington), Alex Ratner (Stanford University), Madeleine Udell (Cornell University), Yunsong Guo (Pinterest), Katherine Heller (Duke University), Amitai Armon (Intel), Yahav Shadmi (Intel), Gerard de Melo (Rutgers University), Tamara Broderick (MIT), Inbal Tadeski (Anodot), Daniel Kang (Stanford University), Bichen Wu (UC Berkeley), Shaked Shammah (Hebrew University)
      A full day of hardcore data science, exploring emerging topics and new areas of study made possible by vast troves of raw data and cutting-edge architectures for analyzing and exploring information. Along the way, leading data science practitioners teach new techniques and technologies to add to your data science toolbox.
      9:00am-5:00pm (8h) Spark & beyond Text
      Spark camp: Apache Spark 2.0 for analytics and text mining with Spark ML
      This one-day hands-on class introduces you to Apache Spark 2.0 core concepts with a focus on Spark's machine learning library, using text mining on real-world data as the primary end-to-end use case.
      9:00am-12:30pm (3h 30m) Machine Learning, Spark & beyond Deep learning
      Unravelling data at scale with Spark using deep learning and other algorithms from machine learning.
      Vartika Singh (Cloudera), Jeffrey Shmain (Cloudera)
      We walk you through approaches available via machine-learning algorithms available in Spark ml to understand and decipher meaningful patterns in real-world data. Along with discussing the common problems encountered as the data and model sizes scale we will also leverage a few open source deep learning frameworks to run a few classification problems on image and text data sets leveraging Spark.
      1:30pm-5:00pm (3h 30m) Artificial Intelligence Deep learning, Healthcare
      Securely Building Deep Learning Models for Digital Health Data
      Josh Patterson (Skymind), Vartika Singh (Cloudera), David Kale (University of Southern California), Tom Hanlon (Skymind)
      In this hands-on tutorial, we will teach attendees how to interactively develop and train deep neural networks to analyze digital health data using the Cloudera Workbench and DeepLearning4J (DL4J). Attendees will learn how to use the Workbench to rapidly explore real world clinical data, build data preparation pipelines, and launch training of neural networks.
      9:00am-12:30pm (3h 30m) Artificial Intelligence, Machine Learning Deep learning, ecommerce
      Deep Learning for Recommender Systems
      Ron Bodkin (Teradata), Mo Patel (Teradata)
      Learn to apply Deep Learning to improve consumer recommendations. We train neural nets to learn categories of interest for recommendations (e.g., for cold start) using embeddings. Learn how to extend this with WALS Matrix Factorization to achieve Wide & Deep Learning - which is now used in production for the Google Play store. Learn with TensorFlow on our cloud GPU (or bring your own GPU laptop).
      1:30pm-5:00pm (3h 30m) Data science & advanced analytics, Machine Learning R
      Machine Learning in R
      Jared Lander (Lander Analytics)
      Modern statistics has become almost synonymous with machine learning; a collection of techniques that utilize today's incredible computing power. This course focuses on the available methods for implementing machine learning algorithms in R, and will examine some of the underlying theories behind the curtain, covering the Elastic Net, Boosted Trees and cross-validation.
      9:00am-12:30pm (3h 30m) Data science & advanced analytics, Machine Learning
      Getting started with TensorFlow
      Yufeng Guo (Google), Amy Unruh (Google)
      We will walk you through training and deploying a machine-learning system using TensorFlow, a popular open source library. Starting from conceptual overviews, we will build all the way up to complex classifiers. You’ll gain insight into deep learning and how it can apply to complex problems in science and industry.
      1:30pm-5:00pm (3h 30m) Machine Learning Deep learning
      Take a Deep-Learning Dive via Keras
      julia lintern (Metis)
      Beginning with basic neural nets and then winding our way through to convolutional neural nets and recurrent neural nets, I will explain both the design theory and the Keras implementation of today's most widely used deep-learning algorithms. As a class, we will work through these deep-learning architectures as well as the corresponding Keras code.
      9:00am-12:30pm (3h 30m) Data science & advanced analytics
      Scaling Python Data Analysis
      Matthew Rocklin (Continuum), Ben Zaitlen (Continuum Analytics)
      The Python Data science stack (NumPy, Pandas, Scikit-Learn) is efficient and intuitive but only for in-memory data and a single core. This tutorial teaches you to parallelize and scale your Python workloads to multi-core machines and multi-machine clusters. We use a variety of tools. This comparative approach encourages us to think broadly about parallel tools and programming paradigms.
      1:30pm-5:00pm (3h 30m) Data science & advanced analytics, Machine Learning Deep learning, Pydata, Text
      Natural language understanding at scale with spaCy, Spark ML & TensorFlow
      David Talby (Atigeo), Claudiu Branzan (G2 Web Services), Alex Thomas (Indeed)
      Natural language processing is a key component in many data science systems that must understand or reason about text. This is a hands-on tutorial for scalable NLP using spaCy for building annotation pipelines, TensorFlow for training custom machine learned annotators, and Spark ML & TensorFlow for using deep learning to build & apply word embeddings.
      9:00am-5:00pm (8h)
      Findata Day
      Bradford Cross (DCVC), Stuart Lacey (Trunomi), Jason Morton (Corvil), Leigh Drogen (Estimize), Jessica Stauth (Quantopian), Abraham Thomas (Quandl), Alistair Croll (Solve For Interesting), Robert Passarella (Protege Partners), Vincent-Charles Hodder (www.locallogic.co), Sastry Durvasula (American Express), Priya Koul (American Express), Tanvi Singh (Credit Suisse), José Ribau (CIBC)
      Finance is information. From analyzing risk and detecting fraud to predicting payments and improving customer experience, data technologies are transforming the financial industry. And we're diving deep into this change with a new day of data-meets-finance talks, tailored for Strata Data Conference events in the world's financial hubs.
      9:00am-5:00pm (8h)
      Data Case Studies
      Rose Winterton (Pitney Bowes), Audrey Spencer-Alvarado (Portland Trail Blazers), Amie Elcan (CenturyLink), Sean Power (Repable), Parisa Foster (Play The Future), Nick Selby (CJX, Inc. | Midlothian Police Department), Salema Rice (Allegis Group)
      In a series of 12 half-hour talks aimed at a business audience, you’ll hear data-themed case studies from household brands and global companies, explaining the challenges they wanted to tackle, the approaches they took, and the benefits—and drawbacks—of their solutions. If you want practical insights about applied data, look no further.
      9:00am-12:30pm (3h 30m) Big data and the Cloud Architecture, Cloud
      A Deep Dive into Running Data Engineering Workloads in AWS
      Jennifer Wu (Cloudera), Andrei Savu (Cloudera), Vinithra Varadharajan (Cloudera), Eugene Fratkin (Cloudera)
      Data engineering workloads are foundational workloads run prior to most analytic and operational database use cases. This hands-on tutorial will provide a deep dive into running data engineering workloads in a managed service capacity in the public cloud; highlight AWS infrastructure best practices; and discuss how data engineering workloads interoperate with data analytic workloads.
      1:30pm-5:00pm (3h 30m) Security Cloud
      A practitioner’s guide to Hadoop security for the hybrid cloud
      Mark Donsky (Cloudera), Manish Ahluwalia (Cloudera), Andre Araujo (Cloudera), Syed Rafice (Cloudera)
      You’ll start with a cluster with no security and then add security features related to authentication, authorization, encryption of data at rest, encryption of data in transit, and complete data governance.
      9:00am-12:30pm (3h 30m) Spark & beyond Architecture
      Architecting A Data Platform
      John Akred (Silicon Valley Data Science), Stephen O'Sullivan (Silicon Valley Data Science)
      What are the essential components of a data platform? This tutorial will explain how the various parts of the Hadoop, Spark and big data ecosystems fit together in production to create a data platform supporting batch, interactive, and real-time analytical workloads.
      1:30pm-5:00pm (3h 30m) Hadoop platform and applications Architecture
      Architecting a next generation data platform
      Jonathan Seidman (Cloudera), Gwen Shapira (Confluent), Ted Malaska (Blizzard Entertainment), Mark Grover (Cloudera)
      Using the Internet of Things and Customer 360 as an example, we’ll explain how to architect a modern, real-time big data platform leveraging recent advancements in open-source software. We’ll show how components like Kafka, Impala, Kudu, Spark Streaming, and Spark SQL along with Apache Hadoop can enable new forms of data processing and analytics.
      9:00am-12:30pm (3h 30m) Stream processing and analytics Streaming
      Building Real-Time Data Pipelines with Apache Kafka
      Ian Wrigley (Confluent)
      This hands-on workshop is designed for people interested in using Apache Kafka to build real-time streaming data pipelines. By the end of the tutorial, attendees will have seen now Kafka Connect and the Kafka Streams API can be used to ingest and process data in real time, as it is being generated. We assume no prior knowledge of Kafka. The tutorial includes hands-on exercises.
      1:30pm-5:00pm (3h 30m) Stream processing and analytics Architecture, Streaming
      Modern Real Time Streaming Architectures
      Karthik Ramasamy (Streamlio), Sanjeev Kulkarni (Streamlio), Avrilia Floratau (Microsoft), Ashvin Agrawal (Microsoft), Arun Kejariwal (Machine Zone), Sijie Guo (Streamlio)
      Across diverse segments in industry, there has been a shift in focus from Big Data to Fast Data. This, in part, stems from the deluge of high velocity data streams and, more importantly, the need for instant data-driven insights. In this tutorial, we walk the audience through the state-of-the-art streaming systems, algorithms and deployment architectures.
      9:00am-12:30pm (3h 30m) Big data and the Cloud Cloud
      Building big data applications on Azure
      Pranav Rastogi (Microsoft)
      As big data solutions are rapidly moving to the cloud, it's becoming increasingly important to know how to use Apache Hadoop, Spark, R Server, and other open source technologies in the cloud. Pranav Rastogi walks you through building big data applications on Azure HDInsight and other Azure services.
      1:30pm-5:00pm (3h 30m) Big data and the Cloud Architecture, Cloud
      Building your first big data application on AWS
      Ryan Nienhuis (Amazon Web Services (AWS)), Radhika Ravirala (Amazon Web Services (AWS)), Dario Rivera (Amazon Web Services (AWS))
      Want to get ramped up on how to use Amazon's big data web services and launch your first big data application on the cloud? Join us for this hands-on workshop as we build a big data application. We will use a combination of open source technologies such as Apache Spark, and Zeppelin; as well as AWS managed services such as Amazon EMR, Amazon Kinesis, and more. Get best practices & design patterns
      9:00am-12:30pm (3h 30m)
      Data 101
      Shannon Cutt (O'Reilly Media), Edd Wilder-James (Silicon Valley Data Science), Jim Scott (MapR Technologies), Julie Rodriguez (BNY - Eagle), Melanie Warrick (Google)
      Data 101 introduces you to core principles of data architecture, teaches you how to build and manage successful data teams, and inspires you to do more with your data through real-world applications. Setting the foundation for deeper dives on the following days of Strata + Hadoop World, Data 101 reinforces data fundamentals and helps you focus on how data can solve your business problems.
      1:30pm-5:00pm (3h 30m) Data-driven business management, Strata Business Summit
      Managing data science in the enterprise
      John Akred (Silicon Valley Data Science)
      In this tutorial, we will share our methods and observations from three years of effectively deploying data science in enterprise organizations. Attendees will learn how to build, run, and get the most value from data science teams, and how to work with and plan for the needs of the business.
      10:30am-11:00am (30m)
      Break: Morning break
      3:00pm-3:30pm (30m)
      Break: Afternoon break
      12:30pm-1:30pm (1h)
      Break: Lunch
      5:00pm-6:30pm (1h 30m)
      Opening Reception
      Grab a drink and mingle with fellow Strata Data Conference attendees while you check out all of the exhibitors in the Expo Hall.
      8:15am-8:45am (30m) Event
      Speed Networking
      Gather before Tutorials on Tuesday morning for a speed networking event. Enjoy casual conversation while meeting fellow attendees.