Presented By O'Reilly and Cloudera
Make Data Work
September 25–26, 2017: Training
September 26–28, 2017: Tutorials & Conference
New York, NY

Unraveling data with Spark using deep learning and other algorithms from machine learning

Vartika Singh (Cloudera), Jeffrey Shmain (Cloudera)
9:00am12:30pm Tuesday, September 26, 2017
Machine Learning & Data Science, Spark & beyond
Location: 1A 12/14 Level: Intermediate
Secondary topics:  Deep learning
Average rating: **...
(2.50, 6 ratings)

Who is this presentation for?

  • Data scientists and analysts, programmers, and software engineers

Prerequisite knowledge

  • Experience with machine learning, Scala, Java, and Python

Materials or downloads needed in advance

  • A laptop with Java, Scala, and Spark installed and configured
  • A GitHub account

What you'll learn

  • Learn approaches for applying machine learning and deep learning algorithms that leverage Spark and other open source libraries

Description

Data analysis has come a long way in terms of dealing with both the size and the complexity of the data itself. Vartika Singh and Jeffrey Shmain walk you through various approaches to unraveling the underlying patterns in the data leveraging Spark, machine learning, and related Along the way, Vartika and Jeff discuss common issues encountered as the data and model sizes grow and demonstrate how to solve analytical problems using deep learning frameworks Caffe and TensorFlow on a Spark cluster.

Topics include:

  • Clustering
  • Classification
  • Deep learning
Photo of Vartika Singh

Vartika Singh

Cloudera

Vartika Singh is a solutions architect at Cloudera with over 10 years of experience applying machine learning techniques to big data problems.

Photo of Jeffrey Shmain

Jeffrey Shmain

Cloudera

Jeff Shmain is a principal solutions architect at Cloudera. He has 16+ years of financial industry experience with a strong understanding of security trading, risk, and regulations. Over the last few years, Jeff has worked on various use-case implementations at 8 out of 10 of the world’s largest investment banks.

Leave a Comment or Question

Help us make this conference the best it can be for you. Have questions you'd like this speaker to address? Suggestions for issues that deserve extra attention? Feedback that you'd like to share with the speaker and other attendees?

Join the conversation here (requires login)

Comments

Picture of Mohammed Ayub
Mohammed Ayub | DATA SCIENTIST
09/26/2017 5:28am EDT

Thanks, Jeffrey !

Picture of Jeffrey Shmain
Jeffrey Shmain | SOLUTIONS ARCHITECT
09/25/2017 7:41pm EDT

This tutorial will mostly be done through Cloudera Data Science Workbench. So minimal setup is required to run the examples. However, all of the code and examples are on github and could potentially be run in 3rd party tools.

Picture of Mohammed Ayub
Mohammed Ayub | DATA SCIENTIST
09/25/2017 7:30pm EDT

I have sparkmagic kernel installed for jupyter notebook from here: https://github.com/jupyter-incubator/sparkmagic
Will this work for the tutorial ?

jagadishwar mekala | SR BIGDATA SOLUTIONS ENGINEER
08/10/2017 6:10am EDT

How much is the hands on programing in this training