Sep 9–12, 2019

In-Person Training
Large-Scale ML with MLflow, Deep Learning and Apache Spark

Amir Issaei (Databricks)
Monday, Sep 9 & Tuesday, Sep 10,
9:00am - 5:00pm
Location: Market

Participants should plan to attend both days of this 2-day training course. To attend training courses, you must register for a Platinum or Training pass; does not include access to tutorials on Tuesday.

The course covers the fundamentals of neural networks and how to build distributed Keras/TensorFlow models on top of Spark DataFrames. Throughout the class, you will use Keras, TensorFlow, Deep Learning Pipelines, and Horovod to build and tune models. You will also use MLflow to track experiments and manage the machine learning lifecycle. NOTE: This course is taught entirely in Python.

What you'll learn, and how you can apply it

After taking this class, students will be able to:
  • Build a neural network with Keras
  • Explain the difference between various activation functions and optimizers
  • Track experiments with MLflow
  • Apply models at scale with Deep Learning Pipelines
  • Build distributed TensorFlow models with Horovod

    This training is for you because...

    This course is aimed at the practicing data scientist who is eager to get started with deep learning, as well as software engineers and technical managers interested in a thorough, hands-on overview of deep learning and its integration with Apache Spark.


    • Python (numpy and pandas)
    • Working knowledge of Spark DataFrames
    • Data science experience

      Hardware and/or installation requirements:

      A computer or laptop
      • Chrome or Firefox web browser - preferably Chrome
      • Internet access with unfettered connections to the following domains:

      Intro to Neural Networks with Keras I

      • Neural network architecture

      • Batch sizes and epochs

      • Evaluation metrics

      • Keras API

      Intro to Neural Networks with Keras II

      • Activation functions

      • Data Normalization

      • Optimizers

      • Custom metrics

      • Validation dataset

      • Callbacks/checkpointing


      • Experiment tracking

      • Record which model + hyperparameters performed best

      Convolutional Neural Networks

      • Working with image data

      • Convolutions

      • Max pooling vs. avg. pooling

      • ImageNet Architectures

      • DeepLearningPipelines: Apply pre-trained models in parallel


      • Distributed Keras/TensorFlow model training

      • All-reduce technique

      • Combine Spark pre-processing with distributed neural network training

      About your instructor

      Photo of Amir Issaei

      Amir Issaei is a data science consultant at Databricks, where he educates customers on how to leverage the company’s Unified Analytics Platform in machine learning (ML) projects. He also helps customers implement ML solutions and use advanced analytics to solve business problems. Previously, he worked in the Operations Research Department at American Airlines, where he supported the Customer Planning, Airport, and Customer Analytics Groups. He holds an MS in mathematics from the University of Waterloo and a BE in physics from the University of British Columbia.

      Conference registration

      Get the Platinum pass or the Training pass to add this course to your package.

      Leave a Comment or Question

      Help us make this conference the best it can be for you. Have questions you'd like this speaker to address? Suggestions for issues that deserve extra attention? Feedback that you'd like to share with the speaker and other attendees?

      Join the conversation here (requires login)

      Contact us

      For conference registration information and customer service

      For more information on community discounts and trade opportunities with O’Reilly conferences

      For information on exhibiting or sponsoring a conference

      Contact list

      View a complete list of O'Reilly AI contacts