Sep 9–12, 2019

Large-scale ML with MLflow, deep learning, and Apache Spark

Amir Issaei (Databricks)
Monday, Sep 9 & Tuesday, Sep 10,
9:00am - 5:00pm
Location: Market

Participants should plan to attend both days of this 2-day training course. To attend training courses, you must register for a Platinum or Training pass; does not include access to tutorials on Tuesday.

Amir Issaei details the fundamentals of neural networks and how to build distributed Keras and TensorFlow models on top of Spark DataFrames. You'll use Keras, TensorFlow, deep learning pipelines, and Horovod to build and tune models. You'll also use MLflow to track experiments and manage the machine learning lifecycle. This course is taught entirely in Python.

What you'll learn, and how you can apply it

  • Learn to build a neural network with Keras and distributed TensorFlow modesl with Horovod
  • Understand how to explain the difference between various activation functions and optimizers
  • Learn to track experiments with MLflow and apply models at scale with deep learning pipelines

    This training is for you because...

    • You're a practicing data scientist who is eager to get started with deep learning.
    • You're a software engineer or technical manager interested in a thorough, hands-on overview of deep learning and its integration with Apache Spark.

    Prerequisites:

    • Experience with Python (NumPy and pandas) and data science
    • A working knowledge of Spark DataFrames

      Hardware and/or installation requirements:

      • A WiFi-enabled laptop with the Chrome (preferred) or Firefox web browser installed
      • Access to Databricks.com, Keras.io, and Spark.apache.org

      Outline

      Intro to neural networks with Keras I

      • Neural network architecture
      • Batch sizes and epochs
      • Evaluation metrics
      • Keras API

      Intro to neural networks with Keras II

      • Activation functions
      • Data normalization
      • Optimizers
      • Custom metrics
      • Validation dataset
      • Callbacks and checkpointing

      MLflow

      • Experiment tracking
      • Record which model and hyperparameters performed best

      Convolutional neural networks

      • Working with image data
      • Convolutions
      • Max pooling versus average pooling
      • ImageNet architectures
      • Deep learning pipelines: Apply pretrained models in parallel

      Horovod

      • Distributed Keras and TensorFlow model training
      • All-reduce technique
      • Combine Spark preprocessing with distributed neural network training

      About your instructor

      Photo of Amir Issaei

      Amir Issaei is a data science consultant at Databricks, where he educates customers on how to leverage the company’s Unified Analytics Platform in machine learning (ML) projects. He also helps customers implement ML solutions and use advanced analytics to solve business problems. Previously, he worked in the Operations Research Department at American Airlines, where he supported the Customer Planning, Airport, and Customer Analytics Groups. He holds an MS in mathematics from the University of Waterloo and a BE in physics from the University of British Columbia.

      Conference registration

      Get the Platinum pass or the Training pass to add this course to your package. Best Price ends June 21.

      Leave a Comment or Question

      Help us make this conference the best it can be for you. Have questions you'd like this speaker to address? Suggestions for issues that deserve extra attention? Feedback that you'd like to share with the speaker and other attendees?

      Join the conversation here (requires login)

      Contact us

      confreg@oreilly.com

      For conference registration information and customer service

      partners@oreilly.com

      For more information on community discounts and trade opportunities with O’Reilly conferences

      Become a sponsor

      For information on exhibiting or sponsoring a conference

      Contact list

      View a complete list of O'Reilly AI contacts