Presented By
O’Reilly + Cloudera
Make Data Work
29 April–2 May 2019
London, UK

Large-scale ML with MLflow, deep learning, and Apache Spark (Day 2)

Amir Issaei (Databricks)
Location: Capital Suite 17

Who is this presentation for?

  • You're a practicing data scientist who's eager to get started with deep learning.
  • You're a software engineer or technical manager interested in a thorough, hands-on overview of deep learning and its integration with Apache Spark.

Level

Advanced

Prerequisite knowledge

  • A working knowledge of Python (NumPy and pandas) and Spark DataFrames
  • Familiarity with data science

What you'll learn

  • Learn how to build a neural network with Keras
  • Understand the difference between various activation functions and optimizers
  • Discover how to track experiments with MLflow
  • Learn how to apply models at scale with Deep Learning Pipelines
  • Understand how to build distributed TensorFlow models with Horovod

    Description

    Outline

    Intro to neural networks with Keras I

    • Neural network architecture

    • Batch sizes and epochs

    • Evaluation metrics

    • Keras API

    Intro to neural networks with Keras II

    • Activation functions

    • Data normalization

    • Optimizers

    • Custom metrics

    • Validation dataset

    • Callbacks/checkpointing

    MLflow

    • Experiment tracking

    • Record which model and hyperparameters performed best

    Convolutional neural networks

    • Working with image data

    • Convolutions

    • Max pooling versus average pooling

    • ImageNet architectures

    • Deep Learning Pipelines: Apply pretrained models in parallel

    Horovod

    • Distributed Keras/TensorFlow model training

    • Allreduce technique

    • Combine Spark preprocessing with distributed neural network training
    Photo of Amir Issaei

    Amir Issaei

    Databricks

    Amir Issaei is a data science consultant at Databricks, where he educates customers on how to leverage the company’s Unified Analytics Platform in machine learning (ML) projects. He also helps customers implement ML solutions and use advanced analytics to solve business problems. Previously, he worked in the Operations Research Department at American Airlines, where he supported the Customer Planning, Airport, and Customer Analytics Groups. He holds an MS in mathematics from the University of Waterloo and a BE in physics from the University of British Columbia.