Presented By O’Reilly and Cloudera
Make Data Work
March 5–6, 2018: Training
March 6–8, 2018: Tutorials & Conference
San Jose, CA

Building ML and AI pipelines with Spark and TensorFlow

Chris Fregly (Amazon Web Services)
2:40pm3:20pm Thursday, March 8, 2018
Secondary topics:  Expo Hall
Average rating: *****
(5.00, 1 rating)

Who is this presentation for?

  • Data scientists and data engineers

Prerequisite knowledge

  • A working knowledge of Spark
  • A basic understanding of TensorFlow

What you'll learn

  • Learn how to create an end-to-end pipeline using Spark and TensorFlow


Chris Fregly demonstrates how to extend existing Spark-based data pipelines to include TensorFlow model training and deploying and offers an overview of TensorFlow’s TFRecord format, including libraries for converting to and from other popular file formats such as Parquet, CSV, JSON, and Avro stored in HDFS and S3. All demos are 100% open source and downloadable as Docker images from

Photo of Chris Fregly

Chris Fregly

Amazon Web Services

Chris Fregly is a senior developer advocate focused on AI and machine learning at Amazon Web Services (AWS). Chris shares knowledge with fellow developers and data scientists through his Advanced Kubeflow AI Meetup and regularly speaks at AI and ML conferences across the globe. Previously, Chris was a founder at PipelineAI, where he worked with many startups and enterprises to deploy machine learning pipelines using many open source and AWS products including Kubeflow, Amazon EKS, and Amazon SageMaker.

Comments on this page are now closed.


Picture of Zhen Fan
03/11/2018 11:55am PDT

Hi Chris,
I’m very interested in your presentation, could you share your slides? My email address is, I think that would be much helpful to my team. I’m looking forward to have a deep discussion with you, thanks.