Presented By O’Reilly and Cloudera
Make Data Work
21–22 May 2018: Training
22–24 May 2018: Tutorials & Conference
London, UK

Serverless machine learning with TensorFlow

Carl Osipov (Google)
9:0017:00 Tuesday, 22 May 2018
Big data and data science in the cloud
Location: Capital Suite 11 Level: Intermediate

Who is this presentation for?

  • Developers interested in machine learning

Prerequisite knowledge

  • A basic understanding of Python
  • Familiarity with machine learning (useful but not required)

Materials or downloads needed in advance

  • A laptop with a modern browser installed (Please note that there will not be power strips in the tutorial room, but there will be charging stations in the hallway and throughout the venue, so please plan accordingly.)
  • Before the course, check-in to the session with your email address by visiting https://goo.gl/forms/AChcS0hYvfsihwaE2 for instructions on how to access downloadable content for this session.

What you'll learn

  • Learn how to build and deploy simple and complex models with TensorFlow

Description

Carl Osipov walks you through building a complete machine learning pipeline from ingest, exploration, training, and evaluation to deployment and prediction. This workshop will be conducted on the Google Cloud Platform (GCP) and will use GCP’s infrastructure to run TensorFlow.

Outline:

  • Data pipelines and data processing: How to explore and split large datasets correctly (using SQL and pandas on BigQuery and Cloud Datalab)
  • Model building: How to develop a wide-and-deep machine learning model in TensorFlow on a small sample locally (using Apache Beam for preprocessing operations so that the same preprocessing can be applied in streaming mode as well and Cloud Dataflow and Cloud ML Engine for preprocessing and training of the model)
  • Model inference and deployment: How to deploy the trained model as a REST microservice and predictions invoked from a web application
Photo of Carl Osipov

Carl Osipov

Google

Carl Osipov is a program manager focused on helping Google’s customers and business partners get trained and certified to run machine learning and data analytics workloads on Google Cloud. Carl has more than 16 years of experience in the IT industry and has held leadership roles for programs and projects in the areas of big data, cloud computing, service-oriented architecture, machine learning, and computational natural language processing at some of the world’s leading technology companies across the United States and Europe. Carl has written over 20 articles in professional, trade, and academic journals and holds six patents from the USPTO. He has received three corporate awards from IBM for his innovative work. You can find out more about Carl on his blog.

Comments on this page are now closed.

Comments

Kirill Osipov
20/05/2018 23:24 BST

@Wolfgang Thank you for your question about the relationship between Spark and TensorFlow Dataset API. In my opinion, Spark and TensorFlow Dataset API are complementary technologies: Spark is a distributed computing platform while Dataset API is, well, an API. Since both are used for data processing both need similar abstractions. However users should understand that some systems may use both Spark and Dataset APIs.

Kirill Osipov
20/05/2018 23:10 BST

@Wolfgang @Melanie Thank you for your questions about the downloadable material! Those who have checked-in using the form should have received an email earlier today with links to downloadable content. If you haven’t received the email, note I will make sure that everyone who attends the session in person gets checked-in.

Wolfgang Giersche | CONSULTANT
19/05/2018 7:39 BST

Anybody out there to fulfil the promise of downloadable material? I get nothing but a remark that my email address has been posted.

Melanie Rezac | CUSTOMER SERVICE
17/05/2018 22:19 BST

How soon after ‘Checking in’ with the email address at
https://goo.gl/forms/AChcS0hYvfsihwaE2 will notifications go out with download material?

Wolfgang Giersche | CONSULTANT
6/04/2018 8:42 BST

Distributed processing and back pressure in the new Dataset API: Can Datasets make Spark and alike obsolete?