Put open source to work

July 16–17, 2018: Training & Tutorials
July 18–19, 2018: Conference

Portland, OR

Powering TensorFlow with big data using Apache Beam, Flink, and Spark

Holden Karau (Independent)

2:35pm–3:15pm Wednesday, July 18, 2018

Artificial intelligence, TensorFlow
Location: Portland 251

Tags:

Level: Intermediate

Average rating:

(2.20, 5 ratings)

Who is this presentation for?

Software engineers

Prerequisite knowledge

Familiarity with TensorFlow, Apache Spark, Flink, and Beam (useful but not required)

What you'll learn

Understand how to work with TensorFlow and big data systems

Description

TensorFlow is all kinds of fancy, from helping startups raising their series A in Silicon Valley to detecting if something is a cat. However, when things start to get “real,” you may find yourself no longer just dealing with mnist.csv but instead needing do large-scale data prep as well as training.

Holden Karau details how to use TensorFlow in conjunction with Apache Spark, Flink, and Beam to create a full machine learning pipeline—including the annoying feature engineering and data prep components that we like to pretend don’t exist. Holden also explains why these feature prep stages need to be integrated into the serving layer. She concludes by examining changing industry trends, like Apache Arrow, and how they impact cross-language development for things like deep learning. Even if you’re not trying to raise a round of funding in Silicon Valley, this talk will give you tools to do interesting machine learning problems at scale.

Holden Karau

Independent

Holden Karau is a transgender Canadian software engineer working in the bay area. Previously, she worked at IBM, Alpine, Databricks, Google (twice), Foursquare, and Amazon. Holden is the coauthor of Learning Spark, High Performance Spark, and another Spark book that’s a bit more out of date. She’s a committer on the Apache Spark, SystemML, and Mahout projects. When not in San Francisco, Holden speaks internationally about different big data technologies (mostly Spark). She was tricked into the world of big data while trying to improve search and recommendation systems and has long since forgotten her original goal. Outside of work, she enjoys playing with fire, riding scooters, and dancing.

Comments on this page are now closed.

Comments

Christopher Bess | LEAD DEVELOPER

07/26/2018 4:59am PDT

From what I could gather the information was relevant and generally helpful. I was grateful that Holden used the command line and showed more detail. But, the swearing (F-bombs, S-word), using several curse words, was unnecessary and unprofessional.

Premier Diamond Sponsors

Diamond Sponsors

Elite Sponsor

Rhodium Sponsor

Gold Sponsors

Silver Sponsors

Supporting Sponsors

Premier Exhibitors

Exhibitors

Innovators

Non-Profit Exhibitors

Sponsorship Opportunities

For exhibition and sponsorship opportunities, email oscon@oreilly.com

Partner Opportunities

For information on trade opportunities with O'Reilly conferences, email partners@oreilly.com

Contact Us

View a complete list of OSCON contacts

©2018, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • confreg@oreilly.com