Sep 23–26, 2019
Please log in

Your cloud, your ML, but more and more scale? How SurveyMonkey did it

Jing Huang (SurveyMonkey), Jesscia Mong (SurveyMonkey)
11:20am12:00pm Thursday, September 26, 2019
Location: 1A 15/16
Average rating: *****
(5.00, 2 ratings)

Who is this presentation for?

  • Architects, engineers, and data scientists who are responsible for building and maintaining machine learning ecosystems within their organizations

Level

Intermediate

Description

As a leading global software company, SurveyMonkey created the online survey category and transformed the way people give feedback. The amount of people-powered data (50+ billion questions answered on the platform, 2.4 million survey respondents per day, etc.) collected over the past two decades is a gold mine for ML. In early 2018, SurveyMonkey started a journey with an objective to expand its ML capabilities and empower the rest of the company to leverage the power of ML. More than a year into the journey it’s seen a 10x increase in its model serving capability, all while building its ML platform on a hybrid cloud infrastructure and expanding to multiple data centers.

Jing Huang and Jessica Mong outline SurveyMonkey’s case study, exploring what the company had to do to reach these landmarks. To serve its fast-growing number of models operating in production, it extended its online ML serving layer to scale up its serving capability, added functionalities to support shadow testing, A/B testing, and different DS libraries. SurveyMonkey built model continuous retraining pipelines to keep model effectiveness and developed a central feature store as the first step to democratizing ML. It has successfully solved complex engineering challenges on the way and seen exciting results in return. While far from done, there are a lot of lessons worth sharing from the company’s journey.

Prerequisite knowledge

  • A basic understanding of ML building blocks and the development pipeline

What you'll learn

  • Learn how to expend ML capability with existing infrastructure from SurveyMonkey’s journey, the critical preparations before tackling the scalability issues in ML, the key components of the ML system, the key considerations in terms of choosing third-party tools versus building in-house solutions when you operate on a hybrid cloud, and the key building blocks to transform existing infrastructure to ML friendly
Photo of Jing Huang

Jing Huang

SurveyMonkey

Jing Huang is a director of engineering, machine learning, at SurveyMonkey, where she drives the vision and execution of democratizing machine learning. She leads the effort to build the next-generation machine learning platform and oversees all machine learning operation projects. Previously, she was an entrepreneur and devoted her time to building mobile-first solutions and data products for nontech industries and worked at Cisco, where her contribution ranged from security and cloud management to big data infrastructure.

Photo of Jesscia Mong

Jesscia Mong

SurveyMonkey

Jessica Egoyibo Mong is an engineering manager on the machine learning engineering (MLE) team at SurveyMonkey. She leads efforts to rearchitect the online serving ML system. Previously, she was a full stack engineer on the billing and payments team, where she built and maintained software to enable SurveyMonkey’s global financial growth and operation; she oversaw the technical talks program, jointly managed the engineering internship program, and co-led the Women in Engineering Group. She’s a 2014 White House Initiative on HBCUs All-Star and a Hackbright (summer 2013) and CODE2040 (summer 2014) alum. She’s served on the leadership team of the Silicon Valley local chapter of the Anita Borg Institute and is a member of /dev/color. Jessica earned a BS in computer engineering from Claflin University in South Carolina. She’s a singer and upcoming drummer, and sings and drums at her church in Livermore, California. In her spare time, she enjoys eating, CrossFit, reading, learning new technologies, and sleeping.

  • Cloudera
  • O'Reilly
  • Google Cloud
  • IBM
  • Cisco
  • Dataiku
  • Intel
  • Io-Tahoe
  • MemSQL
  • Microsoft Azure
  • Oracle Cloud Infrastructure
  • SAS
  • Arcadia Data
  • BMC Software
  • Hazelcast
  • SAP
  • Amazon Web Services
  • Anaconda
  • Esri
  • Infoworks.io, Inc.
  • Kyligence
  • Pitney Bowes
  • Talend
  • Google Cloud
  • Confluent
  • DataStax
  • Dremio
  • Immuta
  • Impetus Technologies Inc.
  • Keyence
  • Kyvos Insights
  • StreamSets
  • Striim
  • Syncsort
  • SK holdings C&C

    Contact us

    confreg@oreilly.com

    For conference registration information and customer service

    partners@oreilly.com

    For more information on community discounts and trade opportunities with O’Reilly conferences

    strataconf@oreilly.com

    For information on exhibiting or sponsoring a conference

    pr@oreilly.com

    For media/analyst press inquires