Sep 23–26, 2019

A large-scale deep learning offline platform: Bing's approach

Kai Liu (BING) (Microsoft), Jack Zhang (Microsoft), Jing Zhao (Microsoft)
2:55pm3:35pm Wednesday, September 25, 2019
Location: 1A 21

Who is this presentation for?

  • CTOs, directors of data platforms, and data science managers

Level

Beginner

Description

Bing runs a large-scale deep learning offline platform with tens of thousands of servers with the services partitioned around different steps in deep learning project lifecycles. The deep learning training service is a set of frameworks and tools for the model training tasks, optimized for resource scheduling, collaboration, and quick iterations. Deep learning offline processing is a set of tools for running a model against massive amounts of data to prepare datasets, optimized for throughput and streaming. Deep learning vector service is a set of tools for hosting and running vectors for online and offline use, optimized for fast computation. And deep learning inference service is a set of tools to host trained models in an offline fashion for model validation and collaboration, optimized for interactivity and cost efficiency.

Kai Liu, Jack Zhang, and Jing Zhao provide a comprehensive overview of a deep learning offline system that can boost the productivity of the data scientist community in your organization.

Prerequisite knowledge

  • A general knowledge of the key steps in deep learning projects

What you'll learn

  • Get an overview of a deep learning offline system that can boost the productivity of the data scientist community in your organization
Photo of Kai Liu (BING)

Kai Liu (BING)

Microsoft

Kai Liu is a senior program manager in the AI and Research Group of Microsoft. He has seven years of experience in data-driven engineering, big data platform, and AI infrastructure for Office product families. He led his team to create a service health portal for SharePoint Online, inject a distributed log collection and storage system for Exchange Online, publish curated datasets and key business metrics, and enable subhour experimentations in Office 365. He’s working on the AI and deep learning infrastructure for large-scale enterprise data under compliance obligations.

Photo of Jack Zhang

Jack Zhang

Microsoft

Jack Zhang is a principle group engineer manager in the AI and Research Group of Microsoft. He has 13 years of experience in data-driven engineering, cloud-based index updating, offline data processing platform, and AI infrastructure for Bing. He’s working on the AI and deep learning infrastructure for large-scale distributed training and offline data processing.

Jing Zhao

Microsoft

Jing Zhao is a senior program manager in the AI and Research Group at Microsoft. She has five years of experience in Windows build and testing infrastructure for automatic large-scale data and failure analysis, Bing Search, and AI index serve platform for deep learning-based ranking and caption infrastructure, and Bing AI and deep learning training infrastructure. She’s mainly focusing on the deep learning training optimization and infrastructure for large scale distributed data and training.

Leave a Comment or Question

Help us make this conference the best it can be for you. Have questions you'd like this speaker to address? Suggestions for issues that deserve extra attention? Feedback that you'd like to share with the speaker and other attendees?

Join the conversation here (requires login)

Contact us

confreg@oreilly.com

For conference registration information and customer service

partners@oreilly.com

For more information on community discounts and trade opportunities with O’Reilly conferences

strataconf@oreilly.com

For information on exhibiting or sponsoring a conference

Contact list

View a complete list of Strata Data Conference contacts