Presented By
O’Reilly + Cloudera
Make Data Work
March 25-28, 2019
San Francisco, CA

Managing globally distributed data for deep learning using TensorFlow on YARN (sponsored by WANdisco)

Jagane Sundar (WANdisco)
2:40pm3:20pm Wednesday, March 27, 2019
Sponsored
Location: 2022
Average rating: ****.
(4.50, 2 ratings)

What you'll learn

  • Explore a system for replicating data across geographically distributed data centers

Description

The benefits of large datasets for deep learning are well known. But what if the source of this data is globally distributed?

Jagane Sundar shares a system for replicating data across geographically distributed data centers, discusses the benefits of consistently replicating data that is used by TensorFlow for training, and explores the advantages of using a Paxos-based distributed coordination algorithm for replication. Jagane then details the resultant unique capability to maintain consistent writable copies of the data in multiple data centers.

This session is sponsored by WANDisco.

Photo of Jagane Sundar

Jagane Sundar

WANdisco

Jagane Sundar is the CTO at WANdisco. Jagane has extensive big data, cloud, virtualization, and networking experience. He joined WANdisco through its acquisition of AltoStor, a Hadoop-as-a-service platform company. Previously, Jagane was founder and CEO of AltoScale, a Hadoop- and HBase-as-a-platform company acquired by VertiCloud. His experience with Hadoop began as director of Hadoop performance and operability at Yahoo. Jagane’s accomplishments include creating Livebackup, an open source project for KVM VM backup, developing a user mode TCP stack for Precision I/O, developing the NFS and PPP clients and parts of the TCP stack for JavaOS for Sun Microsystems, and creating and selling a 32-bit VxD-based TCP stack for Windows 3.1 to NCD Corporation for inclusion in PC-Xware. Jagane is currently a member of the technical advisory board of VertiCloud. He holds a BE in electronics and communications engineering from Anna University.