Sep 23–26, 2019

Enabling big data and AI workloads on the object store at DBS Bank

Vitaliy Baklikov (DBS Bank), Dipti Borkar (Alluxio )
3:45pm4:25pm Thursday, September 26, 2019
Location: 1A 23/24

Who is this presentation for?

  • Data engineers, data architects, and storage architects

Level

Intermediate

Description

The big data stack has evolved over the past few years with an explosion of data frameworks, starting with MapReduce and expanding to Apache Spark and Presto. The approach to managing and storing data has evolved as well, starting from using primarily Hadoop distributed file system (HDFS) to newer, cheaper, and easier technologies like object stores. But the design of most object stores inhibits real-time big data and AI workloads running directly on them.

Vitaliy Baklikov and Dipti Borkar explore a different architecture for analytic workloads, particularly those deployed in cloud environment. Alluxio, an open-source virtual distributed file system, provides a unified data access layer for hybrid and multicloud deployments. Alluxio enables distributed compute engines like Spark or Presto or machine learning frameworks like TensorFlow to transparently access different persistent storage systems (including HDFS, S3, Azure, etc.) while actively leveraging in-memory cache to accelerate data access.

Vitaliy and Dipti dive into how DBS Bank built a modern big data analytics stack, leveraging an object store as persistent storage even for data-intensive workloads, and how it uses Alluxio to orchestrate data locality and data access for Spark workloads. In addition, deploying Alluxio to access data solves many challenges that cloud deployments bring with separated compute and storage.

Prerequisite knowledge

  • A working knowledge of the data ecosystem

What you'll learn

  • Discover that object stores provide an easy and cheaper storage alternative to Hadoop, but their limitations prevent them from being used for real-time big data workloads
  • Learn how Alluxio can enable new workloads on object stores
Photo of Vitaliy Baklikov

Vitaliy Baklikov

DBS Bank

Vitaliy Baklikov is the senior vice president at DBS Bank, where he leads a team of architects who drive the evolution of the platform and tackle various use cases ranging from batch and stream big data processing to sophisticated machine learning workloads, with over 15 years of experience in advanced analytics and distributed architectures. He’s building a next-generation enterprise data platform for the bank that sits across private and public clouds. Previously he held various roles at startups and financial institutions across the US, UK, and Russia.

Photo of Dipti Borkar

Dipti Borkar

Alluxio

Dipti Borkar is the vice president of product and marketing at Alluxio with over 15 years experience in relational and nonrelational data and database technology. Previously, Dipti was vice president of product marketing at Kinetica and Couchbase, where she held several leadership positions, including head of global technical sales and head of product management; she managed development teams at IBM DB2, where she started her career as a database software engineer. Dipti holds an MS in computer science from the University of California San Diego and an MBA from the Haas School of Business at the University of California, Berkeley.

    Contact us

    confreg@oreilly.com

    For conference registration information and customer service

    partners@oreilly.com

    For more information on community discounts and trade opportunities with O’Reilly conferences

    strataconf@oreilly.com

    For information on exhibiting or sponsoring a conference

    pr@oreilly.com

    For media/analyst press inquires