Enabling big data and AI workloads on the object store at DBS Bank
Who is this presentation for?Data engineers, data architects, storage architects
The big data stack has heavily evolved over the past few years with an explosion of data frameworks starting with MapReduce and expanding to Apache Spark and Presto. In addition, the approach to managing and storing data has evolved as well, starting from using primarily HDFS to using newer, cheaper and easier technologies like object stores. But the design of most object stores inhibits real time big data and AI workloads to be directly on run on them.
In this session, we introduce a different different architecture for analytic workloads particularly deployed in cloud environment. Alluxio, an open-source virtual distributed file system, provides a unified data access layer for hybrid and multi-cloud deployments. Alluxio enables distributed compute engines like Spark, Presto or Machine Learning frameworks like TensorFlow to transparently access different persistent storage systems (including HDFS, S3, Azure and etc) while actively leveraging in-memory cache to accelerate data access.
In this presentation, Vitaliy Baklikov from DBS Bank and Dipti Borkar from Alluxio will share how DBS Bank has built a modern big data analytics stack leveraging an object store as persistent storage even for data-intensive workloads and how it uses Alluxio to orchestrate data locality and data access for Spark workloads. In addition, deploying Alluxio to access data, solves many challenges that cloud deployments bring with separated compute and storage.
Prerequisite knowledgeBasic knowledge of the data ecosystem
What you'll learn
Development Bank of Singapore
Vitaliy Baklikov is a data architect at Development Bank of Singapore.
Dipti Borkar is the VP of Product & Marketing at Alluxio with over 15 years experience in data and database technology across relational and non-relational. Prior to Alluxio, Dipti was VP of Product Marketing at Kinetica and Couchbase. At Couchbase she held several leadership positions there including Head of Global Technical Sales and Head of Product Management. Earlier in her career Dipti managed development teams at IBM DB2 where she started her career as a database software engineer. Dipti holds a M.S. in Computer Science from the UC San Diego, and an MBA from the Haas School of Business at UC Berkeley.
Leave a Comment or Question
Help us make this conference the best it can be for you. Have questions you'd like this speaker to address? Suggestions for issues that deserve extra attention? Feedback that you'd like to share with the speaker and other attendees?
Join the conversation here (requires login)
For conference registration information and customer service
For more information on community discounts and trade opportunities with O’Reilly conferences
For information on exhibiting or sponsoring a conference
View a complete list of Strata Data Conference contacts