Presented By
O’Reilly + Cloudera
Make Data Work
March 25-28, 2019
San Francisco, CA
Arun Kumar

Arun Kumar
Assistant Professor, University of California, San Diego


Arun Kumar is an assistant professor in the Department of Computer Science and Engineering at the University of California, San Diego. He’s a member of the Database Lab and CNS and an affiliate member of the AI Group. His primary research interests are in data management and systems for machine learning- and artificial intelligence-based data analytics. Systems and ideas based on his research have been released as part of the MADlib open source library, shipped as part of products from EMC, Oracle, Cloudera, and IBM, and used internally by Facebook, LogicBlox, Microsoft, and other companies. He’s a recipient of the ACM SIGMOD 2014 Best Paper Award, the 2016 Graduate Student Research Award for the best dissertation research in UW-Madison CS, and a 2016 Google Faculty Research Award.


1:50pm2:30pm Thursday, March 28, 2019
Arun Kumar (University of California, San Diego)
Average rating: ****.
(4.00, 2 ratings)
Arun Kumar details recent techniques to accelerate ML over data that is the output of joins of multiple tables. Using ideas from query optimization and learning theory, Arun demonstrates how to avoid joins before ML to reduce runtimes and memory and storage footprints. Along the way, he explores open source software prototypes and sample ML code in both R and Python. Read more.