Presented By O'Reilly and Cloudera
Make Data Work
September 25–26, 2017: Training
September 26–28, 2017: Tutorials & Conference
New York, NY
DB Tsai

DB Tsai
Senior Research Engineer, Netflix

DB Tsai is a senior research engineer working on personalized recommendation algorithms at Netflix. He’s also a member of and committer for the Apache Spark Project Management Committee (PMC). DB has implemented several algorithms, including linear Rrgression and binary/multinomial logistic regression with elastic net (L1/L2) regularization using LBFGS/OWL-QN optimizers in Apache Spark. Previously, he was a lead machine learning engineer at Alpine Data Labs, where he led a team to develop innovative large-scale distributed learning algorithms and contributed back to the open source Apache Spark project. DB was a PhD candidate in applied physics at Stanford University. He holds a master’s degree in electrical engineering from Stanford University.


5:25pm6:05pm Wednesday, September 27, 2017
Machine Learning & Data Science, Spark & beyond
Location: 1A 08/10 Level: Advanced
Secondary topics:  Media
Seth Hendrickson (Cloudera), DB Tsai (Netflix)
Average rating: *****
(5.00, 1 rating)
Recent developments in Spark MLlib have given users the power to express a wider class of ML models and decrease model training times via the use of custom parameter optimization algorithms. Seth Hendrickson and DB Tsai explain when and how to use this new API and walk you through creating your own Spark ML optimizer. Along the way, they also share performance benefits and real-world use cases. Read more.