Presented By O’Reilly and Cloudera
Make Data Work
March 5–6, 2018: Training
March 6–8, 2018: Tutorials & Conference
San Jose, CA
Pawel Szostek

Pawel Szostek
Senior Software Engineer, Criteo

Pawel Szostek is a senior software engineer on Criteo’s analytics data storage team, where he works on various projects, including implementing an improved HyperLogLog algorithm. Previously, he was a researcher at CERN in Geneva.


11:50am12:30pm Thursday, March 8, 2018
Szehon Ho (Criteo), Pawel Szostek (Criteo)
Average rating: ****.
(4.50, 2 ratings)
Hive is the main data transformation tool at Criteo, and hundreds of analysts and thousands of automated jobs run Hive queries every day. Szehon Ho and Pawel Szostek discuss the evolution of Criteo's Hive platform from an error-prone add-on installed on some spare machines to a best-in-class installation capable of self-healing and automatically scaling to handle its growing load. Read more.