Real-world Applications: Disrupting with Data

Location: Strata OLC May Level:

Most massive scale analytics systems rely on hadoop and processing large amounts of unstructured data in logs. At Zynga, we use loosely structured data loaded in essentially real-time into an fully ANSI SQL relational MPP compressed column store warehouse (Vertica). We load more than 50 billion rows a day, all available within 5 minutes from the time they were logged, and we’re able to use traditional ETL and reporting tools since it’s true ANSI SQL. We have built a series of other platform tools around this like an experimentation platform and a data services platform to get data back into the games for run-time access. This talk will focus on the infrastructure developed for these systems.

Photo of Daniel McCaffrey

Daniel McCaffrey