Coordinating the Many Tools of Big Data in Hadoop

Hadoop in Practice Great America Ballroom K
Average rating: ***..
(3.50, 4 ratings)

The big data revolution is more than just terabytes or petabytes of data. It is also the application of new paradigms, languages, and tools to these data sets. This is a great strength of big data, but also a liability. These tools have different data models, different utilities for reading and writing data, and different frameworks for including user code. How can users in the same organization using different tools share data? How can user defined functions written for one tool be used by other tools? This talk will cover work in Apache HCatalog, Apache Pig, and Apache Hive projects that is being done to address these issues.

Photo of Alan Gates

Alan Gates


Alan is a co-founder of Hortonworks. He is an original member of the engineering team that took Pig from a Yahoo! Labs research project to a successful Apache open source project. Alan also designed HCatalog and guided its adoption as an Apache Incubator project. He is also the author of Programming Pig, a book from O’Reilly Press.


Sponsorship Opportunities

For information on exhibition and sponsorship opportunities at the conference, contact Susan Stewart at

Media Partner Opportunities

For information on trade opportunities with O'Reilly conferences contact Kathy Yu at mediapartners

Press and Media

For media-related inquiries, contact Maureen Jennings at

Contact Us

View a complete list of Strata contacts