Hadoop is a powerful data analysis platform, especially for working with unstructured or semi-structured data. Much of the time, Hadoop MapReduce analyses are enhanced by including regularly-structured data as is often found in a MySQL database. Hadoop-based analysis often results in regularly-structured output, which must then be integrated with existing datasets or other online systems.
This talk introduces Sqoop, the open source SQL-to-Hadoop tool. Sqoop helps users perform efficient imports of data from MySQL and other databases to Hadoop’s distributed file system, where it can be processed in concert with other data sources. Sqoop also allows users to export Hadoop-generated results back to MySQL and other systems for use with other data pipelines.
After this session, users will understand how MySQL and Hadoop fit together, and how to use Sqoop to move data between these systems. The talk will provide suggestions for best practices when integrating Sqoop and Hadoop in your data processing pipelines. We’ll also cover some deeper technical details of Sqoop’s architecture, and how it uses MySQL-specific tools to achieve high throughput.
Aaron Kimball is a software engineer at Cloudera, Inc., the Commercial Hadoop company. Aaron is the principle developer of Sqoop, the SQL-to-Hadoop database import/export tool. Aaron has been working with Hadoop since early 2007, and contributes actively to its development. Through Cloudera, he additionally provides training to developers and system administrators working with Hadoop. Aaron holds a B.S. in Computer Science from Cornell University, and an M.S. in Computer Science and Engineering from the University of Washington.
Comments on this page are now closed.
For information on exhibition and sponsorship opportunities at the conference, contact Yvonne Romaine at email@example.com
For media-related inquiries, contact Maureen Jennings at firstname.lastname@example.org
To stay abreast of conference news and to receive email notification when registration opens, please sign up for the O'Reilly MySQL Conference newsletter (login required).
View a complete list of O'Reilly MySQL Conference contacts.