Presented By O'Reilly and Cloudera
Make Data Work
December 1–3, 2015 • Singapore

From Oracle to Hadoop: Unlocking Hadoop for your RDBMS with Apache Sqoop and other tools

Guy Harrison (Dell Software)
4:00pm–4:40pm Wednesday, 12/02/2015
Hadoop Platform
Location: 334-335 Level: Intermediate
Slides:   1-ZIP 

Prerequisite Knowledge

Basic familiarity with Hadoop architecture


Apache Hadoop is great for storing large amounts of unstructured data, but when analyzing this data, users need to reference data from existing RDBMS based systems. We’ll look at how to transfer large volumes of data from Oracle into Hadoop efficiently with high scalability. We’ll also look at some strategies to keep this data up to date and place minimal load on our existing systems.

In addition, we will look at strategies for Hadoop-to-RDBMS data flows, such as moving aggregated data from Hadoop to RDBMS, and consider how Hadoop may be used alongside an RDBMS as a long-term archive or as a long-term transaction or audit log. We will discuss the new features of Apache Sqoop 2.0 and the merging of the Dell/Quest connector for Oracle into the Sqoop core, providing Sqoop with much improved scalability and manageability.

Photo of Guy Harrison

Guy Harrison

Dell Software

Guy Harrison is an executive director of research and development at Dell Software. Guy is the author of Oracle Performance Survival Guide, MySQL Stored Procedure Programming, and Oracle SQL High Performance Tuning as well as other books, articles, and presentations on database technology. He also writes a monthly column for Database Trends and Applications ( Guy can be found on the internet at, and on e-mail at