From Oracle to Hadoop: Unlocking Hadoop for Your RDBMS with Apache Sqoop and Other Tools
Apache Hadoop is great for storing large amounts of unstructured data, but when analyzing this data, users need to reference data from existing RDBMS based systems. We’ll look at how to transfer large volumes of data from Oracle into Hadoop efficiently with high scalability. We’ll also look at some strategies to keep this data up to date and place minimal load on our existing systems. In addition, we will look at strategies for Hadoop to RDBMS data flows such as moving aggregated data from Hadoop to RDBMS and consider how Hadoop may be used alongside an RDBMS as a long term archive or as a long term transaction or audit log. We will discuss the new features of Apache Sqoop 2.0 and the merging of the Dell/Quest connector for Oracle into the Sqoop core, providing Sqoop with much improved scalability and manageability.
Guy Harrison is Executive Director of Research and Development at Dell Software. Guy is the author of Oracle Performance Survival Guide (Prentice Hall, 2009) and MySQL Stored Procedure Programming (OReilly with Steven Feuerstein) as well as other books, articles and presentations on database technology. He also writes a monthly column for Database Trends and Applications (www.dbta.com). He is contributing to the upcoming Oracle Exadata Expert Handbook (Pearson, 2014).
Guy can be found on the internet at http://www.guyharrison.net, on e-mail at Guy.Harrison@software.dell.com and is @guyharrison on twitter.
David Robson is a principal technologist at Dell Software. He is the lead developer of the Dell Oracle connector for Hadoop which is currently in the process of being donated to the Apache SQOOP project, and was the originator of Dell’s Toad for Hadoop project. David has a background in Oracle administration and development as well as Java and Hadoop development. David lives in Melbourne, Australia. He can be reached at firstname.lastname@example.org.
Kathleen Ting (@kate_ting) is currently a technical account manager at Cloudera where she helps strategic customers deploy and use the Hadoop ecosystem in production. She’s a frequent conference speaker, has contributed to several projects in the open source community, and is a committer and PMC member on Sqoop. Kathleen is also a co-author of O’Reilly’s Apache Sqoop Cookbook.
Comments on this page are now closed.