For the past 25 years applications have been getting built using an RDBMS with a predefined schema that forces data to conform with a schema on-write. Many people still think that they must use an RDBMS for applications, even though records in their datasets have no relation to one another. Additionally, those databases are optimized for transactional use, and data must be exported for analytics purposes. NoSQL technologies have turned that model on its side to deliver groundbreaking performance improvements.
I will walk through a music database with over 100 tables in the schema and show how to convert that model for use with a NoSQL database. I will show how to handle creating, updating, and deleting records, using column families for different types of data (and why).
I will then show how to use the exact same data without moving or transforming it to perform analytics, by leveraging Apache Drill’s ANSI-SQL capabilities on the NoSQL database.
Jim Scott is the head of developer relations, data science, at NVIDIA. He’s passionate about building combined big data and blockchain solutions. Over his career, Jim has held positions running operations, engineering, architecture, and QA teams in the financial services, regulatory, digital advertising, IoT, manufacturing, healthcare, chemicals, and geographical management systems industries. Jim has built systems that handle more than 50 billion transactions per day, and his work with high-throughput computing at Dow was a precursor to more standardized big data concepts like Hadoop. Jim is also the cofounder of the Chicago Hadoop Users Group (CHUG).
Comments on this page are now closed.
©2016, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • email@example.com
Apache Hadoop, Hadoop, Apache Spark, Spark, and Apache are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries, and are used with permission. The Apache Software Foundation has no affiliation with and does not endorse, or review the materials provided at this event, which is managed by O'Reilly Media and/or Cloudera.