MapD Core is an open source analytical SQL engine that has been designed from the ground up to harness the parallelism inherent in GPUs. This enables queries on billions of rows of data in milliseconds. MapD Core also supports the GPU DataFrame (GDF) from GoAi (based on Apache Arrow) and is designed for passing data between processes while keeping it all in GPU memory. In order to provide data scientists with a seamless experience, MapD created a Jupyter Notebook kernel extension that can be installed from a MapD-managed Conda channel.
Randy Zwitch offers an overview of the MapD kernel extension for the Jupyter Notebook and explains how to use it in a typical machine learning workflow. You’ll learn how to deploy a Jupyter notebook with the MapD kernel extension, see how the Jupyter Notebook MapD kernel connects to a MapD server backend, and discover how its magic function (%%sql) executes commands on the MapD Core SQL engine. These SQL queries return their results into the GPU memory data frame using the PyGDF library. The GPU resident DataFrame is then accessed by the machine learning modeling framework to test, train, and make predictions.
Randy Zwitch is a Senior Developer Advocate at MapD, enabling customers and community users alike to utilize MapD to its fullest potential. With broad industry experience in Energy, Digital Analytics, Banking, Telecommunications and Media, Randy brings a wealth of knowledge across verticals as well as an in-depth knowledge of open-source tools for analytics.
©2018, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • firstname.lastname@example.org