Ready to dip your toe into data science? Va Barbosa explains why you should start with notebooks and PixieDust, a new open source library that helps data scientists and developers working in the Jupyter Notebook and Apache Spark be more efficient.
PixieDust speeds data manipulation and display with features like auto-visualization of Spark DataFrames, real-time Spark job progress monitoring directly from the notebook, seamless integration to cloud services, and automated local install of Python and Scala kernels running with Spark. And if you prefer working with a Scala notebook, no problem. PixieDust can also run on a Scala Kernel. Imagine being able to visualize your favorite Python chart engines from a Scala notebook.
Join Va to learn how to use PixieDust in your own projects to visualize and explore data effortlessly with no coding. Va also shares a demo combining Twitter, Watson Tone Analyzer, Spark Streaming, and some fun real-time visualizations—all running within a notebook.
This session is sponsored by IBM.
va barbosa is a developer advocate at the Center for Open-Source Data & AI Technologies at IBM, where he helps developers discover and use data and machine learning technologies. This is fueled by his passion to help others and guided by his enthusiasm for open source technology. Always looking to embrace new challenges and fulfill his appetite for learning, he immerses himself in a wide range of technologies and activities. When not focusing on the developer experience, he enjoys dabbling in photography. If you can’t find him in front of a computer, try looking behind a camera.
©2017, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • firstname.lastname@example.org