Brought to you by NumFOCUS Foundation and O’Reilly Media Inc.
The official Jupyter Conference
August 22-23, 2017: Training
August 23-25, 2017: Tutorials & Conference
New York, NY

Data science apps: Beyond notebooks

Natalino Busa (Teradata)
4:10pm–4:50pm Friday, August 25, 2017
Usage and application
Location: Sutton Center/Sutton South Level: Beginner
Average rating: *****
(5.00, 1 rating)

Who is this presentation for?

  • Data analysts, software engineers, and data scientists who are interested in customizing the Jupyter experience for specific tasks

Prerequisite knowledge

  • Familiarity with the Jupyter Notebook

What you'll learn

  • Learn how to build full-fledged applications with Jupyter
  • Explore the features of a Jupyter gateway to build RESTful data-driven APIs

Description

Jupyter notebooks are transforming the way we look at computing, coding, and science. But is this the only “data scientist experience” that this technology can provide? Natalino Busa explains how you can create interactive web applications for data exploration and analysis that in the background are still powered by the well-understood and well-documented Jupyter Notebook.

Natalino shares an architecture composed of three parts: a Jupyter server-only gateway, a Python Jupyter kernel, and an Angular/Bootstrap web application. In particular, the Jupyter gateway allows data scientists to expose notebook code as RESTful API endpoints. The web app can now programmatically run notebooks simply by accessing a REST API. In the background, the Python Jupyter kernel runs notebook data science and machine learning code fragments and returns the results back as JSON data. By chaining these components, you can create beautiful, rich apps that go beyond the limit of the “notebook experience,” providing engaging data analytics journeys where coding is hidden and the UI can be more tuned toward data exploration and more intuitive and guided “data science tours.”

Natalino then explores two examples of this new breed of notebook-powered web apps: O’Reilly’s Oriole Online Tutorials and Autoscience, a project of his own design. Oriole Online Tutorials are a mixture of embedded runnable code, videos, and text that provide a rich training experience where the video is synchronized with the text. In an Oriole Online Tutorial, the embedded code is actually run in the cloud, and the results are pushed back to the browser. The Autoscience project is an example of a meta-notebook. The UI is more intuitive for non-data scientists and provides a selection of precanned analyses of datasets, such as anomaly detection, dimensionality reduction, classification, and clustering. It uses a custom open sourced Python library in the background running on a Python Jupyter kernel.

Photo of Natalino Busa

Natalino Busa

Teradata

Natalino Busa is the head of data science at Teradata, where he leads the definition, design, and implementation of big, fast data solutions for data-driven applications, such as predictive analytics, personalized marketing, and security event monitoring. Previously, Natalino served as enterprise data architect at ING and as senior researcher at Philips Research Laboratories on the topics of system-on-a-chip architectures, distributed computing, and parallelizing compilers. Natalino is an all-around technology manager, product developer, and innovator with a 15+ year track record in research, development, and management of distributed architectures and scalable services and applications.

Leave a Comment or Question

Help us make this conference the best it can be for you. Have questions you'd like this speaker to address? Suggestions for issues that deserve extra attention? Feedback that you'd like to share with the speaker and other attendees?

Join the conversation here (requires login)