Brought to you by NumFOCUS Foundation and O’Reilly Media Inc.
The official Jupyter Conference
August 22-23, 2017: Training
August 23-25, 2017: Tutorials & Conference
New York, NY

Faster prototyping in Jupyter

Moderated by: Andrey Petrin

Who is this presentation for?

Analysts and Data Scientists that want to

Prerequisite knowledge

Basic Jupyter, JupyterHub and Python. My talk is mainly conceptual, I will cover key code snippets, but the goal is to share the approach without focus on any particular realisation

What you'll learn

* Jupyter allows very quick prototyping without backend-frontend knowledge * You can share Jupyter notebooks' environment using Everware * You can create website-like interfaces in Python with very little coding

Description

Big Data analytics is already outdated at Yandex. We need insights and action items from our logs and databases. In this new environment speed of prototyping comes to the first place. I’m going to give an overview how we use Python and Jupyter to create prototypes that amaze and inspire real product creation.

Role of quick prototyping
In a leading IT companies almost any idea could be implemented. But how to decide which of those brilliant plans to undertake? And what if you are an self-sufficient analyst that is facing a completely new challenge every week? Today we are creating an automated self-compiling PowerPoint presentation for our CTO, tomorrow we are creating an URL thematics classifier for the whole Internet, and the day after we need to visualize some of your conclusions based on terabytes of logs.

The good way to solve it is to out-source, or ask a separate team to make a good-looking interface. But the best way is to do it this week, this day. And the Swiss army knife (that isn’t good; it is good enough) we are using is Python and Jupyter.

Case studies
Here is a number of cases that are solved with ease using Jupyter.

  • JupyterHub allows multiple users to run their scripts on the prepared machine with all nesesery libraries preinstalles. Everware project takes this concest and enriches it with Docker environment sharing.
  • Inline HTML forms that make you Notebook work interactively and allow smooth data input in production-scale
  • Progress bars make data analysis predictable.
  • Plotly visualisations with direct connectors from Pandas Dataframe allows to embed complicated graphs into our task tracker, giving other people explore your data.