Brought to you by NumFOCUS Foundation and O’Reilly Media Inc.
The official Jupyter Conference
August 22-23, 2017: Training
August 23-25, 2017: Tutorials & Conference
New York, NY
Please log in

Sponsored by:
Bloomberg

Jupyter Poster Session

5:00pm–7:00pm Wednesday, August 23, 2017
Location: Trianon Ballrom

Posters will be presented Wednesday evening in a friendly setting where attendees can mingle. This session is an opportunity for you to discuss your Jupyter work one-on-one with other attendees and presenters.

Moderated by: Roy Hyunjin Han
Jupyter Notebook is already great, but did you know that you can use it to prototype computational web applications? In this whirlwind tour, we will introduce you to several favorite open source plugins that we have been using for the past few years (many of which we have developed) that let us rapidly deploy tools for processing tables, images, spatial data, satellite images, sounds and video.
Moderated by: Ashwin Trikuta Srinath, Linh Ngo, & Jeff Denton
This talk will be about how to build a JupyterHub setup with a rich set of features for interactive HPC, and solutions to practical problems encountered in integrating JupyterHub with other components of HPC systems. We will present several examples of how researchers at our institute are using JupyterHub, and demonstrate the different parts of our setup that enable their applications.
Moderated by: Feyzi Bagirov & Tatiana Yarmola
Poor data quality frequently invalidates data analysis, especially when performed in Excel, the most commonplace business intelligence tool, on data that underwent transformations, imputations, and manual manipulations. In this talk we will use Pandas to walk through an example of Excel data analysis and illustrate several common pitfalls that make this analysis invalid.
Moderated by: Luciano Resende & Jakob Odersky
Data Scientists are becoming a necessity of every company in the data-centric world of today, and with them comes the requirement to make available a flexible and interactive analytics platform. This session will describe our experience and best practices putting together an Analytical platform based on Jupyter Notebooks, Apache Toree and Apache Spark.
Moderated by: Paco Nathan
Paco Nathan shares lessons learned about using notebooks in media and explores computable content that combines Jupyter notebooks, video timelines, Docker containers, and HTML/JS for "last mile" presentation, covering system architectures, how to coach authors to be effective with the medium, whether live coding can augment formative assessment, and the typical barriers encountered in practice.
Moderated by: Faras Sadek and Demba Ba
At Harvard, we deployed JupyterHub on Amazon AWS for two classes in School of Engineering. The Signal Processing class used Docker-based JupyterHub, where each user provisioned with a docker container notebook. For the Decision Theory class, we redesigned JupyterHub using a dedicated EC2 instance per user’s notebook, providing better scalability, reliability and cost efficiency.
Moderated by: Diogo Munaro Vieira & Felipe Ferreira
At Globo.com all of our datascientists are using Jupyter Notebooks for analysis. Its analysis require some security because they are working on our shared data science platform. We will show how JupyterHub was adjusted for authentication with company's OAuth2 solution and user's action track system based on Jupyter notebook hooks.
Moderated by: Elijah Philpotts
3Blades has developed an innovative artificial intelligence agent to enhance productivity for data scientists when using Jupyter Notebooks for Exploratory Data Analysis (EDA).
Moderated by: Joy Chakraborty
How to run Kerberize secured multi-user Jupyter notebook (JupyterHub) in a integrated with Spark/Yarn cluster and how to use docker to setup such complex integrated platform quickly with less difficulties.
Moderated by: Dave Goodsmith, Meredith Lee, Rene Baston, and Edgar Fuller
A demonstration station will feature donated cloud computing resources from DataScience.com, Amazon Web Services, GoogleCloud, Satori, and other partners in live executable Jupyter-based notebooks.
Moderated by: Steven Anton
Sometimes data scientists need to work directly with highly sensitive data, such as personally identifiable information or health records. Jupyter notebooks provide a great platform for exploration, but don't meet strict security standards. We will walk through a solution that our data science team uses to harden security by seamlessly encrypting notebooks at rest.
Moderated by: Andrey Petrin
Big Data analytics is already outdated at Yandex. We need insights and action items from our logs and databases. In this new environment speed of prototyping comes to the first place. I'm going to give an overview how we use Python and Jupyter to create prototypes that amaze and inspire real product creation.
Moderated by: en zyme & Zelda Kohn
Real estate transactions are geographically sparse and rare, often with both listing and selling agents. Many factors determine price; most models rely on physical parameters. Via Jupyter/Python geographic and data tools, we'll discover "farms", and pricing characteristics. Farms (found via clustering) can affect either listing or sales price, both of which are negotiated.
Moderated by: Douglas Liming
Ready to take a deeper look at how the Jupyter platform is having a widespread impact on analytics? Learn how a large health organization was able to fit SAS their open ecosystem, and thanks to the Jupyter platform, you no longer have to choose between analytics languages like Python, R, or SAS, and how a single, unified open analytics platform supported by Jupyter empowers you to have it all.
Moderated by: Chris Rawles
The availability of data combined with new analytical tools have fundamentally transformed the sports industry, and in this talk I show how to use Jupyter Notebook with powerful analytical tools such as Apache Spark and visualization tools like Matplotlib and Seaborn to assist data science.
Moderated by: Patrick Huck & Shreyas Cholia
The open Materials Project (MP, https://materialsproject.org) that supports the design of novel materials, now allows users to contribute and share new theoretical and experimental materials data via the MPContribs tool. MPContribs uses Jupyter and JupyterHub at every layer and is an important step in MP’s effort to deliver a next-generation collaborative platform for Materials (Data) Science.
Moderated by: Harold Mitchell
Today's healthcare and research professionals have so much precious historical data in need of a predictive outcome. Wouldn't it be nice to carry around a web-based notebook that had built‐in algorithms to perform predictions? Even more, the built‐in algorithms would be built by and maintained by you.
Moderated by: Jacob Frias Koehler
Here, we present an undergraduate mathematics curriculum that leverages the Jupyter notebook and Jupyterhub to deliver material content and serve as the computational platform for students. These materials are motivated by introductory classes typically labeled Quantitative Reasoning, PreCalculus, and Calculus I.
Moderated by: Laxmikanth Malladi
Spinning up Jupyter on AWS is easy with many references for deploying on EC2 and EMR. This session intends to provide additional configurations and patterns for Enterprises to govern, track and audit usage on AWS.
Moderated by: Jeffrey Denton
It is a match made in the cloud. By marrying JupyterHub and CloudyCluster, users gain access to scalable Jupyter without the headache and overhead of operations. Learn how CloudyCluster can scale JupyterHub to support thousands of users and thousands of computers, all from your smartphone, tablet, or desktop device.
Moderated by: David Visontai
The advent of many interdisciplinary research areas and the cooperation of different scientific fields demand computational systems that allow for efficient collaboration. Kooplex, our highly integrated system incorporating the advantages of Jupyter notebooks, public dashboards, version control and data sharing serves as a basis for different projects in fields ranging from Medicine to Physics.
Moderated by: Bill Walrond
In this presentation, Kevin Rasmussen, Solution Architect, Caserta Concepts, discusses why notebooks aren’t just for data scientists anymore. Drawing information from a current project with one of the most respected newspapers in the country, he will go into detail about how to put data engineering into production with notebooks.
Moderated by: Jonathan Whitmore
Project Jupyter contains tools that are perfect for many data science tasks, including rapid iteration for data munging, visualizing, and creating a beautiful presentation of results. The same tools that give power to individual data scientists can prove challenging to integrate in a team setting. This talk will emphasize overall best practices for data science team productivity.
Moderated by: David P. Sanders (Department of Physics, Faculty of Sciences, National University of Mexico)
An overview of using Julia with the Jupyter notebook, showing how the flexibility of the language is reflected in the notebook environment.
Moderated by: Trevor Lyon, Matt McKay, and Spencer Lyon
Introduction to the QuantEcon Open Notebook Archive, a community driven home for sharing and discovering Jupyter notebooks.
Moderated by: Matt Henderson and Shreyas Cholia
Scientists increasingly rely on large-scale computation and data analysis, with applications ranging from designing better batteries to understanding our universe. In this talk we’ll describe how scientists could greatly benefit from a platform using the core Jupyter architecture of notebooks and kernels with large-scale HPC and data analysis systems to enable interactive supercomputing.
Moderated by: Majid Khorrami & Laura Kahn
What if decision makers could use data science techniques to predict how much economic aid they would receive each year? Our proposal will show how we did just that and used data for social good.
Moderated by: Marius van Niekerk
Spylon kernel is a pure python jupyter metakernel. This allows python and scala users to have an easy kernel to use with Apache Spark.
Moderated by: Joshua Cook
This teaching session will take participants through using Docker's suite of tools, the numpy/scipy ecosystem, and the Jupyter project as a feature-rich programming interface, to build powerful systems for performing rich analysis and transformation on data sets of any size.
The DOE Systems Biology Knowledgebase (KBase) is an open source project that enables biological scientists to create, execute, collaborate on and share reproducible analysis workflows. KBase's Narrative Interface, built on the Jupyter Notebook, is the front end to a scalable object store, an execution engine, a distributed compute cluster, and a library of analysis tools packaged as Docker images.
Moderated by: Timothy Dobbins
SQLCell is a magic function that executes raw, parallel, parameterized SQL queries with the ability to accept python variables as parameters, switch between engines with a button click, run outside of a transaction block, produce an intuitive query plan graph with D3.js to highlight slow points in query; all while concurrently running Python code. And much more.
Moderated by: Jason Kuruzovich
FreeCodeCamp.com is a online learning platform for coding that has figured out how to use distributed content creation to power a learning community. This talk will discuss FreeCodeCamp and detail my current efforts to start a similar model for analytics with the AnalyticsDojo.com:, including content, technical, and community related opportunities and challenges.
Researchers, data scientists, and professionals spend their days doing cutting-edge work. But when it comes time to writing, and disseminating their work, they’re often still using models and tools that haven’t changed much in decades, if not centuries.