Skip to main content
Make Data Work
Oct 15–17, 2014 • New York, NY

Multi-language Data Science with IPython, IJulia, IR, and Friends

Brian Granger (Cal Poly San Luis Obispo), Fernando Perez (UC Berkeley and Lawrence Berkeley National Laboratory)
4:15pm–4:55pm Thursday, 10/16/2014
Data Science
Location: 1D
Average rating: *****
(5.00, 4 ratings)

The IPython Notebook is an open-source, web-based interactive computing environment. At its core, the Notebook is an environment for writing and running code in an interactive and exploratory manner. On top of this foundation, it adds a document based workflow: Notebook documents contain live code, descriptive text, mathematical equations, images, videos, and arbitrary HTML. These documents provide a complete and reproducible record of a computation and can be shared with others, version controlled and converted to a wide range of static formats (HTML, PDF, slides, etc.).

While IPython began as a project focused on the Python programming language, over the past 2 years, it has grown to include support for other languages relevant in data science, including Julia and R. In 2014, the project has been making rapid progress towards the goal of being truly programming language neutral. This is being done in recognition that exploratory and interactive data science and scientific computing are activities that inherently involve multiple programming languages. Regardless of what language they are working in, users need tools that enhance reproducibility and interactivity across a wide range of usage contexts, from individual exploration and production runs to teaching and presentation.

After giving an overview of the project, we will describe the work we have been doing to reach this goal of language neutrality in IPython.

First, the different components of IPython’s architecture are evolving towards language independence:

  • The Notebook document JSON specification.
  • The Notebook web application and user interface.
  • The message protocol (JSON over ZeroMQ and WebSockets) specification used to communicate between the web application and the programming language specific kernels.
  • The configuration and discovery of kernels.
  • The http://nbviewer.ipython.org website for sharing notebooks.
  • The nbconvert library for converting notebooks to different static formats including HTML, PDF and Markdown.

Second, the Notebook’s user interface and branding is being updated to treat all programming languages equally. The notebook web application will automatically detect installed kernels. Users will be able to select the programming language for each individual notebook from a dropdown menu. Language specific UI logic such as syntax highlighting, logos, and help menus will automatically be updated as the programming language of a notebook is changed.

Third, we will discuss how the project documentation and broader community is evolving amidst these changes.

By the end of the talk, attendees should have a strong sense of what the IPython Notebook is and how it can be used with different programming languages in data science workflows.

Photo of Brian Granger

Brian Granger

Cal Poly San Luis Obispo

Brian Granger is an Associate Professor of Physics at Cal Poly State
University in San Luis Obispo, CA. He has a background in theoretical
atomic, molecular and optical physics, with a Ph.D from the University of Colorado. His current research interests include quantum computing, parallel and distributed computing and interactive computing environments for scientific and technical computing. He is a core developer of the IPython project and is an active contributor to a number of other open source projects focused on scientific computing in Python. He is @ellisonbg on Twitter and GitHub.

Photo of Fernando Perez

Fernando Perez

UC Berkeley and Lawrence Berkeley National Laboratory

Fernando Pérez is a research scientist at UC Berkeley, working at the
intersection of brain imaging and open tools for scientific computing. He
created IPython while a PhD student in Physics at the University of Colorado in
Boulder. Today, with all the hard work done by a talented team, he continues
to lead IPython’s development as the interface between the humans at the
keyboard and the bits in the machine.

He is a founding member of NumFOCUS, a PSF member, and received the 2012 Award for the Advancement of Free Software for IPython and contributions to
scientific Python.

Comments on this page are now closed.

Comments

Picture of Audra M. Carter
Audra M. Carter
10/21/2014 1:20pm EDT

Hi Lewis,

The slides were from another presentation of Fernando’s. Apologies for any confusion.

Lewis King
10/21/2014 1:01pm EDT

The link to the slides refers to a different presentation.

Andres Davila
10/16/2014 12:36pm EDT

will the used during the talk be shared?