July 20–24, 2015
Portland, OR

Data conference sessions

Let’s tackle data’s continued, growing influence over the entire business world and how you can make it work for you. Harness the power of math to manipulate, secure, and create data.

1:30pm–5:00pm Monday, 07/20/2015
Nicole White (Neo4j)
Flask, a popular Python web framework, has many tutorials available online which use an SQL database to store information about the website’s users and their activities. In this tutorial, we will replace SQL with Neo4j, an open source graph database, in order to build a simple microblog application with social features that are otherwise too complex to model and express in SQL. Read more.
9:00am–12:30pm Tuesday, 07/21/2015
Portland 255
Tom Marrs (LivingSocial)
Most modern web APIs prefer JSON because of its interoperability. All modern languages have excellent JSON support, but large-scale environments often require more than simple serialization/de-serialization. This tutorial shows how to leverage JSON Schema, Search, and Transform along with simple tooling to enhance a developer’s workflow to build elegant, powerful, and efficient applications. Read more.
1:30pm–5:00pm Tuesday, 07/21/2015
Scott Murray (University of San Francisco)
Get started with d3.js, the most powerful JavaScript tool for creating data visualizations on the web. We'll start from scratch, and build an interactive scatterplot by the end of the session. Read more.
10:40am–11:20am Wednesday, 07/22/2015
Portland 252
Jonas Rosland (VMware), Kate Greenough (EMC)
There are tons of metrics that can be measured out there. Facebook likes, Twitter followers, website hits, database queries, number of VMs, cheapest lunch in the neighborhood, and many more. What if you could collect those metrics and choose the ones you'd like to present into a nice dashboard? And perhaps add easy analytics to it? Learn how to use Dashing together with platforms like Keen.io. Read more.
11:30am–12:10pm Wednesday, 07/22/2015
Aurelia Moser (Mozilla Science)
The historical versioning of maps defines some of the most fascinating social, political, and environmental flux of precedent. Everything from the eruption of post-World Cup tweets, to the migration patterns of mammals, can be mapped with OSS. This talk will cover time travel as it can be viewed in visualizations: the ways we partner time-series data with interactive maps @CartoDB. Read more.
1:40pm–2:20pm Wednesday, 07/22/2015
Linda Powell (Consumer Financial Protection Bureau)
Everyone at OSCON knows that good data coupled with modern open source technology can revolutionize business. But does senior management know? This presentation is about how to convince very powerful people with limited tech literacy that investing in good data and good data technology helps promote the organization’s mission. Read more.
2:30pm–3:10pm Wednesday, 07/22/2015
Portland 252
Kenny Bastani (Digital Insight)
Fast and scalable analysis of big data has become a critical competitive advantage for companies. There are open source tools like Apache Hadoop and Apache Spark that are providing opportunities for companies to solve these big data problems in a scalable way. Platforms like these have become the foundation of the big data analysis movement. Read more.
4:10pm–4:50pm Wednesday, 07/22/2015
Portland 252
Tags: Java
Grant Ingersoll (Lucidworks)
Search engine technology is rapidly evolving from keyword-based lookups, to a highly sophisticated ranking engine capable of incorporating many different features across complex data types. With the latest changes in Solr and Lucene, it is now possible to ask more interesting questions of multi-structured content than ever before, making them indispensable tools in the data science toolbox. Read more.
5:00pm–5:40pm Wednesday, 07/22/2015
Mike Biglan (Twenty Ideas), Elijah Hamovitz (Analytic Spot)
CQL3 has a relational-database-centric abstraction that hides many key details of the underlying storage. Though CQL can be an efficient and convenient tool to use when querying, knowing how CQL actually maps to Cassandra's storage structure is key to being able to create scalable and flexible data models. Read more.
10:40am–11:20am Thursday, 07/23/2015
Portland 256
Paco Nathan (O'Reilly Media)
Herein, an open source developer community considers itself _algorithmically_. This project shows how to surface data insights from the developer email forums for just about any Apache open source project. It leverages machine learning and advanced analytics in Apache Spark, but also makes use of Docker containers for standalone NLP services. Read more.
1:40pm–2:20pm Thursday, 07/23/2015
Portland 256
Charles Smith (Netflix)
We are collecting increasing amounts of data to analyze, so we can understand how to better serve our customers. But how do you know that the data collected is useful or even being used? Using Netflix’s experience building data platforms, we will talk about how gaining insight into the use of your data can improve your own platform. Read more.
2:30pm–3:10pm Thursday, 07/23/2015
Portland 252
Roman Shaposhnik (Pivotal Inc.)
Graph relationships are everywhere. In fact, more often than not, analyzing relationships between points in your datasets lets you extract more business value from your data. This presentation will provide an introduction into two of the most used Hadoop ecosystem projects in the area of scalable graph processing: Apache Giraph and Spark GraphX. Read more.
4:10pm–4:50pm Thursday, 07/23/2015
Jonathan Ellis (DataStax, Inc)
This session will cover the new features in Cassandra 3.0, including JSON support, user-defined types, and global indexes. These allow engineers to deliver even better performance and productivity in their application development. Read more.
10:00am–10:40am Friday, 07/24/2015
Portland 252
Joe Witt (Onyara Inc.)
Dataflow is an often underestimated challenge in realizing the value of big data. Messaging-based approaches are fast and well understood, and solid open source options exist. However, this talk makes the case that transport-oriented messaging is not the right abstraction for large distributed enterprise data flow, and describes how Apache NiFi is designed to solve these problems. Read more.
11:10am–11:50am Friday, 07/24/2015
Portland 252
Robert Aboukhalil (Invitae)
In 2008, Nate Silver wowed the public by correctly predicting the outcome of the U.S. elections in 49 out of 50 states. As it turns out, you don't have to be a statistician to perform such analyses. In this talk, I introduce the Bash scripting language and how it can be used to perform advanced number crunching. Read more.