Making Open Work
May 8–9, 2017: Training & Tutorials
May 10–11, 2017: Conference
Austin, TX

Schedule: Data, Big and Small sessions

Data is literally everywhere you look and our devices and computers are working with bigger and more diverse sets of data than ever before. How do you manage this deluge? How do you tackle big data’s continued and growing influence over the entire business world? How you can make it work for you? How do you show others what you’ve collected in a way that is digestible?

Add to your personal schedule
1:30pm5:00pm Monday, May 8, 2017
Location: Ballroom E
Level: Intermediate
Barbara Fusinska (Microsoft)
Average rating: ***..
(3.56, 9 ratings)
Machine learning is growing increasingly popular. R is an open source platform that offers numerous libraries and implementations of machine-learning algorithms. Barbara Fusinska demonstrates how to use R to prepare data, create a predictive model, and display the results. Read more.
Add to your personal schedule
1:30pm5:00pm Monday, May 8, 2017
Location: Meeting Room 10 A/B
Level: Beginner
Jeremy Wilken (VMware)
Average rating: ****.
(4.00, 3 ratings)
Understanding data as it streams is vital today. Using Angular and D3, Jeremy Wilken demonstrates how to build out an example visualization application that consumes a live stream and shows meaningful metrics that could help businesses make critical, real-time decisions. Read more.
Add to your personal schedule
1:30pm5:00pm Tuesday, May 9, 2017
Location: Meeting Room 9
Level: Intermediate
William Lyon (Neo Technology)
Average rating: ****.
(4.75, 4 ratings)
William Lyon explains how to use a graph database to generate real-time recommendations using real-world data. William introduces graph data modeling and querying concepts using Neo4j and Cypher, the query language for graphs to import and query data, before demonstrating how to apply graph algorithms and NLP using Python data science tools to enhance your recommendations. Read more.
Add to your personal schedule
11:00am11:40am Wednesday, May 10, 2017
Location: Meeting Room 18 C/D
Level: Beginner
Vida Williams (Axis Partners, Inc)
Average rating: *****
(5.00, 1 rating)
Vida Williams offers an overview of a project that transmuted qualitative indicators of risk and success in foster care to quantitative indicators using real-life child welfare datasets and shares the lessons about capturing, assembling, and sharing datasets learned along the way. Read more.
Add to your personal schedule
11:00am5:45pm Wednesday, May 10, 2017
Location: Meeting Room 16
Amy Unruh (Google), Yufeng Guo (Google), Ben Hall (Katacoda | Ocelot Uproar), Yufeng Guo (Google), Amy Unruh (Google), Yufeng Guo (Google), Martin Wicke (Google), Vijay Vasudevan (Google), Aaron Schumacher (Deep Learning Analytics), Vijay Vasudevan (Google)
TensorFlow Day at OSCON has been put together by our partners from Google at the center of the very popular, game-changing, open source project TensorFlow. Read more.
Add to your personal schedule
11:50am12:30pm Wednesday, May 10, 2017
Location: Meeting Room 18 C/D
Level: Beginner
Average rating: **...
(2.00, 5 ratings)
Anastasia Sagalovitch explains how she used New York City's open taxi dataset with Python to determine areas of frequent pick-ups and drop-offs within a time frame and superimposed those hotspots atop a map of the subway system to identify taxi hotspots that fall within or outside of a particular radius of established subway stops—and used this data as the basis for a proposed bus route. Read more.
Add to your personal schedule
1:45pm2:25pm Wednesday, May 10, 2017
Location: Meeting Room 18 C/D
Level: Intermediate
Mita Mahadevan (Intuit)
Average rating: ***..
(3.64, 11 ratings)
Many leading tech companies (Uber, Netflix, etc.) are building scalable, in-house product-testing data platforms from the ground up to enable experimentation and engender a data-driven mentality. Mita Mahadevan explores how these companies are developing in-house A/B testing frameworks using open source tools and shares dos and don’ts for those in the midst of their journey to become data driven. Read more.
Add to your personal schedule
2:35pm3:15pm Wednesday, May 10, 2017
Location: Meeting Room 18 C/D
Level: Beginner
Average rating: ****.
(4.50, 2 ratings)
Taras Matyashovsky explains how to use Apache Spark MLlib to build a supervised learning NLP pipeline to distinguish pop music from heavy metal—and have fun in the process. Read more.
Add to your personal schedule
4:15pm4:55pm Wednesday, May 10, 2017
Location: Meeting Room 18 C/D
Level: Intermediate
Alena Hall (Microsoft Research), Natallia Dzenisenka (Independent Contractor)
Average rating: ****.
(4.67, 3 ratings)
Alena Hall and Natallia Dzenisenka explore the set of algorithms behind distributed systems, including snapshot algorithms, traversal algorithms, election algorithms, and reliable broadcast, giving you a clear understanding of how those systems work. Read more.
Add to your personal schedule
5:05pm5:45pm Wednesday, May 10, 2017
Location: Meeting Room 18 C/D
Dani Traphagen (GridGain)
Average rating: ***..
(3.50, 6 ratings)
Dani Traphagen explores the key paradigm shifts currently impacting those Fortune 500 companies that view disk as a bottleneck. Dani explains how to optimize toward the cache, leveraging it for low-latency, highly available microservices architectures with the hot-and-fresh-out-of-the-kitchen open source project Apache Ignite. Read more.
Add to your personal schedule
11:00am11:40am Thursday, May 11, 2017
Location: Ballroom F
Level: Beginner
Jonathon Morgan (New Knowledge)
Average rating: ****.
(4.75, 16 ratings)
Jonathon Morgan explores computer vision, deep learning, and natural language processing techniques for uncovering communities of white nationalists and neo-Nazis on social media and identifying which ones are on the path to radicalization. Read more.
Add to your personal schedule
11:50am12:30pm Thursday, May 11, 2017
Location: Ballroom F
Level: Intermediate
Heather Nelson (Silicon Valley Data Science), Gary Dusbabek (Silicon Valley Data Science)
Average rating: ****.
(4.00, 5 ratings)
Configuring a data platform and data science environment can be a tedious, error-prone process. Heather Nelson and Gary Dusbabek explain how to create a cloud-agnostic environment combining cloud platforms such as AWS or Azure with Terraform and Ansible that spins up quickly and is easy to configure as required. Read more.
Add to your personal schedule
1:45pm2:25pm Thursday, May 11, 2017
Location: Ballroom F
Level: Beginner
Edward Finkler (Graph Story)
Average rating: ****.
(4.24, 17 ratings)
Most of us have worked with relational databases like MySQL or PostgreSQL, but they aren't the best option for many use cases. Graph databases have a simpler, more powerful model for handling complex, related data. Edward Finkler uses Neo4j to explore the advantages of graph databases, showing how graphs work and how they give you the power to do things that are difficult or impossible in SQL. Read more.
Add to your personal schedule
4:15pm4:55pm Thursday, May 11, 2017
Location: Ballroom F
Level: Intermediate
Sean Mackrory (Cloudera)
Average rating: *****
(5.00, 1 rating)
Sean Mackrory offers an overview of and best practices for filesystems in public cloud infrastructures as they relate to traditional filesystems. Many of the examples will relate to Hadoop, namely moving from HDFS to S3. Read more.
Add to your personal schedule
5:05pm5:45pm Thursday, May 11, 2017
Location: Ballroom F
Level: Intermediate
Barbara Fusinska (Microsoft)
Average rating: ****.
(4.00, 3 ratings)
Data science and machine learning are growing increasingly popular. R is an open source platform that offers numerous libraries and implementations of machine-learning algorithms. Barbara Fusinska explains how to use R as a tool for data analysis, performing machine-learning computations, and displaying the results of predictions. Read more.