Strata 2011 Schedule

Below are the confirmed and scheduled talks at Strata 2011 (schedule subject to change).

Customize Your Own Schedule

Create your own Strata schedule using the personal scheduler function. Mark the tutorials, sessions, keynotes, and events you want to attend by clicking on the calendar icon [calendar icon] next to each listing. Then click on "personal schedule" below and get your own customized schedule generated.

Mission City M
10:40am Open Data: Designing Data-centric Web APIs Pablo Castro (Microsoft)
11:30am Realtime Analytics at Twitter Kevin Weil (Twitter, Inc.)
2:30pm What's Mine is Yours: the Ethics of Big Data Ownership Dylan Field (Brown University), Lucian Lita (Intuit), Jud Valeski (Gnip), Tim O'Reilly (O'Reilly Media, Inc.)
4:10pm Data Marketplaces Julie Steele (Silicon Valley Data Science), Ian White (Urban Mapping, Inc), Peter Marney (Thomson Reuters), Moe Khosravy (Microsoft), Dennis Yang (Infochimps)
5:00pm Predicting the Future: Anticipating the World with Data Drew Conway (Alluvium), Christopher Ahlberg (Recorded Future), Robert McGrew (Palantir Technologies), Rion Snow (Twitter)
Mission City B5
10:40am Data Journalism: Applied Interfaces Marshall Kirkpatrick (ReadWriteWeb), Simon Rogers (Guardian), Jer Thorp (New York University)
11:30am Real World Applications Panel: Education and Government Alexander Howard (O'Reilly Media), Peter Clark (Code 42 Software, Inc), Steve Midgley (US Department of Education)
4:10pm Data as Art J.J. Toothman (NASA Ames Research Center)
Mission City B1
10:40am Where's the Money in Big Data? Tim Guleri (Sierra Ventures), Roger Ehrenberg (IA Ventures), Paul Kedrosky (Kauffman Foundation), Ping Li (Accel Partners)
11:30am Online Sentiment, Machine Learning, and Prediction Ben Gimpert (Altos Research), Margaret Francis (Exact Target/ CoTweet), Ryan Strynatka (Radian6), Josh Merchant (Lymbix)
1:40pm What Kind of Data do Government Agencies Have? Virginia Carlson (Metro Chicago Information Center (MCIC))
4:10pm Research Evaluation in the Age of Global Digital Scholarship Brian Wilson (Thomson Reuters)
Mission City B4
10:40am Sponsored by Splunk
Supersized Data? Get Real-time insights. Stephen Sorkin (Splunk), Narayan Bharadwaj (ClearStory Data)
11:30am Sponsored by Rackspace Hosting
Data in the Cloud with OpenStack Eric Day (craigslist)
1:40pm Sponsored by Impetus
Deriving Intelligence from Large Data - Using Hadoop and Applying Analytics Vineet Tyagi (Impetus Technologies)
2:30pm Sponsored by DataStax (formerly Riptano)
Painless Scaling: Rewiring Your Existing Stacks for Scalability Ben Werther (DataStax (formerly Riptano))
4:10pm TBC
5:00pm TBC
12:10pm Lunch: Sponsored by Thomson Reuters
Room: Mezzanine
Thursday Lunchtime BoF Sessions
6:00pm OpenStack Meetup
Room: Mezzanine
OpenStack Meetup & Open Bar, Sponsored by Rackspace Hosting
10:10am Morning Break
Room: Ballroom ABCD
3:10pm Afternoon Break: Sponsored by EMC
Room: Ballroom ABCD
8:45am Keynotes
Room: Mission City Ballroom
Opening Welcome - Day 2 Alistair Croll (Solve For Interesting), Edd Wilder-James (Google)
8:50am Plenary
Room: Mission City Ballroom
Free Our Data: How We Made Sense of Huge Datasets Simon Rogers (Guardian)
9:00am Plenary
Room: Mission City Ballroom
Posthumans, Big Data and New Interfaces Alistair Croll (Solve For Interesting), Toby Segaran (Google), Amber Case (Esri), Bradford Cross (Flightcaster)
9:15am Plenary
Room: Mission City Ballroom
Why Legacy Databases Can't Survive the Data Deluge - It's About Dollars and Sense Ed Boyajian (EnterpriseDB)
9:25am Plenary
Room: Mission City Ballroom
The Heat Death of the Data Warehouse Barry Devlin (9sight Consulting)
9:35am Plenary
Room: Mission City Ballroom
Innovating Data Teams DJ Patil (White House Office of Science and Technology Policy)
9:45am Plenary
Room: Mission City Ballroom
Your Data Rules the World Scott Yara (Greenplum, a division of EMC)
9:55am Plenary
Room: Mission City Ballroom
Can Big Data Fix Healthcare? Carol McCall (Tenzing Health)
10:40am-11:20am (40m) Practitioner
Open Data: Designing Data-centric Web APIs
Pablo Castro (Microsoft)
Sharing data on the Web comes with a tough trade-off between minimalism and enabling creative new scenarios. This session will explore Web APIs that focus on exposing data and let clients decide how to use it. We'll share our experiences while designing the Open Data Protocol (odata.org), what we found to be great and terrible ideas and what we hear from folks running OData Web APIs.
11:30am-12:10pm (40m) Practitioner
Realtime Analytics at Twitter
Kevin Weil (Twitter, Inc.)
Most analytics systems rely on large offline computations, which means results come in hours or days behind. Twitter is all about realtime, but with over 160 million users producing over 90 million tweets per day, we need realtime analytics that scaled horizontally. This talk discusses the development of that infrastructure, as well as the products we are beginning to build on top of it.
1:40pm-2:20pm (40m) Practitioner
Present Tense: The Challenges and Trade-offs in Building a Web-scale Real-time Analytics System
Benjamin Black (Boundary)
The rise of sensor network data and the expectation for low latency query responses combine to obsolete available databases and storage platforms. We have built a platform for web-scale OLAP and in this talk I will cover how we made our infrastructure capable of real-time update and query performance over hundreds of terabytes of multidimensional data.
2:30pm-3:10pm (40m) The Data Business
What's Mine is Yours: the Ethics of Big Data Ownership
Dylan Field (Brown University), Lucian Lita (Intuit), Jud Valeski (Gnip), Tim O'Reilly (O'Reilly Media, Inc.)
To many people, Big Data means Open Data: social graphs, voting records, weather patterns, and more. But who owns data? Most of our laws were written for atoms, not bits; they're woefully out of date in an information age. When you share data, does it become more or less valuable? If someone adds to your data, is it still yours? This panel will tackle the gray area of data ownership.
4:10pm-4:50pm (40m) The Data Business
Data Marketplaces
Julie Steele (Silicon Valley Data Science), Ian White (Urban Mapping, Inc), Peter Marney (Thomson Reuters), Moe Khosravy (Microsoft), Dennis Yang (Infochimps)
Does information really want to be free? While the Internet is full of open data, there's plenty of data companies are willing to pay handsomely for -- particularly if it's timely and well aggregated. As a result, data marketplaces are a burgeoning business. This panel will look at the market for data, and where it's headed.
5:00pm-5:40pm (40m) Disruption & Opportunity
Predicting the Future: Anticipating the World with Data
Drew Conway (Alluvium), Christopher Ahlberg (Recorded Future), Robert McGrew (Palantir Technologies), Rion Snow (Twitter)
Data doesn't just show us the past—it can help predict the future. Several new firms harvest massive amounts of open data, trying to anticipate everything the right ad placement to the next terrorist attack. In this session, we bring together the founders of these firms to discuss the technology—and ethics—of looking into the future.
10:40am-11:20am (40m) Interfaces
Data Journalism: Applied Interfaces
Marshall Kirkpatrick (ReadWriteWeb), Simon Rogers (Guardian), Jer Thorp (New York University)
After Kennedy, you couldn't win an election without TV. After Obama, it was social media. But tomorrow's citizen gets their information from visualizations. In this panel, three acclaimed designers show how they apply visualization to big data, making complex, controversial topics easy to understand and explore.
11:30am-12:10pm (40m) Real World
Real World Applications Panel: Education and Government
Alexander Howard (O'Reilly Media), Peter Clark (Code 42 Software, Inc), Steve Midgley (US Department of Education)
Open access to information promises to connect citizens to their representatives, improving government transparency and helping educators transform the classroom. In this real-world panel, practitioners in government and the public sector will give us a glimpse into how data and new interfaces are transforming how we teach and govern.
1:40pm-2:20pm (40m) Interfaces
AnySurface: Bringing Agent-based Simulation and Data Visualization to All Surfaces
Stephen Guerin (Santa Fe Complex)
Live demonstration of ambient computing using projector-camera pairs to scan the room and place interactive simulations into the space. All surfaces are rendered interactive. We will demonstrate a 3D sandtable for firefighter training and STEM education where the 3D sand becomes and interactive surface.
2:30pm-3:10pm (40m) Business, Data, Interfaces
Beyond visualization: Productivity, Complexity and Information Overload
Creve Maples, Ph.D. (Event Horizon)
We will discuss the impact of the information explosion, the effectiveness of current technological directions, and explore the success that new perception-based, human-computer interfaces provide in analyzing and understanding complex data. Real examples will be used to illustrate that effective man-machine environments are essential in productively dealing with multi-dimensional information.
4:10pm-4:50pm (40m) Interfaces
Data as Art
J.J. Toothman (NASA Ames Research Center)
Artistic visualizations and infographics tell the stories of rich data in unique, compelling ways and synthesize datasets in ways that allow them to be interpreted, absorbed, and experienced in ways beyond the spreadsheet, pie chart, and bar graph.
5:00pm-5:40pm (40m) Interfaces
Creating a Universal Software Experience Across Devices
Sunita Shenoy (Intel)
Ram Peddibhotla, a Director from Intel’s Software and Services Group, will discuss how the future of mobile involves ubiquity across multiple hardware platforms. Specifically, Ram will discuss how open source software will shape the next generation of computing devices, improving compatibility.
10:40am-11:20am (40m) Disruption & Opportunity
Where's the Money in Big Data?
Tim Guleri (Sierra Ventures), Roger Ehrenberg (IA Ventures), Paul Kedrosky (Kauffman Foundation), Ping Li (Accel Partners)
The ability to collect, crunch, act upon, and share huge amounts of data disrupts nearly every industry, tearing down barriers to entry and creating entirely new businesses. This panel of investors will discuss where they see the opportunities in the Big Data industry, and how they think about the value of new ventures in the space.
11:30am-12:10pm (40m) Disruption & Opportunity
Online Sentiment, Machine Learning, and Prediction
Ben Gimpert (Altos Research), Margaret Francis (Exact Target/ CoTweet), Ryan Strynatka (Radian6), Josh Merchant (Lymbix)
Today's web analyst has moved far beyond funnels and visitors. Automated systems decide who gets what content, and language parsing tries to distill sentiment from millions of online interactions. This panel will look at where web analytics is headed, and how new algorithms and approaches are yielding fresh insights into online commerce.
1:40pm-2:20pm (40m) Real World
What Kind of Data do Government Agencies Have?
Virginia Carlson (Metro Chicago Information Center (MCIC))
Data integration and viz technology have given rise to an appetite for government data–the Gov 2.0 movement. Do government agencies have good data? Sort of: I believe that an understanding of data limitations has gotten short shrift in the drive to develop the next app. I'll discuss why a knowledge of the complexities of government data is crucial to building quality decision-making tools.
2:30pm-3:10pm (40m) Real World
Dirty Politics, Dirty Data: Taming the Federal Election Commission’s Database
Jon Bruner (O'Reilly Media)
In a first, Forbes presented all federal campaign contributions by America’s wealthiest people in our September 2010 online edition of the Forbes 400. We combined human effort and homegrown database code to sort through 6 million political donations and find the 20,000 that came from America’s richest people.
4:10pm-4:50pm (40m) Real World
Research Evaluation in the Age of Global Digital Scholarship
Brian Wilson (Thomson Reuters)
New technologies are driving a new era of global collaboration among scientists and researchers. Digital scholarship, the ability to create, collect, publish and collaborate in new digital mediums, is driving the exponential growth of data related to scholarly research. This talk will highlight evolving strategies used to appraise and predict success of institutions and researchers.
5:00pm-5:40pm (40m) Practitioner
Meaningful Insights From Raw Metrics: Virtual Worlds and Other Business Applications
Nicholas Yee (PARC), Nic Ducheneaut (PARC)
Virtual worlds are a goldmine of untapped insights, even for predicting physical behaviors. Not only will we share PARC findings and methods developed to extract key data from online games, but more importantly, we'll discuss how social scientists converted and processed raw behavioral metrics into meaningful psychological variables that can be applied to a broad spectrum of business applications.
10:40am-11:20am (40m)
Supersized Data? Get Real-time insights.
Stephen Sorkin (Splunk), Narayan Bharadwaj (ClearStory Data)
From customer behaviors & usage statistics to security postures & operational analytics, Splunk's ability to make sense of all types of machine data, structured or unstructured, and mash it up w/ other business data provides complete real-time visibility & operational intelligence. This tutorial demos a new approach for analyzing your organization's petabytes of data to derive real-time insights.
11:30am-12:10pm (40m)
Data in the Cloud with OpenStack
Eric Day (craigslist)
The OpenStack project was launched last summer by Rackspace, NASA, and a number of other cloud technology leaders in an effort to build a fully-open cloud computing platform. It is a collection of scalable, standards-based projects currently consisting of OpenStack Compute and OpenStack Object Storage. This session will introduce the projects and describe how they can help manage your data.
1:40pm-2:20pm (40m)
Deriving Intelligence from Large Data - Using Hadoop and Applying Analytics
Vineet Tyagi (Impetus Technologies)
Organizations today possess massive data - in tera- and petabytes - that needs to be effectively collected, stored and processed. Hadoop is a cost effective option that helps manage this big data. To derive real returns from these big data systems, one needs to extract useful insights and business intelligence.
2:30pm-3:10pm (40m)
Painless Scaling: Rewiring Your Existing Stacks for Scalability
Ben Werther (DataStax (formerly Riptano))
If you are a leading enterprise or web company, then two things are almost certainly true. Data is the lifeblood of your business. And you face an ever-increasing need to scale your applications and data services.
4:10pm-4:50pm (40m)
Session
To be confirmed
5:00pm-5:40pm (40m)
Session
To be confirmed
12:10pm-1:40pm (1h 30m)
Thursday Lunchtime BoF Sessions
Birds of a Feather (BoF) sessions provide face to face exposure to those interested in the same projects and concepts. BoFs can be organized for individual projects or broader topics (best practices, open data, standards). BoF topics are entirely up to you. Thursday's Lunchtime BoF sessions will happen on the hotel side of the Hyatt Regency, Mezzanine Level.
6:00pm-9:00pm (3h)
OpenStack Meetup & Open Bar, Sponsored by Rackspace Hosting
Join OpenStack contributors, users, and backers immediately after Strata ends - to celebrate the second release of the fastest-growing open source cloud platform, code-named Bexar. There will be a community Meetup with speakers from 6:00 - 7:00 pm, followed by an open bar from 7:00 - 9:00 pm.
10:10am-10:40am (30m)
Break: Morning Break
3:10pm-4:10pm (1h)
Break: Afternoon Break: Sponsored by EMC
8:45am-8:50am (5m)
Opening Welcome - Day 2
Alistair Croll (Solve For Interesting), Edd Wilder-James (Google)
Alistair Croll and Edd Dumbill welcome you back to Strata.
8:50am-9:00am (10m)
Free Our Data: How We Made Sense of Huge Datasets
Simon Rogers (Guardian)
90,000 items on Afghanistan, 291,000 on Iraq - and another 251,000 cables. Managing the Wikileaks release is just one of the huge data journalism projects the Guardian's data team has embarked on. This talk will look at how journalists can make sense of data, get stories out of it and our role in supplying open data to the world.
9:00am-9:15am (15m)
Posthumans, Big Data and New Interfaces
Alistair Croll (Solve For Interesting), Toby Segaran (Google), Amber Case (Esri), Bradford Cross (Flightcaster)
The convergence of big, open data, ubicomp, and new interfaces will change the way humans work, play, learn, and love. It's a slow transformation that happens one tweet, one blog, and one game at a time -- but it's also an inexorable road towards the singularity. In this panel discussion, we'll look beyond the bytes and algorithms to think about humanity awash in a sea of information.
9:15am-9:25am (10m)
Why Legacy Databases Can't Survive the Data Deluge - It's About Dollars and Sense
Ed Boyajian (EnterpriseDB)
Companies must choose to spend their money and time on the right software initiatives. With exploding volumes of critical data, getting new insight and mastery over business operations demands new investments in BI at multiple levels. Ed will show a proven path for how to avoid exorbitant database software fees and shift that spend to be used in areas like BI where you can realize a stronger ROI.
9:25am-9:35am (10m)
The Heat Death of the Data Warehouse
Barry Devlin (9sight Consulting)
For more than 20 years now, data warehousing has put manners on unruly enterprise data. Yet, physics tells us that disorder inexorably increases unless we endlessly fight it. As information volumes and types explode into chaos, is it time to declare the warehouse dead? Or we could move from classical to quantum physics and create a new information architecture. It’s time to make some new choices…
9:35am-9:45am (10m)
Innovating Data Teams
DJ Patil (White House Office of Science and Technology Policy)
Details coming soon.
9:45am-9:55am (10m)
Your Data Rules the World
Scott Yara (Greenplum, a division of EMC)
A defining characteristic of modern life is the incredible proliferation of digital information. The Economist estimates that the amount of information created each year is growing at a 60% compounded rate. According to the Harvard Business Review, we humans generated more data last year than in all of previous human history.
9:55am-10:10am (15m)
Can Big Data Fix Healthcare?
Carol McCall (Tenzing Health)
In 2001, the Institutes of Medicine declared that “between the care we have and the care we could have lies not just a gap, but a chasm,” yet nothing’s really changed. Healthcare remains one of the most richly endowed yet poorly equipped knowledge industries anywhere. Using real world examples, we’ll see how BIG DATA may be just what the doctor ordered, but only if we pick the right problems.

Sponsors

  • Thomson Reuters
  • EMC Data Computing Division
  • EnterpriseDB
  • Microsoft
  • Gnip
  • Rackspace Hosting
  • IBM
  • Windows Azure MarketPlace DataMarket
  • Amazon Mechanical Turk
  • Amazon Web Services
  • Aster Data
  • Cloudera
  • Clustrix
  • DataStax, Inc. (formerly Riptano, Inc.)
  • Digital Reasoning Systems
  • Heritage Provider Network
  • Impetus
  • Jaspersoft
  • Karmasphere
  • LinkedIn
  • MarkLogic
  • Pentaho
  • Pervasive
  • Revolution Analytics
  • Splunk
  • Urban Mapping
  • Wolfram|Alpha
  • Esri
  • ParAccel
  • Tableau Software

Sponsorship Opportunities

For information on exhibition and sponsorship opportunities at the conference, contact Susan Young at syoung@oreilly.com

Download the Strata Sponsor/Exhibitor Prospectus

Contact Us

View a complete list of Strata Contacts