Strata 2013 Schedule

Below are the confirmed and scheduled talks at Strata Conference in Santa Clara 2013 (schedule subject to change).

Customize Your Own Schedule

Create your own Strata schedule using the personal scheduler function. Mark the tutorials, sessions, keynotes, and events you want to attend by selecting the calendar icon [calendar icon] next to each listing. Then go to your personal schedule and get your own customized schedule generated.

See the list of all events happening onsite, including events on Monday, February 25: Women's Community Meetup, Big Data Camp, and Ignite.

Ballroom H
Add to your personal schedule
10:40am Tools to Turn Emergencies into Knowledge: Turning 911 into 411 Eron Kelly (Microsoft Corporation), Paul Henderson (Ascribe)
Add to your personal schedule
11:30am Ready for Primetime? What Enterprise-ready Really Means Charles Zedlewski (Cloudera)
Add to your personal schedule
2:20pm Using Hadoop to Expand Data Warehousing Mike Peterson (Neustar)
Add to your personal schedule
4:00pm Five Real World Hadoop Success Stories with HP Sanjai Marimadaiah (Hewlett Packard), Luis Maldonado (HP Vertica)
Ballroom AB
Add to your personal schedule
10:40am Real-time Stream Processing and Visualization Using Kafka, Storm, and d3.js Justin Langseth (Zoomdata, Inc.), Byron Ellis (Spongecell)
Add to your personal schedule
11:30am Feedback Control for Programmers and Other Strangers Philipp Janert (Principal Value, LLC)
Add to your personal schedule
2:20pm Third Generation Tools for Realizing Machine Learning Algorithms Dr. Vijay Srinivas Agneeswaran (SapientNitro)
Great America Ballroom J
Add to your personal schedule
10:40am F1 - The Fault-Tolerant Distributed RDBMS Supporting Google's Ad Business Stephan Ellner (Google), Jeff Shute (Google)
Add to your personal schedule
11:30am High-Volume Data Collection and Real Time Analytics Using Redis Constantine Aaron Cois (Carnegie Mellon University, Software Engineering Institute), Tim Palko (Carnegie Mellon University, Software Engineering Institute)
Add to your personal schedule
1:30pm The Rise of the Scientific Databases John A. De Goes (Precog)
Add to your personal schedule
2:20pm Impala: A Modern SQL Engine for Hadoop Justin Erickson (Cloudera)
Add to your personal schedule
4:00pm Druid: Interactive Queries Meet Real-time Data Eric Tschetter (Metamarkets), Danny Yuan (Netflix Platform Engineering Team)
Great America Ballroom K
Add to your personal schedule
11:30am Big Data Tag-Team: Hadoop and the Data Warehouse Shaun Connolly (Hortonworks), Tasso Argyros (Teradata Aster)
Add to your personal schedule
1:30pm The Workflow Abstraction Paco Nathan (O'Reilly Media)
Add to your personal schedule
2:20pm The BigData Top100 List Milind Bhandarkar (Greenplum, A Division of EMC), Chaitan Baru (SDSC/UC San Diego)
Add to your personal schedule
4:00pm Building Recommendation Platforms with Hadoop Jayant Shekhar (Sparkflows Inc.)
Ballroom CD
Add to your personal schedule
11:30am Four Pillars of Effective Visualizations Noah Iliinsky (Amazon Web Services)
Add to your personal schedule
4:00pm Great Debate: Design Matters More Than Math Alexander Gray (Skytree, Inc.), Monica Rogati (Data Natives), Julie Steele (Silicon Valley Data Science), Douglas van der Molen (ClearStory Data)
Ballroom E
Add to your personal schedule
10:40am Crowdfunded Open Doctor Data Fred Trotter (FredTrotter.com)
Add to your personal schedule
11:30am Monkeys & Math: How MailChimp Catches Bad Guys John Foreman (MailChimp)
Add to your personal schedule
1:30pm A Model Strategy for Data Journalism in a Country Without Open Data Sandra Crucianelli (International Center for Journalists), Angélica Peralta Ramos (La Nacion Newspaper)
Add to your personal schedule
2:20pm The Web As The Greatest Dataset Of All Time Lisa Green (Common Crawl), Greg Lindahl (blekko), Kevin Burton (Spinn3r)
Add to your personal schedule
4:00pm Design, Transparency, and Big Data in Civil Litigation Dean Malmgren (Datascope Analytics), Mike Stringer (Datascope Analytics)
Ballroom F
Add to your personal schedule
10:40am Medical Data: Going from Hospitals to Home Carson Darling (Rest Devices)
Add to your personal schedule
11:30am Sociometric Badges: Using Wearable Sensors to Change Management Ben Waber (Sociometric Solutions)
Add to your personal schedule
1:30pm Deriving an Interest Graph for Social Data Anna Smith (bitly)
Add to your personal schedule
2:20pm LinkedIn Endorsements: Reputation, Virality, and Social Tagging Sam Shah (SkipFlag), Peter Skomoroch (Skipflag)
Add to your personal schedule
8:45am Plenary
Room: Mission City Ballroom
Thursday Keynote Welcome Alistair Croll (Solve For Interesting), Edd Wilder-James (Silicon Valley Data Science)
Add to your personal schedule
8:55am Plenary
Room: Mission City Ballroom
Big Data. in a Really Big World! Cecilia Bouras (Western Union)
Add to your personal schedule
9:05am Plenary
Room: Mission City Ballroom
Xbox Data is XXL Dave Campbell (Microsoft)
Add to your personal schedule
9:15am Plenary
Room: Mission City Ballroom
Broad Data: What Happens When the Web of Data Becomes Real? James Hendler (RPI)
Add to your personal schedule
9:25am Plenary
Room: Mission City Ballroom
Delivering Intelligence Wherever Data Lives Girish Juneja (Intel)
Add to your personal schedule
9:30am Plenary
Room: Mission City Ballroom
Grafting Hadoop and SAP HANA Together Joydeep Das (SAP)
Add to your personal schedule
9:35am Plenary
Room: Mission City Ballroom
Human Fault-tolerance Nathan Marz (Twitter)
Add to your personal schedule
9:45am Plenary
Room: Mission City Ballroom
Algorithmic Illusions: Hidden Biases of Big Data Kate Crawford (Microsoft Research)
Add to your personal schedule
4:50pm Plenary
Room: Mission City Ballroom
Big Data vs The Beltway: The Regulatory Risks to Data-Driven Businesses Kenneth Cukier (The Economist)
Add to your personal schedule
5:10pm Plenary
Room: Mission City Ballroom
The Victory Lab Sasha Issenberg (The Victory Lab)
Ballroom G
Add to your personal schedule
10:40am When Energy Met Intelligence: Utilities Using Hadoop for Analytics at Scale Greg Khairallah (Intel), Bert Haskell (Pecan Street Projects)
Add to your personal schedule
11:30am Data Set Management System for Hadoop Michael Lang Sr. (Revelytix)
Add to your personal schedule
2:20pm Dotting the I’s with Hadoop on Eseries David Henry (Pentaho), Benjamin Lloyd (NetApp)
Add to your personal schedule
4:00pm Big Data on the Open Cloud Natasha Gajic (Rackspace)
10:10am Morning Break - Sponsored by Intel
Room: Expo Hall AB
3:00pm Afternoon Break
Room: Expo Hall AB
Add to your personal schedule
12:10pm Lunch - Sponsored by Microsoft
Room: Expo Hall C
Thursday Lunchtime BoF Tables
8:00am Coffee Break - Sponsored by VMware
Room: Mission City Ballroom Foyer
10:40am-11:20am (40m) Sponsored Sessions
Tools to Turn Emergencies into Knowledge: Turning 911 into 411
Eron Kelly (Microsoft Corporation), Paul Henderson (Ascribe)
Microsoft partner, Ascribe, is using Microsoft’s Big Data solutions to turn emergencies into actionable data
11:30am-12:10pm (40m) Sponsored Sessions
Ready for Primetime? What Enterprise-ready Really Means
Charles Zedlewski (Cloudera)
Cloudera, the standard for Apache Hadoop in the enterprise, empowers data-driven enterprises to Ask Bigger Questions™ and get bigger answers from all their data at the speed of thought. Cloudera Enterprise, the platform for Big Data, enables organizations to easily derive business value from structured and unstructured data to achieve a significant competitive advantage.
1:30pm-2:10pm (40m) Sponsored Sessions
Data Science vs. Analytics -- Approaches to Problem Solving
Nick Kolegraff (Rackspace)
Data Science has created quite the movement in the data world, yet confusion between data science and analytics still remain across the enterprise. Rather than approach the subject talking about semantic differences between the two, we will discuss the topics as they relate to solving problems, how businesses are approaching them and what you can start doing with data science.
2:20pm-3:00pm (40m) Sponsored Sessions
Using Hadoop to Expand Data Warehousing
Mike Peterson (Neustar)
Learn how Neustar has expanded their data warehouse capacity, agility for data analysis, reduced costs, and enabled new data products. Discuss challenges and opportunities in capturing 100′s of TB’s of compact binary network data, ad hoc analysis, integration with a scale out relational database, more agile data development, and building new products integrating multiple big data sets.
4:00pm-4:40pm (40m) Sponsored Sessions
Five Real World Hadoop Success Stories with HP
Sanjai Marimadaiah (Hewlett Packard), Luis Maldonado (HP Vertica)
Learn how HP has established itself as the premier Big Data vendor with a solid portfolio of turnkey solutions that can be deployed faster than ever, while keeping acquisition and operational costs down. Learn more at hp.com/go/information.
10:40am-11:20am (40m) Data Science
Real-time Stream Processing and Visualization Using Kafka, Storm, and d3.js
Justin Langseth (Zoomdata, Inc.), Byron Ellis (Spongecell)
Learn how LivePerson and Zoomdata perform stream processing and visualization on mobile devices of structured site traffic and unstructured chat data in real-time for business decision making. Technologies include Kafka, Storm, and d3.js for visualization on mobile devices. Byron Ellis, Data Scientist for LivePerson will join Justin Langseth of Zoomdata to discuss and demonstrate the solution.
11:30am-12:10pm (40m) Data Science
Feedback Control for Programmers and Other Strangers
Philipp Janert (Principal Value, LLC)
Most stable systems rely on feedback - from central heating to industrial plants and biological organisms. This introductory talk will explain what feedback is, why it is relevant to enterprise software development, and how to apply it to some typical problems arising in business and technical situations.
1:30pm-2:10pm (40m) Data Science
Real-World Machine Learning on Big Data: Which Method(s) Should You Use?
Alexander Gray (Skytree, Inc.)
Given a machine learning (ML) problem, which method(s) should you use, and how does big data affect your choices? I will discuss some principles derived from decades of theory and practice, illustrated through real-world ML success stories in medicine, marketing, financial services, and astronomy.
2:20pm-3:00pm (40m) Data Science
Third Generation Tools for Realizing Machine Learning Algorithms
Dr. Vijay Srinivas Agneeswaran (SapientNitro)
The key takeaway from this session will be an understanding of the third generation of tools for realizing machine learning algorithms - examples of these tools include Twister, HaLoop, GraphLab. Attendees will also understand why the second generation tools such as Mahout has not implemented some of the machine learning algorithms for big data. The session will also have real-life use cases.
4:00pm-4:40pm (40m) Data Science
Introducing Julia - a New Open Source Mathematical Programming Language
Michael Bean (Forio Simulations)
Julia is a new mathematical programming language that is scalable, high-performance, and open source. Julia is fast, approaching and often matching the performance of C/C++, easy to learn, and designed for distributed computation. This session will demonstrate some of the special capabilities of Julia and give you the tools you need to get started using this exciting technical computing language.
10:40am-11:20am (40m) Beyond Hadoop
F1 - The Fault-Tolerant Distributed RDBMS Supporting Google's Ad Business
Stephan Ellner (Google), Jeff Shute (Google)
Many of the services that are critical to Google’s ad business have historically been backed by MySQL. We have recently migrated several of these services to F1, a new RDBMS developed at Google. F1 implements rich relational database features, including a strictly enforced schema, a powerful parallel SQL query engine, general transactions, change tracking and notification, and indexing.
11:30am-12:10pm (40m) Beyond Hadoop
High-Volume Data Collection and Real Time Analytics Using Redis
Constantine Aaron Cois (Carnegie Mellon University, Software Engineering Institute), Tim Palko (Carnegie Mellon University, Software Engineering Institute)
In this talk, we describe using Redis, an open source, in-memory key-value store, to capture large volumes of data from numerous remote sources while also allowing real-time monitoring and analytics. With this approach, we were able to capture a high volume of continuous data from numerous remote environmental sensors while consistently querying our database for real time monitoring and analytics.
1:30pm-2:10pm (40m) Beyond Hadoop
The Rise of the Scientific Databases
John A. De Goes (Precog)
This talk discusses the market needs that are giving birth to the "scientific database", what these systems have to offer that is currently lacking in either the data management or statistical worlds, and how scientific databases will co-exist and co-evolve with Hadoop and other leading big data platforms.
2:20pm-3:00pm (40m) Beyond Hadoop
Impala: A Modern SQL Engine for Hadoop
Justin Erickson (Cloudera)
The Cloudera Impala project is for the first time making scalable parallel database technology, which is the underpinning of Google's Dremel as well as that of commercial analytic DBMSs, available to the Hadoop community.
4:00pm-4:40pm (40m) Beyond Hadoop
Druid: Interactive Queries Meet Real-time Data
Eric Tschetter (Metamarkets), Danny Yuan (Netflix Platform Engineering Team)
This talk will discuss how Druid allows users to have interactive queries on real-time data at scale; we feature a case study with Netflix leveraging Druid to obtain at-the-moment insight as it ingests over two terabytes per hour.
10:40am-11:20am (40m) Hadoop in Practice
How Hadoop in the Cloud Affects Developer-Friendly Decision Making
Philip (Flip) Kromer (CSC)
Join Flip Kromer, co-founder and CTO of Infochimps, as he walks you through a series of decision trees, making you rethink your use of Hadoop in the cloud and opening up possibilities for new patterns of work that are uniquely developer-friendly. Patterns of work like tuning your cluster to the job, and why the first priority of any analytics cluster should be downtime.
11:30am-12:10pm (40m) Hadoop in Practice
Big Data Tag-Team: Hadoop and the Data Warehouse
Shaun Connolly (Hortonworks), Tasso Argyros (Teradata Aster)
Apache Hadoop is an innovative emerging technology causing CIOs to rethink their data architecture - making this an exciting time to be a “big data” technologist. This tag-team presentation brings leaders in both Apache Hadoop and data warehousing on the stage, to answer these questions by sharing their vision for the future of big data management and analytics.
1:30pm-2:10pm (40m) Hadoop in Practice
The Workflow Abstraction
Paco Nathan (O'Reilly Media)
This talk examines the notion of a "workflow" as a general abstraction for common use cases encountered in Data Science, particularly for building Enterprise apps. Patterns of workflows provide recipes for integrating different frameworks, plus the means for optimizing large-scale apps. We review this approach in the context of a sample app based on the Cascading open source project.
2:20pm-3:00pm (40m) Hadoop in Practice
The BigData Top100 List
Milind Bhandarkar (Greenplum, A Division of EMC), Chaitan Baru (SDSC/UC San Diego)
We will describe the BigData Top100 List initiative—an new, open, community-based effort for benchmarking big data systems.
4:00pm-4:40pm (40m) Hadoop in Practice
Building Recommendation Platforms with Hadoop
Jayant Shekhar (Sparkflows Inc.)
This talks dives into the extreme details of Building Recommendation Platforms. It covers the end to end Architecture and Design of such a system. It dives into the various ML Algorithms to be used along with their details. It also covers the Solutions to commonly seen Recommendation Patterns and detailed Use Cases along with their Solution.
10:40am-11:20am (40m) Design
Maps Not Lists: Network Graphs for Data Exploration
Amy Heineike (Quid)
The majority of data we consume today are presented in lists, one-dimensional orderings that limit the users ability to understand context or perform strategic analyses. For unstructured data, we need to re-imagine what types of visualisations enable exploration in the way that geographic maps can.
11:30am-12:10pm (40m) Design
Four Pillars of Effective Visualizations
Noah Iliinsky (Amazon Web Services)
This talk discusses the broad design considerations necessary for effective visualizations. Attendees will learn what's required for a visualization to be successful, gain insight for critically evaluating visualizations they encounter, and come away with new ways to think about the visualization design process.
1:30pm-2:10pm (40m) Design
Language Technologies for a Connected World: Processing and Visualizing Unstructured Text in 5000 Languages
Robert Munro (Idibon)
The majority of the world's data is now unstructured, non-English text. How can we extract useful information from it? Many of our assumptions about English do no carry over to other languages. This talk will give a high-level overview of how languages vary, what current language technologies can (and cannot) achieve, and how we can process and visualize this information at scale.
2:20pm-3:00pm (40m) Design
Analyzing Terapixels and Megamiles with Google's Global-scale Geospatial Cloud Computing Platform
Louis Perrochon (Google)
Crunch 40 years worth of daily global satellite data at the push of a button, perform spatial analyses on GBs of your own GIS data and securely share the results privately or publish to 1B Google Earth users. This talk will focus on how what was once the realm of a few is now easily and intuitively accessible from the comfort of your Chrome browser.
4:00pm-4:40pm (40m) Design
Great Debate: Design Matters More Than Math
Alexander Gray (Skytree, Inc.), Monica Rogati (Data Natives), Julie Steele (Silicon Valley Data Science), Douglas van der Molen (ClearStory Data)
The Great Debate series returns to Strata. In this Oxford-style debate, two opposing teams take opposing positions. We poll the audience, and the teams try to sway opinions. It'll be a fast-paced, sometimes irreverent look at some of the core challenges of putting data to work.
10:40am-11:20am (40m) Law, Ethics, and Open Data
Crowdfunded Open Doctor Data
Fred Trotter (FredTrotter.com)
At Strata RX, we announced the release of DocGraph, the largest open named social graph data set that we know of. This data set included links between doctor who commonly team together in the Medicare dataset. Since then, we have added tremendous depth to the data by crowdfunding the acquisition of doctor credentialing data. Come learn how healthcare works under the cover.
11:30am-12:10pm (40m) Law, Ethics, and Open Data
Monkeys & Math: How MailChimp Catches Bad Guys
John Foreman (MailChimp)
Hear from MailChimp’s Chief Scientist John Foreman as he dishes on dirty data and demonstrates the latest in MailChimp’s anti-abuse artificial intelligence. MailChimp sends 3 billion emails a month for their millions of users, and they can't afford to let a drop of spam go out. Learn how the company is using cutting edge NoSQL solutions and predictive models to leave the bad guys out in the cold.
1:30pm-2:10pm (40m) Law, Ethics, and Open Data
A Model Strategy for Data Journalism in a Country Without Open Data
Sandra Crucianelli (International Center for Journalists), Angélica Peralta Ramos (La Nacion Newspaper)
A way to introduce the idea that access to Big Data in many countries – especially Argentina – is still a work in progress and somewhat politicized. Despite that, media like La Nacion Newspaper, are working with developers and experts in Data Viz to address the lack of transparency and accountability.
2:20pm-3:00pm (40m) Law, Ethics, and Open Data
The Web As The Greatest Dataset Of All Time
Lisa Green (Common Crawl), Greg Lindahl (blekko), Kevin Burton (Spinn3r)
Big data tools made it possible to gain extremely valuable insight from large scale analysis of web data, but until recently few people had access to the data. Now tools like Grep the Web and increased raw access to web data grant anyone the power to do such analysis. This presentation addresses practical applications of web data analysis that you can incorporate into your research or products.
4:00pm-4:40pm (40m) Law, Ethics, and Open Data
Design, Transparency, and Big Data in Civil Litigation
Dean Malmgren (Datascope Analytics), Mike Stringer (Datascope Analytics)
Electronic discovery has transformed the way cases are litigated. Gone are the days of manual review, where litigators spent days poring over emails, messages, and documents. Today's e-discovery technologies mine through vast troves of information, looking for the needle in the proverbial haystack that will blow a case wide open.
10:40am-11:20am (40m) Connected World
Medical Data: Going from Hospitals to Home
Carson Darling (Rest Devices)
This talk will discuss Rest Devices proprietary low-cost sensor technology, its use of and vision for big biometric data, and the need for design integration in all facets of product development, be it software or hardware.
11:30am-12:10pm (40m) Connected World
Sociometric Badges: Using Wearable Sensors to Change Management
Ben Waber (Sociometric Solutions)
I will discuss how a wearable sensing platform, the Sociometric Badge, allows us to measure and analyze human behavior in the real-world, particularly in the workplace. We’ll discuss how we use the badges to recognize concepts such as persuasiveness and social support and how we have used the badges in real companies to drive organizational change and put hard numbers behind management methods.
1:30pm-2:10pm (40m) Connected World
Deriving an Interest Graph for Social Data
Anna Smith (bitly)
While audience analysis is an old topic, it is being reimagined as personas along topic distributions as opposed to the usual demographic terms. This provides deeper insights into the communities among the internet that provide interesting insights into how the internet is consumed.
2:20pm-3:00pm (40m) Connected World
LinkedIn Endorsements: Reputation, Virality, and Social Tagging
Sam Shah (SkipFlag), Peter Skomoroch (Skipflag)
Learn how LinkedIn endorsements used data mining techniques to develop a viral social tagging and reputation system.
4:00pm-4:40pm (40m) Data Science, Internet of Things, Location
Big Data from Small Devices: Using Smartphones to Understand Human Behavior
Nadav Aharony (Google)
Today's smartphones have evolved into incredibly rich sensing and computing devices, that can be used to infer complex and interesting things about us, our environment, and our communities. This talk will give an overview of user-centric, continuous mobile sensing, and our work, originating at the MIT Media Lab, to develop open tools to democratize this capability.
8:45am-8:55am (10m)
Thursday Keynote Welcome
Alistair Croll (Solve For Interesting), Edd Wilder-James (Silicon Valley Data Science)
Program Chairs, Edd Dumbill and Alistair Croll, welcome you to the second day of keynotes.
8:55am-9:05am (10m)
Big Data. in a Really Big World!
Cecilia Bouras (Western Union)
In this key note, we will explore some of the challenges of big data operating in a truly global context.
9:05am-9:15am (10m) Sponsored Sessions
Xbox Data is XXL
Dave Campbell (Microsoft)
Microsoft keynote, featuring Dave Campbell, Vice President of Product Development for the SQL Server product suite.
9:15am-9:25am (10m)
Broad Data: What Happens When the Web of Data Becomes Real?
James Hendler (RPI)
In this talk, we present the broad data challenge and discuss potential starting points for solutions. We illustrate these approaches using data from a "meta-catalog" of over 1,000,000 open datasets that have been collected from about two hundred governments from around the world.
9:25am-9:30am (5m) Sponsored Sessions
Delivering Intelligence Wherever Data Lives
Girish Juneja (Intel)
How software can transform human lives by bringing intelligence to wherever big data lives.
9:30am-9:35am (5m) Sponsored Sessions
Grafting Hadoop and SAP HANA Together
Joydeep Das (SAP)
Hadoop and SAP HANA are taking the world by storm. SAP HANA is the fastest growing commercial database in the market, being adopted by the world’s top enterprises for real-time analytics and applications.
9:35am-9:45am (10m)
Human Fault-tolerance
Nathan Marz (Twitter)
Designing for human fault-tolerance leads to important conclusions on the fundamental ways data systems should be architected.
9:45am-10:05am (20m)
Algorithmic Illusions: Hidden Biases of Big Data
Kate Crawford (Microsoft Research)
Big data gives us a powerful new way to see patterns in information - but what can't we see? When does big data not tell us the whole story? This talk opens up the question of the biases we bring to big data, and how we might work beyond them.
4:50pm-5:10pm (20m)
Big Data vs The Beltway: The Regulatory Risks to Data-Driven Businesses
Kenneth Cukier (The Economist)
As big data makes inroads into all aspects of society, how governments regard the technology will be critical for its success. If the past is a guide, the state will embrace big data for its own uses (both good and ill). It will recognize that its authority is threatened and lash out
5:10pm-5:30pm (20m)
The Victory Lab
Sasha Issenberg (The Victory Lab)
The Victory Lab presents a secret history of modern American politics, pulling back the curtain on the tactics and strategies used by some of the era's most important figures-including Barack Obama and Mitt Romney-with iconoclastic insights into human decision-making, marketing and how analytics can put any business on the road to victory.
10:40am-11:20am (40m) Sponsored Sessions
When Energy Met Intelligence: Utilities Using Hadoop for Analytics at Scale
Greg Khairallah (Intel), Bert Haskell (Pecan Street Projects)
Real-world examples of utility companies around the world using Hadoop to optimize their services and changing Hadoop in the process.
11:30am-12:10pm (40m) Sponsored Sessions
Data Set Management System for Hadoop
Michael Lang Sr. (Revelytix)
Managing data in Hadoop gets complex quickly - *Loom* is the data set management system for Hadoop that makes it easy. *Loom* provides tools to track the lineage and provenance of all registered HDFS data, and *Activescan* so that all of the critical information about data sets is collected dynamically.
1:30pm-2:10pm (40m) Sponsored Sessions
The Business Analyst and the Evolution of Interactive Analytics on Big Data
Priyank Patel (Teradata Aster)
MapReduce, Hadoop, and other “NoSQL” big data approaches opened opportunities for data scientists in every industry to develop new data-intensive applications. But what about the more traditional SQL users or analysts? How can they unlock insights through standard business intelligence (BI) tools or ANSI SQL access?
2:20pm-3:00pm (40m) Sponsored Sessions
Dotting the I’s with Hadoop on Eseries
David Henry (Pentaho), Benjamin Lloyd (NetApp)
Attend this session to hear how NetApp was able to solve their big data problem. Since the design and implementation of the solution, NetApp has a number of takeaways and best practices required to convert theory into practice, allowing completion of an enterprise-level implementation of such a solution.
4:00pm-4:40pm (40m) Sponsored Sessions
Big Data on the Open Cloud
Natasha Gajic (Rackspace)
Come learn about ACG, Analytical Compute Grid, a solution Rackspace built leveraging OpenStack, Big Data and NoSQL to help end users manage complex information and data.
10:10am-10:40am (30m)
Break: Morning Break - Sponsored by Intel
3:00pm-4:00pm (1h)
Break: Afternoon Break
12:10pm-1:30pm (1h 20m)
Thursday Lunchtime BoF Tables
Birds of a Feather (BoF) sessions are informal roundtable discussions happening during lunch on Wed 2/27 and Thu 2/28. You can join any BoF table or start your own with a topic of your choice. The BoF sign-up board will be near the Registration area.
8:00am-8:45am (45m)
Break: Coffee Break - Sponsored by VMware

Sponsors

Sponsorship Opportunities

For information on exhibition and sponsorship opportunities at the conference, contact Susan Stewart at sstewart@oreilly.com

Media Partner Opportunities

For information on trade opportunities with O'Reilly conferences contact Kathy Yu at mediapartners
@oreilly.com

Press and Media

For media-related inquiries, contact Maureen Jennings at maureen@oreilly.com

Contact Us

View a complete list of Strata contacts