Schedule: Speakers

Keaton Adams
Keaton Adams (McAfee)

Keaton Adams is a Principal Enterprise Data Engineer with McAfee, an Intel Company. Keaton architects, deploys and maintains large-scale data management solutions for the McAfee Software-as-a-Service Email and Web Protection products. He has served for over twenty years in the IT industry in various data management roles and continues to evaluate technologies such as NoSQL solutions for Big Data Analysis.

Keaton earned his MBA from Colorado Christian University and is currently pursuing a Master of Science in Information Systems with a specialization in Business Intelligence at the University of Colorado Denver.

Jim Adler
Jim Adler (Metanautix), @jim_adler

Jim is a business executive, entrepreneur, and thought leader on big data, privacy, security, and voting systems. Currently, Jim is VP of Products & Chief Privacy Officer at Metanautix. He also currently serves on the The Department of Homeland Security (DHS) Data Privacy and Integrity Advisory Committee (DPIAC) providing advice at the request of the Secretary of Homeland Security and the DHS Chief Privacy Officer.

Most recently, Jim was Vice President, Data Systems at inome and the first Chief Privacy Officer at Intelius. Jim led the big data team that powers the company’s products as well as serving as its chief consumer advocate. Prior to inome and Intelius, Jim served as president and chief technology officer at, an Internet... Read More.

Alasdair Allan
Alasdair Allan (Babilim Light Industries), @aallan

Alasdair Allan is a director at Babilim Light Industries and a scientist, author, hacker, maker, and journalist. An expert on the internet of things and sensor systems, he’s famous for hacking hotel radios, deploying mesh networked sensors through the Moscone Center during Google I/O, and for being behind one of the first big mobile privacy scandals when, back in 2011, he revealed that Apple’s iPhone was tracking user location constantly. He’s written eight books and writes regularly for, Hackaday, and other outlets. A former astronomer, he also built a peer-to-peer autonomous telescope network that detected what was, at the time, the most distant object ever discovered.

Robbie Allen
Robbie Allen (InfiniaML), @RobbieAllen

Robbie Allen is the Founder and CEO of Automated Insights, Inc., a VC-backed startup based in Durham, NC. Robbie was formerly a Distinguished Engineer, IT at Cisco. He has Masters degrees in Civil and Environmental Engineering and System Design Management from MIT. He’s the author or co-author of ten books covering a variety of technical topics.

Xavier Amatriain is currently managing a team of engineering/research stars in creating next generation personalized experiences at Netflix. He is working on the cross-roads of data mining, machine learning, software engineering, innovation, and agile methods.

Previous to this, he was a Senior Researcher in Telefonica, where his research focused on Recommender Systems and neighboring areas such as Data Mining, User Modeling, Social Networks, and e-Commerce.

He has authored more than 50 papers in books, journals and international conferences. He has also been invited speaker to several conferences and universities.

Jesper Andersen
Jesper Andersen (Bloom Studios), @jandersen

Jesper develops experimental online services designed to introduce emotional contexts into online relationships, creating more authentic experiences. He is the co-founder of Bloom Studios, developing novel data interface applications for web and tablet platforms. He is also an accomplished data scientist, working on problems including home valuations for Trulia, credit card fraud for Visa, and social network analysis for Visible Path. Jesper speaks frequently at international technology and design conferences and has appeared in print and broadcast media for projects like Avoidr, Freerisk, and his Foursquare privacy hack. He holds a B.Sc. in Physics from Haverford College and an M.B.A. in Econometrics from University of Chicago.

Tasso Argyros
Tasso Argyros (‎ActionIQ)

Tasso Argyros is Co-President, Teradata Aster, leading the Aster Center of Innovation. Tasso has a background in data management, data mining and large-scale distributed systems. Before founding Aster Data, he was in the Ph.D. program at Stanford University.

Tasso was recognized as one of Bloomberg BusinessWeek’s Best Young Tech Entrepreneurs for 2009. He holds a Master’s Degree in Computer Science from Stanford University and a Diploma in Computer Engineering from Technical University of Athens.

Eric Baldeschwieler
Eric Baldeschwieler (Independent)

Prior to co-founding Hortonworks, Eric served as VP Hadoop Software Engineering for Yahoo!, where he led the evolution of Apache Hadoop from a 20 node prototype to a 42,000 node service that is behind every click at Yahoo!. Eric also served as a technology leader for Inktomi’s web service engine, which Yahoo! acquired in 2003. Prior to Inktomi, Eric developed software for video games, video post production systems and 3D modeling systems. Eric has a Master’s degree in Computer Science from the University of California, Berkeley and a Bachelor’s degree in Mathematics and Computer Science from Carnegie Mellon University.

Solon Barocas
Solon Barocas (New York University)

Solon Barocas is a doctoral student in the Department of Media, Culture, and Communication and Student Fellow at the Information Law Institute at New York University. His research examines the ethics and implications of data mining, particularly for purposes of strategic communication. Solon has worked with the Berkman Center for Internet and Society, the Center for Global Communication Studies, the Stanhope Centre for Communication Policy and Research, and the Russell Sage Foundation. He obtained his MSc in International Relations from the London School of Economics and graduated from Brown University with a BA in Art-Semiotics and International Relations.

Kirkland Barrett
Kirkland Barrett (Microsoft)

Kirkland has 7 years of Business Intelligence experience at Microsoft and has delivered numerous successful solutions for multiple lines of businesses. He has managed deployment of specific line of business solutions as well as being a key player in Microsoft’s efforts to create their Enterprise Data Warehouse. This breadth of experience helps Kirkland bring insights about the tension between the need for centrally controlled data balanced with the agility and needs of both lines of business, but also the business itself. Kirkland has applied this experience in helping to drive innovative operating models embracing self-service BI alongside traditional IT delivered BI to provide increased agility for the business and far greater efficiencies for IT. He now helps drive Microsoft IT’s new BI strategy and partners... Read More.

Bitsy Bentley
Bitsy Bentley (Participatory Budgeting Project), @bitsybot

As the Director of Data Visualization at GfK Custom Research (a global market research firm), Bitsy designs data visualization applications to tell compelling stories about research data. In addition to developing new methodologies and templates for current GfK design processes, she also educates colleagues on current and emergent visualization tools, techniques and best practices.

Prior to joining GfK she worked as a freelance consultant, designing and hand coding interactive data displays for technology companies as well as traditional market research firms.

Bitsy has six years of experience in the market research industry, and holds a B.F.A. in Industrial Design from the University of Wisconsin Stout.

Christopher Berry

Christopher Berry is Vice President, Research at Syncapse, a leading social media technology company. In this role, Chris develops physical and social technologies that help marketers demonstrate the ROI of social media marketing, insight generation, and optimization. His team’s most recent contributions include research papers on the Value of a Facebook Fan and Advanced Sentiment Analysis. Previously, Chris held management positions at Critical Mass, and a variety of research positions at not-for-profit and academic institutions.

He’s an organizer of the Toronto Data Science Group, Web Analytics Wednesday Toronto, and an active bridge builder to INFORMS and the Toronto Data Mining Forum. He is co-chair Web Analytics Associations Research Committee and leads the Peer Review Journals Project.

Matt Biddulph
Matt Biddulph (Product Club), @mattb

Matt Biddulph is an independent creative technologist. He was co-founder of Dopplr, the social network for smarter travel acquired by Nokia in 2009. He started out in 1994 building search engines on CD-ROM, and now specialises in digital media, social software and putting data on the web.

Ron Bodkin

Ron founded Think Big Analytics to help customers leverage new data processing technologies like Hadoop and NoSQL databases and R for statistical analysis. Works with customers to identify opportunities and rapidly develop solutions that integrate data and extract information.

Previously Ron was the VP of Engineering for Quantcast. Each day Quantcast ingests 10 billion events and processes two petabytes of data using Hadoop. The Quantcast map reduce stack handles production data processing, ad hoc analysis, data mining and machine learning. Prior to that Ron was a founder of enterprise consulting companies C-bridge and New Aspects.

Billy Bosworth

Billy is responsible for the day-to-day operations of DataStax. He has 20 years of experience in the database industry in roles ranging from DBA to senior executive. Prior to DataStax, Billy spent 6 years at Quest Software, a provider of systems management software, where his most recent role was VP and GM of the database business unit. Under his leadership, the industry-leading Quest database business grew from supporting traditional relational databases to a portfolio that now includes tools for cloud, NoSQL, columnar, and Hadoop databases, as well as business intelligence offerings. Prior to Quest, Billy led product teams for Embarcadero Technologies’ database productivity solutions. Billy holds a bachelor of science in computer science from the University of Louisville.

Mike Bowles
Mike Bowles (Biomatica)

Dr Mike Bowles’s career is one of the most extraordinary in Silicon Valley. Mike’s career started out in research, as an assistant professor at MIT. He went on to found and run two companies, both of which went on to huge IPOs: First was Com21, an early pioneer in developing cable modem networks, which Mike led to a successful NASDAQ IPO at a $300m valuation. He then went on to create IBeam Broadcasting, a video distribution network, which after just 2.5 years he led to a $3b IPO. More recently he has been active as co-founder and instructor for the series of data mining courses run at Hacker Dojo. These courses are nearly always sold out, and have received great feedback... Read More.

Chris Broadfoot

Chris is a Developer Programs Engineer on the Geo Developer Relations team. He is currently focused on the Maps JavaScript API, helping developers build compelling applications. Prior to Google, he worked as a developer at Atlassian, CSC and Coca-Cola Amatil.

Paul Brown (Paradigm4 Inc.)

Paul Brown is the Chief Plumber for Paradigm4 and SciDB: an open source array database management system designed to scale in support of very large analytic workloads. Prior to Paradigm4, Paul spent a decade working for IBM Research at the Almaden Research Center in San Jose, CA where he focused on advanced database systems research. Before IBM Paul worked for 15 years at a number of database companies all distinguished by the fact their names started with the letter ‘I’; Ingres, Illustra, and Informix. Paul is the author of several books about database technology, and a dozen research papers over the last fifteen years covering data analysis and DBMS implementation.

Jon Bruner
Jon Bruner (O'Reilly Media), @jonbruner

Jon Bruner is a data journalist who approaches questions that interest him by writing and coding. Jon is cochair of the O’Reilly Solid conference, focused on the intersection between software and the physical world, and he oversees O’Reilly’s publications on hardware, the Internet of Things, manufacturing, and electronics. Before coming to O’Reilly, Jon was the data editor at Forbes magazine. He lives in San Francisco, where he can occasionally be found at the console of a pipe organ.

Michael Brunton-Spall
Michael Brunton-Spall (Guardian News and Media), @bruntonspall

Michael Brunton-Spall is the Developer Advocate for the Guardian. He has worked at the Guardian for three years now, helping to build and scale the website. He has spent a lot of time helping to setup and run the platform team that manages internal, behind the scenes, performance and scalability issues.
As a Developer Advocate, Michael speaks at conferences, organises conferences, supports users of the API’s and does training.

Sean Byrnes
Sean Byrnes (Flurry, Inc.)

Sean co-founded Flurry, Inc. ( in 2005 to take advantage of the rapidly expanding mobile application ecosystem and growing potential of mobile phones. Today, Flurry is the leading provider of services to developers of mobile applications, working with over 60,000 developers of applications for iOS, Android, Blackberry and other platform. Flurry’s flagship product, Flurry Analytics, provides in depth intelligence to developers about their mobile applications and tracks over 1.2 Billion application sessions per day. Prior to Flurry, Sean worked in the eServices group of Verizon Communications where he built commercial services from emerging technologies such as WiFi and VoIP. Sean holds a Bachelor of Arts in Engineering from Dartmouth College and a Master of Engineering from Cornell University in Computer Science.

Dave Campbell
Dave Campbell (Microsoft)

David Campbell is a Microsoft Technical Fellow whose present role is Vice President of Product Development for the SQL Server product suite.

David graduated with a Master’s Degree in Mechanical Engineering (Robotics) from Clarkson University in 1984 and began working on robotic workcells for Sanders Associates – later a division of Lockheed Corporation. In 1990 he joined Digital Equipment Corporation where he worked on their Codasyl database product DEC DBMS as well as their relational database product; Rdb.

Upon joining Microsoft in 1994, David was a developer and architect on the SQL Server Storage Engine team that was principally responsible for rewriting the core engine of SQL Server for SQL Server Version 7.0.

At Microsoft, he has held numerous... Read More.

Virginia Carlson
Virginia Carlson (Urban Rubrics), @VL_Carlson

A data and information expert, Virginia has more than 25 years of experience leading fast-paced, creative environments where data are used to make decisions, tell stories and illuminate trends. Before taking the helm at MCIC in January 2009, Virginia was a professor of Urban Planning at the University of Wisconsin– Milwaukee. She’s also been Deputy Director for Data Policy at the Brookings Institution and was the founding Research Director at World Business Chicago. She holds a Doctorate in Political Science from Northwestern University. As a Board member of the Association of Public Data Users, Virginia believes that even the smallest non-profit organizations should have access to the best data available.

Lora Cecere
Lora Cecere (Supply Chain Insights), @lcecere

Lora Cecere is the Supply Chain Shaman. A shaman interprets and connects the evolving world to a group of followers. Lora does this for supply chain. As a founder and CEO of Supply Chain Insights and the author of enterprise software blog, Lora travels the world to chart the course of supply chain practices and disruptive technologies. Her blog focuses on the use of enterprise applications to drive supply chain excellence. She is also busy writing a book, Bricks Matter, that will be published in the fall of 2012.

Lora is known as a supply chain visionary who understands software. She brings seven years of industry analyst expertise coupled with two decades of manufacturing, marketing, and software expertise. Publications such as The... Read More.

Jacomo Corbo
Jacomo Corbo (QuantumBlack), @jacomocorbo

Jacomo Corbo is the chief scientist for QuantumBlack, a visual analytics firm that helps clients meet the analysis challenges of big data to make better decisions. Corbo is also the Canada Research Chair in Information and Performance Management at the University of Ottawa, and a Wharton Clayright Scholar at the University of Pennsylvania’s Wharton School of Business. His research has been funded by grants from the National Research Council, the Alfred P. Sloan Foundation, the Wharton Mack Center for Technological Innovation, the Wharton Customer Analytics Initiative, as well as by companies such as GE Finance and IBM.

Between January 2006 and June 2008, Corbo served as race strategist and subsequently as chief race strategist for the Renault F1 Team Ltd.

Corbo holds a Ph.D.... Read More.

Marilyn Craig
Marilyn Craig (Logitech)

Marilyn Craig is Senior Director of Worldwide Sales & Marketing Planning and Analysis at Logitech. She has a wealth of in-depth, real world experience in retail channels and consumer insights, particularly in the intersection of SCM with sales operations, for some of the most well-known consumer goods and electronic companies in the world, including Logitech, Hewlett-Packard, and Intuit. She is also a member of the Downstream Data Share Group for Consumer Goods Technology.
At Logitech, Marilyn is building a worldwide team to own and manage the advanced analytics, long-term planning, and enabling processes and tools for both the sales and marketing organizations which will feed demand data into SCM processes and operations. At Intuit, Marilyn’s go-to-market and retail channel strategies for Intuit’s... Read More.

Terence Craig
Terence Craig (PatternBuilders)

Terence Craig is CEO and CTO of PatternBuilders, a big data analytics companies that produces advanced applications for financial services, retail and other data intensive industries.

Terence has an extensive background in building, implementing, and selling analytically-driven enterprise applications across such diverse domains as enterprise resource planning (ERP), retail sales channel optimization, professional services automation (PSA), and semi-conductor process control and analytics in both public and private companies. He has been part of the ERP/SCM industry as it has evolved, from the VAX and HP 3000 to its current heyday of client-server, GUIs, and relational databases and is looking forward to exploring what the next generation of solutions, powered by the Internet of Things and big... Read More.

Alistair Croll
Alistair Croll (Solve For Interesting), @acroll

Alistair Croll is an entrepreneur with a background in web performance, analytics, cloud computing, and business strategy. In 2001, he cofounded Coradiant (acquired by BMC in 2011) and has since helped launch Rednod, CloudOps, Bitcurrent, Year One Labs, and several other early-stage companies. He works with startups on business acceleration and advises a number of larger companies on innovation and technology. A sought-after public speaker on data-driven innovation and the impact of technology on society, Alistair has founded and run a variety of conferences, including Cloud Connect, Bitnorth, and the International Startup Festival, and is the chair of O’Reilly’s Strata Data Conference. He has written several books on technology and business, including the best-selling Lean Analytics. Alistair tries to mitigate his chronic ADD by writing... Read More.

Doug Cutting
Doug Cutting (Cloudera), @cutting

Doug Cutting is the chief architect at Cloudera and the founder of numerous successful open source projects, including Lucene, Nutch, Avro, and Hadoop. Doug joined Cloudera from Yahoo, where he was a key member of the team that built and deployed a production Hadoop storage-and-analysis cluster for mission-critical business analytics. Doug holds a bachelor’s degree from Stanford University and sits on the board of the Apache Software Foundation.

Chris Deptula
Chris Deptula (OpenBI)

Chris is a senior consultant with OpenBI responsible for assisting corporate customers integrate Pentaho solutions into their big data systems. He has 5 years of experience in business intelligence, data warehousing, and big data solutions. Prior to OpenBI Chris was a consultant with FICO implementing marketing intelligence and fraud solutions. He has a degree in Computer and Information Technology from Purdue University and is an avid skier.

Anna Divoli

Anna Divoli holds a Master’s degree in Biosystems and Informatics from the University of Liverpool and a PhD in Biomedical Text Mining from the University of Manchester. For her doctoral research, Anna studied sentence extraction for semi-automatic annotation of biological databases. After her PhD, she carried out postdoctoral research, first in user search interfaces in the School of Information at the University of California at Berkeley and then in knowledge acquisition on cancer metastasis from expert opinions in the Department of Medicine at the University of Chicago. Her research focuses on developing methodologies for acquiring knowledge from textual data and studying the effect of human factors in that process. Anna joined Pingar in 2011 as Senior Software Researcher.

James  Dixon
James Dixon (Pentaho)

As " Lord of the 1s and 0s" (CTO) at Pentaho, James Dixon is responsible for Pentaho’s architecture and technology roadmap. James has over 20 years of professional experience in software architecture, development and systems consulting. Prior to Pentaho, James held key technical roles at AppSource Corporation (acquired by Arbor Software which later merged into Hyperion Solutions) and Keyola (acquired by Lawson Software). Earlier in his career, James was a technology consultant working with large and small firms to deliver the benefits of innovative technology in real-world environments.

Leigh Dodds
Leigh Dodds (Kasabi), @ldodds

Open source and open data enthusiast. Product lead. Semantic Web developer.

Michael Driscoll
Michael Driscoll (Metamarkets)

Michael Driscoll has a decade of experience developing large-scale databases and predictive algorithms for digital media, financial, and life sciences firms. He is the CEO and co-founder at Metamarkets, and Chairman of Dataspora LLC, a big data & analytics consultancy he founded in 2007. Previously, he founded the online retailer,, and worked as a software engineer for the Human Genome Project. Michael holds a Ph.D. in Bioinformatics from Boston University and an A.B. from Harvard College.

Michael tweets at medriscoll and blogs at Data Utopian .

Gary Dusbabek
Gary Dusbabek (Rackspace)

An Apache Cassandra committer, Gary Dusbabek is a life-long programmer specializing in distributed systems. His past experience includes working with large-scale text and image indexes in the newspaper industry and high-volume advertisement booking software. He currently works on the Cloud Monitoring team at Rackspace.

Michael Edgcumbe
Michael Edgcumbe (Columbia University), @noisederived

Michael is an Eagle Scout in the Boy Scouts of America. He graduated as a Joseph Wharton Scholar from the Wharton School at the University of Pennsylvania. He has pursued overlapping careers in home remodeling, art direction, small business consulting, systems administration, and data analysis. He also was a Mac Genius. Before pursuing his master’s degree, he worked as an analyst for Meg Shope-Koppel, the Director of Research at the Philadelphia Workforce Investment Board. Michael graduated from New York University’s Interactive Telecommunications Program in the Spring of 2011. He currently serves as the Information Architect for the Institute on Medicine as a Profession at Columbia University.

Jonathan Ellis
Jonathan Ellis (DataStax), @spyced

Jonathan is CTO and co-founder at DataStax. Prior to DataStax, Jonathan worked extensively with Apache Cassandra while employed at Racksace. Prior to Rackspace, Jonathan built a multi-petabyte, scalable storage system based on Reed-Solomon encoding for backup provider Mozy. In addition to his work with DataStax, Jonathan is project chair of Apache Cassandra.

Tim Estes
Tim Estes (Digital Reasoning)

I am the Chairman and CEO of Digital Reasoning Systems. Digital Reasoning builds data analytic solutions based on a distinctive, patented mathematical approach to understanding natural language and leverages bleeding edge cloud technologies to make these analytics work effectively on vast amounts of data. The value of Digital Reasoning is not only the ability to leverage an organization’s existing knowledge base, but also to reveal critical hidden information and relationships that may not have been apparent during manual or other automated analytic efforts. Our products are presently used in connection with large and deployed defense intelligence systems in support of intelligence analysts in the Federal Government.

Sage Weil designed Ceph in 2004 as part of his PhD research in Storage Systems at the University of California, Santa Cruz. Since graduating, he has
continued to refine the system with the goal of providing a stable next
generation distributed file system for Linux. Prior to his graduate work, he cofounded New Dream Network, the company behind, a Los Angeles-based web hosting company.

Fernanda Foertter
Fernanda Foertter (Genus plc)

Fernanda Foetter is a self described, “computer geek interested in Big Data analyses using High Performance Computing.” She has a background in Particle Physics simulations, Molecular Dynamics and Quantum Chemistry Simulations, and more recently Bioinformatics at Genus PLC. Other interests include application development, parallelization techniques and Big Data curation. At Genus she’s responsible for keeping the clusters running and helping scientists make full use available technology or bringing in new ones to reach their goals. She holds a BS in Physics from Florida International University and an MS in Materials Science Engineering from the University of Florida.

Steve Francia
Steve Francia (10gen)

Steve Francia leads the public side of the engineering organization at 10gen including integration, evangelism, support and consulting. Steve brings to this role his experience as VP of engineering at OpenSky where he build the worlds first ecommerce site powered by MongoDB and one of the first PHP sites backed by MongoDB. Steve has been an engineer, entrepreneur and executive since 1995 when he built one of the first ecommerce sites while working for American Telecom. His previous roles include CIO/COO at Portero, VP of Development at Takkle and Founder & CTO of Supernerd. Steve loves open source. He has contributed to dozens of open source projects including MongoDB, Doctrine, Symfony2 and Zoop and has started a few of his own.... Read More.

Max Gadney
Max Gadney (After The Flood), @maxgadney

Max Gadney founded After the Flood to help companies communicate data
better. Current clients include the BBC, Edelman and Manchester City
Football Club. A passion for information design has been a consistent theme
throughout his life and career. At the BBC, Max was the Head of Design and
Audience Insight at BBC News Online from 2000-2007. The team won 11 Webbys
and the Society of News Design President’s award for election night data
visualisation. After that he joined the BBC TV Digital Commissioning team.
His most recent commission there was BBC Dimensions, part of the NYC MOMA
‘Talk To Me’ show in 2011. After a brief stint in market research,... Read More.

Ben Gimpert
Ben Gimpert (Altos Research)

Ben was a professional software developer for ten years, and a BBS scenester in the mid-nineties. He is also one of those annoying former quants. Ben’s past clients include investment banks like JPMorgan Chase and Credit Suisse, the hedge fund Natura Capital, and EdF Trading an energy trading house. He built a taxonomy browser for Encyclopaedia Britannica in 2004, and previously worked for ThoughtWorks as a convert to agile software engineering.

Ben teaches and speaks on machine learning, software engineering, financial analysis, and the culture of quants. While living in London, Ben was an early contributor to the grassroots cartography project OpenStreetMap. He continues to manage a portfolio of financial assets via a quantitative trading strategy built upon sentiment and predictive analytics. He has... Read More.

Fabien Girardin
Fabien Girardin (BBVA Data & Analytics)

Fabien Girardin (PhD) is the co-founder of Lift Lab, a research agency that helps companies and institutions understand, foresee and prepare for changes triggered by technological and social evolutions. He is particularly active in the domains of user experience, data science and urban informatics. His research mixes qualitative observations with quantitative data analysis to gain insights from the integration and appropriation of technologies in urban environments. Subsequently, he exploits the gained knowledge with engineering techniques to prototype and evaluate concepts and solutions for mobile network operators, urban and location-based services providers, city planners and decision makers.

Hjalmar Gislason

Hjalmar is a serial entrepreneur, founder of three startups in the gaming, mobile and web sectors since 1996. Prior to launching DataMarket, Hjalmar worked on new media and business development for companies in the Skipti Group (owners of Iceland Telecom) after their acquisition of his search startup – Spurl. DataMarket is based largely on his vision of the need for a global exchange for structured data.

Josh Gold (e22 Alloy)

Josh is currently CEO of e22 Alloy, a start-up operating a cloud based Workforce Telemetry software service.

Josh has over 15 years of entrepreneurial experience in the Internet software industry. He started his career in 1995 by founding an early interactive design and advertising agency. The company created the first Web sites for Sony Computer Entertainment, Nissan Motors, DreamWorks, and Energizer and later went on to provide interactive marketing services to numerous Fortune 500 companies and high profile Internet start-ups.

Josh later served as a General Manager in Residence at eCompanies, a Santa Monica based business incubator and venture capital firm where he was tasked with evaluating and launching several start up businesses and acted as a CEO for 3 early stage companies... Read More.

Ben Goldacre
Ben Goldacre (Bad Science), @bengoldacre

Ben is a best-selling author, broadcaster, medical doctor and academic who specialises in unpicking dodgy scientific claims from drug companies, newspapers, government reports, PR people and quacks. Unpicking bad science is the best way to explain good science.

Bad Science (4th Estate) has sold over 400,000 copies, is published in 18 countries, and reached #1 in the UK paperback non-fiction charts. His book exposing bad behaviour in the pharmaceutical industry will be published in 2012 by 4th Estate.

Ben has written the weekly Bad Science Column in the Guardian since 2003. It’s archived on this site along with blogposts, columns for the British Medical Journal, and other writing.

There are lots of clips of Ben on telly here, and a talk at... Read More.

Jonathan Gosier
Jonathan Gosier (AuDigent), @jongos

Jon Gosier is a serial tech entrepreneur and venture capitalist working at the intersection of data science and design. Based in Philadelphia, Jon is also the cofounder of Predictive Pop (aka PredPop), a data company changing way the music industry monitors and monetizes music. Prior to PredPop, in his career as a data scientist, Jon spearheaded big data projects for various multinational organizations where tech platforms were used to serve millions of people in developing countries. During that time, his many innovations were deployed by Google, the US Department of State, the US Army, the United Nations, the Red Cross, FEMA, the government of Canada, and the Kenyan disaster-response organization Ushahidi. Jon is also a successful venture capitalist. After developing a successful model for... Read More.

Alexander Gray
Alexander Gray (Skytree, Inc.), @skytreeHQ

Dr. Gray obtained degrees in Applied Mathematics and Computer Science from Berkeley and a PhD in Computer Science from Carnegie Mellon, and is an Associate Professor at Georgia Tech. His lab works to scale up all of the major practical methods of machine learning (ML) to massive datasets. He began working on this problem at NASA in 1993 (long before the current fashionable talk of “big data”). His large-scale algorithms helped enable the Top Scientific Breakthrough of 2003, and have won a number of research awards. He is a member of the National Academy of Sciences Committee on the Analysis of Massive Data and frequently gives invited tutorial lectures on massive-scale ML at top research conferences and agencies.

Josh Green
Josh Green (Panjiva)

Josh Green conceived of the Panjiva solution in 2005 after seeing first-hand just how difficult it is to find good overseas suppliers. Josh is a veteran of The Boston Consulting Group and has masters degrees from Harvard’s JFK School of Government and Harvard Business School, where he graduated as a Baker Scholar.

Stefan Groschupf

Stefan Groschupf is the co-founder and CEO of Datameer and is well known for his entrepreneurial accomplishments in data management and large-scale distributed computing.

Before Datameer, Stefan was the co-founder and CEO of Scale Unlimited, a leader provider of educational and consulting services for Hadoop and related technologies with proven success in companies such as HP, Sun, Apple, Deutsche Telekom and Nokia. Earlier, Stefan was CEO of 101Tec, a supplier of Hadoop and Nutch-based search and text classification software to industry-leading companies such as DHL and EMI Music. Stefan has also served as CTO at Thinglink, a developer of social objects used in social interaction design, as well as at Sproose, a social search engine company.

William Gunn
William Gunn (Mendeley Research Networks), @mrgunn

Stem cell biologist by training, work for Mendeley to disrupt scholarly communication and change how research is done, live in San Diego with wife and dog.

Mark Hahnel
Mark Hahnel (figshare), @figshare

Mark is the founder of FigShare, an open data tool that allows researchers to publish all of their data in a citable, searchable and sharable manner. He’s fresh out of of academia, having just completed his PhD in stem cell biology at Imperial College London, having previously studied genetics in both Newcastle and Leeds. He is passionate about open science and the potential it has to revolutionise the research community. For more information about FigShare, visit You can follow him at @figshare

Martin Hall
Martin Hall (Karmasphere)

Martin Hall is co-founder, Chairman & Executive Vice President of Corporate Development at Karmasphere. He brings a strong entrepreneurial track record and a history of pioneering new Internet technologies and markets. Prior to founding Karmasphere, Martin was a founder of Aventail, a leading computer security company acquired by SonicWall. Prior to that, he was the founding CEO of Stardust, an Internet technology services company sold to Penton Media. Martin has chaired and participated in a number of industry groups including WinSock, Quality of Service, Internet Multicast and Wireless Multimedia Forums. He holds a Masters of Computer Science from Staffordshire University in Stafford, England.

Nick Halstead
Nick Halstead (DataSift), @nik

Nick Halstead is the Founder of DataSift Inc., the real-time social media data-filtering platform. During the past five years, Nick has been a foremost technical visionary on the power of social data to revolutionize information delivery. Nick founded TweetMeme, the leading platform delivering social news, which quickly built an audience of millions in 30 countries. TweetMeme also invented the highly successful Retweet button, which serves more than 30 billion clicks per month and drives high volumes of traffic for Twitter. Nick is a regular speaker at events such as TechCrunch Disrupt, Le Web, Future of Web Apps, The Next Web and Strata and has spoken at SXSW and FOWA.

Felix Hamilton (e22 Alloy)

Mr. Hamilton is currently the chief architect of e22 Alloy, a start-up operating a cloud based Workforce Telemetry software service.

He has been involved in academic, commercial, and industrial R&D and various entrepreneurial activities since 1988. His work experience includes a wide range of hardware and software research and development efforts. Some of his previous software development experience includes the development of neural simulations based on mathematical models derived from collected physiological data, distributed real time software systems, distributed databases, large scale network based software systems, telephony, device drivers, simulations utilizing intelligent agents, web applications, and ERP implementations. Interesting problems, he says, are never too hard to find.

Usman Haque
Usman Haque (, @uah

Usman Haque is the founder of, a real-time data infrastructure for the Internet of Things used by tens of thousands of people around the world (acquired by LogMeIn Inc in 2011). Trained as an architect, he has created responsive environments, interactive installations, digital interface devices and dozens of mass-participation initiatives. His skills include the design and engineering of both physical spaces and the software and systems that bring them to life. He received the 2008 Design of the Year Award (interactive) from the Design Museum, UK, a 2009 World Technology Award (art), a Wellcome Trust Sciart Award, a grant from the Daniel Langlois Foundation for Art, Science and Technology, the Swiss Creation Prize, Belluard Bollwerk International, the Japan Media Arts Festival Excellence prize and... Read More.

Marti Hearst
Marti Hearst (UC Berkeley)

Dr. Marti Hearst is a professor in the School of Information at UC Berkeley, with an affiliate appointment in the Computer Science Division. Her primary research interests are user interfaces for search engines, information visualization, natural language processing, and empirical analysis of social media. She has recently completed the first book on Search User Interfaces.
Prof. Hearst received BA, MS, and PhD degrees in Computer Science from the University of California at Berkeley, and she was a Member of the Research Staff at Xerox PARC from 1994 to 1997.
Prof. Hearst has served on the Advisory Council of NSF’s CISE Directorate and was co-chair of the Web Board for CACM. She is a member... Read More.

Amy Heineike

Amy Heineike is the vice president of product engineering at Primer, where she leads teams to build machines that read and write text leveraging natural language processing (NLP), natural language generation (NLG_, and a host of other algorithms to augment human analysts. Previously, she built out technology for visualizing large document sets as network maps at Quid. A Cambridge mathematician who previously worked in London modeling cities, Amy is fascinated by complex human systems and the algorithms and data that help us understand them.

J. C. Herz
J. C. Herz (Ion Channel), @jcherz

JC Herz is cofounder and COO at Ion Channel, a data and microservices platform that automates situational awareness and enables risk management of the software supply chain. She has 15 years of analytics experience in healthcare and national security. JC was a White House special consultant to the Pentagon’s CIO office and coauthored the DoD’s open technology development roadmap. A published author, she has been contributing to Wired magazine since 1993.

Alexander Howard
Alexander Howard (O'Reilly Media), @digiphile

Alexander B. Howard is the Government 2.0 Correspondent for O’Reilly Media, where he reports on technology, open government and online civics. Before joining O’Reilly, Howard was the associate editor of at TechTarget. His work there focused on how regulations affect IT operations, including issues of data protection, privacy, security and enterprise IT strategy. Before moving the focus of his coverage to cybersecurity, online privacy and compliance, Howard was the associate editor of, an online IT encyclopedia. In that role, he researched and wrote about nearly every aspect of enterprise IT, including the impact of social software on business and the media. In his spare time, he practiced writing about himself in the third person, with mixed results. Howard’s work experience also includes working... Read More.

Jeremy Howard
Jeremy Howard ( | USF | and, @jeremyphoward

Jeremy Howard is President and Chief Scientist at Kaggle. Previously, he founded FastMail (sold to Opera Software) and Optimal Decisions sold to ChoicePoint – now called LexisNexis Risk Solutions). Prior to that he worked in management consulting, at McKinsey & Company and A.T. Kearney. Jeremy’s passion is applying algorithms to data. At FastMail he used algorithms to automate nearly every part of the business – as a result the company only needed a total of 3 full time staff, and got over a million signups. Optimal Decisions was a business entirely built to commercialise a new algorithm he designed for the optimal pricing of insurance. Jeremy competes regularly in data mining competitions, which he uses to test himself and stay on the leading edge of... Read More.

Michael Hugos
Michael Hugos (Center for Systems Innovation [c4si]), @MichaelHugos

MICHAEL HUGOS is an author, speaker and principal at Center for Systems Innovation [c4si]. He specializes in elegant solutions to complex problems in supply chains, business intelligence, and new business ventures. Previously he spent six years as chief information officer (CIO) of a national distribution organization where he developed a suite of supply chain and business intelligence systems that transformed the company’s operations and revenue model. He won the CIO 100 Award, InformationWeek 500 award and the Premier 100 Award for this work. He earned his MBA from Northwestern University’s Kellogg School of Management. His newest book explores the intersection between massively multi-player online games (MMOs) and real life business operations. It’s titled Serious Games: The Future of Work; it... Read More.

Claire Hunsaker
Claire Hunsaker (Samasource), @chunsaker

Claire works at Samasource, a San Francisco-based social enterprise that connects people living in poverty with internet-based work through a proprietary platform. At Samasource, her hats have included leading product, strategy and field expansion, but these days she helps clients connect with Samasource data and content solutions as the head of Sales and Marketing. Her prior gigs have included LiveOps, social enterprise in rural Vietnam, and management consulting with Katzenbach Partners, where she led client teams at large technology companies and helped several non-profits with large-scale operational growth.

Claire holds a BA from Columbia, an MA from the University of London, and an MBA from Stanford.

In her spare time, she knits, plays with Drupal, and sets small fires in her kitchen.

Noah Illinsky
Noah Illinsky (Amazon Web Services), @noahi

Noah Iliinsky is the co-author of Designing Data Visualizations and technical editor of, and a contributor to, Beautiful Visualization, published By O’Reilly Media.

He has spent the last several years thinking about effective approaches to creating diagrams and other types of information visualizations. He also works in interface and interaction design, all from a functional and user-centered perspective. Before becoming a designer he was a programmer for several years.

He has a master’s in Technical Communication from the University of Washington, and a bachelor’s in Physics from Reed College.

Francis Irving
Francis Irving (ScraperWiki Ltd.), @frabcus

Francis Irving, CEO of ScraperWiki, is a computer programmer living in
Liverpool, UK.

He was founding developer at mySociety, which over the last 8 years has made
the world’s most innovative democracy websites. In 2004, TheyWorkForYou was the
first website to scrape a Parliament and make a better interface for citizens,
inspiring the Sunlight Foundation.

Other sites Francis helped make at mySociety include: FixMyStreet, the first
national interface for reporting graffiti, potholes etc.; WhatDoTheyKnow, the
first interface for making Freedom of Information requests in public.

In his earlier career, Francis founded developer tool TortoiseCVS, which with
its successors is used by tens of millions of people. He has a first class
degree in Maths from Oxford... Read More.

Ryan Ismert (Sportvision, Inc)

Ryan Ismert is Sportvision’s General Manager for Augmented Reality. Prior to assuming his latest role, he spent eight years helping to lead the Sportvision engineering team as Director of Engineering. He has an extensive background in computer graphics and computer vision, and graduated from Cornell University with an MS in Architectural Science. Ryan is a frequent speaker at Silicon Valley augmented reality events.

Pervinder Johar
Pervinder Johar (CCC Information Services)

Pervinder Johar is Executive Vice President and Chief Technology Officer, Products & Technology, at CCC Information Services. Pervinder joined CCC in 2011 and is responsible for CCC’s global product and technology strategy. His organization includes: Marketing & Product Strategy, Program Management & QQ, Architecture, Customer Technology Solutions and Product & Technology – Investment & Operations.
Previously, Pervinder was Chief Supply Chain Architect at Hewlett-Packard, responsible for transforming HP’s supply chain into one global operation, and CTO and Executive Vice President of Global Research and Development at Manhattan Associates. His experiences as both a solution provider and architect inform a unique perspective on the SCM challenges and issues that companies are facing in the age of Big Data.

Avinash Kaushik
Avinash Kaushik (Market Motive), @avinash

Avinash Kaushik is the co-Founder of Market Motive Inc and the Digital Marketing Evangelist for Google. His prior professional experience includes key roles at Intuit, DirecTV, Silicon Graphics in the US & DHL in Saudi Arabia.

Through his blog, Occam’s Razor, and his best selling books, Web Analytics: An Hour A Day and Web Analytics 2.0, Avinash has become recognized as an authoritative voice on how marketers, executives teams and industry leaders can leverage data to fundamentally reinvent their digital existence.

Avinash puts a common sense framework around the often frenetic world of web analytics and combines that with the philosophy that investing in talented analysts is the key to long-term success. He passionately advocates customer centricity and leveraging bleeding edge competitive intelligence... Read More.

Siraj Khaliq
Siraj Khaliq (The Climate Corporation)

Siraj founded The Climate Corporation (formerly WeatherBill) in 2006, having previously worked at Google in multiple technical lead roles, from the company’s distributed computing infrastructure to the high-profile Google Book Search project and other offline content search initiatives. Siraj obtained an M.S. degree in Computer Science from Stanford University, and a B.A. (Hons.) in Computer Science from the University of Cambridge, England. While at Stanford, he was also a lead software architect for the popular Folding@Home distributed computing project.

Asad Khan
Asad Khan (Microsoft)

Asad Khan is a Senior Lead Program Manager in the Business Platform Division working on the Hadoop and Entity Framework projects. He is responsible for helping build JavaScript library to enables better data-centric web applications, and more recently explore the space around Big Data and JavaScript. He has spent the last few years working on the next generation data access technologies from Microsoft. Asad holds a master’s degree from Stanford University.

Ed Kohlwey
Ed Kohlwey (Booz Allen Hamilton), @ekohlwey

Edmund Kohlwey is a developer and data scientist at Booz Allen Hamilton. For the last three years, he has helped government clients adopt and develop their big data capabilities across many different problem domains.

Philip (Flip) Kromer

I’m building tools to organize, connect and comprehend massive information streams.

Ken Krugler
Ken Krugler (Scale Unlimited), @kkrugler

Veteran developer and entrepreneur, 25+ years experience. Founder and President of TransPac Software, a 20 year leader in internationalization, mobile devices, and search consulting. Founder and CTO of Krugle, a vertical search engine and enterprise appliance for code and technical information. Co-founder of Bixo web mining project. Committer for the Apache Tika project. Author and speaker on vertical search and web mining.

Coco Krumme
Coco Krumme (Haven | UC Berkeley)

Coco Krumme heads the data team at Haven and is an adjunct faculty member in the UC Berkeley master’s in data science program.

Peter Kuhn (Scripps Physics Oncology)

Dr. Kuhn is a scientist and entrepreneur with a career long commitment in personalized healthcare and individualized cancer patient care.

Dr. Kuhn is a Director of the Scripps Physics Oncology Center where he is developing the concepts for lifelong diagnostic companions for cancer patients. Leveraging the fluid phase of solid tumors the Scripps Physics Oncology Center is advancing daily the forefront of both improving healthcare effectiveness by providing drug guidance and increasing our understanding of cancer as a disease in each individual patient.

Dr. Kuhn is a physicist who trained at the Julius Maximilians Universität Würzburg, Germany, before receiving his Masters in Physics at the University of Albany, Albany, NY in 1993 and his Ph.D. in 1995. He then moved to Stanford University where he... Read More.

Robert Lancaster
Robert Lancaster (Orbitz Worldwide), @rob1lancaster

Rob Lancaster has been in software development for the last 13 years, developing solutions for the travel industry. He is currently a Solutions Architect for Orbitz with a focus on applying predictive analysis to improve the performance of Orbitz hotel systems. He is the organizer of Chicago’s Machine Learning meetup group and an organizer for Chicago’s Big Data user group.

Kin Lane
Kin Lane (API Evangelist), @kinlane

Kin Lane is a unique blend of a IT, data, programming, product development, business development, online and social media marketing. He spends his days helping application developers understand what is possible with mobile and web application development using APIs and focuses on studying the best practices when it comes to the business of APIs.

Gary Lang (MarkLogic), @garylang

Gary Lang is the senior vice president of engineering for MarkLogic. Lang is a proven leader with more than two decades experience delivering large, complex products and systems, architectural design and direction setting for high-revenue software projects. Lang is responsible for all of MarkLogic product development.

Lang comes to MarkLogic from Microsoft, where he was a leader in the development of the next version of Visual Studio. Prior to Microsoft, Gary was vice president of platforms and global engineering at Autodesk, where he led an organization of 1,200 employees worldwide providing platform and product engineering for Autodesk’s core products as well as new software and services for emerging businesses. His organization was responsible for developing code for almost all of Autodesk’s desktop and SaaS products,... Read More.

Robert Lefkowitz

Robert (a/k/a r0ml) Lefkowitz is a computer professional and amateur philosopher. He has worked primarily in large IT organizations where he facilitates information flows. His interests include semasiology and medieval history. He also juggles clubs.

Luke Lonergan
Luke Lonergan (Greenplum, a division of EMC), @lonerganluke

A co-founder of Greenplum, Luke served as CTO of the organization and continues in this role for the Greenplum Division. Prior to Greenplum, Luke founded Didera, a database clustering company, in 2000 and served as CEO and Chairman. Luke’s background includes 16 years of management experience in computing technology ranging from innovations in supercomputing to advances in medical imaging systems. Most recently, he directed data center integration at High Performance Technologies Inc (HPTi), scaling the business to $30M, and setting industry firsts in parallel computing subsequently adopted by IBM and Compaq. Previously he held management positions at Northrop Grumman Corporation. He holds an M.S. in Aeronautics and Astronautics from Stanford University and a B.E. in Mathematics from Vanderbilt University.

Piyush Lumba
Piyush Lumba (Microsoft), @piyushlumba

Piyush Lumba runs product management for Azure Data Services, a set of higher level data services such as the Azure Marketplace aimed at helping customers connect with and contribute to the world’s data. Prior to this he ran product planning for SQL Azure and Azure AppFabric Services, core elements of Microsoft’s Cloud Computing offering. Piyush has been at Microsoft for eight years, and previously worked in business development, business management, product management and strategy roles for the Forefront security product line. Prior to Microsoft, Piyush worked in the Telecom/Data communications industry in systems architecture and software development roles across a range of companies including Lucent Technologies, AT&T Bell Laboratories and venture-backed startups. Piyush has a B.S. Computer Science from the Indian Institute of Technology,... Read More.

Jock Mackinlay
Jock Mackinlay (Tableau Software)

Jock Mackinlay is Tableau Software’s Senior Director of Visual Analysis. At Stanford University he pioneered the automatic design of graphical presentations of relational information. He joined Xerox PARC in 1986, where he collaborated with the User Interface Research Group to develop many novel applications of computer graphics for information access, coining the term “Information Visualization.” Much of the fruits of this research can be seen in his book, “Readings in Information Visualization: Using Vision to Think.” Jock has a Ph.D. in computer science from Stanford University.

Mark Madsen
Mark Madsen (Teradata), @markmadsen

Mark Madsen is a Fellow at Teradata, where he’s responsible for understanding, forecasting, and defining analytics ecosystems and architectures. Previously, he was CEO of Third Nature, where he advised companies on data strategy and technology planning, and vendors on product management. Mark has designed analysis, machine learning, data collection, and data management infrastructure for companies worldwide.

Kuntal Malia
Kuntal Malia (ModCloth)

Kuntal Malia is the Analytics Lead at ModCloth and is focused on using data and quantitative modeling to drive customer-centric decision making. In the last five years, she has worked with business teams to address key business questions by identifying meaningful trends, developing predictive models, as well as conducting experiments and tests that led to actionable insights.

Mano Marks
Mano Marks (Google, Inc. )

Mano joined Google’s Geo API team in 2006. He helps people all over
the world develop and deploy their content in KML and Google Maps,
working with large companies, small startups, and international aid
organizations. Before coming to Google, Mano had an eclectic career
that involved getting a Masters in History, a Masters in Information
Management and Systems, and working as a data manager in social service and public benefit organizations for over a decade.

Ana Martinez
Ana Martinez (CityGrid Media)

Ana Martinez is the Senior Product Director for CityGrid Media where she is responsible for the development of the Ad, Place, Content APIs and the Ad Center and Developer Center applications. Previously, Ms. Martinez served as the Director of Engineering – Technology Director for SpotRunner where she was responsible for the creation and tactical execution for the company’s online platform for media buyers. During her tenure with SpotRunner, she directed the design and development of their platforms intellectual property. Prior to joining SpotRunner, Ms. Martinez was Engineer Manager – Product Manager for Strix Systems, where she defined and lead development for all the company’s device and network management software. Ms. Martinez also held engineers positions for !Candle Corp, Infocorp, Microsoft subsidiary in Uruguay, South America... Read More.

Nathan Marz

Nathan Marz is the lead engineer on Twitter’s Publisher Analytics team. He was previously the lead engineer at BackType before being acquired by Twitter in July of 2011.

Nathan is the author of numerous open-source projects relied upon by companies all around the world. These include Cascalog, ElephantDB, and Storm.

He has spoken about his work at conferences such as the Hadoop Summit, Strange Loop, Gluecon, Clojure/conj, and POSSCON. He writes a blog at

Betsy Masiello
Betsy Masiello (Google)

Betsy Masiello is a Policy Manager on Google’s public policy team. As part of her work at Google she is one of the leads for Google’s privacy efforts and for analyzing Google’s and the Internet’s impact on the economy. Prior to joining Google she was a consultant at McKinsey & Company, where she served global telecommunications companies on new business strategies around emerging technology. Masiello holds a BA in Computer Science from Wellesley College, a MSc in Economics from Oxford where she was a Rhodes Scholar, and an SM from MIT’s Technology & Policy Program.

Mike Maxey
Mike Maxey (Greenplum)

Mike has the responsibility of outbound marketing for the Greenplum product portfolio, a comprehensive platform that is driving the future of Big Data Analytics. Previous to Greenplum, Mike was the Senior Director of Product Management for ParaScale, a parallel distributed file system company now owned by Hitachi Data Systems. Prior toParaScale, Mike held product management roles at EMC Rainfinity and McDATA.

Nathan McCall
Nathan McCall (Apache Cassandra), @zznate

Nate McCall has over 10 years of server side systems and software development experience. He currently heads the DataStax Enterprise Platform development team for DataStax and is also the lead developer and release manager of the open source Hector client for Apache Cassandra.
Nathan also has conducted Apache Cassandra training sessions for DataStax customers and has presented at conferences such as Oracle’s JavaOne.
Nathan is also co-founder of the Cassandra-Austin meetup group for users of Apache Cassandra based in Austin, TX.

Q McCallum
Q McCallum (@qethanm)

Q Ethan McCallum is a consultant, writer, and technology enthusiast, though perhaps not in that order. Most recently put the finishing touches on Parallel R (O’Reilly).

Richard McDougall

Richard McDougall is the Application Infrastructure CTO and Principal Engineer in the Office of the CTO at VMware. He is responsible for driving advanced development and strategy for VMware’s application platform architecture – including the performance and integration of applications, runtimes, middleware, and application encapsulation technologies.

Richard’s is known as an expert in the areas of performance measurement and optimization, and in application deployment architectures.

Before the CTO office, as the Chief Performance architect Richard drove the performance strategy and initiatives to enable virtualization of high-end mission critical applications on VMware products.

Prior to joining VMware, Richard was a Distinguished Engineer at Sun Microsystems. During his 14 years at Sun, he was responsible for driving high performance and scalability initiatives for Solaris... Read More.

Alyona Medelyan

Alyona Medelyan holds a Master’s degree from the University of Freiburg and a PhD from the University of Waikato, which both focused on Natural Language Processing. During her PhD Medelyan developed an open-source tool Maui (Multi-purpose automatic topic indexing) that performs as well as professional librarians in identifying document’s main topics. Maui is now used by companies and organizations around the world. Alyona has always been passionate about practical applications of her research, which lead to internships at Google New York and Exorbyte Germany. She joined Pingar two years ago and now leads the research and development of API-based products that include semantic and faceted search, query analysis, text summarization, keyword extraction, entity and entity relations extraction.

Abhishek Mehta
Abhishek Mehta (Tresata), @ab_hi_

Abhishek is an expert in the areas big data and consumer payments.

He is the co-founder of Tresata, a big data startup that helps companies identify their core data assets, manage, maintain and enhance the intrinsic value in them and build data factories and products to monetize that value.

Abhishek has over a decade of experience in various strategic and operational leadership roles in banking, technology and consulting. Abhishek is also a Member of the Faculty at one of the premier Retail Banking Management Programs in the US.

A featured speaker on these topics, Abhishek is a die-hard supporter of all things open source and is recognized in the industry as a visionary on how to create value by building, transforming (or disrupting) business eco-systems.

... Read More.
Sanjay Mehta
Sanjay Mehta (Splunk)

Sanjay Mehta, Senior Director of Product Marketing at Splunk, is responsible for developing and executing a market-driven product strategy for Splunk’s core product. In addition, Sanjay spearheads the Company’s focus on Big Data, helping customers understand how they can use their big machine data to gain operational intelligence through unprecedented insights in the areas of application management, IT operations, web analytics. Sanjay’s role at Splunk leverages his 19 years of experience building, marketing and advising on enterprise software and information management solutions for the retail, communications and media industries. Prior to joining Splunk, Sanjay held key positions at Oracle, Portal Software and Sybase.

Richard Merkin
Richard Merkin (Heritage Provider Network)

Richard Merkin has more than 30 years of experience in the health care field. He has specific expertise in the development and administration of integrated physician systems. As the founder of Heritage Provider Network established in 1996, Dr. Merkin develops clinically focused networks to bring efficient and quality driven systems to the communities in which it operates by working with physicians and physician organizations, hospitals and integrated delivery systems, health plans, public and community-based health care entities, and other health care professionals.

Dr. Merkin is a visionary and a sought-after healthcare expert who encourages innovation and challenge. Responding to our country’s 2 trillion dollar health care crises, Dr. Merkin created, developed and sponsored the 3 million dollar Heritage Health Prize for predictive modeling to save... Read More.

Tony Middleton
Tony Middleton (HPCC Systems from LexisNexis Risk Solutions)

Tony Middleton, Ph.D.
Sr. Architect, Data Scientist
LexisNexis Risk Solutions/HPCC Systems

Dr. Middleton has worked with the HPCC Systems technology platform and the ECL programming language for more than 11 years with all types of structured and unstructured data. He specializes in data research and developing new and innovative approaches to processing and using data. He has previously worked with other Big Data companies including Standard & Poor’s Computstat.

Eric Mika
Eric Mika (The Department of Objects)

Eric Mika makes art for machines.

David Miller
David Miller (LexisNexis)

Dave is a Senior Architect for LexisNexis Risk Solutions. He has 20 years of experience specializing in big data, primarily the full text databases of LexisNexis, and currently works on company analytics. Prior to LexisNexis, Dave built transaction processing systems for NCR. David has several patents granted and several more pending related to full text big data.

Chris Moody
Chris Moody (Gnip), @gnip

Chris Moody currently serves as the President and COO of Gnip, the leading provider of social media data for enterprise applications. In this role, Moody is responsible for the day-to-day execution of Gnip’s operations with direct responsibility for sales, marketing, finance, and business development.

Prior to joining Gnip, Moody served as Founder and President of Aquent On Demand, a leading provider of technology solutions for creative and marketing organizations. Prior to his responsibilities with Aquent On Demand, Moody served as Aquent’s Chief Operating Officer with responsibility for the day-to-day management of more than 700 employees across 70 offices in 17 countries. Before joining Aquent, Moody served in senior management and technology consulting roles with IBM, Oracle, and EDS where he led engagements... Read More.

JP Morgenthal

JP Morgenthal has over twenty-five years of information technology experience spread across a wide array of technology and business requirements and a demonstrated ability to design complete systems inclusive of business justifications and risk/reward analyses. Mr. Morgenthal communicates effectively with C-level, non-technical and engineering-level individuals in both written and spoken form and is a respected authority on Cloud Computing, Enterprise Architecture, and SOA/BPM. Mr. Morgenthal has authored three books, the most recent release is “EII: A Pragmatic Approach”. JP is recognized as a Top 50 blogger on Cloud Computing and Virtualization by SYS-CON.

John Mulholland
John Mulholland (Fannie Mae)

John Mulholland is Fannie Mae’s Vice President of Enterprise Data Architecture (EDA) & Center of Excellence, reporting to the Senior Vice President and Chief Technology Architect. Mulholland leads the EDA Technology effort and all components of the company’s Data Center of Excellence, including technical data modeling, data quality, data architecture, data distribution, data access, data metrics, data integration, and test data. He also oversees data strategy, ensuring that Fannie Mae’s long-term vision for enterprise data aligns with the company’s goals, and leads efforts to define data relationships across systems and specify principles, blueprints, and standards for information interactions among systems.

Prior to joining Fannie Mae, Mulholland was Director and Global Head of Reference Data for RBC Capital Market (Royal Bank of Canada)... Read More.

Larry Murdock
Larry Murdock (TDWI)

Larry Murdock is Enterprise Architect as Sephora USA. In previous positions he has been Director of Data Services at Leapfrog, as well as involved with a variety of web startups. He holds Masters Degrees in Computer Science and Economics.

Arun Murthy
Arun Murthy (Cloudera )

Arun is the architect/lead of the next generation MapReduce project in Apache Hadoop. Arun is VP, Apache Hadoop, at the Apache Software Foundation i.e. the Chair of the Apache Hadoop PMC. He jointly holds the current world sorting record using Apache Hadoop. Prior to co-founding Hortonworks, Arun was responsible for all MapReduce code and configuration deployed across the 42,000+ servers at Yahoo!. In essence, he was responsible for running Apache Hadoop’s MapReduce as a service for Yahoo!. Follow Arun on Twitter: @acmurthy.

Jack Norris
Jack Norris (MapR Technologies), @Norrisjack

Jack Norris is the senior vice president of data and applications at MapR Technologies, where he works with leading customers and partners worldwide to drive the understanding and adoption of new applications enabled by data and analytics. With over 25 years of enterprise software experience, he has demonstrated success from identifying new markets to defining new products to launching companies. Jack’s background includes senior executive positions with establishing analytic, virtualization, and storage companies. Jack was an early employee of MapR Technologies and held senior executive roles with EMC, Brio Technology, and Bain and Company.

Mike Olson
Mike Olson (Cloudera), @mikeolson

Mike Olson cofounded Cloudera in 2008 and served as its CEO until 2013, when he took on his current role of chief strategy officer. As CSO, Mike is responsible for Cloudera’s product strategy, open source leadership, engineering alignment, and direct engagement with customers. Previously, Mike was CEO of Sleepycat Software, makers of Berkeley DB, the open source embedded database engine, and he spent two years at Oracle Corporation as vice president for embedded technologies after Oracle’s acquisition of Sleepycat. Prior to joining Sleepycat, Mike held technical and business positions at database vendors Britton Lee, Illustra Information Technologies, and Informix Software. Mike holds a bachelor’s and a master’s degree in computer science from the University of California, Berkeley.

Kumar Palaniappan

Kumar Palaniappan is an Enterprise Architect at NetApp where he leads efforts at adopting Hadoop technologies for strategic applications. Previously Kumar was an architect at Cisco Systems, responsible for large scale, mission critical architectures.

DJ Patil
DJ Patil (White House Office of Science and Technology Policy), @dpatil

DJ Patil is the chief data scientist and deputy chief technology officer for data policy at the White House Office of Science and Technology Policy, where he advises on policies and practices to maintain US leadership in technology and innovation, fosters partnerships to maximize the nation’s return on its investment in data, and helps to attract and retain the best minds in data science to serve the public. Since joining OSTP, DJ has collaborated with colleagues across government, including the chief information officer and the US Digital Service as part of the Obama administration’s commitment to open data and data science. He leads data science efforts related to the Precision Medicine Initiative, which focuses on utilizing advances in data and health care to provide... Read More.

Ross Perez
Ross Perez (Tableau Software), @tableau

Data geek and viz extraordinaire

Jacob is the cofounder & CTO of Weotta and the author of Python Text Processing with NLTK 2.0 Cookbook. He blogs at Streamhacker and has created both the NLTK Demos & APIs and NLTK-Trainer.

Claudia Perlich

Chief Scientist for M6D PhD. Information systems from NYU. Spent last 5 years as Sr Researcher at IBM research. Winner of 2007,2008 and 2009 KDD cups.

Cheryl Phillips
Cheryl Phillips (The Seattle Times)

Cheryl Phillips is the Data Enterprise Editor for The Seattle Times and a former board president with Investigative Reporters and Editors, a national journalism training organization, where she served on the board for a decade. Phillips coordinates data-related enterprise journalism across the Seattle Times newsroom. She has edited a number of award-winning stories that made compelling use of data visualizations. One of the most recent was an investigation into the myth-busting reasons behind the foreclosure crisis, “Rescue From Foreclosure? Frustration, Anger Grow.” The joint project with The Seattle Times and ProPublica received The Gannett Award for Innovation in Watchdog Journalism and a first place award in the National Association of Real Estate Editors’ 61st annual journalism contest. She also was the sole journalist in the... Read More.

James Phillips
James Phillips (Couchbase, Inc.)

James Phillips, Couchbase Co-founder
James Phillips is a software geek, entrepreneur and investor. He taught himself to code on the Apple II and TRS-80 microcomputer platforms and founded his first software company, later acquired by Symantec, in 1985. He continues today as a serial software entrepreneur focused on distributed systems, cloud computing infrastructure and database software technologies. Most recently James co-founded Couchbase, provider of open source NoSQL database solutions; prior to that he was founder of Akimbi Systems, which was acquired by VMware in 2006. James is a frequent speaker at industry events including O’Reilly Web 2.0, HadoopWorld, Under the Radar, VMworld, OracleWorld, SD Forum, InfoWorld Executive Forum, CloudSlam, OSBC and others.


Antonio Piccolboni is a data scientist with both industrial and academic experience. His recent work includes the design and implementation of a big data analysis package in R, social network analysis for a top 20 global web site and web analytics for a major web ratings company. He is currently an independent consultant with clients including Dataspora and Revolution Analytics. He blogs at about big data and analytics. His papers have received more than 800 citations and his Erdős number is 3.

Mark Pollack (SpringSource/VMware)

Dr. Mark Pollack has worked on Big Data solutions in High Energy Physics at Brookhaven National Laboratory and then moved to the financial services industry as a technical lead or architect for front office trading systems.

Always interested in best practices and improving the software development process, Mark has been a core Spring (Java) developer since 2003 and founded its Microsoft counterpart, Spring.NET, in 2004.

Mark now leads the Spring Data project that aims to simplify application development with new data technologies around Big Data and NoSQL databases.

Joris Poort
Joris Poort (Startup)

Joris has a broad engineering background, which he leveraged at Boeing to develop product development software for the 787 airplane. During his time at Boeing Joris’ work resulted in hundreds of millions of dollars in cost savings through improved designs and drastically reducing product development time. More recently after a stint at McKinsey, Joris has decided to utilize his technical and business experience in a new startup venture. The startup is developing a cloud enterprise software platform for engineering product development software.


  • The Boeing Company, 787 Program, Engineer
  • McKinsey & Company, High-tech Engagements, Associate
  • Startup, CEO / Founder


  • B.S. Mechanical Engineering, Applied Mathematics magna cum laude at University of Michigan
  • M.S. Aeronautics & Astronautics magna cum laude at University... Read More.
Jake Porway
Jake Porway (DataKind), @jakeporway

Jake Porway is the Data Scientist in the New York Times R&D Lab and is launching an initiative called Data Without Borders to unite non-profits and data scientists in the service of humanity. He spends his days analyzing large data flows to help redefine media and his nights finding uses of data for the greater good.

Jags Ramnarayan (Vmware)

As the Chief Architect for GemFire products at VMWare, Jags is responsible for the technology direction for its high performance distributed data Grid. Jags has been a part of several Java standards – EJB, J2EE while at GemStone systems, XML based JSRs like JAXM while at BEA. He also represented BEA in the W3C SOAP protocol specification. Jags has presented in several conferences in the past on Data management, clustering and grid computing. He has over 20 years of experience, a bachelors degree in computer science and a masters degree in management of science and technology.

Jan Reichelt
Jan Reichelt (Mendeley Ltd.)

Jan Reichelt is the co-founder and president of Mendeley, the world’s largest research collaboration platform. Mendeley helps people to organize and collaborate on research projects, making scientific research more accessible and transparent.

Katrin Ribant
Katrin Ribant (Havas Digital)

As the Global Director for Artemis, Katrin is responsible for Havas Digital’s proprietary marketing analysis platform. In this capacity, Katrin serves as an expert resource to Havas Digital on all matters related to digital media optimisation, campaign performance analysis and advanced attribution methodologies.

Prior to her current position, Katrin held various positions across several countries at Havas Media, including research, digital media planning and data analytics. Most recently, she served as the Director of Product Development for Artemis, developing market-leading analytics methodologies to assess the performance and efficiency of Havas Digital’s marketing efforts.

Katrin began her career at Futur Immediat, a web agency start-up, where she developed a deep understanding of web build and web analytics technologies. She gained further experience in digital marketing at... Read More.

Joseph Rickert
Joseph Rickert (Revolution Analytics)

I am a marketing manager at Revolution Analytics with a passion for analyzing data. I have worked a number of successful Silicon Valley start-ups including Sytek, Alantec, Parallan Computer and Scotts-Valley Instruments. I have graduate degrees in both the Humanities and Statistics. I taught statistics briefly at SJSU and I blog at

Henry Robinson
Henry Robinson (Cloudera), @HenryR

Henry Robinson is a software engineer at Cloudera, where he works on their management and monitoring infrastructure for Apache Hadoop. He has a background in distributed systems, and is also a PMC member for the Apache ZooKeeper distributed coordination platform.

Monica Rogati
Monica Rogati (Data Natives), @mrogati

As one of the founding members of the LinkedIn data science team, Monica turns data into products, actionable insights and (news) stories.

Monica obtained her PhD in Computer Science from Carnegie Mellon, where she focused on text mining and applied machine learning. At LinkedIn, she pioneered data driven products with multi-million dollar business impact and is currently building mathematical models that power LinkedIn’s personalized recommendations. When she doesn’t name projects after Harry Potter, Monica finds stories in the LinkedIn data about the most overused buzzwords, trending job titles, entrepreneur DNA, promotion cycles for Millennials and first names that tend to succeed. Her stories appeared in thousands of media outlets – from the Wall Street Journal & The Economist to NPR & CNN... Read More.

Simon Rogers
Simon Rogers (Guardian)

Simon Rogers is editor of the Guardian’s Datablog and Datastore, an online data resource which publishes hundreds of raw datasets and encourages its users to visualise and analyse them. He is the author of Facts are sacred: the power of data available now on Kindle. Simon is also a news editor on the Guardian, working with the graphics team to visualise and interpret huge datasets. He was closely involved in the Guardian’s exercise to crowdsource 450,000 MP expenses records and the organisation’s coverage of the Afghanistan Wikileaks war logs. Previously he was the launch editor of the Guardian’s online news service and has edited the paper’s science section. He has edited two Guardian books: How Slow Can You Waterski and The Hutton Inquiry... Read More.

Dave Rubin (Oracle)

Dave Rubin recently joined Oracle from Cox Enterprises where he ran the Infrastructure Engineering organization responsible for developing big data systems in the Online Display Advertising vertical. Prior to this he ran the engineering teams at Rapt Inc, delivering Price Optimization and Inventory Forecasting solutions to online media companies. Dave started his career at Sybase where he worked on various parts of the database kernel including access methods, query optimization, resource management, and transaction management. He holds four U.S. patents in the areas of query optimization and advanced transaction models.

Jason is a Sr. Architect at Think Big Analytics. He has many years of experience writing Java application software, most recently for Hadoop-based applications.

Michael Rys
Michael Rys (Microsoft Corp.)

Michael Rys, earned his PhD at the Swiss Federal Institute of Technology in Database Systems and did post-doctoral research into semistructured data and information integration at the Stanford University Database group. Ever since he left academia, he has worked at Microsoft as a program manager specializing in Database Systems, SQL and Beyond Relational Data and Services scenario that includes unstructured and semi-structured data management, Search, Spatial, XML and now NoSQL paradigms. He also serves as Microsoft representative to the W3C XQuery working group and the ANSI SQL standards committee. He has presented at many database conferences including VLDB, SIGMOD, SQLPASS, NoSQLNow and now the Strata conference. You can find several of his webcasts and presentations online and he... Read More.

Diego Saenz
Diego Saenz (Accenture), @diego_s

Managing Director with Accenture’s Digital Practice, my primary focus is on digital transformation and Artificial Intelligence. I lead global teams that deliver innovative digital solutions for some of the world’s biggest companies. I have worked in a variety of industries including high tech, consumer products travel, automotive and retail.

Before joining Accenture, I built and sold two successful internet businesses. Received the Inc 500 award for leading one of America’s fastest-growing private companies.

Marcel Salathé (Penn State University), @salathegroup

Marcel Salathé is a Branco Weiss: Society in Science Fellow and Assistant Professor of Biology in the Center for Infectious Disease Dynamics at Penn State University. He’s generally interested in how human dynamics affect disease dynamics (and vice versa).

Eddie Satterly
Eddie Satterly (Splunk)

Eddie has played a key role in big data adoption at his former employers and was a data scientist before it was cool. He plays a key role with Splunk in adoption of the product as well as in partnerships with the big data community.

Schwark Satyavolu (Truaxis)

Schwark is a co-founder and one of the original visionaries behind Truaxis (formerly BillShrink). As a result of his own family’s frustration with information overload and confusion when trying to switch cell phone plans, Schwark, a serial entrepreneur, started thinking about how to use web technology to solve this problem.

Schwark was previously an entrepreneur-in-residence with Bessemer Venture Partners (BVP), where he explored opportunities to leverage the Internet and innovative information aggregation techniques to solve real-world problems. Prior to his tenure with BVP, Schwark was with Yodlee for eight years, where he developed the technology, co-founded the company, and served in a variety of functions, including chief technology officer. Prior to this, Schwark worked at Microsoft, where he was responsible for designing and... Read More.

Theo Schlossnagle

Theo Schlossnagle is a Founder and Principal at OmniTI where he designs and implements scalable solutions for highly trafficked sites and other clients in need of sound, scalable architectural engineering. He is the architect of the highly scalable Ecelerity mail transport agent. Theo is a participant in various open source communities including OpenSolaris, Linux, Apache, PostgreSQL, perl, and many others. He is a published author in the area of scalability and distributed systems as well as a veteran speaker in the open source conference circuit.

Theo founded several successful startups as engineering focused organizations including: OmniTI, Circonus, Message Systems and Fontdeck.

Bill Schmarzo
Bill Schmarzo (EMC Consulting), @schmarzo

Bill Schmarzo
Global Competency Lead and CTO, Enterprise Information Management Practice
EMC Consulting

Bill Schmarzo has more than two decades of experience in data warehousing, BI and analytic applications. Bill authored the Business Benefits Analysis methodology that links an organization’s strategic business initiatives with their supporting data and analytic requirements, and co-authored with Ralph Kimball a series of articles on analytic applications. Bill has served on The Data Warehouse Institute’s faculty as the head of the analytic applications curriculum.

Previously, Bill was the vice president of Analytics at Yahoo where he was responsible for the development of Yahoo’s Advertiser and Website analytics products, including the delivery of “actionable insights” through a holistic user experience. Before that, Bill oversaw the Analytic Applications... Read More.

Steve Schoettler

Steve Schoettler is Founder and CEO of Junyo, a learning analytics company creating tools to help teachers and students understand and improve academic success. As co-founder of Zynga, Steve helped introduce social gaming, virtual currencies, and real-time analytics on a massive scale. Prior to Zynga, Steve worked on innovative and scalable technologies in mobile, entertainment, distributed computing, and security. Steve holds a B.S. in Electrical Engineering and Computer Science from UC Berkeley.

Toby Segaran

Toby Segaran is the author of the O’Reilly titles, “Programming
Collective Intelligence” and “Programming the Semantic Web” and a
contributing editor of “Beautiful Data” . He frequently speaks on the
subjects of machine learning, collective intelligence and freedom of
data at conferences worldwide.

Toby previous worked as a Senior Data Scientist at Metaweb before it
was acquired by Google in 2010. He now works on large-scale data
reconciliation problems at Google. Prior to Metaweb he founded
Incellico, a biotechnology software company which was acquired in

Toby holds a B.Sc in Computer Science from MIT and is deemed a “Person
of Exceptional Ability” by the USCIS. He loves applying data-analysis
algorithms to everything... Read More.

Deepak Senapati

Josh Wills is the director of data science at Cloudera. Wills is one of the main contributors to Cloudera’s most recent open source project, Crunch, a Java library that aims to make writing, testing, and running MapReduce pipelines easy, efficient, and even fun.

Prior to joining Cloudera, Wills was a software engineer at Google. Josh holds a M.S.E. in operations research from the University of Texas and a BS in mathematics from Duke University.

Sam Shah (LinkedIn)

Sam Shah is a Principle
Engineer in the Search,
Network, and Analytics Team at
LinkedIn, working on applied
data products. He is the principal person behind
“People You May Know,”
LinkedIn’s people
recommendation service, and
LinkedIn’s collaborative
filtering system. Sam holds a
Ph.D. in Computer Science from the University of

Carter Shanklin has been a Product Manager at VMware for more than 4 years and focuses on VMware’s GemFire and SQLFire, both of which are horizontally scalable distributed databases focused on high-performance OLTP applications. Before VMware, Carter spent nearly 10 years as a software developer, and keeps sharp writing code in his spare time. Carter holds a M.S. in mathematics from George Mason University.

Vipul Sharma
Vipul Sharma (Eventbrite), @vipulsharma

Vipul Sharma is leading the data discovery group at Eventbrite where he and his team is working on problems like data platforms, recommendation systems, search, social graph mining etc

Before that Vipul had been working on web big data problems for quite a while now. He has spent over many years fighting spam using machine learning and big data.

Tomer Shiran

Tomer Shiran is cofounder and CEO of Dremio, the data lake engine company. Previously, Tomer was the vice president of product at MapR, where he was responsible for product strategy, road map, and new feature development and helped grow the company from 5 employees to over 300 employees and 700 enterprise customers; and he held numerous product management and engineering positions at Microsoft and IBM Research. He’s the author of eight US patents. Tomer holds an MS in electrical and computer engineering from Carnegie Mellon University and a BS in computer science from the Technion, the Israel Institute of Technology.

Swaminathan Sivasubramanian (Amazon Web Services)

Swami Sivasubramanian works as a GM and Architect in Amazon Web Services where he builds large scale cloud computing platforms and also manages different groups within AWS Database services. Swami has built several large scale systems in the past. Some of the well known ones include Amazon Dynamo, Amazon CloudFront and Amazon RDS (including the replication engine for RDS that does synchronous replication with automated failover). He also wrote a major part of the Amazon’s distributed lock service that is used as a foundational building block for various Amazon service infrastructure.

Swami obtained his Ph.D. from Vrije Universiteit, Amsterdam from the Computer Systems Group headed by Andrew S. Tanenbaum and Maarten van Steen. Swami has authored more than 40 refereed journals and... Read More.

Pete Skomoroch

Pete Skomoroch is a Principal Data Scientist at LinkedIn focused on reputation systems, personalization, and creating data driven products like LinkedIn Skills. Before joining LinkedIn, he was the Director of Advanced Analytics at Juice Analytics and a Sr. Research Engineer at AOL Search. Prior to AOL, he implemented pattern detection algorithms for streaming sensor data at MIT Lincoln Laboratory and constructed predictive models for large retail datasets. Pete has a B.S. in Mathematics and Physics from Brandeis University and blogs at

Marc Smith
Marc Smith (Social Media Research Foundation), @marc_smith

Marc Smith is a sociologist specializing in the social organization of online communities and computer mediated interaction. Smith leads the Connected Action consulting group and lives and works in Silicon Valley, California. Smith co-founded the Social Media Research Foundation (, a non-profit devoted to open tools, data, and scholarship related to social media research.

Smith is the co-editor with Peter Kollock of Communities in Cyberspace (Routledge), a collection of essays exploring the ways identity; interaction and social order develop in online groups. Along with Derek Hansen and Ben Shneiderman, he is the co-author and editor of Analyzing Social Media Networks with NodeXL: Insights from a connected world, from Morgan-Kaufmann which is a guide to mapping connections created through computer-mediated interactions.

Smith’s research focuses on computer-mediated... Read More.

Sarah Sproehnle
Sarah Sproehnle (Cloudera, Inc.)

Sarah Sproehnle is the Director of Educational Services for Cloudera
where she helps customers learn to use Apache Hadoop for big data
processing. Cloudera provides commercial support, training and
services for the Apache Hadoop platform.

Jeff Sternberg (S&P Capital IQ)

Jeff Sternberg founded and currently leads the Data Science Team at S&P Capital IQ. The team builds data products for the S&P Capital IQ platform, a leading provider of data and analytics for global financial professionals. Recent team projects include a company profile recommendation engine, fraud detection, and client value analytics.

Previously, Jeff lead the engineering team that builds S&P Capital IQ’s data collection software and systems. These tools enable collection, standardization, and aggregation of key content sets, including fundamental financials, earnings estimates, company profile information, news and events, and more. Jeff’s career at S&P Capital IQ started in 2002 as a core member of the technology team, where he lead database projects such as a mergers & acquisition valuation calculation engine, enhanced profile search,... Read More.

Alexander Stojanovic

General Manager Microsoft Corporation:

Cloudscale Predictive Analytics

(SQL Azure Analytics)

Mathematical Programming and Stochastic Optimization

(Microsoft Solver Foundation)

Semantic Extraction, Indexing and Machine Learning

(SQL Server Semantic Platform)

Jason Sundram
Jason Sundram (Facebook), @jsundram

I’m a senior data scientist at eBay/PayPal. The work I do looks closely at data generated by mobile users. I’ve worked on the WHERE PlaceGraph (, a tool that reveals the connections between places based on searches and checkins.

In general, I do data visualization with big data, using Python and R for analysis, and Processing and javascript/canvas for display/interaction.

I’m also an accomplished violinist. My interest in music led me to work on creating The Echo Nest’s Music Analyzer, which listens to music the way people do, and extracts summary data that can be used to find out how danceable a song is. I co-created, a site that synchs music to various visualizations of the Echo Nest’s analysis data. It’s hypnotic and... Read More.

Marcia Tal
Marcia Tal (Tal Solutions, LLC), @MarciaBTal

Marcia Tal is a highly respected executive and is widely recognized for creating and building Citigroup’s Decision Management function. She worked with the leaders of global businesses and introduced advanced analytical tools and a strong governance process into business decisions — to deliver organic growth, identify new revenue streams, and to enhance profitability while managing risk and maintaining appropriate controls. As a leader, Marcia translates her strategic vision into business initiatives. She transforms business models and produces innovative sources of revenue. Marcia can create and execute a global strategy within local markets. She has a passion for leadership and talent development within an ecosystem of partnership and community.
Marcia created a scalable, international organization embedded in more than 30 countries. Its charter was to... Read More.

Richard Taylor
Richard Taylor (HPCC Systems from LexisNexis Risk Solutions)

Richard Taylor
Chief Trainer of HPCC Systems from LexisNexis Risk Solutions

Richard Taylor has worked with the HPCC technology platform and the ECL programming language since its inception over 11 years ago. He developed all the ECL programming courses and has taught ECL from its beginning. He is the author of the ECL programming documentation: Language Reference, Programmer’s Guide, and Standard Library Reference. He spent the 10 years prior to that programming, documenting, teaching, and supporting another business data language developed by the same team who later created the HPCC platform.

About HPCC Systems™
HPCC Systems™ from LexisNexis® Risk Solutions offers a proven, data-intensive supercomputing platform designed for the enterprise to process... Read More.

Kaitlin Thaney
Kaitlin Thaney (Mozilla Science Lab), @kaythaney

Kaitlin is the director of the Mozilla Science Lab, a new open science initative at Mozilla to help researchers use the power of the web to change science’s future. She’s previously worked at Digital Science, a technology company out of Macmillan Publishers, as well as Creative Commons, where she managed their science program. She also advises the UK government on digital technology and data-intensive science and business, and is on the board of DataKind UK. You can follow her at @kaythaney.

Jim Tommaney
Jim Tommaney (InfiniDB), @InfiniDB

Jim is the chief product architect for InfiniDB, and CTO at Calpont. InfiniDB’s map-reduce distribution of work enables linear scalability combined with SQL ease of use. A simple create table automatically implements horizontal and vertical partitioning of the data that can be analyzed with fully parallel and distributed execution of inner/outer hash joins, sub-query, correlated sub-query, multi-table hash joins, filters and expressions, group by, and user defined functions.

Daniel Tunkelang

Daniel Tunkelang oversees the data science team at LinkedIn, which analyzes terabytes of data to produce products and insights that serve LinkedIn’s members. Prior to LinkedIn, Daniel led a local search quality team at Google. Daniel was a founding employee and Chief Scientist of Endeca, a leader in enterprise search and business intelligence that pioneered the use of guided navigation in search applications. He has authored eight patents, written a textbook on faceted search, created the annual workshop on human-computer interaction and information retrieval (HCIR), and participated in the premier research conferences on information retrieval, knowledge management, databases, and data mining (SIGIR, CIKM, SIGMOD, SIAM Data Mining). Daniel holds a PhD in Computer Science from CMU, as well as... Read More.

Vineet Tyagi
Vineet Tyagi (Impetus Technologies)

Vineet heads Innovation labs, the R&D & Consulting Division of Impetus Technologies. He is responsible for working on new technology, product development, managing innovation and creating IPs. At Impetus, Vineet has been involved in various client facing assignments and has been instrumental in steering the company’s technology and R&D road map. He has conceptualized and led various new technology initiatives, including those on Big Data, Hadoop & Cloud computing. He spearheads many Open source contributions that have received global recognitions.

Vineet is a sought after speaker on Big Data Technologies and Agile Product Development.

Rohit Valia (Platform Computing), @platform_hiperf

Rohit Valia is the Director of Enterprise Marketing at Platform Computing, an IBM Company. He is an experienced technology and marketing executive, with over 15 years of experience in enterprise datacenter technologies with hands on software development, product management and marketing experience in security, Java EE middleware, virtualization and cloud computing. Before joining Platform Computing, he was the Director for Sun Microsystems cloud services business unit and most recently, the head of Oracle University marketing. He has been a speaker at numerous JavaONE and other technical conferences and published papers in IEEE and other journals. He is also the author of two US Patents for Java and internet

Hal Varian
Hal Varian (Google), @Google

Hal R. Varian is the Chief Economist at Google. He started in May 2002 as a consultant and has been involved in many aspects of the company, including auction design, econometric analysis, finance, corporate strategy and public policy.

He also holds academic appointments at the University of California, Berkeley in three departments: business, economics, and information management.

He received his SB degree from MIT in 1969 and his MA in mathematics and Ph.D. in economics from UC Berkeley in 1973. He has also taught at MIT, Stanford, Oxford, Michigan and other universities around the world.

Dr. Varian is a fellow of the Guggenheim Foundation, the Econometric Society, and the American Academy of Arts and Sciences. He was Co-Editor of the American Economic Review... Read More.

Flavio Villanustre
Flavio Villanustre (LexisNexis Risk Solutions and HPCC Systems)

Flavio Villanustre is the Vice President of Infrastructure and Products. In this position, Flavio is responsible for Information and Physical Security, overall infrastructure strategy and new product development for LexisNexis Risk Solutions and HPCC Systems. Prior to 2001, Flavio served in a variety of roles at different companies including Infrastructure, Information Security and Information Technology. In addition to this, Villanustre has been involved with the Opensource community for over 15 years through multiple initiatives. Some of these include founding the first Linux User Group in Buenos Aires (BALUG) in 1994, releasing several pieces of software under different Opensource licenses, and evangelizing Opensource to different audiences through conferences, training and education. Before working in technology, Flavio was a neurosurgeon.

Dean Wampler

Dean Wampler is Principal Consultant at Think Big Analytics, specialists in Big Data, Machine Learning, and the Hadoop ecosystem. He speaks frequently at conferences on various topics, such as the effective use of different programming languages and modularity paradigms: functional, object-oriented, and aspect-oriented programming.

Dean is the author of Functional Programming for Java Developers (O’Reilly, 2011) and the co-author of Programming Scala with Alex Payne (O’Reilly, 2009).

Pete Warden
Pete Warden (TensorFlow), @petewarden

Pete Warden is the technical lead on the TensorFlow mobile embedded team at Google doing deep learning. Previously, he was CTO of Jetpac, which was acquired by Google, and worked on GPU optimizations for image processing at Apple. He’s written several books on data processing for O’Reilly and blogs at

Ian White
Ian White (Urban Mapping, Inc)

Ian is the CEO of Urban Mapping, a leading provider of hosted mapping services.

Prior to founding Urban Mapping in 2003, White worked as a business consultant at and held various roles in business development and marketing. He also served as Adjunct Professor of Design and Management at Parsons School of Design in New York.

White received a BA from McGill University in Montreal, an MBA from Babson College and completed postgraduate studies in France.

John Wilbanks
John Wilbanks (Kauffman Foundation for Entrepreneurship)

John Wilbanks works on open content, open data, and open innovation systems. He is a Senior Fellow at the Kauffman Foundation and a Research Fellow at Lybba. He’s worked at Harvard Law School, MIT’s Computer Science and Artificial Intelligence Laboratory, the World Wide Web Consortium, the US House of Representatives, and Creative Commons, as well as starting a bioinformatics company. He sits on the Board of Directors for Sage Bionetworks, iCommons, and 1DegreeBio, and the Advisory Board for Boundless Learning. John holds a degree in philosophy from Tulane University and also studied modern letters at the University of Paris (La Sorbonne).

Edd Wilder-James

Edd Wilder-James is a strategist at Google, where he is helping build a strong and vital open source community around TensorFlow. A technology analyst, writer, and entrepreneur based in California, Edd previously helped transform businesses with data as vice president of strategy for Silicon Valley Data Science. Formerly Edd Dumbill, Edd was the founding program chair for the O’Reilly Strata Data Conference and chaired the Open Source Software Conference for six years. He was also the founding editor of the peer-reviewed journal Big Data. A startup veteran, Edd was the founder and creator of the Expectnation conference management system and a cofounder of the Pharmalicensing online intellectual property exchange. An advocate and contributor to open source software, Edd... Read More.

Max Yankelvich
Max Yankelvich (Crowd Computing Systems, Inc.), @crowdcontrol1

Max Yankelevich is a serial entrepreneur, having started and exited several successful companies. Max is the Founder/CEO of CrowdControl Software – a company that combines Artificial Intelligence and Crowdsourcing to build high-value datasets.

Yankelevich is an expert on applying crowdsourcing and artificial intelligence to building, processing, and managing data. Yankelevich has changed the perception of crowdsourcing for many decision makers and his expertise has led the industry to latch on to the next version – accessible, scalable and accountable crowdsourcing.

Yankelevich is a graduate of Massachusetts Institute of Technology’s computer science program; he has more than 15 years of experience in large scale cloud computing and start-up technology companies. He has provided architectural guidance to Fortune 500 companies, such as JP Morgan Chase, Credit... Read More.

Jen Zeralli (S&P Capital IQ)

Jen Zeralli has over a decade of experiences at major financial firms and software companies. She has spent the last several years at S&P Capital IQ. In her current role, she is the Senior Analyst and a founding member of the S&P Capital IQ Data Science Team. She holds a degree in Finance and International Business from New York University’s Stern School of Business. In her spare time, Jen is an avid amateur genealogist (her favorite big data sets). She is also an active volunteer with the Good Dog Foundation providing animal-assisted therapy with her dogs Kitty and Twiggy.

Kate Zimmerman

12 years of experience in applied statistics, experimental and survey design, and quantitative research.

5 years of industry analytics experience, with a focus on revealing data insights using data visualization, predictive models, statistical methods, and behavioral analysis.

Passionate about applying my knowledge of behavioral economics, social psychology, and human judgment & decision making to guide analysis and inform business decisions.


  • EMC
  • Microsoft
  • HPCC Systems™ from LexisNexis® Risk Solutions
  • MarkLogic
  • Shared Learning Collaborative
  • Cloudera
  • Digital Reasoning Systems
  • Pentaho
  • Rackspace Hosting
  • Teradata Aster
  • VMware
  • IBM
  • NetApp
  • Oracle
  • 1010data
  • 10gen
  • Acxiom
  • Amazon Web Services
  • Calpont
  • Cisco
  • Couchbase
  • Cray
  • Datameer
  • DataSift
  • DataStax
  • Esri
  • Facebook
  • Feedzai
  • Hadapt
  • Hortonworks
  • Impetus
  • Jaspersoft
  • Karmasphere
  • Lucid Imagination
  • MapR Technologies
  • Pervasive
  • Platform Computing
  • Revolution Analytics
  • Scaleout Software
  • Skytree, Inc.
  • Splunk
  • Tableau Software
  • Talend

For information on exhibition and sponsorship opportunities at the conference, contact Susan Stewart at

For information on trade opportunities with O'Reilly conferences contact Kathy Yu at mediapartners

For media-related inquiries, contact Maureen Jennings at

View a complete list of Strata contacts