Big Data Revolution: Benefit from MapReduce Without the Risk

Ted Dunning (MapR)
Sponsored Sessions
Location: Murray Hill Suite B
Average rating: ****.
(4.00, 1 rating)

Map-reduce and Hadoop provide new scaling opportunities for analyzing data. As a result organizations are beginning to analyze and derive business value from large amounts of data that, in many cases, were previously simply being discarded. In some cases such as on-line advertising, the ability to analyze these previously impenetrable volumes of data have disrupted entire industries such as is the case with on-line advertising.

Such green field opportunities are rare, however, and few companies can afford to build an entirely new analytics pipeline. Integrating big data analytics systems like Apache Hadoop into existing analytics systems can be very difficult, however, because there are huge differences in the fundamental approaches being taken to the basic problems of how data should be accessed and analyzed.

These differences are exactly what makes these new technologies hugely effective, but they are also what makes integration between conventional and new approaches so difficult.

This talk will provide detailed descriptions of how to use new technologies to

  • Get data into and out of the Hadoop cluster as quickly as possible
  • Allow real-time components to easily access cluster data
  • Use well-known and understood standard tools to access cluster data
  • Make Hadoop easier to use and operate
  • Capitalize on existing code in map-reduce settings
  • Integrate map-reduce systems into existing analytic systems

These descriptions will be taken from real-life customer situations. Each will describe the problems faced and the solutions that solved these problems.

This session is sponsored by MapR Technologies

Photo of Ted Dunning

Ted Dunning

MapR

Ted Dunning is chief application architect at MapR. He’s also a board member for the Apache Software Foundation; a PMC member and committer of the Apache Mahout, Apache Zookeeper, and Apache Drill projects; and a mentor for various incubator projects. Ted has years of experience with machine learning and other big data solutions across a range of sectors. He’s contributed to clustering, classification, and matrix decomposition algorithms in Mahout and to the new Mahout Math library and designed the t-digest algorithm used in several open source projects and by a variety of companies. Previously, Ted was chief architect behind the MusicMatch (now Yahoo Music) and Veoh recommendation systems and built fraud-detection systems for ID Analytics (LifeLock). Ted has coauthored a number of books on big data topics, including several published by O’Reilly related to machine learning, and has 24 issued patents to date plus a dozen pending. He holds a PhD in computing science from the University of Sheffield. When he’s not doing data science, he plays guitar and mandolin. He also bought the beer at the first Hadoop user group meeting.

Sponsors

  • Aster Data
  • EMC Greenplum
  • GE
  • Lexis Nexis
  • MarkLogic
  • Tableau Software
  • Cloudera
  • DataStax
  • Informatica
  • DataSift
  • Splunk
  • Amazon Web Services
  • Datameer
  • Impetus
  • Karmasphere
  • MapR Technologies
  • Pervasive
  • Platform Computing
  • Revolution Analytics
  • Sybase
  • Xeround
  • Media-Science
  • Platfora

Sponsorship Opportunities

For information on sponsorship opportunities at the conference, contact Susan Stewart at sstewart@oreilly.com

Press & Media

For media-related inquiries, contact Maureen Jennings at maureen@oreilly.com

Contact Us

View a complete list of Strata Contacts