Sketching Techniques for Real-time Big Data

Beyond Hadoop Great America Ballroom J
Average rating: ****.
(4.25, 4 ratings)

In many modern web and big data applications the data arrives in a streaming fashion and needs to be processed on the fly. In these applications, the data is usually too large to fit in main memory, and the computations need to be done incrementally upon arrival of new pieces of data. To do these computations, sketches of the data are designed and used that not only take a small amount of memory but also allow for fast queries and updates on the fly. Such sketches are useful both in applications run on a single machine and for applications run on distributed systems such as Twitter Storm. We will present the techniques used to design these sketches and also provide a number of examples, such as frequent item-sets (used for e.g. retail product recommendations), clustering, and heavy hitters (used for e.g. fraud and intrusion detection), etc. to clarify the techniques and how to apply them.

Photo of Bahman Bahmani

Bahman Bahmani

Stanford University

Bahman did his PhD at Stanford University, supported by William R. Hewlett Stanford Graduate Fellowship, and focused on the topic of algorithms for big data applications, in which he is a well-published author in some of the best conferences and journals, including PVLDB, SIGMOD, WWW, and KDD. He was the last PhD student of the legendary late Rajeev Motwani, and has been also advised and co-advised by Ashish Goel and Prabhakar Raghavan (formerly Yahoo VP of Strategy, currently Google VP of Engineering). His industry experience during his PhD studies spans several internships and collaborations with some of the best researchers and practitioners from Twitter, Microsoft Research, Yahoo Research, AOL, and Google. He is a recipient of the Yahoo Key Scientific Challenges Award for his contributions to the area of search technologies.

Comments on this page are now closed.

Comments

Picture of Bahman Bahmani
Bahman Bahmani
02/28/2013 3:44pm PST

You can find the slides on the talk’s webpage: http://strataconf.com/strata2013/public/schedule/detail/27311

Mario Brenes
02/27/2013 2:01pm PST

Where can I get your presentation materials?

Sponsors

Sponsorship Opportunities

For information on exhibition and sponsorship opportunities at the conference, contact Susan Stewart at sstewart@oreilly.com

Media Partner Opportunities

For information on trade opportunities with O'Reilly conferences contact Kathy Yu at mediapartners
@oreilly.com

Press and Media

For media-related inquiries, contact Maureen Jennings at maureen@oreilly.com

Contact Us

View a complete list of Strata contacts