Skip to main content

Is Bigger Really Better? Predictive Analytics with Fine-grained Behavior Data

Foster Provost ( NYU | Stern )
Grand Ballroom
Average rating: ***..
(3.55, 11 ratings)

Predictive analytics is one of the most mature areas of data science and an area where “big data” often is associated with competitive advantage. However, concrete results supporting the advantage conferred by big data are few and far between. In this talk I dig into the value of big data for predictive analytics. Depending on the sort of data available (and relevant), larger data assets may or may not give increased value. I first demonstrate this for a case study of targeting offers for a large bank, showing that for tried-and-true predictive marketing formulations, moderate-sized data are sufficient. However, when employing fine-grained data on consumer behavior, improvements in predictive performance continue as the data size increases to massive scale. Moreover, using both sorts of modeling is better than either alone. I then demonstrate that increasing returns to scale is not specific to this case study, but that it appears to apply to many applications of predictive modeling that employ fine-grained data on behavior. This has important implications for the competitive advantage firms can obtain from their data assets: for predictive tasks based on fine-grained behavior data, firms with larger data assets have the opportunity to build substantially better predictive models than their less data-rich competitors.

Photo of Foster Provost

Foster Provost

NYU | Stern

Foster Provost is coauthor of the O’Reilly best-selling book, Data Science for Business ( He has designed data science solutions for businesses for over two decades, and has co-founded several successful companies focusing on data science for advertising (incl., Dstillery & Integral Ad Science). In his current job as Professor and NEC Faculty Fellow at the NYU Stern School of Business, Foster teaches in the MS in Data Science, MS in Business Analytics, MBA, and PhD programs. His data science research has won many awards and is broadly cited. He served as Program Chair for the ACM SIGKDD Conference and for many years as Editor-in-Chief for the journal Machine Learning.


Sponsorship Opportunities

For exhibition and sponsorship opportunities, contact Susan Stewart at

Media Partner Opportunities

For information on trade opportunities with O'Reilly conferences email mediapartners

Press & Media

For media-related inquiries, contact Maureen Jennings at

Contact Us

View a complete list of Strata + Hadoop World 2013 contacts