Presented By O'Reilly and Cloudera
Make Data Work
Feb 17–20, 2015 • San Jose, CA

Using Big Data to Identify the World's Top Experts

Nima Sarshar (inPowered)
1:30pm–1:50pm Friday, 02/20/2015
Business & Industry
Location: LL20 BC
Average rating: ****.
(4.50, 2 ratings)

In this talk, we report on our implementation of a big data system that is able to automatically identify and rank experts in a large number of categories by ingesting and analyzing millions of pieces of content published across the Web every day.

We adopt a principled approach to defining who an expert is. An expert is someone who (a) writes consistently about a small set of tightly related topics; if you are an expert in everything, you are an expert in nothing, and (b) who has a loyal following that engages with her contents consistently and finds them useful, and © who actually expresses opinions on the topics he writes about rather than merely breaking the news.

Formulating the above criteria, and implementing it at scale, is a daunting big data task. Firstly we needed to form a rather comprehensive picture of the body of works published by authors that often write on many different outlets and at times under different aliases. Secondly, we had to create a dynamic topical model that learns the relationship between tens of thousands of topics by analyzing millions of documents. Thirdly, we had to come up with a formula that results in a stable, consistent, ranking, that is robust to fluctuations in publishing patterns and engagement data, yet is adaptable to allow in for new experts and their voices to be heard.

- Experts vs. Influencers: defining who an expert is
- Unifying identities of authors across sites
- A dynamic topical model that scales
- Projection of topics onto authors
- Opinion vs. Sentiment vs. Statement of Facts
- Putting it all together
- A note on architecture

Photo of Nima Sarshar

Nima Sarshar


As inPowered’s CTO, Nima leads the development of the core technologies at the heart of inPowered, and oversees all aspects of engineering and product development. Once a tenured professor with 50+ peer-reviewed publications, Nima left academia to found Haileo Inc., a startup specializing in visual search. He was a Principal Data Scientist at Intuit, before joining inPowered. Nima hold’s an M.Sc. from UCLA and a Ph.D. from McMaster University, both in Electrical Engineering.