Presented By O'Reilly and Cloudera
Make Data Work
September 26–27, 2016: Training
September 27–29, 2016: Tutorials & Conference
New York, NY

Zillow: Transforming real estate through big data and data science

Jasjeet Thind (Zillow)
4:35pm–5:15pm Wednesday, 09/28/2016
Hadoop use cases
Location: 3D 08 Level: Intermediate
Average rating: ***..
(3.75, 8 ratings)

Prerequisite knowledge

  • A basic understanding of building big data platforms and predictive analytics, such as decision trees, collaborative filtering, and text mining
  • What you'll learn

  • Learn best practices for scaling platforms for distributed data processing in Spark
  • Explore key machine-learning algorithms for real estate
  • Description

    Zillow pioneered providing access to unprecedented information about the housing market. Long gone are the days when you needed an agent to get comparables and prior sale and listing data. Enter Zillow, the nation’s number-one real estate website and mobile app. With more data, data science has enabled more use cases. Jasjeet Thind explores Zillow’s big data platform, discusses some of its core machine-learning algorithms, and outlines best practices for scaling streaming data ingestion and data processing in Spark.

    Topics include:

    • How Zillow predicts the owners of 100+ million homes and distinguishes between a buyer, seller, homeowner, and renter
    • How Zillow makes the Zestimate more accurate via text mining
    • How Zillow implemented its own collaborative filtering algorithm to provide personalized real estate recommendations
    • The best time to sell your home
    Photo of Jasjeet Thind

    Jasjeet Thind


    Jasjeet Thind is the vice president of data science and engineering at Zillow. His group focuses on machine-learned prediction models and big data systems that power use cases such as Zestimates, personalization, housing indices, search, content recommendations, and user segmentation. Prior to Zillow, Jasjeet served as director of engineering at Yahoo, where he architected a machine-learned real-time big data platform leveraging social signals for user interest signals and content prediction. The system powers personalized content on Yahoo, Yahoo Sports, and Yahoo News. Jasjeet holds a BS and master’s degree in computer science from Cornell University.

    Comments on this page are now closed.


    10/04/2016 9:25am EDT

    can we get the slides for this?