Modeling teams at Twitter face a variety of uniquely hard yet fundamentally related machine learning problems. For example, tasks as different as ad serving, abuse detection, and user timeline construction all rely on powerful representations of user and content entities. In addition, because of Twitter’s real-time nature, entity data distributions are constantly in flux, so these representations must be frequently updated. By generating high-quality, up-to-date representations (embeddings) and sharing them broadly across teams, Twitter decreases duplication of efforts and multiplicatively increases cross-team modeling productivity.
Abhishek Tayal offers insight into how Twitter’s ML platform team, Cortex, is making entity embeddings a first-class citizen within Twitter’s ML platform by commoditizing tools and pipelines that create high-quality, custom, regularly retrained, benchmarked, and centrally hosted embeddings. Abhishek also highlights various use cases of how teams at Twitter are using entity embeddings in their ML stack as input features to prediction models and leveraging available tools to easily learn their own embeddings.
Abhishek Tayal is a senior software engineer with Cortex, the machine learning platform team at Twitter, where he leads the entity embeddings team. Abhishek started his journey with Twitter as part of the ads prediction team for its direct response ad products. Previously, Abhishek worked with Tellapart, an ad tech startup (acquired by Twitter), and the Prime Video recommendations team at Amazon, where he led the development of the first-generation ML-based recommendation system for videos. He holds a master’s degree from the University of Southern California in LA.
©2018, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • firstname.lastname@example.org