Building a multi-tenant data processing and model inferencing platform with Kafka Streams
Who is this presentation for?Data engineers, Data Scientists
In this talk, I would share an overview of the architecture for data processing and triggering models, which is inbuilt for scalability and reliability. Since ours is a multitenant platform, each of our client’s models(such as bid models, fraud detection and omnichannel reorder) could be interested in a certain type of events such as search, add to cart, transactions etc and whenever such an event is processed we trigger the model interested in that particular event. I would discuss in detail how event lands into our system from Kafka, then processed and saved internally and how the interested models are triggered on such events. Models use the internal persistent state(on RocksDB) for feature extraction and also store their own model outputs in the platform which could be used across teams as features.
The talk would focus on following parts of the architecture.
- Ensuring fairness among the models.
- Providing isolation and reusing features/inferences across models at the same time.
- Dynamically updating global data(such as product catalog) needed to run models on each node.
- Customizing models to either trigger them on each event or as batch after frequent time intervals.
- Implementing data archival/TTL policies and other features developed to save money.
- Advantages and limitations of the platform.
Prerequisite knowledgeA basic knowledge of Kafka and Kafka Streams. An understanding of how distributed systems work would be a plus.
What you'll learn
Navinder Pal Singh Brar
Navinder is a data engineer in Walmart Labs where he has been working on Kafka and Kafka Streams for over a year now. He likes working on distributed systems and lives in Bangalore, India. He has prior experience in building web applications and one of the biggest GDS platform used in the travel industry. He has a Bachelors degree in Computer Science.
Leave a Comment or Question
Help us make this conference the best it can be for you. Have questions you'd like this speaker to address? Suggestions for issues that deserve extra attention? Feedback that you'd like to share with the speaker and other attendees?
Join the conversation here (requires login)
For conference registration information and customer service
For more information on community discounts and trade opportunities with O’Reilly conferences
For information on exhibiting or sponsoring a conference
View a complete list of Strata Data Conference contacts