Mar 15–18, 2020

Feature engineering pipelines five ways with Kafka, Redis, Spark, Dask, AirFlow, and more

Nick Pinckernell (Comcast)
11:50am12:30pm Wednesday, March 18, 2020
Location: 210 B

Who is this presentation for?

  • Machine learning engineers, data scientists, research engineers, and ops engineers




Feature engineering at large scale is becoming easier with Spark, pub/sub technologies, and a variety of data stores. Nick Pinckernell details five ways to perform feature engineering end-to-end with various popular technologies that won’t lock you into a specific vendor. From the raw data stream to the model, data needs to be persisted, aggregated, and even windowed to satisfy the many requirements your researchers may require from the data before it gets to the model. With code examples and architectures, you’ll learn how to consume, persist, window, aggregate, normalize, and transform data to ensure your model can be called at scale with confidence. This also allows for flexibility in your feature engineering pipelines for monitoring, auditing, and introspection.

Prerequisite knowledge

  • Familiarity with Python, pub/sub (such as Apache Kafka), Apache Spark, Kubeflow, Kubernetes, and model serving

What you'll learn

  • Learn about feature engineering at scale for ML models with a variety of platforms and technologies
  • Discover code examples and best practices to get started in your own organization
Photo of Nick Pinckernell

Nick Pinckernell


Nick Pinckernell is a senior research engineer for the applied AI research team at Comcast, where he works on ML platforms for model serving and feature pipelining. He’s focused on software development, big data, distributed computing, and research in telecommunications for many years. He’s pursuing his MS in computer science at the University of Illinois at Urbana-Champaign, and when free, he enjoys IoT.

Leave a Comment or Question

Help us make this conference the best it can be for you. Have questions you'd like this speaker to address? Suggestions for issues that deserve extra attention? Feedback that you'd like to share with the speaker and other attendees?

Join the conversation here (requires login)

Contact us

For conference registration information and customer service

For more information on community discounts and trade opportunities with O’Reilly conferences

Become a sponsor

For information on exhibiting or sponsoring a conference

For media/analyst press inquires