Problems taking AI to production and how to fix them
Who is this presentation for?
- Data scientists, data engineers, DevOps, and data ops
By working with a variety of clients across many industries—chemical sciences, healthcare, and oil and gas—Jim Scott has documented a number of major impediments to the successful operationalization of these systems and how to keep them running in a production environment, including problems with data formats and optimization and overtime versioning problems of the data, models, and parameters growing too complex. He explores tooling management issues around notebook applications like Jupyter and workflow management tools to keep track and manage an execution pipeline.
New problems arise when when log output volumes grow quickly and significant volume of data movement begins. Source data moves to the GPU, log data moves back to storage, and then the log data moves to machines to handle the distributed compute to perform postmodel analytics to evaluate the performance characteristics. Networks don’t provide infinite bandwidth and most enterprises do not run extremely high-speed networks.
Moving forward, the problems expand when preparing for production deployment of models and adapting them for real time and not just training and testing. You’ll learn about model deployment and scoring with a canary and decoy model leveraging the rendezvous architecture.
- A basic understanding of the daily activities that go into creating models including data sources (useful but not required)
What you'll learn
- Discover specific problems and reference points within the industry
- Learn how to rectify issues with getting to production
Jim Scott is the head of developer relations, data science, at NVIDIA. He’s passionate about building combined big data and blockchain solutions. Over his career, Jim has held positions running operations, engineering, architecture, and QA teams in the financial services, regulatory, digital advertising, IoT, manufacturing, healthcare, chemicals, and geographical management systems industries. Jim has built systems that handle more than 50 billion transactions per day, and his work with high-throughput computing at Dow was a precursor to more standardized big data concepts like Hadoop. Jim is also the cofounder of the Chicago Hadoop Users Group (CHUG).
For conference registration information and customer service
For more information on community discounts and trade opportunities with O’Reilly conferences
For information on exhibiting or sponsoring a conference
For media/analyst press inquires