Yonghua Lin leads a deep dive into AI Vision, a deep learning development platform from IBM for image and video analysis, exploring its system design, performance optimization, and large-scale capability for training and inference.
AI Vision is an end-to-end deep learning development platform for computer vision, including data labeling, data preprocessing, model training, and inference. It lets application developers train a new image classifier or object detector very easily. This new customized model has high accuracy to meet application needs and can be trained in a short duration (less than 30 minutes for classification) with small datasets (less than 100 pictures per category). The infrastructure is optimized for cost efficiency, allowing a service provider to offer a large-scale deployment. The inference (recognition API service) system can provide real-time processing capability on a large scale (more than 10,000 instances) with high availability to support more than 99.9% of SLA requirements.
AI Vision includes both a DL model training stage and an inference API deployment stage. The system requirements for these two stages are different. IBM uses the same Kubernetes framework to build the container cloud and manage resources for both training and inference but applies different scheduling strategies to each. For deep learning, AI Vision provides one-stop service for customized image classification and faster RCNN-based object detection. For data preprocessing, IBM designed a preprocessing framework to allow developers to easily inject a new preprocessing plugin for their images. For training, IBM uses Caffe as the default deep learning framework, but users can also switch to other frameworks. Users can define their training tasks on the web and obtain the customized models and inference APIs quickly after training with relatively small datasets. To achieve the high accuracy with short training time and small datasets, IBM developed enhanced fine-tuning technology. In order to understand and enhance the deep learning performance, IBM designed monitoring and advisory tool DL Insight, which provides the monitoring and advice for both neural network training and system performance.
Yonghua Lin is the founder and leader of IBM’s SuperVessel innovation cloud, a senior member of the technical staff, and senior manager of cognitive systems and cloud in IBM Research. Yonghua has worked on system architecture, the cloud, and cognitive platform research for more than 15 years. She was the initiator of mobile infrastructure in the cloud (now network function virtualization) and led the IBM team that built up the first optimized cloud for 4G mobile infrastructure. Yonghua has spoken widely at industry events, including ITU and Mobile World Congress, holds more than 40 patents granted worldwide, and has authored papers for top conferences and journals.
©2017, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • firstname.lastname@example.org