State-of-the-art algorithms for applications like face recognition, object identification, and tracking utilize deep learning-based models for inference. Edge-based systems like security cameras and self-driving cars necessarily need to make use of deep learning in order to go beyond the minimum viable product. However, the core deciding factors for such edge-based systems are power, performance, and cost, as these devices possess limited bandwidth, have zero latency tolerance, and are constrained by intense privacy issues. The situation is further exacerbated by the fact that deep learning algorithms require computation of the order of teraops for a single inference at test time, translating to a few seconds per inference for some of the more complex networks. Such high latencies are not practical for edge devices, which typically need real-time response with zero latency. Additionally, deep learning solutions are extremely compute intensive, resulting in edge devices not being able to afford deep learning inference.
Deep learning is necessary to bring intelligence and autonomy to the edge. Siddha Ganju offers an overview of Deep Vision’s solution, which optimizes both the hardware and the software, and discusses the Deep Vision embedded processor, which is optimized for deep learning and computer vision and offers 50x higher performance per watt than existing embedded GPUs without sacrificing programmability. Siddha demonstrates how Deep Vision’s solutions offer better performance and higher accuracy and shares a classified secret to achieving higher accuracy with a smaller network, as well as how to optimize for information density.
Siddha Ganju is a self-driving architect at NVIDIA. She was featured on the Forbes 30 under 30 list, and she guides teams at NASA as an AI domain expert and is a featured jury member in several informational tech competitions. Previously, she developed deep learning models for resource-constrained edge devices at DeepVision. She earned her degree from Carnegie Mellon University, and her work ranges from visual question answering to generative adversarial networks to gathering insights from CERN’s petabyte-scale data and has been published at top-tier conferences including CVPR and NeurIPS.
©2017, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • firstname.lastname@example.org