Presented By O’Reilly and Intel Nervana
Put AI to work
September 17-18, 2017: Training
September 18-20, 2017: Tutorials & Conference
San Francisco, CA

Embedded deep learning: Deep learning for embedded systems

Siddha Ganju (NVIDIA)
1:45pm–2:25pm Tuesday, September 19, 2017
Implementing AI
Location: Yosemite A
Secondary topics:  Deep learning, IoT (including smart cities, manufacturing, smart homes/buildings)
Average rating: ***..
(3.33, 3 ratings)

Prerequisite Knowledge

  • A basic understanding of deep learning

What you'll learn

  • Explore Deep Vision’s solution for energy-efficient deep learning, which sits at the intersection between machine learning and computer architecture
  • Understand how to develop exciting new architectures that can potentially revolutionize deep learning and deploy deep learning at scale


State-of-the-art algorithms for applications like face recognition, object identification, and tracking utilize deep learning-based models for inference. Edge-based systems like security cameras and self-driving cars necessarily need to make use of deep learning in order to go beyond the minimum viable product. However, the core deciding factors for such edge-based systems are power, performance, and cost, as these devices possess limited bandwidth, have zero latency tolerance, and are constrained by intense privacy issues. The situation is further exacerbated by the fact that deep learning algorithms require computation of the order of teraops for a single inference at test time, translating to a few seconds per inference for some of the more complex networks. Such high latencies are not practical for edge devices, which typically need real-time response with zero latency. Additionally, deep learning solutions are extremely compute intensive, resulting in edge devices not being able to afford deep learning inference.

Deep learning is necessary to bring intelligence and autonomy to the edge. Siddha Ganju offers an overview of Deep Vision’s solution, which optimizes both the hardware and the software, and discusses the Deep Vision embedded processor, which is optimized for deep learning and computer vision and offers 50x higher performance per watt than existing embedded GPUs without sacrificing programmability. Siddha demonstrates how Deep Vision’s solutions offer better performance and higher accuracy and shares a classified secret to achieving higher accuracy with a smaller network, as well as how to optimize for information density.

Photo of Siddha Ganju

Siddha Ganju


Siddha Ganju is a self-driving architect at NVIDIA. She was featured on the Forbes 30 under 30 list, and she guides teams at NASA as an AI domain expert and is a featured jury member in several informational tech competitions. Previously, she developed deep learning models for resource-constrained edge devices at DeepVision. She earned her degree from Carnegie Mellon University, and her work ranges from visual question answering to generative adversarial networks to gathering insights from CERN’s petabyte-scale data and has been published at top-tier conferences including CVPR and NeurIPS.