Scaling AI at Cerebras





Long training times are the single biggest factor slowing down innovation in deep learning. Today’s common approach of scaling large workloads out over many small processors is inefficient and requires extensive model tuning. With increasing model and dataset sizes, new ideas are needed to reduce training times.
Urs Köster explores trends in the computer vision and natural language processing domains and techniques for scaling with the Cerebras wafer scale engine—the largest chip in the world. Cerebras’s unique, purpose-built processor allows you to leverage sparsity for building larger models and enables model-parallel training as an efficient alternative to data-parallel training.
What you'll learn
- Discover new ideas for reducing training times

Urs Köster
Cerebras Systems
Urs Köster is the head of machine learning at Cerebras Systems, where he develops novel deep learning algorithms to enable the next generation of AI. He has 15 years of experience in neural networks and computational neuroscience, contributed to machine learning frameworks, developed low-precision numerical formats, and led data science engagements. Previously, he was head of algorithms R&D at Intel Nervana and a researcher at UC Berkeley.
Presented by
Elite Sponsors
Strategic Sponsors
Diversity and Inclusion Sponsor
Impact Sponsors
Premier Exhibitor Plus
R & D and Innovation Track Sponsor
Contact us
confreg@oreilly.com
For conference registration information and customer service
partners@oreilly.com
For more information on community discounts and trade opportunities with O’Reilly conferences
Become a sponsor
For information on exhibiting or sponsoring a conference
pr@oreilly.com
For media/analyst press inquires