Reinforcement learning (RL) is emerging as a promising approach to intelligently interact with continuously changing physical or virtual environments. Advances in RL research have already shown remarkable results, such as Google’s AlphaGo beating the Go world champion, and are finding their way into self-driving cars, unmanned aerial vehicles, and surgical robotics. Not surprisingly, many see RL growing rapidly into a potentially dominant area in ML over the next decade. However, the applications of RL pose a new set of requirements, the combination of which creates a challenge for existing distributed execution frameworks: computation with millisecond latency at high throughput, adaptive construction of arbitrary task graphs, and execution of heterogeneous kernels over diverse sets of resources.
Ion Stoica, Robert Nishihara, and Philipp Moritz lead a deep dive into Ray, a new distributed execution framework for reinforcement learning applications developed by machine learning and systems researchers at UC Berkeley’s RISELab, walking you through Ray’s API and system architecture and sharing application examples, including several state-of-the art RL algorithms.
Ion Stoica is a professor in the electrical engineering and computer sciences (EECS) department at the University of California, Berkeley, where he does research on cloud computing and networked computer systems. Previously, he worked on dynamic packet state, chord DHT, internet indirection infrastructure (i3), declarative networks, and large-scale systems, including Apache Spark, Apache Mesos, and Alluxio. He’s the cofounder of Databricks—a startup to commercialize Apache Spark—and Conviva—a startup to commercialize technologies for large-scale video distribution. Ion is an ACM fellow and has received numerous awards, including inclusion in the SIGOPS Hall of Fame (2015), the SIGCOMM Test of Time Award (2011), and the ACM doctoral dissertation award (2001).
Robert Nishihara is a fourth-year PhD student working in the University of California, Berkeley, RISELab with Michael Jordan. He works on machine learning, optimization, and artificial intelligence.
Philipp Moritz is a PhD candidate in the electrical engineering and computer sciences (EECS) department at the University of California, Berkeley, with broad interests in artificial intelligence, machine learning, and distributed systems. He’s a member of the Statistical AI Lab and the RISELab.
Comments on this page are now closed.
©2017, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • email@example.com