Data analytics is one of the fastest growing fields in recent years. It has been used in many industries to better verify models or theories, and make predictions or decisions. Among all the data analysis algorithms, deep neural networks have been shown to be effective at learning very complicated relationships from huge datasets. They significantly outperform previous state-of-the-art programs in many fields like visual object recognition, speech recognition and synthesis, machine translation, and natural language processing. Mocha.jl is a library that encapsulates complicated computations of a deep neural network with a high-level flexible interface, allowing people to configure and train neural network models easily in Julia.
Julia is a high-level dynamic language designed for scientific and technical computing. By working with Julia, we get powerful data manipulation primitives for free, making it very easy to pre-process and convert raw data, as well as visualize and interpret the prediction results. Mocha.jl implements multiple computation backends. The pure Julia backend is portable, allowing us to try out prototypes of networks in any environment that runs Julia. By changing only one configuration, we can switch the backend and run the same neural network on a server node with GPU devices, leading to 20~30 times speedups depending on the model (size and other parameters).
In this session, we will walk through a simple example that demonstrates the APIs of Mocha.jl, and show how to use Mocha.jl primitives to compose sophisticated neural network models. We will also introduce the design and architecture of the Mocha.jl implementation. Although basic concepts will be explained briefly, it will be helpful if the attendees already have an idea of what machine learning is and (deep) neural networks.
Chiyuan Zhang received his BS and Master’s degrees in computer science from Zhejiang University, China, in 2009 and 2012, respectively. He is currently a PhD candidate in the Computer Science and Artificial Intelligence Laboratory at MIT. His research interests include machine learning and computational neuroscience, as well as applications to processing / analysis of speech, vision, and other kinds of real-world signals.
Comments on this page are now closed.
©2015, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • firstname.lastname@example.org