Spark is a fast and general engine for large-scale data. Julia is a fast and general engine for large-scale compute. Viral Shah and Stefan Karpinski explain how combining Julia’s compute and Spark’s data processing capabilities makes amazing things possible.
Viral and Stefan offer an overview of Julia, which solves the two language problem in multiple computational domains. Julia is simultaneously fast and easy and is quickly becoming the language of choice for data scientists, statisticians, quants, actuaries, chemists, physicists, biologists, psychologists, and all other applied mathematicians worldwide. With Spark.jl, a Julia package that makes it possible to run Julia in a Spark cluster and exchange data with Spark, these mathematicians now have a first-class method to access data from Spark in Julia and leverage Julia’s computational capabilities. Viral and Stefan also share a real-world example from one of the world’s largest insurers and the lessons learned along the way.
Viral Shah is the cofounder and CEO of Julia Computing and a cocreator of the Julia language, as well as other open source software. Previously, he drove the rearchitecting of the government’s social security systems in India as part of the national ID project, Aadhaar. Viral is the coauthor of Rebooting India.
Stefan Karpinski is one of the cocreators and core developers of the Julia language. He is an applied mathematician and data scientist by trade, having worked at Akamai, Citrix Online, and Etsy, but currently is focused on advancing Julia’s design, implementation, documentation, and community.
©2017, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • firstname.lastname@example.org