Please clone or download the following GitHub repo:
Apache Spark is written in Scala. Although Spark provides a Java API, many data engineers are adopting Scala since it’s the “native” language for Spark—and because Spark code written in Scala is much more concise than comparable Java code. Most data scientists, however, continue to use Python and R.
If you want to learn Scala for Spark, this is the tutorial for you. Dean Wampler offers an overview of the core features of Scala you need to use Spark effectively, using hands-on exercises with the Spark APIs. You’ll learn the most important Scala syntax, idioms, and APIs for Spark development.
Dean Wampler is the vice president of fast data engineering at Lightbend, where he leads the Lightbend Fast Data Platform project, a distribution of scalable, distributed stream processing tools including Spark, Flink, Kafka, and Akka, with machine learning and management tools. Dean is the author of Programming Scala and Functional Programming for Java Developers and the coauthor of Programming Hive, all from O’Reilly. He is a contributor to several open source projects. A frequent Strata speaker, he’s also the co-organizer of several conferences around the world and several user groups in Chicago.
©2017, O’Reilly UK Ltd • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • email@example.com