Machine learning may be overhyped nowadays, but there is still a strong belief that this area is exclusively for data scientists with a deep mathematical background who leverage the Python (scikit-learn, Theano, TensorFlow, etc.) or R ecosystems and use specific tools like R Studio, Matlab, or Octave. Obviously, there is some truth to this statement, but Java engineers can also take the best of the machine-learning world from an applied perspective by using our native language and familiar frameworks like Apache Spark. Taras Matyashovsky explains how to use Apache Spark MLlib to build a supervised learning NLP pipeline to distinguish pop music from heavy metal—and have fun in the process. Along the way, Taras offers an overview of the simplest machine-learning tasks and algorithms, like regression, classification, and clustering.
Taras Matyashovsky is a software engineer at Lohika, as well as a frequent speaker, the founder of the Morning@Lohika tech talks and a program committee member of JEEConf and XP Days Ukraine conferences. Primarily focused on the development of complex distributed systems and R&D activities, Taras is currently interested in microservices architecture, big data trends, and applied machine learning.
©2017, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • email@example.com