Jayant Shekhar, Amandeep Khurana, Krishna Sankar, and Vartika Singh guide participants through techniques for building machine-learning apps using Spark MLlib and Spark ML and demonstrate the principles of graph processing with Spark GraphX. Jayant, Amandeep, Krishna, and Vartika begin with the use cases for machine learning with Apache Spark. You’ll explore the various algorithms available in Spark MLlib and Spark ML, including those for doing basic statistics, classification and regression, collaborative filtering, clustering, dimensionality reduction, and frequent pattern mining. Along the way, you’ll solve problems using the mentioned algorithms and cover streaming k-means clustering. You’ll also learn use cases for graph processing and get an overview of programming with Spark GraphX, followed by hands-on coding examples of graph-processing problems using GraphX.
Jayant Shekhar is the founder of Sparkflows Inc., which enables machine learning on large datasets using Spark ML and intelligent workflows. Jayant focuses on Spark, streaming, and machine learning and is a contributor to Spark. Previously, Jayant was a principal solutions architect at Cloudera working with companies both large and small in various verticals on big data use cases, architecture, algorithms, and deployments. Prior to Cloudera, Jayant worked at Yahoo, where he was instrumental in building out the large-scale content/listings platform using Hadoop and big data technologies. Jayant also worked at eBay, building out a new shopping platform, K2, using Nutch and Hadoop among others, as well as KLA-Tencor, building software for reticle inspection stations and defect analysis systems. Jayant holds a bachelor’s degree in computer science from IIT Kharagpur and a master’s degree in computer engineering from San Jose State University.
Amandeep Khurana is a solutions architect at Cloudera, where he’s involved in the entire lifecycle of Hadoop adoption for customers from use-case discovery to taking systems to production. Amandeep is also a coauthor of HBase In Action, a book geared toward building applications using HBase. Prior to Cloudera, Amandeep was at Amazon Web Services, where he was a part of the Elastic MapReduce team, and built the first version of EMR’s HBase offering.
Krishna Sankar is a Distinguished Engineer − Artificial Intelligence & Machine Learning at U.S. Bank focusing on augmented intelligence, digital human as well as areas like AI explainability. Earlier stints include Senior Data Scientist with Volvo Cars, Chief Data Scientist at blackarrow.tv, Data Scientist/Tata America Intl, Director of Data Science/Bioinformatics startup & as a Distinguished Engineer/Cisco. He has been speaking at various conferences incl ML tutorials at Strata SJC & LONDON 2016, Spark Summit [goo.gl/ab30lD], Strata-Sparkcamp, OSCON, Pycon & Pydata, writes about Nash Equilibrium, Isaac Asimov and Robots Rules[goo.gl/5yyRv6 as well as has been guest lecturing at the Naval Postgraduate School. His occasional blogs can be found at https://medium.com/@ksankar
They include NeurIPS2018 — Conference Summary [https://goo.gl/VgeyDT], Deep Thinking by Garry Kasparov: The Education Of A Machine [https://goo.gl/9qv671] and Ask not if AlphaZero can beat humans in Go — Ask if AlphaZero can teach humans to be a Go champion [https://goo.gl/vPzN9B]. His other passions are semantic Go engines, flying Drones (working towards Drone Pilot License (FAA UAS Pilot) and Lego Robotics – you will find him at the Detroit FLL World Competition as Robots Design Judge
Vartika Singh is a solutions architect at Cloudera with over 12 years of experience applying machine learning techniques to big data problems.
Comments on this page are now closed.
©2016, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • email@example.com
Apache Hadoop, Hadoop, Apache Spark, Spark, and Apache are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries, and are used with permission. The Apache Software Foundation has no affiliation with and does not endorse, or review the materials provided at this event, which is managed by O'Reilly Media and/or Cloudera.