Spark Camp, organized by the creators of the Apache Spark project at Databricks, will be a day-long, hands-on introduction to the Spark platform including Spark Core, the Spark Shell, Spark Streaming, Spark SQL, MLlib, and more. We will start with an overview of use cases and demonstrate writing simple Spark applications. We will cover each of the main components of the Spark stack via a series of technical talks targeted at developers who are new to Spark. Intermixed with the talks will be periods of hands-on lab work. Attendees will download and use Spark on their own laptops, and learn how to deploy Spark apps in distributed big data environments including common Hadoop distributions and Mesos.
Related
O’Reilly author (Just Enough Math and Enterprise Data Workflows with Cascading) and a “player/coach” who’s led innovative data teams building large-scale apps. Director of community evangelism for Apache Spark with Databricks, advisor to Amplify Partners. Expert in machine learning, cluster computing, and enterprise use cases for big data.
Alex Sicoe is a software engineer at Big Data Partnership working with clients on projects involving scalable storage and compute systems like Apache Spark, Apache Cassandra, Apache Storm, and Apache Hadoop. He has extensive experience building data pipelines involving such systems as well as giving training courses on them. Alex is the first ever certified Databricks trainer. Previously he worked at CERN on building a large scale monitoring system for the ATLAS experiment on top of Apache Cassandra.
Comments on this page are now closed.
For exhibition and sponsorship opportunities, email stratahadoop@oreilly.com
For information on trade opportunities with O'Reilly conferences, email partners@oreilly.com
View a complete list of Strata + Hadoop World contacts
©2015, O’Reilly UK Ltd • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • conf-webmaster@oreilly.com
Apache Hadoop, Hadoop, Apache Spark, Spark, and Apache are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries, and are used with permission. The Apache Software Foundation has no affiliation with and does not endorse, or review the materials provided at this event, which is managed by O'Reilly Media and/or Cloudera.
Comments
Hi Simon,
Yes, that’s correct. All the materials will be accessible through a browser, with no need for local admin privileges on your laptop.
See you there!
Paco
Hi – I’ll be attending Spark Camp on Tuesday and have a question regarding equipment. Can I read into the statement “All of the materials will be using cloud-based notebooks” that I’ll be able to complete the tutorial without local administrator privileges on my laptop? Sorry if that seems a basic question, struggling with corporate IT!
Definitely not Java :)
In my experience, the choice between Scala or Python depends mostly on the nature of the organizations that you intend to be working in…
Scala has numerous advantages, and use of Spark does not require Scala in depth; however writing extensions to Spark likely will. It is commonly more about engineering for distributed systems infrastructure.
Python is rapidly becoming the preferred language for organizations involved in Data Science work. With Spark, Python is inherently slower than Scala, in terms of what groups of CPUs are going, but in practice faster than Scala in terms of what teams of people are doing to surface insights from data at scale.
Thank you Paco. That will be helpful. I’ll register today and look forward attending Spark Camp.
On separate note, if I need to embrace Spark as a beginner. What do I need to pick up – Scala, Python or Java? I mean – does either of them have big advantage? I wanted to pick up right language as I’m beginning anyway.
Hi Deepak,
Definitely. The code examples in Python or Scala can be mostly cut&paste if needed. Of course, for those with more coding experience feel free to explore in more detail. We also provide many examples in terms of SQL, which sounds like it would be more toward your background? Hope to see you at Spark Camp!
Paco
Hi – I’m very keen to attend Spark Camp but I don’t have Python, Java or Scala skills. I’m come from DB & C# background. Do you still think I can get something from this course? Thanks in advance.
Cheers,
DV
Hi Elina,
Having some coding experience in Python or Scala will help, and SQL as well. We do not need any advanced features in either language, and frankly there will be many examples that allow for cut&paste. Use of cloud-based notebooks is particularly good for that. The point is more about how to conceptualize typical problems and use Spark to solve them.
We will probably have R support generally available in Spark and the notebooks by the time of Strata EU. I cannot promise it, but that’s going into Spark now and will be a game-changer.
What is are the pre requirements to be able to follow Spark Camp? Should one know python or is R sufficient? The purpose would be to learn how to use R on Spark for statistical modeling.