Apache Spark: Ask Us Anything

Paco Nathan (derwen.ai), Aaron Davidson (Databricks), Sameer Farooqui (Databricks), Hossein Falaki (Databricks Inc.), Alex Sicoe (Elsevier), Olivier Girardot (Lateral Thoughts)
Hadoop & Beyond
Location: 127-128
Average rating: ****.
(4.00, 2 ratings)

Join the Spark Team for an informal question and answer session.

Photo of Paco Nathan

Paco Nathan

derwen.ai

Paco Nathan is known as a “player/coach” with core expertise in data science, natural language processing, machine learning, and cloud computing. He has 35+ years of experience in the tech industry, at companies ranging from Bell Labs to early-stage startups. His recent roles include director of the Learning Group at O’Reilly Media and director of community evangelism at Databricks and Apache Spark. Paco is the cochair of JupyterCon and an advisor for Amplify Partners, Deep Learning Analytics, and Recognai. He was named one of the top 30 people in big data and analytics in 2015 by Innovation Enterprise.

Aaron Davidson

Databricks

Aaron Davidson is an Apache Spark committer and software engineer at Databricks. He has implemented Spark standalone cluster fault tolerance and shuffle file consolidation, and has helped in the design, implementation, and testing of Spark’s external sorting and driver fault tolerance.

Photo of Sameer Farooqui

Sameer Farooqui

Databricks

Sameer Farooqui is a client services engineer at Databricks, where he works with customers on Apache Spark deployments. Sameer works with the Hadoop ecosystem, Cassandra, Couchbase, and general NoSQL domain. Prior to Databricks, he worked as a freelance big data consultant and trainer globally and taught big data courses. Before that, Sameer was a systems architect at Hortonworks, an emerging data platforms consultant at Accenture R&D, and an enterprise consultant for Symantec/Veritas (specializing in VCS, VVR, and SF-HA).

Photo of Hossein Falaki

Hossein Falaki

Databricks Inc.

Hossein Falaki is a software engineer at Databricks working on the next big thing. Prior to that he was a data scientist at Apple’s personal assistant, Siri. He graduated with Ph.D. in Computer Science from UCLA, where he was a member of the Center for Embedded Networked Sensing (CENS).

Photo of Alex Sicoe

Alex Sicoe

Elsevier

Alex Sicoe recently joined Elsevier as a software developer within the company’s big data analytics platform team. Previously he worked as an engineer with Big Data Partnership
working with clients on projects involving Apache Spark, Apache Cassandra, Apache Storm, Apache Hadoop. He has extensive experience building data pipelines involving such systems as well as giving training courses on them. He also worked at CERN on building a large scale monitoring system for the ATLAS experiment on top of Apache Cassandra.

Photo of Olivier Girardot

Olivier Girardot

Lateral Thoughts

Olivier Girardot is a software engineer and co-founder of Lateral Thoughts working on Machine Learning, Big Data and DevOps solutions with clients to help them tackle problems that require both expertise and experience. In order to become more efficient both as a company and as a team.

Comments on this page are now closed.

Comments

Picture of Paco Nathan
Paco Nathan
8-03-2015 6:58 CET

Hi Avusherla,

Best to direct these kinds of questions to <user@spark.apache.org> email list, where there are many people who can join the discussion: http://spark.apache.org/community.html

Avusherla Bharath
8-03-2015 4:51 CET

I have a question regarding SPARK. Few days back I have tried SPARK on my system and it is working fine. Now I want to install SPARK cluster on Hadoop Multinode Cluster. So do i need to install SPARK on each slave node where Hadoop slave is present. How do i install it on Hadoop multinode cluster.