Streaming data analysis in real time is becoming the fastest and most efficient way to obtain useful knowledge from what is happening now, allowing organizations to react quickly when problems appear or to detect new trends helping to improve their performance. In this talk, we present SAMOA, an open-source platform for mining big data streams. SAMOA is a platform for online mining in a cluster/cloud environment. It features a pluggable architecture that allows it to run on several distributed stream processing engines such as S4 and Storm. SAMOA includes algorithms for the most common machine learning tasks such as classification and clustering.
Gianmarco De Francisci Morales is a Research Scientist at Yahoo Labs Barcelona.
He received his Ph.D. in Computer Science and Engineering from the IMT Institute for Advanced Studies of Lucca in 2012. His research focuses on large scale data mining and big data, with a particular emphasis on Web mining and Data Intensive Scalable Computing systems. He is an active member of the open source community of the Apache Software Foundation working on the Hadoop ecosystem (Giraph, S4), and a committer for the Apache Pig project. He is a co-organizer of the workshop series on Social News on the Web (SNOW) co-located with the WWW conference. He is one of the lead developers of SAMOA, an open-source platform for mining big data streams.
For exhibition and sponsorship opportunities, email firstname.lastname@example.org
For information on trade opportunities with O'Reilly conferences, email email@example.com
For media-related inquiries, contact Maureen Jennings at firstname.lastname@example.org
View a complete list of Strata + Hadoop World contacts
©2015, O’Reilly UK Ltd • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • email@example.com
Apache Hadoop, Hadoop, Apache Spark, Spark, and Apache are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries, and are used with permission. The Apache Software Foundation has no affiliation with and does not endorse, or review the materials provided at this event, which is managed by O'Reilly Media and/or Cloudera.