Several data mining applications depend on timely analysis of large, highly dynamic graphs. Existing graph mining architectures are, however, designed for offline computations making them unsuitable for dynamic graphs. At the same time, devising custom algorithms for dynamic graphs is a notoriously challenging algorithmic task, a fundamental obstacle in deploying online graph mining applications.
Real-Time (RT) Giraph aims to bridge the gap between offline and online graph mining with an architecture that offers the programming simplicity of batch processing while supporting efficient continuous processing of dynamic graphs. Central to RT-Giraph is a novel online graph mining engine that builds on the principles of Implicit Incremental Computation (IIC). The IIC engine computes graph algorithms using the Pregel model, and automatically updates the output through incremental computation as the graph changes.
In this presentation, we discuss the challenges in supporting incremental computation for graph algorithms and present the design and implementation of RT-Giraph, an Apache Giraph extension. RT-Giraph is an open-source project, part of the Grafos.ml effort to create tools for large-scale Machine Learning and Graph Analytics.
Georgos Siganos is a Senior Scientist at Qatar Computing Research Institute working on next generation Graph Mining Architectures and Big Data Systems. He is also the lead of the Grafos.ml open-source project. Previous to this, he was a Research Scientist at Telefonica Research focusing on Big Data and Peer to Peer Systems. He has authored more than 30 papers in journals and conferences. He received his Ph.D. from the University of California, Riverside.
For exhibition and sponsorship opportunities, email firstname.lastname@example.org
For information on trade opportunities with O'Reilly conferences, email email@example.com
For media-related inquiries, contact Maureen Jennings at firstname.lastname@example.org
View a complete list of Strata + Hadoop World contacts
©2015, O’Reilly UK Ltd • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • email@example.com
Apache Hadoop, Hadoop, Apache Spark, Spark, and Apache are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries, and are used with permission. The Apache Software Foundation has no affiliation with and does not endorse, or review the materials provided at this event, which is managed by O'Reilly Media and/or Cloudera.