Gaffer is a scalable open source graph database built on Accumulo or HBase (your choice). A Parquet implementation is in progress, and Gaffer can be extended to other technologies as well—its data ingest and query services easily integrate with Hadoop and Spark. Gaffer is designed to be very scalable and can ingest and store data streamed in at very high rates or bulk-loaded in large batches while providing fast, flexible query access.
Gaffer allows rich properties, such as data sketches, to be stored on entities and edges in the graph. Its built-in aggregation framework lets users specify complex logic that tells Gaffer how to evolve and aggregate these properties as new data is added. For example, each edge could have a count property that is maintained by a “sum” function. New instances of an existing edge can be added and the count is updated without having to query for the existing edge first.
This session explores Gaffer’s history, architecture, data model, features, and functionality and outlines some future goals for the project.
©2017, O’Reilly UK Ltd • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • firstname.lastname@example.org