Enterprises that pursue data-driven operations and decisions are approaching the conclusion that graph analysis capabilities will yield critical competitive advantages. However, for this impact to be fully realized, the results of any graph analysis must be available, in real time, to operational applications, data scientists, and developers across the enterprise.
Monsanto previously attempted graph analysis using both RDBMS-based and offline batch processing techniques. In the process, Monsanto found that some couldn’t drill sufficiently deeply to result in the necessary insights; others were limited in their expressibility and therefore general usefulness outside of the data science lab; and still others weren’t able to provide answers in a short enough amount of time to be useful to the business. Monsanto finally selected a graph database used alongside a broader tech stack that includes Apache Kafka, Spark, and Oracle. This stack allows Monsanto to not just derive but also operationalize insights that have allowed it to shorten R&D cycles, better understand the dynamics of its business, and carry out certain of types of science in silico.
Tim Williamson and Emil Eifrem draw on Monsanto’s real-world experience to explain how organizations can use graph databases to operationalize insights from big data. Tim and Emil discuss Monsanto’s big data stack, using examples from Monsanto’s substantial experience with graphs, and describe the service-oriented graph architecture that has already handled over one billion requests and is available to over 150 developers, data scientists, and applications throughout Monsanto.
Emil Eifrem is CEO of Neo Technology and cofounder of Neo4j, the world’s leading graph database. Committed to sustainable open source, he guides Neo along a balanced path between free availability and commercial reliability. Before founding Neo, Emil was the CTO of Windh AB, where he headed the development of highly complex information architectures for enterprise content management systems.
Tim Williamson is a data scientist at Monsanto, where he leads a full stack data engineering team focused on creating distributed analysis capabilities around complex scientific datasets in genomics, genetics, and agronomic performance.
©2016, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • firstname.lastname@example.org
Apache Hadoop, Hadoop, Apache Spark, Spark, and Apache are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries, and are used with permission. The Apache Software Foundation has no affiliation with and does not endorse, or review the materials provided at this event, which is managed by O'Reilly Media and/or Cloudera.