Presented By
O’Reilly + Cloudera
Make Data Work
March 25-28, 2019
San Francisco, CA

Taking graph applications to production

Denise Gosnell (DataStax)
2:40pm3:20pm Wednesday, March 27, 2019
Average rating: ****.
(4.73, 11 ratings)

Who is this presentation for?

  • Data architects, data scientists, data engineers, graph practitioners, software architects, solution architects, and solution engineers

Level

Advanced

Prerequisite knowledge

  • Familiarity with graph data modeling
  • Experience using graph data in your application

What you'll learn

  • Understand the computational overhead introduced by a graph's branching factor and mitigation techniques with selectivity
  • Learn how to monitor for supernodes and resolutions via data modeling
  • Explore three common problems that typically lead to unsuccessful implementations of graph technology

Description

Over the past decade, Denise Gosnell has helped build some of the largest production applications of graph databases around the world. From those experiences, she’s collected a set of common areas in which teams frequently misstep when getting started with graph technology. It also happens that those themes parallel the experience of playing one of her favorite games, SimCity 2000. Denise walks you through a few of these topics.

Know the rules.
The introduction of graph data into your application introduces a new paradigm of data modeling: relationship-first design instead of entity-first design. The transition to relationship-first design principles introduces a new set of rules to consider for understanding your application’s performance, just like learning the rules of building a successful metropolis in SimCity. In this section, you’ll dive into the computational overhead introduced into your system from the branching factor and selectivity of your graph traversals.

Things can quickly become catastrophic.
Relationship-first data modeling can create a sleeping time bomb in your graph data: namely, supernodes. Just like in SimCity, high volumes of progress without proper planning will eventually introduce a catastrophe. To plan for this, you will need to track, mitigate, and eliminate the potential for supernodes within your applications. In this section, Denise introduces supernodes and presents tangible plans for avoiding the disasters which they can create.

You’re going to make mistakes.
Just like the learning process for understanding the tools and rules for building a successful city, you’ll inevitably make some mistakes when starting down the path of integrating graph technology into your stack. These common mistakes often start out as red herrings that are misinterpreted as graph problems. In this section, you’ll explore three use cases that are frequently misinterpreted as graph problems and learn techniques for avoiding these traps.

Photo of Denise Gosnell

Denise Gosnell

DataStax

Denise Koessler Gosnell is the chief data officer at DataStax, where she applies her experiences as a machine learning and graph data practitioner to make more informed decisions with data. Her career centers on her passion for examining, applying, and advocating the applications of graph data. She has patented, built, published, and spoken on dozens of topics related to graph theory, graph algorithms, graph databases, and applications of graph data across all industry verticals. Previously, Denise created and led the global graph practice, a team that builds some of the largest distributed graph applications in the world at DataStax, and she was in the healthcare industry, where she contributed to software solutions for permissioned blockchains, machine learning applications of graph analytics, and data science. Denise earned her PhD in computer science from the University of Tennessee as an NSF fellow. Her research coined the concept “social fingerprinting” by applying graph algorithms to predict user identity from social media interactions.