As part of the US House Intelligence Committee investigation into how Russia may have influenced the 2016 US election, Twitter released the screen names of nearly 3,000 Twitter accounts tied to Russia’s Internet Research Agency. These accounts were immediately suspended, removing the data from Twitter.com and Twitter’s developer API. Ryan Boyd explains how he and his team reconstructed a subset of the Twitter network of Russian troll accounts and applied graph analytics to the data using the Neo4j graph database to uncover how these accounts were spreading fake news.
Ryan covers how they collected and munged the data distributed by NBC, taking advantage of the flexibility of the property graph and demonstrates how NLP and graph algorithms like PageRank and community detection can be applied in the context of social media to make sense of the data. Ryan shows how Cypher, the query language for graphs, is used to work with graph data and how visualization is used in combination with these algorithms to interpret results of the analysis and to help share the story of the data.
San Francisco-based software engineer, authNZ geek, data geek, and graph geek Ryan Boyd is director of developer relations for Neo4j, an open source graph database that powers connected data analysis in data journalism, cancer resource, and some of the world’s top companies. Previously, he was head of developer relations for Google Cloud Platform and worked on over 20+ different APIs and developer products during his eight years at Google. Ryan is the author of Getting Started with OAuth 2.0 by O’Reilly. Now that he has a young daughter, he no longer skydives but still enjoys the adventures of sailing and cycling.
©2018, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • email@example.com