Skip to main content

Easier than Excel: Social Network Analysis of DocGraph with Gephi

Janos Hajagos (Stony Brook School of Medicine), Fred Trotter (FredTrotter.com)
Data Liquidity Salon G
Workshop Please note: to attend, your registration must include Workshops on Wednesday.
Average rating: ***..
(3.50, 2 ratings)
Slides:   1-PPT 

The DocGraph dataset was released at Strata RX 2012. The dataset is the result of FOI request to CMS by healthcare data activist Fred Trotter (co-presenter). The dataset is minimal where each row consists of just three numbers: 2 healthcare provider identifiers and a weighting factor. By combining these three numbers with other publicly available information sources novel conclusions can be made about delivery of healthcare to Medicare members.
As an example of this approach see: http://tripleweeds.tumblr.com/post/42989348374/visualizing-the-docgraph-for-wyoming-medicare-providers

The DocGraph dataset consists of over 49,685,810 relationships between 940,492 different Medicare providers. Analyzing the complete dataset is too big for traditional tools but useful subsets of the larger dataset can be analyzed with Gephi. Gephi is a opensource tool to visually explore and analyze graphs. This tutorial will teach participants how to use Gephi for social network analysis on the DocGraph dataset.

Outline of the tutorial:

Part 1: DocGraph and the network data model (30% of the time)

The DocGraph dataset
The raw data
Helper data (NPI associated data)
The graph / network data model
Nodes versus edges
How graph models are integral to social networking
Other Healthcare graph data sets

Part 2: Using Gephi to perform analysis (70% of the time)

Basic usage of Gephi
Saving and reading the GraphML format
Laying out edges and nodes of a graph
Navigating and exploring the graph
Generating graph metrics on the network
Filtering a subset of the graph
Producing the final output of the graph

Photo of Janos Hajagos

Janos Hajagos

Stony Brook School of Medicine

Dr. Janos G. Hajagos is the lead data analyst for a unique partnership between SUNY and the New York State Department of Health. He has a Ph.D. in Ecology and Evolutionary Biology and has published widely from risk analysis to applications of the semantic web to healthcare. He is a participant in the CTSAConnect project.

Photo of Fred Trotter

Fred Trotter

FredTrotter.com

Fred Trotter is the leading consultant and advocate for Free/Libre and Open Source (FOSS) Health Software. In recognition of his role within the Open Source Health Informatics community, Trotter was the only Open Source representative invited by the NCVHS to testify on the definition of ‘meaningful use’.

Trotter has contributed code to FreeMed, OpenEMR is the current project manager of MirrorMed and the original author of FreeB, the worlds first GPL medical billing engine. In 2004 Fred Trotter received the LinuxMedNews achievement award for work on FreeB. Fred Trotter manages the Open Source EHR review project with the American Medical Informatics Association (AMIA), Open Source Working Group (oswg). Fred is also a member of WorldVistA.

Fred Trotter is a recognized expert in Free and Open Source medical software and security systems. He has spoken on those subjects at the SCALE DOHCS conference, Retail Healthcare conference, LinuxWorld, and DefCon. He has been quoted in multiple articles on Health Information Technology in several print and online journals, including WIRED, zdnet, Government Health IT, Modern Healthcare Online, Linux Journal, Free Software Magazine and LinuxMedNews.

Trotter has a B.S in Computer Science, a B.A in psychology and a B.A in philosophy from Trinity University. Trotter minored in Business Administration, Cognitive Science, and Management Information Systems. Before working directly on health software, Trotter passed the CISSP certification and consulted for VeriSign on HIPAA security for major hospitals and health institutions. Trotter was originally trained on information security at the Air Force Information Warfare Center.

For exhibition and sponsorship opportunities at Strata Rx conference, contact Sharon Pierce at (203) 304-9476 or spierce@oreilly.com

For information on trade opportunities with O'Reilly conferences email mediapartners
@oreilly.com

View a complete list of Strata Rx 2013 contacts