It’s time to start visualizing methods, not just data. The days of trusting obscure “black-box” data processes are numbered. Dismissing the data manipulation and modeling process as too technical or unnecessary prevents decision makers from effectively leveraging their data. Communicating about data science provides a richer and more participatory understanding of data, driving better-informed decision-making. In this tutorial, we’ll begin by detailing what exactly we’re doing to data. These actions include:
After describing the landscape of data modeling and manipulation, we’ll introduce a process for isolating which aspects of data processing are key to communicate. This requires a nuanced understanding of both your user and your data. Based on user-centered design, participants will be exposed to methods of honing in on the most relevant aspects of their data processing. Through articulating the goals and limitations of data users, we are able to identify assumptions in our data cleaning and modeling that are crucial to communicate.
Once participants have the tools to determine what to communicate, we’ll explore examples of sharing complex ideas visually. This link provides an excellent example of how a dense method (Markov Chains) can be expressed through simple visual representations. This type of visual representation is a prerequisite to engagement with data.
While there has been an explosive focus on data visualization, it’s time to start demanding methods visualization. Until we fully understand what we’re doing to data, the impact of visualizing it will be limited.
Michael Freeman is a senior lecturer at the Information School at the University of Washington, where he teaches courses on data science, data visualization, and web development. With a background in public health, Michael works alongside research teams to design and build interactive data visualizations to explore and communicate complex relationships in large datasets. Previously, he was a data visualization specialist and research fellow at the Institute for Health Metrics and Evaluation, where he performed quantitative global health research and built a variety of interactive visualization systems to help researchers and the public explore global health trends. Michael is interested in applications of data visualization to social change. He holds a master’s degree in public health from the University of Washington. You can find samples from his projects on his website.
©2015, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • firstname.lastname@example.org
Apache Hadoop, Hadoop, Apache Spark, Spark, and Apache are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries, and are used with permission. The Apache Software Foundation has no affiliation with and does not endorse, or review the materials provided at this event, which is managed by O'Reilly Media and/or Cloudera.