Guides: Data Visualization: Process

3-Step Data Visualization Process

There are three main steps to visualizing your data. Steps 1 and 2 are interchangeable and may co-occur!

Step 1: Formulate the question you want your visualization to answer or the story you want your visualization to tell about your data.

Step 2: Gather, sort, and understand your data

Step 3: Apply your chosen visual representation (If you are unfamiliar with your chosen tool, you will want to take time to learn and understand the software).

Gathering data

You may already have a dataset ready or are in the process of collecting your own data.

If you need to find data to use for visualization, Pepperdine Libraries provides access to a variety of datasets through its databases and online repositories. You can find datasets or source data by searching the library’s databases and online repositories. You can create your own dataset using sources such as surveys, experiments, web scraping, or digitized documents.

If you have any questions about where to source your data, you can contact the library liaison for your subject interest or area.

Understanding and sorting data

The following questions will help you approach, understand, and transform your data to be ready for visualization:

Examine your data

Do you have all of the data you need?
Does your data include all of the variables in which you are interested?
Are there any obvious errors in your data?
Is there data that is missing?

Understand your data type

What type(s) of data do you have?
- Qualitative or quantitative
- Nominal, ordinal, ratio, categorical, or spatial
What is the range of values contained in your data?

Transform your data

Quality transformations

Do you need to clean up your data?
Do you need to correct errors in your data?
Do you need to fill in gaps in your data?

Analysis transformations

Do you need to parse (split up) any of your variables? (e.g., extract the year from a date)
Do you need to merge any of your variables? (e.g., combine a forename and surname into one name value)
Do you need to convert qualitative data or free-text into coded values or keywords?
Do you need to create calculations to use in the analysis? (e.g., percentage proportions)
Do you need to remove redundant data for which you have no planned use?

Cleaning data

Data cleaning is the process of identifying and removing (or correcting) errors, inaccuracies, and inconsistencies in your data. The process of data cleaning ensures that your data is accurate, complete, and consistent. Basic actions like renaming column headers or splitting columns can easily be done in programs like Excel or Google Sheets.

However, if you need to do more significant transformations or clean up with your data, you may want to consider the following tools:

OpenRefine - a free and open-source desktop application for working with messy data (available to download on their website)
DataWrangler - developed by the Stanford Visualization group for wrangling messy data (not actively supported by the research group)

Resources

Need help with the design of your visualization? Johns Hopkins Sheridan Libraries has an excellent guide on designing effective visualizations.

Need help deciding what type of visualization to use? Yale Library has a great guide on different types of visualizations and use cases for each.

Books at Pepperdine Libraries

Effective Data Visualization by Stephanie Evergreen
ISBN: 9781544350882

Publication Date: 2019-05-14

This comprehensive how-to guide functions as a set of blueprints--supported by both research and the author's extensive experience with clients in industries all over the world--for conveying data in an impactful way.
Fundamentals of Data Visualization by Claus Wilke
ISBN: 9781492031086

Publication Date: 2019-05-14

This practical book takes you through many commonly encountered visualization problems, and it provides guidelines on how to turn large datasets into clear and compelling figures.
Hands-On Data Visualization by Jack Dougherty; Ilya Ilyankou
ISBN: 9781492086000

Publication Date: 2021-05-18

This introductory book teaches you how to design interactive charts and customized maps, beginning with simple drag-and-drop tools such as Google Sheets, Datawrapper, and Tableau Public. You'll also gradually learn how to edit open-source code templates like Chart.js, Highcharts, and Leaflet on GitHub. No coding experience is required.
Data Visualization by Jeffrey D. Camm; James J. Cochran; Michael J. Fry; Jeffrey W. Ohlmann
ISBN: 9780357631348

Publication Date: 2021-05-18

This book contains material on effective design, choice of chart type, effective use of color, how to both explore data visually, and how to explain concepts and results visually in a compelling way with data.