Skip to Main Content

Data Visualization

This guide introduces data visualization and highlights resources available at Pepperdine Libraries.

3-Step Data Visualization Process

There are three main steps to visualizing your data. Steps 1 and 2 are interchangeable and may co-occur!

Step 1: Formulate the question you want your visualization to answer or the story you want your visualization to tell about your data.

Step 2: Gather, sort, and understand your data

Step 3: Apply your chosen visual representation (If you are unfamiliar with your chosen tool, you will want to take time to learn and understand the software).

Gathering data

You may already have a dataset ready or are in the process of collecting your own data.

If you need to find data to use for visualization, Pepperdine Libraries provides access to a variety of datasets through its databases and online repositories. You can find datasets or source data by searching the library’s databases and online repositories. You can create your own dataset using sources such as surveys, experiments, web scraping, or digitized documents.

If you have any questions about where to source your data, you can contact the library liaison for your subject interest or area. 

Understanding and sorting data

The following questions will help you approach, understand, and transform your data to be ready for visualization:

Examine your data

  • Do you have all of the data you need?
  • Does your data include all of the variables in which you are interested?
  • Are there any obvious errors in your data?
  • Is there data that is missing?

Understand your data type

  • What type(s) of data do you have?
    • Qualitative or quantitative
    • Nominal, ordinal, ratio, categorical, or spatial
  • What is the range of values contained in your data?

Transform your data

Quality transformations

  • Do you need to clean up your data?
  • Do you need to correct errors in your data?
  • Do you need to fill in gaps in your data?

Analysis transformations

  • Do you need to parse (split up) any of your variables? (e.g., extract the year from a date)
  • Do you need to merge any of your variables? (e.g., combine a forename and surname into one name value)
  • Do you need to convert qualitative data or free-text into coded values or keywords?
  • Do you need to create calculations to use in the analysis? (e.g., percentage proportions)
  • Do you need to remove redundant data for which you have no planned use?

Cleaning data

Data cleaning is the process of identifying and removing (or correcting) errors, inaccuracies, and inconsistencies in your data. The process of data cleaning ensures that your data is accurate, complete, and consistent. Basic actions like renaming column headers or splitting columns can easily be done in programs like Excel or Google Sheets.

However, if you need to do more significant transformations or clean up with your data, you may want to consider the following tools:

  • OpenRefine - a free and open-source desktop application for working with messy data (available to download on their website)
  • DataWrangler - developed by the Stanford Visualization group for wrangling messy data (not actively supported by the research group)

Resources

Need help with the design of your visualization? Johns Hopkins Sheridan Libraries has an excellent guide on designing effective visualizations

Need help deciding what type of visualization to use? Yale Library has a great guide on different types of visualizations and use cases for each.

Books at Pepperdine Libraries