Data sets are most valuable when people can understand them. When done right, data visualization is a great way to display large amounts of information simply and intuitively. However, in order to ensure that visualizations are effective, it’s important to follow a few important standards and avoid a few all-too-common mistakes.
Data Visualization Do’s
Keep the visualizations simple
While many people assume that complex graphs, charts, infographics, etc. are useful visualization tools, the opposite is actually true. Good visualization should simplify messages and make the main data points as easy to understand as possible. Simple, effective visualizations often follow these guidelines:
- They minimize colors and other attention-grabbing elements that aren’t directly related to the data of interest.
- They show the full scale of the graph, then zoom to show the data of interest, if necessary.
- They use traditional line graphs, bar charts, and pie charts; these are simple and popular for a reason!
- They aim to grab attention and make their point in under five seconds.
- They include clear labels and titles to explain important chart elements.
Pay attention to how color is used
Using color effectively, and intentionally, will help you to get your point across.
Consider using shades of the same color for comparisons, limiting the number of colors to minimize distraction, and using colors related to the topic being discussed, if applicable. Be mindful of colors that have strong connotations, like red and green, as well as colors with specific cultural associations that may be misleading or confusing for your audience.
Below is an example of what not to do when using colors in a visualization. The color selected for 46-49% is essentially indistinguishable from the color selected for 96-100%. Be sure to use colors that can easily and quickly be differentiated from each other.
The way we see the world today is shaped by geospatial data. Learn the 7 key techniques you should know to visualize geospatial data.
Consider the intention of the visualization
What is the goal of your visualization?
There are two broad kinds of visualizations: presented and distributed.
Presented visualizations focus on conclusions and specific, key highlights, and they are as simplified as possible to accurately present the information.
Distributed visualizations provide both context and conclusions. This kind of visualization focuses on the story behind the data set and therefore may be more complex.
Check that your visualization is understandable
Just like you’d never publish important written content without external proofreading, you should always run graphs and charts by an outside party to make sure your visualizations tell the right story. Double check these elements before determining that a visualization is complete:
- Make sure labels are accurate, graphs are to scale, and the data displayed adds up to the total (if you’re using a pie chart).
- Ask someone who is unfamiliar with the data what they understand from the visualization. Is it consistent with what you’re trying to communicate? This includes asking them if the colors used are appropriate and make sense.
The graph above is clearly labeled, uses minimal, appropriate colors, has a well-defined key, and doesn’t try to give too much information at once.
Government budgets are notoriously hard to understand, but the White House created a straightforward tree map to visually break down its 2016 budget. The visualization uses color and size effectively to help readers quickly understand how much funding each issue or program gets.
People don’t simply want to see numbers. They want to know the story behind them. Learn how to tell a great story with data visualization.
Data Visualization Don’ts
It’s important to avoid visualization don’ts. A bad or sloppy visualization can inadvertently lead to misrepresented data, which is discrediting at best and dishonest at worst. The below tips can help you keep your visualizations honest and effective.
Don’t intentionally misrepresent data
This one might seem like a no-brainer, but it’s important to mention. Whether done intentionally, or unintentionally, misrepresented data has consequences. For example, any of the following errors can undermine the validity of your data set or even your reputation:
- An axis that starts at a place that exaggerates differences within the data
- Using uneven intervals between numbers
- Using inaccurate or inconsistent scales on size comparisons
- Using colors that are inappropriate for the data set being described
This list is not exhaustive. It’s the responsibility of the data experts putting together a visualization to think about how the visualization could be misleading or misrepresentative.
The way the information above is displayed is misleading because the amount of visual space used is not proportional with the corresponding percentage, and the numbers don’t add up to 100%. Also, the use of color is distracting as some colors are more eye-catching than others.
The size of the pieces in this pie chart contradict the percentages associated with them. A smaller chunk is given to those who would vote ‘Yes’ to Brexit even though the percentage is higher than those who would vote ‘No’.
Don’t try to present too much information
Squishing too much information into your visualization is confusing and just plain ugly. Here are a couple of signs that your visualization has too much information:
- There are more than six colors in your visual.
- The chart is crowded, and it is difficult, if not impossible, to differentiate between the data points within the first couple of seconds
- You need multiple text boxes to explain the data points.
This graph above is far too crowded for anyone looking at it to have a good idea of what the pie chart is representing.
Pie charts may look simple, but they’re tricky to get right. Check out the 5 things you should know before you make a pie chart.
This visualization is visually beautiful but wild — the connections between the longlisters for the Booker Prize and what makes a prize winning novel is somewhat unclear.
Don’t put bad data into a visualization to try to make it look better
Bad data is bad data, and no colors or graphs can help to give bad data more substantial meaning.
63 % + 70% + 60% = 193%….. Pie charts must always add up to 100%.
This is a great example of “Phantom Data”, which is pretty meaningless to visualize.
Planning to visualize time series data? Learn how to create the 7 most common temporal data visualizations.
Don’t assume that all visualization methods are created equal
Different charts and graphs work best for different types of information. Consider what your data is describing, and analyze which type of visualization will provide the most clear and accurate picture of your data set.
The goal of the visualization above was to show the number of features available in various versions of Microsoft Word. However, a pie chart shows this in terms of proportion. This is not an effective use of a pie chart. A bar or column chart, for example, would have been a better way to compare the number of features per version.
The scale above between 995 and 62.5 is not large enough. Also, you should avoid comparing different measures (i.e. percentage vs. ratio vs. rate) in the same chart.
Data visualization is a valuable tool that can be used in a variety of ways to make large data sets accessible and approachable. Just make sure to follow the checklist of “dos” and “don’ts” while making your data visualization to ensure your data is intuitive, accurate and efficient.