Making Tables of Numbers Comprehensible
I’m always amazed (read: dismayed) when I see the results of an analysis presented with a key set of the results delivered as a raw table of numbers. It is impossible to instantly comprehend a data table that has more than 3 or 4 rows and 3 or 4 columns. And, “instant comprehension” should be the goal of any presentation of information — it’s the hook that gets your audience’s brain wrapped around the material and ready to ponder it more deeply. Below are two different ways to use conditional formatting to convey information rather than data.
An industry analyst report was released recently that summarized the scores from the analyst’s evaluation of multiple platforms across a number of dimensions. You may recognize the table below (although I’ve doctored the results enough that I’m not giving away any of the analyst’s intellectual property):
This is a barely-dressed-up Excel spreadsheet. It takes some real staring at the table, including scanning and re-scanning the numbers, to realize that “Jupiter” is rated as the strongest offering. “Neptune” — the fourth vendor listed — appears to be the second strongest. To be fair, a high-level summary of these results is presented in a separate chart, but that chart is really high level.
Some things that, when I tried to wrap my own head around the table, seemed extraneous:
- 2 decimal places — these scores were the roll-ups of dozens and dozens of individual scores that were, inherently, somewhat subjective. Two decimal places implies a precision that simply does not exist.
- The actual component scores — in an evaluation like this, the reader really primarily cares about relative strength across each row rather than the absolute scores.
- The weightings — the weighting matters…but it’s just a factor in the formulas that get to the overall scores — it’s not actually part of the results
With that in mind, just to get a better understanding of the data myself, I grabbed the data and reformatted it using a heatmap. Each row is graduated separately based on the high/low values in that row:
I eliminated the actual display of the numbers for the components for each group, and I shifted the weighting off to the right (and lightened it and made it a smaller font). When I looked at this, I realized that the order of the vendors was simply alphabetical. While that is a logical order, and it may seem like a good way to “let the data speak for itself,” alphabetization is entirely arbitrary in this context. Why not arrange the platforms from overall strongest to weakest?
As it turns out, “Jupiter” happened to be first alphabetically and had the highest overall score. But, with this arrangement, we can quickly see where the different platforms stand out. For instance, “Mercury” is rated relatively lower on all dimensions except for “Employees,” where they were scored high relative to the other platforms. “Saturn” has only 3 areas where they are not the absolute weakest of the entire group.
Now, is the heatmap approach above the only way to present it? Do I think you have to omit the numbers from all of the cells? Of course not! But, it hardly seems arguable that the heatmap is much easier to digest than the raw data table.
A different type of data table is shown below. This one is a case where multiple metrics are shown across a single dimension (in this case, traffic source):
This is a reasonably formatted table of numbers, but reading and interpreting a bunch of numbers at a glance isn’t something we do well. Interestingly, the first “read” of this data actually comes from the number of digits in each metric rather than the numbers themselves. For instance, the higher conversion rate for email jumps out because it’s a longer number, rather than because it is numerically larger. As a matter of fact, the first three metrics actually each have a bit of a bar chart nature to them just because they have varying numbers of digits.
Understanding that that length is a much faster/easier visual input than numerical digits, we can use conditional formatting to capitalize on that fact:
Or, if you have the room to allow a slightly wider chart, add a column for each metric so that the value and the bars don’t overlap:
In either case, it then becomes much easier to grasp which metrics for which rows are anomalous. For instance, the conversion rate for email — noted earlier — but also the visits from paid search.
Always More Than “Just The Numbers”
Hopefully, the examples here, more than prescribing exactly how to plague-ishly avoid raw tables of numbers, show how much more readily comprehensible tables of numbers can be with some quick visualizations of the data. What do you think? Do you have techniques you use to make data tables more readily digestible?