Recovery.gov Needs Some Few and Some Tufte
I caught an NPR story about recovery.gov last week, and it sounded really promising. Depending on where you fall on the political spectrum, the various rounds of stimulus and bailout funding that have come through over the past six months fall somewhere between “throwing money away,” “ready, fire, aim,” and “point in what seems what might be a good direction, pull the finger, and shoot.” No one can stand up and say, with 100% certainty, that we’re not going to look back on this approach in a decade or two and say, “Um…oops?”
It’s hard to imagine anyone taking issue with the proclaimed intent of recovery.gov, though — make the process as transparent as possible, including how much money is going where, when it’s going, and what ultimately comes of it. It was a day or two before I found myself at a computer with time to check out the site…and I was disappointed. In the NPR interview, the interviewer commented how the site was slick and clean. Reality is “not so much.”
Now, I did once take a run at downloading the federal budget to try to scratch a curiousity itch regarding, at a macro level, where the federal government allocates its funds. On the one hand, I was pleased that I was able to find a .csv file with a sea of data that I could easily download and open with Excel. On the other hand, the budget is incredibly complex, and it takes someone with a deeper understanding of our government to really translate that sea of data into the answers I was looking for. Really, though, that wasn’t a surprise:
The data is ALWAYS more complex than you would like…when you’re trying to answer a specific question.
To the credit of recovery.gov, they clearly intended to show some high-level charts that would answer some of the more common questions citizens are asking. Unfortunately, it looks like they turned over the exercise to a web designer who had no experience in data visualization.
Examples from the featured area on the home page:
The overall dark/inverse style itself I won’t knock too much (althought it bothers me). And, the fact that the gridlines are kept to a minimum is definitely a good thing. My main beef is admittedly a bit ticky-tack. There was an earlier version where there was a $30 B gridline, and that has since been removed — that gridline clearly showed the “30.5 B point” being below the midway point between 20 B and 40 B. Clearly, someone would have to really be scrutinizing the graph to identify this hiccup, but someone will.
When presenting data to an audience, the data as it stands alone needs to be rock solid. If it contradicts itself, even in a minor way, it risks having its overall credibility questioned.
So, moving on to some more egregious examples:
We get a triple-whammy with this one:
- Pie charts are inherently difficult for the human brain to interpret accurately
- Pie charts are even worse when they are “tilted” to give a 3D effect — the wedges on the right and left get “shrunk” while wedges on the top or bottom get “stretched”
- Exploding a pie chart and then providing a pie chart of just the wedge…just ain’t good
Two questions this visualization might have been trying to answer:
- How much of the stimulus plan is devoted to tax benefits?
- How much of the stimulus plan is going to the “Making Work Pay” tax credit?
Without doing any math, can you estimate either one of these? For the first question, you’re estimating the size of the small wedge on the left pie chart. It looks like it’s ~ 1/4 of the pie, doesn’t it? In reality, it’s 37%! For the second question, you have to combine your first estimate with an estimate of the lavender wedge in the right pie chart…and that’s way more work than it’s worth. If you do the math, you’ll get that the lavender wedge works out to ~7% of the entire left pie. A simple table or a bar graph would be more effective.
And, finally, the estimated distribution of Highway Infrastructure Funds:
Well, that’s just silly. There is NO value of making these bars come flying out of the graph. Really.
Now, to the site’s credit, it takes all of 3 clicks to get from the home page to downloading .csv files with department-specific data and weekly updates (which includes human-entered context as to major activities during the prior week). That’s good (assuming it’s not unduly cumbersome to maintain)! And, I’m sure the site will continue to evolve. But, I’d love to see them bring in some data visualization expertise. The model for the visualization should be pretty simple:
- Identify the questions that citizens are asking about the stimulus money
- Present the data in the way that answers those questions most effectively
- Link to the underlying data — the aggregate and the detail — directly from each visualization
As it turns out, Edward Tufte has already been engaged (thanks to Peter Couvares for that tip via Twitter), and is doing some pro bono work. But, it’s not clear that he’s focussing on the high-level stuff. I would love to see Stephen Few get involved as well — pro bono or not! Or, hell, I’d offer my services…but might as well get the Top Dog for something like this.
Starting today, the site is hosting a weeklong online dialogue to engage the public, potential recipients, solution providers, and state, local and tribal partners about how to make Recovery.gov better. I’ve submitted a couple of ideas already!