Every Analyst Should Follow fivethirtyeight.com
I’ll admit it: I’m a Nate Silver fanboy. That fandom is rooted in my political junky-ism and dates back to the first iteration of fivethirtyeight.com back in 2008. Since then, Silver joined the New York Times, so fivethirtyeight.com migrated to be part of that media behemoth, and, more recently, Silver left the New York Times for ESPN — another media behemoth. This bouncing around has been driven by Silver’s passion for various places where data is abundant and underutilized: starting with online poker, then baseball analytics, and then a sharp turn to political polling (the original fivethirtyeight.com), which then went even more deeply into politics (the Times iteration of fivethirtyeight.com), which then went broadly into data across many subjects (his book), and which then stayed fairly broad…but with a return to some heavier sports (with the ESPN iteration of fivethirtyeight.com).
Silver talks a lot about what data can and cannot do and how it gets mis-used, and he often dives into details of statistical analysis that I really can’t quite follow. But, he also has a whole other aspect of what he (and his team) does really, really well that I haven’t seen him talking about much. These are twofold:
- Picking the questions that are worth answering
- Effectively visualizing that data
These are both key to his success, but they’re also key to any analyst’s ability to deliver value within their organizations.
Picking Questions Worth Answering
Silver originally picked questions that simply intrigued him (winning at online poker, better analyzing baseball players, predicting election outcomes), and those wound up getting him to questions that had mass appeal. Now, as a media site, the questions his team picks, I assume, have a heavy component of “will this drive traffic?” The questions have a pretty diverse range:
- In the wake of the Sandy Hook shootings, what happened with media coverage and public opinion about gun control? [Article]
- Will the recent moves to add calorie counts to fast food menus actually change consumer consumption behavior? [Article]
- Would lifting the ban that prevents gay men from donating blood meaningfully move the needle on blood donations? [Article]
These questions are often driven by current events and, clearly, would be of interest to a sufficiently large number of potential readers.
“But I’m not trying to drive impressions with my analyses! I just want to drive my business forward!” you exclaim! “How does this relate to me?!”
I’ll claim that it does, but I’ll admit it’s a somewhat meta argument. The dream for most analysts is to find something that gets widely shared internally, because the work reveals something that is surprising and actionable. It’s sooooo easy to lose sight day-in and day-out of the need to be tackling questions that will be most likely to lead to dissemination and action. fivethirtyeight.com — any media site, really — has to focus on content that will be “popular” (in some definition of the word). As an analyst, shouldn’t we constantly be going beyond reacting to the questions that fall in our lap and seeking out meaningful questions to answer?
For me, every time I read an article on fivethirtyeight.com and think, “Aw, man! That author is so lucky to have gotten to dig into that!” I try to remind myself that I do have some control over what I dig into with most of my clients, and I should constantly be seeking questions that would have broad and actionable appeal (and pushing them to identify those questions themselves).
Effectively Visualizing that Data
This second aspect of the content on fivethirtyeight.com is more tangible and directly applicable. It’s not that every article nails it, but most of the articles include visualized data, and most of those visualizations are very well thought through — neither picking a “standard” visualization, nor getting fancy for fanciness’s sake.
I’m a casual college football fan, at best, but it’s been interesting to watch Silver struggle with predicting who would be in the first “final four” with the change to the championship system that went into place this year. One of his approaches was to run simulations based on what clues he could find about how the selection committee would act, combined with predictions for the results of as-yet-unplayed games. This resulted in a chart like the one below.
Although the one below didn’t actually get the final four “right,” in that TCU dropped out and Ohio State was in…this was something that was almost impossible to accurately predict (between the wildcard of the selection committee’s process, and the fact that Ohio State surprised everyone by blowing out Wisconsin in the Big Ten championship game that occurred several days after he ran this simulation). But, the visualization works on two levels: 1) at a glance, it’s clear which teams his analysis show as being in contention for a final four spot, and 2) the use of the heatmap and dividing lines provides a second level of detail as to the skewing and variability that the model predicted for each team:
Are you not a “sportsball” (<– Michele Kiss hat tip) fan? Let’s look at an example from politics!
When Jeb Bush took an offical pre-pre-pre-pre-“I’m running for U.S. President” step, Silver asked the question: “Is Jeb Bush Too Liberal To Win the Republican Nomination in 2016?” To tackle this, he pulled third party data from three different sources that all used different techniques to quantify where various political figures fall on the liberal-conservative spectrum. The result? Another exceedingly well-presented visualization!
Again, the visualization works on two levels: 1) at a glance, it shows that Bush appears to skew to the left side of the conservative spectrum, but he’s not extremely so, and 2) the second layer of detail shows where current (potential) and past Republican candidates fall relative to each other, how consistent each of 2 or 3 different measurement systems aligned when making that assessment (see Rand Paul!), and even how the times they have a’ changed as to the “average” for the party (for Congress):
The great visualizations aren’t limited to sports and politics, nor are they limited to Silver’s posts. One final example is, in one sense, “just” a simple histogram, but it’s a histogram that has had some real care put into by Mona Chalabi. She tried to answer the question: “How Common Is It For A Man To Be Shorter Than His Partner?” She was limited to secondary data (which was quite limiting!), and she noted at the outset that, for a range of reasons, the results weren’t all that surprising. But, in the histogram below, look at how much care was put into adding clear labels (“Woman taller.” “Man taller”), using color to emphasize the “answer to the original question,” and even the addition of a simple vertical line to represent “equal height.”
I absolutely love the level of care that fivethirtyeight.com puts into their visualizations. They clearly have a well-defined style guide when it comes to the palette, fonts, and font size. But, as with any good style guide, those constraints enable a high level of creativity to then determine what the truly best way to visualize the information is.
fivethirtyeight.com is my newest most favorite site. As I opened with, much of the underlying content is actually on topics I care about, but I’m going to justify my on-going consumption of that content by claiming that it is also a source of inspiration and motivation for improving my work as an analyst!