Baseball Stats and BI Musings Part I: Good Metrics?
There are two basic reasons to score a baseball game:
- Capture enough information on a single page (two pages, actually) that would allow you to entirely recreate the game, play by play, after the fact
- Capture information required to compile game/season statistics for individual players — things like batting average on offense, fielding effectiveness on defense, and ERA for pitchers (also technically a defensive thing)
This means you need to capture a lot of information. Every pitch typically gets recorded in some fashion, and any time a batter finishes at the plate (through a hit, a walk, hit by a pitch, etc.) requires recording additional information. The more detailed the information, the more fun statistics you can pull from the data. But, generally, it’s good to capture a bit more data than you expect to use. For instance, with the system I’m using now, I actually catch the sequence of pitches for any batter: ball, then strike, then strike, then ball, then hit, for instance. That detail, in theory, would allow me to report how a batter fares when he is “behind in the count” (more strikes than balls) vs. “ahead in the count” (more balls than strikes). I’m not going there at all at this point.
At my son’s age, we really just want to make sure we get the final score right. But, the statistics are awfully alluring, so I’ve been logging the information in a spreadsheet so I can do some crunching and see what it tells me. We’re only four games in, and I’m no baseball sophisticate, so I started with the two most popular stats in baseball: earned run average (ERA) and batting average. I regularly mount my “a metric that isn’t tied to a clear objective is not a good metric” soapbox, and it turns out ERA is a pretty great metric. A pitcher’s objective is pretty clear: allow as few runs to score as possible. But, you can’t simply look at the total runs scored on a pitcher for two reasons:
- A great pitcher who has an infield that regularly flubs plays is going to have more runs scored on him than a similar pitcher who has Derek Jeter and Alex Rodriguez shagging grounders
- The more innings a pitcher pitches, the more runs he’s going to have scored on him
The “earned run” part of the ERA addresses the first issue by trying to isolate how many runs would have been scored if the other 8 players on the field played perfectly. The “average” part of ERA addresses the second issue by normalizing the metric to a 9-inning average (or a 6-inning average in my son’s case, as their games are only 6 innings long).
What about setting a target? The Gospel According to Gilligan clearly states “Thou shalt not consider a metric worthy if it does not have a preset target.” In the majors, an ERA below 3.00 is considered to be pretty darn good. It’s a “benchmark” of sorts. Or, the other way to look at the metric is to say the target is a 0.00, which is unattainable, but a worthy stretch goal.
So, what about batting average? This seems pretty simple. The batting average is the percent of a player’s at bats where he gets a hit. It’s actually represented as a 3-place fraction rather than a percentage (a .347 batting average means the player gets a hit on 34.7% of his at bats). The stat has been around as long as ERA and has long been considered the metric that is the single best measure of a player’s offensive output. There are a couple of problems with the metric, though. First off, what is a batter’s primary objective? Ultimately, it’s to score runs…but there are too many other factors at play to use that as metric. And, as it turns out, it’s not to get hits as much as it is to get on base. And hits are only one way of doing that. When you peel back the batting average calculation a bit, you find that a walk is not considered an official at bat, so it doesn’t go into the numerator or the denominator of the equation. The reasoning is that the batter got on base because the pitcher screwed up. That’s giving the pitcher a bit too much credit, as a batter who has “plate discipline” is a batter that doesn’t swing at balls — he gets more walks, and when he swings, he’s more likely to be swinging at a hittable ball. (Sacrifices also don’t count as an at bat, but I’m okay with that, as the batter’s objective in that case is to move the baserunner(s) up, so he’s not really trying to get on base himself. A fielder’s choice where the hitter winds up on base doesn’t count as a hit, which makes sense. And, if a batter puts a ball in play and then reaches base on an error, that’s still not considered a hit, because that was more a defensive goof than an offensive success, so it goes into the denominator as an at bat but not in the numerator as a hit. Oh…MAN…can I digress on this subject…!)
Whether it’s true or not, or whether it’s a gross oversimplification, Billy Beane, the general manager of the Oakland A’s, gets credited with this epiphany. The story of how Billy used data to go against baseball’s conventional wisdom to make the Oakland A’s a consistent contender despite their minuscule payroll (by MLB standards) is the basic premise of Moneyball: The Art of Winning an Unfair Game. One of the metrics that Billy and his number crunching assistant started focussing on was on-base percentage (OBP), which includes walks in the numerator and denominator of the calculation. OBP gets a lot closer to a batter’s objective than batting average does. And, Beane started picking up college players who walked a lot but didn’t have a great batting average. And it worked.
Theo Epstein, the general manager of the Boston Red Sox, followed in Beane’s footsteps (he actually worked for Beane for all of 12 hours during Beane’s one-day stint as GM of the Red Sox). And the Red Sox finally won another World Series.
So, as I’ve started tallying the stats for my son’s team, I’ve calculated both batting average and OBP, and, lo’ and behold, we’ve got a couple of kids who are in the lowest third of the team based on batting average…but move up considerably when it comes to OBP. None of this is to be shared with the kids — at this point, they’re having a good time, they’re trying hard, and they’re learning to support each other, so introducing a hierarchy of “who’s better” is wildly counter-productive.
In the end, I’ve violated my core tenet — I’m looking at metrics that are not, in the end, actionable at all! But I’m having fun, and it’s got me thinking about data in some new ways. This post was about metrics. I’ll explore data quality in the next post. Stay tuned!