The Best Little Book on Data
How’s that for a book title? Would it pique your interest? Would you download it and read it? Do you have friends or co-workers who would be interested in it?
Why am I asking?
Because it doesn’t exist. Yet. Call it a working title for a project I’ve been kicking around in my head for a couple of years. In a lot of ways, this blog has been and continues to be a way for me to jot down and try out ideas to include in the book. This is my first stab at trying to capture a real structure, though.
The Best Little Book on Data
In my mind, the book will be a quick, easy read — as entertaining as a greased pig loose at a black-tie political fundraiser — but will really hammer home some key concepts around how to use data effectively. If I’m lucky, I’ll talk a cartoonist into some pen-and-ink, one-panel chucklers to sprinkle throughout it. I’ll come up with some sort of theme that will tie the chapter titles together — “myths” would be good…except that means every title is basically a negative of the subject; “Commandments” could work…but I’m too inherently politically correct to really be comfortable with biblical overtones; an “…In which our hero…” style (the “hero” being the reader, I guess?). Obviously, I need to work that out.
First cut at the structure:
- Introduction — who this book is for; in a nutshell, it’s targeted at anyone in business who knows they have a lot of data, who knows they need to be using that data…but who wants some practical tips and concepts as to how to actually go about doing just that.
- Chapter 1: Start with the Data…If You Want to Guarantee Failure — it’s tempting to think that, to use data effectively, the first thing you should do is go out and query/pull the data that you’re interested in. That’s a great way to get lost in spreadsheets and emerge hours (or days!) later with some charts that are, at best, interesting but not actionable, and, at worst, not even interesting.
- Chapter 2: Metrics vs. Analysis — providing some real clarity regarding the fundamentally different ways to “use data.” Metrics are for performance measurement and monitoring — they are all about the “what” and are tied to objectives and targets. Analysis is all about the “why” — it’s exploratory and needs to be hypothesis driven. Operational data is a third way, but not really covered in the book, so probably described here just to complete the framework.
- Chapter 3: Objective Clarity — a deeper dive into setting up metrics/performance measurement, and how to start with being clear as to the objectives for what’s being measured, going from there to identifying metrics (direct measures combined with proxy measures), establishing targets for the metrics (and why, “I can’t set one until I’ve tracked it for a while” is a total copout), and validating the framework
- Chapter 4: When “The Metric Went Up” Doesn’t Mean a Gosh Darn Thing — another chapter on metrics/performance measuremen. A discussion of the temptation to over-interpret time-based performance metrics. If a key metric is higher this month than last month…it doesn’t necessarily mean things are improving. This includes a high-level discussion of “signal vs. noise,” an illustration of how easy it is to get lulled into believing something is “good” or “bad” when it’s really “inconclusive,” and some techniques for avoiding this pitfall (such as using simple, rudimentary control limits to frame trend data).
- Chapter 5: Remember the Scientific Method? — a deeper dive on analysis and how it needs to be hypothesis-driven…but with the twist that you should validate that the results will be actionable just by assessing the hypothesis before actually pulling data and conducting the analysis
- Chapter 6: Data Visualization Matters — largely, a summary/highlights of the stellar work that Stephen Few has done (and, since he built on Tufte’s work, I’m sure there would be some level of homage to him as well). This will include a discussion of how graphic designers tend to not be wired to think about data and analysis, while highly data-oriented people tend to fall short when it comes to visual talent. Yet…to really deliver useful information, these have to come together. And, of course, illustrative before/after examples.
- Chapter 7: Microsoft Excel…and Why BI Vendors Hate It — the BI industry has tried to equate MS Excel with “spreadmarts” and, by extension, deride any company that is relying heavily on Excel for reporting and/or analysis as being wildly early on the maturity curve when it comes to using data. This chapter will blow some holes in that…while also providing guidance on when/where/how BI tools are needed (I don’t know where data warehousing will fit in — this chapter, a new chapter, or not at all). This chapter would also reference some freely downloadable spreadsheets with examples, macros, and instructions for customizing an Excel implementation to do some of the data visualization work that Excel can do…but doesn’t default to. Hmmm… JT? Miriam? I’m seeing myself snooping for some help from the experts on these!
- Chapter 8: Your Data is Dirty. Get Over It. — CRM data, ERP data, web analytics data, it doesn’t matter what kind of data. It’s always dirtier than the people who haven’t really drilled down into it assume. It’s really easy to get hung up on this when you start digging into it…and that’s a good way to waste a lot of effort. Which isn’t to say that some understanding of data gaps and shortcomings isn’t important.
- Chapter 9: Web Analytics — I’m not sure exactly where this fits, but it feels like it would be a mistake to not provide at least a basic overview of web analytics, pitfalls (which really go to not applying the core concepts already covered, but web analytics tools make it easy to forget them), and maybe even providing some thoughts on social media measurement.
- Chapter 10: A Collection of Data Cliches and Myths — This may actually be more of an appendix, but it’s worth sharing the cliches that are wrong and myths that are worth filing away, I think: “the myth of the step function” (unrealistic expectations), “the myth that people are cows” (might put this in the web analytics section), “if you can’t measure it, don’t do it” (and why that’s just plain silliness)
- Chapter 11: Bringing It All Together — I assume there will be such a chapter, but I’m going to have to rely on nailing the theme and the overall structure before I know how it will shake out.
What do you think? What’s missing? Which of these remind you of anecdotes in your own experience (haven’t you always dreamed of being included in the Acknowledgments section of a book? Even if it’s a free eBook?)? What topic(s) are you most interested in? Back to the questions I opened this post with — would you be interested in reading this book, and do you have friends or co-workers who would be interested? Or, am I just imagining that this would fill a gap that many businesses are struggling with?