Excel Tips

Excel Dynamic Named Ranges Redux — Multiple Series in One Chart

In one of the more consistently popular posts I’ve written, I went into detail about how to set up charts that would update based on a value selected from a couple of dropdown menus – specifically geared towards a dropdown menu that allows the selection of a date such that the chart(s) would update to reflect the data up to that date.

One of the commenters asked how to include multiple data series in a single chart using that same technique. I did a very quick example via email, but I mentally committed to documenting the specifics on the blog, so here we go (file download at the end of this post).

Add Some Data

I could, of course, just use the data I was already working with, but none of that fictitious data made sense as a stacked bar chart. So, the first step is to add a couple of data series that might reasonably belong in a stacked column chart – an easy one is to break out the web traffic into “New Visitors” versus “Returning Visitors.”

Following the same technique as described in the original post, I name the top cells NewVisitors_Current (Column E) and ReturningVisitors_Current (Column F) and copy the formula from the Web Traffic column into those two columns (it’s the same formula in all cells in row 1, and they can be copied without modification due to the use of “COLUMN()” in the formula).

Then, create NewVisitors_Range and ReturningVisitors_Range named ranges by going to Formulas » Name Manager, copying the formula for WebTraffic_Range, and then creating the two new named ranges using the same formula, except swapping out “WebTraffic” in the formula with “NewVisitors” and “ReturningVisitors.”

Note: This may seem like a complicated setup. It’s actually pretty quick and simple, and can even be achieved using a macro if there are a slew of metrics that need to be set up. One tip, though, is to establish a consistent naming convention for the different aspects of each metric.

So, enough with the seup. How do we put multiple series into a stacked bar chart?

Copy One of the Line Charts

The easiest way to get our base chart is to simply hold down <Ctrl>-<Shift> and click and drag one of the existing charts straight down on the worksheet. I’m a fan of copying charts rather than making new charts from scratch for two reasons: 1) It’s easier to keep them aligned and exactly the same size, and 2) It’s easier to keep the formatting the same (the formatting in this example is horrid, but that was for the sake of simplicity in the initial tutorial).

So, now we have two charts (I copied the date and “current total” cells as well, but we’re pretty much done there now – in this case, the current total uses the “Web Traffic” value, and it’s the sum of the New Visitors and Returning Visitors):

Change the Chart Type

Select the chart and then go to Chart Tools » Design » Change Chart Type and select the Stacked Column chart type:

You will now have a chart that looks like this:

But, this is still only one data series, and it’s the overall web traffic – not the breakout of new visitors and returning visitors. So…

Update the Data Series

Click on the columns in the chart, and a formula will appear in the formula bar (it’s not you…it’s a small image; image width constraints I apply to this blog, but you get the idea):

There are other ways to update the data, but this is the fastest when it’s a viable option. Simply change the first “Web Traffic,” which is the name of the data series, to “New Visitors.” Then, change “WebTraffic” later in the formula to “NewVisitors”. What you’re really doing with this second change is changing the data source from “WebTraffic_Range” to “NewVisitors_Range”.

The chart will update and will look like this:

Now, since we’re going to have two series on this chart, let’s go ahead and click on the column title and change it to “Web Traffic” manually (when you changed “Web Traffic” to “New Visitors” in the formula bar, you were changing the series name — Excel just noticed that you had only one series and no legend, so it decided to make that the chart title, too; you’ll still want the series name to be “New Visitors,” though; the reason should become apparent shortly…like…after the next sentence!). And, while we’re at it, let’s add a legend and make the chart a bit taller to make room for it!

Add the Second Series

Now, here’s where the fun happens. Right-click in the chart and select Select Data. Then, select New Visitors and click the Edit button. You’re not actually going to edit that data series, but it’s the fastest way to get the second series set up. In the Edit dialog box, select the entire contents of the Series values field and select <Ctrl>-<C> to copy the formula:

Click Cancel.

Click Add.

For the Series name enter “Returning Visitors” and then paste the formula (<Ctrl>-<V>) you just copied into the Series values field. Then, scroll to the end of that formula and replace “New” with “Returning”:

Click OK and then click OK again on the next screen.

Voila!

Still, as before, you can change the Report Period and the Report Range dropdowns to alter what data gets displayed on the chart.

You can download the spreadsheet with the full example if you want to fiddle around with it without starting from scratch.

Happy charting!

 

 

Excel Tips

Excel Dynamic Named Ranges = Never Manually Updating Your Charts

[This post was written in 2010. I’ve made a new version of the post that takes advantage of Excel tables, which simplified the process a bit (it’s still kinda’ complicated). That post is available here.]

[This post is about dynamic named ranges in Excel 2007. I’m seeing a lot of referral traffic to this post searching for Excel 2010. If you’re simply looking for where you define or modify named ranges in Excel 2010 (as one commenter indicated in response to an earlier version of this update), it’s on the Formulas tab in the Defined Names area — Name Manager. If you are looking for other Excel 2010-specific information that this post doesn’t cover, please leave a quick comment as to what the change/issue is that led you to the search. Thanks.]

I’ve had a pretty good run of theoretical posts about the nature of marketing measurement of late, so it seemed like I was due for a more down-in-the-weeds-Excel-efficiency-tactics write-up.  This blog isn’t really focussed on all of the myriad ways that Excel can be contorted to represent data effectively, but I’m a big believer in using tools as effectively as possible to remove as much rote report generation as possible. There are lots of blogs devoted entirely to Excel tips and tricks. My favorite on that front is Jon Peltier’s (if you get intrigued by this post, hop over and peruse a slew of other ways to have charts dynamically update).

This post describes (and includes a downloadable file of the example) a technique that we use extensively to make short work of updating recurring reports. Here are the criteria I was working against:

  • User-selectable report date
  • User-selectable range of data to include in the chart
  • Single date/range selection to update multiple charts at once
  • No need to touch the chart itself
  • Reporting of the most recent value (think sparklines, where you want to show the last x data values in a small chart, and then report the last value explicitly as a number)
  • No use of third-party plug-ins — one of these days, I’ll get around to playing with the various Excel add-ons like those offered by Tableau Software and XLCubed (or even the Peltier Tech add-ins, which are targeted but made by one of the top 3 most authoritative Excel resources on the ‘net), but that adds just the slightest of barriers and, again, isn’t needed for this exercise
  • No macros used — I don’t have anything against macros, but they introduce privacy concerns, version compatibility, odd little warnings, and, in this case, aren’t needed

The example shown here is pretty basic, but the approach scales really well.

Sound like fun?

Setting Up the Basics

One key here is to separate the presentation layer from the data layer. I like to just have the first worksheet as the presentation layer — let’s name it Dashboard — and the second worksheets as the data layer — let’s call that Data. (Note: I abhor many, many things about Excel’s default settings, but, to keep the example as familiar as possible, I’m going to leave those alone. This basic approach is one of the core components in the dashboards I work on every day, and it can be applied to a much more robust visualization of data than is represented here. See An Excel Dashboard Widget for a look at my thoughts on dashboard visualization.)

Data Tab Setup — Part 1

This is a slightly iterative process that starts with the setup of the Data tab. On that worksheet, we’ll use the first column to list our dates — these could be days, weeks, months, whatever (they can be changed at any time and the whole approach still works). For the purposes of this example, we’ll go with months. Let’s leave the first row alone — this is where we will populate the “current value,” which we’ll get to later. I like to use a simple shading schema to clearly denote which cells will get updated with data and which ones never really need to be touched. And, in this example, let’s say we’ve got three different metrics that we’re updating: Revenue, Orders, and Web Visits. This approach can be scaled to include dozens of metrics, but three should illustrate the point. That leaves us with a Data tab that looks like this:

While we’re on this tab, we should go ahead and defined some named cells and some named ranges. We’ll name the cell in the first row of each metric column as the current value for that metric (the cells don’t have to be named cells, but it makes for easier, safer updating of the dashboard as the complexity grows). Name each cell by clicking on the cell, then clicking in the cell address at the top left and typing in the cell name. It’s important to have consistent naming conventions, so we’ll go with <metric>_Current for this (it works out to have the metric identified first, with the qualifier/type after — just trust me!). The screen capture below shows this being done for the cell where the current value for Orders will go, but this needs to be done for Revenue and Web Traffic as well (I just remove the space for Web Traffic — WebTraffic_Current).

And, we’re definitely going to want to have the whole range of data on the tab available to us. Let’s call this MainData and define it by going to Formulas » Name Manager and clicking on New (this is Excel 2007 — it’s somewhere else easier to find in Excel 2003). Define a new range with a Workbook scope that encompasses all the columns and all of the rows of data (starting at row 3):

There are lots of ways to dynamically define MainData. You can just drag a big area if you want, but this is a slightly more elegant approach. I’m not going to go into the nuts and bolts of why this formula works, but you can look up the OFFSET and COUNTA functions and figure it out if you’re so inclined:

=OFFSET(Data!$A$3,0,0,COUNTA(Data!$A:$A)-2,COUNTA(Data!$2:$2))

We’ll also want a named range that just includes the list of months — create that the same way as MainData, but call it DateSelector and use a slightly different formula:

=OFFSET(Data!$A$3,0,0,COUNTA(Data!$A:$A)-2,1)

And, of course, we’ll actually need data — this would come later, but I’ve gone ahead and dropped some fictitious stuff in there:

That’s it for the Data tab for now…but we’ll be back!

Dashboard Tab Setup — Part 1

Now we jump over to the Dashboard worksheet and set up a couple of dropdowns — one is the report period selector, and the other is the report range (how many months to include in the chart) selector. Start by setting up some labels with dropdowns (I normally put these off to the side and outside the print range…but that doesn’t sit nice with the screen resolution I like to work with on this blog):

Then, set up the dropdowns using Excel data validation:

First, the report period. Click in cell C1, select Data » Data Validation, choose List, and then reference the named range of months we set up earlier, DateSelector:

When you click OK, you will have a dropdown in cell C1 that contains all of the available months. This is a critical cell — it’s what we’ll use to select the date we want to key off of for reporting, and it’s what we’ll use to look up the data. So, we need to make it a named cell — ReportPeriod:

Now, let’s do a similar operation for the report range — this tells the spreadsheet how many months to include in each chart. Click in cell C3, select Data » Data Validation, choose List, and then enter the different values you want as options (I’ve used 3, 6, 9, and 12 here, but any list of integers will work):

And, let’s name that cell ReportRange:

Does this seem like a lot of work? It can be a bit of a hassle on the initial setup, but it will pay huge dividends as the report gets updated each day, week, or month. Trust me!

Before we leave this tab, go ahead and select a value in each dropdown — this will make it easier to check the formulas in the next step.

Data Tab Setup — Part 2

Now is where the fun begins. We’re going to go back over to the Data worksheet and start setting up some additional named ranges. We’ve got MainData, which is the full range of data. We want to look at the currently selected Report Period (a named range called ReportPeriod) and find the value for each metric that is in the same row as that report period. That will give us the “Current” value for each metric. All you need to do is put the exact same formula in each of the three “Current” cells:

=VLOOKUP(ReportPeriod,MainData,COLUMN())

In this example, these are the values for each of the three arguments:

  • ReportPeriod — Jul-09, the value we selected on the Dashboard tab
  • MainData — this is the full set of data, including the list of months in column A
  • COLUMN() — this is 2, the column that the current metric is listed in (this function resolves to “3” for Orders and to “4” for Web Traffic)

So, the formula simply takes the currently selected month, finds the row with that value in the data array, and then moves over to the column that matches the current column of the formula:

Slick, huh? And, because the ReportPeriod data validation dropdown on the Dashboard worksheet is referencing the first column of the data on the Data tab, the VLOOKUP will always be able to find a matching value. (Read that last sentence again if it didn’t sink in — it’s a nifty little way of ensuring the robustness of the report)

This little bit of cleverness is really just a setup for the next step, which is setting up the data ranges that we’re going to chart. Conceptually, it’s very similar to what we did to find the current metric value, but we want to select the range of data that ends with that value and goes backwards by the number of months specified by ReportRange. So, in the values we selected above, Jul-09 and “6,” we basically want to be able to chart the following range of data:

We’ll do this by defining a named range called Revenue_Range (note how this has a similar naming convention to Revenue_Current, the name we gave the cell with the single value — this comes in handy for keeping track of things when setting up the dashboard). We can’t use VLOOKUP, because that function doesn’t really work with arrays and ranges of data. Instead, we’ll use a combination of the MATCH function (which is sort of like VLOOKUP on steroids) and the INDEX function (which is a handy way to grab a range of cells). Pull your hat down and fasten your seatbelt, as this one gets a little scary. Ultimately, the formula looks like this:

=INDEX(MainData,MATCH(ReportPeriod,DateSelector)-ReportRange+1, COLUMN(Revenue_Current)):INDEX(MainData, MATCH(ReportPeriod,DateSelector), COLUMN(Revenue_Current))

It’s really not that bad when you break it down. I promise!

Working from the outside in, you’ve got a couple of INDEX() functions. Think of those as being INDEX(First Cell) and INDEX(Last Cell).

The range is defined, in pseudocode, as simply:

=INDEX(First Cell):INDEX(Last Cell)

The Last Cell calculation is slightly simpler to understand. As a matter of fact, this is really just trying to identify the cell location (not the value in the cell) of the current value for revenue — very similar to what we did with the VLOOKUP function earlier. The INDEX function has three arguments: INDEX(array,row_num,column_num). Here’s how those are getting populated:

  • array — this is simply set to MainData, the full range of data
  • row_num — this is the row number within the array that we want to use; we’ll come back to that in just a minute
  • column_num — we use a similar trick that we used on the Revenue_Current function, in that we use the COLUMN() formula; but, since we set up this range simply as a named range (as opposed to being a value in a cell), we can’t leave the value of the function blank; so, we populate the function with the argument of Revenue_Current — we want to grab the column that is the same column as where the current revenue value is populated in the top row.

Now, back to how we determine the row_num value. We do this using the MATCH function, which we need to use on a 1-dimensional array rather than a 2-dimensional array (MainData is a 2-dimensional array). All we want this function to return is the number of the row in the MainData array for the currently selected report period, which, as it turns out, is the same row as the currently selected report period in the DataSelector range. The formula is pretty simple:

MATCH(ReportPeriod,DateSelector)

The formula looks in the DateSelector range for the ReportPeriod value and finds it…in the seventh row of the array. So, row_num is set to 7.

INDEX(First Cell) is almost identical to INDEX(Last Cell), except the row_num value needs to be set to 2 instead of 7 — that will make the full range match the ReportRange value of 6. So, row_num is calculated as:

MATCH(ReportPeriod,DateSelector)-ReportRange+1

(The “+1” is needed because we want the total number of cells included in the range to be ReportRange inclusive.)

Now, that’s not all that scary, is it? We just need to drop the full formula into a named range called Revenue_Range by selecting Formulas » Name Manager » New, naming the range Revenue_Range, and inserting the formula:

=INDEX(MainData,MATCH(ReportPeriod,DateSelector)-ReportRange+1, COLUMN(Revenue_Current)):INDEX(MainData, MATCH(ReportPeriod,DateSelector), COLUMN(Revenue_Current))

The whole formula is there, even if you can’t see it!

Repeat this last step to create two more named ranges with slightly different formulas (the differences are in bold):

  • Orders_Range: =INDEX(MainData,MATCH(ReportPeriod,DateSelector)-ReportRange+1,COLUMN(Orders_Current)):INDEX(MainData,MATCH(ReportPeriod,DateSelector),COLUMN(Orders_Current))
  • WebTraffic_Range: =INDEX(MainData,MATCH(ReportPeriod,DateSelector)-ReportRange+1,COLUMN(WebTraffic_Current)):INDEX(MainData,MATCH(ReportPeriod,DateSelector),COLUMN(WebTraffic_Current))

Tip: After creating one of these named ranges, while still in the Name Manager, you can select the range and click into the formula box, and the current range of cells defined by the formula will show up with a blinking dotted line around them.

You’re getting sooooooo close, so hang in there! In order for the chart labels to show up correctly, we need to make one more named range. We’ll call it Date_Range and define it with the following formula (this is just like the earlier _Range formulas, but we know we want to pull the dates from the first column, so, rather than using the COLUMN() formula, we simply use a constant, “1”:

=INDEX(MainData,MATCH(ReportPeriod,DateSelector)-ReportRange+1,1):INDEX(MainData, MATCH(ReportPeriod,DateSelector),1)

If you want, you can fiddle around with the different settings on the Dashboard tab and watch how both the “Current” values and (if you get into Name Manager) the _Range areas change.

OR…you can move on to the final step, where it all comes together!

Dashboard Tab Setup — Part 2 (the final step)

It’s back over to the Dashboard worksheet to wrap things up.

Insert a 2-D Line chart and resize it to be less than totally obnoxious. It will just be a blank box initially:

Right-click on the chart and select Select Data. Click to Add a new series and enter “Revenue” (without the quotes — Excel will add those for you) as the series name and the following formula for the series values:

=DynamicCharts_Example.xlsx!Revenue_Range

(Change the name of the workbook if that’s not what your workbook is named)

Click to edit the axis labels and enter a similar formula:

=DynamicCharts_Example.xlsx!Date_Range

You will now have an absolutely horrid looking chart (thank you, Excel!):

Tighten it up with some level of formatting (if you just can’t stand to wait, you can go ahead and start flipping the dropdowns to different settings), drop “=ReportPeriod” into cell E6 and “=Revenue_Current” into cell E7, and you will wind up with something that looks like this:

Okay, so that still looks pretty horrid…but this isn’t a post about data visualization, and I’m trying to make the example as illustrative as possible. In practice, we use this technique to populate a slew of sparklines (no x-axis labels) and a couple of bar charts, as well as some additional calculated values for each metric.

To add charts for orders and web traffic is a little easier than creating the initial chart. Just copy the Revenue chart a couple of times (if you hold down <Ctrl>-<Shift> and then click and drag the chart it will make a copy and keep that copy aligned with the original chart).

Then, simply click on the data line in the chart and look up at the formula box. You will see a formula that looks something like this:

=SERIES(“Revenue“,DynamicCharts_Example.xlsx!Date_Range, DynamicCharts_Example.xlsx!Revenue_Range,1)

Change the bolded text, “Revenue,” to be “Orders” and the chart will update.

Repeat for a Web Traffic chart, and you’ll wind up with something like this:

And…for the magic…

<drum rollllllllllll>

Change the dropdowns and watch the charts update!

So, is it worth it? Not if you’re going to produce one report a couple of times and move on. But, if you’re in a situation where you have a lot of recurring, standardized reports (not as mindless report monkeys — these should be well-structured, well-validated, actionable performance measurement tools), then the payoff will hit pretty quickly. Updating the report is simply a matter of updating the data on the Data tab (some of which could even be done automatically, depending on the data source and the API availability), then the Report Period dropdown on the Dashboard tab can be changed to the new report period, and the charts get automatically updated! You can then spend your time analyzing and interpreting the results. Often, this means going back and digging for more data to supplement the report…but I’m teetering on the verge of much larger topic, so I’ll stop…

As an added bonus, you can hide the Data tab and distribute the spreadsheet itself, enabling your end users to flip back and forth between different date ranges — a poor man’s BI tool, if ever there was one (in practice, there will seldom be any real insight gleaned from this limited number of adjustable dropdowns, and that’s not the reason to set them up in the first place).

I was curious as to what it would take to create this example from scratch and document it as I went. As it’s turned out, this is a lonnnnnnngggg post. But, if you’ve skimmed it, get the gist, and want to start fiddling around with the example used here, feel free to download it!

Happy dynamic charting!

Presentation, Reporting

Calculating Trend Indicators

Put this down as one of my more tactical posts, brought on by a fit of lingering annoyance with the use (and by “use” I mean “grotesque misuse”) of trend indicators on reports and dashboards. The trouble is that trends are a trickier business than they seem at first blush, and, at the same time, there are a number of quick and easy ways to calculate them…that are all problematic.

With the well-warranted increasing use of sparklines, which are inherently trend-y representations of data, I like to be able to put a meaningful trend indicator that complements the sparkline. Throughout this post, I will illustrate trendlines, but I’m really focussed on trend indicators, which are a symbol that indicates whether the trend in the data is upward, downward, or flat. Although there are a few minor tweaks I’d love to make once Excel 2010 is released and allows the customization of icon sets, I’m reasonably happy with their 5-arrow set of trend indicators:

Trend Icons

They’re clean and clear, and they work in both color and in black and white. And, with conditional formatting, they can be automatically updated as new data gets added to a dashboard or report. While I won’t show these indicators again in this post, the trendlines I do show are the behind-the-scenes constructs that would manifest themselves as the appropriate indicator next to a sparkline or numerically reported measure.

I’ll use a simple 12-period data set throughout this post to illustrate some thoughts (not as a sparkline, but the principles all still apply):

Sample Data

Trends are slippery beasts for several reasons:

  • Noise, noise, noise — all data is noisy, which means it’s easy to over-read into the data and spot a trend that is not really there
  • The aircraft carrier vs. the speedboat conundrum — the more data points you use, the more stable your trend, but the longer it takes to collect enough data to identify a trend, or, worse, to determine if you’ve truly impacted the trend going forward

Let’s start this exploration by walking through some of the common ways that “trend” judgments get made and point out why they’re troubling. I will then show an alternative that, while only marginally more complex to implement, works better when it comes to specifying trend-age.

Trending Approaches of which I’m Leery

Trending Based on the Change Over the Previous Period

The most common way I see trends reported is on a “change since the previous period” basis.

Prior Period

In this example, the trend would be an “up” because the data went up from the prior period to the current period. The problem with this is that, if you look at the longer pattern of data, you see that the data is pretty noisy, and it’s entirely possible that this “trend” is entirely a case of noise masking the true signal.

Trending Over an Extended Period

Another way to trend your data, which Excel makes very simple, is to add a trendline using Excel’s built-in trending capabilities (converting this trendline to an indicator would require some use of a couple of Excel functions that I’ll go into a bit in my recommended approach later in the post).

Trendline Example

With this method, the trend would be indicated as “slightly up.” While this may be a valid representation of the overall trend…it seldom seems quite right to use it. The trend gets impacted heavily by any sort of big spikes (or dips) in the data. These keep the same upward or downward trend for a very long period of time. I had a blog post during March Madness one year that wound up driving a big spike in traffic to my site. While it was legitimate for that spike to show an upward trend when I looked at my traffic that week or month, that spike has now wreaked havoc on the macro trend indicator that Google Analytics has shown ever since — for several months that spike kept my overall trend up, and, then, once that spike passed the fulcrum of the tool’s trend calculation, it caused the reporting of a downward trend for severals subsequent months. Through the whole period, I had to mentally discount what the trend indicator showed.

Year-Over-Year Trending

Because seasonality wreaks havoc with trendlines, it’s not uncommon to see trend indicators based on year-over-year results — if the current reporting period is a higher number than the same period a year ago, then the trend is up. For trending purposes, this combines the worst of the two prior examples — it takes a very small number of data points (subjecting the assessment to noise) and it uses ancient history data in the equation.

This isn’t to say that comparisons to the same period in the prior year (or even the same period in the prior quarter, since many companies see an intra-quarter pattern) are bad. But, the question those comparisons answer differs from a trend: a trend should be an indication of “where we are heading of late such that, if we continue on the current course, we can estimate whether we will  be doing better or worse next week/next month,” while a year-over-year comparison is more a measure of “did we move positively from where we were last year at this time?”

Trending Approaches I Feel Better About

I’ve spent an embarrassing amount of time thinking about trending over the past four or five years, but I’ve finally settled on an approach that meets all of these criteria:

  • It balances the number of data points available for the trend with the sluggishness/timeliness of the results
  • It’s reasonably intuitive to explain
  • It passes the “sniff test” — while a trend indicator may initially be a little surprising, on closer inspection, the user will realize it’s legit

The last bullet point is really a combination/result of the first two.

My Failed Exploration: Single Point Moving Range (mR)

Because of criteria above, I’ve discarded what I thought was my most promising approach — using the single point moving range (mR). A light bulb went off last spring when I took an intermediate stats class, and, although the professor glossed over the moving range formulas, I thought it was going to be the answer that would allow me to solve my trendline quandary — it would look at the “change over previous period” and determine if that change was sufficiently large to warrant reporting a measurable trend. After noodling with it quite a bit… I don’t think that it works for the purposes of trend indicators. For chuckles, a moving range chart for the example in this post looks like the following:

Moving Range

If you want to read more about moving ranges, the best explanation I found was on the Quality Magazine web site. I’ll just stop there, though. We’ve already lost on the “reasonably intuitive” front, and I haven’t even calculated the control limits yet!

And Another Failed Exploration: the Moving Average

There’s also the “moving average” approach, which smooths things out quite a bit:

Moving Average

I always feel like the moving average is some sort of narcotic applied to the data — it makes things fuzzy by having a single data point factored into multiple points represented on the chart. But, I’ll grudgingly admit that it does have its merits in some cases.

My Approach to Trending (At Last!!!)

There are two key elements to my trending approach, and neither is particularly earth-shattering:

  1. Break the data into smaller components than the reporting cycle
  2. Trend only over recent data, rather than over the entire reported timeframe

Going back to the original example here, let’s say that I update a dashboard once a month, and that the dashboard primarily looks at data for the prior 3 months. In that case, the 12 data points each represent (roughly) one week. IF I simply reported the data on a monthly basis, then the chart would look like this:

Trending Example

That shows a clear upward trend, regardless of whether I look at the last month or the last two months of data. It would be hard not to put an upward trend indicator on this plot. But, we’re relying on all of three data points, and we’re going back three full reporting periods to draw that conclusion. Both of these are a bit concerning. Invariably, we’d want to go back farther in time to get more data points to see if this trend was real…and then we’re falling into the aircraft carrier dilemma.

Instead, though, I can keep the granularity of the reporting at a week, but only trend over the last four periods:

Trendline Proposed Approach

I don’t actually plot the trendline shown in the chart above. Rather, I calculate the formula for the line using the SLOPE and INTERCEPT  Excel functions. I then calculate the value of the 4-weeks-ago endpoint of the line and the most-recent-week endpoint of the line and look at the percentage change from one to the other. I actually set some named cells in my workbook to specify how many periods I report over (so I can vary from 4 to 6 or something else universally) as well as what the different thresholds are for a strong up, weak up, no change, weak down, or strong down trend.

In the example in this post, the change is a 16% drop, which usually would garner a “strong down” trend — very different from all the upward trends in the early examples! And, even somewhat counter-intuitive, as the most recent change was actually an “up.” If the entire range has been trending upward, as shown by the 3-point plot as well as by a close inspection of the raw basic data (think of it as a sparkline), then you already have that information available as the longer term trend, but, of late, the trend seems to be somewhat downward.

A Note of Caution

This post has gone through what works for me as a general rule. As I read back over it, I realize I’m setting myself up for a case of, “Yeah, you CAN make the data say whatever you want.”

I’m less concerned about prescribing a universally-effective approach to trend calculation as I am about putting out a cautionary tone on the various “obvious” ways to calculate a trend. The sniff test is important — does the trend work for your specific situation when you actually apply it? Or, have you adopted a simplistic, formulaic approach that can actually provide a very clear misrepresentation of the data?

And…a Nod to Efficiency and Automation

The prospect of introducing SLOPE and INTERCEPT functions may seem a little intimidating from a maintenance and updating perspective, but it really doesn’t need to be. By using built-in Excel functionality, these can be set up once and then dynamically updated as new data comes in. I like to build spreadsheets with a data selector so that the dashboard is a poor man’s BI tool that allows exploring how the data has changed over time. The key is to use some of Excel’s most powerful, yet under-adopted, features:

  • Conditional formatting — especially in Excel 2007 where conditional formatting can make use of customized icon sets
  • Named cells and named ranges — these are handy for establishing constants used throughout the workbook (thresholds, for instance) that you may want to adjust
  • Data validation — using a cell as your “date range selector” that references a named range of the column that lists the dates for which you record the data
  • VLOOKUP — because you used data validation, you can then use VLOOKUP to find the current data based on what is selected by the user
  • Dynamic charts — these actually aren’t a “feature” of Excel so much as the clever combination of several different features; Jon Peltier has an excellent write-up of how to do this

If set up properly, a little investment up front can make for an easily updated report delivery tool…with meaningful trend indicators!