Webtrends Table Limits — Simply Explained
A co-worker ran into a classic Webtrends speed bump a couple of weeks ago. A new area of a client’s web site had rolled out a few days earlier…and Webtrends wasn’t showing any data for it on the Pages report. More perplexingly, there was traffic to the new content group that had been created along with the launch showing up in the Content Groups report. What was going on? I happened to walk by, and, although I haven’t done heavy Webtrends work in a few years, the miracle of cranial synapses meant that the issue jumped out pretty quickly (I can’t figure out how to say that without sounding egotistical; oh, well — it is what it is).
Heavy Webtrends users will recognize this as a classic symptom of “table limits reached.” There’s quite a bit written on the subject online…if you know where to look. The best post I found was You need to read this post about Table Limits by Rocky of the Webtrends Outsiders. The last sentence (well, sentence fragment, really) in the post is, “End of rant.” In other words, the post starts AND finishes strong, and the content in between is damn good, too.
What I found, though, was that it took a couple of conversations and a couple of whiteboard rounds to really explain to my colleague what was going on under the hood that was causing the issue in a way that he could really understand. That’s not a knock against him. Rather, it’s one of those things that makes perfect sense…once it makes sense. It’s like reading an analog clock or riding a bicycle (or, presumably, riding a RipStik…I wouldn’t know!). So, I decided I’d take a crack at laying out a simplistic example in semi-graphical form as a supplement to the post above.
The Webtrends Report-Table Paradigm
First, it’s important to understand that every report in Webtrends has two tables associated with it:
- Report Table — this is the table of data that gets displayed when you view a report
- Analysis Table — the analysis table is identical in structure to the report table, but it has more rows, and it’s where the data really gets stored as it comes into the system
Webtrends aggregates data, meaning that it doesn’t store raw visitor-level, click-by-click data and then try to mine through a massive data volume any time someone runs a simple report. Rather, it simply increments counters in the analysis tables. That makes sense from a performance perspective, but can easily lead to a “hit the limits” issue.
Key: neither of these tables simply expands (adds rows) as needed. Both have their maximum row count configured in the admin console. Those limits can be adjusted…but that comes at a storage and a processing load price.
(Now, actually, there are multiple analysis tables for any single report — copies of the underlying table structure populated with data for a specific day, week, or month…but it’s beyond the scope of this post to go into detail there. Just tuck it away as another wrinkle to learn.)
In the rest of this post, I’m going to walk through an overly simplistic scenario of a series of visits to a fictitious site with unrealistically low table limits to illustrate what happens.
The Scenario
Let’s say we have a web site with a series of pages that we’ll call Page A, Page B, Page C,…Page Z. And, let’s say we have our Report Table limit for the Pages report set to “4” (in practice, it’s probably more like 5,000) and our Analysis Table limit set to “8” (in practice, it would be more like 20,000). That gives us a couple of empty tables that look something like this:
Now, we’re going to walk through a series of visits to the site and look at what gets put into the tables.
Visit 1
The first visitor to our site visits three pages in the following order: Page A –> Page B –> Page C –> <Exit>.
The analysis table gets its first three rows loaded up in the order that the pages were visited, and each page gets a Visits value of 1. If we looked at the Pages report at that point, the Report Table would pull those top 3 values, and everything would look fine:
Visit 2
The next visitor comes to the site and visits 5 pages in the following order: Page B –> Page C –> Page D –> Page E –> Page F –> <Exit>
We’ve now had more unique pages visited than can be displayed in the report (because the report table limit is set to 4). But, that’s okay. After two visits to the site, our Analysis Table would still have a row or two to spare, and the Report Table could pull the top 4 pages from the Analysis Table and do a quick sort to display correctly, using the All Others row to lump in everything that didn’t make the top 4:
If you searched or queried for “Page F” at this point, you wouldn’t see it. It’s there in the Analysis Table, but you’re searching/querying off of the Report Table. That doesn’t mean Page F is lost, though. It just means it has less traffic (or is tied for last) with the last item that fit in the Report Table.
Visit 3
Sequence of pages: Page F –> Page G –> Page H –> Page B –> <Exit>
Following the same steps above and incrementing the values in our Analysis Table, and again looking at a report for the entire period, we see (bolded numbers in the Analysis Table are the ones that got created or incremented with this visit):
Look! Page F is now showing up in the Report Table! Can you see why? Because the Analysis Table has greater row limits, the Report Table can adjust and pick the top-visited pages.
Visit 4
Sequence of pages: Page F –> Page I –> Page J –> Page B –> <Exit>
Here’s where we really start to lose page-level granularity. Our Analysis Table is full, so there are no rows to store Page I and Page J. So, that will add 2 visits to the All Others row in the Analysis Table (while this is a single visit, this is the pages report, and each of those pages received a visit). Our tables now look like this:
Until the Analysis Table gets reset, no pages after Page H will ever appear in a report.
Even if Page I Becomes the Most Popular Page on My Site?
It’s time for a direct quote from the Webtrends Outsider post referenced at the beginning of this post:
Ugly example #1: Your end users contact you wanting to know about traffic to their expensive new microsite. You know you’ve been collecting the data correctly because you triple-checked the tagging before and after launch. So you open the Pages report and WebTrends tells you those pages don’t exist. Those expensive pages got no traffic at all, apparently. Knowing how the CEO’s been obsessed with the new microsite, you call in sick indefinitely.
It doesn’t matter if Page I becomes the only page on your site. Until the tables reset, you won’t see the page in your Pages report — it will continue to be lumped into All Others.
And That Is Why…
If you started out on Google Analytics and then switched over to Webtrends you might have noticed something odd about the URLs being captured (I learned it going in the opposite direction): in Google Analytics, the full URLs for each page, including any query string parameters (campaign tracking parameters excluded) are reported by default. In Webtrends, query string parameters are dropped by default. In the case of Google Analytics, you can configure a profile to drop specific parameters, while, in Webtrends, you can configure the tool to include specific parameters.
Why does Webtrends exclude all parameters by default? The table limits is one of the reasons. If, for instance, your site search functionality passes the terms searched for and other details to the search engine using query parameters, the Analysis Table for the Pages report would fill up very quickly…with long tail searches that only received 1 or a small handful of requests.
What to Do?
The most important thing to do is to keep an eye on your table sizes and see which ones are getting close to hitting their limits. If they’re getting close, then consider adjusting your configuration to reduce “fluff” values going in. If that’s not an issue, then you need to bump up your table limits. That may slow down the time it takes for Webtrends to process your profiles, but it will keep you from unpleasant conversations with the business users you support!