How Google Analytics In-Page Analytics / Overlay Works
I’m starting to think that page overlays are the new page-level clickstream — they’re what well-meaning-but-inexperienced business users see in their minds’ eyes as a quick and clear path to deep insights when, generally, they are not. I’ve had a couple of clients over the last year ask for overlays (in one case, “provided weekly for all major pages of the microsite”), and the overlays were never an effective mechanism for helping them drive their businesses forward. (One request was for overlays from Sitecatalyst; the other was for overlays from Google Analytics.)
I seldom use overlays for reporting or analysis. The reason isn’t that they don’t have very real usefulness in certain situations, but, rather, because those certain situations are extremely rare in my day-to-day work. As the “page” paradigm — in its basic-HTML simplistic glory — goes the way of daytime soap operas, and as brands’ digital presences increasingly are intertwined combinations of their sites and social media platforms, the number of scenarios where an overlay provides a view of the page that is both reasonably complete and actually useful are few and far between.
That’s a bit broader of a topic than I was aiming to cover with this post, though.
I recently needed to explain to a client why it wasn’t simply a matter of “fixing” the Google Analytics implementation on his site to get the overlays to work properly. I did some digging for documentation that explained the underlying mechanics of GA’s in-page overlays (similar to what Ben Gaines wrote about Sitecatalyst ClickMap a couple of years ago when he was still at Omniture), and…I couldn’t find what I was looking for. This post is trying to be that documentation for the next person who is in the same situation. If you have deeper knowledge of the underlying mechanics of Google Analytics than I have, and I’ve misrepresented something here, please leave a comment to let me know!
Google Analytics <> Sitecatalyst <> ClickTale
There are different ways to capture/present clickmap and heatmap overlays. In order of increasing robustness/usefulness (I’m leaving out a number of vendors because I simply don’t have current knowledge of their specifics):
- Google Analytics, at its core, uses some basic reverse-engineering of page view data to generate its in-page analytics (overlays). It looks nice in their video…but the video uses a very basic site, which doesn’t reflect the reality of most sites for medium-sized and large companies
- Adobe Sitecatalyst gets a bit more sophisticated with its approach, which automatically closes some of the gaps in the GA approach while also allowing for working around a chunk of the challenges that are inherent with overlays; see Ben’s post that I referenced earlier if you want to really get into the details there!
- ClickTale is a solution that was developed from the ground up to provide workable overlays and heatmaps. As such, it takes an even more robust approach — capturing both mouse movements and clicks. The “downside” (in quotes because this is a limitation in theory — not in practice) is that ClickTale does not track all sessions. It samples sessions — still collecting plenty of data to provide you with highly usable data, but business users inevitably get heartburn when they find out that they’re not capturing everything.
Make sense? The point is that there are different ways to skin the overlays cat. This post just covers Google Analytics.
How Google Analytics Figures Out Overlays
For each user session, Google Analytics gets a “hit” for each page viewed during the session, and it records a timestamp for each page view, so it knows the sequence in which pages were viewed in the session. Consider a simple, 3-page site, where the main page (page_A) has links to the other two pages.
Now, let’s have three visitors come to the site (Visitor 1111, Visitor 2222, and Visitor 3333). All three enter the site on Page_A, but then:
- Visitor 1111 clicks on the link to Page_B and then exits the site
- Visitor 2222 clicks on the link to Page_C and then exits the site
- Visitor 3333 clicks on the link to Page_B and then exits the site
Google Analytics would have captured a series of page views that looked something like this:
Visitor ID | Timestamp | Page Viewed |
Visitor 1111 | 09:03:16 | Page_A |
Visitor 1111 | 09:03:24 | Page_B |
Visitor 2222 | 09:04:12 | Page_A |
Visitor 2222 | 09:04:53 | Page_C |
Visitor 3333 | 09:10:22 | Page_A |
Visitor 3333 | 09:10:54 | Page_B |
With a little sorting and counting and cross-referencing, Google Analytics can figure out that:
- There were 3 visits to Page_A
- The “next page” that two of those visitors went to from Page_A was Page_B
- The “next page” that one of those visitors went to from Page_A was Page_C
That’s how Google Analytics generates the Next Page Path area of the Navigation Summary report for a page (and, with the same basic technique, this is how the Previous Page Path is generated):
Make sense? Good. So, how does this become in-page analytics? In-page analytics, really, is just a visualization of the Next Page Path data. To do that:
- Google Analytics pulls up the current version of the page at the URL being analyzed with in-page analytics
- It compiles a list of all of the “next pages” that were visited (with the number of “next page” page views for each one)
- It scans the page for the URLs of those “next pages” and then labels each link that references one of those pages with the number of pageviews (and the % of total “next page” page views that the value represents)
Pretty simple, and pretty solid…except when various common situations occur, which we’ll get to next.
Oh, the Many Ways that In-Page Analytics Breaks Down
In-page analytics is problematic when any of the following situations occur on a page:
- A link has a target URL that is not part of the current site (e.g., a link to the brand’s Facebook page or YouTube channel): Google Analytics doesn’t capture the “next page” viewed, so it can’t deduce how many times the link was clicked (Note: a best practice, obviously, is to have event tracking or social tracking implemented in these situations, so Google Analytics can report on how many times the link was clicked…but this doesn’t work it’s way back into in-page analytics overlays)
- A link points to a PDF or file download: this is similar to the previous scenario, in that the “next page” doesn’t execute the Google Analytics page tag; again, even if a virtual page view is captured on the click, that is, technically, different from the actual target URL in the <a href=”…e> that points to the file, so Google Analytics doesn’t make the connection needed to render this on the overlay. In other words, the virtual page view will show up on the Navigation Summary in the Next Page Path list, but it won’t show up on the overlay.
- Multiple links on the page point to the identical next page: because GA uses the URL of the “next pages,” it doesn’t inherently capture which link pointing to the specific next page is the one that was clicked. The standard workaround for this is to force the URLs to be unique by tacking on a junk parameter to the end of the second URL (e.g., have one link point to “Page_B.htm” and the second link point to “Page_B?link=2”). This will make the target URLs unique in GA’s view…but will also make base reporting for Page_B a bit trickier, as there will be two different rows in the Pages report for the same page (if your <title> tags are well-formed, you can work around this by using the Page Titles dimension in the Pages report)
- Links are embedded in “hidden” content, such as Javascript menu dropdowns: this is simply a limitation of the overlay paradigm, in that it is often impossible to make all of the links on a page visible at once. With in-page analytics, as you mouse over areas that make the links appear, the in-page analytics data will appear as well, but it still requires moving all around the page to reveal all of the links to view all of the “next page” data
- Links are embedded in Flash: in-page analytics simply can’t effectively add clicks to links that are embedded in Flash objects
- Links appear to reference the same page: some implementation of DHTML that trigger overlays or other interactive in-page content wind up including something like “<a href=”#”…”, which looks to Google like a link back to the current page. This confuses GA mightily!
- The link is removed from the page: say you run a promo for a week and then take the hyperlinked image off of the page. When you pull up in-page analytics for that week, GA will know that there were a lot of “next page” views to the target for that promo…but it only has the current page for use in generating an overlay, so it won’t know where to overlay the page views for that promo
- The links on the page aren’t spaced far enough apart: this is a practical reality, in that I have never seen an overlay where there aren’t some overlay details that obscure the details for other links that are located in close proximity. Obviously, you’re not going to design your site to be overlay-friendly…so you just have to accept this limitation.
The kicker is that these are not obscure, corner-case scenarios. They’re common occurrences, and they lead to most overlays presenting an incomplete picture of activity that occurs on the page.
A Handful of Additional Thoughts
In-Page Analytics are seldom useful. To the best of my knowledge, this is neither an area in which Google is investing to make improvements, nor is it an area that seasoned web analysts are really clamoring for updates.
However, overlays have their place, I think. But, they need to be done right, which is something on which ClickTale is focused (Michele Hinojosa wrote a good overview of the platform last year if you want to read another analyst’s perspective).
Related to overlays, although not strictly overlay-ish, is a feature of Satellite by Search Discovery, whereby you can very easily enable tracking of all clicks on unlinked content (how many times have you been on a site where you think clicking on a product image will take you to the product’s page…and it doesn’t take you anywhere at all!). I think this is some ClickTale-ish like functionality, but that may be something of a stretch. It was a nifty concept, though.
So, that’s it on GA’s In-Page Analytics. Understand what it does and how it does it, and you will be able to identify the (extremely rare) situations when it will be useful.