How Google and Adobe Identify Your Web Visitors
A few weeks ago I wrote about cookies and how they are used in web analytics. I also wrote about the browser feature called local storage, and why it’s unlikely to replace cookies as the primary way for identifying visitors among analytics tools. Those 2 concepts really set the stage for something that is likely to be far more interesting to the average analyst: how tools like Google Analytics and Adobe Analytics uniquely identify website visitors. So let’s take a look at each, starting with Google.
- __utma identifies a visitor and a visit. It has a 2-year expiration that will be updated on every request to GA.
- __utmb determines new sessions and visits. It has 30-minute expiration (same as the standard amount of time before a visit “times out” in GA) that will be updated on every request to GA.
- __utmz stores all GA traffic source information (i.e. how the visitor found your site). If you look closely at its value, you’ll be able to spot campaign query parameters or search engine referring domains, or at the very least the identifier of a “direct” visit. It has an expiration of 6 months that is updated on every request to GA.
- __utmv stores GA’s custom variable data (visitor-level only). It has an expiration of 2 years that is updated on every request to GA.
That was a mouthful – you might want to read through it again to make sure you didn’t miss anything! There are even a few cookies I didn’t list because GA sets them but they don’t contribute at all to visitor identification. If that looks like a lot of data sitting in cookies to you, you’re exactly right – and it helps explain why classic GA offers a much smaller set of reports than some of the other tools on the market. While I’m sure GA does a lot of work on the back-end, with all those cookies storing traffic source and custom variable data, there’s definitely a lot more burden being placed on the browser to keep a visitor’s “profile” up-to-date than on other analytics tools I’ve used. Understanding how classic GA used cookies is important to understanding just what an advancement Google’s Universal Analytics product really is.
One final note about GA’s cookies – and this applies to both Classic and Universal – is that there is code that can be used to pass cookie values from one domain to another. This code passes GA’s cookie values through the query string onto the next page, for cases where your site spans multiple domains, allowing you to preserve your visitor identification across sites. I won’t get into the details of that code here, but it’s useful to know that feature exists.
Many of the new features introduced with Universal Analytics – including additional custom dimensions (formerly variables) and metrics, enhanced e-commerce tracking, attribution, etc. – are either dependent upon or made much easier by that simpler approach to cookies. And the ability to identify your own visitors with your own unique identifier – part of the new “Measurement Protocol” introduced with Universal Analytics – would have fallen somewhere between downright impossible and horribly painful with Classic GA.
This one change to visitor identification put GA on a much more level playing field with its competitors – one of whom we’re about to cover next.
Over the 8 years or so that I’ve been implementing Adobe Analytics (and its Omniture SiteCatalyst predecessor), Adobe’s best-practices approach to visitor identification has changed many times. We’ll look at 4 different iterations – but note that with each one, Adobe has always used a single ID to identify visitors, and then maintained visitor and visit information on its servers (like GA now does with Universal Analytics).
Third-party cookie (s_vi)
This cookie, called s_vi, has an expiration of 2 years, and is made up of 2 hexadecimal values, surrounded by [CS] and [CE]. On Adobe’s servers, these 2 values are converted to a more common base-10 value. But using hexadecimal keeps the values in the cookie smaller.
First-party cookie (s_vi)
You may remember from an earlier post that third-party cookies have a less-than-glowing reputation, and almost all the reasons for this are valid. Because third-party cookies are much more likely to be blocked, several years ago, Adobe started offering customers the ability to create a first-party cookie instead. The cookie is still set on Adobe’s servers – but using this approach, you actually allow Adobe to manage a subdomain to your site (usually metrics.companyname.com) for you. All Adobe requests are sent to this subdomain, which looks like part of your site – but it actually still just belongs to Adobe. It’s a little sneaky, but it gets the job done, and allows your Adobe tracking cookie to be first-party.
First-party cookie (s_fid)
In most cases, using the standard cookie (either first- or third-party) works just fine. But what if you’re using a third-party cookie and you find that a lot of your visitors have browser settings that reject it? Or what if you’re using a first-party cookie, but you have multiple websites on completely different domains? Do you have to set up subdomains for first-party cookies for every single one of them? What a hassle!
Adobe Marketing Cloud ID
The current iteration of Adobe’s visitor identification is a brand-new ID that allows for a single ID across Adobe’s entire suite of products (called the “Marketing Cloud”). That means if you use Adobe Analytics and Adobe Target, they can now both identify your visitors the exact same way. It must sound crazy that Adobe has owned both tools for over 6 years and that functionality is only now built right into the product – but it’s true!