Minimize Robot Traffic
Robots are cool. I like robots when they build cars, try to plug oil spills and clean carpets. The only types of robots I don’t like are the ones that hit websites repeatedly and throw off my precious web analytics data! Do you have a problem with these types of robots? Would you know how to see if you do? I find that many web analytics customers don’t even know how to see this, so in this post I will share what I do to monitor robots and hope that others out there will share other ways they deal with robots.
Why Should I Care About Robots?
This is often the first question I get. Who cares? Here are my reasons for caring about minimizing robots hitting your site:
- If you use Visits or Unique Visitors as part of any of your website KPI’s (i.e. Revenue/Unique Visitor), you should care because robots are inflating your denominator and dragging your conversion rates down
- If you are tasked with reducing Bounce Rates on your site, you should care as robots will often be seen as bounces
- Omniture (and other web analytics vendors) often bill you by website traffic (server calls) so you may be paying $$$ for junk data
- Often times web analytic KPI’s have razor-thin differences month over month and having a lot of garbage data can mean the difference between making a good and bad website business decision
Do I Have a Problem?
The first step is to identify if you have a problem with robots. Unfortunately, SiteCatalyst does not currently have an “out-of-the-box” way to alert you if you have a problem (@VaBeachKevin has added this to the Idea Exchange so please vote!), but in the meantime, here is my step-by-step approach to determining this:
- Create a recurring DataWarehouse report that sends you Page Views and Visitors for each IP address hitting your site (If you store the Omniture Visitor ID in an sProp, I would use that in place of IP address). This can be daily, weekly or monthly depending on how much traffiic your website receives. I sometimes add the Country/City as well (you’ll see why later).
- When you receive this report, it should look something like this:
- Once you have the data, I create a calculation which divides Page Views by Visitors and then sort by that column (if you have a lot of data from different days/weeks, you can create a pivot table). The result should look like the report below where you will start to see which IP addresses are viewing a lot of pages on your site per visitor. Keep in mind that this doesn’t mean they are all bad. It is common for small companies or individuals to share IP addresses. The goal of this step is just to identify the IP addresses that might be issues. In the example below, you can see that the the top two IP addresses appear to be a bit different than the rest. While it may make you feel good that these unique visitors liked your website so much they viewed thousands of pages each, you might be fooling yourself!
- Once you have this list, I like to do some research on the the top IP Address offenders. You can do this via a basic Whois IP Lookup or you can invest in a reverse IP lookup service.
What Do I Do If I Find Robots?
If after reviewing the top offending IP addresses you find that you do, in fact, have a robot hitting your site, you have a few options:
- Work with your IT group to exclude these IP addresses from hitting your website. This is your best option since it will be the most reliable and reduce your web analytics server call cost.
- Work with Omniture’s Engineering Services team to create a DB Vista Rule that will move these website hits to a new report suite so it will not pollute your data. The best part of this option is that you don’t have to engage with your IT team and you can add/remove IP addresses anytime you want via FTP. Unfortunately, you will still be hit with server call charges for this (not to mention the cost of the DB Vista Rule!), but if you also pass data to Omniture Discover, you might save money there by not passing bad data to Discover.
- Work with Omniture’s Engineering Services team to build a custom solution for dealing with robots…
While I don’t want to imply that your co-workers are robots, I wanted to mention employee traffic in this post as well since it is tangentially related. I find that many Omniture customers don’t exclude their own employees from their web analytics reports. This can be a huge mistake if you have a lot of employees or have employees who actively use the website. For example, at my employer (Salesforce.com), we use our website to log into our internal systems which are all run on Salesforce.com! This means that we have thousands of employees hitting our website every day to log in to our “cloud” applications and that traffic should not count towards our marketing/website goals. Therefore, we manually exclude all employee traffic from our reports by IP address to minimize the impact of employee traffic impacting our KPI’s. While we don’t consider this to be robot traffic, we address it in the same manner by passing employee traffic to its own report suite. One cool by-product of placing employee traffic in its own report suite is that you can see how often your own employees are using your website so you can show management that the dollars they give you serve multiple audiences!
As I stated in the beginning of this post, this is just one way to investigate and deal with robots. If you have other techniques, please share them here! Thanks!