# Creating Weighted Metrics Using the Percentile Function

**Adam Greco**on

**October 26, 2015**

When using Adobe Analytics, one of the things that has been a bit annoying historically is that when you sort by a calculated metric, you often see really high percentages for rows that have very little data. For example, if you create a click-through rate metric or a bounce rate metric and sort by it, you may see items of 100% float to the top, but when looking at the raw instances, the volume is so low that it is insignificant. Here is an example of a case where you may be capturing onsite searches by term (search criteria) and clicks on search results for the same term (as outlined in this post):

In this case, it is interesting to see that certain terms have a highly disproportionate number of clicks per search, but if each is searched only once, it isn’t that statistically relevant. To get around this, Adobe Analytics customers have had to export all data to Microsoft Excel, re-create the calculated metrics, delete all rows with fewer than *x* items (searches in this case) and then sorted by the click-through rate. What a pain! That is a lot of extra steps!

But now, thanks to the new Derived Metrics feature of Adobe Analytics, this is no longer required. It is now possible to use more complex functions and formulas to narrow down your data such that you can sort by a calculated metric and only see the cases where you have a higher volume of instances. In this post, I will demonstrate exactly how this is done.

## Using the Percentile Function

The key to sorting on a calculated metric is the use of the new PERCENTILE function in Adobe Analytics. This function allows you to choose a percentile within a list of values and use that in a formula. To illustrate this, I will continue the onsite search example from above. While the click-through rate formula used above is accurate, we want to create a report that only shows the click-through rate when search criteria have at least *x* number of searches. However, since the number of unique search criteria will vary greatly, we cannot simply pick a number (like 50 or more), because we don’t know how many searches will be performed for the chosen date range. For example, one of your users may choose one day, in which greater than 5o searches is unlikely, but another user may choose a full year, which will make greater than 50 a huge number of items with a very long tail. To deal with all scenarios, you can use the PERCENTILE function, which will look at all of the rows for the selected date range and allow you to calculate the *x*th percentile of that list. Hence, the threshold is relative to the date range chosen with respect to the number of instances that have to take place for it to show up in the calculated metric. Since this can be a bit confusing, let’s look at an example:

To start, you can build a new calculated metric that shows you what the PERCENTILE formula will return at a specific percentile. To do this, open the Calculated Metrics builder and make the following formula:

Since there may be a LOT of unique values for search criteria, I am starting off with a very high percentile (99.5%) to see how many searches it takes to be in the 99.5% percentile. This is done by selecting the core metric (Searches in this case) and then making the “k” value 99.5 (Note: You can also figure out the correct “k” value by using the PERCENTILE function in Microsoft Excel with the same data if you find that easier). Once you are done, save this formula and add it to your search criteria report so you see this:

This formula will have the same value for every row, but this is ok since we are only using it temporarily to figure out if 99.5% is the right value. In this case, what we see is that at a 99.5% percentile, anything with over 18 searches will show us the search click-through rate and anything below 18 searches will not. Now it is up to you to make your judgement call. Is 18 too high in this case? Too low? If you want to raise, it, simply raise the “k” value in the formula to 99.8 or something similar.

While doing this, keep in mind, that changing your date range (say choosing just one day), will change the results as well. The above report is for 30 days of data, but look what happens when we change this to just one day of data:

As you can see, the threshold changed from 18 to 6, but the number of overall searches also went down, so the 99.5% percentile seems to be doing its job!

Once you have determine what your ideal percentile “k” value is, it is now time to use this formula in the overall click-through rate formula. To do this, you need to create an IF statement and use a GREATER THAN function as well. The goal is to tell Adobe Analytics that you want it to show you the search click-through rate only in cases where the number of searches is greater than the 99.5% percentile. In other cases, you want to set the value to “0” so that when you sort in descending order, you don’t have crazy percentages showing up at the top, even though there are low values. Here is what the formula will look like:

While this may look a bit intimidating at first, if you look at its individual components, all it is really doing is calculating a click-through rate only when the number of searches is above our chosen threshold. Now you can add this metric to your report and see the results:

As you can see, this new calculated metric is no different from the existing one in cases where the number of searches is greater than the 99.5% threshold. But look what happens when we sort by this new Weighted Click-Through Rate metric:

Unlike the report shown at the beginning of this post, we don’t see super high percentages for items with low numbers of searches. All of the results are above the threshold, which makes this report much more actionable. If you want, you can verify this by paging down to the point where our new weighted metric is 0% (in this case when searches are under 18):

Here you can see that searches are less than 18 and that the previous click-through rate metric is still calculating, but our new metric has hard-coded these values to “0” for sorting purposes.

## Final Thoughts

As you can see, using the PERCENTILE function can be a real time-saver. While this example is related to onsite search, the same concept can be applied to any calculated metric you have in your Adobe Analytics implementation. In fact, Adobe has created a similar metric for Weighted Bounce Rate that is publicly available for all customers to use for marketing campaigns. So any time you want to sort by a calculated metric and not see rows with low numbers of data, consider using this technique.

Tim WilsonOctober 26th, 2015I love using calculated metrics this way. The challenge can be figuring out what the “right” cutoff is — it’s not something that can be dynamically adjusted (can’t be adjusted without updating the metric definition). I sometimes use Report Builder to allow the user to specify/adjust either an absolute count cutoff or a percentile cutoff. The downside of that approach is that you wind up pulling in a LOT of data (and sometimes can’t pull in *all* the data if it’s something like Search Criteria and exceeds 50,000 unique values). But, the upside is that it can be played around with a little on the fly and get the data updated immediately.

thirstydcDecember 11th, 2015The frustrating aspect of using weighted statistics, especially for something such as bounce rate or conversion rate, is that setting the value_if_false to 0 can muddle the results. For example, am I seeing values set to 0.00% because there actually is no conversion, or because the number of conversions fell outside the percentile?