Twitter Influence — Still Searching for the Perfect Answer
[Updated on 2/17/2011 — added the last section with additional information about Twitalyzer’s Community measurement and a little additional nod to TweetReach.]
A pretty intriguing post from Michael Healy came across the Twitterverse yesterday: #Measuring in 2010 — Analyzing the #measure Data of the Twitterati. What Michael did was take all of the Tweets that used the #measure hashtag in 2010 and run them through an “influence” formula he developed. The tweets data was courtesy of Kevin Hillstrom, who had set up a Twapper Keeper archive of #measure tweets. I’ve set up a couple of Twapper Keeper archives (hopefully, the #emetrics one will continue to function through the upcoming eMetrics conference, as I smell another juicy data set for us to play around with) in my day and have been a bit frustrated with the quality of the exports — they required quite a bit of cleanup, especially of the timestamps — and, I’ve been a little skeptical of the completeness of the data. But, maybe that’s just because I’ve been working with TweetReach of late, and it’s just so darn clean and robust that the free services really do start to pale in comparison.
Michael’s stated goal was pretty simple:
I wanted to know who were the most influential members of the #measure Twitterverse.
This was an exercise he did as prep work for the Web Analytics Association Spring Awards Gala, which, if you’re going to be in the area, you should plan to attend, as it should be a really good time.
I read the post and had three immediate thoughts:
- “Influence” is one of the Mid-Major Holy Grails of social media management
- “#measure” could be replaced by any brand or topic in, and the ability to achieve Michael’s stated goal would come in damn handy in all sorts of situations
- Michael is one of those Brains with a capital “B,” so it’s worth taking a close look at what he produces when he claims he’s “just having fun.”
The final result was a big ol’ diagram (click on the image to jump over to Michael’s original post and a link to the full-sized image):
Michael’s formula focussed on both the volume of tweets and the “original content” in the tweets (using an Entropy calculation and a slight dampening of the score based on the volume of retweets).
Like Michael, I was surprised to see Szymon Szymanski (@ulyssez) as the dominant circle in the diagram. I’d certainly seen his tweets in the #measure stream, but I would have been more likely to guess @aknecht or @KISSMetrics (which is the good-sized circle off to the right of the diagram, as it so happens) would have had the dominant slot based on volume/variety.
So…Influence, You Say?
Seeing as I’ve been spending a lot of time with Twitalyzer of late, the next thought that popped into my mind was, “I wonder how Michael’s analysis of the top influencers would line up with Twitalyzer’s?”
It doesn’t take much to push that thought further and immediately hit a wrinkle:
- Michael’s definition of influence was oriented towards tweet volume and tweet content
- Twitalyzer’s definition of influence is based on an assessment of how likely a user is to be referenced or retweeted
Now, logically, if you have a high tweet volume, and your tweets contain a lot of original content (and, presumably, it’s not navel-gazing content, as that would rarely warrant the inclusion of the #measure hashtag), you’re more likely to be referenced or retweeted. Okay, there’s a logical link there, so maybe that’s not a huge wrinkle.
“Oh, bother,” said Pooh almost immediately, “I think we have a second wrinkle.” That being:
- Michael’s analysis was based solely on tweets that included the #measure hashtag and the users who tweeted those tweets
- Twitalyzer’s definition is more “user-based,” and takes into account the user’s direct and indirect network
And, a third wrinkle, just to round out a nice list:
- Michael’s analysis was based on all #measure tweets from 2010
- Twitalyzer operates more on a last day, last 7 days, last 30 days mode (with historical data going back much further…but it’s all based on when the user got plugged in as a daily-updated account)
For this third wrinkle, it seems reasonable to assume that the most influential #measure tweeters in 2010 are likely still fairly influential as of the last month.
In the end…does it matter? There’s only one way to know! Let’s take a look!
A Semi-Random Comparison
I’m not going to go through every bubble in Michael’s diagram. But, I am going to hit the “big” bubbles, look up their Twitalyzer Influence scores for the last 30 days, and then do the same for a smattering of small bubbles. For the “small bubble” users I”ve only included users where it looks like Twitalyzer has been doing daily tracking for the last 30 days. And, these bubbles were also a random selection of users that I readily recognized (which, I realize, very likely introduced some sample bias).
Let’s see what we see:
|Username||Bubble Size||Twitalyzer Influence|
|@cjpberry||Pretty Big||Not available*|
* These accounts had not yet been Twitalyzed. As such, while I Twitalyzed them, a reliable 30-day average was not available, so I have not included their reported scores here.
So, what does this tell us? Well…seems like we don’t have a perfect correlation (we never do, do we?). From my own use of Twitter, the Twitalyzer scores square pretty well with what I would expect, although, yowza!, I wouldn’t have expected @kissmetrics to be running away from the pack like that!
I don’t think either one of these is “right” in any absolute way. Both approaches were developed with different purposes. Michael’s exercise was, I think, a couple of idle thoughts taken to a logical conclusion. Twitalyzer’s score is one metric inside a measurement platform that offers a whole suite of metrics and that has been evolving and maturing a couple of years.
Does Any of This Really Matter?
I’m drawn to these sorts of exercises because I think they do matter. As web analysts, we got to a pretty consistent definition of a “page view” and a “visit” (“different tools calculate differently” be damned — the basic definition is the same), left things a little loose on “unique visitors,” and never really reached closure on “engagement” (philosophical debates as to whether it even matters notwithstanding).
As social media continues to gain traction with consumers and as social media platforms continue to evolve and mature, we absolutely need to be thinking about measurement within those platforms and we need to keep scrambling to keep up. And, hopefully, maybe we’ll be able to influence the evolution of those platforms so that they’re at least somewhat measurement-friendly. As long as we’ve got analysts pushing the tools and experimenting with new approaches (another example: @jojoba’s oxygenating alter ego posted her Social Media Masters Twitter Analytics presentation over the weekend — lot’s o’ tools out there!), we’ll get there!
So, yeah, it matters. The fact that it’s pretty interesting to watch (and maybe even help) some really, really sharp minds in our space try to crack some pretty hard nuts is just added gravy.
Update: Twitalyzer Community Scores
One of the few benefits of being based in the Eastern timezone with a lot of the heavy analytics work occurring on the west coast is that I got to get up this morning with an inbox and comments on a post that went up shortly before I retired for the evening!
Eric Peterson sent me some of his thoughts and pointed me to the Community area under Tweets and Tags in Twitalyzer:
So…now I need to go do some more Twitalyzer exploration and thinking — from the list above, I need to think through the relationship between Participation, Influence, and Attention, methinks. One of the real draws, for me, of Twitalyzer, is that it enables picking a set of appropriate metrics that, together, measure the effectiveness of any particular Twitter engagement approach. The kicker is nailing down which of those metrics are the right fit in any given situation.
Jenn Deering Davis of TweetReach also sent me some TweetReach data on #measure that covers 2011 to date. TweetReach focusses on reach and exposure of tweets (the difference being that reach is “unique people exposed” and exposure is more “raw impressions”). From that perspective:
TweetReach’s approach has a more direct tie to traditional advertising measurement when it comes to tracking “impressions.” But, it also has a lot of other features that can help sniff out influential people on a particular topic — a major differentiator is that its trackers can use boolean logic, which they showcased in the work they did around the Super Bowl ads. It doesn’t really show this off when we’re looking at a community that is defined as tightly as the #measure hashtag.
Again, I say… so many tool…!