Akismet – the Danger of Collective Intelligence (and why I don’t use it)

Akismet

Akismet is a very smart and effective system for controlling comment spam on blogs, and I know thousands of bloggers swear by it, listing it as their number one plugin for their blogging platform.

Increasingly I have started having problems posting comments on popular blogs such as Problogger. Initially it was when I included a link to a relevant post, which is within Darren’s comment policy. Recently I have had comments that didn’t include a link enter the moderation queue on new posts. Maybe I am going to have to start commenting without entering an optional URL (I know that affects Spam Karma)
It is well known among those that visit Darrens blog on a regular basis that Automattic use Problogger as a yardstick – the site attracts a great deal of spam.

Collective Intelligence

No one outside Automattic know the inner workings of Akismet, but it is known that it collects data from all the blogs using the service, and uses an algorithm to determine if a comment is spam. Such a system can be very effective. If someone spams one blog, they in some way get a black flag on all blogs, although how much weighting is transferred from one blog to another is unknown.

Rogue Data

I know Matt Mullenweg wouldn’t flag criticism on his blog as spam, but that might not be the case of every blogger. The same could be true for trackbacks – instead of just deleting a trackback that was in some way critical, or from a competitor joining in the conversation, it would be easy to flag it as spam.

The Danger of Collective Intelligence

Collective Intelligence isn’t just used for blog comment spam, it is also used for email spam. Every day I pull emails out of my spam folder in gmail. Some of it is highly important, such as data sent to me from this domain, contact form results, spam karma results etc. Often those reports contain “naughty” words – one day Gmail will learn that I really need this information.
I also pull lots of email from marketers out of the spam bin. If I have signed up for the mailing list, it shouldn’t be in the spam bin, even if I might not read it every time. If you don’t take this action, the email filters have no data to work with, or might take it that if you leave the comment in the spam bin, that it made a correct decision.

My Own Commenting

Whilst I am sure there are more active blog commenters than me, I am fairly active, and always try to add value. I am a repeat commenter on many blogs, and 95% appear without any problems, even on sites running Akismet.

Here are some things I have noticed:-

  • Links in the Body – If you include a link in the body of a comment, you have a high risk of being flagged for the moderation queue. Even when the owner of the site has asked for a link to be posted, I avoid it.
  • Optional Links – In your optional link to your site, that is actively encouraged, if you use it to link not to your root domain, but to highly relevant deep content, it has a higher chance of being flagged as spam.
  • Long comments – If you make a longer comment, adding true value to the blog where you are posting it, you are more likely to be flagged as spam. I am not sure if that is because you have a higher chance of snagging a particular word filter, but I am less inclined to write long comments. I write lots of comments on blogs discussing monetization. I avoid using words such as money
  • Constructive Criticism – I avoid linking / pinging blogs that are adverse to criticism, or don’t show trackbacks. If a trackback doesn’t show up, it suggests to me that there is a chance my trackback is remaining in the moderation queue, or even worse is being flagged as spam.

A blog which isn’t comment / trackback friendly for me becomes a “bad neighbourhood“, in much the same way as linking to a grey boxed website might damage your ranking in search engines, commenting on or linking with a trackback might damage your ability to comment on other sites.

Require Signups to Comment?

No matter how much you think this helps with spam, it deters constructive comments. I have declined commenting on 2 blogs today simply because they required me to sign up to their blog to place a comment.
I have just deleted the broken trackback from one of them after I wrote a long comment, only to discover after I hit submit that I needed to sign up to comment. I didn’t however flag it as spam as many would.

Liability and Reputation Management

Reputation management is often discussed – I frequently get comments from the owners of various products and services I review, whether the post was positive or negative. I always try to be constuctive in my opinions, and even when what I say is critical in some way, generally the time I have spent looking at something in depth is appreciated.

Due to collective intelligence, the actions of a rogue webmaster who flags critical comments and trackbacks as spam, could prevent legitimate commenters voicing their opinion on hundreds of other blogs. Who would ultimately be responsible, Akismet or the webmaster who flagged comments as spam that were just voicing a different opinion?
Who knows, maybe that is why I am having increasing difficulties posting comments on Problogger, and other high traffic blogs using Akismet. I am fairly certain that Darren has never flagged one of my comments as spam, and I have been leaving comments there for some time. Surely Akismet should have learned by now?

Spammers are actively working to improve their Akismet reputation by posting comments containing absolutely no links. If a comment gets approved or slips through, many comment systems would give a significant bonus for any future comments, they would no longer be a first time commenter.

GeoTargeting Blacklists

I am not sure if GeoTargeting is used in some way for blog spam blacklists. I really hope it isn’t. I am based in Poland, maybe that is having an increasing affect on my ability to post comments. Blogging is a global community.

Tips For Akismet Users

Many people who use Akismet as a way to control comment spam believe that it saves them a lot of time. In many cases that is true, but there are certainly an increasing number of posts I have read regarding false positives.

This post isn’t intended to deter people from using Akisment, but to be aware of the effects of collective intelligence and how they can help to improve it.

  • Use delete – Don’t flag critical comments and trackbacks as spam – you could always leave them, or answer them in a constructive manner – the latter shows real class
  • Monitor your spam frequently – don’t just assume everything Akismet catches is spam – not taking action could be harming your customers ability to give feedback
  • Actively recover comments even if they are critical – you can always delete the comments afterwards – I don’t know if comments detected as spam and not recovered affect the collective intelligence, but having briefly looked at the Akismet WP plugin code, white flag signals are being sent.
  • Upgrade Akismet – newer versions of Akismet plugins use different functions to check for spam, and sometimes remove functions that were determined to cause too many false positives.

The Automattic guys are constantly working on Akismet to improve it. They rely in part on the data you provide them, both flagging comments as spam, and recovering comments from the sin bin.

Maybe one day I will happily use Akismet, but currently I am uncomfortable by the amount of times my own comments end up in someone’s moderation queue, only to sometimes appear a few hours or days later when a good webmaster does some house keeping. It interrupts the flow of conversation.

Liked this post? Follow this blog to get more. Follow

Comments

  1. says

    About a week ago I noticed that almost no comments I made were getting through on many blogs, including my own a quick email had it sorted out, though when I asked how I triggered the black listing, I got no answer, though I think it could be down to regularly changing the name field I post comments under various names Tim Nash, Venture Skills, Vskills Team depending on the blog and subject, normally always with the same URL but sometimes a non personal email address if I’m unsure of the blog. These variants I suspect is what put me in a sin bin rather then URL based spam. Still its rather amusing to be blocked from commenting on your own blogs (well once its sorted it is)

  2. says

    Bloggers which report things as spam which aren’t, which is not uncommon, are devalued in the Akismet system. That’s how it’s been from day 1, and numerous attempts to poison Akismet have failed. The advantage of so many people using Akismet is it becomes very difficult for any one person to influence the system.

    However what you don’t mention, and I think is far more likely scenario, is that a spam filter gives people plausible deniability for deleting a comment from their site. It’s awkward to say “I didn’t like your comment” so people just blame it on the computer third-party. This used to really confuse me, as people who had been “blocked” would contact Akismet and we deeply investigate every reported FP, but the logs were showing that they weren’t ever marked as spam.

  3. says

    That doesn’t explain why I would notice an increase in comments being held for moderation on sites where I make comments on a fairly regular basis, even though those comments are pulled out of moderation.

    Why would I notice specific things such as providing a deeper link through to more relevant content, or making a longer comment affect my ability to escape the moderation queue?

    There are advantages to collective intelligence, and I did query how much weighting is transferred.

    Since the beginning of December I have tracked approximately 450 comments I have made in various places. I would estimate 100 to 150 of those are on WordPress installations running Akismet.

    I would estimate that of those comments less than 5 have not appeared after a period of time, yet I still have comments that hit the moderation queue on blogs where I have previously commented without problems, and it seems extremely erratic.
    I am referring to fresh posts, no links in the body of the comment, and comments of a reasonable length that as far as I am aware didn’t contain words that might trip up filters.

    With that amount of comment history I would expect the collective intelligence to now be working in my favor, allowing me to include a link in a comment even on sites such as Darren’s, or to be able to write a 500+ word comment without it going in the moderation queue.

    Please forgive my lack of faith in filters that rely on global databases, but having spent a lot of time trying to train Gmail into allowing the email I want to get to me, false positives are a major sticking point.

    I expect problems on a first comment, or maybe a comment to an older post, but not to new posts on blogs where my comments are always accepted.

  4. says

    Moderation has nothing to do with Akismet, that’s other settings inside of WP. For example on your Options > Discussion page you have things like “Moderate comments with more than X links.”

  5. says

    So are you stating that number of links in the body, length of comment, whether you supply an optional URL and how deep it goes are not factors within the Akismet algorithms?

    From what I remember, on comments on Problogger they just don’t appear if they have hit a hurdle.

    With Spam Karma, comments that almost pass end up in the normal moderation queue, and I thought that was the same with Akismet

  6. says

    Akismet is binary, it either says something is spam or it isn’t. The only time it may send something to the moderation queue is if the service is down.

  7. says

    Matt,
    Suggesting that Akismet isn’t to blame for the false positives and that the comments were just deleted seems pretty suspicious to me considering that just recently I was being blocked from Scoble’s blog – where he doesn’t delete comments that criticise him, nor were my comments critical – and you investigated the problem and resolved it. Further, a collegue of mine who uses Akismet confirmed that it marked all of my comments on his blog as spam. See http://www.symphonious.net/2007/02/02/scoble-your-blog-is-eating-comments/

    Also, Akismet may be separate to moderation, but if Akismet marks something as spam and later the site owner recovers the comment, it looks as if the comment has been held in moderation, which is what Andy has most likely been seeing.

    Finally, Andy the “Post” button for comments does nothing in NetNewsWire (uses the Safari rendering engine).

  8. says

    Adrian,

    Thanks for stopping by – Robert’s is another blog that most of the time my comments appear without problem, yet sometimes they don’t.

    I don’t chase Robert over it, because he has enough to deal with. Robert definitely doesn’t delete critical comments, and I have racked up a couple.

    In fact one fairly serious comment didn’t get through Akismet, even without a link, and I just made sure I included the warning in a later followup comment. Who knows the potential harm to the thread of discussion that missed comment made.

    Maybe coComment integration is causing a problem, I will try switching off – note you may have to visit my comments policy sometime soon to place comments here – just working out the code to enforce people at least view it once.

  9. says

    I would never suggest that Akismet never makes mistakes. In fact, more than anyone else in the world I know exactly how many mistakes it makes every day. What’s different about it isn’t that it doesn’t mess up (even five nines of accuracy would be a few dozen mistakes a day, with the amount we process) but that it adapts and learns quickly.

    If you think a comment has been caught, just email the person and ask them to submit it as a false positive if it was caught. It’s not personal, and it probably has very little to do with what you do (number of links, etc) and far more to do with what spammers are doing. (Remember only 6% of the comments we process are legit.)

  10. says

    My main problem is that I get far too many spam comments to peruse the log and rescue the FNs. 400+ a day. I’d imagine Scoble gets thousands a day.

    My first step toward a solution to that problem is this plugin, which notifies users when their comments are moderated or put into the spam bin. When something goes into the spam bin, I immediately shoot them a note that encourages them to contact me and ask me to rescue their comment.

    The next thing I’m considering is a plugin that goes through all comments marked as spam and deletes the really obvious spam. Akismet is an imperial “thumbs up” or “thumbs down” decision, so there’s no way to separate the SPAM from the maybe-spam. If I could eliminate 80% of the spam log on sight, I might be able to skim the Akismet spam log for false positives.

    Even a simple blacklist (perhaps making the built-in WP blacklist delete caught comments instead of putting them in the spam bin) would likely help.

  11. says

    I will have to give that a little test run. Maybe an option is to provide a get out of jail card that will either send an email with confirmation click, or move the comment to the normal moderation queue.

    Even when people fill in my captcha they often still end up in the moderation queue, but it is very rare that I have to pull comments from the black list in Spam Karma. The worst is when someone pre-prepares a comment, pastes it in, it gets caught because they have not been on the page long enough, and then they write another comment saying “I left a comment but it was caught by your filter” – I have had some weird database bugs when people have managed to get their personal domain banned.

  12. says

    Before trying Akismet, I’d implemented a little spam comment word blocker, which was an update of a little code I found in the WordPress.org forums. It did on occasion make posting comments difficult, and so I was thrilled to find how effective Akismet is.

    However, after a while I tired of scanning the Akismet queue, so I partially re-enabled it. Since I don’t want to be snagged for links, here’s the unlinked URL:

    http://developedtraffic.com/2007/02/11/wordpress-akismet-a-little-code-blocker/

    There are links to the full post about the word blocker in the post.

  13. says

    Just edited it to make the link live.

    The content of comments here doesn’t affect what gets through unless you go crazy. I think I set the penalty low for links, and only if there are more than 4 links in the content.

    There are actually plugins available that use Akismet as a variable in their overall calculations, rather than as a sole judge.

    That little hack might be useful to block some of the spam but I like being able to write

    a little bit of code in my comments
  14. says

    I agree about the requiring signup policy. I don’t bother with blogs that require a signup or block comments altogether (unless its very important).

  15. says

    Andy, I know what you mean.

    I’d noticed that much comment spam contains both normal links and BB code links, I just thought blocking the BB code for URLs would stop an awful lot of it from being posted in the first place. But that “word blocker” doesn’t block normal HTML links.

    I hadn’t considered wanting to write BB code in comments, since those don’t work on my blogs.

  16. says

    Andy. I just wrote a post that I’ll publish in the next day or two about a comment I left at a corporate blog that was shut down because I had a differing opinion of what their company holds. Sorry it’s not posted yet to see. But before the weekend’s end.

  17. says

    I have been very afraid of Akismet providing false positives, especially since I was running it with Bad Behaviour. It might have been a bit excessive, but I was getting a lot of referral spam.

    I found the Comment Guard WP plugin by accident, and have it running on my blog right now. I am very positive with my experience so far, and have had great support from Angsuman with some technical problems. Read about it here:

    http://blog.taragana.com/index.php/archive/wordpress-comment-spam-protection-comment-guard-plugin-beta-release/

    It is now in beta, but I think it is quite easy to get in to the program.

  18. says

    Wow, for a blog post about commenting I had a really hard time finding out how to comment. There was two screen worths between the post and the comments, and there is another screen worth AFTER where the comment box should be. It made if very hard to see the sections where it says you have to read the comment policy to enable the form.

    Just a though, do a “fake” comment form that is read only input where the real form should be that clearly indicates the comment policy info — that will make it jump out.

    on to my comment!

    One thing I’d *really* like Akismet to start doing is be “less binary”. I’d like to more easily ignore the stuff that is guaranteed spam and focus on reviewing the stuff that may or may not be spam.

  19. says

    Apologies that you found it a little difficult to find – the cookie based system was only hacked in yesterday and certainly needs some tweaking.
    I didn’t intend to implement it as quickly, but some manual spam managed to get emailed to some comment subscribers. It will become slightly more streamlined, especially for posts that already have quite a few comments.

  20. says

    Well, what is the alternative of Akismet, Andy? My blog receive more than 20 spam comments per day. I was tired to delete it until Akismet was installed. What is the alternative?

    • says

      Engtech

      I am actually quite sick of Akismet, because I haven’t got time to monitor every blog I link to to see if a trackback was pulled out of the sin bin.

      It happened recently on Scobleizer again, and he didn’t know I was part of the “conversation”

      Hmm… is Akismet sending email addresses to Akismet.com?

Trackbacks

  1. [...] now, after openly criticizing Akismet. Defensio is a similar tool, so it does fall prey to the same “wisdom of crowds” weaknesses. Defensio’s usability far outstrips SK2 ((gouge my eyes out ugly)) and Akismet ((requires plugins [...]