Exactly Why Nofollow at Wikipedia is bad

Once upon a time, a scientist wrote a brilliant blog post solving the answer to life, the universe and everything. A few people linked to him, and anyone searching for the answer to the ultimate question could find it easily, by just a simple search in Google.
The scientist appeared right at the top of the results.

A Wikipedia editor read the article, and because it was free to use due to its CC license added an entry to the Wikipedia.

He didn’t use the article word for word, but it contained many of the same keywords.

Of course the student linked through to the original source, but the link was nofollowed.

Lots of websites were free to use the content from Wikipedia, so the article appeared in all the national newspapers, major online publications etc.

Most of those didn’t link back to the poor scientists website, but to Wikipedia.

All these other sites already had huge authority in the search engines, so whatever they published also enjoyed a high position in search results.

Now, a few months later, whenever someone is looking for the answer to life, the universe and everything, the first 50 results contain the Wikipedia entry, plus 50 versions of the Wikipedia entry with a slightly different slant, all pointing back to Wikipedia.

The original work by the scientist is no longer visible.

This might be looked upon as a Darwinian process, but what happens if there was actually a mistake in the entry created by the student?

Oh, and of course, the scientist isn’t yet regarded as “notable”

Liked this post? Follow this blog to get more. Follow

Comments

  1. says

    Somehow I can’t imagine the scenario occuring for two reasons:

    As far as I know, each Wikipedia article about an event or personality contains a link to the official website. Having the link to the author (or scientist’s website) may ensure that the specific Wikipedia page must kept up to date with author’s official website. Many pages are structured using heavy quotes from the official website, which is seen as a definitive resource.

    I’m sure any dedicated journalist would have been taught to ‘always go to the source’ when writing their news article. It’s sorta like college essays, you might interpret the facts wrongly but resources exist for verification.

    Wikipedia is also an open-community project: I’ve seen several pages go through many revamps because of neutrality issues. Many authors edit and argue about a specific page. This helps to somewhat reduce the likelihood of a complete blunder by one individual.

    I’m just curious about the actual process of being widely accepted as a legitimate Wikipedia editor. Are there any specific requirements? Or does it result from in-group validation?

  2. says

    Well just look at how many bloggers referred to Wikipedia being a black hole in relation to link equity and didn’t attribute that to me.

    Black hole in relation to Wikipedia

    Followable links are used by search engines as part of their duplicate content algorithms.

    Google specifically state to syndicate your content carefully with a link back to the original source. It has to be followable.

  3. says

    I can’t help but think that this is a huge step towards completely ruining the spirit of the project. It seems to go against what social media is all about, 2 way communications. On the net that road isn’t always travelled in words but in links and trackbacks and relevancy. Hording indeed.

  4. says

    It’s a tragedy. But I don’t know if the scientist really cares. He still gets credit on Wikipedia and he probably isn’t trying to make money from his website anyways. He can console himself with his Nobel Prize and all the other accolades and opportunities that come from his discovery.

    That is unless he’s a scientist-blogger.

  5. says

    What intrigues me is why you think this is a problem with wikipedia’s policy and not with google’s search technology.

    If, as you say, the scientist’s article is no longer visible in the search results then that could demonstrate as much a problem with PageRank as a search algorithm (i.e. it is no longer such a useful search). Alternatively, it could be that the condensed version of the scientist’s arguments presented in the wikipedia article actually are more useful to the average searcher than the original; in that case it is difficult to see where the problem lies. It’s not as if PageRank is the web’s equivalent of the Academy Awards; it’s just a search engine …

    Ultimately both the scientist and wikipedia are contending with the vagaries of PageRank. But I completely fail to see where the ‘tragedy’ that John Wesley mentions lies. PageRank exists to find relevant results for searchers, not reward bloggers….

  6. says

    First of all Wikipedia have implemented a specific tag to search engines which is intended to indicate that the link is unverified, irrelevant and untrusted, whereas in fact the link in the majority of cases is to the original research documents that are being cited. That presents both search engine and ethical problems.

    From Google’s Webmaster Central

    Syndicate carefully: If you syndicate your content on other sites, make sure they include a link back to the original article on each syndicated article. Even with that, note that we’ll always show the (unblocked) version we think is most appropriate for users in each given search, which may or may not be the version you’d prefer.

    I do a lot of article syndication and study closely what happens when you syndicate the same or similar piece of content to 500 or more sites.

    If your content is posted well before you syndicate, there is not a major problem, you will most likely still end up in the top 4 or 5 positions in the serps for a specific title search. If it is a more vague search on keywords, until you have enough site authority, you won’t be anywhere in sight.
    You might still receive a trickle of traffic from it, and overall more views than if you had just published the article on your own site, unless you have 1000s of subscribers.

    In this particular example, the scientist starts off with a first position in the SERPS.
    When Wikipedia pick up the content, he might be relegated to 4th or 5th place fairly quickly.

    When multiple sites take the content from Wikipedia in some way for syndication, the scientist would drop to 50th place.

    Wikipedia effectively becomes the authority on the subject.

    The job the search engines do is to work out which parts of each article on Wikipedia are relevant to which citation link. You couldn’t expect a human to decide which paragraph in a Wikipedia entry belonged to which cited site.

    Without the links, you are expecting the search engines to work out a “Which came first, the chicken or the egg” type question comparing millions of documents.

    Links that search engines can follow are important

  7. says

    Right, so really your argument is that wikipedia are misusing the nofollow attribute in the sense of using it for links that are trusted (although the question is, of course, how reliably they know that). The issue about which article comes up first in search engines seems to be slightly orthogonal.

    The fundamental question there is, ‘does this make the search result more or less relevant to the user’, not ‘does this search result credit the right person for an idea’?

  8. says

    They are using an attribute designed for search engines to prevent the search engines using the link in their calculations for duplicate content and origin.

    They are not doing it effectively, because Yahoo ignores it, and who knows, maybe Google will as well in the future.

    It is quite possible what Google will do is just use a historical snapshot of links, including all the spam it contained.

    I think the person an idea originates from is a relevant result. It might be less relevant than the Wikipedia entry of compiled works, but they certainly deserve a position higher than 50th in the SERPs for their original idea.

    I don’t believe forcing change in algorithms is such a good idea. The way things are a link to Wikipedia means something, just like a link out to related sources.

    It seems Wikipedia are hoping that links from them will become devalued. The best way to do that immediately is to reduce the value and relevance of a link to Wikipedia, then they have less juice to pass on.

    In many ways the search engines should treat Wikipedia as a bunch of rehashed duplicate content, linked to automatically simply because it can be done without much care.
    I am sure they devalue a lot of the links to Technorati in the same way, who also use nofollow for outgoing links.

    Wikipedia almost gets linked to now by default, simply because they are already at the top of the search results on every search.

  9. Jared Haer says

    The purpose of this attribute is to stop comment spammer(loans, gambling, dating-services) on blogs and forms, not to stop scientist writing papers on their own site. It seems to me the good of this out weighs the bad, easily. Or I could post links to my site all over you blog to help my PageRank. . .

  10. iceek says

    Sorry I am probably very blunt. But if I understand Wikipedia is the blog of a student?

Trackbacks

  1. Is Wikipedia a black hole?…

    Over the weekend, Wikipedian-in-Chief Jimmy Wales decreed that all links on the site would be tagged as “No Follow.” That means, in essence, that the links become invisible to search engines like Google’s. The engines won’t take the links into acco…

  2. […] search enginesspamwikipedia Wikipedia implemented a little change to their website yesterday, in an attempt to make their popular social encyclopedia a less attractive target for spammers. It may have been a tiny change but it seems it may have a significant impact on many people in the web community. Some people think it is a good move while others argue that it will turn wikipedia into the greedy kid who no-one invited to the party. […]