Robots-Nocontent | How To Spot Abuse By Sneaky Webmasters

The moment I heard about the new Robots-Nocontent class that can be added to content, alarm bells started ringing in my ears. Whilst this can be used in a legitimate manner, it is also possible for webmasters to heavily abuse it.

The intent is that you can tell Yahoo which content not to index, such as adverts and links to navigation and legal elements that are nothing to do with the content you want to rank for.

It has already been stated that this has no effect on link equity. If you don’t want links to count, you have to still use rel=”nofollow”

From the Yahoo Blog

This new tag does not change any treatment of inlinks from the page. Links within the section marked with ‘robots-nocontent’ will be treated just like links in the rest of the page.

They will continue to be actually crawled to find the target page, but they will not carry link attribution if they have the ‘rel=nofollow’ tag.

I can think of lots of nefarious ways this can be used

  • class=”this-is-what-the-user-really-sees”
  • class=”this-is-my-ripped-off-content”
  • class=”these-are-my-p0rn-ads”
  • class=”I-am-not-a-thin-affiliate-really”

Just as casual internet users are almost totally unaware of the rel=”nofollow” attribute for links, it will take a long time before there is any awareness about Robots-Nocontent

Robots-nocontent

Here is what some of the HTML I used above looks like

<ul class="robots-nocontent">
	<li>class="this-is-what-the-user-really-sees"</li>
	<li>class="this-is-my-ripped-off-content"</li>
	<li>class="these-are-my-p0rn-ads"</li>
	<li>class="I-am-not-a-thin-affiliate-really"</li>
</ul>

You see normal web visitors will get to see everything, and search engines whilst they might cache the whole document, the marked sections won’t affect rankings, good or bad.

There are certain internet industries that will no doubt love this new feature. Sure there are other ways around this such as using graphics, but I haven’t explored every possibility.

How To Display The Robots-Nocontent Hidden CSS Class

  1. Go to and Install the Chrone Edit Plus Firefox Extension
  2. Restart Firefox
  3. Go to Tools->ChromeEdit Plus->ChromEdit. Select the userContent.css tab and paste the following code in plain text
a[rel~="nofollow"] {
  border: thin dashed firebrick ! important;
  background-color: rgb(255, 200, 200) ! important;
}

.robots-nocontent {
  border: thin dashed firebrick ! important;
  background-color: rgb(200, 255, 200) ! important;
}

The first part of this CSS is the original from Matt Cutts to highlight Nofollow links with a pink background.
The second part of the CSS highlights content with the Robots-Nocontent class applied

It is highly likely that this functionality will also be added to well known Firefox plugins such as Search Status and SEO for Firefox.

Danny Sullivan as usual has a good writeup on Robots-Nocontent.
Threadwatch has a slightly different perspective.

Also check out:- Andy Beal, WebProNews, FinalTag, John Andrews, Mashable & Techmeme

Liked this post? Follow this blog to get more. Follow

Comments

  1. says

    From what I’ve read on other sources as well, I feel that this is definitely one of the areas that quite a number of people are ‘happy’ to exploit. Still, it provides more freedom for folks like us when we sometimes drift to a little non-related content.

  2. says

    How long before affiliate marketers with affiliate content are ‘advised’ to use this tag on their content I wonder.

  3. says

    The robots-nocontent class name is crap by design. Although spammers and scrapers will use it, the major issue is that it’s creating a shitload of work for honest Webmasters of legacy sites. Boycott it!

    • says

      That partially depends on “how” legacy they are.

      If sites used some PHP or server side includes they might well not have too many problems.

      If they didn’t, they might still be able to use some search and replace with a regex based tool. There are some fairly smart search and replace tools available, and if anyone is maintaining a large site, they probably have to use these already.

      I am not anti-nofollow, I find it easier to use than messing with javascript – I am not anti this tag either in the sense that I like having control, and it is better than having to make graphic elements for stuff like shipping information which SEOs already do, but are not ideal for accessibility.

      There is also the question of footprints – this could effectively at least smudge them a little.

      Whether it will be abused at a later date by the search engines reinventing the purpose is hard to tell at this point. Maybe if the SEs backed down a little on nofollow for paid links, or gave clearer signals, people would be happier about this implementation.

      I don’t buy the “designers” crying about a break in standards and purity of CSS.
      Allowing a configurable class as currently discussed on threadwatch might be useful, or the ability to define page elements.

  4. says

    I don’t listen to whining designers arguing “SEO classnames conflict with CSS” because that’s not true. What is true is that such crawler/indexer directives are meta data which should not alter or influence the markup on element level. Referencing existing classes and DOM-IDs in robots.txt to assign crawler/indexer directives would be elegant and way more flexible. And it would save bandwidth and maintenance costs.

  5. SEO Tips, Techniques & Tutorials says

    Hmn…I think this feature/tag will be abused by all the spammer who knows how to abuse it (i don’t want to give any idea here). But not so much for ‘most’ people. It would be nice for Yahoo to implement a ‘spam alert’ system where using too many of those tags triggers the alarm.

    Anyway, I think it’s much better to have a tag similar to Google Section Targeting tag for Adsense rather than this one: less place to place the tags.

  6. SEO Tips, Techniques & Tutorials says

    BTW-Andy, I’d really like to know how you manage to incorporate “Stumble Upon, Digg-It,Techn Fav & Bump-It” in your ‘post’. Do you have to add the codes (with the plugins activated) every time when posting or via the template files?

    • says

      Just a floated DIV before the content

      <?php the_post(); ?>
      
      				<div id="post-<?php the_ID(); ?>" class="<?php blogtxt_post_class(); ?>">
      					<h2 class="entry-title"><?php the_title(); ?></h2>
      					<div class="entry-content">
      
      <div class="mybotbox">
      
      <a  href="http://www.stumbleupon.com/submit?url=<?php the_permalink(); ? rel="nofollow">&title=<?php the_title(); ?>"><img src="http://www.stumbleupon.com/images/stumble7.gif" width="100"></a>
      <div style="float:left; margin:0; padding-left:22px;"><?php if(function_exists('digg_this')) { digg_this('', '', '', 'tech_news'); } ?></div>
      <p style="margin:0;padding-left:10px;"><a rel="nofollow" href="http://technorati.com/faves?sub=addfavbtn&amp;add=http://andybeard.eu" rel="nofollow"><img src="http://static.technorati.com/pix/fave/btn-fave2.png" alt="Add to Technorati Favorites" /></a></p>
      <p style="margin:0;padding-left:14px;">
      </p>
      <p style="margin:0;padding-left:10px;">
      <?php if (function_exists('bump_this_widget')) bump_this_widget(); ?>
      </p>
      </div>
      <?php the_content('<span class="more-link">'.__('Continue Reading &raquo;', 'blogtxt').'</span>'); ?>
      <div style="clear:right"> </div>
      <?php link_pages('<div class="page-link">'.__('Pages: ', 'blogtxt'), "</div>\n", 'number'); ?>
      					</div>
      
      <!-- <?php trackback_rdf(); ?> -->
      
      				</div>

      The CSS is fairly simple too

      .mybotbox{
      float: right;
      margin: 10px 5px 10px 10px;
      width: 100px;
      height: 250px;
      min-height: 250px;
      background:#ffffff;
      padding: -8px 0em 0em 0.5em;
      }
      

      I still have a few bugs in the CSS on this theme, but it will get fixed gradually. I make no claims to being a designer.

  7. says

    WOW…Thanks for the code, but Andy your page loading time is killing——my—patience =P. Maybe trying to lose some java a little bit–just a suggestion =? *(I think the translation is not a lot of use, but you can post a poll to see if that function’s use or not. Really, seriously it takes quite long to load the page ;).

    Also, there’s an updgrade/more secure plugin for ‘Digg This’ called ‘Digg That’–if you are interested in security:

    http://www.harrymaugans.com/digg-that/

    • The SEO Blogger says

      Hey, I’ve just read the post and apparently you have visited the site =P. Anyway, I don’t think you’re using the ‘Digg That’ plugin, though–or are you?

      • says

        I am using a beta version of DiggIt

        http://tuggo.org/projects/diggit/

        Unfortunately the version on the front page is broken in IE, and the forums where 1.1.4 is available are currently down

        It is the only plugin I know of that uses the real (new) Digg buttons with perfect detection, and allows the buttons to operate on a front page. It took the guy a little hacking to do it.

        The translation plugin creates cached pages and Google have specifically stated that translated pages even using their tools are not a problem for supplemental results.

        They do eat up a lot of server allocation, and bandwidth when the bots come visiting.

        This blog currently has over 7000 pages cached and only 2 in supplemental.
        I must admit that number is a bit of a roller coaster, I have seen it as low as 3000 recently. That number will also increase as the bots pick up all my translated pages. Theoretically they should have close to 20K pages cached.

        I don’t need to ask people whether they are useful, because half of my visitors are from North America, any poll would be worthless, and I look at my stats.

        Just today I know that French, German, Japanese, Russian, Korean, Spanish, Portuguese and both forms of Chinese pages were viewed, and I do see improved results with search traffic that so far are hard to quantify.
        I live in Europe, I know the search habits of Europeans, even if they speak fluent English.

        On page loading there are a few things I can change and will change, though the Digg button seems to have as much effect as most of the header other than the subscriptions.
        The subscription buttons are due for another redesign anyway.

  8. says

    I can implement this in our site in about 3 files – the header/footer/navigation. I think I will give it a shot because I have definitely seen pages on our site getting ranked based on things that are in the navigation column.

Trackbacks

  1. Daily SearchCast, May 3, 2007: Yahoo Offers New Robots-Nocontent Tag; Belgians And Google Make Peace; What’s That Hidden On The VW Site & More!…

    Yahoo has a new way to block off parts of your web pages from being indexed. Going to try robots-nocontent? Belgian papers make a sort-of peace in Google and become searchable once more. Google is (rumored) to be buying everything. Google features VW …