WordPress.com Bugged XML Sitemaps

WordPress.com has added XML sitemaps so I thought I would take a glance at their implementation.

My immediate though was to take a look at Lorelle’s sitemap.xml

  • Homepage daily priority
  • Every other page updated on a weekly basis?

That seems like a good way to tell the spiders to index your site less often than they currently do.

With Lorelle you would certainly want spiders checking the home page hourly as she is sometimes the source of breaking news.

Then I looked at the sitemap with a little more detail, and in particular the entry for her most recent post, the Cyclical Nature of Blog Stats – a post worthy of a link anyway so this is a 2-in-1.

This entry was written by Lorelle VanFossen and posted on June 16, 2008 at 4:57 am

Ah but I know Lorelle writes posts sometimes in batches and schedules them for publishing. Lets look at the XML

<loc>http://lorelle.wordpress.com/2008/06/16/the-cyclical-nature-of-blog-stats/</loc>
		<changefreq>weekly</changefreq>
		<priority>0.6</priority>

		<lastmod>2008-06-11T18:59:24+00:00</lastmod>

Last modified 5 days before it was published.

Just for good measure, lets look at the home page

<loc>http://lorelle.wordpress.com/</loc>
		<changefreq>daily</changefreq>
		<priority>1.0</priority>
		<lastmod>2008-06-12T02:05:56+00:00</lastmod>

Wrong again – today is the 17th, Lorelle published a post on 16th June, which updated the home page, but it is not reflected in the sitemap.

Sometimes you might be better off with no sitemap at all…

5/10 for finally fulfilling a user request
1/10 for implementation (so far)

Related posts

This entry was posted in SEO Blog, wordpress and tagged , , , , , . Bookmark the permalink. Both comments and trackbacks are currently closed.

23 Comments

  1. Jacky Supit (2 comments.)
    Posted June 17, 2008 at 11:29 am | Permalink

    nice found.
    i guest we can give them a little more time to do their best, and they might even use your feedback to make it better :)

  2. Jacky Supit (2 comments.)
    Posted June 17, 2008 at 11:31 am | Permalink

    where is my submitted comment?
    gone? caught by askimet or something?

    • Andy Beard (1685 comments.)
      Posted June 17, 2008 at 11:33 am | Permalink

      I use Spam Karma – need to leave a few comments before they avoid the moderation queue

      • Clement (10 comments.)
        Posted September 10, 2008 at 4:31 pm | Permalink

        Andy, Can you share with me the URL for Spam Karma.Do use it along with Akismet? At the meantime, I use only Akismet on my blog.

  3. Jeet (1 comments.)
    Posted June 17, 2008 at 12:47 pm | Permalink

    Good find buddy. I am sure now that they have started generating sitemaps they will do it right in next release. Till then I would use the plug-ins :)

  4. Al (2 comments.)
    Posted June 17, 2008 at 12:56 pm | Permalink

    An interesting and possibly simple idea would be to derive the frequency for the home page based on the average number of posts per day over the last 30 day window.

    Couple benefits I see:
    * People like Lorelle that post quite often, it’ll be low
    * People that post sporadically, it’ll be higher
    * Not all frequencies across wordpress.com will be the same. This may or may not have any impact but I suspect that Google will love you more (read:trust in some form) if the sitemap is accurate and reflects what Googlebot is seeing in terms of changesets going through a site.

    A similar line of thinking could be used for post pages. Default the frequency to a derivative of number of comments per post over a rolling 30 day window. After the statistics for your site show that you only receive comments for x days, it gets increased to weekly or higher depending.

  5. Hunter Jackson (9 comments.)
    Posted June 17, 2008 at 4:20 pm | Permalink

    Andy,
    Very interesting. I saw this announcement this morning. I am upset that the spiders are only crawling pages every 7 days…wish we could edit it on wordpress.com where my blog is currently…at this time…hosted.

    hopefully they will be doing this soon!

  6. zach (1 comments.)
    Posted June 17, 2008 at 6:25 pm | Permalink

    Wow, great article. Looking back at your past articles has taught me a lot. I subscribed to your feed. Maybe you could take a look at my site, and maybe even subscribe if you would like to.

    Thanks,
    Zach

  7. Lorelle (1 comments.)
    Posted June 17, 2008 at 10:16 pm | Permalink

    This is great information, Andy, but for those who don’t really understand what sitemaps are, WordPress and WordPress.com uses pings when a post is published through ping-o-matic to alert search engines and others that you’ve published a new post. This is the traditional “invitation” for them to send out their search bots.

    The sitemaps are a constantly current table of contents for your blog, updated every time you publish a new post. It acts like a road map, telling the search bots which recognize XML sitemaps which pages to index. On the first run through, it indexes everything. On the next visits, it can check via the dates to find out what is new or modified and index only the new information, allowing the bots to move faster through the sites and not waste so much time with duplicating effort and information.

    As for those who fear having these activated, they are a standard on most sites today, invisible to users and administrators. You control whether or not you want your WordPress or WordPress.com indexed through the Options panel.

    Sitemaps are recognized by Google, Yahoo, and MSN last time I was paying attention to these things. Not all search engines or site indexing bots recognize them, so while it improves indexing, keywords, links, and other traditional techniques still holds sway over SEO. This is just a tool that speeds up the work of the search engine bots.

  8. Chris Lang (3 comments.)
    Posted June 18, 2008 at 9:20 pm | Permalink

    Andy, what plugin do you suggest using for sitemaps? Someone told me your RSS feed is consumable as a site map (it is xml) but it just errors in Webmaster tools. I don’t have time to play around so I thought you would send us to the right one.

    • Andy Beard (1685 comments.)
      Posted June 18, 2008 at 9:40 pm | Permalink

      I generally don’t use one, though I am thinking about using one again.

      Generally everyone I know uses this one if anything
      http://wordpress.org/extend/plugins/google-sitemap-generator/

      • JunkieYard Dot Com (1 comments.)
        Posted July 11, 2008 at 6:12 pm | Permalink

        Yep, that’s the one I’m using for my sites. If you want a sitemap with your WP, that’s the one. It got so many options that you can configure and it will ping all the search engine everytime you publish a new post. They will come crawling to your sites. :D

        • Clement (10 comments.)
          Posted September 9, 2008 at 6:53 pm | Permalink

          Spot on! I use this sitemap generator on my self hosted WP blog and ever since I started using it I have noticed a huge improvement in how fast my pages get indexed in different search engines. The amount of traffic to my blog has also increased greatly.

  9. REBlogGirl (9 comments.)
    Posted June 19, 2008 at 1:44 pm | Permalink

    Very nice. Just another reason why I don’t use the big WP. We’ve seen this a lot on WP sites but never really thought about the consequences. We’ve played around a lot with our sitemap and are now using a script to generate it and base priority and change frequency on the average weekly traffic a post gets with special weight given to certain types of posts like listings and MLS RSS feed. This means our sitemaps are constantly in flux which seems to work very well. Google crawls the sites regularly and we saw massive SERP changes on the listing and RSS feed pages once this was implemented. I think that might be a valuable plugin for WP- a script that can help identify which posts should be given priority and generate new priority and change frequency data.

  10. Mark (1 comments.)
    Posted June 24, 2008 at 1:37 am | Permalink

    Thats the problem with sitemaps. If they aren’t arranged properly, they can seriously hurt your google traffic

  11. Posted June 29, 2008 at 11:45 pm | Permalink

    I use the Google XML Sitemap Generator for WP (I believe the same as Andy’s link) at http://www.arnebrachhold.de/projects/wordpress-plugins/google-xml-sitemaps-generator/

    It is very user-friendly and has tons of options to be defined by user, such as frequencies and crawl priorities for all content, posts and pages, and more.

  12. Cornel (1 comments.)
    Posted July 8, 2008 at 2:27 pm | Permalink

    Not just your traffic… they hurt your ranking as well especially if they are incorrectly formatted. I have seen it in a couple of blogs I was runnning on Wordpress MU, after the corrections to the sitemap generator I was using one of the two blogs gained about 3 positions with no extra postings, the other just one.

  13. One Year Millionaire (1 comments.)
    Posted July 19, 2008 at 8:19 pm | Permalink

    So if you have a sitemap that isn’t properly set up it would hurt you more than no sitemap at all?

  14. Ash (1 comments.)
    Posted August 12, 2008 at 6:07 pm | Permalink

    I don’t believe you’re ever better off without a xml sitemap. If you’re having trouble, fix it. Don’t disable it.

    I’ve heard of some people having trouble with Google XML Sitemaps plugin on Wordpress for scheduled posts but I’ve never had a problem on my blogs.

  15. james (1 comments.)
    Posted October 22, 2008 at 12:21 pm | Permalink

    Subject Line : Beat Long Poll Lines with Absentee Ballots from StateDemocracy.org

    Many state and local election officials are encouraging voters to use Absentee Ballots to avoid the long lines and delays expected at the polls on November 4th due to the record-breaking surge in newly registered voters.

    Voters in most states still have time to obtain an Absentee Ballot by simply downloading an official application form available through http://www.StateDemocracy.org, a completely FREE public service from the nonprofit StateDemocracy Foundation.

    Read More: http://us-2008-election.blogspot.com/2008/10/beat-long-poll-lines-with-absentee.html

  16. Detectives (1 comments.)
    Posted March 24, 2009 at 8:18 pm | Permalink

    I think keeping no sitemap is in fact much better than keeping a faulty one. I believe it affects in the crawling of the site by the search engine spiders. This is another reason why we try to avoid using WordPress.

  17. Jonas Software (1 comments.)
    Posted March 26, 2009 at 11:28 pm | Permalink

    Wordpress.com is buggy as hell. Run your own installation!

  18. mtb (2 comments.)
    Posted March 27, 2009 at 4:45 am | Permalink

    This post and comments are a littledated but my findings with sitemaps has been nothing but positive. I occasionally check my dynamic site map and it is generating the correct links. I also occasionally resubmit to the search engines if a lot of links were genereated. My opinion is they do us good.

4 Trackbacks

  1. [...] Comments About The AuthorAndy Beard – Niche Marketing – Blog search engine perfomance, Wordpress and general niche and affiliate marketing tips. [...]

  2. By How Important XML Sitemap is ? « SEO Alert on June 20, 2008 at 3:19 pm

    [...] Related articles WordPress.com Bugged XML Sitemaps [...]

  3. [...] WordPress XML Sitemap [...]