BUG: Google Now Counting Open Graph Tags For Canonicalization?

 

First of all I want to pay my condolences to Bobbi Kristina Brown on the death of your mother.

When I heard the news I immediately went to Google to find out more information.

I saw this search result

At the time that link went directly to http://www.whitneyhouston.com/us/home

(it now redirects to http://www.whitneyhouston.com/us/remembering which may end up with similar issues)

I checked the robots.txt to see if there was an issue

#
# robots.txt
#
# This file is to prevent the crawling and indexing of certain parts
# of your site by web crawlers and spiders run by sites like Yahoo!
# and Google. By telling these "robots" where not to go on your site,
# you save bandwidth and server resources.
#
# This file will be ignored unless it is at the root of your host:
# Used:    http://example.com/robots.txt
# Ignored: http://example.com/site/robots.txt
#
# For more information about the robots.txt standard, see:
# http://www.robotstxt.org/wc/robots.html
#
# For syntax checking, see:
# http://www.sxw.org.uk/computing/robots/check.html

User-agent: *
Crawl-delay: 10
# Directories
Disallow: /includes/
Disallow: /misc/
Disallow: /modules/
Disallow: /profiles/
Disallow: /scripts/
Disallow: /themes/
# Files
Disallow: /CHANGELOG.txt
Disallow: /cron.php
Disallow: /INSTALL.mysql.txt
Disallow: /INSTALL.pgsql.txt
Disallow: /install.php
Disallow: /INSTALL.txt
Disallow: /LICENSE.txt
Disallow: /MAINTAINERS.txt
Disallow: /update.php
Disallow: /UPGRADE.txt
Disallow: /xmlrpc.php
# Paths (clean URLs)
Disallow: /admin/
Disallow: /comment/reply/
Disallow: /logout/
Disallow: /node/add/
Disallow: /search/
Disallow: /user/register/
Disallow: /user/password/
Disallow: /user/login/
# Paths (no clean URLs)
Disallow: /?q=admin/
Disallow: /?q=comment/reply/
Disallow: /?q=logout/
Disallow: /?q=node/add/
Disallow: /?q=search/
Disallow: /?q=user/password/
Disallow: /?q=user/register/
Disallow: /?q=user/login/
# Friendlist Links
Disallow: /friendlist/add/
Disallow: /us/friendlist/add/
Disallow: /ar/friendlist/add/
Disallow: /au/friendlist/add/
Disallow: /at/friendlist/add/
Disallow: /be/friendlist/add/
Disallow: /br/friendlist/add/
Disallow: /ca/friendlist/add/
Disallow: /co/friendlist/add/
Disallow: /fi/friendlist/add/
Disallow: /fr/friendlist/add/
Disallow: /de/friendlist/add/
Disallow: /gr/friendlist/add/
Disallow: /hk/friendlist/add/
Disallow: /ie/friendlist/add/
Disallow: /it/friendlist/add/
Disallow: /jp/friendlist/add/
Disallow: /my/friendlist/add/
Disallow: /nl/friendlist/add/
Disallow: /nz/friendlist/add/
Disallow: /ph/friendlist/add/
Disallow: /pl/friendlist/add/
Disallow: /pt/friendlist/add/
Disallow: /ru/friendlist/add/
Disallow: /sg/friendlist/add/
Disallow: /es/friendlist/add/
Disallow: /se/friendlist/add/
Disallow: /ch-de/friendlist/add/
Disallow: /tw/friendlist/add/
Disallow: /tr/friendlist/add/
Disallow: /uk/friendlist/add/
Disallow: /th/friendlist/add/

No issue that I can see there, unless there was something wrong with the redirect

The redirect was a clean 301, nothing in the headers that I could see was causing something to be blocked.

The head of the page was also quite clean

<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
  <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<script type="text/javascript" src="http://adm.fwmrm.net/p/sonymusic_live/AdManager.js"></script>
<meta name="loginMethod" content="anonymous"/>
<meta name="siteSection" content="home"/>
<meta property="og:site_name" content="The Official Whitney Houston Site"/>
<meta property="og:title" content="Whitney Houston"/>
<meta property="og:type" content=""/>
<meta property="og:url" content="http://www.whitneyhouston.com"/>
<meta property="og:description" content="Check out Whitney Houston at http://www.whitneyhouston.com"/>
<link rel="shortcut icon" href="http://www.whitneyhouston.com/sites/whouston/files/favicon_2.ico" type="image/x-icon" />
<meta name="description" content="Official Whitney Houston website featuring Whitney Houston news, music, videos, album info, tour dates and more. " />

<meta name="keywords" content="Whitney Houston" />

  <title>Whitney Houston | The Official Whitney Houston Site</title>
  <link type="text/css" rel="stylesheet" media="all" href="/sites/all/modules/contrib/views/css/views.css?8" />
<link type="text/css" rel="stylesheet" media="all" href="http://www.whitneyhouston.com/sites/whouston/files/css/css_c1430b97100c9d627ca5594b4b0016ee.css" />
  <meta http-equiv="X-UA-Compatible" content="IE=8" />

It is not conclusive, but when I tried visiting the site as Googlebot nothing funny happened. I know potentially they could still be doing additional validation and treating me differently, but that is rare.

The only thing which I can think of which might be an issue was this.

<meta property="og:url" content="http://www.whitneyhouston.com"/>

The facebook linter/debugger actually throws an error if you have an og:url that redirects to a page that redirects back – they shouldn’t do it because any brand may switch landing pages and URLs, but still wants to retain votes on a single canonical URL. I hit this issue with the uQast sales funnel moving pages, and potentially we have lost hundreds of likes.

However when I have had these issues on that sales funnel it didn’t affect Google. Maybe because we had a canonical set to the actual landing page that Google took as the preference.

I have never seen Google treat an open graph tag as rel canonical, but it is the only potential issue I can see.

There may be something funky happening with geolocation, but Google doesn’t seem to be picking that up as it should either.

You also won’t find Whitney’s other pages on Google easily, partially caused by this indexation error on the primary domain.

Whitney Houston on Myspace
Whitney Houston on Facebook

I think this might be a rare bug in canonicalization – some of the localization and redirects happening are not exactly ideal, but shouldn’t be preventing indexation in this way.

p.s. I am deliberately not trying to grab search traffic for this sad event – I just want fans to be able to find a place to pay their respects. My first attempt (on Google+) to get word out to Googlers has so far not had a response.

 

Liked this post? Follow this blog to get more. Follow

Comments