BUG: Google Now Counting Open Graph Tags For Canonicalization?

First of all I want to pay my condolences to Bobbi Kristina Brown on the death of your mother.

When I heard the news I immediately went to Google to find out more information.

I saw this search result

At the time that link went directly to http://www.whitneyhouston.com/us/home

(it now redirects to http://www.whitneyhouston.com/us/remembering which may end up with similar issues)

I checked the robots.txt to see if there was an issue

#
# robots.txt
#
# This file is to prevent the crawling and indexing of certain parts
# of your site by web crawlers and spiders run by sites like Yahoo!
# and Google. By telling these "robots" where not to go on your site,
# you save bandwidth and server resources.
#
# This file will be ignored unless it is at the root of your host:
# Used:    http://example.com/robots.txt
# Ignored: http://example.com/site/robots.txt
#
# For more information about the robots.txt standard, see:
# http://www.robotstxt.org/wc/robots.html
#
# For syntax checking, see:
# http://www.sxw.org.uk/computing/robots/check.html

User-agent: *
Crawl-delay: 10
# Directories
Disallow: /includes/
Disallow: /misc/
Disallow: /modules/
Disallow: /profiles/
Disallow: /scripts/
Disallow: /themes/
# Files
Disallow: /CHANGELOG.txt
Disallow: /cron.php
Disallow: /INSTALL.mysql.txt
Disallow: /INSTALL.pgsql.txt
Disallow: /install.php
Disallow: /INSTALL.txt
Disallow: /LICENSE.txt
Disallow: /MAINTAINERS.txt
Disallow: /update.php
Disallow: /UPGRADE.txt
Disallow: /xmlrpc.php
# Paths (clean URLs)
Disallow: /admin/
Disallow: /comment/reply/
Disallow: /logout/
Disallow: /node/add/
Disallow: /search/
Disallow: /user/register/
Disallow: /user/password/
Disallow: /user/login/
# Paths (no clean URLs)
Disallow: /?q=admin/
Disallow: /?q=comment/reply/
Disallow: /?q=logout/
Disallow: /?q=node/add/
Disallow: /?q=search/
Disallow: /?q=user/password/
Disallow: /?q=user/register/
Disallow: /?q=user/login/
# Friendlist Links
Disallow: /friendlist/add/
Disallow: /us/friendlist/add/
Disallow: /ar/friendlist/add/
Disallow: /au/friendlist/add/
Disallow: /at/friendlist/add/
Disallow: /be/friendlist/add/
Disallow: /br/friendlist/add/
Disallow: /ca/friendlist/add/
Disallow: /co/friendlist/add/
Disallow: /fi/friendlist/add/
Disallow: /fr/friendlist/add/
Disallow: /de/friendlist/add/
Disallow: /gr/friendlist/add/
Disallow: /hk/friendlist/add/
Disallow: /ie/friendlist/add/
Disallow: /it/friendlist/add/
Disallow: /jp/friendlist/add/
Disallow: /my/friendlist/add/
Disallow: /nl/friendlist/add/
Disallow: /nz/friendlist/add/
Disallow: /ph/friendlist/add/
Disallow: /pl/friendlist/add/
Disallow: /pt/friendlist/add/
Disallow: /ru/friendlist/add/
Disallow: /sg/friendlist/add/
Disallow: /es/friendlist/add/
Disallow: /se/friendlist/add/
Disallow: /ch-de/friendlist/add/
Disallow: /tw/friendlist/add/
Disallow: /tr/friendlist/add/
Disallow: /uk/friendlist/add/
Disallow: /th/friendlist/add/

No issue that I can see there, unless there was something wrong with the redirect

The redirect was a clean 301, nothing in the headers that I could see was causing something to be blocked.

The head of the page was also quite clean

<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
  <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<script type="text/javascript" src="http://adm.fwmrm.net/p/sonymusic_live/AdManager.js"></script>
<meta name="loginMethod" content="anonymous"/>
<meta name="siteSection" content="home"/>
<meta property="og:site_name" content="The Official Whitney Houston Site"/>
<meta property="og:title" content="Whitney Houston"/>
<meta property="og:type" content=""/>
<meta property="og:url" content="http://www.whitneyhouston.com"/>
<meta property="og:description" content="Check out Whitney Houston at http://www.whitneyhouston.com"/>
<link rel="shortcut icon" href="http://www.whitneyhouston.com/sites/whouston/files/favicon_2.ico" type="image/x-icon" />
<meta name="description" content="Official Whitney Houston website featuring Whitney Houston news, music, videos, album info, tour dates and more. " />

<meta name="keywords" content="Whitney Houston" />

  <title>Whitney Houston | The Official Whitney Houston Site</title>
  <link type="text/css" rel="stylesheet" media="all" href="/sites/all/modules/contrib/views/css/views.css?8" />
<link type="text/css" rel="stylesheet" media="all" href="http://www.whitneyhouston.com/sites/whouston/files/css/css_c1430b97100c9d627ca5594b4b0016ee.css" />
  <meta http-equiv="X-UA-Compatible" content="IE=8" />

It is not conclusive, but when I tried visiting the site as Googlebot nothing funny happened. I know potentially they could still be doing additional validation and treating me differently, but that is rare.

The only thing which I can think of which might be an issue was this.

<meta property="og:url" content="http://www.whitneyhouston.com"/>

The facebook linter/debugger actually throws an error if you have an og:url that redirects to a page that redirects back – they shouldn’t do it because any brand may switch landing pages and URLs, but still wants to retain votes on a single canonical URL. I hit this issue with the uQast sales funnel moving pages, and potentially we have lost hundreds of likes.

However when I have had these issues on that sales funnel it didn’t affect Google. Maybe because we had a canonical set to the actual landing page that Google took as the preference.

I have never seen Google treat an open graph tag as rel canonical, but it is the only potential issue I can see.

There may be something funky happening with geolocation, but Google doesn’t seem to be picking that up as it should either.

You also won’t find Whitney’s other pages on Google easily, partially caused by this indexation error on the primary domain.

Whitney Houston on Myspace
Whitney Houston on Facebook

I think this might be a rare bug in canonicalization – some of the localization and redirects happening are not exactly ideal, but shouldn’t be preventing indexation in this way.

p.s. I am deliberately not trying to grab search traffic for this sad event – I just want fans to be able to find a place to pay their respects. My first attempt (on Google+) to get word out to Googlers has so far not had a response.

Posted in SEO Blog | Tagged | Comments closed

SEO Sanitation From A Cleaner of Other People’s $#!+

I often liken what I occasionally do to being a cleaner of other people’s $#!+ and in some ways that is what I do in my role as Product Manager at uQast (we are just starting our chartered launch), tracking down bugs & designing features that solve the problems of our customers.
It is much better being the “Head Cleaner” – then you don’t necessarily have to get out a mop & bucket yourself… let alone a plunger.

Here is my list in reverse order of difficulty to fix a SEO disaster.

  1. Amateurs

    If you have a website that just needs some loving this is by far the easiest to work with. It is almost a clean slate that just needs a bit of polish.

  2. Web dev who thinks they are SEOs or that SEO is Bullshit

    There may be a need for a new platform, or platform improvements and the incumbent web-dev doesn’t agree. Often a lack of promotion. I have friends who offer web-dev as part of their SEO business and my honest advice to them is to focus on what they are best at.

  3. Startups full of Ex-Googlers

    Twitter and Facebook come to mind, though an earlier example was Vark.com.. an answers service that just couldn’t be crawled naturally. The ex-Googlers are now back at Google.

  4. Lazy SEOs who just do directory submits, maybe buy obvious links etc

    Typical SEO cleanup job – remove the crap, build good stuff to compensate. Site quality (lets group that under Panda) issues are of similar complexity.

  5. Aggressive Pro SEOs who didn’t warn customer of inherent risk
  6. I am not saying #5 doesn’t work, but you have to be prepared to move house and rebuild rather than just clean up

The difference between each tier is significant but maybe not logarithmic, and the difficulty is often of a different nature. e.g. #3 the difficulty would be in trying to convince ex-googlers they are wrong.

My friend Dave from the SEO Dojo has a great post today on diagnosing and removing Google penalties.

I’ve never considered myself a SEO Pro… when I have helped people (& often it is other SEOs) it has primarily been to satisfy my inner geek.

Oh and don’t forget to check out the new subscription offering in the uQast Chartered Launch

Posted in SEO Blog | Tagged , | Comments closed

ACTA? Dead Man Walking

Apparently 22 member states wasted a huge amount of resources signing the ACTA treaty.
If the current version had in some way been publicized previously they might have had a vague chance of it not causing a multinational revolt across Europe.

The internet population already know how to mobilize… they are SOPA trained. It is easy to slip something past novices who are timid.

The internet is now full of battle veterans confident in their ability to use the skills they have learned and the power of their voice to take on governments.

The guy from the European Parliament assigned to investigate ACTA quit in disgust.

Any MEP who votes for the act will be widely publicised, thus harming their chance of re-election and will no doubt subject them to extreme financial scrutiny.

It is likely I will get a few comments accusing me of being a supporter of piracy.

Certainly the act in some ways will affect me as a content publisher on this blog.
It affects me as the product manager of a multimedia publishing platform, uQast.com – our aim there is to help rights owners who create content make money from their own content
I worked in the software and computer games industry for 15 years before devoting myself to online endeavours.

So if anything I should be for ACTA, not against it.

Ultimately I haven’t read it… and that is the primary issue today… secrecy. The act has been signed which may affect the rights of my generation, and that of our children.

Save your bullets – wait until you can see the whites of their eyes. The complete document will leak even if not intentionally published, and then will be time for serious comment.. and action.

More on Techmeme of course

Posted in news | Tagged | Comments closed

Dear @twittercomms – A Basic Search Query For Your Engineers

I am not sure why Twitter engineers are struggling with this, and potentially misleading tech journalists (and even some well respected SEOs) with the SEO issues hampering the indexation of Twitter.com

This is all pretty basic technical SEO

Here is a very simple cleaned up Google search query that will bring up some interesting results

https://www.google.com/search?q=site:twitter.com+inurl:andybeard&filter=0

It doesn’t exclude a subdomain such as www.twitter.com or fr.twitter.com or api.twitter.com

Twitter splits their domain authority between lots of different subdomains which have no business being indexed.

It also doesn’t include http and https – typically those should also be canonicalized (think of Highlander “There can be only one!”)

The &filter=0 tells Google not to ignore some of the URLs it might otherwise due to no content, low PageRank etc. It is especially useful for picking up URLs which are blocked by robots.txt

Like these

The first red arrow is what Google sees when it decides not to follow the funky javascript redirects that Twitter does. It is possible Google sees that a lot – it is like having a door slammed in your face.

The second is the bouncers at the door to the nighclub… you get to see all the cool stuff entering into Twitter’s archives. but it won’t let Google through as they aren’t willing to tip the bouncers enough money, or haven’t got the right friends.

That barrier prevents Google crawling deeper into your content, so whilst if they were very observant they may have seen a piece of content once, it may eventually drop out of the index unless other sites in the Twitter ecosystem maintain links directly to that content that Google can follow.

So for instance sites like Topsy & Tweetmeme if crawled by Google and they link directly to a tweet, it is possible for Google to find content… but that is far from perfect.

What controls Google in this way is Twitter’s robots.txt file https://twitter.com/robots.txt

The line of that file in the Google section that is causing a lot of the issues is this one
Disallow: /*?

Effectively any URL on the whole of Twitter that contains a “?” or query parameter Google is not allowed to look at.

Here are 2 more queries for you

https://www.google.com/search?q=site:andybeard.tweetglide.com/blog&filter=0&start=991

https://www.google.com/search?q=site%3Atwitter.com%2FAndyBeard&filter=0&start=991

Those are 991 searches… Google will only list up to 1000 items for a search, so that kind of query will show the end of the search results – if used on huge sites, then you might have to refine things down to a subset of pages where possible, but in this case my Twitter account only has 5100 tweets and Twitter should easily be able to get all of those indexed. I am restricting the search to a folder /andybeard

I haven’t linked directly to either my twitter account or my archived copy on Tweetglide for some time so here they both get a link with a screenshot of current indexation over 2 years… not so many tweets over that time due to a long hiatus.

http://andybeard.tweetglide.com/blog

http://twitter.com/andybeard

341 vs 321

The winner here seems to be Tweetglide, and it seems fairly close until you examine all the URLs for Twitter that google crawls needlessly as duplicate content on different subdomains, and remove all the junk pages (and some good stuff) that are blocked by robots.txt

Such as this

You also have to understand that in the last 2 years due to my hiatus I have only created 280 tweets as archived by Tweetglide (there may be a few early ones missing), and the additional tweets in that deep search result are the archived Tweets of the people I have conversations with.

If we removed that filter parameter things are drastically different

https://www.google.com/search?q=site:andybeard.tweetglide.com/blog&start=991

https://www.google.com/search?q=site%3Atwitter.com%2FAndyBeard&start=991

341 vs 253

That is when Google filter out lots of the duplicate junk from Twitter, and none from Tweetglide.
That filter removal isn’t perfect.. there are still some duplicates – if Twitter retains half the indexation of Tweetglide I would be amazed.

Is there a crawl limit?

It vastly depends on juice.. my Twitter profile at one time had enough juice that if Google had been allowed to crawl, they probably would have picked up 25-50% of the 5100 tweets, but Twitter doesn’t allow you to paginate that far into it’s archive (even if it wasn’t blocked), and even the API is still limited and can only pick up around 3000 historical tweets.

My good online friend Vlad Zablotskyy has more tweets than me achived on Tweetglide.

https://www.google.com/search?q=site:vladzablotskyy.tweetglide.com/blog&filter=0&start=990

He actually has less pages indexed than me possibly because of juice, so lets give him some, and see if we can crack the 500 barrier.

This has been a small introduction to Twitter’s SEO woes to demonstrate hopefully to laymen that all is not well with Twitter, and any claims that they can be indexed normally are false. Any competent SEO could have found all of these issues with the site, and fixing them would reduce the load on Twitter’s servers caused by Google, and maybe allow Google to index more content.
I have avoided additional complications with rel=”nofollow”, and potential cloaking issues with their impelentation of #!hashbang URLs, and funny javascript redirects and haven’t touched on some additional nuances with the way they feed juice to list pages.

Disclosure: When Tweetglide launched I offered some SEO tips (pro bono) over a few days to the owner and one of his engineers, and had 100% “buy in” to follow my recommendations for internal linking structure on the site. I would possibly change the structure of the pagination links at the bottom but otherwise I think the site from an SEO perspective is doing great… damn… no difference between with and without filter=0 actually amazes me.

Posted in SEO Blog , web 2.0 | Tagged , | Comments closed

Really Evil – The “Don’t Be Evil” Bookmarklet

It is not as benign as you might think.

<a href="http://.mov" ><img src="http://url" alt="The Evil of The Don't Be Evil Tool" width="560" height="315"></a>

Here is a link to the original 720p mp4

Some of the points raised in the video

  1. We need to avoid a cartel of top sites with limited access, especially with Ggoogle and Facebook currently buying ads from each other in the free market.
  2. Identity is important and both Twitter & Facebook fail to provide a method of bi-directional ownership – the “Don’t Be Evil” bookmarklet fails by linking Mark Zuckerberg’s Google+ to a Quora topic page about him… not his actual profile (if he has one) – it is possible for Google to do fairly well with some cases, e.g. if I have authorship with my blog, and I have bi-directional links between my blog and Twitter, that might count
  3. Google+ profile owners shoud get a choice of which links might be used
  4. If you want to give Google hints, you really should fill out your profile better

A few of the links from the video

The evil don’t be evil bookmarklet
John Batelle’s post
Danny’s post

Experiment for yourself, but also understand that Google’s results are personalized and regional. E.g. I fully expect to have competition in the UK SERPs for the first twitter listing for Andy Beard, as there are at least 2 Andy Beards I know of there, and at least 2 others in the US.
Identity is a big part of Web 3.0 and both Facebook and Twitter currently fail in the ability to accurately verify a chain for most people.

The intent of the bookmarklet is to focus on the “user”

I want users to be directed to the profiles and services I vouch for
I want users to go to links that have been verified with a robust identity as being mine

Full coverage on Techmeme

Posted in Google , news , SEO Blog | Tagged , | Comments closed

Tech Blogging Triple Rainbow

I just had to share this achievement by Mike Masnick as he might not see it himself.

The Techmeme page updates extrememly frequently pulling in stories algorithmically with some additional human curation but I have never seen one writer with a lead story and 2 additional reference posts without any additional “noise”, unless you count the larger child branch below. He has one post in that child branch as well.

I am not going to link to the posts… that would most likely spoil the “triple rainbow”.

Even more remarkable:- none of the articles were “fluff”, and Mike wrote 14 posts on 12th January.

Kudos

p.s. I am sure it has happened before, either for Mike or other highly productive writers especially with the way publishing platfroms can publish content at a specified future time so it could even be engineered. But someone has to actually see it, and note it as something special, as it would still take considerable effort.

Just like a double rainbow

Posted in blogging tips | Tagged | Comments closed

Updated: Facebook & Twitter – Lucky To Be In Google At All

Facebook & Twitter have some of the worst landing pages on the web.

At least if you look at it from a search engine perspective, who should assume that every visitor isn’t a member of the site they are referencing in the search engine.

It should also be understood that both Facebook & Twitter are bursting at the seams with former Google engineers & execs – they can’t claim they were unaware of what Google is looking for from content owners on the web, webmaster guidelines etc.

Twitter

You can’t look at the Google cache and see exactly what Google sees, because they do some sneaky redirects which are very akin to cloaking.

I have written about this before.

Video Exclusive: Has Google Given Twitter a Cloaking Penalty?

This is what Google sees based upon the preview

The little piece of text at the top of the page is what amounts to your profile… you can’t count the background image if any because it can’t be read by Googlebot unless it works really hard using OCR, and certainly can’t be read by people with disabilities.
The links within the content of the page are mostly nofollow, and the links in the sidebar get blocked by robots.txt.
The link at the bottom of the page to access more content… which may be of interest to search is also blocked by robots.txt.

I am not the only one who has spent considerable time trying to get Twitter fixed. A great example is this post by Vanessa on Search Engine Land.
How Twitter’s Technical Infrastructure Issues Are Impacting Google Search Results

Facebook

Facebook is worse

There is nothing there of any real value… it isn’t the timeline a logged in user might see.

First Click Free

If you want to have some kind of membership wall for users, then Google have special arrangements where you are required to show content for the first click.

Cloaking

Google over the years have published lots of content about what they think of cloaking.

I can still think of a few cases where some kind of cloaking would be justified. As an example on uQast we serve RTMP video with flash and use javascript “cloaking” to provide mp4 for iPhone. We could even serve that video to Googlebot’s mobile crawler without breaking Google guidelines as “cloaking” to serve content to specific browsers is allowed. But we can’t serve Googlebot which crawls for the main search index something it understands, as the Google guidelines require you treat Google as a normal desktop user browsing from California in the USA.
So Googlebot is served flash based RTMP within the webmaster guidelines rather than something it might like to see which we would be quite happy to give it.

That doesn’t prevent Google sometimes (though rarely) indexing the mobile video by figuring out the javascript, but it would be so much easier to give them something they understand.

Google Isn’t Playing Fair

One area that Google isn’t necessarily playing fair is that I don’t seem to be able to view Google+ profile pages in their own cache, and they don’t give a preview of the page that Googlebot sees.

This is my Google+ Profile

You can normally search in Google for cache:https://plus.google.com/102279602913916787678/posts or any url to get a cached version of what the crawler sees.
It is possible for every site to tell Google and other search engines not to store a cached page, so Google are well within their rights not to do so… but it prevents comparrisons.

Compare
cache:andybeard.eu – brings up a cached result
cache:https://plus.google.com/102279602913916787678/posts – does not bring up a cached result, just a 404 error

FTC Complaint over Search Plus Your World

The blogoshere love a good witch hunt, but I can’t see that Google is treating Twitter or Facebook unfairly. Eric Schmidt was quite right about some of the nofollows, but there are bigger technical restrictions in place on crawling.

I actually quite like a Google profile as a default profile and identity on the web, but Google need to live up to the promise of salmon and make it a viable endpoint for all activity, or as an alternative use it for identity, and allow me to define my own default profile.. which if I choose might be Twitter or Facebook.
I can also understand why you wouldn’t undertake the complex engineering to make such flexibility possible for your first itteration, especially with partners who are unwilling to do something similar themselves.

Just ask Twitter how many content partners they now support on the new Twiter for embeds. (I wrote them a letter a year ago and never received a response)

This post ignores what a logged in and fully javascript supporting human might experience, but in many ways Google’s profiles whilst now having a social element for years have generously linked out to any other online destination of your choosing, and provided the necessary markup to claim them as being part of your personal social graph.

Update – Google Profiles Now Cached

Michael VanDeMar left a comment showing a way to get the cached page to show by including the https protocol at the beginning of the url to query.

However when I posted I had tried lots of different variations all resulting in a 404 error.

This unmodified link was previously bringing up a 404 error
cache:https://plus.google.com/102279602913916787678/posts

It now returns what appears to be a blank page – as Michael points out if you switch off the CSS in your browser you can see the complete cached landing page.

Andy Beard Google+ Profile
Click to view full size without CSS

This appears to be a recent change, though they still need to fix the canonical – the canonical changes as you navigate between tabs and between the first 2 urls on this list there is effectively a redirect loop with /posts claiming / is the canonical, but humans are redirected to /posts

https://plus.google.com/102279602913916787678/

https://plus.google.com/102279602913916787678/posts

https://plus.google.com/102279602913916787678/about

All the different URLs show all of the same content, so should set whichever canonical a human is redirected to which currently is /posts

Not Total Fix

It seems some other pages are still giving 404 errors – maybe due to all the funky redirects going in circles with the canonical on occasion (this query is with HTTPS)

If you have difficulty understanding the concept of canonical, it is just like Highlander… “There should be only one” page with the same content in Google’s index, especially on the same domain.

Posted in Google , SEO Blog | Tagged , , , , | Comments closed

Google+ Now Gets A Full Banana – Canonical Support

Back in June when Google +1 was first introduced (the voting on links, not the full Google+ experience) I wrote about an issue I had with the new service.

Google +1 & The Problem With Canonicalization Of Votes

I’ve ripped the relevant section out of the original post so we can now look at the current state of play.

You will see that the value of both Google +1 buttons is now the same, as Google now treat them as the same page, even though one of them is a redirect.

This seems to work for lots of types of redirects, parameters for tracking etc, which is what made it important to support in the first place.
So now you have every reason to add specific tracking parameters to the URLs that get added to Google+ so that you can track the way they are shared and the traffic that generates.
You could even use a URL shortener like bit.ly to make your URL with parameters a little shorter.

What I haven’t yet tested are how subsequent redirects of the canonical page get handled, and what if any safeguards have been implemented to reduce abuse. I also haven’t tested how many redirects will be followed.

This was still an issue back at the beginning of Novemeber when I mentioned it on the G+ Developers group. No… I didn’t submit a bug ticket as suggested. I don’t consider myself enough of a dev to create a ticket with sufficient clarity on someone else’s product. I have to assume G+ product managers and evangelists monitor feedback for features to some extent.

This may have been announced as a fix sometime, but a search for Google+ canonicalization brings up such as messy SERP I gave up digging.

I would love to know who to talk to about geting uQast videos supported in Google products… everything currently strips out the iframes and I am assuming the logic is being shared between platforms (Google Reader, Google+, Google Currents etc.)

Google +1 Only Gets Half A Banana

As Google +1 has only just been launched, the uQast landing page hasn’t received 100s or 1000s of bookmarks but it is a good example of the current problem with Google’s implementation of the +1 button.

This is the same URL we were using in the above example, my uQast affiliate link, but it could be any tracking link, or just using Google Analytics tracking parameters.

http://welcome.uqast.com/page13312


This is how that button should be encoded

<script type="text/javascript" src="http://apis.google.com/js/plusone.js"></script>
<g:plusone size="tall" href="http://welcome.uqast.com/page13312"></g:plusone>

With this result… I gave it a plus one to test this earlier as I am considering adding Google +1 to our landing pages, not just for our launch signup, but also throughout uQast and within our embeddable players.

If Google had implemented +1 correctly, then the count for a URL that points directly to the page would be the same.

<script type="text/javascript" src="http://apis.google.com/js/plusone.js"></script>
<g:plusone size="tall" href="http://welcome.uqast.com/intro/"></g:plusone>

At time of writing the affiliate link shows 1 and the “clean” link shows 0 – I am sure that will change over time

p.s. I know Google is having a bit of a bad day about their new Search + Your World introduction – I persoanlly love it

Posted in Google | Tagged , | Comments closed

Open Video To Google – Please Reinstate Chrome

Dear Google

Your recent decision to invoke a manual penalty on the download page for Google Chrome will have lasting ramifications for the whole of online marketing, whether display advertising, affiliate marketing, and other performance marketing such as CPA models, making many such business models unworkable.

Policing every piece of content produced by marketing partners (affiliates etc) on the offchance that they inadvertantly linked directly to the traffic or buzz benifitiary without using a nofollow or otherwise blocking the direct link is commercially untenable.

In the following video I have outlined what has led to this unreasonable decision being made, and elaborated a little on some of the commercial implications not just for competitors in the online advertising space, but even for Google services such as Google Publisher Network, Google Affiliate Network & Doubleclick.

<a href="http://.mov" ><img src="http://url" alt="Open Video To Google - Please Reinstate Chrome" width="560" height="315"></a>

Here is a 720p 1280×720 mp4 of the above video (looking forward to supporting this in a player real soon now)

Sincerely

Andy Beard

Here is a specific example

This is an Amazon widget

Wow it is promoting a really cool Google phone!

Here is another iframe creative

Here is a text only link

T-Mobile G2 with Google Android Phone (T-Mobile)

So far I haven’t broken Google’s new interpretation of the webmaster guidelines

I love Amazon

Oops… sorry Amazon that wasn’t an affiliate link, but an editorial link… Google will now feel that they have to remove the Amazon home page from the search engine results and Amazon won’t sell 20M Kindle Fires this year… Just 19.8M – or maybe 21M if they replace the Amazon home page with the Kindle Fire product page. (yes I realise they are very similar)

Whilst pureists might argue that this wasn’t a video CPA advert but an affiliate link, a huge amount of the sites that Google filtered this year as poor quality thin affiliates were using Amazon and other affiliate networks for monetization. The purpose quite often for the content was to drive traffic to the ads in small quantities.
At scale the revenue from 1000s of websites earning just a few dollars a month above the hosting and domain costs add up.

Another comparrison is Google’s own Adsense program and the vast numbers of poor quality sites that have arisen because of it. The good often (in search visibility) outweigh the junk MFA (made for adsense) sites, but it really is a chicken & egg situation. The webmasters target specific topics and even optimize content not just for SEO, but to pull up the highest paying and possibly even specific advertising creatives for products, maybe even video content, and they get paid for clicks on that content.
If I write a blog about Android phones and included an Adsense advert at the bottom of each post, allowed video and display ads, the situation wouldn’t be vastly different to some junk content followed by a video embed of a Google commercial I was being paid for on a CPA basis.

People in the past made complete websites dedicated to the promotion of Google pack, their Adsense program etc, and even offered incentives such as training in online marketing, or included the Adsense registration links as part of the course material… of course without disclosure as that wasn’t allowed.

Google… Please Reinstate Chrome

Here are the links referenced in the video

Matt’s post on Google+
Aaron’s's post Sponsored By Google that started this huge mess.
Danny’s post on all the thin content
Unruly Media is clearly CPA (grats on $25M funding guys)
Wikipedia on CPA (will Wikipedia be the only independent content site soon?)
Wikipedia on Online Marketing advertising models
Andrew Girdwood proving Google has used this form of CPA before
Danny with Google’s statement throwing their agency and Unruly under a bus
Google’s staement and effect (from Danny) – Statement saying this was a violation of their guidelines, possibly from someone who hasn’t read them recently.
My post on Google pack and word of mouth marketing
Google’s policy statement for their Google pack CPA campaign No mention of not giving editorial links etc
The CPA video embed (the iframe contents) – I am not going to drag an individual blogger who may have given a quite nice editorial link to Google Chrome through the coals
The Unruly Media terms of service which have now been enhanced – the nofollow statement is a new bullet point – it shouldn’t be needed as payment is not for the content of the blog post, or links, but based on CPA actions with the video.
Webmaster help forums on Affiliate links Google repeatedly avoids answering questions regarding the use of nofollow with affiliate links and other forms of display advertising.
How to report paid links and selling links that pass pagerank

Disclosure: I work for an online video & affiliate marketing startup called uQast but I am posting this on my personal blog and the words and opions expressed here are my own and my volition and not of my employer (does that remind anyone of Matt’s disclaimer?) – I have been involved in affiliate marketing for 7 years and the issues discussed here have been a topic of this blog since I started publishing it in 2005.

Small update: just added a download link for the MP4 version in HD 720p

Posted in Google , SEO Blog , Video SEO & Marketing | Tagged , , , , | Comments closed

Google Buzz SEO – Evidence Buzz Now Used For Indexing

I have been testing whether links posted in Google Buzz pass PageRank or at the very least can help with indexation of content since 22nd February 2010

I didn’t write a blog post about it, but I did write something exclusive for my readers on Google Buzz (those few of you out there) as a test case, posting a link from that page to one of the most poorly indexed sites known to the internet, which happens to be owned by Google.

Paydirt

Here is the link to the seo test page I kept on this domain, linked from my top navigation.

This is the post on Google Buzz that has a link to a single page on Vark.com where I answered someone’s question.
If you examined the code on Buzz at the time it was posted it was all funky javascript. Now the link to Vark is clearly visible within the HTML.

Here is the page indexed in Google

google-buzz-seo

Profiles Updated

Google updated their profiles around 3rd March 2011 so this page may have been indexed a while back and I missed a Google Alert. It does however add some credence to the notion that someone giving you a +1 may at the very minimum give you some kind of indexation benefit. Whether that will pass anchor text or other signals remains to be seen from further tests.

Any claims from before March 3rd 2011 of a direct ranking benefit or even indexation benefit from Google Buzz should be questioned.

Posted in Google , SEO Blog , web 2.0 | Tagged , | Comments closed