|
Removing Duplicate Penalties
By Ray "Catfish" Comstock
Expert Author
Article Date: 2009-03-04
Upon further review of the new tag rel=canonical, that is used to let search engines know what URL you want indexed for any given page of your site, I can't find any reason why you wouldn't want to make it a standard part of every Web page on your site. This tag proactively handles a number of potential duplicate content issues.
*special note about "duplicate content penalties": Some people feel that it is very important to be clear that a duplicate content "penalty" is not really a penalty. Google is not punishing your site for having a Web page that has the same content as another Web page. But what it is happening is, if your page is not the original version of the content, it is highly likely that your page will be put into the supplemental index which makes it less likely to rank well for competitive terms. So in effect, it's the same end result of incurring a penalty but in fact there has been no "penalty." For some reason this game of semantics is very important to some members of the SEO community, so I wanted to make sure I was clear with everyone that I know the phrase "duplicate content penalty" is not entirely accurate, but it is how most people view the situation.
Having said that, here are some of the issues that the rel=canonical tag can help with:
- Capitalization: The search engines treat each unique URL string as a separate entity. So any variation in a URL, no matter how slight, creates a new URL in the eyes of the search engine. This includes differences in capitalization, although Google specifically has been much better at figuring this out on its own since the "Big Daddy" update. So for example, when an engine indexes these URLs:
http://www.businessol.com/seo-blog/ and http://www.businessol.com/Seo-Blog/ it sees two different URLs for the same content.
This happens often because Webmasters are not standardized in the way that they link to 3rd party websites. Some Webmasters and blogs capitalize letters in URLs out of coding practices or habit.
In the old days, Google would have not been able by itself to figure out that these two URLs are the same page. And therefore one of these two URLs (most likely the one with the least amount of Page Rank) would have been put into supplemental results and the links that pointed to it would essentially be lost (because now they point to a page in the supplemental results that isn't going to rank for much and they are not helping the other URL that is listed in Google). But nowadays, Google is pretty good about figuring this stuff out although not perfect. And the other engines are not very good at this kind of thing. So by including the Rel=Canonical tag on every page of the site, you make it easy for all the engines to consolidate URLs that have capitalization problems.
- Dynamic URL Strings: Whether its tracking codes like www.domain.com?tracking-code or CMS systems that generate multiple URLs for the same page, the issues with these pages are the same as the issues for capitalization. But now instead of an elaborate 301 redirect strategy or costly adjustments to your backend system, this simple tag solves the problem.
- Other Canonical Issues: Some other issues that Google already does fairly well but that can still cause problems include www versus non www version of the Web site (domain.com versus www.domain.com), session IDs and also linking using IP addresses. This tag if correctly implemented, should fix all of these problems.
Given the number of potential issues that this tag can correct, it should be added to every page of your site. If your site is dynamic, this should be a pretty easy addition.
Comments
About the Author: Ray "Catfish" Comstock has spent over 9 years in the search engine optimization field and has successfully ranked numerous companies in the top-10 of all major search engines for ultra competitive markets such as travel, real estate, computer software, retail, broadband internet and various manufacturing sectors.
|
|