A canonical URL allows you to tell search engines that certain similar URLs are actually one and the same. Sometimes you have products or content that is accessible under multiple URLs, or even on multiple websites. Using a canonical URL (an HTML link tag with attribute rel=canonical) these can exist without harming your rankings.
In February 2009 Google, Bing and Yahoo! introduced the canonical link element. Matt Cutt’s post is probably the easiest reading if you want to learn about its history. While the idea is simple, the specifics of how to use it turn out to be complex.
The rel=canonical element, often called the “canonical link”, is an HTML element that helps webmasters prevent duplicate content issues. It does this by specifying the “canonical”, or “preferred”, version of a web page. Using it well improves a site’s SEO.
The idea is simple: if you have several similar versions of the same content, you pick one “canonical” version and point the search engines at that. This solves the duplicate content problem where search engines don’t know which version of the content to show. This article takes you through the use cases and the anti-use cases.
Choosing a proper canonical URL for every set of similar URLs improves the SEO of your site. Because the search engine knows which version is canonical, it can count all the links towards all the different versions, as links to that single version. Basically, setting a canonical is similar to doing a 301 redirect, but without actually redirecting.
When you have several choices for a products URL, canonicalization is the process of picking one. In many cases, it’ll be obvious: one URL will be better than others. In some cases, it might not be as obvious, but then it’s still rather easy: pick one! Not canonicalizing your URLs is always worse than canonicalizing your URLs.
Let’s assume you have two versions of the same page. Exactly, 100% the same content. They differ in that they’re in separate sections of your site and because of that the background color and the active menu item differ. That’s it. Both versions have been linked from other sites, the content itself is clearly valuable. Which version should a search engine show? Nobody knows.
For example’s sake, these are their URLs:
This is what
rel=canonical was invented for. Especially in a lot of e-commerce systems, this (unfortunately) happens fairly often. A product has several different URLs depending on how you got there. You would apply
rel=canonical as follows:
section of the page:
That’s it. Nothing more, nothing less.
What this does is “merge” the two pages into one from a search engine’s perspective. It’s basically a “soft redirect”, without redirecting the user. Links to both URLs now count for the single canonical version of the URL.
If you use Yoast SEO, you can change the canonical of several page types using the plugin. You only need to do this if you want to change the canonical to something different than the current page’s URL. Yoast SEO already renders the correct canonical URL for almost any page type in a WordPress install.
For posts, pages and custom post types, you can edit the canonical in the advanced tab of the Yoast SEO metabox:
For categories, tags and other taxonomy terms, you can change them in the Yoast SEO metabox too, in the same spot. If you have other advanced use cases, you can always use the
wpseo_canonical filter to change the Yoast SEO output.
If you have the choice of doing a 301 redirect or setting a canonical, what should you do? The answer is simple: if there are no technical reasons not to do a redirect, you should always do a redirect. If you cannot redirect because that would break the user experience or be otherwise problematic: set a canonical URL.
In the example above, we make the non-canonical page link to the canonical version. But should a page set a rel canonical for itself? This is a highly debated topic amongst SEOs. At Yoast we have a strong preference for having a canonical link element on every page and Google has confirmed that’s best. The reason is that most CMSes will allow URL parameters without changing the content. So all of these URLs would show the same content:
The issue: if you don’t have a self-referencing canonical on the page that points to the cleanest version of the URL, you risk being hit by this stuff. Even if you don’t do it yourself, someone else could do this to you and cause a duplicate content issue. So adding a self-referencing canonical to URLs across your site is a good “defensive” SEO move. Luckily for you, our Yoast SEO plugin does this for you.
You might have the same piece of content on several domains. For instance, SearchEngineJournal regularly republishes articles from Yoast.com (with explicit permission). Look at every one of those articles and you’ll see a rel=canonical link point right back at our original article. This means all the links pointing at their version of the article count towards the ranking of our canonical version. They get to use our content to please their audience, we get a clear benefit from it too. Everybody wins.
There is a multitude of cases out there showing that a wrong rel=canonical implementation can lead to huge issues. I know of several sites that had the canonical on their homepage point to an article, and completely lost their home page from the search results. There are more things you shouldn’t do with rel=canonical. Let me list the most important ones:
Facebook and Twitter honor rel=canonical too. This might lead to weird situations. If you share a URL on Facebook that has a canonical pointing elsewhere, Facebook will share the details from the canonical URL. In fact, if you add a like button on a page that has a canonical pointing elsewhere, it will show the like count for the canonical URL, not for the current URL. Twitter works in the same way.
Google also supports a canonical link HTTP header. The header looks like this:
Link: ; rel="canonical"
Canonical link HTTP headers can be very useful when canonicalizing files like PDFs, so it’s good to know that the option exists.
While I won’t recommend this, you can definitely use rel=canonical very aggressively. Google honors it to an almost ridiculous extent, where you can canonicalize a very different piece of content to another piece of content. If Google catches you doing this, it will stop trusting your site’s canonicals and thus cause you more harm…
In our ultimate guide on hreflang, we talk about canonical. It’s very important that when you use hreflang, each language’s canonical points to itself. Make sure that you understand how to use canonical well when you’re implementing hreflang as otherwise you might kill your entire hreflang implementation.
Rel=canonical is a powerful tool in an SEO’s toolbox, but like any power tool, you should use it wisely as it’s easy to cut yourself. For larger sites, the process of canonicalization can be very important and lead to major SEO improvements.