![What is canonicalization? What is canonicalization? - SpiderSavvy](https://spidersavvy.com/wp-content/uploads/2011/08/canonicalization1.jpg)
Canonicalization: What is it?
Canonicalization may sound scary, but it doesn’t have to be if your website is set up correctly. The good news is that it’s a relatively easy fix once you understand what you’re fixing. :)
The main idea of this post is that there should only be one URL for each page on your site. You may not realize this, but even though both URLs below take you to the home page of SavvySite.com if the canonical formatting is not set up correctly, search engines may interpret both URLs as representing completely different websites.
https://savvysite.com/
https://www.savvysite.com/
Why is Improper Canonicalization Bad for Your Website?
It may not seem like a big deal, but not having your domain set up correctly can dilute your site’s equity and lead search engines to believe that your site has duplicate content—from your own site!
Dilution of Link Authority
First, you lose link authority. If visitor 1 comes to ‘www.savvysite.com’ and links to that page, visitor two lands on ‘https://savvysite.com’ and links to that URL, and visitor three lands on ‘https://www.savvysite.com/index.html’ and links to that page, Googlebot sees three links to three different pages and applies one ‘vote’ to each one.
These links could have sent three authoritative signals to Googlebot for your site’s home page. Instead, they’re split into three weaker individual votes for three pages. It’s link love mayhem.
If SavvySite.com was set up so that its home page ‘lived’ at one unique URL—’https://SavvySite.com,- all three visitors would have linked to that page, and Googlebot would instead apply all three votes to a single page.
Don’t Make it Hard for Search Engines to Crawl Your Website.
Search engines frequently crawl your site, looking for new content. However, search bots won’t waste time tracking down all those different versions if they find multiple pages with multiple URLs. Search engines allocate resources for each crawl, and no one knows precisely how, but it’s safe to say Googlebot won’t just wander around your site until it has found every page. At some point, it gives up and leaves. That’s time they could spend crawling other unique pages instead.
So, fewer unique pages of your site end up in the search index, and you have fewer chances to rank.
Duplicate Content
Having one page available via multiple URLs can also mislead search engines by making them believe you are stealing someone else’s content when it is your own!
Let’s say that you are a real estate agent and just posted an article on ways a seller can improve their chances of selling their home. If your website’s canonical formatting is not set up correctly, the search engines will see the URLs below as two different sites with the same content!
https://savvysite.com/blog/10-ways-to-help-sell-your-house
https://www.savvysite.com/blog/10-ways-to-help-sell-your-house
You worked hard for your keyword-rich, relevant content. The last thing you need is for a search engine to think it is duplicate content and penalize you.
Fix That Canonicalization!
You can avoid the heartbreak of lousy canonicalization, or at least minimize it, by doing a few simple things:
- Use 301 Redirection: Ensure your home page is only found at one URL.
- Link Consistently to Your Home Page: Use a single URL for your home page. Don’t mix in instances of ‘https://savvysite.com/index.html’ with ‘https://savvysite.com.’ If you aren’t doing this properly right now, a quick change may significantly impact SEO.
- Don’t Use Tracking IDs in Internal Site Navigation: Many sites add stuff like ‘?source=blog’ in their navigation to track user movement. Instead, learn to use your web analytics referrer and navigation path reports. If you must use tracking IDs, change your software to a hash mark (a ‘#’ sign) instead of a question mark. Search engines ignore everything after the hash, so you’ll avoid confusion.
- Don’t Use Tracking IDs in Organic Links from Other Sites: If you get a link on another site and want it to help with your SEO, don’t put a tracking ID in that link.
- Be Careful with Pagination: Many sites have pagination for search results, product lists, or articles. Make sure each page has a single URL. For example, if page 1 of the article is ‘https://savvysite.com/article.html,’ ensure that the number’ 1′ in the pagination takes you there, too, instead of to ‘https://savvysite.com/article.html?page=1′.
- Set Up Preventative Redirects: Make sure that ‘https://Savvysite.com’ 301 redirects to ‘https://www.savvysite.com.’
- Exclude ‘Email a Friend’ Pages: Most content management systems with ’email a friend’ options direct users to a unique page with the same form and content. Each instance of that page has a unique URL like ‘ID=123′. Use robots.txt and the meta robots tag to exclude these from search engine crawls.
What about rel=canonical?
The canonical tag is a neat little gadget that’s supposed to let you tell search engines the correct URL for any page. Adding <link rel=”canonical” href=”https://www.savvysite.com”>
to any page lets visiting search bots index just that version and direct all link authority to that URL. It sounds ideal.
However, it’s not foolproof. First, Yahoo! and Bing don’t yet have confirmed support. Second, you can’t rely solely on tags of this nature, as search engines may change their minds later. Google’s done it. So don’t stake your SEO strategy on it. Third, why not do it right the first time? In addition to SEO benefits, a canonically clean site should:
- Run faster
- Present fewer maintenance headaches
- Place less load on server and bandwidth resources
Addressing these canonicalization issues can improve your site’s SEO, user experience, and overall performance.