Full Guide to the Canonical Tag}

Full Guide to the Canonical Tag

Published 2024-11-25

Massive thanks to the wonderful Stevy Liakopoulpou for this guide to canonical tags, which covers what they are, how they work, common mistakes, best practices, and how to audit your website’s canonical tags. 

Have you ever searched in Google Search Console and noticed issues like “alternate page with proper canonical tag”, or “ Duplicate, Google chose different canonical than user”? Or have you ever worked on an e-commerce website that sells products with multiple variants (dimensions, colours e.g.) and noticed duplication issues? 

The magical solution to solving duplication, which Google Search Console is talking about, is a technical SEO factor called the ‘canonical’ tag. 

Contents:

What is a Canonical Tag?

A canonical tag (<link rel="canonical" href="URL">) is an HTML element used to specify the "preferred" version of a web page. 

This tag helps search engines understand which URL you want to be considered the main page when there are multiple pages with similar or duplicate content, for example filtered pages, ID session pages, referral pages, print version pages, pages with tags and so many more.

Here’s an example of a canonical tag inside the HTML code for sitebulb.com

HTML structure of a canonical tag

How Do Canonical Tags Work?

Basic Structure

The canonical tag should be placed within the <head> section of an HTML document. This is where search engines expect to find meta information and link elements that are important for SEO and page indexing. Here's the basic structure:

<link rel="canonical" href="https://www.example.com/preferred-page" />

When the canonical tag is placed outside of the <head> section, it may not be recognized or processed correctly by search engines. This misplacement can lead to several SEO issues.

Example Scenario

Imagine you run an e-commerce website selling shoes. You might have multiple URLs for the same product, such as:

  • https://www.example.com/shoes/red-sneakers
  • https://www.example.com/shoes?color=red&type=sneakers 
  • https://www.example.com/products/red-sneakers

To avoid duplicate content issues, you would use a canonical tag to indicate the preferred URL:

<link rel="canonical" href="https://www.example.com/shoes/red-sneakers" />

But there are lots more situations in which to use the canonical tag, which we’ll investigate in the rest of this guide.

Common Canonical Tag Mistakes

Scenario 1: Homepage is Accessible via Multiple URL Versions

  • http://example.com
  • https://example.com
  • http://www.example.com
  • https://www.example.com

HTTP vs HTTPS and WWW vs non-WWW duplication issues are very common. It means that your content and every single page is accessible via multiple URLs. Search engines may struggle to determine the most authoritative version of your homepage, leading to inconsistent indexing and ranking.

So Google sees the pages like this… Pretty confusing, huh?

Common mistakes of canonical tags

Image source: WooRank

What to do

Decide what is the proper URL structure of your website (e.g. will the URL contain www or not?), set 301 redirects rules and add the proper self-referencing canonical tag. Ensure the canonical tag on the homepage points to the preferred URL version.

Scenario 2: Canonical Tag Outside of Head Section

Typically, the canonical tag is placed within the <head> section of a webpage's HTML code. However, there are cases where this tag might appear outside of the head section, which can lead to unexpected behaviour in terms of SEO and indexing, like... Your page is being completely ignored by Googlebot.

This placement might occur due to errors in website coding or through intentional modifications made during website development. From an SEO standpoint, search engines prioritise the canonical tag's presence within the head section to interpret the preferred URL version for indexing and ranking purposes. 

Scenario 3: Canonical Tag Points to HTTP URL

In cases where the canonical tag points to an HTTP URL instead of HTTPS, alarm bells start ringing for website security issues. Modern best practices use HTTPS to encrypt data transmitted between users and websites, ensuring confidentiality and integrity. When the canonical tag references an HTTP URL, it indicates a potential oversight or outdated implementation regarding website security protocols.

Google and in general search engines prioritise HTTPS URLs over HTTP for indexing and ranking due to their security features. Failure to update the canonical tag to HTTPS may affect a website's visibility and trustworthiness in SERPs. It's very important for webmasters to ensure consistent implementation of HTTPS across all canonical tags to align with current SEO guidelines and user expectations for secure browsing experiences.

Scenario 4: Rendered Canonical is Different in Rendered HTML Code

Malfunctions between the rendered canonical tag and its corresponding HTML code can be caused due to website rendering and indexing processes. The rendered canonical tag represents the version processed by search engine crawlers, influencing how web pages are indexed and ranked in search results.

Googlebot might index a page before rendering is complete

Image source: Google Search Central 

The problem

Differences between the rendered canonical tag and the HTML code may come from dynamic content generation, JavaScript execution, or server-side rendering complexities. Search engines try to index content accurately based on the rendered version visible to users, rather than solely relying on static HTML code. 

Websites must ensure consistency between the rendered canonical tag and its HTML representation to uphold SEO best practices and facilitate optimal indexing of web pages.

You can learn more here about why rendered canonical tag is different in rendered HTML code and how to fix it.

Scenario 5: Canonical Tag Points to Wrong URL

There are multiple types of “wrong urls” that are set as a canonical tag, so let’s dive into this a bit more:

Case 1: Canonical Tag Points to 404 Page

When a canonical tag points to a URL that returns a 404, it tells search engines that the primary version of the content is a non-existent page. This creates confusion and can negatively impact the indexing and ranking of your content.

Imagine a library with many books. Each book has a label saying, "This is the main copy." Each book's label points to its own spot on the shelf. If you find a duplicate book, the label shows you where the main copy is. Now, imagine some labels point to an empty spot on the shelf (404 error). When someone finds a duplicate book, they go to the spot but find nothing there.

The problem:

Search engines may struggle to index the correct page since they are directed to a non-existent page, potentially leading to the original page not being indexed.

The SEO value of the original page may be lost, as they are unable to transfer link equity to a 404 page.

What to do:
  • Update Canonical Tags: Change the canonical tags to point to appropriate, existing URLs. Ensure the canonical URL is a valid, live page.

For Example: If Page A has a canonical tag pointing to Page B, and Page B returns a 404 error, update Page A’s canonical tag to point to a relevant, existing page (e.g., Page C).

  • Fix or Redirect 404 Pages: If the page being pointed to by the canonical tag (Page B) is supposed to exist, restore the content or implement a proper 301 redirect to a relevant existing page.

Develop a clear strategy for setting canonical tags, ensuring they always point to valid, live URLs. 

Case 2: Canonical Tag Points to 5XX Page

When a canonical tag points to a URL that returns a 5XX error, it means the tag is directing search engines to a page that has server issues and is not accessible. 

The problem:

First of all, search engines will try to access the URL specified in the canonical tag. If the URL returns a 5XX error, they may not be able to crawl and index the page. This, sooner or later, will cause indexing issues.

And second, when the canonical URL is inaccessible, there is a strong possibility for search engines to not identify the primary version of the content correctly, which might lead to potential duplicate content issues and dilution of link equity.

What to do:
  • Fix Server Issues: Investigate and resolve the server issues causing the 5XX errors. Ensure that the canonical URLs are accessible and return a 200 (OK) status.

*** Common Issues: Check for server misconfigurations, temporary outages, or resource limitations that might be causing the errors.

  • Update Canonical Tags (if needed): If the server issues cannot be resolved quickly, update the canonical tags to point to appropriate, accessible URLs in the interim.

For Example: If Page A has a canonical tag pointing to Page B, and Page B returns a 5XX error, update Page A’s canonical tag to point to another relevant, live page (e.g., Page C) until Page B is fixed.

Case 3: Canonical Tag Points to a Noindex URL

When a canonical tag on a page points to a URL that has a noindex directive, it means that the canonical URL - the one search engines should consider the primary version - is marked to not be indexed by search engines. 

This is contradictory and creates confusion for search engines about how to handle the pages.

The problem:

Indexing Issues: Search engines are instructed to consider the canonical URL as the primary one but are also told not to index it. (Whaaaaat?) This contradiction can lead to search engines ignoring both the original page and the canonical page.

Loss of SEO Value: If search engines follow the noindex tag on the canonical URL, the original page might not pass its SEO value, causing a loss in search rankings and visibility.

Wasted Crawl Budget: Search engines may waste their crawl budget trying to understand the conflicting directives, potentially ignoring other important pages on your site.

What to do:
  • Spot Affected Pages: Use a SEO auditing tool like Sitebulb to identify pages with canonical tags pointing to noindex URLs.
  • Update Canonical Tags: Change the canonical tags to point to an appropriate indexable URL. Ensure the canonical URL is meant to be indexed and represents the primary version of the content.

For Example: If Page A has a canonical tag pointing to Page B, and Page B is set to noindex, either update Page B to be indexable or change Page A’s canonical tag to point to another relevant, indexable page.

  • Monitor everything: Continuously monitor your site to catch and correct any new instances where canonical tags might point to noindex URLs.

A meme about 15 canonical tag issues reported on Sitebulb crawl

Case 4: Canonical Tag Points to Another Canonicalized URL

If Page A has a canonical tag pointing to Page B, and Page B has a canonical tag pointing to Page C, this might create conflict to search engines.

The problem:

Search engines might get confused by the multiple layers of canonical tags, making it harder to understand the original version of the content.

The additional hops between canonical tags can waste crawl budget and lead to inefficient indexing, possibly resulting in some pages not being indexed at all.

What to do:
  • You can use auditing tools to identify these pages.
  • Ensure each page's canonical tag points directly to the final, intended primary URL, avoiding intermediate steps.

For Example: If Page A has a canonical tag pointing to Page B, and Page B has a canonical tag pointing to Page C, change Page A’s canonical tag to point directly to Page C.

Case 5: Canonical Tag Points to a Disallowed URL

Ok, here’s the thing: When a canonical tag on a page points to a URL that is disallowed in the robots.txt file, it means search engines are instructed to consider the disallowed URL as the primary version of the content. However, search engines are also instructed not to crawl the disallowed URL, creating a conflict.

The problem:

Search engines may not be able to crawl the disallowed URL, leading to difficulties in determining the primary version of the content.

Given that, if the canonical URL is disallowed, search engines might not index any version of the page too, reducing visibility in SERPs and as a result drops in rankings and traffic.

What to do:
  • Change the canonical tags to point to appropriate, crawlable URLs. Ensure the canonical URL is not disallowed in the robots.txt file.

For Example: If Page A has a canonical tag pointing to Page B, and Page B is disallowed in robots.txt, update Page A’s canonical tag to point to a relevant, allowed URL.

  • Update robots.txt file (if needed): If the canonical URL is correct and should be indexed, update the robots.txt file to allow crawling of that URL.

For Example: If Page B is the intended primary URL but disallowed, change the robots.txt file to allow crawling of Page B.

Case 6: Canonical Tag Points to a Redirecting URL

When a canonical tag on a page points to a URL that performs a redirect (301 or 302), it creates an extra step for search engines to follow. This can cause confusion as search engines try to determine the primary version of the content.

The problem:

Search engines must follow the redirect to determine the final destination, which can waste crawl budget and delay indexing. In addition, the extra step can lead to delays in indexing the correct canonical page too. All these are going against effective SEO growth.

What to do:
  • Identify Redirect Chains: Use tools to identify pages with canonical tags pointing to URLs that redirect.

  • Direct Canonicalization: Update canonical tags to point directly to the final, non-redirecting URL.

For Example: If Page A has a canonical tag pointing to Page B, and Page B redirects to Page C, update Page A’s canonical tag to point directly to Page C.

  • Simplify Redirects: Minimise and simplify redirect chains on your site.

Scenario 6: Canonical Only Found in Rendered DOM

When you see the message "Canonical only found in rendered DOM," it means that the canonical tag is not present in the initial HTML source code. Instead, it is only found after JavaScript has executed and modified the DOM (Document Object Model) of the page.

The problem:

There are several reasons why having the canonical tag only in rendered DOM is a problem.

  1. Crawl Budget: Search engines have limited resources to crawl and index pages. If the canonical tag is only available in the rendered DOM, search engines need to execute JavaScript to see it. This can be resource-intensive and may lead to the canonical tag being missed if JavaScript isn't executed.
  2. Rendering Delays: There can be delays or issues in rendering JavaScript, meaning the canonical tag might not be processed in time, leading to potential indexing of duplicate content.
  3. Inconsistent Indexing: If the canonical tag is only found in the rendered DOM, there's a risk that different search engines or even different crawls by the same search engine may not consistently recognize the canonical URL, leading to duplicate content issues.
  4. JavaScript Errors: If there are errors in the JavaScript that adds the canonical tag, it might not be applied correctly or at all.

All the above can lead to:

  1. Crawling Issues: Googlebot may not always execute JavaScript. If the canonical tag is only available after JavaScript runs, search engines might miss it entirely.
  2. Duplicate Content: Without a proper canonical tag, search engines might index multiple versions of the same page, considering them as duplicate content. This can harm your page’s ranking and authority.
  3. Rankings: The lack of a visible canonical tag can lead to confusion for search engines about which page to prioritise, potentially affecting your site's SEO performance and visibility.

What to do:

  1. Server-Side Rendering (SSR): Implement server-side rendering so that the canonical tag is included in the initial HTML sent to the browser. This ensures that search engine bots see the canonical tag without needing to execute JavaScript.
  2. Static Rendering: Generate static HTML versions of your pages that include the canonical tag, which can be served to both users and bots.
  3. JavaScript Enhancement: If SSR or static rendering isn't feasible, make sure your JavaScript executes quickly and effectively. However, this is less reliable than having the canonical tag in the initial HTML.
  4. Testing and Monitoring: Use tools like Google Search Console and Sitebulb’s JavaScript crawler to check if canonical tags are being detected correctly by search engines.

Scenario 7: Canonical Loops

A canonical loop happens when two or more pages point to each other with canonical tags, creating a circular reference. For example, Page A has a canonical tag pointing to Page B, and Page B has a canonical tag pointing back to Page A. This loop can also extend to multiple pages, creating a more complex circular reference.

The problem:

  1. Confusion for Search Engines: Search engines rely on canonical tags to spot the primary version of a page. Canonical loops create confusion, making it difficult for search engines to identify which page should be considered the authoritative version.
  2. Indexing Issues: Pages involved in canonical loops might not be indexed properly, leading to reduced visibility in search engine results.
  3. SEO Performance: The confusion caused by canonical loops can dilute the SEO value of your pages, affecting rankings and traffic.

You can use SEO auditing tools or manual checks to identify pages with canonical loops. 

What to do:

Make sure that each page has a clear and correct canonical tag pointing to the intended primary page. Avoid pointing canonical tags back and forth between pages.

  • For Page A and Page B: If Page A should be the primary page, set the canonical tag of Page B to point to Page A, and ensure Page A's canonical tag points to itself.
  • Break Larger Loops: If there are multiple pages involved, carefully break the loop by setting appropriate canonical tags for each page to point to the intended primary page.

Develop and maintain a consistent strategy for setting canonical tags across your site and monitor and audit every now and then.

Canonical Tag Best Practices

Here are some simple best practices for using canonical tags, perfect for teaching beginners in SEO:

1. Always Use One Canonical Tag on Every Page

  • A canonical tag tells search engines which version of a page is the "main" or "preferred" one. This is helpful if there are similar or duplicate versions of a page (for example, with different URL parameters).
  • Even if there are no duplicates, adding a self-referential canonical tag (where the page points to itself) is a good habit. It helps prevent search engines from picking a different URL version by mistake.

2. Canonicalize Your Homepage

  • Sometimes, other websites might link to your homepage with extra parameters (like tracking codes). To avoid confusion, add a canonical tag to your homepage that points to its correct version (e.g., https://yourwebsite.com).

3. Avoid Mixing Canonicals and Redirects

  • Don’t mix canonical tags with 301 redirects. If you use a canonical tag to point from one page to another, avoid redirecting the canonical page back to the first page. This confuses search engines.

4. Stick to One Direction for Canonicalization

  • When you set a page as canonical, only point one version to another. Don’t have two pages pointing back to each other as the canonical version. Always keep it one-way and clear.

5. Include Only Canonical URLs in Your Sitemap

  • In your sitemap, only include the URLs that are marked as canonical. This helps Google and other search engines know which pages to index.

6. Use Absolute URLs

  • Always use the full URL in your canonical tag, including the https:// and the domain name. For example, use https://yourwebsite.com/page, not /page. This ensures there’s no confusion about which version of the URL is the correct one.

7. Use Canonicals to Handle Duplicate Content

  • If you have pages with very similar or duplicate content (for example, product pages with different colours), use a canonical tag to tell search engines which version you want to rank. This helps prevent your pages from competing with each other in search results.

8. Don’t Chain Canonicals

  • Don’t set up a "chain" of canonical tags. For example, if Page A points to Page B as canonical, don’t make Page B point to Page C. This makes it hard for search engines to figure out the correct version of the page.

9. Use Cross-Domain Canonicals

  • When content is syndicated, use canonical tags to point to the original source.

10. Use Canonicals to Avoid International Duplicates 

  • Combine canonical tags with hreflang attributes to manage multilingual and regional versions of a page.

11. Ensure Canonicals Aren’t Impacted by JavaScript Rendering 

  • The canonical tag should be directly placed in the raw HTML to ensure it's available before any JavaScript rendering.
  • Verify Consistency After Rendering: After JavaScript execution, verify that the canonical tag remains consistent and unchanged in the rendered HTML.
  • Test Canonical Tag in Dynamic Content: For pages relying heavily on JavaScript, check that the canonical tag appears correctly in both raw and fully rendered HTML.
  • Server-Side Rendering (SSR) Consideration: If using SSR, ensure the canonical tag is included in the HTML served to the client to improve SEO.

How to Audit Canonical Tags Using Sitebulb

This guide on how to audit canonical tags using Sitebulb includes clear directions. 

Step 1: Install and Set Up Sitebulb

  1. Download and Install Sitebulb: If you haven’t already, download Sitebulb and install it on your computer.
  2. Create a New Project:
    • Open Sitebulb and click on 'New Project'.
    • Enter the website URL you want to audit in the ‘Website URL’ field.
    • Customize crawl settings as needed (for large sites, adjust crawl speed and limits).

Step 2: Configure Crawl Settings for Canonical Tags

  1. Advanced Settings (Optional): If you want to focus specifically on canonical tags, you can adjust the settings:
    • Under Crawl Settings, ensure that Meta Tags and Canonical Links options are checked.
    • If you want to detect duplicate content, also enable Duplicate Content Detection.
  2. Start the Crawl: Once everything is set up, hit the ‘Start Audit’ button. Sitebulb will begin crawling the website and collecting data.

Sitebulb will automatically check every internal URL for a wide range of potential canonical issues, and if any issues are found, these will be presented via the 'Hints' tab.

Sitebulb's interface when canonical tag issues are reported

There you have it, a complete guide to canonical tags for SEO. You should now know what to do, what not to do, and how to check whether your canonical tags meet best practices.

You might also like:

Sitebulb is a proud partner of Women in Tech SEO! This author is part of the WTS community. Discover all our Women in Tech SEO articles.

Stevy Liakopoulou

Stevy Liakopoulou is a highly driven and data oriented SEO expert, currently working at Search Magic as an SEO Specialist. She takes pride in providing technical audits and can translate business KPIs into successful SEO campaigns. Her goal is to teach SEO internationally and implement technical audits on at least 300 websites.

Sitebulb Desktop

Find, fix and communicate technical issues with easy visuals, in-depth insights, & prioritized recommendations across 300+ SEO issues.

  • Ideal for SEO professionals, consultants & marketing agencies.

Sitebulb Cloud

Get all the capability of Sitebulb Desktop, accessible via your web browser. Crawl at scale without project, crawl credit, or machine limits.

  • Perfect for collaboration, remote teams & extreme scale.