High This Hint is very important, and definitely warrants attention. Issue This Hint represents an error or problem that needs to be fixed.

Disallowed URL has incoming hreflang

This means that the URL in question is defined as a hreflang alternate, yet is disallowed in robots.txt.

Why is this important?

Hreflang tags are interpreted by search engines as indexing instructions. An English page that has hreflang pointing at its French alternate is instructing search engines to index both the English version and the French version, and to consider each as equivalent in their respective languages.

If this English page was disallowed in robots.txt, this would stop search engines from crawling the content. Whilst this does not technically stop them from indexing the URL, it will stop them from actually seeing the alternate URLs in the first place, and validating the return-tags (reciprocal hreflang), which is mandatory for hreflang to work properly.

What does the Hint check?

This Hint will trigger for any URL which is defined as a hreflang alternate, and is disallowed.

Note: This Hint is very similar to another Hint: Has outgoing hreflang annotations to disallowed URLs. The difference being that this Hint is analysing the target page of a hreflang annotation (i.e. incoming hreflang) whereas the other Hint is analysing the page with hreflang on (i.e. outgoing hreflang).

Examples that trigger this Hint:

Consider the URL: https://example.com/pages/us/page-a/

The Hint would trigger for this URL if it was disallowed in robots.txt:

robots.txt example

and if the URL is listed as an hreflang alternate, either on another page, or on the page itself via a self-referencing hreflang;

<link rel="alternate" href="https://example.com/pages/us/page-a/" hreflang="en-us" />
<link rel="alternate" href="https://example.com/pages/fr/page-a/" hreflang="fr-fr" />

How do you resolve this issue?

The problem with this sort of conflicting instruction is that it is not instantly obvious which page is causing the error. Either the hreflang is wrong or the robots.txt is wrong.

Either way, the current setup will stop hreflang from working properly, so it will need to be fixed.

This starts with figuring out which is the correct hreflang URL, which may require manual inspection.

Once you have established this, it should be straightforward to work out the next step:

  • If the hreflang URL is correct, then the robots.txt is incorrect, and should be changed to remove or amend the offending rule.
  • If the hreflang URL is not correct, then the outgoing hreflang annotations should be corrected so they point to the correct URL.

There's a version of Sitebulb for everyone!

No project limits. No crawl credits. We save you time and we save you money.

Sitebulb Desktop

Try Sitebulb's award-winning desktop crawler for Windows or Mac:

Try our fully featured 14 day trial. No credit card required.

Try Sitebulb Desktop for Free

Sitebulb Server

Everything you love about cloud crawling, paired with everything you love about Sitebulb:

We offer fully managed cloud server plans or a DIY server license.

Explore Sitebulb Server