Dealing with XML Sitemaps issues?

Crawl your website with Sitebulb for 300+ tech SEO checks

Try for Free
High This Hint is very important, and definitely warrants attention. Issue This Hint represents an error or problem that needs to be fixed.

Forbidden (403) URL in XML Sitemaps

This means that the URL in question returns a HTTP status of 403 (Forbidden), yet is included in an XML Sitemap.

Why is this important?

Your XML Sitemap should only contain URLs you wish for search engines to index. URLs in your sitemaps should be clean - i.e. sitemaps should only include URLs that are HTTP status 200 (OK), indexable, canonical and unique.

If search engines find 'dirt' in sitemaps, such as 403 pages, they may stop trusting the sitemaps for crawling and indexing signals.

Eric Enge once interviewed Duane Forrester while he was at Bing;

Duane Forrester quote

What does the Hint check?

This Hint will trigger for any internal URL which returns an HTTP status of 403, and is included in one of the submitted XML Sitemaps.

Examples that trigger this Hint:

Consider the URL: https://example.com/page-a, which is included in a submitted XML Sitemap.

The Hint would trigger for this URL if it had a 403 (Forbidden) header response:

HTTP/... 403 Forbidden

...

How do you resolve this issue?

To resolve this issue, simply remove any URLs that return 403 from all XML Sitemaps.

One other thing to note is that the 403 (Forbidden) response MIGHT NOT be the same response given to a search engine crawler. Sometimes, servers employ over-zealous firewalls or 'DDoS protection services', that detect crawling activity and treat it as a DDoS attack, which can lead to 403 responses. So it is worth checking the source of the 403 before taking action.

Further Reading

Sitebulb Desktop

Find, fix and communicate technical issues with easy visuals, in-depth insights, & prioritized recommendations across 300+ SEO issues.

  • Ideal for SEO professionals, consultants & marketing agencies.

Sitebulb Cloud

Get all the capability of Sitebulb Desktop, accessible via your web browser. Crawl at scale without project, crawl credit, or machine limits.

  • Perfect for collaboration, remote teams & extreme scale.