IndexabilityGo to hints
Indexability relates to the technical configuration of URLs so that they are either Indexable or Not Indexable.
Search engines generally take the stance that any successful URLs (i.e. HTTP status 200) they find should be indexed by default - and they will, in the main, index everything they can find. However, there are certain signals and directives you can give to search engines that instruct them to NOT index certain URLs.
Setting URLs so that they are Not Indexable is a relatively common task, and straightforward to do in most modern CMSs. You might want to set a URL to noindex, for instance, if it is useful to website users, but is not a page that would represent a useful search result (e.g. a 'print' version of a page).
However, indexing signals often get misconfigured, or set up incorrectly, which can result in important URLs not getting indexed. An important thing to note is that if a page is not indexed, it has no chance to generate any organic search traffic.
Robots Directives & Canonicals
There are 2 main ways you can signal to search engines that a page should not be indexed - robots directives and canonicals. Accordingly, Sitebulb's Indexability Hints are split in two to reflect this.
Sitebulb's Robots Hints deal with the robots.txt file, meta robots tags and the X-Robots-Tag, and how robots directives may impact the way in which URLs are indexed by search engines.
Click through to read more about robots directives, or check out the Robots Hints below.
There are 3 Hints that relate to potential issues caused by the directives themselves, and how they are used in conjunction with internal linking practices.
- Has noindex and nofollow directives
- Internal Disallowed URLs
- URL only has nofollow incoming internal links
There are 3 Hints that relate to rendering issues caused for disallowed resource files:
Multiple robots directives
There are 6 Hints that relate to issues caused by robots directives being specified multiple times:
- Mismatched nofollow directives in HTML and header
- Mismatched noindex directives in HTML and header
- Multiple nofollow directives
- Multiple noindex directives
- Nofollow in HTML and HTTP header
- Noindex in HTML and HTTP header
Sitebulb's Canonical Hints deal with how canonicals impact the way in which URLs are indexed by search engines, and help you unpick canonical issues.
Issues with the canonicalized URL
There are 11 Hints that relate to the canonical URL itself:
- Canonical loop
- Canonical points to a different internal URL
- Canonical points to a disallowed URL
- Canonical points to a noindex nofollow URL
- Canonical points to a noindex URL
- Canonical points to a redirecting URL
- Canonical points to a URL that is Error (5XX)
- Canonical points to a URL that is Not Found 404
- Canonical points to another canonicalized URL
- Canonical points to external URL
- Canonical URL has no incoming internal links
Conflicting protocol issues
There are 2 Hints that relate to mismatched HTTP/HTTPS canonicals:
There are 8 Hints that relate to the implementation of canonicals:
- Canonical is a relative URL
- Canonical is malformed or empty
- Canonical only found in rendered DOM
- Canonical outside of head
- Canonical tag in HTML and HTTP header
- Mismatched canonical tag in HTML and HTTP header
- Multiple canonical tags
- Multiple, mismatched canonical tags
There are 3 Hints that relate to pagination and pagination canonicals: