Websites that have content targeted to multiple countries or languages can use special markup to communicate to search engines who this content is for. Most of the time, this will refer to hreflang, but some search engines still use the HTML lang attribute. Either way, these attributes are used to communicate the language and region that a particular URL is targeted to.
This article will explain how hreflang and HTML lang can help search engines understand language and regional targeting, and how issues with the configuration can cause the instructions to be ignored. Throughout the article you will find links to all the relevant Hints that Sitebulb uses.
Why use hreflang?
Since Google introduced hreflang in 2011, it has become the standard way to show search engines what the relationship is between web pages in different languages. It is designed solely for the consumption of search engines, but has far-reaching implications for website users. It you have an English page that has been translated into Spanish, using hreflang tells Google that it should be showing the Spanish version in its search results, when users are searching in Spanish speaking regions. This ensures you serve the right localised content to the right audiences.
The trouble with hreflang is that it is notoriously easy to mess up.
For any given URL, hreflang annotations can be added using 3 different methods:
- Within the <head> section of the HTML
- In the HTTP header
- In XML sitemaps
Generally, it is not advised to use multiple methods, as it increases the risk that future mismatches may occur.
For hreflang to be properly understood by search engines, it must conform to a number of specific regulations:
- Regions must use the ISO 3166-1 alpha-2 standard
- Languages must use the ISO 639-1 standard
- The language may be declared with no region, but a region cannot be declared with no language
- When declaring both, language must come before region in the annotation (e.g. en-GB)
- Hreflang must be reciprocal (if you link to a page as your Spanish translation, the Spanish page must also link back - these are often called 'return tags')
- There must be no conflict between hreflang and rel=canonical or noindex
There are also some optional annotations which may be used:
- URLs may declare a 'self-referential' hreflang (e.g. on the English/Great Britain page, include a self-referencing en-GB annotation)
- URLs may declare an x-default hreflang annotation, which would be considered as the fallback in regions where no language has been specified
HTML lang is used to define the language and associated region, within the HTML of the page. HTML lang is used by search engines such as Bing to identify language/region data (Bing does not currently support hreflang), and it is set either using the HTML lang tag, the title lang tag, or the 'content-language' meta tag.
Incoming & Outgoing Hreflang
The International Hints can be broken down into 3 sections: Outgoing hreflang, Incoming hreflang and HTML lang.
Outgoing hreflang refers to hreflang annotations that relate to the URL in question. So if I had an English URL, https://example.com/page-a/en/, the outgoing hreflang would be considered to be link elements that point at the German version (/de/) or the French version (/fr/) etc...
Incoming hreflang refers to hreflang annotations that 'point' to the URL in question. So if I had an English URL, https://example.com/page-a/en/, the incoming hreflang would be considered to be link elements that were found on the German version (/de/) or the French version (/fr/), that point back to the English version (/en/).
HTML lang refers to all methods of defining the language/region: the HTML lang tag, the title lang tag, or the 'content-language' meta tag.