Work for an agency? Our next webinar is for you. Register now.

How to audit Hreflang & HTML lang in Sitebulb

Hreflang annotation is essential to the organic performance of internationalized websites. However analyzing hreflang can get complicated, as validation requires checking every URL variation, sometimes spanning across several subfolders or subdomains.

Sitebulb lets you analyze your international setup by comprehensively checking the hreflang and HTML lang attribute setup across every international variation, and providing you with organized insights and hints for optimization.

Setting up the International Audit

To start auditing your internationalized website, enable the ‘International’ report in your ‘Audit Data’ settings, then run your crawl.

* Note: Sitebulb must be configured to crawl the whole website for hreflang validation to work as intended. The International report will return less valuable data if elements of the crawl are restricted.

For example, if you restrict the crawl to focus on the /de/ subfolder, the crawler is unable to check translated content on other subfolders like /en/ or /es/. Sitebulb will see the outgoing hreflang tags but will not be able to gather data about incoming reciprocal hreflang tags, therefore it won’t be able to validate the setup.

Enabling the International report in Sitebulb

If your hreflang markup is set up in sitemaps, you should also enable ‘XML Sitemaps’ as a Crawler Source during setup.

Enabling XML Sitemap as a crawl source

Once the audit is complete, you’ll find the international report in the left-hand menu in Sitebulb.

Finding the international report

Sitebulb checks and reports on both hreflang annotation and HTML lang attributes, following every single alternate URL, irrespective of implementation of whether these are found in the HTTP header, HTML, or within a sitemap.

Once all of this information is collected, Sitebulb can return useful hints that check the validity of hreflang annotations and HTML lang attributes against the recognized standard.

Reading the International Report

Within the report, you’ll see an array of charts and graphs. Navigate through these to find key data points, including information about both hreflang tags and HTML language tags.

At the top of this report, you’ll find links to two tables containing every hreflang tag and every HTML lang tag found on the site respectively. These tables provide an overview of the validity of each tag, the number of URLs targeted, and the most prevalent issues, such as hreflang annotations pointing to redirected or broken URLs.

International report - hreflang geotargeting summary

Within the report overview, you can navigate to the relevant URL Lists:

International Report - viewing URL lists

  • URLs with Hreflang - All HTML URLs with at least one incoming or outgoing hreflang annotation.
  • URLs Missing Hreflang - All HTML URLs with no incoming or outgoing hreflang annotations.
  • Unique Hreflang - Unique hreflang variations, including invalid hreflang.
  • External Hreflang - External hreflang URLs found through an outgoing hreflang.

When viewing, you’ll be able to customize each of these URL lists to find the exact data you need, by adding or removing data columns and applying advanced filters.

International Report charts

Within the International report, Sitebulb breaks down the key attributes and annotations data in handy (and dazzlingly colorful) graphs.

URLs

The URLs chart provides a visual comparison between the hreflang and HTML lang coverage across the site. URLs with missing annotations will also be represented here (in red) if they exist on your website.

Hosts (or Hostnames)

Since Sitebulb follows and crawls every alternate URL found in the annotation, your international report coverage will also include referencing URLs from separate domains and subdomains. This gives you the complete picture of the setup and validity of your hreflang annotations.

hreflang hosts

Hreflang Annotations & HTML Lang Attributes

These charts show the distribution of hreflang and HTML lang attributes respectively across all URLs, including URLs with no annotation.

Hreflang Annotations & HTML Lang Attributes charts

You can use the ‘View Data Table’ button for a simplified table view of the data, or hover over the graph to see a breakdown and click through to the specific data segment.

View data table and URLs list

Hreflang Annotation Languages & HTML Lang Languages

These two graphs show the distribution of languages as defined by the hreflang annotations and HTML lang attributes respectively. Since different geographical areas may use the same language code, the distribution in these graphs will look different from the ones above.

But the two graphs should in most cases look similar to each other, as it is expected that the language declaration of hreflang and HTML lang attributes. If they don’t, Sitebulb will flag this up with a dedicated hint - ‘Mismatched hreflang and HTML lang declarations

Hreflang Annotation Languages & HTML Lang Languages charts

Hreflang Annotation Regions & HTML Lang Regions

Like the graphs above, the two region graphs illustrate the region distribution defined by the hreflang annotations and HTML lang attributes.

Hreflang Annotation Regions & HTML Lang Regions charts

How to find issues with your hreflang annotation

The data in the International report overview will give you a good starting point for analyzing your international annotations.

Sitebulb also provides 25 different International hints, which flag common issues that may be hindering your hreflang and HTML lang declarations. You’ll find the International hints at the top of your report overview. From here, you can navigate to the relevant list of URLs for each Hint to understand which pages are affected.

International report hints

The International Report Hints include checks for:

  • The validity of incoming and outgoing annotation
  • Broken or redirected hreflang links
  • Incoming and outgoing annotation to no-index and canonicalized URLs
  • Conflicting hreflang annotation
  • Invalid HTML lang attributes
  • Mismatched hreflang and HTML lang declarations

You can find a full list of the International hints, along with detailed explanations for each of these, in our Hints Explanations section.

Exporting and analyzing Hreflang data

All of the data gathered by Sitebulb’s International report is packaged in one handy set of spreadsheets, which you can download from the top of your report.

You’ll find useful data and visualizations in here, which are not available within the UI itself.

For example, the hreflang cluster matrix makes it easy to understand and compare the international coverage for each URL.

Hreflang cluster matrix in the International report export

The ‘Invalid Incoming Annotations’ and ‘Invalid Outgoing Annotations’ tabs contain every URL with invalid annotations alongside the respective reciprocal URLs containing those tags.

Invalid Outgoing Annotations in international report export

Analyzing individual URLs

Once you run an international audit, you will also be able to see International data in the ‘URL Details’ view. This will allow you to analyze individual URLs, understand their reciprocal relationships with other pages, and uncover implementation issues.

International URL details