Audit settings

When you set up a Sitebulb audit, you are able to customise with a high degree of granularity the data that Sitebulb collects when auditing websites.

There are lots of options in the audit setup, which basically break down into these three 'zones'.

On this page we will run through the 'Audit Settings' options, which are marked in red as 'Extra Data Options' in the image below.

Sitebulb Data Options

Audit Settings

When you first open up the audit setup screen, the left hand side will have 'Audit Data' selected, and the options will show on the right hand side. There are lots of Audit Data options, so we have a dedicated documentation page for this. All the other options are covered below.

Audit Settings

By default, the options in the left-hand menu typically have a greyed out tick alongside them, which means they have not been enabled. If you go through and enable a particular option, it will now show a green tick alongside.

If an option requires attention, it will have a red alert marker - for example, if you have selected Google Analytics but not chosen an account:

No GA account selected

When you select any one of the menu options on the left, the right hand panel will change, to show further configuration options.

Google Analytics (optional)

You can connect to a Google Analytics account to access visit, engagement and conversion data for each URL.

In terms of setup, the most important thing to check is the Property and View selections. Sitebulb will attempt to auto-select the right View based on the Google Analytics tracking ID found on the page, but sometimes you may wish to select a different view.

In the configuration section, you can also adjust the date range used for Google Analytics data and select to crawl any other URLs found in Google Analytics (that were not found by the crawler).

Select GA Account

Google Search Console (optional)

You can connect to a Google Search Console account to access Search Analytics, Keywords and Sitemap data. Additionally, you can select to crawl any other URLs found in Search Analytics (that were not found by the crawler).

In terms of setup, the most important thing to check is the Property selection. Sitebulb will attempt to auto-select the right Property by matching up the start URL with the properties in the account, but sometimes you may wish to select a different Property.

In the configuration section, you can also adjust the date range used for Google Search Console data and select to crawl any other URLs found in Google Search Console (that were not found by the crawler).

Google Search Console

The final option for Google Search Console is 'Analyse Google Search Console Keywords', and by ticking this you activate the Keywords report. This means that Sitebulb will extract keyword data from the Search Console API, including clicks, impressions and CTR. The bottom box allows you to enter brand keyword, which then allows Sitebulb to group the data by brand or non-brand keywords.

Keyword Analysis

Crawl Sources

This is the only audit setting in Sitebulb that is not optional. Sitebulb needs at least one crawl source, otherwise it cannot crawl!

The default setting is for Sitebulb to crawl the website, so this will always be ticked by default. However it can be configured to also crawl XML Sitemap URLs, and/or a provided URL List.

Crawl Sources

Crawl Website

This is a pretty straightforward option - Sitebulb will perform a website crawl, following links on every page to discover new URLs, until every page on the website is crawled.

XML Sitemaps

Sitebulb will crawl URLs found in XML Sitemaps, that were not already found in the main crawl. It will also provide analysis of sitemap URLs, and compare URLs found in the sitemap vs URLs found by the crawler.

Any XML Sitemaps referenced in robots.txt will be pre-filled when you select this option. You can also add in multiple sitemap URLs or sitemap files using the various upload options.

Add Sitemap URLs

URL List

Sitebulb can also 'crawl' based on a list. It isn't strictly crawling, as links from the pages will not be followed, but the data will be collected and analysed for all URLs contained in the list. Typically URL Lists are used when you DON'T also crawl the website, and are used to crawl a specific area or section of the site.

One thing to note is that Sitebulb will only crawl URLs that match the subdomain of the start URL provided (so you can't just upload a massive list of URLs from lots of different sites).

To add a URL List, simply upload from your local computer.

URL List Crawl Source

Content Extraction (optional)

Sitebulb will collect specific content elements from the HTML, based on custom rules that you define using CSS paths. Typically, the reports you can get from these options would be considered tangential to SEO auditing.

We have a complete guide on Content Extraction

Content Extraction

An important thing to note regarding extraction is that it may influence the crawler you need to select. For example, if 'product price' is added to the page via JavaScript, then you would need to use the Chrome Crawler in order for Sitebulb to pick it up.

Content Search (optional)

Sitebulb will check each URL for words or phrases that you specify via rules, and count the instances it finds for each rule. Optional advanced configuration allows you combine multiple words or phrases.

We also have a complete guide on Content Search.

Content Search

Sitebulb's Recommendation: Don't tick everything

Some users are tempted to tick every box they can, figuring they can just ignore any data they don't want or need. This is not a great idea, in general.

Every checkbox you tick will require Sitebulb to do more processing, which means the audit will take more time and will use more computer resources. On some computers, ticking every single box will mean that it is very difficult to continue doing other tasks.

In particular, Performance and Accessibility are CPU intensive, so only select them if you actually care about the data.