When you set up a Sitebulb audit, you are able to customise with a high degree of granularity the data that Sitebulb collects when auditing websites.
There are lots of options in the audit setup, which basically break down into these three 'zones'.
On this page we will run through the 'Audit Data' options, which are marked in red as 'Main Data Options' in the image below.
By default, certain options are ticked, so if you just go ahead and start your audit without adjusting anything, Sitebulb will crawl your website and collect data regarding SEO, Page Resources and Security.
Sitebulb will collect core on-site SEO data, such as internal links and indexability signals. If you were a Sitebulb user prior to version 5, these settings were basically 'always on' and you could not switch them off.
You can toggle some of the data options in the Advanced Settings, which you may wish to do in order to save time and CPU resources:
You can click to select which data options you wish to include/exclude in the audit via the Advanced Settings.
Sitebulb carries out its performance analysis directly with headless Chrome, which means the Chrome Crawler is required. If you have the HTML Crawler selected, you will see this message below. You can switch to the Chrome Crawler in the Crawler Settings.
With Performance & Mobile Friendly enabled, Sitebulb will perform performance and mobile friendly analysis for every URL, highlighting opportunities and diagnostic issues. Sitebulb will also collect Web Vitals metrics for a sample of URLs.
Enabling this option will automatically open up the Advanced Settings, which allow you to change the sampling for Web Vitals (default selection 10%) and toggle the Code Coverage and Technology options.
Sitebulb will collect structured data and validate it against both Schema.org guidelines and Google's guidelines for their Search result features.
We also have a comprehensive guides on auditing Structured Data.
Sitebulb will perform server analysis on protocols and certificates, in addition to checking every URL for on-page security issues and vulnerabilities.
Sitebulb will crawl URLs specified in hreflang annotations (even if they are on different domains), and check the validity of hreflang and HTML lang attributes.
Sitebulb will crawl any AMP URLs found, and check that they are valid and reciprocal.
You can use the Advanced Settings to toggle crawling pure AMP URLs (on sites that only use AMP pages).
Sitebulb carries out its accessibility analysis directly with headless Chrome, which means the Chrome Crawler is required. If you have the HTML Crawler selected, you will see this message below. You can switch to the Chrome Crawler in the Crawler Settings.
With Accessibility enabled, Sitebulb will run over 50 automated accessibility checks, across every page on the website. It will highlight accessibility violations and identify opportunities to make your web pages more inclusive and user-friendly.
In the section above we covered the 'Audit Data' options, which take up the right hand side of the screen when you first open view the audit setup screen.
Now we will cover the small menu on the left hand side, which was marked in red as 'Extra Data Options' on the big image at the top.
By default, these typically have a greyed out tick alongside them, which means they have not been enabled. If you go through and enable a particular option, it will have a green tick alongside.
If an option requires attention, it will have a red alert marker - for example, if you have selected Google Analytics but not chosen an account:
When you select any one of the menu options on the left, the right hand panel will change, to show further configuration options.
You can connect to a Google Analytics account to access visit, engagement and conversion data for each URL.
In terms of setup, the most important thing to check is the Property and View selections. Sitebulb will attempt to auto-select the right View based on the Google Analytics tracking ID found on the page, but sometimes you may wish to select a different view.
In the configuration section, you can also adjust the date range used for Google Analytics data and select to crawl any other URLs found in Google Analytics (that were not found by the crawler).
You can connect to a Google Search Console account to access Search Analytics, Keywords and Sitemap data. Additionally, you can select to crawl any other URLs found in Search Analytics (that were not found by the crawler).
In terms of setup, the most important thing to check is the Property selection. Sitebulb will attempt to auto-select the right Property by matching up the start URL with the properties in the account, but sometimes you may wish to select a different Property.
In the configuration section, you can also adjust the date range used for Google Search Console data and select to crawl any other URLs found in Google Search Console (that were not found by the crawler).
The final option for Google Search Console is 'Analyse Google Search Console Keywords', and by ticking this you activate the Keywords report. This means that Sitebulb will extract keyword data from the Search Console API, including clicks, impressions and CTR. The bottom box allows you to enter brand keyword, which then allows Sitebulb to group the data by brand or non-brand keywords.
Available using either the HTML Crawler or the Chrome Crawler are 'Extraction' options, as below. Typically, the reports you can get from these options would be considered tangential to SEO auditing.
Some users are tempted to tick every box they can, figuring they can just ignore any data they don't want or need. This is not a great idea, in general.
Every checkbox you tick will require Sitebulb to do more processing, which means the audit will take more time and will use more computer resources. On some computers, ticking every single box will mean that it is very difficult to continue doing other tasks.
In particular, Performance and Accessibility are CPU intensive, so only select them if you actually care about the data.