Released on 25th November 2021
- Some users were having issues exporting to Google Sheets (in particular 'All Hints'), so we rebuilt the upload system to work more reliably.
- Fixed a bug on the Crawler Settings when setting a URL Limit over 1 million URLs.
Released on 18th November 2021
#1 Added an optional 'audit completed' email alert
We get it, it's 2021 now. SEOs are busy people, attention taken by machine learning, and NLP, and AI, and y'know... Twitter. It's no wonder folks are forgetting to check if their favourite SEO tool has finished crawling yet.
Well now you don't have to! When setting up an audit, head to Crawler Settings from the left hand menu, and you'll see a new option to 'Send Email When Finished.' Enter your email address in this box (or multiple addresses, comma separated) and Sitebulb will send you a friendly email whenever your audit is finished.
Pairs perfectly with a nice scheduled audit.
#2 Added more ways to view image alt text
Sitebulb team: 'Hey SEOs! We've released a revolutionary new way to compare response HTML to rendered HTML. Nice huh?'
SEOs: 'Is that it?'
Sitebulb team: 'Erm, actually, no! We also added Web Vitals metrics that are collected while you crawl!'
Sitebulb team: 'Don't act like you're not impressed.'
SEOs: 'This stuff is great and all, but does it really move the needle?'
Sitebulb team: 'What is it you want to see?'
SEOs: 'Image alt text'.
Sitebulb team: 'Oh.'
Sitebulb has already been collecting image alt text, to be fair, but not doing a great job of making it accessible. So we've made it a lot easier to find, chucking it bang slap in the middle of the On Page report.
We also now provide various different ways to interrogate the data too.
I hope you're happy in the end.
#3 Re-jigged 'Readability' setting and turned it off by default
With v5 we switched to a new audit setup workflow, and it's fair to say that we mangled the 'Advanced Settings' options for SEO. We'd made 'Readability' an option you could switch on/off, yet if you did switch it off, Sitebulb would not collect on-page data like h1s. This caused levels of bewilderment and head-scratching not seen since Man Utd appointed Ole Gunnar Solksjaer as manager of an actual real life football team.
So we have adjusted this to ensure that all on-page data is collected by default, and the readability on/off switch only relates to readability and sentiment scoring.
We have also made this switched off by default. Pourquoi? Because analysing this data is quite expensive (computationally), which puts greater strain on your computer (and your carbon footprint), and is only really necessary when doing content audits.
If you do want this data, simply check the box to switch it on.
#4 Added a search field to GA property list
Sometimes, we here at Sitebulb underestimate how wildly successful our customers are (which of course should be directly attributed to Sitebulb), and we had not anticipated that they would have Google Analytics accounts with literally thousands of website properties.
And that they were too busy and important to scroll their mouse.
Following our latest update, these disgustingly rich and successful users can simply type into the search box at the top of the dropdown, and have the property served up to them on a silver platter, exactly how they like it.
#5 Improved the 'support window' in the tool
Every four or five years, we find that a user has actually had an issue with Sitebulb and wishes to report a bug. I know this sounds far-fetched, but please stay with me.
Since we are committed to 100% bug free software, we encourage any and all such feedback, however infrequent. So we have a little 'Contact Support' option in the 'Help & Support' dropdown menu:
This opens a modal which allows you to send us a nice message, or report a bug (unlikely). We have redesigned the modal window that appears, offering self-help suggestions from our documentation:
We have also adjusted how this modal loads in, so it no longer takes you away from the thing you were doing. Rejoice!
#6 Updated all the structured data validation
If you've not been paying attention, earlier this year we launched a free service that alerts you to changes in Google Rich Result requirements and Schema.org updates. I signed Gareth up for the alerts system, so now he just goes and updates Sitebulb based on the alerts that hit his inbox. There's pretty much nothing to dev work, IMO.
#7 Click through to URL Lists of sitemap URLs
Within the XML Sitemaps report, you can now click through to URLs from a specific sitemap, via these linked values in the sitemap list. This was from one of those Twitter feature requests that made us go, 'oh yeah, that makes sense.'
#8 Added filtering options to content search and extraction
Content search and content extraction 'URL' results now allow you to do the same kind of advanced filtering as you can do on normal URL Lists, which to be honest we probably should have just added from the beginning.
#9 Updated Chrome to version 93
Per our commitment to evergreen Chromium, Sitebulb is now using version 93, for the best crawling and rendering experience available on earth.
- We've been getting a bunch of users inadvertently ending up in a state where all new Sitebulb audits get queued. Even if there was nothing to queue behind. As a British company we are proud of our heritage and longstanding reputation as magnificent queuers, but even we thought this was taking it a little too far.
- A rather annoying hreflang issue that meant Sitebulb would not list URLs on different ccTLDs, falsely claiming that your site was responsible for all sorts of heinous crimes.
- Sitebulb was getting stuck failing to crawl some sites due to 'handshake failed' errors. I don't really know what anyone bloody expected, we've not been allowed to shake hands for literally years now, this was bound to happen eventually.
- Content extraction would get screwed up whenever you tried to use attributes. Firstly, it would not save the attribute you were using (less than ideal), and secondly it would not display correctly in the tool (also marginally below par).
- A website was getting errors that read 'object reference not set to an instance of an object.' This is an error that .NET spits back when trying to access something that has not been properly defined, and as 'chief tester', I have a hate/hate relationship with this error. In fact I would probably say it is my sworn enemy.
- The 'All Hints' export, famed for it's quite delightful integration with Google Sheets, was spitting back some 'pre-v5' performance Hints when you did the CSV export. As an Excel lover I do cherish other users sticking so steadfastly with CSVs, but even I must admit that the Sheets version is waaay better (BTW we did actually fix the CSV bug).
Released on 29th July 2021
We have an unwarranted reputation as 'Mac haters'. Truly, I do not know where such propaganda stems from, as we have nothing but love for Macs, and I challenge any individual to present evidence of a contradictory position.
Anyway, despite our predilection for all things Mac, in our last update we managed to introduce a bug which meant that some websites would not crawl on Macs, when they would crawl on Windows. This update includes a fix for these crawling issues, so if you are an unfortunate Mac user, please make sure to update!
Added a veritable panoply of structured data updates;
- Added an error to priceRange when it is greater than 100 characters.
- Removed @id from LocalBusiness
- Added SeekToAction to VideoObject
- Added directApply added to JobPosting
- Added funder added to Dataset
- Updated to Schema.org version 13
- Added gtin to Product
- Added BackOrder to Product availability
When working on these updates, Gareth sent a message to our group WhatsApp, asking me to add a Jira ticket whenever I add a new structured data alert to our change history page.
My response was typically professional and supportive:
- Aforementioned Mac crawling issue was fixed, and to not mention it here would be to deny my completist tendencies, which I am not wont to do.
- The 'All Hints' report was not working in conjunction with our new 'pick and mix' audit options, and would only show if 'Search Engine Optimization' was ticked. It will now work whatever selection you choose. Personally I like a mix of the fizzy ones and white mice. Don't judge me.
- When connecting Google Search Console, Sitebulb was having issues recognising domain properties.
- Sitebulb was incorrectly flagging pages with HTML lang as 'not having HTML lang'. When challenged on why this was happening, Gareth's response was 'I had an IF statement the wrong way around.' Which, if you ask me, is kinda worrying...
- Adding the column 'Natural Height' to a URL List would cause an SQL error on some lists. I am not really sure why we don't just include 'Natural Height' on every URL List by default, since height related data is central to almost all SEO strategies (e.g. 'what is the natural height of Barack Obama').
- On sites with AMP equivalents, ticking AMP would make Sitebulb think that HTML pages are actually AMP pages. This would have the unfortunate consequence that Sitebulb would therefore not collect things like link data, structured data, performance data and whatnot. It would have been a more disastrous bug if we had not all concluded that AMP is basically now dead.
- When setting up a sample audit, if you went to the crawler settings, the 'depth limit' controls had mysteriously disappeared.
Released on 12th July 2021 (hotfix)
- Fixed the Hint: 'Images with missing alt text'... so apparently this disappeared in version 5, and we somehow managed to not notice it was gone. I asked Gareth what happened and he started waffling on about nulls, and 0s, 1s, and how you can't null a null and all this shit. TL;DR the Hint had stopped triggering when it should have, and now it is fixed so that it does fire again.
- Fixed an issue in the audit setup where the language setting would not let you add new languages.
- Fixed an issue with structured data which was inaccurately claiming some 'recommended' properties were actually 'required' for the Local Business Search Feature.
Released on 1st July 2021
Wouldn't it be wonderful if, just once, England could beat Germany in the knock-out stages of a major tournament?
Oh, wait hang on. I said the wrong one.
Wouldn't it be wonderful if, just once, Team Sitebulb could manage a major release without the inevitable and embarrassing 'bug fix update' mere days later?
Reaudit from side menu actually now works
Folks using the burger menu on the left hand side of the Projects list in order to 'Re-audit Website' were being treated to a green 'Start Audit' button that did literally nothing. They would press the button and Sitebulb would just sit there like a fucking lemon.
Keywords report no longer completely missing
Considering that one of the few things I actually do around here is 'chief tester', this one has to go down as one of my worst fails this week (top 5 at least I reckon).
The Keywords report was literally not there at all. We were collecting the data, but not even showing it as a menu item on the left.
'Treat subdomains as internal' was not treating subdomains as internal
This feature had literally one job! How do you fuck that up?
Exports from Link Explorer had displaced column headings
In the URL Explorers and Link Explorers we added a new 'Row' column in the UI, to make it easier to scan the data. However we were also pushing this through to the spreadsheet exports, and for some reason we'd forgotten to set the column header for Link Explorer exports. Which meant that the column headings did not line up as they should, and no longer made sense.
Considering that spreadsheets are literally a numbered list of rows and columns, we figured we could probably do without the row data on exports entirely...
Improved message for JS/CSS/images blocked by robots.txt
This is technically an improvement rather than a bug fix, but I'm so embarrassed by the farce that precedes this one that it does not deserve to be considered an 'Update'.
Now, if you crawl a site that has resource files blocked by robots.txt, Sitebulb will tell you what they are in a little notification box:
(Note that the bottom line links to a list of disallowed URLs, if and only if you tick 'Save disallowed URLs' during your audit setup).
Released on 28th June 2021
Every time we do a major version update, I forget how looong it takes. We've been working on this one for pretty much all of 2021, and it represents the biggest change to how Sitebulb actually crawls websites since we first added headless Chrome back in 2018.
Here's a quick bullet list of the changes, with jump-links to make you life easier, should you need them:
- Brand new Performance report
- Lighthouse audit across every page on your site
- Performance budgets
- New audit setup process
- Open audits in new windows
- Single Page Analysis given more of the limelight
- Re-audit without pre-audit
Brand new Performance report
Google's longstanding promise to factor Core Web Vitals data into their page experience signal has sent the industry into a tailspin over recent months, as deadline day approached and every scribe across the land penned an 'everything you need to know' guide. The inevitable adjournment was, in my opinion, further evidence that we will not see widespread change when it does indeed come to be.
With my soliloquy complete, I may now put forward our attempt at make Web Vitals data accessible at scale:
Our journey to this point was tempestuous, with many twists and turns along the way. As we went, we came to certain realisations, and determined some decisions, defining our philosophy:
- CrUX 'field' data is useless for auditing - the data is SO sparse on most websites that trying to collect it at scale is a joke. You can't make decisions based on a table full of 0s, so we do not include field data, and suggest you leave that to Google Search Console.
- Sampling 'lab' data is the way forward - there is no universe where you need to be collecting lab data for every page on your site, sampling is a much more efficient way to collect Web Vitals data.
- Data can be collected directly from Chrome - there is no need to connect to the PageSpeed Insights API or use Lighthouse (both of which require you to hit the page again), we can just collect this data directly from Chrome first time around.
This is a lot to unpack, and to some extent requires a mindset shift from the typical task-based approach to performance auditing, which typically starts with 'bulk grab CWV data for a massive list of URLs.'
As such, we have ended up with a system that allows you to do the following:
- Collect Web Vitals metrics with the Chrome Crawler as you crawl: LCP, CLS, TBT, TTI, FCP and TTFB.
- Crawl the entire site, yet set a sample % for collecting Web Vitals (recommended 10% sample).
- Run the Lighthouse ruleset (Opportunities and Diagnostics) across every single web page crawled (not just the sample).
- Apply Performance Budget calculations based on the resource files found.
I expect these boring words do not do much to inspire, so I'll show you some images instead.
The first thing you see in the new performance report is the big Performance Score at the top, followed by the green/orange/red bar underneath showing the individual URL performance scores.
The big score at the top relates to the overall average score of the website, for all URLs tested. The green/orange/red bar shows how the URLs split out individually.
These scores are based on the following thresholds:
- Good (green) - 90 to 100
- Needs improvement (orange) - 50 to 89
- Poor (red) - 0 to 49
These scores are based ONLY on the URLs that were sampled, and ONLY using Web Vitals metrics.
The scoring methodology is based on the Lighthouse Scoring Calculator, which takes Web Vitals data and translates this into a weighted average:
As you can see, the furthest right column shows the weighting, so the FCP contributes 10% to the score, whereas LCP contributes 25%, etc...
Sitebulb does the same thing - for every URL sampled, it calculates the Web Vitals scores, then applies the weighting like above to end up at a performance score for the URL.
This is just like when you see a single URL score in Lighthouse or PageSpeed Insights:
To see individual URL scores in Sitebulb's performance report, you can click through from the chart:
To end up at a list of URLs, where you can see the performance scores in the third column.
You can also click the orange Performance Data button to isolate a single URL and see all the Web Vitals metrics, along with the score:
And finally just to re-iterate, we have just explored how to dig in and look at the performance score for an individual URL, but the big performance score at the top is the average of all the URLs that were sampled:
Web Vitals lab data
On the audit overview, directly underneath the performance scores, you will see the Web Vitals charts:
It's super important to understand that this is lab data, not field data.
I know I said it at the beginning, but I want to say it again because you probably forgot. Don't beat yourself up about it - you've had a lot on recently.
We all forget marginally important information once in a while, it's fine. In fact, just the other day I forgot Gareth's surname (later on Geoff pointed out that I should just think about Reservoir Dogs if I need a reminder).
Anyways, you can dig into the Web Vitals data further, simply by clicking on the relevant pie chart segment:
This will bring you through to a URL List that contains all the URLs within the segment selected:
At this point, analysing the URLs to identify patterns in the URL structure can allow you to isolate specific HTML page templates that all share a common issue (e.g. 'Poor CLS on product pages').
Lighthouse audit across every page on your site
Of course, we have Hints as well, in the usual place:
These Hints are processed for every single URL crawled, not just the ones sampled for Web Vitals.
The Hints split into two different categories:
- Opportunities, which present optimization opportunities that can help you speed up your site and improve performance, such as reducing server response time.
- Diagnostics, which flag specific problems on your site that need to be addressed, such as pages with enormous payloads.
These reflect the same issues that Lighthouse use, and we have deliberately used the same wording so you can match the issues up when digging in further with Lighthouse.
What this effectively means is that when you run a performance audit with Sitebulb, you can run the Lighthouse ruleset against every single one of the URLs on your website.
Importantly, whereas Lighthouse will only ever give you this data for one URL at a time, Sitebulb will collate the issues so you can identify groups of URLs that have the same problems - and then dig into these groups further to find the worse offenders.
As an example, consider the Hint below, which picks out pages that have images which could use further optimization:
If I click 'View URLs', I can see the URLs which contain images that need further optimization:
The data columns highlighted above help you understand the scale of the issue on each URLs, so you can determine if you want to explore further. For instance, you can sort by the column 'Images not Efficiently Encoded' in order to highlight the worst offenders.
The other blue button, 'View Resources' shows the image files themselves which need further optimization:
This time, the highlighted columns tell you data regarding the specific image URLs themselves. This allows you to sort either by the images which will benefit most from optimization (the column 'Encoding Image Saving' shows you this) or by the images which are most prevalent across the site (the column 'No. Referencing HTML URLs' shows you this).
We held a Core Web Vitals webinar last week, with expert guests Arnout Hellemans and Billie Geena. Try as we might, we could not get Arnout to shut up about the value of utilising performance budgets.
If you've watched that and Arnout failed to convince you, I have also written some performance budgets documentation where I explain what they are and how to use them. So if you find the following a rather brief description, it is not about me being a lazy prick, and more about me trying not to write 5000 words of drivel once again.
The performance budget data is, somewhat surprisingly, located under the Performance Budget tab.
Scrolling down you will see a performance budgets chart:
And then underneath that, the same data in table format:
The first column shows the type of page resource, and the second column shows the maximum 'allowed' size of all the corresponding resources on each URL. So for example looking at the second row, there is an allowed budget of 400KB for scripts. Of all the URLs on the website, 541 were under this 400KB budget (i.e. 'passed') and 21 where over the 400KB budget (i.e. 'failed').
Now, you may be wondering at what point you are expected to give a shit about any of this, and I'll point you to the 5th row down: Images.
It doesn't matter how much important 'CWV' work you do, it just takes one intern one moment to upload one full size 5MB photo and the page is totally fucked.
I don't care if you ignore every word Arnout ever says, or print my article out to use as kindling, just make sure you check the images section of the performance budgets, which will identify any URLs with images that have a large total weight.
Let's click this '32' value to see the pages on the Sitebulb site with high image weight...
Look at the fucking state of those release notes pages!
I'll be honest, there's loads more shit I could tell you about the performance stuff, but you've put up with way too much serious shit already, so I'll just point you at the docs instead if you wish to read further:
- How to audit performance & Web Vitals
- Performance budgets (what they are and how to use them)
- URL sampling (how and why we do it)
New audit setup process
I know what you're thinking, "You fucking charlatan, this isn't a new feature it's just a different way to do the same bastard thing!"
And while I struggle to dispute this assertion, there are a couple of neat features the new setup allows.
The first is that you can now set the device type in the project setup - desktop or mobile - with mobile being the default (mobile first innit):
The second is that the audit options are now completely decoupled, which means you can pick and choose exactly which audit options you want, without being obliged to fill up your hard drive with needless data.
So you can do this sort of thing:
i.e. only collect structured data, without the SEO data which was always on before (on page, links, indexability signals, duplicate content etc...).
This means you can do really specific audits, a lot more quickly and efficiently (think about content extraction on a competitor site just to steal their price data - but you don't give a shit about their SEO or security data... switch on content extraction and turn everything else off).
Open audits in new windows
If you've ever spent your weekends comparing Sitebulb audits (don't pretend like you haven't), you'll have found this task pretty frustrating... before now.
Ctrl+click (Cmd+click on Mac) to open up audits in a new window overlay:
But don't stop with only one! You can keep opening and opening more audits, until you can't see the wood for the audits.
Single Page Analysis given more of the limelight
We've provided a 'single page analysis' tool in Sitebulb for literally years, and almost nobody knew about it. Frankly, we were getting pissed off how often we'd point it out and folks would think we added a new feature. Literally years!!
So we just moved it out of the 'Tools' dropdown menu at the top (emphasising arrows mine).
If you haven't yet tried the tool, maybe this it's about time you should.
We have also now made it available in URL Lists via the (right click) context menu. If you hit 'Single Page Analysis' from here it will open up a new window and run a 'live' Single Page Analysis on the URL in question.
Here's what that looks like:
Note that this is NOT data collected during the audit. This is Sitebulb rendering the page, parsing the HTML and compiling the data live. So if you need to verify an issue has been resolved, this is a great, quick way to do it.
Re-audit without pre-audit
Although some may prefer the flashy features above, this one is my favourite. If you want to re-audit a website using exactly the same configuration as your last audit, you can easily do this now using the dropdown option 'Start with current configuration.'
This skips the pre-audit, and will just start the new audit up instantly. Sooooo much more convenient.
487 Bugs Fixed
We have fixed so many shitty little bugs in this version I actually don't even want to count them, never mind write about them. Especially when you lot will not even give one single fuck about any of them. Also the Belgium - Portugal game is just hotting up...
...so I won't bother. But I will email you individually if you reported a bug we fixed (at some point later today or tomorrow).
Access the archives of Sitebulb's Release Notes, to explore the development of this precocious young upstart: