Release Notes | Sitebulb

August 20 2025 v 9.0.0

URL Crawl Limits Now Based on HTML URLs

Desktop Update

Up to now, Sitebulb's 'crawl limit' has been calculated on all pages crawled - which would include external links and page resource URLs, in addition to internal HTML URLs.

The limit I'm referring to is found in the 'Crawler Settings':

Maximum Pages To Audit

But most folks base their understanding of 'how big is my website?' only on internal HTML URLs (i.e. unique pages) and so our implementation of these limits was confusing. As an example, you might have an ecommerce store with about 10,000 pages - based on around 9,000 product pages and 1,000 other pages (categories, subcategories, blog, etc...). But if you run a crawl, Sitebulb could easily find another 1000 external URLs, and maybe 50,000 page resource URLs (images, CSS, JavaScript etc...).

So if you set a crawl limit of 10,000, you might actually find that the crawl would stop when only a couple of thousand internal HTML pages had been crawled (as the rest had been external links or page resources).

While no one actually complains at us about these sort of things, we know how annoying they must be for day-to-day usage. So we completely changed our philosophy on how we calculate it, so it's now based entirely on the number of HTML pages crawled.

We also updated the Crawl Progress UI to add more clarity around how many HTML pages have been crawled or are due to be crawled:

New Crawl Details Panel

In practice, what this means is that all of Sitebulb's plans have become more generous in terms of 'how many pages you can crawl', and it should be easier to set appropriate crawl limits as there is less guesswork involved.

Designed for:

Case Studies

Learn SEO

Featured Training Courses

Featured Resource Hubs

Quick Links

Sitebulb Release Rants

URL Crawl Limits Now Based on HTML URLs