Crawler speed

You may wish to control how fast Sitebulb is able to crawl, and you can do this via the Advanced Settings.

To get to Advanced Settings, you scroll to the bottom of the main Audit setup page and hit the grey Advanced Settings button.

The Speed section is under Crawler -> Speed, and if you are using the HTML Crawler it will look like this:

Crawler Speed HTML

These options give you ways to limit the crawler so that it crawls more quickly, or more slowly:

  • Number of Threads - this controls how much CPU usage is allocated to Sitebulb. In general, the more threads you use, the faster it will go - however this is capped by the number of logical processors (cores) you have in your machine.
  • Limit URL Speed - a toggle you can use to switch on a "max URLs/second" speed cap. If switched on, Sitebulb will not crawl faster than the specified URLs/second rate.
  • Max HTML URLs per Second - you can only set this value if the box above is ticked. If so, this value with provide the limit. So for instance if this is set to 5, Sitebulb will not download more than 5 HTML URLs per second.

If you are using the Chrome Crawler, the Speed section is located in the same place, but looks slightly different:

Speed Chrome

There are no thread options or URL speed limiting options (as neither are applicable when crawling with Chrome). Instead, there are these options:

  • Render Timout - this determines how long Sitebulb will pause to wait for content to render, before parsing the HTML. The lower value you use, the faster Sitebulb will crawl.
  • Instances of Chrome - this determines how many logical processors will be used for rendering with headless Chrome, and is dependent upon the number of logical processors available on your machine (just like threads, above). The higher value your use, the faster Sitebulb will crawl, within the limitations of your machine. 

If you wish to learn more about crawling fast, we suggest you read our documentation How to Crawl Really Fast. Alternatively, to examine the benefits of a more measured approach, then check out our article, 'How to Crawl Responsibly.'