On some websites, the HTML is structured in a way which makes it hard for Sitebulb's algorithm to determine the content area, which is used for calculating word counts and duplicate content.
To help Sitebulb correctly calculate this data, you can unambiguously identify the content area that Sitebulb looks for, via the Advanced Settings.
To get to Advanced Settings, you scroll to the bottom of the main Audit setup page and hit the grey Advanced Settings button.
The Content section is under Crawler -> Content:
This page presents two options:
- Add a class of "sb-content" to the containing DOM element, for instance the closest parent DIV tag. This option would be suitable if you have control of the site and are able/comfortable editing page templates.
- Select a parent DOM element that contains all the content you want to analyse using a custom CSS Selector. This option is suitable for instances where you don't have control of the site, or can't edit the page templates.
Before carrying out an audit, it is very unlikely that you would know that Sitebulb will struggle identifying the content area. However, the tell-tale sign that Sitebulb is struggling with this is when Sitebulb can't find any words:
If you see this sort of thing in your audits, experiment with these methods of defining the content area for Sitebulb.