URLs with duplicate content
This means that the URL in question has identical HTML content to at least one other indexable URL.
Why is this important?
If this sort of duplication occurs, you have a relatively serious issue, whereby URLs with identical content are accessible to search engine crawlers.
If this results in large scale duplicate content issues on the site, you could trip quality algorithms like Google's Panda, which can depress organic search traffic to the site as a whole.
What does the Hint check?
This Hint will trigger for any internal, indexable URL which has identical text content in the HTML. This means that the pages may have the exact same text, but show different images, and this Hint would still be triggered).
Note: since the duplicate content check is only for indexable URLs, URLs which are canonicalized are not included in the analysis - as the canonical tag 'handles' the duplicate issue.
How do you resolve this issue?
As with all duplicate content issues, the seriousness of the issue largely depends on the scale - in general, if only a few pages are affected, it is probably not affecting the site to any meaningful degree. If there are thousands of duplicates, however, the scale might be large enough to trigger a quality algorithm like Panda.
A common source of URLs with duplicate content are landing pages that target very slight variations on the same keyword - with the same text content on the page, but perhaps a different slate of product images. In general, this sort of content is not good for SEO, and search engines will often simply filter it out of search results since the content is too similar. If you do have a content issue like this that you need to resolve, the best way is to create a single unique page that contains enough rich content and crossover keywords that would allow it to rank for a number of related keywords.
Another common situation is if you have the same page content that is deliberately available through multiple paths on the website. An example, on an outdoors ecommerce website, might be that of a pocket torch - where there is a product page accessible via several categories (e.g. Torches, Camping, Travel). The solution for this sort of issue is to select one version of the page to be the canonical (e.g. the Torches one) and add canonical tags to the Camping and Travel versions of the product page. This is exactly the type of issue that canonical tags were designed to solve.
How do you get more data from Sitebulb?
Within Sitebulb you can either dig in to a specific URL and look at the duplicate content that way, or you can export all the duplicate content and sort in Excel.
To find details of specific URLs with duplicate content, click on the blue URL Details button from the URL List.
The URL Details tab will slide across, and you then need to navigate to Duplicate Content -> Content, and you'll see all the duplicate URLs underneath.
To export ALL the duplicate content URL data, click on the green Export Hint Data button in the top right hand corner.
This will give you a nicely formatted Excel sheet showing you all the URLs with duplicate content, allowing you to easily sort and pick through the data.