You can check if URLs on your website are indexed in Google by taking advantage of the Search Console URL Inspection API, which you can connect to Sitebulb website audit software via the Google Search Console integration.
This can allow you to see high level data about whether a URL is indexed or not, along with 'reasons' why URLs are not indexed, which allows you to explore further.
For clarity, the data returned is data that Google Search Console stores about a given URL - the API will return indexing information that is currently available in the URL Inspection tool. What Sitebulb does is allow you to collect this data in bulk.
To connect Sitebulb to the URL Inspection API, add Google Search Console to your audit settings when you set up a Project, and tick the box under the Configuration options to Fetch URL data from Search Console Inspection API.
It is important at this point to make sure you select the correct property. Sitebulb will help you do this by pre-selecting the property by matching with the start URL, but you may have multiple properties for the same account (e.g. domain properties and URL-level properties).
The main caveat of this feature is that Google limit the number of queries to 2,000 URLs per day - per Search Console website property (i.e. calls querying the same site).
This means that if you have a website with more than 2,000 URLs, Sitebulb will not be able to collect indexing information about all the URLs at once. In this case, Sitebulb will query the top 2,000 HTML URLs, ordered by URL Rank.
So, by default, Sitebulb will always select the most important pages to check for indexing, based on internal link popularity.
This quota is per website, not per tool. If you collect API data via multiple different tools, they are all grabbing from the same quota pool. If you have used up the 2,000/day allowance with other tools, you will not then be able to collect more data through Sitebulb on the same day.
Similarly, there is also a query limit on the API of 600 URLs/minute. Sitebulb is set to safely query under this limit, but if you hit the API with two different tools at once, you could accidentally exceed it any hit error messages.
See the section 'When URL data is not returned' for more information about exceeding API limits.
To access the data collected by Sitebulb, navigate to the URL Inspection report via the left hand navigation.
The overview shows numerous charts and tables, and if you click the URLs tab you can see all the data in table format:
As with all URL Lists in Sitebulb, the data can be augmented and fine-tuned by adding additional columns, sorting, or applying advanced filters.
The data returned by the API can be quite rich and nuanced, so to really understand what you are looking at requires some familiarity with the Index Coverage report and the URL Inspection tool in Google Search Console.
However, Sitebulb gives you easy, intuitive access to big-ticket data items like 'URLs are not indexed in Google', with a straightforward workflow for digging deeper into the data.
With any of the charts, click on the segment area to view a filtered URL List of the data:
This will bring you to URL data like this:
Also, on any chart, you can click the 'View Data Table' toggle, which shows the chart data in a table format:
Then the chart data will show in a table, and clicking on any of these values will also bring you to the corresponding URL List:
On the URL Lists themselves you can analyse issues in bulk, by scrolling right to inspect the most meaningful columns:
Alternatively, to dive into a specific URL and see what the inspect tool is saying in Google Search Console, just hit the orange button to Open URL Inspection:
This will then open up Google Search Console in your browser, with the Inspect URL tool already open and the selected URL pre-loaded:
There is a number of pie charts and bar charts on the URL Inspection report, so we will run through what each one is showing:
This chart splits out every URL into different buckets, based on whether Google could find and index the page. Each option includes a short, descriptive reason for the status of the URL, explaining why the URL is or isn't on Google.
This pie chart provides a summary evaluation about whether or not URLs are eligible to appear in Google Search results.
It is important to note that 'URL is on Google' doesn't necessarily mean that the page is appearing in Search results, simply that it is indexed.
This chart indicates whether or not URLs explicitly disallow indexing (e.g. a noindex tag). If indexing is not allowed, the reason is shown in the legend - these pages won't appear in Google Search results.
Note that if a page is blocked by robots.txt, then 'Indexing allowed' will always be 'Yes' because Google can't see and respect any noindex directives.
This chart splits out the URLs submitted to the Inspection API based on their sitemap status. Either they were not found on sitemaps in Google Search Console (in which case they show as ‘Not submitted’), they were submitted and indexed, or they were submitted and not indexed.
This chart shows the distribution between URLs crawled with Google’s Mobile Crawler vs their Desktop Crawler.
The results in this chart are only for URLs that are indexed.
This chart indicates whether or not URLs were allowed to be crawled by Google, as determined by the site's robots.txt rules. Note that this value is not the same as allowing indexing, which is given by the 'Indexing allowed' value.
The results in this chart are only for URLs that are indexed.
This chart indicates whether or not Google agree with the user-declared canonical URL. If they do agree, this will show as 'Match,' and if they do not agree, this will show as 'Mismatch.' If no canonical is present and Google have selected one, this will show as 'Google Selected.'
This chart shows the distribution of URLs based on their last crawl date by Google. Days showing as '0' means that the URL has been crawled within the last day. The date ranges allow you to dig deeper and explore URLs that have been crawled recently – or not recently at all.
This chart indicates whether or not URLs are eligible for Rich Results, or if the URLs trigger errors or warnings.
The results in this chart are only for URLs that contain structured data which could lead to Rich Results.
This chart indicates whether or not URLs are deemed mobile friendly by Google, or if the URLs trigger errors or warnings.
The results in this chart are only for URLs that are indexed.
Sometimes you will find that URL data is not returned, and this could be for a number of reasons:
If you go over the daily quota (see above) you will need to wait for 24 hours before trying again. Please also bear in mind that the 2,000 URL limit is per property, per day, which could mean that you've gone over the limit due to tools other than Sitebulb.
As soon as Sitebulb goes over the daily quota, it will stop sending API requests.
There is a query limit on the API of 600 URLs/minute. Sitebulb is set to safely query under this limit, but if you hit the API with two different tools at once, you could accidentally exceed it any hit errors messages.
This means that Sitebulb has requested indexing data for a URL that is 'not part of the property selected.'
For example, https://example.com for the URL-prefix property https://www.example.com. If you wish to check URLs from multiple subdomains, please select a domain-level property.
This means that Google's API itself has fallen over. If this happens, come back and try again later.