A Practical Introduction to Structured Data for SEO
Updated 02 February 2021
There’s a lot of talk around the theory behind structured data, and even how it can impact SEO. But what’s often missing are the practical aspects of how to approach structured data.
This article gives a hands-on introduction, guiding you from identifying opportunities, to implementing on your website, and (hopefully) getting pulled into Google’s rich results.
What this guide covers:
- Why is structured data important for SEO?
- What is structured data?
- Identify structured data opportunities
- Manual structured data implementation
- Site-wide structured data solutions
I’ve written this piece primarily for SEOs that already have a basic understanding of what structured data is, know they want to give it a try, but aren’t quite sure of how to get started. Does that sound like you? If so, read on!
Why is structured data important for SEO?
If you’re an SEO, then the main reason you’ll be interested in structured data will likely be to get search features or rich results. In other words - to help stand out in the SERPs.
Below you can see a rich result we triggered using FAQ markup within an article on the Sitebulb website. It pretty much doubles our shelf-space on the page.
In sectors like the recruitment industry, if you’re not appearing in rich results, it’s likely you’re going to have some serious problems competing.
Jobs SERPs like the one below are clearly dominated by rich results.
In areas like ecommerce, there’s real opportunity to stand out on the page through structured data resulting in rich snippets:
What is structured data?
From a practical perspective, structured data is a way to give meaning to your web page content, using a format that computers can understand.
Before we can create structured data, we need content.
Now, we (as humans) can look at the piece of content below and have a pretty good idea what it is we’re looking at without much help.
But we need to remember that Google is essentially just a big computer, and doesn’t have a brain as sophisticated as ours to interpret what this collection of text and images actually mean, in this context. (Although to be fair, they are getting pretty good at it).
In Google’s own words, structured data can:
“provide explicit clues about the meaning of a page”
The way we provide those clues is quite simple. We take our content, and apply predefined labels to tell Google what each part is.
In this example, we’re telling Google that the content as a whole is an event. Then breaking down each of the elements into the parts that we know Google is interested in.
To quote another definition from Google:
“Structured data provides a way to standardize information about a page and classify the page content.”
What that looks like in practice is this - A block of code which takes all those pieces of content we just looked at, applies an explicit label of what each part is, in the JSON-LD format, using Schema.org vocabulary.
This is what structured data is.
Schema.org provides the naming conventions and rules. JSON-LD is the script and code format. There are other formats and vocabularies available (Microdata is one you’ve probably heard of), but Schema and JSON-LD are Google’s preferred choice.
So for the sake of SEO, these are the ones I’ll focus on in this article.
This has been a very quick recap on what we mean by structured data. If you need some more help to get your head around the basics, there’s plenty of good resources out there.
Some of the ‘official’ resources from Google and Schema.org can provide a good place to get started:
- Read Google’s ‘Understand how structured data works’ guide.
- Browse Schema.org.
- Work through the codelabs tutorial.
Identify structured data opportunities
Let’s move onto the practical side of things - We’ve just covered how we need content in order to create structured data. But what types of structured data should we be using?
We’re now going to consider how you can identify content opportunities and types to use with them.
Content first opportunities
Firstly, we’re going to look at finding opportunities within your existing content. I’m now going to take you through a process for doing a really quick review of your content to identify opportunities.
The only things you need for this are a spreadsheet, Google’s Developer Documentation, and your own judgement.
The screenshot below shows a basic Google Sheets template I put together for this task. You can find the template along with a sample audit here: SEO Structured Data Opportunities Audit Template.
The first part of the process uses Google’s Search Gallery:
This covers each of the structured data search features that they support (there are many Schema types which aren’t yet supported by Google).
Look at the description and preview for each search feature type, and think about whether you’ve got any content on your site that might fit that feature.
If you do, mark that row on your spreadsheet as a ‘potential’.
Click on the 'Get started' button for that search feature type.
This will take you to the specific developer guidelines for that search feature. The example I’ve used is for the FAQ search feature.
Then it’s just a case of reading through the guidelines to see if the content you have in mind meets them.
If it does, tick the ‘meets guidelines’ column in your spreadsheet, and you’ve got an opportunity to mark-up.
And that’s it! As I said, this is a really basic approach, but there’s no point in over complicating things at this stage.
Below is an example of a completed opportunities audit:
The rows that now have two green columns are your opportunities shortlist. You then need to take a closer look and make a judgement on which ones to prioritise.
This process does require that you know your content fairly well, so it’s worthwhile doing as a team if you can, rather than relying on one person. If you’ve got a big site, you can break it down and repeat the review process for each part of the site individually to make the task more manageable.
Result first opportunities
By ‘result’, I’m referring to search results. This involves manually reviewing your priority search terms.
In it’s very simplest form this is just a case of doing a search for one of your target terms, and having a look at the results.
In this example, we can see that TripAdvisor, in position 2, is triggering the FAQ search feature:
Now if I was an SEO for TheCrazyTourist.com, which is ranking no.1, I’d be thinking about what content we could add to our page and mark up with FAQ schema, in order to trigger some of those results ourselves.
This is what I mean by a result-led opportunity. You can see that a search term is already triggering a search feature, so you look at how you can optimise your own content (or even create new content), to trigger it instead.
Again, this is a very basic approach, but effective.
The main downside to this is that it can be pretty labour intensive. Especially if a lot of your target terms aren’t triggering any search features. However, there are options you can use for scaling this up, by using tools to do the bulk of the work for you.
This can be a much quicker way to get an initial list of the search terms you should be looking at.
I’m sure there are a lot of other tools out there that do this. So if you’re already signed up to a different one, it’s worth checking to see if it provides similar reports.
Manual structured data implementation
So we’ve identified opportunities for the content we want to mark-up, now I’ll move on to how we actually get it implemented.
Firstly, we need to create the markup code.
If we take a step back for a moment, from all of the tools, apps, and plugins - and go back to basics - it’s worth reminding ourselves that in its simplest form we’re just dealing with lines of text.
There’s no reason whatsoever why you can’t take a text editor and Google’s structured data guidelines, and write out all of the lines of code you need.
As a learning exercise, this can be a really good way to help stuff sink in, especially the code structure and syntax. But in reality, it’s not something most of us want to be doing regularly.
Schema Markup Generators
The next level up is to use a schema markup generator to produce your code.
Given the recent focus on structured data for SEO, the number of Schema markup generators has been growing rapidly, with new ones popping up all the time. It’s worth doing a bit of research to see which one (or ones) will work best for you. A lot tend to be limited to producing certain types, so this will likely guide which one you decide to use.
The concept of these generators is pretty straightforward:
- You fill in the relevant fields with your content,
- ...the generator adds the relevant markup,
- ...and outputs the code for you to add to your page.
The screenshot below shows our earlier event example, using the Merkle Schema Markup Generator:
For a more in-depth guide on how to use Schema generators, and the different ones available, have a look at this post on ‘Schema Markup Generators for Structured Data’ from Elise Dopson.
Validate Schema Markup
Once you’ve got your markup code, you need to make sure it’s valid. As with generators, there’s a lot of different validators out there.
For the single-page, manual implementation we’re looking at here, the main one you should be looking at is Google’s own: the Rich Results Test tool.
The Rich Results Test tool is a long way from perfect - there’s a whole other discussion to be had around that. If you looking to validate Schema types that aren’t supported by it, have a look at Patrick’s Structured Data Testing Tool Alternatives post. But for what we’re doing here, the Rich Results Test tool should be your first port of call.
When you’re in the tool, you need to use the ‘code’ tab as I’ve highlighted in the screenshot below. That allows you to test your code before putting it live on the webpage.
When you run the test, you should see something like this:
That big green tick is what we’re looking for. This means that your code is eligible to trigger rich results.
‘Eligible’ is the key word here though - This just means that technically your code meets Google’s validation criteria. Whether you actually appear in rich results is a whole different matter.
Errors vs Warnings
If you looked closely at the last screenshot, you might have spotted the ‘warnings’ text in yellow.
To help explain a bit about Schema validation errors and warnings, I’ve purposely broken my code and run the same test again.
Our nice green tick has gone. And we have a message telling us that our markup is no longer eligible:
Within the Events section, we have the same two warnings, plus a new error. If you click on these, they open up giving some more details.
The error relates to a missing ‘name’ field - this is what I removed to break the code.
‘Name’ is a required property, and if you have a required property missing, your markup isn’t valid.
The two warnings relate to missing ‘performer’ and ‘organiser’ fields. The reason these trigger a warning rather than error is because they’re only recommended properties rather than required.
For the example we’ve been looking at, ‘performer’ isn’t really relevant - the event doesn’t have a performer. But ‘organizer’ might be something we could add.
The important thing to remember is that errors are things you must fix, whereas warnings are opportunities to add additional information.
Errors = Essential fixes
Warnings = Optimisation opportunities
Generally speaking you should try to add as many recommended properties as you can.
Treat them as a way to optimise your structured data further.
Adding markup to the page
So we’ve got our markup, we’ve checked it’s valid, now we need to get it on our page.
Google have said that as long as your JSON-LD script within the head or body sections of your page, they can read it. So they’re pretty flexible on exactly where it goes.
Going back to basics again - you could just edit your page code directly, and copy/paste the structured data in there.
The next level up from that would be to use a custom CMS field. The example below is a field we’ve got on the Sitebulb website in Umbraco.
Adding this field meant we weren’t messing around with the page code directly. But we also don’t need to hassle the devs every time we want to change something. It’s a quick and easy way to get markup live on the page, and it’s flexible enough that we can play around with the code and test things.
Google Tag Manager
Another good way of getting structured data on your site with minimal input required from developers is to use Google Tag Manager.
As long as you’re able to get a container installed on the site (it should be on every page), you can then inject structured data into any page, without touching the site itself.
A while ago Google were advising against using Tag Manager to implement structured data, but they’ve changed their stance and now even include it within their structured data documentation.
Test the Live Page
Once you’ve got the structured data on your page, it’s time for some more testing.
Rich Results Test Tool
Back on Google’s Rich Results Test Tool, you can now use the URL tab to test your live page, in the same way you did with your code snippet.
This should give the same results as when you tested the code snippet, but there’s always scope for things to go wrong when implementing, so it’s worth testing both the snippet and live page.
Google Search Console
Not strictly testing, but you should also make sure that your site is submitted to Google Search Console.
If Google picks up any issues with your structured data, it’ll send out a notification via GSC. Fairly recently, they’ve started pulling more and more structured data elements into reports too. It’s pretty likely we’ll see more structured data appearing in there as time goes on.
The screenshot below shows an example from our own site. We broke our structured data markup to see what GSC would pick up. We received an error report just a few hours afterwards. So it would seem to be a fairly reliable way to spot at least some issues.
Site-wide structured data solutions
What I’ve covered so far has been a fairly manual approach. But if you’re taking structured data seriously, you need to be considering more automated solutions for site-wide implementation.
But why do we need to automate? What’s wrong with the approach I’ve covered so far?
- Isn’t scalable - If you’ve got a website of 10k pages, it would take a whole team to manually implement structured data across all of them. Even implementing manually across every new blog post will get laborious pretty quickly.
- Isn’t efficient - It’s unlikely that the ROI for manually implementing on every page is viable. Maybe for a few key pages - but beyond that, your time is more valuable.
- Is fragile - When you’re writing code and manually implementing it each time, it’s very easy to break it. Human error is your enemy!
- Is prone to inconsistency - In our event example - If I implemented it, then decided to change the ticket price (or anything else for that matter), i’d need to remember to update the price in the structured data too. This can quickly become a nightmare to manage!
So let me address the elephant in the room - why the hell didn’t I just start with automated solutions in the first place? Well there is value in the manual implementation approach I’ve covered. Primarily in three key areas:
Firstly as a learning exercise - Try it, break it, learn how it works. I’ve always said that the best way to learn about SEO is to just set up a website and give it a go. The same applies here.
Secondly, doing small scale manual implementation gives us flexibility to test stuff. The structured data environment is changing all the time and very little is black and white. So we can’t just expect to follow a set of guidelines and achieve what we want.
- Proof of concept
Lastly, manual implementation is ideal for proof of concept implementations. It’s pretty common to get push-back on rolling out sitewide changes for SEO. So implementing on a handful of pages first, can help build a business case.
An approach to site-wide implementation
Let's have a look at some of the different options for an automated approach.
The subject of automating structured data across your site goes beyond the scope of this article. I will, however, introduce you to some of the important principles you’ll need to consider.
To some degree or another, most websites are built on a set of page templates.
If we look at the product page below, you can see that it has a standardised structure which will likely be the same for every product.
So if we know that this template will always contain a product, and each of the product attributes will be consistent within the template. We can map these to our structured data fields, to automatically populate structured data markup.
Obviously this sort of implementation requires some careful planning and development resource.
But if you consider that a single page template implementation might populate structured data across thousands of pages, as an investment it’s usually worthwhile.
Plugins & 3rd party tools
If you want to avoid some of the heavy lifting, there are plenty of tools out there that will help roll out structured data across your site.
If you’re using Wordpress, then the Yoast SEO or RankMath plugins are worth looking at. They’ve both been doing a lot of work recently to automate structured data implementation. There are other tools like Schema App, inLinks and Wordlift which are less platform specific.
With any of these tools, you are putting a degree of trust in the tools to correctly identify your content to mark up. Whereas with the template option we looked at, you have complete control. So there’s pros and cons to both approaches.
If you’re looking to investigate this approach further, Schema App have put together a pretty comprehensive list of tools here.
Testing at scale
As we’ve already covered - testing is essential. Google’s Rich Results Tool is fine for single pages, and spot checks. But for sitewide testing you really need to be looking at a crawler type solution.
Using a tool like Sitebulb lets you test structured data across your whole site at once. You can aggregate issues and diagnose problems at a template level. And you can also interrogate the data to look for opportunities.
It’s also worth remembering that testing isn’t just a one-time thing. Google is constantly rolling out new features and changing their guidelines, so you need to be checking this stuff periodically in the same way you would any other element of SEO.
To help stay up to date with changes to both Google's and Schema.org guidelines, you can register for our free structured data change alerts. These will let you know whenever guidelines are changed, so you can make sure that your structured data always meets their criteria, plus spot any new opportunities as soon as they're live.
We’ve now reached the end, so let’s have a recap the steps to getting started:
- Learn the basic principles - If you haven’t already, I suggest you spend some time to properly understand the principles behind structured data.
- Identify content opportunities - The first practical thing to do is identify opportunities in your existing content which could be marked up with structured data.
- Identify SERP opportunities - Next, look at your target search terms to see which ones are triggering rich results.
- Manual implementation on select pages - Use small-scale manual implementation for testing and proof of concept.
- Full structured data strategy - All of the above should feed into creating and implementing a full structured data strategy which can be rolled out across your site, and include ongoing testing.
For validation and testing, make sure you give Sitebulb a try. If you’re not already using the tool you can get a 14 day free licence here, which gives you access to the full functionality including the structured data auditing feature, and standalone structured data testing tool.