June webinar: Brand authority in generative search with Daniel Foley Carter, Chima Mmeje & Dixon Jones. Sign up now!

Webinar: SEO for LLMs - Adapting to the AI-first Web}

Webinar: SEO for LLMs - Adapting to the AI-first Web

Published 06 June 2024

May's webinar was all about SEO for LLMs. Whether you call it GEO, AIO, AEO, or some other insane acronym, it's every SEO's responsibility.

The legendary Aleyda Solis, International SEO Consultant and Founder of Orainti, is on the Sitebulb panel, along with Ray Grieselhuber, Founder of Demandsphere and AI/ML expert. Patrick Hathaway dives in deep with these guests to help attendees learn:  

  • How we can direct LLM crawlers
  • Tactics for helping them find & understand content better
  • Tools and techniques for tracking them
  • .....and a whole lot more!

Watch the webinar

Subscribe to the Sitebulb YouTube channel! You can also sign up here to be the first to find out about upcoming webinars. 

Transcript

Patrick Hathaway:
Okay. So Ray, you can answer all the questions. So today's webinar will span three central themes. We want to talk about content strategy for LLMs, technical accessibility of content for LLMs, and how we go about tracking visibility in LLMs. And I just want to highlight the second one of those for a minute, because, actually, there's a range of things you can do with Sitebulb to help you check how accessible your content is to LLMs. And so my colleague, Miruna, is running a Sitebulb masterclass training webinar a week today. A link is in the chat for that one as well. So that's at the same start time as this one, 4:00 P.M. BST. So get yourself registered on that, and learn how to do all this stuff in Sitebulb.

Cool. So we can stop sharing, I think, this slide. Jojo, whenever you're ready on that one. And we've just got a couple of final bits of housekeeping that we can crack on. So we are recording the webinar today, everyone always asks. So we'll be sending the recording out tomorrow, so don't worry if you're not able to make it for the whole thing, and we will have some time at the end of the webinar for your questions. Now, please put them in the Q&A tab next to the chat, don't put them in the actual chat itself, which everybody always does. There's a separate tab on the left-hand side for Q&A underneath the chat bit. If you haven't got any of your own questions, you can upvote other people's questions in there if you can't think of any yourself.

All right. So housekeeping's done. Let's kick off the webinar proper, and let me introduce our guests today. First up, we've got Aleyda Solis. Fortunately, she needs no introduction, an absolute powerhouse, and living legend in the SEO world. If you're watching this webinar and don't know who Aleyda is already, then I advise you to come out from that rock you've been hiding under. I-

Aleyda Solis:
You can ask ChatGPT who's Aleyda.

Patrick Hathaway:
Yeah, you can ask ChatGPT who's Aleyda. And I actually attended the excellent Hive Manchester Conference last week, and watched Aleyda's presentation on the state of AI Overviews, and the impact of LLMs. Unsurprisingly, she's all over this stuff, so it is really great to have you with us, Aleyda.

Aleyda Solis:
Thank you very much for having me. I really appreciate it.

Patrick Hathaway:
Next I'd like to introduce Ray Grieselhuber. Hey, was that better?

Ray Grieselhuber:
That was good.

Patrick Hathaway:
That was good. Yes.

Ray Grieselhuber:
You nailed it. Yeah.

Patrick Hathaway:
Ray is the founder and CEO of DemandSphere, a SERP analytics platform built for e-commerce and enterprise SEO teams. He's been working at the intersection of search and technology for over a decade, and has become a leading voice on how LLMs and AI tools are reshaping the SEO landscape. Thanks for joining us, Ray.

Ray Grieselhuber:
Thank you for having me.

Patrick Hathaway:
So welcome to everybody. Thank you all so much for joining us. And let's kick off with the questions. So the first section, again, will be focused around how we kind of decide what content LLMs should access. So the first question I've got is, how should SEOs think differently when creating content for LLMs versus traditional search engines? Aleyda, do you want to start?

Aleyda Solis:
I think that, at the end of the day, that depends a lot on your business model. What is content for you? And how do you create content already for your website? For many websites, let's say, content itself is the core area of their business model, and that is how they make money, through ads, through affiliates, and that is their core service, let's say, the information that they share through research analysis and so on. So for those type of businesses, this is a very tricky time, and potentially, one of those pivotal changes, and because we have had, so far, a very fine balance with Google, the way that the usual search paradigm, or the current search paradigm was working is, like, "Okay, they are the middleman." Pretty much. And you will allow them to crawl and access as much information or content, just for them to surface this information and refer potential clients, users, traffic, when the content itself is not the core value.

Again, this content is shown... It performs pretty much like a salesperson, showcasing, explaining your offering, your actual product or service that will be only access if they pay online, or offline, or through other type of methods, where the conversion actually happens, but when the content itself is the value. And now with LLMs, the referral is not so straightforward as it was, or it has been so far with traditional search results. Then is when the tricky part comes, and when I can definitely see how many of these publishers or creators can very well question if they should allow LLMs to access their information for training, and if so, up to what point? Because, of course, they don't want to block entirely, and not [inaudible 00:06:02] showcase in any way out there. But what is the fine balance?

And from that regard, what I will hear balance is, on one hand potentially so far, thinking about for which queries there's actual value for them to be found as a source of information, or as an expert of a certain vertical, or area and so on. And to which other point, you don't want LLMs to actually get access to core areas, or core insights for research for which you will also actually request something extra, or you need to provide something extra.

And this is where I can definitely see the business model changing quite a bit, and many publications are already doing additional moves of trying to make agreements with bigger players for access, for information on one hand. Then on the other hand, also less sophisticated and potentially more basic configuration with Google so far with paywall in information. The problem here is that we as internet and web users have been already to use, let's say, free access to information so far. And we have pretty much overlooked that raw information and content production, especially if it's expert-driven one, like it has value and traffic.

Sorry that I'm extending too far potentially, but if your business is not the content itself, it's very straightforward, you want to surface your landing pages, your solution pages, your PDPs, your PLPs as much as possible, because you want to endorse and you want to refer those users, and eventually, the conversion will happen on your website, yes or yes, at least as the web is built so far, let's say in a few years, but as it is right now. However, if you're a publisher, if your core service or product is the content itself, there should be a further strategy here. Not exactly the same, but very similar to, or comparable to the one that you define at some point in the past with paywalls. What should I give that is actually beneficial for me? Versus wanting a change, and up to which point I should request a registration, for which there should be something in exchange provided, I think.

Patrick Hathaway:
Like you mentioned in there, obviously publishers, there is potentially a way forward with paywalls. I think affiliates, like the recipe bloggers, travel... All that lot, the way I see it is, they could all just disappear to TikTok, or somewhere where they can build a platform that's not going to be... What's the point in living here?

Aleyda Solis:
Honestly, you did a really good observation. A couple of years ago, or even more, three years ago, it was before LLMs and all of this revolution that started with ChatGPT, I went to an event, an affiliate event, very focused on affiliates, and I was one of the very few SEO speakers at that event. And I was very curious, because I hadn't... I don't work with publishers, usually I work with e-commerce, marketplaces, SaaS. So I was very curious about, "Okay, what is the latest trend? What are you doing with E-A-T?" And all of that. The helpful content update at that time was already a thing, et cetera. And they told me, "Look, at this point, the more advanced and mature e-commerce, we have stopped relying on organic search as the main traffic channel, and nowadays, we have private [inaudible 00:10:14] influencers, to TikTok, to Instagram, to YouTubers and so on." So yeah, makes complete sense to do that further.

Patrick Hathaway:
It totally does. So Ray, then let's think about the outside of the businesses that are later just covered in that first answer. Let's think about some of the other types of businesses, e-commerce, maybe SaaS, all those sorts of businesses. What types of content do we think are more valuable to LLMs? Is this informational, commercial, or brand-focused?

Ray Grieselhuber:
From a pure LLM perspective, what we've seen, especially in the way that we've been starting to gather data on this has been, the informational content is definitely the most easily represented. Probably the transactional commercial content would be far less represented, and the types of responses that come back. Because essentially what's happening is, you have this new layer of mediation between the user who's doing the searching and researching, and their eventual destination, and the LLMs are basically acting as this process to synthesise a lot of different research stages that we would've gone through previously. And so informational content in that regard, especially if it's very easy to include in these types of results, is going to be definitely overrepresented.

Patrick Hathaway:
So-

Aleyda Solis:
I'm wondering if so far we have seen actually much more informational type of queries, because of the current interface that we have had.

Ray Grieselhuber:
Yes. Totally. Definitely.

Aleyda Solis:
But in the future... It took my attention how ChatGPT has actually really prioritised shopping integration. Even if they say that, right now, they pretty much perform as a bridge that doesn't even re-rank the products that they get from [inaudible 00:12:10] and so on. They are definitely pushing further, and providing ways to... For a better chance to integrate with them. And I can definitely see this being too good to not try to at least get a little bit of a share of that market.

And coincidentally, just yesterday in the Google IO announcement, one of the big announcements that they did was that they will start testing a new shopping integration in AI Mode, that seems to, from the demo that they did, better personalise the interface of the output. So it's something that connects better with a commercial transactional intent, rather than informational one. When I saw the current format of AI Overviews, for example, I was like, "They will need to reward this to connect better with a shopping intent, that is much more aligned with the product path, and the clickable knowledge panel." And so on. So yeah, let's take a look at how it goes. I expect that there is a much higher shopping intent in the future once they start better presenting, and better showcasing, and providing a better experience for this other type of search too. So it's a little bit of chicken and egg in this type of situation too.

Ray Grieselhuber:
If they're able to really capture the financial transaction smoothly. Right now, I think of it as being dominated by Amazon, and maybe a few of the top e-commerce providers. But if Google is able to insert themselves into that credit card transaction flow effectively, I think that's going to definitely have a big impact, and lead to exactly what we're talking about.

Patrick Hathaway:
Yeah. For sure. Cool. So is there a framework that people can use to prioritise which content we actually want LLMs to access, and which we want to keep out?

Ray Grieselhuber:
Are you talking about a technical level, or just more of a content-

Patrick Hathaway:
I'm thinking more strategically. How do you figure out, "Right, we do want them to ingest this stuff, but actually all this marginal value stuff over here"? How do you figure out where that lies?

Ray Grieselhuber:
I mean, it's a very big problem right now for some companies, because some companies are having this internal debate about how much they want their content to even be ingested. To me, the cat's out of the bag, it's not going back in, and the LLMs, they're going to ingest what they want to. I know we're going to talk about probably some technical sides of this as well, but at the end of the day, they're going to get what they want to get, and you can try to paywall it all you want, but that's just going to end up hurting your own visibility in the long-term. And so fortunately or unfortunately, depending on your business model, and how you look at things, I think it's safer to assume that either your content or a competitor content is going to be ingested, and so it's going to come down to whether or not you view it as strategic to be included in those results, and prominent in those results. I think any question of, do you want to be in there or not want to be in there? Is in the rear of your mirror.

Aleyda Solis:
I completely agree, Ray. And then there is an additional question here. I think that we're just at the beginning of this whole journey. So realistically, it's true that Google has started already the conversation with the biggest publication, the bigger players, and for agreements regarding licencing of access, information for training, et cetera. And then we ask ourselves, what about the indie players? What about those highly specialised, personalised, and the future of all of this, the incentives for creators in order to continue creating? And so on.

So again, I believe that we're only at the beginning. I remember there were some... When all of this started happening, and being... Launched Copilot, they started saying that... They were actually having conversations to see that... Via the programmes that everybody could apply in order to get something back from being shown in LLM answers, or a share of the ads, or something like that. So I think that this needs to evolve. And there should be programmes, or there should be some sort of methods that should be created in order to split the revenues, and incentivize, and give something back to the creators and publications, in case that there would be less direct traffic.

Realistically, for the queries, and the prompts, and the questions are more transactional and commercially-oriented. The conversion itself, even if it doesn't happen in the very long-term future in the actual website, and it's provided as an extra option in the LLM interface directly, because the integrations has come that far, fine. Because the actual goal of the business is to sell anyway, that product or service, independently where the transaction happens in any case. But the problem of creators and publications is the actual value is to showcase the information in the first place.

So it makes complete sense if it comes to that. I guess then in the following months, I do expect that the more and more they start referring more traffic and become much more widely used, because we need to understand that, so far, in 99% of the cases, the websites that I see, is 1% of referral traffic, is 1, 2%, is increasing a lot in the latest months, but I can understand why for most it might not be necessarily number one concern right now, but it might be in the next few months, because it's increasing a lot. But I do expect that LLMs start taking more initiative, more initiatives like that, because it's only fair, otherwise they will...

I mean, imagine if everybody starts blocking their content to LLMs, good luck generating good answers in the first place. So there's that. Then another thing I do expect is that... And I actually posted yesterday after the AI Mode official announcement in the U.S. Perfect, you are releasing AI Mode to all users in the U.S., I do expect now a report in the Google Search Console, thank you very much, and showcasing what is my activity there, how my website is showcased there, and the same-

Patrick Hathaway:
Fair share.

Aleyda Solis:
And here is the thing, we're asking this a lot to Google, but let's... Well, realistically, and we asked Google because they are the mature player in the market, and the one that has the current highest market share in the market in search, but they were not the ones that changed the status quo, it was ChatGPT. And thank you very much OpenAI, I would like also to have some sort of search console, or interface, or LLM console to understand how you are accessing my website, to give me further control, pretty much the Google Search Console, but apply to ChatGPT too. So every single player should be doing this, again, because we're very early in the market, they still [inaudible 00:20:07]. I do expect that they provide this sort of additional visibility and control of what's going on in their platforms to every website out there.

Patrick Hathaway:
Yeah. All right. I agree with you, Ray, the cat is out of the bag, it's not going back in, but I do want to challenge this idea that there's this opportunity with AI, and there's nothing but good to come of it, we should all want to be in there with everything at all times. We spend so much of our time, from Google's perspective, going, "Well, actually, let's no index these pages. These pages are duplicate, let's [inaudible 00:20:47] them." We put these technical checks and measures in place on a basis of the value of the content, whether it deserves to be valuable to be indexed or not. Now, are we saying basically that when it comes to LLMs, "Go wild, crawl whatever you want, ingest whatever you want, it's all going to be valuable at the end of the day"? Or should we actually be going, "Well, there is still a similar thought process to, this content deserves to be ingested, and available to LLMs"? Or do we just go, "Crack on. Do what you want"?

Ray Grieselhuber:
And actually, you're probably one of the top experts on this question on the Google side of things, because a lot of the incentive when we work with the customers basically 2 or 3 years ago, they hit this point where crawl budgets were getting reduced so dramatically that they were forced to kind of curate how their site was being indexed more. Before that point, I wasn't really seeing any idea that customers cared about reducing the amount of content that was being indexed by Google one way or another. It is basically just like, "What's going to be the most relevant? We don't really know. The funnel's all over the place, customer journey's all over the place, so we're going to just try to be visible for anything."

And so I think the situation today, if you look at how often the LLMs are crawling, we know that ChatGPT bot is very aggressive right now, so they're going to just crawl as much as they can. I think the question is, there some strategic incentive to say, "I want you to ingest or surface this content versus this over here. I don't really care about this over here"? My personal feeling is that we're not really there yet, because there isn't really any sort of semantic relevance at a site level that OpenAI and the others would care about in terms of saying, "This is the most semantically relevant content." We do see Google doing that now, of course, and because they're trying to incentivize site owners to create semantically-oriented clusters that are somewhat similar to each other, and on topic. My impression of how the LLM bots are functioning right now is that it's still pretty early in the game for them to do it. I think they're just doing everything they can to build as large of an index of the web as they can.

Patrick Hathaway:
Yeah. I think I'm going to invoke the second use of independence today, because I think there are certainly some businesses that would strategically not want certain aspects of their site to be ingested by LLMs. Let me give you an example of a customer site that deals with legal advice. They are legally required to keep archival records of old legislation that was in place 10 years ago, but it's not up-to-date current legislation. If LLMs are going to get into that stuff, they could start spouting off all sorts of nonsense.

So there definitely will be some instances like that, where from a business perspective, they are not going to want to enable LLMs to crawl absolutely everything, and just absolutely everything. So I think that's a clear area, because I want to move on to the next section on technical accessibility. But I also want to come up and talk about... So there was a post that I saw a couple of days ago actually from somebody who works at Profound, and they had said that the OpenAI was hitting their site 12 times more than Google, and that they'd seen it on their customer sites as well. And then in the graph that he shared, PerplexityBot was second, and then that was even double Googlebot.

So this sort of bandwidth toll that this LLMs could exact on sites like this does not feel sustainable, and it's not eco-friendly in the slightest, but it's also going to start costing companies money, already costing companies money. So I do want to address that aspect as well. I don't think it's all just sunshine and roses. Is there an argument that some... You mentioned crawl budget before, the AI crawl budget we should be trying to manage in some way?

Aleyda Solis:
And you know what? Realistically, we complain about Google and the Googlebot and so on, but realistically, now we understand how sophisticated they have become-

Patrick Hathaway:
Yes.

Ray Grieselhuber:
Definitely. Totally.

Aleyda Solis:
... so far when crawling the web. And we can see Cloudflare also sharing data publicly about the different bots out there. So I believe that they should actually start caching better, rendering JavaScript, because we're talking about... When we talk about the real differences, one of the main bigger difference is that we had assumed so far that, yes, clients render JavaScript as long as it follows certain best practises, it can get rendered quickly, and not necessarily use at scale, and for very core content, et cetera. It could work realistically well, but now with most of these new LLMs coming, and... It was Versal that did this research, along with the... I think it was the [inaudible 00:26:21].

Patrick Hathaway:
Merge, Merge.

Aleyda Solis:
Merge. Sorry.

Patrick Hathaway:
Yeah.

Aleyda Solis:
Is that they did it, and it showed very clearly that it was the ChatGPT, OpenAI didn't do it, for example, there. And it's the most prominent LLM out there. It's important that we take a look at the CDNs or log files data to understand well how big we are getting hit out there on one hand. And then on the other hand, a little bit of regarding the access. If there are actually certain areas that we didn't care about, and we allow the Googlebot to crawl it all, because it didn't matter. We'll need to rethink that approach, unfortunately, and pay closer attention to what the other bots do.

Patrick Hathaway:
Yeah. Okay. So let's suppose then either the business case I mentioned before, or the bandwidth concern was presented to a website owner, what technical methods exist to restrict LLM crawlers from accessing content?

Ray Grieselhuber:
I'd love to hear your thoughts on this too, Patrick, because I know you're seeing a lot, but I'm not seeing a lot of evidence right now, at least that the LLM bots are respecting things like robots.txt as a format. Again, I think their incentives are in completely the different direction right now to just get as much as possible. So really, you're talking about server level, reverse proxy level, things that you can do to technically, physically... You block them from seeing certain things. And again, that becomes... Going back to the strategic question of, what you want them to see versus not?

Patrick Hathaway:
Yeah. I mean, there's certainly ways, like lots of websites these days on things like Cloudflare, you can go in on the firewall rules basically, and restrict certain bots if you don't want them in there. What about the Google-Extended? So slightly different obviously, because it's not the LLM crawler, so I don't know if we're concern that much about Google and how they're doing things, because we feel like we're on a clearer plane with them about what they're trying to do. Aleyda, have you seen any of your clients using Google-Extended at all [inaudible 00:29:08]?

Aleyda Solis:
Let's say with additional rules or something specifically for them? No, I haven't. I had a client who wanted to, on the other hand, more granularly to not be showcasing AI Overviews. And yes, they were playing with the... Not with the no snippets, but it was with the max snippets, understanding a little bit the implications of it. And we played with that, and we did a few tests with a few resources, a few informational content that they didn't want to be showcased, and it worked effectively. But not blocking the whole thing, and not through the token. Indeed for learning purposes, not in that regard.

Patrick Hathaway:
So no snippets, the thing that prevents content from being used for Google as a direct input for AI Overviews and AI Mode. And so we have insight in Sitebulb... You can detect that through the indexability report, and I don't know if I've ever seen one in the wild, that people are actually using this stuff. Some of these controls exist, it's all very new, but-

Aleyda Solis:
I don't think that nobody will want to use the no snippet, because that pretty much prevents all snippets to be gone, and I wouldn't recommend anybody to do that. But in the case of the max snippets, there was this case of, okay, they also were this type of player that didn't care, were not necessarily that happy to be showcased in future snippets, because this is what will end up preventing, if you play with it, how many characters, how much do you want Google to take.

And I'm playing with that, saying, "Okay, the worst consequence is that we are not shown with a feature snippet too to avoid also preventing our content to be shown, or this type of content to be shown in AI Overviews. Let's go with it." It actually worked, by the way. But the thing is, and this is my point, they were not shown in the AI Overviews for these queries, they were still shown in the first or second position in the traditional search results, but since Google was showcasing an AI Overview at the top, there were other players showcased there. So what's the point?

And this is the type of conversation that I believe is... Well, it's healthy to discuss with your clients at the end of the day, even if they have some type of policies, whatever, great, but they need to understand the implications that even if they decide to not move forward, like Ray mentioned before the cat is out of the bag. If you don't do it, your competitors will anyway, unless you're a monopoly, and you have control, and you're the leader in your market. Sorry, but it will be a little bit difficult. Yes, speaking of monopolies, right?

Ray Grieselhuber:
Yeah. Exactly.

Aleyda Solis:
But yes, indeed.

Patrick Hathaway:
So do you guys have any thoughts on llms.txt? And I feel like the narrative on this changes almost daily. There was a study, I think, last week from Chris Green, who'd been crawled the whole majestic million to see who implemented it, and he found 100 sites.

Aleyda Solis:
Very few. [inaudible 00:32:26].

Ray Grieselhuber:
Yeah.

Patrick Hathaway:
John Mueller came out and compared it to the keywords meta tag or something, right?

Aleyda Solis:
Yeah.

Ray Grieselhuber:
Yeah.

Patrick Hathaway:
However, yesterday I think I saw something where Anthropic, I think, was supporting it now. So it feels like it's changing, it feels like... So the only argument that I can find for it to make sense is that, what we mentioned before, what you said, Aleyda, that the major LLMs aren't rendering JavaScript. So if you have a JavaScript-heavy site, and the development cycle for actually moving to server-side rendering would be too heavy to actually present the data to LLMs without requiring them to crawl jobs. I want to talk about LLMs, but also, I want to know if you've seen any shift in how companies are approaching rendering, so I have two-fold question there. Ray, what are your thoughts on llms.txt before we get onto the next [inaudible 00:33:30]?

Ray Grieselhuber:
Yeah, I mean, I don't really see anybody using it either. As a developer myself, I think it's a cool idea. I love the idea of being able to not have to parse out a bunch of HTML to get the content of the page. I think we all do, but I just don't really see any evidence that it's being used right now. I think with Google, the advantage they had over the years enforcing these types of new standards was that they were essentially the main gatekeeper for traffic on the internet, and so they basically said, "If you want to be visible in that gateway, then you're going to do this thing. And if you're not, then you won't."

And with the LLM bots, aside from what Google has, we're just not there yet, to Aleyda's point. The traffic isn't there yet, and so they don't really have the ability to say, "This is how it's going to be." And if they're not interested in doing it, then what's the point at this point? The question around JavaScript rendering is really interesting. You kind of stopped halfway there, but I'd love to hear what your question was on that too.

Patrick Hathaway:
Yeah. So my question, really, regarding JavaScript rendering is, if we have seen... Since this research has been made public, and people are starting to realise that it's not just potentially a problem for Google, that it's actually a major problem for LLMs, whether there has been any kind of a shift in how companies are actually approaching rendering?

Ray Grieselhuber:
I mean, it's just more expensive. It depends on, from the LLM side and for any crawler, how much money do you want to spend on rendering JavaScript. I think the whole React, Next.js trend over the last few years did bite a couple of... Not a couple, a lot of companies where they, for some reason, thought client-side rendering on a very heavy faceted search type site would be a good idea. Obviously that's not a great idea, and so people are re-architecting on that basis, and I think the more server-side rendering that's happening as a result of that, I think it'll make it easier for the LLMs to consume that from that regard as well.

Aleyda Solis:
I have seen, indeed, a couple of major, let's say marketplaces/e-commerce websites realise in the last few years that the decision of going into that path because of how easy it is with this new framework ended up backlashing and spending a lot in pre-rendering add-ons let's say.

Ray Grieselhuber:
Totally. Yeah.

Aleyda Solis:
And now with LLMs, that is still... It's now not because they are not going to be found out there at all, and their business is going to go downhill, it's not the case because of the market share that they have right now, but it should be an additional consideration, for sure, to, let's say bulletproof a little bit more the visibility in the future when these are better used in any case, and it's an additional consideration, 100%. I have just seen this shift thankfully too, about, "Okay, next couple of years' plans is to go the other way around, and do the right thing." What they should have been done in the first place, let's say.

Patrick Hathaway:
Yeah.

Ray Grieselhuber:
Yep.

Patrick Hathaway:
Well, fingers crossed, that's what happens, and that would be good. For that to be the reason why llms.txt gets adopted would be just the wrong way around. Okay, so let's move on then to... I want to talk about tracking. And so you already mentioned, Ray, that there's data you're tracking in DemandSphere, so please do tell us about that. I want to know what current methods or tools we've got for tracking, where and how content shows up in LLM responses.

Ray Grieselhuber:
Sure. Yeah. Probably everyone here on the call is aware of basically the two modes of how LLMs are going to be returning answers to people. You have something that's coming out of the foundational model, where they've done all this pre-training on everything, and if the answer is in that model, and you happen to be performing well within that model in the way that those training processes took place, then your visibility can look good. And then there is the live retrieval approach. And so the question from the live retrieval side of things is, how is that happening? What sources are the LLMs using when they're trying to retrieve content for people?

And so going with the example of ChatGPT, because it's at least in this kind of new mode, it's the elephant in the room right now, very early on, there was a lot of talk about how they basically have this partnership with Bing to get a lot of their results via the Bing API, and as far as I'm aware, they still do have that partnership. And so the question became, pretty quickly, "Well, okay, is it just a matter of appearing in Bing? And is it going to be a one-to-one correlation between visibility and Bing to ChatGPT?" And that becomes a question of data, what data are you looking at? What's your dataset? Who's your audience? All of those different things. You can probably just add another counter to the, it depends, for today.

Patrick Hathaway:
Yeah. Got it. Thanks.

Ray Grieselhuber:
Got it. Okay, here we go. I see a live counter going here too. But what we ended up seeing, and the approach we took was, the most valuable type of stuff to monitor is that you can do the model monitoring, that's fine. Basically, there's going to be a lot of similarity, subtracting out any personalization. There's going to be a pretty strong correlation between what you see in the API versus what you see in the web UI. When it comes to the citations that appear, there is not that. You can't hit an API, and get a one-to-one view of what the citations are going to look like in the web UI. And so what that means is, you have to scrape that data.

So if you really care what types of links and citations, using ChatGPT as an example, are using, you have to be able to get that data from the UI directly, and technically, they make it even harder than it is to scrape Google. But once you do that, then you can start analysing, is that data coming from Bing? Are they using Bing for their live retrieval index or are they using something else?

Patrick Hathaway:
Okay.

Ray Grieselhuber:
And in the data that we've seen so far, it's about actually... They're actually scraping Google themselves quite a bit, a lot more than people thought. So again, this is getting into this funnel of visibility of, "Okay, where do you focus your efforts?" And again, that's going to really come down to what your strategy and your audience looks like.

Aleyda Solis:
It's pretty exciting, because I believe that we're in this era that I wasn't at, unfortunately, when SEO started to become a thing. When I started, we had already Overture as a keyword research tool, and the Yahoo one, whatever. But I feel that we're in this era in which everything is starting. So I'm thankful to have access to tools like Rankscale. I think that the founder is here in the chat, peak.ai, DEJAN also, Dan Petrovic created this, actually, free tool.

And all of these tools in general, what they give us is, "Okay, for these prompts that you give us, we will tell you what is the share of visibility, and how visible you are for this query versus all the players that you have also specified as to be your competitor." And if the sentiment is positive, it is neutral, negative, where are these answers coming from, and what are the citations. And based on that, you can identify, "Okay, I am not being showcased here. I will take a look at why my competitor is... I can do something to close that gap, if it actually makes sense for me to go around and try to be cited, be covered by that same website, or very similar websites in nature, and authority that the model will tend to take into account for references."

But in general, I believe that the biggest difference here is the lack of data that we have in terms of the typical keyword research, the search volume, and popularity that we have had so far, because that... Even if we complain a lot about 100% accuracy and so on, realistically, we have so many data sources now in SEO regarding keyword data and queries data. And unfortunately, we don't have that type of data yet here. The only player that I have seen so far, correct me if I'm wrong, maybe there are more, but so far that I have seen is Similarweb, providing top prompts per... They have a new AI chatbot report, in which they provide not only the traffic of chatbots and the split per chatbot, but also the top pages. And if you click on the pages, the top prompts of the pages. So that gives you a little bit more of additional input, let's say, to understand what type of prompts you should be monitoring, or validating, or identifying opportunities for.

I think that will only increase and improve in the next few months, hopefully. Hopefully the tools themselves or the platforms, let's say themselves start providing more of these insights. I actually asked for it in the Central Live event in Madrid two months ago to John Mueller, and he mentioned the concerns regarding privacy, and how the prompts so long tails, different unlike keywords that were more [inaudible 00:43:21] and so on. But even like that, I think that there should be a balance there of the information that they can provide. So there's definitely that. I think that we're only starting, and I'm excited to see that more. But yes, I believe that the biggest challenge here is the lack of that sense of prompt popularity or the query-

Ray Grieselhuber:
Totally. Yep.

Aleyda Solis:
... [inaudible 00:43:42] particularly important in this LLM, because realistically... I mean, we have talked about the difference in optimization, but the major difference that I can see is the behaviour, the search behaviour that is much more conversational, that is much more long tail, multi-turn type of interaction with the interface. So I expect that also once that it becomes much more personalised for each type of intent, or search behaviour, I expect that this will only evolve with time. It will highly impact the different type of queries that we will see in each type of platform, for each type of intent. And it will be great to have that source of information to optimise further accordingly. The same content that we have for SEO, or different content, if we see that there's much more of an impact or importance of certain type of topics or queries coming from LLMs, for example.

Patrick Hathaway:
Yeah. Awesome. Right. So I'll tell you what, I had prepared 15 questions, and I think we've done about half of them.

Ray Grieselhuber:
If that.

Aleyda Solis:
Sorry.

Patrick Hathaway:
The thing is, it's such a new topic, it's such a wide-ranging topic. We have a tonne of questions from the guests as well. Does that count? I think that counts as well. We've got loads of questions from the audience, so I think it's really great to try and bring as many of them on as well. So I'll move on to them. If we happen to get through all of them, I can get back to my other questions as well, but I expect there'll be some overlap as well on some of these as well. So let's go through, and if anyone just wants to jump in, if there's something in particular. So I'm going to go with the first one, which is, do you have any ideas on how to track visibility and referrals from AI Mode?

So we did just talk about LLMs in general. AI Mode is so new, and we don't have it in the UK, but... Ray, do you do anything like this at the moment, not?

Ray Grieselhuber:
AI Mode is still pretty early. Primarily, you have to be logged in as well, and to Aleyda's point earlier, there's no search console enhancements or anything. The closest thing that I've seen that I think we'll probably end up seeing, and I'd love to hear what Aleyda says on this as well, is... Like Dana, for example, she's got a really good series on tracking traffic from AIOs. And so I think we'll probably end up seeing things like that, where there are different triggers that you can find and set things up in GA4, and potentially other analytics platforms where you'll be able to see some of that. But we're going to be flying blind on a lot of it, I think for a while.

Aleyda Solis:
Yeah. So far, since it was in the lab on test, and yes, you had to be signed in, logged in, et cetera, there was no way to accurately do it. But now that it has been publicly launched, let's say, I have no doubts that through the Google Analytics, and the same filters or similar filters, that you could apply for AI Overviews. And then also, I'm pretty sure that a lot of these platforms that track clickstream data, now that is an additional tab too-

Ray Grieselhuber:
Right. Yeah.

Aleyda Solis:
... it's very likely they should be able to provide this additional information now. They couldn't before, because it hadn't been launched so far.

Ray Grieselhuber:
Yeah.

Patrick Hathaway:
Okay. So I guess a little bit, but we'll to see. Fingers crossed. Okay. So this one's interesting.

So given that LLMs access information from both their core training data, like the common crawl, and through live web searches with the RAG, what are the top actionable strategies an SEO team should prioritise to most effectively influence visibility?

Right. Could have a whole webinar just on this.

Ray Grieselhuber:
Just on that one topic. Yeah.

Aleyda Solis:
Real brand, real brand, a real brand.

Ray Grieselhuber:
Yeah. Exactly.

Aleyda Solis:
Real expertise. Authority.

Ray Grieselhuber:
Absolutely.

Patrick Hathaway:
I mean, build a brand-

Aleyda Solis:
We showcase in the Wikipedia. So this is very broad. I actually have a presentation about this that I can link into...

Patrick Hathaway:
Yeah.

Aleyda Solis:
If that's okay. Sorry for this promo. I mean, the presentation is free, of course. It's about how to improve your brand visibility and recognition. And at that time, it's true that I mentioned it very quickly for how this also plays a role for LLMs, but it was more focused on traditional search results still at that point. But it is highly aligned with that. It took my attention, Ray, I don't know if you saw this. A few months ago, there was this big conversation that happened about all the importance of structured data for LLMs. Patrick, did you see it?

Patrick Hathaway:
We did a webinar. We did a webinar.

Aleyda Solis:
It was true, it was true. It was you too. So it was funny how eventually, at some point... Again, it was John Mueller, I believe, at the Google sessions on live in New York, the first one that they did of the series, confirming that, yes, it is important they use it. I mean, honestly, the more signals you can provide, I was going to say to Google, but LLMs in general about your entities, what matters to you, who are you in your website or elsewhere, consistently. It's like in SEO, the alignment of signals, not one thing or the other, that will allow them to recognise you, what you stand for, and where you should be showcased. And the more in that regard, especially for training purposes, the better, I guess. Because that part of the process that we have had so far for traditional search results, that will end up impacting through RAG too, et cetera, grounding, et cetera. Indeed, I think that most of us already knows the importance too.

Patrick Hathaway:
Yeah, no, for sure. Okay, so let's leave that one there, because, again, we could spend forever on that one. So what else have we got? Okay, try this one.

So how do you track mentions or links topics... Oh, all right. Is this similar to what Dixon's new tool does? Which is called Waikay.

Aleyda Solis:
Waikay. Yeah, Waikay.

Patrick Hathaway:
What AI knows about you. That's what it is designed for. It's very specific. It's trying to do this stuff. And I'll find a link for it. Anything else [inaudible 00:50:35] answer that question?

Ray Grieselhuber:
Well, when you say links, I think this is talking about citations that we were mentioning before, is that-

Aleyda Solis:
Yeah.

Ray Grieselhuber:
Yeah. Okay. And those are really the two main things, because with any response, you're going to have, obviously, the response text, and then you're going to have the citation links peppered throughout those. And so brand mentions are going to be in the response text, and then potentially also on... If you care about getting... How your brand is mentioned on those citation links beyond that point. And we are starting to see a lot more people very interested in that, because they want to see what's influencing the models back at the way that they're selecting this data. And so it becomes a really big tracking side of things. You have to get a little bit strategic, I think both as a... For use as a platform provider, but also as a brand, how far deep you want to go, because you can just keep going down that tree. But I think there's going to be a lot of tools. This is definitely something we're working on. All the citation tracking is a really hot topic for people right now.

Aleyda Solis:
Yeah, [inaudible 00:51:38] do this very well, very enterprise-oriented, also all the tools, they're newcomers, like Waikay is, Rankscale, peak.ai too, they do this. So there's a variety of tools that you can definitely check out, and test, and use the one that better fits your preferences, project and so on.

Patrick Hathaway:
Awesome. Okay. So to the next one. We're going to have another, it depends here. Do you care about tracking your product's visibility as e-commerce [inaudible 00:52:11]?

Ray Grieselhuber:
Yeah, I think you should. I mean, this is where people are going to be shopping. We started getting pretty granular on a lot of the prompt research that we were doing for some large shoe brands, and it was pretty interesting to see, when you look at the competitive view, the level of detail that these answer engines are surfacing for individual products. And again, you have to think about whether or not you care about the foundational model and the mentions within that, versus the citations. But I definitely think you should care about it. And then it's going to be a question of, how are you going to track that transaction and performance, and all that. All that's going to be stuff that has to get worked out over the next couple of years.

Patrick Hathaway:
Cool. Okay, what else we got? Simon. All right.

So considering how much AI is changing at the moment as it matures, should we just stop trying to second-guess what's going to work for organic search, and focus on long-term brand building for now?

Aleyda Solis:
Not really. I believe that there's very big overlap. There was this big conversation the other day, we should stop even calling ourselves SEOs or something else. For me, we're findability experts, and optimizers, and people who choose to, because their audience is there to optimise primarily for TikTok, or for YouTube, or for other platforms search features. And now most of our users are expected to start using these new platforms to search, we need to apply many of the principles that we're already applying to SEO. It's not only brand building, it's about also probability, indexability, it's about understanding and improving the semantic relevance of the content to connect with the user search behaviour that might or might not be different. Realistically, what I see that the biggest change that there will be might be this, the impact on the search behaviour, and the type of questions or queries [inaudible 00:54:27] searching one, or the other... Depending on the preferences, and the evolution of the interfaces too. But I believe that there's a very big overlap, 95% of overlap pretty much at this point on the principles of optimization.

Patrick Hathaway:
I would certainly agree with that. You still need the strong technical foundation at the beginning. All of the sort of digital PR type work, traditional link building, mentions that you get as a result of that is all going to help train the models, the co-citations being mentioned, the topic modelling, all of it, just makes sense that it's all going to work together. So I don't think we should just stop worrying about it. All right, so we've got four minutes left, so I'm just going to try and answer these questions in order, because we've got loads.

Are there any ideas for content writers to approach content moving forward, especially in your money, your life industries? I mean, this is kind of where we started, I suppose. Is it just brand-focused now and assisting versus answering?

Ray Grieselhuber:
I don't really see that as a dichotomy. You have to kind of do it all. There's not really like, "Okay, we're just going to do brand." Kind of like the previous question, like, "Okay, we're just not going to just do brand only, and ignore everything else. It doesn't really make sense." Managing a brand on a digital level, you have to be very granular, and you have to cover a lot of bases, and you have to know what your footprint looks like, you have to know what your competitor's footprint looks like, and make sure that you're getting good feedback from all the different channels that you care about.

So it's becoming a lot more complicated right now, but I do see a lot of this tendency to, to Jojo's point, throw the baby out with the bathwater, and just be like, "Okay, we're just going to do this one thing, because it sounds like the top priority right now." But I think it's really important to educate executive teams on the fact that we're in this transition stage from where things used to be, to where it's going to be, and we're not... I don't think anyone can claim to 100% understand what it's going to look like five years from now. But it's going to be a lot of changes, and you have to be able to manage a lot of different things, and avoid this sort of dialectical approach as much as possible.

Aleyda Solis:
Yeah, 100% agree. I mean, we will need to write for what our audiences are searching in the platforms where we really want to be found, because we see that there's a higher engagement, or, "These are the platforms that our potential customers are actually using to look for information, or services, or products like the ones that we can offer." At this point, this is... Since how new LLMs are, we can see that these are pretty much informational queries out there. It's still very simplistic about assisting on research around topics and so on. So I expect that this evolves a lot with new features, new integrations, new interfaces and so on.

So yeah, I can only expect this to evolve, and at the end of the day, it's about creating information, and fulfilling the needs of our users through the whole customer journey. So even if at some point in the future LLMs, for some reason, are only used for certain [inaudible 00:57:48] cases, or use cases, or particular scenarios, that doesn't mean that you won't write about everything else, because at the end of the day, your actual outcomes should be to sell, and very likely, you will need that information to exist on your website for other reasons, also to rank in traditional search results on the platforms too. So yes, there's definitely that. So I will say to see them with a much more holistic approach.

Patrick Hathaway:
Yeah. And I think we're at that inflexion point, like I mentioned at the beginning, you're going to need to be open and agile in terms of the business model you're using, how do we actually monetize, it's going to change. It's absolutely going to change.

Right, it's 4:59, so we have time for one question if I'm really quick. Well, if you're really quick. Right. I don't know if there's a good answer to this one. Google's AI Mode is just released in the U.S., when do you predict this might be rolled out to the rest of the world? Any ideas?

Ray Grieselhuber:
6 to 9 months. We do a lot of work in Japan, and in Europe, outside of the U.S., and it generally seems to be everything's 6 to 9 months behind.

Aleyda Solis:
Yeah. Before the end of the year, very likely. If all evolves as they expect, it might be some... If it's not, then maybe a little bit longer. But if it all goes as they expect, and predict, and can control, the evolution of it without harming their clicks on ads, and impressions on ads, pretty much, before the end of the year, very likely, I expect.

Patrick Hathaway:
Awesome. So we do have a clear answer on that. Definitely 6 or 9 months, or it depends.

Ray Grieselhuber:
Hopefully not.

Patrick Hathaway:
Right. So that's all we've got time for today, folks. Thank you ever so much for watching, and for your fantastic questions, and huge thanks, of course, go to Ray and Aleyda for so generously giving their time and expertise.

Ray Grieselhuber:
Thank you very much.

Patrick Hathaway:
We'll be emailing out the recording tomorrow to everyone who registered, so if you missed the start, don't worry, you can catch up. And a reminder that this time next week, you can catch that Sitebulb masterclass on checking LLM accessibility and indexability signals with Sitebulb. I think Jojo can put the link in the chat again for that one. And finally, you're the first to be invited to June's webinar, which is all about brand SEO and its importance, a generative search becomes the norm. So everything we talked about today. So the link for that is also hopefully in the chat. So hopefully see you there, and thanks again for watching. [inaudible 01:00:12].

Aleyda Solis:
Thank you very much for everything.

Ray Grieselhuber:
Thank you very much.

 

Jojo Furnival
Jojo is Marketing Manager at Sitebulb. She has 15 years' experience in content and SEO, with 10 of those agency-side. Jojo works closely with the SEO community, collaborating on webinars, articles, and training content that helps to upskill SEOs. When Jojo isn’t wrestling with content, you can find her trudging through fields with her King Charles Cavalier.

Sitebulb Desktop

Find, fix and communicate technical issues with easy visuals, in-depth insights, & prioritized recommendations across 300+ SEO issues.

  • Ideal for SEO professionals, consultants & marketing agencies.

Sitebulb Cloud

Get all the capability of Sitebulb Desktop, accessible via your web browser. Crawl at scale without project, crawl credit, or machine limits.

  • Perfect for collaboration, remote teams & extreme scale.