Webinar Recording: Google Data Leak, DOJ Trial & Algorithm
Published 2024-08-22
In this webinar, Sitebulb’s Patrick Hathaway was joined by Marie Haynes, SEO Consultant and Google algorithm guru!
She blew our minds with her knowledge of:
- RankBrain
- Helpful content
- The 3 ranking systems discussed in the DOJ trial
- Life of a Click
- Information Satisfaction Signals
- Gemini, Google Assistant and AI answers
- Marie's predictions for the future of search, and so much more...
Watch the Google webinar recording
Here's the webinar recording to watch at your leisure.
Don't forget to subscribe to our YouTube channel to get more videos like this!
Further reading
- Marie's first blog post in her series on the data leak
- Details about the DOJ antitrust trial
- Blog post on Navboost by Marie
- Cyrus Shepard's article about Google patents involving clicks
Webinar transcript
Patrick: Hi everyone. Thank you for joining our webinar today. My name is Patrick and I'm the co-founder and CEO of Sitebulb. Today we welcome the wonderful Dr. Marie Haynes, and we'll be talking about the various revelations that have come out recently by the leaked API docs from Google and the Department of Justice versus Google Legal Case. Marie is well known in the industry for publicly sharing the results of her never ending investigations and providing commentary and takeaways regarding Google's algorithm updates. And in particular regarding the data leak Marie has an ongoing blog post series exploring the attributes on her website, and she recently published a massive Twitter thread of her takeaways from the DOJ legal case. So I can guarantee you she knows way more than I do about this stuff, so I'm excited and fascinated to pick her brains today. I hope you guys are too.
So for those of you who aren't familiar with us already, Sitebulb is a website called an auditing tool. To put it simply, if you do website audits, you should probably be using Sitebulb. And now we have both desktop crawler and a cloud crawler, so we can handle websites and businesses of pretty much any size. So if you aren't already using it, head across to sitebulb.com/download and checkout our free trial.
So before I get into the questions, I've got a little bit housekeeping. We will have some time at the end of the webinar for your questions, so please put them in the Q&A tab next to the chat box. Don't put them in the chat box itself. There's a separate Q&A tab to put it in instead, and you can also up vote other people's questions in there if you can't think of any yourself. Word of warning, both the leaked API docs and the DOJ case docs are absolutely massive. So you could well come up with a question that Marie doesn't yet know the answer to yet. So yeah, bear that in mind.
And as always, we have our marketing manager, Jojo with us behind the scenes. She'll be hanging out in the chat and helping us with the question so please go ahead and say hi to her. And I've got one final bit of promo. We have a totally free JavaScript SEO training course that starts just under two weeks, which you can sign up for on sitebulb.com/javascript-seo-training. Jojo will put the link in the chat, so feel free to go ahead and sign up for that now while I vamp. So this training course is in conjunction with Women in Tech SEO, and it will be run by two brilliant WTS members, Tory Gray, and Sam Torres from Grey Dot Company. The main course will comprise three lessons by a live webinar covering in the first week identifying and understanding JavaScript content. Week two will be how to audit JavaScript for SEO and week three will be prioritising and communicating JavaScript SEO issues.
And then there's also a bonus hands-on audit clinic with feedback on real sites that's an exclusive for Women in Tech SEO members. So having seen the course materials ahead of time, we are super excited about the course so please go ahead and sign up. Yeah. Find the link in the chat. Okay. So thank you very much for joining us today, Marie. I feel like everything we're going to discuss today is just one part of the puzzle you've been starting to unravel and there's plenty more to come. So before we actually get into the questions, could you just give us a brief intro and share with everyone all the many different ways they can follow along with your journey of discovery?
Marie Haynes: For sure. Well, first of all, thanks for having me. This is great. I haven't done any podcasts or interviews for a while. So when the API files and really the DOJ versus Google case came out, it's so fascinating that we have to talk about it. I think our industry, a lot of what we do is fixed on things that we've learned over the years that maybe aren't as effective anymore. So yeah. So I just love learning about Google. I guess if you want to reach me ... You said throw out some links. Probably the best way to contact me is or to get to know my stuff is my newsletter. So if you go to mariehaynes.com/newsletter, then you can see some information there. And we also have a community called the Search Bar. It's community.mariehaynes.com, where lots of people are talking about what Google's doing, especially in terms of AI, because none of us are experts in how AI is changing search, which is wild because Google has been AI first for many years now. They just don't talk about how they use it in search very much.
Patrick: So you've also got a book on AI and a course.
Marie Haynes: Yeah. I know I do too much, right? You can go mariehaynes.com/book. I'll tell you this story of this book. It's called SEO in the Gemini Era: The Story of How AI Changed Google Search. Two years ago when Google launched this new helpful content system, they told us that it was a machine learning system, and I thought, well, my job is understanding Google algorithms. And at the time, I really didn't know very much about machine learning. And I think as SEOs, it's important for us to understand not necessarily exactly that we can write the code to create a machine learning algorithm, but really how Google uses it. And so I set out to try to just understand more. I took Google's course on machine learning and Python. They have a number of courses that I'd really recommend. And then ChatGPT came out at the same time, the end of 2022. And it was amazing to be able to just think out theories and get feedback from this language model. And so for the last ... I didn't realise it would take me two years to write my theories on how AI changed search, but it did. It took two years. And now I'm back into consulting again. I've taken on a few clients to experiment with some ideas of optimising for vector search and understanding RankBrain and what we need to talk about today, Navboost, I think should radically change how a lot of people do SEO. So I don't claim to be an expert. I'm somebody who really loves learning these things and really have dug into these API files, but not completely. Like you said, they're a massive document. So I'm happy to talk about what we can learn that's not just interesting, but that actually changes how we do SEO.
Patrick: Yeah. For sure. I think that's really the key part is, yeah, there's all this stuff out there. What does it really mean and I suppose, how do we take advantage from what we've learned? So we'll get onto all that stuff as well. So we'll start in the beginning though. So I expect anyone watching is aware that there were some private Google Docs that were leaked back in May. And so we're all on the same page, what exactly is it that was leaked and perhaps more importantly, what is it not?
Marie Haynes: Sure. And I hesitate to use the word leaked because apparently these documents were available on GitHub for a couple of years. They were found though, and they were discovered, and that's fantastic that that happened. What we first noticed when we started digging into these files is that there are a list of attributes, and some of the attributes are really interesting. There's talk of page rank in there, and then all of this Navboost stuff. There's a number of attributes that are in this file. The documentation is meant for people who are using Google's cloud services and calling their APIs. It's like Google said, "Here you go. Here's our search algorithm." And it isn't the search algorithm. The part that we can pay attention to are again, the attributes. So we can say, all right, well, this is something that Google can measure or has measured in the past, but we don't know what they're using today.
I personally feel ... I don't know this for certain. But I personally feel that the algorithms changed dramatically in the last ... Well, really since the March core update, Elizabeth Tucker from Google ... There's an article on Search Engine Land where she says that the March core update introduced new architecture and they brought in new signals to the core algorithms. And I think that a lot of these things that we got the API files and we're like, "Oh, wow. Now we can find out exactly how search works." But what we know from them is here are some things that Google can measure. And then it's a guessing game to say, well, how do they measure them and how do they use them? But then the same time last year, we also had the DOJ versus Google trial, which honestly, if you're in SEO, I guess I would look at the API files. But if you have not read Pandu Nayak's testimony from the ... And I can't remember which day it is. I can get you the link if you can't find it. But that testimony is mind-blowing. Talks about RankBrain, RankEmbed, Bert, DeepRank, and then a tonne about Navboost as well.
Patrick: Awesome. I think we'll come on to the DOJ stuff after. But in terms of the API leak, non leak itself, the way you're describing it feels a little bit like how we can infer things from patents. So we could see that Google is patent is something, and it means that theoretically they might be using this, they might not be using this, but we can at least go, like you say, we know that they can measure these things. And putting those two things together might be a way that we can go, all right, well ... I think in one of your posts, you mentioned some of the stuff that Cyrus Shepard shared a few years ago on Moz. I remember reading some of that stuff back at the time and going, this just makes so much sense that these would be factors that they would care about, right?
Marie Haynes: Yeah. Yeah. That post by Cyrus is ... I remember thinking it as well because it was talking ... I don't know how much detail you want to go into with that, but talking about clicks, should we talk about that?
Patrick: Yeah. Sure.
Marie Haynes: Let's do that. So this post, he looked at three patents that Google had. And again, every time we talk about patents, we have to put this disclaimer out that says, well, we don't know that Google's using it, but it really made sense at the time. But when we didn't understand Navboost, it was hard to put that into our ideology of what SEO is and how it works. So he talked about three different patents. One of them was being the first click. So if you are a website that people tend to choose, then that potentially is something that could contribute. And again, Navboost. I feel like I'm like saying Navboost too many times. Could contribute to something good basically that maybe your website is helpful. The next one is if you were the longest click. And so we had all these discussions about dwell time and Google's not looking at your Google Analytics and saying, "Oh, well, your dwell time is longer than your competitors, therefore you're better." But it is something that they could consider.
And what they mean by dwell time in this patent is how long it takes you to return to the search results. So somebody did a search, clicked on your site, spent some time on it, went back to the search results, then that's a sign that perhaps you were somewhat helpful. And then the last patent was talking about being the last click, which could be a sign that your website has satisfied the user search. But again, it's clearly not a hundred percent because there's lots of times where I click on something and go, okay, that's it. I'm leaving the internet, or I'm going on to another topic or whatever.
Let's jump to Navboost. What Navboost does is store every single query. This was mind-blowing to me. Every time you do a search for every search that you've ever done, unless you have turned off privacy settings, which most people don't do, every single query that's done is stored by Google along with the actions that the searcher took. So which sites they clicked on, again, whether they stayed there for a long time, whether it was the last site that they clicked on. And over the years ... Because Navboost is not a new thing. It's been around for a number of years. Google has been learning how to use those signals with the goal of trying to predict which sites are likely to be the ones that satisfy people. A lot of the trial was talking about just how much they need user data, because again, nobody likes the idea ... Now, since reading it, every time I do a Google search, I think, what have I just taught Google? And it's brutal. If you're in SEO. Some of the niches that we might be doing searches for are not necessarily what I would personally search for. But every single search that we do is teaching Google something about what it is that we found to be relevant and satisfying.
So the trial was talking about maybe needing less user data for a lot of the things that they do in search. And I really think that Google has been learning how to use this information so that they can predict even without seeing it. Then when Elizabeth Tucker said that they brought in new signals to search, it makes sense to me that some of these things that are attributes that were used in the previous Google algorithms now are signals that can be weighed and used in machine learning systems to try to predict what a searcher's likely to find helpful.
Patrick: Yeah. It certainly seems as though there's a lot of ... I think that one of the phrases used was the magic source or something like the secret formula was actually all this user data. And if you think about as well, in the framing of how this data got shared with us in the DOJ case, it's come about because Google is so unbelievably dominant compared to everybody else. So one of their huge competitive advantages, they have so much more of this data than everybody else does, which means that, again, it's going to be so hard for anybody to compete with this in the first place. So yeah. It totally makes sense that you would try and not only use that data to inform the algorithms that are working right now, but also how can you use it for machine learning moving forward, right?
Marie Haynes: Yeah. The main point of the trial was about Google's monopoly. The deals that they have with Apple and with the mobile phone carriers, that even if you give people the option to switch to another search engine, most people don't. They just stick with what's there. And so nobody can compete. And I think that there hasn't been a big stink made of Google using user data. I bet you it'll come. I think that'll be the next wave of people who hate Google. We'll be talking about just how non-private it is. But as SEOs, we need to recognise that it's been ... There's an email that came out in the trial as well from years ago that was saying that the teams at Google who were working on search ... The teams who were not working on Navboost were jealous of the Navboost crew because Navboost was, I think they called it getting all the wins. It was one of the best things that helped them produce search results that their metrics said were more satisfying to searchers.
Patrick: Particularly with Navboost, from what we know, it helps take the potential answers for a query down enormously, right?
Marie Haynes: Yeah. Think of how amazing it is that when you do a search, there's trillions of webpage that potentially could answer your search. And so we are familiar with the basics of information retrieval and how Google initially was using keywords to match keywords on pages. You probably remember the days where people would ... You could spam Google just by putting meta keywords in or putting white keywords on a white background. Those days are gone. And obviously Google has evolved since then. Oh, I lost my train of thought there. I was all excited about spanning. Bring me back the-
Patrick: The good old days.
Marie Haynes: Yeah.
Patrick: Reducing the data ...
Marie Haynes: Right. Okay. So they narrow the amount of possible relevant pages down to a smaller number. In the trial I think they talk about tens of thousands. And then they insinuate that Navboost is one of the things, not the only thing, but one of the things that brings the results down to about two or 300 results. And so we're talking a small handful of results. And then what's most interesting really is then when they talk about RankBrain, they say that RankBrain takes the top 20 to 30 results and re ranks them, which is fascinating. And so RankBrain is their AI brain for ranking. We don't know exactly how it works. But that's what I think we should be paying attention to. Navboost is one thing. Navboost tells Google what people are likely to find helpful, and then they can reran all of those results according to their understanding of the intent of the query, their understanding of your location, a number of different things like that.
Patrick: So then if we just look at specifically the Navboost section before we move on to some of the other bits, we now know that they are using it. We now know that user signals are a factor. So what can we do in our search strategies to take advantage of this knowledge?
Marie Haynes: That's a really good question because the obvious thing that people jump to is manipulating click-through rate, which Google's had years of fighting against that. Making it so that that is less likely to be effective. I don't know if you remember the experiment that Rand Fishkin did. I'll just explain it in case anybody in the call doesn't know. I was in one of the conferences. It was a MozCon where Rand said, "Everybody pull out your phone, get off of the wifi and search for best steakhouse in Seattle." It was something like that. And he said, "My friend's site is this one here. So click on that and navigate around the site. Look at the menu, do things that you normally would do." And sure enough, within a small timeframe, the rankings for that site jumped up for best steakhouse. And so after that, all I saw across everywhere were these ads for like, "Hey, we can manipulate your click-through rate."
So why did that work? Well, let me tell you another story. Years ago, my team and I won the Wix SEO competition. And what we needed to do was to rank for one specific term. It was Wix SEO, on one specific day, and whoever outranked this other team would win. And what we did was ... And I think it was based on these experiments that Rand did. I didn't know about Navboost at the time. But I put out a thing in my newsletter saying, "Search for Wix SEO, click on our website." And then we hid pictures of John Mueller all around the site and said, "If you can find all 12 pictures, then we'll give you a free subscription to my paid newsletter." And so that generated signals that looked like people were engaging.
Patrick: And sticking around. Yeah. Right.
Marie Haynes: Yeah. Exactly. Not just clicking. It's very hard to randomise actual human behaviour so that worked. We did that one day, and then the next day our rankings had improved, but it only worked for a day. So it turns out that there's a version of Navboost that's called instant glue. And instant glue helps Google understand fresh queries. One of the testimonies in the trial is from a man named Douglas Ford, and he talks about how if somebody does the search, nice pictures, N-I-C-E pictures, they're probably looking for pictures for their PowerPoint presentation or some type of image like that. But let's say there's a terrorist event in Nice, France, N-I-C-E again and people are searching for N-I-C-E pictures. Well, they're not looking for PowerPoint. They're obviously looking for pictures of the event or. The thing that happened, and Google can't wait for weeks for their system to catch up. They need to be presenting relevant results.
So instant glue helps Google understand what people are finding relevant for fresh changing queries. So that's one thing. If you run a news site, how do you use that information? I suppose being aware of what the trends are and really focusing on the things that are likely to get someone to click. So one of the things I think I personally have not paid enough attention to is our meta descriptions. Title tags we've always known are important, but what the user sees in the search results, if you can entice them to click on your site, then that's something that can influence this instant glue system.
Now, in terms of other queries that are not freshly changing, how do we use this information about Navboost? I don't want people to come away from this thinking that ... Because Navboost, I don't think the system is black and white. That's saying, if you're the first click, the longest, click the last click, therefore you're better. It's using that information to create a model of the type of content that people tend to find good.
So yes ... I'm going to contradict myself. Yes, we should be aiming to be the first click. We should be aiming to keep people on our site not with tricks like disabling the back button or something like that. Things that actively engage people. And we should aim to be the satisfying result for a search. And all of this comes down to the document. Well, if you've been following me at all that I studied the quality rater guidelines for years. And people as SEOs would often say, "Well, there's no way you can turn that into an algorithm. It's just wishful thinking on Google's part." And again, their documentation on creating helpful content. That's the piece that has about 20 questions that you should ask yourself. There was a time where I think we treated that like a checklist and that Google would analyse all these things. But really those are the things that people tend to find helpful. And the very first one on that list is the information ... And insightful.
And so for years, SEO has been like, let's do keyword research to see what everybody else is writing about and then figure out how we can make Google think that ours is better, when really it does come down to what the Google spokespeople say is create content that people find helpful. Which sounds a little trite, and it's difficult for SEOs because it's hard for us to produce original people first content if we are not the business that we're working for. It's a tough task.
Patrick: Marie, we're having a few connection issues. You're dropping out a little bit. I've managed, I think to follow along with what you're saying. If anyone is having issues, just let us know in the chat because I don't think we've really lost any context. I'm just letting you know that it feels like it's dropping a little bit, at least for me. Yeah. Okay. Jojo's just agreed. Marie's dropping out a bit. If that happens again, I'll just interrupt and maybe I'll have to ask you to repeat anything. But yeah. I think that's really got to be the way to think about things. It almost validates the helpful content stuff.
Obviously Google either couldn't or didn't want to come out and say to us, you need to produce content that satisfies these things because we're looking at your click data and we're taking all these user signals, and that's going to affect how you rank, but essentially that's going on underneath it all. That's got to be at least part of the takeaway is that this stuff that they're asking us to do, they're asking us to do it for a reason and here's the reason.
Marie Haynes: I think one of the best things you can do as an SEO is to look at the SERPs. Really put yourself in the shoes of the searcher. And a lot of the time, it's hard for us because you look at your own baby and your own baby's always beautiful. But if Google is ranking other sites above you, it's probably because people are preferring those. I'll tell you an example. Every time I analyse a site that's been impacted by an algorithm update, that's what I'll do is we'll, at keywords that had declined, and then we'll look at the search results and see what did Google start elevating above my client? And then if you put yourself in the shoes of the searcher, well, in one case, the site that I was looking at was ranking ... They lost rankings for a term, garage refrigerator. And I'm like, I don't know what a garage refrigerator is. And they were outranked by this spammy looking site. But the spammy looking site had a picture at the top. It's just like a small refrigerator that you put in your garage. I knew that because the spammy site answered my question. It met my intent.
Whereas an SEO would look at this and go, "Oh my gosh. This site's littered with ads. It doesn't have even a good title tag." It didn't look like something that an SEO would produce. And this is why a lot of the sites that have been impacted by especially the helpful content system are sites that previously did well because of our knowledge of SEO. And as Google gets better at figuring out what's actually satisfying people, what's actually likely to satisfy people, that's what really matters. So focusing on our users is by far the best thing that we can do.
Patrick: Yeah. I love that as well. The idea that if a result has taken over our site, it could be ... Not definitely is, but it could be as a result of the page being more satisfying and that shows up in Google's results because the user interaction, the Navboost has boosted it more rather than we've dropped or anything like that. It could simply be that they're doing something which is more satisfying, or at least it's more satisfying today. Because also user's behaviour is changing as we go so those user signals will continue to update. All right. Okay. So we've not spoken that much about the DOJ case yet, and time is marching on, and you said that you think it's more interesting or there's more to learn maybe from that than the data leak. So I guess could you expand on that a bit?
Marie Haynes: Sure. We briefly touched on what I think is the most important part. There's a part where they talk about the core machine learning systems that are used in search, which Google doesn't talk about a lot. For years, the narrative was that Google doesn't use machine learning. I think Matt Cutts years ago made a comment. Something to the sense of, well, if they did that, nobody would be able to explain rankings. And what they talk about in the DOJ trial is these three systems, RankBrain is the one that they mentioned first, and then they talk about RankEmbed-Bert, which is fascinating. If I start talking about it will be going on forever. But the whole idea of vector search is different from what we're used to and RankEmbed-Bert is a vector space. And Bert itself is a language model.
When Bert first came out, I didn't know what a language model was, but we all know now like ChatGPT, the GPT is a language model. And so Bert is really, really important to search in understanding what's important in a query and then using math to match it with what is likely to be relevant. And then the other system that they talk about is DeepRank, which is fascinating as well. There's a video that Google put out along with the launch of the knowledge graph, which was 2012. Did you know that the knowledge graph launched within weeks of the Penguin algorithm?
Patrick: Whoa. Really?
Marie Haynes: I know, right? And now it all makes sense because I have always said ... That's how I got started in SEO. I was doing Google penalties just for fun, and then when Penguin came out, nobody knew what was going on. And so I was like, "Okay. Well, let's try to learn about it." And I always said, every time there was a Penguin update, it felt that Google was putting less emphasis on links in their algorithm and more on some other understanding of quality. Well, this video that they put out for the knowledge graph, there's a line in it where the woman says ... And I'm going to misquote her. But she says something like when we search, it's our searches that teach Google. It's similar to those Nice pictures or nice pictures. When we search for something, it's what we click on that tells Google what's relevant.
So let's say I'm searching for best tool for doing site audits. There's things for SEO that you can do to ... You're sure your title tag is going to have those words in it and things like that. But what really tells Google what is relevant is what people found relevant. So if I end up clicking on Sitebulb's site and possibly even if I download your product, there's some debate as to whether Google uses Chrome signals in their algorithms. We don't know this for sure. But the Google privacy policy says that they can see what actions we take on websites and also what purchases we make. So if I'm looking for the best website crawler, do you think that Google knows that lots of people use Sitebulb? They know that, right? It becomes less of an SEO game and more of what do people actually consider?
So throughout the years, Google's been learning to take all of the signals that are in the world. I think for many cases, especially real world businesses that have a brick and mortar presence, actually having people come to your store and purchase from you is, I don't want to say a ranking factor, but is something that's important. And so SEO becomes less of a game of how can we manipulate Google and more of ... Not that we were all trying to manipulate Google. But more about just marketing a business as something that people actually want and enjoy.
Patrick: Yeah. We can test it now. So Jojo's just put it in the chat. Let's get Rand at the next MozCon to tell everybody to go and download Sitebulb. Get your phones out. Yeah. So that's really interesting that they could be using it. So I guess that was not clarified in either or you've not found it yet in either the DOJ case or the API docs, like the idea that they might actually be looking beyond ... Obviously it makes sense that they would be tracking stuff within their ecosystem. But actually as soon as it moves to our ecosystem and our websites that they could be tracking that as well.
Marie Haynes: Yeah. Why did they create Chrome? Before Chrome we had the toolbar. Everybody used the page rank toolbar. Well, why would Google give us the page rank toolbar? Did they want us all to know that our site's page rank was three or four or whatever? Every time they give us something, it's for a reason. I personally believe that they get all sorts of information from Chrome. But what we can find evidence for what's in the trial and what we see the attributes for in the API docs is that it's mostly what happens in the Google app that is important. So Google can see. And there's a slide presentation from the trial, it's called Life of a Click. If you can find that, that's something that everybody should look at. And it talks about not just clicks, but also what people hover over, which I think is really important. Now, when they're trying to test AI overviews, they can see where people have ... They're hovering over the links in the carousel or they're scrolling through the links depending on whether Google shows them that way.
And so the actions that are taken within the Google app itself, whether it's desktop or on your phone, those are the things that help ... They use all of those signals to make the experience better. And this is actually a good question that always comes up. How does Google know? When they put out something and they say, "Oh, we've reduced unhelpful content by 40%," or in their earnings calls, they'll say that people are finding the AI overviews to be useful. That's not just marketing talk. They have actual statistics for that. So in the DOJ trial, they talk a lot about IFs which is information satisfaction signals. And what happens is they give Google ... Sorry. They give the quality raters two versions to look at of Google search. One is they call it frozen Google, which is search as it exists now. And the other is retrained Google, which is the machine learning algorithms and all of the stuff that they've put together to try to improve the helpfulness of the results or the quality of the results. They put that algorithm in front of the raters as well. And then the raters assess, are these results actually better than these results?
So the ratings actually provide Google with a score of information satisfaction. And as long as that score is going up, then it means that the machine learning systems are doing overall a good job at improving. They're not perfect. And that's why we see examples that people will post of like, "Oh, Google got this wrong," because it's a prediction. It's not always going to be perfect. And then the goal is to continually improve more and more. If they're not improving then ... And this is in the trial as well. The results from the raters actually fine tune the machine learning systems, which blew my mind as well.
So the raters ... I know Cyrus Shepard again shared because he was a quality rater for a while that the raters provide Google with this grid of every site that they look at and they just write down what made it. He had a Twitter thread where he looked at a site like a quality rater and he said, "Oh, I thought your site was great. It had nice pictures, it loaded fast, it got my answer quickly." And so that would be something he'd put in his little box, his grid as a quality rater. And then Google can use that information to fine tune the machine learning systems that are trying to predict which results to show people which is fascinating.
Patrick: Yeah. If you think about where the landscape is going with AI overviews and SERPs that satisfy the query without requiring a click, do you think that means that all this stuff is preparing for a world where they actually, they don't have this valuable click data? Because someone clicking on a thing is a really clear evidence to some degree of satisfaction. When you mix it with all the other Navboost type metrics, how long they stayed and whether it was the last click, whether they returned and all those sorts of things. When you have this world where an AI overview is the answer and no one's clicking on anything, how are they going to have that Navboost type factor any longer?
Marie Haynes: Well, and that's just the thing. That's why I think that Navboost is not used in exactly the same way that we just learned is used as. I personally think that the entirety of search is going to change in ways that we won't be happy with as SEOs. There's a video of Sergey Brin talking to a group of ... It was a hackathon and everybody focused on some guy in the crowd was wearing this shirt that made him look like he had boobs or something. And that's what people focused on for the media. Meanwhile, what Sergey said in this video was really, really important. Somebody asked him about, well, what about Google's business model? Because there's no way they can switch to AI answers and not relying on websites because that's how Google makes money. We have this agreement almost ... Well, not almost. We have this agreement with Google that we as website owners create content and Google shows that content so we can make money from it, whether it's from ads or what. Up until recently like Google needs and still now they need ads. What Sergey was saying though is that there will probably be a new business model. And he talked about how ChatGPT has a premium subscription and people pay for it. And at the time Gemini Advanced was just coming out. Do you pay for Gemini?
Patrick: No.
Marie Haynes: Now I think every SEO should be paying for and using Gemini regularly, even if you're not finding it useful. Although I am, it's getting more and more useful now. Actually Gemini and AI Studio, which is the latest version, is by far better than any of the other tools. Other than Grok, which just Grok two on Twitter came out. I'm rambling here. But what Sergey said was that ... So people are going to pay for Gemini. He said if the product is good enough, the business model will figure itself out. Now, I believe soon ... Well even just this week on pixel phones, Gemini is now starting to replace Google Assistant. And I think the days are gone ... I shouldn't say the days are gone. I think it's going to happen over the next few years that we won't use our fingers to search. We will use our Google Assistant and get AI answers. As much as SEOs hate them, they're continually improving and Google's data shows that people are engaging with them.
So I don't know if I've really answered the question. And it's hard because I think we can't predict what's coming. It's like in the times of electricity starting trying to talk about how to write code. We can't really see what's coming. But I think that the changes that are happening with AI ... And Sundar Pichai said over the next decade that AI in Google Assistant will radically change search. So the question is how do we prepare for that? I think the way we prepare for it is to just continually use the language models and get used to them, but it won't just be a fringe thing that is a chatbot that people turn to for the odd thing. It will replace search. I'm positive of that.
Patrick: To your point about how does Google make money? There's so much of the DOJ trial, which is like this is how much ridiculous amount of money that Google makes and it's almost all from ads. So they can figure the business model out later for AI, but it's not going to happen overnight because they're obviously going to have to keep that ecosystem going until they've got something that replaces it with. It doesn't make sense otherwise, right?
Marie Haynes: I think what's going to happen is right now they haven't made it easy for businesses to integrate AI and there's good reason for that. I think it has to continue to improve before a business is going to feel comfortable with that. I think there's actually going to be a tipping point that Google will make it very easy once it gets good enough, and I think we're getting there. For businesses to use the power of their AI models. And when businesses start making money from AI and start having huge advantages from AI, then we're going to see a shift. And when that happens, the people who know how to use AI are going to do really well. But all of those businesses that use Google's AI via their cloud services pay Google for cloud. And I don't know the actual numbers, but in the earnings call, their revenue from cloud is creeping up like it's higher and higher. Not nearly at the point where they get from ad revenue from websites. But I think that that's the goal is that we're going to see less and less reliance on ads. There's something that Sergey said as well in that video. He said something like, "For 25 years we've had this great deal with search." And so I do think that things are going to change.
Patrick: Super interesting. It feels like AI to some degree is inevitable so it's just going to be how and when, I suppose.
Marie Haynes: It's been their rule too from the beginning. There's a clip from Larry Page in the year 2000, and he talks about the thing that holds Google back is computational power and that one day, the goal is that Google could answer any question. You can give it any question and it'll give a perfect answer. And then he says, "And that would be artificial intelligence." So people often will say, "Well, Google's behind OpenAI perhaps." And I think they're moving slowly, but this has been the goal all along is to answer everybody's needs with AI. We can talk about ... Google's got glasses again. I know Google Glass got a bad wrap because it was too early. But I demoed at I/O, not the glasses, but Project Astra. And so once that's in glasses where it can see everything in front of you and you can just have a conversation about whatever, that's going to radically change things. I don't know if you want to get into AI in our brains, but that's another topic as well.
Patrick: Yeah. I think that's for another webinar.
Marie Haynes: Yeah, maybe.
Patrick: It feels a bit AI. A bit AI. A bit Sci-Fi even. That's what I meant say. Okay. So we are at the point of the webinar where we bring in the questions from the audience. So this is your last chance to add any questions yourself or up vote the ones that are already there. From what I can see there is wide-ranging questions. So if we just go through the most up voted ones first. Jojo, please put one on screen for us. So is there a place to watch a recording of the trial?
Marie Haynes: Not that I know of. No.
Patrick: No. So you mentioned before that there's a doc that's worth going through, right?
Marie Haynes: I know it's hosted on ... I think it's The Capital Forum. I can put the link in the chat here.
Patrick: Jojo, I think should have that I believe. If you look at what I shared with you earlier Jojo, I think you should have that one.
Marie Haynes: Yeah. You've got it there.
Patrick: There we go.
Marie Haynes: That's the Pandu Nayak. And then there is a page that has every day of testimony. There's 40 some days of testimony. I think day 41 in the morning, Douglas Ord, that's another one that talks a lot about instant glue and user data as well. That would be good to look at. But no, there's no way that I have found to watch the trial. That would be fascinating.
Patrick: You be there for weeks. Just sat there.
Marie Haynes: I would.
Patrick: Okay. We've got Barry. So what do you think about LLMs reaching their limits and the potential for model collapse due to the LLM simply running out of content to learn? It's eating itself almost, right? Any thoughts on this?
Marie Haynes: So I think I'm probably not super qualified to answer this, but I'll give my thoughts on it. We know that the current language models are trained on the web and on other things like the Books Corpus. Did you know that Bert was trained on Wikipedia and then the Books Corpus is romance novels and fantasy fiction mostly. It's because they're good at describing things. So there's a lot of text. And then I guess the thought is that if Google gives us less incentive to put text on the web, then what are they going to train on?
Well, Gemini is built to be multimodal from the ground up. And I think that that's the thing that is going to train AI. And it's almost too hard to grasp that. If language models can understand video, then it's almost endless, especially again, if we have AI in our glasses right now let's talk about Elon Musk's AI. Teslas give so much information coming in from the real world in video form every day. We'll look back at the times where if I wanted to know what happened in the Super Bowl that I read an article about it, whereas today you could take a video of the Super Bowl and I can ask any question of it. Gemini and AI studio is very good. It still has hallucinations from time to time, but it keeps improving. Very good at taking a video. What I'll often do is I'll go through a client's site and I'll just ramble things like, oh, I don't like this, I'd like to change that, whatever. And then I give it to Gemini and it takes the stuff ... Not just what I've said, but it can see what I'm doing on the video. And oh, are we losing my ...
Patrick: No.
Marie Haynes: No, we're good. Okay. And it can get information from that. So real world information is not going to die. And then in terms of text, we still need journalism. I think that a lot of what has evolved into being called journalism today is not journalism. It's what can we write that will get clicks from search engines and how can we paraphrase what everybody else is writing and put it in a way that search engines like. But when somebody is producing original and insightful news, that's not going to stop. We're still going to have news organisations that do investigative journalism and AI will use that as well. So I think as long as there's information in the world, I don't think that there will be a problem with having stuff for AI to train on.
Patrick: Just imagine them training with the multimodal using romcom movies now with the books. I think Barry's now gone ahead and answered his own question in the chat because he seems convinced that they're going to need literally trillions and trillions of tokens. So there you go. If anyone to read check the model collapse answer that Barry's given for himself. So Barry, you've been on your own webinar, right? Can we have another question please? Yeah. So not really the topic of the day, but very pertinent to the case. What are your thoughts on that?
Marie Haynes: Will Google be broken up? That's probably beyond my area of expertise. I did see Eric Schmidt, former CEO of Google in a video say recently that Microsoft was supposed to be broken up and it wasn't. And then this lawsuit probably ... I'm no legal expert. But probably will drag on forever. Google has said that they're going to appeal it, and I think it's possible that by the time we get some type of resolution from this, the whole way that we get our information will have changed. So I don't know. I really don't know whether it would be broken up or not.
Patrick: Yep .fair. Okay, we've got a question from Simon. So thanks to Navboost is rank tracking useless now?
Marie Haynes: That's a really good question. I think there's a few questions around rank tracking and one is AI overviews. I know there are tools that are doing it, but I don't know ... You did need to be logged in. And then just last week Google started showing AI overviews for non logged in people. But the results that you get for AI answers differ from every time you do a query. So that's going to be difficult to track.
In terms of Navboost, I still look at keyword rankings. I was doing it just before this call. I still think you can look at keyword rankings and aim for certain keywords. That's a reasonable goal to improve your keywords. And I think that rank tracking will be important for quite some time. But I think that Google's understanding of queries will allow for longer and longer queries that you want to rank for best site auditing tool. Like, okay, maybe that's a query that lots of people do, but eventually it'll be somebody having a conversation with ChatGPT or with Gemini saying, "I'm trying to figure out why these pages are not getting indexed," or blah, blah, blah, blah. And then the LLM says, "Well, maybe you should get a tool," and then you ask it to recommend some. How do you track that query? You can see referrals from LLMs, but you're not going to see what was the conversation that led to that referral.
Patrick: Right .yeah. And all of the context that would go with it, especially if they can get all the back and forth. How would you ever just reduce that down to something which you could use in a strategy other than just sitting there. But even then you wouldn't even get it.
Marie Haynes: Yeah. It'd be interesting to see SearchGPT, ChatGPT's search product. I haven't used it, but if you search for something in ChatGPT, it's quite good as well as Grok, Twitter's new AI. Grok-2 just came out. The full version came out yesterday. And if you ask Grok-2 right now, tell me about the August core update it writes a better answer than I could. And then if you ask about a recent event ... If something happens in your city and people are talking about it on Twitter, then people are going to go to Grok for ... Not too many people pay for Grok. But eventually most of the world answers are going to come from language models. And you can't rank track that.
Patrick: Yeah. Do you see GPT or Grok or whatever else as the major threat to Google if there is any threat to Google?
Marie Haynes: I keep switching on this. If you asked me a year ago, I would say that Google has been working towards this goal of being an AI assistant and answer machine for many, many years. And then there's also what came out in the DOJ trial is that it's such an integral part of every phone that we use. So how does anybody crack that? We've seen people complaining a lot lately that search is not useful. And I think that as people turn to language models we're going to see significant competition against Google, whether it's OpenAI, X or Perplexity. I think Perplexity is getting a lot of attention, but I don't think they'll have enough to pull people from Google search. But if these language models are dramatically better than what we can get from Google, I know myself ... It would be interesting, I should look at my search history. I think my searches have dropped probably 80%. If I want to know an answer to something now I'm going to go to a language model first.
Now, there was a time where that would sound like a ludicrous thing that it's going to hallucinate, and what if it gives me something wrong? But I'm telling you, if anybody wants to search something on Grok, send me a message and I'll search it and give you a screenshot because it's mind-blowing.
And then when you think everything that Elon Musk does, overtakes industries, and he's got so much data from Teslas from space and then Neuralink as well. And so I think that if you had to make me pick one, I do think that there's a possibility that X will overtake Google as a main source. I know that sounds ludicrous right now, but I think it's possible.
Patrick: Maybe that's why he bought it. Not just to incite riots.
Marie Haynes: It is a hundred percent. Why do you pay $44 billion for a social media platform? It's because it's the town square. And then Google made a deal with Reddit, which is the other town square. And so when you see all of these Reddit pages that are ranking really high, it's because Google needs them to train their AI. Which sounds silly like, oh, I don't want Reddit trolls being my answer. But Reddit tells Google what the questions are. There's a paper called Fresh Prompt, another good one to look into where they talk about how you can improve the response from a language model by taking it's reg, retrieval augmented generation. And what they found was that the questions and answers on Reddit and Quora and other large forums greatly improved the responses of the language model because they could understand what the questions are now.
Patrick: Nice. Wow. There's just so much. I think there is one more question and we have one minute. So let's just give it a go. So when do you think the adoption rate would pick up for Google Gemini, especially with the hard push they're doing right now in the face of Apple Intelligence and GPT?
Marie Haynes: That's a really good question. I think that Google has purposely not wanted Gemini to be widely adopted just yet because they want it to improve. We know that when Bard first came out, it was bad. I was so disappointed when Bard came out because I wanted it to be really good and it just hallucinated on everything. And I think that especially with ... Google had new technology in their machine learning architecture in February of this year where they switched from using just a transformer network to something called a mixture of experts model. And that gave them way more capabilities that the other models don't have. And so it's learning and it's getting better and better.
The Apple question is a good one because I know that we've seen in the news that Apple potentially has partnership with OpenAI and the AI assistant as well as Siri. May possibly be run by ChatGPT. But I also saw somewhere that there will be the option to choose between ChatGPT and Gemini.
I think I'm just speculating on all of this. When will Gemini be mainstream? Well, just this week I got it on my phone. I got Gemini Live and some people are getting it in Google Assistant, and I wrote a piece probably a year ago or so about how I think this will be the main way that people search. And so I think given that so many people already use Google for searches, it won't be much of a leap. They're already starting to get us into the ecosystem with AI overviews. And as people start finding this more and more useful ... My husband does a tonne of gardening and he's not a tech person at all. The things he has created with the help of ... He calls his friends Chad, Brad, and I can't remember what he calls Meta. Meta is a good one too. He's accomplished more than you can imagine in this garden with the help of language models. And so I do think it will become mainstream. I'm really bad at predicting dates. I would've said it would've happened by now. Google did say that there changing search over the next decade, so who knows?
Patrick: Amazing. Well, that is about all we've got time for I'm afraid. I feel like we could just sit here and talk for hours. There'll be more and more amazing things coming out of your brain. So thank you so much to everybody watching and for the really great questions. And of course, thank you, Marie, for so generously giving your time and expertise and of course spending all the time, the weeks and weeks and weeks digging through all the docs and stuff in the first place. So really appreciate you coming on. Thank you very much.