- Joined
- Sep 3, 2014
- Messages
- 6,221
- Likes
- 13,094
- Degree
- 9
Something I've mentioned here and there in passing on the forum, I think, deserves to get a full mention, if only to draw more attention to it so there's more brains thinking about it and providing insight is... indexing and how that relates to Google's 2022 Core Updates, Product Review Updates, Helpful Content Updates, & Spam Updates.
What will illustrate this more than anything is some images, courtesy of Ahref's new overview that has indexed page counts (pages in the top 100 results) baked in alongside the organic traffic. I don't have any specific examples in mind, but it's so common that I'm going to pull up random sites and I bet I'll find plenty:
Literally 3 minutes later I found about 7 of these. I just typed crap into Google like "dog website" and grabbed one, then "home & garden website" and grabbed one, etc. Of the 7 I found, only one of the 8 total I searched didn't have this issue. That's how common it is.
Now, you might be thinking "well, Ahrefs is probably scraping this data using
Though really, enough people here even on this forum have been noticing it and complaining about it all year. It's not just that they're dropping posts out of the index but even getting consistent indexation is becoming a problem for a lot of people. I haven't had that problem myself, as you can see in my graph here on what's become my main project lately where I've consistently published 6 posts, all on the same day, every week for about a year now (after a pause around 150):
Okay, there's the foundation of what I'm wanting to point out (but I have no over-arching point I'm making, just presenting a pattern).
This all really started in a serious way around the March Product Reviews Update (which I think was the first without confirming) and especially the May 2022 Core Update. That's when something Google's doing really changed. It's when people first started tanking by 40% upwards to 99% even. You can see that in the first big image with 4 graphs if you look at the top left graph. Around May, they went to ZERO search visibility
By the July 2022 Product Reviews Update, it popped back, which has been my mantra on here. "Somethings up, just wait, it'll pop back."
But notice at that same time period when the traffic tanked to zero, that their indexation also dropped by probably 37% or so. Posts that were in the index were no longer in the index. I don't know if they continued to publish more content and the dropped content never came back, or if that slow increase is all of the content returning to the index.
Look at the graph directly below that one, where the timelines match. You can see the May Update dropped that site by 35% of their traffic and they lost some indexation. It looks like they kept publishing and gained all their dropped posts back plus the new ones. But then it drops again...
Look at the top right graph where they pushed hard and then not only dropped a ton of posts from the index but then started losing more and more from the Septemeber Core Update, September Product Reviews Update, and October Spam Update.
And take note on all 4 graphs, along these updates I'm concerned with, how severe of drops everyone has been suffering (even the top right, which looks less severe because the y-axis is compressed due to that big surge they had in traffic).
This is all pretty unprecedented. Indexation fluctuating big time, traffic taking WILD swings. On my graph, it looks like around a 25% hit but it's really about 40% organic in reality.
What prompted me create this thread finally was a post I ran into this morning called Was Google's Recent Update About Indexing? They think this just started in October because it's when they first noticed it, but it's been going on. But what they did do was uncover something very interesting.
First let me show you what they said about their own indexation:
Between October 10th (9 days before the October 2022 Spam Update) and October 18th (1 day before the update) their indexation went from 98.2% to 75.4%. A total of 63 of their posts dropped out of the index, all of which are high quality if you click around the site, though they did say that they felt the posts that were dropped out of the index were "thin" and chose to 404 them.
They then decide to start looking around and noticed that developers.google.com went from 86% indexation to 70%, having dropped about 7,000 posts from the index.
Then they went to JohnMu.com and found that it went from 97% indexed to 66% indexed. This is the website of the current iteration of Matt Cutts, who is their industry-facing, public relations / propaganda guy. They then comment that they checked on which posts dropped out of the index and if they were "helpful content" or not, which they were. Click around that site and see what John Mueller thinks is worth publishing for some insight into his advice, knowing he knows what Google is desiring from people.
I stated on the forum before all this even started that with the AI Content wave coming that Google was going to have to start making decisions about what they're going to index or not. The internet is already growing exponentially without AI Content people spewing out extra millions of posts per day. Bing already doesn't index everything it finds... why should Google, especially now that so much is trash.
Google is constantly fighting spam and trying to find ways to kill it like:
I think this is a part of what's going on. Their thresholds on what is worth indexing are being tweaked and things are going very haywire. Especially because it's not about just blocking crap from being indexed but also dropping existing crap out of the index.
@tyealia shared an interesting piece of text he saved from an AI Engineer regarding his thoughts on the matter (click to expand or read my summary below):
Basically, the explanation is that Google is moving towards more machine learning in their algorithms, which may produce better results but poses three problems:
What I've noticed is that people who got hit at any point during this timeline of updates all seem to be rewinded in the algorithm to a specific point in time. With the September Core Update, myself and others I'm watching that got hit all seem to have been pushed back to how things were in January / February. And if that's happening to all of us, that means they either straight up are using data from that time period or some amount of algorithmic data is needing to be ignored for the time being that was collected since that point in time.
If that's true, it does make sense to see some of the indexation between those time periods go away too. But the problem is there's always edge cases and exceptions, like my site which hasn't seen a single post drop from the index and has a 100% indexation rate.
At risk of being long-winded as usual, I'm going to close this out with my opinion, and invite everyone to share theirs. The more varying and dissenting and agreeable they are, the better. Everything is invited.
From where I was sitting, watching this happen to people since May 2022, I was telling everyone to hang on. Google's screwing up or doing something different and it'll all pop back to where you were. I know some people on here had to wait until August to September (4 to 5 months) before they started popping back in as predicted. But they're popping in around 70% of where they were and then daily are clawing back a little more (hopefully ending up where they started).
The advice included not having any knee-jerk reactions or assuming you did anything wrong. Don't start 301-ing to a new domain, deleting content, or anything drastic. Just be patient, and in the meantime keep doing what you would have been doing so when everything gets settled you won't have lost your time-delayed growth momentum.
That was my advice at the time, and then I got my turn to be hit in the September Core Update. Now I'm having to take my own advice and I'm hopeful for a timeline of somewhere in late January to early February for popping back to where I was. I'm actually about to double the amount I'm publishing and make a heavy monetary investment into content. That's how confident I am that this shakes out.
As I said somewhere else, it's a game of attrition, meaning it's a question of whether or not you can fund your way through it and out the other side. If not, then I'd recommend taking the time to do a kitchen sink style audit manually in the mean time to set yourself up for as much success as possible when it does shake out, so you have a sudden boon of cash flow at that time.
But I'm one person with only two eyes and one brain. I'd love to hear what others are thinking, experiencing, the doom and gloom they felt when it happened to them, what they think now that they're popping out of this, etc. Not only regarding "what we should do" but what the hell is going on in general, why the indexation issues are happening, why organic traffic would tank as much as 75% for some (no matter the site size, DR score, etc., nobody is safe) and anything else that comes to mind.
The Foundation of the Discussion
What will illustrate this more than anything is some images, courtesy of Ahref's new overview that has indexed page counts (pages in the top 100 results) baked in alongside the organic traffic. I don't have any specific examples in mind, but it's so common that I'm going to pull up random sites and I bet I'll find plenty:
Now, you might be thinking "well, Ahrefs is probably scraping this data using
site:
searches and that's not accurate. I have an answer for you regarding that later, which takes the whole conversation up a notch.Though really, enough people here even on this forum have been noticing it and complaining about it all year. It's not just that they're dropping posts out of the index but even getting consistent indexation is becoming a problem for a lot of people. I haven't had that problem myself, as you can see in my graph here on what's become my main project lately where I've consistently published 6 posts, all on the same day, every week for about a year now (after a pause around 150):
Okay, there's the foundation of what I'm wanting to point out (but I have no over-arching point I'm making, just presenting a pattern).
An Explanation for What I'm Pointing Out
This all really started in a serious way around the March Product Reviews Update (which I think was the first without confirming) and especially the May 2022 Core Update. That's when something Google's doing really changed. It's when people first started tanking by 40% upwards to 99% even. You can see that in the first big image with 4 graphs if you look at the top left graph. Around May, they went to ZERO search visibility
By the July 2022 Product Reviews Update, it popped back, which has been my mantra on here. "Somethings up, just wait, it'll pop back."
But notice at that same time period when the traffic tanked to zero, that their indexation also dropped by probably 37% or so. Posts that were in the index were no longer in the index. I don't know if they continued to publish more content and the dropped content never came back, or if that slow increase is all of the content returning to the index.
Look at the graph directly below that one, where the timelines match. You can see the May Update dropped that site by 35% of their traffic and they lost some indexation. It looks like they kept publishing and gained all their dropped posts back plus the new ones. But then it drops again...
Look at the top right graph where they pushed hard and then not only dropped a ton of posts from the index but then started losing more and more from the Septemeber Core Update, September Product Reviews Update, and October Spam Update.
And take note on all 4 graphs, along these updates I'm concerned with, how severe of drops everyone has been suffering (even the top right, which looks less severe because the y-axis is compressed due to that big surge they had in traffic).
This is all pretty unprecedented. Indexation fluctuating big time, traffic taking WILD swings. On my graph, it looks like around a 25% hit but it's really about 40% organic in reality.
Is Google Having Problems?...
What prompted me create this thread finally was a post I ran into this morning called Was Google's Recent Update About Indexing? They think this just started in October because it's when they first noticed it, but it's been going on. But what they did do was uncover something very interesting.
First let me show you what they said about their own indexation:
They then decide to start looking around and noticed that developers.google.com went from 86% indexation to 70%, having dropped about 7,000 posts from the index.
Then they went to JohnMu.com and found that it went from 97% indexed to 66% indexed. This is the website of the current iteration of Matt Cutts, who is their industry-facing, public relations / propaganda guy. They then comment that they checked on which posts dropped out of the index and if they were "helpful content" or not, which they were. Click around that site and see what John Mueller thinks is worth publishing for some insight into his advice, knowing he knows what Google is desiring from people.
Or is This Part of the New Paradigm?
I stated on the forum before all this even started that with the AI Content wave coming that Google was going to have to start making decisions about what they're going to index or not. The internet is already growing exponentially without AI Content people spewing out extra millions of posts per day. Bing already doesn't index everything it finds... why should Google, especially now that so much is trash.
Google is constantly fighting spam and trying to find ways to kill it like:
- the old Farmer Update (keyword permutations)
- Penguin (link spammers)
- Panda (technical SEO more than anything)
- Spam Updates (cloaking, doorway pages, malware)
- Thin Content penalties (no value added affiliate content)
- Helpful Content (low quality content, thin content, AI content)
- They've increased the time delays and randomizations in everything
I think this is a part of what's going on. Their thresholds on what is worth indexing are being tweaked and things are going very haywire. Especially because it's not about just blocking crap from being indexed but also dropping existing crap out of the index.
Another Possible Explanation
@tyealia shared an interesting piece of text he saved from an AI Engineer regarding his thoughts on the matter (click to expand or read my summary below):
"From my experience working with AI developers running on large amounts of data and complex multi-variant models, my thought is this has very little to do with your content.
When Google indexes sites they have a dynamic scoring system that continuously takes into account user response data along with the categorizations Google has already done on each piece of content on your site. Every time they update their algorithms and sub-algorithms they need to re-run all the pages on all the sites that fall within the category of sites they were trying to improve the search results for.
For example if they add another factor to one of their algo models - like how many scrolls and clicks somebody does, or how many internal links a page has, or whether the page uses specific code pattern - then all the pages on all the sites this applies to need to be re-run through the new algorithm. The reason is you can’t compare outputs from 2 different ranking models. So they basically wipe the old post-process data used to rank your pages previously and rerun those pages over time with the new algorithm. If you had good content scores before, that gets wiped and they rebuild it from new user experience data generated by the new algorithm. It takes time and ideally you get the same or better ranking afterward.
The pattern you are describing where irrelevant / bad content sites and large high-authority sites (eg Home Depot) are outranking you now seems to be an artifact of the historical ranking data wipe. When Google wipes and has to reconstruct a portion of the ranking data, what’s left is the data that hasn’t changed. In this case it’s probably the historical backlink ranking data that was left which is now inordinately more important in the rankings because the relevance ranking data got wiped and hasn’t been rebuilt yet. So the guys with tons of backlinks are winning temporarily.
Google also takes time to split test. So they will apply the new algo to one population of sites, and keep the old algo for the other group of sites. Then compare results to see which version of the algo “wins”. Your site might be in the “new algo” rather than the control group. They’re probably using AI to design and run these tests on the fly, too.
Google has been doing a massive development push on relevance and NLP in the last 3-4 years. Relevance-based algorithms are dramatically different than old data-value based algos. Now when they do an algo update they’re not just shifting the value they place on one or more known discrete ranking factors. They are transforming the entire ranking model in novel ways they don’t even understand completely.
in a nutshell:
1) it’s probably not your fault
2) Google is probably not singling out a particular site or post model (unless they are explicit about it)
3) you probably lost rankings because a chunk of your historical data got wiped and needs to be rebuilt by Google over time
4) You (and other good sites) should recover after a few months once Google has a chance to rebuild your user response data
5) You cannot prevent these things - you can only mitigate the damage by “doing all the things” on each site, and diversifying across sites, revenue models and niches."
Basically, the explanation is that Google is moving towards more machine learning in their algorithms, which may produce better results but poses three problems:
- It's a (not really) black box algorithm they'll begin to have less control over in terms of variable weighting.
- It requires dynamic processing (less offline data crunching) that happens out in the live index.
- To transition to these new algorithms, they need to start fresh (wipe data) to not interfere with the data.
What I've noticed is that people who got hit at any point during this timeline of updates all seem to be rewinded in the algorithm to a specific point in time. With the September Core Update, myself and others I'm watching that got hit all seem to have been pushed back to how things were in January / February. And if that's happening to all of us, that means they either straight up are using data from that time period or some amount of algorithmic data is needing to be ignored for the time being that was collected since that point in time.
If that's true, it does make sense to see some of the indexation between those time periods go away too. But the problem is there's always edge cases and exceptions, like my site which hasn't seen a single post drop from the index and has a 100% indexation rate.
What Do SEO's Do Right Now?
At risk of being long-winded as usual, I'm going to close this out with my opinion, and invite everyone to share theirs. The more varying and dissenting and agreeable they are, the better. Everything is invited.
From where I was sitting, watching this happen to people since May 2022, I was telling everyone to hang on. Google's screwing up or doing something different and it'll all pop back to where you were. I know some people on here had to wait until August to September (4 to 5 months) before they started popping back in as predicted. But they're popping in around 70% of where they were and then daily are clawing back a little more (hopefully ending up where they started).
The advice included not having any knee-jerk reactions or assuming you did anything wrong. Don't start 301-ing to a new domain, deleting content, or anything drastic. Just be patient, and in the meantime keep doing what you would have been doing so when everything gets settled you won't have lost your time-delayed growth momentum.
That was my advice at the time, and then I got my turn to be hit in the September Core Update. Now I'm having to take my own advice and I'm hopeful for a timeline of somewhere in late January to early February for popping back to where I was. I'm actually about to double the amount I'm publishing and make a heavy monetary investment into content. That's how confident I am that this shakes out.
As I said somewhere else, it's a game of attrition, meaning it's a question of whether or not you can fund your way through it and out the other side. If not, then I'd recommend taking the time to do a kitchen sink style audit manually in the mean time to set yourself up for as much success as possible when it does shake out, so you have a sudden boon of cash flow at that time.
But I'm one person with only two eyes and one brain. I'd love to hear what others are thinking, experiencing, the doom and gloom they felt when it happened to them, what they think now that they're popping out of this, etc. Not only regarding "what we should do" but what the hell is going on in general, why the indexation issues are happening, why organic traffic would tank as much as 75% for some (no matter the site size, DR score, etc., nobody is safe) and anything else that comes to mind.