Massive influx of links

Joined
Mar 27, 2015
Messages
837
Likes
1,486
Degree
3
Here's a question for you link guys.

One of my sites is getting a mountain of incoming spammy links from 1 domain. It is a scraper site that is linking to images in the image folder in my wordpress site.

In the last 7 days Ahrefs detects 42,000 links incoming.

I have never really bothered with disavow files or the like but I am wondering if the sheer volume of links could be detrimental.

The fact that they are linking to images in my image folder in WP also makes me feel like something isnt right with this.

What do you guys think and what would you do?
 
Are they coming from urls with '-k.html' at the end of them? If so, these are classic content scraper farms. If you rank for anything remotely competitive, you'll pick up dogshit like this on your website's virtual shoe. If it's just one domain it's easy enough to disavow, but I'd just ignore them.
 
Just the link or the image itself with the link?

I would disavow the entire domain
 
Are they hot-linking your images or just straight linking to them as in click to see the image?
 
Are they hot-linking your images or just straight linking to them as in click to see the image?
They are showing the image on their site - so copy and paste - and then below it saying : Source and the source links to - mysite.com/wp-content/images/theimage.jpg
 
I had this exact thing happen to a site several years back.

The site was ranking front page for water heaters and several other related big product keywords.

It tanked, never came back and I lost out on a shit load of income.

Hopefully googles gotten smarter, or disavow actually does something.
Good luck.

Should note the site linking to me was a blatant porn site and was used to take down several other people in other niches as well. They also did the images folder thing for all of them.
 
It tanked

Sounds like disvowing is the only way. If it was hotlinking than I would suggest a htaccess redirect, but this looks like some black/gray hat negative SEO technique.

Ironically I seen this before - before canonical was introduced. I would disavow the whole domain and each URL. And then block the IP address of the domain and htaccess referring traffic too.
 
They are showing the image on their site - so copy and paste - and then below it saying : Source and the source links to - mysite.com/wp-content/images/theimage.jpg
These sites are usually copying paragraphs from various articles to create a "new" article while inserting related images from various sites. They tend to know what's what based on the images file name. They're auto-generated database sites, basically.

It tanked, never came back and I lost out on a shit load of income.
I actually went halfsie's on a site with @stackcash that tanked and he didn't have the time to deal with it. If I recovered it, we'd split the earnings. I did a lot of clean-up on-site and other things, but the smoking gun was that some ding-dong had included this site in their gigantic automated Blogspot network where they linked directly to pictures. The only spam links to the site were to the pictures, and it tanked. After disavowing the entirety of Blogspot (one line in the disavow file) it came right back. This was probably 6 years ago, for perspective.

Google not only didn't have it's shit together enough to not penalize based on automated PBN's, but it still to this day allows all of this Google Images spam, and to make matters worse, it allows endless amounts of automated PBN spam on Blogspot. Happily indexes it all and let's it survive.

I don't think it's a sophisticated operation. I'm not entirely sure what the purpose of it all is, because in about half of the cases they do use no-follow. I feel like it's got something to do with selling "Crawling & Indexing As A Service".
 
These sites are usually copying paragraphs from various articles to create a "new" article while inserting related images from various sites. They tend to know what's what based on the images file name. They're auto-generated database sites, basically.


I actually went halfsie's on a site with @stackcash that tanked and he didn't have the time to deal with it. If I recovered it, we'd split the earnings. I did a lot of clean-up on-site and other things, but the smoking gun was that some ding-dong had included this site in their gigantic automated Blogspot network where they linked directly to pictures. The only spam links to the site were to the pictures, and it tanked. After disavowing the entirety of Blogspot (one line in the disavow file) it came right back. This was probably 6 years ago, for perspective.

Google not only didn't have it's shit together enough to not penalize based on automated PBN's, but it still to this day allows all of this Google Images spam, and to make matters worse, it allows endless amounts of automated PBN spam on Blogspot. Happily indexes it all and let's it survive.

I don't think it's a sophisticated operation. I'm not entirely sure what the purpose of it all is, because in about half of the cases they do use no-follow. I feel like it's got

Just checked and it looks like a ton of shitty sites linking this way delivering a total of 320,000 incoming links in the last 6 months!! I never check on links so this caught me off guard.

I see there are around 2.5k domains sending these links that link directly to .jpg files on my site.

I have always had the vibe (total gut feel not data based) that messing with a disavow file somehow put you on Googles radar in a negative way. Almost like we tried to pump our site with shit links it didnt work and now we want forgiveness...

Is that me being super paranoid?
 
Is that me being super paranoid?
Yes. Disavow that shit. URL level works faster than Domain level but you have to get them re-crawled and you have to get all of them.

Ahrefs picks up about 28% of links on the dozen or so disavow recovery projects I've done.

Disavowing some of the crap but not all isn't how you win this game, use Link Research Tools instead. Expensive but necessary.

If after you've disavowed and submitted to crawlers and indexers (and you got them all, not ignored me and tried to do it with just Ahrefs) and it's still not recovered, spam the living shit out of those links.

Nothing brings the spiders to the yard like links.

I refuse to neg SEO people but if they did it to me first, open fucking season ya bastards.
 
Yes. Disavow that shit. URL level works faster than Domain level but you have to get them re-crawled and you have to get all of them.

Ahrefs picks up about 28% of links on the dozen or so disavow recovery projects I've done.

Disavowing some of the crap but not all isn't how you win this game, use Link Research Tools instead. Expensive but necessary.

If after you've disavowed and submitted to crawlers and indexers (and you got them all, not ignored me and tried to do it with just Ahrefs) and it's still not recovered, spam the living shit out of those links.

Nothing brings the spiders to the yard like links.

I refuse to neg SEO people but if they did it to me first, open fucking season ya bastards.
Got it thanks.

I know exactly who is doing it too as the only one of the main competitors is looking clean af. Pretty obvious lol.

Do you happen to do cleanups and counter measures by any chance?
 
Got it thanks.

I know exactly who is doing it too as the only one of the main competitors is looking clean af. Pretty obvious lol.

Do you happen to do cleanups and counter measures by any chance?
I do but it's ridiculously expensive because it has to be for me to focus on someone else's product instead of mine. If a 5 figure price tag doesn't deter you, reach out.

If it were me and I knew exactly who was doing it, I'd 301 my site at theirs until they reached out and asked me to stop, then discuss wtf they thought they were doing in the first place and how they're going to make you whole.

Neg SEO is a dirty dirty game I refuse to play but once someone decides to damage my property, gloves come off.
 
you have to get them re-crawled and you have to get all of them.
Do you mean you have to get the original spam url that is spamming your site re-crawled? If so, how would you do that?

Or do you mean you have to get the target page/image on your site re-crawled?

And after you've disavowed all the spam links, do you have to request re-indexing of your pages by Google or should you just wait for the old pages to be recrawled?

I'm dealing with something similar (unrelenting spam links) and trying to fight back so finding tons of value in the discussion above.
 
Disavow, wait 48-72 hours, start indexing the original spam URLs.

If you wait for Google to give a fuck about your site it's like 6 months typically.

I run the spam through every indexing and crawling service available, multiple runs.

If the site still hasn't popped back in 7-10 days, then I start spamming the spam.

Fresh links to them does wonders.
 
Disavow, wait 48-72 hours, start indexing the original spam URLs.

Thank you for responding.

Just to make sure I understand, are you saying 48-72 hours after disavowing the domains that are spamming my site I should reindex the pages on my website that the spam links were targeting?

I'm new to fighting off negative seo... so just wondering how would that resolve the inbound spam links? Doesn't it all depend on Google updating the disavow file to ignore links from those domains?
 
Thank you for responding.

Just to make sure I understand, are you saying 48-72 hours after disavowing the domains that are spamming my site I should reindex the pages on my website that the spam links were targeting?

I'm new to fighting off negative seo... so just wondering how would that resolve the inbound spam links? Doesn't it all depend on Google updating the disavow file to ignore links from those domains?
Pretty sure he's saying crawl/index the spam URLs (the ones you want disavowed) rather than the URLs on your site that are getting linked to. It sounds like Google needs to crawl the spam page and follow the spam link to your site to recognize all the elements and have the disavow fully kick in.
 
Pretty sure he's saying crawl/index the spam URLs (the ones you want disavowed) rather than the URLs on your site that are getting linked to. It sounds like Google needs to crawl the spam page and follow the spam link to your site to recognize all the elements and have the disavow fully kick in.

Thanks for clarifying. That's what I thought was being suggested, which is why I'm confused...

I know how to request the indexing of pages on my website... but how would you get Google to index a page on someone else's website?

I'm searching for the answer but not finding anything online.

I'll obviously keep searching but any suggestions on how to do that would be appreciated.
 
Thanks for clarifying. That's what I thought was being suggested, which is why I'm confused...

I know how to request the indexing of pages on my website... but how would you get Google to index a page on someone else's website?

I'm searching for the answer but not finding anything online.

I'll obviously keep searching but any suggestions on how to do that would be appreciated.
You have to get them to recrawl the page. That’s all there is to it.

It used to be as easy as pinging the URL in one of many ping aggregator list sites Google was known to crawl. When that stopped working, people started autogenerating giant websites (like those image scrapers) to insert links in to get them crawled. Tricks that have come and gone include spinning up Blogspot sites with pages that are simply lists of links, creating “video RSS feeds”, using 301 redirects, and more. None of it is working anymore, which is to say not to pay any crawling services. I tested a bunch last year and they’re all toast but still offering their services. It’s a waste of money.

That’s why @Grind’s final suggestion is spam links. Google crawls through links. That’s their core job. You can spam spammers but I’d try not to spam good sites directly, which is where redirects come in. The last (or perhaps only) resort is to hit them with links. Fiverr is the place to go for cheap dofollow comments and GSA spam and all that crap. Pass a big list to a few of those guys and a bunch of {click here|this page|on this site} type of anchors and let her rip.
 
Disavow, wait 48-72 hours, start indexing the original spam URLs.

If you wait for Google to give a fuck about your site it's like 6 months typically.

I run the spam through every indexing and crawling service available, multiple runs.

If the site still hasn't popped back in 7-10 days, then I start spamming the spam.

Fresh links to them does wonders.
How does this apply to domain level disavows?

Do you run the disavowed URLS through or at the domain level to the indexers?
 
i could have sworn I recalled a conversation like this at BuSo, it was our thread here.

Some more context:


Also some hotlinking tips:

https://webmasters.stackexchange.co...image-protection-hurt-search-engines-indexing

I think you can simply block ALL referring from a specific domain, will need go have A.I. look at this in the morning.

Edit:

Here is the code, assumed BuSo is the bad guy in the example:

Code:
RewriteEngine On
RewriteCond %{HTTP_REFERER} ^http(s)?://(www\.)?buildersociety.com [NC]
RewriteRule \.(jpg|jpeg|png|gif|bmp|zip|rar|mp3|mp4|avi)$ - [F]
 
Last edited:
Wow. Looking at my link report is pretty wild.

For some pages on my site, I used to rank number one. Looking closely, some of these pages have a couple of good backlinks that likely propelled them to number one.

Then, someone got pissed that I came out of nowhere to steal their long-held ranking. And sent a bunch of bad links at the pages.

bunch of .best .online .pics .de links on certain pages that all go to dead sites or very low quality sites.

Seems like someone tried to get me too. Another thing to fix...

Quick note here as I dug deeper, I discovered that image situation you described is also very much happening to my website from sites that Chrome won't even let you visit.

Cloudflare has a "scrape shield" feature that allows you to disable image hotlinking with a simple slider switch.

Just turned it on and also manually disavowed hundreds of spam links from my GSC.
 
Code:
RewriteEngine On
RewriteCond %{HTTP_REFERER} ^http(s)?://(www\.)?buildersociety.com [NC]
RewriteRule \.(jpg|jpeg|png|gif|bmp|zip|rar|mp3|mp4|avi)$ - [F]

Looking at my backlink audit, I have a ton of garbage from a metric shit-ton of web.app subdomains. Apparently, this is Google's Firebase web app service. I can't imagine there is anything good coming from any subdomains on there. I also looked at my good links and didn't see anything from *.web.app.

Chat GPT tells me that I can block all image links from any subdomain with:

Code:
RewriteEngine On
RewriteCond %{HTTP_HOST} ^(?:.*\.)?example\.com$ [NC]
RewriteRule \.(jpg|jpeg|png|gif|bmp|zip|rar|mp3|mp4|avi)$ - [F]

But I am curious - why not just block ALL traffic from these garbage sites instead of just the image files?

Also, any concerns about the length of the htaccess file? These could get big pretty fast. Chat GPT tells me this. Do the smart humans here agree?

In summary, while a large .htaccess file may present some minor concerns related to performance and maintenance, it is generally an acceptable approach for blocking SEO spam and other unwanted traffic. Just be mindful of organization, documentation, and periodic review to keep it manageable.
 
ut I am curious - why not just block ALL traffic from these garbage sites instead of just the image files?

Also, any concerns about the length of the htaccess file? These could get big pretty fast. Chat GPT tells me this. Do the smart humans here agree?

Be careful, the code I posted blocks images from loading on the bad guy's website. Your code doesn't do that. Notice the HTTP_REFERER in my code, that's blocks the loading. Yours looks like it blocks access from that domain to the file - I'm not sure that's the same thing. Test it out like I did in this thread: Prevent Hotlinking of Images (Apache and NGINX)

By blocking traffic you are assuming there is traffic coming from these sites to your, does your analytics state there is? Otherwise I don't know a reason you want to block traffic, especially if it's real people.

Also, any concerns about the length of the htaccess file?
No I wouldn't worry, mine is pretty big on my sites, 17 KBs and am fine.

Some of the stuff I have can be blocked at the Apache level to reduce it, but I haven't noticed a performance problem.
 
You have to get them to recrawl the page. That’s all there is to it.

It used to be as easy as pinging the URL in one of many ping aggregator list sites Google was known to crawl. When that stopped working, people started autogenerating giant websites (like those image scrapers) to insert links in to get them crawled. Tricks that have come and gone include spinning up Blogspot sites with pages that are simply lists of links, creating “video RSS feeds”, using 301 redirects, and more. None of it is working anymore, which is to say not to pay any crawling services. I tested a bunch last year and they’re all toast but still offering their services. It’s a waste of money.

That’s why @Grind’s final suggestion is spam links. Google crawls through links. That’s their core job. You can spam spammers but I’d try not to spam good sites directly, which is where redirects come in. The last (or perhaps only) resort is to hit them with links. Fiverr is the place to go for cheap dofollow comments and GSA spam and all that crap. Pass a big list to a few of those guys and a bunch of {click here|this page|on this site} type of anchors and let her rip.
Thanks for translating my jibberish into easily understandable context. I'm sure Bernard is grateful too.
 
Back