How Google fought search spam using AI in 2020

By Tilly Kenyon
Google has said that Artificial Intelligence (AI) offers ‘unprecedented potential to revolutionise’ spam fighting...

Last year Google was able to build their very own spam-fighting AI, that can catch both known and new spam trends. 

Hacked spam was still widespread in 2020 as the number of vulnerable websites remained quite large, although Google has said they have improved their detection capability by more than 50% and removed most of the hacked spam from search results. They have also reduced sites with auto-generated and scraped content by more than 80% compared to a couple of years ago.

What is search engine spam? 

Search engine spam refers to measures that try to influence the position a website has in search engines, often for pages that contain little or no relevant content.

How Google prevents spam from reaching you 

Before Google delivers a set of search results, there is a lot that happens. Every day they are discovering, crawling, and indexing billions of web pages of which they discover 40 billion spammy pages. 

undefined

This diagram shows how Google defends against spam.

Firstly, they have systems that can detect spam when they crawl pages or other content. Crawling is when their automatic systems visit content and consider it for inclusion in the index they use to provide search results. 

These systems also work for the content they discover through sitemaps and Search Console. For example, Search Console has a 'request indexing' feature so creators can let Google know about new pages that should be added quickly. Google has previously observed spammers hacking into vulnerable sites, pretending to be the owners of these sites, verifying themselves in the Search Console, and using the tool to ask Google to crawl and index the many spammy pages they created. Using AI, Google was able to pinpoint suspicious verifications and prevented spam URLs from getting into the index this way.

Next, they have systems that analyse the content that is included in the index. When you issue a search, they work to double-check if the content that matches might be spam. If so, that content won’t appear in the top search results. 

The result is that very little spam actually makes it into the top results anyone sees for a search, thanks to the automated systems that are aided by AI. Google has estimated that these automated systems help keep more than 99% of visits from Search completely spam-free. As for the percentage left, their teams take manual action to further improve the automated systems.

(Image: Google)

Share

Featured Articles

Pick N Pay’s Leon Van Niekerk: Evaluating Enterprise AI

We spoke with Pick N Pay Head of Testing Leon Van Niekerk at OpenText World Europe 2024 about its partnership with OpenText and how it plans to use AI

AI Agenda at Paris 2024: Revolutionising the Olympic Games

We attended the IOC Olympic AI Agenda Launch for Olympic Games Paris 2024 to learn about its AI strategy and enterprise partnerships to transform sports

Who is Gurdeep Singh Pall? Qualtrics’ AI Strategy President

Qualtrics has appointed Microsoft veteran Gurdeep Singh Pall as its new President of AI Strategy to transform the company’s AI offerings for customers

Should Tech Leaders be Concerned About the Power of AI?

Technology

Andrew Ng Joins Amazon Board to Support Enterprise AI

Machine Learning

GPT-4 Turbo: OpenAI Enhances ChatGPT AI Model for Developers

Machine Learning