The operation of a search engine can be summarized in two steps: crawling and indexing.

Tracking

Search engines use web crawlers, also called bots, and these go through all the pages of the websites. Like any user would do when browsing the Web’s content, they go from one link to another and collect data about those web pages they provide to their users.

Web Crawler
Web Crawler

The tracking process starts with a list of web addresses from previous crawls and sitemaps provided by websites on the search console/webmaster. 

Once they access these websites, the bots look for links to other pages to visit them. Bots are especially attracted to new sites and changes to existing websites.

The bots themselves decide which pages to visit, how often, and how long they will crawl that website, so it is essential to have an optimal loading time and updated content.

Commonly, a web page needs to restrict the crawling of some pages or specific content to prevent them from appearing in search results. For this, search engine bots can be told not to crawl individual pages through the "robots.txt" file.

Indexing

Once a bot has crawled a website and collected the necessary information, these pages are indexed. And they are arranged according to their content, their authority, and their relevance. 

In this way, when we make a query to the search engine, it will be much easier to show us the most related results.

In the beginning, search engines were based on the number of times a word was repeated. When searching, they crawled those terms in their index to find which pages had them in their texts, ranking better the one that had it repeated the most times. 

Today they are more sophisticated and base their criteria on hundreds of different things. If they contain images, videos, or animations, microformats, etc., the publication date is a few of those aspects. Now they give more priority to the quality of the content.

 

Once the pages are crawled and indexed, the time comes for the algorithm to act: 

Algorithms are the computer processes that decide which pages appear earlier or later in search results. Once the search is performed, the algorithms check the indexes. This way, they will know which are the most relevant pages considering the hundreds of ranking factors. And all of this happens in a matter of milliseconds.