With the rise of the Internet, search engines have become a way of life for millions of computer users worldwide. Search engines allow computer users to find information and websites that have relevant answers to their questions by using keywords and meta data to match consumers with their interests. Although nearly everyone uses search engines on a daily basis, not that many people know how search engines work. In this article, we will explore types of search engines and how they operate.
What is a Search Engine
A search engine is an index of websites that uses search algorithms and predetermined rules to match consumers with their interests. When you use a search engine, all you have to do is type in a keyword or a series of keywords and the search engine will display a number of websites that contain those keywords in the correct context. Many search engines not only display websites but also images, videos, news, and documents depending on what you are trying to find. While search engines often display millions of results for your keywords, you can usually find what you are looking for by browsing through the first few results.
While each search engine (e.g. Google, Bing, etc.) has a different approach to data gathering, they all perform three core functions:
- Searching or “crawling” the Internet by following links;
- Keeping an index of what they find and where they found them; and
- Allow users to look for words or combinations of words on their site.
Search Engine Spiders
The key to a search engine’s success lies in special programs called “spiders” – automated virtual robots also commonly referred to as “crawlers” – that build lists of words that can be found on a web page or website. They are usually sent out from a central computer and aimed at popular websites and servers that experience heavy use. The spiders crawl over the web page, cataloging the words and links (and following these links to other websites), and then creating an index or listing of “key search words” that online users can use to find the pages they are looking for.
Over the years, search engines have been upgrading and improving their systems in order to provide more convenience and faster response times to their users. One such system is the use of “meta-tags” – key words on a web page that makes a page easier for a spider to pinpoint and index.
Making Sense of the Data – Indexing the Words
Once the spiders have collected the data, these are processed and stored by the search engine in a way that makes it easy for people to access them.
It would be simple to simply store the word and URL address in a single, searchable database but there is no way that a user, when seeing the bare bones information, would know which page is more important, or fits in with the search he has in mind.
For example, one may be searching for “apples” to find apples for apple pie-making or a picture of apples. If the database were to simply list “apples” and their URL, one would have to do his own tedious searching for the specific context of the word he’s looking for.
For this reason, most search engines index and collect more than mere words and URL addresses: they may index words that are in specific locations on the page (title and sub-title areas, for example), meta-tags, frequency of words on the page, etc.
Google and the other search engines are in a constant race to improve their search capabilities like concept-based searching which uses statistical analysis on pages with the words one is searching for, so as to identify other pages of interest. Another approach is natural language queries where one inputs questions in the same way he talks to people – “What is a search engine?” for example.
The most popular search engine using natural language queries is AskJeeves.com, which “zooms in” on key words (e.g. What, search engine) and uses this as the starting point for checking through its index of words.
A search algorithm is a set of predetermined rules that control how a search engine works. For example, a search algorithm may contain instructions on how to display results based not only on the keywords entered but also in what order the keywords were entered. Many search engines also have methods to allow the user to customize their results. For instance, Google allows users to type keywords in quotation marks in order to find webpages with that specific phrase or sentence. This type of search algorithm allows for search engines to display the most relevant information.
Almost every search engine has a section at the top of their search results that is usually labelled as “sponsored listings”, “sponsored links”, “premium results”, or a similar title. This section does not contain more relevant information but instead contains webpages that someone has paid to have placed there. Companies and organizations often do this in order for people to see their website every time a specific or general keyword is entered. Not every search has a sponsored listing as not every keyword has someone willing to pay for it. For example, if you type “search engine” into Google, there will be no sponsored listings. However, if you type in a more business-related keyword such as “data integration”, you will see three sponsored listings at the top of the page in a yellow box labelled as “Sponsored Links”. These listings are not in any specific order and if you refresh the page, the sponsored listings may change or be put in a different order.
Hybrid Search Engines
Hybrid search engines are search engines that combine both human-controlled directories with the powerful search algorithms that control crawler systems. Hybrid search engines often display results from both types of indexes and allow for users to receive more appropriate and direct responses. With a hybrid system, you can perform a standard keyword search but then browse through a directory of comments and resources plus an automated search of popular websites that contain the information that you are seeking.
A directory could be considered another type of search engine that is controlled by actual people rather than an automatic system. Directories contain sources of information that other users have seen and added to the list so that others may find the information without having to stumble upon it on their own. Directories often contain descriptions about the website, image, video, or document in question and some directories even allow for users to comment the media so that other users can not only see what the information is about but also whether other people found it relevant and helpful or not. While directories are more personal and contain real human information, they do lack the capabilities that allow search engines to process large amounts of webpages.
What Are the Top Internet Search Engines?
While there are thousands of search engines and directories on the web, only about 5 are widely used by English speaking Internet surfers. They include Google, Bing, Yahoo, AOL and Ask.com.
If you are wondering why each of these search engines is so popular and unique, here is some info about each of them.
Google Search is owned by Google, Inc., it is by far considered to be the most popular search engine on the web and since its inception in 1998, it has revolutionized the web Internet search industry. You can visit Google Search at the following link: http://www.google.com, you will notice that compared to other search engines the homepage or landing page is uncluttered and quite empty except for the logo, search bar and a few links to other functions. This uncluttered style originally differentiated itself from other search engines (web portals) such as Yahoo! that in some cases overloaded viewers with links and advertisements. It should be noted that there are no advertisements on the homepage of Google.
Searching for information on Google is quick, easy and extremely effective. Google has invested enormous sums of money (several billion dollars) into their search engine making it disputably the best search engine on the web. Google Search handles hundreds of millions of queries per day and not only indexes HTML web pages (which is usually the standard web page format), but other file formats such as PDF, Word documents, images, videos, etc. Google also allows you to search images, video and retail price data.
Google uses an algorithm called Pagerank, which is what makes this search engine unique (however, many other search engines have since copied this search strategy). The method of Pagerank is quite complicated and takes into account the data found on a page, the pages that are associated to it (links) and other important variables. However, the end result is quite on the mark and when a person searches for a term using Google, in the vast majority of cases information returned by Google is quite relevant.
Another important aspect regarding Google Search is that when search results are returned to a user, they are organized and differentiate between both advertisements or paid results and unpaid- non commercial results. One reason Google became quite popular was that other search engines would not distinguish paid results from non paid results. This meant the user wasn’t receiving the most relevant results for their query, but actual advertisements that paid for a high ranking.
Originally known as MSN (Microsoft Network) and then Live Search Bing was started to counter AOL’s ISP network and search engine. Bing is the second most popular search engine, but ranks far behind in unique visitors compared to Google.
You can read about about how Bing and Google compare in our Bing vs. Google article.
Yahoo was one of the first search engines on the web, developed in 1994. Yahoo!, however is much more than just a search engine- considering itself a web portal. A web portal is usually defined as a site that offers many services, all emanating from once source. Not only does Yahoo! offer a very competent search tool, but offers email, video, news, instant messaging, etc. You can visit Yahoo! at the following site: http://www.yahoo.com
However, since 2009 Yahoo Search results has been provided by Bing.
AOL originally stood for America Online. AOL is considered to be an online service provider. This means that not only does AOL offer content, but also ISP services. Similar to MSN, AOL was considered the first ISP with national appeal and built a internet community with over 20 million households.
It should be noted that AOL was never popular due to its search, in fact in many cases, it would outsource its search engine to other large companies, and even today relies on Google for its search results. While AOL is a popular search engine, it is far less so compared to Bing, and especially Google. The vast majority of users that use AOL search are those that pay for its ISP or broadband service. AOL offers users original content and other key services such as email and instant messaging. AOL is considered to be a portal with a search tool found at the top of the page. AOL’s search engine can be found at http://www.aol.com.
Ask.com was originally called Ask Jeeves. Jeeves was a character based on an English Butler. While other search engines asked you to input keywords in order to conduct a search, Ask Jeeves requested that you write a question in regular form (e.g. Who was the first president?). While this may seem like a practical way to search, many people that searched for more complex information found the results off base and irrelevant. As Google and other search engines developed highly competent algorithms, AskJeeves.com failed to develop as quickly and lost much of its user base. The character of Ask Jeeves was dropped in February 2006 and now the site is known as simply Ask.com.
Ask.com now still allows users to enter a question as a query, but mainly focuses on keyword queries. While Ask.com does not have as many web pages indexed, it is still a competent search tool. You can visit Ask.com at the following link: http://www.ask.com