SEOMiningSEOMining





Search Directories   «Prev  Next»
Lesson 1

How Search Engines and Directories work

Up to this point in the course, you have not done any searching (unless you have tried a search with some of the sites you visited in the last module). Now that you have been introduced to different search services and to some challenges of searching, you can begin to practice some searches.
The searching exercises in this module let you compare different categories of information retrieval services and different services in the same category. This module will also discuss in more detail the concepts and functions of directories and search engines, including their advantages and disadvantages compared to other information retrieval service categories, how they can complement each other, and why one may be more appropriate than the other in a particular search.
An understanding of how each type of search service functions will help you to create more effective search strategies.
After completing this module, you will be able to:
  1. Describe how directories are created and organized, their advantages and limitations
  2. Describe how a search engine creates and maintains its database of sites
  3. Ask a search engine to find information with a search query
  4. Explain how a search engine's database affects your results

Search Engine Functions

Search engines fundamentally do three things:
  1. ingest content,
  2. return content matching incoming queries, and
  3. sort the returned content based upon some measure of how well it matches the query.
Relevance is the term used to describe this notion of "how well the content matches the query". Most of the time the matched content is documents, and the returned and ranked content is those matched documents along with some corresponding metadata describing the documents.
In most search engines, the default relevance sorting is based upon a score indicating how well each keyword in a query matches the same keyword in each document, with the best matches yielding the highest relevance score and returned at the top of the search results. The relevance calculation is highly configurable, however, and can be easily adjusted on a per-query-basis in order to enable very sophisticated ranking behavior.
In this module, we will provide an overview of how relevance is calculated, how the relevance function can be easily controlled and adjusted through function queries, and how to implement popular domain-specific and user-specific relevance ranking features. We’ll start by looking at how ranking actually works.


Click the link below to consider what makes using a search engine or directory an easy or a difficult experience.
How search engines work
Search Engine Optimization (SEO) is the activity of optimizing web pages or whole sites in order to make them search engine friendly, thus getting higher positions in search results. This tutorial explains simple SEO techniques to improve the visibility of your web pages for different search engines, especially for Google, Yahoo, and Bing.

How does a Search Engine Work?

Search engines perform several activities in order to deliver search results.
  1. Crawling: Process of fetching all the web pages linked to a website. This task is performed by a software called a crawler or a spider (or Googlebot, in case of Google).
  2. Indexing: Process of creating index for all the fetched web pages and keeping them into a giant database from where it can later be retrieved. Essentially, the process of indexing is identifying the words and expressions that best describe the page and assigning the page to particular keywords.
  3. Processing: When a search request comes, the search engine processes it, i.e., it compares the search string in the search request with the indexed pages in the database.
  4. Calculating Relevancy: It is likely that more than one page contains the search string, so the search engine starts calculating the relevancy of each of the pages in its index to the search string.
  5. Retrieving Results: The last step in search engine activities is retrieving the best matched results. Basically, it is nothing more than simply displaying them in the browser.
Search engines such as Google often update their search algorithms several times per month. When you see changes in your rankings, it is due to a new algorithm being implemented.