search engine

SEARCH ENGINES

Presented By-Swati Singh

MCA(5th sem)

Flow of presentation

Pie Chart of Different search engines History Introduction Working of search engines Web crawlers Architecture of a Search engine Best & Quick Search Advanced search techniques of Google Search Engine Market

.

Pie chart of different search engines

Search engine market share in the US

HISTORY

The very first tool used for searching on the Internet was Archie. The name stands for

"archive" without the "v".

Archie was created in 1990 by Alan Ematage , Bill Heelan and

J. Peter Deutsch, computer science students at McGrill University in Montreal.

Around 2000, Google’s search engine rose to

prominence.

By 2000, Yahoo! was providing search services based on Inktomi's search

engine

Yahoo! switched to Google's search engine until 2004,when

it launched its own search engine based on the combined technologies of its acquisitions.

A web search engine is designed to search for information on the World wide web and FTP servers.

The search results are generally presented in a list of results and are often called hits.

The information may consist of web pages, images, information and other types of files.

INTRODUCTION

Some search engines also have mine data available in databases or open directories.

Unlike web directories, which are maintained by human editors, search engines operate algorithmically or are a mixture of algorithmic and human input.

Today there are many search engines. e.g. Google, Yahoo, Aol, Safari, msn etc.

Today Google at the top in search engine. The reason is its creativity only.

Continues…

How web search engine works

High-level architecture of a standard Web crawler

http://en.wikipedia.org/wiki/File:WebCrawlerArchitecture.svg

Data Storage• Data about web

pages are stored in an index database for use in later queries

• The purpose of an index is to allow information to be found as quickly as possible.

Query & Indexing

• When a user enters a query into a search engine the engine examines its index

• provides a listing of best-matching web pages according to its criteria

Drawback• The index is

built from the information stored with the data and the method by which the information is indexed

• Unfortunately, there are currently no known public search engines that allow documents to be searched by date.

Most search engines support

the use of the

boolean operators AND, OR and NOT

Some search engines

provide an advanced feature called

proximity search which allows

users to define the distance between keywords

Natural language queries

allow the user to type a question in the same

form one would ask it to a human. A site like this would

be ask.com.

The usefulness

of a search engine

depends on the

relevance of the

result set it gives back.

Most search engines employ methods to rank

the results to provide the "best"

results first.

WEB CRAWLER A Web crawler is a computer program that browses the World Wide Web in a methodical, automated manner or in an orderly fashion.

This process is called Web crawling or spidering.

Web crawlers are mainly used to create a copy of all the visited pages for later processing by a search engine that will index the downloaded pages to provide fast searches.

Crawlers can also be used for automating maintenance

tasks on a Web site, such as checking links or validating

HTML code.

Spiders take a Web page's content and create key search words that enable online users to find pages they're looking for.

How web crawler works is shown in figure

Web Crawlers are the heart of search engines.

They continuously keep on crawling the web and find new web page that have been added to the web

They will begin with a popular site,indexing the words on its page and following every link found within site.

When you query a search engine to find information, it is actually searching through the database which it has created and not actually searching the Web

Web Crawler

SEARCH ENGINE

WORKS

OPERATIONS•Web Crawling•Indexing•Searching

WORKS•Storing Information•Retrieval from Html pages itself

WEB CRAWLEER•An automated Web browser•Contents of each page are analyzed to determine how it should be indexed

Architecture of a Search Engine

The Web

Web Results 1 - 10 of about 7,310,000 for miele. (0.12 seconds)

Miele, Inc -- Anything else is a compromise At the heart of your home, Appliances by Miele. ... USA. to miele.com. Residential Appliances. Vacuum Cleaners. Dishwashers. Cooking Appliances. Steam Oven. Coffee System ... www.miele.com/ - 20k - Cached - Similar pages

Miele Welcome to Miele, the home of the very best appliances and kitchens in the world. www.miele.co.uk/ - 3k - Cached - Similar pages

Miele - Deutscher Hersteller von Einbaugeräten, Hausgeräten ... - [ Translate this page ] Das Portal zum Thema Essen & Geniessen online unter www.zu-tisch.de. Miele weltweit ...ein Leben lang. ... Wählen Sie die Miele Vertretung Ihres Landes. www.miele.de/ - 10k - Cached - Similar pages

Herzlich willkommen bei Miele Österreich - [ Translate this page ] Herzlich willkommen bei Miele Österreich Wenn Sie nicht automatisch weitergeleitet werden, klicken Sie bitte hier! HAUSHALTSGERÄTE ... www.miele.at/ - 3k - Cached - Similar pages

Sponsored Links

CG Appliance Express Discount Appliances (650) 756-3931 Same Day Certified Installation www.cgappliance.com San Francisco-Oakland-San Jose, CA Miele Vacuum Cleaners Miele Vacuums- Complete Selection Free Shipping! www.vacuums.com Miele Vacuum Cleaners Miele-Free Air shipping! All models. Helpful advice. www.best-vacuum.com

Web spider

Indexer

Indexes

Search

User

Best & Quick SearchUse words called Boolean operators to link key words and phrases. These Boolean operators are: AND, OR, and NOT.

Renaissance in Europe, but you keep getting sites on the Harlem Renaissance, type:Renaissance NOT Harlem

If you want AIDS statistics in France, type:AIDS AND France AND statistics Is your hit list too

small, or you get no hits at all? Try using the word OR between related words or synonyms. exampleAIDS OR HIV AND France AND Statistics

Advanced search techniques of Google

Google began as

an academic search engine.

It built its initial system to use

multiple spiders,usually 3

at a time.

At its peak

performance,its system

can crawl over 100 pages per second ,

generating around

600 kilobytes of data each

second

Each spider could keep 300

connections to web

pages open at a time.

Google had its

own DNS that

translates a server’s

name (URL) into an address in order to keep

delays to a

minimum

When the Google spider

looked at an HTML

page looks words within page

occurring in the

title,subtitles,meta tags etc.

Search engine market

Most Web search engines are commercial ventures supported by

advertising revenue and , as a result , some employ the practice of allowing advertisers to pay money to have their listings ranked higher in search results

Some search engines which do not accept money for their search engine

results make money by running search related ads alongside the regular

search engine results.

The search engines make money every time someone clicks on one of these ads

Thank You

search engine

Education

web search engine

search results

search services

proximity search

notmost search engines

key search words

inktomis search engineyahoo

known public search