search engine
DESCRIPTION
TRANSCRIPT
SEARCH ENGINES
Presented By-Swati Singh
MCA(5th sem)
Flow of presentation
Pie Chart of Different search engines History Introduction Working of search engines Web crawlers Architecture of a Search engine Best & Quick Search Advanced search techniques of Google Search Engine Market
.
Pie chart of different search engines
Search engine market share in the US
HISTORY
The very first tool used for searching on the Internet was Archie. The name stands for
"archive" without the "v".
Archie was created in 1990 by Alan Ematage , Bill Heelan and
J. Peter Deutsch, computer science students at McGrill University in Montreal.
Around 2000, Google’s search engine rose to
prominence.
By 2000, Yahoo! was providing search services based on Inktomi's search
engine
Yahoo! switched to Google's search engine until 2004,when
it launched its own search engine based on the combined technologies of its acquisitions.
A web search engine is designed to search for information on the World wide web and FTP servers.
The search results are generally presented in a list of results and are often called hits.
The information may consist of web pages, images, information and other types of files.
INTRODUCTION
Some search engines also have mine data available in databases or open directories.
Unlike web directories, which are maintained by human editors, search engines operate algorithmically or are a mixture of algorithmic and human input.
Today there are many search engines. e.g. Google, Yahoo, Aol, Safari, msn etc.
Today Google at the top in search engine. The reason is its creativity only.
Continues…
How web search engine works
High-level architecture of a standard Web crawler
Data Storage• Data about web
pages are stored in an index database for use in later queries
• The purpose of an index is to allow information to be found as quickly as possible.
Query & Indexing
• When a user enters a query into a search engine the engine examines its index
• provides a listing of best-matching web pages according to its criteria
Drawback• The index is
built from the information stored with the data and the method by which the information is indexed
• Unfortunately, there are currently no known public search engines that allow documents to be searched by date.
Most search engines support
the use of the
boolean operators AND, OR and NOT
Some search engines
provide an advanced feature called
proximity search which allows
users to define the distance between keywords
Natural language queries
allow the user to type a question in the same
form one would ask it to a human. A site like this would
be ask.com.
The usefulness
of a search engine
depends on the
relevance of the
result set it gives back.
Most search engines employ methods to rank
the results to provide the "best"
results first.
WEB CRAWLER A Web crawler is a computer program that browses the World Wide Web in a methodical, automated manner or in an orderly fashion.
This process is called Web crawling or spidering.
Web crawlers are mainly used to create a copy of all the visited pages for later processing by a search engine that will index the downloaded pages to provide fast searches.
Crawlers can also be used for automating maintenance
tasks on a Web site, such as checking links or validating
HTML code.
Spiders take a Web page's content and create key search words that enable online users to find pages they're looking for.
How web crawler works is shown in figure
Web Crawlers are the heart of search engines.
They continuously keep on crawling the web and find new web page that have been added to the web
They will begin with a popular site,indexing the words on its page and following every link found within site.
When you query a search engine to find information, it is actually searching through the database which it has created and not actually searching the Web
Web Crawler
SEARCH ENGINE
WORKS
OPERATIONS•Web Crawling•Indexing•Searching
WORKS•Storing Information•Retrieval from Html pages itself
WEB CRAWLEER•An automated Web browser•Contents of each page are analyzed to determine how it should be indexed
Architecture of a Search Engine
The Web
Web Results 1 - 10 of about 7,310,000 for miele. (0.12 seconds)
Miele, Inc -- Anything else is a compromise At the heart of your home, Appliances by Miele. ... USA. to miele.com. Residential Appliances. Vacuum Cleaners. Dishwashers. Cooking Appliances. Steam Oven. Coffee System ... www.miele.com/ - 20k - Cached - Similar pages
Miele Welcome to Miele, the home of the very best appliances and kitchens in the world. www.miele.co.uk/ - 3k - Cached - Similar pages
Miele - Deutscher Hersteller von Einbaugeräten, Hausgeräten ... - [ Translate this page ] Das Portal zum Thema Essen & Geniessen online unter www.zu-tisch.de. Miele weltweit ...ein Leben lang. ... Wählen Sie die Miele Vertretung Ihres Landes. www.miele.de/ - 10k - Cached - Similar pages
Herzlich willkommen bei Miele Österreich - [ Translate this page ] Herzlich willkommen bei Miele Österreich Wenn Sie nicht automatisch weitergeleitet werden, klicken Sie bitte hier! HAUSHALTSGERÄTE ... www.miele.at/ - 3k - Cached - Similar pages
Sponsored Links
CG Appliance Express Discount Appliances (650) 756-3931 Same Day Certified Installation www.cgappliance.com San Francisco-Oakland-San Jose, CA Miele Vacuum Cleaners Miele Vacuums- Complete Selection Free Shipping! www.vacuums.com Miele Vacuum Cleaners Miele-Free Air shipping! All models. Helpful advice. www.best-vacuum.com
Web spider
Indexer
Indexes
Search
User
Best & Quick SearchUse words called Boolean operators to link key words and phrases. These Boolean operators are: AND, OR, and NOT.
Renaissance in Europe, but you keep getting sites on the Harlem Renaissance, type:Renaissance NOT Harlem
If you want AIDS statistics in France, type:AIDS AND France AND statistics Is your hit list too
small, or you get no hits at all? Try using the word OR between related words or synonyms. exampleAIDS OR HIV AND France AND Statistics
Advanced search techniques of Google
Google began as
an academic search engine.
It built its initial system to use
multiple spiders,usually 3
at a time.
At its peak
performance,its system
can crawl over 100 pages per second ,
generating around
600 kilobytes of data each
second
Each spider could keep 300
connections to web
pages open at a time.
Google had its
own DNS that
translates a server’s
name (URL) into an address in order to keep
delays to a
minimum
When the Google spider
looked at an HTML
page looks words within page
occurring in the
title,subtitles,meta tags etc.
Search engine market
Most Web search engines are commercial ventures supported by
advertising revenue and , as a result , some employ the practice of allowing advertisers to pay money to have their listings ranked higher in search results
Some search engines which do not accept money for their search engine
results make money by running search related ads alongside the regular
search engine results.
The search engines make money every time someone clicks on one of these ads
Thank You