search engine dependency conference

Post on 03-Sep-2014

1.725 Views

Category:

Technology

1 Downloads

Preview:

Click to see full reader

DESCRIPTION

Conference slides about search engine dependency and its influence on data quality

TRANSCRIPT

SEARCH ENGINE DEPENDENCY AND ITS INFLUENCE ON

DATA QUALITYBy Ronan CHARDONNEAU

Index

I - Introduction to the world of search enginesII - Risks of search engines dependency

III - How to solve the equation?IV - Future of Google and information research

V - Conclusion

I-Introduction II-Risks III-Solutions IV-Future V-Conclusion

The World of Search engines

I-Introduction II-Risks III-Solutions IV-Future V-Conclusion

Market configurationTOP 10 Search websites in the world for August 2007

Target: users more than 15 year-old, home and at work Source: comscore qSearch 2.0

I-Introduction II-Risks III-Solutions IV-Future V-Conclusion

Leaders per country

Source: map made using data on « Alexa the Web information company (2008) ».

I-Introduction II-Risks III-Solutions IV-Future V-Conclusion

A win or lose market

I-Introduction II-Risks III-Solutions IV-Future V-Conclusion

Approximation of language contents available on Internet

Source: Internet world Stats

I-Introduction II-Risks III-Solutions IV-Future V-Conclusion

What has already been proved

• Studies are showing that Internet is the main information provider (at least in Europe and America);• When surfing on the Internet search engines are the most used websites;• People trust search engines results;• When making research on the Internet people are mainly using one single search engine;

I-Introduction II-Risks III-Solutions IV-Future V-Conclusion

Brief summary

• Google is the market leader, followers are far;• 8 search engine leaders and probably eight continents on Internet;• A market defined by the adoption of standards (<50%) to search;• Contents are mainly in English, importance of Chinese, quality contents in Japanese, German and Korean;• Internet users cannot live without search engines and are loyal to a specific one;

I-Introduction II-Risks III-Solutions IV-Future V-Conclusion

Risks of search engine dependency and its influence on

data quality

I-Introduction II-Risks III-Solutions IV-Future V-Conclusion

Definition The behaviour of not reconsidering the results coming from one single search engine.

It normally starts when you hear sentences such as:

- "Why should I bother using other search engines because I find everything I want with Google?"

- Do I really have some risks when I am using Google?

- All countries in the world have Google in their top 100 or less;

- Google has been recognized as the most powerful brand;

I-Introduction II-Risks III-Solutions IV-Future V-Conclusion

• Who is Google? Well... It is our friend;• We can carry it everywhere, relevant, convenient(quick display, services associated);• But:

– You have to know how to deal with it;– You have to know its limits;– You have to know its potential;

I-Introduction II-Risks III-Solutions IV-Future V-Conclusion

• If you don’t know how to deal with it:- You will never use his true capacities;- You will probably take the first information which is

displayed;• If you don't know its limits:

- And cannot find the information you will may think that the information does not exist;

- You may even think that the technology does not exist elsewhere;

• If you don’t know its potential:- You will not improve at performing research;

Consequences

I-Introduction II-Risks III-Solutions IV-Future V-Conclusion

Advertisement• Search engines economical model is based on advertisement (99% of Google revenues are based on it);• However studies are showing that some categories of adults (non Internet generations) do not make the difference between commercial and non commercial links;• Some search engines are more commercial than others;• The more you know a search engine (Google) and the more you can practise Search Engine Optimization;

I-Introduction II-Risks III-Solutions IV-Future V-Conclusion

Google is not an isolated case• Baidu dependency in China and Yandex dependency in Russia;• Seznam dependency in Czech Republic;• Naver dependency in South Korea;• Yahoo dependency in Japan and many others Asian countries;

I-Introduction II-Risks III-Solutions IV-Future V-Conclusion

• Search engine dependency is confortable and then understandable;• But for many reasons it goes for a mass consumption information (blog phenomenon, advertisement…) which is not the best ones;• In our countries it is Google dependency but keep in mind that Europe and Americas are not the center of the world;

Brief summary

I-Introduction II-Risks III-Solutions IV-Future V-Conclusion

How to solve the equation?

I-Introduction II-Risks III-Solutions IV-Future V-Conclusion

First point

• If an answer exist... we should look for it;• At the moment there is no miracle solution

for lazy search;• But there are ways to get closer to the

answer;

I-Introduction II-Risks III-Solutions IV-Future V-Conclusion

Three pillarsLearn how to use the technology

Breaking the habitsTechnological awareness

I-Introduction II-Risks III-Solutions IV-Future V-Conclusion

Concrete case: GoogleLearn how to use the technology:• Make advanced research:

– Simple Boolean operators («  », links:, define:, ?, *, ~,…) ;– Complex request: ?intitle:index.of? "" -filetype:html -filetype:asp -wiki -ringtone -filetype:htm

-posts -lyrics -filetype:shtml -filetype:php -filetype:doc -filetype:pdf -filetype:txt mpeg wma avi wmv

– Google Advanced search;• Using other Google services such as Google Alerts;• Use sub Google search engines such as Google Scholars;

Breaking the habits:- Get used to practice what you learnt and force

yourself to do so;- Results are coming and you get used to it;

I-Introduction II-Risks III-Solutions IV-Future V-Conclusion

Concrete case: GoogleTechnological awareness:

By performing better at search you will discover new technologies that you will have to learn.

For example: Google Alerts tell you that a new searchengine is coming up and then you try it;

I-Introduction II-Risks III-Solutions IV-Future V-Conclusion

Technological awareness: Google

Google Ads

Google Advanced Search

Do you know iGoogle?

I-Introduction II-Risks III-Solutions IV-Future V-Conclusion

When Google promotes its own technology good chances that it is worthwhile

Technological awareness: How to select the best

• Search engine market is a world of buzz:

• Where every search engine want to beat Google;• But are they really providing a technical

revolution?

I-Introduction II-Risks III-Solutions IV-Future V-Conclusion

• Real time information: the Twitter example

When Google starts to be interested in one's technology it should then be a good one

I-Introduction II-Risks III-Solutions IV-Future V-Conclusion

Start to look at what Google does not have

• Finding similar websites: Who is like it?

Unfortunately it is working only for popular websites

I-Introduction II-Risks III-Solutions IV-Future V-Conclusion

Start to look at what Google does not have

Another way of searching information: Social bookmarking

Advantages: you find unindexed websites;

Disadvantages: rubbish websites, advertisement?

I-Introduction II-Risks III-Solutions IV-Future V-Conclusion

Start to look at what Google does not have

Graphical display

I-Introduction II-Risks III-Solutions IV-Future V-Conclusion

Start to look at what Google does not have

Look for specialized search engines- People: 123 People, CV gadget, Pipl…- Jobs: Indeed, JobiJoba…- Tutorials: Tutosearch, …- Torrent: Toorgle, …- Scientific information: Scirus,…- Information in a specific language: Yandex

for Russian, Baidu for Chinese….

I-Introduction II-Risks III-Solutions IV-Future V-Conclusion

Start to look where Google is not the best

• Triangle method: Locating three independent sources that point to the same answer;

• Recent events in Tibet showed how it was important to look at different sources of information and even out of your own country;

How to improve data quality on the Internet?

I-Introduction II-Risks III-Solutions IV-Future V-Conclusion

Source 1: Washington Post

Source 3: AntiCnn.comSource 2: Le Parisien

• Learn how to use, change your habits, be aware; • Be curious• Think about another way to look for information;• Three dependent sources of information;

I-Introduction II-Risks III-Solutions IV-Future V-Conclusion

Brief summary

Future of Google and information research

I-Introduction II-Risks III-Solutions IV-Future V-Conclusion

Semantic search• You get feed instead of entering your

request;• Everything is talking about Semantic

search;• But it is mature yet, a buzz world again (there

are not a lot of suggestions);• Poor results if developped on scratch (poor index)

if developped by huge companies (few suggestions);

I-Introduction II-Risks III-Solutions IV-Future V-Conclusion

Some issues to fix

• How to well index pictures? Are solutions such as Google labeler are the best???

• How to index videos?• How to index sounds?

I-Introduction II-Risks III-Solutions IV-Future V-Conclusion

A Google which will have to change

• Too much information on the Internet;• A Google which is collapsing and providing

more and more sub search engines;• The development of high bandwidth

connection which mean graphical interface;

• A technological awareness which is difficult to transmitt;

I-Introduction II-Risks III-Solutions IV-Future V-Conclusion

But a Google more and more present in our life

• Forecasts are going in that sense;• Development of OS on cell phones, Web

browser, Web software application (Google slides, Google « excel »....)

I-Introduction II-Risks III-Solutions IV-Future V-Conclusion

The question is just how they will do it?

Google in 1998 Google 11 years after

I-Introduction II-Risks III-Solutions IV-Future V-Conclusion

• Google will be with us in the future and we have to get used to it;

• Information research will be more and more assisted but you will still be in late if you do not perform advanced research;

• In a short future some issues will still be there (indexing of pictures…)

I-Introduction II-Risks III-Solutions IV-Future V-Conclusion

Brief summary

Conclusion

I-Introduction II-Risks III-Solutions IV-Future V-Conclusion

What you have to keep in mind

• At least if you are dependent you should be well dependent;

• Apply the triangle method;• Reconsider on each time the information

process (think differently);

I-Introduction II-Risks III-Solutions IV-Future V-Conclusion

RecommandationsMaster thesis about search engine dependency:

- http://www.pandia.com/index.htmlList of search engines:

- http://www.pandia.com/powersearch/index.html- http://www.philb.com/whichengine.htm

To know more about search engines: Pandia search:- www.pandiasearch.com

Documentaries:- Google: Behind the screen by IJsbrand van Veelen

http://www.youtube.com/watch?v=TBNDYggyesc&hl=fr- The Great Firewall of China

http://www.youtube.com/watch?v=IWsXhNJFj78&hl=frI-Introduction II-Risks III-Solutions IV-Future V-Conclusion

Thank you for your attention

http://moteurs-de-recherches-alternatifs.blogspot.com

top related