searching the net welcome to. what do you want what do you want from searching the net
TRANSCRIPT
Searching the net
welcome to
What do you want What
do
you
want
from
Searching the Net
Navigate the Information
Superhighway
From This…
To This…
She got the information she wanted thanks to…
‘Searching the Net’
www www www www www www www www
World Wide Web created by Tim Berners-Lee in 1989
to overcome problem of file sharing with no common machines or software
Produced first web browser in 1990 using HTTP, HTML, and URL - in 1993 CERN said that WWW would be free to everyone
httphyp
er
text
tran
sfer
pro
toco
l
http://www...
html
hyp
er
text
mar
k-u
p
lan
gu
age
<!-- side menu list of web resources pages --> <div id = "sidemenu"> <table width= "18%" border="0" cellspacing="0" cellpadding="0" align="left"> <tr align = "left" class = "heading"><td valign="middle" ><p>Web Resources</p></td></tr> <tr><td></td></tr> <tr align="left"><td valign="top"><p class = "resources"> <a href= "information.htm" title="quick reference and general resources">Information</a></p></td></tr> <tr align="left"><td valign="top"><p class = "resources"><a href="heritage.htm">Heritage</a></p></td></tr> <tr align="left"><td valign="top"><p class = "resources"><a href="books.htm">Reading</a></p></td></tr> <tr align="left"><td valign="top"><p class = "resources"><a href="learning.htm">Learning</a></p></td></tr> <tr align="left"><td valign="top"><p class = "resources"><a href="health.htm" title="Health and lifestyle resources">Lifestyle</a></p></td></tr> <tr align="left"><td valign="top"><p class = "resources"><a href="Online.htm" title="Reference sources provided by SLC">Online Reference</a></p></td></tr> <tr align="left"><td valign="top"><p class = "resources"><a href="community.htm" title="Community language resources">Community Languages </a></p></td></tr> <tr align="left"><td valign="top"><p class = "resources"><a href="money.htm" title="Money and consumer advice resources">Money advice</a></p></td></tr> <tr><td></td></tr> <tr align = "left" class="heading"><td valign="middle" ><p>Local Resources</p></td></tr> <tr><td></td></tr> <tr align="left"><td valign="top"><p class = "resources"><a href="http://www.southlanarkshire.gov.uk/portal/page/portal/EXTERNAL_WEBSITE_DEVELOPMENT/SLC_ONLINE_HOME/EDUCATION_LIBRARIES/LIBRARIES?CONTENT_ID=418">South Lanarkshire Libraries</a></p></td></tr> <tr align="left"><td valign="top"><p class = "resources"> <a href="http://www.library.southlanarkshire.gov.uk/00_002_login.aspx?ReturnUrl=/01_YourAccount/01_001_AccountDetails.aspx" title="Renew items you have on loan. You will need your PIN number to do this.">Renew your library books</a></p></td></tr> <tr align="left"><td valign="top"><p class = "resources"> <a href="http://www.southlanarkshire.gov.uk/portal/page/portal/SLC_PUBLICDOCUMENTS/EDUCATION_DOCUMENTS/EDU_1259_Clubs-and-societies-in-South-Lanarkshire%20(updat.pdf" title="This link will open in a new window" target="_blank">Clubs and Societies in South Lanarkshire</a></p></td></tr> <tr align="left"><td valign="top"><p class = "resources"><a href="bookaward/details08.htm">South Lanarkshire Book Awards</a></p></td></tr> <tr align="left"><td valign="top"><p class = "resources"><a href="about.htm">About activeIT</a></p></td></tr> <tr align="left"><td valign="top"><p class = "resources"><a href="locallinks.htm">Local links</a></p></td></tr> <tr align="left"><td valign="top"><p class = "resources"><a href="openinghours.htm">Library opening hours</a></p></td></tr> <tr align="left"><td valign="top"><p class = "resources"><a href="http://www.pageflakes.com/activeit/21822620" title="Local news, travel and weather information">ActiveIT pageflakes </a></p></td></tr> <tr align="left"><td valign="top"><p class = "resources"><a href="readinggroups.htm">Reading groups</a></p></td></tr> </table>
urlun
ifo
rm
reso
urc
e
loca
tor
http://www.bbc.co.uk
World Internet Users(total - 1,463,632,361)
41%3%
17%9% 1% 3%
26%
Africa (44,361,940)
Asia (578,538,257)
Europe (384,633,765)
Middle East (41,939,200)
North America(248,241,969)
Latin America/Caribbean(139,009,209)
Oceania/Australia(20,204,331)
Searching the net
Types of toolsSubject directories - material submitted by webmasters or evaluators and categorised by humans
Search engines - material submitted by webmasters – but also found by computers – and arranged by computers
Deep (hidden or invisible) web - database material gathered and categorised by humans
Subject directories
2 types
Commercial Academic
http://dir.yahoo.com/
FeaturesLarge subject directory – 2 million
Broad subject coverage
Hierarchical subject organisation
Drawbacks
Evaluation poor
No subject balance
Selection somewhat arbitrary
Tends to index only the home page
Subject classification not always useful
http://bubl.ac.uk
FeaturesGuarantees to return at least 5 results for any subject
Compiled by Library and Information professionals
Highest quality sites only
High quality, reliable annotations
Search engines
Used to be 2500, each
with own database of
web sites
Anticipated that features
would improve with competition
Now basically 4 with
little competitionUsed to hold full text
– now limited amount
of each page (Google
101Kb)
Two interfaces – basic and advanced
Don’t be impressed
by the number of hits
Search engines
Search engines
3 parts1. Spider
2. Index
3. Search software
Claims over 8 billion
pages, but may be
counting pages not
fully indexed
General web database with often useful
ranking by popularity
Far from comprehensive, but often finds
‘the best’ pagesSubject directory
included
Used a search engine
called Teoma
Changed name back to AskJeeves
in 2009 in UK
Allows user to
ask in plain
English
Provides good search alternatives
Meta search engines
Separate retrieval
Collated retrieval
Meta search engines
Good for
obscure
topics
Useful for getting an overview of a subject
Single click searches
multiple sources
Dozens of different search tools grouped together
Type search terms just once
Good reminder of alternative tools and different types of resources
Advanced searching
Most search engines provide facility for users to refine search
Allows searchers to state language requirements, time frames, file format, and domain (e.g. .edu, .mil, .gov)
United
Nations
United
Kingdom
Adobe Acrobat (PDF)
Chemical weapons
In the last year
Advanced searching
Waste
recycling
Advanced searching
Without using the SLC website
In the South Lanarkshire
Council area
Gout
Advanced searching
Fizzy drinks
British Medical Journal
And Not
Or
Boolean operators
General Search StrategiesMost search engines use Boolean Logic for search queries
Each engine has its own default AND or OR operator – normally AND
Field searching can reduce the number of pages
slavery
intitle:slavery
inurl:slavery intitle:slavery
inurl:slavery
Refine search to a specific website
spacewalks site:nasa.gov
Write your task in big Write your task in big red letters on a yellow red letters on a yellow
"post-it" and stick it in "post-it" and stick it in the middle of your the middle of your
screen screen
+ pizza
+ pepperoni
+ ham
- olives
- garlic
+”pan pizza” –olives pepperoni
Must include phraseMust not include
Should preferably include
| means OR
Deep or Invisible Web
Deep or Invisible Web
Normally database contents do not show in search engines
Invisible web 500 times greater than visible web
Some sites specialise in providing links to the deep web
Some deep web links
http://completeplanet.com
http://www.freepint.com/gary/direct.htm
http://www.search.com/subjects
Finding…
People•To use electoral rolls there is www.192.com•Friends Reunited http://www.friendsreunited.co.uk•Genealogy www.scotlandspeople.gov.uk
Businesses, Locations and Routes•Weather at www.weather.com. Try hamilton, uk•Business at www.yell.com•House prices http://www.ourproperty.co.uk•Train timetables at http://www.nationalrail.co.uk/•Traffic http://www.trafficscotland.org/•Maps at www.multimap.com
The Past WebThe Past Webhttp://www.archive.org/web/web.php
Combing
Searching for those who have already searched
There will some ‘specialist’ who has spent years
of his life cataloguing all
possible variants of the G4M
‘Betty’ bombers
Combing
Searching for those who have already searched
Usenet – www.groups.google.co.uk - groups – 25 million+ users
Mailbase – www.jiscmail.ac.uk
Forums, messageboards, and blogs – www.acciesworld.com – searchingthenet.wordpress.com – over 100 million users
Wikis – www.wikipedia.org
What are people searching for?
http://www.metaspy.com/info.metac.spy/metaspy
Meat
Aggressive driving school
Natalie Imbruglia toless
Web 2.0 – Library 2.0 A Generational Thing?
Library 2.0 – use of those tools which would allow the user to participate in
and help shape the service
Web 2.0 - a generation of web tools that allow social interaction and information sharing online
(Tends to be used by ‘Digital Natives’ rather than ‘Digital Immigrants’)
If adopted fully, Library 2.0 could be as revolutionary as the change to Open Access in the 20th century
Web 2.0 Library 2.0
•Bebo, FaceBook, MySpace, Flickr •Wikis•Instant Messaging (IM)•Weblog•RSS•Social Bookmarking: Del.icio.us, furl etc•Web-based work sharing
•Library blogging•Interactive OPAC•IM for overdues•Weblog with themes for stock promotion•Stock promotion on Active IT website•Storytelling sessions on website
Web 2.0
Bebo etc: widely used by teenagers and twenty-somethings (‘Digital natives’) for social networking, keeping in touch. Modern equivalent of sending a postcard!)
IM (Instant messaging): MSN Messenger etc
RSS: have news downloaded on regular basis to your PC via a news reader
Social Bookmarking: instead of searching for material, see what other people have looked for. Del.icio.us. Users bookmark websites/URLs that they think are of interest to others like themselves
Web-based work sharing: Google and others provide office tools which can be used and stored or shared online, e.g. Google Docs and Spreadsheets, Wetpaint
Practicalities: What can we do? What can we not do?
OPAC - lets users share ideas via book reviews and reading lists
- can be accessed through website - allows online registration/reservations
IM - lets IT & Systems send overdue notifications - could let them send reminders (possible
revenue loss)Weblog -can be used to promote events, author visits
e.g.RM - restrictions in place prevent access to sites like
Facebook. Can access Google Tools like Documents and Spreadsheets
The Challenges
Keep pace with explosive growth
Understand the ‘boundary’ needs of user communities – search engines for each professional group?
Provide sufficient ‘intelligence’ to infer what users are really asking for even when their queries don’t specify it
Ensure sufficient coverage to provide one-stop searching
Evaluating Web PagesEvaluating Web Pages
Aye, right!
“If I find the same information from three different web sites, it must be true”
“90% of what’s worth reading can be found on the Internet”
“The World Wide Web must be a good place to find what I am looking for, because it’s bigger than all the biggest libraries in the world combined”
“Search engines list the best sites first”
FactsOnly careful evaluation can tell you if any resource is reliable
Only a small, but ever increasing, fraction of important reading material can be found via the Internet – much of what’s in print may never be digitised or online
The size of the web comes mostly from commercial sites – over 60%
Search engines are organised by computers and the results are ranked by computer programs oblivious to information quality.Ranking by popularity – who links to a site (as in Google) – often brings to the top many unreliable and satirical sites. Commercial links pay to be placed at the top and this is not always obvious
Look at the URL
What is the domain?
.com .edu .net .gov .mil .org.com .edu .net .gov .mil .org
.uk .de .fr .jp.uk .de .fr .jp
Does the domain match the type of information?
http://www.scottish.parliament.nethttp://www.scottish.parliament.net
http://www.glasgow.university.geocities.comhttp://www.glasgow.university.geocities.com
Look at the URL
Who published the page?
Between the http:// and the first /
e.g. http://bbc.co.uk/scotland/sport/default.stm
Is the content appropriate for this publisher?
NY Times article from aol.com
Is there a ~ or a % preceding a name?
e.g. http://harvard.edu/~pjones/report.html
Who wrote the page?
Look for an email or contact address/phone number
About the author
If author is unclear try to shorten the URL
Ask yourself…
• What is the pageWhat is the page’’s purpose? Why was it created?s purpose? Why was it created?– To inform? To inform? – To give facts or data or schedules? To give facts or data or schedules? – To persuade or explain? To persuade or explain? – To sell or entice? To sell or entice? – To share or disclose something personal?To share or disclose something personal?
• Are you looking at an authentic source?Are you looking at an authentic source?– A well known newspaper, journal, organization, institution?A well known newspaper, journal, organization, institution?
• If viewing a document, is it unmodified?If viewing a document, is it unmodified?– Parts can be selectively omitted Parts can be selectively omitted – ItIt’’s easy to falsify a document and mimic the original format s easy to falsify a document and mimic the original format ––
look for .pdf files which are difficult to alterlook for .pdf files which are difficult to alter
Certificate of Astronauting
Presented to
Buzz Cameron & Iain Lightyear
On completion of their Space Crash Course at
NASA Academy, Coatbridge Branch, June 2008
To To Fairhill… Fairhill…
and and beyondbeyond
Look for bias• Who sponsors the page ?
– Could the sponsor be a stakeholder in the page’s content ? – Would the EGG Society be a good source of cholesterol info ?
• What is NOT being said ? – Try to think of alternative points of view
• Look for your own biases – Are you being completely fair ? – Is the site good for some things and not others ?
Is this biased? or this?
More questions
• Is the web page dated?Is the web page dated?– Is the information current?Is the information current?– Is the page being maintained?Is the page being maintained?
• Is information authentic?Is information authentic?– Are sources cited and are they reliable?Are sources cited and are they reliable?
• Could the page be ironic? Satire or parody?Could the page be ironic? Satire or parody?
Look at Look at www.gatt.org
Warning
Step back, add it all up
Does it add up to integrity and reliability?
Do you need more information?
If you are not sure then voice your reservations
Look for other, complementary sources
Does everything you’ve found feel right?
View a web page as you would a TV commercial
The library is a good place to start!
What have we learned from Searching the net?Google is not the ‘be all
and end all’Deeper, more focused searching can produce better results Think
about using databases
Use other people’s expertise
And that’s the end!
Thanks
More subject directories
http://www.hw.ac.uk/libWWW/irn/pinakes/pinakes.html http://bubl.ac.uk/link http://infomine.ucr.edu http://www.vlib.org http://www.academicinfo.net http://www.about.com
Some sites to evaluate
Try this tutorialhttp://library.albany.edu/usered/webeval/
Here’s a site celebrating the life of Martin Luther Kinghttp://martinlutherking.org/ or is it? Who maintains the site?
Wikipedia? Not quitehttp://uncyclopedia.org/
Some sites to evaluate
http://www.infotoday.com/searcher/sep00/piper.htmhttp://www.fulkerson.org/ancestors/buyanancestor.htmlhttp://www.ovaprima.org/history.htmlhttp://www.quackwatch.org/index.htmlhttp://www.google.com/technology/pigeonrank.htmlhttp://www.satirewire.com/news/jan02/australia.shtmlhttp://zapatopi.net/treeoctopus/
http://www.whirledbank.org/
Social or vertical searching
www.eurekster.comwww.rollyo.comhttp://peerspective.mpi-sws.mpg.dewww.chacha.comhttp://search.wikia.com/wiki/Search_Wikiawww.powerset.comwww.q-phrase.com
Finding books and texts
How to find books and textsBooks.google.comScholar.google.comMicrosoft Academicwww.librarything.comhttp://wikibooks.org/http://www.fullbooks.com/http://portico.bl.uk/http://www.readprint.com/http://www.bookyards.comhttp://manybooks.net/http://www.ipl.org/