information technology in business and society
DESCRIPTION
Information technology in business and society. Session 9 – Search and Advertising Sean J. taylor. Administrativia. Assignment 2 online d ue Saturday 2/25 at 1am Assignment 2 resources Assignment 3 preview Guest speaker on Tuesday 2/28: Chrys Wu discussing IT and Journalism - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Information technology in business and society](https://reader036.vdocument.in/reader036/viewer/2022070500/5681685a550346895dde8f88/html5/thumbnails/1.jpg)
INFORMATION TECHNOLOGY IN BUSINESS AND SOCIETYSESSION 9 – SEARCH AND ADVERTISING
SEAN J. TAYLOR
![Page 2: Information technology in business and society](https://reader036.vdocument.in/reader036/viewer/2022070500/5681685a550346895dde8f88/html5/thumbnails/2.jpg)
ADMINISTRATIVIA
• Assignment 2 onlinedue Saturday 2/25 at 1am
• Assignment 2 resources• Assignment 3 preview• Guest speaker on Tuesday 2/28:
Chrys Wu discussing IT and Journalism• Substitute on Thursday 3/1
Professor Dylan Walker
![Page 3: Information technology in business and society](https://reader036.vdocument.in/reader036/viewer/2022070500/5681685a550346895dde8f88/html5/thumbnails/3.jpg)
LEARNING OBJECTIVES
1. Learn how search engines rank pages
2. Learn how to design effectively for high rankings
3. Learn how online advertising works, especially search ads and keyword auctions
4. The future of search
![Page 4: Information technology in business and society](https://reader036.vdocument.in/reader036/viewer/2022070500/5681685a550346895dde8f88/html5/thumbnails/4.jpg)
SEARCH ENGINES AND WEB DIRECTORIESResources on the Web that help you find sites with the information and/or services you want.
• Directory search engine - organizes listings of Web sites into hierarchical lists.
• Search engine - uses software agent technologies (or “spiders”, or “bots”) to search the Web for key words and place them into indexes.
![Page 5: Information technology in business and society](https://reader036.vdocument.in/reader036/viewer/2022070500/5681685a550346895dde8f88/html5/thumbnails/5.jpg)
WEB DIRECTORIES EXAMPLE
Advantages? Disadvantages?
![Page 6: Information technology in business and society](https://reader036.vdocument.in/reader036/viewer/2022070500/5681685a550346895dde8f88/html5/thumbnails/6.jpg)
SEARCH ENGINE EXAMPLES
Advantages? Disadvantages?
![Page 7: Information technology in business and society](https://reader036.vdocument.in/reader036/viewer/2022070500/5681685a550346895dde8f88/html5/thumbnails/7.jpg)
SEARCH ENGINES DRIVE ECOMMERCE!
![Page 8: Information technology in business and society](https://reader036.vdocument.in/reader036/viewer/2022070500/5681685a550346895dde8f88/html5/thumbnails/8.jpg)
WHERE IS CONSUMERS ATTENTION?
![Page 9: Information technology in business and society](https://reader036.vdocument.in/reader036/viewer/2022070500/5681685a550346895dde8f88/html5/thumbnails/9.jpg)
![Page 10: Information technology in business and society](https://reader036.vdocument.in/reader036/viewer/2022070500/5681685a550346895dde8f88/html5/thumbnails/10.jpg)
EYETRACKING STUDY OF GOOGLE RESULTS
![Page 11: Information technology in business and society](https://reader036.vdocument.in/reader036/viewer/2022070500/5681685a550346895dde8f88/html5/thumbnails/11.jpg)
– Search engines discover new pages by following links
– Keep track of words that appear in pages and when you enter a query, the search engine returns a ranked list
– Text content is important! But is not enough! (Why?)
How do search engines rank pages?(why does this matter?)
HOW SEARCH ENGINES WORK
![Page 12: Information technology in business and society](https://reader036.vdocument.in/reader036/viewer/2022070500/5681685a550346895dde8f88/html5/thumbnails/12.jpg)
PAGERANK IS REALLY A “RANDOM SURFER” MODEL
Random Surfer Model:
T 1 W)1( 22)1( WW)1(1
1
What about getting stuck in loops? takes care of that
Let’s count the surfer’s that pass through each point:
Transfer Matrix: The probability that a surfer follows a link from webpage i to webpage j is = [Prob. you were not “picked up”] * [prob. of following link i->j ]
The matrix if page i links to page j
![Page 13: Information technology in business and society](https://reader036.vdocument.in/reader036/viewer/2022070500/5681685a550346895dde8f88/html5/thumbnails/13.jpg)
MEASURING IMPORTANCE OF LINKING
PageRank Algorithm
Idea: important pages are pointed to by other important pages
Method:• Each link from one page to another is counted as a “vote” for the
destination page
• The number of incoming links is important!• But it is not enough!
• But each “vote” is different! PageRank places more importance to votes that come from pages with large number of votes (and so on, and so on)
Compare, for example, the cases for the circled page in cases A and B
A
B
![Page 14: Information technology in business and society](https://reader036.vdocument.in/reader036/viewer/2022070500/5681685a550346895dde8f88/html5/thumbnails/14.jpg)
People who bought this also bought…
BOOK A
book Bbook Cbook D
People who bought this also bought…
BOOK D
book CPeople who bought this also bought…
BOOK C
book A
People who bought this also bought…
BOOK B
book Abook C
(ignoring damping factor for illustration)
COMPUTING PAGERANK
![Page 15: Information technology in business and society](https://reader036.vdocument.in/reader036/viewer/2022070500/5681685a550346895dde8f88/html5/thumbnails/15.jpg)
People who bought this also bought…
BOOK A
book Bbook Cbook D
People who bought this also bought…
BOOK D
book CPeople who bought this also bought…
BOOK C
book A
People who bought this also bought…
BOOK B
book Abook C
COMPUTING PAGERANK
(ignoring damping factor for illustration)
![Page 16: Information technology in business and society](https://reader036.vdocument.in/reader036/viewer/2022070500/5681685a550346895dde8f88/html5/thumbnails/16.jpg)
PAGERANK
People who bought this also bought…
BOOK A
book Bbook Cbook D
People who bought this also bought…
BOOK D
book CPeople who bought this also bought…
BOOK C
book A
People who bought this also bought…
BOOK B
book Abook C.250 .250
.250 .250
(ignoring damping factor for illustration)
![Page 17: Information technology in business and society](https://reader036.vdocument.in/reader036/viewer/2022070500/5681685a550346895dde8f88/html5/thumbnails/17.jpg)
PAGERANK
People who bought this also bought…
BOOK A
book Bbook Cbook D
People who bought this also bought…
BOOK D
book CPeople who bought this also bought…
BOOK C
book A
People who bought this also bought…
BOOK B
book Abook C.250 .250
.250 .250
.250/3
.250
.250/3
.250/2
.250.250/3 .250/2
(ignoring damping factor for illustration)
![Page 18: Information technology in business and society](https://reader036.vdocument.in/reader036/viewer/2022070500/5681685a550346895dde8f88/html5/thumbnails/18.jpg)
PAGERANK
People who bought this also bought…
BOOK A
book Bbook Cbook D
People who bought this also bought…
BOOK D
book CPeople who bought this also bought…
BOOK C
book A
People who bought this also bought…
BOOK B
book Abook C
.250/3
.250
.250/3
.250/2
.250.250/3 .250/2
.375 .083
.083 .458
(ignoring damping factor for illustration)
![Page 19: Information technology in business and society](https://reader036.vdocument.in/reader036/viewer/2022070500/5681685a550346895dde8f88/html5/thumbnails/19.jpg)
PAGERANK
People who bought this also bought…
BOOK A
book Bbook Cbook D
People who bought this also bought…
BOOK D
book CPeople who bought this also bought…
BOOK C
book A
People who bought this also bought…
BOOK B
book Abook C
.375/3
.083
.375/3
.083/2
.458.375/3 .083/2
.375 .083
.083 .458
(ignoring damping factor for illustration)
![Page 20: Information technology in business and society](https://reader036.vdocument.in/reader036/viewer/2022070500/5681685a550346895dde8f88/html5/thumbnails/20.jpg)
PAGERANK
People who bought this also bought…
BOOK A
book Bbook Cbook D
People who bought this also bought…
BOOK D
book CPeople who bought this also bought…
BOOK C
book A
People who bought this also bought…
BOOK B
book Abook C
.375/3
.083
.375/3
.083/2
.458.375/3 .083/2
.500 .125
.125 .250
(ignoring damping factor for illustration)
![Page 21: Information technology in business and society](https://reader036.vdocument.in/reader036/viewer/2022070500/5681685a550346895dde8f88/html5/thumbnails/21.jpg)
PAGERANK
People who bought this also bought…
BOOK A
book Bbook Cbook D
People who bought this also bought…
BOOK D
book CPeople who bought this also bought…
BOOK C
book A
People who bought this also bought…
BOOK B
book Abook C.400 .133
.133 .333
.400/3
.133
.400/3
.133/2
.333.400/3 .133/2
(ignoring damping factor for illustration)
![Page 22: Information technology in business and society](https://reader036.vdocument.in/reader036/viewer/2022070500/5681685a550346895dde8f88/html5/thumbnails/22.jpg)
GAMING PAGERANK AND TRUST
TrustRank Algorithm
Initial votes come only from trusted pages
Compare, for example, the cases for the circled page in cases A and B B
trusted page
trusted page
Links from untrusted sources
A
![Page 23: Information technology in business and society](https://reader036.vdocument.in/reader036/viewer/2022070500/5681685a550346895dde8f88/html5/thumbnails/23.jpg)
SIMULATINGCHANGES IN PAGERANK
People who bought this also bought…
BOOK A
book Bbook Cbook D
People who bought this also bought…
BOOK D
book CPeople who bought this also bought…
BOOK C
book A
People who bought this also bought…
BOOK B
book Abook C
Change PR of A PR of C
C cuts link to A 0.18 0.50
C links to B 0.38 0.33
C links to D 0.24 0.40
C links to B & D 0.22 0.38
.400 .133
.133 .333
![Page 24: Information technology in business and society](https://reader036.vdocument.in/reader036/viewer/2022070500/5681685a550346895dde8f88/html5/thumbnails/24.jpg)
IMPORTANCE OF ANCHOR TEXT
<a href=http://www.sims…>INFOSYS 141</a>
<a href=http://www.sims…>A terrific course on search engines</a>
The anchor text summarizes what the website is about.
![Page 25: Information technology in business and society](https://reader036.vdocument.in/reader036/viewer/2022070500/5681685a550346895dde8f88/html5/thumbnails/25.jpg)
OTHER RANKING FACTORS
Location, Location, Location...and Frequency• Query words in title, or in first few sentences• The more frequent the query words, the better
Click through measurement• How often users click on your URL, when they
see it• How long do they stay (using toolbars!)
![Page 26: Information technology in business and society](https://reader036.vdocument.in/reader036/viewer/2022070500/5681685a550346895dde8f88/html5/thumbnails/26.jpg)
OUTLINE1. Learn how search engines rank pages
2. Learn how to design effectively for high rankings
3. Learn how online advertising works, especially search ads and keyword auctions
4. The future of search
![Page 27: Information technology in business and society](https://reader036.vdocument.in/reader036/viewer/2022070500/5681685a550346895dde8f88/html5/thumbnails/27.jpg)
ACHIEVING HIGHER RESULTS RANKINGS• Position your keywords (title, headings, early on page)
• Make text visible (no tiny fonts, no white-on-white)
• Frames can kill• Have relevant content• Do not change topics• Just say no to search engine spamming • Submit your key pages• Verify your listing often
![Page 28: Information technology in business and society](https://reader036.vdocument.in/reader036/viewer/2022070500/5681685a550346895dde8f88/html5/thumbnails/28.jpg)
Motives• Commercial, political, religious, lobbies• Promotion funded by advertising budget
Operators• Contractors (Search Engine Optimizers) for lobbies,
companies• Web masters• Hosting services
What are the techniquesused by rankings manipulators?
MANIPULATING RANKINGS
![Page 29: Information technology in business and society](https://reader036.vdocument.in/reader036/viewer/2022070500/5681685a550346895dde8f88/html5/thumbnails/29.jpg)
MANIPULATION TECHNOLOGIESCloaking
• Serve fake content to search engine robot• DNS cloaking: Switch IP address. Impersonate
Doorway pages• Pages optimized for a single keyword that re-direct
to the real target page Keyword Spam
• Misleading meta-keywords, excessive repetition of a term, fake “anchor text”
• Hidden text with colors, CSS tricks, etc.Link spamming
• Mutual admiration societies, hidden links, awards• Domain flooding: numerous domains that point or
re-direct to a target pageRobots
• Fake click stream• Fake query stream
Is this a SearchEngine spider?
N
Y
SPAM
FakeDoc
Cloaking
Meta-Keywords = “… London hotels, hotel, holiday inn, hilton, discount, booking, reservation, sex, mp3, britney spears, viagra, …”
Risky to use any of these as search engines aregetting better at detecting and punishing them
![Page 30: Information technology in business and society](https://reader036.vdocument.in/reader036/viewer/2022070500/5681685a550346895dde8f88/html5/thumbnails/30.jpg)
OUTLINE1. Learn how search engines rank pages
2. Learn how to design effectively for high rankings
3. Learn how online advertising works, especially search ads and keyword auctions
4. The future of search
![Page 31: Information technology in business and society](https://reader036.vdocument.in/reader036/viewer/2022070500/5681685a550346895dde8f88/html5/thumbnails/31.jpg)
PAID RANKING
Keyword bidding for targeted ads• Pay-per-click• Higher bids result in higher ranks for the ad• Higher percentage of clicks on the ad, increase
the rank as well (why?)
Google's AdWords is the biggest player• Google’s 2007 revenue was more than $16
Billion, 2008 ~ $22 Billion, mostly from such ads
Promoting without Manipulation: Paid placement
![Page 32: Information technology in business and society](https://reader036.vdocument.in/reader036/viewer/2022070500/5681685a550346895dde8f88/html5/thumbnails/32.jpg)
EXAMPLE
AdWordsPlacement
AdWords Placement
Most relevant sites
![Page 33: Information technology in business and society](https://reader036.vdocument.in/reader036/viewer/2022070500/5681685a550346895dde8f88/html5/thumbnails/33.jpg)
![Page 34: Information technology in business and society](https://reader036.vdocument.in/reader036/viewer/2022070500/5681685a550346895dde8f88/html5/thumbnails/34.jpg)
FUND YOUR WEBSITE: ADSENSEGoogle also delivers ads to other websitesSign-up for Google AdSense, and Google delivers ads to your website (common source of income for “professional” bloggers)
How ads are delivered:
• If website best for targeted keywords
• If users of website click on results
Strategies for successful ads:
• Place the ads on top
• Blend with the rest of the website
• Ads at the bottom are ignored consistently
![Page 35: Information technology in business and society](https://reader036.vdocument.in/reader036/viewer/2022070500/5681685a550346895dde8f88/html5/thumbnails/35.jpg)
EXAMPLE: WASHINGTON POSTWEBSITE
![Page 36: Information technology in business and society](https://reader036.vdocument.in/reader036/viewer/2022070500/5681685a550346895dde8f88/html5/thumbnails/36.jpg)
Analysis of Washington Post
Website
![Page 37: Information technology in business and society](https://reader036.vdocument.in/reader036/viewer/2022070500/5681685a550346895dde8f88/html5/thumbnails/37.jpg)
TARGETING BANNER ADS
Request for Ad from Ad Server
IP AddressCountry, Domain, CompanyBrowser, Operating System
Surfing Behavior from cookiesDemographic Data?
Targeted Ad isDelivered to
User
Context:Movie reviewsUser Profile:
NYU userNew York
![Page 38: Information technology in business and society](https://reader036.vdocument.in/reader036/viewer/2022070500/5681685a550346895dde8f88/html5/thumbnails/38.jpg)
UserVisits
PublisherSites
Ads Delivered By Dart For Advertisers
DART For
Advertisers
BoomerangCaptures User
Action DataData Analysis
Databank
Boomerang Compiles & Reports Response For Future Targeting
User Clicks &Visits
Advertiser’sSite
CLOSED LOOP MARKETING
Source: Doubleclick, Inc.
![Page 39: Information technology in business and society](https://reader036.vdocument.in/reader036/viewer/2022070500/5681685a550346895dde8f88/html5/thumbnails/39.jpg)
FUTURE OF SEARCH
1. Information Extraction:Search on Structured Data
2. Social Search3. Privacy Preserving Search
![Page 40: Information technology in business and society](https://reader036.vdocument.in/reader036/viewer/2022070500/5681685a550346895dde8f88/html5/thumbnails/40.jpg)
INFORMATION EXTRACTION
Information extraction applications extract structured relations from unstructured textMay 19 1995, Atlanta -- The Centers for Disease Control
and Prevention, which is in the front line of the world's response to the deadly Ebola epidemic in Zaire , is finding itself hard pressed to cope with the crisis…
Date Disease Name LocationJan. 1995 Malaria EthiopiaJuly 1995 Mad Cow Disease U.K.
Feb. 1995 Pneumonia U.S.May 1995 Ebola Zaire
Disease Outbreaks in The New York Times
Information Extraction System
(e.g., NYU’s Proteus)
![Page 41: Information technology in business and society](https://reader036.vdocument.in/reader036/viewer/2022070500/5681685a550346895dde8f88/html5/thumbnails/41.jpg)
RETURN STRUCTURED ANSWERS, NOT WEBPAGES
![Page 42: Information technology in business and society](https://reader036.vdocument.in/reader036/viewer/2022070500/5681685a550346895dde8f88/html5/thumbnails/42.jpg)
FUTURE OF SEARCH
1. Information Extraction:Search on Structured Data
2. Social Search3. Privacy Preserving Search
![Page 43: Information technology in business and society](https://reader036.vdocument.in/reader036/viewer/2022070500/5681685a550346895dde8f88/html5/thumbnails/43.jpg)
Y! ANSWERSLaunched in second half of 2005
Incentive system based on points and voting for best answers
Questions grouped by category
Some statistics: • over 60 million users• over 120 million answers, available in 18 countries and
in 6 languages
![Page 44: Information technology in business and society](https://reader036.vdocument.in/reader036/viewer/2022070500/5681685a550346895dde8f88/html5/thumbnails/44.jpg)
![Page 45: Information technology in business and society](https://reader036.vdocument.in/reader036/viewer/2022070500/5681685a550346895dde8f88/html5/thumbnails/45.jpg)
Y! ANSWERS
![Page 46: Information technology in business and society](https://reader036.vdocument.in/reader036/viewer/2022070500/5681685a550346895dde8f88/html5/thumbnails/46.jpg)
Y! ANSWERS
![Page 47: Information technology in business and society](https://reader036.vdocument.in/reader036/viewer/2022070500/5681685a550346895dde8f88/html5/thumbnails/47.jpg)
LONG-TERM PROSPECTSQuestions follow a power-law:
•Large number of questions will be asked by many people (20% of questions80% of requests)
•We only need one answer for each question•Acquire quickly high-quality answers for 80% of queries
•…people will take care in time of the “long tail” of the remaining questions
![Page 48: Information technology in business and society](https://reader036.vdocument.in/reader036/viewer/2022070500/5681685a550346895dde8f88/html5/thumbnails/48.jpg)
FUTURE OF SEARCH
1. Information Extraction:Search on Structured Data
2. Social Search3. Privacy Preserving Search
![Page 49: Information technology in business and society](https://reader036.vdocument.in/reader036/viewer/2022070500/5681685a550346895dde8f88/html5/thumbnails/49.jpg)
PRIVACY PRESERVING SEARCH
![Page 50: Information technology in business and society](https://reader036.vdocument.in/reader036/viewer/2022070500/5681685a550346895dde8f88/html5/thumbnails/50.jpg)
NEXT CLASS:SOCIAL NETWORKS
• Work on Assignment 2