stephen rhind-tutt, president, alexander street press, l.l.c developments: new forms of digital...
DESCRIPTION
Google “Google's mission is to organize the world's information and make it universally accessible and useful.” (Google Corporate Information Web page: February, 2005)TRANSCRIPT
Stephen Rhind-Tutt, President, Alexander Street Press, L.L.C
Developments: New forms of Digital Content
Presented to the Council on Library and Information Resources2005 Sponsor’s Symposium
April 18th, 2005
Overview
• Google – positive or negative?
• A potential response for publishers (and libraries)
• Specific examples
• Indexing
• Linking
• Comprehensiveness
“Google's mission is to organize the world's information and make it universally accessible and useful.”
(Google Corporate Information Web page: February, 2005)
How to view Google?
“those who succumb to the commercial
influence are building a monster who, like
Frankenstein, will slay his creator…”
A siren calling usto wreck ourselves
on the rocks of mediocrity
A superman to rescue scholarship and put it firmly in the public domain
The cautionary tale of radio…
1912 - Local, active, independent, “free,” chaotic, amateur1920 – Some commercial, some non-profit, passive1940 – Network domination (RCA, NBC), 98% commercial
• From 1921 to 1936, 202 licenses were issued to non-profit stations.
• By 1937, only 38 were still operating.
(Selling Radio: The Commercialization of American Broadcasting, 1920-1934, Susan Smulyan, Smithsonian Institute Press, 1994.)
The challenge
By 2010, the web will contain
• 90% of published works prior to 1923
• Majority of works published to 2010
• > 20 billion pages of e-mail, phone logs, databases, blogs, and Web sites (currently 8 billion)
• > 1 billion photographs
• > 20 million facsimile pages of manuscripts
• > 10 million audio files
• > 1 million video files
Objections to Google Print 1. Operational
Will they be able to digitize all this material?What quality will it have? When will it be available?
2. Technical / legalWith so many hits won’t content get lost?OCR will never get beyond 98% or so…(i.e. several errors per page)Will they allow others to index, copy, and crawl the material?
3. Philosophical Won’t commercial priorities conflict with scholarly ones?As a de facto monopoly they’ll become too powerful. What happens to quality when search supplants content?Secrecy and commercialism will hurt scholarship
4. Experiential What about NlightN, Knowledge Network, Questia, NetLibrary…?
But…
Rather than trying to work out why it won’t work…let’s assume that it (or something like it) will work out and see what the world looks like…
If it works, Google will deliver 30 times the content delivered by EEBO, ECO, Evans, Shaw-Shoemaker and similar initiatives at no charge to end-users…
Another perspective
• Netscape
• Microsoft
• Yahoo
• Amazon
• Many others…
Have helped and will help
• It will all be available in digital form• It will not cost too much• Many more people will use it • It will be enriched through better display, better
integration, better links, better context, etc., etc…
Good for publishers Good for librarians
Good for “society”
Where we’re headed
Evolution of publisher tasks
Fading Growing
Typesetting Printing
Print Monograph Print Directory
Public Domain ReprintsSimple, One database Search
Rare and unpublished material
Linking
Licensing
Free materials
Semantic indexing
Process integration
Unified Search software
Workflow tools
Warehousing
Community BuildingAsset Management
Evolution of publisher tasks
Fading Growing
Typesetting Printing
Print monograph Print directory
Public domain reprintsSimple, one database search
Rare and unpublished material
Linking
Licensing
Free materials
Semantic indexing
Process integration
Unified search software
Workflow tools
Warehousing
Community buildingAsset management
Commissioning?
Editorial?
Quality?Selection?
With literally billions of pages…
What tools will we need ?
• Beyond paper
• Higher quality
• More comprehensive
• Discipline specific
• High functionality links
• Community-centric
• Semantically organized
Where we’re headed
After Data, Information, Knowledge, and Wisdom, Gene Bellinger, Durval Castro, Anthony Mills. http://www.systems-thinking.org/
Who, What, When, Where?
Therefore
Why?
What tools will we need?
• Discipline specific
• Selection and quality
• Interactive
• Community centric
• More comprehensive (in copyright, rare, unpublished)
• High-functionality links
• Semantically organized
“Semantic” indexing
• Tim Berners-Lee and James Hendler in Scientific American, May 2001
• “…an extension of the current web in which information is given well-defined meaning, better enabling computers and people to work in cooperation.”
• Represents a quantum shift in the functionality of the web
The strain on keyword search
Question: “Martin Luther King”
• Google:– 8.3m hits
• Google Scholar: 7.8kFirst hit: Development of a research strategy for assessing the ecological risk of endocrine disruptors. GT Ankley, RD Johnson, G Toth, LC Folmar, NE …US Environmental Protection Agency, Research Division 26 W Martin Luther King Dr Cincinnati, OH 45288.
• Yahoo: 7.4m hits
• Alta Vista: 7.3m hits
“Semantic” indexing
Collection
Series
Book or Volume
Chapter
Page
Word
Where ?When ?What ?Who ?Traditio
nal indexing >
“Semantic” indexing >
Increases in utility
Access Keyword Search
Fielded Search
Semantic Search
Do youhave the booktitled…
All mentions of “Star Wars”
All mentions of “Star Wars” in texts about Reagan published in 1985
All mentions of “Star Wars” by Reagan in speeches he delivered in 1985
North American Theatre Online
• 40,000 pages of reference works for American and Canadian Theatre
• Detailed information and links on – 25,000 plays and screenplays– 20,000+ authors– 15,000+ productions– 2,500+ production companies– 2,800+ theatres – Over 5,000 resources (playbills, posters,
ephemera)– 15,000 characters within plays
• Integrates all ASP databases and material freely available on the Web
The “real” world
Play
Author
Production Stills
Playbills
ProductionVenueDirector
Lighting Set Designers
Theater
PerformanceLocation
Production Company
Producer
Texts
Criticism
Cast List
Performers
Posters
Ephemera
Scenes
ActsCharacters
Dramatis Personae
The virtual world
AuthorBirth dateDeath dateBirth PlaceDeath PlaceNationalityOccupationAwards(38 fields)
TheaterDistrictLocationCapacityStyleEtc…(18 fields)
Company
NameProductionsPerformersEtc…(14 fields)
ProductionDirectorTheaterCast# of Perfs.LightingCostumesEtc…(47 fields)
Characters
PlaysAgeAuthorPerformerEtc…(30 fields)
Scenes
WhereWhenSettingSubjectEtc…(41 fields)
Resources
PlayDirectorTheaterProduction Co.CharacterSceneEtc…(45 fields)
Texts
KeywordAuthorDate WrittenDate PublishedProduction(67 fields)
“All scenes performed in South Africa discussing AIDS from 1980 to 1990”
AuthorBirth dateDeath dateBirth PlaceDeath PlaceNationalityOccupationAwards(38 fields)
TheaterDistrictLocationCapacityStyleEtc…(18 fields)
Company
NameProductionsPerformersEtc…(14 fields)
ProductionDirectorTheaterCast# of Perfs.LightingCostumesEtc…(47 fields)
Characters
PlaysAgeAuthorPerformerEtc…(30 fields)
Scenes
WhereWhenSettingSubjectEtc…(41 fields)
Resources
PlayDirectorTheaterProduction Co.CharacterSceneEtc…(45 fields)
Texts
KeywordAuthorDate WrittenDate PublishedProduction(67 fields)
Alexander Street Databases
Materials free on the WebOPAC
Context
Higher value linkages
Loosely Held Tightly Held
Free Websites
Loosely integrated
Tightlyintegrated
Refuse to License
License widely
License widelyand be a Licensor
• Higher value links• Semantic indexing and keyword searching
of more than 3,000 oral history collections• Represents the personal histories of some
300,000 people• Value:
– Context– Selection– Search power– Licensed material– Integration
Higher value linkages
Context and selection
Search Power
Organized Results
Copyright and comprehensiveness
Public Domain
Films
PlaysStills rightsResiduals
Story rightsMusic rights
Foreign rightsSong rights
Screenplay rightsPrint rights
Electronic rightsSoundtrack rights
Performance rights
Books
Music
No rights reserved
MechanicalsGeographic restrictions
ComposerLabel
Library and publisher opportunity
1. Understand: View conceptual relationships (mapping terms to concepts)
2. Explore: Move from one concept to another with ease (browse tables of contents)
3. Discover: Answer questions you’ve never been able to (never before published content + Semantic indexing)
4. Learn: Test hypotheses and see if they’re correct
Libraries, publishers and technologists
New form of publishingPublishingTechnologyLibrarianship
For more information: www.alexanderstreet.com