v irtual i nternational a uthority f ile
DESCRIPTION
V irtual I nternational A uthority F ile. ALA, June 2006. Virtual International Authority File. Link authority records from national bibliographic agencies Build on their authority work Expand the concept of universal bibliographic control - PowerPoint PPT PresentationTRANSCRIPT
VirtualInternationalAuthorityFile
ALA, June 2006
Richard Bennett OCLCChristel Hengel DDBThomas B. Hickey OCLCEdward T. O’Neill OCLCBarbara B. Tillett LC
VIAF
DDB/LC/OCLC
Virtual International Authority File
Link authority records from national bibliographic agencies
Build on their authority work Expand the concept of universal
bibliographic control• Allow national or regional variations in
authorized form to co-exist• Support needs for variations in preferred
language, script, and spelling
VIAF
DDB/LC/OCLC
Joint VIAF Project
VIAF
DDB/LC/OCLC
Semantic Web Building Blocks
End-user
A&I controlledvocabularies
(Library) authority files
Other controlled vocabularies
“Ontologies”
VIAF
DDB/LC/OCLC
Project Goal
Demonstrate feasibility of linking personal names across:
Personennormadatei (PND) Library of Congress Name Authority File (LCNAF)
VIAF
DDB/LC/OCLC
What is the VIAF?
System• Links between files• Web browser access• Multi-lingual and multi-scripts
Maintenance• National agencies control their records• Records harvested from national systems
Scalable• Any number of national authority files
VIAF
DDB/LC/OCLC
Matching Variations
In the LCNAF and PND authority files: Same name, same person Same name, different people Different names, same person Missing person in one file
VIAF
DDB/LC/OCLC
Two Different People – One Name
Adams, Mike PND: a golfer LCNAF: author of a Beatles collector's guide
Same NameDifferent
People
VIAF
DDB/LC/OCLC
One Person – Two Names
LCNAF: Morel, Pierre PND: Morellus, Petrus
Same PersonDifferent
Names
VIAF
DDB/LC/OCLC
Enhancing the Authorities
Bibliographic
Record
Derived Authorit
y
AuthorityRecord
Enhanced
Authority
VIAF
DDB/LC/OCLC
Mining the Bibliographic Record
LDR 00826ccm 2200289 a 4500 1 ocm10025532 5 20031229650847.0 8 840627s1982 nyuuua n eng 10 $a 84758340 40 $a DLC $c DLC 19 $a 17706440 20 $c $2.95 28 22 $a 48418 $b G. Schirmer 45 2 $b d198006 $b d198007 48 $b va01 $b ve01 $a ka01 50 00 $a M1529.3 $b .T100 1 $a Thomson, Virgil, $d 1896-245 14 $a The cat : $b duet for soprano and baritone / $c Virgil Thomson ; [words by Jack Larson].260 $a New York : $b G. Schirmer, $c c1982.300 $a 1 score (11 p.) ; $c 31 cm.500 $a For soprano, baritone, and piano.650 0 $a Vocal duets with piano.600 10 $a Larson, Jack $x Musical settings.700 1 $a Larson, Jack.
Authors
LC Control Number
LC ClassificationTitl
e
Material Type
Publisher
Place of Publication
Language
Date ofPublication
Usage
VIAF
DDB/LC/OCLC
Derived Authority Record
00525nz 2200229n 4500 0 1 xlc 1 1 3 OCoLC 2 5 20040721111415.0 3 8 040721nneanz||abbn n and d 4 40 $a OCoLC $b eng $c OCoLC $f viaf 5 100 1 $a Larson, Jack. 6 903 $a 84758340 7 910 14 $a the cat $b duet for soprano and baritone 8 921 $a g schirmer 9 922 $a nyu10 930 $a jack larson11 940 $a eng12 942 $a 23413 943 $a 198x14 944 $a cm15 950 1 $a thomson, virgil $d 1896
All text is normalizedSubjects are grouped
into broad subject areas
Material type is coded
Publication date is by decade
Coauthor
VIAF
DDB/LC/OCLC
Enhanced Authority Record
VIAF
DDB/LC/OCLC
Strong Matching Attributes
A work (title) in common Common control numbers (ISBN, ISSN, or LCCN) Exact birth and death year Joint authors Name as subject
VIAF
DDB/LC/OCLC
Weaker Attributes
Only one of birth/death date(s) (allows some variation) Subject area of works (two levels) Format (books, films, musical scores, etc.) Language Publisher Partial title match
Date of publication Country Role (author, illustrator, composer, etc.) Format (books, films, musical scores, etc.)
VIAF
DDB/LC/OCLC
LCNAF PND100 Menzel, Adoph, $d 1815-1905 400 Menzel, Adolph $d 1815-1905901 345816110 $9 1 901 345816110 $9 1910 friedrich def grosse und sien
fridenswerk $9 1910 friedrich def grosse und sien
fridenswerk $9 4921 bruckman $9 1 921 bruckman $9 8940 ger $9 27 940 ger $9 267941 ill $9 1 941 ill $9 18942 243 $9 7 942 243 $9 1943 184x $9 1 943 184x $9 2950 achenbach, sigrid $d 1944 $9 1 950 achenbach, sigrid $9 2951 hambuger kunstahalle $9 2 951 hambuger kunstahalle $9 1
Exact name match with datesStandard Number
Exact title matchPublisherLanguage
RoleSubject
DecadeJoint aughor
Corporate name
VIAF
DDB/LC/OCLC
LC Names
Established Names4,187,973
Names from Bib Records 3,440,706
UncontrolledNames883,882
OrphanedNames1,631,149
Active Establishe
d Names
2,556,824
VIAF
DDB/LC/OCLC
DDB Names
Established Names2,659,276
Names from Bib Records 2,319,829
Uncontrolled(Undif’d)Names306,211
OrphanedNames645,658
Active Establishe
d Names
2,013,618
VIAF
DDB/LC/OCLC
Results
Matches 558,618 Complex Matches 70,797 Unique Matches 487,821
VIAF
DDB/LC/OCLC
VIAF File
DDB Names2,659,276
LC Names 4,187,973
Common558,618(70% of
potential)
VIAF
DDB/LC/OCLC
Next Steps
Move to incremental updates Start harvesting national files Bring up Web interface (to full files) Make OAI accessible Bring in new participants Handle non-Roman matching
Move to other types of authorities• Corporate names• Geographic names• …
VIAF
DDB/LC/OCLC
Stage 3: Build OAI Server
LCNAF
DDB/PND
OAI
Server(s)
VIAF
DDB/LC/OCLC
Stage 4: Ongoing maintenance
VIAF
DDB/LC/OCLC
Stage 5: Build End User Interface with Unicode displays
User’s cookie specifies Hangul is preferred.Display 700 form, building on local system’s authority structure
Thank you
T. Hickeyhttp://errol.oclc.org/laf/n82-54463.html
ALA June 2006