isni assignment
DESCRIPTION
October 2014 OCLC. Janifer Gatenby. EMEA Program Manager Metadata OCLC. ISNI Annual General Assembly, Frankfurt 2014. ISNI Assignment. Assigned 8 million. Provisional: Possible 701,157. Provisional: Unassigned 9,953,505. ISNI Assignment: Batch loading. Independent matching sources. - PowerPoint PPT PresentationTRANSCRIPT
The world’s libraries. Connected.
ISNI AssignmentISNI Annual General Assembly, Frankfurt 2014
October 2014OCLC
Janifer Gatenby
EMEA Program Manager MetadataOCLC
The world’s libraries. Connected.
Provisional: Unassigned9,953,505
Provisional: Possible701,157
Assigned8 million
Assigned ISNIs October 2014
2 + independent sources3,956,454
3+ VIAF sources494,002
Unique name3,233,924
Single source (JISC names, BOEK, Ringgold)
342, 234Total
8,026,614
The world’s libraries. Connected.
ISNI Assignment: Batch loading
Independent matching sources
3 VIAF sources
The world’s libraries. Connected.
ISNI Matching
Name
Title
Partial title
Rare title word
Date
Publisher
Personal affiliation
Organisation affiliation
ISBN, ISWC, ISAN, DOI +
Other name identifier e.g. IPI, VIAF, IPD
Instrument
Linked entities
Dewey classification
Scores are collected from each judge (ice skating style)
Lowered for common surnames and common titles
Score > .85 = match
Score >.6 but <.85 = possible match
Scores are collected from each judge (ice skating style)
Lowered for common surnames and common titles
Score > .85 = match
Score >.6 but <.85 = possible match
The world’s libraries. Connected.
ISNI Assignment: Batch loading
Unique name
Single source
Central database - TrustCentral database - Trust
+ % confidence
- % confidence
Provisional: Unassigned9+ Million
Provisional: Possible≈638,000
Assigned
≈ 8 million
Assignment is curatedAssignment is curatedAuthoritativeAuthoritative
UniqueUniqueTrustfulTrustful
PersistentPersistent
Assignment only if confidentAssignment only if confident
Publicly accessible www.isni.org
Matching algorithmsData samplingAnomaly checksQuality assurance processesEnd User input notes
The world’s libraries. Connected.
ConfidenceThe two main problems for maintaining persistence are
• duplicates needing to be merged• undifferentiated identities needing to be split
ISNI errs on the side of making duplicates rather than mixed identities
Thus the batch load process (usually) makes a provisional record• where there is no match (for fear of making a duplicate assignment)• where there is a low confidence match (for fear of making a mixed identity or a duplicate assignment)• where a matching record already has another local ID for the same source, regardless of the strength of the match (for fear of making a mixed identity)
The world’s libraries. Connected.
Procedures for maximizing assignment
• Refinement of matching algorithms
• E.g. introduced rare title word;
• Now ignoring date of birth 1900
• Re-import program
• Rematch with new rules
• Rematch after new data added
• ISNI Quality Team: Data sampling
• assessing impact of single source
• Recommendations for program changes
• New criteria
• Assessing uncommon surname assignment
• Rules for online rich assignment
The world’s libraries. Connected.
Online: Guarantee assignment – Personal Name
ISNIs will be automatically assigned where there are no possible matches in these cases:
There are matches with a database record with a different source A personal name is unique and includes a surname and forename The request includes an “isNot” statement The metadata supplied is considered rich as per these cases:
• Full date of birth and death supplied• Year of birth + 1 title or instrument+ 1 related name (co-
author or affiliated institution)• 1 title or instrument + 1 external URL link of type
encyclopaedia, home page (not social network page) + 1 related name (co-author or affiliated institution)
The request is resolving a possible match by including a PPN
The world’s libraries. Connected.
Online: Guarantee assignment – Organisation Name
ISNIs will be automatically assigned where there are no possible matches in these cases:
There are matches with a database record with a different source An organisation name is unique and does not consist only of abbreviations The metadata supplied is considered rich as per these cases:
• Includes LOCODE &• Organisation type &• Organisation URL
The request is resolving a possible match by including a PPN
The world’s libraries. Connected.
Maximizing assignment
Enter a request record online (Web page or via API)
Batch loaded records – passive method
• Quality Team manual fixes
• OCLC periodic re-match runs
• Matches from later batch loading & online activity
Batch loaded records – active method
• Resolve possible matches found by the system
• Search the database for candidate records for merging
• Enrich a record with URLs to external sources such as author’s web pages, Wikipedia, IMDB, MusicBrainz, Discogs, etc.
May 2012 % assigned Oct 2014 % assigned
ALCS 41,523 63.86% 49,157 76.66%
PROL 2,205 35.24% 4,143 66.18%
PROQ 65,122 12.89% 243,481 48.19%
May 2012 % assigned Oct 2014 % assigned
AUVLU 0 0% 1,716 48.28%
ICLA 0 0% 2,208 97.61%
The world’s libraries. Connected.
Finding possible matches
Command What it finds
Cn: proq & bs: [01]* All your records with a possible match
Cn: proq & bs: 1* Exact duplicates
Cn: proq & bs: 09* Probably your duplicates
Cn: proq & bs: 08* Most likely are matches
Cn: proq & bs: 07* Possible matches
Cn: proq & bs: 06* Possible matches, lower match confidence
DECISIONS Records should mergeOne of the records should split (note to QT)Different identities
The world’s libraries. Connected.
Resolving Possible Matches
ClickClick
The world’s libraries. Connected.
Compare Screen
The world’s libraries. Connected.
Adding a new record – Michel Calame
The world’s libraries. Connected.
Adding a new record
The world’s libraries. Connected.
Adding a new record
The world’s libraries. Connected.
Adding a new record for an Organisation
The world’s libraries. Connected.
New Organisation form
The world’s libraries. Connected.
Adding your source to an existing record
The world’s libraries. Connected.
Adding your source to an existing record
The world’s libraries. Connected.
Correcting and enriching
These are all the same person. The second has an incorrect DOB = 1900These are all the same person. The second has an incorrect DOB = 1900
The world’s libraries. Connected.
Enriching
You can add a source note or general note to any database record, your code does not need to be present
The world’s libraries. Connected.
Reporting errors
The general note will trigger an email to the ISNI Quality Team for attention
The world’s libraries. Connected.
• Requests and replacements (you can replace your existing data citing local identifier)
• Request • Atom Pub Header
• Content = Request in the ISNI XML Request schema
• Documentation• ISNI Atom Pub API guidlines.doc
• ISNI request.xsd (XML schema)
• ISNI request schema.doc (describes the schema)
• ISNI response.xsd (XML schema)
• ISNI response schema.doc (describes the schema)
Atom Pub API (Machine to machine)
The world’s libraries. Connected.
Documentation: Data Submission
Documents relating to data submission
ISNI tab delimited formatISNI tab delimited format organisationsISNI data element valuesISNI XML request schemaISNI XML request schema documentISNI Atom Pub interactive request requirementsISNI Data contributors usage guidelinesISNI database source profiles RAG information
ISNI bulk load submission
Documents relating to data submission output
ISNI XML response schemaISNI XML response schema documentISNI XML notification schemabulk load assigned ISNIs.xsdbulk load ISNI not assigned.xsdbulk load too many matches.xsdISNI Data contributors reports and notifications guidelines
The world’s libraries. Connected.
ISNI Charges
Enquiry no charge
Resolving possible match; no charge
Resolving non match no charge
Correcting information or adding information to an existing record
no charge
Adding a source to a record (status is assigned, provisional or suspect) or
Adding a new record
100 p.a. free
ISNI request rate*
The world’s libraries. Connected.
What is requested from ISNI Data Contributors?
Ingest ISNIs
Act on notifications (new assignments, changed assignments, errors and queries)
Assist in reviewing possible matches (Exact matches then possible matches)
Add a note to any record found with an error
Keep data up to date(become a RAG or use the services of an existing one)
Supply URI