open archives initiative protocol for metadata harvesting (oai-pmh) phil barker, march 2006. ©...
TRANSCRIPT
![Page 1: Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) Phil Barker, March 2006. © Heriot-Watt University. You may reproduce all or any part](https://reader036.vdocument.in/reader036/viewer/2022083006/56649f315503460f94c4d66d/html5/thumbnails/1.jpg)
Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH)
Phil Barker, March 2006. © Heriot-Watt University.You may reproduce all or any part of this presentation but please retain acknowledgement of authorship & copyright. Also I would appreciate it if you let me know how you are using it and sent me any feedback <[email protected]>
![Page 2: Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) Phil Barker, March 2006. © Heriot-Watt University. You may reproduce all or any part](https://reader036.vdocument.in/reader036/viewer/2022083006/56649f315503460f94c4d66d/html5/thumbnails/2.jpg)
OAI Background• Has origins in ePrints (arXive, CogPrints), dating
back to 1999 – actively seeking wider applicability– Nothing to do with OAIS
• Aims to “facilitate the efficient dissemination of content” – free access to the archives (at least: metadata)– consistent interfaces for archives and service provider– low barrier protocol / effortless implementation (e.g.,
because based on HTTP, XML, DC)
• Now on version 2.0 (June 2002)
![Page 3: Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) Phil Barker, March 2006. © Heriot-Watt University. You may reproduce all or any part](https://reader036.vdocument.in/reader036/viewer/2022083006/56649f315503460f94c4d66d/html5/thumbnails/3.jpg)
OAI-PMH: what’s it all about
• Service providers harvest metadata from data providers.
Requests (HTTP)
Metadata (XML)
Data provider
Metadata(+ resources)
Harv
est
er
Service Provider
Metadata“service”
Adapted from http://www.oaforum.org/tutorial/english/page3.htm
OAI PMH
![Page 4: Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) Phil Barker, March 2006. © Heriot-Watt University. You may reproduce all or any part](https://reader036.vdocument.in/reader036/viewer/2022083006/56649f315503460f94c4d66d/html5/thumbnails/4.jpg)
What can be requested (verb)
• Description of the archive (Identify)• A list of metadata formats supported by
the data provider (ListMetadataFormats)• A list of sets provided (ListSets)• A list of resource identifiers (ListIdentifiers)• Many records (ListRecords)• An individual record (GetRecord)
![Page 5: Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) Phil Barker, March 2006. © Heriot-Watt University. You may reproduce all or any part](https://reader036.vdocument.in/reader036/viewer/2022083006/56649f315503460f94c4d66d/html5/thumbnails/5.jpg)
Example Requests
• http://archive.example.org/oaipmh?verb=Identify
• http:// archive.example.org/oaipmh?verb=ListRecords&metadataPrefix=oai_dc
![Page 6: Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) Phil Barker, March 2006. © Heriot-Watt University. You may reproduce all or any part](https://reader036.vdocument.in/reader036/viewer/2022083006/56649f315503460f94c4d66d/html5/thumbnails/6.jpg)
Metadata Formats
• Metadata may be returned in any XML format• Dublin Core is mandatory
– OAI-PMH specifies the XML schema to use– No single DC element is mandatory
• Other element sets / bindings are optional– Qualified DC (e.g. RDN, NSDL)– MODS (LoC)– LOM (RDN-LTSN)– ODRL (JORUM (I think))– ...
![Page 7: Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) Phil Barker, March 2006. © Heriot-Watt University. You may reproduce all or any part](https://reader036.vdocument.in/reader036/viewer/2022083006/56649f315503460f94c4d66d/html5/thumbnails/7.jpg)
Sets
• A grouping of items made to allow selective harvesting– E.g. all theses– E.g. the Engineering section– E.g. all resources from a given source
• Optional
![Page 8: Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) Phil Barker, March 2006. © Heriot-Watt University. You may reproduce all or any part](https://reader036.vdocument.in/reader036/viewer/2022083006/56649f315503460f94c4d66d/html5/thumbnails/8.jpg)
List Records
• Harvester can ask for specific metadata format for– All available items – All items in a set– All records modified in given date range– (A single item — GetRecord)
• Data provider can return– All relevant records– Some relevant records + resumption token– An error code (no such set / metadata format)
![Page 9: Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) Phil Barker, March 2006. © Heriot-Watt University. You may reproduce all or any part](https://reader036.vdocument.in/reader036/viewer/2022083006/56649f315503460f94c4d66d/html5/thumbnails/9.jpg)
Static Repositories
• Even lighter-weight specification for data providers with small and relatively static collections– E.g. the output from a conference
• Essentially an XML file available at a URL
• Accessed through a “static repository gateway” intermediary
![Page 10: Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) Phil Barker, March 2006. © Heriot-Watt University. You may reproduce all or any part](https://reader036.vdocument.in/reader036/viewer/2022083006/56649f315503460f94c4d66d/html5/thumbnails/10.jpg)
Issues: complexity
• Providing data is easy• Harvesting data is easyHowever• Doing so may lead to complex workflow /
policy issues– What do you do with the harvested metadata?– Do you modify the metadata you harvest?– If so, do you feed this back to the provider?– What if the provider changes a modified record?– Does a service provider disseminate via OAI?
![Page 11: Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) Phil Barker, March 2006. © Heriot-Watt University. You may reproduce all or any part](https://reader036.vdocument.in/reader036/viewer/2022083006/56649f315503460f94c4d66d/html5/thumbnails/11.jpg)
Issues: uptake
• Lots of implementers, who have produced lots of useful support
However• Relatively little commercial uptake• Relatively little support for harvesting
rich metadata• Relatively little support/consensus on
sets
![Page 12: Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) Phil Barker, March 2006. © Heriot-Watt University. You may reproduce all or any part](https://reader036.vdocument.in/reader036/viewer/2022083006/56649f315503460f94c4d66d/html5/thumbnails/12.jpg)
Issues: Harvesting resource (e.g. Full text)
• Nothing in OAI-PMH requires that full-text should be available for harvesting.– Resource may be physical or accessed controlled
• Nothing in OAI-PMH requires that information required for harvesting should be available.
• However in many cases OAI-PMH will provide the information required to harvest the resource.
![Page 13: Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) Phil Barker, March 2006. © Heriot-Watt University. You may reproduce all or any part](https://reader036.vdocument.in/reader036/viewer/2022083006/56649f315503460f94c4d66d/html5/thumbnails/13.jpg)
Further resources
• Open Archives Initiative– http://www.openarchives.org/– Spec, best practice guide and useful resources,
mailing lists
• OAI for beginners– http://www.oaforum.org/tutorial/– Online tutorial
• OAI Repository Explorer– http://www.purl.org/NET/oai_explorer– Web interface for issuing OAI-PMH requests