jcdl2013 mklein
Post on 08-May-2015
339 Views
Preview:
TRANSCRIPT
JCDL 2013 July 24th Indianapolis, IN 1
Martin Klein@mart1nkle1n
martinklein0815@gmail.com
Herbert Van de Sompel@hvdsomp
hvdsomp@gmail.com
http://www.openarchives.org/rs/
Extending Sitemaps for ResourceSync
JCDL 2013 July 24th Indianapolis, IN 2
ResourceSync Core Team
JCDL 2013 July 24th Indianapolis, IN 3
ResourceSync Technical Group
JISC
Richard JonesGraham Klyne
Stuart Lewis
OCLC
Jeff Young
LOCKSS
David Rosenthal
RedHat
Christian Sadilek
Ex Libris Inc.
Shlomo Sanders
Library of Congress
Kevin Ford
JCDL 2013 July 24th Indianapolis, IN 4
Synchronize
• Web resourceso things with a URI that can be dereferenced
• many/few• big/small• fast/slow
What
• Keep “in sync”• Destination (client) follows changes at a Source
(server) over time• Keep copies on different systems the same
JCDL 2013 July 24th Indianapolis, IN 5
Two ResourceSync Capabilities
Resource List
Lists resources
subject to synchronization
Change List
Lists changes to resources
subject to synchronization
• Allow Destinations to obtain current resources• Requires URI
• Allow Destination to verify accuracy of sync’ed content• Requires lastmod and fixity information
• Allow Source to include references to additional content• Requires inclusion of links
JCDL 2013 July 24th Indianapolis, IN 6
Entrance…. Sitemaps
• Resource List is an inventory – so is a Sitemap
• Low barrier of adoption
• Ack’ed by Google, Yahoo!, Bing
JCDL 2013 July 24th Indianapolis, IN 7
<loc>http://example.com/res1</loc>
<lastmod>2013-07-24-T09:00:00Z</lastmod>
<url>
</url>
<url>
…
</url>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9”>
</urlset>
Sitemap Format
JCDL 2013 July 24th Indianapolis, IN 8
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9”> --- root level --- document info, lastmod, links <url> --- resource level --- fixity, change type, and other resource info, links <loc>http://example.com/res1</loc> <lastmod>2013-07-24T09:00:00Z</lastmod> </url> <url> … </url></urlset>
ResourceSync Sitemap Extensions
JCDL 2013 July 24th Indianapolis, IN 9
Testing ResourceSync Sitemap Extensions
Series of informal experiments
1. Enhance Sitemaps with attributes and elements
2. Submit Sitemaps to Google’s Webmaster Tool
3. Evaluate immediate feedback
4. Check Google index
Concerns:
1. Rejection of ResourceSync documents due to
a. Added elements and attributes on root level
b. Added elements and attributes on resource level
2. Unwanted indexing of URIs from links vs. <loc>
JCDL 2013 July 24th Indianapolis, IN 10
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9” xmlns:rs="http://www.openarchives.org/rs/terms/”> <rs:meta capability=”resourcelist” modified=”2013-07-24-T11:00:00Z"/> <url> <loc>http://example.com/res1</loc> <lastmod>2013-07-24T09:00:00Z</lastmod> </url></urlset>
Sitemap Extensions Test #1
Inclusion of elements and attributes at root level
to convey: • Type of capability• Last modification date
JCDL 2013 July 24th Indianapolis, IN 11
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9” xmlns:rs="http://www.openarchives.org/rs/terms/”> <url> <loc>http://example.com/res1</loc> <lastmod rs:change=”updated">2013-07-24T09:00:00Z</lastmod> <rs:fixity type=“md5”>a2f29dklfgj9823lksdf90sfkd</rs:fixity> <rs:mimetype>text/html</rs:mimetype> </url></urlset>
Sitemap Extensions Test #2
Inclusion of elements and attributes at resource level
to convey: • Change type• Metadata
JCDL 2013 July 24th Indianapolis, IN 12
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9” xmlns:rs="http://www.openarchives.org/rs/terms/”> <rs:link rel=”resourcesync” href=”http://example.com/capabilitylist.xml"/> <rs:link rel=”describedby” href=”http://example.com/info-about-source.xml"/> <url> <loc>http://example.com/res1</loc> <lastmod>2013-07-24T09:00:00Z</lastmod> </url></urlset>
Sitemap Extensions Test #3
Inclusion of links at root level to: • Navigate through the framework• Point at misc documents
JCDL 2013 July 24th Indianapolis, IN 13
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9” xmlns:rs="http://www.openarchives.org/rs/terms/”> <url> <loc>http://example.com/res1</loc> <lastmod>2013-07-24T09:00:00Z</lastmod> <rs:link rel="duplicate" href="http://mirror.example.com/res1"/> <rs:link rel="http://www.openarchives.org/rs/terms/patch” href="http://example.com/res1-json-patch" type="application/json-patch"/> </url></urlset>
Sitemap Extensions Test #4
Inclusion of links at resource level to: • Point to related resources documents
JCDL 2013 July 24th Indianapolis, IN 14
Results - Sitemap Extensions Test #4
As expected:
1. Child elements tolerated
2. Google indexes URI within <loc>
Unintended consequences:
3. Google indexes URIs within <rs:link>
2 & 3 together is not desired e.g.,• When mirror location is provided, URI in <rs:link>
should and URI in <loc> should not be indexed• URI in <rs:link> points at partial content
JCDL 2013 July 24th Indianapolis, IN 15
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9” xmlns:rs="http://www.openarchives.org/rs/terms/”> <url> <loc>http://example.com/res1</loc> <lastmod>2013-07-24T09:00:00Z</lastmod> <rs:link rel="duplicate" href="http://mirror.example.com/res1"/> <rs:link rel="http://www.openarchives.org/rs/terms/patch” href="http://example.com/res1-json-patch" type="application/json-patch"/> </url></urlset>
Sitemap Extensions Test #4
Inclusion of links at resource level to: • Point to related resources documents
JCDL 2013 July 24th Indianapolis, IN 16
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9” xmlns:rs="http://www.openarchives.org/rs/terms/”> <rs:ln rel=”resourcesync” href=”http://example.com/capabilitylist.xml"/> <rs:md capability=”changelist” modified=” 2013-07-24-T11:00:00Z"/> <url> <loc>http://example.com/res1</loc> <lastmod>2013-07-24T09:00:00Z</lastmod> <rs:md change=”updated” type=”text/html” hash=”md5:a2f94c567f9b370c43fb1188f1f46330”/> <rs:ln rel=”duplicate” href=”http://mirror.example.com/res1"/> </url></urlset>
Summary
JCDL 2013 July 24th Indianapolis, IN 17
http://www.openarchives.org/rs/
JCDL 2013 July 24th Indianapolis, IN 18
Martin Klein@mart1nkle1n
martinklein0815@gmail.com
Herbert Van de Sompel@hvdsomp
hvdsomp@gmail.com
http://www.openarchives.org/rs/
Extending Sitemaps for ResourceSync
Thank you!
top related