discovery event peter burnhill (aggregation as tactic)
DESCRIPTION
TRANSCRIPT
1
‘aggregation as a tactic’
to support discovery
Peter BurnhillEDINA national data centre
University of Edinburgh
RDTF Discovery – “a UK metadata ecology”
2
‘aggregation as a tactic’
to support discovery for ease [and continuity] of access
Peter BurnhillEDINA national data centre
University of Edinburgh
RDTF Discovery – “a UK metadata ecology”
3
Recognise mixed parentage: a collision of language
a ‘re-mix’ of the document tradition & the computation tradition
With emergence of Digital Library & in Information Science …
“considerable simplification, … helpful to think … of two traditions, or mentalities, even cultures, co-exist in area of Information Science
– “Approaches based on a concern with documents, with signifying records: archives, bibliography, documentation, librarianship, records management, and the like … [Content Provider speak]
1. “approaches based on uses of formal techniques, whether mechanical (such as punch cards and data-processing equipment) or mathematical (as in algorithmic procedures).” [Developer speak]
Michael Buckland, Presidential Address, American Society for Information Science,
JASIS’s 50th (1998)http://people.ischool.berkeley.edu/~buckland/asis62.html
“so, please excuse my French”
The term aggregation is used a lot in programming …– “objects … assembled or configured together to create a more complex
object” UML, IBM– “aggregating resources based on … properties. … they are owl:sameAs
and their other properties can be intermixed.”
http://www.w3.org/wiki/RdfSmushing
Here aggregation is taken to mean … • an assembly of data sources
– more than a collection of objects (related or otherwise)
• for machine-as-user, as much as for humans– via an API sans RDF / Linked Data or with RDF / Linked (Open) Data stack
=> de-coupling the metadata layer from the presentation layer?
4
My Brief & Perspective …• high level overview
– making strategic points about aggregation as tactic
“The Supply Side, Opportunities to Expand Access and Visibility”– “Art of the Possible: Aggregation Services”
maybe we all slip into being ‘middle folk’: those who deal in middleware
sometimes having the role of creator and supplier of some service; sometimes being the user of what others supply.
I am asked to speak as from a University-based organisation (EDINA) that develops and delivers JISC-sponsored online services
– Service providers add value to data and content
eg Digimap Collections (Single source: OS mapping; SeaZone; Brit Geological Survey)
NewsfilmOnline (various; digitised with JISC £)
UK Access Federation (institutions; authentication)
+ Carmichael Watson; Tobar an Dualchais/Kist o Riches; Statistical Accounts– but also as someone interested in assisting this UK policy framework
5
My Brief & Perspective … as aggregator
developing and delivering services for research, learning & teaching
• Aggregators add value to data and content from multiple sources
JISCMediahub - links to collections & hosted content (1m resources)
eg CultureGrid; First World War Poetry; hundreds more collections
GoGeo! - metadata registry for spatially-referenced data
SUNCAT– serials union catalogue:metadata; online & on-shelf
PEPRS - e-journal preservation registry; archiving agencies OpenURL Router - commercial resolvers & institutions; activity OA Repository Junction - enabling deposit into OA repositoriesopendepot.org - using OpenDOAR/ROAR & RoMEO
aggregating – adding value – assisting discoverability – looking to gain leverage from other aggregations
+ repositories for user generated content
eg ShareGeo; DataShare; opendepot.org; JISCMediahub
6
Some RDTF-related projects @ EDINA
1. GOgeo Linked Data (GOLD)
2. SUNCAT: Exploring Open [bibliographic] Metadata
1. Sharing OpenURL Activity Data
* monthly usage data: recommender service – data now available!
2. Aggregations of Metadata for Images and Time Based Media, a Scoping study (completed, report available)
3. CHALICE
* publish UK historic placename gazetteer as Linked Data
4. Linked Data Focus* 3 case studies on other EDINA services illustrating how other collections
can benefit from the same techniques.
5. VSM Portal – discovery/access to visual & sound materials
6. etc etc <re-thinking lessons from past projects and activiies>
7
back to the brief: 'Aggregation as a tactic' …(a phrase coined to break a log-jam at a meeting)
• to contrast the enthusiasm for the centralisation implied in aggregation …
… with what is clearly the main Internet game:
exploiting that things are now 'remote, digital and published’
• ‘aggregation’ is not a goal, not an end in itself
– aggregation is an intervention to be used for some strategic purpose
… and that strategic purpose is twofold:
* ‘improvement’
* ‘discoverability’
8
… is to help ensure researchers, students and their teachers have
ease of accessto online scholarly resources
P.Burnhill, Edinburgh 2009; 2001
licence to use
usability
authorisation
open
restricted
Creative Commons licensing
and the purpose of ‘discoverability’ …
reliabilitywell-seamed
interoperability
functionality
anytime/placeconvenience
Machine as User
“ease” access to content & services
“continuity”
making it easier for humans:
hiding complexity in underlying data
making it easier for machines:
exposing complexity in underlying data
back content preservation
… to help ensure researchers, students and their teachers have
ease [and continuity] of accessto online scholarly resources
“ease” “continuity”
P.Burnhill, Edinburgh 2009; 2001
access to content & services
licence to use
usability
openauthorisation
Creative Commons licensing
Our shared & central task …
make it easier for machines:
expose complexity and relationships in underlying data
usability open
stewardship
orphan content
‘Aggregation as a tactic’
“key concept in RDTF Vision is aggregation, directly or represented through metadata”
• Propose we regard aggregation as intervention to exploit the telematic opportunity for things are 'remote, digital & published’
– In my mind the phrase derives from an IASSIST conference in 1990
* the international data librarian and archivist’s meeting, with the title, 'Numbers, pictures, words and sounds: priorities for the 1990s’. We were exploring what it meant with the Internet if we regarded all as ‘remote and published’.
• the Web in mid-1990s simplified and so improved so much– But unfortunately, even now, much which is online and on the Web is badly or
inadequately published …
– We have to improve, re-interpreting what it means to be ‘well-published’
• The RDTF Vision and the ‘Discovery metadata ecology’ are intended to enable that improvement and enhance ‘discoverability’
• recognising that the public/audience/readership for any given work, service or aggregation is now machine as well as human.
11
Time served in the Refactory …• In earlier ‘web time’ we hade the MODELS ‘user-verbs’:
Discover -> Locate -> Request -> Access (Deliver)
Dempsey, Russell & Murray (1999) http://www.ukoln.ac.uk/dlis/models/publications/utopia/
where Access was the end game for us ‘middle folk’even if the beginning and part of a deeper process for researchers, students …
• Now there is call for more than bilateral negotiated interoperability, where Access is the beginning for developers and for other services
• RDF/Linked Data enables information to be shared in a more Web-friendly way, exposing database structure and content
• RDF/Linked Data enables structure and content of those data sources to be explicit
exposing the complexity and relationship in the underlying data
Hanging the insides on the outside …
12
The art treasures are on show inside, but …
13
Where does Virtue lie?
14
Having focused on making it easy for humans ….• hiding the complexity and relationship in the underlying data
– paying attention to the user interface: HCI & GUI; Usability and Accessibility
… to ensure ‘discoverability’, make it much easier for machines• exposing the complexity and relationship in the underlying data
– having in mind the machine-as-user: API as well as HCI
• Regard aggregation as intervention, with strategic purpose1. to engage in value added improvement
– to the data and their relations
• to enhance the discoverability of that which is ‘aggregated’
– to be a focus of attention
• If it is with RDF, then that’s good don’t fuss that its not– If you have built a RDBMS, then take delight in RDFa schema
Are we building something …
… how do we use the ecology metaphor?
better than surfing the web
… what is the ecology metaphor [what’s the meta-for]?
Thank you
Questions please …
update on services, projects, publications & presentations:
http://edina.ac.uk