formalizing the design of digital libraries based on uml delos noe, preservation cluster: workshop:...
Post on 19-Dec-2015
215 views
TRANSCRIPT
Formalizing the Design of Digital Libraries Based on UML
Delos NoE, Preservation Cluster:
Workshop: Persistency in Digital Libraries
13. February 2006, Oxford Internet Institute
0
Talking about …
• Theoretical stage: Transforming conceptual models into an UML representation (class diagram)
• „Pragmatic“ model by Endres and Fellner
• Formally defined model „5S Framework for Digital Libraries“ by Fox, Goncalves et al.
1
The Endres/Fellner Model (EF-Model)
Goals
• Modelling an architecture of a digital library on a very high level (Conceptual model)
• Modelling just those elements of a DL which are absolutely fundamental and do not change
2
Starting point: Use cases
The EF-Model is based on an essential model, regarding
first of all fundamental scenarios of the system (business processes, use cases):
3
• How can the digital library system fulfill the requirements of the essential model?
• Therefore we need to know: With which elements and concepts the digital library has to deal in order to handle the Use Cases?
4
• The fundamental unit of a digital library is data.
• All systems data has to be saved.
DigitalLibraryData
saveData()
5
• According to the essential model, there are 8 kind of data within a digital library.
• All of these data is a specialisation of the global concept of data.
• So these data can be modelled as super-class - sub-class relationships, i.e. as generalisations.
7
1. Users
• Data about people who are users of the digital library are one fundamental kind of data within a digital library system.
This data represents the user. Therefore, the class to be modelled is termed „User“.
• Basic attributes are address and profile of the user; Additionally, users can be identified through an identification number; operations enable to modify or create these data.
• Users are specified through sub-classes.
9
2. Supplier
• Suppliers are the second group of entities which interact with the system. They can be real persons as well as corporations. Supplier‘s data is encapsulated within the class „Supplier“.
• According to E/F, basic attributes are address and (sales) conditions. They are considered to be common to all suppliers.
10
Class „EFSupplier“
„EFSupplier“ can be specialised through subclasses. Which particular specialisations are chosen is up to the designer and
depends on the requirements of the DL.
11
3. Documents
• Documents are the core products of a digital library.
• All data about digital documents which are deliverable (asked for by any user) are subsumed within a class „EFDocument“.
• „EFDocument“ serves as a super-class for a number of sub-classes. Again, the question which sub-classes can be derived is a matter of the needs of every distinct digital library.
13
4. Finding Aids
• Finding aids cover all of the descriptive metadata of a digital library; E/F are focussing especially on those metadata which you can retrieve via e.g. OPACs or search engines.
We therefore call this class „EFRetrieval“. The tools for retrieval are modelled as sub-classes as well.
• According to E/F, basic attributes are designation, type and (network) address; basic operations are inserting new finding aids or modifying the existing.
15
5. Services
• Services are defined as all services which are supported by the digital library except the delivery of documents.
• E/F do not give more detailed statements on services.
EFService
16
6. Orders
• The E/F model also comprises business data, just as we can find them in almost every commercial company.
• Within the EF-Model, one important task of a digital library is its ability to cope with orders of users for documents or services.
• The class „EFOrder“ represents this task.
18
7. Deliveries
• Suppliers provide users with the services or documents they have ordered.
• These data concerning deliveries are therefore encapsulated within the class „EFDelivery“.
20
8. Accountings
• All deliveries are accounted. The related data is encapsulated in the „EFAccounting“ class. The particular units of the accounting (items) are modelled as a class that is associated to „EFAccounting“.
• Order, Delivery and Accounting are business related data.
22
EF-Model: Summary
• The EF-Model is a high-level architecture. It provides a conceptual model of a digital library system.
• The EF-Model is also a taxonomy of data.
• It focuses on some aspects of digital libraries. Not all aspects are equally considered. The system is to a certain extent understood as an economical one.
• The model is also on an analytical stage of system design.
25
5S Model of a Digital Library
1. What is „5S“?
• „5S“ stands for: Streams, Structures, Spaces, Scenarios and Societies
• These five dimensions are considered to be crucial for every digital library
• As the main components they constitute a framework for a digital library. All of the elements in the 5S framework are formally described.
26
• Streams are defined as a sequence of elements of an arbitrary type. This could be e.g. bitstreams, stream of characters.
• Structures reflect the organisation of information. This can be on quite diffrent levels, e.g. structure of streams, structure of a hypertext, relationships among actors, system connections.
27
• Spaces present the content of digital libraries in a usable and retrievable way. This could be the interface to a bibliographic database or a browser for accessing objects.
• Scenarios detail the behaviour of digital library services and explain the functionality of structures and spaces. An example is the act of searching for objects.
• Societies focus on the actors involved in the functionality of a digital library, e.g. users, suppliers, service staff.
28
Formal Definition of a DL
A digital library is a 4-tuple DL=(R,Cat,Serv,Soc) where R is a repository Cat={DMC1,DMC2,...,DMCK} is a set of metadata catalogues
for all collections {C1, C2,..., CK} in R Serv is a set of services Soc is a society.
28.1
Formal Definition of a DL
A digital library is a 4-tuple DL=(R,Cat,Serv,Soc) where R is a repository Cat={DMC1,DMC2,...,DMCK} is a set of metadata catalogues
for all collections {C1, C2,..., CK} in R Serv is a set of services Soc is a society.
29
Formal Definition of Repository
A repository is formally defined as R={Ci} (i=1 to f) with assumed operations get(), store(), del(): R is a family of collections and get(), store() and del() are fundamental functions for
a repository to manipulate collections.
A collection is a set of digital objects: C={do1, do2,..., dok}.
31
Formal Definition of a Digital Object
A digital object is defined as a tuple
do=(h,SM,ST,StructuredStreams) with h is an element of H SM={sm1,sm2,...,smn} is a set of streams ST={st1,st2,...,stm} is a set of structural metadata specifications and StructuredStreams={stsm1, stsm2,..., stsmp} is a set of functions, defined
from the streams in the SM set and the structural
metadata specifications in the ST.
32
Formal Definition of a Digital Object
To resolve the concepts of this definition: H is a set of universally unique
handles (labels) Stream is defined as a sequence of elements of an arbitrary type Structural Metadata Specification is a structure Structure is a tuple (G,L,F) where
G is a directed graph, L is a set of label values and F is a labelling function.
34A
Formal Definition of a DL
A digital library is a 4-tuple DL=(R,Cat,Serv,Soc) where R is a repository Cat={DMC1,DMC2,...,DMCK} is a set of metadata catalogues
for all collections {C1, C2,..., CK} in R Serv is a set of services Soc is a society.
34
Formal Definition of Catalogue
Cat={DMC1,DMC2,...,DMCK} where DMC is a set of pairs {(h,{dm1,...,dmkh})} and where C is a collection with k handles in H h is element of H, dmi is a descriptive metadata specification.
36A
Formal Definition of a DL
A digital library is a 4-tuple DL=(R,Cat,Serv,Soc) where R is a repository Cat={DMC1,DMC2,...,DMCK} is a set of metadata catalogues
for all collections {C1, C2,..., CK} in R Serv is a set of services Soc is a society.
36
Formal Definition of Service
The formal definition of a 5S DL identifies a service Serv={Serv1,...,ServK} as a set of services, containing at least services for browsing, indexing and searching.
Furthermore, a service is defined as a set of scenarios.
A scenario again is, according to the formal definition, a sequence of related transition events.
38A
Formal Definition of a DL
A digital library is a 4-tuple DL=(R,Cat,Serv,Soc) where R is a repository Cat={DMC1,DMC2,...,DMCK} is a set of metadata catalogues
for all collections {C1, C2,..., CK} in R Serv is a set of services Soc is a society.
38
Formal Definition of Society
A society Soc=(CM,RS) is a tuple where CM={cm1,cm2,...,cmn} is a set of conceptual communities, each community referring to a set of individuals of the same class or type RS={rs1,rs2,...,rsm} is a set of relationships rsj=(ej,ij) each relationship being a tuple, where ej is a Cartesian product, specifying the communities involved in the relationship ij is an activity that describes the interactions or communications among individuals.
40
What about the Spaces?
A Space is by definition “a measureable space, measure space, probability space, vector space, topological space or a metric space.” Digital libraries can use the space concepts for many representations, e.g. visualisation of documents, indexing, communication between user and system.
42
References
• Endres, A.; Fellner, D.W.: Digitale Bibliotheken. Heidelberg: d-punkt, 2000.
• Goncalves, M.A.; Fox, E.A.; Watson, L.T.; Kipp, N.: Streams, Structures , Spaces, Scenarios, Societies (5S): A formal model for digital libraries. Technical report 03-04, Virginia Tech., 2004. Link: http://portal.acm.org/citation.cfm?id=984321.984325