[ieee third latin american web congress (la-web'2005) - buenos aires, argentina (31-02 oct....

4
Strategies for filling out LOM metadata fields in a Web-based CSCL tool Luciano T. E. Pansanato 1,2 and Renata P. M. Fortes 2 1 Centro Federal de Educação Tecnológica do Paraná – CEFET-PR Cornélio Procópio, PR, Brazil 2 Instituto de Ciências Matemáticas e de Computação – ICMC-USP Universidade de São Paulo, São Carlos, SP, Brazil [email protected], [email protected] Abstract In this paper, we describe our experience of implementing an application profile based on Learning Object Metadata (LOM) and increased with elements to adjust it to our application domain. The educational resources considered here are collaborative Web pages based on Wiki. Furthermore, we discuss strategies for automatic and semi-automatic generation of metadata. 1. Introduction Many metadata schema standards have been proposed to facilitate interoperability and resource discovery (e.g., [1, 2]); most of the high-level metadata defined by these schemas will have to be manually produced by metadata record creators. Moreover, metadata elements are optional in some schemas; consequently, metadata for a given record may be insufficient, and repositories are not likely to populate all the fields [3]. We find similar problems in using CoTeia [4], a CSCL (Computer Supported Collaborative Learning) tool that is mostly used to complement face-to-face lectures with collaborative learning activities. CoTeia is an asynchronous collaborative tool for creating Web pages, conceptually based on wikis and it is analogous to CoWeb (http://coweb.cc.gatech.edu/csl/9). CoTeia has a small set of metadata – title, author, and keywords – and only the title has a strategy of automatic generation. The title is filled out during page creation using the text inserted between start and end tags of the link to the page being created. Recently we carried out a quantitative analysis of two CoTeia repositories 1 , used from 2001 to 2005 at ICMC-USP (Institute of Mathematical Sciences and Computing – University of São Paulo), with the aim of 1 http://coteia.icmc.usp.br/coteia/ http://coweb.icmc.usp.br/coweb/ investigating the completion of metadata fields. We were able to observe that 88.41% of the metadata fields from 3,131 pages related to educational domain were not filled out. However, the title was filled out in 99.74% of the pages due to the simple strategy adopted for this metadata field. This result motivates the development and use of new strategies to automatic and semi-automatic metadata generation. Thus, user time and effort can be saved if the user does not want to create the metadata manually. Moreover, by means of a qualitative analysis of the pages we were able to identify the possibility of extending the current metadata support of CoTeia. This extension is used to assist the metadata generating process and it will be used in the future to offer more familiar options to the user in discovering educational resources. In this paper, we describe our experience with the implementation of an application profile based on Learning Object Metadata (LOM) [1]. The LOM was chosen because of its focus on educational domain, adoption by international e-learning standards (e.g., IMS, ADL, ARIADNE), and mapping to Dublin Core standard [2]. The educational resources considered are collaborative pages, or wiki pages, because they are used to support educational activities. Uses of the pages in CoTeia vary enough, for example, collaborative content creation, review activities, work library creation, and distributing information. The challenge addressed here is to provide strategies for automatic and semi-automatic generation of metadata fields. This is an important issue for research on learning objects and their use in education and training [5]. 2. Strategies for automatic metadata generation The new metadata support of CoTeia keeps one associated metadata record for each page. This Proceedings of the Third Latin American Web Congress (LA-WEB’05) 0-7695-2471-0/05 $20.00 © 2005 IEEE

Upload: rpm

Post on 24-Feb-2017

212 views

Category:

Documents


0 download

TRANSCRIPT

Strategies for filling out LOM metadata fields in a Web-based CSCL tool

Luciano T. E. Pansanato1,2 and Renata P. M. Fortes2

1Centro Federal de Educação Tecnológica do Paraná – CEFET-PRCornélio Procópio, PR, Brazil

2Instituto de Ciências Matemáticas e de Computação – ICMC-USPUniversidade de São Paulo, São Carlos, SP, Brazil

[email protected], [email protected]

Abstract

In this paper, we describe our experience ofimplementing an application profile based on LearningObject Metadata (LOM) and increased with elementsto adjust it to our application domain. The educationalresources considered here are collaborative Web pagesbased on Wiki. Furthermore, we discuss strategies forautomatic and semi-automatic generation of metadata.

1. Introduction

Many metadata schema standards have beenproposed to facilitate interoperability and resourcediscovery (e.g., [1, 2]); most of the high-level metadatadefined by these schemas will have to be manuallyproduced by metadata record creators. Moreover,metadata elements are optional in some schemas;consequently, metadata for a given record may beinsufficient, and repositories are not likely to populateall the fields [3]. We find similar problems in usingCoTeia [4], a CSCL (Computer SupportedCollaborative Learning) tool that is mostly used tocomplement face-to-face lectures with collaborativelearning activities. CoTeia is an asynchronouscollaborative tool for creating Web pages, conceptuallybased on wikis and it is analogous to CoWeb(http://coweb.cc.gatech.edu/csl/9). CoTeia has a smallset of metadata – title, author, and keywords – and onlythe title has a strategy of automatic generation. The titleis filled out during page creation using the text insertedbetween start and end tags of the link to the page beingcreated. Recently we carried out a quantitative analysisof two CoTeia repositories1, used from 2001 to 2005 atICMC-USP (Institute of Mathematical Sciences andComputing – University of São Paulo), with the aim of

1 http://coteia.icmc.usp.br/coteia/http://coweb.icmc.usp.br/coweb/

investigating the completion of metadata fields. Wewere able to observe that 88.41% of the metadata fieldsfrom 3,131 pages related to educational domain werenot filled out. However, the title was filled out in99.74% of the pages due to the simple strategy adoptedfor this metadata field. This result motivates thedevelopment and use of new strategies to automatic andsemi-automatic metadata generation. Thus, user timeand effort can be saved if the user does not want tocreate the metadata manually. Moreover, by means of aqualitative analysis of the pages we were able toidentify the possibility of extending the currentmetadata support of CoTeia. This extension is used toassist the metadata generating process and it will beused in the future to offer more familiar options to theuser in discovering educational resources.

In this paper, we describe our experience with theimplementation of an application profile based onLearning Object Metadata (LOM) [1]. The LOM waschosen because of its focus on educational domain,adoption by international e-learning standards (e.g.,IMS, ADL, ARIADNE), and mapping to Dublin Corestandard [2]. The educational resources considered arecollaborative pages, or wiki pages, because they areused to support educational activities. Uses of the pagesin CoTeia vary enough, for example, collaborativecontent creation, review activities, work librarycreation, and distributing information.

The challenge addressed here is to provide strategiesfor automatic and semi-automatic generation ofmetadata fields. This is an important issue for researchon learning objects and their use in education andtraining [5].

2. Strategies for automatic metadatageneration

The new metadata support of CoTeia keeps oneassociated metadata record for each page. This

Proceedings of the Third Latin American Web Congress (LA-WEB’05) 0-7695-2471-0/05 $20.00 © 2005 IEEE

metadata record is a subset of elements from the LOMstandard, i.e., the metadata schema used in CoTeia is anapplication profile [5] based on only one of the existingschemas. We considered four main categories ofstrategies for metadata generation: resource, context,template, and rules based metadata generation.

The first strategy (resource) is based on the contentanalysis of the own resource. Our metadata generatoruses typical techniques such as language classification,automatic summarization and keyword extraction. Forexample, the value of 1.3:General.Language element isgenerated by using language classification based on athesaurus. Currently, the metadata generator canidentify pages in Portuguese and English.

The second strategy (context) is based on context inwhich the resource is used. Our metadata generatorobtains metadata information from three context types:file system, structure, and user management. Forexample: the value of 4.2:Technical.Size element,whose value is the page size in bytes, is automaticallyobtained from the file system; relationships of the7:Relation category (e.g., “is part of”, “has part”) areidentified from the hierarchical structure of the pages;the value of 2.3.2:Life Cycle.Contribute.Entity isobtained from context information maintained in thesession register.

The third strategy (template) is based on templates(or profiles) which are associated to the resource.Template based metadata generation is considered asemi-automatic strategy because it requires userintervention to some degree. Our metadata generatorobtains metadata information from three template types:resource group, system and user. For example, inCoTeia the pages are grouped by entries. There is atemplate associated with each entry and the pages of thesame entry (a resource group) may receive the metadatavalues stored in the template. The system and usertemplate are useful for fixed values, for example, in thecase of the 3.3:Meta-Metadata.Metadata Schemaelement and 6:Rights category.

The fourth strategy (rules) is based on rules whichenable us to exploit relationships among metadata ofthe resource. We have used rules based strategy toovercome some LOM problems. For example, the5.2:Educational.Learning Resource Type element hassome vocabulary problems [6]. We use a specificvocabulary instead of that presented in the LOM andautomatically maps it to the 5.2:Educational.LearningResource Type element. This allows an implementationto preserve a minimum of semantic interoperabilitywith other LOM implementations through the termsused, while also applying a vocabulary suited for ourparticular purposes.

Another rule type we have exploited is heuristicrules. This rule type can be applied when heuristics areknown about the metadata elements. For example, thevalue of the 3.4:Meta-Metadata.Language element is afixed value, usually “pt” (Portuguese, in CoTeia) andcan be obtained from the system template. This elementis related to the language of the metadata record. Aheuristic to generate this element is to classify thelanguage used in 1.4:General.Description (or other freetext elements).

Some metadata elements of LOM were discardedfrom the application profile of CoTeia because they arenot applicable to pages, or they are not recommended inspecific best practice guidelines (e.g., [6]) and otherwork reported in the literature (e.g., [5, 7]). In short, 43metadata elements were selected from LOM for theapplication profile of CoTeia. Of these elements, onlyfive are not generated by our strategies:

1. 5.8:Educational.Difficult;2. 6.1:Rights.Cost;3. 6.2:Rights.Copyright and Other Restrictions;4. 6.3:Rights.Description;5. 8.3:Annotation.Description.These metadata elements are quite subjective, either

because such elements are subject to differing points ofview, or because they are specifically intended torepresent a subjective evaluation. They should be filledout manually in CoTeia. However, the template basedstrategy can be used to generate this element in ourapplication domain.

3. Adding local metadata elements to theapplication profile

The application profile of CoTeia was increasedwith five metadata elements to support our specificapplication domain. With appropriate support in thiscontext, metadata can better assist CoTeia users indiscovering educational resources. Moreover, thesemetadata are easily generated using the resource basedstrategy and before they are used in the rules basedstrategy.

The five new elements are: Brazilian Context, Topic,Discipline, Didactic Material Type, and LearningActivity. The conceptual definitions of these elementsare based on Brazilian educational legislation(http://portal.mec.gov.br), Brazilian Dictionary ofEducation [8], and Brazilian Thesaurus of Education(http://www.inep.gov.br/pesquisa/thesaurus/). Further-more, they are introduced by using the 9:Classificationcategory (an extension mechanism) to preserve fullcompatibility with the larger LOM context [5].

The Brazilian Context element is a result of ouranalysis of legislation and common practice in

Proceedings of the Third Latin American Web Congress (LA-WEB’05) 0-7695-2471-0/05 $20.00 © 2005 IEEE

describing the complex educational structure of Brazil.It is defined as an aggregate element with three sub-elements: Level, Formation and Modality. The value ofLevel element is one of two education levels: basic orsuperior. Formation element comprises the differentgrades (or levels) for each level (basic or superior) inLevel element. Modality of education is related tohorizontal or transversal structure of education. Thehorizontal structure makes it possible for education ofthe same level to be supplied in different modes. Thetransversal structure makes it possible for differentmodes to permeate to all the vertical structures (levelsin the Formation element). For example, specialeducation traverses all the levels in the Formationelement. Special education is the modality of educationfor people with disabilities.

Topic element is used to indicate computing sciencetopics related to the page. The vocabulary for thiselement is a branch of the knowledge areas as definedby IBICT (Instituto Brasileiro de Informação emCiência e Tecnologia – Brazilian Institute ofInformation in Science and Technology)(http://dici.ibict.br/view/subjects/). For example,[“Knowledge Areas”, “Exact Sciences”, “ComputingScience”, “Software Engineering”] defines the path inthat branch to the term “Software Engineering”. Otherbranches of IBICT can be considered for otherapplication domains.

The vocabulary for the Discipline element iscomposed of code and name lists obtained fromadministrative systems of the ICMC-USP. Thediscipline codes are used by the metadata generator(resource based strategy) to assist the automaticidentification of the discipline associated to the page.This strategy is based on the practice of identifying thediscipline in the content of a page. We use a simpleparse to resolve it. The Discipline element is used togenerate the values of the Brazilian Context and5.6:Educational.Context elements.

To establish an initial list for the vocabulary of theDidactic Material Type element, a preliminary surveyof the didactic material type used (inserted or proposed)in the pages was carried out. The Brazilian Thesaurusof Education was used to validate this vocabulary andto help to hierarchically differentiate the terms. Thisthesaurus is important for term identification because itprovides facets (i.e., division characteristics) that assistin distinguishing and generalizing/specializing theterms selected in the preliminary survey. This elementrefines the 5.2:Educational.Learning Resource Type,whose values are generated using the rules basedstrategy. The definition of the vocabulary for theLearning Activity element was similar to the DidacticMaterial Type.

4. Implementation

We have developed and implemented otherstrategies which work in conjunction with metadatagenerating in CoTeia, a CSCL tool. The idea behind itis to try to improve the metadata record as the pagedevelops and the users add new content. The cycle ofmetadata generation is performed every time the page ismodified. Users can also collaborate with the metadatajust as they do with the pages.

Although CoTeia provides strategies to generatemetadata, the user has the final decision. CoTeia carryout this feature implementing a mechanism to support afeedback strategy. When the author edits a metadatafield, CoTeia does not try to generate that metadatafield anymore until he/she indicates the opposite.

Figure 1 illustrates the use of multiple strategies forautomatic and semi-automatic generation of metadatawhen creating and editing pages. The underlyingprocess is as follows:

1. Initially the page and its associated metadatarecord are loaded to the interface.

2. The template-based strategy is applied to themetadata record. The user-generated metadata ismaintained to populate the metadata record.

3. The metadata fields and page content can beedited. Feedback information is collected.Feedback information corresponds to whichmetadata fields were filled out by the author andwhich by the application.

4. The resource and context based strategy forcreating application-generated metadata takesplace. The application-generated metadata ismaintained to populate the metadata record.

5. Mapping rules (and other rules) are applied. Otherrules include ad-hoc heuristic rules using metadataand content to infer other metadata. Theinference-generated metadata is maintained topopulate the metadata record.

6. Conflicts are resolved and the new metadatarecord is created. A conflict occurs when differentvalues are generated for the same metadataelement. There are several strategies to resolve theconflicts; depending on the element, one strategymay work better than another [9].

Proceedings of the Third Latin American Web Congress (LA-WEB’05) 0-7695-2471-0/05 $20.00 © 2005 IEEE

Figure 1. Process of creating and editingpages and metadata in CoTeia.

5. Related work

Several extensions to the LOM standard areproposed in the literature, for instance, to model coursematerials (e.g., bibliography, evaluation rules, courseprogram) [10], computer-based training [11], andassessment [12]. These approaches are generallyintended to be as broad as possible, in order tomaximize their modeling power or applicability. Ourapproach is intended to refine the educational elementsof LOM concerns a specific application domain.

The HLSI (Higher Level Skills for Industry) project[7] makes use of several strategies to facilitate metadatacreation such as user selectable vocabulary, templates,and automation of technical (low-level) metadata. Wehave combined these and other strategies and not usedthem in isolation, for example, as template is used inHLSI project. Moreover, HLSI authoring software asksfor mandatory elements and our approach has nointrusive characteristics (all elements are optional).

The ARIADNE Knowledge Pool System (KPS) usesthree combined techniques [13]: profile (template),content analysis, and similarity with other objects. In[9], the authors also exploit context analysis andpresent a framework for automatic metadata generation.However, in their approach they do not make use ofstrategies based on rules. We use a simple strategy formetadata generation, which is based on mapping rules.

6. Conclusions

Our effort has been spent in recognizing, refining,and supporting the strategies we have identified. Thestrategies presented in this paper can be useful toimprove the quality of metadata, mainly in CSCWapplications. We intend to develop more strategies

concerning inference-generated metadata, for example,exploiting semantic rules and ontology.

In conclusion, educational authoring tools shouldhave facilities, which combine author-created metadataand strategies for automatic and semi-automaticmetadata generation that can simplify metadatacreation.

Acknowledgments. The authors would like to thank FAPESPand FINEP for funding this research.

References

[1] IEEE LTSC, IEEE 1484.12.1-2002 Draft Standard forLearning Object Metadata, 2002.

[2] Dublin Core Metadata Initiative, Using Dublin Core,2003, http://www.dublincore.org/documents/ usageguide.

[3] McClelland, M., “Metadata standards for educationalresources”, Computer, 36(11), 107-109, 2003.

[4] Arruda Jr, C. R. E., Izeki, C. A., Pimentel, M. G. C.,“CoTeia: Uma ferramenta colaborativa de edição baseada naWeb”, Proc. of the 8th Brazilian Symposium on Multimediaand Hypermedia Systems, 371-374, Fortaleza, 2002.

[5] Duval, E., Hodgins, W., “A LOM Research Agenda”,Alternate Paper Tracks Proc. of the 12th WWW Conf., 2003.

[6] Friesen, N., Fisher, S., Roberts, A., CanCore Guidelinesfor the Implementation of Learning Object Metadata (IEEE1484.12.1-2002) Version 2.0, 2004, http://www.cancore.org.

[7] Ryan, B., Walmsley, S., “Implementing metadatacollection: a projects problems and solutions”, IEEELearning Technology, 5(1), 2003.

[8] Duarte, S. G., Dicionário Brasileiro de Educação, Nobel,1986.

[9] Cardinaels, K., Meire, M., Duval, E., “AutomatingMetadata Generation: the Simple Indexing Interface”, Proc.of the 14th WWW Conf., 548-556, 2005.

[10] Simões, D., Luís, R., Horta, N., “Enhancing the SCORMMetadata Model”, Proc. of the 13th WWW Conf., 238-239,2004.

[11] Farrell, R., Dooley, S. S., Thomas, J. C., Rubin, W.,Levy, S. “Implementing and Extending Learning ObjectMetadata For Learning-directed Assembly of Computer-based Training”, IEEE Learning Technology, 5(1), 2003.

[12] Chang, W-C., Hsu, H-H., Smith, T. K., Wang, C-C.,“Enhancing SCORM metadata for assessment authoring in e-Learning”, Journal of Computer Assisted Learning 20, 305-316, 2004.

[13] Olivié, H., Cardinaels, K., Duval, E., “Issues inAutomatic Learning Object Indexation”, Proc. of the ED-MEDIA Conf. 2002(1), 239-240, 2002.

Proceedings of the Third Latin American Web Congress (LA-WEB’05) 0-7695-2471-0/05 $20.00 © 2005 IEEE