taxonomy and metadata
DESCRIPTION
Primer on taxonomy and metadata as seen from an enterprise content mgmt consulant's viewTRANSCRIPT
Taxonomy and Metadata
11/24/09
David Champeau - ECM Consultant
Taxonomy and Metadata◦ Definitions◦ Examples◦ Uses
Introduction
A taxonomy is◦ A classification scheme◦ Semantic◦ A knowledge map
Taxonomies provide the lenses by which we perceive and talk about the world we live in
[Classification] is almost the methodical equivalent of electricity- we use it every day, yet often consider it to be rather mysterious.
Taxonomy
A taxonomy is a form of classification scheme◦ Designed to group related things together (related
not similar) Oranges and apples are in the fruit section
Can be informal and ad hoc◦ organize music CDs by genre
Can be highly formal and standardized◦ Dewey Decimal System
Taxonomy
Taxonomies are semantic◦ Taxonomies in knowledge management are
different from formal published classification schemes Formal schemes rely heavily on codes Knowledge management taxonomies provide a fixed
vocabulary This vocabulary needs to be meaningful and
transparent to ordinary users◦ When content is labeled “Project Kickoff”
everybody should know what kind of documents they can expect to find in that category
Taxonomy
Taxonomies are semantic◦ They express the relationship between terms
In the folder structure PROJECT DOCUMENTS\PROJECT KICKOFF we immediately recognize that we will find other types of project documents adjacent to the PROJECT KICKOFF folder and we expect that they will be linked to the sequence of stages in the project
PROJECT DOCUMENTS\PROJECT KICKOFF PROJECT DOCUMENTS\PROJECT REQUIREMENTS PROJECT DOCUMENTS\PROJECT ARCHITECTURE
◦ If you take all the labels in a taxonomy and put them in alphabetical order, you have a controlled vocabulary – a dictionary
Taxonomy
A taxonomy is a knowledge map “coup d’oueil” – “cast of the eye”
◦ A good taxonomy should enable the user to immediately grasp the overall structure of the knowledge domain
◦ The user should be able to accurately anticipate what resources he or she might find where
◦ The taxonomy should be comprehensible, predictable and easy to navigate
Taxonomy
A taxonomy also acts as a artificial memory device◦ Concepts are located in taxonomy structures and
locked in place by association with their neighbors through their classification relationships
◦ This affords considerable mnemonic power
Taxonomy
Various representations of taxonomies◦ Lists◦ Trees◦ Hierarchies◦ Polyhierarchies◦ Matrices◦ Facets◦ System maps
Taxonomy
Taxonomy work◦ Taxonomies are products of work◦ Developing a taxonomy is a project◦ Knowledge management taxonomies need to
reflect the working worlds of the organizations they are created for
◦ Because those working worlds continue to change, so must our taxonomies Taxonomy work is therefore continuous
Taxonomy
Taxonomy and Knowledge Management evolution◦ Paper filing systems◦ Shared drive folder structures◦ Content management systems
Initially taxonomies were quite simple, drop down lists of keywords◦ Initially an aid to findability
As technology developed, metadata played a wider role in the control and management of content
Taxonomy
Definition◦ “Data about data” – Oxford English Dictionary◦ “A collection of structured information about a
document or a piece of content”◦ For a document or (work) item of information this
means data about the item such as Author, Title, Issue Data and other information.
◦ Metadata is usually defined in terms of units called “elements”, “fields”, “attributes” or “properties” Some elements may have “sub-elements”
Date may have “date created”, “date approved”, “date published”
◦ Metadata may be made mandatory or optional
Metadata
Purposes of metadata◦ To identify content
Capture fields and distinguish each document from all others◦ Manage content
Version numbers, archive date, security and access permissions◦ Retrieval of content
Taxonomy topics, subject keywords, document type◦ Connect content to other content
Behavioural metadata captured in transaction (i.e. Amazon.com)◦ Business processes
Authored by whom? Reviewed by whom and when? Approved by whom and when?
◦ Support Records Management Retention periods, disposition cycles
Metadata
Standards and Guidelines◦ Dublin Core Metadata Initiative (ISO 15836)◦ Records Management
ISO 15489 and 23081◦ US DoD 5015.2-STD
Design Criteria Standard for Electronic Records Management Software Applications
Metadata
Dublin Core ExampleField Name Element Name Definition Data type or
SourceComment
Title Title A name given to the resource
Text string Compulsory – pick up from system
Author Creator Who created the content
Text string Compulsory – pick up from system
Subject area 1 Subject The topic of the content
Values come from taxonomy facet
Compulsory – select from drop down list
Subject area 2 Subject The topic of the content
Values come from taxonomy facet
Compulsory – select from drop down list
Subject area 3 Subject The topic of the content
Values come from taxonomy facet
Compulsory – select from drop down list
Place Coverage Extent or scope of the content
Values come from geographic names
Optional-select from drop down list
Date Date Date the content was published
Date: use format yyyy-mm-dd
Compulsory-pick up from system
Document type Type The type of document Values come from taxonomy facet
Compulsory-pick from drop down list
Format Format File format of content Test string conforming to internet media types
Compulsory-pick up from system
Identifier Identifier URL or other document identifier
Text string Compulsory-pick up from system
Publisher Publisher Entity responsible for making the content available
Values come from list of dept names
Optional-select from drop down list
Enforcing metadata use◦ Items with no metadata?◦ Minimum metadata needed at birth◦ Metadata additions later◦ Keep entries consistent
Controlled vocabularies Pick values from lists Select from given options
A simple approach◦ The system will hold metadata about items in two main
categories Essential (mandatory), to identify and manage the item Optional, the provide more information about the item
More on metadata
Clearly metadata has to come from somewhere – and be accurate and useful◦ Making some entries mandatory can help
Too many mandatory elements may be seen as a tedious chore
Too few mandatory elements may result in little metadata being entered
Too many optional metadata entries may also result in little metadata being entered
Users need to appreciate the VALUE of filling in the entries, voluntarily
Mandatory or Optional?
Metadata sources◦ Document◦ Template◦ System◦ User
Multi-media sources Auto-classification and auto-indexing
◦ Keyword indexing◦ OCR/ICR◦ Classification software
Metadata Sources
Metadata content, however important, is notoriously difficult to acquire from users
Before implementing ECM, users just put documents into an electronic folder of their choosing
Now you are asking them to make a series of decisions about choosing categories, identifying access restrictions an so on
Metadata implemented
Try to assign metadata without user involvement◦ E.g. templates, defaults
Users must see value◦ Does it make their job easier?
Metadata implemented
David Champeau ECM Consultant [email protected]
Hope that it was helpful