for conference purposes only enabling an information driven enterprise: terminology management at...

32
For Conference Purposes For Conference Purposes Only Only Enabling An Information Driven Enabling An Information Driven Enterprise: Terminology Enterprise: Terminology Management at EPA Management at EPA Michael Pendleton Michael Pendleton Metadata Open Forum Metadata Open Forum New York City New York City July 10, 2007 July 10, 2007

Upload: basil-foster

Post on 30-Dec-2015

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: For Conference Purposes Only Enabling An Information Driven Enterprise: Terminology Management at EPA Michael Pendleton Metadata Open Forum New York City

For Conference Purposes OnlyFor Conference Purposes Only

Enabling An Information Driven Enabling An Information Driven Enterprise: Terminology Enterprise: Terminology

Management at EPAManagement at EPA

Michael PendletonMichael PendletonMetadata Open ForumMetadata Open Forum

New York CityNew York City

July 10, 2007July 10, 2007

Page 2: For Conference Purposes Only Enabling An Information Driven Enterprise: Terminology Management at EPA Michael Pendleton Metadata Open Forum New York City

For Conference Purposes OnlyFor Conference Purposes Only

OverviewOverview

EPA’s need for terminology managementEPA’s need for terminology management Current terminology development effortsCurrent terminology development efforts Elements of a successful terminology Elements of a successful terminology

programprogram Environmental Terminology System and Environmental Terminology System and

Services (ETSS) Services (ETSS) Semantic VisionSemantic Vision

Page 3: For Conference Purposes Only Enabling An Information Driven Enterprise: Terminology Management at EPA Michael Pendleton Metadata Open Forum New York City

For Conference Purposes OnlyFor Conference Purposes Only

Why EPA Needs to Manage TermsWhy EPA Needs to Manage Terms

REASON # 1:REASON # 1:

So that we know what we So that we know what we meanmean

Business termsBusiness terms

Legal termsLegal terms

Administrative termsAdministrative terms

AcronymsAcronyms

Gary Larson – The Far Side

Page 4: For Conference Purposes Only Enabling An Information Driven Enterprise: Terminology Management at EPA Michael Pendleton Metadata Open Forum New York City

For Conference Purposes OnlyFor Conference Purposes Only

EPA’s Quality SystemEPA’s Quality System

Quality System focuses on data Quality System focuses on data Need for shared understandingNeed for shared understanding Quality Glossary ProjectQuality Glossary Project

Retooling the Quality GlossaryRetooling the Quality Glossary Establishing a repeatable glossary Establishing a repeatable glossary

governance framework and methodologygovernance framework and methodology

Page 5: For Conference Purposes Only Enabling An Information Driven Enterprise: Terminology Management at EPA Michael Pendleton Metadata Open Forum New York City

For Conference Purposes OnlyFor Conference Purposes Only

Why EPA Needs to Manage TermsWhy EPA Needs to Manage Terms

REASON # 2:REASON # 2:

So we can find stuffSo we can find stuff

Indexing Indexing

Cataloging Cataloging

Keyword managementKeyword management

“Commentary.” Government Computer News – August 14, 2006

Page 6: For Conference Purposes Only Enabling An Information Driven Enterprise: Terminology Management at EPA Michael Pendleton Metadata Open Forum New York City

For Conference Purposes OnlyFor Conference Purposes Only

Web TaxonomyWeb Taxonomy

EPA’s Web contentEPA’s Web content Information Architecture StrategyInformation Architecture Strategy Web TaxonomyWeb Taxonomy

Metadata specifications + controlled Metadata specifications + controlled vocabularyvocabulary

Faceted approachFaceted approach

Page 7: For Conference Purposes Only Enabling An Information Driven Enterprise: Terminology Management at EPA Michael Pendleton Metadata Open Forum New York City

For Conference Purposes OnlyFor Conference Purposes Only

EPA Taxonomy FacetsEPA Taxonomy FacetsFacetsFacets DefinitionsDefinitions

Information Information TypesTypes

Typology that indicates what type of information this is.Typology that indicates what type of information this is.

AudiencesAudiences AudienceAudience segments for whom the content is targeted.segments for whom the content is targeted.

GeographyGeography Places which the content covers or is related to.Places which the content covers or is related to.

FunctionsFunctions EPA business functions or services that are covered by EPA business functions or services that are covered by or related to the content.or related to the content.

IndustriesIndustries Industry sectors that are covered by or related to the Industry sectors that are covered by or related to the content.content.

OrganizationsOrganizations EPA or external organizational units that are covered by EPA or external organizational units that are covered by or related to the content.or related to the content.

Laws & Laws & RegulationsRegulations

Specific environmental laws, regulations and treaties Specific environmental laws, regulations and treaties that are covered by or related to the content.that are covered by or related to the content.

SubstancesSubstances Chemicals and substances covered by or related to the Chemicals and substances covered by or related to the content.content.

Page 8: For Conference Purposes Only Enabling An Information Driven Enterprise: Terminology Management at EPA Michael Pendleton Metadata Open Forum New York City

For Conference Purposes OnlyFor Conference Purposes Only

EPA Web Taxonomy: EPA Web Taxonomy: Asset & Use FacetsAsset & Use Facets

ConsumersContractors & GranteesEPA EmployeesGovernmentHealth Care ProvidersInternationalResearchers & ScientistsTeachers & KidsTechnical & Regulated Community

Audiences Geography

Country & RegionUnited States

StateRegionRegulated FacilitiesSuperfund SitesWatersheds & Wetlands

 

Information Types

Basic Facts & InformationCommunity InformationConcerned Citizens Resources

Curriculum ResourcesEmergency Preparedness & Response Information

Environmental Laws & Regulations

News & News ReleasesProgram ResourcesResources for Non Profit Organizations

Technical InformationTest Methods & Models

Page 9: For Conference Purposes Only Enabling An Information Driven Enterprise: Terminology Management at EPA Michael Pendleton Metadata Open Forum New York City

For Conference Purposes OnlyFor Conference Purposes Only

EPA Web Taxonomy: EPA Web Taxonomy: Subject FacetsSubject Facets

Services for CitizensCommunity & Social Services

Disaster Mgmt Economic Dev Education

EnergyEnvironmental Mgmt

General Science & Innovation

Homeland SecurityIntl Affairs & Commerce

Law EnforcementNatural Resources

Mode of DeliverySupport Delivery of ServicesMgmt of Resources

Admin Mgmt

Functions

Agricultural ChemicalAir PollutantAllergenBiological ContaminantCarcinogenChemicalExplosiveExtremely Hazardous SubstanceLiquid WasteMicroorganismMultimedia PollutantMutagenOzonePesticideRadiationRadioactive WasteSoil ContaminantSolid Waste - NonhazardousTeratogenToxic SubstanceWater Pollutant

SubstancesIndustries

AgricultureAutomobile Repair Banking Chemical Construction Dry Cleaning Electronics & Computer Energy Environmental Extractive Fishing Food Processing Forest Garment & Textile Care Leather Tanning & Finishing Metal Finishing Metal Processing Pesticides Petroleum Pharmaceutical Printing Pulp & Paper Real Estate Transportation

Organizations

EPAFederal GovernmentInteragency ProgramsLocal GovernmentMilitaryMulti-State WorkgroupsNon-Government OrganizationPartner/NetworkPublication & Information SourceState GovernmentTribal Government

Economics & PolicyEmergencies &

CleanupEnvironmental MediaHuman HealthIndustrialResearch,

Prevention & Control

Topics

Page 10: For Conference Purposes Only Enabling An Information Driven Enterprise: Terminology Management at EPA Michael Pendleton Metadata Open Forum New York City

For Conference Purposes OnlyFor Conference Purposes Only

AdvisoryChildren’s HealthExposureFood SafetyHealth AssessmentHealth EffectHealth Risk Occupational HealthPesticide EffectsSenior's HealthSun ProtectionToxicity

Health

EPA Taxonomy: EPA Taxonomy: Topics Sub-FacetsTopics Sub-Facets

CommunitiesEconomics &

FinancingGlobal Climate

ChangeInternational

CooperationRisk AssessmentTechnical AssistanceTechnical CooperationVoluntary Partnerships

Research, Prevention & Control

Emergencies & Cleanup Environmental Media Industrial

Cooperation & Assistance

Topics

CleanupBrownfieldsCleanup Technology

Corrective ActionsStorage TanksSuperfund

EmergenciesAccidentsContingency PlansCounter-TerrorismDisastersEmergency Preparedness

Oil SpillsPoisoningRadiation Emergencies

Storage Tank Spills

AirEcosystemsWasteWater

Industrial EcologyIndustrial processesLarge BuildingsOrphaned SourcesPesticide TopicsRadiation &

RadioactivitySmall BusinessStorage Tanks

Pollution PreventionPhysical AspectsResearchTreatment & Control

Page 11: For Conference Purposes Only Enabling An Information Driven Enterprise: Terminology Management at EPA Michael Pendleton Metadata Open Forum New York City

For Conference Purposes OnlyFor Conference Purposes Only

Example Webpage: Example Webpage: Mercury Research StrategyMercury Research Strategy

FacetFacet ValueValue

Information TypesInformation Types Technical Information; Technical Information; Planning DocumentsPlanning Documents

OrganizationOrganization Office of Research & Office of Research & DevelopmentDevelopment

FunctionsFunctions Pollution Prevention & Control; Pollution Prevention & Control; Research & DevelopmentResearch & Development

SubstancesSubstances MercuryMercury

Health TopicsHealth Topics AdvisoryAdvisory

Page 12: For Conference Purposes Only Enabling An Information Driven Enterprise: Terminology Management at EPA Michael Pendleton Metadata Open Forum New York City

For Conference Purposes OnlyFor Conference Purposes Only

Why EPA Needs to Manage TermsWhy EPA Needs to Manage Terms

REASON # 3:REASON # 3:

Others are counting on usOthers are counting on us

Emergency responseEmergency response

Federal Government Federal Government

• (CENDI) Interagency workgroup(CENDI) Interagency workgroup

International effortsInternational efforts

• EcoInformatics Initiative EcoInformatics Initiative Ecoterm Ecoterm

Page 13: For Conference Purposes Only Enabling An Information Driven Enterprise: Terminology Management at EPA Michael Pendleton Metadata Open Forum New York City

For Conference Purposes OnlyFor Conference Purposes Only

Where We’ve BeenWhere We’ve Been EPA’s Terminology Reference System (www.epa.gov/trs)EPA’s Terminology Reference System (www.epa.gov/trs)

Searchable repositorySearchable repository Over 250 distinct vocabularies; over 11,000 termsOver 250 distinct vocabularies; over 11,000 terms

• Environmental regulations and lawsEnvironmental regulations and laws• EPA Program glossaries and term listsEPA Program glossaries and term lists

• GEGEneral neral MMultilingual ultilingual EEnvironmental nvironmental TThesaurus (hesaurus (GEMETGEMET)) Significant limitationsSignificant limitations

• Limited search capabilityLimited search capability• Lacks web servicesLacks web services• Lacks editing functionalityLacks editing functionality• Doesn’t support multilingual capabilityDoesn’t support multilingual capability• Insufficient for concept managementInsufficient for concept management

Page 14: For Conference Purposes Only Enabling An Information Driven Enterprise: Terminology Management at EPA Michael Pendleton Metadata Open Forum New York City

For Conference Purposes OnlyFor Conference Purposes Only

Elements of a Successful Elements of a Successful Terminology Management ProgramTerminology Management Program

ContentContent – terminology important to EPA and our – terminology important to EPA and our partnerspartners

Data ModelData Model – to hold various types of terminologies – to hold various types of terminologies ToolsTools – create, store, maintain, compare, and distribute – create, store, maintain, compare, and distribute

terminologiesterminologies GovernanceGovernance – to support development and – to support development and

maintenance of terminologiesmaintenance of terminologies ServicesServices – training, administration, web services – training, administration, web services

Page 15: For Conference Purposes Only Enabling An Information Driven Enterprise: Terminology Management at EPA Michael Pendleton Metadata Open Forum New York City
Page 16: For Conference Purposes Only Enabling An Information Driven Enterprise: Terminology Management at EPA Michael Pendleton Metadata Open Forum New York City

For Conference Purposes OnlyFor Conference Purposes Only

ETSS StatusETSS StatusCurrentCurrent Terminology editorial system Terminology editorial system Providing editor training and Providing editor training and resource page resource page Migrated TRS content to ETSSMigrated TRS content to ETSS Added Web Taxonomy to ETSSAdded Web Taxonomy to ETSS

Coming SoonComing Soon Public interfacePublic interface Integrate with other systemsIntegrate with other systems Establish governance and workflowEstablish governance and workflow Strategy for concept-based systemStrategy for concept-based system

Page 17: For Conference Purposes Only Enabling An Information Driven Enterprise: Terminology Management at EPA Michael Pendleton Metadata Open Forum New York City

For Conference Purposes OnlyFor Conference Purposes Only

Login for EPA and Partners

Page 18: For Conference Purposes Only Enabling An Information Driven Enterprise: Terminology Management at EPA Michael Pendleton Metadata Open Forum New York City

For Conference Purposes OnlyFor Conference Purposes Only

Semantic VisionSemantic Vision

Controlled concepts interact with dataControlled concepts interact with data

ETSS – Vocabulary Management

EDR: DataElementMetadata

WebContentCatalog

READ:System

Inventory

SCRR: Reusable

Components

ECMS:Doc. Mgmt.& Records

Page 19: For Conference Purposes Only Enabling An Information Driven Enterprise: Terminology Management at EPA Michael Pendleton Metadata Open Forum New York City

For Conference Purposes OnlyFor Conference Purposes Only

Getting ThereGetting There

Establish umbrella concept systemEstablish umbrella concept system Establish relationships between terms Establish relationships between terms

across vocabulariesacross vocabularies Add and improve contentAdd and improve content Develop comparison toolsDevelop comparison tools Enable stewardship programEnable stewardship program Automated transactionsAutomated transactions

Page 20: For Conference Purposes Only Enabling An Information Driven Enterprise: Terminology Management at EPA Michael Pendleton Metadata Open Forum New York City

For Conference Purposes OnlyFor Conference Purposes Only

For More InformationFor More InformationEnvironmental Terminology System and ServicesEnvironmental Terminology System and Services

Michael Pendleton – Office of Environmental Information, Data Standards Branch, Michael Pendleton – Office of Environmental Information, Data Standards Branch, [email protected]@epa.gov; (202) 566-1658; (202) 566-1658

Linda Spencer - Office of Environmental Information, Data Standards Branch, Linda Spencer - Office of Environmental Information, Data Standards Branch, [email protected]@epa.gov; (202) 566-1651; (202) 566-1651

Quality GlossaryQuality Glossary

Katherine Breidenstine - Office of Environmental Information, Quality Staff, Katherine Breidenstine - Office of Environmental Information, Quality Staff, [email protected]@epa.gov; (202) 564-1511; (202) 564-1511

Web TaxonomyWeb Taxonomy

Susan Fagan - Office of Environmental Information, Information Access Division Susan Fagan - Office of Environmental Information, Information Access Division [email protected]@epa.gov; 202-566-2021; 202-566-2021

Page 21: For Conference Purposes Only Enabling An Information Driven Enterprise: Terminology Management at EPA Michael Pendleton Metadata Open Forum New York City

For Conference Purposes OnlyFor Conference Purposes Only

Key ETSS CustomersKey ETSS Customers Human CustomersHuman Customers

EPA vocabulary developers like the Web Taxonomy EPA vocabulary developers like the Web Taxonomy ProjectProject

Policy makers defining terms in regulationsPolicy makers defining terms in regulations System developers selecting XML tags and defining System developers selecting XML tags and defining

data elementsdata elements Program managers and researchers seeking terms Program managers and researchers seeking terms

and glossaries perhaps via the portal and glossaries perhaps via the portal Non-EPA vocabulary developers interested in Non-EPA vocabulary developers interested in

environmental termsenvironmental terms People trying to use terms and definitions consistentlyPeople trying to use terms and definitions consistently Stakeholders, partners and the publicStakeholders, partners and the public

System CustomersSystem Customers Search engines – to expand searches or provide the Search engines – to expand searches or provide the

basis for taxonomies or foldersbasis for taxonomies or folders Enterprise content management – source of value Enterprise content management – source of value

domains and controlled vocabulariesdomains and controlled vocabularies Other systems that use pick listsOther systems that use pick lists

Page 22: For Conference Purposes Only Enabling An Information Driven Enterprise: Terminology Management at EPA Michael Pendleton Metadata Open Forum New York City

For Conference Purposes OnlyFor Conference Purposes Only

Extra SlidesExtra Slides

Page 23: For Conference Purposes Only Enabling An Information Driven Enterprise: Terminology Management at EPA Michael Pendleton Metadata Open Forum New York City

For Conference Purposes OnlyFor Conference Purposes Only

ETSS High-Level Data ModelETSS High-Level Data Model

Vocabulary(Relationship Definitions, Rules, Versions,

Contact Information for Stewards & Owners)

TermsStandard Attributes

(Definitions, Source, Language)

EPA Custom Attributes(Notes fields, etc.)

Relationship Links (Narrower Than, Broader Than,

Equivalent, and EPA-Custom Relationships to be Defined)

Page 24: For Conference Purposes Only Enabling An Information Driven Enterprise: Terminology Management at EPA Michael Pendleton Metadata Open Forum New York City

For Conference Purposes OnlyFor Conference Purposes Only

Knowledge Organization Knowledge Organization ContinuumContinuum

Page 25: For Conference Purposes Only Enabling An Information Driven Enterprise: Terminology Management at EPA Michael Pendleton Metadata Open Forum New York City

For Conference Purposes OnlyFor Conference Purposes Only

Enterprise Content Management Enterprise Content Management System (ECMS)System (ECMS)

Terminology Management NeedsTerminology Management Needs keyword list management such as document keyword list management such as document

type and topic (e.g. air, water, waste)type and topic (e.g. air, water, waste) manage content, and web service content to manage content, and web service content to

DocumentumDocumentum Repository for ECMS metadata Repository for ECMS metadata

Page 26: For Conference Purposes Only Enabling An Information Driven Enterprise: Terminology Management at EPA Michael Pendleton Metadata Open Forum New York City

For Conference Purposes OnlyFor Conference Purposes Only

Concept Management Concept Management and the Semantic Weband the Semantic Web

The Semantic Web is an extension of the current The Semantic Web is an extension of the current web in which information is given well-defined web in which information is given well-defined meaning, better enabling computers and people to meaning, better enabling computers and people to work in cooperation.work in cooperation.

It’s about:It’s about:• Managing conceptsManaging concepts• More explicit meaningMore explicit meaning• Structure and standardsStructure and standards• Tools and infrastructureTools and infrastructure

Page 27: For Conference Purposes Only Enabling An Information Driven Enterprise: Terminology Management at EPA Michael Pendleton Metadata Open Forum New York City

For Conference Purposes OnlyFor Conference Purposes Only

What is Concept Management?What is Concept Management? Organizing terms around core concepts in a business, domain or Organizing terms around core concepts in a business, domain or

enterpriseenterprise Goals:*Goals:*

Articulate clear and concise meanings of business domain Articulate clear and concise meanings of business domain conceptsconcepts

Achieve a shared understanding of the concepts among relevant Achieve a shared understanding of the concepts among relevant stakeholders, andstakeholders, and

Guard the stability of a concept’s meaning during system Guard the stability of a concept’s meaning during system developmentdevelopment

Major activities:*Major activities:* Scoping the environment of discourseScoping the environment of discourse Concept specification, integration and enforcementConcept specification, integration and enforcement

*Bleeker, et al “The Role of Concept Management in System Development – *Bleeker, et al “The Role of Concept Management in System Development –

A Practical and Theoretical Perspective” 2003. A Practical and Theoretical Perspective” 2003.

http://www.cs.ru.nl/Research/reports/full/NIII-R0330.pdfhttp://www.cs.ru.nl/Research/reports/full/NIII-R0330.pdf

Page 28: For Conference Purposes Only Enabling An Information Driven Enterprise: Terminology Management at EPA Michael Pendleton Metadata Open Forum New York City

For Conference Purposes OnlyFor Conference Purposes Only

Page 29: For Conference Purposes Only Enabling An Information Driven Enterprise: Terminology Management at EPA Michael Pendleton Metadata Open Forum New York City

For Conference Purposes OnlyFor Conference Purposes Only

EPA System of Registries

ETSS

DiscoverTerminology

DevelopTerminology

Launches to collaboration tools

Environmental Data Registry

(EDR)

Registry of EPA Applications and

Databases (READ)

Facility Registry System (FRS)

Substance Registry System

(SRS)

Service Component Registry and Repository

(SCRR)

Launches to Synaptica

ETSS Relationship to theETSS Relationship to the System of Registries System of Registries

Page 30: For Conference Purposes Only Enabling An Information Driven Enterprise: Terminology Management at EPA Michael Pendleton Metadata Open Forum New York City

For Conference Purposes OnlyFor Conference Purposes Only

Taxonomy Taxonomy TopicsTopics Sub-Facets Sub-FacetsTopics Sub Topics Sub

FacetsFacets DefinitionsDefinitions

Cooperation & Cooperation & AssistanceAssistance

Topics related to environmental cooperation and Topics related to environmental cooperation and assistance referred to or associated with content.assistance referred to or associated with content.

Emergencies & Emergencies & CleanupCleanup

Topics related to environmental emergencies and cleanup Topics related to environmental emergencies and cleanup referred to or associated with content.referred to or associated with content.

Environmental Environmental MediaMedia

Topics related to environmental media--air, land, water--Topics related to environmental media--air, land, water--referred to or associated with content.referred to or associated with content.

HealthHealth Topics related to health conditions or concerns referred to Topics related to health conditions or concerns referred to or associated with content.or associated with content.

IndustrialIndustrial Topics related to industrial environmental issues and Topics related to industrial environmental issues and policies referred to or associated with content.policies referred to or associated with content.

Research, Research, Prevention & Prevention & ControlControl

Topics related to environmental research and pollution Topics related to environmental research and pollution prevention and control referred to or associated with prevention and control referred to or associated with content.content.

Page 31: For Conference Purposes Only Enabling An Information Driven Enterprise: Terminology Management at EPA Michael Pendleton Metadata Open Forum New York City

For Conference Purposes OnlyFor Conference Purposes Only

Indexing rules: How to use EPA Indexing rules: How to use EPA Taxonomy to tag contentTaxonomy to tag content

RuleRule DescriptionDescription

Use specific termsUse specific terms Apply the most specific terms when tagging content. Apply the most specific terms when tagging content. Specific terms can always be generalized, but generic Specific terms can always be generalized, but generic terms cannot be specialized.terms cannot be specialized.

Use multiple Use multiple termsterms

Use as many terms as necessary to describe Use as many terms as necessary to describe What the What the content is aboutcontent is about & & Why it is importantWhy it is important. .

Use appropriate Use appropriate termsterms

Only fill-in the facets & values that make sense. Not all Only fill-in the facets & values that make sense. Not all facets apply to all content.facets apply to all content.

Consider how Consider how content will be content will be usedused

Anticipate Anticipate how the content will be searched forhow the content will be searched for in the in the future, & future, & how to make it easy to find ithow to make it easy to find it. Remember that . Remember that search engines can only operate on explicit information.search engines can only operate on explicit information.

Page 32: For Conference Purposes Only Enabling An Information Driven Enterprise: Terminology Management at EPA Michael Pendleton Metadata Open Forum New York City

For Conference Purposes OnlyFor Conference Purposes Only

Environmental Terminology System Environmental Terminology System and Services (ETSS)and Services (ETSS)

Search & Discovery Terminology Management Human and Automated Services Collaborative Stewardship