terminology and standards dan gillman us bureau of labor statistics
TRANSCRIPT
Terminology and Standards
Dan GillmanUS Bureau of Labor Statistics
Terminology
Principle – To communicate, we need to agree on
terms Concept –
– unit of thought
Term – – linguistic expression (similar to a word)
linked to a concept
Special Language –– set of terms describing a subject field
2
Terminology
Examples of special languages Probability and statistics Database theory Statistical metadata Statistical activity within each SI
– E.g., US Current Population Survey• Labor force• Unemployed
Union of special languages within SI
3
Projects
UNECE Metadata Glossary Glossary (a.k.a. Vocabulary) –
– Alphabetical listing of terms and their definitions
BLS Taxonomy and Lexicon Taxonomy (artefact, not the science) –
– Scheme for organizing terms within some subject field, typically a hierarchy
Lexicon –– Vocabulary, or dictionary, of terms
4
UNECE Metadata Glossary
Create glossary of terms In order of importance
– UNECE statistical metadata standards• GSIM, GSBPM, GAMSO, CSPA, etc.
– Other statistical metadata standards• DDI, SDMX, etc.
– Other standards and specifications• Maybe ISO/IEC 11179, Dublin Core, etc.
Disseminate in user-friendly format
5
UNECE Metadata Glossary
Build special language for Statistical institutes
– Designing metadata systems– Building interfaces to metadata systems– Message frameworks for sharing
metadata
Establish authoritative source Terms Definitions For international use
6
BLS Taxonomy and Lexicon
Project to Record terms describing BLS data
– For all disseminated time series– Separate terms into facets• Measures (estimates on populations)• Characteristics (classifications used to subset
measures)
Produce– Taxonomy – hierarchy of terms– Lexicon – list of terms
7
BLS Taxonomy and Lexicon
Goals For each term, find related documents
and data– organize data – use taxonomy– tag documents – use lexicon
Use taxonomy to drive and guide– Web site reorganization
Provide plain English equivalent words– Help unsophisticated users find resources– Alleviate common confusions 8
BLS Taxonomy and Lexicon
Plain English examples Inflation – CPI Field of work – industry or occupation Wages, earnings, income,
compensation Plain English names for categories Authoritative source for BLS
language
9
Usage of Terms
Metadata models Names of classes, attributes,
relationships E.g., Universe, Category,
Specialization Metadata content
Content stored in attributes in a model E.g., establishment, retail grocery
store, etc. Terminology systems
Authoritative sources for terms / meaning
10
Standards
Why standards? Consistency
– Eliminate inconsequential (gratuitous) differences• Spelling and phrasing differences
Semantic interoperability– Shared meaning w/o need for negotiation
Data harmonization– Ability to combine data from different
sources
11
Standards
Many levels Program, Agency, National, Regional,
International Weaker condition
Authoritative sources– Term and meaning for some subject
field(s)• E.g., unemployed in US CPS• Plain English -> not employed• US CPS -> not employed but still in Labor
Force
– Not necessarily standard12
Standards
Consistency and Interoperability Handled by authoritative sources Use URI’s to terminological entries Spelling and phrasing differences
eliminated Access to meaning ensured
But, Differences across subject fields
remain 13
Standards
Data Harmonization Authoritative sources not sufficient
– Subject fields may differ– Gratuitous differences may exist too
Need new standards and agreements– Bilateral agreements not scalable
Multiple standards on same subject a problem– E.g., Geographical standards (US MSA vs.
CSA)– BLS has 6 definitions of Boston 14