[aiim17] data categorization you can live with - monica crocker
TRANSCRIPT
DATA CATEGORIZATION YOU CAN LIVE WITH
Presented by Monica Crocker, CRM, PMP
Speaker Introduction and Session Overview
• Speaker• Session Overview– Define Data Categorization – Describe a Data Categorization Development approach– Discuss some “gotchas”
• What does Data Categorization mean to you?
Why Do You Care?• Because others care: Information Security, Privacy,
BCP, Users, Compliance, Auditors, etc.• Information is only useful if findable• Categories can be applied at different levels:
enterprise, division, application, table, field• Full text searching doesn’t work • Consistency is critical to system and process integrity
Data Categorization and Taxonomy• Taxonomy = A system of classifying things based on their
relationships; often from more general to more specific• Data Categorization = A way to categorize content
(classification)• Together = A way to get to information• Note: Data = Information• Information = Structured and Unstructured
Ways to Categorize Information• By level of risk: critical, confidential, public• Subject – requires a robust cross-reference of synonyms/preferred
terms/related terms and doesn’t support information security and retention requirements
• Organizational – doesn’t hold up to re-orgs or support cross-functional content and processes
• Functional – supports information security and retention requirements and is intuitive
• A hybrid of the last two, with subject incorporated into metadata, may be the best
Other Related Terminology• Metadata – Data about data, or in this case, information• Index values – the actual metadata values associated with a particular
item• Key words – user defined values in a keyword field or search terms used in
a full text search• Text mining – using a statistical inventory of the full contents of a library of
items to define common terms• Auto-classification – using analysis of the text and metadata associated
with content to automatically assign index values to it, particularly classification values
Good Metadata/Classification Criteria• Should support all the governance needs for it• Should describe the content itself, within its context • Should allow users to search with the information they
already have• Should be enough metadata to identify each item uniquely
AND NO MORE• Should be immutable (not change over the life of the item)• Should be intuitive/understood by a novice
Things to Avoid• Folders• Team member names (except in the Author field)• Status fields• Due dates (work triggers)• Technology specific terms (“PDF” files)• Duplicating data from another system• Too many labels
Model Development Process• Define scope…how will the information be used and
by whom? What policies and regulations apply?• Gather Subject Matter Experts/Stakeholders• Start with your Business Classification Scheme or
Functional Hierarchy• Determine any system limitations• Research existing (prebuilt) models
Model Development Process: Enterprise-wide
Collect all existing organization models including: Records Retention Schedule, Organizational Charts, Security classifications, Information inventories, Budget, File plans, Business process documentation
Build Model• Determine how it will be documented• Build a thesaurus • Figure out how to handle items that don’t fit – create
new category or categorize as “other”• Decide if you will supplement it with folksonomy
(keywords, comments) or tagging• Define ongoing responsibilities
Implement Model• Test– Can users find what they need to do their jobs?– Does everything fit?– Can you apply policy/regulations?
• Modify• Implement• Update and Review – REGULARLY
Handout Exercises1. Put the items on your list into groups and name the
groups. And come up with a name for everything together.
2. Pick one of your groups and define fields someone might use to find items in that group (maximum of 10).
3. Review your results – anything you want to change?
Conclusion• It’s just another big standardization project
with potential for significant impact on business productivity
• It requires varied SMEs• Questions?
• [email protected] • @rec_rocker