document repositories-and-metadata

36
Copyright © 2010 Earley & Associates Inc. All Rights Reserved. Document Repositories & Metadata Richard Beatch– Earley & Associates

Upload: earley-amp-associatesinc

Post on 22-Jan-2015

3.968 views

Category:

Technology


0 download

DESCRIPTION

 

TRANSCRIPT

  • 1. Document Repositories & Metadata Richard Beatch Earley & Associates

2.

    • Focus: Information Architecture (IA) Services
    • Founded: 1994
    • Personnel: Twenty core team consultants, plus a network of other top industry experts
      • ECM and KM experts
      • taxonomy specialists
      • search experts
      • information architects
      • usability professionals
      • technology consultants
      • business process experts
    • Headquarters: Boston, MA

About Earley & Associates, Inc.

    • Consulting Philosophy:
      • Organizing Principles based on business context and goals
      • Four Pillars - People, Content, Process, and Technology

3. Core Capabilities Enterprise Search, Portal Design, Collaboration Web Content Management Workflow Management Security & PrivacyManagement Rights Management Records Management Website Navigation, Search & SEO Digital Asset Management Taxonomy, Metadata, & Usability 4. Core Capabilities

  • Document/Content/Management:
  • Strategy and requirements planning
  • Taxonomy, Metadata, Object modeling
  • Audit and analysis
  • Migration
  • Tagging and indexing
  • Lifecycle and workflow planning
  • Technology selection, RFP development
  • Governance
  • Taxonomy & Metadata:
  • Taxonomy strategy
  • Taxonomy development (for e-commerce, faceted search, ECM, DAM, enterprise taxonomy, thesauri)
  • Taxonomy evaluation and testing
  • Taxonomy implementation
  • Taxonomy governance and training
  • Taxonomy tool selection
  • Metadata standards development
  • Metadata schema design
  • Metadata governance
  • Digital Asset Management:
  • DAM strategy
  • DAM taxonomy
  • DAM technology evaluation
  • Asset lifecycle management
  • Marketing resource management (MRM)
  • Information Architecture/Usability:
  • Usabilitystudies(site,navigation,taxonomy)
  • Wireframesand IA design
  • Search:
  • Search audit and user testing
  • Search strategy and ROI analysis
  • Taxonomy for faceted search and search optimization
  • Search deployment
  • Search and business intelligence
  • Search tuning and SEO
  • Search technology evaluation/tool selection

5. About Me

  • Richard Beatch
    • Senior Consultant at Earley & Associates, Inc.
    • Ph.D. in Ontology
    • Specialized in Taxonomy, Search, Metadata, and content architecture.
    • Extensive industry experience leading the implementation and design of taxonomies and search solutions for a range of companies including Apple, McAfee, Allstate, Dell, and AT&T.
    • Blog: http://sethearley.wordpress.com/

PAGE 6. The Challenge

  • Suppose you have roughly 1 Million scanned documents entering your document management system each week
  • Suppose you want users to be able to find them in the future so as to conduct your business
  • Suppose it is 2001

PAGE 7. The result

  • H:DocStoreCaliforniaClaimsAutoPoliceRepPhotosDR65876KL
  • H:DocStoreCaliforniaClaimsAutoPoliceRepPhotosDR64876KL
  • H:DocStoreCaliforniaClaimsAutoPoliceRepPhotosDR64879DL
  • H:DocStoreCaliforniaClaimsAutoPoliceRepPhotosDW72876KL
  • Multiplied by (roughly) 250K each week

PAGE 8. Why should I care about access anyways?

  • Reuse of content
  • Access in order to do business, e.g., process an insurance claim
  • Access for regulatory needs
  • In short, to either generate revenue or save money

PAGE 9. How do we access th isinformation now?

  • Ad HOC mechanisms
    • File shares
    • Snail mail/sending CDs
    • Email

PAGE 10. Why Ad Hoc approaches still fail:

  • Intricacies of:
    • files formats
    • digital rights
    • time to transfer content

PAGE

    • Wow what a cool photo can I re-use it?
    • Yeah sure let me get a copy and send it over

11. Document Management to the Rescue

  • Ad Hoc Sharing frustration bubbles up to the surface
  • Business recognizes the need and the potential cost savings over time

PAGEWe need a document management system 12. But we all know the answer:

  • A database and metadata!

PAGE 13. But how do you expectmeto find content? PAGEPrint Websites Social Media Ted's Print Projects 2009 Home_.html Facebook new ideas 14. Taxonomy & Metadata For Findability

  • Type : Magazine Advertisement
  • Channel: Print
  • Target Demographic: Parents
  • Country : US
  • Language : Spanish
  • Concept : Rebellion
  • Brand:Settletra

PAGE

  • Do your kids:
  • Have discipline problems?
  • Trouble paying attention in school?
  • Trouble getting along with others?
  • Maybe its time to find out how Settletra can help

15.

  • Structured data that describes the attributes of an information package (Taylor, 1994)
  • Helpsmanage & shareinformation
  • Helpsfindinformation

Metadata a refresher Document Component Data Metadata can be applied at any level Library 2009 16. I am metadata 17. Types of metadata Structural Administrative DescriptiveTaxonomy can apply in any category What is it? What is it about? What is it called? When was it created? Who owns it? Whats its status? What parts does it have? 18. Types of metadata Structural Administrative DescriptiveTaxonomy can apply in any category Subject Title Document type Description Date created File type Review date Publication Status Is_Part _Of Requires Parent_Object 19.

  • Taxonomy is applied to content as metadata
    • Describes
      • Is-ness
      • About-ness

Taxonomy as metadata Press Release Item Types Press PressReleases Logos Press Kits Taxonomy IRESSA Brands ELAVIL IRESSA Isabout Is a Date created May-15-2009 Document name IRESSA Recommended... Item Type Metadata Document type Document 20. Uses for Metadata

  • Identification
  • Discovery
  • Structural
  • Rights
  • Product

21. Identification

  • Globally unique identifiers
  • Single or federated registries (directories)
  • Choice of what to identify
    • Abstract piece of IP
    • Manifestation of work (US version, German version, etc.)
    • Individual copy
  • General or content type-specific
  • Examples:
    • Book publishing: ISBN, ISTC
    • Journal publishing: ISSN
    • Video content: ISAN
    • Music: ISWC, ISRC, ISMN, GRid
    • Broadcast industry: UMID
    • All content types: DOI, Handle
    • Internet resources: URL, URN, URI

22. Discovery

  • Enable searching, querying, categorization
  • Basic identifying information
  • Descriptive metadata
  • Examples
    • Identifying information from Dublin Core schema: Title Creator Publisher Format
    • Descriptive information from Dublin Core schema: Subject Description

23. Discovery Standards

  • Basic bibliographic: Dublin Core
  • Books: ONIX
  • Magazine articles (print & online): PRISM
  • Journal articles (online): CrossRef
  • News stories: NewsML
  • Educational content: LOM
  • Images: TIFF, DIG35
  • Music: MUZE, AMG

24. Structural

  • Describe logical structure of content
    • Ideally without defining output appearance
  • Allow content to be fed to predefined templates for production & distribution
  • Replacements for old markup languages(TROFF, SCRIPT, etc.)
  • Examples
    • From NITF tagset: [sic]

25. Structural Standards

  • Web pages: XHTML HTML that can be validated through an XML parser
  • News stories: NITF
  • E-books: IDPF OPS/OPF
  • Technical documentation (book form): DocBook
  • Technical documentation (modular): DITA
  • Multimedia: SMIL/MMS

26. Rights

  • Establish rights that can be conveyed to user
  • Define rights that you own or can grant
  • Examples
    • From ODRL 1.1 Permission Elements: display print play execute sell lend give lease modify excerpt

27. Rights Standards

  • DRM-based distribution: ODRL, MPEG REL/XrML
  • Website indexing/search: ACAP
  • Image licensing: PLUS
  • Downstream reuse rights: Creative Commons

28. Product

  • Describe characteristics of product
    • Physical or appearance
    • Marketing
  • Allow separation of content from product
  • Examples
    • From ONIX:
  • Product metadata standard: ONIX (books)

29. The Holy Grail PAGETaxonomy & Metadata Governance & Content Strategy submission retrieval 30. Why stop there? PAGE 31. Perhaps we can do better

  • This is ALL just metadata
  • Different users can focus on what is valuable to them:
    • Price
    • Optical zoom
    • Megapixels
  • The good news: this used to cost a fortune.Not anymore.

PAGE 32. Conclusion

  • Managing large and changing document repositories is challenging.
  • File stores and databases alone cannot provide for genuine findability.
  • Semantically rich metadata can provide for findability through search.
  • Shifts in the costs of faceted navigation make eCommerce-style searching a real option within the enterprise.

PAGE 33. Communities & Events

  • Communities of Practice
    • Taxonomy:www.finance.groups.yahoo.com/group/TaxoCoP
    • SharePoint IA:www.tech.groups.yahoo.com/group/SharePointIACoP
    • Search:www.tech.groups.yahoo.com/group/SearchCoP
  • Upcoming Webinars
    • Taxonomy Community of Practice series
      • www.earley.com/webinars
    • Technology Showcase series
      • www.earley.com/webinars/technology-showcase
    • Jumpstarts
      • www.earley.com/webinars/jumpstarts

34. Communities & Events

  • SharePoint IA Group:http://tech.groups.yahoo.com/group/SharePointIACoP/
  • Taxonomy Group:http://finance.groups.yahoo.com/group/TaxoCoP
  • Search Group:http://tech.groups.yahoo.com/group/SearchCoP
  • Upcoming Taxonomy Community of Practice Webinars
      • May 5, 2010 Cross-Channel Brand Management
      • June 2, 2010 Mega Menus
      • July 7, 2010 Taxonomy for SharePoint 2010
  • Upcoming Vendor Showcase Webinars
      • March 30, 2010 SharePoint Search
      • May 11, 2010 Optimizing Search with FAST
  • Visitwww.earley.com/webinarsfor upcoming schedules and archives.

Communities of Practice 35. For Additional Reading

  • Conquering Chaos via Smart Content Management http://www.earley.com/knowledge/articles/conquering-chaos-via-smart-content-management
  • Tips for Keyword Research http://www.earley.com/knowledge/articles/tips-for-keyword-research
  • Measuring the Success of a Taxonomy Project http://www.earley.com/knowledge/whitepapers/measuring-the-success-of-a-taxonomy-project
  • Retrospective Indexing: Strategies for Cataloging Legacy Contenthttp://www.earley.com/knowledge/whitepapers/retrospective-indexing-strategies-cataloging-legacy-content
  • Designing for Faceted Search http://www.earley.com/knowledge/articles/designing-faceted-search
  • Search & Taxonomy - Leveraging Metadata to Return Content in Context http://www.earley.com/knowledge/articles/search-and-taxonomy-leveraging-metadata-to-return-content-in-context

36. Questions PAGE