data archive considerations for customer communication management
DESCRIPTION
In today’s multi-channel customer communications environment, what must be archived and the amount of data to be archived are both growing exponentially. This presentation discusses why the issue is exploding and addresses technical aspects of developing and implementing a CCM archival solution.TRANSCRIPT
Slide 1Copyright © 2011 Stephen D. Poe
Where Will All the Data Go?
Stephen D. Poe, EDP, CSM, CSPONautilus Solutions
9 June 2011
Slide 2Copyright © 2011 Stephen D. Poe
Where Will All the Data Go? Our Agenda
• The Problem• Solutions• Technical Questions• Planning Issues
Slide 3Copyright © 2011 Stephen D. Poe
The Problem
Slide 4Copyright © 2011 Stephen D. Poe
How Big is the Problem?
• Overall– In 2003, UC Berkley estimated 5 exabytes of new
data stored on digital drives• 1 petabyte = 1,000 terabytes• 1 exabyte = 1,000 petabytes
– In 2008, IDC estimated 281 exabytes of digital information was created and replicated globally
• That’s 45GB for each person on earth• Specific examples
– Internet traffic in March 2010 was estimated at 21 exabytes
– Email storage now commonly 25GB per user– Individual statement (AFP) used to average
perhaps 10-15KB per statement• Now several MB per statement
– Color, more graphics• What happens when your online statement includes
personalized audio and video?
Slide 5Copyright © 2011 Stephen D. Poe
How Big is the Problem?
• The number of files is growing even faster– Average file size is shrinking
• No longer just large print files• Emails, IM log, single tweet, QR
request
– Example: storing 1 TB• 1,000 1GB production files• 1,000,000,000 1KB email files
Slide 6Copyright © 2011 Stephen D. Poe
Multi-Channel World• You archive your customer correspondence
– Bills, statements, notices• Only 24% of US bank account holders have gone paperless• 37% say they will never go paperless (Forrester)
• How about new multi-channel messages? – Email, instant messages, mobile, video, voice, Tweets,
and blog posts• Instant messages, Twitter posts and blog posts are not
archived in 80% of the organizations using them.
• All may be discoverable– They may need to be stored
Slide 7Copyright © 2011 Stephen D. Poe
Solutions
Slide 8Copyright © 2011 Stephen D. Poe
Framing the Archive Issue
• Our archives must meet:– All legal and regulatory requirements
• to hold all required electronic documents• for the mandated length(s) of time
– in a cost effective manner– with a defensible plan to manage them
• Insuring that, when required, we can reproduce the ‘original’
– enough to satisfy a judge
Slide 9Copyright © 2011 Stephen D. Poe
Archival System Components
• Storage Format(s)– Multiple, and growing
• Archival system– Hardware– Software– Network
• Retrieval/display software– Network
• Process and procedures
Slide 10Copyright © 2011 Stephen D. Poe
Archive Drivers
Source: AIIM ECM state of the Industry study 2010
Slide 11Copyright © 2011 Stephen D. Poe
Archive Projects
No plans, 5%
In next 12 months, 16%
Departmental, 24%
Across departments,
15%
Implementing enterprise,
28%
Completed enterprise,
12%
Source: AIIM ECM state of the Industry study 2010
Slide 12Copyright © 2011 Stephen D. Poe
Technical Questions
Slide 13Copyright © 2011 Stephen D. Poe
What Do You Have Now?
• ECM, WCM, MCM, repositories of record, archives– How many silos in-house all ready?– Who owns which data?
• Where should we keep it all? – Single repository for all data, all formats?– Separate repositories specialized for each?
Slide 14Copyright © 2011 Stephen D. Poe
Example - Storing Emails
Slide 15Copyright © 2011 Stephen D. Poe
Storage & Admin & Overhead, Oh My!
• Storage may be cheap– Management and ’-ilities aren’t
• Metrics to think about– $/terabtye continues to fall
• Perhaps $2000-$3000/TB for near-Pentabyte systems– Petabytes/IT Storage Administrator
• Burdened labor overhead of perhaps $100K per admin – And overhead
• Rent, electricity, cooling, security– What ‘Green Footprint’?
Slide 16Copyright © 2011 Stephen D. Poe
The Cloud• Remember ASPs?
– Review pros and cons• In-house vs. outsourced
– Where outsourced?• Regulatory environment
– Will this data ever cross a trans-national boundary?
• Recent Amazon.com outage– 4 days down – 98.9% annual up time – What are the SLAs?– What are the penalties?– But could you do better in-house?
• Corporate level of risk– To allow corporate data to be held off-site– But is it any safer in-house?
Slide 17Copyright © 2011 Stephen D. Poe
Legal
• Compliance with rules and regulations– Especially with evolving regulations– Joint legal/IT taskforce to keep up with changes?
• International considerations– EU privacy rules considerably tighter– Conflicts
• Limited or no sharing of data across borders• US discovery laws vs. EU privacy directives
Slide 18Copyright © 2011 Stephen D. Poe
Preserving Your Data
• How long do you need to archive– Legal and regulatory requirements
• 7 years – 100 years
• Average lifespan of a format & reader software– Perhaps 2-3 major OS upgrades
• Look at PDF/A for possible format– ISO standard for very long term archive & retrieval– Good for some (but not all) documents
Slide 19Copyright © 2011 Stephen D. Poe
Finding Your Data
• Key indices– Good enough in the past
• For legacy applications on older data• Structured taxonomies
– If you develop the taxonomies before designing the archive • The New Search
– Full text search is a goal– What does that mean against several Pentabytes of data?
• Metadata– Exceptionally valuable– Usually exceptionally expensive, especially to retrofit
Slide 20Copyright © 2011 Stephen D. Poe
Planning Issues
Slide 21Copyright © 2011 Stephen D. Poe
The ‘-ilities
• The ‘-ilities– Usability, reliability, maintainability, scalability,
availability, extensibility, security, portability• Difference between a system and a success
– Requires long term commitment to people, process, and standards
– Set standards, define metrics, monitor and fix issues as they arise
Slide 22Copyright © 2011 Stephen D. Poe
Archive Planning
• Detailed knowledge of what is to be archived– Current & future production processes– Legacy data and documents– Current multi-channel and social media– Future data and documents?
• Detailed knowledge of how it will be used– By whom– On what platform(s)?– For what purposes
Slide 23Copyright © 2011 Stephen D. Poe
Archive Planning
• Archive system design– Implementation– Maintenance & upgrades– Be flexible – things will change
• Corporate processes and procedures– Satisfy the –’ilities– Continue to meet the business goals– Plan for regular review and transitions
Slide 24Copyright © 2011 Stephen D. Poe
Archive Planning – A Checklist
• Develop the business plan– Business goals, business case, costs, funding, project management
• Technology review– Time estimates, requirement gathering, analyze, plan, get consensus
• Develop policies and process– Define processes, people, standards, tools, technologies, metrics
• Develop Project Plan– A PM is a good idea
• Gap Analysis and build underlying foundation– Environment, platforms, skill sets, enterprise architecture
• Develop plan details – Implement, test, modify
• Maintenance
Slide 25Copyright © 2011 Stephen D. Poe
For More Information
Stephen D. Poe, EDP
Nautilus Solutions+1.214.532.0443