© Copyright 11/11/08 and prior years by Data Blueprint - all rights reserved!TRA- - datablueprint.com
The Evolution of Data Management
(a 25+ year study)
© Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved! - datablueprint.com
Peter Aiken
• DoD Computer Scientist– Reverse Engineering Program Manager/Office of the Chief Information Officer (1992-1997)
• Visiting Scientist
– Software Engineering Institute/Carnegie Mellon University (2001-2002)
• DAMA International Advisor/Board Member (http://dama.org)
– 2001 DAMA International Individual Achievement Award (with Dr. E. F. "Ted" Codd)
– 2005 DAMA Community Award
• Founding Advisor/International Association for Information and Data Quality (http://iaidq.org)
• Founding Advisor/Meta-data Professionals Organization (http://metadataprofessional.org)
• Founding Director Data Blueprint 1999
2
• Full time in information technology since 1981
• IT engineering research and project background
• University teaching experience since 1979
• Seven books and dozens of articles
• Research Areas – reengineering, data reverse engineering, software requirements engineering, information engineering, human-
computer interaction, systems integration/systems engineering, strategic planning, and DSS/BI
• Director
– George Mason University/Hypermedia Laboratory (1989-1993)
• Published Papers– Communications of the ACM, IBM Systems Journal, InformationWEEK, Information & Management, Information
Resources Management Journal, Hypermedia, Information Systems Management, Journal of Computer Information Systems and IEEE Computer & Software
© Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved! - datablueprint.com
Dogs New Clothes
3
© Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved! - datablueprint.com
http://peteraiken.net
Contact Information:
Peter Aiken, Ph.D.
Department of Information Systems School of BusinessVirginia Commonwealth University1015 Floyd Avenue - Room 4170Richmond, Virginia 23284-4000
Data Blueprint Maggie L. Walker Business & Technology Center501 East Franklin StreetRichmond, VA 23219804.521.4056http://datablueprint.com
office :+1.804.883.759cell:+1.804.382.5957
e-mail:[email protected]://peteraiken.net
4
© Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved! - datablueprint.com5
© Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved! - datablueprint.com
September 21, 2004
6
© Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved! - datablueprint.com
Hmm …
Confusion
Correct Name:Yusuf Islam
TSA No Fly Listing:Youssouf Islam
7
© Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved! - datablueprint.com
15,000 want off the US terror watch list• 15,000 people appealed to be removed from list
• 2,000 month requesting removal
• TSA promised 30 day review process
• Actual time is 44 days
• American Civil Liberties Union estimates 1 million people on US government watch lists
8
• Fall 2008 comments:– Fewer than 2,500 people on US "no-fly" list
– 10% those are US citizens
– 16,000 people on "selectee" list (additional screening)
• Transfer responsibility of comparing names on lists from dozens of airlines to TSA
© Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved! - datablueprint.com
US Terror Watch List Facts
9
© Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved! - datablueprint.com
IT Project Failure RatesRecent IT project failure rates statistics can be summarized as follows:
– Carr 1994
• 16% of IT Projects completed on time, within budget, with full functionality
– OASIG Study (1995)
• 7 out of 10 IT projects "fail" in some respect
– The Chaos Report (1995)
• 75% blew their schedules by 30% or more
• 31% of projects will be canceled before they ever get completed
• 53% of projects will cost over 189% of their original estimates
• 16% for projects are completed on-time and on-budget
– KPMG Canada Survey (1997)
• 61% of IT projects were deemed to have failed
– Conference Board Survey (2001)
• Only 1 in 3 large IT project customers were very “satisfied"
– Robbins-Gioia Survey (2001)
• 51% of respondents viewed their large IT implementation project as unsuccessful
– MacDonalds Innovate (2002)
• Automate fast food network from fry temperature to # of burgers sold-$180M USD write-off
– Ford Everest (2004)
• Replacing internal purchasing systems-$200 million over budget
– FBI (2005)
• Blew $170M USD on suspected terrorist database-"start over from scratch"
http://www.it-cortex.com/stat_failure_rate.htm (accessed 9/14/02)
New York Times 1/22/05 pA31
10
© Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved! - datablueprint.com
Data Integration/Exchange Challenges
11
• Customer typically has had different meanings to different parts of the organization:– Accounting -> organization that buys products or services– Service -> client– Sales -> prospect
• Assigning the same mission to the DoD ‘lines of business’ to: “Secure the building” elicits very different results from each ‘line of business’:– Army: Posts guards at all entrances and ensures no unauthorized
access– Navy: Turns out all the lights, locks up, and leaves– Marines: Sends in a company to clear the building room-by-room; forms
perimeter defense around the building– Air Force: Signs three year lease with option to buy
[Second example courtesy of Burt Parker]
Hypothesized extensions contributed by a Chicago DAMA Member10. Both soon to be female11. Both soon to be male12. Psychologically female, biologically male13. Psychologically male, biologically female
© Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved! - datablueprint.com
FBI & Canadian Social Security Gender Codes
1. Male
2. Female
3. Formerly male now female
4. Formerly female now male
5. Uncertain
6. Won't tell
7. Doesn't know
8. Male soon to be female
9. Female soon to be male
12
If column 1 in
source = "m"
• then set
value of
target data
to "male"
• else set
value of
target data
to "female"
© Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved! - datablueprint.com
Platform: UniSysOS: OS1998 Age: 21 Data Structure: DMS (Network)Physical Records: 4,950,000Logical Records: 250,000Relationships: 62Entities: 57Attributes: 1478
Predicting Engineering Problem Characteristics
New System
Legacy System #1: Payroll
Legacy System #2: Personnel
Platform: AmdahlOS: MVS1998 Age: 15 Data Structure: VSAM/virtual database tablesPhysical Records: 780,000Logical Records: 60,000Relationships: 64Entities: 4/350Attributes: 683
Characteristics Logical PhysicalPlatform: WinTel Records: 250,000 600,000OS: Win'95 Relationships: 1,034 1,0201998 Age: new Entities: 1,600 2,706Data Structure: Client/Sever RDBMS Attributes: 15,000 7,073
13
© Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved! - datablueprint.com
"Extreme" Data Engineering
• 2 person months = 40 person days
• 2,000 attributes mapped onto 15,000
• 2,000/40 person days = 50 attributes per person dayor 50 attributes/8 hour = 6.25 attributes/hour
and
• 15,000/40 person days = 375 attributes per person dayor 375 attributes/8 hours = 46.875 attributes/hour
• Locate, identify, understand, map, transform, document, QA at a rate of -
• 52 attributes every 60 minutes or .86 attributes/minute!
14
© Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved! - datablueprint.com
Why Data Projects Fail by Joseph R. Hudicka
• Assessed 1200 migration projects!
– Surveyed only experienced migration specialists who have done at least four migration projects
• The median project costs over 10 times the amount planned!
• Biggest Challenges: Bad Data; Missing Data; Duplicate Data
• The survey did not consider projects that were cancelled largely due to data migration difficulties
• "… problems are encountered rather than discovered"
Median Project Expense
Median Project Cost
$0 $125,000 $250,000 $375,000 $500,000
Joseph R. Hudicka "Why ETL and Data Migration Projects Fail" Oracle Developers Technical Users Group Journal June 2005 pp. 29-3115
© Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved! - datablueprint.com
Link business objectives to technical capabilities
16
"Understanding the current and future data needs of an enterprise and making that data effective and efficient in supporting business activities"
Aiken, P, Allen, M. D., Parker, B., Mattia, A., "Measuring Data Management's Maturity: A Community's Self-Assessment" IEEE Computer (research feature April 2007)
© Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved! - datablueprint.com
Data Management
17
© Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved! - datablueprint.com
Data Data
Data
Information
Fact Meaning
Request
A Model Specifying Relationships Among Important Terms
[Built on definition by Dan Appleton 1983]
Intelligence
Use
1. Each FACT combines with one or more MEANINGS.2. Each specific FACT and MEANING combination is referred to as a DATUM.3. An INFORMATION is one or more DATA that are returned in response to a specific
REQUEST. 4. INFORMATION REUSE is enabled when one FACT is combined with more than
one MEANING.5. INTELLIGENCE is INFORMATION associated with its USES.
Wisdom & knowledge are often used synonymously
Data
Data
Data Data
18
© Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved! - datablueprint.com
Do you know the game Twister?
• Canada• Chile• Columbia• Egypt
• Ireland• Italy• Japan• Qatar• Scotland
• Estonia• Finland• France• Germany• Great
Britain
• Switzerland• Thailand• Turkey• UAE• US
19
© Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved! - datablueprint.com
Typical System Evolution
Payroll Application(3rd GL)
Payroll Data(database)
R& D Applications(researcher supported, no documentation)
R & DData(raw)
Mfg. Data(home grown
database) Mfg. Applications(contractor supported)
FinanceData
(indexed)
Finance Application(3rd GL, batch system, no source)
Marketing Application(4rd GL, query facilities, no reporting, very large)
Marketing Data(external database)
Personnel Data(database)
Personnel App.(20 years old,
un-normalized data)
20
© Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved! - datablueprint.com
Nicolo Machiavelli (1469-1527)
He who doesn’t lay his foundations before hand, may by great abilities do so afterward, although with great trouble to the architect and danger to the building.
Machiavelli, Niccolo. The Prince. 19 Mar. 2004 http://pd.sparknotes.com/philosophy/prince21
© Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved! - datablueprint.com
Information Architectures
• … are plans, guiding the transformation of strategic organizational information needs into specific information systems development projects Source: Internet
• "Information architecture is a foundation discipline describing the theory, principles, guidelines, standards, conventions, and factors for managing information as a resource. It produces drawings, charts, plans, documents, designs, blueprints, and templates, helping everyone make efficient, effective, productive and innovative use of all types of information."
– Source: Information First by Roger & Elaine Evernden, 2003 ISBN 0 7506 5858 4 p. 1.
• Information architecture (IA) is the art of expressing a model or concept of information used in activities that require explicit details of complex systems. (wikipedia.org)
• All organizations have information architectures
• Some are better understood and documented (and therefore more useful to the organization) than others.
22
© Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved! - datablueprint.com
Building from the Top
23
© Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved! - datablueprint.com
Sample Conversation (Developing Constraints)
• I'd like to build a building.
• What kind of building - do you want to sleep in it? Eat in it? Work in it?
• I'd like to sleep in it.
• Oh, you want to build a house?
• Yes, I'd like a house.
• How large a house do you have in mind?
• Well, my lot size is 100 feet by 300 feet.
• Then you want a house about 50 feet by 100 feet.
• Yes, that's about right.
• How many bedrooms do you need?
• Well, I have two children, so I'd like three bedrooms ...
24
© Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved! - datablueprint.com
GAO Has Identified the Problem
25
© Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved! - datablueprint.com
Concrete Block & Engineering Continuity
26
© Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved! - datablueprint.com
Look Familiar?
27
© Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved! - datablueprint.com
Why?
28
© Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved! - datablueprint.com
Finance Example
• Business Rule:
– A customer may have one and only one account
• Bank Manager:
– The customer is always right ...
– And this one needs multiple accounts!
# Account ID1 peter
2 peter1
3 peter2
4 peter3
5 peter4
6 peter5
7 peter6
8 peter7
9 peter8
10 peter9
11 peter10
Sorted IDs
peter
peter1
peter10
peter2
peter3
peter4
peter5
peter6
peter7
peter8
peter9
29
© Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved! - datablueprint.com
Architecture Jargon
30
© Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved! - datablueprint.com
Avoiding Unnecessary Work Using Business Rule Metadata
Person Job Class
Employee Position
BR1) Zero, one, or more EMPLOYEES can be
associated with one PERSON
BR2) Zero, one, or more EMPLOYEES can be associated with one JOB CLASS;
BR3) Zero, one, or more EMPLOYEES can be associated with one POSITION
BR4) One or more POSITIONS can be associated with one JOB CLASS.
31
Job Sharing
'Mond-Licht' or
'Mondschein'
© Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved! - datablueprint.com
Student System
Data Model
32
© Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved! - datablueprint.com
Proposed Data Model33
© Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved! - datablueprint.com
Organizations Surveyed
34
• Results from more than 400 organizations
• 32% government
• Appropriate public company representation
• Enough data to demonstrate European organization DM practices are generally more mature
Local Government
4%
State Government Agencies
17%
Federal Government
11%
Public Companies
58%
International Organizations
10%
• Approximately, 10% percent of organizations achieve parity and (potential positive returns) on their DM investments.
• Only 30% of DM investments achieve tangible returns at all.
• Seventy percent of organizations have very small or no tangible return on their DM investments.
© Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved! - datablueprint.com
Largely Ineffective
DM Investments
35
Investment <= Return
10%
Investment > Return
20%
Return ! 0
70%
© Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved! - datablueprint.com
Misunderstanding Data Management
36
MandatoryData quality security, privacy, compliance
Optional extensions Mashups, …
2005-
Enterprise-wide Data coordination, integration, stewardship
1995-2005
Data Administration
Data as a strategic resource Requirements analysis/modeling
© Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved! - datablueprint.com
Expanding Scope
Approximately 1950-1975
Database AdministrationDatabase design, operation,
monitoring, troubleshooting, etc.
37
1975-1995
© Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved! - datablueprint.com
DM Origins – Which arrives first – DM or DBMS?
38
• A Key Indicator
• 70% reacting instead of anticipating
• Best practices are obvious
0
0.2
0.4
0.6
0.8
DM 1st
DBMS 1st
Simultaneously
1981 2007
0
0.25
0.50
0.75
1.00
1978 1980 1982 1984 1986 1988 1990 1992 1994 1996 1998 2000 2002 2004 2006
© Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved! - datablueprint.com
DM Group Longevity/Maturity
39
• Potential mis-labeling – information or data management," "information or data services,"
"enterprise data architecture," and "data administration"
• Unexplained dip and rise in popularity
• Shift operations to a much broader range of activities
Measured Estimated
© Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved! - datablueprint.com
Non-Relational Database Processing
40
• 68% using hierarchical (typically IMS or Adabase)
• 20% reporting operational network DBMS
• "the rumors of the demise of non-relational processing are greatly exaggerated" (from Mark Twain)
• Virtually no textbook education
0-10% 11-20% 21-30% 31-40% 41-50% 51-60% 61-70% 71-80% 81-90% 91-100%
0.000.00
0.000.00
0.010.01
0.01
0.05
0.010.03
0.050.050.02
0.100.090.08
0.16
0.21
Percentage of Processing Mission-Critical
© Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved! - datablueprint.com
DM Responsibilities
41
• Biggest perceived increase
– "Application Design" "Long Range Planning" "Liaison To End Users"
• Biggest perceived Decrease
– "environment control" "backup/recovery" "installing new releases"
Backup/Recovery
Installs New Release
Performance Tuning
Environmental Controls
Database Design
Security and Privacy
Liaison to Programmers
Liaison to Systems Analysts
Database Auditing
Data Communications Administration
Input to Data Dictionary
Manages Data Dictionary
Education
Long Range Planning
Liaison to End Users
Application Design
0 0.2 0.4 0.6 0.8
Future Current
© Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved! - datablueprint.com
Total FTEs in Data Management
201+
100-200
51-100
25-50
9-24
5-8
2-4
1
0 7.5 15.0 22.5 30.0
42
© Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved! - datablueprint.com
DM Organization Footprint
43
0%
5.25%
10.50%
15.75%
21.00%
26.25%
Number 0 1 2-4 5-8 9-24 25-50 100-200
Perc
en
tag
e
DM Group Size
1981 DM Group Size
Small 5Average 6-7
Large 5-8
Quite large 20's - 30s
Very large 50 - 75
2007 DM Footprint size
© Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved! - datablueprint.com
Data Management Manager’s Last Position
44
• 1981
– systems analyst and programming
– project manager or leader
• 2007
– database administrator
– data administrator
– programmer
– project manager
– application manager
– systems architect
– data security administrator
– systems analyst, development manager
– business line manager
© Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved! - datablueprint.com
Data Dictionary Usage
45
0
0.18
0.35
0.53
0.70
0.60
0.70
1981
2007
© Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved! - datablueprint.com
Aligning Strategy & Execution
Strongly Disagree
Somewhat Disagree
Neither Agree or Disagree
Somewhat Agree
Strongly Agree
0 7.5 15.0 22.5 30.0
46
© Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved! - datablueprint.com
DM Involvement
47
Data Warehousing
XML
Data Quality
Customer Relationship Management
Master Data Management
Customer Data Integration
Enterprise Resource Planning
Enterprise Application Integration
0 12.5 25.0 37.5 50.0
Particpation Percentage
Initiative Leader Initiative Involvement Not Involved
© Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved! - datablueprint.com
No52%
They say they are but they aren't24%
Yes24%
Formal or Structured Approach to IQ?
48
0
0.09
0.18
0.27
0.36
0.45
Successful
Partial Success
Don't know/too soon to tell
Unsuccessful
Does not exist• In 25 years:
© Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved! - datablueprint.com
% of DM organizations labeled "successful"
49
19812007
© Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved! - datablueprint.com
Best Practices
50
© Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved! - datablueprint.com51
© Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved! - datablueprint.com
Cruiser Collector
52
© Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved! - datablueprint.com53
Initial(1)
Our DM practices are ad hoc and dependent upon
"heroes"
Repeatable(2)
We have DM experience and have the ability to implement
disciplined processes
1996 Council of American
Building Officials (COBE) and the
2000 International Code Council
recommendations call for unit
runs to be not less than 10
inches and unit rises not more
than 7! inches.
Capability Maturity Model Levels Optimizing
(5)
We have a process for improving our DM capabilities
We manage our DM processes so that the whole organization can follow our
standard DM guidance
Managed(4)
One concept for process improvement, others include:
• Norton Stage Theory
• TQM
• TQdM
• TDQM
• ISO 9000
and focus on understanding current processes and determining where improvements can be made.
Defined(3)
We have experience that we have standardized so that all in the
organization can follow it
© Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved! - datablueprint.com
Capability Im-Maturity Model
Level Description
0 Negligent/Indifference: Failure to allow successful development process to proceed. All problems are perceived to be technical problems. Managerial and quality assurance activities are deemed to be overhead and superfluous to the development process. Reliance on silver pellets.
-1 Obstructive/Counter Productive: Counterproductive processes are imposed. Process are rigidly defined and adherence to the form is stressed. Ritualistic ceremonies abound. Collective management precludes assigning responsibility. Status quo über alles.
-2 Contemptuous/Arrogance: Disregard for good software engineering institutionalized. Complete schism between software development activities and software process improvement activities. Complete lack of a training program.
-3 Undermining/Sabotage: Total neglect of own charter, conscious discrediting of peer organizations software process improvement efforts. Rewarding failure and poor performance.
http://stsc.hill.af.mil/crosstalk/1996/xt96d11h.asp
54
© Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved! - datablueprint.com
Source: Applications Executive Council, Applications Budget, Spend, and Performance Benchmarks: 2005 Member Survey Results, Washington D.C.: Corporate Executive Board 2006, p. 23.
Percentage of Projects on BudgetBy Process Framework Adoption
…while the same pattern generally holds true for on-time performance
Percentage of Projects on TimeBy Process Framework Adoption
Key Finding: Process Frameworks are not Created Equal
With the exception of CMM and ITIL, use of process-efficiency
frameworks does not predict higher on-budget project delivery…
55
© Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved! - datablueprint.com
StandardData
Organizational DM Functions and their Inter-relationships
Data Program Coordination
OrganizationalData Integration
DataStewardship
Data SupportOperations
Data Asset Use
Organizational Strategies
Goals
IntegratedModels
BusinessData
Business Value
Application Models & Designs
Feedback
Implementation
Direction
DataDevelopment
Guidance
56
© Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved! - datablueprint.com
StandardData
Organizational DM Functions and their Inter-relationships
Data Program Coordination
OrganizationalData Integration
DataStewardship
Data SupportOperations
Data Asset Use
Organizational Strategies
Goals
IntegratedModels
BusinessData
Business Value
Application Models & Designs
Feedback
Implementation
Direction
DataDevelopment
Guidance
Defining, coordinating, resourcing, implementing, and monitoring organizational data program strategies, policies, plans, etc. as coherent set of activities.
Identifying, modeling, coordinating, organizing, distributing, and architecting data shared across business areas or organizational boundaries.
Ensuring that specific individuals are assigned the responsibility for the maintenance of specific data as organizational assets, and that those individuals are provided the requisite knowledge, skills, and abilities to accomplish these goals in conjunction with other data stewards in the organization.
Initiation, operation, tuning, maintenance, backup/recovery, archiving and disposal of data assets in support of organizational activities.
57
Specifying and designing appropriately architected data assets that are engineered to be capable of supporting organizational needs.
© Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved! - datablueprint.com
StandardData
Organizational DM Functions and their Inter-relationships
Data Program Coordination
OrganizationalData Integration
DataStewardship
Data SupportOperations
Data Asset Use
Organizational Strategies
Goals
IntegratedModels
BusinessData
Business Value
Application Models & Designs
Feedback
Implementation
Direction
DataDevelopment
Guidance
Leverage data in organizational activities
Data management processes andinfrastructure
Combining multipleassets to produceextra value
Organizational-entity subject area dataintegration
Provide reliable access to data
Achieve sharing of data within a business area
58
© Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved! - datablueprint.com
How is it done?
• Follows form of a semi-structured interview
• Approximately one hour is required to complete each interview
• Examines organizational data management practices in five areas
• Branched series of questions explores capabilities, execution, and ongoing efforts.
• Total time to results typically ranges from 1 week to 1 month
59
Council Hill Road Sign roadsign
Photo from William J. Manon Jr. .pbase.com/g3/91/555491/ 2/66430431.telWKGJG.jpg© Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved! - datablueprint.com60
© Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved! - datablueprint.com
Assessment Benefits
• Quantitative Benefits– Objective determination of
baseline BI/Analytic capabilities
– Gap analysis indicates specific actions required to achieve the "next" level
– Available comparisons with similar organizations
– Provides facts useful when prioritizing subsequent investments
• Qualitative Benefits– Highlights strengths, weaknesses,
capabilities, and limitations existing BI/61
• Collaboration withCMU's Software Engineering Institute (SEI)
• Results from more than 400 organizations
– Public Companies – State Government
Agencies
– Federal Government– International
Organizations
• Defined industry standard
© Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved! - datablueprint.com © Copyright 07/23/08 by Data Blueprint - all rights reserved!32 - datablueprint.com
Data Management Practices Measurement (DMPA)
Data Program Coordination
Organizational Data Integration
Data Stewardship
Data Development
Data Support Operations
Initial (I)
Rep
eatable (II)
Defin
ed (III)
Man
aged
(IV)
Optim
izing (V
)
Focus: Guidance
and Facilitation
Focus: Implementation and
Access
62
0
1
2
3
4
5
Development Guidance
Data Adminstration
Support Systems Asset Recovery Capability
Development Training
© Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved! - datablueprint.com
Sample Perception vs. Fact Chart
2.0
1.0
2.0
2.2
1.0
1.2
3.0
2.4
1.0
2.3
AverageVerified
63
Comparative Assessment Results
© Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved! - datablueprint.com
Data Program Coordination
Organizational Data Integration
Data Stewardship
Data Development
Data Support Operations
0 1 2 3 4 5
Nokia Industry Competition All Respondents
Challenge
Challenge
Challenge
Client
64
Page
High Marks for IFC’s Program
Data Mgmt Audit 2006
Leadership & Guidance
Asset Creation
Metadata Management
Quality Assurance
Change Management
Data Quality
0 1 2 3 4 5
Overall Benchmarks Industry Benchmarks TRE IFC ISG
"These IFC scores represent the highest aggregate scores in the area of data stewardship recorded in our database of hundreds of assessments that has been recognized as as a representative scientific sample."
© Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved! - datablueprint.com
The challenge ahead
0.00
1.00
2.00
3.00
4.00
5.00
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28
The chart represents the average scores presented on the previous slide - interesting that none have apparently reached level-3
66
© Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved! - datablueprint.com
After more than a decade …
Question How many software practices (surveyed) are above level 1 on the CMM?
Answer By far most organizations (95%) surveyed are producing software using informal processes
Question How many organizations have demonstrated at least some proficiency according to the DM3? (i.e., scored above level 1)
Answer One in ten organizations has scored above level 167
© Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved! - datablueprint.com68
© Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved! - datablueprint.com
Archeology-based Transformations Solve a Puzzle• Primary sources of guidance:
– The edge-pieces are easy to identify
– Distinct physical piece features exist, such as colors, patterns, pictures, etc.
• Steps for solving:
– Physically segregate all identified edge pieces (not always present in existing environment.)
– Create puzzle framework - connecting edge pieces using the puzzle picture
– Within frame, physically group remaining pieces by distinct physical features
– Solve a smaller section of the puzzle containing just a portion of the picture that is focused on similar physical features such as a ball or a puppy as images in the picture. This is an effective approach because the
• Focus is on a common domain–one distinct aspect of the entire picture
• Because it focuses the analysis on a smaller number of puzzle pieces it is proportionately smaller than attempting to solve the overall puzzle at once.
– As the components are assembled, combine them to solve the complete puzzle.
69
© Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved! - datablueprint.com
How was this bridge constructed?
70
© Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved! - datablueprint.com
Flood
71
© Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved! - datablueprint.com
New River Bridge72
© Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved! - datablueprint.com
Bridge Engineering
73
© Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved! - datablueprint.com
Wally Eastwood Playing Piano
74
© Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved! - datablueprint.com
http://peteraiken.net
Copyright 12/18/07 by Data Blueprint - all rights reserved!
© Copyright 2004 by Data Blueprint - all rights reserved!10 - datablueprint.com
SystemComponent
ComponentElement
LogicalData
Attribute
SystemComponent
Type
LogicalData Entity
EvidenceType
Evidence
ModelDecomposition
Information
User Type
Location
Process
Tomorrow's Data ManagementTomorrow's Data ManagementQuality
Challenge #4Challenge #3
Challenge #1Challenge #2
Challenge #1
Revised Data
Management
Goals
Increased business
perception of DM
value resulting from
better business
systems including
repositories,
warehouses, ERP
implementations
Data Assets
Business
Rules
Business
Processes
XML-based Portals
Data Analysis
Technologies
XML-based Repositories
XML
Business Intelligence
01101001
01100100
01110010
Intelligence
Contact Information:
Peter Aiken, Ph.D.
Department of Information Systems School of BusinessVirginia Commonwealth University1015 Floyd Avenue - Room 4170Richmond, Virginia 23284-4000
Data Blueprint Maggie L. Walker Business & Technology Center501 East Franklin StreetRichmond, VA 23219804.521.4056http://datablueprint.com
office :+1.804.883.759cell:+1.804.382.5957
e-mail:[email protected]://peteraiken.net
75
© Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved! - datablueprint.com
Service Orient or Be Doomed!
• Service Orient or Be Doomed! – How Service
Orientation Will Change Your Business (Hardcover) by Jason Bloomberg & Ronald Schmelzer
– I'm not quite sure what "doom" awaits by not service orienting, other than remaining mired in archaic, calcified and siloed processes — which a lot of
76
© Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved! - datablueprint.com
Services
Integration Possibilities
• User Interface
• Business Process
• Application
• Data
AV Component
• Well defined components
• Self-contained
• No interdependencies
Analogy derived from D. Barry "Web Services" Intelligent Enterprise 10/10/03 pp. 26-47 - wiring diagram from sunflowerbroadband.com
77
© Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved! - datablueprint.com
Contractor Implemented Wiring
78
© Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved! - datablueprint.com
Concise Notes on Software Engineering
– Published in 1979
– 93 pages including appendices & references
– Out of print
– $1.99 at half.com
• Principles of Information Hiding (p. 32-33)
– Conceal complex data structures whenever possible
– Allow only selected service modules to know about the concealed data structures
– Bind together modules that know about concealed data structures
79
© Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved! - datablueprint.com
The basketball and golfball slide
How Does SOA Fit In Existing Architectures?
Bank
80
© Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved! - datablueprint.com
Evolving applications from stove pipe to web-service-based architectures
16 million
lines of
legacy code
2.1 million
lines of
legacy code
Organizational Portal
Sunday, April 27, 2008 - All systems operational!
Organizational News
• Organizational Early News • Industry News• Press Releases • Newsletters
Organizational IT
• Service Desk• Settings
• 320 new msgs, 14,572 total• Send quick email
Organizational Essentials
• Knowledge network• Employee assistance• IT procurement• Organizational media design• Organizational merchandise
Search
Go
Stocks
Full Portfolio
XYZYYZZZZ
Market Update
5029.5
45.25
As of: Sunday, April 27, 2008
Get Quote
Reporting
Regional• Northeast• Northwest• Southeast• Southwest• Midnorth• Midsouth
State• Alabama• Arkansas• Georgia• Mississippi• Vermont• Virginia
81
© Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved! - datablueprint.com © Copyright 11/03/08 by Data Blueprint - all rights reserved!25 - datablueprint.com
Legacy Systems Transformed Into Web-services Accessed Through a Portal
Organizational Portal
Monday, November 03, 2008 - All systems operational!
Organizational News
• Organizational Early News • Industry News• Press Releases • Newsletters
Organizational IT
• Service Desk• Settings
• 320 new msgs, 14,572 total• Send quick email
Organizational Essentials
• Knowledge network• Employee assistance• IT procurement• Organizational media design• Organizational merchandise
Search
Go
Stocks
Full Portfolio
XYZYYZZZZ
Market Update
5029.5
45.25
As of: Monday, November 03, 2008
Get Quote
Reporting
Regional• Northeast• Northwest• Southeast• Southwest• Midnorth• Midsouth
State• Alabama• Arkansas• Georgia• Mississippi• Vermont• Virginia
LegacyApplication 1
LegacyApplication 2
LegacyApplication 3
LegacyApplication 4
LegacyApplication 5
WebService 1.1
WebService 1.2
WebService 1.3
WebService 2.1
WebService 2.2
WebService 3.1
WebService 3.2
WebService 4.1
WebService 4.2
WebService 5.1
WebService 5.2
WebService 5.3
82
© Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved! - datablueprint.com
Solution Framework
SORs
SOR 1
SOR 2
SOR 3
SOR 4
SOR 5
SOR 6
SOR 7
SOR 8
Repository
IndicatorExtractionService(could be
segmented byday of week
month, system, etc.)
UpdateAddresses
LatencyCheckService
Ch 1
Ch 2
Ch 3
Ch 4
Ch 5
Ch 6
Channels
Ch 7
Ch 8
External Address Validation Processing
CustomerContact
83
© Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved! - datablueprint.com84
Text
LogicalExtension
© Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved! - datablueprint.com85
LogicalExtension
© Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved! - datablueprint.com © Copyright 07/23/08 by Data Blueprint - all rights reserved!35 - datablueprint.com
>COVER STORY28 Simpler Than SOAStymied by the complexity of SOAs, some IT departments are taking the Web-orientedarchitecture route
>>
NEWS FILTER21 Global Problem Theindictment of 11 people in fivecountries in connection with thetheft of credit card numbers fromU.S. retailers demonstrates howeasily cybercrime crosses borders
21 Still Standing IT spendinghas tightened in the United States,but demand from other parts ofthe world kept big tech companiesgrowing in the second quarter
22 No Deal Deutsche Post killsa proposed seven-year outsourcingdeal with Hewlett-Packard, sayingit wouldn’t save enough money tobe worth the risk
23 Lost Opportunity IBM’se-discovery software offers manyuseful features, but it misses themark by not pulling e-mail fromthird-party archives
23 Real Protection SunGardparlays its partnership withVMware into a service that usesvirtualization to provide fasterdisaster-recovery setup
24 Olympic-Sized TaskAT&T’s new Synaptic Hostingcloud computing service will get its first big test this week,providing temporary Web servercapacity for the U.S. OlympicCommittee’s Web site
25 New Cloud FormsElastra advances the idea ofprivate clouds, in which corporatedata centers use the technologiesand practices of public cloudinfrastructures from the likes of Amazon.com and Google 2221
CONTENTSCONTENTS
Cov
erph
oto
byM
ick
Cou
las
Small world Backpedaling
DEFINING THE BUSINESS VALUE OF TECHNOLOGY ISSUE 1,198 AUG. 11, 2008
informationweek.com Aug. 11, 2008 5
Simpler Than SOAStymied by the complexity of
SOAs, some IT departments are
taking the Web-oriented
architecture route
Smart Web App DevelopmentWeb-oriented architectures are easier to implement
and offer a similar flexibility to SOA
86
WOA
© Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved! - datablueprint.com © Copyright 07/23/08 by Data Blueprint - all rights reserved!35 - datablueprint.com
http://hinchcliffe.org/archive/2008/02/27/16617.aspx
© Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved! - datablueprint.com
SOA & Data & ???
88
© Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved! - datablueprint.com © Copyright 11/03/08 by Data Blueprint - all rights reserved!44 - datablueprint.com
Data Program Coordination
Organizational Data Integration
Data Stewardship
Data Development
Data Support Operations
0 1.25 2.50 3.75 5.00
SOA Requirements
89
• I'm a little surprised, with such extensive experience in predictive analysis, you should've known we would hire you
Predictive Analysis
© Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved! - datablueprint.com90
© Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved! - datablueprint.com
What is Analytics?
• Analytics:
– Something that is analytic
• Analytic:
– Of or relating to analysis; especially; separating or breaking up a whole or a compound into it component parts or constituent elements
91
© Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved! - datablueprint.com92
Car Maxx in Doha, Qatar
© Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved! - datablueprint.com93
© Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved! - datablueprint.com
BI/Analytic Capabilities
• Business Intelligence (BI)
– refers to technologies, applications and practices for the collection, integration, analysis, and presentation of business information and sometimes to the information itself.
– The purpose of business intelligence--a term that dates at least to 1958--is to support better business decision making.
• Analytics
– The simplest definition of Analytics is "the science of analysis."
– A simple and practical definition, however, would be how an entity (i.e., business) arrives at an optimal
94
Analytics
© Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved! - datablueprint.com
BI/Analytic Capabilities
95
Business Intelligence
Strategy formulation Strategy implementation
• Wine quality = 12.145 + 00.00117 winter rainfall + 0.0614 growing season temperature - 0.00386 harvest rainfall (Orley Ashenfelter)
• Out performs experts
– specifically Robert Parker(http://www.erobertparker.com/)
– Most everyone else
• Clinical Versus Statistical Prediction (Paul Meele)
– 8/136 studies experts were more accurate
© Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved! - datablueprint.com
BI/Analytic Capabilities
96
© Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved! - datablueprint.com © Copyright 2004 by Data Blueprint - all rights reserved!
I didn’t have the data
97
© Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved! - datablueprint.com
BI Challenges
• Technical Challenges– Poor quality data
– Poor understanding of architectural constructs
– Poor quality data management practices
– New technical expertise is required
• Non-Technical Challenges– Architecture is under appreciated
– BI perceived as a "technology" project
– Inability to link technical capabilities to business objectives
– Putting BI initiatives in context
98
24%
28%
35%
36%
43%
46%
46%
47%
60%
Performance and scalability
Immature technology
Lack of tools for doing real-time
processing
Education and understanding of
real-time BI by IT staff
Poor quality data
Lack of infrastructre for handingreal-time processing
Education and understanding ofreal-time BI by business users
Non-integrated data sources
Business case, high cost or budget
issues
© Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved! - datablueprint.com
Obstacles to Real-Time BI-Lessons from Deployment
TDWI The Real Time Enterprise Report, 200399
© Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved! - datablueprint.com
Cost of Poor Data Quality $600 Billion Annually!
Thanks to Bret Champlin100
© Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved! - datablueprint.com
Who is Joan Smith?
http://www.sas.com101
© Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved! - datablueprint.com
Defining Customer
Challenges
• Purchased an A4 on June 15 2007
• Had not done business with the dealership prior
• "makes them
seem sleazy
when I get a
letter in the mail
before I've even
made the first
payment on the
car advertising
lower payments
than I got"
102
© Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved! - datablueprint.com
Defining Customer
Challenges
• Purchased an A4 on June 15 2007
• Had not done business with the dealership prior
• "makes them
seem sleazy
when I get a
103
© Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved! - datablueprint.com
How to solve this data quality problem using just tools?
Retail price for the unit was $40
104
© Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved! - datablueprint.com
A congratulations letter from another
bank
Problems
• Bank did not know it made an error
• Tools alone could not have prevented this error
• Lost confidence in the ability of the bank to manage customer funds
105
© Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved! - datablueprint.com
From my retirement plan
106
© Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved! - datablueprint.com
Rolling Stone Magazine
107
© Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved! - datablueprint.com
Quantitative Benefits
108
© Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved! - datablueprint.com
Please Help with A Research Project!
Data Management Practices Assessment
109