- satya - blogs.dbspeak.comblogs.dbspeak.com/.../07/database_design_satya.pdf · adabas, adabas,...
TRANSCRIPT
2
Satya © 2004
"Normalization is a logical concept, performance is determined at the physical
level. Therefore, it is impossible to denormalize for performance."
Fabian Pascal –co-founder & editor of Database Debunkings (dbdebunk.com)
Architect’s buzZ word
3
Satya © 2004
“Denormalization, if necessary, should be done at the level of stored files, not at the
level of base relvars”
Chris J. Date –Most respected database expert in Computer Industry, Author –Database Systems.
“denormalization, is not ‘good for performance’, it is good for the performance
of specific applications”
Architect’s buzZ word
4
Satya © 2004
• Presenting concepts, not syntax.• Presenting “How” & “What” not “Why” in the RDBMS.
Background
6
Satya © 2004
40,000 BCEcave paintings
bone tools 3500writing 0 C.E.
paper 1051450
printing1870
electricity, telephonetransistor 1947
computing 1950Late 1960s
Internet(DARPA)
1993The web
1999
1½Bin 1999
GIG
ABYTES
Source: UC Berkeley
Ready for the data eXplosion?
7
Satya © 2004
2000 3B
2001 6B
2002 12B
2003 24B
40,000 BCEcave paintings
bone tools 3500writing 0 C.E.
paper 1051450
printing1870
electricity, telephone
transistor 1947computing 1950
Late 1960sInternet
(DARPA)1993
The web1999
GIG
ABYTES
Source: UC Berkeley
The coming content - “Big Bang”
8
Satya © 2004
• Terabytes of data– Common corporate expression– Petabytes(10^15) & Exabytes(10^18) is fast approaching
• 2-3 Exabytes = total volume of all information generated worldwide annually
– Need structure to efficiently handle large data.
Source: 2001 - IBM Informix Conference, Las Vegas.
Future data size
9
Satya © 2004
? An Organized Store of Information
–Flat Files
–Hierarchical Databases
–Network Databases
–Relational Databases
–Object Relational Databases
–Object Databases
DatabaseDatabase
Adabas, Adabas, FileMaker
IBM’s Information Management System (IMS) – used in Apollo Moon Landing.
GE’s Integrated Data Store (IDS)
Oracle, Db2, Sybase, MS SQL, Postgres
Oracle 9
Cloudscape
Database – the solution
10
Satya © 2004
Project IdentificationProject Identificationand Selectionand Selection
Project InitiationProject Initiationand Planningand Planning
AnalysisAnalysis
Physical DesignPhysical Design
ImplementationImplementation
MaintenanceMaintenance
Logical DesignLogical Design
Enterprise modelingEnterprise modeling
Conceptual data modelingConceptual data modeling
Database development activities during the systems development life cycle (SDLC)
11
Satya © 2004
Database Application Lifecycle
DB Design
SYSTEMS DEFINITIONSYSTEMS DEFINITION
DATABASE PLANNINGDATABASE PLANNING
REQUIREMENTS ANALYSISREQUIREMENTS ANALYSIS
DBMS SELECTIONDBMS SELECTION
PROTOTYPINGPROTOTYPING IMPLEMENTATIONIMPLEMENTATION
DATA LOADING / MIGRATIONDATA LOADING / MIGRATION
TESTINGTESTING
MAINTENANCEMAINTENANCE
CONCEPTUAL DESIGNCONCEPTUAL DESIGN
LOGICAL DESIGNLOGICAL DESIGN
PHYSICAL DESIGNPHYSICAL DESIGN
APPLICATION APPLICATION DESIGNDESIGN
12
Satya © 2004
Entity relationship modeling Entity relationship modeling and normalizationand normalization
Data analysis and Data analysis and requirementsrequirements
Data model verificationData model verification
Distributed database designDistributed database design
DBMS software selectionDBMS software selection
Logical design Logical design
Physical design Physical design
Conceptual Design
Determine end user views, outputs, and transaction processing requirements.
Define entities, attributes and relationships. Draw ER diagrams. Normalize tables.
Identify main processes, insert, update and delete rules. Validate reports, queries, views, integrity, sharing and security.
Define location of tables, access requirements and fragmentation strategy.
Translate the conceptual model into definitions for tables, views and so on…
Define storage structures and access paths for optimum performance.
DBMS independent
DBMS dependent
Hardware dependent
Database design flow
13
Satya © 2004
• Entity-Relationship (ER) data modeling– A graphical technique for understanding and organizing the data
independently of the eventual database implementation
• Normalization– An algorithmic process for evaluating the quality of a database design -
most applicable to relational database designs
• Types of Models– Models (of databases or anything else) can be built at different levels of
abstraction– For databases (following the text):
• Conceptual – logical ? ER Models (represent semantics)• Internal - for the chosen DBMS• External - the way the User see the data• Physical - for the actual physical storage
Design Approach
15
Satya © 2004
•The concepts upon which ER models are built are:–Entities (or, more correctly, entity types)
–also called as “relvars”, “base relvars”, “relation”–at physical implementation level, called as table
–Relationships (between entities) –Attributes (of entities and relationships)
Entity-Relationship BasicsER Modeling concepts
16
Satya © 2004
•An entity is “A person, place, event, or thing for which we intend to collect data”•Normally a database will contain data about groups of similar entities (e.g. students, subjects, licenses, aircraft or whatever)•These groups of similar entities are referred to as entity types but often this is shortened to just “entity” or “entities”
Entities & Entity types
17
Satya © 2004
•Entity types are conventionally named in the singular•Attributes are represented on ER diagrams as ellipses attached to the relevant entity type symbol
•There are other notations as well (e.g. a list of attributes next to the entity type symbol) but they are conceptually equivalent
student
studentNumber
NameDOB
Address
Gender
Entity types & Attributes
18
Satya © 2004
•A relationship is an association between entity types•Relationships are represented by diamond shaped symbols on ER diagrams•A descriptive name is placed inside the relationship symbol
student enrolsin
subject
Relationships
19
Satya © 2004
•Entity type names are usually nouns•Relationship names are usually, though not always, verbs (or verb phrases)•Most relationships are binary (i.e. connect 2 entity types) - like “enrolls in”•Other types of relationships are possible
Relationship & entity
20
Satya © 2004
Degree of a Relationship
•The degree of a relationship is the number of entity type(s) that it connects–One Unary–Two Binary–Three Ternary
•Relationships of degree higher than three are rare
employeesupervises
Unary relationship student subjectenrolls Binary relationship
vendor purchasersale
Ternary relationship
item
vendor purchasersale
Three binary relationships
itemsale sale
=
21
Satya © 2004
Relationship Connectivity (Cardinality)
•Relationships can have different connectivity(s)
•one-to-one (1:1)•one-to-many (1:N)•many-to-many (M:N)
•Indicated on the ER diagram by placing an appropriate symbol on each “leg” of the relationship
employeesupervises
M1 supervisor
student subjectenrolls
N
1
lecturer subjectteaches
N
22
Satya © 2004
E R F E R F E R F
One-to-one relationshipmin-card(E, R)=0max-card(E,R)=1min-card(F,R)=0max-card(F,R)=1
Many-to-one relationshipmin-card(E, R)=0max-card(E,R)=Nmin-card(F,R)=1max-card(F,R)=1
Many-to-many relationshipmin-card(E, R)=0max-card(E,R)=Nmin-card(F,R)=0max-card(F,R)=N
23
Satya © 2004
Relationship Participation
•Entity types connected by a relationship can have two kinds of “participation” in it
•Partial (or optional)•Total (or mandatory)
•“Total” means that every entity instance must be connected (through the relationship) to an instance of the other participating entity type(s)•“Partial” means not total
1
staff departmentHead of
1
24
Satya © 2004
Key Attribute(s)
•There will normally be one, or perhaps several, attributes that will be unique for every entity instance•Example:
•Every student will have a unique student number•Such an attribute (or combination) is called a key•If the key for an entity set consists of two or more attributes in combination it is called a concatenated key•Key attribute(s) are underlined on the ER diagram
person
Number
Name DOB
Address
Gender
Qualification Age
25
Satya © 2004
Derived, Multi-valued attributes
•Sometimes it is useful to have, on the ER diagram, attributes that can be derived from other attributes•Example:
•An attribute Age can be derived from an attribute DOB and the current date•Derived attributes can be indicated on the ER diagram by using a dashed ellipse and connecting line to the relevant entity type
26
Satya © 2004
Relationships attributes
•A relationship is an association between entity sets•Relationships can also have attributes•An attribute of a relationship is drawn attached to the relationship diamond•Usually only M:N relationships have attributes
employeesupervises
N
Task
M
27
Satya © 2004
Strong & Weak entities/entity types
•Sometimes the instances of one entity type depend, for their unique identification, on their relationship to the instances of another entity type
building consists of
room
Name Number
28
Satya © 2004
Supertypes & Subtypes
•Sometimes notionally different entity types are really specializations of a more general entity type•Example:
•Trucks, cars, motorcycles, buses, taxis are all motor vehicles•Some attributes are common to all, others are specific to one group
•This kind of situation can be dealt with using a generalization hierarchy (or super type/subtype hierarchy)•The attribute(s) that are common belong to the super type•The attributes that are specific are attached to the relevant subtype
29
Satya © 2004
Supertypes & Subtypes
motorvehicle
truck car bus
Registration
truckattributes
carattributes
busattributes
Seats
dUUU
30
Satya © 2004
Supertypes & Subtypes
employee
Safety officer engineer pilot
TFN
Safetyattributes
engineerattributes
pilotattributes
DOB
oUUU
Gender
Address
31
Satya © 2004
Internal level(storage view)
Conceptual level(community user view)
External level(individual user views)
Internal View
Conceptual
External (COBOL) External (XML )
01 EMPC.02 EMPNO PIC X(6).02 DEPTNO PIC X(4).
STORED_EMP BYTES=20PREFIX TYPE=BYTE(6), OFFSET=0EMP# TYPE=BYTE(6), OFFSET=6, INDEX=EMPXDEPT# TYPE=BYTE(4), OFFSET=12PAY TYPE=FULLWORD, OFFSET=16
EMPLOYEEEMPLOYEE_NUMBER CHARACTER(6)DEPARTMENT_NUMBER CHARACTER(4)SALARY NUMERIC(5)
<xsd:element name=“Emp”><xsd:element name=“Eno” type=“Number” /><xsd:elementname=“Dno” type=“Number” />
</xsd:element >
Three schema architecture for Database development
Conceptual Schema- Neutral View -
ExternalSchema
InternalSchema
33
Satya © 2004
Levels of normalization
5NF relvars
4NF relvars
BCNF relvars
3NF relvars
2NF relvars
1NF relvars (normalized entities)
34
Satya © 2004
Normalization - Keys
Superkey:A superkey is a set of one or more attributes that, taken collectively, allows us to identify uniquely an entity. Candidate key:Any subset of a superkey that is also a superkey and is not reducible to another superkey is called candidate key. Primary key:A primary key is selected arbitrarily from the set of candidate keys to be used in an index for that table.
Source: Database Modeling & Design – Tobey J. Teorey
35
Satya © 2004
Normalization – 1nf
Source: Administration Guide: Planning – DB2Database Systems – C.J. Date
First normal form (1NF):Defn: A relvar is in 1NF if and only if, in every legal value of that relvar, every tuple contains exactly one value for each attribute.
Explanation: At each row and column position in the table, thereexists one value, never a set of values.
Essence: Every row, column should be atomic.
Violation of 1NF:Employee (EID#, Name, SkillSet, Address1, Address2)
•SkillSet stores, comma separated values. (C, VisualBasic, Oracle)•How many more addresses can be stored in this fashion?
36
Satya © 2004
Normalization – 2nf
Source: Administration Guide: Planning – DB2Database Systems – C.J. Date
Second normal form (2NF):(Assuming one candidate key, which we assume is the primary key)
Defn: A relvar is in 2NF if and only if, it is in 1NF and every nonkey attribute is irreducibly dependent on the primary key.
Explanation: Each column that is not part of the key is dependent upon the key.
Essence: All non-keys must depend on Key value.
Violation of 2NF:WarehouseParts(PART#, WAREHOUSE#, Qty, WHAddr)
•WAREHOUSE# ? WHAddr•PART# ? Qty
37
Satya © 2004
Normalization – 3nf
Source: Administration Guide: Planning – DB2Database Systems – C.J. Date
Third normal form (3NF):(Assuming one candidate key, which we assume is the primary key)
Defn: A relvar is in 3NF if and only if, it is in 2NF and every nonkey attribute is nontransitively dependent on on the primary key.
Note: “No transitive dependencies” implies no mutual dependencies.
Explanation: Each column that is not part of the key is dependent upon the key.
Essence: All non-keys must depend “only” on Key value and no other non-key.
Violation of 3NF:Emp_Dept(EID#, FirstName, LastName, WorkDept, DeptName)
38
Satya © 2004
Normalization - bcnf
Boyce/Codd normal form (BCNF):(Assuming composite candidate key as primary key)
Defn: A relvar is in BCNF if and only if, every non-trivial, left irreducible FD has a candidate key as its determinant.
Explanation: Each column that is not part of the key is fully dependent upon the whole composite key and not on any single keyalone.
Essence: All non-keys must depend “only” on “composite” key value and not on a single key.
Violation of BCNF:HotelRoom (HNo#, Room#, RoomType)
RoomType ? Room# & RoomType?HNo#
Source: Database Systems – C.J. Date
39
Satya © 2004
Normalization – 4nf
Source: Administration Guide: Planning – DB2Database Systems – C.J. Date
Fourth normal form (4NF):
Defn: Relvar R is in 4NF if and only if, whenever there exist a subsets A and B of the attributes of R such that the nontrivial MVD A ??B is satisfied, then all attributes of R are also functionally dependent on A.
Explanation: No row contains two or more independent multi-valued facts about an entity.
Essence: Two separate facts cannot be in the same entity.
Violation of 4NF:Emp_Skill(EID#, SkillName#, Language#)
40
Satya © 2004
Normalization – 5nf (pjnf)
Source: Database Systems – C.J. Date
Fifth normal form (5NF):
Defn: Relvar R is in 5NF(also called projection join normal form) if and only if, every nontrivial join dependency that holds for R is implied by the candidate keys of R.Explanation: If a table can be decomposed further losslessly, then it could be decomposed. R{A,B,C} satisfies JD * {AB,AC} if and only if the MVDs A ?? B and A?? C hold in R
A ?? B | C ? * {AB, AC}Essence: Two separate facts cannot be in the same entity.
Violation of 5NF:
41
Satya © 2004
Normalization – Others
Domain Key normal form (DK/NF):
Defn: A relvar R is said to be in DKNF if and only if, every constraint on R is a logical consequence of the domain constraints and key constraints that apply to R.Explanation:--Principle of Orthogonal design (A Digression):Eg: SA has suppliers of Paris, SB has suppliers not in paris or with status 30. It is possible for a row to be present in both SA and SB, thus giving rise to update anomaly.
SX(S#, Sname, Status), SY(S#, Sname, City)
This can be best used in Distributed database design.
Source: Database Systems – C.J. Date
42
Satya © 2004
Denormalization - Types
1. Collapsing Tablesa. Two entities in a m:n relationship
To avoid frequent joins, this can be applied.b. Two entities in a 1:1 relationship
To avoid updates to two separate entities that are in 1:12. Reference data in a 1:m relationship (Add Redundant Columns)
When large composite key / derived keys are used, they can be added to child entity in a 1:m relationship as a foreign key, again to avoid certain join operations.
3. Entities with the most detailed dataWhen MVDs/Temporal design is in place, we could store summarized
data about MVD attribute/temporal dimension (eg: months)4. Derived attributes
When an attribute is derived by a function of another, but its better to store derived attribute. (eg: SearchName, y = f(x) ? store x,y in R)
5. Splitting Tables (Horizontal / Vertical Splitting)
Source: Denormalization effects on Performance of RDBMS, G. Lawrence Sanders, Seungkyoon Shin, State University of New York, Buffalo
43
Satya © 2004
Denormalization – Criteria
Criteria• General application performance requirements• indicated by business needs.• On-line response time requirements for application queries,
updates ad processes.• Minimum number of data access paths.• Minimum amount of storage.
Source: Database Modeling 7 Design – Tobey J. Teorey
44
Satya © 2004
Denormalization – Alternatives
Alternatives• Application performance criteria.• Future application development and maintenance
considerations.• Volatility of application requirements.• Relations between transactions and relations of entities
involved.• Transaction type (update/query, OLTP/OLAP).• Transaction frequency.• Access paths needed by each transaction.• Number of rows accessed by each transaction.• Number of pages/blocks accessed by each transaction.• Cardinality of each relation.
Source: Database Modeling & Design – Tobey J. Teorey
50
Satya © 2004
Diagramming Notations - IDEF1X Notation
Attribute And Primary Key Syntax
} Primary-Key Attributes
Entity-name/Entity-number
Attribute-Name [Attribute-Name]
[Attribute-Name][Attribute-Name][Attribute-Name]
reference to note (n) where cardinality is specified(n)
zero or one
Z
n-m
from n to mone or moreP
exactly nn
zero, one or more
Relationship Cardinality
Source: IDEF1X Formalization, 1993, Robert G. Brown (The Database Design Group)
51
Satya © 2004
Diagramming Notations - IDEF1X Notation
Identifying Relationship
** The Child Entity in an Identifying Relationship is always an Identifier-Dependent Entity.
* The Parent Entity in an Identifying Relationship may bean Identifier-Independent Entity (as shown) or an Identifier-Dependent Entity depending upon other relationships.
*Parent Entity
Entity-A
Key-Attribute-A
**Child Entity Key-Attribute-BKey-Attribute-A (FK)
Entity-B
Relationship NameIdentifying Relationship
** The Child Entity in a Mandatory Non-Identifying Relationship will be an Identifier-Independent Entity unless the entity is also a Child Entity in some Identifying Relationship.
* The Parent Entity in a Mandatory Non-Identifying Relationship may be an Identifier-Independent Entity (as shown) or an Identifier-Dependent Entity depending upon other relationships.
*Parent Entity
Entity-A
**Child Entity
Key-Attribute-A
Key-Attribute-B
Key-Attribute-A (FK)
Entity-B
Relationship Name
Mandatory Non-Identifying Relationship
Mandatory Non-Identifying Relationship
Source: IDEF1X Formalization, 1993, Robert G. Brown (The Database Design Group)
52
Satya © 2004
Diagramming Notations - IDEF1X Notation
Optional Non-Identifying Relationship
** The Child Entity in a Optional Non-Identifying Relationship will be an Identifier-Independent Entity unless the entity is also a Child Entity in some Identifying Relationship.
* The Parent Entity in a Optional Non-Identifying Relationship may be an Identifier-Independent Entity (as shown) or an Identifier-Dependent
Entity depending upon other relationships.
*Parent Entity
Entity-A
Key-Attribute-A
**Child Entity
Key-Attribute-B
Key-Attribute-A (FK)
Entity-B
Relationship Name
Optional Non-Identifying Relationship
Source: IDEF1X Formalization, 1993, Robert G. Brown (The Database Design Group)
53
Satya © 2004
Diagramming Notations - IDEF1X Notation
Frequency
Ultra High Frequency
(UHF)
Very High Frequency
(VHF)
High Frequency
(HF)
Radio Frequency
Audio Frequency
Ultra-Sonic Sonic Sub-Sonic
Base Domain
Typed Domains
Domain Hierarchy
Source: IDEF1X Formalization, 1993, Robert G. Brown (The Database Design Group)
54
Satya © 2004
Diagramming Notations - IDEF1X Notation
Team Organization
Expert
Source Project
Manager Modeler
Acceptance Review
Committee
Source: IDEF1X Formalization, 1993, Robert G. Brown (The Database Design Group)
55
Satya © 2004
Diagramming Notations - IDEF1X Diagram
domaindomain_id.id (FK) domainRule (O)
entity
entity_id.id (FK)
viewview_id.id (FK) level purpose scope author_conventions
idef1xObjectid name (AK1) description (O)
aliasEntityentity_id (FK) realEntity_id.entity_id (FK)
aliasDomaindomain_id (FK) realDomain_id.domain_id (FK)
baseDomaindomain_id (FK) dataType (O)
viewEntity
view_id (FK) entity_id (FK) is_dependent (O)
typedDomaindomain_id (FK) superType_id.domain_id (FK)
clusterview_id (FK) (AK1) clusterNo generic_id.entity_id (FK) (AK1) discEntity_id.entity_id (O) (FK) is_compete disc_id .attribute_id (O) (FK) (AK1)
connectionRelationshipparent_id.entity_id (FK) (AK1) connectionNo child_id.entity_id (FK) (AK1) view_id (FK) (AK1) name1 (O) (AK1) name2 (O) (AK1) childLow childHigh (O) parentLow (O) parentHigh (O) is-mandatory (O) is-specific is-identifying category
view_id (FK) category_id.entity_id (FK) clusterNo (FK) generic_id (FK)
connectionForeignKeyAttributeparent_id (FK) view_id (FK) role_id.attribute_id (FK) child_id.entity_id (FK) connectionNo (FK)
alternateKeyAttributeentity_id (FK) view_id (FK) attribute_id (FK) alternateKeyNo (FK)
/supertypeaka / real
appears in
aka / realappears in
contains
is parent in / parentis child in / child
is generic in / genericcontains
appears inP
is used as
is discriminator for
primaryKeyAttribute
attribute_id (FK) view_id (FK) entity_id (FK)
viewEntityAttributeattribute_id.domain_id (FK) view_id (FK) entity_id (FK) is_nonull (O) is_owned (O) is_migrated (O)
alternateKeyalternateKeyNo view_id (FK) entity_id (FK)
contains
P
IDEF1X Diagram
Source: IDEF1X Formalization, 1993, Robert G. Brown (The Database Design Group)
56
Satya © 2004
Diagramming Guidelines
• Identify layout conventions• Analyze information requirements for attributes• Model attributes• Identify multi-valued attributes• Validate attributes • Identify common and derived data• Understand the use of domains• Identify the components of a data warehouse
Source: Adding Detail to the Diagram - Annette scott, Oracle
57
Satya © 2004
Lay Out the ER Diagram
• Neat and tidy
• Unambiguous text
• Memorable patterns
• Neat and tidy
• Unambiguous text
• Memorable patterns
Source: Adding Detail to the Diagram - Annette scott, Oracle
58
Satya © 2004
Layout Guidelines
Dead Crows Fly East !
Source: Adding Detail to the Diagram - Annette scott, Oracle
59
Satya © 2004
Attributes
Badge Number - Identifies an employee
Name - Qualifies an employee
Payroll category (weekly or salaried) -Classifies an employee
Date of birth - Quantifies an employee
Employment status (active, leave, terminated) -Expresses the status of an employee
Source: Adding Detail to the Diagram - Annette scott, Oracle
60
Satya © 2004
Finding Attributes
Is this attribute really needed ?
Beware of obsolete requirements from previous systems
Beware of derived data
Source: Adding Detail to the Diagram - Annette scott, Oracle
61
Satya © 2004
Attribute Diagramming Conventions
EMPLOYEE
badge numfirst namelast namepayroll numdate of birthemployment status
• Inside the entity's soft box
• Singular
• Lowercase
• Inside the entity's soft box
• Singular
• Lowercase
Source: Adding Detail to the Diagram - Annette scott, Oracle
62
Satya © 2004
Meaningful Components
PERSON
name
PERSONlast namefirst name
ITEM
code
ITEMtypevendornum
Break down aggregate attributes
Source: Adding Detail to the Diagram - Annette scott, Oracle
63
Satya © 2004
Verify for Single Value
RENTAL
transaction date
total amount paiditem
Yes, more than one item may be rented at a time. An entity is missing.
RENTAL
transaction date
total amount paid
RENTAL ITEM
item num
Can an attribute have more than one value for an instance of the entity?
Source: Adding Detail to the Diagram - Annette scott, Oracle
64
Satya © 2004
Attributes Which have Attributes
Does information need to be stored about any of the attributes?
Yes, review details. An entity is missing.
TITLE
REVIEW
authorcommentdate recorded
product codetitledescriptionreview details
product codetitledescription
TITLE
Source: Adding Detail to the Diagram - Annette scott, Oracle
65
Satya © 2004
Finding Common or Derived Data
• Count • Total• Maximum, Minimum, Average• Calculation
• Count • Total• Maximum, Minimum, Average• Calculation
Derived attributes are redundant and can lead to inconsistent values
12 08 30 22----72----
Source: Adding Detail to the Diagram - Annette scott, Oracle
66
Satya © 2004
Attribute Optionality
• A value must be stored for each entity instance
• Tagged with *
• A value must be stored for each entity instance
• Tagged with *
Mandatory Attributes
Optional Attributes
• A value may be stored for each entity instance
• Tagged with o
• A value may be stored for each entity instance
• Tagged with o
EMPLOYEE
badge num
first name
last name
title
***o weighto
Source: Adding Detail to the Diagram - Annette scott, Oracle
67
Satya © 2004
Attribute Details and Volumes
Attribute - * Engine Size
Format Type NumberMaximum length 4Average length 4Decimal place 1Unit of measure ccAllowable values 900,1000,1500,1800,2000
Volume Initial 100%
Source: Adding Detail to the Diagram - Annette scott, Oracle
68
Satya © 2004
Using a Domain
AUDIO
MONSTESUR
Movie
Game
Audio
Sound
Mono
Stereo
Surround
Source: Adding Detail to the Diagram - Annette scott, Oracle
69
Satya © 2004
Data Warehousing
Reference data
Meta data
Load management
Warehouse management
Query management
Fact data
Summary data
Source: Adding Detail to the Diagram - Annette scott, Oracle
71
Satya © 2004
Three approaches can be followed.Sentence Analysis:
• Ask Business user to tell ‘their story’•Resultant sentences serve as basic constituents of tasks and processes performed in IS to be supported.•Extract data requirements from those sentences.
Document Analysis:•Analyze documents including transactions, reports.•Interview results, Observation results, Policies and procedures, Output of existing systems (resports & screens), Inputs to existing screens (forms & screens), Database/file specifications of existing systems.
Event Analysis:•Identify and describe what happens (the events), who is involved (actors and business resources), and what responses are required. (Follow the Zachman Interrogatives)
Design Techniques
Source: 1) Logical Data Modeling - Salvatore T. March ,2) IDEF1x.doc
72
Satya © 2004
Sentence Analysis
•Salespeople service Customers.•Customers place Orders through a Salesperson.•Freight is determined when an Order is Shipped.•Salespeople are paid commission based on their commission rate and Invoiced sales.•Each Salesperson has a number, name, and address.•Each Customer has a number and a bill-to-address.
?Identify subjects, verb phrases and objects.
?Specific instances must be generalized.
?If subject and object are both entities, then the verb phrase represents a relationship.
?If subject is an entity but object is a fact about that entity, then the object is an attribute and the verb phrase explains the meaning of the attribute.
Source: Logical Data Modeling - Salvatore T. March
73
Satya © 2004
Document AnalysisINVOICE
Sample Company, Inc. Number Date
111 Any Street 157289 10/02/90
Anytown, USA
Bill To:
Customer Number: 0361 Salesperson: 4531 – Joe Smith
Local Grocery Store Customer PO: 3291
132 Local Street Terms: Net 30
Localtown, USA FOB Point: Anytown
Line Product Product Unit of Quantity Unit
No. Number Description Sale Order Ship Backord Price Discount Extension
1 2157 Cheerios Carton 40 40 0 50.00 5 % 1900.00
2 2283 Oat Rings Each 300 200 100 2.00 0 % 400.00
3 0579 Corn Flakes Carton 30 30 0 40.00 10 % 1080.00
Order Gross 4380.00
Tax at 6 % 262.80
Freight 50.00
--------
Order Net 4692.80
?Each heading can be an entity, attribute or a derived attribute.
?Relationships need to be defined from a careful analysis only.
Source: Logical Data Modeling - Salvatore T. March
74
Satya © 2004
Document Analysis – Data Flow Diagram (DFD)
Source: Logical Data Modeling - Salvatore T. March
Examples of DFDs, Level 0, Level 1, Level 2 etc…
75
Satya © 2004
Event Analysis
?Define an entity for each event. Identify associated actor.
?Hence, Place Order, Ship Order, Invoice Order, and Pay Invoice are all entities.
Source: Logical Data Modeling - Salvatore T. March
76
Satya © 2004
Design Evaluation
?Each entity must be uniquely identified.
?Attributes are associate with entities (not relationships), and each entity must have one and only one value for each of its attributes (otherwise an additional entity must be created)
?Relationships associate a pair of entities or associate an entity with itself (only binary relationships are allowed but relationships can be recursive)
?Many to Many relationships are not allowed
?Subtypes are identified when the minimum degree of a relationship descriptor is zero or when an attribute does not apply to all instances of an entity
Source: Logical Data Modeling - Salvatore T. March
77
Satya © 2004
Further Reading
?CASE*Method: Entity Relationship Modeling by Richard Barker is an excellent introduction to ER modeling.
?Relational Database Design by Fleming and von Halle goes step by step into the nuts and bolts, all the way to the physical side.
?Practical Issues in Database Management by Fabian Pascal, will introduce many of the perennial tough problems in data modeling,and will help assure the new data modeler that there's more to data modeling than what is supported by current commercial implementations of SQL and relational database management products.
80
Satya © 2004
Dependability Estimation
Mean time to failure (MTTF):Mean time to Repair (MTTR):
Availability:MTBF = MTTF + MTTRAi = MTTFi / MTBFi
Reliability:
Mean Transaction time:
83
Satya © 2004
Relational:Network:Hierarchical:Object-Oriented:Spatial-Geographic:Multimedia:Temporal:Text:Active:Real Time: