courses.mbl.edu
TRANSCRIPT
Principles of Database Design
NLM/MBL Medical Informatics
NLM/MBL Medical Informatics
Session Outline
◆ Why learn this?◆ Database Principles and Paradigms◆ Principles of Relational Database
Design◆ System design and building methods◆ Exercise: Transforming flat files to
tables
NLM/MBL Medical Informatics
Why Learn about Database Design?
◆ Vendors will sell you on user interfaces, but the power and flexibility is in the data model
◆ Evaluating and comparing products◆ Communicating with vendors and IT
support staff◆ Building your own databases
NLM/MBL Medical Informatics
What is a Database?
◆ An organized collection of information– Computer-based representation– Systematic, automated retrieval
– Systematic, automated symbol manipulation
NLM/MBL Medical Informatics
Historical Evolution of Databases
◆ Dedicated files created & maintained by application software (sequential, random access)
◆ Database Management Systems (DBMSs)
NLM/MBL Medical Informatics
Hierarchical Databases
Lab Results
5/30/96
Serum Na+
Pt=Smith
Advantages: efficient storage and I/O, rapid access via predetermined data hierarchies
Disadvantages: difficult to view/retrieve data from other perspectives, hard to modify underlying structure
NLM/MBL Medical Informatics
Information Network Databases
Advantages: Can model complex many-to-many relationships as well as hierarchies and simple lists
Disadvantages: difficult to predict & control effects of transitive relationships; recursion; I/O intensive, potential to become incomprehensible
“Database as Hypertext”
NLM/MBL Medical Informatics
Relational Databases
Advantages: Understandable, permits variety of logical aggregation or “views” of data elements, structure easily modifiable, new elements generally do not “break” existing programs
Disadvantages: I/O intensive, 1 logical record may = many physical records, relational integrity is a constant concern & must be under software control
“Rows & Columns with inter-table references”
Pt-UI Testname Date12345 Serum_Na 5/30/9642353 CBC 5/30/9647756 ESR 5/30/9612348 HBsAg 5/30/9634523 Amylase 5/30/96
Lab_testPt-UI Lname Fname
12345 Smith Elmer12346 Jones Barbara12347 Clark Arthur12348 Jones Casey12349 Sample Steve
Patient
NLM/MBL Medical Informatics
Object-Oriented Databases
◆ Multiple data types including text, graphics, sound, signals, etc.
◆ Encapsulation of data & programs◆ Interprocess messaging: e.g., “Print
Yourself”
Advantages: applications programs consist of high level commands & functions which do not need to know the underlying data organization; modularity, reusability and portability between systems
Disadvantages: early in commercialization; CPU intensive; few standards for query & object sharing
NLM/MBL Medical Informatics
Fundamental Assertions about Systems Design
◆ The Data Model is the most critical aspect of system design and function
◆ Data Models should reflect real world objects and their relationships to ensure durability
◆ A correct Data Model subserves and outlasts applications, including many not anticipated at system start-up
NLM/MBL Medical Informatics
Object-oriented Systems design:Basic Concepts
◆ The World contains Things e.g., Collies, Terriers, Bloodhounds
◆ We develop abstractions of things called “objects” e.g., dog
◆ We group objects by criteria which represent the abstract object as an empty table
Dog Name Breed Favorite Food Birthdate
NLM/MBL Medical Informatics
Basic Concepts, cont’d
◆ Empty tables can be filled in to represent the real world things from which the object was abstracted
Dog Name Breed Favorite Food Birthdate
Boris St. Bernard Canned Jan 81
Fifi Poodle Dry May 92
Fido Pomeranian Canned Apr 87
NLM/MBL Medical Informatics
Basic Concepts, cont’d
◆ There are Relationships between objects which are attributes of those objects
Dog Name License Owner Name Lic. Date
Owner Name Address PhoneRelationship: “OWNS” Dog Owner OWNS Dogs
NLM/MBL Medical Informatics
Objects◆ All of the real-world things in the set (the
“instances”) have the same characteristics◆ All instances conform to the same rules
So that...
License Exp. Date Manufacturer Model
123 ABC Jan. 97 Ford Taurus
691XKY Mar.98 Honda Prelude
12-A-962 Apr.98 ? Poodle
...you don’t get holes in the table ...you don’t get strange values
LICENSE
NLM/MBL Medical Informatics
Types of Objects (ie., types of tables)
◆ Tangible Things e.g., book◆ Roles e.g., doctor, patient, supervisor◆ Incidents (=events, occurences) e.g., ordering of
a lab test◆ Interactions (bind two or more other objects via a
transaction) e.g., Purchase relates Buyer to Seller
◆ Specifications (definition tables of tangible things)
NLM/MBL Medical Informatics
Table Notation
Patient_Admissions
Pt_ID Date_Adm Time_Adm Unit Room
Empty Table form:
Graphical Form:
Patient_Admissions* Pt_ID-Date_Adm-Time_Adm-Unit-Room
Textual Form:
Patient_Admissions (Pt_ID,Date_Adm, Time_Adm, Unit,Room)
NLM/MBL Medical Informatics
Formalisms for Tables◆ Rule 1: One instance of an object has exactly
one value for each attribute (i.e, only one data element at each row-column intersection; no repeating groups, no true “holes” in table)
◆ Rule 2: Attributes must contain no internal structure
Name Age-SexSmith 38-FJones 22-MClark 18-M
Not OK:
If Rules 1 and 2 are obeyed, the data model is in “First Normal Form”
NLM/MBL Medical Informatics
Formalisms for Tables, cont’d
◆ Rule 3: Every attribute should represent a characteristic of the entire object, not a characteristic of a limited part of the object
Hospital Committee Membership* Person Name* Committee Name-Date committee term expires-Date first joined hospital staff
Not OK:
Attribute of hospitalstaff appointment, notcommitteeHospital Committee Membership
* Person Name* Committee Name-Date committee term expires
OK:
NLM/MBL Medical Informatics
Relationships
◆ A relationship is the abstraction of a set of associations that hold systematically between different kinds of real world things– Patient OCCUPIES bed– Library CONTAINS books– Specimen IS ASSAYED by Lab Method
◆ Most relationships may be stated in the inverse also:
– Library LENDS book– Book IS LENT BY Library
NLM/MBL Medical Informatics
Relationship Types
State GovernorOne-to-One:has
governs
Many-to-Many Authorwrites
Bookis written by
One-to-Many Dog Ownerowns
Dogis owned by
NLM/MBL Medical Informatics
Modeling Many-to-Many Relationships
DRUG MANUFACTURER* manufacturer name- other attributes
DRUG*generic name- other attributes
LICENSE* manufacturer name* generic name- date licensed
NLM/MBL Medical Informatics
Overall System Design Process
◆ Build the Entity-Relationship diagram for all defined objects (tables), [including an Object Specification Document]
◆ [Create a State Transition Model which describes changes to objects based on events or transactions]
◆ [Create a Data Flow diagram which models the information elements which cause State Transitions]
[Recommended for multi-programmer projects]
Exercise: Devise a Relational Model
for MEDLINE citations
UI - 90134185AU - Greenes RA ; Shortliffe EHTI - Medical Informatics. An Emerging academic discipline and institutional priorityMH - Hospital Information Systems; Career Choice; Medical Informatics/EDUCATION/*TRENDSPT - JOURNAL ARTICLE; REVIEW; TUTORIALEM - 9005AB - Information management constitutes a major activity of the health care profession. Currently a number of forces are focusing attention on this function...AD - Department of Radiology, Brigham and Women’s Hosp., Boston, MA 02115SO - JAMA 1990 Feb 23; 263(8):1114-20
Sample MEDLINE citation
NLM/MBL Medical Informatics
The “Bottom Line” in Database Design
◆ The Data Model is the most critical aspect of system design and function
◆ Data Models should reflect real world objects and their relationships to ensure durability
◆ A correct Data Model subserves and outlasts applications, including many not anticipated at system start-up
Questions?