preparing to automate data management chapter 1. chapter introduction discovery phase includes: –...
TRANSCRIPT
Preparing to Automate Data Management
Chapter 1
Chapter Introduction
• Discovery phase includes:– Gathering all existing data – Researching missing and incomplete data– Talking with users about data output needs
• Subsequent steps in process include:– Putting data into groups called tables– Identifying unique values for each record in those
tables – Designing database to produce desired output
Succeeding in Business with Microsoft Access 2010 2
Database Design Process: The Discovery Phase
Succeeding in Business with Microsoft Access 2010 3
Level 1 Objectives: Examining Existing and Missing Sources of Data
• Discover and evaluate sources of existing data• Research sources of missing data• Assign data to tables and use field types and
sizes to define data
Succeeding in Business with Microsoft Access 2010 4
Discovering and Evaluating Sources of Existing Data
• Identify information that organization needs to manage and organize
• Might begin to see patterns that indicate how to organize data
• Database management system (DBMS)– Includes:• Oracle • Microsoft Access• MySQL
Succeeding in Business with Microsoft Access 2010 5
Discovering and Evaluating Sources of Existing Data (cont’d)
• Data duplication– Undesirable • Additional space required in database to store
extra records• Leads to inconsistent and inaccurate data
• Data redundancy– Same data repeated for different records
Succeeding in Business with Microsoft Access 2010 6
Researching Sources of Missing Data
• Part of discovery phase• Must ask right questions of right people to get
right answers
Succeeding in Business with Microsoft Access 2010 7
Assimilating the Available Information and Planning the Database
• First step in database design – Determine best way to organize data into logical
groups of fields• Field – Single characteristic of entity– Also called column
• Record– Values in each field in table– Also called row
Succeeding in Business with Microsoft Access 2010 8
Assimilating the Available Information and Planning the Database (continued)
• Table– Collection of fields that describe one entity– Also called entity or relation
• Database– Collection of one or more tables
• Relational database– Contains related tables through fields that contain
identical data
Succeeding in Business with Microsoft Access 2010 9
Evaluating Field Values and Assigning Appropriate Data Types
• Data type – Determines how to store data in field
• DBMSs use different names for some data types
• How do you determine which data type to assign each field? – Depends on what function you want to derive
from data– Each data type has different properties
Succeeding in Business with Microsoft Access 2010 10
Common Data Types and Their Descriptions
Succeeding in Business with Microsoft Access 2010 11
The Text and Memo Data Types
• Text data type– Letters and numbers– Not used in calculations or formulas– Stores maximum of 255 characters– Default for all fields created in access database
• Memo data type– Store long passages of text– Displays only 65,000 characters
Succeeding in Business with Microsoft Access 2010 12
The Number Data Type
• Stores both positive and negative numbers • Contains up to 15 digits• Use for values used in calculations
Succeeding in Business with Microsoft Access 2010 13
The Currency Data Type
• Includes two decimal places and displays values with dollar sign
• Use for monetary values
Succeeding in Business with Microsoft Access 2010 14
The Date/Time Data Type
• Display values in format mm/dd/yyyy– Can also include time in different formats
• Used in calculations if necessary
Succeeding in Business with Microsoft Access 2010 15
The AutoNumber Data Type
• Number automatically generated by access • Produces unique values for each record• Useful to distinguish two records that share
identical information
Succeeding in Business with Microsoft Access 2010 16
The Yes/No Data Type
• Assigned to fields requiring – Yes/no– True/false– On/off
• Takes up one character of storage space• Make data entry easy– Check box
Succeeding in Business with Microsoft Access 2010 17
The OLE Object Data Type
• Used to identify files created in another program – Then linked or embedded in database
• Abbreviation for object linking and embedding
Succeeding in Business with Microsoft Access 2010 18
The Hyperlink Data Type
• Assigned to fields that contain hyperlinks to – Web pages– E-mail addresses– Files that open in • Web browser• E-mail client• Another application
Succeeding in Business with Microsoft Access 2010 19
The Attachment Data Type
• Lets you store one or more files for each record in the database– Pictures– Documents– Charts– Spreadsheets
Succeeding in Business with Microsoft Access 2010 20
The Calculated Type
• New for Access 2010• Uses data from fields in the same table to
perform calculations• When selected, opens Expression Builder so
you can create the calculation or expression
Succeeding in Business with Microsoft Access 2010 21
The Lookup Data Type
• Creates fields to look up data in – Another table– Or list of values created for field
• Makes data entry easy • Ensures that valid data entered into field
Succeeding in Business with Microsoft Access 2010 22
Selecting the Correct Data Type
• Helps store correct data in correct format while using least amount of space
• Eases data entry and interactivity with data • Choosing certain data types results in user-
friendly interactive features– Drop-down menus – Check boxes– Hyperlinks
• Correctly manipulate dataSucceeding in Business with Microsoft Access 2010 23
Assigning the Correct Field Size for Text Fields
• Important to consider field size when assigning data types– Minimize space reserved for each record by
assigning smallest data type that will store data• Be conservative when assigning field sizes– But not too conservative
Succeeding in Business with Microsoft Access 2010 24
Assigning the Correct Field Size for Number Fields
Succeeding in Business with Microsoft Access 2010 25
Dividing the Existing and Missing Data into Tables
• Tables – Single most important component of database– Most databases contain: • Multiple tables • Hundreds or even thousands of records
• Primary key – One field that creates unique value in each record – Used to identify each record in table– May be a combination of fields
Succeeding in Business with Microsoft Access 2010 26
Database Design Process: Planning the Tables
Succeeding in Business with Microsoft Access 2010 27
Naming Conventions
• Database tables must – Have unique names– Follow established naming conventions
• General rules for naming objects– Object names cannot exceed 64 characters– Object names cannot include period, exclamation
point, accent grave, or brackets– Object names should not include spaces– Most developers capitalize first letter of each word
when table name includes two wordsSucceeding in Business with Microsoft Access 2010 28
Leszynski/Reddick Naming Conventions for Database Objects
Succeeding in Business with Microsoft Access 2010 29
Level 1 Summary
• Discovery phase• Identify existing and missing data• Determine tables– Determine data types
• Follow naming conventions
Succeeding in Business with Microsoft Access 2010 30
Level 2 Objectives:Understanding and Creating Table Relationships
• Understand relational database objects and concepts
• Create table relationships• Understand referential integrity
Succeeding in Business with Microsoft Access 2010 31
Understanding Relational Database Objects
• Users can view data in tables by:– Opening table – Creating other objects
• Four main objects in database – Tables– Queries– Forms– Reports
Succeeding in Business with Microsoft Access 2010 32
Tables
• Data in relational database stored in one or more tables
• View data in table – Open it and scroll through records
• Most of the time, three other main database objects used to display data
Succeeding in Business with Microsoft Access 2010 33
Queries
• Query – Question asked about data stored in database
• Query results– Look similar to table– Fields displayed in columns – Records displayed in rows
Succeeding in Business with Microsoft Access 2010 34
Queries (continued)
• Select query – Most commonly used query– Data selected from table on which query based
• Action query– Performs action on table– Select specific records in table and update them
• Crosstab query – Performs calculations on values in field and
displays results in datasheetSucceeding in Business with Microsoft Access 2010 35
Forms
• Used to view, add, delete, and update records in database
• Based on table or query• Interface more attractive than table datasheet• Customize form’s appearance with instructions and
command buttons• Switchboard or Navigation form– Form displayed when database opened– Provides controlled method for users to open objects in
database
Succeeding in Business with Microsoft Access 2010 36
Form Based on a Table
Succeeding in Business with Microsoft Access 2010 37
Reports
• Formatted presentation of data from table or query
• Created as printout or to be viewed on screen• Data displayed by report usually based on
query• Dynamic– Reflect latest data from object
• Cannot be used to modify data
Succeeding in Business with Microsoft Access 2010 38
Accounts Receivable Report
Succeeding in Business with Microsoft Access 2010 39
Other Database Objects
• Macro – Set of instructions – Automate certain database tasks– Usually automates simple tasks
• Module – Contains instructions to automate database task– Written in Visual Basic for Applications (VBA)– Performs more sophisticated actions than macro
Succeeding in Business with Microsoft Access 2010 40
Understanding Relational Database Concepts
• Relational database– Contains multiple tables to store related
information• Common field – Field that appears in two or more tables and
contains identical data to relate tables– Primary key in first table– Foreign key in second table
Succeeding in Business with Microsoft Access 2010 41
Creating Table Relationships
• Goal in good database design – Create separate tables for each entity– Ensure each table has primary key– Use common field to relate tables
• Relate two (or more) tables– Query them as though they are one big table
• Join – Specifies relationship between tables and
properties of relationshipSucceeding in Business with Microsoft Access 2010 42
One-to-Many Relationships
• Abbreviated as 1:M• One record in first table matches zero one or
many records in related table• Primary table– One side
• Related table– Many side
Succeeding in Business with Microsoft Access 2010 43
One-to-Many Relationship Between Customers and Prescriptions
Succeeding in Business with Microsoft Access 2010 44
One-to-One Relationships
• Abbreviated as 1:1• Exists when each record in one table matches
exactly one record in related table
Succeeding in Business with Microsoft Access 2010 45
One-to-One Relationship Between Physical and Billing Addresses
Succeeding in Business with Microsoft Access 2010 46
Many-to-Many Relationships
• Abbreviated as M:N• Each record in first table matches many
records in second table• Each record in second table matches many
records in first table• Junction table
Succeeding in Business with Microsoft Access 2010 47
Many-to-Many Relationship Between Employees and Classes
Succeeding in Business with Microsoft Access 2010 48
Understanding Referential Integrity
• Null value– Field does not contain any value
• Entity integrity– Guarantee that there are no duplicate records in table– Each record unique– No primary key field contains null values
• Referential integrity – If foreign key in one table matches primary key in second
table – Values in foreign key must match values in primary key
Succeeding in Business with Microsoft Access 2010 49
Understanding Referential Integrity (continued)
• When database does not enforce referential integrity – Problems occur that lead to inaccurate and
inconsistent data• Orphaned– No longer match between primary key in primary
table and foreign keys in related table
Succeeding in Business with Microsoft Access 2010 50
Referential Integrity Errors
Succeeding in Business with Microsoft Access 2010 51
Overriding Referential Integrity
• Might want to override referential integrity – Intentionally change primary key – Delete parent record
• Cascade updates– Change primary key value so that DBMS
automatically updates appropriate foreign key values in related table
• Cascade deletes
Succeeding in Business with Microsoft Access 2010 52
Level 2 Summary
• Main database objects:– Table– Query– Form– Report
• Relationship types:– One-to-many– One-to-one– Many-to-many
Succeeding in Business with Microsoft Access 2010 53
Level 3 Objectives: Identifying and Eliminating Database Anomalies by Normalizing Data
• Learn the techniques for normalizing data• Evaluate fields that are used as keys• Test the database design
Succeeding in Business with Microsoft Access 2010 54
Normalizing the Tables in the Database
• Normalization– Design process– Goals:• Reduces space required to store data by
eliminating duplicate data in database• Reduces inconsistent data in database by
storing data only once• Reduces chance of deletion update and
insertion anomalies
Succeeding in Business with Microsoft Access 2010 55
Normalizing the Tables in the Database (continued)
• Deletion anomaly – User deletes data from database – Unintentionally deletes only occurrence of data in database
• Update anomaly – Due to redundant data in database– User fails to update some records or updates records
erroneously
• Insertion anomaly – User cannot add data to database unless preceded by entry
of other data
Succeeding in Business with Microsoft Access 2010 56
Succeeding in Business with Microsoft Access 2010
57
Normalizing the Tables in the Database (continued)
• Functional dependency– Column in table considered functionally
dependent on another column • If each value in second column associated with
exactly one value in first column• Partial dependency – Field dependent on only part of primary key
• Composite primary key– Primary key uses two or more fields to create
unique records in table
Succeeding in Business with Microsoft Access 2010 58
Normalizing the Tables in the Database (continued)
• Determinant– Field or collection of fields whose value determines
value in another field– Inverse of dependency
• Natural key– Primary key that details obvious and innate trait of
record• Artificial key– Field whose sole purpose is to create primary key– Usually visible to users
Succeeding in Business with Microsoft Access 2010 59
Normalizing the Tables in the Database (continued)
• Surrogate key – Computer-generated primary key – Usually invisible to users
Succeeding in Business with Microsoft Access 2010 60
First Normal Form
• Repeating group– Field contains more than one value
• First normal form – 1NF– Does not contain any repeating groups
Succeeding in Business with Microsoft Access 2010 61
Succeeding in Business with Microsoft Access 2010
62
Succeeding in Business with Microsoft Access 2010
63
Padding up the missing columns won’t remediate redundancy.
Succeeding in Business with Microsoft Access 2010
64
The solution is to split the “wide” table into two.
Remember to remain the link
Second Normal Form
• 2NF• Table must be in 1NF • Must not contain any partial dependencies on
composite primary key• Tables in 1NF and contain primary key with
only one field – Automatically in 2NF
Succeeding in Business with Microsoft Access 2010 65
Succeeding in Business with Microsoft Access 2010
66
Still has redundancy
Composite PKDepend on
only part of PK
Third Normal Form
• 3NF• Only determinants must be candidate keys• Candidate key – Field or collection of fields that could function as
primary key but was not chosen to do so• Transitive dependency – Occurs between two nonkey fields both dependent on
third field• Tables in 3NF should not have transitive
dependenciesSucceeding in Business with Microsoft Access 2010 67
Succeeding in Business with Microsoft Access 2010
68
Level 3 Summary
• Normal forms– First (1NF)– Second (2NF)– Third (3NF)
Succeeding in Business with Microsoft Access 2010 69
Chapter Summary
• Discovery:– Identify existing and missing data– Organize data into tables– Determine data types for each field
• Table relationships– Established through common fields– Types• 1:M• 1:1• M:N
Succeeding in Business with Microsoft Access 2010 70
Chapter Summary (continued)
• Normalization– Reduces duplication and inconsistency– Forms:• 1NF• 2NF• 3NF
Succeeding in Business with Microsoft Access 2010 71