database structure basics national center for immunization & respiratory diseases influenza...

30
Database Structure Basics National Center for Immunization & Respiratory Diseases Influenza Division Pam Kennedy Analyst, McKing Consulting Regional Training Workshop on Influenza Data Management Phnom Penh, Cambodia July 27 – August 2, 2013

Upload: donna-green

Post on 24-Dec-2015

216 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Database Structure Basics National Center for Immunization & Respiratory Diseases Influenza Division Pam Kennedy Analyst, McKing Consulting Regional Training

Database StructureBasics

National Center for Immunization & Respiratory Diseases

Influenza Division

Pam KennedyAnalyst, McKing Consulting

Regional Training Workshop on Influenza Data Management

Phnom Penh, Cambodia

July 27 – August 2, 2013

Page 2: Database Structure Basics National Center for Immunization & Respiratory Diseases Influenza Division Pam Kennedy Analyst, McKing Consulting Regional Training

Course Objectives

• Designing a System Function Structure Data Types Data Relationships

• Data Entry Data Standards Forms

• Controls• Control Validation

Data Security

Section 1 Section 2

Questions

Page 3: Database Structure Basics National Center for Immunization & Respiratory Diseases Influenza Division Pam Kennedy Analyst, McKing Consulting Regional Training

Designing a SystemFunction

• Function of a database is to facilitate the analysis of large amounts of data quickly and efficiently

• Effective databases result from upfront planning before the first data table is built

Who has data /who is collecting data

What data is being collected/needs to be collected

Why is the data needed

When is data collected

Where is data collected

Page 4: Database Structure Basics National Center for Immunization & Respiratory Diseases Influenza Division Pam Kennedy Analyst, McKing Consulting Regional Training

Designing a System Structure

How many data tables are needed?

How will data be entered?

How will data be stored?

What level of data/database security is needed?

Who needs access to the files?

Will data be used in calculations?

Will regular reports be needed?

Is there a need to export data?

Database structure

Page 5: Database Structure Basics National Center for Immunization & Respiratory Diseases Influenza Division Pam Kennedy Analyst, McKing Consulting Regional Training

Designing a System Structure

• Database structure

Excel is a ‘flat’ file system

A "flat file" is a plain text or mixed text which usually contains one record per line

There are no structural relationships between the records.

ACCESS is a ‘relational’ system

A relational database matches data by using common characteristics found within the data set. The resulting groups of data are organized and are much easier for many people to understand.

http://en.wikipedia.org/wiki/Flat_file_databasehttp://en.wikipedia.org/wiki/Relational_database

Page 6: Database Structure Basics National Center for Immunization & Respiratory Diseases Influenza Division Pam Kennedy Analyst, McKing Consulting Regional Training

Designing a System Structure

• Excel and ACCESS both store data in a ‘column/row’ format

• Excel A ‘worksheet’ is the data storage tool

Excel workbook is a collection of worksheets Excel worksheets are ‘flat files’

• ACCESS A ‘table’ is the data storage tool (equivalent to an Excel ‘worksheet’)

Tables relate to other tables forming the basis for data analysis

Tables group like elements together Patient information – name, address, phone, Date of Birth, Site information - site name, site contact info, type of site

Page 7: Database Structure Basics National Center for Immunization & Respiratory Diseases Influenza Division Pam Kennedy Analyst, McKing Consulting Regional Training

Designing a System Structure

• Worksheets/Tables store data within the database Excel Worksheet

ACCESS Table

Record

Fields Field Value

Page 8: Database Structure Basics National Center for Immunization & Respiratory Diseases Influenza Division Pam Kennedy Analyst, McKing Consulting Regional Training

Designing a System Data Types

• There are different elements of data: Names, Dates, Cost information, Notes

• Data type is defined during database construction• Excel and ACCESS both have default data types

Excel – ‘general’ ACCESS – ‘text’

• Data type determines how data can be used in calculations Important during analysis or data summary

• Commonly used data types: Excel – general, date, currency ACCESS – text, number, date

Page 9: Database Structure Basics National Center for Immunization & Respiratory Diseases Influenza Division Pam Kennedy Analyst, McKing Consulting Regional Training

Designing a System Data Types

• Shared data types – Excel and ACCESS Text - for use storing names, short titles, country Number - for use storing numerical data Date/Time - for use storing dates and can capture time Currency - for use storing costs, price, etc.

• ACCESS Specific Data Types Memo - text field without character limit – for use storing notes

In Excel – there is no character limit in any one cell Yes/No - creates check box field in a table (yes = box checked)

AutoNumber – field that assigns an incremental number to each record

Page 10: Database Structure Basics National Center for Immunization & Respiratory Diseases Influenza Division Pam Kennedy Analyst, McKing Consulting Regional Training

Designing a SystemData Types

• Excel

Page 11: Database Structure Basics National Center for Immunization & Respiratory Diseases Influenza Division Pam Kennedy Analyst, McKing Consulting Regional Training

Designing a System Data Types

• ACCESS

Page 12: Database Structure Basics National Center for Immunization & Respiratory Diseases Influenza Division Pam Kennedy Analyst, McKing Consulting Regional Training

Designing a System Data Relationships

• ACCESS Specific

• What is a relational database ?

System that links data by using common characteristics

System that stores data in logical manner making analysis easier

Provide more efficient means of dealing with duplicated data

• An ACCESS database is composed of the following elements: Tables – Store data Queries – Aid in sorting, summing, calculating data Forms – Aid data input

Page 13: Database Structure Basics National Center for Immunization & Respiratory Diseases Influenza Division Pam Kennedy Analyst, McKing Consulting Regional Training

Designing a System Data Relationships

• Data relationships are critical

• Relationships should be set up as the database is built

• Questions to consider: How do the fields in different tables relate?

Is there a common factor in all the tables?

• This is the unique field that will apply across all data

Consider data to be collected and how it will be used

Consider data types – relationships link data of same type

Page 14: Database Structure Basics National Center for Immunization & Respiratory Diseases Influenza Division Pam Kennedy Analyst, McKing Consulting Regional Training

Designing a System Data Relationships

• Key types of relationships One to One – A unique record in Table 1 matches an exact

record in Table 2 One to Many – A unique record in Table 1 matches to several

records in Table 2

SARI Data Table 1

Lab Data Table 2

Patient ID Date of specimen collection

Date of Visit Type of sample collected

Age Sex Results Patient ID

Page 15: Database Structure Basics National Center for Immunization & Respiratory Diseases Influenza Division Pam Kennedy Analyst, McKing Consulting Regional Training

Example of Data Relationship

Page 16: Database Structure Basics National Center for Immunization & Respiratory Diseases Influenza Division Pam Kennedy Analyst, McKing Consulting Regional Training

Example of Data Relationship

Page 17: Database Structure Basics National Center for Immunization & Respiratory Diseases Influenza Division Pam Kennedy Analyst, McKing Consulting Regional Training

Data Entry

• Standards• Forms

Form functions

Controls• Drop down

• Check boxes

• Data Validation

Queries

Data Security

Page 18: Database Structure Basics National Center for Immunization & Respiratory Diseases Influenza Division Pam Kennedy Analyst, McKing Consulting Regional Training

Data EntryStandards

• Why standardize? Create common understanding Ease comparison of data year to year

Increase efficiency for analyzing Know where data is Know what people are asking for

Everyone enters data in same way

Ease comparison of data from group to group Example

Dateshttp://en.wikipedia.org/wiki/Football

Page 19: Database Structure Basics National Center for Immunization & Respiratory Diseases Influenza Division Pam Kennedy Analyst, McKing Consulting Regional Training

Data EntryStandards

• Dates

• Football

http://en.wikipedia.org/wiki/Football

12/2/2013 2/12/2013

Page 20: Database Structure Basics National Center for Immunization & Respiratory Diseases Influenza Division Pam Kennedy Analyst, McKing Consulting Regional Training

Data EntryStandards

• Standards Definition of data types during design phase ensures

standardization of data Numbers stored as numbers Currency stored in specified currency format Dates stored uniformly - ‘Yes’ stored as ‘Yes’ not ‘Y’

‘Controls’ can be used to assist in maintaining data standards Limit data entry choices to defined terms Excel and ACCESS have similar ‘controls’ available

Page 21: Database Structure Basics National Center for Immunization & Respiratory Diseases Influenza Division Pam Kennedy Analyst, McKing Consulting Regional Training

Data EntryForms

• Building a ‘data entry form’ Identify data fields Determine structure Determine data field relationship (ACCESS) Determine data type for each data field

Dates = date type Cost = currency

Determine common standard responses “Yes” will by stored as ‘Yes’ Choices will be ‘Yes’, ‘No’, ‘NA’ Date format = DDMMYY or DDMMYYYY

• Most of these decisions should be made during database structure design

Page 22: Database Structure Basics National Center for Immunization & Respiratory Diseases Influenza Division Pam Kennedy Analyst, McKing Consulting Regional Training

Data EntryForms

• Excel

Page 23: Database Structure Basics National Center for Immunization & Respiratory Diseases Influenza Division Pam Kennedy Analyst, McKing Consulting Regional Training

Data EntryForms

• ACCESS

Page 24: Database Structure Basics National Center for Immunization & Respiratory Diseases Influenza Division Pam Kennedy Analyst, McKing Consulting Regional Training

Data EntryForms

• Queries Allow you to view, change, and analyze data in different

ways Several types :

Select Parameter Crosstab Action

• A select query is the most common type of query Retrieve data from one or more tables Group records and calculate sums, counts, averages, and

other types of totals

Stefano TEMPIAMicrosoft Access Tutorial – Lecture 2

Page 25: Database Structure Basics National Center for Immunization & Respiratory Diseases Influenza Division Pam Kennedy Analyst, McKing Consulting Regional Training

Data EntryForms

• Select query

Stefano TEMPIAMicrosoft Access Tutorial – Lecture 2

Page 26: Database Structure Basics National Center for Immunization & Respiratory Diseases Influenza Division Pam Kennedy Analyst, McKing Consulting Regional Training

Data EntryData Security

• Why security Ensure data is maintained according to set standards Ensure access to data is controlled Ensure accuracy of data

• Security levels Excel

Workbook Worksheet

ACCESS Database Record

Page 27: Database Structure Basics National Center for Immunization & Respiratory Diseases Influenza Division Pam Kennedy Analyst, McKing Consulting Regional Training

Designing a SystemDatabase Basics

• Questions???

Page 28: Database Structure Basics National Center for Immunization & Respiratory Diseases Influenza Division Pam Kennedy Analyst, McKing Consulting Regional Training

Designing a SystemDatabase Basics

• Final Thoughts The database should meet your current needs and

any future needs you can anticipate Build a structure that works for you and provides what

you need Be prepared to change and adapt as your data needs

change

Keep it SIMPLE!

Page 29: Database Structure Basics National Center for Immunization & Respiratory Diseases Influenza Division Pam Kennedy Analyst, McKing Consulting Regional Training

For more information please contact Centers for Disease Control and Prevention1600 Clifton Road NE, Atlanta, GA 30333Telephone, 1-800-CDC-INFO (232-4636)/TTY: 1-888-232-6348E-mail: [email protected] Web: www.cdc.gov

The findings and conclusions in this report are those of the authors and do not necessarily represent the official position of the Centers for Disease Control and Prevention.

THANK YOU

National Center for Immunization & Respiratory Diseases

Influenza Division

Page 30: Database Structure Basics National Center for Immunization & Respiratory Diseases Influenza Division Pam Kennedy Analyst, McKing Consulting Regional Training

Designing a SystemDatabase Basics

• EXERCISESetting up tables using national surveillance forms

Setting up data entry controls

Entering data using controls