comp4 unit6a lecture slides
TRANSCRIPT
Introduction to Information and Computer ScienceDatabases and SQL
Lecture a
This material (Comp4_Unit6a) was developed by Oregon Health and Science University, funded by the Department of Health and Human Services, Office of the National Coordinator for Health Information Technology under Award Number
IU24OC000015.
Databases and SQLLearning Objectives
• Define and describe the purpose of databases (Lecture a)• Define a relational database (Lecture a)• Describe data modeling and normalization (Lecture b)• Describe the structured query language (SQL) (Lecture c)• Define the basic data operations for relational databases and how to
implement them in SQL (Lecture c)• Design a simple relational database and create corresponding SQL
commands (Lecture c)• Examine the structure of a healthcare database component (Lecture
d)
2Health IT Workforce Curriculum Version 3.0/Spring 2012
Introduction to Information and Computer Science Databases and SQL
Lecture a
Data Representation
• It’s all 1s and 0s• 01000001 can mean
– 65 as a binary number– ‘A’ as alphanumeric character (ASCII)– Many other options, including CPU
instructions and multimedia data
3Health IT Workforce Curriculum Version 3.0/Spring 2012
Introduction to Information and Computer Science Databases and SQL
Lecture a
Data Storage
• Large component of computer systems is management of data
• Storing and retrieving data are important functions– Efficiency– Speed
4Health IT Workforce Curriculum Version 3.0/Spring 2012
Introduction to Information and Computer Science Databases and SQL
Lecture a
Data Storage Options
• Text/data files• Spreadsheets• Databases
5Health IT Workforce Curriculum Version 3.0/Spring 2012
Introduction to Information and Computer Science Databases and SQL
Lecture a
Files
• A collection of information stored electronically in a single location
• Can store text or data• Files have different formats
6Health IT Workforce Curriculum Version 3.0/Spring 2012
Introduction to Information and Computer Science Databases and SQL
Lecture a
Advantages/Disadvantages of Files
Advantages• Easy to create and
store• Easy to share• Used by many
applications– Input or output data
from scientific computations
Disadvantages• Limited security• Multiple user access
isn't supported• Redundant and
inconsistent data
7Health IT Workforce Curriculum Version 3.0/Spring 2012
Introduction to Information and Computer Science Databases and SQL
Lecture a
Contact Information Example
File with contact information:
Bill Robeson, 1312 Main, Portland, OR, Community Hospital, Inc.
Walter Schmidt, 14 12th St., Oakland, CA, Oakland Providers LLC
Mary Stahl, 14 12th St., Oakland, CA, Oakland Providers LLC
Albert Brookings, 1312 Main, Portland, OR, Community Hospital Incorporated
Catherine David, 14 12th Street, Oakland, CA, Oakland Providers LLC
8Health IT Workforce Curriculum Version 3.0/Spring 2012
Introduction to Information and Computer Science Databases and SQL
Lecture a
Quick!
• Do Bill and Albert work for the same company? • Is there an issue with Catherine and Walter?• Can a computer application tell?• Give me a contact list sorted by last name• Imagine with 10,000 contacts!
9Health IT Workforce Curriculum Version 3.0/Spring 2012
Introduction to Information and Computer Science Databases and SQL
Lecture a
Quick! Answers
• Bill and Albert work for the same company – but it’s represented differently
• Catherine and Walter have the same addressed – again represented differently
• It’s hard for a computer application to tell• You CAN sort by hand – but it’s a challenge
10Health IT Workforce Curriculum Version 3.0/Spring 2012
Introduction to Information and Computer Science Databases and SQL
Lecture a
Another Problem
• What do you do if “Community Hospital” becomes “Community General” ?– Find every instance of “Community Hospital”
or variation thereof– Change EVERY entry
11Health IT Workforce Curriculum Version 3.0/Spring 2012
Introduction to Information and Computer Science Databases and SQL
Lecture a
Another Solution: Spreadsheets
• Spreadsheet applications store, manipulate and present data
• Provide more functionality than plain text files– Calculations– Sorting– Filtering– Data analysis
12Health IT Workforce Curriculum Version 3.0/Spring 2012
Introduction to Information and Computer Science Databases and SQL
Lecture a
Spreadsheet Example
OpenOffice Calc spreadsheet example. (PD-US, 2011).
13Health IT Workforce Curriculum Version 3.0/Spring 2012
Introduction to Information and Computer Science Databases and SQL
Lecture a
Advantages/Disadvantages of Spreadsheets
Advantages• Widely available• Powerful calculations• Basic sorting and
filtering
Disadvantages• Limited security• Multiple user access
isn't supported• Redundant and
inconsistent data
14Health IT Workforce Curriculum Version 3.0/Spring 2012
Introduction to Information and Computer Science Databases and SQL
Lecture a
Databases
• Definition:– Structured data collection accessed
electronically• Files are simple databases• Relational databases maintain relationships
between data
15Health IT Workforce Curriculum Version 3.0/Spring 2012
Introduction to Information and Computer Science Databases and SQL
Lecture a
Relational Database
• Introduced by Dr. Edgar Codd of IBM Research Laboratory in 1970– “Future users of large data banks must be protected
from having to know how the data is organized in the machine (the internal representation).”
• Definition:• An organized collection of data accessible by
electronic means where the information type and information relationships are maintained
16Health IT Workforce Curriculum Version 3.0/Spring 2012
Introduction to Information and Computer Science Databases and SQL
Lecture a
Relational Database Contents
• A relational database contains tables• Tables contain multiple rows of data• Rows contain data of specified type(s) in a
column order• Data and type are independent• Row order does not matter, but column order
does.
17Health IT Workforce Curriculum Version 3.0/Spring 2012
Introduction to Information and Computer Science Databases and SQL
Lecture a
Advantages/Disadvantages of Relational Databases
Advantages• Secure• Multiple user access• Relationships prevent
redundancy and inconsistency
• Optimized operations• Complex queries
Disadvantages• Expertise required• Limited data
calculations
18Health IT Workforce Curriculum Version 3.0/Spring 2012
Introduction to Information and Computer Science Databases and SQL
Lecture a
Databases and SQLSummary – Lecture a
• Data can be stored in files, spreadsheets or databases• Files and spreadsheets
– Widely available– Good for computations
• Databases– Secure– Optimized for speed– Multiple user access– Store relationships
19Health IT Workforce Curriculum Version 3.0/Spring 2012
Introduction to Information and Computer Science Databases and SQL
Lecture a
Databases and SQLReferences – Lecture a
References• American National Standards Institute. (2007). Information Systems - Coded Character Sets - 7-Bit American
National Standard Code for Information Interchange (7-Bit ASCII) (No. ANSI INCITS 4-1986 (R2007)).• Codd, E. F. (1970). A relational model of data for large shared data banks. Communications of the ACM, 13(6),
377-387.
Images• Slide 13: OpenOffice Calc spreadsheet example. (PD-US, 2011).
20Health IT Workforce Curriculum Version 3.0/Spring 2012
Introduction to Information and Computer Science Databases and SQL
Lecture a