1 the relational data model and sql 10:30am—12noon monday, july 18 th, 2005 csig05 chaitan baru

28
1 The Relational Data Model and SQL 10:30AM—12noon Monday, July 18 th , 2005 CSIG05 Chaitan Baru

Upload: angela-stephens

Post on 29-Dec-2015

215 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: 1 The Relational Data Model and SQL 10:30AM—12noon Monday, July 18 th, 2005 CSIG05 Chaitan Baru

1

The Relational Data Model and SQL

10:30AM—12noonMonday, July 18th, 2005

CSIG05

Chaitan Baru

Page 2: 1 The Relational Data Model and SQL 10:30AM—12noon Monday, July 18 th, 2005 CSIG05 Chaitan Baru

2

OUTLINE

• Foundations of the relational data model– Data models and database systems– Relations, attributes, keys

• Introduction to SQL

• “Hands-on” with SQL– Vishwanath Nandigam

Page 3: 1 The Relational Data Model and SQL 10:30AM—12noon Monday, July 18 th, 2005 CSIG05 Chaitan Baru

3

Historical note: IBM’s role in database systems

• IBM’s early database systems were based on the hierarchical data model – IMS (Information Management System). – IMS serves more than 95 percent of Fortune 1000 companies

– Manages 15 petabytes of production data

– Supports more than two hundred million users per day

Student

Course

Instructor

S1

C1 C2

I1 I2

S2

C2 C3

I2 I3

C4

I1

Page 4: 1 The Relational Data Model and SQL 10:30AM—12noon Monday, July 18 th, 2005 CSIG05 Chaitan Baru

4

IBM and Relational Database Systems

• Relational model was introduced to provide separation of application logic from the data representations

• DB2 IBM’s second database product!• Foundations of relational model were invented at IBM by

E.F.Codd, IBM– A Relational Model of Data for Large Shared Data Banks, E. F.

Codd, June 1970• First prototype, System R, was developed at IBM Almaden

in mid-70’s – Introduced SQL– Provided as SQL/DS on IBM mainframe systems– Oracle was based on early System R work

Page 5: 1 The Relational Data Model and SQL 10:30AM—12noon Monday, July 18 th, 2005 CSIG05 Chaitan Baru

5

Why the Relational Model?• Deals with the application logic / data separation in

business data processing, unlike the earlier “network” and “hierarchical” data model

• Plus, an algebra for manipulating relations the key innovation

Student Course Instructortakes teachesN M N 1

S# student info C# course info

S# C#

Page 6: 1 The Relational Data Model and SQL 10:30AM—12noon Monday, July 18 th, 2005 CSIG05 Chaitan Baru

6

What is a Database System?• Database (system) =

– Database Instance (e.g. set of tables of rows)

– Database Management System (DBMS)

• Origins in the commercial world: – to organize, query, and manipulate data more effectively,

efficiently, and independently

• Scientific databases– often special features:

• spatial, temporal, spatiotemporal, GIS, units, uncertainty, raw & derived data, …

Page 7: 1 The Relational Data Model and SQL 10:30AM—12noon Monday, July 18 th, 2005 CSIG05 Chaitan Baru

7

Why not just use files as “databases”?

• Works for some applications…• But:

– scanning & ‘grep’ing large files can be very inefficient– no language support for selecting desired data, joining them, etc.

• cannot express the kinds of questions/queries you’d like to ask• ‘grep’ is no substitute for a query language

– redundant and/or inconsistent storage of data– no transaction management and concurrency control among multiple users– no security– no recovery– no data independence (application data)– no data modeling support– …

Page 8: 1 The Relational Data Model and SQL 10:30AM—12noon Monday, July 18 th, 2005 CSIG05 Chaitan Baru

8

Features of a Database System

• A data model (relational, object-oriented, XML) prescribes how data can be organized:– as relations (tables) of tuples (rows)– as classes of (linked) objects– as XML trees

• A (database) schema (stored in the “data dictionary”) defines the structure of a specific database instance:– Relational schema– OO schema– XML Schema (or XML DTD)

• A query language– Allows ad hoc, declarative (non-procedural) queries on schema

Page 9: 1 The Relational Data Model and SQL 10:30AM—12noon Monday, July 18 th, 2005 CSIG05 Chaitan Baru

9

Features of a Database System• Data is treated uniformly and separately from the application• Efficient data access • Queries and views are expressed over the schema• Integrity constraints (checking and enforcement)• Transactions combine sets of operations into logical units

(all-or-nothing)• Synchronization of concurrent user transactions• Recovery (after system crash)

– not to be confused w/ backup– instead: guarantee consistency by “roll-back” of partially executed

transactions (how? Hint: logging)

• …

Page 10: 1 The Relational Data Model and SQL 10:30AM—12noon Monday, July 18 th, 2005 CSIG05 Chaitan Baru

10

DB features:E.g. Concurrency Control

• Concurrent execution of simultaneous requests – long before web servers where around... – transaction management guarantees consistency despite

concurrent/interleaved execution

• Transaction (= sequence of read/write operations)– Atomicity: a transaction is executed completely or not at all– Consistency: a transaction creates a new consistent DB state, i.e.,

in which all integrity constraints are maintained – Isolation: to the user, a transaction seems to run in isolation– Durability: the effect of a successful (“committed”) transaction

remains even after system failure

Page 11: 1 The Relational Data Model and SQL 10:30AM—12noon Monday, July 18 th, 2005 CSIG05 Chaitan Baru

11

Levels of Abstraction: Architecture Overview

DBinstances

Physical level

Logical (“conceptual”) level

View 1 View 2 View nConceptual…

Level

Index structures

Tables

Export schemas

ER-Model(Entity-Relationship) OO Models (Classes…)

part of DB design conceptual design

… often lost in the process…

logical dataindependence

physical data independence

User

Page 12: 1 The Relational Data Model and SQL 10:30AM—12noon Monday, July 18 th, 2005 CSIG05 Chaitan Baru

12

Database Design: Entity-Relationship (ER) Model

• Entities:• Relationships:• Attributes:• ER Model:

– initial, high-level DB design (conceptual model)– easy to map to a relational schema (database tables)– comes with more constraints (cardinalities, aggregation) and

extensions: EER (is-a => class hierarchies)– related: UML (Unified Modeling Language) class diagrams

Employee Departmentworks-for

Name Salary ManagerName

since

Page 13: 1 The Relational Data Model and SQL 10:30AM—12noon Monday, July 18 th, 2005 CSIG05 Chaitan Baru

13

The Relational Model • Relation/Table Name:

– employee, dept

• Attributes = Column Names:– Emp, Salary, DeptNo, Name, Mgr

• Relational Schema:– employee(Emp:string,

Salary:integer, DeptNo:integer), ...

• Tuple = Row of the table:– (“tom”, “60000”, “1”)

• Relation = Set of tuples:– {(...), (...), ...}

Emp Salary DNotom 60k 1tim 57k 1sally 45k 3carol 30k 1carol 35k 2….

Employee

DNo Name Mgr 1 Toys carol2 Comp. carol3 Shoes sam

Department

FK: foreign key, pointing to another key

Page 14: 1 The Relational Data Model and SQL 10:30AM—12noon Monday, July 18 th, 2005 CSIG05 Chaitan Baru

14

Creating a Relational Database in SQLCREATE TABLE employee (

ssn CHAR(11),name VARCHAR(30),deptNo INTEGER,PRIMARY KEY (ssn),

FOREIGN KEY (deptNo) REFERENCES department )

CREATE TABLE department (

deptNo INTEGER,

name VARCHAR(20),

manager CHAR(11),

PRIMARY KEY (deptNo),

FOREIGN KEY (manager) REFERENCES employee(ssn) )

Page 15: 1 The Relational Data Model and SQL 10:30AM—12noon Monday, July 18 th, 2005 CSIG05 Chaitan Baru

15

What is a Query?• Intuitively:

– An “executable question” in terms of a database schema

– Evaluating a query Q against a database instance D yields a set of answer objects:

• Relational tuples or XML elements

• Example:– Who are the employees in the ‘Toys’ dept.?

– Who is (are) the manager(s) of ‘Tom’?

– Show all pairs (Employee, Mgr)

• Technically: – A mapping from an input schema (the given table schemas) to a

result schema (the new columns you are interested in) defined in some query language

Page 16: 1 The Relational Data Model and SQL 10:30AM—12noon Monday, July 18 th, 2005 CSIG05 Chaitan Baru

16

Why (Declarative) Query Languages?

• Things we talk and think about in PLs and QLs – Assembly languages:

• registers, memory locations, jumps, ...

– C and the likes: • if-then-else, for, while, memory (de-)allocation, pointers, ...

– Object-oriented languages:• C++: C plus objects, methods, classes, ...

• Java: objects, methods, classes, references, ...

• Smalltalk: objects, objects, objects, ...

• OQL: object-query language

Page 17: 1 The Relational Data Model and SQL 10:30AM—12noon Monday, July 18 th, 2005 CSIG05 Chaitan Baru

17

Why (Declarative) Query Languages?

• Things we talk and think about in PLs and QLs – Functional languages (Haskell, ML):

• (higher-order) functions, fold(l|r), recursion, patterns, ...

=> Relational languages (SQL, Datalog)• relations (tables), tuples (rows); conceptual level: ER• relational operations: , , , , ..., ,,,,,..., , , |X|

=> Semistructured/XML (Tree) & Graph Query Languages

• trees, graphs, nodes, edges, children nodes, siblings, … • XPath, XQuery, …

• Also: Focus on what, and not how!

Page 18: 1 The Relational Data Model and SQL 10:30AM—12noon Monday, July 18 th, 2005 CSIG05 Chaitan Baru

18

Example: Querying a Relational Database

Emp SalaryDeptNoanne 62k 2john 60k 1

Employee

DeptNoMgr

1 anne2 anne

Department

SELECT e.Emp, d.MgrFROM Employee e, Department dWHERE e.DeptNo = d.DeptNo

Emp Mgrjohn anneanne anne

result

join

input tables

SQL query (or view def.)

answer (or view)

we don’t say how to evaluate this expression

Page 19: 1 The Relational Data Model and SQL 10:30AM—12noon Monday, July 18 th, 2005 CSIG05 Chaitan Baru

19

Example Query: SQL vs DATALOG

• “List all employees and their managers”

• In SQL:SELECT e.name, d.manager

FROM Employee e, Department d

WHERE e.deptNo = d.deptNo

• In DATALOG:q(E, M) :- employee(E, S, D), department(D, N, M).

a “join” operation

Page 20: 1 The Relational Data Model and SQL 10:30AM—12noon Monday, July 18 th, 2005 CSIG05 Chaitan Baru

20

Important Relational Operations• select(R, Condition)

– filter rows of a table wrt. a condition

• project(R, Attr) – remove unwanted columns; keep rest

• join(R1, A2, R2, A2, Condition) – find “matches” in a “related” table

– e.g. match R1.foreign key = R2.primary key

• cartesian product(R1, R2)• union (“OR”), intersection (“AND”)• set-difference (“NOT IN”)

Page 21: 1 The Relational Data Model and SQL 10:30AM—12noon Monday, July 18 th, 2005 CSIG05 Chaitan Baru

21

Queries, Views, Integrity Constraints

• … can all be seen as “special queries”

• Query q(…) :- … ad-hoc queries

• View v(…) :- … exported views;

• Integrity Constraints– ic (…) :- …. MgrSal < EmpSal … – say what shouldn’t happen– if it does: alert the user (or refuse an update, …)

Page 22: 1 The Relational Data Model and SQL 10:30AM—12noon Monday, July 18 th, 2005 CSIG05 Chaitan Baru

22

Query Evaluation vs Reasoning• Query evaluation

– Given a database instance D and a query Q, run Q(D)

– What databases do all the time

• Reasoning (aka “Semantic Query Optimization”)– Given a query Q and a constraint C, “optimize” Q&C (e.g., given

C, Q might be unsatisfiable)

– Given Q1 and Q2 decide whether Q1 Q2

– Given Q1,Q2, C decide whether Q1 Q2 | C

– Note: we are NOT given a database instance D here; just the schema and the query/IC expressions

Page 23: 1 The Relational Data Model and SQL 10:30AM—12noon Monday, July 18 th, 2005 CSIG05 Chaitan Baru

23

Summary QLs for Relational Databases

Natural Join: same attribute name add condition that values must match

Page 24: 1 The Relational Data Model and SQL 10:30AM—12noon Monday, July 18 th, 2005 CSIG05 Chaitan Baru

24

Relational Algebra

Page 25: 1 The Relational Data Model and SQL 10:30AM—12noon Monday, July 18 th, 2005 CSIG05 Chaitan Baru

25

Relational Algebra

Page 26: 1 The Relational Data Model and SQL 10:30AM—12noon Monday, July 18 th, 2005 CSIG05 Chaitan Baru

26

Relational Algebra

Page 27: 1 The Relational Data Model and SQL 10:30AM—12noon Monday, July 18 th, 2005 CSIG05 Chaitan Baru

27

Relational Algebra

Page 28: 1 The Relational Data Model and SQL 10:30AM—12noon Monday, July 18 th, 2005 CSIG05 Chaitan Baru

28

Relational Algebra