1 the relational data model and sql 10:30am—12noon monday, july 18 th, 2005 csig05 chaitan baru
Post on 29-Dec-2015
215 Views
Preview:
TRANSCRIPT
1
The Relational Data Model and SQL
10:30AM—12noonMonday, July 18th, 2005
CSIG05
Chaitan Baru
2
OUTLINE
• Foundations of the relational data model– Data models and database systems– Relations, attributes, keys
• Introduction to SQL
• “Hands-on” with SQL– Vishwanath Nandigam
3
Historical note: IBM’s role in database systems
• IBM’s early database systems were based on the hierarchical data model – IMS (Information Management System). – IMS serves more than 95 percent of Fortune 1000 companies
– Manages 15 petabytes of production data
– Supports more than two hundred million users per day
Student
Course
Instructor
S1
C1 C2
I1 I2
S2
C2 C3
I2 I3
C4
I1
4
IBM and Relational Database Systems
• Relational model was introduced to provide separation of application logic from the data representations
• DB2 IBM’s second database product!• Foundations of relational model were invented at IBM by
E.F.Codd, IBM– A Relational Model of Data for Large Shared Data Banks, E. F.
Codd, June 1970• First prototype, System R, was developed at IBM Almaden
in mid-70’s – Introduced SQL– Provided as SQL/DS on IBM mainframe systems– Oracle was based on early System R work
5
Why the Relational Model?• Deals with the application logic / data separation in
business data processing, unlike the earlier “network” and “hierarchical” data model
• Plus, an algebra for manipulating relations the key innovation
Student Course Instructortakes teachesN M N 1
S# student info C# course info
S# C#
6
What is a Database System?• Database (system) =
– Database Instance (e.g. set of tables of rows)
– Database Management System (DBMS)
• Origins in the commercial world: – to organize, query, and manipulate data more effectively,
efficiently, and independently
• Scientific databases– often special features:
• spatial, temporal, spatiotemporal, GIS, units, uncertainty, raw & derived data, …
7
Why not just use files as “databases”?
• Works for some applications…• But:
– scanning & ‘grep’ing large files can be very inefficient– no language support for selecting desired data, joining them, etc.
• cannot express the kinds of questions/queries you’d like to ask• ‘grep’ is no substitute for a query language
– redundant and/or inconsistent storage of data– no transaction management and concurrency control among multiple users– no security– no recovery– no data independence (application data)– no data modeling support– …
8
Features of a Database System
• A data model (relational, object-oriented, XML) prescribes how data can be organized:– as relations (tables) of tuples (rows)– as classes of (linked) objects– as XML trees
• A (database) schema (stored in the “data dictionary”) defines the structure of a specific database instance:– Relational schema– OO schema– XML Schema (or XML DTD)
• A query language– Allows ad hoc, declarative (non-procedural) queries on schema
9
Features of a Database System• Data is treated uniformly and separately from the application• Efficient data access • Queries and views are expressed over the schema• Integrity constraints (checking and enforcement)• Transactions combine sets of operations into logical units
(all-or-nothing)• Synchronization of concurrent user transactions• Recovery (after system crash)
– not to be confused w/ backup– instead: guarantee consistency by “roll-back” of partially executed
transactions (how? Hint: logging)
• …
10
DB features:E.g. Concurrency Control
• Concurrent execution of simultaneous requests – long before web servers where around... – transaction management guarantees consistency despite
concurrent/interleaved execution
• Transaction (= sequence of read/write operations)– Atomicity: a transaction is executed completely or not at all– Consistency: a transaction creates a new consistent DB state, i.e.,
in which all integrity constraints are maintained – Isolation: to the user, a transaction seems to run in isolation– Durability: the effect of a successful (“committed”) transaction
remains even after system failure
11
Levels of Abstraction: Architecture Overview
DBinstances
Physical level
Logical (“conceptual”) level
View 1 View 2 View nConceptual…
Level
Index structures
Tables
Export schemas
ER-Model(Entity-Relationship) OO Models (Classes…)
part of DB design conceptual design
… often lost in the process…
logical dataindependence
physical data independence
User
12
Database Design: Entity-Relationship (ER) Model
• Entities:• Relationships:• Attributes:• ER Model:
– initial, high-level DB design (conceptual model)– easy to map to a relational schema (database tables)– comes with more constraints (cardinalities, aggregation) and
extensions: EER (is-a => class hierarchies)– related: UML (Unified Modeling Language) class diagrams
Employee Departmentworks-for
Name Salary ManagerName
since
13
The Relational Model • Relation/Table Name:
– employee, dept
• Attributes = Column Names:– Emp, Salary, DeptNo, Name, Mgr
• Relational Schema:– employee(Emp:string,
Salary:integer, DeptNo:integer), ...
• Tuple = Row of the table:– (“tom”, “60000”, “1”)
• Relation = Set of tuples:– {(...), (...), ...}
Emp Salary DNotom 60k 1tim 57k 1sally 45k 3carol 30k 1carol 35k 2….
Employee
DNo Name Mgr 1 Toys carol2 Comp. carol3 Shoes sam
Department
FK: foreign key, pointing to another key
14
Creating a Relational Database in SQLCREATE TABLE employee (
ssn CHAR(11),name VARCHAR(30),deptNo INTEGER,PRIMARY KEY (ssn),
FOREIGN KEY (deptNo) REFERENCES department )
CREATE TABLE department (
deptNo INTEGER,
name VARCHAR(20),
manager CHAR(11),
PRIMARY KEY (deptNo),
FOREIGN KEY (manager) REFERENCES employee(ssn) )
15
What is a Query?• Intuitively:
– An “executable question” in terms of a database schema
– Evaluating a query Q against a database instance D yields a set of answer objects:
• Relational tuples or XML elements
• Example:– Who are the employees in the ‘Toys’ dept.?
– Who is (are) the manager(s) of ‘Tom’?
– Show all pairs (Employee, Mgr)
• Technically: – A mapping from an input schema (the given table schemas) to a
result schema (the new columns you are interested in) defined in some query language
16
Why (Declarative) Query Languages?
• Things we talk and think about in PLs and QLs – Assembly languages:
• registers, memory locations, jumps, ...
– C and the likes: • if-then-else, for, while, memory (de-)allocation, pointers, ...
– Object-oriented languages:• C++: C plus objects, methods, classes, ...
• Java: objects, methods, classes, references, ...
• Smalltalk: objects, objects, objects, ...
• OQL: object-query language
17
Why (Declarative) Query Languages?
• Things we talk and think about in PLs and QLs – Functional languages (Haskell, ML):
• (higher-order) functions, fold(l|r), recursion, patterns, ...
=> Relational languages (SQL, Datalog)• relations (tables), tuples (rows); conceptual level: ER• relational operations: , , , , ..., ,,,,,..., , , |X|
=> Semistructured/XML (Tree) & Graph Query Languages
• trees, graphs, nodes, edges, children nodes, siblings, … • XPath, XQuery, …
• Also: Focus on what, and not how!
18
Example: Querying a Relational Database
Emp SalaryDeptNoanne 62k 2john 60k 1
Employee
DeptNoMgr
1 anne2 anne
Department
SELECT e.Emp, d.MgrFROM Employee e, Department dWHERE e.DeptNo = d.DeptNo
Emp Mgrjohn anneanne anne
result
join
input tables
SQL query (or view def.)
answer (or view)
we don’t say how to evaluate this expression
19
Example Query: SQL vs DATALOG
• “List all employees and their managers”
• In SQL:SELECT e.name, d.manager
FROM Employee e, Department d
WHERE e.deptNo = d.deptNo
• In DATALOG:q(E, M) :- employee(E, S, D), department(D, N, M).
a “join” operation
20
Important Relational Operations• select(R, Condition)
– filter rows of a table wrt. a condition
• project(R, Attr) – remove unwanted columns; keep rest
• join(R1, A2, R2, A2, Condition) – find “matches” in a “related” table
– e.g. match R1.foreign key = R2.primary key
• cartesian product(R1, R2)• union (“OR”), intersection (“AND”)• set-difference (“NOT IN”)
21
Queries, Views, Integrity Constraints
• … can all be seen as “special queries”
• Query q(…) :- … ad-hoc queries
• View v(…) :- … exported views;
• Integrity Constraints– ic (…) :- …. MgrSal < EmpSal … – say what shouldn’t happen– if it does: alert the user (or refuse an update, …)
22
Query Evaluation vs Reasoning• Query evaluation
– Given a database instance D and a query Q, run Q(D)
– What databases do all the time
• Reasoning (aka “Semantic Query Optimization”)– Given a query Q and a constraint C, “optimize” Q&C (e.g., given
C, Q might be unsatisfiable)
– Given Q1 and Q2 decide whether Q1 Q2
– Given Q1,Q2, C decide whether Q1 Q2 | C
– Note: we are NOT given a database instance D here; just the schema and the query/IC expressions
23
Summary QLs for Relational Databases
Natural Join: same attribute name add condition that values must match
24
Relational Algebra
25
Relational Algebra
26
Relational Algebra
27
Relational Algebra
28
Relational Algebra
top related