lecture 2

36
1 Lecture Outline • File system vs. db system • Overview of database systems • Preliminary on SQL

Upload: tess98

Post on 17-Jun-2015

387 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: lecture 2

1

Lecture Outline

• File system vs. db system

• Overview of database systems

• Preliminary on SQL

Page 2: lecture 2

2

What was it like before DBMS was available ?

By storing the data in files:

customers.txt transaction.txt manufacture.txt

Now write C++ programs to implement specific tasks

Page 3: lecture 2

3

Doing it without a DBMS...• Record the fact that “Bill” bought “an Ford Explore” (let

us further assume Bill is a new customer, and it is the first time for he to buy a car from the agency:

Read ‘Customer.txt’Read ‘transaction.txt’Find&update or add bill to the record “Customer.txt”Update ‘transaction.txt’Write “customer.txt”Write “transaction.txt”

Read ‘Customer.txt’Read ‘transaction.txt’Find&update or add bill to the record “Customer.txt”Update ‘transaction.txt’Write “customer.txt”Write “transaction.txt”

Write a C program to do the following:

Page 4: lecture 2

4

Problems without an DBMS...• Assume you use C++ language.

• If data structure changes, you need to change C code.

• If you want to record other information, you need define new data structure literally and write another set of new C++ programs to manipulate the modified data structure. You need recompile the source code.

Page 5: lecture 2

5

Problems without an DBMS...• You have to take everything into consideration

in order to make your program robust and bullet proof.

Read ‘customers.txt’Read ‘transactions.txt’Find&update or add the record “bill”Find&update or add the record “bill bought a ford Explore”Write “customers.txt”Write “transactions.txt”

Read ‘customers.txt’Read ‘transactions.txt’Find&update or add the record “bill”Find&update or add the record “bill bought a ford Explore”Write “customers.txt”Write “transactions.txt”

CRASH !

Page 6: lecture 2

6

Problems without an DBMS

• Think about how many kinds of data manipulation needs?– Sort– Summation– Count– Search– Modification– List could be very long Etc.

• Each of them could be needed for each table or even for questions regarding multiple tables.

• Each of them need to be hard coded in C++.

Page 7: lecture 2

7

• Crash management • Concurrency control• Transaction management• Internal low level coding• Exception handling• User control• Disk space management • backup• List goes long …

Many other Problems without an DBMS...

Page 8: lecture 2

8

History of DBMS and RDBMS

• Read introduction chapter of the Textbook and any other RDBMS books.

Page 9: lecture 2

9

Database and DBMS

• What is a database? – A set of organized files or a collection of files that store the data.– A database is a collection of data that is saved and organized in files so that

the data can easily be accessed, managed, and updated.– Data organized in some predefined ways usually saved in files.

• What is a database management system (stripped down)?– A piece of software that– Database Management System = DBMS– A big piece of software (written by system software programmers in C\C+

+) that accesses and updates those files for you, designed to make data storing and manipulation tasks easier.

• By storing data in a DBMS, rather than as a collection operating system files, we can use the DBMS’s features to manage the data in a robust and efficient manner.

Page 10: lecture 2

10

Advantages of A DBMS

• Data indepence

• Efficient data access

• Powerful Query language

• Data integrity and security

• Data administration

• Concurrent access and crash recovery

• Reduced application development time

Page 11: lecture 2

11

Data independence

• Application programs should be as independent as possible from details of data representation and storage. The DBMS can provide an abstract view of the data to insulate application code from such details.

Page 12: lecture 2

12

Where are DBMS used ?

• DBMS is pervasive, used almost everywhere in your daily life.

• You are actually living in a human being world that, is kept monitored, tracked by all kinds of DBMS in a digital world.

• In other words, you and your behavior are a bunch of data in this digital era.

Page 13: lecture 2

13

data model:

relational / object-orieted / hierarchical / network / object-relational

users: single-user / multi-user

location: distributed / centralized

cooperation: homogeneous / heterogeneous

OLTP: on-line transaction processing

•Used to run the day-to-day operations of a business

•event-oriented: take an order, make a reservation, payment for goods, withdraw cash, ...

Classification of DBMSs

Page 14: lecture 2

14

RDBMS

• We are interested in Relational Database system.

Relational DBMS = RDBMS

• In relational model database, data files are structured as relations (tables)

Page 15: lecture 2

15

Database Systems• The giant commercial database vendors:

– Oracle– IBM (with DB2)– Microsoft (SQL Server)– Sybase

• Some free database systems (Unix) :– Postgres– MySQL– Predator

• In CSCI242 we use Oracle QL Server. – You can also choose MySQL, but less support!– If you are thinking about Access, try it later on your

own.

Page 16: lecture 2

16

An Example of a Traditional Enterprise level Database

ApplicationSuppose we are building a system

to store the information about:

• customers

• products

• manufactures

• who buys what, who produces what

Page 17: lecture 2

17

more examples of RDBMS

– Backend for traditional “database” applications• Registrar system at SUCO

– Backend for large Websites• www.microsoft.com, msdn, Search engine

– Backend for online shopping• ebay

Page 18: lecture 2

18

Who interact with (R)DBMS

– RDBMS providers implement them.– Advanced administrators install, maintain and tune-up

them. – A good developer should develop value-added software

using them– End users use software developed by developers

Page 19: lecture 2

19

Enters a DMBS

Data files

Database server(centralized by a

third party)

Applications

You will work here

connection

(ODBC, JDBC,

dbprovider)

“Two-tier system” or “client-server”

•developer

•administrators

•End users

Page 20: lecture 2

20

The language of RDBMS

SQL (Structured query Language) – Data Definition Language - DDL– Data Manipulation Language - DML

• query language

– Data Control Language - DCL

Page 21: lecture 2

21

SQL

• SQL is a declarative language, one in which you specify goal what you want and let the language processor engine figure out what steps to take to accomplish the goal.

• In general, SQL statements instruct the relational database engine about the desired end-state condition, but do not have to give step by step instructions to the engine.

Page 22: lecture 2

22

Transparent to SQL programmers

• For example, in handling a SELECT statement, the database engine program may flush buffers, dirty write caches, read sectors from disk, follow linked lists, etc., none of which the programmer has to know. But with most SQL engines, the programmer can add proprietary hints to adjust the way the query is processed. By the way, deletes, updates and inserts are also generically referred to as queries.

Page 23: lecture 2

23

How the Programmer Sees the DBMS

• Start with DDL to create tables:

• Continue with DML to populate tables:

CREATE TABLE Students (Name CHAR(30) NOT NULL,SSN CHAR(9) PRIMARY KEY,Category CHAR(20)

);

CREATE TABLE Students (Name CHAR(30) NOT NULL,SSN CHAR(9) PRIMARY KEY,Category CHAR(20)

);

INSERT INTO StudentsVALUES(‘Charles’, ‘123456789’, ‘undergraduate’). . . .

INSERT INTO StudentsVALUES(‘Charles’, ‘123456789’, ‘undergraduate’). . . .

Page 24: lecture 2

24

How the Programmer Sees the RDBMS

• Tables:

• Still implemented as files, but behind the scenes can be quite complex

SSN Name Category 123-45-6789 Charles undergrad 234-56-7890 Dan grad … …

SSN CID 123-45-6789 CSE444 123-45-6789 CSE444 234-56-7890 CSE142 …

Students: Takes:

CID Name Quarter CSE444 Databases fall CSE541 Operating systems winter

Courses:

“data independence” = separate logical view from physical implementation

Page 25: lecture 2

25

Queries

• Find all courses that “Mary” takes

• What happens behind the scene ?– Query processor figures out how to answer the

query efficiently.

SELECT C.nameFROM Students S, Takes T, Courses CWHERE S.name=“Mary” and S.ssn = T.ssn and T.cid = C.cid

SELECT C.nameFROM Students S, Takes T, Courses CWHERE S.name=“Mary” and S.ssn = T.ssn and T.cid = C.cid

Page 26: lecture 2

26

Queries, behind the scene

Imperative query execution plan:

SELECT C.nameFROM Students S, Takes T, Courses CWHERE S.name=“Mary” and S.ssn = T.ssn and T.cid = C.cid

SELECT C.nameFROM Students S, Takes T, Courses CWHERE S.name=“Mary” and S.ssn = T.ssn and T.cid = C.cid

Declarative SQL query

Students Takes

sid=sid

sname

name=“Mary”

cid=cid

Courses

The optimizer chooses the best execution plan for a query

Page 27: lecture 2

27

Transactions

• Enroll “Mary Johnson” in “CSCI242”:BEGIN TRANSACTION;

INSERT INTO Takes SELECT Students.SSN, Courses.CID FROM Students, Courses WHERE Students.name = ‘Mary Johnson’ and Courses.name = ‘CSCI242’

-- More updates here....

IF everything-went-OK THEN COMMIT;ELSE ROLLBACK

BEGIN TRANSACTION;

INSERT INTO Takes SELECT Students.SSN, Courses.CID FROM Students, Courses WHERE Students.name = ‘Mary Johnson’ and Courses.name = ‘CSCI242’

-- More updates here....

IF everything-went-OK THEN COMMIT;ELSE ROLLBACK

If system crashes, the transaction is still either committed or aborted

Page 28: lecture 2

28

Transactions

• A transaction = sequence of statements that either all succeed, or all fail

• Transactions have the ACID properties:A = atomicity

C = consistency

I = isolation

D = durability

Page 29: lecture 2

29

Databases and front-end application and Web application

• Accessing databases through web interfaces– Java programming interface (JDBC)– Embedding into HTML pages (JSP, ASP)– Access through http protocol (Web Services)– ASP

Page 30: lecture 2

30

PL/SQLAdditional proprietary extensions, such as Oracle's PL/SQL and Microsoft's T-SQL, add procedural constructs such as explicit looping, flow control, local variables and so forth the SQL. The idea is that some database maintenance, data cleaning, summary generation, and even business logic functionality can execute inside the database for speed and efficiency. In addition to that, these extensions allow business logics (business rules) to be implemented at server end.

Page 31: lecture 2

31

SQL*Plus

• SQL*Plus is a command line SQL and PL/SQL language interface and reporting tool that ships with the Oracle Database Client and Server.

• It can be used interactively or driven from scripts. SQL*Plus is frequently used by DBAs and Developers to interact with the Oracle database.

• If you are familiar with other databases, sqlplus is equivalent to "sql" in Ingres, "isql" in Sybase and SQLServer, "db2" in IBM DB2, "psql" in PostgresQL, and "mysql" in MySQL.

Page 32: lecture 2

32

Oracle SQL*Loader

• Oracle SQL*Loader is a batch utility for loading data into a target Oracle table.

• The idea is rather simple.

• We won’t use it.

Page 33: lecture 2

33

PL/SQL Developer

• It is an Integrated Development Environment (IDE) for developing stored program units in an Oracle Database. Using PL/SQL Developer you can conveniently create the server-part of your client/server applications.

Page 34: lecture 2

34

Forms Developer

• Form Developer is Oracle's productive Rapid Application Development (RAD) environment for building highly scalable, enterprise-class Internet database applications. It uses powerful declarative features so that business developers can instantly create fully functional applications from database definitions.

Page 35: lecture 2

35

SQR

• A fourth generation language for the creation of reports from databases. SQR is interpreted to dynamically generate SQL queries and format the results.

• Originally a Sybase product, it was then sold to MITI, who subsequently changed their name to SQRIBE.

• SQR Server supports native database access for all major DBMSs and the use of platform independent Java code.

Page 36: lecture 2

36

What are behind the scene of RDBMS?

• Functional components– SQL language parser and processor– Query engine– Query optimizer– Storage management– Transaction Management (concurrency, recovery)– …