1 cs3431 – database systems i introduction instructor: mohamed eltabakh [email protected]

33
1 CS3431 – Database Systems I Introduction Instructor: Mohamed Eltabakh [email protected]

Upload: cecily-stevens

Post on 05-Jan-2016

227 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 1 CS3431 – Database Systems I Introduction Instructor: Mohamed Eltabakh meltabakh@cs.wpi.edu

1

CS3431 –Database Systems I

Introduction

Instructor: Mohamed Eltabakh [email protected]

Page 2: 1 CS3431 – Database Systems I Introduction Instructor: Mohamed Eltabakh meltabakh@cs.wpi.edu

Today’s Lecture

Overview on Database Management Systems

Course Logistics

2

Page 3: 1 CS3431 – Database Systems I Introduction Instructor: Mohamed Eltabakh meltabakh@cs.wpi.edu

What is a Database System?

Software platform for managing large amounts of data

Managing means: Storing, querying, indexing, and structuring the data

Different names refer to the same thing: Database systems Database management systems DBMS

3

Page 4: 1 CS3431 – Database Systems I Introduction Instructor: Mohamed Eltabakh meltabakh@cs.wpi.edu

What is a Database System? (Cont’d)

What’s inside a DBMS Collection of interrelated data (E.g., for a given application) Set of programs to secure and access the data An environment that is both convenient and efficient to use

Usually data is too large to fit in computer memory at once Data stored on disk

Usually many users want to access this data and do so fast

Databases touch all aspects of our lives. We use it without knowing !!!

4

Page 5: 1 CS3431 – Database Systems I Introduction Instructor: Mohamed Eltabakh meltabakh@cs.wpi.edu

5

Database Applications

E-commerce: books, equipment etc. at Amazon

Banks -- your valuable $$ and ATM transactions

Airlines – manage flights to get you places

Universities – manage student enrollment

GIS (Maps) – find restaurants closest to WPI

Bio-informatics (genome data)

Have you ever used a database application?

Data is everywhere. To efficiently manage it, we need DBMS

?

Page 6: 1 CS3431 – Database Systems I Introduction Instructor: Mohamed Eltabakh meltabakh@cs.wpi.edu

Why use DBMS, and not files?Several drawbacks of using file systems

Data redundancy and inconsistency Multiple file formats, duplication of information in different files Multiple records formats within the same file No order enforced between fields

Difficulty in accessing data Need to write a new program to carry

out each new task

Integrity problems

6

Account balance >= 0 Student cannot take same course twice

….

Page 7: 1 CS3431 – Database Systems I Introduction Instructor: Mohamed Eltabakh meltabakh@cs.wpi.edu

Why use DBMS, and not files? (Cont’d)

Concurrent access by multiple users Many users need to access/update the data

at the same time (concurrent access)

Security problems Hard to provide user access to some,

but not all, data

Recovery from crashes While updating the data the system crashes

Maintenance problems Hard to search for or update a field Hard to add new fields

7

Page 8: 1 CS3431 – Database Systems I Introduction Instructor: Mohamed Eltabakh meltabakh@cs.wpi.edu

8

DBMS Provides Solutions

Data consistency even with multiple users

Efficient access to the data

Data integrity embedded in the DBMS

Recovery from crashes, security

Page 9: 1 CS3431 – Database Systems I Introduction Instructor: Mohamed Eltabakh meltabakh@cs.wpi.edu

9

Basic Terminology

Data Model Tools used for describing the data

Data Schema Describes structures for a particular application, using the given model

Database Collection of actual data that conforms to given schema

Database Management System (DBMS) Software platform that allows us to create, stores, use, and maintain a

database

SQL & Data Manipulation Language (DML) Language to manipulate, e.g., update or query, the data

Page 10: 1 CS3431 – Database Systems I Introduction Instructor: Mohamed Eltabakh meltabakh@cs.wpi.edu

Data Model

A collection of tools for describing Data objects Data relationships Data semantics Data constraints

Several data models: Relational model Entity-Relationship (ER) data model Object-based data models (Object-oriented) Semi-structured data model (XML) Other older models:

Network model Hierarchical model 10

We will learn these two models

Page 11: 1 CS3431 – Database Systems I Introduction Instructor: Mohamed Eltabakh meltabakh@cs.wpi.edu

Example: ER Model

11

Graphical model for describing entities, attributes, and relationships

Page 12: 1 CS3431 – Database Systems I Introduction Instructor: Mohamed Eltabakh meltabakh@cs.wpi.edu

Data Schema Captures the relationships between objects (“entities”) in an application Schemas can be represented graphically or textual

12

Page 13: 1 CS3431 – Database Systems I Introduction Instructor: Mohamed Eltabakh meltabakh@cs.wpi.edu

Query Language (SQL)

Language for accessing and manipulating the data organized by the appropriate data model SQL: Structured Query Language

13

SELECT ID, Name

FROM Student

WHERE address=“320FL”;

Page 14: 1 CS3431 – Database Systems I Introduction Instructor: Mohamed Eltabakh meltabakh@cs.wpi.edu

Query Language

Two classes of languages Procedural – user specifies

what data is required and

how to get those data

Declarative (non-procedural) – user specifies what data is required without specifying how to get those data

DBMSs use SQL

14

SELECT ID, Name

FROM Student

WHERE address=“320FL”;

Page 15: 1 CS3431 – Database Systems I Introduction Instructor: Mohamed Eltabakh meltabakh@cs.wpi.edu

A Big Picture of What You will Learn

15

Page 16: 1 CS3431 – Database Systems I Introduction Instructor: Mohamed Eltabakh meltabakh@cs.wpi.edu

16

You will Learn

Data Model Relational Model Entity-Relationship (ER) Model

Data Schema How to put pieces together to build a schema describing the application

Database Build an actual database and manipulate data

Database Management System (DBMS) We will use Oracle

Query Language SQL Language

Page 17: 1 CS3431 – Database Systems I Introduction Instructor: Mohamed Eltabakh meltabakh@cs.wpi.edu

Relational Data Model: Overview

The most widely used model today

It is a tabular representation of the data

Main concepts: Relations (Tables), basically a table with rows and columns. Every relation has a schema, which describes the columns, or fields.

Field or attribute

Page 18: 1 CS3431 – Database Systems I Introduction Instructor: Mohamed Eltabakh meltabakh@cs.wpi.edu

18

Example Database : Relational

Tabular View of Data in Airline System

Flight

Passenger

flightNo start destination miles

101 BOS LAX 3000

102 PVD LAX 2900

pName freqFlyerID DoB milesEarned

Mike 3433 1980 12000

Mary 5872 1981 11000

flightNo freqFlyerID date

101 3433 Jan 4

102 5872 Jan 5

Travel

Tabular view of data is called “Relational Model”

Page 19: 1 CS3431 – Database Systems I Introduction Instructor: Mohamed Eltabakh meltabakh@cs.wpi.edu

Entity-Relationship Model: Overview

Models the application as a collection of entities and relationships

Represented using Entity-Relationship Diagram (ERD)

19

Page 20: 1 CS3431 – Database Systems I Introduction Instructor: Mohamed Eltabakh meltabakh@cs.wpi.edu

SQL: Overview SQL: Non-procedural language to access the data inside

a database

20

External programs, e.g., in C or Java, typically access the database using: Language extensions to allow embedded SQL ODBC: Open Database Connectivity JDBC: Java Database Connectivity

Page 21: 1 CS3431 – Database Systems I Introduction Instructor: Mohamed Eltabakh meltabakh@cs.wpi.edu

Logical vs. Physical

21

How this information is stored???

Page 22: 1 CS3431 – Database Systems I Introduction Instructor: Mohamed Eltabakh meltabakh@cs.wpi.edu

22

Levels of Abstraction

• View Level --describes how users see the data

• Logical Level – describes the logical structures used

• Relational Model• ERD model

• Physical Level -- describes files and indexes

Usually hidden from users

Page 23: 1 CS3431 – Database Systems I Introduction Instructor: Mohamed Eltabakh meltabakh@cs.wpi.edu

23

Levels of Abstraction: Airline Application Example Logical (Conceptual) Level

Flight, Passenger, Travel tables

Physical Level Flight table stored as a sorted file on the flight number Index on flightNo attribute for Flight relation

View Level (External Schema) NoOfPassengers (flightNo, date, numPassengers) Hide employees salary

These levels of abstraction lead to “Data Independence”

These levels of abstraction lead to “Data Independence”

Page 24: 1 CS3431 – Database Systems I Introduction Instructor: Mohamed Eltabakh meltabakh@cs.wpi.edu

Data Independence DBMS has the three levels of abstractions

Ability to modify one level without affecting the other levels

Physical data independence: Physical schema such as indexes can change, but logical

schema need not change Protection from changes in physical structure of data

Logical data independence: Logical schema can change, but views need not change Protection from changes in logical structure of data

Page 25: 1 CS3431 – Database Systems I Introduction Instructor: Mohamed Eltabakh meltabakh@cs.wpi.edu

Other Advanced Topics

Efficient access

Query optimization

Concurrency control

Recovery control

Big Data Analytics

>> We will not have time to study these subjects during the course

>> It is important to know their existence and what is meant by each component

25

Page 26: 1 CS3431 – Database Systems I Introduction Instructor: Mohamed Eltabakh meltabakh@cs.wpi.edu

26

Efficient Access

Indexing Indexes gives direct access to “necessary” portion

of data, as opposed to sequential access in files

Directly find this customer without scanning all customers

Page 27: 1 CS3431 – Database Systems I Introduction Instructor: Mohamed Eltabakh meltabakh@cs.wpi.edu

27

Query Optimization

Costing: Estimate expected execution times

Query optimization : Generates many alternatives to answer a query Estimates the cost of each alternative Automatically determine and prepare optimal (or near

optimal) access plans for getting the data

Optimizer = “The Bread and Butter of a DBMS !”

SELECT ID, Name

FROM Student

WHERE address=“320FL”;

Page 28: 1 CS3431 – Database Systems I Introduction Instructor: Mohamed Eltabakh meltabakh@cs.wpi.edu

28

Concurrency Control

DBMS ensures data is consistent under concurrent access E.g.: multiple airline staff trying to reserve a seat

for different customers

Concepts: Transactions – grouping multiple instructions

(reads/writes) into one atomic unit Locks – locking of resources (tables)

Page 29: 1 CS3431 – Database Systems I Introduction Instructor: Mohamed Eltabakh meltabakh@cs.wpi.edu

29

Recovery Control

If system crashes in middle of transaction, recovery must be provided : Cannot afford to loose data or leave it

inconsistent

Concepts: Logging of transactions’ actions Ability to redo or undo transactions

Page 30: 1 CS3431 – Database Systems I Introduction Instructor: Mohamed Eltabakh meltabakh@cs.wpi.edu

Big Data Analytics

Large-Scale Data Management

Big Data AnalyticsData Science and Analytics

• How to manage very large amounts of data and extract value and knowledge from them

30

Page 31: 1 CS3431 – Database Systems I Introduction Instructor: Mohamed Eltabakh meltabakh@cs.wpi.edu

Data Explosion

31

Page 32: 1 CS3431 – Database Systems I Introduction Instructor: Mohamed Eltabakh meltabakh@cs.wpi.edu

32

Who uses databases?

End users

DB application programmers

Database Administrators Database design Security, Authorization Data availability, crash recovery Database tuning (for performance)

Page 33: 1 CS3431 – Database Systems I Introduction Instructor: Mohamed Eltabakh meltabakh@cs.wpi.edu

33

Summary : Why study DBMS?

Need to process large amounts of data efficiently Video, WWW, computer games, geographic information

systems (GIS), genome data, digital libraries, etc. Make use of all functionalities provided by DBMSs

DB administrators and programmers hold rewarding jobs

DB research is one of the most exciting areas in Computer Science !!