sql overview - texas a&m...

Post on 30-Mar-2020

0 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

SQL OVERVIEWCSCE 315, SPRING 2019

PROJECT 1, PART 3

ROBERT LIGHTFOOT

Slides adapted from those used byJeffrey Ullman, via Jennifer Welch

SQL

• Structured Query Language

• Database language used to manage and query

relational databases

• A well-known, commonly used standard

• Regularly updated

• Many extensions, variations

• Platform-specific versions, etc.

GENERATIONS OF PROGRAMMING LANGUAGES

• 1st generation• Machine code

• 2nd generation• Human-readable but directly related to processor

• Assembly language, C (sort of)

• 3rd generation• Abstraction from processor, easier for humans

• Fortran, C/C++, Java, etc.

• 4th generation• Programming Language for specific task

• e.g. SQL, Matlab

• 5th generation• Give constraints (goal), and result follows logically

• e.g. Prolog

SQL ELEMENTS

• Data Definition Language (DDL)• Supports creation of database schema

• Data Manipulation Language (DML)• Supports entering/removing data

•Querying Language• Supports query operations (don’t change data itself)

•Others:• Transaction control, Data control

OUR DISCUSSION OF SQL

• Will highlight some of the structures and features of SQL

• Give you an idea of the basics of how it works

• Reflects how relational databases work

• Not meant to make you SQL programmers

• You will need to know parts of this for the first project

DATABASE SCHEMA

• The set of relations (tables) in the database.

• Create, delete, change tables

CREATE

• Define a relation

CREATE TABLE <name> (

<element list>

);

element = <name> <type>

ELEMENT TYPES

• INT, INTEGER• Integers

• FLOAT, REAL• Floating-Point numbers

• CHAR(n)• Fixed-length string of n characters

• VARCHAR(n)• Variable-length string of up to n characters

• DATE• yyyy-mm-dd

• TIME• hh:mm:ss

EXAMPLE

CREATE TABLE HouseRep (

Name VARCHAR(80),

Party CHAR(10),

Birthdate DATE,

YearsInCongress INT,

Salary REAL

);

DECLARING KEYS

• Keys declared within CREATE statement

• Key attributes functionally determine all other attributes

in the relation

• List under PRIMARY KEY

• Elements of primary key can not be NULL

EXAMPLE

CREATE TABLE HouseRep (

Name VARCHAR(80),

Party CHAR(10),

Birthdate DATE,

YearsInCongress INT,

Salary REAL,

PRIMARY KEY (Name)

);

EXAMPLE

CREATE TABLE HouseRep (

Name VARCHAR(80),

Party CHAR(10),

Birthdate DATE,

YearsInCongress INT,

Salary REAL,

PRIMARY KEY (Name, Birthdate)

);

OTHER ELEMENT MODIFIERS

• UNIQUE• Placed after type• Only one tuple in that relation for each value (except

NULL)• Can imply key if no primary key given• Can be NULL

• NOT NULL• Cannot take value NULL

• DEFAULT• Default value specified

EXAMPLE

CREATE TABLE HouseRep (

Name VARCHAR(80) UNIQUE,

Party CHAR(10),

Birthdate DATE NOT NULL,

YearsInCongress INT DEFAULT 0,

Salary REAL DEFAULT 120000.00

);

OTHER TABLE MODIFICATIONS

• DROP <name>• Deletes that table

• ALTER TABLE <name> ADD <attribute>• Adds a new column to table

• ALTER TABLE <name> DROP <attribute>• Removes the column from the table

VIEWS

• Views are a sort of “virtual table”, usually created as the

result of a query

• Format:

CREATE VIEW <name> AS <query>

MODIFYING THE DATABASE

• Data Manipulation Language

• Given a schema, must “populate” the database with

actual data

• Insert, Delete, Modify

INSERTION

• INSERT command:

INSERT INTO <Relation>

VALUES (<value list>);

• Can specify only certain attributes in Relation

Relation(<attribute list>)

• Instead of values, can have subquery

INSERTION EXAMPLE

• Senator (Name, Party, State, Years)

INSERT INTO Senator

VALUES (Jill Smith, Republican, NY, 5);

INSERT INTO Senator(Name, State)

VALUES (Jill Smith, NY);

DELETION

• Delete from relation according to condition

DELETE FROM <Relation>

WHERE <condition>;

• Example: delete Texas Senators:

DELETE FROM Senator

WHERE State = ‘TX’;

MODIFICATION

• Update subset according to condition

UPDATE <Relation>

SET <list of attribute assignments>

WHERE <condition>;

• Example: Joe Lieberman becomes Independent

UPDATE Senator

SET Party = ‘Independent’

WHERE Name = ‘Joseph Lieberman’;

QUERIES

• The heart of SQL

• Queries can form portion of other commands

• e.g. INSERT results of a query into a table

• Form:

• SELECT attributes

• FROM relation(s)

• WHERE condition

EXAMPLE

Name Party State Years

Jill Smith Republican NY 5

Joe Adams Democrat NJ 0

Sue Jones Democrat CT 9

Jim Brown Republican PA 15

Senator:

Query:SELECT Name

FROM Senator

WHERE Party = ‘Republican’;

Result: Name

Jill Smith

Jim Brown

STATEMENT PROCESSING• Begin with the relation(s) in the FROM clause

• Can be the result of another query!

• Apply selection condition in WHERE clause

• Can potentially be very complex, and include subqueries

• Get the attributes given in (more generally, apply a projection to) the SELECT clause

• Process: iterate through all tuples in FROM, checking vs. WHERE, and for those that match, apply the SELECT-

SELECT CLAUSE - *• Can use a * for SELECT to indicate all attributes given in

the relation listed in FROM.

• Senator:

• Query:SELECT *

FROM Senator

WHERE Party = ‘Republican’;

• Result:

Name Party State Years

Jill Smith Republican NY 5

Joe Adams Democrat NJ 0

Sue Jones Democrat CT 9

Jim Brown Republican PA 15

Name Party State Years

Jill Smith Republican NY 5

Jim Brown Republican PA 15

SELECT CLAUSE - AS• Can use AS to rename attributes in result

• Senator:

• Query:SELECT Name AS Person, Party AS Affiliation, State

FROM Senator

WHERE Party = ‘Republican’;

• Result:

Name Party State Years

Jill Smith Republican NY 5

Joe Adams Democrat NJ 0

Sue Jones Democrat CT 9

Jim Brown Republican PA 15

Person Affiliation State

Jill Smith Republican NY

Jim Brown Republican PA

SELECT CLAUSE - EXPRESSION• Can include expressions in SELECT Clause

• Senator:

• Query:SELECT Name, Years * 365 AS DaysInOffice

FROM Senator

WHERE Party = ‘Republican’;

• Result:

Name Party State Years

Jill Smith Republican NY 5

Joe Adams Democrat NJ 0

Sue Jones Democrat CT 9

Jim Brown Republican PA 15

Name DaysInOffice

Jill Smith 1825

Jim Brown 5475

AGGREGATIONS• SUM, AVG, COUNT, MIN, MAX

• COUNT(*) counts number of tuples

• The main purpose is to Aggregate a number of records to a single row

• Applied to column(s) in SELECT clause

• Example 1: Count the number of reps in US

SELECT COUNT(*)FROM USReps

Name Party State Years

Jill Smith Republican NY 5

Joe Adams Democrat NJ 0

Sue Jones Democrat CT 9

Jim Brown Republican PA 15

Count(*)

4

AGGREGATIONS• SUM, AVG, COUNT, MIN, MAX

• COUNT(*) counts number of tuples

• The main purpose is to Aggregate a number of records to a single row

• Applied to column(s) in SELECT clause

• Example 2: Select Party and also Count the number of reps

SELECT Party, COUNT(*)FROM USReps

Name Party State Years

Jill Smith Republican NY 5

Joe Adams Democrat NJ 0

Sue Jones Democrat CT 9

Jim Brown Republican PA 15

GROUPING AGGREGATIONS• Adding GROUP BY <attribute> at the end will apply

aggregation only to group

• Example 3: Count the number of U.S. Representatives by Party

(also mention which party)

SELECT Party, COUNT(*)FROM USReps GROUP BY Party

Name Party State Years

Jill Smith Republican NY 5

Joe Adams Democrat NJ 0

Sue Jones Democrat CT 9

Jim Brown Republican PA 15

Party Count

Republican 2

Democrat 2

HAVING• Can restrict GROUP using HAVING

• HAVING can refer to the FROM clause and its attributes

• Example 4: Count representatives by party, but discard parties that have any representative with 3 years or less experience

Name Party State Years

Jill Smith Republican NY 5

Joe Adams Democrat NJ 0

Sue Jones Democrat CT 9

Jim Brown Republican PA 15

Party Count

Republican 2SELECT Party, COUNT(*) FROM USReps

GROUP BY Party HAVING MIN(Years) > 3

WHERE CLAUSE – COMPLEX EXPRESSIONS

• Can include NOT, AND, OR operators

• Senator:

• Query:

SELECT *

FROM Senator

WHERE Party = ‘Republican’ OR Years > 3;

• Result:

Name Party State Years

Jill Smith Republican NY 5

Joe Adams Democrat NJ 0

Sue Jones Democrat CT 9

Jim Brown Republican PA 15

Name Party State Years

Jill Smith Republican NY 5

Sue Jones Democrat CT 9

Jim Brown Republican PA 15

WHERE CLAUSE – OTHER EFFECTS

• Order of operations, including parentheses

• LIKE: String comparisons with wildcards

• % means any string

• _ means any

character

Name Party State Years

Jill Smith Republican NY 5

Joe Adams Democrat NJ 0

Sue Jones Democrat CT 9

Jim Brown Republican PA 15

SELECT * FROM Senator WHERE Name LIKE ‘J%’

Name Party State Years

Jill Smith Republican NY 5

Joe Adams Democrat NJ 0

Jim Brown Republican PA 15

WHERE CLAUSE – NULL VALUES

• Tuples may contain NULL values

• Undefined/Unknown

• Inapplicable

• All conditions evaluate to either TRUE, FALSE, or

UNKNOWN

• Comparisons to NULL are UNKNOWN

• Tuples selected only if TRUE

3-VALUED LOGIC

• Can think of values as• TRUE = 1

• FALSE = 0

• UNKNOWN = ½

•Operations would be• OR = MAX

• AND = MIN

• NOT = 1-x

• Example: (T AND ((NOT U OR F) AND NOT (U OR T)))

3-VALUED LOGIC

• Can think of values as• TRUE = 1

• FALSE = 0

• UNKNOWN = ½

•Operations would be• OR = MAX

• AND = MIN

• NOT = 1-x

• Example: (T AND ((NOT U OR F) AND NOT (U OR T)))

MAX(1- ½,0) = MAX(½,0) = ½ = U

3-VALUED LOGIC

• Can think of values as• TRUE = 1• FALSE = 0• UNKNOWN = ½

• Operations would be• OR = MAX• AND = MIN• NOT = 1-x

• Example: (T AND (U AND NOT (U OR T)))

3-VALUED LOGIC

• Can think of values as• TRUE = 1• FALSE = 0• UNKNOWN = ½

• Operations would be• OR = MAX• AND = MIN• NOT = 1-x

• Example: (T AND (U AND NOT (U OR T)))

MAX(½, 1) = 1 = T

3-VALUED LOGIC

• Can think of values as• TRUE = 1• FALSE = 0• UNKNOWN = ½

• Operations would be• OR = MAX• AND = MIN• NOT = 1-x

• Example: (T AND (U AND NOT T)

3-VALUED LOGIC

• Can think of values as• TRUE = 1• FALSE = 0• UNKNOWN = ½

• Operations would be• OR = MAX• AND = MIN• NOT = 1-x

• Example: (T AND (U AND NOT T) )

MIN(½, 1-1) = MIN(½,0) = 0 = F

3-VALUED LOGIC

• Can think of values as• TRUE = 1• FALSE = 0• UNKNOWN = ½

• Operations would be• OR = MAX• AND = MIN• NOT = 1-x

• Example: (T AND F)

3-VALUED LOGIC

• Can think of values as• TRUE = 1• FALSE = 0• UNKNOWN = ½

• Operations would be• OR = MAX• AND = MIN• NOT = 1-x

• Example: (T AND F)

MIN(0,1) = 0 = F

3-VALUED LOGIC

• Can think of values as• TRUE = 1• FALSE = 0• UNKNOWN = ½

• Operations would be• OR = MAX• AND = MIN• NOT = 1-x

• Example: F

(T AND ((NOT U OR F) AND NOT (U OR T)))

UNEXPECTED RESULTS FOR NULLS

• WHERE (Years > 2) OR (Years < 3)

• This should “cover” all cases

• If Years is NULL

• Years > 2 is UNKNOWN

• Years < 3 is UNKNOWN

• So the OR is UNKNOWN

• And thus the tuple is NOT selected!

WHERE CLAUSE –IN OPERATOR

• <tuple> IN <relation>

• TRUE iff the tuple is a member of the

relation

SELECT *

FROM ElectedOfficial

WHERE Name IN USRep

ElectedOfficial

Name Party

Lloyd Doggett Democrat

John Cornyn Republican

John Adams Federalist

Bill Flores Republican

Result

Name Party

Lloyd Doggett Democrat

Bill Flores Republican

USRep

Name

Lloyd Doggett

Bill Flores

WHERE CLAUSE –EXISTS OPERATOR• EXISTS (<relation>)

• TRUE iff the relation is not empty

relation

SELECT *

FROM ElectedOfficial

WHERE EXISTS(USRep)

ElectedOfficial

Name Party

Lloyd Doggett Democrat

John Cornyn Republican

John Adams Federalist

Bill Flores Republican

USRep

Name

Bill Flores

Lloyd Doggett

Result

Name Party

Lloyd Doggett Democrat

John Cornyn Republican

John Adams Federalist

Bill Flores Republican

EXISTS (AND OTHER) OPERATORS

• Usually applied to the results of a subquery

• Example: is any Senator a Whig?

EXISTS(

SELECT *

FROM Senator

WHERE Party = ‘Whig’

)

WHERE CLAUSE –ANY AND ALL OPERATORS

• x = ANY(<relation>)

• TRUE iff x is equal to at least one tuple in the relation

• x = ALL(<relation>)

• TRUE iff x is equal to all tuples in the relation

• The = can also be >, >=, <, <=, <>

• The relation should have only one attribute

EXAMPLE: ANY

SELECT *

FROM ElectedOfficial

WHERE Party = ANY (CurrentParties)

ElectedOfficial

Name Party

Lloyd Doggett Democrat

Bill Flores Republican

John Adams Federalist

John Cornyn Republican

CurrentParties

Name

Democrat

Republican

Result

Name Party

Lloyd Doggett Democrat

Bill Flores Republican

John Cornyn Republican

EXAMPLE: ALLSenator

Name Party State Years

Jill Smith Republican NY 5

Joe Adams Democrat NJ 0

Sue Jones Democrat CT 9

Jim Brown Republican PA 15

Name Party State Years

Jim Brown Republican PA 15

SELECT *

FROM ElectedOfficial

WHERE Years > ALL (YearsPresidentsInSenate)

YearsPresidentsInSenate

Years Served

6

0

12

6

0

EXAMPLE: ALLSenator

Name Party State Years

Jill Smith Republican NY 5

Joe Adams Democrat NJ 0

Sue Jones Democrat CT 9

Jim Brown Republican PA 15

Name Party State Years

Jim Brown Republican PA 15

SELECT *

FROM ElectedOfficial

WHERE Years > ALL (YearsPresidentsInSenate)

YearsPresidentsInSenate

Years Served

6

0

12

6

0

EXAMPLE: ALLSenator

Name Party State Years

Jill Smith Republican NY 5

Joe Adams Democrat NJ 0

Sue Jones Democrat CT 9

Jim Brown Republican PA 15

Name Party State Years

Jim Brown Republican PA 15

SELECT *

FROM ElectedOfficial

WHERE Party = ANY (CurrentParties)

YearsPresidentsInSenate

Years Served

6

0

12

6

0

EXAMPLE: ALLSenator

Name Party State Years

Jill Smith Republican NY 5

Joe Adams Democrat NJ 0

Sue Jones Democrat CT 9

Jim Brown Republican PA 15

Name Party State Years

Jim Brown Republican PA 15

SELECT *

FROM ElectedOfficial

WHERE Years > ALL (YearsPresidentsInSenate)

YearsPresidentsInSenate

Years Served

6

0

12

6

0

EXAMPLE: ALLSenator

Name Party State Years

Jill Smith Republican NY 5

Joe Adams Democrat NJ 0

Sue Jones Democrat CT 9

Jim Brown Republican PA 15

Name Party State Years

Jim Brown Republican PA 15

SELECT *

FROM ElectedOfficial

WHERE Years > ALL (YearsPresidentsInSenate)

YearsPresidentsInSenate

Years Served

6

0

12

6

0

EXAMPLE: ALLSenator

Name Party State Years

Jill Smith Republican NY 5

Joe Adams Democrat NJ 0

Sue Jones Democrat CT 9

Jim Brown Republican PA 15

Name Party State Years

Jim Brown Republican PA 15

SELECT *

FROM ElectedOfficial

WHERE Years > ALL (YearsPresidentsInSenate)

YearsPresidentsInSenate

Years Served

6

0

12

6

0

UNION, INTERSECT, DIFFERENCE

• Can combine subqueries with Boolean operations

• e.g. (subquery) UNION (subquery)

• Default: duplicates are removed by these operations unless ALL is included

• (subquery) INTERSECT ALL (subquery)

• Likewise, can remove duplicates in normal SELECT by including DISTINCT

• SELECT DISTINCT Years …

“BAG” VS. “SET” SEMANTICS

• Items are in a “bag”

• Duplicates OK

• Items are in a “set”

• Duplicates removed

JOINS

• Combining relations into one new relation

• Many ways, variations

• <relation> CROSS JOIN <relation>

• Takes every possible combination

CROSS JOIN EXAMPLE

VanTypes

Make Model

Dodge Caravan

Honda Odyssey

Result

Make Model Seats Paint

Dodge Caravan Cloth Standard

Dodge Caravan Leather Standard

Dodge Caravan Leather Premium

Honda Odyssey Cloth Standard

Honda Odyssey Leather Standard

Honda Odyssey Leather Premium

SeatsAndPaint

Seats Paint

Cloth Standard

Leather Standard

Leather Premium

INNER JOINS• Inner Joins are based on the Cross Join

• Join is usually limited by some comparison using ON (Theta Join)

Creates table with one (Senator, Representative) tuple for every pair from the same state.

(Note: both State attributes still appear)

SELECT * FROM Senator

INNER JOIN Representative

ON Senator.State = Representative.State

NATURAL JOINS

• Automatically looks for matching columns

• The “ON” keyword does not apply here

•Only one column for each match, and only select

tuples that match in those columns

SELECT * FROM Senator

NATURAL JOIN Representative

NATURAL JOIN EXAMPLEStudents

Name School

Joe Smith Rice

Jill Smith LSU

Sam Jones Texas A&M

Sue Jones Rice

SchoolLocations

School City

Texas A&M College Station

Rice Houston

LSU Baton Rouge

Result

Name School City

Joe Smith Rice Houston

Jill Smith LSU Baton Rouge

Sam Jones Texas A&M College Station

Sue Jones Rice Houston

SELECT * FROM Students

NATURAL JOIN SchoolLocations

OUTER JOIN

• Includes tuples from both relations, even if no match in the

other

• Those attributes are set to NULL

• LEFT, RIGHT, FULL

• Keep all records from left table, or from right table, or from both

PROJECT 1 CHECKPOINT 3

• Helpful SQL Query Tools (go this site): (old, I’ll look for a newone)

http://beekeeperdata.com/posts/2016/02/01/5-free-query-tools-for-sql.html

• Installs mini SQL engine and gives a platform to test queries

• Useful for verifying correctness of your implementation

• Implementing the Database::Query function

• Some Important Queries

• Implementation Issues

SOME EXAMPLE QUERIES

SELECT * FROM ElectedOfficial WHERE Years > 5

SELECT * FROM ElectedOfficial WHERE Years > 5

AND Years < 7

SELECT * FROM ElectedOfficial WHERE Years > 5

AND Years < 7 OR (Years > Salary / 100 * 7

AND Years < Networth/1000)

SELECT * FROM ElectedOfficial WHERE Years > 5

AND Years < 7 OR NOT (Years > Salary / 100 *

7 - 35) ORDER BY Salary

MORE QUERIES

SELECT * FROM Invoice WHERE

Invoice.total > (SELECT MAX (UnitPrice)

FROM Track)

SELECT * FROM Invoice WHERE

Invoice.total > (SELECT MAX (UnitPrice)

FROM Track) AND Invoice.total < 10

OTHER QUERIES• ANY, ALL, EXISTS, IN

• Required for 4-person teams

• 3-person teams can get extra

SELECT * FROM Invoice WHERE Invoice.total >

ANY (SELECT UnitPrice FROM Track)

SELECT * FROM Invoice WHERE EXISTS

(SELECT * FROM Track WHERE Invoice.total >

UnitPrice)

Context, not independent!!!

IMPLEMENTATION THOUGHT

General Format:

The condition can be:

• A logical expression within the table

• A number of logical expression separated by AND/OR/NOT

• A number of queries with AND/OR/NOT between them and also with

IN/EXISTS/ANY

SELECT <columns>

FROM <tablename>

WHERE <condition>

QUERY IMPLEMENTATION – THINKING RECURSIVE

• Process the rest of the query as a math expression

• Remember post-fix calculators!!!

• But when comes to another query inside one, call the same

query function recursive

• As argument pass the context

• The recurse-d function is simpler

• It has to process a simpler query

• At some point, you will find a very simple query, and then start

returning the result

QUERY FUNCTION: PSEUDOCODE

Table Query (vector<string> columns, string from-table,

string condition)

{

1. Parse condition string and identify parts,

2. Find their connecting conditions (AND/OR/NOT)

3. Evaluate each part

a. If the part is an expression, evaluate that

b. If the part looks like a query, (i.e., starting

with a SELECT within a parenthesis), call the query

function again to evalue it, pass along the context:

from-table

4. Combine all parts and apply the AND/OR/NOT

condition between them

5. Each row of the from-table is tested this way and

either kept or thrown away

6. Return the kept rows and only the selected colums

as table

}

top related