sql. 1.sql is a high-level language, in which the programmer is able to avoid specifying a lot of...

25
SQL

Post on 20-Dec-2015

225 views

Category:

Documents


0 download

TRANSCRIPT

SQL

SQL

1. SQL is a high-level language, in which the programmer is able to avoid specifying a lot of data-manipulation details that would be necessary in languages like C++.

2. What makes SQL viable is that its queries are “optimized” quite well, yielding efficient query executions.

Select-From-Where StatementsPrincipal form of a query is:

SELECT desired attributes

FROM one or more tables

WHERE condition about tuples of the tables

1. Begin with the relation in the FROM clause.

2. Apply the selection indicated by the WHERE clause.

3. Apply the extended projection indicated by the SELECT clause.

Running Example

Our SQL queries will be based on the following database schema.

Movie(title, year, length, inColor, studioName, producerC)StarsIn(movieTitle, movieYear, starName)MovieStar(name, address, gender, birthdate)MovieExec(name, address, cert#, netWorth)Studio(name, address, cert#, netWorth)

Example: Find all movies produced by Disney in 1990.

SELECT *FROM MovieWHERE studioName = 'Disney' AND year = 1990;

(Extended) Projection in SQL

SELECT title, lengthFROM MovieWHERE studioName = 'Disney' AND year = 1990;

SELECT title AS name, length AS durationFROM MovieWHERE studioName = 'Disney' AND year = 1990;

SELECT title AS name, length*0.016667 AS lenghtInHoursFROM MovieWHERE studioName = 'Disney' AND year = 1990;

SELECT title AS name, length/60 AS length, 'hrs.' AS inHoursFROM MovieWHERE studioName = 'Disney' AND year = 1990;

Selection in SQL• The selection of the relational algebra is expressed through

the WHERE clause of SQL. • We may build expressions by using the operators:

=, <>, <, >, <=, >=

• String constants are surrounded by single quotes. – studioName = 'Disney'

• Numeric constants are for e.g.: -12.34, 1.23E45• Boolean operators are: AND, OR, NOT.

ExampleWhich movies are made after 1970 and aren't in color?

SELECT title FROM MovieWHERE (year > 1970) AND NOT (inColor='C');

Selection in SQL (Cont.)

• Which Disney movies are after 1970 or have length greater than 90 mins?

SELECT title FROM MovieWHERE (year > 1970 OR length < 90) AND

studioName='Disney';

• Parenthesis are needed because the precedence of OR is less than that of AND.

Comparison of strings• Strings can be compared (lexicographically) with the same

operators: =<><><=>=

• For instance ‘fodder’<‘foo’ ‘bar’ < ‘bargain’

Patterns• WHERE clauses can have conditions in which a string is compared with a

pattern, to see if it matches.

• General form:

<Attribute> LIKE <pattern> or <Attribute> NOT LIKE <pattern>

• <pattern> is a quoted string which may contain

% = meaning “any string”

_ = meaning “any character.”

Example. Suppose we remember a movie “Star something”.

SELECT title

FROM Movie

WHERE title LIKE 'Star %';

Apostrophes. Two consecutive apostrophes represent one apostrophe and not the end of the string. E.g.

SELECT title

FROM Movie

WHERE title LIKE '%''s%';

Patterns (Continued) • What if the pattern we wish to use in a LIKE expression involves the

characters % or _? – We should “escape” their special meaning proceeding them by

some escape character. – In UNIX and C we use backslash \ as the escape character.– SQL allows us to use any character we like.

• s LIKE 'x%%x%%' ESCAPE 'x'– x will be the escape character.– A string that is matched by this pattern is for example: %aaaa%

Example

SELECT title

FROM Movie

WHERE title LIKE 'x%%x%%' ESCAPE 'x';

Ordering the Input• We may ask the tuples produced by a query to be presented in sorted

order.

• ORDER BY <list of attributes>

Example. • Find the Disney movies of 1990. List them by length, shortest first, and

among movies of equal length, sort alphabetically– Movie(title, year, length, inColor, studioName, producerC#)

SELECT *FROM MovieWHERE studioName = 'Disney'ORDER BY length, title;

Remarks• Ordering is ascending, unless you specify the DESC keyword after an

attribute.• Ties are broken by the second attribute on the ORDER BY list, etc.

NULL Values

• Tuples in relations can have NULL as a value for one or more components.

• Meaning depends on context. Two common cases:– Missing value: e.g., we know the length has some

value, but we don’t know what it is.– Inapplicable: e.g., the value of attribute spouse for

an unmarried person.

Comparing NULL’s to Values

• The logic of conditions in SQL is really 3-valued logic: TRUE, FALSE, UNKNOWN.

• When any value is compared with NULL, the truth value is UNKNOWN.

• But a query only produces a tuple in the answer if its truth value for the WHERE clause is TRUE (not FALSE or UNKNOWN).

Three-Valued Logic

• To understand how AND, OR, and NOT work in 3-valued logic, think of – TRUE = 1, FALSE = 0, and UNKNOWN = ½.– AND = MIN – OR = MAX – NOT(x) = 1-x

Example:

TRUE AND (FALSE OR NOT(UNKNOWN)) =

MIN(1, MAX(0, (1 - ½ ))) =

MIN(1, MAX(0, ½ )) = MIN(1, ½ ) = ½.

Surprising Example

SELECT *

FROM Movie

WHERE length <=120 OR length > 120;

• Suppose that we have some NULL values in the length.

• What’s the result?

Products and Joins in SQL• SQL has a simple way to couple relations in one query: list each

relation in the FROM clause.– All the relations in the FROM clause are coupled through Cartesian

product

– Then we can put conditions in the WHERE clause in order to get the various kinds of join.

Example. • We want to know the name of the producer of Star Wars. • To answer we need the information from both:

– Movie(title, year, length, inColor, studioName, producerC)

– MovieExec(name, address, cert, netWorth)

SELECT name

FROM Movie, MovieExec

WHERE title = 'Star Wars' AND producerC = cert;

Disambiguating Attributes• When we involve two or more relations in a query, we can have

attributes with the same name among these relations. – Solution: Disambiguate by putting the name of the relation followed by

a dot and then the name of the attribute.

Example. • Find pairs (star, movie executive) living in the same address.

– MovieStar(name, address, gender, birthdate)

– MovieExec(name, address, cert, netWorth)

SELECT MovieStar.name, MovieExec.name

FROM MovieStar, MovieExec

WHERE MovieStar.address = MovieExec.address;

Aliases• Sometimes we need to ask a query that combines a relation with

itself. – We may list a relation R as many times we want in the from

clause but we need a way to refer to each occurrence of R.– SQL allows us to define, for each occurrence in the FROM

clause, an alias (which is called “tuple variable”).

Example. • We like to know about two stars who share an address.

– MovieStar(name, address, gender, birthdate)

SELECT Star1.name, Star2.name

FROM MovieStar AS Star1, MovieStar AS Star2

WHERE Star1.address = Star2.address

AND Star1.name < Star2.name;• Remark: Don’t put ‘AS’ in Oracle.

Why not use <> instead?

Joins• Natural join is obtained by:

R NATURAL JOIN S;Example

SELECT *FROM MovieStar NATURAL JOIN MovieExec;

• Theta join is obtained by:

R JOIN S ON <condition>

Example

SELECT *

FROM MovieStar JOIN MovieExec

ON moviestar.name = movieexec.name;

In ORACLE it is used in FROM

OuterjoinsSELECT *

FROM moviestar NATURAL FULL OUTER JOIN movieexec;

SELECT *

FROM moviestar NATURAL LEFT OUTER JOIN movieexec;

SELECT *

FROM moviestar NATURAL RIGHT OUTER JOIN movieexec;

One of LEFT, RIGHT, or FULL before OUTER (but not missing). LEFT = pad dangling tuples of R only. RIGHT = pad dangling tuples of S only. FULL = pad both.

RemarkIf we had used <> then we would have produced pairs of married stars

twice, like:

Star1.name Star2.name

Alec Baldwin Kim Basinger

Kim Basinger Alec Baldwin

Connection with Relational Algebra1. Start with the relations in the FROM clause and take

their Cartesian Product.

2. Having created the product, apply a selection to it by converting the WHERE clause to a selection condition.

3. Finally with the list of attributes in the SELECT clause do a projection.

An Unintuitive Consequence of SQL semantics• Suppose R, S, T are relations each having attribute A alone. • We wish to compute R(ST), which is (RS) (RT). • We might expect the following SQL query to do the job.

SELECT R.A

FROM R, S, T

WHERE R.A = S.A OR R.A = T.A

• However, consider the situation in which T is empty. – Since R.A = T.A can never be satisfied, we might expect the query to

produce exactly RT.

• But using the Relational Algebra interpretation the result is empty.– Cartesian product R x S x T is .

Union, Intersection, and Difference of Queries• If two SQL queries produce relations with the same set of attributes then

we can combine the queries using the set operations: UNION, INTERSECT and EXCEPT.

Example. • Find the names and addresses of all female movie stars who are also

movie executives with a net worth over $1,000,000.– MovieStar(name, address, gender, birthdate)

– MovieExec(name, address, cert, netWorth)

(SELECT name, address

FROM MovieStar

WHERE gender = 'F')

INTERSECT

(SELECT name, address

FROM MovieExec

WHERE netWorth > 1000000);

Union, Intersection, and Difference of Queries (Continued)

Example. • Give the names and addresses of movie stars who are not also

movie executives. (SELECT name, address

FROM MovieStar)

EXCEPT

(SELECT name, address

FROM MovieExec);

Example. • We want all the titles and years of movies that appeared in either

the Movie or StarsIn relation.

(SELECT title, year FROM Movie)

UNION

(SELECT title, year FROM StarsIn);

In ORACLE use MINUS.