sql sql is a very-high-level language, in which the programmer is able to avoid specifying a lot of...

24
SQL • SQL is a very-high-level language, in which the programmer is able to avoid specifying a lot of data- manipulation details that would be necessary in languages like C++. • What makes SQL viable is that its queries are “optimized” quite well, yielding efficient query executions.

Post on 21-Dec-2015

219 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: SQL SQL is a very-high-level language, in which the programmer is able to avoid specifying a lot of data-manipulation details that would be necessary in

SQL

• SQL is a very-high-level language, in which the programmer is able to avoid specifying a lot of data-manipulation details that would be necessary in languages like C++.

• What makes SQL viable is that its queries are “optimized” quite well, yielding efficient query executions.

Page 2: SQL SQL is a very-high-level language, in which the programmer is able to avoid specifying a lot of data-manipulation details that would be necessary in

Select-From-Where Statements

• The principal form of a query is:

SELECT desired attributes

FROM one or more tables

WHERE condition about tuples of the tables

Page 3: SQL SQL is a very-high-level language, in which the programmer is able to avoid specifying a lot of data-manipulation details that would be necessary in

Our Running Example

• Our SQL queries will be based on the following database schema.

Movie(title, year, length, inColor, studioName, producerC)StarsIn(movieTitle, movieYear, starName)MovieStar(name, address, gender, birthdate)MovieExec(name, address, cert#, netWorth)Studio(name, address, cert#, netWorth)

Find all movies produced by Disney Studios in 1990.

SELECT *FROM MovieWHERE studioName = 'Disney' AND year = 1990;

Page 4: SQL SQL is a very-high-level language, in which the programmer is able to avoid specifying a lot of data-manipulation details that would be necessary in

Meaning of Single-Relation Query

• Begin with the relation in the FROM clause.

• Apply the selection indicated by the WHERE clause.

• Apply the extended projection indicated by the SELECT clause.

Page 5: SQL SQL is a very-high-level language, in which the programmer is able to avoid specifying a lot of data-manipulation details that would be necessary in

(Extended) Projection in SQLSELECT title, lengthFROM MovieWHERE studioName = 'Disney' AND year = 1990;

SELECT title AS name, length AS durationFROM MovieWHERE studioName = 'Disney' AND year = 1990;

SELECT title AS name, length*0.016667 AS lenghtInHoursFROM MovieWHERE studioName = 'Disney' AND year = 1990;

SELECT title AS name, length/60 AS length, 'hrs.' AS inHoursFROM MovieWHERE studioName = 'Disney' AND year = 1990;

Page 6: SQL SQL is a very-high-level language, in which the programmer is able to avoid specifying a lot of data-manipulation details that would be necessary in

Selection in SQL• The selection of the relational algebra is available through the

WHERE clause of SQL. • We may build expressions by using the operators:

=<><><=>=

• The string constants are surrounded by single quotes. – studioName = 'Disney'

• Numeric constants are for e.g.: -12.34, 1.23E45• Boolean operators are: AND, OR, NOT.

SELECT title FROM MovieWHERE (year > 1970) AND NOT (inColor='C');

Page 7: SQL SQL is a very-high-level language, in which the programmer is able to avoid specifying a lot of data-manipulation details that would be necessary in

Selection in SQL (Cont.)• Which Disney movies are after 1970 or have length greater than

90 mins?

SELECT title FROM MovieWHERE (year > 1970 OR length < 90) AND studioName='Disney';

• The parenthesis are needed because the precedence of OR is less than that of AND.

Page 8: SQL SQL is a very-high-level language, in which the programmer is able to avoid specifying a lot of data-manipulation details that would be necessary in

Comparision of strings• Strings can as well be compared (lexicographically) with the

same operators: =<><><=>=

• For instance ‘fodder’<‘foo’ ‘bar’ < ‘bargain’

Page 9: SQL SQL is a very-high-level language, in which the programmer is able to avoid specifying a lot of data-manipulation details that would be necessary in

Patterns• WHERE clauses can have conditions in which a string is compared with a pattern, to

see if it matches.

• General form:

<Attribute> LIKE <pattern>

<Attribute> NOT LIKE <pattern>

• Pattern is a quoted string with

% = “any string”

_ = “any character.”

• Examples. Suppose we remember a movie “Star something”.

SELECT title

FROM Movie

WHERE title LIKE 'Star %';

SELECT title

FROM Movie

WHERE title LIKE '%''s%';

Two consecutive apostrophes in a string represent itself and not the end of the string.

Page 10: SQL SQL is a very-high-level language, in which the programmer is able to avoid specifying a lot of data-manipulation details that would be necessary in

Comparison of Strings (Continued)

• What if the pattern we wish to use in a LIKE expression involves the characters % or _?

• We should “escape” their special meaning proceeding them by some escape character.

• In UNIX and C we use backslash \ as the escape character.

• SQL allows us to use any character we like.

• s LIKE 'x%%x%%' ESCAPE 'x'– x will be the escape character.

– A string that is matched by this pattern is for example: %aaaa%

SELECT title

FROM Movie

WHERE title LIKE 'x%%x%%' ESCAPE 'x';

Page 11: SQL SQL is a very-high-level language, in which the programmer is able to avoid specifying a lot of data-manipulation details that would be necessary in

Ordering the Input• We may ask the tuples produced by a query to be presented in sorted

order.

• ORDER BY <list of attributes>

• Example. Find the Disney movies of 1990.– Movie(title, year, length, inColor, studioName, producerC#)– To get the movies listed by length, shortest first, and among movies of equal

length, sort alphabetically:

SELECT *FROM MovieWHERE studioName = 'Disney'ORDER BY length, title;

• Ordering is ascending, unless you specify the DESC keyword to an attribute.

• Ties are broken by the second attribute on the ORDER BY list, etc.

Page 12: SQL SQL is a very-high-level language, in which the programmer is able to avoid specifying a lot of data-manipulation details that would be necessary in

NULL Values

• Tuples in SQL relations can have NULL as a value for one or more components.

• Meaning depends on context. Two common cases:– Missing value : e.g., we know the length has

some value, but we don’t know what it is.– Inapplicable : e.g., the value of attribute spouse

for an unmarried person.

Page 13: SQL SQL is a very-high-level language, in which the programmer is able to avoid specifying a lot of data-manipulation details that would be necessary in

Comparing NULL’s to Values

• The logic of conditions in SQL is really 3-valued logic: TRUE, FALSE, UNKNOWN.

• When any value is compared with NULL, the truth value is UNKNOWN.

• But a query only produces a tuple in the answer if its truth value for the WHERE clause is TRUE (not FALSE or UNKNOWN).

Page 14: SQL SQL is a very-high-level language, in which the programmer is able to avoid specifying a lot of data-manipulation details that would be necessary in

Three-Valued Logic• To understand how AND, OR, and NOT work in 3-valued

logic, think of – TRUE = 1, FALSE = 0, and UNKNOWN = ½.

• AND = MIN • OR = MAX • NOT(x) = 1-x

• Example:

TRUE AND (FALSE OR NOT(UNKNOWN)) =

MIN(1, MAX(0, (1 - ½ ))) =

MIN(1, MAX(0, ½ )) = MIN(1, ½ ) = ½.

Page 15: SQL SQL is a very-high-level language, in which the programmer is able to avoid specifying a lot of data-manipulation details that would be necessary in

Surprising ExampleSELECT *

FROM Movie

WHERE length <=120 OR length > 120;

• Suppose that we have some NULL values in the length.

• What’s the result?

Page 16: SQL SQL is a very-high-level language, in which the programmer is able to avoid specifying a lot of data-manipulation details that would be necessary in

Products and Joins in SQL• SQL has a simple way to couple relations in one query: list each relation

in the FROM clause.– All the relations in the FROM clause are coupled through Cartesian product

– Then we can put conditions in the WHERE clause in order to get the various kinds of join.

• Example. We want to know the name of the producer of Star Wars.

• To answer we need the information from both of the relations:– Movie(title, year, length, inColor, studioName, producerC)

– MovieExec(name, address, cert, netWorth)

SELECT name

FROM Movie, MovieExec

WHERE title = 'Star Wars' AND producerC = cert;

Page 17: SQL SQL is a very-high-level language, in which the programmer is able to avoid specifying a lot of data-manipulation details that would be necessary in

Disambiguating Attributes• When we involve two or more relations in a query, we can have

attributes with the same name among these relations.

• We solve the problem of disambiguating between them by putting the name of the relation followed by a dot and then the name of the attribute.

• Example. Suppose we wish to find pairs (star, movie executive) living in the same address.– MovieStar(name, address, gender, birthdate)

– MovieExec(name, address, cert, netWorth)

SELECT MovieStar.name, MovieExec.name

FROM MovieStar, MovieExec

WHERE MovieStar.address = MovieExec.address;

Page 18: SQL SQL is a very-high-level language, in which the programmer is able to avoid specifying a lot of data-manipulation details that would be necessary in

Tuple Variables• Sometimes we need to ask a query that involves two or more tuples

from the same relation. We may list a relation R as many times we want in the from clause but we need a way to refer to each occurrence of R.

• SQL allows us to define, for each occurrence in the FROM clause, an alias which we call “tuple variable.”

• Example. We like to know about two stars who share an address.

SELECT Star1.name, Star2.name

FROM MovieStar AS Star1, MovieStar AS Star2

WHERE Star1.address = Star2.address

AND Star1.name < Star2.name;

AS is not supported in Oracle.

Page 19: SQL SQL is a very-high-level language, in which the programmer is able to avoid specifying a lot of data-manipulation details that would be necessary in

In Oracle

SELECT Star1.name, Star2.name

FROM MovieStar Star1, MovieStar Star2

WHERE Star1.address = Star2.address

AND Star1.name <> Star2.name;

Page 20: SQL SQL is a very-high-level language, in which the programmer is able to avoid specifying a lot of data-manipulation details that would be necessary in

Tuple Variables (Continued)

• Why we have the condition – Star1.name < Star2.name ?

• Without this condition we would produce also pairs of identical star names.

• This conditions forces us to produce each pair of stars with a common address only.

• Why we used the operator < and not <>?

• If we had used <> the we would have produced pairs of married stars twice, like:

Star1.name Star2.name

Alec Baldwin Kim Basinger

Kim Basinger Alec Baldwin

Page 21: SQL SQL is a very-high-level language, in which the programmer is able to avoid specifying a lot of data-manipulation details that would be necessary in

Conversion to Relational Algebra• Another approach to interpret SQL queries is to relate them

to relational algebra.

• Start with the relations in the FROM clause and take their Cartesian Product.

• Having created the product, we apply a selection operator to it by converting the WHERE clause to a selection condition.

• Finally from the list of attributes in the SELECT clause we do a projection.

Page 22: SQL SQL is a very-high-level language, in which the programmer is able to avoid specifying a lot of data-manipulation details that would be necessary in

An Unintuitive Consequence of SQL semantics

• Suppose R, S, T are unary relations each having attribute A alone.

• We wish to compute R(ST).

• We might expect the following SQL query to do the job.

SELECT R.A

FROM R, S, T

WHERE R.A = S.A OR R.A = T.A

• However, consider the situation in which T is empty. Since R.A = T.A can never be satisfied, we might expect the query to produce exactly RT.

• But using the interpretation the result is empty.– If we use the conversion to RA, the Cartesian product R x S x T is .

Page 23: SQL SQL is a very-high-level language, in which the programmer is able to avoid specifying a lot of data-manipulation details that would be necessary in

Union, Intersection, and Difference of Queries

• If two SQL queries produce relations with the same set of attributes then we can combine the queries using the set operations: UNION, INTERSECT and EXCEPT.

• Example. Suppose we want the names and addresses of all female movie stars who are also movie executives with a net worth over $1,000,000.– MovieStar(name, address, gender, birthdate)

– MovieExec(name, address, cert, netWorth)

(SELECT name, address

FROM MovieStar

WHERE gender = 'F')

INTERSECT

(SELECT name, address

FROM MovieExec

WHERE netWorth > 1000000);

Page 24: SQL SQL is a very-high-level language, in which the programmer is able to avoid specifying a lot of data-manipulation details that would be necessary in

Union, Intersection, and Difference of Queries (Continued)

• Example. Give the names and addresses of movie stars who are not also movie executives. (SELECT name, address

FROM MovieStar)

EXCEPT

(SELECT name, address

FROM MovieExec);

• In ORACLE the EXCEPT is MINUS.

• Example. We want all the titles and years of movies that appeared in either the Movie or StarsIn relation.

(SELECT title, year FROM Movie)

UNION

(SELECT title, year FROM StarsIn);