sql, data storage technologies, and web-data integration week 4

Post on 31-Dec-2015

213 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

SQL, Data Storage Technologies, and Web-Data

IntegrationWeek 4

Today’s Agenda

• Review

• Intro to SQL Continued– SELECT, GROUP BY, HAVING, DELETE,

UPDATE

• “Advanced” SQL– Joins, Functions, Locking tables, Transactions

Week 3 Review• Physical database design

– Column options• NOT NULL, DEFAULT, AUTO_INCREMENT, PRIMARY

KEY

• Client/Server Architecture• Connecting to SQL

– command line mysql client

• Introduction to SQL– SHOW, USE, CREATE, INSERT

Selecting Data

• Syntax– SELECT column_name,… FROM TableName– Use a “*” in place of the column_name list to

retrieve all columns– Examples:

• Show me all the data stored about our donors•mysql> SELECT * FROM Donor;• What are all the names of all our donors?•mysql> SELECT name FROM Donor;

Being More Specific

• Syntax– SELECT column_name,… FROM TableName

WHERE statement– Example

• Show me all the donors with the name “Jake Johnson”

•mysql> SELECT * FROM Donor WHERE name = ‘Jake Johnson’

Conditional OperatorsSQL Symbol Definition

= Matches if the values are equal

!= Matches if the values are not equal

> Matches if the left value is greater than the right value

< Matches if the left value is less than the right value

>= Matches if the left value is greater than or equal to the right value

<= Matches if the left value is less than or equal to the right value

IN (value,…) Matches if value is among the values listed

BETWEEN value AND value

Matches if value is between value1 and value2 or equal to one of them.

LIKE Matches if value matches the pattern expressed in value1 using any series of wildcard characters and anchors.

The LIKE comparison

• Uses wildcard characters to match column data– ‘_’ represent any one character– SELECT name FROM Donor WHERE name

LIKE ‘_ob’• Matches for “Bob”, “Rob”, “Job”, etc.

– ‘%’ represents any number of characters• Select name FROM Fruit WHERE name LIKE

‘%apple’• Matches for “Pineapple” and “Apple”

More LIKE

• Find all donors whose name starts with a "J”:• mysql> SELECT * FROM Donor WHERE name LIKE ‘J%’;

• Use “AND” or “OR” to add multiple restrictions in your WHERE clause

• Find all donors whose name starts with a “J” and have a 206 area code

• mysql> SELECT * FROM Donor WHERE name LIKE “J%” AND phone_number LIKE “206%”;

Other Constraints

• GROUP BY– Fun with aggregates

• HAVING

Ordering Your Data

• Syntax– SELECT column_name,… FROM TableName

ORDER BY column_name, …

• Example– List all donors in alphabetical order– mysql> SELECT * FROM Donor ORDER BY name;

More Ordering

• Feel free to combine with other constraints, such as WHERE– mysql> SELECT * FROM Donor WHERE name like ‘J%’ ORDER BY name;

• You can order by more than one column– SELECT * FROM Donor ORDER BY lastname,

firstname

• Swap the order with DESC or ASC– mysql> SELECT * FROM Donor WHERE name like ‘J%’ ORDER BY name DESC;

Grouping your data

• The GROUP BY clause groups data together so that aggregate functions can be performed on it.

• Very common for reports and statistics

• More interesting with large sets of data

Piping SQL commands to MySQL

• Sometimes we have a big file of SQL commands that we want to run.

• Quit your mysql client application– mysql> quit;

• Retrieve, and then upload to your dante account:– https://courses.washington.edu/wtcampus/spring/examples/sql/d

onation.sql• Look at the big file of SQL command

– $ less donation.sql• Use a Unix “pipe” to send the file of commands to MySQL

– $ /usr/local/mysql-5.0.67-linux-i686/bin/mysql –u uwnetid –p uwnetid < donation.sql

• The “<“ operator takes all the lines of text from donations.sql, and sends them to MySQL

Back to Grouping

• What does the Donation table look like?• mysql> DESCRIBE Donation;

Column Name Type Options DONATIONID Int unsigned Primary key

auto_increment date Datetime Not null amount Decimal(5.2) Not null processorName Varchar(255) DONORID Int unsigned

Group By

• Syntax– SELECT column_name,… FROM TableName

GROUP BY column_name, …

• Example– What is the total amount donated by each

donor?– mysql> SELECT donorid, SUM(amount) FROM Donation GROUP BY donorid;

GROUP BY with other constraints

• GROUP BY must come after any WHERE clause

• GROUP BY must come before any LIMIT or ORDER BY clause

The GroupiesAggregate Function Definition

AVG(column) Returns the average of the column values.

COUNT(column) Returns the number of times the column was not null or had a value.

MAX(column) Returns the maximum value of the column.

MIN(column) Returns the minimum value of the column.

STD(column) Returns the standard deviation of the column values.

SUM(column) Returns the sum of the column values.

GROUP BY

• SELECT column_name (or aggregate function), … FROM TableName WHERE clause GROUP BY column_name, …

• You can GROUP BY multiple columns– Example:

• How many of our donors have the same name?• SELECT fname, lname, COUNT(*) FROM Donor

GROUP BY fname, lname;

HAVING clause

• Syntax– SELECT FROM column_name,… FROM

TableName HAVING statement

• “statement” is the same set of conditionals that the WHERE clause has– So what is the difference between HAVING

and WHERE?

HAVING vs. WHERE

• The WHERE clause happens as MySQL is looking through its table

• The HAVING clause happens on the rows returned by the WHERE clause

• mysql> SELECT * FROM Donor HAVING name = ‘Jake Johnson’;– Twice as slow! First scan the Donor table for

all the rows, then scan all the rows again for names matching ‘Jake Johnson’.

HAVING

• Let's say we're interested in sending a letter to our top donors – those who donated more than $150.

• Use the GROUP BY clause and the SUM aggregate function to get a list of the total amounts.

• Adding the HAVING clause we can further restrict the results.

• mysql> SELECT donorid, SUM(amount) FROM Donation GROUP BY donorid HAVING SUM(amount) > 150;

Deleting rows

• Syntax– DELETE FROM TableName [whereclause]

• Example– mysql> DELETE FROM Donation;

Deleting is too Easy!

• Rows are hard to create, easy to destroy!

• Always use a WHERE clause!

• Example:– mysql> DELETE FROM Donor WHERE name = ‘Jake Johnson’;

• Best to write a SELECT first

Updating data

• Syntax– UPDATE TableName SET column_name =

value [where_clause]

• Let’s learn our lesson from delete, and always use the WHERE clause

• Example:– mysql> UPDATE DONOR SET address = ‘123 Home Lane’, phone_number = ‘555-1212’ WHERE Donorid = 1;

Practice: Using the Aggregates

• How many donations has each donor made?• What is the maximum donation amount made by

each donor?• What are the donorIDs for the top ten donors?• Of those who donated in 2003, what are the

donorIDs of the ten worst donors in 2003?• What is the total amount of donations we’ve

received in 2004?• What are the donorIDs for the ten best and ten

worst donors?

“Advance” SQL

• Joining tables– Inner vs. Outer

• Built in functions

• Table locking

• Transactions

Joining• Our queries on the Donation table only return

the DonorID.• Typically, we want to know the Donor’s name,

not their ID.• We could do two selects and collate the data

– SELECT donorid, SUM(amount) FROM Donation GROUP BY donorid;

– SELECT donorid, name FROM Donor;– match these up in our code

• Or, we could simply do this with one query

Joining• So far we’ve seen SELECTs on a single

table– How is this any better than using a Berkley

DB or text file on a local computer?

• Joins allow us to select information from more than one table and model the relationships in the conceptual model– We don’t want to know the donorIDs, we want

to know the donor names!

Simple Joining

• SELECT * FROM Table1, Table2;– Listing multiple tables after “FROM” joins

those tables together– This effectively creates a new schema, with

new tuples.

• Donor(donorid, name, address)• Donation(donationid, amount, donorid)• DonorDonation(donorid, name, address,

donationid, amount, donorid)

Cartesian (Cross) Product

• New “virtual” schema:– DonorDonation(donorid, name, address, donationid, amount,

donorid)

• New tuples:– (1, “Bob”, “123 St.”, 8, 50.00, 1)– (2, “Sue”, “345 Pl.”, 8, 50.00, 1)– (3, “Joe”, “678 Rd.”, 8, 50.00, 1)

• Every row in table A is joined with every row in table B (A x B).

• mysql> SELECT * FROM Donor, Donation;• 500 Donors x 3000 Donations = 1.5 Million rows!!

Enforcing the relationship

– (1, “Bob”, “123 St.”, 8, 50.00, 1)– (2, “Sue”, “456 Pl.”, 8, 50.00, 1)– (3, “Joe”, “789 Rd.”, 8, 50.00, 1)

• Knowing we have a Donation of (8, 50.00, 1), we are only interested in the row where the Donor was (1, “Bob”, “123 St.”)

• Solution: Use a WHERE clause just like we did before

Enforcing the Relationship

• Our new, new “virtual” schema:– DonorDonation(Donor.donorid, Donor.name,

Donor.address, Donation.donationid, Donation.amount, Donation.donorid)

• Select just the tuples that have matching donorIDs:– mysql> SELECT * FROM Donor, Donation WHERE Donor.donorid = Donation.donorid;

Enforcing the relation

• Tuples now only have data where the donorIDs match– (1, “Bob”, “123 St.”, 8, 50.00, 1)– (2, “Sue”, “456 Pl.”, 19, 175.00, 2)– (2, “Sue”, “456 Pl.”, 33, 25.00, 2)

• Our Donor - Donation relationship is now successfully modeled– One Donor (i.e.: “Sue”) has one or many

Donations (i.e.: 175.00, 25.00)

Refining your join• Just like a single table SELECT statement, you

can refine your multiple table SELECT statement– AND, OR, GROUP BY, HAVING, ORDER BY, LIMIT

• Example: What are the names of the top five donors that have donated at least $150, and how much have they donated?

SELECT Donor.name, SUM(Donation.amount)

FROM Donor, Donation

WHERE Donor.donorid = Donation.donorid

GROUP BY Donor.donorid

HAVING SUM(Donation.amount) > 150

ORDER BY SUM(Donation.amount) DESC

LIMIT 5;

More than two tables

• SELECT * FROM Donor, Donation, Processor WHERE Donor.donorid = Donation.donorid AND Donation.processorid = Processor.processorid;

• Order of the tables in not important

The Equality Test

• Typically you need an equality test for each extra table you add to the FROM clause.

• The equality checks are almost always between the primary key and the foreign keys of tables. (That’s why the foreign keys are there!)

Outer Joins

• The joins we’ve looked at only return a Donor who has made a Donation

• What if we want to know which Donors have not made any Donations?

• The solution is to use an Outer Join (MySQL supports this through the Left Join command)

Outer Joins

• An Outer Join will take all the rows from the Left table (or the Right, depending on the SQL/RDBMS), without requiring a match on the other table.

Outer Joins• SELECT columns FROM Table1 LEFT

JOIN Table2 ON equality_test [WHERE|GROUP BY|etc.]

• Example:– mysql> SELECT Donor.name, Donation.donationid FROM Donor LEFT JOIN Donation ON Donor.donorid = Donation.donorid;

Outer Joins

• An outer join will Null fill any columns from Table2 where the ON statement doesn’t match.– (1, “Bob”, “123 St.”, 8, 50.00, 1)– (2, “Sue”, “456 Pl.”, 19, 175.00, 2)– (3, “Joe”, “789 Rd.”, NULL, NULL, NULL)

• If a tuple from Table1 can be joined with any tuple from Table2, it will not be Null filled.

Outer Joins

• What if we want to know which Donors have not made any Donations?

• mysql> SELECT Donor.name, Donation.amount FROM Donor LEFT JOIN Donation ON Donor.donorid = Donation.donorid WHERE Donation.amount IS NULL;

A Lot of Typing

• SELECT Donor.name, SUM(Donation.amount), Processor.name FROM Donor LEFT JOIN Donation ON Donor.donorid = Donation.donorid LEFT JOIN Processor ON Donation.processorid = Processor.processorid GROUP BY Donor.donorid

Less Typing with Aliases

• You can give your Tables nicknames:

• SELECT Dr.name, SUM(Dn.amount), P.name FROM Donor AS Dr LEFT JOIN Donation AS Dn ON Dr.donorid = Dn.donorid LEFT JOIN Processor AS P ON Dn.processorid = P.processorid GROUP BY Dr.donorid

Renaming Output

• You can give your selected columns nicknames too

• SELECT Donor.name, SUM(Donation.amount) AS total, Processor.name FROM Donor LEFT JOIN Donation ON Donor.donorid = Donation.donorid LEFT JOIN ON Processor ON Donation.processorid = Processor.processorid GROUP BY Donor.donorid ORDER BY total

• You can’t always use aggregate functions in your ORDER BY, and you can’t always use them in your HAVING clause

Joining to Other Databases

• Sometimes you may want to share a database with other databases

• Example: You have a “Users” database that is shared between two applications, each of which has its own database.

• SELECT C.name, CN.nickname FROM Users.Customer AS C, CustomerNickname AS CN WHERE C.customerid = CN.customerid;

SQL Functions

• MySQL provides a lot of functions to munge the results of a query

• Example, returning a date– SELECT date FROM Donation WHERE

donationid=1;• 2004-10-14 15:52:08• Not very “pretty” for a user to see

SQL Functions

• Use the FORMAT_DATE() function instead!– SELECT FORMAT_DATE(date, “%m/%d/%y”)

FROM Donation WHERE donationid=1;• 10/13/04• Prettier!

– SELECT FORMAT_DATE(date, “%M %D, %Y”) FROM Donation WHERE donationid=1;

• October 13th, 2004

Some Common FunctionsFunction Result

ABS returns the absolute value of the column CONCAT returns the string formed by joining together all of the function

arguments DATE_ADD returns the date formed by adding a given amount of time to the date DATE_SUB returns the date formed by subtracting a given amount of time from

the date DATE_FORMAT returns a date formatted as you specify in the format string. This is

one of the most useful functions for printing dates in the format you desire

FORMAT returns a neatly formatted number with commas and the specified number of decimal places

ISNULL returns 1 if the value is NULL, zero otherwise LENGTH returns the number of characters in a string NOW returns the current date and time.

Example using functions

• Return a nicely formatted list of donation dates made in the last 10 days

• mysql> SELECT DATE_FORMAT(date, ‘%m/%d/%Y’) FROM Donation WHERE date > DATE_SUB(NOW(), INTERVAL 10 DAY);

Practice: Joining Tables

• Who Processed the most Donations?• Which Donors have made no Donations?• Which Division received the most money?• Which Donor gave the most to Healthcare?

mysql> SOURCE non-profit.sql

Transactions and Table Locking

• Usually, a database is being used concurrently by many different users– Example: Multiple processors will be entering

donation information

• Very important to maintain data integrity

• Transactions or Table Locking can help provide that data integrity

Table Locking

• Example: Say one of our data integrity checks is to make sure no Donor has the same name and address

• In our application, we would probably do something like:

– Check for any "Donor" with the same name and address.

– If there are no matches, insert our new "Donor."

Table Locking• A two step process is okay for one user, very

bad for more than one user• What if this happens:

User 1 User 2

Check for any donor with the name "John Doe."

Check for any donor with the name "John Doe."

No matches

No matches

insert new record

insert new record

Table Locking

• One solution is to use table locking

• User 1 would lock the table so that only s/he can use the table

• They can then check and insert to the table while User 2 waits for the table to become available

Table LockingUser 1 User 2

Lock table

Check for any donor with the name "John Doe."

Wait for table to unlock

No matches Wait for table to unlock

insert new record Wait for table to unlock

Unlock table Wait for table to unlock

Lock table

Check for any donor with the name "John Doe."

Match exists – do NOT insert

Unlock table

Table Locking

• Why not lock everything?

• Slower– Processing overhead to lock tables– Users forced to wait for tables to be unlocked

• Potentially Dangerous– What if you forget to unlock your table?– What if your application crashes?

More on Table Locking

• Syntax:– LOCK TABLE TableName (WRITE|READ), …– UNLOCK TABLES

• Use WRITE for when you need to insert into the table, and READ for when you just need to query the table

• You must lock any table you plan to use between your LOCK TABLE and UNLOCK TABLES commands.– mysql> LOCK TABLE Donor WRITE, Donation READ;

Transactions

• Transactions work by isolating a set of commands such that no other command can alter the data currently being worked on.– Treats a set of commands as one command– Works like table locking in this sense– Also slow like table locking in this sense

Transactions

• Transactions aren’t automatically available on all of your database tables.– You have to be using the InnoDB or BDB

table types– Default table type is MyISAM

• Let’s create one!– mysql> CREATE TABLE Account

(ACCOUNTID INT UNSIGNED PRIMARY KEY AUTO_INCREMENT, balance DOUBLE) ENGINE = InnoDB;

Transactions

• Start a transaction with BEGIN• Transaction isn’t completed until a

COMMIT• Any commands in between are treated as

one command• If something goes wrong, you can

ROLLBACK– reverts the database to the state it was when

you began

Transactions

• Transactions are fairly complex– Not covered in-depth for “Intro to SQL” course

• Transactions are fairly powerful as well– Some external resources– MySQL: MySQL Manual

"1.8.5.3 Transactions and Atomic Operations"http://dev.mysql.com/doc/mysql/en/ANSI_diff_Transactions.html

– Sams Publishing"MySql Transactions Overview“ http://www.samspublishing.com/articles/article.asp?p=29312

top related