review jun 5th, 2002. hw#5.2 tabletupletuple/pagepage r10000101000 s200010200 r r.a = s.b s...
TRANSCRIPT
Review
Jun 5th, 2002
HW#5.2
Table Tuple Tuple/page Page
R 10000 10 1000
S 2000 10 200
R R.a = S.b S (52buffers)
Page-Oriented Nested Loop Join For each page in the outer relation R,
we scan the entire inner relation S. – Cost: M + M * N
__________________
. . .
R & S
Input buffer for S
Output buffer . . .
Join ResultInput buffer for R
Nested Join v.s. Index Join Simple Nested Join
– Cost: M + ( PR*M ) * N
_______________________ Page-oriented Simple Nested Join
– Cost: M + M * N
_______________________ Index Join (Unclustered)
– Cost: M + ( PR*M ) * (1.2 + 1)
_______________________ When is Nested Join better than Index Join?
_______________________
Blocked Nested Loop Join For each matching tuple r in R-block, s in S-
page, add<r, s> to result. Then read next R-block, scan S, etc. – Cost: M + ( M / (B-2) ) * N
____________________
. . .
. . .
R & SInput buffer for R
( k < B-1 pages)
Input buffer for S Output buffer
. . .
Join Result
Hash-Join Partition both
relations using hash fn h1: R tuples in partition i will only match S tuples in partition i.
__________ Read in a
partition of R, hash it using h2 (<> h1). Scan matching partition of S, search for matches.
________
Partitionsof R & S
Input bufferfor Si
Hash table for partitionRi (k < B-1 pages)
B main memory buffersDisk
Output buffer
Disk
Join Result
hashfnh2
h2
B main memory buffers DiskDisk
Original Relation OUTPUT
2INPUT
1
hashfunction
h1 B-1
Partitions
1
2
B-1
. . .
Blocked Join v.s. Hash-Join Blocked Join
– Cost: M + ( M / (B-2) ) * N
___________________ Hash Join
– Cost: 3 * ( M + N )
___________________ When is Blocked Join better than Hash Join?
___________________
Sort-Join Sorting both
relations using Multi-way sort: ________
Read in each intermediate result of R and S, search for matches.
________
Partitionsof R & S
Disk Disk
Join Result
DiskDisk
Original Relation
Partitions(B-1) pages
1
2
M_B-1
. . .
B main memory buffers
INPUT
2 OUTPUT
1
Multisorting
B-1
B main memory buffers
INPUT
_ M_ B-1 OUTPUT
1
MergeJoin
_ N_ B-1
1
Sort-Merge Join v.s. Hash-Join
Sort-Merge needs more buffer space– Sort-Merge Join
• Cost: 3 * ( M + N )
• Buffer Size: ________________– Hash Join
• Cost: 3 * ( M + N )
• Buffer Size: ________________ Sort-Merge join is less sensitive to data skew Result of Sort-Merge join is sorted
HW#5.4 SQL TransformationSELECT DISTINCT F.FirstName, F.LastNameFROM GradStudents AS G, Faculty AS F, Advise AS AWHERE G.LoginID = A.Student AND F.LoginID = A.Advisor
AND G.Office = '224';
GradStudents: 157 tuples (20 distinct values for Office, uniform distribution)Faculty: 53 tuples Advise: 87 tuples
HW#5.4 SQL TransformationSELECT EntryYear, count(*) FROM GradStudentsWHERE FirstName = 'David'GROUP BY EntryYearHAVING EntryYear >= 1995ORDER BY EntryYear DESC
SELECT EntryYear, COUNT(*) FROM GradStudentsWHERE FirstName = 'David'GROUP BY EntryYearHAVING COUNT(*) >= 10ORDER BY EntryYear DESC
HW#5.4 SQL TransformationSELECT FirstNameFROM FacultyWHERE FirstName IN (
SELECT FirstNameFROM GradStudents)
SELECT FirstNameFROM FacultyWHERE FirstName NOT IN (
SELECT FirstNameFROM gradStudents)
HW#5.4 SQL TransformationSELECT LoginIDFROM UndergradStudentsWHERE EntryYear >= ANY (
SELECT EntryYearFROM GradStudents)
SELECT LoginIDFROM UndergradStudentsWHERE EntryYear >= ALL (
SELECT EntryYearFROM GradStudents)
HW#4.1 XML & XQuery<!ELEMENT products (product*)><!ELEMENT product (name, price, description, store*)><!ELEMENT store (name, phone, markup)>
Which products are sold at least in one store?
What are the product-store pairs whose markup is no lower than 15%?
Which stores sell some products with a price higher than 50?
Which products (except “gizmo”) are sold in some store that also sells the product “gizmo”?
HW#4.1 XML & XQuery<!ELEMENT products (product*)><!ELEMENT product (name, price, description, store*)><!ELEMENT store (name, phone, markup)>
Returns the names and prices of all products that are sold in all stores with a markup of 25%.
Returns the names and prices of all products that are sold at least at one store with a markup of 25%
HW#4.1 XML & XQuery<!ELEMENT products (product*)><!ELEMENT product (name, price, description, store*)><!ELEMENT store (name, phone, markup)>
<products>{
FOR $p IN documents(“database.xml”)//products/row
RETURN
<product pid = “{$p/pid/text()}”>
<name>{$p/name/text()}</name>
<price>{$p/price/text()}</price>
<description>{$p/description/text()}</description>
{FOR $x IN documents(“database.xml”)//sells/row[pid = $p/pid]
FOR $s IN documents(“database.xml”)//stores/row[sid = $x/sid]
RETURN
<store sid = “{$s/sid/text()}”>
<name>{$s/name/text()}</name>
<phone>{$s/phone/text()}</phone>
<markup>{$x/markup/text)()}</markup>
</store>
}</product>
}</products>
Midterm 1
Midterm 1
Company (DeptID, Name, Budget, CEOEmployID, CEOContratID, Since)
Work-in (EmployID, Lot, DeptID, Name, Budget, CEOEmployID, Since)
HW#5.1 B+ Tree
B+ Tree – Insert 70
B+ Tree – Insert 155
B+ Tree – Insert 165
B+ Tree – Delete 10
B+ Tree – Delete 8