review jun 5th, 2002. hw#5.2 tabletupletuple/pagepage r10000101000 s200010200 r r.a = s.b s...

24
Review Jun 5th, 2002

Upload: gwendoline-douglas

Post on 03-Jan-2016

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Review Jun 5th, 2002. HW#5.2 TableTupleTuple/pagePage R10000101000 S200010200 R R.a = S.b S (52buffers)

Review

Jun 5th, 2002

Page 2: Review Jun 5th, 2002. HW#5.2 TableTupleTuple/pagePage R10000101000 S200010200 R R.a = S.b S (52buffers)

HW#5.2

Table Tuple Tuple/page Page

R 10000 10 1000

S 2000 10 200

R R.a = S.b S (52buffers)

Page 3: Review Jun 5th, 2002. HW#5.2 TableTupleTuple/pagePage R10000101000 S200010200 R R.a = S.b S (52buffers)

Page-Oriented Nested Loop Join For each page in the outer relation R,

we scan the entire inner relation S. – Cost: M + M * N

__________________

. . .

R & S

Input buffer for S

Output buffer . . .

Join ResultInput buffer for R

Page 4: Review Jun 5th, 2002. HW#5.2 TableTupleTuple/pagePage R10000101000 S200010200 R R.a = S.b S (52buffers)

Nested Join v.s. Index Join Simple Nested Join

– Cost: M + ( PR*M ) * N

_______________________ Page-oriented Simple Nested Join

– Cost: M + M * N

_______________________ Index Join (Unclustered)

– Cost: M + ( PR*M ) * (1.2 + 1)

_______________________ When is Nested Join better than Index Join?

_______________________

Page 5: Review Jun 5th, 2002. HW#5.2 TableTupleTuple/pagePage R10000101000 S200010200 R R.a = S.b S (52buffers)

Blocked Nested Loop Join For each matching tuple r in R-block, s in S-

page, add<r, s> to result. Then read next R-block, scan S, etc. – Cost: M + ( M / (B-2) ) * N

____________________

. . .

. . .

R & SInput buffer for R

( k < B-1 pages)

Input buffer for S Output buffer

. . .

Join Result

Page 6: Review Jun 5th, 2002. HW#5.2 TableTupleTuple/pagePage R10000101000 S200010200 R R.a = S.b S (52buffers)

Hash-Join Partition both

relations using hash fn h1: R tuples in partition i will only match S tuples in partition i.

__________ Read in a

partition of R, hash it using h2 (<> h1). Scan matching partition of S, search for matches.

________

Partitionsof R & S

Input bufferfor Si

Hash table for partitionRi (k < B-1 pages)

B main memory buffersDisk

Output buffer

Disk

Join Result

hashfnh2

h2

B main memory buffers DiskDisk

Original Relation OUTPUT

2INPUT

1

hashfunction

h1 B-1

Partitions

1

2

B-1

. . .

Page 7: Review Jun 5th, 2002. HW#5.2 TableTupleTuple/pagePage R10000101000 S200010200 R R.a = S.b S (52buffers)

Blocked Join v.s. Hash-Join Blocked Join

– Cost: M + ( M / (B-2) ) * N

___________________ Hash Join

– Cost: 3 * ( M + N )

___________________ When is Blocked Join better than Hash Join?

___________________

Page 8: Review Jun 5th, 2002. HW#5.2 TableTupleTuple/pagePage R10000101000 S200010200 R R.a = S.b S (52buffers)

Sort-Join Sorting both

relations using Multi-way sort: ________

Read in each intermediate result of R and S, search for matches.

________

Partitionsof R & S

Disk Disk

Join Result

DiskDisk

Original Relation

Partitions(B-1) pages

1

2

M_B-1

. . .

B main memory buffers

INPUT

2 OUTPUT

1

Multisorting

B-1

B main memory buffers

INPUT

_ M_ B-1 OUTPUT

1

MergeJoin

_ N_ B-1

1

Page 9: Review Jun 5th, 2002. HW#5.2 TableTupleTuple/pagePage R10000101000 S200010200 R R.a = S.b S (52buffers)

Sort-Merge Join v.s. Hash-Join

Sort-Merge needs more buffer space– Sort-Merge Join

• Cost: 3 * ( M + N )

• Buffer Size: ________________– Hash Join

• Cost: 3 * ( M + N )

• Buffer Size: ________________ Sort-Merge join is less sensitive to data skew Result of Sort-Merge join is sorted

Page 10: Review Jun 5th, 2002. HW#5.2 TableTupleTuple/pagePage R10000101000 S200010200 R R.a = S.b S (52buffers)

HW#5.4 SQL TransformationSELECT DISTINCT F.FirstName, F.LastNameFROM GradStudents AS G, Faculty AS F, Advise AS AWHERE G.LoginID = A.Student AND F.LoginID = A.Advisor

AND G.Office = '224';

GradStudents: 157 tuples (20 distinct values for Office, uniform distribution)Faculty: 53 tuples Advise: 87 tuples

Page 11: Review Jun 5th, 2002. HW#5.2 TableTupleTuple/pagePage R10000101000 S200010200 R R.a = S.b S (52buffers)

HW#5.4 SQL TransformationSELECT EntryYear, count(*) FROM GradStudentsWHERE FirstName = 'David'GROUP BY EntryYearHAVING EntryYear >= 1995ORDER BY EntryYear DESC

SELECT EntryYear, COUNT(*) FROM GradStudentsWHERE FirstName = 'David'GROUP BY EntryYearHAVING COUNT(*) >= 10ORDER BY EntryYear DESC

Page 12: Review Jun 5th, 2002. HW#5.2 TableTupleTuple/pagePage R10000101000 S200010200 R R.a = S.b S (52buffers)

HW#5.4 SQL TransformationSELECT FirstNameFROM FacultyWHERE FirstName IN (

SELECT FirstNameFROM GradStudents)

SELECT FirstNameFROM FacultyWHERE FirstName NOT IN (

SELECT FirstNameFROM gradStudents)

Page 13: Review Jun 5th, 2002. HW#5.2 TableTupleTuple/pagePage R10000101000 S200010200 R R.a = S.b S (52buffers)

HW#5.4 SQL TransformationSELECT LoginIDFROM UndergradStudentsWHERE EntryYear >= ANY (

SELECT EntryYearFROM GradStudents)

SELECT LoginIDFROM UndergradStudentsWHERE EntryYear >= ALL (

SELECT EntryYearFROM GradStudents)

Page 14: Review Jun 5th, 2002. HW#5.2 TableTupleTuple/pagePage R10000101000 S200010200 R R.a = S.b S (52buffers)

HW#4.1 XML & XQuery<!ELEMENT products (product*)><!ELEMENT product (name, price, description, store*)><!ELEMENT store (name, phone, markup)>

Which products are sold at least in one store?

What are the product-store pairs whose markup is no lower than 15%?

Which stores sell some products with a price higher than 50?

Which products (except “gizmo”) are sold in some store that also sells the product “gizmo”?

Page 15: Review Jun 5th, 2002. HW#5.2 TableTupleTuple/pagePage R10000101000 S200010200 R R.a = S.b S (52buffers)

HW#4.1 XML & XQuery<!ELEMENT products (product*)><!ELEMENT product (name, price, description, store*)><!ELEMENT store (name, phone, markup)>

Returns the names and prices of all products that are sold in all stores with a markup of 25%.

Returns the names and prices of all products that are sold at least at one store with a markup of 25%

Page 16: Review Jun 5th, 2002. HW#5.2 TableTupleTuple/pagePage R10000101000 S200010200 R R.a = S.b S (52buffers)

HW#4.1 XML & XQuery<!ELEMENT products (product*)><!ELEMENT product (name, price, description, store*)><!ELEMENT store (name, phone, markup)>

<products>{

FOR $p IN documents(“database.xml”)//products/row

RETURN

<product pid = “{$p/pid/text()}”>

<name>{$p/name/text()}</name>

<price>{$p/price/text()}</price>

<description>{$p/description/text()}</description>

{FOR $x IN documents(“database.xml”)//sells/row[pid = $p/pid]

FOR $s IN documents(“database.xml”)//stores/row[sid = $x/sid]

RETURN

<store sid = “{$s/sid/text()}”>

<name>{$s/name/text()}</name>

<phone>{$s/phone/text()}</phone>

<markup>{$x/markup/text)()}</markup>

</store>

}</product>

}</products>

Page 17: Review Jun 5th, 2002. HW#5.2 TableTupleTuple/pagePage R10000101000 S200010200 R R.a = S.b S (52buffers)

Midterm 1

Page 18: Review Jun 5th, 2002. HW#5.2 TableTupleTuple/pagePage R10000101000 S200010200 R R.a = S.b S (52buffers)

Midterm 1

Company (DeptID, Name, Budget, CEOEmployID, CEOContratID, Since)

Work-in (EmployID, Lot, DeptID, Name, Budget, CEOEmployID, Since)

Page 19: Review Jun 5th, 2002. HW#5.2 TableTupleTuple/pagePage R10000101000 S200010200 R R.a = S.b S (52buffers)

HW#5.1 B+ Tree

Page 20: Review Jun 5th, 2002. HW#5.2 TableTupleTuple/pagePage R10000101000 S200010200 R R.a = S.b S (52buffers)

B+ Tree – Insert 70

Page 21: Review Jun 5th, 2002. HW#5.2 TableTupleTuple/pagePage R10000101000 S200010200 R R.a = S.b S (52buffers)

B+ Tree – Insert 155

Page 22: Review Jun 5th, 2002. HW#5.2 TableTupleTuple/pagePage R10000101000 S200010200 R R.a = S.b S (52buffers)

B+ Tree – Insert 165

Page 23: Review Jun 5th, 2002. HW#5.2 TableTupleTuple/pagePage R10000101000 S200010200 R R.a = S.b S (52buffers)

B+ Tree – Delete 10

Page 24: Review Jun 5th, 2002. HW#5.2 TableTupleTuple/pagePage R10000101000 S200010200 R R.a = S.b S (52buffers)

B+ Tree – Delete 8