04 database storage part ii · log-structured file organization instead of storing tuples in pages,...
TRANSCRIPT
![Page 1: 04 Database Storage Part II · LOG-STRUCTURED FILE ORGANIZATION Instead of storing tuples in pages, the DBMS only stores log records. The system appends log records to the file of](https://reader033.vdocument.in/reader033/viewer/2022050215/5f61819daf133879d30f87d6/html5/thumbnails/1.jpg)
Intro to Database Systems
15-445/15-645
Fall 2019
Andy PavloComputer Science Carnegie Mellon UniversityAP
04 Database StoragePart II
![Page 2: 04 Database Storage Part II · LOG-STRUCTURED FILE ORGANIZATION Instead of storing tuples in pages, the DBMS only stores log records. The system appends log records to the file of](https://reader033.vdocument.in/reader033/viewer/2022050215/5f61819daf133879d30f87d6/html5/thumbnails/2.jpg)
CMU 15-445/645 (Fall 2019)
ADMINISTRIVIA
Homework #1 is due September 11th @ 11:59pm
Project #1 will be released on September 11th
2
![Page 3: 04 Database Storage Part II · LOG-STRUCTURED FILE ORGANIZATION Instead of storing tuples in pages, the DBMS only stores log records. The system appends log records to the file of](https://reader033.vdocument.in/reader033/viewer/2022050215/5f61819daf133879d30f87d6/html5/thumbnails/3.jpg)
CMU 15-445/645 (Fall 2019)
UPCOMING DATABASE EVENTS
SalesForce Talk→ Friday Sep 13th @ 12:00pm→ CIC 4th Floor
Impira Talk→ Monday Sep 16th @ 4:30pm→ GHC 8102
Vertica Talk→ Monday Sep 23rd @ 4:30pm→ GHC 8102
3
![Page 4: 04 Database Storage Part II · LOG-STRUCTURED FILE ORGANIZATION Instead of storing tuples in pages, the DBMS only stores log records. The system appends log records to the file of](https://reader033.vdocument.in/reader033/viewer/2022050215/5f61819daf133879d30f87d6/html5/thumbnails/4.jpg)
CMU 15-445/645 (Fall 2019)
DISK-ORIENTED ARCHITECTURE
The DBMS assumes that the primary storage location of the database is on non-volatile disk.
The DBMS's components manage the movement of data between non-volatile and volatile storage.
4
![Page 5: 04 Database Storage Part II · LOG-STRUCTURED FILE ORGANIZATION Instead of storing tuples in pages, the DBMS only stores log records. The system appends log records to the file of](https://reader033.vdocument.in/reader033/viewer/2022050215/5f61819daf133879d30f87d6/html5/thumbnails/5.jpg)
CMU 15-445/645 (Fall 2019)
SLOT TED PAGES
The most common layout scheme is called slotted pages.
The slot array maps "slots" to the tuples' starting position offsets.
The header keeps track of:→ The # of used slots→ The offset of the starting location of the
last slot used.
5
Header
Tuple #4
Tuple #2
Tuple #3
Tuple #1
Fixed/Var-lengthTuple Data
Slot Array
![Page 6: 04 Database Storage Part II · LOG-STRUCTURED FILE ORGANIZATION Instead of storing tuples in pages, the DBMS only stores log records. The system appends log records to the file of](https://reader033.vdocument.in/reader033/viewer/2022050215/5f61819daf133879d30f87d6/html5/thumbnails/6.jpg)
CMU 15-445/645 (Fall 2019)
SLOT TED PAGES
The most common layout scheme is called slotted pages.
The slot array maps "slots" to the tuples' starting position offsets.
The header keeps track of:→ The # of used slots→ The offset of the starting location of the
last slot used.
5
Header
Tuple #4
Tuple #2
Tuple #3
Tuple #1
Fixed/Var-lengthTuple Data
Slot Array
![Page 7: 04 Database Storage Part II · LOG-STRUCTURED FILE ORGANIZATION Instead of storing tuples in pages, the DBMS only stores log records. The system appends log records to the file of](https://reader033.vdocument.in/reader033/viewer/2022050215/5f61819daf133879d30f87d6/html5/thumbnails/7.jpg)
CMU 15-445/645 (Fall 2019)
LOG-STRUCTURED FILE ORGANIZATION
Instead of storing tuples in pages, the DBMS only stores log records.
The system appends log records to the file of how the database was modified:→ Inserts store the entire tuple.→ Deletes mark the tuple as deleted.→ Updates contain the delta of just the
attributes that were modified.
6
…Ne
w E
ntr
ies
INSERT id=1,val=a
INSERT id=2,val=b
DELETE id=4
UPDATE val=X (id=3)
UPDATE val=Y (id=4)
INSERT id=3,val=c
Page
![Page 8: 04 Database Storage Part II · LOG-STRUCTURED FILE ORGANIZATION Instead of storing tuples in pages, the DBMS only stores log records. The system appends log records to the file of](https://reader033.vdocument.in/reader033/viewer/2022050215/5f61819daf133879d30f87d6/html5/thumbnails/8.jpg)
CMU 15-445/645 (Fall 2019)
LOG-STRUCTURED FILE ORGANIZATION
To read a record, the DBMS scans the log backwards and "recreates" the tuple to find what it needs.
7
INSERT id=1,val=a
INSERT id=2,val=b
DELETE id=4
UPDATE val=X (id=3)
UPDATE val=Y (id=4)
INSERT id=3,val=c
…
Re
ad
s
Page
![Page 9: 04 Database Storage Part II · LOG-STRUCTURED FILE ORGANIZATION Instead of storing tuples in pages, the DBMS only stores log records. The system appends log records to the file of](https://reader033.vdocument.in/reader033/viewer/2022050215/5f61819daf133879d30f87d6/html5/thumbnails/9.jpg)
CMU 15-445/645 (Fall 2019)
LOG-STRUCTURED FILE ORGANIZATION
To read a record, the DBMS scans the log backwards and "recreates" the tuple to find what it needs.
Build indexes to allow it to jump to locations in the log.
7
INSERT id=1,val=a
INSERT id=2,val=b
DELETE id=4
UPDATE val=X (id=3)
UPDATE val=Y (id=4)
INSERT id=3,val=c
…id=1
id=2
id=3
id=4
Page
![Page 10: 04 Database Storage Part II · LOG-STRUCTURED FILE ORGANIZATION Instead of storing tuples in pages, the DBMS only stores log records. The system appends log records to the file of](https://reader033.vdocument.in/reader033/viewer/2022050215/5f61819daf133879d30f87d6/html5/thumbnails/10.jpg)
CMU 15-445/645 (Fall 2019)
LOG-STRUCTURED FILE ORGANIZATION
To read a record, the DBMS scans the log backwards and "recreates" the tuple to find what it needs.
Build indexes to allow it to jump to locations in the log.
Periodically compact the log.
7
id=1,val=aid=2,val=bid=3,val=Xid=4,val=Y
Page
![Page 11: 04 Database Storage Part II · LOG-STRUCTURED FILE ORGANIZATION Instead of storing tuples in pages, the DBMS only stores log records. The system appends log records to the file of](https://reader033.vdocument.in/reader033/viewer/2022050215/5f61819daf133879d30f87d6/html5/thumbnails/11.jpg)
CMU 15-445/645 (Fall 2019)
LOG-STRUCTURED FILE ORGANIZATION
To read a record, the DBMS scans the log backwards and "recreates" the tuple to find what it needs.
Build indexes to allow it to jump to locations in the log.
Periodically compact the log.
7
id=1,val=aid=2,val=bid=3,val=Xid=4,val=Y
Page
![Page 12: 04 Database Storage Part II · LOG-STRUCTURED FILE ORGANIZATION Instead of storing tuples in pages, the DBMS only stores log records. The system appends log records to the file of](https://reader033.vdocument.in/reader033/viewer/2022050215/5f61819daf133879d30f87d6/html5/thumbnails/12.jpg)
CMU 15-445/645 (Fall 2019)
TODAY'S AGENDA
Data Representation
System Catalogs
Storage Models
8
![Page 13: 04 Database Storage Part II · LOG-STRUCTURED FILE ORGANIZATION Instead of storing tuples in pages, the DBMS only stores log records. The system appends log records to the file of](https://reader033.vdocument.in/reader033/viewer/2022050215/5f61819daf133879d30f87d6/html5/thumbnails/13.jpg)
CMU 15-445/645 (Fall 2019)
TUPLE STORAGE
A tuple is essentially a sequence of bytes.
It's the job of the DBMS to interpret those bytes into attribute types and values.
The DBMS's catalogs contain the schema information about tables that the system uses to figure out the tuple's layout.
9
![Page 14: 04 Database Storage Part II · LOG-STRUCTURED FILE ORGANIZATION Instead of storing tuples in pages, the DBMS only stores log records. The system appends log records to the file of](https://reader033.vdocument.in/reader033/viewer/2022050215/5f61819daf133879d30f87d6/html5/thumbnails/14.jpg)
CMU 15-445/645 (Fall 2019)
DATA REPRESENTATION
INTEGER/BIGINT/SMALLINT/TINYINT→ C/C++ Representation
FLOAT/REAL vs. NUMERIC/DECIMAL→ IEEE-754 Standard / Fixed-point Decimals
VARCHAR/VARBINARY/TEXT/BLOB→ Header with length, followed by data bytes.
TIME/DATE/TIMESTAMP→ 32/64-bit integer of (micro)seconds since Unix epoch
10
![Page 15: 04 Database Storage Part II · LOG-STRUCTURED FILE ORGANIZATION Instead of storing tuples in pages, the DBMS only stores log records. The system appends log records to the file of](https://reader033.vdocument.in/reader033/viewer/2022050215/5f61819daf133879d30f87d6/html5/thumbnails/15.jpg)
CMU 15-445/645 (Fall 2019)
VARIABLE PRECISION NUMBERS
Inexact, variable-precision numeric type that uses the "native" C/C++ types.→ Examples: FLOAT, REAL/DOUBLE
Store directly as specified by IEEE-754.
Typically faster than arbitrary precision numbers but can have rounding errors…
11
![Page 16: 04 Database Storage Part II · LOG-STRUCTURED FILE ORGANIZATION Instead of storing tuples in pages, the DBMS only stores log records. The system appends log records to the file of](https://reader033.vdocument.in/reader033/viewer/2022050215/5f61819daf133879d30f87d6/html5/thumbnails/16.jpg)
CMU 15-445/645 (Fall 2019)
VARIABLE PRECISION NUMBERS
12
#include <stdio.h>
int main(int argc, char* argv[]) {float x = 0.1;float y = 0.2;printf("x+y = %f\n", x+y);printf("0.3 = %f\n", 0.3);
}
Rounding Example
x+y = 0.3000000.3 = 0.300000
Output
![Page 17: 04 Database Storage Part II · LOG-STRUCTURED FILE ORGANIZATION Instead of storing tuples in pages, the DBMS only stores log records. The system appends log records to the file of](https://reader033.vdocument.in/reader033/viewer/2022050215/5f61819daf133879d30f87d6/html5/thumbnails/17.jpg)
CMU 15-445/645 (Fall 2019)
VARIABLE PRECISION NUMBERS
12
#include <stdio.h>
int main(int argc, char* argv[]) {float x = 0.1;float y = 0.2;printf("x+y = %f\n", x+y);printf("0.3 = %f\n", 0.3);
}
Rounding Example
x+y = 0.3000000.3 = 0.300000
Output
#include <stdio.h>
int main(int argc, char* argv[]) {float x = 0.1;float y = 0.2;printf("x+y = %.20f\n", x+y);printf("0.3 = %.20f\n", 0.3);
}
x+y = 0.300000011920928955080.3 = 0.29999999999999998890
![Page 18: 04 Database Storage Part II · LOG-STRUCTURED FILE ORGANIZATION Instead of storing tuples in pages, the DBMS only stores log records. The system appends log records to the file of](https://reader033.vdocument.in/reader033/viewer/2022050215/5f61819daf133879d30f87d6/html5/thumbnails/18.jpg)
CMU 15-445/645 (Fall 2019)
FIXED PRECISION NUMBERS
Numeric data types with arbitrary precision and scale. Used when round errors are unacceptable.→ Example: NUMERIC, DECIMAL
Typically stored in a exact, variable-length binary representation with additional meta-data.→ Like a VARCHAR but not stored as a string
Demo: Postgres, SQL Server, Oracle
13
![Page 19: 04 Database Storage Part II · LOG-STRUCTURED FILE ORGANIZATION Instead of storing tuples in pages, the DBMS only stores log records. The system appends log records to the file of](https://reader033.vdocument.in/reader033/viewer/2022050215/5f61819daf133879d30f87d6/html5/thumbnails/19.jpg)
CMU 15-445/645 (Fall 2019)
POSTGRES: NUMERIC
14
typedef unsigned char NumericDigit;
typedef struct {
int ndigits;
int weight;
int scale;
int sign;
NumericDigit *digits;
} numeric;
# of Digits
Weight of 1st Digit
Scale Factor
Positive/Negative/NaN
Digit Storage
![Page 20: 04 Database Storage Part II · LOG-STRUCTURED FILE ORGANIZATION Instead of storing tuples in pages, the DBMS only stores log records. The system appends log records to the file of](https://reader033.vdocument.in/reader033/viewer/2022050215/5f61819daf133879d30f87d6/html5/thumbnails/20.jpg)
CMU 15-445/645 (Fall 2019)
POSTGRES: NUMERIC
14
typedef unsigned char NumericDigit;
typedef struct {
int ndigits;
int weight;
int scale;
int sign;
NumericDigit *digits;
} numeric;
# of Digits
Weight of 1st Digit
Scale Factor
Positive/Negative/NaN
Digit Storage
![Page 21: 04 Database Storage Part II · LOG-STRUCTURED FILE ORGANIZATION Instead of storing tuples in pages, the DBMS only stores log records. The system appends log records to the file of](https://reader033.vdocument.in/reader033/viewer/2022050215/5f61819daf133879d30f87d6/html5/thumbnails/21.jpg)
CMU 15-445/645 (Fall 2019)
L ARGE VALUES
Most DBMSs don't allow a tuple to exceed the size of a single page.
To store values that are larger than a page, the DBMS uses separate overflow storage pages.→ Postgres: TOAST (>2KB)→ MySQL: Overflow (>½ size of page)→ SQL Server: Overflow (>size of page)
15
Overflow Page
VARCHAR DATA
Tuple
Header a b c d e
![Page 22: 04 Database Storage Part II · LOG-STRUCTURED FILE ORGANIZATION Instead of storing tuples in pages, the DBMS only stores log records. The system appends log records to the file of](https://reader033.vdocument.in/reader033/viewer/2022050215/5f61819daf133879d30f87d6/html5/thumbnails/22.jpg)
CMU 15-445/645 (Fall 2019)
EXTERNAL VALUE STORAGE
Some systems allow you to store a really large value in an external file.Treated as a BLOB type.→ Oracle: BFILE data type→ Microsoft: FILESTREAM data type
The DBMS cannot manipulate the contents of an external file.→ No durability protections.→ No transaction protections.
16
Data
Header a b c d e
External File
Tuple
![Page 23: 04 Database Storage Part II · LOG-STRUCTURED FILE ORGANIZATION Instead of storing tuples in pages, the DBMS only stores log records. The system appends log records to the file of](https://reader033.vdocument.in/reader033/viewer/2022050215/5f61819daf133879d30f87d6/html5/thumbnails/23.jpg)
CMU 15-445/645 (Fall 2019)
EXTERNAL VALUE STORAGE
Some systems allow you to store a really large value in an external file.Treated as a BLOB type.→ Oracle: BFILE data type→ Microsoft: FILESTREAM data type
The DBMS cannot manipulate the contents of an external file.→ No durability protections.→ No transaction protections.
16
Data
Header a b c d e
External File
Tuple
![Page 24: 04 Database Storage Part II · LOG-STRUCTURED FILE ORGANIZATION Instead of storing tuples in pages, the DBMS only stores log records. The system appends log records to the file of](https://reader033.vdocument.in/reader033/viewer/2022050215/5f61819daf133879d30f87d6/html5/thumbnails/24.jpg)
CMU 15-445/645 (Fall 2019)
SYSTEM CATALOGS
A DBMS stores meta-data about databases in its internal catalogs.→ Tables, columns, indexes, views→ Users, permissions→ Internal statistics
Almost every DBMS stores their a database's catalog in itself.→ Wrap object abstraction around tuples.→ Specialized code for "bootstrapping" catalog tables.
17
![Page 25: 04 Database Storage Part II · LOG-STRUCTURED FILE ORGANIZATION Instead of storing tuples in pages, the DBMS only stores log records. The system appends log records to the file of](https://reader033.vdocument.in/reader033/viewer/2022050215/5f61819daf133879d30f87d6/html5/thumbnails/25.jpg)
CMU 15-445/645 (Fall 2019)
SYSTEM CATALOGS
You can query the DBMS’s internal INFORMATION_SCHEMA catalog to get info about the database.→ ANSI standard set of read-only views that provide info
about all of the tables, views, columns, and procedures in a database
DBMSs also have non-standard shortcuts to retrieve this information.
18
![Page 26: 04 Database Storage Part II · LOG-STRUCTURED FILE ORGANIZATION Instead of storing tuples in pages, the DBMS only stores log records. The system appends log records to the file of](https://reader033.vdocument.in/reader033/viewer/2022050215/5f61819daf133879d30f87d6/html5/thumbnails/26.jpg)
CMU 15-445/645 (Fall 2019)
ACCESSING TABLE SCHEMA
List all the tables in the current database:
19
SELECT *FROM INFORMATION_SCHEMA.TABLESWHERE table_catalog = '<db name>';
SQL-92
\d; Postgres
SHOW TABLES; MySQL
.tables; SQLite
![Page 27: 04 Database Storage Part II · LOG-STRUCTURED FILE ORGANIZATION Instead of storing tuples in pages, the DBMS only stores log records. The system appends log records to the file of](https://reader033.vdocument.in/reader033/viewer/2022050215/5f61819daf133879d30f87d6/html5/thumbnails/27.jpg)
CMU 15-445/645 (Fall 2019)
ACCESSING TABLE SCHEMA
List all the tables in the student table:
20
SELECT *FROM INFORMATION_SCHEMA.TABLESWHERE table_name = 'student'
SQL-92
\d student; Postgres
DESCRIBE student; MySQL
.schema student; SQLite
![Page 28: 04 Database Storage Part II · LOG-STRUCTURED FILE ORGANIZATION Instead of storing tuples in pages, the DBMS only stores log records. The system appends log records to the file of](https://reader033.vdocument.in/reader033/viewer/2022050215/5f61819daf133879d30f87d6/html5/thumbnails/28.jpg)
CMU 15-445/645 (Fall 2019)
OBSERVATION
The relational model does not specify that we have to store all of a tuple's attributes together in a single page.
This may not actually be the best layout for some workloads…
23
![Page 29: 04 Database Storage Part II · LOG-STRUCTURED FILE ORGANIZATION Instead of storing tuples in pages, the DBMS only stores log records. The system appends log records to the file of](https://reader033.vdocument.in/reader033/viewer/2022050215/5f61819daf133879d30f87d6/html5/thumbnails/29.jpg)
CMU 15-445/645 (Fall 2019)
WIKIPEDIA EXAMPLE
24
CREATE TABLE revisions (revID INT PRIMARY KEY,userID INT REFERENCES useracct (userID), pageID INT REFERENCES pages (pageID),content TEXT,updated DATETIME
);
CREATE TABLE pages (pageID INT PRIMARY KEY,title VARCHAR UNIQUE,latest INT⮱REFERENCES revisions (revID),
);
CREATE TABLE useracct (userID INT PRIMARY KEY,userName VARCHAR UNIQUE,⋮
);
![Page 30: 04 Database Storage Part II · LOG-STRUCTURED FILE ORGANIZATION Instead of storing tuples in pages, the DBMS only stores log records. The system appends log records to the file of](https://reader033.vdocument.in/reader033/viewer/2022050215/5f61819daf133879d30f87d6/html5/thumbnails/30.jpg)
CMU 15-445/645 (Fall 2019)
OLTP
On-line Transaction Processing:→ Simple queries that read/update a small
amount of data that is related to a single entity in the database.
This is usually the kind of application that people build first.
25
UPDATE useracctSET lastLogin = NOW(),
hostname = ?WHERE userID = ?
INSERT INTO revisions VALUES (?,?…,?)
SELECT P.*, R.* FROM pages AS PINNER JOIN revisions AS R
ON P.latest = R.revIDWHERE P.pageID = ?
![Page 31: 04 Database Storage Part II · LOG-STRUCTURED FILE ORGANIZATION Instead of storing tuples in pages, the DBMS only stores log records. The system appends log records to the file of](https://reader033.vdocument.in/reader033/viewer/2022050215/5f61819daf133879d30f87d6/html5/thumbnails/31.jpg)
CMU 15-445/645 (Fall 2019)
OL AP
On-line Analytical Processing:→ Complex queries that read large portions
of the database spanning multiple entities.
You execute these workloads on the data you have collected from your OLTP application(s).
26
SELECT COUNT(U.lastLogin),EXTRACT(month FROM
U.lastLogin) AS monthFROM useracct AS UWHERE U.hostname LIKE '%.gov'GROUP BYEXTRACT(month FROM U.lastLogin)
![Page 32: 04 Database Storage Part II · LOG-STRUCTURED FILE ORGANIZATION Instead of storing tuples in pages, the DBMS only stores log records. The system appends log records to the file of](https://reader033.vdocument.in/reader033/viewer/2022050215/5f61819daf133879d30f87d6/html5/thumbnails/32.jpg)
CMU 15-445/645 (Fall 2019)
WORKLOAD CHARACTERIZATION
Writes ReadsSimple
Complex
Workload Focus
Op
era
tio
n C
om
ple
xit
y
OLTP
OLAP
[SOURCE]
HTAP
![Page 33: 04 Database Storage Part II · LOG-STRUCTURED FILE ORGANIZATION Instead of storing tuples in pages, the DBMS only stores log records. The system appends log records to the file of](https://reader033.vdocument.in/reader033/viewer/2022050215/5f61819daf133879d30f87d6/html5/thumbnails/33.jpg)
CMU 15-445/645 (Fall 2019)
DATA STORAGE MODELS
The DBMS can store tuples in different ways that are better for either OLTP or OLAP workloads.
We have been assuming the n-ary storage model(aka "row storage") so far this semester.
28
![Page 34: 04 Database Storage Part II · LOG-STRUCTURED FILE ORGANIZATION Instead of storing tuples in pages, the DBMS only stores log records. The system appends log records to the file of](https://reader033.vdocument.in/reader033/viewer/2022050215/5f61819daf133879d30f87d6/html5/thumbnails/34.jpg)
CMU 15-445/645 (Fall 2019)
N-ARY STORAGE MODEL (NSM)
The DBMS stores all attributes for a single tuple contiguously in a page.
Ideal for OLTP workloads where queries tend to operate only on an individual entity and insert-heavy workloads.
29
![Page 35: 04 Database Storage Part II · LOG-STRUCTURED FILE ORGANIZATION Instead of storing tuples in pages, the DBMS only stores log records. The system appends log records to the file of](https://reader033.vdocument.in/reader033/viewer/2022050215/5f61819daf133879d30f87d6/html5/thumbnails/35.jpg)
CMU 15-445/645 (Fall 2019)
N-ARY STORAGE MODEL (NSM)
The DBMS stores all attributes for a single tuple contiguously in a page.
30
←Tuple #1
←Tuple #2
←Tuple #3
←Tuple #4
userID userName userPass lastLoginhostname
userID userName userPass lastLoginhostname
userID userName userPass lastLoginhostname
- - - --
Header
Header
Header
Header
![Page 36: 04 Database Storage Part II · LOG-STRUCTURED FILE ORGANIZATION Instead of storing tuples in pages, the DBMS only stores log records. The system appends log records to the file of](https://reader033.vdocument.in/reader033/viewer/2022050215/5f61819daf133879d30f87d6/html5/thumbnails/36.jpg)
CMU 15-445/645 (Fall 2019)
N-ARY STORAGE MODEL (NSM)
The DBMS stores all attributes for a single tuple contiguously in a page.
30
NSM Disk Page
userID userName userPass lastLoginhostname
userID userName userPass lastLoginhostname
userID userName userPass lastLoginhostname
- - - --
Header
Header
Header
Header
![Page 37: 04 Database Storage Part II · LOG-STRUCTURED FILE ORGANIZATION Instead of storing tuples in pages, the DBMS only stores log records. The system appends log records to the file of](https://reader033.vdocument.in/reader033/viewer/2022050215/5f61819daf133879d30f87d6/html5/thumbnails/37.jpg)
CMU 15-445/645 (Fall 2019)
N-ARY STORAGE MODEL (NSM)
31
SELECT * FROM useracctWHERE userName = ?AND userPass = ?
Index
NSM Disk Page
userID userName userPass lastLoginhostname
userID userName userPass lastLoginhostname
userID userName userPass lastLoginhostname
- - - --
Header
Header
Header
Header
Lecture 7
![Page 38: 04 Database Storage Part II · LOG-STRUCTURED FILE ORGANIZATION Instead of storing tuples in pages, the DBMS only stores log records. The system appends log records to the file of](https://reader033.vdocument.in/reader033/viewer/2022050215/5f61819daf133879d30f87d6/html5/thumbnails/38.jpg)
CMU 15-445/645 (Fall 2019)
N-ARY STORAGE MODEL (NSM)
31
SELECT * FROM useracctWHERE userName = ?AND userPass = ?
IndexINSERT INTO useracctVALUES (?,?,…?)
NSM Disk Page
userID userName userPass lastLoginhostname
userID userName userPass lastLoginhostname
userID userName userPass lastLoginhostname
- - - --
Header
Header
Header
Header userID userName userPass lastLoginhostnameHeader
Lecture 7
![Page 39: 04 Database Storage Part II · LOG-STRUCTURED FILE ORGANIZATION Instead of storing tuples in pages, the DBMS only stores log records. The system appends log records to the file of](https://reader033.vdocument.in/reader033/viewer/2022050215/5f61819daf133879d30f87d6/html5/thumbnails/39.jpg)
CMU 15-445/645 (Fall 2019)
N-ARY STORAGE MODEL (NSM)
32
SELECT COUNT(U.lastLogin),EXTRACT(month FROM U.lastLogin) AS month
FROM useracct AS UWHERE U.hostname LIKE '%.gov'GROUP BY EXTRACT(month FROM U.lastLogin)
![Page 40: 04 Database Storage Part II · LOG-STRUCTURED FILE ORGANIZATION Instead of storing tuples in pages, the DBMS only stores log records. The system appends log records to the file of](https://reader033.vdocument.in/reader033/viewer/2022050215/5f61819daf133879d30f87d6/html5/thumbnails/40.jpg)
CMU 15-445/645 (Fall 2019)
N-ARY STORAGE MODEL (NSM)
32
NSM Disk Page
userID userName userPass lastLoginhostname
userID userName userPass lastLoginhostname
userID userName userPass lastLoginhostname
userID userName userPass lastLoginhostname
Header
Header
Header
Header
SELECT COUNT(U.lastLogin),EXTRACT(month FROM U.lastLogin) AS month
FROM useracct AS UWHERE U.hostname LIKE '%.gov'GROUP BY EXTRACT(month FROM U.lastLogin)
![Page 41: 04 Database Storage Part II · LOG-STRUCTURED FILE ORGANIZATION Instead of storing tuples in pages, the DBMS only stores log records. The system appends log records to the file of](https://reader033.vdocument.in/reader033/viewer/2022050215/5f61819daf133879d30f87d6/html5/thumbnails/41.jpg)
CMU 15-445/645 (Fall 2019)
N-ARY STORAGE MODEL (NSM)
32
NSM Disk Page
userID userName userPass lastLoginhostname
userID userName userPass lastLoginhostname
userID userName userPass lastLoginhostname
userID userName userPass lastLoginhostname
Header
Header
Header
Header
SELECT COUNT(U.lastLogin),EXTRACT(month FROM U.lastLogin) AS month
FROM useracct AS UWHERE U.hostname LIKE '%.gov'GROUP BY EXTRACT(month FROM U.lastLogin)
![Page 42: 04 Database Storage Part II · LOG-STRUCTURED FILE ORGANIZATION Instead of storing tuples in pages, the DBMS only stores log records. The system appends log records to the file of](https://reader033.vdocument.in/reader033/viewer/2022050215/5f61819daf133879d30f87d6/html5/thumbnails/42.jpg)
CMU 15-445/645 (Fall 2019)
N-ARY STORAGE MODEL (NSM)
32
NSM Disk Page
userID userName userPass lastLoginhostname
userID userName userPass lastLoginhostname
userID userName userPass lastLoginhostname
userID userName userPass lastLoginhostname
Header
Header
Header
Header
SELECT COUNT(U.lastLogin),EXTRACT(month FROM U.lastLogin) AS month
FROM useracct AS UWHERE U.hostname LIKE '%.gov'GROUP BY EXTRACT(month FROM U.lastLogin)
![Page 43: 04 Database Storage Part II · LOG-STRUCTURED FILE ORGANIZATION Instead of storing tuples in pages, the DBMS only stores log records. The system appends log records to the file of](https://reader033.vdocument.in/reader033/viewer/2022050215/5f61819daf133879d30f87d6/html5/thumbnails/43.jpg)
CMU 15-445/645 (Fall 2019)
N-ARY STORAGE MODEL (NSM)
32
NSM Disk Page
userID userName userPass lastLoginhostname
userID userName userPass lastLoginhostname
userID userName userPass lastLoginhostname
userID userName userPass lastLoginhostname
Header
Header
Header
Header
Useless Data
SELECT COUNT(U.lastLogin),EXTRACT(month FROM U.lastLogin) AS month
FROM useracct AS UWHERE U.hostname LIKE '%.gov'GROUP BY EXTRACT(month FROM U.lastLogin)
![Page 44: 04 Database Storage Part II · LOG-STRUCTURED FILE ORGANIZATION Instead of storing tuples in pages, the DBMS only stores log records. The system appends log records to the file of](https://reader033.vdocument.in/reader033/viewer/2022050215/5f61819daf133879d30f87d6/html5/thumbnails/44.jpg)
CMU 15-445/645 (Fall 2019)
N-ARY STORAGE MODEL
Advantages→ Fast inserts, updates, and deletes.→ Good for queries that need the entire tuple.
Disadvantages→ Not good for scanning large portions of the table and/or
a subset of the attributes.
33
![Page 45: 04 Database Storage Part II · LOG-STRUCTURED FILE ORGANIZATION Instead of storing tuples in pages, the DBMS only stores log records. The system appends log records to the file of](https://reader033.vdocument.in/reader033/viewer/2022050215/5f61819daf133879d30f87d6/html5/thumbnails/45.jpg)
CMU 15-445/645 (Fall 2019)
DECOMPOSITION STORAGE MODEL (DSM)
The DBMS stores the values of a single attribute for all tuples contiguously in a page.→ Also known as a "column store".
Ideal for OLAP workloads where read-only queries perform large scans over a subset of the table’s attributes.
34
![Page 46: 04 Database Storage Part II · LOG-STRUCTURED FILE ORGANIZATION Instead of storing tuples in pages, the DBMS only stores log records. The system appends log records to the file of](https://reader033.vdocument.in/reader033/viewer/2022050215/5f61819daf133879d30f87d6/html5/thumbnails/46.jpg)
CMU 15-445/645 (Fall 2019)
DECOMPOSITION STORAGE MODEL (DSM)
The DBMS stores the values of a single attribute for all tuples contiguously in a page.→ Also known as a "column store".
35
userID userName userPass lastLoginhostname
userID userName userPass lastLoginhostname
userID userName userPass lastLoginhostname
userID userName userPass lastLoginhostname
Header
Header
Header
Header
![Page 47: 04 Database Storage Part II · LOG-STRUCTURED FILE ORGANIZATION Instead of storing tuples in pages, the DBMS only stores log records. The system appends log records to the file of](https://reader033.vdocument.in/reader033/viewer/2022050215/5f61819daf133879d30f87d6/html5/thumbnails/47.jpg)
CMU 15-445/645 (Fall 2019)
DECOMPOSITION STORAGE MODEL (DSM)
The DBMS stores the values of a single attribute for all tuples contiguously in a page.→ Also known as a "column store".
35
userID
userName
userPass
DSM Disk Page
hostname
hostname
hostname
hostname
hostname
hostname
hostname
hostname
hostname
hostname
hostname
hostname
hostname
hostname
hostname
hostname
hostname
hostname
hostname
hostname
hostname
hostname
hostname
hostname
lastLogin
![Page 48: 04 Database Storage Part II · LOG-STRUCTURED FILE ORGANIZATION Instead of storing tuples in pages, the DBMS only stores log records. The system appends log records to the file of](https://reader033.vdocument.in/reader033/viewer/2022050215/5f61819daf133879d30f87d6/html5/thumbnails/48.jpg)
CMU 15-445/645 (Fall 2019)
DECOMPOSITION STORAGE MODEL (DSM)
36
SELECT COUNT(U.lastLogin),EXTRACT(month FROM U.lastLogin) AS month
FROM useracct AS UWHERE U.hostname LIKE '%.gov'GROUP BY EXTRACT(month FROM U.lastLogin)
![Page 49: 04 Database Storage Part II · LOG-STRUCTURED FILE ORGANIZATION Instead of storing tuples in pages, the DBMS only stores log records. The system appends log records to the file of](https://reader033.vdocument.in/reader033/viewer/2022050215/5f61819daf133879d30f87d6/html5/thumbnails/49.jpg)
CMU 15-445/645 (Fall 2019)
DSM Disk Page
hostname
hostname
hostname
hostname
hostname
hostname
hostname
hostname
hostname
hostname
hostname
hostname
hostname
hostname
hostname
hostname
hostname
hostname
hostname
hostname
hostname
hostname
hostname
hostname
DECOMPOSITION STORAGE MODEL (DSM)
36
SELECT COUNT(U.lastLogin),EXTRACT(month FROM U.lastLogin) AS month
FROM useracct AS UWHERE U.hostname LIKE '%.gov'GROUP BY EXTRACT(month FROM U.lastLogin)
![Page 50: 04 Database Storage Part II · LOG-STRUCTURED FILE ORGANIZATION Instead of storing tuples in pages, the DBMS only stores log records. The system appends log records to the file of](https://reader033.vdocument.in/reader033/viewer/2022050215/5f61819daf133879d30f87d6/html5/thumbnails/50.jpg)
CMU 15-445/645 (Fall 2019)
TUPLE IDENTIFICATION
Choice #1: Fixed-length Offsets→ Each value is the same length for an attribute.
Choice #2: Embedded Tuple Ids→ Each value is stored with its tuple id in a column.
37
Offsets
0123
A B C D
Embedded IdsA
0123
B
0123
C
0123
D
0123
![Page 51: 04 Database Storage Part II · LOG-STRUCTURED FILE ORGANIZATION Instead of storing tuples in pages, the DBMS only stores log records. The system appends log records to the file of](https://reader033.vdocument.in/reader033/viewer/2022050215/5f61819daf133879d30f87d6/html5/thumbnails/51.jpg)
CMU 15-445/645 (Fall 2019)
DECOMPOSITION STORAGE MODEL (DSM)
Advantages→ Reduces the amount wasted I/O because the DBMS only
reads the data that it needs.→ Better query processing and data compression (more on
this later).
Disadvantages→ Slow for point queries, inserts, updates, and deletes
because of tuple splitting/stitching.
38
![Page 52: 04 Database Storage Part II · LOG-STRUCTURED FILE ORGANIZATION Instead of storing tuples in pages, the DBMS only stores log records. The system appends log records to the file of](https://reader033.vdocument.in/reader033/viewer/2022050215/5f61819daf133879d30f87d6/html5/thumbnails/52.jpg)
CMU 15-445/645 (Fall 2019)
DSM SYSTEM HISTORY
1970s: Cantor DBMS
1980s: DSM Proposal
1990s: SybaseIQ (in-memory only)
2000s: Vertica, VectorWise, MonetDB
2010s: Everyone
39
![Page 53: 04 Database Storage Part II · LOG-STRUCTURED FILE ORGANIZATION Instead of storing tuples in pages, the DBMS only stores log records. The system appends log records to the file of](https://reader033.vdocument.in/reader033/viewer/2022050215/5f61819daf133879d30f87d6/html5/thumbnails/53.jpg)
CMU 15-445/645 (Fall 2019)
CONCLUSION
The storage manager is not entirely independent from the rest of the DBMS.
It is important to choose the right storage model for the target workload:→ OLTP = Row Store→ OLAP = Column Store
40
![Page 54: 04 Database Storage Part II · LOG-STRUCTURED FILE ORGANIZATION Instead of storing tuples in pages, the DBMS only stores log records. The system appends log records to the file of](https://reader033.vdocument.in/reader033/viewer/2022050215/5f61819daf133879d30f87d6/html5/thumbnails/54.jpg)
CMU 15-445/645 (Fall 2019)
DATABASE STORAGE
Problem #1: How the DBMS represents the database in files on disk.
Problem #2: How the DBMS manages its memory and move data back-and-forth from disk.
41
← Next