01_intro (1)nnm
TRANSCRIPT
-
8/13/2019 01_intro (1)nnm
1/23
CPS 216: Advanced DatabaseSystems
Shivnath Babu
-
8/13/2019 01_intro (1)nnm
2/23
Outline for Today
What this class is about: Data management
What we will cover in this class
Logistics
What does a Database System mean to you?(Hint: What are they used for? Give examples)
-
8/13/2019 01_intro (1)nnm
3/23
-
8/13/2019 01_intro (1)nnm
4/23
Example: At a Company
ID Name DeptID Salary
10 Nemo 12 120K
20 Dory 156 79K
40 Gill 89 76K
52 Ray 34 85K
ID Name
12 IT
34 Accounts
89 HR
156 Marketing
Employee Department
Query 1: Is there an employee named Nemo?
Query 2: What is Nemos salary?Query 3: How many departments are there in the company?
Query 4: What is the name of Nemos department?
Query 5: How many employees are there in theAccounts department?
-
8/13/2019 01_intro (1)nnm
5/23
DataBase Management System (DBMS)
High-levelQuery Q
DBMS
Data
Answer
Translates Q intobest execution planfor current conditions,
runs plan
-
8/13/2019 01_intro (1)nnm
6/23
Example: Store that Sells Cars
Make Model OwnerID
Honda Accord 12Toyota Camry 34
Mini Cooper 89
Honda Accord 156
ID Name Age
12 Nemo 2234 Ray 42
89 Gill 36
156 Dory 21
Cars Owners
Filter (Make = Honda andModel = Accord)
Join (Cars.OwnerID = Owners.ID)
Make Model OwnerID ID Name Age
Honda Accord 12 12 Nemo 22Honda Accord 156 156 Dory 21
Owners of
Honda Accordswho are
-
8/13/2019 01_intro (1)nnm
7/23
DataBase Management System (DBMS)
High-levelQuery Q
DBMS
Data
Answer
Translates Q intobest execution planfor current conditions,
runs planKeeps data safe
and correctdespite failures,concurrent
updates, onlineprocessing, etc.
-
8/13/2019 01_intro (1)nnm
8/23
-
8/13/2019 01_intro (1)nnm
9/23
Final balance = $250
read balance; $400
if balance > amount then
balance = balance - amount; $300write balance; $300
read balance; $300
if balance > amount then
balance = balance - amount; $250
write balance; $250
Homer withdraws $100:
Marge withdraws $50:
-
8/13/2019 01_intro (1)nnm
10/23
Final balance = $300
read balance; $400
if balance > amount thenbalance = balance - amount; $300
write balance; $300
read balance; $400
If balance > amount thenbalance = balance - amount; $350
write balance; $350
Homer withdraws $100: Marge withdraws $50:
-
8/13/2019 01_intro (1)nnm
11/23
Final balance = $350
read balance; $400
if balance > amount then
balance = balance - amount; $300
write balance; $300
read balance; $400
if balance > amount then
balance = balance - amount; $350
write balance; $350
Homer withdraws $100: Marge withdraws $50:
-
8/13/2019 01_intro (1)nnm
12/23
Concurrency control in DBMS
Similar to concurrent programming problems But data is not all in main-memory
Appears similar to file system concurrentaccess?
Approach taken by MySQL initially; nowMySQL offers better alternatives
But want to control at much finer granularity
Or else one withdrawal would lock up allaccounts!
-
8/13/2019 01_intro (1)nnm
13/23
Recovery in DBMS
Example: balance transferdecrement the balance of account Xby $100;
increment the balance of account Y
by $100; Scenario 1: Power goes out after the first
instruction
Scenario 2: DBMS buffers and updates data inmemory (for efficiency); before they are writtenback to disk, power goes out
Log updates; undo/redo during recovery
-
8/13/2019 01_intro (1)nnm
14/23
DataBase Management System (DBMS)
High-levelQuery Q
DBMS
Data
Answer
Translates Q intobest execution planfor current conditions,
runs planKeeps data safe
and correctdespite failures,concurrent
updates, onlineprocessing, etc.
-
8/13/2019 01_intro (1)nnm
15/23
Summary of modern DBMS features
Persistent storage of data
Logical data model; declarative queries andupdates ! physical data independence
Multi-user concurrent access
Safety from system failures Performance, performance, performance
Massive amounts of data (terabytes ~petabytes)
High throughput (thousands ~ millionstransactions per minute)
High availability ( 99.999% uptime)
-
8/13/2019 01_intro (1)nnm
16/23
Modern DBMS Architecture
Disk(s)
Applications
OS
Parser
Query Optimizer
Query Executor
Storage Manager
Logical query plan
Physical query plan
Access method API calls
SQL
File system API callsStorage system API calls
DBMS
-
8/13/2019 01_intro (1)nnm
17/23
Course Outline
40% of the class is about core DBMS concepts
Query execution, query optimization, transactions,recovery, etc.
Textbook material
60% of the class is on what is happening todayin data management
New developments on textbook material
Data streams
Web searchGoogle, Yahoo! Data integration (structured data + unstructured data)
Data mining
Unsolved challenges
-
8/13/2019 01_intro (1)nnm
18/23
Using a Traditional DBMS
User/Application
Loader
Query Result
Table R
Table S
Result
Query
-
8/13/2019 01_intro (1)nnm
19/23
New Approach for Data Streams
User/Application
RegisterContinuous Query
(Standing Query)
Stream QueryProcessor
Input streams
Result
-
8/13/2019 01_intro (1)nnm
20/23
Example Continuous (Standing) Queries
Web Amazons best sellers over last hour
Network Intrusion Detection
Track HTTP packets with destination addressmatching a prefix in given table and contentmatching *\.ida
Finance
Monitor NASDAQ stocks between $20 and$200 that have moved down more than 2% inthe last 20 minutes
-
8/13/2019 01_intro (1)nnm
21/23
New Challenges in DBMSs
High-levelQuery Q
DBMS
Answer
Data
TeraBytes PetaBytes
Empire B.
Bob Dylan
USA
Columbia
10.90
http://images.google.com/imgres?imgurl=http://www.informativos.telecinco.es/imgsed/google_logo_261.jpg&imgrefurl=http://www.informativos.telecinco.es/dn_12871.htm&h=196&w=261&sz=9&tbnid=JLK_Hwibc1gJ:&tbnh=80&tbnw=107&start=8&prev=/images%3Fq%3Dgoogle%2Blogo%26hl%3Den%26lr%3Dhttp://images.google.com/imgres?imgurl=http://www.katu.com/news/images/stock2003/amazon_logo.jpg&imgrefurl=http://www.katu.com/business/story.asp%3FID%3D70294&h=158&w=210&sz=12&tbnid=2E7bzhTXZiMJ:&tbnh=75&tbnw=100&start=5&prev=/images%3Fq%3Damazon%2Blogo%26hl%3Den%26lr%3Dhttp://images.google.com/imgres?imgurl=http://www.atl.ec.gc.ca/weather/hurricane/juan/juan-satellite-large.jpg&imgrefurl=http://www.atl.ec.gc.ca/weather/hurricane/juan/satellite_e.html&h=1046&w=800&sz=176&tbnid=-zFQ1G5ziOsJ:&tbnh=149&tbnw=114&start=25&prev=/images%3Fq%3Dsatellite%2Bimage%26start%3D20%26hl%3Den%26lr%3D%26sa%3DN -
8/13/2019 01_intro (1)nnm
22/23
Course Logistics
Reference: Database Systems: The CompleteBook, by H. Garcia-Molina, J. D. Ullman, and J.Widom
Web site: http://www.cs.duke.edu/courses/fall07/cps216
Grading:
Project 30%
Homework Assignments 20%
Midterm 20%
Final 30%
-
8/13/2019 01_intro (1)nnm
23/23
Summary: Data Management isImportant
Core aspect of most sciences andengineering today
Core need in industry
Cool mix of theory and systems
Chances are you will find something
interesting even if you primary interest iselsewhere