01_intro (1)nnm

Upload: tushar-chauhan

Post on 04-Jun-2018

232 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/13/2019 01_intro (1)nnm

    1/23

    CPS 216: Advanced DatabaseSystems

    Shivnath Babu

  • 8/13/2019 01_intro (1)nnm

    2/23

    Outline for Today

    What this class is about: Data management

    What we will cover in this class

    Logistics

    What does a Database System mean to you?(Hint: What are they used for? Give examples)

  • 8/13/2019 01_intro (1)nnm

    3/23

  • 8/13/2019 01_intro (1)nnm

    4/23

    Example: At a Company

    ID Name DeptID Salary

    10 Nemo 12 120K

    20 Dory 156 79K

    40 Gill 89 76K

    52 Ray 34 85K

    ID Name

    12 IT

    34 Accounts

    89 HR

    156 Marketing

    Employee Department

    Query 1: Is there an employee named Nemo?

    Query 2: What is Nemos salary?Query 3: How many departments are there in the company?

    Query 4: What is the name of Nemos department?

    Query 5: How many employees are there in theAccounts department?

  • 8/13/2019 01_intro (1)nnm

    5/23

    DataBase Management System (DBMS)

    High-levelQuery Q

    DBMS

    Data

    Answer

    Translates Q intobest execution planfor current conditions,

    runs plan

  • 8/13/2019 01_intro (1)nnm

    6/23

    Example: Store that Sells Cars

    Make Model OwnerID

    Honda Accord 12Toyota Camry 34

    Mini Cooper 89

    Honda Accord 156

    ID Name Age

    12 Nemo 2234 Ray 42

    89 Gill 36

    156 Dory 21

    Cars Owners

    Filter (Make = Honda andModel = Accord)

    Join (Cars.OwnerID = Owners.ID)

    Make Model OwnerID ID Name Age

    Honda Accord 12 12 Nemo 22Honda Accord 156 156 Dory 21

    Owners of

    Honda Accordswho are

  • 8/13/2019 01_intro (1)nnm

    7/23

    DataBase Management System (DBMS)

    High-levelQuery Q

    DBMS

    Data

    Answer

    Translates Q intobest execution planfor current conditions,

    runs planKeeps data safe

    and correctdespite failures,concurrent

    updates, onlineprocessing, etc.

  • 8/13/2019 01_intro (1)nnm

    8/23

  • 8/13/2019 01_intro (1)nnm

    9/23

    Final balance = $250

    read balance; $400

    if balance > amount then

    balance = balance - amount; $300write balance; $300

    read balance; $300

    if balance > amount then

    balance = balance - amount; $250

    write balance; $250

    Homer withdraws $100:

    Marge withdraws $50:

  • 8/13/2019 01_intro (1)nnm

    10/23

    Final balance = $300

    read balance; $400

    if balance > amount thenbalance = balance - amount; $300

    write balance; $300

    read balance; $400

    If balance > amount thenbalance = balance - amount; $350

    write balance; $350

    Homer withdraws $100: Marge withdraws $50:

  • 8/13/2019 01_intro (1)nnm

    11/23

    Final balance = $350

    read balance; $400

    if balance > amount then

    balance = balance - amount; $300

    write balance; $300

    read balance; $400

    if balance > amount then

    balance = balance - amount; $350

    write balance; $350

    Homer withdraws $100: Marge withdraws $50:

  • 8/13/2019 01_intro (1)nnm

    12/23

    Concurrency control in DBMS

    Similar to concurrent programming problems But data is not all in main-memory

    Appears similar to file system concurrentaccess?

    Approach taken by MySQL initially; nowMySQL offers better alternatives

    But want to control at much finer granularity

    Or else one withdrawal would lock up allaccounts!

  • 8/13/2019 01_intro (1)nnm

    13/23

    Recovery in DBMS

    Example: balance transferdecrement the balance of account Xby $100;

    increment the balance of account Y

    by $100; Scenario 1: Power goes out after the first

    instruction

    Scenario 2: DBMS buffers and updates data inmemory (for efficiency); before they are writtenback to disk, power goes out

    Log updates; undo/redo during recovery

  • 8/13/2019 01_intro (1)nnm

    14/23

    DataBase Management System (DBMS)

    High-levelQuery Q

    DBMS

    Data

    Answer

    Translates Q intobest execution planfor current conditions,

    runs planKeeps data safe

    and correctdespite failures,concurrent

    updates, onlineprocessing, etc.

  • 8/13/2019 01_intro (1)nnm

    15/23

    Summary of modern DBMS features

    Persistent storage of data

    Logical data model; declarative queries andupdates ! physical data independence

    Multi-user concurrent access

    Safety from system failures Performance, performance, performance

    Massive amounts of data (terabytes ~petabytes)

    High throughput (thousands ~ millionstransactions per minute)

    High availability ( 99.999% uptime)

  • 8/13/2019 01_intro (1)nnm

    16/23

    Modern DBMS Architecture

    Disk(s)

    Applications

    OS

    Parser

    Query Optimizer

    Query Executor

    Storage Manager

    Logical query plan

    Physical query plan

    Access method API calls

    SQL

    File system API callsStorage system API calls

    DBMS

  • 8/13/2019 01_intro (1)nnm

    17/23

    Course Outline

    40% of the class is about core DBMS concepts

    Query execution, query optimization, transactions,recovery, etc.

    Textbook material

    60% of the class is on what is happening todayin data management

    New developments on textbook material

    Data streams

    Web searchGoogle, Yahoo! Data integration (structured data + unstructured data)

    Data mining

    Unsolved challenges

  • 8/13/2019 01_intro (1)nnm

    18/23

    Using a Traditional DBMS

    User/Application

    Loader

    Query Result

    Table R

    Table S

    Result

    Query

  • 8/13/2019 01_intro (1)nnm

    19/23

    New Approach for Data Streams

    User/Application

    RegisterContinuous Query

    (Standing Query)

    Stream QueryProcessor

    Input streams

    Result

  • 8/13/2019 01_intro (1)nnm

    20/23

    Example Continuous (Standing) Queries

    Web Amazons best sellers over last hour

    Network Intrusion Detection

    Track HTTP packets with destination addressmatching a prefix in given table and contentmatching *\.ida

    Finance

    Monitor NASDAQ stocks between $20 and$200 that have moved down more than 2% inthe last 20 minutes

  • 8/13/2019 01_intro (1)nnm

    21/23

    New Challenges in DBMSs

    High-levelQuery Q

    DBMS

    Answer

    Data

    TeraBytes PetaBytes

    Empire B.

    Bob Dylan

    USA

    Columbia

    10.90

    http://images.google.com/imgres?imgurl=http://www.informativos.telecinco.es/imgsed/google_logo_261.jpg&imgrefurl=http://www.informativos.telecinco.es/dn_12871.htm&h=196&w=261&sz=9&tbnid=JLK_Hwibc1gJ:&tbnh=80&tbnw=107&start=8&prev=/images%3Fq%3Dgoogle%2Blogo%26hl%3Den%26lr%3Dhttp://images.google.com/imgres?imgurl=http://www.katu.com/news/images/stock2003/amazon_logo.jpg&imgrefurl=http://www.katu.com/business/story.asp%3FID%3D70294&h=158&w=210&sz=12&tbnid=2E7bzhTXZiMJ:&tbnh=75&tbnw=100&start=5&prev=/images%3Fq%3Damazon%2Blogo%26hl%3Den%26lr%3Dhttp://images.google.com/imgres?imgurl=http://www.atl.ec.gc.ca/weather/hurricane/juan/juan-satellite-large.jpg&imgrefurl=http://www.atl.ec.gc.ca/weather/hurricane/juan/satellite_e.html&h=1046&w=800&sz=176&tbnid=-zFQ1G5ziOsJ:&tbnh=149&tbnw=114&start=25&prev=/images%3Fq%3Dsatellite%2Bimage%26start%3D20%26hl%3Den%26lr%3D%26sa%3DN
  • 8/13/2019 01_intro (1)nnm

    22/23

    Course Logistics

    Reference: Database Systems: The CompleteBook, by H. Garcia-Molina, J. D. Ullman, and J.Widom

    Web site: http://www.cs.duke.edu/courses/fall07/cps216

    Grading:

    Project 30%

    Homework Assignments 20%

    Midterm 20%

    Final 30%

  • 8/13/2019 01_intro (1)nnm

    23/23

    Summary: Data Management isImportant

    Core aspect of most sciences andengineering today

    Core need in industry

    Cool mix of theory and systems

    Chances are you will find something

    interesting even if you primary interest iselsewhere