co so du lieu 2 phan tan va suy dien 6639

Upload: tuyet-hoa

Post on 04-Apr-2018

216 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639

    1/117

    MUC LUC

    MUC LUC....................................................................................1L i ni u .................................................................................3PH N 1 .......................................................................................5C S D LI U PHN TN ..........................................................5CH NG 1. T NG QUAN V C S D LI U PHN TN .............5

    1.1. H CSDL phn tn ............................................................ 51.1.1. nh ngh a CSDL phn tn .........................................51.1.2. Cc c i m chnh c a c s d li u phn tn ..........61.1.3. M c ch c a vi c s d ng c s d li u phn tn .....81.1.4. Ki n trc c b n c a CSDL phn tn ..........................91.1.5. H qu n tr CSDL phn tn .......................................10

    1.2. Ki n trc h qu n tr C s d li u phn tn ..................111.2.1. Cc h khch / i l ................................................111.2.2. Cc h phn tn ngang hng ...................................12

    CH NG 2. CC PH NG PHP PHN TN D LI U ................132.1.Thi t k c s d li u phn tn .......................................13

    2.1.1.Cc chi n l c thi t k ..............................................132.2. Cc v n thi t k ........................................................14

    2.2.1. L do phn m nh .....................................................142.2.2. Cc ki u phn m nh ................................................142.2.3. Phn m nh ngang ....................................................16

    2.3. Phn m nh d c .............................................................. 302.5. Phn m nh h n h p ....................................................... 412.6. C p pht ........................................................................ 42

    2.6.1 Bi ton c p pht ......................................................422.6.2 Yu c u v thng tin .................................................422.6.3. M hnh c p pht ....................................................43

    CH NG 3. X L V N TIN ......................................................473.1. Bi ton x l v n tin .....................................................473.2. Phn r v n tin ............................................................... 513.3. C c b ha d li u phn tn ..........................................59

    3.4. T i u ho v n tin phn tn ...........................................663.4.1. Khng gian tm ki m ................................................663.4.2. Chi n l c tm ki m .................................................693.4.3. M hnh chi ph phn tn..........................................703.4.4. X p th t n i trong cc v n tin theo m nh .............76

    CH NG 4. QU N L GIAO D CH .............................................834.1. Cc khi ni m ................................................................834. 2. M hnh kho c b n ..................................................... 914.4. Thu t ton i u khi n t ng tranh b ng nhn th i gian .97

    PH N 1 ...................................................................................100C S D LI U SUY DI N ......................................................1002.1. Gi i thi u chung ........................................................... 100

    1

  • 7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639

    2/117

    2.2- CSDL suy di n ........................................................... 1002.2.1. M hnh CSDL suy di n ..........................................1002.2.2. L thuy t m hnh i v i CSDL quan h ................1022.2.3. Nhn nh n CSDL suy di n ......................................1042.2.4. Cc giao tc trn CSDL suy di n ...........................105

    2.3. CSDL d a trn Logic ..................................................... 1052.3.4. C u trc c a cu h i ..............................................1102.3.5. So snh DATALOG v i i s quan h ....................111

    2.3.6. Cc h CSDL chuyn gia .....................................1162.4. M t s v n khc ......................................................116

    2

  • 7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639

    3/117

    Li ni u

    Cc h c s d liu (h CSDL) u tin c xy dng theo cc m hnh phncp v m hnh mng, xut hin vo nhng nm 1960, c xem l th h th nht

    ca cc h qun tr c s d liu (h QTCSDL).

    Tip theo l th h th hai, cc h QTCSDL quan h, c xy dng theo m

    hnh d liu quan h do E.F. Codd xut vo nm 1970.

    Cc h QTCSDL c mc tiu t chc d liu, truy cp v cp nht nhng khi

    lng ln d liu mt cch thun li, an ton v hiu qu.

    Hai th h u cc h QTCSDL p ng c nhu cu thu thp v t chc

    cc d liu ca cc c quan, x nghip v t chc kinh doanh.

    Tuy nhin, vi s pht trin nhanh chng ca cng ngh truyn thng v s

    bnh trng mnh m ca mng Internet, cng vi xu th ton cu ho trong mi lnh

    vc, c bit l v thng mi, lm ny sinh nhiu ng dng mi trong phi

    qun l nhng i tng c cu trc phc tp (vn bn, m thanh, hnh nh) v ng

    (cc chng trnh, cc m phng). Trong nhng nm 1990 xut hin mt th h th

    ba cc h QTCSDL cc h hng i tng, c kh nng h tr cc ng dng a

    phng tin (multimedia).

    Trc nhu cu v ti liu v sch gio khoa ca sinh vin chuyn nghnh cng

    ngh thng tin, nht l cc ti liu v CSDL phn tn, CSDL suy din, CSDL hng

    i tng, chng ti a ra gio trnh mn hc C s d liu 2.

    Mc ch ca gio trnh C s d liu 2 nhm trnh by cc khi nim v

    thut ton c s ca CSDL bao gm: cc m hnh d liu v cc h CSDL tng ng,cc ngn ng CSDL, t chc lu tr v tm kim, x l v ti u ho cu hi, qun l

    giao dch v ieukhin tng tranh, thit k cc CSDL.

    Trong qu trnh bin son, chng ti da vo ni dung chng trnh ca mn

    hc hin ang c ging dy ti cc trng i hc trong nc, ng thi cng c

    gng phn nh mt s thnh tu mi ca cng ngh CSDL.

    Gio trnh C s d liu 2 c chia thnh 2 phn

    Phn 1: C s d liu phn tn

    Phn 2: C s d liu suy din

    3

  • 7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639

    4/117

    Sau mi chng u c nhng phn tm tt cui chng, cu hi n tp v bi

    tp nhm gip sinh vin nm vng ni dung chnh ca tng chng v kim tra trnh

    ca chnh mnh trong vic gii cc bi tp.

    Tuy rt g gng, gio trnh chc chn cn c nhng thiu st. Rt mong nhn

    c kin ng gp ca c gi trong ln ti bn sau, gio trnh s hon chnh hn.

    Thi Nguyn thng 10 nm 2009

    Cc tc gi

    4

  • 7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639

    5/117

    PHN 1

    C S D LIU PHN TN

    CHNG 1. TNG QUAN V C S D LIU PHN TN

    Vi vic phn b ngy cng rng ri ca cc cng ty, x nghip, d liu bi ton

    l rt ln v khng tp trung c. Cc CSDL thuc th h mt v hai khng gii

    quyt c cc bi ton trong mi trng mi khng tp trung m phn tn, song song

    vi cc d liu v h thng khng thun nht, th h th ba ca h qun tr CSDL ra

    i vo nhng nm 80 trong c CSDL phn tn p ng nhng nhu cu mi.

    1.1. H CSDL phn tn

    1.1.1. nh ngha CSDL phn tn

    Mt CSDL phn tn l mt tp hp nhiu CSDL c lin i logic v c phn

    b trn mt mng my tnh

    - Tnh cht phn tn: Ton b d liu ca CSDL phn tn khng c c tr

    mt ni m c tr ra trn nhiu trm thuc mng my tnh, iu ny gip chng ta

    phn bit CSDL phn tn vi CSDL tp trung n l.- Tng quan logic: Ton b d liu ca CSDL phn tn c mt s cc thuc tnh

    rng buc chng vi nhau, iu ny gip chng ta c th phn bit mt CSDL phn

    tn vi mt tp hp CSDL cc b hoc cc tp c tr ti cc v tr khc nhau trong mt

    mng my tnh.

    5

    Trm 1

    Trm 2

    Trm 3Trm 4

    Trm 5

    Mng truyn d liu

    Hnh 1.1 Mi trng h CSDLphn tn

  • 7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639

    6/117

    Trong h thng c s d liu phn tn gm nhiu trm, mi trm c th khai

    thc cc giao tc truy nhp d liu trn nhiu trm khc.

    V d 1.1: Vi mt ngn hng c 3 chi nhnh t cc v tr khc nhau. Ti mi

    chi nhnh c mt my tnh iu khin mt s my k ton cui cng (Teller terminal).

    Mi my tnh vi c s d liu thng k a phng ca n ti mi chi nhnh c t

    mt v tr ca c s d liu phn tn. Cc my tnh c ni vi nhau bi mt mng

    truyn thng.

    1.1.2. Cc c im chnh ca c s d liu phn tn

    (1) Chia s ti nguyn

    Vic chia s ti nguyn ca h phn tn c thc hin thng qua mng truyn

    thng. chia s ti nguyn mt cch c hiu qu th mi ti nguyn cn c qun lbi mt chng trnh c giao din truyn thng, cc ti nguyn c th c truy cp,

    cp nht mt cch tin cy v nht qun. Qun l ti nguyn y l lp k hoch d

    phng, t tn cho cc lp ti nguyn, cho php ti nguyn c truy cp t ni ny

    n ni khc, nh x ln ti nguyn vo a ch truyn thng, ...

    (2) Tnh m

    Tnh m ca h thng my tnh l d dng m rng phn cng (thm cc thit

    b ngoi vi, b nh, cc giao din truyn thng ...) v cc phn mm (cc m hnh h

    iu hnh, cc giao thc truyn tin, cc dch v chung ti nguyn, ... )

    Mt h phn tn c tnh m l h c th c to t nhiu loi phn cng v

    phn mm ca nhiu nh cung cp khc nhau vi iu kin l cc thnh phn ny phi

    theo mt tiu chun chung.

    Tnh m ca h phn tn c xem xt thao mc b sung vo cc dch v

    dng chung ti nguyn m khng ph hng hay nhn i cc dch v ang tn ti. Tnh

    m c hon thin bng cch xc nh hay phn nh r cc giao din chnh ca mt

    h v lm cho n tng thch vi cc nh pht trin phn mm.

    Tnh m ca h phn tn da trn vic cung cp c ch truyn thng gia cc

    tin trnh v cng khai cc giao din dng truy cp cc ti nguyn chung.

    (3) Kh nng song song

    H phn tn hot ng trn mt mng truyn thng c nhiu my tnh, mi my

    c th c 1 hay nhiu CPU. Trong cng mt thi im nu c N tin trnh cng tn ti,

    6

  • 7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639

    7/117

    ta ni chng thc hin ng thi. Vic thc hin tin trnh theo c ch phn chia thi

    gian (mt CPU) hay song song (nhiu CPU)

    Kh nng lm vic song song trong h phn tn c thc hin do hai tnh

    hung sau:

    - Nhiu ngi s dng ng thi ra cc lnh hay cc tng tc vi cc chng

    trnh ng dng

    - Nhiu tin trnh Server chy ng thi, mi tin trnh p ng cc yu cu t

    cc tin trnh Client khc.

    (4) Kh nng m rng

    H phn tn c kh nng hot ng tt v hiu qu nhiu mc khc nhau. Mt

    h phn tn nh nht c th hot ng ch cn hai trm lm vic v mt File Server.Cc h ln hn ti hng nghn my tnh.

    Kh nng m rng c c trng bi tnh khng thay i phn mm h thng

    v phn mm ng dng khi h c m rng. iu ny ch t c mc d no

    vi h phn tn hin ti. Yu cu vic m rng khng ch l s m rng v phn cng,

    v mng m n tri trn cc kha cnh khi thit k h phn tn.

    (5) Kh nng th li

    Vic thit k kh nng th li ca cc h thng my tnh da trn hai gii php:

    - Dng kh nng thay th m bo s hot ng lin tc v hiu qu.

    - Dng cc chng trnh hi phc khi xy ra s c.

    Xy dng mt h thng c th khc phc s c theo cch th nht th ngi ta

    ni hai my tnh vi nhau thc hin cng mt chng trnh, mt trong hai my chy

    ch Standby (khng ti hay ch). Gii php ny tn km v phi nhn i phn

    cng ca h thng. Mt gii php gim ph tn l cc Server ring l c cung cpcc ng dng quan trng c th thay th nhau khi c s c xut hin. Khi khng c

    cc s c cc Server hot ng bnh thng, khi c s c trn mt Server no , cc

    ng dng Clien t chuyn hng sang cc Server cn li.

    Cch hai th cc phn mm hi phc c thit k sao cho trng thi d liu

    hin thi (trng thi trc khi xy ra s c) c th c khi phc khi li c pht hin.

    Cc h phn tn cung cp kh nng sn sng cao i ph vi cc sai hng

    phn cng.

    (6) Tnh trong sut

    7

  • 7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639

    8/117

    Tnh trong sut ca mt h phn tn c hiu nh l vic che khut i cc

    thnh phn ring bit ca h i vi ngi s dng v nhng ngi lp trnh ng dng.

    Tnh trong sut v v tr: Ngi s dng khng cn bit v tr vt l ca d liu.

    Ngi s dng c quyn truy cp ti n c s d liu nm bt k ti v tr no. Cc

    thao tc ly, cp nht d liu ti mt im d liu xa c t ng thc hin bi h

    thng ti im a ra yu cu, ngi s dng khng cn bit n s phn tn ca c s

    d liu trn mng.

    Tnh trong sut trong vic s dng: Vic chuyn i ca mt phn hay ton b

    c s d liu do thay i v t chc hay qun l, khng nh hng ti thao tc ngi

    s dng.

    Tnh trong sut ca vic phn chia: Nu d liu c phn chia do tng ti, n

    khng c nh hng ti ngi s dng.

    Tnh trong sut cas trng lp: Nu d liu trng lp gim chi ph truyn

    thng vi c s d liu hoc nng cao tin cy, ngi s dng khng cn bit n

    iu .

    (7) m bo tin cy v nht qun

    H thng yu cu tin cy cao: s b mt ca d liu phi c bo v, cc

    chc nng khi phc h hng phi c m bo. Ngoi ra yu cu ca h thng vtnh nht qun cng rt quan trng trong th hin: khng c c mu thun trong ni

    dung d liu. Khi cc thuc tnh d liu l khc nhau th cc thao tc vn phi nht qun.

    1.1.3. Mc ch ca vic s dng c s d liu phn tn

    Xut pht t yu cu thc t v t chc v kinh t: Trong thc t nhiu t chc

    l khng tp trung, d liu ngy cng ln v phc v cho a ngi dng nm phn tn,

    v vy c s d liu phn tn l con ng thch hp vi cu trc t nhin ca cc t

    chc . y l mt trong nhng yu t quan trng thc y vic pht trin c s dliu phn tn.

    S lin kt cc c s d liu a phng ang tn ti: c s d liu phn tn l

    gii php t nhin khi c cc c s d liu ang tn ti v s cn thit xy dng mt

    ng dng ton cc. Trong trng hp ny c s d liu phn tn c to t di ln

    da trn nn tng c s d liu ang tn ti. Tin trnh ny i hi cu trc li cc c

    s d liu cc b mt mc nht nh. D sao, nhng sa i ny vn l nh hn rt

    nhiu so vi vic to lp mt c s d liu tp trung hon ton mi.

    8

  • 7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639

    9/117

    Lm gim tng chi ph tm kim: Vic phn tn d liu cho php cc nhm lm

    vic cc b c th kim sot c ton b d liu ca h. Tuy vy, ti cng thi im

    ngi s dng c th truy cp n d liu xa nu cn thit. Ti cc v tr cc b, thit

    b phn cng c th chn sao cho ph hp vi cng vic x l d liu cc b ti im .

    S pht trin m rng: Cc t chc c th pht trin m rng bng cch thm

    cc n v mi, va c tnh t tr, va c quan h tng i vi cc n v t chc

    khc. Khi gii php c s d liu phn tn h tr mt s m rng uyn chuyn vi

    mt mc nh hng ti thiu ti cc n v ang tn ti

    Tr li truy vn nhanh: Hu ht cc yu cu truy vn d liu t ngi s dng

    ti bt k v tr cc b no u tho mn d liu ngay ti thi im .

    tin cy v kh nng s dng nng cao: nu c mt thnh phn no ca h

    thng b hng, h thng vn c th duy tr hot ng.

    Kh nng phc hi nhanh chng: Vic truy nhp d liu khng ph thuc vo

    mt my hay mt ng ni trn mng. Nu c bt k mt li no h thng c th t

    ng chn ng li qua cc ng ni khc.

    1.1.4. Kin trc c bn ca CSDL phn tn

    y khng l kin trc tng minh cho tt c cc CSDL phn tn, tuy vy

    kin trc ny th hin t chc ca bt k mt CSDL phn tn no

    - S tng th: nh ngha tt c cc d liu s c lu tr trong CSDL

    phn tn. Trong m hnh quan h, s tng th bao gm nh ngha ca cc tp quan

    h tng th.

    - S phn on: Mi quan h tng th c th chia thnh mt vi phn

    khng gi ln nhau c gi l on (fragments). C nhiu cch khc nhau thc

    hin vic phn chia ny. nh x (mt - nhiu) gia s tng th v cc on c

    nh ngha trong s phn on.

    - S nh v: Cc on l cc phn logic ca quan h tng th c nh v

    vt l trn mt hoc nhiu v tr trn mng. S nh v nh ngha on no nh v

    ti cc v tr no. Lu rng kiu nh x c nh ngha trong s nh v quyt

    nh CSDL phn tn l d tha hay khng.

    - S nh x a phng: nh x cc nh vt l v cc i tng c lu

    tr ti mt trm (tt c cc on ca mt quan h tng th trn cng mt v tr to ramt nh vt l)

    9

  • 7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639

    10/117

    1.1.5. H qun tr CSDL phn tn

    H qun tr CSDL phn tn (Distributed Database Management System-

    DBMS) c nh ngha l mt h thng phn mm cho php qun l cc h CSDL

    (to lp v iu khin cc truy nhp cho cc h CSDL phn tn) v lm cho vic phn

    tn tr nn trong sut vi ngi s dng.

    c tnh v hnh mun ni n s tch bit v ng ngha cp cao ca mt

    h thng vi cc vn ci t cp thp. S phn tn d liu c che du vi

    ngi s dng lm cho ngi s dng truy nhp vo CSDL phn tn nh h CSDL tp

    trung. S thay i vic qun tr khng nh hng ti ngi s dng.

    H qun tr CSDL phn tn gm 1 tp cc phn mm (chng trnh) sau y:

    Cc chng trnh qun tr cc d liu phn tn Cha cc chng trnh qun tr vic truyn thng d liu

    Cc chng trnh qun tr cc CSDL a phng.

    Cc chng trnh qun tr t in d liu.

    to ra mt h CSDL phn tn (Distributed Database System-DDBS) cc

    tp tin khng ch c lin i logic chng cn phi c cu trc v c truy xut qua

    mt giao din chung.

    10

    S tng th

    S phn on

    S nh v

    S nh x a phng 2S nh x a phng 1

    DBMS ca v tr 1

    CSDL a phng ti v tr 1

    Cc v tr khc

    DBMS ca v tr 2

    CSDL a phng ti v tr 2

    Hnh1.2 Kin trc c bn ca CSDL phn tn

  • 7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639

    11/117

    Mi trng h CSDL phn tn l mi trng trong d liu c phn tn

    trn mt s v tr.

    1.2. Kin trc h qun tr C s d liu phn tn

    1.2.1. Cc h khch / i l

    Cc h qun tr CSDL khch / i l xut hin vo u nhng nm 90 v c

    nh hng rt ln n cng ngh DBMS v phng thc x l tnh ton. tng tng

    qut ht sc n gin: phn bit cc chc nng cn c cung cp v chia nhng chc

    nng ny thnh hai lp: chc nng i l (server function) v chc nng khch hng

    (client function). N cung cp kin trc hai cp, to d dng cho vic qun l mc

    phc tp ca cc DBMS hin i v phc tp ca vic phn tn d liu.

    i l thc hin phn ln cng vic qun l d liu. iu ny c ngha l tt cmi vic x l v ti u ho vn tin, qun l giao dch v qun l thit b lu tr c

    thc hin ti i l. Khch hng, ngoi ng dng v giao din s c modun DBMS

    khch chu trch nhim qun l d liu c gi n cho bn khch v i khi vic

    qun l cc kho cht giao dch cng c th giao cho n. Kin trc c m t bi

    hnh di rt thng dng trong cc h thng quan h, vic giao tip gia khch v

    i l nm ti mc cu lnh SQL. Ni cch khc, khch hng s chuyn cc cu vn

    tin SQL cho i l m khng tm hiu v ti u ho chng. i l thc hin hu htcng vic v tr quan h kt qu v cho khch hng.

    C mt s loi kin trc khch/ i l khc nhau. Loi n gin nht l trng

    hp c mt i l c nhiu khch hng truy xut. Chng ta gi loi ny l nhiu

    khch mt i l. Mt kin trc khch/ i l phc tp hn l kin trc c nhiu i l

    trong h thng (c gi l nhiu khch nhiu i l). Trong trng hp ny chng ta

    c hai chin lc qun l: hoc mi khch hng t qun l ni kt ca n vi i l

    hoc mi khch hng ch bit i l rut ca n v giao tip vi cc i l khc quai l khi cn. Li tip cn th nht lm n gin cho cc chng trnh i l

    nhng li t gnh nng ln cc my khch cng vi nhiu trch nhim khc. iu ny

    dn n tnh hung c gi l cc h thng khch t phc v. Li tip cn sau tp

    trung chc nng qun l d liu ti i l. V th s v hnh ca truy xut d liu c

    cung cp qua giao din ca i l.

    T gc tnh logc c d liu, DBMS khch/ i l cung cp cng mt hnh

    nh d liu nh cc h ngang hng s c tho lun phn tip theo. Ngha l chng

    cho ngi s dng thy mt hnh nh v mt CSDL logic duy nht, cn ti mc vt l

    n c th phn tn. V th s phn bit ch yu gia cc h khch/i l v ngang hng

    11

  • 7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639

    12/117

    khng phi mc v hnh c cung cp cho ngi dng v cho ng dng m m

    hnh kin trc c dng nhn ra mc v hnh ny.

    1.2.2. Cc h phn tn ngang hng

    M hnh client / server phn bit client (ni yu cu dch v) v server (niphc v cc yu cu). Nhng m hnh x l ngang hng, cc h thng tham gia c vai

    tr nh nhau. Chng c th yu cu va dch v t mt h thng khc hoc va tr

    thnh ni cung cp dch v. Mt cch l tng, m hnh tnh ton ngang hng cung

    cp cho x l hp tc gia cc ng dng c th nm trn cc phn cng hoc h iu

    hnh khc nhau. Mc ch ca mi trng x l ngang hng l h tr cc CSDL

    c ni mng. Nh vy ngi s dng DBMS s c th truy cp ti nhiu CSDL

    khng ng nht.

    12

  • 7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639

    13/117

    CHNG 2. CC PHNG PHP PHN TN D LIU

    2.1.Thit k c s d liu phn tn

    2.1.1.Cc chin lc thit k

    Qu trnh thit k t trn xung (top-down)

    13

    Phn tch yu cu

    Yu cu h thng(mc tiu)

    Thit k khi nimThit k khung nhn

    Lc khi

    nim toncc

    Thng tintruy xut nh ngha

    lc ngoi

    Thit k phn tn

    Lc khi nim cc b

    Thit k vt l

    Lc vt l

    Theo di v bo tr

    Phn hi

    Nguyn liu tngi dng

    Nguyn liu

    t ngi dng

    Hnh 2.1. Qu trnh thit k t trnxung

  • 7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639

    14/117

    Phn tch yu cu: nhm nh ngha mi trng h thng v thu thp cc nhu

    cu v d liu v nhu cu x l ca tt c mi ngi c s dng CSDL

    Thit k khung nhn: nh ngha cc giao-din cho ngi s dng cui (end-

    user)

    Thit k khi nim: xem xt tng th x nghip nhm xc nh cc loi thc th

    v mi lin h gia cc thc th.

    Thit k phn tn: chia cc quan h thnh nhiu quan h nh hn gi l phn

    mnh v cp pht chng cho cc v tr.

    Thit k vt l: nh x lc khi nim cc b sang cc thit b lu tr vt l c

    sn ti cc v tr tng ng.

    Qu trnh thit k t di ln (bottom-up)

    Thit k t trn xung thch hp vi nhng CSDL c thit k t u. Tuy

    nhin chng ta cng hay gp trong thc t l c sn mt s CSDL, nhim v thit

    k l phi tch hp chng thnh mt CSDL. Tip cn t di ln s thch hp cho tnh

    hung ny. Khi im ca thit k t di ln l cc lc khi nim cc b . Qu

    trnh ny s bao gm vic tch hp cc lc cc b thnh khi nim lc ton cc.

    2.2. Cc vn thit k

    2.2.1. L do phn mnh

    Khung nhn ca cc ng dng thng ch l mt tp con ca quan h. V th

    n v truy xut khng phi l ton b quan h nhng ch l cc tp con ca quan h.

    Kt qu l xem tp con ca quan h l n v phn tn s l iu thch hp duy nht.

    Vic phn r mt quan h thnh nhiu mnh, mi mnh c x l nh mt

    n v, s cho php thc hin nhiu giao dch ng thi. Ngoi ra vic phn mnh cc

    quan h s cho php thc hin song song mt cu vn tin bng cch chia n ra thnhmt tp cc cu vn tin con hot tc trn cc mnh. V th vic phn mnh s lm tng

    mc hot ng ng thi v nh th lm tng lu lng hot ng ca h thng.

    2.2.2. Cc kiu phn mnh

    Cc quy tc phn mnh ng n

    Chng ta s tun th ba quy tc trong khi phn mnh m chng bo m

    rng CSDL s khng c thay i no v ng ngha khi phn mnh.

    a) Tnh y (completeness).

    14

  • 7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639

    15/117

    Nu mt th hin quan h R c phn r thnh cc mnh R1, R2,,Rn, th

    mi mc d liu c th gp trong R cng c th gp mt trong nhiu mnh Ri. c tnh

    ny ging nh tnh cht phn r ni khng mt thng tin trong chun ho, cng quan

    trng trong phn mnh bi v n bo m rng d liu trong quan h R c nh x

    vo cc mnh v khng b mt. Ch rng trong trng hp phn mnh ngang mcd liu mun ni n l mt b, cn trong trng hp phn mnh dc, n mun ni

    n mt thuc tnh.

    b) Tnh ti thit c (reconstruction).

    Nu mt th hin quan h R c phn r thnh cc mnh R1, R2,,Rn, th

    cn phi nh ngha mt ton t quan h sao cho

    R=Ri, Ri Fr

    Ton t thay i tu theo tng loi phn mnh, tuy nhin iu quan trng

    l phi xc nh c n. Kh nng ti thit mt quan h t cc mnh ca n bo m

    rng cc rng buc c nh ngha trn d liu di dng cc ph thuc s c bo ton.

    c) Tnh tch bit (disjointness).

    Nu quan h R c phn r ngang thnh cc mnh R1, R2,,Rn, v mc d

    liu di nm trong mnh Rj, th n s khng nm trong mnh Rk khc

    (kj ). Tiu chun ny m bo cc mnh ngang s tch bit (ri nhau). Nu quan hc phn r dc, cc thuc tnh kho chnh phi c lp li trong mi mnh. V th

    trong trng hp phn mnh dc, tnh tch bit ch c nh ngha trn cc trng

    khng phi l kho chnh ca mt quan h.

    Cc yu cu thng tin

    Mt iu cn lu trong vic thit k phn tn l qu nhiu yu t c nh

    hng n mt thit k ti u. t chc logic ca CSDL, v tr cc ng dng, c tnh

    truy xut ca cc ng dng n CSDL, v cc c tnh ca h thng my tnh ti miv tr u c nh hng n cc quyt nh phn tn. iu ny khin cho vic din t

    bi ton phn tn tr nn ht sc phc tp.

    Cc thng tin cn cho thit k phn tn c th chia thnh bn loi:

    - Thng tin CSDL

    - Thng tin ng dng

    - Thng tin v mng

    - Thng tin v h thng my tnh

    15

  • 7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639

    16/117

    Hai loi sau c bn cht hon ton nh lng v c s dng trong cc m

    hnh cp pht ch khng phi trong cc thut ton phn mnh

    2.2.3. Phn mnh ngang

    Trong phn ny, chng ta bn n cc khi nim lin quan n phn mnhngang (phn tn ngang). C hai chin lc phn mnh ngang c bn:

    - Phn mnh nguyn thu (primary horizontal fragmentation) ca mt quan h c

    thc hin da trn cc v t c nh ngha trn quan h .

    - Phn mnh ngang dn xut (derived horizontal fragmentation ) l phn mnh mt

    quan h da vo cc v t c nh trn mt quan h khc.

    Hai kiu phn mnh ngang

    Phn mnh ngang chia mt quan h r theo cc b, v vy mi mnh l mt tp con

    cc b t ca quan h r.

    Phn mnh nguyn thu (primary horizontal fragmentation) ca mt quan h

    c thc hin da trn cc v t c nh ngha trn quan h . Ngc li phn

    mnh ngang dn xut (derived horizontal fragmentation ) l phn mnh mt quan h

    da vo cc v t c nh trn mt quan h khc. Nh vy trong phn mnh ngang

    tp cc v tng vai tr quan trng.Trong phn ny s xem xt cc thut ton thc hin cc kiu phn mnh ngang.

    Trc tin chng ta nu cc thng tin cn thit thc hin phn mnh ngang.

    Yu cu thng tin ca phn mnh ngang

    a) Thng tin v c s d liu

    Thng tin v CSDL mun ni n l lc ton cc v quan h gc, cc quan

    h con. Trong ng cnh ny, chng ta cn bit c cc quan h s kt li vi nhau

    bng php ni hay bng php tnh khc. vi mc ch phn mnh dn xut, cc v t

    c nh ngha trn quan h khc, ta thng dng m hnh thc th - lin h (entity-

    relatinhip model), v trong m hnh ny cc mi lin h c biu din bng cc

    ng ni c hng (cc cung) gia cc quan h c lin h vi nhau qua mt ni.

    16

  • 7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639

    17/117

    Th d 1:

    Hnh 2.2. Biu din mi lin h gia cc quan h nh cc ng ni.

    Hnh trn trnh by mt cch biu din cc ng ni gia cc quan h. ch

    rng hng ca ng ni cho bit mi lin h mt -nhiu. Chng hn vi mi chc

    v c nhiu nhn vin gi chc v , v th chng ta s v mt ng ni t quan h

    CT (chi tr) hng n NV (nhn vin). ng thi mi lin h nhiu- nhiu gia NVv DA(d n) c biu din bng hai ng ni n quan h PC(phn cng).

    Quan h nm ti u (khng mi tn ) ca ng ni c gi l ch nhn

    (owner) ca ng ni v quan h ti cui ng ni (u mi tn) gi l thnh vin

    (member).

    Th d 2:

    Cho ng ni L1 ca hnh 2.2, cc hm owner v member c cc gi tr sau:

    Owner( L1 ) = CT

    Member (L1) = NV

    Thng tin nh lng cn c v CSDL l lc lng (cardinality) ca mi quan

    h R, l s b c trong R, c k hiu l card (R)

    b) Thng tin v ng dng

    phn tn ngoi thng tin nh lng Card(R) ta cn cn thng tin nh tnh

    c bn gm cc v t c dng trong cc cu vn tin. Lng thng tin ny ph thuc

    bi ton c th.

    17

    Chc v, Lng

    MNV, tnNV, chc vMDA, tnDA, ngn sch, a im

    MNV , MDA, nhim v, thi gian

    CT

    NVDA

    PC

    L1

    L2 L3

  • 7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639

    18/117

    Nu khng th phn tch c ht tt c cc ng dng xc nh nhng v t

    ny th t nht cng phi nghin cu c cc ng dng quan trng nht.

    Vy chng ta xc nh cc v t n gin (simple predicate). Cho quan h R

    ( A1, A2,, An ), trong Ai l mt thuc tnh c nh ngha trn mt min bin

    thin D(Ai) hay Di..

    Mt v t n gin P c nh ngha trn R c dng:

    P:Ai Value

    Trong {=,, } v

    value c chn t min bin thin ca Ai (value Di).

    Nh vy, cho trc lc R, cc min tr Di chng ta c th xc nh c tp

    tt c cc v t n gin Pr trn R.

    Vy Pr={P: Ai Value}. Tuy nhin trong thc t ta ch cn nhng tp con thc

    s ca Pr.

    Th d 3: Cho quan h D n nh sau:

    P1 : TnDA = thit b iu khin

    P2 : Ngn sch 200000

    L cc v t n gin..

    Chng ta s s dng k hiu Pri biu th tp tt c cc v t n gin c

    nh ngha trn quan h Ri. Ccphn t ca Pri c k hiu l pij.

    Cc v t n gin thng rt d x l, cc cu vn tin thng cha nhiu v t

    phc tp hn, l t hp ca cc v t n gin. Mt t hp cn c bit ch , c gi

    l v t hi s cp (minterm predicate), l hi (conjunction) ca cc v t n gin.

    Bi v chng ta lun c th bin i mt biu thc Boole thnh dng chun hi, vic

    s dng v t hi s cp trong mt thut ton thit k khng lm mt i tnh tng qut.

    Cho mt tp Pri = {pi1, pi2, , pim } l cc v t n gin trn quan h Ri, tp cc

    v t hi s cp Mi={mi1, mi2, , miz } c nh ngha l:

    Mi={mij | mij= p*ik} vi 1 k m, 1 j z

    Trong p*ik=pik hoc p*ik= pik . V th mi v t n gin c th xut hin

    trong v t hi s cp di dng t nhin hoc dng ph nh.

    18

  • 7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639

    19/117

    Th d 4:

    Xt quan h CT:

    chc v Lng

    K s in

    Phn tch h thng

    K s c kh

    Lp trnh

    40000

    34000

    27000

    24000

    Di y l mt s v t n gin c th nh ngha c trn PAY.

    p1: chc v= K s in

    p2: chc v= Phn tch h thng

    p3: chc v= K s c kh

    p4: chc v= Lp trnh

    p5: Lng 30000

    p6: Lng > 30000

    Di y l mt s cc v t hi s cp c nh ngha da trn cc v t n

    gin ny

    m1: chc v= K s in Lng 30000

    m2: chc v = K s in Lng > 30000

    m3: (chc v= K s in ) Lng 30000

    m4: (chc v= K s in ) Lng> 30000

    m5: chc v= Lp trnh Lng 30000

    m6: chc v= Lp trnh Lng > 30000

    Ch :+ Php ly ph nh khng phi lc no cng thc hin c. Th d:xt

    hai v t n gin sau: Cn_di A; A Cn_trn. Tc l thuc tnh A c min tr

    nm trong cn di v cn trn, khi phn b ca chng l:

    (Cn_di A);

    (A Cn_trn) khng xc nh c. Gi tr ca A trong cc ph nh

    ny ra khi min tr ca A.

    19

  • 7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639

    20/117

    Hoc hai v t n gin trn c th c vit li l:

    Cn_di A Cn_trn c phn b l: (Cn_di A Cn_trn) khng

    nh ngha c. V vy khi nghin cu nhng vn ny ta ch xem xt cc v t

    ng thc n gin.

    => Khng phi tt c cc v t hi s cp u c th nh ngha c.

    + Mt s trong chng c th v ngha i vi ng ngha ca quan h Chi tr .

    Ngoi ra cn ch rng m3 c th c vit li nh sau:

    m3: chc v K s in Lng 30000

    Theo nhng thng tin nh tnh v cc ng dng, chng ta cn bit hai tp d liu.

    tuyn hi s cp (minterm selectivity): s lng cc b ca quan h s c

    truy xut bi cu vn tin c c t theo mt v t hi s cp cho. chng

    hn tuyn ca m1 trong Th d 4 l zero bi v khng c b no trong CT

    tha v t ny. tuyn ca m2 l 1. Chng ta s k hiu tuyn ca mt hi

    s cp mi l sel (mi).

    Tn s truy xut (access frequency): tn s ng dng truy xut d liu. Nu

    Q={q1, q2,....,qq} l tp cc cu vn tin, acc (qi) biu th cho tn s truy xut ca

    qi trong mt khong thi gian cho.

    Ch rng mi hi s cp l mt cu vn tin. Chng ta k hiu tn s truy xut

    ca mt hi s cp l acc(mi)

    Phn mnh ngang nguyn thu

    Phn mnh ngang nguyn thu c nh ngha bng mt php ton chn

    trn cc quan h ch nhn ca mt lc ca CSDL. V th cho bit quan h R, cc

    mnh ngang ca R l cc Ri:

    Ri = Fi(R), 1 i z.

    Trong Fi l cng thc chn c s dng c c mnh Ri. Ch rng

    nu Fi c dng chun hi, n l mt v t hi s cp (mj).

    20

  • 7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639

    21/117

    Th d 5:Xt quan h DA

    MDA TnDA Ngn sch a im

    P1

    P2

    P3

    P4

    Thit b o c

    Pht trin d liu

    CAD/CAM

    Bo dng

    150000

    135000

    250000

    310000

    Montreal

    New York

    New York

    Paris

    Chng ta c th nh ngha cc mnh ngang da vo v tr d n. Khi cc

    mnh thu c, c trnh by nh sau:

    DA1=a im=Montreal (DA)

    DA2=a im=New York (DA)

    DA3=a im=Paris (DA)

    DA1

    MDA TDA Ngn sch a im

    P1 Thit b o c 150000 Montreal

    DA2

    MDA TnDA Ngn sch a im

    P2

    P3

    Pht trin d liu

    CAD/CAM

    135000

    250000

    New York

    New York

    DA3

    MDA TnDA Ngn sch a im

    P4 thit b o c 310000 ParisBy gi chng ta c th nh ngha mt mnh ngang cht ch v r rng hn

    Mnh ngang Ri ca quan h R c cha tt c cc b R tha v t hi s cp mi

    Mt c tnh quan trng ca cc v t n gin l tnh y v tnh cc tiu.

    - Tp cc v t n gin Pr c gi l y nu v ch nu xc sut mi ng

    dng truy xut n mt b bt k thuc v mt mnh hi s cp no c nh

    ngha theo Pr u bng nhau.

    21

  • 7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639

    22/117

    Th d 6: Xt quan h phn mnh DA c a ra trong Th d 5. Nu tp ng

    dng Pr={a im=Montreal, a im=New York , a im=Paris, Ngn

    sch 200000 } th Pr khng y v c mt s b ca DA khng c truy xut

    bi v t Ngn sch 200000. cho tp v t ny y , chng ta cn phi xt

    thm v t Ngn sch > 200000 vo Pr. Vy Pr={a im=Montreal, aim=New York , a im=Paris, Ngn sch 200000 , Ngn sch> 200000 }

    l y bi v mi b c truy xut bi ng hai v t p ca Pr. Tt nhin nu ta bt

    i mt v t bt k trong Pr th tp cn li khng y .

    L do cn phi m bo tnh y l v cc mnh thu c theo tp v t y

    s nht qun v mt logic do tt c chng u tho v t hi s cp. Chng cng

    ng nht v y v mt thng k theo cch m ng dng truy xut chng.

    V th chng ta s dng mt tp hp gm cc v t y lm c s ca phnmnh ngang nguyn thy.

    - c tnh th hai ca tp cc v t l tnh cc tiu. y l mt c tnh cm

    tnh. V t n gin phi c lin i (relevant) trong vic xc nh mt mnh. Mt v

    t khng tham gia vo mt phn mnh no th c th coi v t l tha. Nu tt c

    cc v t ca Pr u c lin i th Pr l cc tiu.

    Th d 7: Tp Pr c nh ngha trong Th d 6 l y v cc tiu. Tuy

    nhin nu chng ta thm v t TnDA =thit b o c vo Pr, tp kt qu s khng

    cn cc tiu bi v v t mi thm vo khng c lin i ng vi Pr. V t mi thm

    vo khng chia thm mnh no trong cc mnh c to ra.

    Khi nim y gn cht vi mc tiu ca bi ton. S v t phi y theo

    yu cu ca bi ton chng ta mi thc hin c nhng vn t ra ca bi ton.

    Khi nim cc tiu lin quan n vn ti u ca b nh, ti u ca cc thao tc trn

    tp cc cu vn tin. Vy khi cho trc mt tp v t Pr xt tnh cc tiu chng ta c

    th kim tra bng cch vt b nhng v t tha c tp v t Pr l cc tiu v ttnhin Pr cng l tp y vi Pr.

    Thut ton COM_MIN: Cho php tm tp cc v t y v cc tiu Pr t Pr.

    Chng ta tm quy c:

    Quy tc 1: Quy tc c bn v tnh y v cc tiu , n khng nh rng mt

    quan h hoc mt mnh c phn hoch thnh t nht hai phn v chng c truy

    xut khc nhau bi t nht mt ng dng.

    22

  • 7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639

    23/117

    Thut ton 1.1 COM_MIN

    Input : R: quan h; Pr: tpcc v t n gin;

    Output: Pr: tp cc v t cc tiu v y ;

    DeclareF: tp cc mnh hi s cp;

    Begin

    Pr= ; F = ;

    For each v t p Pr if p phn hoch R theo Quy tc 1 then

    Begin

    Pr: = Pr p;

    Pr: = Pr p;

    F: = F p; {fi l mnh hi s cp theo p i }

    End; {Chng ta chuyn cc v t c phn mnh R vo Pr}

    Repeat

    For each p Pr if p phn hoch mt mnh f k ca Pr

    theo quy tc 1 then

    Begin

    Pr: = Pr p;

    Pr: = Pr p;

    F: = F p;

    End;

    Until Pr y {Khng cn p no phn mnh fk ca Pr}

    For each p Pr, ifp m pp then

    Begin

    Pr:= Pr-p;

    F:= F - f;

    End;End. {COM_MIN}

    23

  • 7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639

    24/117

  • 7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639

    25/117

    Begin

    Pr:= COM_MIN(R, Pr);

    Xc nh tp M cc v t hi s cp;

    Xc nh tp I cc php ko theo gia cc piPr;For each miM do

    Begin

    IF mi mu thun vi I then

    M:= M-mi

    End;

    End. {PHORIZONTAL}

    Th d 8: Chng ta hy xt quan h DA. Gi s rng c hai ng dng. ng

    dng u tin c a ra ti ba v tr v cn tm tn v ngn sch ca cc d n khi

    cho bit v tr. Theo k php SQL cu vn tin c vit l:

    SELECT TnDA, Ngn sch

    FROM DA

    WHERE a im=gi tr

    i vi ng dng ny, cc v t n gin c th c dng l:

    P1: a im=Montreal

    P2: a im=New York

    P3: a im=Paris

    ng dng th hai l nhng d n c ngn sch di 200.000 la c qun l

    ti mt v tr, cn nhng d n c ngn sch ln hn c qun l ti mt v tr th hai.

    V th cc v t n gin phi c s dng phn mnh theo ng dng th hai l:

    P4: ngn sch200000

    P5: ngn sch>200000

    Nu kim tra bng thut ton COM_MIN, tp Pr={p1, p2, p3, p4, p5} r rng

    y v cc tiuDa trn Pr chng ta c th nh ngha su v t hi s cp sau y to ra M:

    25

  • 7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639

    26/117

  • 7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639

    27/117

    DA3

    MDA TnDA Ngn sch a im

    P2 Pht trin d liu 135000 New York

    DA4

    MDA TnDA Ngn sch a im

    P3 CAD/CAM 250000 New York

    DA 6

    MDA TnDA Ngn sch a im

    P4 bo dng 310000 Paris

    Phn mnh ngang dn xut

    Phn mnh ngang dn xut c nh ngha trn mt quan h thnh vin ca

    ng ni da theo php ton chn trn quan h ch nhn ca ng ni .

    Nh th nu cho trc mt ng ni L, trong owner (L)=S v

    member(L)=R, v cc mnh ngang dn xut ca R c nh ngha l:

    Ri=R|>< Si , 1 i w

    Trong w l s lng cc mnh c nh ngha trn R, v S i=Fi(S) vi Fi l

    cng thc nh ngha mnh ngang nguyn thu Si

    27

  • 7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639

    28/117

    Th d 9: Xt ng ni

    NV

    MNV TnNV Chc v

    E1

    E2

    E2

    E3

    E3

    E4

    E5

    E6

    E7

    E8

    J.Doe

    M.Smith

    M.Smith

    A.Lee

    A.Lee

    J.Miller

    B.Casey

    L.Chu

    R.david

    J.Jones

    K s in

    Phn tch

    Phn tch

    K s c kh

    K s c kh

    Programmer

    Phn tch h thng

    K s in

    K s c kh

    Phn tch h thng

    th th chng ta c th nhm cc k s thnh hai nhm ty theo lng: nhm c

    lng t 30.000 la tr ln v nhm c lng di 30.000 la. Hai mnh Nhn

    vin1 v Nhn vin2 c nh ngha nh sau:

    NV1=NV |>< CT1

    NV2=NV |>< CT2

    Trong CT1=Lng 30000( CT)

    CT2=Lng>30000( CT)

    CT 1 CT2

    Chc v Lng Chc v Lng

    K s c kh

    Lp trnh

    27000

    24000

    K s in

    Phn tch h thng

    40000

    34000

    Kt qu phn mnh ngang dn xut ca quanh NV nh sau:

    28

    Chc v, Lng

    MNV, TnNV, Chc v

    L1NV

    CT

  • 7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639

    29/117

    NV1 NV2

    MNV TnNV Chc v MNV TnNV Chc v

    E3

    E4

    E7

    A.Lee

    J.Miller

    R.David

    K s c kh

    Lp trnh vin

    K s c kh

    E1

    E2

    E5

    E6

    E8

    J.Doe

    M.Smith

    B.Casey

    L.Chu

    J.Jones

    K s in

    Phn tch

    Phn tch h thng

    K s in

    Phn tch h thng

    Ch :

    + Mun thc hin phn mnh ngang dn xut, chng ta cn ba nguyn liu

    (input): 1. Tp cc phn hoch ca quan h ch nhn (Th d: CT1, CT2).

    2. Quan h thnh vin

    3. Tp cc v t ni na gia ch nhn v thnh vin (Chng hn

    CT.Chucvu = NV.Chucvu).

    + Vn phc tp cn ch : Trong lc CSDL, chng ta hay gp nhiu

    ng ni n mt quan h R. Nh th c th c nhiu cch phn mnh cho quan h

    R. Quyt nh chn cch phn mnh no cn da trn hai tiu chun sau:1. Phn mnh c c tnh ni tt hn

    2. Phn mnh c s dng trong nhiu ng dng hn.

    Tuy nhin, vic p dng cc tiu chun trn cn l mt vn rc ri.

    Th d 10: Chng ta tip tc vi thit k phn tn cho CSDL bt u t Th

    d 9. V quan h NV phn mnh theo CT. By gi xt ASG. Gi s c hai ng dng sau:

    1. ng dng 1: Tm tn cc k s c lm vic ti mt ni no . ng dng nychy c ba trm v truy xut cao hn cc k s ca cc d n nhng v tr khc.

    2. ng dng 2: Ti mi trm qun l, ni qun l cc mu tin nhn vin, ngi

    dng mun truy xut n cc d n ang c cc nhn vin ny thc hin v cn bit

    xem h s lm vic vi d n trong bao lu.

    Kim nh tnh ng n

    By gi chng ta cn phi kim tra tnh ng ca phn mnh ngang.

    a. Tnh y

    29

  • 7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639

    30/117

    + Phn mnh ngang nguyn thu: Vi iu kin cc v t chn l y , phn

    mnh thu cng c m bo l y , bi v c s ca thut ton phn mnh l tp

    cc v t cc tiu v y Pr, nn tnh y c bo m vi iu kin khng c

    sai st xy ra.

    + Phn mnh ngang dn xut: C khc cht t, kh khn chnh y l do v t

    nh ngha phn mnh c lin quan n hai quan h. Trc tin chng ta hy nh

    ngha qui tc y mt cch hnh thc.

    R l quan h thnh vin ca mt ng ni m ch nhn l quan h S. Gi A l

    thuc tnh ni gia R v S, th th vi mi b t ca R, phi c mt b t ca S sao cho

    t.A=t.A

    Quy tc ny c gi l rng buc ton vn hay ton vn tham chiu, bom rng mi b trong cc mnh ca quan h thnh vin u nm trong quan h ch nhn.

    b. Tnh ti thit c

    Ti thit mt quan h ton cc t cc mnh c thc hin bng ton t hp

    trong c phn mnh ngang nguyn thy ln dn xut, V th mt quan h R vi phn

    mnh Fr={R1, R2,,Rm} chng ta c

    R = Ri , Ri FR

    c. Tnh tch ri

    Vi phn mnh nguyn thu tnh tch ri s c bo m min l cc v t hi

    s cp xc nh phn mnh c tnh loi tr tng h (mutually exclusive). Vi phn

    mnh dn xut tnh tch ri c th bo m nu th ni thuc loi n gin.

    2.3. Phn mnh dc

    Mt phn mnh dc cho mt quan h R sinh ra cc mnh R1, R2,..,Rr, mi mnh

    cha mt tp con thuc tnh ca R v c kho ca R. Mc ch ca phn mnh dc lphn hoch mt quan h thnh mt tp cc quan h nh hn nhiu ng dng ch cn

    chy trn mt mnh. Mt phn mnh ti ul phn mnh sinh ra mt lc phn

    mnh cho php gim ti a thi gian thc thi cc ng dng chy trn mnh .

    Phn mnh dc tt nhin l phc tp hn so vi phn mnh ngang. iu ny l

    do tng s chn la c th ca mt phn hoch dc rt ln.

    V vy c c cc li gii ti u cho bi ton phn hoch dc thc s rt

    kh khn. V th li phi dng cc phng php khm ph (heuristic). Chng ta a rahai loi heuristic cho phn mnh dc cc quan h ton cc.

    30

  • 7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639

    31/117

    - Nhm thuc tnh: Bt u bng cch gn mi thuc tnh cho mt mnh, v ti

    mi bc, ni mt s mnh li cho n khi tha mt tiu chun no . K thut ny

    c c xut ln u cho cc CSDL tp trung v v sau c dng cho cc

    CSDL phn tn.

    - Tch mnh: Bt u bng mt quan h v quyt nh cch phn mnh c li

    da trn hnh vi truy xut ca cc ng dng trn cc thuc tnh.

    Bi v phn hoch dc t vo mt mnh cc thuc tnh thng c truy xut

    chung vi nhau, chng ta cn c mt gi tr o no nh ngha chnh xc hn v

    khi nim chung vi nhau. S o ny gi l t lc hay lc ht (affinity) ca thuc

    tnh, ch ra mc lin i gia cc thuc tnh.

    Yu cu d liu chnh c lin quan n cc ng dng l tn s truy xut ca

    chng. gi Q={q1, q2,,qq} l tp cc vn tin ca ngi dng (cc ng dng) s chy

    trn quan h R(A1, A2,,An). Th th vi mi cu vn tin q i v mi thuc tnh Aj,

    chng ta s a ra mt gi tr s dng thuc tnh, k hiu use(q i, Aj) c nh ngha

    nh sau:

    1 nu thuc tnh Aj c vn tin qi tham chiu

    use(qi, Aj)= 0 trong trng hp ngc li

    Cc vct use(qi, ) cho mi ng dng rt d nh ngha nu nh thit k bitc cc ng dng s chy trn CSDL.

    Th d 11:

    Xt quan h DA, gi s rng cc ng dng sau y chy trn cc quan h .

    Trong mi trng hp chng ta cng c t bng SQL.

    q1: Tm ngn sch ca mt d n, cho bit m ca d n

    SELECT Ngn schFROM DA

    WHERE MDA=gi tr

    q2: Tm tn v ngn sch ca tt c mi d n

    SELECT TnDA, ngn sch

    FROM DA

    q3: Tm tn ca cc d n c thc hin ti mt thnh ph cho

    SELECT tnDA

    31

  • 7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639

    32/117

    FROM DA

    WHERE a im=gi tr

    q4: Tm tng ngn sch d n ca mi thnh ph

    SELECT SUM (ngn sch)FROM DA

    WHERE a im=gi tr

    Da theo bn ng dng ny, chng ta c th nh ngha ra cc gi tr s dng

    thuc tnh. cho tin v mt k php, chng ta gi A 1=MDA, A2=TnDA, A3=Ngn

    sch, A4=a im. Gi tr s dng c nh ngha di dng ma trn, trong mc

    (i,j) biu th use(qi , Aj ).

    T lc ca cc thuc tnhGi tr s dng thuc tnh khng lm c s cho vic tch v phn mnh.

    iu ny l do chng khng biu th cho ln ca tn s ng dng. S o lc ht

    (affinity) ca cc thuc tnh aff(Ai, Aj), biu th cho cu ni (bond) gia hai thuc tnh

    ca mt quan h theo cch chng c cc ng dng truy xut, s l mt i lng cn

    thit cho bi ton phn mnh.

    Xy dng cng thc o lc ht ca hai thuc tnh A i, Aj.

    Gi k l s cc mnh ca R c phn mnh. Tc l R = R1.Rk.

    Q= {q1, q2,,qm} l tp cc cu vn tin (tc l tp cc ng dng chy trn quan

    h R). t Q(A, B) l tp cc ng dng q ca Q m use(q, A).use(q, B) = 1.

    Ni cch khc:

    Q(A, B) = {qQ: use(q, A) =use(q, B) = 1}

    Th d da vo ma trn trn ta thy Q(A 1,A1) = {q1}, Q(A2,A2 ) = {q2, q3},

    Q(A3,A3 ) = {q1,q2, q4}, Q(A4,A4 ) = {q3, q4}, Q(A1,A2 ) = rng, Q(A1,A3 ) = {q1},Q(A2,A3 ) = {q2},..

    A1 A2 A3 A4

    q1 1 0 1 0

    q2 0 1 1 0

    q3 0 1 0 1

    q 4 0 0 1 1

    32

  • 7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639

    33/117

    S o lc ht gia hai thuc tnh A i, Aj c nh ngha l:

    aff(Ai, Aj)= refl (qk)accl(qk)

    qkQ(Ai, Aj) l Rl

    Hoc:aff(Ai, Aj)= refl (qk)accl(qk)

    Use(qk, Ai)=1Use(qk, Aj)=1 Rl

    Trong refl (qk) l s truy xut n cc thuc tnh (Ai, Aj) cho mi ng dng

    qkti v tr Rl v accl(qk) l s o tn s truy xut ng dng qk n cc thuc tnh Ai, Aj

    ti v tr l. Chng ta cn lu rng trong cng thc tnh aff (A i, Aj) ch xut hin cc

    ng dng q m c Ai v Aj u s dng.

    Kt qu ca tnh ton ny l mt ma trn i xng n x n, mi phn t ca n l

    mt s o c nh ngha trn. Chng ta gi n l ma trn lc t ( lc ht hoc i

    lc) thuc tnh (AA) (attribute affinity matrix).

    Th d 12: Chng ta hy tip tc vi Th d 11. cho dn gin chng ta hy

    gi s rng refl (qk) =1 cho tt c qk v Rl. Nu tn s ng dng l:

    Acc1(q1) = 15 Acc2(q1) = 20 Acc3(q1) = 10

    Acc1(q2) = 5 Acc2(q2) = 0 Acc3(q2) = 0

    Acc1(q3) = 25 Acc2(q3) = 25 Acc3(q3) = 25

    Acc1(q4) = 3 Acc2(q4) = 0 Acc3(q1) = 0

    S o lc ht gia hai thuc tnh A1 v A3 l:

    Aff(A1, A3) = 1k=13t=1acct(qk) = acc1(q1)+acc2(q1)+acc3(q1) = 45

    Tng t tnh cho cc cp cn li ta c ma trn i lc sau:

    Thut ton nng lng ni BEA (Bond Energy Algorithm)

    A1 A2 A3 A4

    A1 45 0 45 0

    A2 0 80 5 75

    A3 45 5 53 3

    A4 0 75 3 78

    33

  • 7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639

    34/117

    n y ta c th phn R lm cc mnh ca cc nhm thuc tnh da vo s

    lin i (lc ht) gia cc thuc tnh, th d t lc ca A 1, A3 l 45, ca A2, A4 l 75,

    cn ca A1, A2 l 0, ca A3, A4 l 3 Tuy nhin, phng php tuyn tnh s dng trc

    tip t ma trn ny t c mi ngi quan tm v s dng. Sau y chng ta xt mt

    phng php dng thut ton nng lng ni BEA ca Hoffer and Severance, 1975 vNavathe., 1984.

    1. N c thit k c bit xc nh cc nhm gm cc mc tng t, khc

    vi mt sp xp th t tuyn tnh ca cc mc.

    2. Cc kt qu t nhm khng b nh hng bi th t a cc mc vo thut ton.

    3. Thi gian tnh ton ca thut ton c th chp nhn c l O(n2), vi n l s

    lng thuc tnh.

    4. Mi lin h qua li gia cc nhm thuc tnh t c th xc nh c.

    Thut ton BEA nhn nguyn liu l mt ma trn i lc thuc tnh (AA), hon

    v cc hng v ct ri sinh ra mt ma trn i lc t (CA) (Clustered affinity matrix).

    Hon v c thc hin sao cho s o i lc chung AM (Global Affinity Measure) l

    ln nht. Trong AM l i lng:

    AM=ni=1nj=1 aff(Ai, Aj)[aff(Ai, Aj-1)+aff(Ai, Aj+1)+aff(Ai-1, Aj)+ aff(Ai+1, Aj)]

    Vi aff(A0, Aj)=aff(Ai, A0)=aff(An+1, Aj)=aff(Ai, An+1)=0 cho i,j

    Tp cc iu kin cui cng cp n nhng trng hp mt thuc tnh c

    t vo CA v bn tri ca thuc tnh tn tri hoc v bn phi ca thuc tnh tn

    phi trong cc hon v ct, v bn trn hng trn cng v bn di hng cui cng

    trong cc hon v hng. Trong nhng trng hp ny, chng ta cho 0 l gi tr lc ht

    aff gia thuc tnh ang c xt v cc ln cn bn tri hoc bn phi (trn cng hoc

    di y ) ca n hin cha c trong CA.

    Hm cc i ho ch xt nhng ln cn gn nht, v th n nhm cc gi tr ln

    vi cc gi tr ln , gi tr nh vi gi tr nh. V ma trn lc ht thuc tnh AA c tch

    cht i xng nn hm s va c xy dng trn thu li thnh:

    AM=ni=1nj=1 aff(Ai, Aj)[aff(Ai, Aj-1)+aff(Ai, Aj+1)]

    Qu trnh sinh ra ma trn t lc (CA) c thc hin qua ba bc:

    Bc 1: Khi gn:

    t v c nh mt trong cc ct ca AA vo trong CA. Th d ct 1, 2 cchn trong thut ton ny.

    34

  • 7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639

    35/117

    Bc 2: Thc hin lp

    Ly ln lt mt trong n-i ct cn li (trong i l s ct c t vo

    CA) v th t chng vo trong i+1 v tr cn li trong ma trn CA. Chn ni t sao

    cho cho i lc chung AM ln nht. Tip tc lp n khi khng cn ct no dt.

    Bc 3: Sp th t hng

    Mt khi th t ct c xc nh, cc hng cng c t li cc v tr

    tng i ca chng ph hp vi cc v tr tng i ca ct.

    Thut ton BEA

    Input: AA - ma trn i lc thuc tnh;

    Output: CA - ma trn i lc t sau khi sp xp li cc hng cc ct;

    Begin

    {Khi gn: cn nh rng l mt ma trn n x n}

    CA(, 1)AA(, 1)

    CA(, 2)AA(, 2)

    Index:=3

    while index

  • 7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639

    36/117

    hiu r thut ton chng ta cn bit cont(*,*,*). Cn nhc li s o i lc

    chung AM c nh ngha l:

    AM=ni=1nj=1 aff(Ai, Aj)[aff(Ai, Aj-1)+aff(Ai, Aj+1)]

    V c th vit li:

    AM = ni=1nj=1 [aff(Ai, Aj) aff(Ai, Aj-1)+aff(Ai, Aj) aff(Ai, Aj+1)]

    = nj=1[ni=1 aff(Ai, Aj) aff(Ai, Aj-1)+ ni=1 aff(Ai, Aj) aff(Ai, Aj+1)]

    Ta nh ngha cu ni (Bond) gia hai thuc tnh Ax, v Ay l:

    Bond(Ax, Ay )=nz=1aff(Az, Ax)aff(Az, Ay)

    Th th c th vit li AM l:

    AM = nj=1[ Bond(Ai, Aj-1)+Bond(Ai, Aj+1)]

    By gi xt n thuc tnh sau:

    A1 A2 Ai-1 AiAj Aj+1 An

    Vi A1 A2 Ai-1 thuc nhm AM v AiAj Aj+1 An thuc nhm AM

    Khi s o lc ht chung cho nhng thuc tnh ny c th vit li:

    AMold = AM + AM+ bond(Ai-1, Ai) + bond(Ai, Aj) + bond(Aj, Ai)+

    bond(bond(Aj+1, Aj) = nl=1[ bond(Al, Al-1)+bond(Ai, Al+1)] + nl=i+1[bond(Al, Al-

    1)+bond(Ai, Al+1)] + 2bond(Ai, Al))

    By gi xt n vic t mt thuc tnh mi A k gia cc thuc tnh Ai v Aj

    trong ma trn lc ht t. S o lc ht chung mi c th c vit tng t nh:

    AMnew = AM + AM+ bond(Ai, Ak) + bond(Ak, Ai) + bond(Ak, Aj)+ bond(Aj,

    Ak) = AM + AM+ 2bond(Ai, Ak) + 2bond(Ak, Aj)

    V th ng gp thc (net contribution) cho s o i lc chung khi t thuctnh Ak gia Ai v Aj l:

    Cont(Ai, Ak, Aj) = AMnew - AMold = 2Bond(Ai, Ak )+ 2Bond(Ak, Aj ) - 2Bond(Ai,

    Aj )

    Bond(A0, Ak)=0. Nu thuc tnh Ak t bn phi thuc tnh tn bn phi v cha

    c thuc tnh no c t ct k+1 ca ma trn CA nn bond(Ak, Ak+1)=0.

    Th d 13: Ta xt ma trn c cho trong Th d 12 v tnh ton phn ng

    gp khi di chuyn thuc tnh A4 vo gia cc thuc tnh A1 v A2, c cho bng cng

    thc:

    36

  • 7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639

    37/117

  • 7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639

    38/117

    A1 A2 A1 A3 A2

    A1 45 0 A1 45 45 0

    A2 0 80 A2 0 5 80

    A3 45 5 A3 45 53 5

    A4 0 75 A4 0 3 75

    (a) (b)

    A1 A3 A2 A4 A1 A3 A2 A4

    A1 45 45 0 0 A1 45 45 0 0

    A2 0 5 80 75 A3 45 53 5 3

    A3 45 53 5 3 A2 0 5 80 75A4 0 3 75 78 A4 0 3 75 78

    (b) (d)

    trong hnh trn chng ta thy qu trnh to ra hai t: mt gc trn tri cha cc

    gi tr i lc nh, cn t kia di gc phi cha cc gi tr i lc cao. Qu trnh phn

    t ny ch ra cch thc tch cc thuc tnh ca D n. Tuy nhin, ni chung th ranh

    ri cc phn tch khng hon ton r rng. Khi ma trn CA ln, thng s c nhiu t

    hn c to ra v nhiu phn hoch c chn hn. Do vy cn phi tip cn bi tonmt cch c h thng hn.

    Thut ton phn hoch

    Mc ch ca hnh ng tch thuc tnh l tm ra cc tp thuc tnh c truy

    xut cng nhau hoc hu nh l cc tp ng dng ring bit. Xt ma trn thuc tnh t:

    A1 A2 A3 ... Ai Ai+1 ... An

    A1

    A1

    :

    Ai

    Ai+1

    :

    :

    An

    38

    TA

    B A

  • 7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639

    39/117

  • 7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639

    40/117

    Output: F: tp cc mnh;

    Begin

    {xc nh gi tr z cho ct th nht}

    {cc ch mc trong phng trnh chi ph ch ra im tch}tnh CTQn-1

    tnh CBQn-1

    tnh COQn-1

    best CTQn-1*CBQn-1 (COQn-1)2

    do {xc nh cch phn hoch tt nht}

    begin

    for i from n-2 to 1 by -1 do

    begin

    tnh CTQi

    tnh CBQi

    tnh COQi

    z CTQi*CBQi (COQi)2

    if z > best then

    begin

    best z

    ghi nhn im tch bn vo trong hnh ng x dch

    end-ifend-for

    gi SHIFT(CA)

    end-begin

    until khng th thc hin SHIFT c na

    Xy dng li ma trn theo v tr x dch

    R1TA(R) K {K l tp thuc tnh kho chnh ca R}

    R2BA(R) K

    40

  • 7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639

    41/117

    F {R1, R2}

    End. {partition}

    p dng cho ma trn CA t quan h d n, kt qu l nh ngha cc mnh Fd

    n={D n1, D n2}

    Trong : D n1={A1, A3} v D n2= {A1, A2, A4}. V th

    D n1={M d n, Ngn sch}

    D n2={M d n, Tn d n, a im}

    ( y M d n l thuc tnh kho ca D n)

    Kim tra tnh ng n:

    Tnh y : c bo m bng thut ton PARTITION v mi thuc tnhca quan h ton cc c a vo mt trong cc mnh.

    Tnh ti thit c: i vi quan h R c phn mnh dc FR={R1, R2,...., Rr}

    v cc thuc tnh kho K

    R= K Ri , Ri FR

    Do vy nu iu kin mi Ri l y php ton ni s ti thit li ng R.

    Mt im quan trng l mi mnh Ri phi cha cc thuc tnh kho ca R.

    2.5. Phn mnh hn hp

    Trong a s cc trng hp, phn mnh ngang hoc phn mnh dc n gin

    cho mt lc CSDL khng p ng cc yu cu t ng dng. Trong trng hp

    phn mnh dc c th thc hin sau mt s mnh ngang hoc ngc li, sinh ra mt

    li phn hoch c cu trc cy. Bi v hai chin lc ny c p dng ln lt, chn

    la ny c gi l phn mnh hn hp.

    41

    R23

    R1

    R2

    R

    HH

    R11

    R12

    R21

    R22

    VV V V

  • 7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639

    42/117

    2.6. Cp pht

    2.6.1 Bi ton cp pht

    Gi s c mt tp cc mnh F={F1, F2, ...,Fn} v mt mng bao gm cc v

    tr S={S1, S2, ...,Sm} trn c mt tp cc ng dng Q={q1, q2, ...,qq} ang chy.

    Bi ton cp pht l tm mt phn phi ti u ca F cho S.

    Tnh ti u c th c nh ngha ng vi hai s o:

    - Chi ph nh nht: Hm chi ph c chi lu mnh Fi vo v tr Sj, chi ph vn tin

    mnh Fi vo v tr Sj, chi ph cp nht F i ti tt c mi v tr c cha n v chi ph tryn

    d liu. V th bi ton cp pht c gng tm mt lc cp pht vi hm chi ph t

    hp nh nht.

    - Hiu nng: Chin lc cp pht c thit k nhm duy tr mt hiu qu ln l h thp thi gian p ng v tng ti a lu lng h thng ti mi v tr.

    Ni chung bi ton cp pht tng qut l mt bi ton phc tp v c phc tp

    l NP-y (NP-complete). V th cc nghin cu c dnh cho vic tm ra cc

    thut gii heuristec tt c li gii gn ti u.

    2.6.2 Yu cu v thng tin

    giai on cp pht, chng ta cn cc thng tin nh lng v CSDL, v cc

    ng dng chy trn , v cu trc mng, kh nng x l v gii hn lu tr ca mi

    v tr trn mng.

    Thng tin v CSDL

    tuyn ca mt mnh Fj ng vi cu vn tin qi. y l s lng cc b ca Fj

    cn c truy xut x l q i. Gi tr ny k hiu l sel i(Fj)

    Kch thc ca mt mnh Fj c cho bi

    Size (Fj) = card (Fj)* length(Fj)

    Trong : Length(Fj) l chiu di (tnh theo byte) ca mt b trong mnh Fj.

    Thng tin v ng dng

    Hai s liu quan trng l s truy xut c do cu vn tin qi thc hin trn mnh

    Fj trong mi ln chy ca n (k hiu l RRij), v tng ng l cc truy xut cp nht

    (URij). Th d chng c th m s truy xut khi cn phi thc hin theo yu cu vn tin.

    Chng ta nh ngha hai ma trn UM v RM vi cc phn t tng ng u ij v rijc c t tng ng nh sau:

    42

  • 7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639

    43/117

    1 nu vn tin qi c cp nht mnh Fj

    uij=

    0 trong trng hp ngc li

    1 nu vn tin qi c cp nht mnh Fj

    rij =

    trong trng hp ngc li

    Mt vct O gm cc gi tr o(i) cng c nh ngha, vi o(i) c t v tr a

    ra cu vn tin qi .

    Thng tin v v tr

    Vi mi v tr (trm) chng ta cn bit v kh nng lu tr v x l ca n.

    Hin nhin l nhng gi tr ny c th tnh c bng cc hm thch hp hoc bng

    phng php nh gi n gin.

    + Chi ph n v tnh lu d liu ti v tr Sk s c k hiu l USCk.

    + c t s o chi ph LPCk, l chi ph x l mt n v cng vic ti v tr Sk.

    n v cng vic cn phi ging vi n v ca RR v UR.

    Thng tin v mngChng ta gi s tn ti mt mng n gin, gij biu th cho chi ph truyn mi

    b gia hai v tr Si v Sj. c th tnh c s lng thng bo, chng ta dng fsize

    lm kch thc (tnh theo byte) ca mt b d liu.

    2.6.3. M hnh cp pht

    M hnh cp pht c mc tiu lm gim thiu tng chi ph x l v lu tr d

    liu trong khi vn c gng p ng c cc i hi v thi gian p ng. M hnh

    ca chng ta c hnh thi nh sau:

    Min (Total Cost)

    ng vi rng buc thi gian p ng, rng buc lu tr, rng buc x l.

    Bin quyt nh xij c nh ngha l

    1 nu mnh Fi c lu ti v tr Sj

    xij= 0 trong trng hp ngc li

    43

  • 7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639

    44/117

    Tng chi ph

    Hm tng chi ph c hai thnh phn: phn x l vn tin v phn lu tr. V th

    n c th c biu din l:

    TOC= QPCi + STCjk

    qi Q Sk S Fj F

    vi QPCi l chi ph x l cu vn tin ng dng q i, v STCjk l chi ph lu mnh

    Fj ti v tr Sk.

    Chng ta hy xt chi ph lu tr trc. N c cho bi

    STCjk = USCk * size(Fj) *xjk

    Chi ph x l vn tin kh xc nh hn. Hu ht cc m hnh cho bi ton cp

    pht tp tin FAP tch n thnh hai phn: Chi ph x l ch c v chi ph x l ch cpnht. y chng ti chn mt hng tip cn khc trong m hnh cho bi ton

    DAP v xc nh n nh l chi ph x l vn tin bao gm chi ph x l l PC v chi

    ph truyn l TC. V th chi ph x l vn tin QPC cho ng dng qi l

    QPCi=PCi+TCi

    Thnh phn x l PC gm c ba h s chi ph, chi ph truy xut AC, chi ph duy

    tr ton vn IE v chi ph iu khin ng thi CC:

    PCi=ACi+IEi+CCi

    M t chi tit cho mi h s chi ph ph thuc vo thut ton c dng

    hon tt cc tc v . Tuy nhin minh ho chng ti s m t chi tit v AC:

    ACi= (uij*URij+rij*RRij)* xjk*LPCk

    Sk S Fj F

    Hai s hng u trong cng thc trn tnh s truy xut ca vn tin qi n mnh

    Fj. Ch rng (URij+RRij) l tng s cc truy xut c v cp nht. Chng ta gi thit

    rng cc chi ph x l chng l nh nhau. K hiu tng cho bit tng s cc truy xut

    cho tt c mi mnh c q i tham chiu. Nhn vi LPCk cho ra chi ph ca truy xut

    ny ti v tr Sk. Chng ta li dng xjk ch chn cc gi tr chi ph cho cc v tr c

    lu cc mnh.

    Mt vn rt quan trng cn cp y. Hm chi ph truy xut gi s rng

    vic x l mt cu vn tin c bao gm c vic phn r n thnh mt tp cc vn tin

    con hot tc trn mt mnh c lu ti v tr , theo sau l truyn kt qu tr li vv tr a ra vn tin.

    44

  • 7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639

    45/117

    H s chi ph duy tr tnh ton vn c th c m t rt ging thnh phn x

    l ngoi tr chi ph x l cc b mt n v cn thay i nhm phn nh chi ph thc

    s duy tr tnh ton vn.

    Hm chi ph truyn c th c a ra ging nh cch ca hm chi ph truy

    xut. Tuy nhin tng chi ph truyn d liu cho cp nht v cho yu cu ch c s

    khc nhau hon ton. Trong cc vn tin cp nht, chng ta cn cho tt c mi v tr bit

    ni c cc bn sao cn trong vn tin ch c th ch cn truy xut mt trong cc bn sao

    l . Ngoi ra vo lc kt thc yu cu cp nht th khng cn phi truyn d liu v

    ngc li, cho v tr a ra vn tin ngoi mt thng bo xc nhn, cn trong vn tin ch

    c c th phi c nhiu thng bo tryn d liu.

    Thnh phn cp nht ca hm truyn d liu l:

    TCUi = uj*xjk*go(i),k + uj*xjk*g k,o(i)

    Sk S Fj F Sk S Fj F

    S hng th nht gi thng bo cp nht t v tr gc o(i) ca q i n tt c

    bn sao cp nht. S hng th hai dnh cho thng bo xc nhn.

    Thnh phn chi ph ch c c th c t l:

    TCRi= min (uij * xjk * go(i), k+rij * xjk * (seli(Fj)* length (Fj)/fsize) * gk, o(i))

    Fj F Sk S

    S hng th nht trong TCR biu th chi ph truyn yu cu ch c n nhng

    v tr c bn sao ca mnh cn truy xut. S hng th hai truyn cc kt qu t

    nhng v tr ny n nhng v tr yu cu. Phng trnh ny khng nh rng trong s

    cc v tr c bn sao ca cng mt mnh, ch v tr sinh ra tng chi ph truyn thp nht

    mi c chn thc hin thao tc ny.

    By gi hm chi ph tnh cho vn tin q i c th c tnh l:

    TCi=TCUi+TCRi

    Rng buc

    Rng buc thi gian p ng cn c c t l thi gian thc thi ca q i thi

    gian p ng ln nht ca qiqiQ

    Ngi ta thch c t s o chi ph ca hm theo thi gian bi v n n gin

    ho c t v rng buc thi gian thc thi.

    Rng buc lu tr l: STCjk kh nng lu tr ti v tr Sk, Sk S

    Fj F

    45

  • 7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639

    46/117

    Trong rng buc x l l:

    ti trng x l ca qi ti v tr Sk kh nng x l ca Sk, SkS.

    qi Q

    46

  • 7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639

    47/117

    CHNG 3. X L VN TIN

    Ng cnh c chn y l php tnh quan h v i s quan h. Nh chng

    ta thy cc quan h phn tn c ci t qua cc mnh. Thit k CSDL c vai trht sc quan trng i vi vic x l vn tin v nh ngha cc mnh c mc ch lm

    tng tnh cc b tham chiu, v i khi tng kh nng thc hin song song i vi

    nhng cu vn tin quan trng nht. Vai tr ca th x l vn tin phn tn l nh x cu

    vn tin cp cao trn mt CSDL phn tn vo mt chui cc thao tc ca i s quan h

    trn cc mnh. Mt s chc nng quan trng biu trng cho nh x ny. Trc tin cu

    vn tin phi c phn r thnh mt chui cc php ton quan h c gi l vn tin

    i s. Th hai, d liu cn truy xut phi c cc b ha cc thao tc trn cc

    quan h c chuyn thnh cc thao tc trn d liu cc b (cc mnh). Cui cng cu

    vn tin i s trn cc mnh phi c m rng bao gm cc thao tc truyn thng

    v c ti u ha hm chi ph l thp nht. Hm chi ph mun ni n cc tnh

    ton nh thao tc xut nhp a, ti nguyn CPU, v mng truyn thng.

    3.1. Bi ton x l vn tin

    C hai phng php ti u ha c bn c s dng trong cc b x l vn tin:

    phng php bin i i s v chin lc c lng chi ph.

    Phng php bin i i s n gin ha cc cu vn tin nh cc php bin

    i i s nhm h thp chi ph tr li cu vn tin, c lp vi d liu thc v cu trc

    vt l ca d liu.

    Nhim v chnh ca th x l vn tin quan h l bin i cu vn tin cp cao

    thnh mt cu vn tin tng ng cp thp hn c din t bng i s quan h.

    Cu vn tin cp thp thc s s ci t chin lc thc thi vn tin. Vic bin i ny

    phi t c c tnh ng n ln tnh hiu qu. Mt bin i c xem l ng nnu cu vn tin cp thp c cng ng ngha vi cu vn tin gc, ngha l c hai cng

    cho ra mt kt qu. Mt cu vn tin c th c nhiu cch bin i tng ng thnh

    i s quan h. Bi v mi chin lc thc thi tng ng u s dng ti nguyn

    my tnh rt khc nhau, kh khn chnh l chn ra c mt chin lc h thp ti a

    vic tiu dng ti nguyn.

    Th d 3.1:

    Chng ta hy xt mt tp con ca lc CSDL c choNV( MNV, TnNV, Chc v)

    47

  • 7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639

    48/117

    PC (MNV, MDA, Nhim v, Thi gian)

    V mt cu vn tin n gin sau:

    Cho bit tn ca cc nhn vin hin ang qun l mt d n

    Biu thc vn tin bng php tnh quan h theo c php ca SQL l:SELECT TnNV

    FROM NV, PC

    WHERE NV.MNV=PC.MNV

    AND Nhimv=Qunl

    Hai biu thc tng ng trong i s quan h do bin i chnh xc t cu

    vn tin trn l:TnNV(Nhimv=Qunl NV.MNV=PC.MNV (NV x PC))

    v

    TnNV(NV|>

  • 7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639

    49/117

    TnNV(NV|> E3(NV)PC1=MNV E3(PC)

    PC2=MNV E3(PC)

    Cc mnh PC1, PC2, NV1, NV2 theo th t c lu ti cc v tr 1, 2, 3 v 4 v

    kt qu c lu ti v tr 5

    Mi tn t v tr i n v tr j c nhn R ch ra rng quan h R c chuyn t v

    tr i n v tr j. Chin lc A s dng s kin l cc quan h EMP v ASG c phn

    mnh theo cng mt cch thc hin song song cc php ton chn v ni. chin

    lc B tp trung tt c cc d liu ti v tr lu kt qu trc khi x l cu vn tin.

    49

    Hnh 4.1a) Chin lc a

    Kt qu = NV1

    NV2

    NV1= NV|>

  • 7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639

    50/117

  • 7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639

    51/117

  • 7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639

    52/117

    trong pij l mt v t n gin. Ngc li, mt lng t ho dng chun

    tuyn nh sau:

    (p11p12.p1n) .(pm1pm2.pmn)

    Bin i cc v t phi lng t l tm thng bng cch s cc quy tc tng

    ng cho cc php ton logic (, , ): 9

    1. p1p2p2p1

    2. p1p2p2p1

    3. p1( p2p3) (p1p2 )p3

    4. p1( p2p3) (p1p2 )p3

    5. p1( p2p3) (p1p2 )(p1p3 )6. p1( p2p3) (p1p2 ) (p1p3 )

    7. (p1p2 )p1p2

    8. (p1 p2 )p1p2

    9. (p)p

    Trong dng chun tc tuyn, cu vn tin c th c x l nh cc cu vn

    tin con hi c lp, c ni bng php hp (tng ng vi cc tuyn mnh ).Nhn xt: Dng chun tuyn t c dng v dn n cc v t ni v chn

    trng nhau. Dng chun hi hay dng trong thc t

    Th d 3.3:

    Tm tn cc nhn vin ang lm vic d n P 1 trong 12 thng hoc 24 thng.

    Cu vn tin c din t bng SQL nh sau:

    SELECT TnNVFROM NV, PC

    WHERE NV.MNV=PC.MNV

    AND PC.MDA= P1

    AND Thi gian=12 OR Thi gian=24

    Lng t ho dng chun hi l:

    NV.MNV=PC.MNV PC.MDA= P1 (Thi gian=12 Thi gian=24)

    Cn lng t ho dng chun tuyn l

    52

  • 7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639

    53/117

    (NV.MNV=PC.MNV PC.MDA=P1 Thi gian=12)

    (NV.MNV=PC.MNVPC.MDA=P1 Thi gian=24)

    dng sau, x l hai hi c lp c th l mt cng vic tha nu cc biu thc

    con chung khng c loi b.

    Phn tch

    Phn tch cu vn tin cho php ph b cc cu vn tin chun ho nhng

    khng th tip tc x l c hoc khng cn thit, nhng l do chnh l do chng sai

    kiu hoc sai ng ngha.

    - Mt cu vn tin gi l sai kiu nu n c mt thuc tnh hoc tn quan h

    cha c khai bo trong lc ton cc, hoc nu n p dng cho cc thuc tnh c

    kiu khng thch hp.

    Select MaDA

    From TenNV >200

    - Mt cu vn tin gi l sai ngha nu cc thnh phn ca n khng tham gia vo

    vic to ra kt qu.

    Nu cc cc vn tin khng cha cc tuyn v ph nh ta c th dng th vn

    tin. Vn tin cha php chn ni chiu.

    - Biu din bng th vn tin:

    + 1 nt biu th quan h kt qu

    + Cc nt khc biu th cho quan h ton hng

    + Mt cnh gia hai nt khng phi l quan h kqu biu din cho mt ni

    + Cnh m nt ch l kt qu s biu th cho php chiu.

    + Cc nt khng phi l kt qu s c gn nhn l mt v t chn hoc 1 v t

    ni (chnh n).

    - th ni: mt th con quan trng ca th vn tin, n ch c cc ni.

    Th d 3.4:

    PC (MNV, MaDA, NV, Tgian)

    NV (MaNV, TnNV, CV)

    DA (MaDA, TnDA, Kph, im)

    53

  • 7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639

    54/117

  • 7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639

    55/117

  • 7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639

    56/117

    where TnNV=Mai

    Vit li cu vn tin

    Bc ny c chia thnh hai bc nh:

    (1) Bin i cu vn tin t php tnh quan h thnh i s quan h(2) Cu trc li cu vn tin i s nhm ci thin hiu nng.

    cho d hiu, chng ta s trnh by cu vn tin i s quan h mt cch

    hnh nh bng cy ton t. Mt cy ton t l mt cy vi mi nt l biu th cho mt

    quan h c lu trong CSDL v cc nt khng phi l nt l biu th cho mt quan h

    trung gian c sinh ra bi cc php ton quan h. Chui cc php ton i theo hng

    t l n gc biu th cho kt qu vn tin.

    Bin i cu vn tin php tnh quan h b thnh mt cy ton t c th thuc d dng bng cch sau. Trong SQL, cc nt l c sn trong mnh FROM. th

    hai nt gc c to ra nh mt php chiu cha cc thuc tnh kt qu. Cc thuc

    tnh ny nm trong mnh SELECT ca cu vn tin SQL. Th ba, lng t ho

    (mnh Where ca SQL) c dch thnh chui cc php ton quan h thch hp

    (php chn, ni, hp, ..) i t cc nt l n nt gc. Chui ny c th c cho trc

    tip qua th t xut hin ca cc v t v ton t.

    Th d 3.7:

    Cu vn tin: tm tn cc nhn vin tr J.Doe lm cho d n CAD/CAM

    trong mt hoc hai nm.

    Biu thc SQL l:

    SELECT TnNV

    FROM DA, PC, NV

    WHERE PC.MNV=NV.MNV

    AND PC.MDA=DA.MDA

    AND TnNV J.Doe

    AND DA.TnDA=CAD/CAM

    AND (Thi gian=12 OR Thi gian=24)

    Cc th c nh x thnh cy trong hnh di.

    56

  • 7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639

    57/117

    Bng cch p dng cc quy tc bin i, nhiu cy c th c thy rng tng

    ng vi cy c to ra bng phng php c m t trn. Su quy tc tng

    ng hu ch nht v c xem l cc php ton i s quan h c bn :

    R, S, T l nhng quan h, trong R c nh ngha trn cc thuc tnhA={A1, A2,,An} v quan h S c nh ngha trn cc thuc tnh B={B1, B2,

    ,Bn}.

    1. Tnh giao hon ca php ton hai ngi

    R x SS x R

    R SS R

    Quy tc ny cng p dng c cho hp nhng khng p dng cho hiu tp hphay ni na.

    2. Tnh kt hp ca cc php ton hai ngi

    (R x S)x T R x (Sx T)

    (R S) TR (S T)

    3- Tnh ly ng ca cc php ton n ngi

    Nu R c nh ngha trn tp thuc tnh A v A A, A A v A A thAA(A (R)) A(R)

    57

    TnNV

    Thi gian=12Thi gian=24

    TnDA=CAD/CAM

    TnNV J.Doe

    MDAM

    ENO

    PC NVDA

    Chiu

    Chn

    Ni

  • 7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639

    58/117

    p1(A1)(p2(A2)(R))p1(A1)p2(A2)(R)

    trong pi l mt v t c p dng cho thuc tnh A i

    4. Giao hon php chn vi php chiu

    A1An(p(Ap)(R)) A1An(p(Ap)(A1An,Ap(R)))Ch rng nu Ap l phn t ca {A1, A2,,An} th php chiu cui cng trn

    {A1, A2,,An} v phi ca h thc khng c tc dng.

    5. Giao hon php chn vi php ton hai ngi

    p(Ai)(R x S) (p(Ai)(R)) x S

    p(Ai)(R p(j, Bk) S) (p(Ai)(R)) p(j, Bk) S

    p(Ai)(R T) p(Ai)(R) p(Ai)(T)6-Giao hon php chiu vi php ton hai ngi

    Nu C=A B, trong AA, B B, v A, B l cc tp thuc tnh tng

    ng ca quan h R v S, chng ta c

    C(R x S) A(R)B(S)

    C(R p(i, Bj) S) A(R) p(i, Bj)B(S)

    C(R S) A(R) B(S)

    Cc quy tc trn c th c s dng cu trc li cy mt cch c h

    thng nhm loi b cc cy xu. Mt thut ton ti cu trc n gin s dng

    heuristic trong c p dng cc php ton n ngi (chn/ chiu ) cng sm cng tt

    nhm gim bt kch thc ca quan h trung gian.

    Ti cu trc cy trong hnh trn sinh ra cy trong hnh sau. Kt qu c xem

    l t cht lng theo ngha l n trnh truy xut nhiu ln n cng mt quan h v

    cc php ton chn la nhiu nht c thc hin trc tin.

    58

  • 7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639

    59/117

    3.3. Cc b ha d liu phn tn

    Tng cc b ha d liu chu trch nhim dch cu vn tin i s trn quan hton cc sang cu vn tin i s trn cc mnh vt l. Cc b ha c s dng cc thng

    tin c lu trong mt lc phn mnh.

    Tng ny xc nh xem nhng mnh no cn cho cu vn tin v bin i cu

    vn tin phn tn thnh cu vn tin trn cc mnh. To ra cu vn tin theo mnh c

    thc hin qua hai bc. Trc tin vn tin phn tn c nh x thnh vn tin theo

    mnh bng cch thay i mi quan h phn tn bng chng trnh ti thit ca n. Th

    hai vn tin theo mnh c n gin ho v ti cu trc to ra mt cu vn tin ccht lng. Qu trnh n gin ho v ti cu trc c th c thc hin theo nhng

    59

    TenDA J.Doe

    MDA, MNV

    MNV, TenNV

    MDA

    TenDA=CAD/CAM

    Thoigian=12Thoigian=24

    DA PC NV

    TnNV

    MDA, TenNV

    MNV

  • 7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639

    60/117

  • 7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639

    61/117

    Where MaNV=E5

    Rt gn vi php ni

    - Ni trn cc quan h phn mnh ngang c th c n gin khi cc quan h ni

    c phn mnh theo thuc tnh ni.

    - n gin ho gm c phn phi cc ni trn cc hp ri b i cc ni v dng.

    ( r1 r2 ) |> E6 (NV)

    PC1 = MaNV E3 (PC)

    PC (MaNV, MaDA, NV, Tg) PC2 = MaNV > E3 (NV)

    Cu hi:

    Select *From NV, PC

    Where NV.MaNV = PC.MaNV

    61

    MaNV

    MaNV=E5

    NV1 NV2 NV3

    MaNV

    MaNV=E5

    NV2

  • 7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639

    62/117

    Rt gn cho phn mnh dc

    Phn mnh dc phn tn mt quan h da trn cc thuc tnh chiu. Chng trnh

    cc b ho cho mt quan h phn mnh dc gm c ni ca cc mnh theo thuc tnh

    chung.

    Th d 3.10:

    NV(MaNV, TnNV, CV)

    NV1 =MaNV, TnNV(NV)

    NV2 =MaNV, CV(NV)

    Chng trnh cc b ho: NV = NV1 |>

  • 7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639

    63/117

  • 7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639

    64/117

    64

    *

    |>

  • 7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639

    65/117

    Rt gn cho phn mnh ngang hn hp

    Mc tiu: H tr hiu qu cc cu vn tin c cha php chiu, chn v ni

    Cu vn tin trn cc mnh hn hp c th c rt gn bng cch t hp cc qui tc

    tng ng uc dng trong cc phn mnh ngang nguyn thu, phn mnh dc,

    phn mnh ngang dn xut.

    Qui tc:

    1/ Loi b cc quan h rng c to ra bi cc php ton chn mu thun trn cc mnh

    ngang.

    2/ Loi b cc quan h v dng c to ra bi cc php chiu trn cc mnh dc

    3/ Phn phi cc ni cho cc hp nm c lp v loi b cc ni v dng.

    Th d 3.13: NV1 = MaNV E4 (MaNV, TnNV (NV) )

    MaNV, TnNV, CV) NV2 = MaNV > E4 (MaNV, TnNV (NV) )

    NV3 = MaNV, CV (NV)

    Chng trnh cc b ho NV = (NV1 NV2 ) |>

  • 7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639

    66/117

    3.4. Ti u ho vn tin phn tn

    Trong phn ny chng ta s gii thiu v qu trnh ti u ha ni chung, bt kmi trng l phn tn hay tp chung. Vn tin cn ti u gi thit l c din t bngi s quan h trn cc quan h CSDL (c th l cc mnh) sau khi vit li vn tin

    t biu thc php tnh quan h.Ti u ha vn tin mun ni n qu trnh sinh ra mt hoch nh thc thi vn

    tin (query execution plan, QEP) biu th cho chin lc thc thi vn tin. Hoch nhc chn phi h thp ti a hm chi ph. Th ti u ha vn tin, l mt n th phnmm chu trch nhim thc hin ti u ha, thng c xem l cu to bi ba thnh

    phn: mt khng gian tm kim (search space), mt m hnh chi ph(cost model) vmt chin lc tm kim (search strstegy) (xem hnh 1.4.4). Khng gian tm kim ltp cc hoch nh thc thi biu din cho cu vn tin. Nhng hoch nh ny l tng

    ng, theo ngha l chng sinh ra cng mt kt qu nhng khc nhau th t thchin cc thao tc v cch thc ci t nhng thao tc ny, v th khc nhau v hiunng. Khng gian tm kim thu c bng cch p dng cc quy tc bin i, chnghn nhng qui tc cho i s quan h m t trong phn vit li cu vn tin. M hnhchi ph tin on chi ph ca mt hoch nh thc thi cho. cho chnh xc, mhnh chi ph phi c thng tin cn thit v mi trng thc thi phn tn. Chin lctm kim s khm ph khng gian tm kim v chn ra hoch nh tt nht da theom hnh chi ph. N nh ngha xem cc hoch nh no cn c kim tra v theo th

    t no. Chi tit v mi trng (tp trung hay phn tn) c ghi nhn trong khng gianv m hnh chi ph.

    3.4.1. Khng gian tm kim

    Cc hoch nh thc thi vn tin thng c tru tng ha qua cy ton t),trn nh ngha th t thc hin cc php ton. Chng ta b sung thm cc thng tinnh thut ton tt nht c chn cho mi php ton. i vi mt cu vn tin cho,khng gian tm kim c th c nh ngha nh mt tp cc cy ton t tng ng,c c bng cch p dng cc qui tc bin i . nu bt cc c trng ca th ti

    u ha vn tin , chng ta thng tp trung cc cy ni (join tree), l cy ton t vi ccphp ton ni hoc tch Descartes. L do l cc hon v th t ni cc tc dng quantrng nht n hiu nng ca cc vn tin quan h.

    66

  • 7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639

    67/117

    CU V N TIN

    T O RA KHNGGIAN TM KI M

    Th d 3.14:

    Xt cu vn tin sau:

    SELECT ENAME

    FROM EMP, ASG, PROJ

    WHERE EMP, ENO=ASG.ENO

    AND ASG, PNO=PROJ . PNO

    Hnh sau minh ha ba cy ni tng ng cho vn tin , thu c bng cch

    s dng tnh cht kt hp ca cc ton t hai ngi. Mi cy ny c th c gn mt

    chi ph da trn chi ph ca mi ton t. Cy ni ( c ) bt u vi mt tch Des-cartesc th c chi ph cao hn rt nhiu so vi cy cn li.

    67

    CU VN TIN

    QEP TNG NG

    QEP TT NHT

    TO RA KHNGGIAN TM KIM QUY TC BIN I

    CHIN LCTM KIM

    M HNH CHIPH

  • 7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639

    68/117

    PNO ENO

    ENO PROJ PNO EMP

    EMP ASG ASG PROJ

    (a) (b)

    ENO.PNO

    X ASG

    PROJ EMP

    (c)

    Vi mt cu vn tin phc tp (c gm nhiu quan h v nhiu ton t), s

    caaytoans t tng ng c th rt nhiu. Th d s cy ni c th thu c t vic

    p dng tnh giao hon v kt hp l O(N!) cho N quan h. Vic nh gi mt khnggian tm kim ln c th mt qu nhiu thi gian ti u ha, i khi cn tn hn c

    thi gian thc thi thc s. V th, th ti u ha thng hn ch kch thc cn xem

    xt ca khng gian tm kim . Hn ch th nht l dng cc heuristic. Mt heuristic

    thng dng nht l thc hin php chn v chiu khi truy xut n quan h c s. Mt

    heuristic thng dng khc l trnh ly cc tch Descartes khng c chnh cu vn tin

    yu cu. Th d trong hnh trn cy ton t (c ) khng phi l phn c th ti u ha

    xem xt trong khng gian tm kim.

    a) Cy ni tuyn tnh b) Cy ni xum xu

    68

    R1

    R2

    R3

    R4

    R1 R2R3 R4

  • 7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639

    69/117

    Mt hn ch quan trng khc ng vi hnh dng ca cy ni. Hai loi cy ni

    thng c phn bit Cy ni tuyn tnh v cy ni xum xu (xem Hnh 9.3). Mt

    cy tuyn tnh (linear tree) l cy vi mi nt ton t c t nht mt ton hng l mt

    quan h c s. Mt cy xum xu (bushy tree) th tng qut hn v c th c cc ton t

    khng c quan h c s lm ton hng (ngha l c hai ton hng u l cc quan h

    trung gian). Nu ch xt cc cy tuyn tnh, kch thc ca khng gian tm kim c

    rt gn li thnh O(2N). Tuy nhin trong mi trng phn tn, cy xum xu rt c li

    cho vic thc hin song song.

    3.4.2. Chin lc tm kim

    Chin lc tm kim hay c cc th ti u ha vn tin s dng nht l quy

    hoch ng(dynamic programming) ci tnh cht n nh (deterministic). Cc chinlc n nh tin hnh bng cch xy dng cc hoch nh , bt u t cc quan h

    c s, ni thm nhiu quan h ti mi bc cho n khi thu c tt c mi hoch

    nh kh hu nh trong Hnh 9.4.. Quy hoch ng xy dng tt c mi hoch nh

    kh hu theo hng ngang(breadth-first) trc khi n chn ra hoch nh tt nht.

    h thp chi ph ti u ha, cc hoch nh tng phn rt c kh nng khng dn

    n mt hoch nh ti u u c xn b ngay khi c th. Ngc li, mt chin lc

    n nh khc l thut ton thin cn ch xy dng mt hoch nh theo hng su

    (depth-first).

    Bc 1 Bc 2 Bc 3

    Quy hoch ng hu nh c bn cht vt cn v bo m tm ra c cc hoch

    nh. N phi tr mt chi ph c th chp nhn c (theo thi gian v khng gian) khi

    s quan h trong cu vn tin kh nh. Tuy nhin li tip cn ny c chi ph qu cao khi

    s quan h ln hn 5 hoc 6. V l do ny m cc ch gn y ang tp trung vo

    cc chin lc ngu nhin ha (randomized strategy) lm gim phc tp ca ti

    69

    R1 R2 R1

    R2

    R1 R2

    R3R4R4

    R3

  • 7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639

    70/117

    u ha nhng khng bo m tm c hoch nh tt nht. Khng ging nh cc

    chin lc n nh, cc chin lc ngu nhin ha cho php th ti u ha nh i

    thi gian ti u ha v thi gian thc thi.

    Chin lc ngu nhin ha chng hn nh tp trung vo vic tm kim li gii ti

    u xung quanh mt s im c bit no . Chung khng m bo s thu c mt

    li gii tt nht nhng trnh c chi ph qu cao ca ti u ha tnh theo vic tiu

    dng b nh v thi gian. Trc tin mt hoc nhiu hoch nh khi u c xy

    dng bng mt chin lc thin cn . Sau thut ton tm cch ci thin hoch nh

    ny bng cch thm cc ln cn (neighbor) ca n. Mt ln cn thu c bng cch

    p dng mt bin i ngu nhin cho mt hoch nh. Th d v mt bin i in

    hnh gm c hon i hai quan h ton hng c chn ngu nhin ca hoch nh

    nh trong chng t bng thc nghim rng cc chin lc ngu nhin ha c hiu

    nng tt hn cc chin lc n nh khi vn tin c cha kh nhiu quan h.

    3.4.3. M hnh chi ph phn tn

    M hnh chi ph ca th ti u ha gm c cc hm chi ph d on chi phca cc ton t, s liu thng k, d liu c s v cc cng thc c lng kchthc cc kt qu trung gian.

    Hm chi ph

    Chi ph ca mt chin lc thc thi phn tn c th c din t ng vi tngthi gian hoc vi thi gian p ng. Tng thi gian (total time) l tng tt c ccthnh phn thi gian (cn c gi l chi ph), cn thi gian p ng( response time)l thi gian tnh t khi khi hot n lc hon thnh cu vn tin. Cng thc tng qut xc nh tng chi ph c m t nh sau:

    Total_time = TCPU * #insts + TI/O * #I/Os + TMSG * #msgs + TTR * #bytcs

    Hai thnh phn u tin l thi gian x l cc b, trong TCPU l thi gian camt ch th CPU v TI/O l thi gian cho mt thao tc xut nhp a. Thi gian truyn

    c biu th qua hai thnh phn cui cng. TMSG l thi gian c nh cn khi hotv nhn mt thng bo, cn TTR l thi gian cn truyn mt n v d liu t v trny n v tr khc. n v d liu y tnh theo byte ( #byte l tng kch thc ca

    70

    R1

    R2

    R1 R3

    R3 R2

  • 7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639

    71/117

  • 7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639

    72/117

  • 7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639

    73/117

    1. i vi mi thuc tnhAichiu di (theo s byte) c k hiu l length (Ai),v i vi mi thuc tinh Aica mi mnhRj, s lng phn bit cc gi tr ca Ai, llc lng khi chiu mnhRj trnAi , c k hiu l card ( Ai (Rj)).

    2. ng vi min ca mi thuc tnhAi trn mt tp gi tr sp th t c (th ds nguyn hoc s thc), gi tr ln nht v nh nht c k hiu l max (Ai) v

    min(Ai).

    3. ng vi min ca mi thuc tnh Ai lc lng ca min c k hiu lcard(dom[Ai]). Gi tr ny cho bit s lng cc gi tr duy nht trong dom[Ai].

    4. S lng cc b trong mi mnh Rj c k hiu l card(Rj)

    i khi d liu thng k cng bao gm h s chn ni (join selectivity factor) ivi mt s cp quan h, ngha l t l cc b c tham gia vo ni. H s c chn nic k hiu l SFj ca quan hR v Sl mt gi tr thc gia 0 v 1.:

    card(R S)

    SFj =

    card(R)* card(S)

    Chng hn h s chn ni 0.5 tng ng vi mt quan h ni cc ln, trong khi h s 0.001 tng ng vi mt quan h kh nh. Chng ta ni rng ni c chnkm trong trng hp u v chn tttrong trng hp sau:

    D liu thng k ny rt c ch cho vic d on kch thc quan h trung gian.

    Size (R) = card (R) * length (R)

    Trong length (R) l chiu di (theo byte) ca mt b ca R, c tnh t ccchiu di ca cc thuc tnh caR. Vic c lng card (R), s lng cc b trongR,i hi phi s dng cc cng thc c cho trong phn tip theo.

    Lc lng ca cc kt qu trung gian

    D liu thng k rt c ch khi nh gi lc lng ca cc kt qu trung gian. Haigi thit n gin thng c a ra v CSDL. Phn phi ca cc gi tr thuc tnhtrong mt quan h c gi nh l thng nht, v tt c mi thuc tnh u c clp, theo ngha l gi tr ca mt thuc tnh khng nh hng n gi tr ca cc thuctnh khc. Hai gi thit ny thng khng ng trong thc t, tuy nhin chng lm cho

    bi ton d gii quyt hn. Trong nhng on sau, chng ta trnh by cc cng thcc lng, lc lng cc kt qu ca cc php ton i s c bn (php chn, php

    chiu, tch Descartes, ni, ni na, hp, v hiu). Quan h ton hng c k hiu l Rv S.H s chn ca mt php ton c biu th l SFOP, vi OPbiu th cho phpton.

    73

  • 7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639

    74/117

    Php chn. Lc lng ca php chn l:

    Card(F(R)) = SFS(F) * card(R)

    Trong SFS(F) ph thuc vo cng thc chn v c th c tnh nh sau, vip(Ai) vp(Aj) biu th cho v tr t thuc tnhAi vAj.

    1

    SFS (A= value)=

    card(A (R))

    max(A ) -value

    SFS (A > value)=max(A) - min(A)

    value- min(A)

    SFS (A < value)=

    max(A) - min(A)

    SFS (p(Ai) p(Aj)) = SFS (p(Ai) * SFS (p(Aj))

    SFS (p(Ai) p(Aj)) = SFS (p(Ai) * SFS (p(Aj)) (SFS (p(Ai)) * SFS (p(Aj)))

    SFS (Ai {value}) = SFS (A= value) * card({values})

    Php chiu. Chiu c th loi b hoc khng loi b cc b ging nhau. ychng ta xem nh chiu c km theo c vic loi b ny. Mt php chiu bt k rtkh c lng chnh xc bi v mi tng quan gia cc thuc tnh c chiu thngkhng c bit. Tuy nhin c hai trng hp c bit c ch nhng vic c lnghon ton tm thng. Nu chiu ca quan h R da trn thuc tnh A duy nht, lclng ch l s b thu c khi thc hin php chiu. Nu mt trong cc thuc tnhchiu l kha caR th

    74

  • 7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639

    75/117

    card (A(R)) = card(R)

    Tch Descartes. Lc lng ca tch Descartes ca quan hR v Sl

    card (R x S) = card(R)* card(S)

    Ni. Khng c mt phng php tng qut no tnh lc lng ca ni mkhng cn thm thng tin b sung. Cn trn ca lc lng cho ni l lc lng catch Descartes. Mt s h thng, chng hn nh h INGRES phn tn s dng cn trnny, mt c lng hi qu ng. R* s dng thng s ca trn ny vi mt hng s,

    phn nh s kin l kt qu ni lun nh hn tch Descartes. Tuy nhin c mt trnghp xy ra kh thng xuyn nhng vic c lng li kh n gin. Nu R c thchin ni bng vi S trn thuc tnh A ca R v thuc tnh B ca S, trong A l khaca quan h R v B l kha ngoi ca quan h S th lc lng ca kt qu c thtnh xp x l:

    Card (R A=B S) = card(S)Bi v mi b ca S khp vi ti a mt b ca R. Hin nhin l iu ny cng

    ng nu B l kha ca S v A l kha ngoi ca R. Tuy nhin c lng ny l cntrn bi v n gi s rng mi b ca S u tham gia vo trong ni. i vi nhng niquan trng khc, chng ta cn duy tr h s chn ni SFJ nh thnh phn ca cc thngtin thng k. Trong trng hp d lc lng ca kt qu l:

    Card (R S) = SFJ* card(R)* card(S)

    Ni na. H s chn ca ni na gia R v S cho bi ty l phn trm cc b ca

    R c ni vi cc b ca S. Mt xp x cho h s chn ni na c a ra trng l:

    Card (A(S))

    SFSJ (R |>< S)=

    Card(dom[A])

    Cng thc ny ch ph thuc vo thuc tnh A v S. V th n thng c gi lh s chn ca thuc tnh A v S, k hiu l SF SJ(S.A), v l h s chn ca S.A trn

    bt k mt thuc tnh no c th ni c vi n. V th lc lng ca ni na ccho bi:

    Card (R |>< S) = SFSJ(S.A)* card(R)

    Xp x ny c th c xc nhn trn mt trng hp rt thng gp, l khiR.A l kha ngoi ca S (S.A l kha chnh). Trong trng hp ny, h s chn ni

    na l 1 bi v card (A(S))= Card(dom[A]) cho thy rng lc lng ca ni na lcard (R).

    75

  • 7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639

    76/117

    Php hp. Rt kh c lc lng trong trng hp ca R v S bi v cc bging nhau b loi b trong hp. Chng ta ch trnh by cng thc n gin cho cccn trn v di, tng ng l:

    card (R) + card (S)

    max{card (R), card (S)}

    Ch rng nhng cng thc ny gi thit R v S khng cha cc b ging nhau.

    Hiu.Ging nh php hp, chng ta ch trnh by cc cn trn v di. Cn trnca card (R - S) l card (R), cn cn di l 0.

    3.4.4. Xp th t ni trong cc vn tin theo mnh

    Nh chng ta bit vic sp xp cc ni l mt ni dung quan trng trong qutrnh ti u ha vn tin tp trung. Xp th t ni trong ng cnh phn tn d nhin l

    quan trng hn bi v ni cc mnh lm tng thi gian truyn. Hin c hai cch tipcn c bn sp th t cc ni trong cc vn tin mnh. Mt l ti u ha trc tipvic xp th t ni, cn cch kia th thay cc ni bng cc t hp ca ni na nhmgim thiu chi ph truyn.

    Xp th t ni.

    Mt s thut ton ti u ha vic sp th t ni mt cch trc tip m khngdng cc ni na. Cc thut ton ca h INGRES phn tn v System R* l i dincho nhm ny. Mc ch ca phn ny l trnh by cc vn phc tp ca vic sp

    th t ni v to tin cho phn tip theo c s dng ni na ti u ha cc cuvn tin ni.

    Chng ta cn a ra mt s gi thit nhm tp trung vo cc vn chnh. Bi vcu vn tin c cc b ha v c din t trn cc mnh, chng ta khng cn phi

    phn bit gia cc mnh ca cng mt quan h v cc mnh c lu ti mt v tr cth. Nhm tp trung vo vic sp th t ni, chng ta b qua thi gian x l cc b,vi gi thit l cc thao tc rt gn (chn, chiu) c thc hin cc b hoc trc khi,hoc trong khi ni, (cn nh rng thc hin ph chn trc khng phi lc no cnghiu qu). V th chng ta ch xt cc cu vn tin ni m cc quan h ton hng c

    lu ti cc v tr khc nhau. Chng ta gi s rng vic di chuyn quan h c thchin theo ch mi ln mt tp ch khng phi ni ln mt b. Cui cng chng tab qua thi gian truyn d liu c c d liu ti v tr kt qu.

    Trc tin chng ta tp trung vo mt vn n gin hn l truyn ton hngtrong mt ni. Cu vn tin l R S , trong R v S l cc quan h c lu tinhng v tr khc nhau. Chn la quan h truyn, hin nhin l gi quan h nh nv tr ca quan h ln, cho ra hai kh nng nh c trnh by trong Hnh 7. c tha ra mt chn la, chng ta cn c lng kch thc ca R v S. By gi chng taxt trng hp c nhiu hn hai quan h trong mt ni. Ging nh trng hp mt ni

    n, mc ch ca thut ton xp th t ni l truyn nhng quan h nh. Kh khnny sinh t s kin l cc ni c th lm gim hoc tng kch thc ca cc quan htrung gian. V th c lng kch thc kt qu ni l iu bt buc nhng cng rt

    76

  • 7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639

    77/117

    kh. Mt gii php l c lng chi ph truyn ca tt c cc chin lc ri chn ramt chin lc tt nht. Tuy nhin s lng ca cc chin lc s tng nhanh theo squan h. Li tip cn ny, c dng rong System*R, c chi ph ti u ha cao, mcd n s c tr li rt nhanh nu cu vn tin c thc hin thng xuyn.

    nu size (R) < size (S)

    nu size (R) > size (S)

    Th d 3.16:

    Xt cu vn tin c biu din di dng i s quan h:

    PROJ PNOEMP ENO ASG

    Vi th ni c trnh by trong Hnh 8. Ch rng chng ta a ra mt sgi thit v v tr ca ba quan h. Cu vn tin ny c th c thc hin t nht l bngnm cch khc nhau. Chng ta m t nhng chin lc ny bng nhng chng trnhsau, trong (R v tr j) biu th quan h R c chuyn n v tr j

    V tr 2

    ENO PNO

    V tr 1 V tr 3

    1. EMP v tr 2. V tr 2 tnh EMP = EMP ASG.EMP v tr 3. V tr 3tnh EMP PROJ

    2. ASG v tr 1. V tr 1 tnh EMP = EMP ASG.EMP v tr 3. V tr 3tnh EMP PROJ

    3. ASG v tr 3. V tr 3 tnh ASG = ASG PROJ.SG v tr 1. V tr 1

    tnh ASG EMP

    77

    RS

    ASG

    EM

    P

    PR

    OJ

  • 7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639

    78/117

    4. PROJ v tr 2. V tr 2 tnh PROJ = PROJ ASG. PROJ v tr 1.V tr1 tnh PROJ EMP

    5. EMP v tr 2. PROJ v tr 2. V tr 2 tnh EMP PROJ ASG

    chn ra mt chng trnh trong s ny, chng ta phi bit hoc d on c

    cc kch thc: size (EMP), size (ASG), size (PROJ), size (EMP ASG) v size(ASG PROJ). Hn na nu xem xt c thi gian p ng, vic ti u ha phi tnhn vn l truyn d liu c th c thc hin song song trong chin lc 5. Mt

    phng n khc lit k tt c cc gii php l dng cc heuristic ch xt n kchthc cc quan h ton hng bng cch gi thit, chng hn l lc lng ca ni cto ra l tch ca cc lc lng. Trong trng hp ny, cc quan h c xp th ttheo kch thc v th t thc hin c cho bi cch xp th t v th ni. Th dth t (EMP, ASG, PROJ) c th s dng chin lc 1, cn th t (PROJ, ASG,EMP) c th dng chin lc 4.

    Cc thut ton da trn ni naTrong phn ny chng ta trnh by xem phi s dng ni na nh th no h

    thp tng thi gian ca cc vn tin ni. y chng ta cng dng gi thit ging nhtrong phn 1. Thiu st chnh ca phng php ni c m t trong phn trc lton b quan h ton hng phi c truyn qua li gia cc v tr. i vi mt quanh, ni na hnh ng nh mt tc nhn rt gn kch thc ging nh mt php chn.

    Ni ca hai quan h R v S trn thuc tnh A, c lu tng ng ti v tr 1 v 2,c th c tnh bng cch thay mt hoc c hai ton hng bng mt ni na vi quan

    h kia nh cc quy tc sau y.R AS

    (R |>< A S) A S, gi thit rng size(R) < A S

    3. R v tr 2

    4. V tr 2 tnh R AS

    cho n gin, chng ta hy b qua hng TMSG trong thi gian truyn vi githit l ton hng TTR * size(R ) l