of 117
7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639
1/117
MUC LUC
MUC LUC....................................................................................1L i ni u .................................................................................3PH N 1 .......................................................................................5C S D LI U PHN TN ..........................................................5CH NG 1. T NG QUAN V C S D LI U PHN TN .............5
1.1. H CSDL phn tn ............................................................ 51.1.1. nh ngh a CSDL phn tn .........................................51.1.2. Cc c i m chnh c a c s d li u phn tn ..........61.1.3. M c ch c a vi c s d ng c s d li u phn tn .....81.1.4. Ki n trc c b n c a CSDL phn tn ..........................91.1.5. H qu n tr CSDL phn tn .......................................10
1.2. Ki n trc h qu n tr C s d li u phn tn ..................111.2.1. Cc h khch / i l ................................................111.2.2. Cc h phn tn ngang hng ...................................12
CH NG 2. CC PH NG PHP PHN TN D LI U ................132.1.Thi t k c s d li u phn tn .......................................13
2.1.1.Cc chi n l c thi t k ..............................................132.2. Cc v n thi t k ........................................................14
2.2.1. L do phn m nh .....................................................142.2.2. Cc ki u phn m nh ................................................142.2.3. Phn m nh ngang ....................................................16
2.3. Phn m nh d c .............................................................. 302.5. Phn m nh h n h p ....................................................... 412.6. C p pht ........................................................................ 42
2.6.1 Bi ton c p pht ......................................................422.6.2 Yu c u v thng tin .................................................422.6.3. M hnh c p pht ....................................................43
CH NG 3. X L V N TIN ......................................................473.1. Bi ton x l v n tin .....................................................473.2. Phn r v n tin ............................................................... 513.3. C c b ha d li u phn tn ..........................................59
3.4. T i u ho v n tin phn tn ...........................................663.4.1. Khng gian tm ki m ................................................663.4.2. Chi n l c tm ki m .................................................693.4.3. M hnh chi ph phn tn..........................................703.4.4. X p th t n i trong cc v n tin theo m nh .............76
CH NG 4. QU N L GIAO D CH .............................................834.1. Cc khi ni m ................................................................834. 2. M hnh kho c b n ..................................................... 914.4. Thu t ton i u khi n t ng tranh b ng nhn th i gian .97
PH N 1 ...................................................................................100C S D LI U SUY DI N ......................................................1002.1. Gi i thi u chung ........................................................... 100
1
7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639
2/117
2.2- CSDL suy di n ........................................................... 1002.2.1. M hnh CSDL suy di n ..........................................1002.2.2. L thuy t m hnh i v i CSDL quan h ................1022.2.3. Nhn nh n CSDL suy di n ......................................1042.2.4. Cc giao tc trn CSDL suy di n ...........................105
2.3. CSDL d a trn Logic ..................................................... 1052.3.4. C u trc c a cu h i ..............................................1102.3.5. So snh DATALOG v i i s quan h ....................111
2.3.6. Cc h CSDL chuyn gia .....................................1162.4. M t s v n khc ......................................................116
2
7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639
3/117
Li ni u
Cc h c s d liu (h CSDL) u tin c xy dng theo cc m hnh phncp v m hnh mng, xut hin vo nhng nm 1960, c xem l th h th nht
ca cc h qun tr c s d liu (h QTCSDL).
Tip theo l th h th hai, cc h QTCSDL quan h, c xy dng theo m
hnh d liu quan h do E.F. Codd xut vo nm 1970.
Cc h QTCSDL c mc tiu t chc d liu, truy cp v cp nht nhng khi
lng ln d liu mt cch thun li, an ton v hiu qu.
Hai th h u cc h QTCSDL p ng c nhu cu thu thp v t chc
cc d liu ca cc c quan, x nghip v t chc kinh doanh.
Tuy nhin, vi s pht trin nhanh chng ca cng ngh truyn thng v s
bnh trng mnh m ca mng Internet, cng vi xu th ton cu ho trong mi lnh
vc, c bit l v thng mi, lm ny sinh nhiu ng dng mi trong phi
qun l nhng i tng c cu trc phc tp (vn bn, m thanh, hnh nh) v ng
(cc chng trnh, cc m phng). Trong nhng nm 1990 xut hin mt th h th
ba cc h QTCSDL cc h hng i tng, c kh nng h tr cc ng dng a
phng tin (multimedia).
Trc nhu cu v ti liu v sch gio khoa ca sinh vin chuyn nghnh cng
ngh thng tin, nht l cc ti liu v CSDL phn tn, CSDL suy din, CSDL hng
i tng, chng ti a ra gio trnh mn hc C s d liu 2.
Mc ch ca gio trnh C s d liu 2 nhm trnh by cc khi nim v
thut ton c s ca CSDL bao gm: cc m hnh d liu v cc h CSDL tng ng,cc ngn ng CSDL, t chc lu tr v tm kim, x l v ti u ho cu hi, qun l
giao dch v ieukhin tng tranh, thit k cc CSDL.
Trong qu trnh bin son, chng ti da vo ni dung chng trnh ca mn
hc hin ang c ging dy ti cc trng i hc trong nc, ng thi cng c
gng phn nh mt s thnh tu mi ca cng ngh CSDL.
Gio trnh C s d liu 2 c chia thnh 2 phn
Phn 1: C s d liu phn tn
Phn 2: C s d liu suy din
3
7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639
4/117
Sau mi chng u c nhng phn tm tt cui chng, cu hi n tp v bi
tp nhm gip sinh vin nm vng ni dung chnh ca tng chng v kim tra trnh
ca chnh mnh trong vic gii cc bi tp.
Tuy rt g gng, gio trnh chc chn cn c nhng thiu st. Rt mong nhn
c kin ng gp ca c gi trong ln ti bn sau, gio trnh s hon chnh hn.
Thi Nguyn thng 10 nm 2009
Cc tc gi
4
7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639
5/117
PHN 1
C S D LIU PHN TN
CHNG 1. TNG QUAN V C S D LIU PHN TN
Vi vic phn b ngy cng rng ri ca cc cng ty, x nghip, d liu bi ton
l rt ln v khng tp trung c. Cc CSDL thuc th h mt v hai khng gii
quyt c cc bi ton trong mi trng mi khng tp trung m phn tn, song song
vi cc d liu v h thng khng thun nht, th h th ba ca h qun tr CSDL ra
i vo nhng nm 80 trong c CSDL phn tn p ng nhng nhu cu mi.
1.1. H CSDL phn tn
1.1.1. nh ngha CSDL phn tn
Mt CSDL phn tn l mt tp hp nhiu CSDL c lin i logic v c phn
b trn mt mng my tnh
- Tnh cht phn tn: Ton b d liu ca CSDL phn tn khng c c tr
mt ni m c tr ra trn nhiu trm thuc mng my tnh, iu ny gip chng ta
phn bit CSDL phn tn vi CSDL tp trung n l.- Tng quan logic: Ton b d liu ca CSDL phn tn c mt s cc thuc tnh
rng buc chng vi nhau, iu ny gip chng ta c th phn bit mt CSDL phn
tn vi mt tp hp CSDL cc b hoc cc tp c tr ti cc v tr khc nhau trong mt
mng my tnh.
5
Trm 1
Trm 2
Trm 3Trm 4
Trm 5
Mng truyn d liu
Hnh 1.1 Mi trng h CSDLphn tn
7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639
6/117
Trong h thng c s d liu phn tn gm nhiu trm, mi trm c th khai
thc cc giao tc truy nhp d liu trn nhiu trm khc.
V d 1.1: Vi mt ngn hng c 3 chi nhnh t cc v tr khc nhau. Ti mi
chi nhnh c mt my tnh iu khin mt s my k ton cui cng (Teller terminal).
Mi my tnh vi c s d liu thng k a phng ca n ti mi chi nhnh c t
mt v tr ca c s d liu phn tn. Cc my tnh c ni vi nhau bi mt mng
truyn thng.
1.1.2. Cc c im chnh ca c s d liu phn tn
(1) Chia s ti nguyn
Vic chia s ti nguyn ca h phn tn c thc hin thng qua mng truyn
thng. chia s ti nguyn mt cch c hiu qu th mi ti nguyn cn c qun lbi mt chng trnh c giao din truyn thng, cc ti nguyn c th c truy cp,
cp nht mt cch tin cy v nht qun. Qun l ti nguyn y l lp k hoch d
phng, t tn cho cc lp ti nguyn, cho php ti nguyn c truy cp t ni ny
n ni khc, nh x ln ti nguyn vo a ch truyn thng, ...
(2) Tnh m
Tnh m ca h thng my tnh l d dng m rng phn cng (thm cc thit
b ngoi vi, b nh, cc giao din truyn thng ...) v cc phn mm (cc m hnh h
iu hnh, cc giao thc truyn tin, cc dch v chung ti nguyn, ... )
Mt h phn tn c tnh m l h c th c to t nhiu loi phn cng v
phn mm ca nhiu nh cung cp khc nhau vi iu kin l cc thnh phn ny phi
theo mt tiu chun chung.
Tnh m ca h phn tn c xem xt thao mc b sung vo cc dch v
dng chung ti nguyn m khng ph hng hay nhn i cc dch v ang tn ti. Tnh
m c hon thin bng cch xc nh hay phn nh r cc giao din chnh ca mt
h v lm cho n tng thch vi cc nh pht trin phn mm.
Tnh m ca h phn tn da trn vic cung cp c ch truyn thng gia cc
tin trnh v cng khai cc giao din dng truy cp cc ti nguyn chung.
(3) Kh nng song song
H phn tn hot ng trn mt mng truyn thng c nhiu my tnh, mi my
c th c 1 hay nhiu CPU. Trong cng mt thi im nu c N tin trnh cng tn ti,
6
7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639
7/117
ta ni chng thc hin ng thi. Vic thc hin tin trnh theo c ch phn chia thi
gian (mt CPU) hay song song (nhiu CPU)
Kh nng lm vic song song trong h phn tn c thc hin do hai tnh
hung sau:
- Nhiu ngi s dng ng thi ra cc lnh hay cc tng tc vi cc chng
trnh ng dng
- Nhiu tin trnh Server chy ng thi, mi tin trnh p ng cc yu cu t
cc tin trnh Client khc.
(4) Kh nng m rng
H phn tn c kh nng hot ng tt v hiu qu nhiu mc khc nhau. Mt
h phn tn nh nht c th hot ng ch cn hai trm lm vic v mt File Server.Cc h ln hn ti hng nghn my tnh.
Kh nng m rng c c trng bi tnh khng thay i phn mm h thng
v phn mm ng dng khi h c m rng. iu ny ch t c mc d no
vi h phn tn hin ti. Yu cu vic m rng khng ch l s m rng v phn cng,
v mng m n tri trn cc kha cnh khi thit k h phn tn.
(5) Kh nng th li
Vic thit k kh nng th li ca cc h thng my tnh da trn hai gii php:
- Dng kh nng thay th m bo s hot ng lin tc v hiu qu.
- Dng cc chng trnh hi phc khi xy ra s c.
Xy dng mt h thng c th khc phc s c theo cch th nht th ngi ta
ni hai my tnh vi nhau thc hin cng mt chng trnh, mt trong hai my chy
ch Standby (khng ti hay ch). Gii php ny tn km v phi nhn i phn
cng ca h thng. Mt gii php gim ph tn l cc Server ring l c cung cpcc ng dng quan trng c th thay th nhau khi c s c xut hin. Khi khng c
cc s c cc Server hot ng bnh thng, khi c s c trn mt Server no , cc
ng dng Clien t chuyn hng sang cc Server cn li.
Cch hai th cc phn mm hi phc c thit k sao cho trng thi d liu
hin thi (trng thi trc khi xy ra s c) c th c khi phc khi li c pht hin.
Cc h phn tn cung cp kh nng sn sng cao i ph vi cc sai hng
phn cng.
(6) Tnh trong sut
7
7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639
8/117
Tnh trong sut ca mt h phn tn c hiu nh l vic che khut i cc
thnh phn ring bit ca h i vi ngi s dng v nhng ngi lp trnh ng dng.
Tnh trong sut v v tr: Ngi s dng khng cn bit v tr vt l ca d liu.
Ngi s dng c quyn truy cp ti n c s d liu nm bt k ti v tr no. Cc
thao tc ly, cp nht d liu ti mt im d liu xa c t ng thc hin bi h
thng ti im a ra yu cu, ngi s dng khng cn bit n s phn tn ca c s
d liu trn mng.
Tnh trong sut trong vic s dng: Vic chuyn i ca mt phn hay ton b
c s d liu do thay i v t chc hay qun l, khng nh hng ti thao tc ngi
s dng.
Tnh trong sut ca vic phn chia: Nu d liu c phn chia do tng ti, n
khng c nh hng ti ngi s dng.
Tnh trong sut cas trng lp: Nu d liu trng lp gim chi ph truyn
thng vi c s d liu hoc nng cao tin cy, ngi s dng khng cn bit n
iu .
(7) m bo tin cy v nht qun
H thng yu cu tin cy cao: s b mt ca d liu phi c bo v, cc
chc nng khi phc h hng phi c m bo. Ngoi ra yu cu ca h thng vtnh nht qun cng rt quan trng trong th hin: khng c c mu thun trong ni
dung d liu. Khi cc thuc tnh d liu l khc nhau th cc thao tc vn phi nht qun.
1.1.3. Mc ch ca vic s dng c s d liu phn tn
Xut pht t yu cu thc t v t chc v kinh t: Trong thc t nhiu t chc
l khng tp trung, d liu ngy cng ln v phc v cho a ngi dng nm phn tn,
v vy c s d liu phn tn l con ng thch hp vi cu trc t nhin ca cc t
chc . y l mt trong nhng yu t quan trng thc y vic pht trin c s dliu phn tn.
S lin kt cc c s d liu a phng ang tn ti: c s d liu phn tn l
gii php t nhin khi c cc c s d liu ang tn ti v s cn thit xy dng mt
ng dng ton cc. Trong trng hp ny c s d liu phn tn c to t di ln
da trn nn tng c s d liu ang tn ti. Tin trnh ny i hi cu trc li cc c
s d liu cc b mt mc nht nh. D sao, nhng sa i ny vn l nh hn rt
nhiu so vi vic to lp mt c s d liu tp trung hon ton mi.
8
7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639
9/117
Lm gim tng chi ph tm kim: Vic phn tn d liu cho php cc nhm lm
vic cc b c th kim sot c ton b d liu ca h. Tuy vy, ti cng thi im
ngi s dng c th truy cp n d liu xa nu cn thit. Ti cc v tr cc b, thit
b phn cng c th chn sao cho ph hp vi cng vic x l d liu cc b ti im .
S pht trin m rng: Cc t chc c th pht trin m rng bng cch thm
cc n v mi, va c tnh t tr, va c quan h tng i vi cc n v t chc
khc. Khi gii php c s d liu phn tn h tr mt s m rng uyn chuyn vi
mt mc nh hng ti thiu ti cc n v ang tn ti
Tr li truy vn nhanh: Hu ht cc yu cu truy vn d liu t ngi s dng
ti bt k v tr cc b no u tho mn d liu ngay ti thi im .
tin cy v kh nng s dng nng cao: nu c mt thnh phn no ca h
thng b hng, h thng vn c th duy tr hot ng.
Kh nng phc hi nhanh chng: Vic truy nhp d liu khng ph thuc vo
mt my hay mt ng ni trn mng. Nu c bt k mt li no h thng c th t
ng chn ng li qua cc ng ni khc.
1.1.4. Kin trc c bn ca CSDL phn tn
y khng l kin trc tng minh cho tt c cc CSDL phn tn, tuy vy
kin trc ny th hin t chc ca bt k mt CSDL phn tn no
- S tng th: nh ngha tt c cc d liu s c lu tr trong CSDL
phn tn. Trong m hnh quan h, s tng th bao gm nh ngha ca cc tp quan
h tng th.
- S phn on: Mi quan h tng th c th chia thnh mt vi phn
khng gi ln nhau c gi l on (fragments). C nhiu cch khc nhau thc
hin vic phn chia ny. nh x (mt - nhiu) gia s tng th v cc on c
nh ngha trong s phn on.
- S nh v: Cc on l cc phn logic ca quan h tng th c nh v
vt l trn mt hoc nhiu v tr trn mng. S nh v nh ngha on no nh v
ti cc v tr no. Lu rng kiu nh x c nh ngha trong s nh v quyt
nh CSDL phn tn l d tha hay khng.
- S nh x a phng: nh x cc nh vt l v cc i tng c lu
tr ti mt trm (tt c cc on ca mt quan h tng th trn cng mt v tr to ramt nh vt l)
9
7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639
10/117
1.1.5. H qun tr CSDL phn tn
H qun tr CSDL phn tn (Distributed Database Management System-
DBMS) c nh ngha l mt h thng phn mm cho php qun l cc h CSDL
(to lp v iu khin cc truy nhp cho cc h CSDL phn tn) v lm cho vic phn
tn tr nn trong sut vi ngi s dng.
c tnh v hnh mun ni n s tch bit v ng ngha cp cao ca mt
h thng vi cc vn ci t cp thp. S phn tn d liu c che du vi
ngi s dng lm cho ngi s dng truy nhp vo CSDL phn tn nh h CSDL tp
trung. S thay i vic qun tr khng nh hng ti ngi s dng.
H qun tr CSDL phn tn gm 1 tp cc phn mm (chng trnh) sau y:
Cc chng trnh qun tr cc d liu phn tn Cha cc chng trnh qun tr vic truyn thng d liu
Cc chng trnh qun tr cc CSDL a phng.
Cc chng trnh qun tr t in d liu.
to ra mt h CSDL phn tn (Distributed Database System-DDBS) cc
tp tin khng ch c lin i logic chng cn phi c cu trc v c truy xut qua
mt giao din chung.
10
S tng th
S phn on
S nh v
S nh x a phng 2S nh x a phng 1
DBMS ca v tr 1
CSDL a phng ti v tr 1
Cc v tr khc
DBMS ca v tr 2
CSDL a phng ti v tr 2
Hnh1.2 Kin trc c bn ca CSDL phn tn
7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639
11/117
Mi trng h CSDL phn tn l mi trng trong d liu c phn tn
trn mt s v tr.
1.2. Kin trc h qun tr C s d liu phn tn
1.2.1. Cc h khch / i l
Cc h qun tr CSDL khch / i l xut hin vo u nhng nm 90 v c
nh hng rt ln n cng ngh DBMS v phng thc x l tnh ton. tng tng
qut ht sc n gin: phn bit cc chc nng cn c cung cp v chia nhng chc
nng ny thnh hai lp: chc nng i l (server function) v chc nng khch hng
(client function). N cung cp kin trc hai cp, to d dng cho vic qun l mc
phc tp ca cc DBMS hin i v phc tp ca vic phn tn d liu.
i l thc hin phn ln cng vic qun l d liu. iu ny c ngha l tt cmi vic x l v ti u ho vn tin, qun l giao dch v qun l thit b lu tr c
thc hin ti i l. Khch hng, ngoi ng dng v giao din s c modun DBMS
khch chu trch nhim qun l d liu c gi n cho bn khch v i khi vic
qun l cc kho cht giao dch cng c th giao cho n. Kin trc c m t bi
hnh di rt thng dng trong cc h thng quan h, vic giao tip gia khch v
i l nm ti mc cu lnh SQL. Ni cch khc, khch hng s chuyn cc cu vn
tin SQL cho i l m khng tm hiu v ti u ho chng. i l thc hin hu htcng vic v tr quan h kt qu v cho khch hng.
C mt s loi kin trc khch/ i l khc nhau. Loi n gin nht l trng
hp c mt i l c nhiu khch hng truy xut. Chng ta gi loi ny l nhiu
khch mt i l. Mt kin trc khch/ i l phc tp hn l kin trc c nhiu i l
trong h thng (c gi l nhiu khch nhiu i l). Trong trng hp ny chng ta
c hai chin lc qun l: hoc mi khch hng t qun l ni kt ca n vi i l
hoc mi khch hng ch bit i l rut ca n v giao tip vi cc i l khc quai l khi cn. Li tip cn th nht lm n gin cho cc chng trnh i l
nhng li t gnh nng ln cc my khch cng vi nhiu trch nhim khc. iu ny
dn n tnh hung c gi l cc h thng khch t phc v. Li tip cn sau tp
trung chc nng qun l d liu ti i l. V th s v hnh ca truy xut d liu c
cung cp qua giao din ca i l.
T gc tnh logc c d liu, DBMS khch/ i l cung cp cng mt hnh
nh d liu nh cc h ngang hng s c tho lun phn tip theo. Ngha l chng
cho ngi s dng thy mt hnh nh v mt CSDL logic duy nht, cn ti mc vt l
n c th phn tn. V th s phn bit ch yu gia cc h khch/i l v ngang hng
11
7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639
12/117
khng phi mc v hnh c cung cp cho ngi dng v cho ng dng m m
hnh kin trc c dng nhn ra mc v hnh ny.
1.2.2. Cc h phn tn ngang hng
M hnh client / server phn bit client (ni yu cu dch v) v server (niphc v cc yu cu). Nhng m hnh x l ngang hng, cc h thng tham gia c vai
tr nh nhau. Chng c th yu cu va dch v t mt h thng khc hoc va tr
thnh ni cung cp dch v. Mt cch l tng, m hnh tnh ton ngang hng cung
cp cho x l hp tc gia cc ng dng c th nm trn cc phn cng hoc h iu
hnh khc nhau. Mc ch ca mi trng x l ngang hng l h tr cc CSDL
c ni mng. Nh vy ngi s dng DBMS s c th truy cp ti nhiu CSDL
khng ng nht.
12
7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639
13/117
CHNG 2. CC PHNG PHP PHN TN D LIU
2.1.Thit k c s d liu phn tn
2.1.1.Cc chin lc thit k
Qu trnh thit k t trn xung (top-down)
13
Phn tch yu cu
Yu cu h thng(mc tiu)
Thit k khi nimThit k khung nhn
Lc khi
nim toncc
Thng tintruy xut nh ngha
lc ngoi
Thit k phn tn
Lc khi nim cc b
Thit k vt l
Lc vt l
Theo di v bo tr
Phn hi
Nguyn liu tngi dng
Nguyn liu
t ngi dng
Hnh 2.1. Qu trnh thit k t trnxung
7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639
14/117
Phn tch yu cu: nhm nh ngha mi trng h thng v thu thp cc nhu
cu v d liu v nhu cu x l ca tt c mi ngi c s dng CSDL
Thit k khung nhn: nh ngha cc giao-din cho ngi s dng cui (end-
user)
Thit k khi nim: xem xt tng th x nghip nhm xc nh cc loi thc th
v mi lin h gia cc thc th.
Thit k phn tn: chia cc quan h thnh nhiu quan h nh hn gi l phn
mnh v cp pht chng cho cc v tr.
Thit k vt l: nh x lc khi nim cc b sang cc thit b lu tr vt l c
sn ti cc v tr tng ng.
Qu trnh thit k t di ln (bottom-up)
Thit k t trn xung thch hp vi nhng CSDL c thit k t u. Tuy
nhin chng ta cng hay gp trong thc t l c sn mt s CSDL, nhim v thit
k l phi tch hp chng thnh mt CSDL. Tip cn t di ln s thch hp cho tnh
hung ny. Khi im ca thit k t di ln l cc lc khi nim cc b . Qu
trnh ny s bao gm vic tch hp cc lc cc b thnh khi nim lc ton cc.
2.2. Cc vn thit k
2.2.1. L do phn mnh
Khung nhn ca cc ng dng thng ch l mt tp con ca quan h. V th
n v truy xut khng phi l ton b quan h nhng ch l cc tp con ca quan h.
Kt qu l xem tp con ca quan h l n v phn tn s l iu thch hp duy nht.
Vic phn r mt quan h thnh nhiu mnh, mi mnh c x l nh mt
n v, s cho php thc hin nhiu giao dch ng thi. Ngoi ra vic phn mnh cc
quan h s cho php thc hin song song mt cu vn tin bng cch chia n ra thnhmt tp cc cu vn tin con hot tc trn cc mnh. V th vic phn mnh s lm tng
mc hot ng ng thi v nh th lm tng lu lng hot ng ca h thng.
2.2.2. Cc kiu phn mnh
Cc quy tc phn mnh ng n
Chng ta s tun th ba quy tc trong khi phn mnh m chng bo m
rng CSDL s khng c thay i no v ng ngha khi phn mnh.
a) Tnh y (completeness).
14
7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639
15/117
Nu mt th hin quan h R c phn r thnh cc mnh R1, R2,,Rn, th
mi mc d liu c th gp trong R cng c th gp mt trong nhiu mnh Ri. c tnh
ny ging nh tnh cht phn r ni khng mt thng tin trong chun ho, cng quan
trng trong phn mnh bi v n bo m rng d liu trong quan h R c nh x
vo cc mnh v khng b mt. Ch rng trong trng hp phn mnh ngang mcd liu mun ni n l mt b, cn trong trng hp phn mnh dc, n mun ni
n mt thuc tnh.
b) Tnh ti thit c (reconstruction).
Nu mt th hin quan h R c phn r thnh cc mnh R1, R2,,Rn, th
cn phi nh ngha mt ton t quan h sao cho
R=Ri, Ri Fr
Ton t thay i tu theo tng loi phn mnh, tuy nhin iu quan trng
l phi xc nh c n. Kh nng ti thit mt quan h t cc mnh ca n bo m
rng cc rng buc c nh ngha trn d liu di dng cc ph thuc s c bo ton.
c) Tnh tch bit (disjointness).
Nu quan h R c phn r ngang thnh cc mnh R1, R2,,Rn, v mc d
liu di nm trong mnh Rj, th n s khng nm trong mnh Rk khc
(kj ). Tiu chun ny m bo cc mnh ngang s tch bit (ri nhau). Nu quan hc phn r dc, cc thuc tnh kho chnh phi c lp li trong mi mnh. V th
trong trng hp phn mnh dc, tnh tch bit ch c nh ngha trn cc trng
khng phi l kho chnh ca mt quan h.
Cc yu cu thng tin
Mt iu cn lu trong vic thit k phn tn l qu nhiu yu t c nh
hng n mt thit k ti u. t chc logic ca CSDL, v tr cc ng dng, c tnh
truy xut ca cc ng dng n CSDL, v cc c tnh ca h thng my tnh ti miv tr u c nh hng n cc quyt nh phn tn. iu ny khin cho vic din t
bi ton phn tn tr nn ht sc phc tp.
Cc thng tin cn cho thit k phn tn c th chia thnh bn loi:
- Thng tin CSDL
- Thng tin ng dng
- Thng tin v mng
- Thng tin v h thng my tnh
15
7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639
16/117
Hai loi sau c bn cht hon ton nh lng v c s dng trong cc m
hnh cp pht ch khng phi trong cc thut ton phn mnh
2.2.3. Phn mnh ngang
Trong phn ny, chng ta bn n cc khi nim lin quan n phn mnhngang (phn tn ngang). C hai chin lc phn mnh ngang c bn:
- Phn mnh nguyn thu (primary horizontal fragmentation) ca mt quan h c
thc hin da trn cc v t c nh ngha trn quan h .
- Phn mnh ngang dn xut (derived horizontal fragmentation ) l phn mnh mt
quan h da vo cc v t c nh trn mt quan h khc.
Hai kiu phn mnh ngang
Phn mnh ngang chia mt quan h r theo cc b, v vy mi mnh l mt tp con
cc b t ca quan h r.
Phn mnh nguyn thu (primary horizontal fragmentation) ca mt quan h
c thc hin da trn cc v t c nh ngha trn quan h . Ngc li phn
mnh ngang dn xut (derived horizontal fragmentation ) l phn mnh mt quan h
da vo cc v t c nh trn mt quan h khc. Nh vy trong phn mnh ngang
tp cc v tng vai tr quan trng.Trong phn ny s xem xt cc thut ton thc hin cc kiu phn mnh ngang.
Trc tin chng ta nu cc thng tin cn thit thc hin phn mnh ngang.
Yu cu thng tin ca phn mnh ngang
a) Thng tin v c s d liu
Thng tin v CSDL mun ni n l lc ton cc v quan h gc, cc quan
h con. Trong ng cnh ny, chng ta cn bit c cc quan h s kt li vi nhau
bng php ni hay bng php tnh khc. vi mc ch phn mnh dn xut, cc v t
c nh ngha trn quan h khc, ta thng dng m hnh thc th - lin h (entity-
relatinhip model), v trong m hnh ny cc mi lin h c biu din bng cc
ng ni c hng (cc cung) gia cc quan h c lin h vi nhau qua mt ni.
16
7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639
17/117
Th d 1:
Hnh 2.2. Biu din mi lin h gia cc quan h nh cc ng ni.
Hnh trn trnh by mt cch biu din cc ng ni gia cc quan h. ch
rng hng ca ng ni cho bit mi lin h mt -nhiu. Chng hn vi mi chc
v c nhiu nhn vin gi chc v , v th chng ta s v mt ng ni t quan h
CT (chi tr) hng n NV (nhn vin). ng thi mi lin h nhiu- nhiu gia NVv DA(d n) c biu din bng hai ng ni n quan h PC(phn cng).
Quan h nm ti u (khng mi tn ) ca ng ni c gi l ch nhn
(owner) ca ng ni v quan h ti cui ng ni (u mi tn) gi l thnh vin
(member).
Th d 2:
Cho ng ni L1 ca hnh 2.2, cc hm owner v member c cc gi tr sau:
Owner( L1 ) = CT
Member (L1) = NV
Thng tin nh lng cn c v CSDL l lc lng (cardinality) ca mi quan
h R, l s b c trong R, c k hiu l card (R)
b) Thng tin v ng dng
phn tn ngoi thng tin nh lng Card(R) ta cn cn thng tin nh tnh
c bn gm cc v t c dng trong cc cu vn tin. Lng thng tin ny ph thuc
bi ton c th.
17
Chc v, Lng
MNV, tnNV, chc vMDA, tnDA, ngn sch, a im
MNV , MDA, nhim v, thi gian
CT
NVDA
PC
L1
L2 L3
7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639
18/117
Nu khng th phn tch c ht tt c cc ng dng xc nh nhng v t
ny th t nht cng phi nghin cu c cc ng dng quan trng nht.
Vy chng ta xc nh cc v t n gin (simple predicate). Cho quan h R
( A1, A2,, An ), trong Ai l mt thuc tnh c nh ngha trn mt min bin
thin D(Ai) hay Di..
Mt v t n gin P c nh ngha trn R c dng:
P:Ai Value
Trong {=,, } v
value c chn t min bin thin ca Ai (value Di).
Nh vy, cho trc lc R, cc min tr Di chng ta c th xc nh c tp
tt c cc v t n gin Pr trn R.
Vy Pr={P: Ai Value}. Tuy nhin trong thc t ta ch cn nhng tp con thc
s ca Pr.
Th d 3: Cho quan h D n nh sau:
P1 : TnDA = thit b iu khin
P2 : Ngn sch 200000
L cc v t n gin..
Chng ta s s dng k hiu Pri biu th tp tt c cc v t n gin c
nh ngha trn quan h Ri. Ccphn t ca Pri c k hiu l pij.
Cc v t n gin thng rt d x l, cc cu vn tin thng cha nhiu v t
phc tp hn, l t hp ca cc v t n gin. Mt t hp cn c bit ch , c gi
l v t hi s cp (minterm predicate), l hi (conjunction) ca cc v t n gin.
Bi v chng ta lun c th bin i mt biu thc Boole thnh dng chun hi, vic
s dng v t hi s cp trong mt thut ton thit k khng lm mt i tnh tng qut.
Cho mt tp Pri = {pi1, pi2, , pim } l cc v t n gin trn quan h Ri, tp cc
v t hi s cp Mi={mi1, mi2, , miz } c nh ngha l:
Mi={mij | mij= p*ik} vi 1 k m, 1 j z
Trong p*ik=pik hoc p*ik= pik . V th mi v t n gin c th xut hin
trong v t hi s cp di dng t nhin hoc dng ph nh.
18
7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639
19/117
Th d 4:
Xt quan h CT:
chc v Lng
K s in
Phn tch h thng
K s c kh
Lp trnh
40000
34000
27000
24000
Di y l mt s v t n gin c th nh ngha c trn PAY.
p1: chc v= K s in
p2: chc v= Phn tch h thng
p3: chc v= K s c kh
p4: chc v= Lp trnh
p5: Lng 30000
p6: Lng > 30000
Di y l mt s cc v t hi s cp c nh ngha da trn cc v t n
gin ny
m1: chc v= K s in Lng 30000
m2: chc v = K s in Lng > 30000
m3: (chc v= K s in ) Lng 30000
m4: (chc v= K s in ) Lng> 30000
m5: chc v= Lp trnh Lng 30000
m6: chc v= Lp trnh Lng > 30000
Ch :+ Php ly ph nh khng phi lc no cng thc hin c. Th d:xt
hai v t n gin sau: Cn_di A; A Cn_trn. Tc l thuc tnh A c min tr
nm trong cn di v cn trn, khi phn b ca chng l:
(Cn_di A);
(A Cn_trn) khng xc nh c. Gi tr ca A trong cc ph nh
ny ra khi min tr ca A.
19
7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639
20/117
Hoc hai v t n gin trn c th c vit li l:
Cn_di A Cn_trn c phn b l: (Cn_di A Cn_trn) khng
nh ngha c. V vy khi nghin cu nhng vn ny ta ch xem xt cc v t
ng thc n gin.
=> Khng phi tt c cc v t hi s cp u c th nh ngha c.
+ Mt s trong chng c th v ngha i vi ng ngha ca quan h Chi tr .
Ngoi ra cn ch rng m3 c th c vit li nh sau:
m3: chc v K s in Lng 30000
Theo nhng thng tin nh tnh v cc ng dng, chng ta cn bit hai tp d liu.
tuyn hi s cp (minterm selectivity): s lng cc b ca quan h s c
truy xut bi cu vn tin c c t theo mt v t hi s cp cho. chng
hn tuyn ca m1 trong Th d 4 l zero bi v khng c b no trong CT
tha v t ny. tuyn ca m2 l 1. Chng ta s k hiu tuyn ca mt hi
s cp mi l sel (mi).
Tn s truy xut (access frequency): tn s ng dng truy xut d liu. Nu
Q={q1, q2,....,qq} l tp cc cu vn tin, acc (qi) biu th cho tn s truy xut ca
qi trong mt khong thi gian cho.
Ch rng mi hi s cp l mt cu vn tin. Chng ta k hiu tn s truy xut
ca mt hi s cp l acc(mi)
Phn mnh ngang nguyn thu
Phn mnh ngang nguyn thu c nh ngha bng mt php ton chn
trn cc quan h ch nhn ca mt lc ca CSDL. V th cho bit quan h R, cc
mnh ngang ca R l cc Ri:
Ri = Fi(R), 1 i z.
Trong Fi l cng thc chn c s dng c c mnh Ri. Ch rng
nu Fi c dng chun hi, n l mt v t hi s cp (mj).
20
7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639
21/117
Th d 5:Xt quan h DA
MDA TnDA Ngn sch a im
P1
P2
P3
P4
Thit b o c
Pht trin d liu
CAD/CAM
Bo dng
150000
135000
250000
310000
Montreal
New York
New York
Paris
Chng ta c th nh ngha cc mnh ngang da vo v tr d n. Khi cc
mnh thu c, c trnh by nh sau:
DA1=a im=Montreal (DA)
DA2=a im=New York (DA)
DA3=a im=Paris (DA)
DA1
MDA TDA Ngn sch a im
P1 Thit b o c 150000 Montreal
DA2
MDA TnDA Ngn sch a im
P2
P3
Pht trin d liu
CAD/CAM
135000
250000
New York
New York
DA3
MDA TnDA Ngn sch a im
P4 thit b o c 310000 ParisBy gi chng ta c th nh ngha mt mnh ngang cht ch v r rng hn
Mnh ngang Ri ca quan h R c cha tt c cc b R tha v t hi s cp mi
Mt c tnh quan trng ca cc v t n gin l tnh y v tnh cc tiu.
- Tp cc v t n gin Pr c gi l y nu v ch nu xc sut mi ng
dng truy xut n mt b bt k thuc v mt mnh hi s cp no c nh
ngha theo Pr u bng nhau.
21
7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639
22/117
Th d 6: Xt quan h phn mnh DA c a ra trong Th d 5. Nu tp ng
dng Pr={a im=Montreal, a im=New York , a im=Paris, Ngn
sch 200000 } th Pr khng y v c mt s b ca DA khng c truy xut
bi v t Ngn sch 200000. cho tp v t ny y , chng ta cn phi xt
thm v t Ngn sch > 200000 vo Pr. Vy Pr={a im=Montreal, aim=New York , a im=Paris, Ngn sch 200000 , Ngn sch> 200000 }
l y bi v mi b c truy xut bi ng hai v t p ca Pr. Tt nhin nu ta bt
i mt v t bt k trong Pr th tp cn li khng y .
L do cn phi m bo tnh y l v cc mnh thu c theo tp v t y
s nht qun v mt logic do tt c chng u tho v t hi s cp. Chng cng
ng nht v y v mt thng k theo cch m ng dng truy xut chng.
V th chng ta s dng mt tp hp gm cc v t y lm c s ca phnmnh ngang nguyn thy.
- c tnh th hai ca tp cc v t l tnh cc tiu. y l mt c tnh cm
tnh. V t n gin phi c lin i (relevant) trong vic xc nh mt mnh. Mt v
t khng tham gia vo mt phn mnh no th c th coi v t l tha. Nu tt c
cc v t ca Pr u c lin i th Pr l cc tiu.
Th d 7: Tp Pr c nh ngha trong Th d 6 l y v cc tiu. Tuy
nhin nu chng ta thm v t TnDA =thit b o c vo Pr, tp kt qu s khng
cn cc tiu bi v v t mi thm vo khng c lin i ng vi Pr. V t mi thm
vo khng chia thm mnh no trong cc mnh c to ra.
Khi nim y gn cht vi mc tiu ca bi ton. S v t phi y theo
yu cu ca bi ton chng ta mi thc hin c nhng vn t ra ca bi ton.
Khi nim cc tiu lin quan n vn ti u ca b nh, ti u ca cc thao tc trn
tp cc cu vn tin. Vy khi cho trc mt tp v t Pr xt tnh cc tiu chng ta c
th kim tra bng cch vt b nhng v t tha c tp v t Pr l cc tiu v ttnhin Pr cng l tp y vi Pr.
Thut ton COM_MIN: Cho php tm tp cc v t y v cc tiu Pr t Pr.
Chng ta tm quy c:
Quy tc 1: Quy tc c bn v tnh y v cc tiu , n khng nh rng mt
quan h hoc mt mnh c phn hoch thnh t nht hai phn v chng c truy
xut khc nhau bi t nht mt ng dng.
22
7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639
23/117
Thut ton 1.1 COM_MIN
Input : R: quan h; Pr: tpcc v t n gin;
Output: Pr: tp cc v t cc tiu v y ;
DeclareF: tp cc mnh hi s cp;
Begin
Pr= ; F = ;
For each v t p Pr if p phn hoch R theo Quy tc 1 then
Begin
Pr: = Pr p;
Pr: = Pr p;
F: = F p; {fi l mnh hi s cp theo p i }
End; {Chng ta chuyn cc v t c phn mnh R vo Pr}
Repeat
For each p Pr if p phn hoch mt mnh f k ca Pr
theo quy tc 1 then
Begin
Pr: = Pr p;
Pr: = Pr p;
F: = F p;
End;
Until Pr y {Khng cn p no phn mnh fk ca Pr}
For each p Pr, ifp m pp then
Begin
Pr:= Pr-p;
F:= F - f;
End;End. {COM_MIN}
23
7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639
24/117
7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639
25/117
Begin
Pr:= COM_MIN(R, Pr);
Xc nh tp M cc v t hi s cp;
Xc nh tp I cc php ko theo gia cc piPr;For each miM do
Begin
IF mi mu thun vi I then
M:= M-mi
End;
End. {PHORIZONTAL}
Th d 8: Chng ta hy xt quan h DA. Gi s rng c hai ng dng. ng
dng u tin c a ra ti ba v tr v cn tm tn v ngn sch ca cc d n khi
cho bit v tr. Theo k php SQL cu vn tin c vit l:
SELECT TnDA, Ngn sch
FROM DA
WHERE a im=gi tr
i vi ng dng ny, cc v t n gin c th c dng l:
P1: a im=Montreal
P2: a im=New York
P3: a im=Paris
ng dng th hai l nhng d n c ngn sch di 200.000 la c qun l
ti mt v tr, cn nhng d n c ngn sch ln hn c qun l ti mt v tr th hai.
V th cc v t n gin phi c s dng phn mnh theo ng dng th hai l:
P4: ngn sch200000
P5: ngn sch>200000
Nu kim tra bng thut ton COM_MIN, tp Pr={p1, p2, p3, p4, p5} r rng
y v cc tiuDa trn Pr chng ta c th nh ngha su v t hi s cp sau y to ra M:
25
7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639
26/117
7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639
27/117
DA3
MDA TnDA Ngn sch a im
P2 Pht trin d liu 135000 New York
DA4
MDA TnDA Ngn sch a im
P3 CAD/CAM 250000 New York
DA 6
MDA TnDA Ngn sch a im
P4 bo dng 310000 Paris
Phn mnh ngang dn xut
Phn mnh ngang dn xut c nh ngha trn mt quan h thnh vin ca
ng ni da theo php ton chn trn quan h ch nhn ca ng ni .
Nh th nu cho trc mt ng ni L, trong owner (L)=S v
member(L)=R, v cc mnh ngang dn xut ca R c nh ngha l:
Ri=R|>< Si , 1 i w
Trong w l s lng cc mnh c nh ngha trn R, v S i=Fi(S) vi Fi l
cng thc nh ngha mnh ngang nguyn thu Si
27
7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639
28/117
Th d 9: Xt ng ni
NV
MNV TnNV Chc v
E1
E2
E2
E3
E3
E4
E5
E6
E7
E8
J.Doe
M.Smith
M.Smith
A.Lee
A.Lee
J.Miller
B.Casey
L.Chu
R.david
J.Jones
K s in
Phn tch
Phn tch
K s c kh
K s c kh
Programmer
Phn tch h thng
K s in
K s c kh
Phn tch h thng
th th chng ta c th nhm cc k s thnh hai nhm ty theo lng: nhm c
lng t 30.000 la tr ln v nhm c lng di 30.000 la. Hai mnh Nhn
vin1 v Nhn vin2 c nh ngha nh sau:
NV1=NV |>< CT1
NV2=NV |>< CT2
Trong CT1=Lng 30000( CT)
CT2=Lng>30000( CT)
CT 1 CT2
Chc v Lng Chc v Lng
K s c kh
Lp trnh
27000
24000
K s in
Phn tch h thng
40000
34000
Kt qu phn mnh ngang dn xut ca quanh NV nh sau:
28
Chc v, Lng
MNV, TnNV, Chc v
L1NV
CT
7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639
29/117
NV1 NV2
MNV TnNV Chc v MNV TnNV Chc v
E3
E4
E7
A.Lee
J.Miller
R.David
K s c kh
Lp trnh vin
K s c kh
E1
E2
E5
E6
E8
J.Doe
M.Smith
B.Casey
L.Chu
J.Jones
K s in
Phn tch
Phn tch h thng
K s in
Phn tch h thng
Ch :
+ Mun thc hin phn mnh ngang dn xut, chng ta cn ba nguyn liu
(input): 1. Tp cc phn hoch ca quan h ch nhn (Th d: CT1, CT2).
2. Quan h thnh vin
3. Tp cc v t ni na gia ch nhn v thnh vin (Chng hn
CT.Chucvu = NV.Chucvu).
+ Vn phc tp cn ch : Trong lc CSDL, chng ta hay gp nhiu
ng ni n mt quan h R. Nh th c th c nhiu cch phn mnh cho quan h
R. Quyt nh chn cch phn mnh no cn da trn hai tiu chun sau:1. Phn mnh c c tnh ni tt hn
2. Phn mnh c s dng trong nhiu ng dng hn.
Tuy nhin, vic p dng cc tiu chun trn cn l mt vn rc ri.
Th d 10: Chng ta tip tc vi thit k phn tn cho CSDL bt u t Th
d 9. V quan h NV phn mnh theo CT. By gi xt ASG. Gi s c hai ng dng sau:
1. ng dng 1: Tm tn cc k s c lm vic ti mt ni no . ng dng nychy c ba trm v truy xut cao hn cc k s ca cc d n nhng v tr khc.
2. ng dng 2: Ti mi trm qun l, ni qun l cc mu tin nhn vin, ngi
dng mun truy xut n cc d n ang c cc nhn vin ny thc hin v cn bit
xem h s lm vic vi d n trong bao lu.
Kim nh tnh ng n
By gi chng ta cn phi kim tra tnh ng ca phn mnh ngang.
a. Tnh y
29
7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639
30/117
+ Phn mnh ngang nguyn thu: Vi iu kin cc v t chn l y , phn
mnh thu cng c m bo l y , bi v c s ca thut ton phn mnh l tp
cc v t cc tiu v y Pr, nn tnh y c bo m vi iu kin khng c
sai st xy ra.
+ Phn mnh ngang dn xut: C khc cht t, kh khn chnh y l do v t
nh ngha phn mnh c lin quan n hai quan h. Trc tin chng ta hy nh
ngha qui tc y mt cch hnh thc.
R l quan h thnh vin ca mt ng ni m ch nhn l quan h S. Gi A l
thuc tnh ni gia R v S, th th vi mi b t ca R, phi c mt b t ca S sao cho
t.A=t.A
Quy tc ny c gi l rng buc ton vn hay ton vn tham chiu, bom rng mi b trong cc mnh ca quan h thnh vin u nm trong quan h ch nhn.
b. Tnh ti thit c
Ti thit mt quan h ton cc t cc mnh c thc hin bng ton t hp
trong c phn mnh ngang nguyn thy ln dn xut, V th mt quan h R vi phn
mnh Fr={R1, R2,,Rm} chng ta c
R = Ri , Ri FR
c. Tnh tch ri
Vi phn mnh nguyn thu tnh tch ri s c bo m min l cc v t hi
s cp xc nh phn mnh c tnh loi tr tng h (mutually exclusive). Vi phn
mnh dn xut tnh tch ri c th bo m nu th ni thuc loi n gin.
2.3. Phn mnh dc
Mt phn mnh dc cho mt quan h R sinh ra cc mnh R1, R2,..,Rr, mi mnh
cha mt tp con thuc tnh ca R v c kho ca R. Mc ch ca phn mnh dc lphn hoch mt quan h thnh mt tp cc quan h nh hn nhiu ng dng ch cn
chy trn mt mnh. Mt phn mnh ti ul phn mnh sinh ra mt lc phn
mnh cho php gim ti a thi gian thc thi cc ng dng chy trn mnh .
Phn mnh dc tt nhin l phc tp hn so vi phn mnh ngang. iu ny l
do tng s chn la c th ca mt phn hoch dc rt ln.
V vy c c cc li gii ti u cho bi ton phn hoch dc thc s rt
kh khn. V th li phi dng cc phng php khm ph (heuristic). Chng ta a rahai loi heuristic cho phn mnh dc cc quan h ton cc.
30
7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639
31/117
- Nhm thuc tnh: Bt u bng cch gn mi thuc tnh cho mt mnh, v ti
mi bc, ni mt s mnh li cho n khi tha mt tiu chun no . K thut ny
c c xut ln u cho cc CSDL tp trung v v sau c dng cho cc
CSDL phn tn.
- Tch mnh: Bt u bng mt quan h v quyt nh cch phn mnh c li
da trn hnh vi truy xut ca cc ng dng trn cc thuc tnh.
Bi v phn hoch dc t vo mt mnh cc thuc tnh thng c truy xut
chung vi nhau, chng ta cn c mt gi tr o no nh ngha chnh xc hn v
khi nim chung vi nhau. S o ny gi l t lc hay lc ht (affinity) ca thuc
tnh, ch ra mc lin i gia cc thuc tnh.
Yu cu d liu chnh c lin quan n cc ng dng l tn s truy xut ca
chng. gi Q={q1, q2,,qq} l tp cc vn tin ca ngi dng (cc ng dng) s chy
trn quan h R(A1, A2,,An). Th th vi mi cu vn tin q i v mi thuc tnh Aj,
chng ta s a ra mt gi tr s dng thuc tnh, k hiu use(q i, Aj) c nh ngha
nh sau:
1 nu thuc tnh Aj c vn tin qi tham chiu
use(qi, Aj)= 0 trong trng hp ngc li
Cc vct use(qi, ) cho mi ng dng rt d nh ngha nu nh thit k bitc cc ng dng s chy trn CSDL.
Th d 11:
Xt quan h DA, gi s rng cc ng dng sau y chy trn cc quan h .
Trong mi trng hp chng ta cng c t bng SQL.
q1: Tm ngn sch ca mt d n, cho bit m ca d n
SELECT Ngn schFROM DA
WHERE MDA=gi tr
q2: Tm tn v ngn sch ca tt c mi d n
SELECT TnDA, ngn sch
FROM DA
q3: Tm tn ca cc d n c thc hin ti mt thnh ph cho
SELECT tnDA
31
7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639
32/117
FROM DA
WHERE a im=gi tr
q4: Tm tng ngn sch d n ca mi thnh ph
SELECT SUM (ngn sch)FROM DA
WHERE a im=gi tr
Da theo bn ng dng ny, chng ta c th nh ngha ra cc gi tr s dng
thuc tnh. cho tin v mt k php, chng ta gi A 1=MDA, A2=TnDA, A3=Ngn
sch, A4=a im. Gi tr s dng c nh ngha di dng ma trn, trong mc
(i,j) biu th use(qi , Aj ).
T lc ca cc thuc tnhGi tr s dng thuc tnh khng lm c s cho vic tch v phn mnh.
iu ny l do chng khng biu th cho ln ca tn s ng dng. S o lc ht
(affinity) ca cc thuc tnh aff(Ai, Aj), biu th cho cu ni (bond) gia hai thuc tnh
ca mt quan h theo cch chng c cc ng dng truy xut, s l mt i lng cn
thit cho bi ton phn mnh.
Xy dng cng thc o lc ht ca hai thuc tnh A i, Aj.
Gi k l s cc mnh ca R c phn mnh. Tc l R = R1.Rk.
Q= {q1, q2,,qm} l tp cc cu vn tin (tc l tp cc ng dng chy trn quan
h R). t Q(A, B) l tp cc ng dng q ca Q m use(q, A).use(q, B) = 1.
Ni cch khc:
Q(A, B) = {qQ: use(q, A) =use(q, B) = 1}
Th d da vo ma trn trn ta thy Q(A 1,A1) = {q1}, Q(A2,A2 ) = {q2, q3},
Q(A3,A3 ) = {q1,q2, q4}, Q(A4,A4 ) = {q3, q4}, Q(A1,A2 ) = rng, Q(A1,A3 ) = {q1},Q(A2,A3 ) = {q2},..
A1 A2 A3 A4
q1 1 0 1 0
q2 0 1 1 0
q3 0 1 0 1
q 4 0 0 1 1
32
7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639
33/117
S o lc ht gia hai thuc tnh A i, Aj c nh ngha l:
aff(Ai, Aj)= refl (qk)accl(qk)
qkQ(Ai, Aj) l Rl
Hoc:aff(Ai, Aj)= refl (qk)accl(qk)
Use(qk, Ai)=1Use(qk, Aj)=1 Rl
Trong refl (qk) l s truy xut n cc thuc tnh (Ai, Aj) cho mi ng dng
qkti v tr Rl v accl(qk) l s o tn s truy xut ng dng qk n cc thuc tnh Ai, Aj
ti v tr l. Chng ta cn lu rng trong cng thc tnh aff (A i, Aj) ch xut hin cc
ng dng q m c Ai v Aj u s dng.
Kt qu ca tnh ton ny l mt ma trn i xng n x n, mi phn t ca n l
mt s o c nh ngha trn. Chng ta gi n l ma trn lc t ( lc ht hoc i
lc) thuc tnh (AA) (attribute affinity matrix).
Th d 12: Chng ta hy tip tc vi Th d 11. cho dn gin chng ta hy
gi s rng refl (qk) =1 cho tt c qk v Rl. Nu tn s ng dng l:
Acc1(q1) = 15 Acc2(q1) = 20 Acc3(q1) = 10
Acc1(q2) = 5 Acc2(q2) = 0 Acc3(q2) = 0
Acc1(q3) = 25 Acc2(q3) = 25 Acc3(q3) = 25
Acc1(q4) = 3 Acc2(q4) = 0 Acc3(q1) = 0
S o lc ht gia hai thuc tnh A1 v A3 l:
Aff(A1, A3) = 1k=13t=1acct(qk) = acc1(q1)+acc2(q1)+acc3(q1) = 45
Tng t tnh cho cc cp cn li ta c ma trn i lc sau:
Thut ton nng lng ni BEA (Bond Energy Algorithm)
A1 A2 A3 A4
A1 45 0 45 0
A2 0 80 5 75
A3 45 5 53 3
A4 0 75 3 78
33
7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639
34/117
n y ta c th phn R lm cc mnh ca cc nhm thuc tnh da vo s
lin i (lc ht) gia cc thuc tnh, th d t lc ca A 1, A3 l 45, ca A2, A4 l 75,
cn ca A1, A2 l 0, ca A3, A4 l 3 Tuy nhin, phng php tuyn tnh s dng trc
tip t ma trn ny t c mi ngi quan tm v s dng. Sau y chng ta xt mt
phng php dng thut ton nng lng ni BEA ca Hoffer and Severance, 1975 vNavathe., 1984.
1. N c thit k c bit xc nh cc nhm gm cc mc tng t, khc
vi mt sp xp th t tuyn tnh ca cc mc.
2. Cc kt qu t nhm khng b nh hng bi th t a cc mc vo thut ton.
3. Thi gian tnh ton ca thut ton c th chp nhn c l O(n2), vi n l s
lng thuc tnh.
4. Mi lin h qua li gia cc nhm thuc tnh t c th xc nh c.
Thut ton BEA nhn nguyn liu l mt ma trn i lc thuc tnh (AA), hon
v cc hng v ct ri sinh ra mt ma trn i lc t (CA) (Clustered affinity matrix).
Hon v c thc hin sao cho s o i lc chung AM (Global Affinity Measure) l
ln nht. Trong AM l i lng:
AM=ni=1nj=1 aff(Ai, Aj)[aff(Ai, Aj-1)+aff(Ai, Aj+1)+aff(Ai-1, Aj)+ aff(Ai+1, Aj)]
Vi aff(A0, Aj)=aff(Ai, A0)=aff(An+1, Aj)=aff(Ai, An+1)=0 cho i,j
Tp cc iu kin cui cng cp n nhng trng hp mt thuc tnh c
t vo CA v bn tri ca thuc tnh tn tri hoc v bn phi ca thuc tnh tn
phi trong cc hon v ct, v bn trn hng trn cng v bn di hng cui cng
trong cc hon v hng. Trong nhng trng hp ny, chng ta cho 0 l gi tr lc ht
aff gia thuc tnh ang c xt v cc ln cn bn tri hoc bn phi (trn cng hoc
di y ) ca n hin cha c trong CA.
Hm cc i ho ch xt nhng ln cn gn nht, v th n nhm cc gi tr ln
vi cc gi tr ln , gi tr nh vi gi tr nh. V ma trn lc ht thuc tnh AA c tch
cht i xng nn hm s va c xy dng trn thu li thnh:
AM=ni=1nj=1 aff(Ai, Aj)[aff(Ai, Aj-1)+aff(Ai, Aj+1)]
Qu trnh sinh ra ma trn t lc (CA) c thc hin qua ba bc:
Bc 1: Khi gn:
t v c nh mt trong cc ct ca AA vo trong CA. Th d ct 1, 2 cchn trong thut ton ny.
34
7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639
35/117
Bc 2: Thc hin lp
Ly ln lt mt trong n-i ct cn li (trong i l s ct c t vo
CA) v th t chng vo trong i+1 v tr cn li trong ma trn CA. Chn ni t sao
cho cho i lc chung AM ln nht. Tip tc lp n khi khng cn ct no dt.
Bc 3: Sp th t hng
Mt khi th t ct c xc nh, cc hng cng c t li cc v tr
tng i ca chng ph hp vi cc v tr tng i ca ct.
Thut ton BEA
Input: AA - ma trn i lc thuc tnh;
Output: CA - ma trn i lc t sau khi sp xp li cc hng cc ct;
Begin
{Khi gn: cn nh rng l mt ma trn n x n}
CA(, 1)AA(, 1)
CA(, 2)AA(, 2)
Index:=3
while index
7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639
36/117
hiu r thut ton chng ta cn bit cont(*,*,*). Cn nhc li s o i lc
chung AM c nh ngha l:
AM=ni=1nj=1 aff(Ai, Aj)[aff(Ai, Aj-1)+aff(Ai, Aj+1)]
V c th vit li:
AM = ni=1nj=1 [aff(Ai, Aj) aff(Ai, Aj-1)+aff(Ai, Aj) aff(Ai, Aj+1)]
= nj=1[ni=1 aff(Ai, Aj) aff(Ai, Aj-1)+ ni=1 aff(Ai, Aj) aff(Ai, Aj+1)]
Ta nh ngha cu ni (Bond) gia hai thuc tnh Ax, v Ay l:
Bond(Ax, Ay )=nz=1aff(Az, Ax)aff(Az, Ay)
Th th c th vit li AM l:
AM = nj=1[ Bond(Ai, Aj-1)+Bond(Ai, Aj+1)]
By gi xt n thuc tnh sau:
A1 A2 Ai-1 AiAj Aj+1 An
Vi A1 A2 Ai-1 thuc nhm AM v AiAj Aj+1 An thuc nhm AM
Khi s o lc ht chung cho nhng thuc tnh ny c th vit li:
AMold = AM + AM+ bond(Ai-1, Ai) + bond(Ai, Aj) + bond(Aj, Ai)+
bond(bond(Aj+1, Aj) = nl=1[ bond(Al, Al-1)+bond(Ai, Al+1)] + nl=i+1[bond(Al, Al-
1)+bond(Ai, Al+1)] + 2bond(Ai, Al))
By gi xt n vic t mt thuc tnh mi A k gia cc thuc tnh Ai v Aj
trong ma trn lc ht t. S o lc ht chung mi c th c vit tng t nh:
AMnew = AM + AM+ bond(Ai, Ak) + bond(Ak, Ai) + bond(Ak, Aj)+ bond(Aj,
Ak) = AM + AM+ 2bond(Ai, Ak) + 2bond(Ak, Aj)
V th ng gp thc (net contribution) cho s o i lc chung khi t thuctnh Ak gia Ai v Aj l:
Cont(Ai, Ak, Aj) = AMnew - AMold = 2Bond(Ai, Ak )+ 2Bond(Ak, Aj ) - 2Bond(Ai,
Aj )
Bond(A0, Ak)=0. Nu thuc tnh Ak t bn phi thuc tnh tn bn phi v cha
c thuc tnh no c t ct k+1 ca ma trn CA nn bond(Ak, Ak+1)=0.
Th d 13: Ta xt ma trn c cho trong Th d 12 v tnh ton phn ng
gp khi di chuyn thuc tnh A4 vo gia cc thuc tnh A1 v A2, c cho bng cng
thc:
36
7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639
37/117
7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639
38/117
A1 A2 A1 A3 A2
A1 45 0 A1 45 45 0
A2 0 80 A2 0 5 80
A3 45 5 A3 45 53 5
A4 0 75 A4 0 3 75
(a) (b)
A1 A3 A2 A4 A1 A3 A2 A4
A1 45 45 0 0 A1 45 45 0 0
A2 0 5 80 75 A3 45 53 5 3
A3 45 53 5 3 A2 0 5 80 75A4 0 3 75 78 A4 0 3 75 78
(b) (d)
trong hnh trn chng ta thy qu trnh to ra hai t: mt gc trn tri cha cc
gi tr i lc nh, cn t kia di gc phi cha cc gi tr i lc cao. Qu trnh phn
t ny ch ra cch thc tch cc thuc tnh ca D n. Tuy nhin, ni chung th ranh
ri cc phn tch khng hon ton r rng. Khi ma trn CA ln, thng s c nhiu t
hn c to ra v nhiu phn hoch c chn hn. Do vy cn phi tip cn bi tonmt cch c h thng hn.
Thut ton phn hoch
Mc ch ca hnh ng tch thuc tnh l tm ra cc tp thuc tnh c truy
xut cng nhau hoc hu nh l cc tp ng dng ring bit. Xt ma trn thuc tnh t:
A1 A2 A3 ... Ai Ai+1 ... An
A1
A1
:
Ai
Ai+1
:
:
An
38
TA
B A
7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639
39/117
7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639
40/117
Output: F: tp cc mnh;
Begin
{xc nh gi tr z cho ct th nht}
{cc ch mc trong phng trnh chi ph ch ra im tch}tnh CTQn-1
tnh CBQn-1
tnh COQn-1
best CTQn-1*CBQn-1 (COQn-1)2
do {xc nh cch phn hoch tt nht}
begin
for i from n-2 to 1 by -1 do
begin
tnh CTQi
tnh CBQi
tnh COQi
z CTQi*CBQi (COQi)2
if z > best then
begin
best z
ghi nhn im tch bn vo trong hnh ng x dch
end-ifend-for
gi SHIFT(CA)
end-begin
until khng th thc hin SHIFT c na
Xy dng li ma trn theo v tr x dch
R1TA(R) K {K l tp thuc tnh kho chnh ca R}
R2BA(R) K
40
7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639
41/117
F {R1, R2}
End. {partition}
p dng cho ma trn CA t quan h d n, kt qu l nh ngha cc mnh Fd
n={D n1, D n2}
Trong : D n1={A1, A3} v D n2= {A1, A2, A4}. V th
D n1={M d n, Ngn sch}
D n2={M d n, Tn d n, a im}
( y M d n l thuc tnh kho ca D n)
Kim tra tnh ng n:
Tnh y : c bo m bng thut ton PARTITION v mi thuc tnhca quan h ton cc c a vo mt trong cc mnh.
Tnh ti thit c: i vi quan h R c phn mnh dc FR={R1, R2,...., Rr}
v cc thuc tnh kho K
R= K Ri , Ri FR
Do vy nu iu kin mi Ri l y php ton ni s ti thit li ng R.
Mt im quan trng l mi mnh Ri phi cha cc thuc tnh kho ca R.
2.5. Phn mnh hn hp
Trong a s cc trng hp, phn mnh ngang hoc phn mnh dc n gin
cho mt lc CSDL khng p ng cc yu cu t ng dng. Trong trng hp
phn mnh dc c th thc hin sau mt s mnh ngang hoc ngc li, sinh ra mt
li phn hoch c cu trc cy. Bi v hai chin lc ny c p dng ln lt, chn
la ny c gi l phn mnh hn hp.
41
R23
R1
R2
R
HH
R11
R12
R21
R22
VV V V
7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639
42/117
2.6. Cp pht
2.6.1 Bi ton cp pht
Gi s c mt tp cc mnh F={F1, F2, ...,Fn} v mt mng bao gm cc v
tr S={S1, S2, ...,Sm} trn c mt tp cc ng dng Q={q1, q2, ...,qq} ang chy.
Bi ton cp pht l tm mt phn phi ti u ca F cho S.
Tnh ti u c th c nh ngha ng vi hai s o:
- Chi ph nh nht: Hm chi ph c chi lu mnh Fi vo v tr Sj, chi ph vn tin
mnh Fi vo v tr Sj, chi ph cp nht F i ti tt c mi v tr c cha n v chi ph tryn
d liu. V th bi ton cp pht c gng tm mt lc cp pht vi hm chi ph t
hp nh nht.
- Hiu nng: Chin lc cp pht c thit k nhm duy tr mt hiu qu ln l h thp thi gian p ng v tng ti a lu lng h thng ti mi v tr.
Ni chung bi ton cp pht tng qut l mt bi ton phc tp v c phc tp
l NP-y (NP-complete). V th cc nghin cu c dnh cho vic tm ra cc
thut gii heuristec tt c li gii gn ti u.
2.6.2 Yu cu v thng tin
giai on cp pht, chng ta cn cc thng tin nh lng v CSDL, v cc
ng dng chy trn , v cu trc mng, kh nng x l v gii hn lu tr ca mi
v tr trn mng.
Thng tin v CSDL
tuyn ca mt mnh Fj ng vi cu vn tin qi. y l s lng cc b ca Fj
cn c truy xut x l q i. Gi tr ny k hiu l sel i(Fj)
Kch thc ca mt mnh Fj c cho bi
Size (Fj) = card (Fj)* length(Fj)
Trong : Length(Fj) l chiu di (tnh theo byte) ca mt b trong mnh Fj.
Thng tin v ng dng
Hai s liu quan trng l s truy xut c do cu vn tin qi thc hin trn mnh
Fj trong mi ln chy ca n (k hiu l RRij), v tng ng l cc truy xut cp nht
(URij). Th d chng c th m s truy xut khi cn phi thc hin theo yu cu vn tin.
Chng ta nh ngha hai ma trn UM v RM vi cc phn t tng ng u ij v rijc c t tng ng nh sau:
42
7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639
43/117
1 nu vn tin qi c cp nht mnh Fj
uij=
0 trong trng hp ngc li
1 nu vn tin qi c cp nht mnh Fj
rij =
trong trng hp ngc li
Mt vct O gm cc gi tr o(i) cng c nh ngha, vi o(i) c t v tr a
ra cu vn tin qi .
Thng tin v v tr
Vi mi v tr (trm) chng ta cn bit v kh nng lu tr v x l ca n.
Hin nhin l nhng gi tr ny c th tnh c bng cc hm thch hp hoc bng
phng php nh gi n gin.
+ Chi ph n v tnh lu d liu ti v tr Sk s c k hiu l USCk.
+ c t s o chi ph LPCk, l chi ph x l mt n v cng vic ti v tr Sk.
n v cng vic cn phi ging vi n v ca RR v UR.
Thng tin v mngChng ta gi s tn ti mt mng n gin, gij biu th cho chi ph truyn mi
b gia hai v tr Si v Sj. c th tnh c s lng thng bo, chng ta dng fsize
lm kch thc (tnh theo byte) ca mt b d liu.
2.6.3. M hnh cp pht
M hnh cp pht c mc tiu lm gim thiu tng chi ph x l v lu tr d
liu trong khi vn c gng p ng c cc i hi v thi gian p ng. M hnh
ca chng ta c hnh thi nh sau:
Min (Total Cost)
ng vi rng buc thi gian p ng, rng buc lu tr, rng buc x l.
Bin quyt nh xij c nh ngha l
1 nu mnh Fi c lu ti v tr Sj
xij= 0 trong trng hp ngc li
43
7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639
44/117
Tng chi ph
Hm tng chi ph c hai thnh phn: phn x l vn tin v phn lu tr. V th
n c th c biu din l:
TOC= QPCi + STCjk
qi Q Sk S Fj F
vi QPCi l chi ph x l cu vn tin ng dng q i, v STCjk l chi ph lu mnh
Fj ti v tr Sk.
Chng ta hy xt chi ph lu tr trc. N c cho bi
STCjk = USCk * size(Fj) *xjk
Chi ph x l vn tin kh xc nh hn. Hu ht cc m hnh cho bi ton cp
pht tp tin FAP tch n thnh hai phn: Chi ph x l ch c v chi ph x l ch cpnht. y chng ti chn mt hng tip cn khc trong m hnh cho bi ton
DAP v xc nh n nh l chi ph x l vn tin bao gm chi ph x l l PC v chi
ph truyn l TC. V th chi ph x l vn tin QPC cho ng dng qi l
QPCi=PCi+TCi
Thnh phn x l PC gm c ba h s chi ph, chi ph truy xut AC, chi ph duy
tr ton vn IE v chi ph iu khin ng thi CC:
PCi=ACi+IEi+CCi
M t chi tit cho mi h s chi ph ph thuc vo thut ton c dng
hon tt cc tc v . Tuy nhin minh ho chng ti s m t chi tit v AC:
ACi= (uij*URij+rij*RRij)* xjk*LPCk
Sk S Fj F
Hai s hng u trong cng thc trn tnh s truy xut ca vn tin qi n mnh
Fj. Ch rng (URij+RRij) l tng s cc truy xut c v cp nht. Chng ta gi thit
rng cc chi ph x l chng l nh nhau. K hiu tng cho bit tng s cc truy xut
cho tt c mi mnh c q i tham chiu. Nhn vi LPCk cho ra chi ph ca truy xut
ny ti v tr Sk. Chng ta li dng xjk ch chn cc gi tr chi ph cho cc v tr c
lu cc mnh.
Mt vn rt quan trng cn cp y. Hm chi ph truy xut gi s rng
vic x l mt cu vn tin c bao gm c vic phn r n thnh mt tp cc vn tin
con hot tc trn mt mnh c lu ti v tr , theo sau l truyn kt qu tr li vv tr a ra vn tin.
44
7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639
45/117
H s chi ph duy tr tnh ton vn c th c m t rt ging thnh phn x
l ngoi tr chi ph x l cc b mt n v cn thay i nhm phn nh chi ph thc
s duy tr tnh ton vn.
Hm chi ph truyn c th c a ra ging nh cch ca hm chi ph truy
xut. Tuy nhin tng chi ph truyn d liu cho cp nht v cho yu cu ch c s
khc nhau hon ton. Trong cc vn tin cp nht, chng ta cn cho tt c mi v tr bit
ni c cc bn sao cn trong vn tin ch c th ch cn truy xut mt trong cc bn sao
l . Ngoi ra vo lc kt thc yu cu cp nht th khng cn phi truyn d liu v
ngc li, cho v tr a ra vn tin ngoi mt thng bo xc nhn, cn trong vn tin ch
c c th phi c nhiu thng bo tryn d liu.
Thnh phn cp nht ca hm truyn d liu l:
TCUi = uj*xjk*go(i),k + uj*xjk*g k,o(i)
Sk S Fj F Sk S Fj F
S hng th nht gi thng bo cp nht t v tr gc o(i) ca q i n tt c
bn sao cp nht. S hng th hai dnh cho thng bo xc nhn.
Thnh phn chi ph ch c c th c t l:
TCRi= min (uij * xjk * go(i), k+rij * xjk * (seli(Fj)* length (Fj)/fsize) * gk, o(i))
Fj F Sk S
S hng th nht trong TCR biu th chi ph truyn yu cu ch c n nhng
v tr c bn sao ca mnh cn truy xut. S hng th hai truyn cc kt qu t
nhng v tr ny n nhng v tr yu cu. Phng trnh ny khng nh rng trong s
cc v tr c bn sao ca cng mt mnh, ch v tr sinh ra tng chi ph truyn thp nht
mi c chn thc hin thao tc ny.
By gi hm chi ph tnh cho vn tin q i c th c tnh l:
TCi=TCUi+TCRi
Rng buc
Rng buc thi gian p ng cn c c t l thi gian thc thi ca q i thi
gian p ng ln nht ca qiqiQ
Ngi ta thch c t s o chi ph ca hm theo thi gian bi v n n gin
ho c t v rng buc thi gian thc thi.
Rng buc lu tr l: STCjk kh nng lu tr ti v tr Sk, Sk S
Fj F
45
7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639
46/117
Trong rng buc x l l:
ti trng x l ca qi ti v tr Sk kh nng x l ca Sk, SkS.
qi Q
46
7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639
47/117
CHNG 3. X L VN TIN
Ng cnh c chn y l php tnh quan h v i s quan h. Nh chng
ta thy cc quan h phn tn c ci t qua cc mnh. Thit k CSDL c vai trht sc quan trng i vi vic x l vn tin v nh ngha cc mnh c mc ch lm
tng tnh cc b tham chiu, v i khi tng kh nng thc hin song song i vi
nhng cu vn tin quan trng nht. Vai tr ca th x l vn tin phn tn l nh x cu
vn tin cp cao trn mt CSDL phn tn vo mt chui cc thao tc ca i s quan h
trn cc mnh. Mt s chc nng quan trng biu trng cho nh x ny. Trc tin cu
vn tin phi c phn r thnh mt chui cc php ton quan h c gi l vn tin
i s. Th hai, d liu cn truy xut phi c cc b ha cc thao tc trn cc
quan h c chuyn thnh cc thao tc trn d liu cc b (cc mnh). Cui cng cu
vn tin i s trn cc mnh phi c m rng bao gm cc thao tc truyn thng
v c ti u ha hm chi ph l thp nht. Hm chi ph mun ni n cc tnh
ton nh thao tc xut nhp a, ti nguyn CPU, v mng truyn thng.
3.1. Bi ton x l vn tin
C hai phng php ti u ha c bn c s dng trong cc b x l vn tin:
phng php bin i i s v chin lc c lng chi ph.
Phng php bin i i s n gin ha cc cu vn tin nh cc php bin
i i s nhm h thp chi ph tr li cu vn tin, c lp vi d liu thc v cu trc
vt l ca d liu.
Nhim v chnh ca th x l vn tin quan h l bin i cu vn tin cp cao
thnh mt cu vn tin tng ng cp thp hn c din t bng i s quan h.
Cu vn tin cp thp thc s s ci t chin lc thc thi vn tin. Vic bin i ny
phi t c c tnh ng n ln tnh hiu qu. Mt bin i c xem l ng nnu cu vn tin cp thp c cng ng ngha vi cu vn tin gc, ngha l c hai cng
cho ra mt kt qu. Mt cu vn tin c th c nhiu cch bin i tng ng thnh
i s quan h. Bi v mi chin lc thc thi tng ng u s dng ti nguyn
my tnh rt khc nhau, kh khn chnh l chn ra c mt chin lc h thp ti a
vic tiu dng ti nguyn.
Th d 3.1:
Chng ta hy xt mt tp con ca lc CSDL c choNV( MNV, TnNV, Chc v)
47
7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639
48/117
PC (MNV, MDA, Nhim v, Thi gian)
V mt cu vn tin n gin sau:
Cho bit tn ca cc nhn vin hin ang qun l mt d n
Biu thc vn tin bng php tnh quan h theo c php ca SQL l:SELECT TnNV
FROM NV, PC
WHERE NV.MNV=PC.MNV
AND Nhimv=Qunl
Hai biu thc tng ng trong i s quan h do bin i chnh xc t cu
vn tin trn l:TnNV(Nhimv=Qunl NV.MNV=PC.MNV (NV x PC))
v
TnNV(NV|>
7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639
49/117
TnNV(NV|> E3(NV)PC1=MNV E3(PC)
PC2=MNV E3(PC)
Cc mnh PC1, PC2, NV1, NV2 theo th t c lu ti cc v tr 1, 2, 3 v 4 v
kt qu c lu ti v tr 5
Mi tn t v tr i n v tr j c nhn R ch ra rng quan h R c chuyn t v
tr i n v tr j. Chin lc A s dng s kin l cc quan h EMP v ASG c phn
mnh theo cng mt cch thc hin song song cc php ton chn v ni. chin
lc B tp trung tt c cc d liu ti v tr lu kt qu trc khi x l cu vn tin.
49
Hnh 4.1a) Chin lc a
Kt qu = NV1
NV2
NV1= NV|>
7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639
50/117
7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639
51/117
7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639
52/117
trong pij l mt v t n gin. Ngc li, mt lng t ho dng chun
tuyn nh sau:
(p11p12.p1n) .(pm1pm2.pmn)
Bin i cc v t phi lng t l tm thng bng cch s cc quy tc tng
ng cho cc php ton logic (, , ): 9
1. p1p2p2p1
2. p1p2p2p1
3. p1( p2p3) (p1p2 )p3
4. p1( p2p3) (p1p2 )p3
5. p1( p2p3) (p1p2 )(p1p3 )6. p1( p2p3) (p1p2 ) (p1p3 )
7. (p1p2 )p1p2
8. (p1 p2 )p1p2
9. (p)p
Trong dng chun tc tuyn, cu vn tin c th c x l nh cc cu vn
tin con hi c lp, c ni bng php hp (tng ng vi cc tuyn mnh ).Nhn xt: Dng chun tuyn t c dng v dn n cc v t ni v chn
trng nhau. Dng chun hi hay dng trong thc t
Th d 3.3:
Tm tn cc nhn vin ang lm vic d n P 1 trong 12 thng hoc 24 thng.
Cu vn tin c din t bng SQL nh sau:
SELECT TnNVFROM NV, PC
WHERE NV.MNV=PC.MNV
AND PC.MDA= P1
AND Thi gian=12 OR Thi gian=24
Lng t ho dng chun hi l:
NV.MNV=PC.MNV PC.MDA= P1 (Thi gian=12 Thi gian=24)
Cn lng t ho dng chun tuyn l
52
7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639
53/117
(NV.MNV=PC.MNV PC.MDA=P1 Thi gian=12)
(NV.MNV=PC.MNVPC.MDA=P1 Thi gian=24)
dng sau, x l hai hi c lp c th l mt cng vic tha nu cc biu thc
con chung khng c loi b.
Phn tch
Phn tch cu vn tin cho php ph b cc cu vn tin chun ho nhng
khng th tip tc x l c hoc khng cn thit, nhng l do chnh l do chng sai
kiu hoc sai ng ngha.
- Mt cu vn tin gi l sai kiu nu n c mt thuc tnh hoc tn quan h
cha c khai bo trong lc ton cc, hoc nu n p dng cho cc thuc tnh c
kiu khng thch hp.
Select MaDA
From TenNV >200
- Mt cu vn tin gi l sai ngha nu cc thnh phn ca n khng tham gia vo
vic to ra kt qu.
Nu cc cc vn tin khng cha cc tuyn v ph nh ta c th dng th vn
tin. Vn tin cha php chn ni chiu.
- Biu din bng th vn tin:
+ 1 nt biu th quan h kt qu
+ Cc nt khc biu th cho quan h ton hng
+ Mt cnh gia hai nt khng phi l quan h kqu biu din cho mt ni
+ Cnh m nt ch l kt qu s biu th cho php chiu.
+ Cc nt khng phi l kt qu s c gn nhn l mt v t chn hoc 1 v t
ni (chnh n).
- th ni: mt th con quan trng ca th vn tin, n ch c cc ni.
Th d 3.4:
PC (MNV, MaDA, NV, Tgian)
NV (MaNV, TnNV, CV)
DA (MaDA, TnDA, Kph, im)
53
7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639
54/117
7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639
55/117
7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639
56/117
where TnNV=Mai
Vit li cu vn tin
Bc ny c chia thnh hai bc nh:
(1) Bin i cu vn tin t php tnh quan h thnh i s quan h(2) Cu trc li cu vn tin i s nhm ci thin hiu nng.
cho d hiu, chng ta s trnh by cu vn tin i s quan h mt cch
hnh nh bng cy ton t. Mt cy ton t l mt cy vi mi nt l biu th cho mt
quan h c lu trong CSDL v cc nt khng phi l nt l biu th cho mt quan h
trung gian c sinh ra bi cc php ton quan h. Chui cc php ton i theo hng
t l n gc biu th cho kt qu vn tin.
Bin i cu vn tin php tnh quan h b thnh mt cy ton t c th thuc d dng bng cch sau. Trong SQL, cc nt l c sn trong mnh FROM. th
hai nt gc c to ra nh mt php chiu cha cc thuc tnh kt qu. Cc thuc
tnh ny nm trong mnh SELECT ca cu vn tin SQL. Th ba, lng t ho
(mnh Where ca SQL) c dch thnh chui cc php ton quan h thch hp
(php chn, ni, hp, ..) i t cc nt l n nt gc. Chui ny c th c cho trc
tip qua th t xut hin ca cc v t v ton t.
Th d 3.7:
Cu vn tin: tm tn cc nhn vin tr J.Doe lm cho d n CAD/CAM
trong mt hoc hai nm.
Biu thc SQL l:
SELECT TnNV
FROM DA, PC, NV
WHERE PC.MNV=NV.MNV
AND PC.MDA=DA.MDA
AND TnNV J.Doe
AND DA.TnDA=CAD/CAM
AND (Thi gian=12 OR Thi gian=24)
Cc th c nh x thnh cy trong hnh di.
56
7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639
57/117
Bng cch p dng cc quy tc bin i, nhiu cy c th c thy rng tng
ng vi cy c to ra bng phng php c m t trn. Su quy tc tng
ng hu ch nht v c xem l cc php ton i s quan h c bn :
R, S, T l nhng quan h, trong R c nh ngha trn cc thuc tnhA={A1, A2,,An} v quan h S c nh ngha trn cc thuc tnh B={B1, B2,
,Bn}.
1. Tnh giao hon ca php ton hai ngi
R x SS x R
R SS R
Quy tc ny cng p dng c cho hp nhng khng p dng cho hiu tp hphay ni na.
2. Tnh kt hp ca cc php ton hai ngi
(R x S)x T R x (Sx T)
(R S) TR (S T)
3- Tnh ly ng ca cc php ton n ngi
Nu R c nh ngha trn tp thuc tnh A v A A, A A v A A thAA(A (R)) A(R)
57
TnNV
Thi gian=12Thi gian=24
TnDA=CAD/CAM
TnNV J.Doe
MDAM
ENO
PC NVDA
Chiu
Chn
Ni
7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639
58/117
p1(A1)(p2(A2)(R))p1(A1)p2(A2)(R)
trong pi l mt v t c p dng cho thuc tnh A i
4. Giao hon php chn vi php chiu
A1An(p(Ap)(R)) A1An(p(Ap)(A1An,Ap(R)))Ch rng nu Ap l phn t ca {A1, A2,,An} th php chiu cui cng trn
{A1, A2,,An} v phi ca h thc khng c tc dng.
5. Giao hon php chn vi php ton hai ngi
p(Ai)(R x S) (p(Ai)(R)) x S
p(Ai)(R p(j, Bk) S) (p(Ai)(R)) p(j, Bk) S
p(Ai)(R T) p(Ai)(R) p(Ai)(T)6-Giao hon php chiu vi php ton hai ngi
Nu C=A B, trong AA, B B, v A, B l cc tp thuc tnh tng
ng ca quan h R v S, chng ta c
C(R x S) A(R)B(S)
C(R p(i, Bj) S) A(R) p(i, Bj)B(S)
C(R S) A(R) B(S)
Cc quy tc trn c th c s dng cu trc li cy mt cch c h
thng nhm loi b cc cy xu. Mt thut ton ti cu trc n gin s dng
heuristic trong c p dng cc php ton n ngi (chn/ chiu ) cng sm cng tt
nhm gim bt kch thc ca quan h trung gian.
Ti cu trc cy trong hnh trn sinh ra cy trong hnh sau. Kt qu c xem
l t cht lng theo ngha l n trnh truy xut nhiu ln n cng mt quan h v
cc php ton chn la nhiu nht c thc hin trc tin.
58
7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639
59/117
3.3. Cc b ha d liu phn tn
Tng cc b ha d liu chu trch nhim dch cu vn tin i s trn quan hton cc sang cu vn tin i s trn cc mnh vt l. Cc b ha c s dng cc thng
tin c lu trong mt lc phn mnh.
Tng ny xc nh xem nhng mnh no cn cho cu vn tin v bin i cu
vn tin phn tn thnh cu vn tin trn cc mnh. To ra cu vn tin theo mnh c
thc hin qua hai bc. Trc tin vn tin phn tn c nh x thnh vn tin theo
mnh bng cch thay i mi quan h phn tn bng chng trnh ti thit ca n. Th
hai vn tin theo mnh c n gin ho v ti cu trc to ra mt cu vn tin ccht lng. Qu trnh n gin ho v ti cu trc c th c thc hin theo nhng
59
TenDA J.Doe
MDA, MNV
MNV, TenNV
MDA
TenDA=CAD/CAM
Thoigian=12Thoigian=24
DA PC NV
TnNV
MDA, TenNV
MNV
7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639
60/117
7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639
61/117
Where MaNV=E5
Rt gn vi php ni
- Ni trn cc quan h phn mnh ngang c th c n gin khi cc quan h ni
c phn mnh theo thuc tnh ni.
- n gin ho gm c phn phi cc ni trn cc hp ri b i cc ni v dng.
( r1 r2 ) |> E6 (NV)
PC1 = MaNV E3 (PC)
PC (MaNV, MaDA, NV, Tg) PC2 = MaNV > E3 (NV)
Cu hi:
Select *From NV, PC
Where NV.MaNV = PC.MaNV
61
MaNV
MaNV=E5
NV1 NV2 NV3
MaNV
MaNV=E5
NV2
7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639
62/117
Rt gn cho phn mnh dc
Phn mnh dc phn tn mt quan h da trn cc thuc tnh chiu. Chng trnh
cc b ho cho mt quan h phn mnh dc gm c ni ca cc mnh theo thuc tnh
chung.
Th d 3.10:
NV(MaNV, TnNV, CV)
NV1 =MaNV, TnNV(NV)
NV2 =MaNV, CV(NV)
Chng trnh cc b ho: NV = NV1 |>
7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639
63/117
7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639
64/117
64
*
|>
7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639
65/117
Rt gn cho phn mnh ngang hn hp
Mc tiu: H tr hiu qu cc cu vn tin c cha php chiu, chn v ni
Cu vn tin trn cc mnh hn hp c th c rt gn bng cch t hp cc qui tc
tng ng uc dng trong cc phn mnh ngang nguyn thu, phn mnh dc,
phn mnh ngang dn xut.
Qui tc:
1/ Loi b cc quan h rng c to ra bi cc php ton chn mu thun trn cc mnh
ngang.
2/ Loi b cc quan h v dng c to ra bi cc php chiu trn cc mnh dc
3/ Phn phi cc ni cho cc hp nm c lp v loi b cc ni v dng.
Th d 3.13: NV1 = MaNV E4 (MaNV, TnNV (NV) )
MaNV, TnNV, CV) NV2 = MaNV > E4 (MaNV, TnNV (NV) )
NV3 = MaNV, CV (NV)
Chng trnh cc b ho NV = (NV1 NV2 ) |>
7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639
66/117
3.4. Ti u ho vn tin phn tn
Trong phn ny chng ta s gii thiu v qu trnh ti u ha ni chung, bt kmi trng l phn tn hay tp chung. Vn tin cn ti u gi thit l c din t bngi s quan h trn cc quan h CSDL (c th l cc mnh) sau khi vit li vn tin
t biu thc php tnh quan h.Ti u ha vn tin mun ni n qu trnh sinh ra mt hoch nh thc thi vn
tin (query execution plan, QEP) biu th cho chin lc thc thi vn tin. Hoch nhc chn phi h thp ti a hm chi ph. Th ti u ha vn tin, l mt n th phnmm chu trch nhim thc hin ti u ha, thng c xem l cu to bi ba thnh
phn: mt khng gian tm kim (search space), mt m hnh chi ph(cost model) vmt chin lc tm kim (search strstegy) (xem hnh 1.4.4). Khng gian tm kim ltp cc hoch nh thc thi biu din cho cu vn tin. Nhng hoch nh ny l tng
ng, theo ngha l chng sinh ra cng mt kt qu nhng khc nhau th t thchin cc thao tc v cch thc ci t nhng thao tc ny, v th khc nhau v hiunng. Khng gian tm kim thu c bng cch p dng cc quy tc bin i, chnghn nhng qui tc cho i s quan h m t trong phn vit li cu vn tin. M hnhchi ph tin on chi ph ca mt hoch nh thc thi cho. cho chnh xc, mhnh chi ph phi c thng tin cn thit v mi trng thc thi phn tn. Chin lctm kim s khm ph khng gian tm kim v chn ra hoch nh tt nht da theom hnh chi ph. N nh ngha xem cc hoch nh no cn c kim tra v theo th
t no. Chi tit v mi trng (tp trung hay phn tn) c ghi nhn trong khng gianv m hnh chi ph.
3.4.1. Khng gian tm kim
Cc hoch nh thc thi vn tin thng c tru tng ha qua cy ton t),trn nh ngha th t thc hin cc php ton. Chng ta b sung thm cc thng tinnh thut ton tt nht c chn cho mi php ton. i vi mt cu vn tin cho,khng gian tm kim c th c nh ngha nh mt tp cc cy ton t tng ng,c c bng cch p dng cc qui tc bin i . nu bt cc c trng ca th ti
u ha vn tin , chng ta thng tp trung cc cy ni (join tree), l cy ton t vi ccphp ton ni hoc tch Descartes. L do l cc hon v th t ni cc tc dng quantrng nht n hiu nng ca cc vn tin quan h.
66
7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639
67/117
CU V N TIN
T O RA KHNGGIAN TM KI M
Th d 3.14:
Xt cu vn tin sau:
SELECT ENAME
FROM EMP, ASG, PROJ
WHERE EMP, ENO=ASG.ENO
AND ASG, PNO=PROJ . PNO
Hnh sau minh ha ba cy ni tng ng cho vn tin , thu c bng cch
s dng tnh cht kt hp ca cc ton t hai ngi. Mi cy ny c th c gn mt
chi ph da trn chi ph ca mi ton t. Cy ni ( c ) bt u vi mt tch Des-cartesc th c chi ph cao hn rt nhiu so vi cy cn li.
67
CU VN TIN
QEP TNG NG
QEP TT NHT
TO RA KHNGGIAN TM KIM QUY TC BIN I
CHIN LCTM KIM
M HNH CHIPH
7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639
68/117
PNO ENO
ENO PROJ PNO EMP
EMP ASG ASG PROJ
(a) (b)
ENO.PNO
X ASG
PROJ EMP
(c)
Vi mt cu vn tin phc tp (c gm nhiu quan h v nhiu ton t), s
caaytoans t tng ng c th rt nhiu. Th d s cy ni c th thu c t vic
p dng tnh giao hon v kt hp l O(N!) cho N quan h. Vic nh gi mt khnggian tm kim ln c th mt qu nhiu thi gian ti u ha, i khi cn tn hn c
thi gian thc thi thc s. V th, th ti u ha thng hn ch kch thc cn xem
xt ca khng gian tm kim . Hn ch th nht l dng cc heuristic. Mt heuristic
thng dng nht l thc hin php chn v chiu khi truy xut n quan h c s. Mt
heuristic thng dng khc l trnh ly cc tch Descartes khng c chnh cu vn tin
yu cu. Th d trong hnh trn cy ton t (c ) khng phi l phn c th ti u ha
xem xt trong khng gian tm kim.
a) Cy ni tuyn tnh b) Cy ni xum xu
68
R1
R2
R3
R4
R1 R2R3 R4
7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639
69/117
Mt hn ch quan trng khc ng vi hnh dng ca cy ni. Hai loi cy ni
thng c phn bit Cy ni tuyn tnh v cy ni xum xu (xem Hnh 9.3). Mt
cy tuyn tnh (linear tree) l cy vi mi nt ton t c t nht mt ton hng l mt
quan h c s. Mt cy xum xu (bushy tree) th tng qut hn v c th c cc ton t
khng c quan h c s lm ton hng (ngha l c hai ton hng u l cc quan h
trung gian). Nu ch xt cc cy tuyn tnh, kch thc ca khng gian tm kim c
rt gn li thnh O(2N). Tuy nhin trong mi trng phn tn, cy xum xu rt c li
cho vic thc hin song song.
3.4.2. Chin lc tm kim
Chin lc tm kim hay c cc th ti u ha vn tin s dng nht l quy
hoch ng(dynamic programming) ci tnh cht n nh (deterministic). Cc chinlc n nh tin hnh bng cch xy dng cc hoch nh , bt u t cc quan h
c s, ni thm nhiu quan h ti mi bc cho n khi thu c tt c mi hoch
nh kh hu nh trong Hnh 9.4.. Quy hoch ng xy dng tt c mi hoch nh
kh hu theo hng ngang(breadth-first) trc khi n chn ra hoch nh tt nht.
h thp chi ph ti u ha, cc hoch nh tng phn rt c kh nng khng dn
n mt hoch nh ti u u c xn b ngay khi c th. Ngc li, mt chin lc
n nh khc l thut ton thin cn ch xy dng mt hoch nh theo hng su
(depth-first).
Bc 1 Bc 2 Bc 3
Quy hoch ng hu nh c bn cht vt cn v bo m tm ra c cc hoch
nh. N phi tr mt chi ph c th chp nhn c (theo thi gian v khng gian) khi
s quan h trong cu vn tin kh nh. Tuy nhin li tip cn ny c chi ph qu cao khi
s quan h ln hn 5 hoc 6. V l do ny m cc ch gn y ang tp trung vo
cc chin lc ngu nhin ha (randomized strategy) lm gim phc tp ca ti
69
R1 R2 R1
R2
R1 R2
R3R4R4
R3
7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639
70/117
u ha nhng khng bo m tm c hoch nh tt nht. Khng ging nh cc
chin lc n nh, cc chin lc ngu nhin ha cho php th ti u ha nh i
thi gian ti u ha v thi gian thc thi.
Chin lc ngu nhin ha chng hn nh tp trung vo vic tm kim li gii ti
u xung quanh mt s im c bit no . Chung khng m bo s thu c mt
li gii tt nht nhng trnh c chi ph qu cao ca ti u ha tnh theo vic tiu
dng b nh v thi gian. Trc tin mt hoc nhiu hoch nh khi u c xy
dng bng mt chin lc thin cn . Sau thut ton tm cch ci thin hoch nh
ny bng cch thm cc ln cn (neighbor) ca n. Mt ln cn thu c bng cch
p dng mt bin i ngu nhin cho mt hoch nh. Th d v mt bin i in
hnh gm c hon i hai quan h ton hng c chn ngu nhin ca hoch nh
nh trong chng t bng thc nghim rng cc chin lc ngu nhin ha c hiu
nng tt hn cc chin lc n nh khi vn tin c cha kh nhiu quan h.
3.4.3. M hnh chi ph phn tn
M hnh chi ph ca th ti u ha gm c cc hm chi ph d on chi phca cc ton t, s liu thng k, d liu c s v cc cng thc c lng kchthc cc kt qu trung gian.
Hm chi ph
Chi ph ca mt chin lc thc thi phn tn c th c din t ng vi tngthi gian hoc vi thi gian p ng. Tng thi gian (total time) l tng tt c ccthnh phn thi gian (cn c gi l chi ph), cn thi gian p ng( response time)l thi gian tnh t khi khi hot n lc hon thnh cu vn tin. Cng thc tng qut xc nh tng chi ph c m t nh sau:
Total_time = TCPU * #insts + TI/O * #I/Os + TMSG * #msgs + TTR * #bytcs
Hai thnh phn u tin l thi gian x l cc b, trong TCPU l thi gian camt ch th CPU v TI/O l thi gian cho mt thao tc xut nhp a. Thi gian truyn
c biu th qua hai thnh phn cui cng. TMSG l thi gian c nh cn khi hotv nhn mt thng bo, cn TTR l thi gian cn truyn mt n v d liu t v trny n v tr khc. n v d liu y tnh theo byte ( #byte l tng kch thc ca
70
R1
R2
R1 R3
R3 R2
7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639
71/117
7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639
72/117
7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639
73/117
1. i vi mi thuc tnhAichiu di (theo s byte) c k hiu l length (Ai),v i vi mi thuc tinh Aica mi mnhRj, s lng phn bit cc gi tr ca Ai, llc lng khi chiu mnhRj trnAi , c k hiu l card ( Ai (Rj)).
2. ng vi min ca mi thuc tnhAi trn mt tp gi tr sp th t c (th ds nguyn hoc s thc), gi tr ln nht v nh nht c k hiu l max (Ai) v
min(Ai).
3. ng vi min ca mi thuc tnh Ai lc lng ca min c k hiu lcard(dom[Ai]). Gi tr ny cho bit s lng cc gi tr duy nht trong dom[Ai].
4. S lng cc b trong mi mnh Rj c k hiu l card(Rj)
i khi d liu thng k cng bao gm h s chn ni (join selectivity factor) ivi mt s cp quan h, ngha l t l cc b c tham gia vo ni. H s c chn nic k hiu l SFj ca quan hR v Sl mt gi tr thc gia 0 v 1.:
card(R S)
SFj =
card(R)* card(S)
Chng hn h s chn ni 0.5 tng ng vi mt quan h ni cc ln, trong khi h s 0.001 tng ng vi mt quan h kh nh. Chng ta ni rng ni c chnkm trong trng hp u v chn tttrong trng hp sau:
D liu thng k ny rt c ch cho vic d on kch thc quan h trung gian.
Size (R) = card (R) * length (R)
Trong length (R) l chiu di (theo byte) ca mt b ca R, c tnh t ccchiu di ca cc thuc tnh caR. Vic c lng card (R), s lng cc b trongR,i hi phi s dng cc cng thc c cho trong phn tip theo.
Lc lng ca cc kt qu trung gian
D liu thng k rt c ch khi nh gi lc lng ca cc kt qu trung gian. Haigi thit n gin thng c a ra v CSDL. Phn phi ca cc gi tr thuc tnhtrong mt quan h c gi nh l thng nht, v tt c mi thuc tnh u c clp, theo ngha l gi tr ca mt thuc tnh khng nh hng n gi tr ca cc thuctnh khc. Hai gi thit ny thng khng ng trong thc t, tuy nhin chng lm cho
bi ton d gii quyt hn. Trong nhng on sau, chng ta trnh by cc cng thcc lng, lc lng cc kt qu ca cc php ton i s c bn (php chn, php
chiu, tch Descartes, ni, ni na, hp, v hiu). Quan h ton hng c k hiu l Rv S.H s chn ca mt php ton c biu th l SFOP, vi OPbiu th cho phpton.
73
7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639
74/117
Php chn. Lc lng ca php chn l:
Card(F(R)) = SFS(F) * card(R)
Trong SFS(F) ph thuc vo cng thc chn v c th c tnh nh sau, vip(Ai) vp(Aj) biu th cho v tr t thuc tnhAi vAj.
1
SFS (A= value)=
card(A (R))
max(A ) -value
SFS (A > value)=max(A) - min(A)
value- min(A)
SFS (A < value)=
max(A) - min(A)
SFS (p(Ai) p(Aj)) = SFS (p(Ai) * SFS (p(Aj))
SFS (p(Ai) p(Aj)) = SFS (p(Ai) * SFS (p(Aj)) (SFS (p(Ai)) * SFS (p(Aj)))
SFS (Ai {value}) = SFS (A= value) * card({values})
Php chiu. Chiu c th loi b hoc khng loi b cc b ging nhau. ychng ta xem nh chiu c km theo c vic loi b ny. Mt php chiu bt k rtkh c lng chnh xc bi v mi tng quan gia cc thuc tnh c chiu thngkhng c bit. Tuy nhin c hai trng hp c bit c ch nhng vic c lnghon ton tm thng. Nu chiu ca quan h R da trn thuc tnh A duy nht, lclng ch l s b thu c khi thc hin php chiu. Nu mt trong cc thuc tnhchiu l kha caR th
74
7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639
75/117
card (A(R)) = card(R)
Tch Descartes. Lc lng ca tch Descartes ca quan hR v Sl
card (R x S) = card(R)* card(S)
Ni. Khng c mt phng php tng qut no tnh lc lng ca ni mkhng cn thm thng tin b sung. Cn trn ca lc lng cho ni l lc lng catch Descartes. Mt s h thng, chng hn nh h INGRES phn tn s dng cn trnny, mt c lng hi qu ng. R* s dng thng s ca trn ny vi mt hng s,
phn nh s kin l kt qu ni lun nh hn tch Descartes. Tuy nhin c mt trnghp xy ra kh thng xuyn nhng vic c lng li kh n gin. Nu R c thchin ni bng vi S trn thuc tnh A ca R v thuc tnh B ca S, trong A l khaca quan h R v B l kha ngoi ca quan h S th lc lng ca kt qu c thtnh xp x l:
Card (R A=B S) = card(S)Bi v mi b ca S khp vi ti a mt b ca R. Hin nhin l iu ny cng
ng nu B l kha ca S v A l kha ngoi ca R. Tuy nhin c lng ny l cntrn bi v n gi s rng mi b ca S u tham gia vo trong ni. i vi nhng niquan trng khc, chng ta cn duy tr h s chn ni SFJ nh thnh phn ca cc thngtin thng k. Trong trng hp d lc lng ca kt qu l:
Card (R S) = SFJ* card(R)* card(S)
Ni na. H s chn ca ni na gia R v S cho bi ty l phn trm cc b ca
R c ni vi cc b ca S. Mt xp x cho h s chn ni na c a ra trng l:
Card (A(S))
SFSJ (R |>< S)=
Card(dom[A])
Cng thc ny ch ph thuc vo thuc tnh A v S. V th n thng c gi lh s chn ca thuc tnh A v S, k hiu l SF SJ(S.A), v l h s chn ca S.A trn
bt k mt thuc tnh no c th ni c vi n. V th lc lng ca ni na ccho bi:
Card (R |>< S) = SFSJ(S.A)* card(R)
Xp x ny c th c xc nhn trn mt trng hp rt thng gp, l khiR.A l kha ngoi ca S (S.A l kha chnh). Trong trng hp ny, h s chn ni
na l 1 bi v card (A(S))= Card(dom[A]) cho thy rng lc lng ca ni na lcard (R).
75
7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639
76/117
Php hp. Rt kh c lc lng trong trng hp ca R v S bi v cc bging nhau b loi b trong hp. Chng ta ch trnh by cng thc n gin cho cccn trn v di, tng ng l:
card (R) + card (S)
max{card (R), card (S)}
Ch rng nhng cng thc ny gi thit R v S khng cha cc b ging nhau.
Hiu.Ging nh php hp, chng ta ch trnh by cc cn trn v di. Cn trnca card (R - S) l card (R), cn cn di l 0.
3.4.4. Xp th t ni trong cc vn tin theo mnh
Nh chng ta bit vic sp xp cc ni l mt ni dung quan trng trong qutrnh ti u ha vn tin tp trung. Xp th t ni trong ng cnh phn tn d nhin l
quan trng hn bi v ni cc mnh lm tng thi gian truyn. Hin c hai cch tipcn c bn sp th t cc ni trong cc vn tin mnh. Mt l ti u ha trc tipvic xp th t ni, cn cch kia th thay cc ni bng cc t hp ca ni na nhmgim thiu chi ph truyn.
Xp th t ni.
Mt s thut ton ti u ha vic sp th t ni mt cch trc tip m khngdng cc ni na. Cc thut ton ca h INGRES phn tn v System R* l i dincho nhm ny. Mc ch ca phn ny l trnh by cc vn phc tp ca vic sp
th t ni v to tin cho phn tip theo c s dng ni na ti u ha cc cuvn tin ni.
Chng ta cn a ra mt s gi thit nhm tp trung vo cc vn chnh. Bi vcu vn tin c cc b ha v c din t trn cc mnh, chng ta khng cn phi
phn bit gia cc mnh ca cng mt quan h v cc mnh c lu ti mt v tr cth. Nhm tp trung vo vic sp th t ni, chng ta b qua thi gian x l cc b,vi gi thit l cc thao tc rt gn (chn, chiu) c thc hin cc b hoc trc khi,hoc trong khi ni, (cn nh rng thc hin ph chn trc khng phi lc no cnghiu qu). V th chng ta ch xt cc cu vn tin ni m cc quan h ton hng c
lu ti cc v tr khc nhau. Chng ta gi s rng vic di chuyn quan h c thchin theo ch mi ln mt tp ch khng phi ni ln mt b. Cui cng chng tab qua thi gian truyn d liu c c d liu ti v tr kt qu.
Trc tin chng ta tp trung vo mt vn n gin hn l truyn ton hngtrong mt ni. Cu vn tin l R S , trong R v S l cc quan h c lu tinhng v tr khc nhau. Chn la quan h truyn, hin nhin l gi quan h nh nv tr ca quan h ln, cho ra hai kh nng nh c trnh by trong Hnh 7. c tha ra mt chn la, chng ta cn c lng kch thc ca R v S. By gi chng taxt trng hp c nhiu hn hai quan h trong mt ni. Ging nh trng hp mt ni
n, mc ch ca thut ton xp th t ni l truyn nhng quan h nh. Kh khnny sinh t s kin l cc ni c th lm gim hoc tng kch thc ca cc quan htrung gian. V th c lng kch thc kt qu ni l iu bt buc nhng cng rt
76
7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639
77/117
kh. Mt gii php l c lng chi ph truyn ca tt c cc chin lc ri chn ramt chin lc tt nht. Tuy nhin s lng ca cc chin lc s tng nhanh theo squan h. Li tip cn ny, c dng rong System*R, c chi ph ti u ha cao, mcd n s c tr li rt nhanh nu cu vn tin c thc hin thng xuyn.
nu size (R) < size (S)
nu size (R) > size (S)
Th d 3.16:
Xt cu vn tin c biu din di dng i s quan h:
PROJ PNOEMP ENO ASG
Vi th ni c trnh by trong Hnh 8. Ch rng chng ta a ra mt sgi thit v v tr ca ba quan h. Cu vn tin ny c th c thc hin t nht l bngnm cch khc nhau. Chng ta m t nhng chin lc ny bng nhng chng trnhsau, trong (R v tr j) biu th quan h R c chuyn n v tr j
V tr 2
ENO PNO
V tr 1 V tr 3
1. EMP v tr 2. V tr 2 tnh EMP = EMP ASG.EMP v tr 3. V tr 3tnh EMP PROJ
2. ASG v tr 1. V tr 1 tnh EMP = EMP ASG.EMP v tr 3. V tr 3tnh EMP PROJ
3. ASG v tr 3. V tr 3 tnh ASG = ASG PROJ.SG v tr 1. V tr 1
tnh ASG EMP
77
RS
ASG
EM
P
PR
OJ
7/30/2019 Co So Du Lieu 2 Phan Tan Va Suy Dien 6639
78/117
4. PROJ v tr 2. V tr 2 tnh PROJ = PROJ ASG. PROJ v tr 1.V tr1 tnh PROJ EMP
5. EMP v tr 2. PROJ v tr 2. V tr 2 tnh EMP PROJ ASG
chn ra mt chng trnh trong s ny, chng ta phi bit hoc d on c
cc kch thc: size (EMP), size (ASG), size (PROJ), size (EMP ASG) v size(ASG PROJ). Hn na nu xem xt c thi gian p ng, vic ti u ha phi tnhn vn l truyn d liu c th c thc hin song song trong chin lc 5. Mt
phng n khc lit k tt c cc gii php l dng cc heuristic ch xt n kchthc cc quan h ton hng bng cch gi thit, chng hn l lc lng ca ni cto ra l tch ca cc lc lng. Trong trng hp ny, cc quan h c xp th ttheo kch thc v th t thc hin c cho bi cch xp th t v th ni. Th dth t (EMP, ASG, PROJ) c th s dng chin lc 1, cn th t (PROJ, ASG,EMP) c th dng chin lc 4.
Cc thut ton da trn ni naTrong phn ny chng ta trnh by xem phi s dng ni na nh th no h
thp tng thi gian ca cc vn tin ni. y chng ta cng dng gi thit ging nhtrong phn 1. Thiu st chnh ca phng php ni c m t trong phn trc lton b quan h ton hng phi c truyn qua li gia cc v tr. i vi mt quanh, ni na hnh ng nh mt tc nhn rt gn kch thc ging nh mt php chn.
Ni ca hai quan h R v S trn thuc tnh A, c lu tng ng ti v tr 1 v 2,c th c tnh bng cch thay mt hoc c hai ton hng bng mt ni na vi quan
h kia nh cc quy tc sau y.R AS
(R |>< A S) A S, gi thit rng size(R) < A S
3. R v tr 2
4. V tr 2 tnh R AS
cho n gin, chng ta hy b qua hng TMSG trong thi gian truyn vi githit l ton hng TTR * size(R ) l