retrieval multimedia data from disks presented by yuni xia
DESCRIPTION
Retrieval Multimedia Data from Disks Presented by Yuni Xia. Fundamental characteristics : Real-time storage and retrieval Large data transfer rate and storage space requirement. Why choosing magnetic disk Storage capacity Speed Moderate cost / Random access / Writing. Side View:. - PowerPoint PPT PresentationTRANSCRIPT
Retrieval Multimedia Data from DisksPresented by
Yuni Xia
Fundamental characteristics:
• Real-time storage and retrieval• Large data transfer rate and
storage space requirement
Why choosing magnetic disk• Storage capacity• Speed• Moderate cost / Random access /
Writing
Read/write head
platters
Spindle
Side View:
Top View:
Tracks
Sector
Suppose: Wish to read data sector i on track ti,
read head is currently over sector j in track tj
Readtime = seek(ti, tj) + rotation(si,sj)+data/dtr
seek(ti, tj) = abs(ti-tj) / rv
rotation(si,sj) = (abs(si-sj) / snum ) / ss
Symbol Meaningtnumsnumitdssrvdtr
total # of trackstotal # of sectorsintertrack distancespin speed radial velocitydata transfer rate
Raid arrays and Placement methods
• By spreading data across several hard disks faster performance greater storage capacityhigher data security
• Six standards: 0-5
(cross-type variations, such as 0/1, 3/5)
• Implemented by software and hardware
RAID 0: Striped Disk Array without Fault Tolerance
RAID Level 0 requires a minimum of 2 drives to implement
AEI
M
BFJN
CGKO
DHL
etc..
RAID 1: Mirroring and Duplexing
RAID Level 1 requires a minimum of 2 drives to implement
ABCD
EFGH
IJKL
MNOP
ABCD
=
EFGH
=
MNOP
IJKL
= =
RAID 5: Independent Data Disks with Distributed Parity Blocks
RAID Level 5 requires a minimum of 3 drives to implement
A0A1A2A3
B0B1B2
3 parity
4 parity B4
C0C1
2 parity
C3C4
D01 parity
D2D3D4
0 parity
E1E2E3E4
Router
Server1
...
. . .
d1 d2 d3 dm
Server n
...
d1 d2 d3 dn
A model of heterogeneous disk servers
What needs to be modeled?
• The intrinsic characteristics of each disk server• The intrinsic characteristics/capabilities of each
client• The relationship between the disk servers and
clients
• The distribution of data across the disk server
Disk Server Characteristics
1. Dtr(i):
Total disk bandwidth of disk server i
2. Buf(i):
Total buffer space associated with server i
3. Switchtime(i, t):
Time required for si to switch between clients at time t
4. Cyctime(i, t):
One cycle of read operation to be executed by si at time t
Client Characteristics
1. Cons(i,t):
The consumption rate of client Ci at time t
2. Data(i, t): (M, b)
Play: data(i, t) = {(m,b), (m, b+1), …}
FF: data(i, t) = {(m,b), (m, b+ffs), (m, b+2ffs), …}
RW: data(i, t) = {(m,b), (m, b-rws), (m, b-2rws), …}
Pause: data(i, t) = {(m,b)}
Client Characteristics
Data(i, t): (M, b, len, step)
b, (b+step), (b+2*step), …. , (b+(len-1)*step)
1. Play: step =1
2. FF: step = ffs
3. RW: step = -rws
4. Pause: step = 0
Client-Server Characteristics
1. Timealloc(i,j,t):• In any given cycle of disk server i, each client
cj has a time-slice, timealloc(i, j, t)• cyctime(i, t) >= sum( timealloc(i,j,t))
+ (ni,t * switchtime(i,t))
2. active(t):The set of all clients that are active at time t.
3. d_active(i, t)active(t)= Union(d_active(i, t))
Client-Server Characteristics
4. Ut (i):
The set of servers which are handling the requests of client Ci.
Ut (i) = { S | Ci d_active(s, t)}
5. Bufreq(j, i, t):The amount of buffer that is required at server
Si so that data that client Cj needs to read doesn’t get overwritten.
Buf(i) >= sum(bufreq(j, i, t)
Distribution of Data
M (mi , b) : placement mapping
• The set of all servers that contain block b of mi
• M ( “Sound of Music”, 20 ) = {2, 4, 5}
Placement constraint
data (C, t) = (m, b, len, step) i (0< i <len)
( j Ut (i) ) ( j M (mi , b + ( i * step ) )
Suppose: Data (C, t) = (M, 5, 5, 3),
Ut (C) = {1, 3, 4}
{ 5, 8, 11, 14, 17} must be in {S1, S3, S4}
Definition: State of an MOD System S(t) 1. Active ( t ) 2. Cyctime (i, t) 3. Cons ( i, t ) 4. Timealloc ( i, j, t) 5. Data ( i, t) 6. Ut
Disk availability constraint
1. Consumption Rate Constraint:Sum(cons(j,t)) + switchtime(i,t) * dtr(i)/ cyctime(i,t) <=dtr(i)
2. Buffer requirement constraint:
sum(buf(j, i, t)) < = buf (i)
timealloc (i, j, t) = cyctime (i, t)* cons (j, t) / dtr(i)
bufreq(j,i,t) = (dtr(i)-cons(j,t))* timealloc(i, j, t)
(mi, 140, 2, 5)
(mi, 199, 2, 1)
(140, 145)
(199, 200)
Router
Server1B: 1-150
...
Server 2B: 151-250
...
Server 3B: 200-300
...
(150, 155)
(201, 202)
Trans Transaction type Priority
tr1
tr2
tr3
tr4
tr5
tr6
exiting client
continuing client-normal
continuing client-needs switching
continuing client- needs splitting
new client
new client -needs splitting
5
4
3
3
2
1
An event-based algorithm QuickSOL• FindSOL • OptimizeSOL
FindSOL Phase:FindSOL Phase: 1. Split EV(t) into 6 sets:
new(t), exit(t), cont(t), pause(t), ff(t), rew(t)
2. (handle exiting clients)
For each clients Ci in exit(t) do
1) free the resources
2) delete Ci from state table
3. (Handle Continuing Clients) For each clients Ci in cont(t) or ff(t) or rew(t) do
If servers currently assigned to C satisfy ..then modify the state tableelse 1) re-set C’s priority to 3 2) Move it into new(t) 3) update the resource table
4. (Handle New Clients) For each clients Ci in new(t) do
1) Identify the servers that have the data required by C
2) Determine which server have enough bandwidth .. • IF no such server is available,
split the event into 2 sub-events: data(C, t) = (m, s, l/2, 2*step) and
data(C, t) = (m, s+step, l/2, 2*step) • Keep splitting till for both sub-events ... • Update state table
3) Do the same as 2) in terms of buffer requirement
OptimizeSOL phase:OptimizeSOL phase: 1. Switching 2. Splitting Balancing the load, Maximizing the # of clients ...