sorting and searching in the presence of memory faults (without redundancy)
DESCRIPTION
Sorting and searching in the presence of memory faults (without redundancy). Irene Finocchi Giuseppe F. Italiano DISP, University of Rome “Tor Vergata” {finocchi,italiano}@disp.uniroma2.it. 80. A. 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 20. 4. 9. 10. 2. 3. B. 11. 12. - PowerPoint PPT PresentationTRANSCRIPT
Sorting and searching in the presence of memory faults (without redundancy)
Irene FinocchiGiuseppe F. Italiano
DISP, University of Rome “Tor Vergata”
{finocchi,italiano}@disp.uniroma2.it
The problem
• Large, inexpensive and error-prone memories
• Classical algorithms may not be correct in the presence of (even very few) memory faults
1 2 3 4 5 6 7 8 9 10
11 12 13 14 15 16 17 18 19 20
A
B
Out
An example: merging two ordered lists
(n) (n)
...2 3 4 9 1080(n2)
inversions...11 12 2013
80
Faulty- memory model
• Memory fault = the correct value stored in a memory location gets altered (destructive faults)
• Fault appearanceAt any time
At any memory location
Simultaneously
• Faulty Random Access Machine:
– O(1) words of reliable memory
– Corrupted values indistinguishable from correct ones
• Fault-tolerant algorithms = able to get a correct output (at least) on the set of uncorrupted values
Related work
Lies Transient failures Algorithms can exploit query replication strategies
The liar model: comparison questions answered by a possibly lying adversary [Ulam 77, Renyi 76]
At most k lies
Linearly boundedmodel
(n log n + k n) [Lakshmanan et al., IEEE TOC 91]
O(n log n) for k = O (log n / log log n) [Ravikumar, COCOON 02]
(n log (n/q)), correct with probability (1-q) [Feige et al., SICOMP 94]
Exponential lower bound [Borgstrom & Kosaraju, STOC 93]
Probabilisticmodel
Why not data replication?
Data replication can be quite inefficient in certain highly dynamic scenarios, especially if objects to be replicated are large and complex
What can we do without data replication?
Q1. Can we sort the correct values in the presence of, e.g.,
polynomially many memory faults?
Q2. How many faults can we tolerate in the worst case if
we wish to maintain optimal time and space?
E.g., with respect to sorting:
A fault tolerant algorithm
We show an algorithm resilient up to O ( (n log n)1/3 ) memory faults
• Based on mergesort
• Main difficulty: merging step
Can we sort (at least) the correct values on O(n log n) time and optimal space in the presence of, e.g., polynomially many memory faults?
Q1.
A hierarchy of disorder
k-unordered
faithfully ordered
ordered
k-weakly fault tolerant
strongly fault tolerant
faithfully ordered = ordered except for the corrupted keys 1 2 3 4 5 6 7 8 9 1080
k-unordered = ordered except for k (correct or corrupted) keys
3-unordered
1 2 3 4 9 5 7 8 6 1080
Solving a relaxation
solve the k-weakly FT merging problem (for k not too large) and use it to reconstruct a faithful order
Idea of our merging algorithm:
k-unordered
faithfully ordered
ordered
The merging algorithm: a big picture
A B
k-weaklyFT-merge
C
E
naïf-mergesort
stronglyFT-merge
F
purify
S D
Slow, but D is short...
Slow in general, fast on unbalanced
sequences
Faithfully ordered, long Disordered, short
Very fast
Faithfully ordered, short
Faithfully ordered
k-unordered, but k is not so large
Very fast
The merging algorithm: a big picture
A B
k-weaklyFT-merge
C
E
naïf-mergesort
stronglyFT-merge
F
purify
S D
O()-unordered, ≤
O(n)
O(n+
)
Faithfully ordered, long |D| = O()
O (n+)
Faithfully ordered, short
Faithfully ordered
O ()
Running time O (n+)
Strongly fault tolerant
Summing up
We obtain an O(n log n) strongly fault tolerant sorting algorithm that is resilient up to O(n log n)1/3 memory faults and uses O(n) space
By plugging in the merging algorithm into mergesort,we can sort in time O (n log n+ ) and thus:
A (polynomial) lower bound
No more than O ( (n log n)1/2 )
To prove this, we first prove a lower bound on fault tolerant merging:
If n2/(3-2, for some [0,1/2], then (n+ 2-) comparisons are necessary for merging two faithfully ordered n-length lists
We use an adversary based argument
How many faults can we tolerate in the worst case maintaining space and running time optimal?Q2.
Adversary-based argument: big picture
If Paul asks less than 2-/2 questions, than he cannot determine the correct faithful order univocally
Carole’s power: doesn’t need to choose the sequences in advance, can play with memory faults
Carole’s limits: if challenged at any time, she must exhibit two input sequences and at most memory faults that prove that her answers were consistent with the memory image
Paul (the merging algorithm) asks comparison questions of the form “x<y?”
Carole (the adversary) must answer consistently D
elph
i’s
Ora
cle
Carole’s strategy
A and B: n-length faithfully ordered sequences
A
B
A1 A2 A1 A...
B1 B2 B1 B...
n/ n/ n/ n/
n/ n/ n/ n/
Carole answers as if the sorted sequence were:
A1 B1 A2 B2 … B-1 A B
Sparse sets
If n2/(3-2, for some [0,1/2], and Paul asks less than 2-/2 questions, then sparse set S containing two elements a Ai S and b Bj S that have not been directly compared by Paul
We prove that both the order a < b and the order b < a can be consistent with Carole’s answers
A set S of consecutive subsequences is sparse if the number of comparisons between elements in S is at most /2
A1 B1 A2 B2 A3 B3 … B-1 A B
S
Paul’s dilemma: a < b or b < a ?
For each such question asked by Paul, Carole asserts that she answered after corrupting x (if x ≠ a,b) and y (if y ≠ a,b)
How many memory faults introduced by Carole?
– 1 or 2 faults only if x and y are both in S
– S sparse (at most /2 comparisons)
at most/2) = faults
All possibly useful elements are corrupted!
a and b not directly compared: but why Paul can’t deduce their order by transitivity?
“x<y?” is a possibly “useful”comparison if both x and y S a
xb
y≤ ≤ ≤
x
y
Implications
If n2/(3-2, for some [0,1/2], then Paul must ask at least 2-/2 questions to merge A and B
(n+ 2-) comparisons for merging
(n log n+ 2-) comparisons for sorting
If n2/(3-2 ?
n0
What if is large?
If n2/(3-2, (n log n+ 2-) comparisons for sorting
(n log n)1/2 n2/3
(n log n+ 2)
(n log n)2/3 n
(n log n+ 3/2)
An O(n log n) strongly fault tolerant sorting algorithm cannot be resilient to (n log n)1/2 memory faults
(n log n)6/11 n3/4
(n log n+ 11/6)
Open questions
• Can randomization help (e.g., to tolerate more than
(n log n)1/2 memory faults)?
• Closing the gap:– our algorithm is resilient to O(n log n)1/3 memory faults
– no optimal algorithm can be resilient to (n log n)1/2
memory faults
• External fault-tolerant sorting