![Page 1: Computing exactly with unsafe resources: fault tolerant ...lig-membres.imag.fr/pernet/Publications/SD16_mkhonji_cpernet.pdf · Perspectives Cloud Computing [Above the clouds, a Berkeley](https://reader035.vdocument.in/reader035/viewer/2022071214/6043813330b83f3806647832/html5/thumbnails/1.jpg)
fault tolerant linearalgebra
Majid Khonji andClément Pernet
InroductionDistributed and Cloudcomputing
Fault tolerance
Parallel Exact linear algebra
ChineseRemainder andError correctingcodePrinciple
Correction
Overview of thecomputationscheme
FurtherimprovementsOrdering, sign
Early termination
Updated computationscheme
Application tocomputingDeterminantsSoftwares
Live demo
Perspectives
Computing exactly with unsaferesources: fault tolerant exact linear
algebra and cloud computing
Majid KHONJI and Clément PERNET,joint work with
Jean-Louis ROCH and Thomas STALINSKY
INRIA-MOAIS, Grenoble Univ. France
SAGE Days 16,June 23, 2009
![Page 2: Computing exactly with unsafe resources: fault tolerant ...lig-membres.imag.fr/pernet/Publications/SD16_mkhonji_cpernet.pdf · Perspectives Cloud Computing [Above the clouds, a Berkeley](https://reader035.vdocument.in/reader035/viewer/2022071214/6043813330b83f3806647832/html5/thumbnails/2.jpg)
fault tolerant linearalgebra
Majid Khonji andClément Pernet
InroductionDistributed and Cloudcomputing
Fault tolerance
Parallel Exact linear algebra
ChineseRemainder andError correctingcodePrinciple
Correction
Overview of thecomputationscheme
FurtherimprovementsOrdering, sign
Early termination
Updated computationscheme
Application tocomputingDeterminantsSoftwares
Live demo
Perspectives
OutlineInroduction
Distributed and Cloud computingFault toleranceParallel Exact linear algebra
Chinese Remainder and Error correcting codePrincipleCorrection
Overview of the computation scheme
Further improvementsOrdering, signEarly terminationUpdated computation scheme
Application to computing DeterminantsSoftwaresLive demo
Perspectives
![Page 3: Computing exactly with unsafe resources: fault tolerant ...lig-membres.imag.fr/pernet/Publications/SD16_mkhonji_cpernet.pdf · Perspectives Cloud Computing [Above the clouds, a Berkeley](https://reader035.vdocument.in/reader035/viewer/2022071214/6043813330b83f3806647832/html5/thumbnails/3.jpg)
fault tolerant linearalgebra
Majid Khonji andClément Pernet
InroductionDistributed and Cloudcomputing
Fault tolerance
Parallel Exact linear algebra
ChineseRemainder andError correctingcodePrinciple
Correction
Overview of thecomputationscheme
FurtherimprovementsOrdering, sign
Early termination
Updated computationscheme
Application tocomputingDeterminantsSoftwares
Live demo
Perspectives
OutlineInroduction
Distributed and Cloud computingFault toleranceParallel Exact linear algebra
Chinese Remainder and Error correcting codePrincipleCorrection
Overview of the computation scheme
Further improvementsOrdering, signEarly terminationUpdated computation scheme
Application to computing DeterminantsSoftwaresLive demo
Perspectives
![Page 4: Computing exactly with unsafe resources: fault tolerant ...lig-membres.imag.fr/pernet/Publications/SD16_mkhonji_cpernet.pdf · Perspectives Cloud Computing [Above the clouds, a Berkeley](https://reader035.vdocument.in/reader035/viewer/2022071214/6043813330b83f3806647832/html5/thumbnails/4.jpg)
fault tolerant linearalgebra
Majid Khonji andClément Pernet
InroductionDistributed and Cloudcomputing
Fault tolerance
Parallel Exact linear algebra
ChineseRemainder andError correctingcodePrinciple
Correction
Overview of thecomputationscheme
FurtherimprovementsOrdering, sign
Early termination
Updated computationscheme
Application tocomputingDeterminantsSoftwares
Live demo
Perspectives
High performance mathematics computing
Numerous application areasNumber theory: I computing tables of elliptic curves,
modular forms,I testing of conjectures,
Crypto: I Algebraic attacksI Search for big primes
Graph theory: testing conjectures (graph isomorph.,...)Your research pb here! ....
![Page 5: Computing exactly with unsafe resources: fault tolerant ...lig-membres.imag.fr/pernet/Publications/SD16_mkhonji_cpernet.pdf · Perspectives Cloud Computing [Above the clouds, a Berkeley](https://reader035.vdocument.in/reader035/viewer/2022071214/6043813330b83f3806647832/html5/thumbnails/5.jpg)
fault tolerant linearalgebra
Majid Khonji andClément Pernet
InroductionDistributed and Cloudcomputing
Fault tolerance
Parallel Exact linear algebra
ChineseRemainder andError correctingcodePrinciple
Correction
Overview of thecomputationscheme
FurtherimprovementsOrdering, sign
Early termination
Updated computationscheme
Application tocomputingDeterminantsSoftwares
Live demo
Perspectives
Parallel computations(Re-)birth of parallel computing
Personal computers ⇒multi/many cores
I End of the race for CPU clock-frequencyI Multi-core: 2, 4, 6, ...I Many-cores: near future, already in GPU’sI multi-GPU
HPC ⇒Global computing
I serversI grids, clusters,I volunteer computing,
Peer to PeerI cloud computing
INTERNET
user
⇒Shift towards massively parallel computations
![Page 6: Computing exactly with unsafe resources: fault tolerant ...lig-membres.imag.fr/pernet/Publications/SD16_mkhonji_cpernet.pdf · Perspectives Cloud Computing [Above the clouds, a Berkeley](https://reader035.vdocument.in/reader035/viewer/2022071214/6043813330b83f3806647832/html5/thumbnails/6.jpg)
fault tolerant linearalgebra
Majid Khonji andClément Pernet
InroductionDistributed and Cloudcomputing
Fault tolerance
Parallel Exact linear algebra
ChineseRemainder andError correctingcodePrinciple
Correction
Overview of thecomputationscheme
FurtherimprovementsOrdering, sign
Early termination
Updated computationscheme
Application tocomputingDeterminantsSoftwares
Live demo
Perspectives
Parallel computations(Re-)birth of parallel computing
Personal computers ⇒multi/many cores
I End of the race for CPU clock-frequencyI Multi-core: 2, 4, 6, ...I Many-cores: near future, already in GPU’sI multi-GPU
HPC ⇒Global computing
I serversI grids, clusters,I volunteer computing,
Peer to PeerI cloud computing
INTERNET
user
⇒Shift towards massively parallel computations
![Page 7: Computing exactly with unsafe resources: fault tolerant ...lig-membres.imag.fr/pernet/Publications/SD16_mkhonji_cpernet.pdf · Perspectives Cloud Computing [Above the clouds, a Berkeley](https://reader035.vdocument.in/reader035/viewer/2022071214/6043813330b83f3806647832/html5/thumbnails/7.jpg)
fault tolerant linearalgebra
Majid Khonji andClément Pernet
InroductionDistributed and Cloudcomputing
Fault tolerance
Parallel Exact linear algebra
ChineseRemainder andError correctingcodePrinciple
Correction
Overview of thecomputationscheme
FurtherimprovementsOrdering, sign
Early termination
Updated computationscheme
Application tocomputingDeterminantsSoftwares
Live demo
Perspectives
Volunteer and Peer to peer computing
Well known successful projects:I Mersenne Prime searchI SETI@HomeI Folding@Home/BOINC :⇒1 Petaflops (670 000 PS3s) in 2007
I ...⇒What level of trust?
![Page 8: Computing exactly with unsafe resources: fault tolerant ...lig-membres.imag.fr/pernet/Publications/SD16_mkhonji_cpernet.pdf · Perspectives Cloud Computing [Above the clouds, a Berkeley](https://reader035.vdocument.in/reader035/viewer/2022071214/6043813330b83f3806647832/html5/thumbnails/8.jpg)
fault tolerant linearalgebra
Majid Khonji andClément Pernet
InroductionDistributed and Cloudcomputing
Fault tolerance
Parallel Exact linear algebra
ChineseRemainder andError correctingcodePrinciple
Correction
Overview of thecomputationscheme
FurtherimprovementsOrdering, sign
Early termination
Updated computationscheme
Application tocomputingDeterminantsSoftwares
Live demo
Perspectives
Cloud Computing[Above the clouds, a Berkeley view on Cloud Computing, Feb 09]Difficult agreement on the definitionBased on
I Software As A ServiceI Computing centers providing ressources in a pay as
you go fashion
The interesting thing about Cloud Computing isthat we’ve redefined Cloud Computing to includeeverything that we already do... I don’tunderstand what we would do differently in thelight of Cloud Computing other than change thewording of some of our ads.
Larry Ellison
It’s stupidity, It’s worse than stupidity: it’s amarketing hype campaign.
R. Stallman
![Page 9: Computing exactly with unsafe resources: fault tolerant ...lig-membres.imag.fr/pernet/Publications/SD16_mkhonji_cpernet.pdf · Perspectives Cloud Computing [Above the clouds, a Berkeley](https://reader035.vdocument.in/reader035/viewer/2022071214/6043813330b83f3806647832/html5/thumbnails/9.jpg)
fault tolerant linearalgebra
Majid Khonji andClément Pernet
InroductionDistributed and Cloudcomputing
Fault tolerance
Parallel Exact linear algebra
ChineseRemainder andError correctingcodePrinciple
Correction
Overview of thecomputationscheme
FurtherimprovementsOrdering, sign
Early termination
Updated computationscheme
Application tocomputingDeterminantsSoftwares
Live demo
Perspectives
Cloud Computing[Above the clouds, a Berkeley view on Cloud Computing, Feb 09]Difficult agreement on the definitionBased on
I Software As A ServiceI Computing centers providing ressources in a pay as
you go fashion
The interesting thing about Cloud Computing isthat we’ve redefined Cloud Computing to includeeverything that we already do... I don’tunderstand what we would do differently in thelight of Cloud Computing other than change thewording of some of our ads.
Larry Ellison
It’s stupidity, It’s worse than stupidity: it’s amarketing hype campaign.
R. Stallman
![Page 10: Computing exactly with unsafe resources: fault tolerant ...lig-membres.imag.fr/pernet/Publications/SD16_mkhonji_cpernet.pdf · Perspectives Cloud Computing [Above the clouds, a Berkeley](https://reader035.vdocument.in/reader035/viewer/2022071214/6043813330b83f3806647832/html5/thumbnails/10.jpg)
fault tolerant linearalgebra
Majid Khonji andClément Pernet
InroductionDistributed and Cloudcomputing
Fault tolerance
Parallel Exact linear algebra
ChineseRemainder andError correctingcodePrinciple
Correction
Overview of thecomputationscheme
FurtherimprovementsOrdering, sign
Early termination
Updated computationscheme
Application tocomputingDeterminantsSoftwares
Live demo
Perspectives
Cloud Computing
Interest:I illusion of infinite computing ressourcesI no up-front commitment by cloud usersI processor by the hour, storage by the day
⇒for massively parallel computations:New cost associativity :
100’s of computers for 1h ≡ 1 computer for 100h
⇒What level of trust
![Page 11: Computing exactly with unsafe resources: fault tolerant ...lig-membres.imag.fr/pernet/Publications/SD16_mkhonji_cpernet.pdf · Perspectives Cloud Computing [Above the clouds, a Berkeley](https://reader035.vdocument.in/reader035/viewer/2022071214/6043813330b83f3806647832/html5/thumbnails/11.jpg)
fault tolerant linearalgebra
Majid Khonji andClément Pernet
InroductionDistributed and Cloudcomputing
Fault tolerance
Parallel Exact linear algebra
ChineseRemainder andError correctingcodePrinciple
Correction
Overview of thecomputationscheme
FurtherimprovementsOrdering, sign
Early termination
Updated computationscheme
Application tocomputingDeterminantsSoftwares
Live demo
Perspectives
Cloud Computing
Interest:I illusion of infinite computing ressourcesI no up-front commitment by cloud usersI processor by the hour, storage by the day⇒for massively parallel computations:
New cost associativity :
100’s of computers for 1h ≡ 1 computer for 100h
⇒What level of trust
![Page 12: Computing exactly with unsafe resources: fault tolerant ...lig-membres.imag.fr/pernet/Publications/SD16_mkhonji_cpernet.pdf · Perspectives Cloud Computing [Above the clouds, a Berkeley](https://reader035.vdocument.in/reader035/viewer/2022071214/6043813330b83f3806647832/html5/thumbnails/12.jpg)
fault tolerant linearalgebra
Majid Khonji andClément Pernet
InroductionDistributed and Cloudcomputing
Fault tolerance
Parallel Exact linear algebra
ChineseRemainder andError correctingcodePrinciple
Correction
Overview of thecomputationscheme
FurtherimprovementsOrdering, sign
Early termination
Updated computationscheme
Application tocomputingDeterminantsSoftwares
Live demo
Perspectives
Fault tolerance
Need to deal with faults of different types
I Fail stop (crash failures)I Network congestionI Malicious attack
⇒Model of the Byzantine failure (not consistently failing)
![Page 13: Computing exactly with unsafe resources: fault tolerant ...lig-membres.imag.fr/pernet/Publications/SD16_mkhonji_cpernet.pdf · Perspectives Cloud Computing [Above the clouds, a Berkeley](https://reader035.vdocument.in/reader035/viewer/2022071214/6043813330b83f3806647832/html5/thumbnails/13.jpg)
fault tolerant linearalgebra
Majid Khonji andClément Pernet
InroductionDistributed and Cloudcomputing
Fault tolerance
Parallel Exact linear algebra
ChineseRemainder andError correctingcodePrinciple
Correction
Overview of thecomputationscheme
FurtherimprovementsOrdering, sign
Early termination
Updated computationscheme
Application tocomputingDeterminantsSoftwares
Live demo
Perspectives
Global computing
I Grid and P2P: Transparent allocation of the resources toauthenticated usersscheduling supports ressource connections / disconnections
INTERNET
f2f3
user
s1s1
Parallel Application
f5f4
f3f2
s2
f1
f5
s2
f1 f4
I Yet a task can be forged⇐⇒ f ( input ) 6= f̂ ( input )
I forgery can affect many tasks [e.g. patched client in SETI@home]
![Page 14: Computing exactly with unsafe resources: fault tolerant ...lig-membres.imag.fr/pernet/Publications/SD16_mkhonji_cpernet.pdf · Perspectives Cloud Computing [Above the clouds, a Berkeley](https://reader035.vdocument.in/reader035/viewer/2022071214/6043813330b83f3806647832/html5/thumbnails/14.jpg)
fault tolerant linearalgebra
Majid Khonji andClément Pernet
InroductionDistributed and Cloudcomputing
Fault tolerance
Parallel Exact linear algebra
ChineseRemainder andError correctingcodePrinciple
Correction
Overview of thecomputationscheme
FurtherimprovementsOrdering, sign
Early termination
Updated computationscheme
Application tocomputingDeterminantsSoftwares
Live demo
Perspectives
Global computing and task forgery
I Grid and P2P: Transparent allocation of the resources toauthenticated usersscheduling supports ressource connections / disconnections
INTERNET
f2f3
user
s1s1
Parallel Application
f5f4
f3f2
s2
f1
f5
s2
hacker
f1 f4
modified result
I Yet a task can be forged⇐⇒ f ( input ) 6= f̂ ( input )
I forgery can affect many tasks [e.g. patched client in SETI@home]
![Page 15: Computing exactly with unsafe resources: fault tolerant ...lig-membres.imag.fr/pernet/Publications/SD16_mkhonji_cpernet.pdf · Perspectives Cloud Computing [Above the clouds, a Berkeley](https://reader035.vdocument.in/reader035/viewer/2022071214/6043813330b83f3806647832/html5/thumbnails/15.jpg)
fault tolerant linearalgebra
Majid Khonji andClément Pernet
InroductionDistributed and Cloudcomputing
Fault tolerance
Parallel Exact linear algebra
ChineseRemainder andError correctingcodePrinciple
Correction
Overview of thecomputationscheme
FurtherimprovementsOrdering, sign
Early termination
Updated computationscheme
Application tocomputingDeterminantsSoftwares
Live demo
Perspectives
State-of-the-Art in Result CertificationI Mainly target programs P composed of independent tasks
I General approach: Replication-basedI Voting [e.g. BOINC, SETI@home]I Spot-checking [Germain-Playez03, based on Wald
test]I checkpoint and restart⇒30’ to 1h on large clusters!
I Blacklisting, Credibility-based fault-tolerance[Sarmenta03]
I Partial execution on reliable resources[Gao-Malewicz04]
I Specific approach: check post-condition on the resultsI Eg: Sorting O(n log n) – Simple-Checker O(n)
[Blum97]I The most efficient approach when possible!
Yet, no guarantee of result correction without hypothesison the attack
![Page 16: Computing exactly with unsafe resources: fault tolerant ...lig-membres.imag.fr/pernet/Publications/SD16_mkhonji_cpernet.pdf · Perspectives Cloud Computing [Above the clouds, a Berkeley](https://reader035.vdocument.in/reader035/viewer/2022071214/6043813330b83f3806647832/html5/thumbnails/16.jpg)
fault tolerant linearalgebra
Majid Khonji andClément Pernet
InroductionDistributed and Cloudcomputing
Fault tolerance
Parallel Exact linear algebra
ChineseRemainder andError correctingcodePrinciple
Correction
Overview of thecomputationscheme
FurtherimprovementsOrdering, sign
Early termination
Updated computationscheme
Application tocomputingDeterminantsSoftwares
Live demo
Perspectives
State-of-the-Art in Result CertificationI Mainly target programs P composed of independent tasks
I General approach: Replication-basedI Voting [e.g. BOINC, SETI@home]I Spot-checking [Germain-Playez03, based on Wald
test]I checkpoint and restart⇒30’ to 1h on large clusters!
I Blacklisting, Credibility-based fault-tolerance[Sarmenta03]
I Partial execution on reliable resources[Gao-Malewicz04]
I Specific approach: check post-condition on the resultsI Eg: Sorting O(n log n) – Simple-Checker O(n)
[Blum97]I The most efficient approach when possible!
Yet, no guarantee of result correction without hypothesison the attack
![Page 17: Computing exactly with unsafe resources: fault tolerant ...lig-membres.imag.fr/pernet/Publications/SD16_mkhonji_cpernet.pdf · Perspectives Cloud Computing [Above the clouds, a Berkeley](https://reader035.vdocument.in/reader035/viewer/2022071214/6043813330b83f3806647832/html5/thumbnails/17.jpg)
fault tolerant linearalgebra
Majid Khonji andClément Pernet
InroductionDistributed and Cloudcomputing
Fault tolerance
Parallel Exact linear algebra
ChineseRemainder andError correctingcodePrinciple
Correction
Overview of thecomputationscheme
FurtherimprovementsOrdering, sign
Early termination
Updated computationscheme
Application tocomputingDeterminantsSoftwares
Live demo
Perspectives
Approach: Massive versus localized attacks
I In practice:I DAG of tasks, highly parallelI for most executions, only few or none forged tasks!↪→ full replication useless and too expensive
I Yet, no blind trust:I large scale falsifications are possible ↪→ can be
detected by trustable verification of randomlyselected tasks:Extended Monte-Carlo Certification EMCT
[Krings&al 06]I few falsifications are possible ↪→ can be efficiently
corrected by Algorithm-based fault tolerance (ABFT)[Beckman 93, Plank&al 97, Saha 2006]
![Page 18: Computing exactly with unsafe resources: fault tolerant ...lig-membres.imag.fr/pernet/Publications/SD16_mkhonji_cpernet.pdf · Perspectives Cloud Computing [Above the clouds, a Berkeley](https://reader035.vdocument.in/reader035/viewer/2022071214/6043813330b83f3806647832/html5/thumbnails/18.jpg)
fault tolerant linearalgebra
Majid Khonji andClément Pernet
InroductionDistributed and Cloudcomputing
Fault tolerance
Parallel Exact linear algebra
ChineseRemainder andError correctingcodePrinciple
Correction
Overview of thecomputationscheme
FurtherimprovementsOrdering, sign
Early termination
Updated computationscheme
Application tocomputingDeterminantsSoftwares
Live demo
Perspectives
Approach: Massive versus localized attacks
I In practice:I DAG of tasks, highly parallelI for most executions, only few or none forged tasks!↪→ full replication useless and too expensive
I Yet, no blind trust:I large scale falsifications are possible ↪→ can be
detected by trustable verification of randomlyselected tasks:Extended Monte-Carlo Certification EMCT
[Krings&al 06]I few falsifications are possible ↪→ can be efficiently
corrected by Algorithm-based fault tolerance (ABFT)[Beckman 93, Plank&al 97, Saha 2006]
![Page 19: Computing exactly with unsafe resources: fault tolerant ...lig-membres.imag.fr/pernet/Publications/SD16_mkhonji_cpernet.pdf · Perspectives Cloud Computing [Above the clouds, a Berkeley](https://reader035.vdocument.in/reader035/viewer/2022071214/6043813330b83f3806647832/html5/thumbnails/19.jpg)
fault tolerant linearalgebra
Majid Khonji andClément Pernet
InroductionDistributed and Cloudcomputing
Fault tolerance
Parallel Exact linear algebra
ChineseRemainder andError correctingcodePrinciple
Correction
Overview of thecomputationscheme
FurtherimprovementsOrdering, sign
Early termination
Updated computationscheme
Application tocomputingDeterminantsSoftwares
Live demo
Perspectives
Algorithmic Based Fault Tolerance
Idea: incorporate the redundancy in the algorithm⇒use some specific properties of the
problem
Example: Matrix vector product [Dongarra 2006]
u uA
A
I precompute the product B = A×[I R
]I compute x = uB in parallelI decode/correct x
![Page 20: Computing exactly with unsafe resources: fault tolerant ...lig-membres.imag.fr/pernet/Publications/SD16_mkhonji_cpernet.pdf · Perspectives Cloud Computing [Above the clouds, a Berkeley](https://reader035.vdocument.in/reader035/viewer/2022071214/6043813330b83f3806647832/html5/thumbnails/20.jpg)
fault tolerant linearalgebra
Majid Khonji andClément Pernet
InroductionDistributed and Cloudcomputing
Fault tolerance
Parallel Exact linear algebra
ChineseRemainder andError correctingcodePrinciple
Correction
Overview of thecomputationscheme
FurtherimprovementsOrdering, sign
Early termination
Updated computationscheme
Application tocomputingDeterminantsSoftwares
Live demo
Perspectives
Algorithmic Based Fault Tolerance
Idea: incorporate the redundancy in the algorithm⇒use some specific properties of the
problem
Example: Matrix vector product [Dongarra 2006]
u uA
A
I precompute the product B = A×[I R
]I compute x = uB in parallelI decode/correct x
![Page 21: Computing exactly with unsafe resources: fault tolerant ...lig-membres.imag.fr/pernet/Publications/SD16_mkhonji_cpernet.pdf · Perspectives Cloud Computing [Above the clouds, a Berkeley](https://reader035.vdocument.in/reader035/viewer/2022071214/6043813330b83f3806647832/html5/thumbnails/21.jpg)
fault tolerant linearalgebra
Majid Khonji andClément Pernet
InroductionDistributed and Cloudcomputing
Fault tolerance
Parallel Exact linear algebra
ChineseRemainder andError correctingcodePrinciple
Correction
Overview of thecomputationscheme
FurtherimprovementsOrdering, sign
Early termination
Updated computationscheme
Application tocomputingDeterminantsSoftwares
Live demo
Perspectives
Algorithmic Based Fault Tolerance
Idea: incorporate the redundancy in the algorithm⇒use some specific properties of the
problem
Example: Matrix vector product [Dongarra 2006]
u uA
A Id R
x
I precompute the product B = A×[I R
]I compute x = uB in parallelI decode/correct x
![Page 22: Computing exactly with unsafe resources: fault tolerant ...lig-membres.imag.fr/pernet/Publications/SD16_mkhonji_cpernet.pdf · Perspectives Cloud Computing [Above the clouds, a Berkeley](https://reader035.vdocument.in/reader035/viewer/2022071214/6043813330b83f3806647832/html5/thumbnails/22.jpg)
fault tolerant linearalgebra
Majid Khonji andClément Pernet
InroductionDistributed and Cloudcomputing
Fault tolerance
Parallel Exact linear algebra
ChineseRemainder andError correctingcodePrinciple
Correction
Overview of thecomputationscheme
FurtherimprovementsOrdering, sign
Early termination
Updated computationscheme
Application tocomputingDeterminantsSoftwares
Live demo
Perspectives
Algorithmic Based Fault Tolerance
Idea: incorporate the redundancy in the algorithm⇒use some specific properties of the
problem
Example: Matrix vector product [Dongarra 2006]
u uA
A Id R
x
I precompute the product B = A×[I R
]I compute x = uB in parallelI decode/correct x
![Page 23: Computing exactly with unsafe resources: fault tolerant ...lig-membres.imag.fr/pernet/Publications/SD16_mkhonji_cpernet.pdf · Perspectives Cloud Computing [Above the clouds, a Berkeley](https://reader035.vdocument.in/reader035/viewer/2022071214/6043813330b83f3806647832/html5/thumbnails/23.jpg)
fault tolerant linearalgebra
Majid Khonji andClément Pernet
InroductionDistributed and Cloudcomputing
Fault tolerance
Parallel Exact linear algebra
ChineseRemainder andError correctingcodePrinciple
Correction
Overview of thecomputationscheme
FurtherimprovementsOrdering, sign
Early termination
Updated computationscheme
Application tocomputingDeterminantsSoftwares
Live demo
Perspectives
Parallel Exact linear algebra
Mathematics is the art of reducing any problemto linear algbra
W. Stein
⇒the core to be optimized for a lot of computations
I Matrix multiplication over GF(p)I Eliminations: Gauss, Gram-Schmidt (LLL), ...I Krylov iteration
![Page 24: Computing exactly with unsafe resources: fault tolerant ...lig-membres.imag.fr/pernet/Publications/SD16_mkhonji_cpernet.pdf · Perspectives Cloud Computing [Above the clouds, a Berkeley](https://reader035.vdocument.in/reader035/viewer/2022071214/6043813330b83f3806647832/html5/thumbnails/24.jpg)
fault tolerant linearalgebra
Majid Khonji andClément Pernet
InroductionDistributed and Cloudcomputing
Fault tolerance
Parallel Exact linear algebra
ChineseRemainder andError correctingcodePrinciple
Correction
Overview of thecomputationscheme
FurtherimprovementsOrdering, sign
Early termination
Updated computationscheme
Application tocomputingDeterminantsSoftwares
Live demo
Perspectives
Parallel Exact linear algebra
Mathematics is the art of reducing any problemto linear algbra
W. Stein
⇒the core to be optimized for a lot of computationsI Matrix multiplication over GF(p)I Eliminations: Gauss, Gram-Schmidt (LLL), ...I Krylov iterationI Chinese Remainder Algorithm
![Page 25: Computing exactly with unsafe resources: fault tolerant ...lig-membres.imag.fr/pernet/Publications/SD16_mkhonji_cpernet.pdf · Perspectives Cloud Computing [Above the clouds, a Berkeley](https://reader035.vdocument.in/reader035/viewer/2022071214/6043813330b83f3806647832/html5/thumbnails/25.jpg)
fault tolerant linearalgebra
Majid Khonji andClément Pernet
InroductionDistributed and Cloudcomputing
Fault tolerance
Parallel Exact linear algebra
ChineseRemainder andError correctingcodePrinciple
Correction
Overview of thecomputationscheme
FurtherimprovementsOrdering, sign
Early termination
Updated computationscheme
Application tocomputingDeterminantsSoftwares
Live demo
Perspectives
Parallel Exact linear algebra
Mathematics is the art of reducing any problemto linear algbra
W. Stein
⇒the core to be optimized for a lot of computationsI Matrix multiplication over GF(p)I Eliminations: Gauss, Gram-Schmidt (LLL), ...I Krylov iterationI Chinese Remainder Algorithm
![Page 26: Computing exactly with unsafe resources: fault tolerant ...lig-membres.imag.fr/pernet/Publications/SD16_mkhonji_cpernet.pdf · Perspectives Cloud Computing [Above the clouds, a Berkeley](https://reader035.vdocument.in/reader035/viewer/2022071214/6043813330b83f3806647832/html5/thumbnails/26.jpg)
fault tolerant linearalgebra
Majid Khonji andClément Pernet
InroductionDistributed and Cloudcomputing
Fault tolerance
Parallel Exact linear algebra
ChineseRemainder andError correctingcodePrinciple
Correction
Overview of thecomputationscheme
FurtherimprovementsOrdering, sign
Early termination
Updated computationscheme
Application tocomputingDeterminantsSoftwares
Live demo
Perspectives
OutlineInroduction
Distributed and Cloud computingFault toleranceParallel Exact linear algebra
Chinese Remainder and Error correcting codePrincipleCorrection
Overview of the computation scheme
Further improvementsOrdering, signEarly terminationUpdated computation scheme
Application to computing DeterminantsSoftwaresLive demo
Perspectives
![Page 27: Computing exactly with unsafe resources: fault tolerant ...lig-membres.imag.fr/pernet/Publications/SD16_mkhonji_cpernet.pdf · Perspectives Cloud Computing [Above the clouds, a Berkeley](https://reader035.vdocument.in/reader035/viewer/2022071214/6043813330b83f3806647832/html5/thumbnails/27.jpg)
fault tolerant linearalgebra
Majid Khonji andClément Pernet
InroductionDistributed and Cloudcomputing
Fault tolerance
Parallel Exact linear algebra
ChineseRemainder andError correctingcodePrinciple
Correction
Overview of thecomputationscheme
FurtherimprovementsOrdering, sign
Early termination
Updated computationscheme
Application tocomputingDeterminantsSoftwares
Live demo
Perspectives
Principle
Chinese Remainder Theorem
x1 x2 . . . xkx ∈ Z
where p1 × · · · × pk > x and xi = x mod pi ∀i
Definition(n, k)-code:C = {(x1, . . . , xn) ∈ Zp1 × · · · × Zpn s.t. there exists x ,0 ≤x < p1 . . . pk and xi = x mod pi ∀i}.
![Page 28: Computing exactly with unsafe resources: fault tolerant ...lig-membres.imag.fr/pernet/Publications/SD16_mkhonji_cpernet.pdf · Perspectives Cloud Computing [Above the clouds, a Berkeley](https://reader035.vdocument.in/reader035/viewer/2022071214/6043813330b83f3806647832/html5/thumbnails/28.jpg)
fault tolerant linearalgebra
Majid Khonji andClément Pernet
InroductionDistributed and Cloudcomputing
Fault tolerance
Parallel Exact linear algebra
ChineseRemainder andError correctingcodePrinciple
Correction
Overview of thecomputationscheme
FurtherimprovementsOrdering, sign
Early termination
Updated computationscheme
Application tocomputingDeterminantsSoftwares
Live demo
Perspectives
Principle
Chinese Remainder Theorem
x1 x2 . . . xkx ∈ Z xk+1 xn. . .
where p1 × · · · × pn > x and xi = x mod pi ∀i
Definition(n, k)-code:C = {(x1, . . . , xn) ∈ Zp1 × · · · × Zpn s.t. there exists x ,0 ≤x < p1 . . . pk and xi = x mod pi ∀i}.
![Page 29: Computing exactly with unsafe resources: fault tolerant ...lig-membres.imag.fr/pernet/Publications/SD16_mkhonji_cpernet.pdf · Perspectives Cloud Computing [Above the clouds, a Berkeley](https://reader035.vdocument.in/reader035/viewer/2022071214/6043813330b83f3806647832/html5/thumbnails/29.jpg)
fault tolerant linearalgebra
Majid Khonji andClément Pernet
InroductionDistributed and Cloudcomputing
Fault tolerance
Parallel Exact linear algebra
ChineseRemainder andError correctingcodePrinciple
Correction
Overview of thecomputationscheme
FurtherimprovementsOrdering, sign
Early termination
Updated computationscheme
Application tocomputingDeterminantsSoftwares
Live demo
Perspectives
Principle
Chinese Remainder Theorem
x1 x2 . . . xkx ∈ Z xk+1 xn. . .
where p1 × · · · × pn > x and xi = x mod pi ∀i
Definition(n, k)-code:C = {(x1, . . . , xn) ∈ Zp1 × · · · × Zpn s.t. there exists x ,0 ≤x < p1 . . . pk and xi = x mod pi ∀i}.
![Page 30: Computing exactly with unsafe resources: fault tolerant ...lig-membres.imag.fr/pernet/Publications/SD16_mkhonji_cpernet.pdf · Perspectives Cloud Computing [Above the clouds, a Berkeley](https://reader035.vdocument.in/reader035/viewer/2022071214/6043813330b83f3806647832/html5/thumbnails/30.jpg)
fault tolerant linearalgebra
Majid Khonji andClément Pernet
InroductionDistributed and Cloudcomputing
Fault tolerance
Parallel Exact linear algebra
ChineseRemainder andError correctingcodePrinciple
Correction
Overview of thecomputationscheme
FurtherimprovementsOrdering, sign
Early termination
Updated computationscheme
Application tocomputingDeterminantsSoftwares
Live demo
Perspectives
Principle
PropertyX ∈ C iff X < Mk .
p1 p2 . . . pk pk+1 pn. . .
Mk = p1 × · · · × pk
Mn = p1 × · · · × pn
![Page 31: Computing exactly with unsafe resources: fault tolerant ...lig-membres.imag.fr/pernet/Publications/SD16_mkhonji_cpernet.pdf · Perspectives Cloud Computing [Above the clouds, a Berkeley](https://reader035.vdocument.in/reader035/viewer/2022071214/6043813330b83f3806647832/html5/thumbnails/31.jpg)
fault tolerant linearalgebra
Majid Khonji andClément Pernet
InroductionDistributed and Cloudcomputing
Fault tolerance
Parallel Exact linear algebra
ChineseRemainder andError correctingcodePrinciple
Correction
Overview of thecomputationscheme
FurtherimprovementsOrdering, sign
Early termination
Updated computationscheme
Application tocomputingDeterminantsSoftwares
Live demo
Perspectives
Principle
Transmission channel ≡ Computation
Correction
Computation
Encoding
Decoding
Solution
Input
x = (x1, . . . , xn)
x′ = (r1, . . . , rn)
A = (A1, . . .An)
x ′ < Mr
x < Mk
A
![Page 32: Computing exactly with unsafe resources: fault tolerant ...lig-membres.imag.fr/pernet/Publications/SD16_mkhonji_cpernet.pdf · Perspectives Cloud Computing [Above the clouds, a Berkeley](https://reader035.vdocument.in/reader035/viewer/2022071214/6043813330b83f3806647832/html5/thumbnails/32.jpg)
fault tolerant linearalgebra
Majid Khonji andClément Pernet
InroductionDistributed and Cloudcomputing
Fault tolerance
Parallel Exact linear algebra
ChineseRemainder andError correctingcodePrinciple
Correction
Overview of thecomputationscheme
FurtherimprovementsOrdering, sign
Early termination
Updated computationscheme
Application tocomputingDeterminantsSoftwares
Live demo
Perspectives
Principle
Noisy Transmission channel ≡ Unsafe Computation
Correction
Computation
Encoding
Decoding
Solution
Input
x = (x1, . . . , xn)
x′ = (r1, . . . , rn)
A = (A1, . . .An)
x ′ < Mr
x < Mk
A
![Page 33: Computing exactly with unsafe resources: fault tolerant ...lig-membres.imag.fr/pernet/Publications/SD16_mkhonji_cpernet.pdf · Perspectives Cloud Computing [Above the clouds, a Berkeley](https://reader035.vdocument.in/reader035/viewer/2022071214/6043813330b83f3806647832/html5/thumbnails/33.jpg)
fault tolerant linearalgebra
Majid Khonji andClément Pernet
InroductionDistributed and Cloudcomputing
Fault tolerance
Parallel Exact linear algebra
ChineseRemainder andError correctingcodePrinciple
Correction
Overview of thecomputationscheme
FurtherimprovementsOrdering, sign
Early termination
Updated computationscheme
Application tocomputingDeterminantsSoftwares
Live demo
Perspectives
Principle
Noisy Transmission channel ≡ Unsafe Computation
Correction
Computation
Encoding
Decoding
Solution
Input
x = (x1, . . . , xn)
x ′ = (r1, . . . , rn)
A = (A1, . . .An)
x ′ < Mr
x < Mk
A
![Page 34: Computing exactly with unsafe resources: fault tolerant ...lig-membres.imag.fr/pernet/Publications/SD16_mkhonji_cpernet.pdf · Perspectives Cloud Computing [Above the clouds, a Berkeley](https://reader035.vdocument.in/reader035/viewer/2022071214/6043813330b83f3806647832/html5/thumbnails/34.jpg)
fault tolerant linearalgebra
Majid Khonji andClément Pernet
InroductionDistributed and Cloudcomputing
Fault tolerance
Parallel Exact linear algebra
ChineseRemainder andError correctingcodePrinciple
Correction
Overview of thecomputationscheme
FurtherimprovementsOrdering, sign
Early termination
Updated computationscheme
Application tocomputingDeterminantsSoftwares
Live demo
Perspectives
Property of the code
Redundancy : r = n − k
p1 p2 . . . pk pk+1 pn. . .
Mk = p1 × · · · × pk
Mn = p1 × · · · × pn
Detects up to r errors:
For X ′ = X + E with X ∈ C, ‖E‖ ≤ r ,
X ′ > Mk .
I Distance: r + 1I Corrects up to
⌊ r2
⌋errors:
![Page 35: Computing exactly with unsafe resources: fault tolerant ...lig-membres.imag.fr/pernet/Publications/SD16_mkhonji_cpernet.pdf · Perspectives Cloud Computing [Above the clouds, a Berkeley](https://reader035.vdocument.in/reader035/viewer/2022071214/6043813330b83f3806647832/html5/thumbnails/35.jpg)
fault tolerant linearalgebra
Majid Khonji andClément Pernet
InroductionDistributed and Cloudcomputing
Fault tolerance
Parallel Exact linear algebra
ChineseRemainder andError correctingcodePrinciple
Correction
Overview of thecomputationscheme
FurtherimprovementsOrdering, sign
Early termination
Updated computationscheme
Application tocomputingDeterminantsSoftwares
Live demo
Perspectives
Property of the code
Redundancy : r = n − k
p1 p2 . . . pk pk+1 pn. . .
Mk = p1 × · · · × pk
Mn = p1 × · · · × pn
Detects up to r errors:
For X ′ = X + E with X ∈ C, ‖E‖ ≤ r ,
X ′ > Mk .
I Distance: r + 1I Corrects up to
⌊ r2
⌋errors: the closest code word
![Page 36: Computing exactly with unsafe resources: fault tolerant ...lig-membres.imag.fr/pernet/Publications/SD16_mkhonji_cpernet.pdf · Perspectives Cloud Computing [Above the clouds, a Berkeley](https://reader035.vdocument.in/reader035/viewer/2022071214/6043813330b83f3806647832/html5/thumbnails/36.jpg)
fault tolerant linearalgebra
Majid Khonji andClément Pernet
InroductionDistributed and Cloudcomputing
Fault tolerance
Parallel Exact linear algebra
ChineseRemainder andError correctingcodePrinciple
Correction
Overview of thecomputationscheme
FurtherimprovementsOrdering, sign
Early termination
Updated computationscheme
Application tocomputingDeterminantsSoftwares
Live demo
Perspectives
Property of the code
Redundancy : r = n − k
p1 p2 . . . pk pk+1 pn. . .
Mk = p1 × · · · × pk
Mn = p1 × · · · × pn
Detects up to r errors:
For X ′ = X + E with X ∈ C, ‖E‖ ≤ r ,
X ′ > Mk .
I Distance: r + 1I Corrects up to
⌊ r2
⌋errors: the closest code word
![Page 37: Computing exactly with unsafe resources: fault tolerant ...lig-membres.imag.fr/pernet/Publications/SD16_mkhonji_cpernet.pdf · Perspectives Cloud Computing [Above the clouds, a Berkeley](https://reader035.vdocument.in/reader035/viewer/2022071214/6043813330b83f3806647832/html5/thumbnails/37.jpg)
fault tolerant linearalgebra
Majid Khonji andClément Pernet
InroductionDistributed and Cloudcomputing
Fault tolerance
Parallel Exact linear algebra
ChineseRemainder andError correctingcodePrinciple
Correction
Overview of thecomputationscheme
FurtherimprovementsOrdering, sign
Early termination
Updated computationscheme
Application tocomputingDeterminantsSoftwares
Live demo
Perspectives
Correction[Mandelbaum 78]
CRTI π = p1 × · · · × pn,I πi = π/pi
I yi = π−1i mod pi .
x =∑
xiπiyi
I Error e of weight t ≤⌊ r
2
⌋I C: the product of the pi of the errors
x = x ′ − e= x ′ −
∑supp(e)
xiπiyi
= x ′ − Mn
CB′
for some 0 ≤ B′ ≤ C
![Page 38: Computing exactly with unsafe resources: fault tolerant ...lig-membres.imag.fr/pernet/Publications/SD16_mkhonji_cpernet.pdf · Perspectives Cloud Computing [Above the clouds, a Berkeley](https://reader035.vdocument.in/reader035/viewer/2022071214/6043813330b83f3806647832/html5/thumbnails/38.jpg)
fault tolerant linearalgebra
Majid Khonji andClément Pernet
InroductionDistributed and Cloudcomputing
Fault tolerance
Parallel Exact linear algebra
ChineseRemainder andError correctingcodePrinciple
Correction
Overview of thecomputationscheme
FurtherimprovementsOrdering, sign
Early termination
Updated computationscheme
Application tocomputingDeterminantsSoftwares
Live demo
Perspectives
Correction[Mandelbaum 78]
CRTI π = p1 × · · · × pn,I πi = π/pi
I yi = π−1i mod pi .
x =∑
xiπiyi
I Error e of weight t ≤⌊ r
2
⌋I C: the product of the pi of the errors
x = x ′ − e= x ′ −
∑supp(e)
xiπiyi
= x ′ − Mn
CB′
for some 0 ≤ B′ ≤ C
![Page 39: Computing exactly with unsafe resources: fault tolerant ...lig-membres.imag.fr/pernet/Publications/SD16_mkhonji_cpernet.pdf · Perspectives Cloud Computing [Above the clouds, a Berkeley](https://reader035.vdocument.in/reader035/viewer/2022071214/6043813330b83f3806647832/html5/thumbnails/39.jpg)
fault tolerant linearalgebra
Majid Khonji andClément Pernet
InroductionDistributed and Cloudcomputing
Fault tolerance
Parallel Exact linear algebra
ChineseRemainder andError correctingcodePrinciple
Correction
Overview of thecomputationscheme
FurtherimprovementsOrdering, sign
Early termination
Updated computationscheme
Application tocomputingDeterminantsSoftwares
Live demo
Perspectives
Correction
TheoremIf e ≤ (n − k) log pmin−log 2
log pmax+log pmin∣∣∣∣ x ′
Mn− B′
C
∣∣∣∣ ≤ 1C2
⇒use continued fractions to approximate x ′
Mn⇒recover B′/C ⇒e
![Page 40: Computing exactly with unsafe resources: fault tolerant ...lig-membres.imag.fr/pernet/Publications/SD16_mkhonji_cpernet.pdf · Perspectives Cloud Computing [Above the clouds, a Berkeley](https://reader035.vdocument.in/reader035/viewer/2022071214/6043813330b83f3806647832/html5/thumbnails/40.jpg)
fault tolerant linearalgebra
Majid Khonji andClément Pernet
InroductionDistributed and Cloudcomputing
Fault tolerance
Parallel Exact linear algebra
ChineseRemainder andError correctingcodePrinciple
Correction
Overview of thecomputationscheme
FurtherimprovementsOrdering, sign
Early termination
Updated computationscheme
Application tocomputingDeterminantsSoftwares
Live demo
Perspectives
Continued fractions
Continued fractions: Every rational can be written aspq = a0 + 1
a1+1
a2+...
= [a0, . . . ,an]
Convergents: a truncation piqi
= [a0, . . . ,ai ] for i ≤ n
Property∣∣∣pq − piqi
∣∣∣ ≤ 1q2
iand is decreasing.
⇒I compute iteratively the convergents of x ′
Mnuntil∣∣∣ x ′
Mn− pi
qi
∣∣∣ ≤ 1C2 is satisfied.
I If Mnpiqi∈ Z and x = x ′ − e ≤ Mk ⇒Corrected
![Page 41: Computing exactly with unsafe resources: fault tolerant ...lig-membres.imag.fr/pernet/Publications/SD16_mkhonji_cpernet.pdf · Perspectives Cloud Computing [Above the clouds, a Berkeley](https://reader035.vdocument.in/reader035/viewer/2022071214/6043813330b83f3806647832/html5/thumbnails/41.jpg)
fault tolerant linearalgebra
Majid Khonji andClément Pernet
InroductionDistributed and Cloudcomputing
Fault tolerance
Parallel Exact linear algebra
ChineseRemainder andError correctingcodePrinciple
Correction
Overview of thecomputationscheme
FurtherimprovementsOrdering, sign
Early termination
Updated computationscheme
Application tocomputingDeterminantsSoftwares
Live demo
Perspectives
Continued fractions
Continued fractions: Every rational can be written aspq = a0 + 1
a1+1
a2+...
= [a0, . . . ,an]
Convergents: a truncation piqi
= [a0, . . . ,ai ] for i ≤ n
Property∣∣∣pq − piqi
∣∣∣ ≤ 1q2
iand is decreasing.
⇒I compute iteratively the convergents of x ′
Mnuntil∣∣∣ x ′
Mn− pi
qi
∣∣∣ ≤ 1C2 is satisfied.
I If Mnpiqi∈ Z and x = x ′ − e ≤ Mk ⇒Corrected
![Page 42: Computing exactly with unsafe resources: fault tolerant ...lig-membres.imag.fr/pernet/Publications/SD16_mkhonji_cpernet.pdf · Perspectives Cloud Computing [Above the clouds, a Berkeley](https://reader035.vdocument.in/reader035/viewer/2022071214/6043813330b83f3806647832/html5/thumbnails/42.jpg)
fault tolerant linearalgebra
Majid Khonji andClément Pernet
InroductionDistributed and Cloudcomputing
Fault tolerance
Parallel Exact linear algebra
ChineseRemainder andError correctingcodePrinciple
Correction
Overview of thecomputationscheme
FurtherimprovementsOrdering, sign
Early termination
Updated computationscheme
Application tocomputingDeterminantsSoftwares
Live demo
Perspectives
Continued fractions
Continued fractions: Every rational can be written aspq = a0 + 1
a1+1
a2+...
= [a0, . . . ,an]
Convergents: a truncation piqi
= [a0, . . . ,ai ] for i ≤ n
Property∣∣∣pq − piqi
∣∣∣ ≤ 1q2
iand is decreasing.
⇒I compute iteratively the convergents of x ′
Mnuntil∣∣∣ x ′
Mn− pi
qi
∣∣∣ ≤ 1C2 is satisfied.
I If Mnpiqi∈ Z and x = x ′ − e ≤ Mk ⇒Corrected
![Page 43: Computing exactly with unsafe resources: fault tolerant ...lig-membres.imag.fr/pernet/Publications/SD16_mkhonji_cpernet.pdf · Perspectives Cloud Computing [Above the clouds, a Berkeley](https://reader035.vdocument.in/reader035/viewer/2022071214/6043813330b83f3806647832/html5/thumbnails/43.jpg)
fault tolerant linearalgebra
Majid Khonji andClément Pernet
InroductionDistributed and Cloudcomputing
Fault tolerance
Parallel Exact linear algebra
ChineseRemainder andError correctingcodePrinciple
Correction
Overview of thecomputationscheme
FurtherimprovementsOrdering, sign
Early termination
Updated computationscheme
Application tocomputingDeterminantsSoftwares
Live demo
Perspectives
OutlineInroduction
Distributed and Cloud computingFault toleranceParallel Exact linear algebra
Chinese Remainder and Error correcting codePrincipleCorrection
Overview of the computation scheme
Further improvementsOrdering, signEarly terminationUpdated computation scheme
Application to computing DeterminantsSoftwaresLive demo
Perspectives
![Page 44: Computing exactly with unsafe resources: fault tolerant ...lig-membres.imag.fr/pernet/Publications/SD16_mkhonji_cpernet.pdf · Perspectives Cloud Computing [Above the clouds, a Berkeley](https://reader035.vdocument.in/reader035/viewer/2022071214/6043813330b83f3806647832/html5/thumbnails/44.jpg)
fault tolerant linearalgebra
Majid Khonji andClément Pernet
InroductionDistributed and Cloudcomputing
Fault tolerance
Parallel Exact linear algebra
ChineseRemainder andError correctingcodePrinciple
Correction
Overview of thecomputationscheme
FurtherimprovementsOrdering, sign
Early termination
Updated computationscheme
Application tocomputingDeterminantsSoftwares
Live demo
Perspectives
Computation resources
3 types of resources:I Trusted
I 100% secureI Untrusted
I no security measures (public machines)I Semi-Trusted
I accessible by untrusted machines but with securitymeasures
I firewall, chroot, input checking...etc
![Page 45: Computing exactly with unsafe resources: fault tolerant ...lig-membres.imag.fr/pernet/Publications/SD16_mkhonji_cpernet.pdf · Perspectives Cloud Computing [Above the clouds, a Berkeley](https://reader035.vdocument.in/reader035/viewer/2022071214/6043813330b83f3806647832/html5/thumbnails/45.jpg)
fault tolerant linearalgebra
Majid Khonji andClément Pernet
InroductionDistributed and Cloudcomputing
Fault tolerance
Parallel Exact linear algebra
ChineseRemainder andError correctingcodePrinciple
Correction
Overview of thecomputationscheme
FurtherimprovementsOrdering, sign
Early termination
Updated computationscheme
Application tocomputingDeterminantsSoftwares
Live demo
Perspectives
Basic scheme
![Page 46: Computing exactly with unsafe resources: fault tolerant ...lig-membres.imag.fr/pernet/Publications/SD16_mkhonji_cpernet.pdf · Perspectives Cloud Computing [Above the clouds, a Berkeley](https://reader035.vdocument.in/reader035/viewer/2022071214/6043813330b83f3806647832/html5/thumbnails/46.jpg)
fault tolerant linearalgebra
Majid Khonji andClément Pernet
InroductionDistributed and Cloudcomputing
Fault tolerance
Parallel Exact linear algebra
ChineseRemainder andError correctingcodePrinciple
Correction
Overview of thecomputationscheme
FurtherimprovementsOrdering, sign
Early termination
Updated computationscheme
Application tocomputingDeterminantsSoftwares
Live demo
Perspectives
Basic scheme
![Page 47: Computing exactly with unsafe resources: fault tolerant ...lig-membres.imag.fr/pernet/Publications/SD16_mkhonji_cpernet.pdf · Perspectives Cloud Computing [Above the clouds, a Berkeley](https://reader035.vdocument.in/reader035/viewer/2022071214/6043813330b83f3806647832/html5/thumbnails/47.jpg)
fault tolerant linearalgebra
Majid Khonji andClément Pernet
InroductionDistributed and Cloudcomputing
Fault tolerance
Parallel Exact linear algebra
ChineseRemainder andError correctingcodePrinciple
Correction
Overview of thecomputationscheme
FurtherimprovementsOrdering, sign
Early termination
Updated computationscheme
Application tocomputingDeterminantsSoftwares
Live demo
Perspectives
Basic schemeI we don’t know the number of tasks!I assuming the error rate is not fixedI Trade-off: iterations vs #forks
increasing iteratively the numberof tasks:
I quadratic growth: 2i
I ρi with 1 < ρ < 2I ρ-amortized: ρf (i) with f
sub-linearI f (i) = iα with 0 < α < 1I f (i) = i
log i , ....
![Page 48: Computing exactly with unsafe resources: fault tolerant ...lig-membres.imag.fr/pernet/Publications/SD16_mkhonji_cpernet.pdf · Perspectives Cloud Computing [Above the clouds, a Berkeley](https://reader035.vdocument.in/reader035/viewer/2022071214/6043813330b83f3806647832/html5/thumbnails/48.jpg)
fault tolerant linearalgebra
Majid Khonji andClément Pernet
InroductionDistributed and Cloudcomputing
Fault tolerance
Parallel Exact linear algebra
ChineseRemainder andError correctingcodePrinciple
Correction
Overview of thecomputationscheme
FurtherimprovementsOrdering, sign
Early termination
Updated computationscheme
Application tocomputingDeterminantsSoftwares
Live demo
Perspectives
Basic schemeI we don’t know the number of tasks!I assuming the error rate is not fixedI Trade-off: iterations vs #forks
increasing iteratively the numberof tasks:
I quadratic growth: 2i
I ρi with 1 < ρ < 2I ρ-amortized: ρf (i) with f
sub-linearI f (i) = iα with 0 < α < 1I f (i) = i
log i , ....
![Page 49: Computing exactly with unsafe resources: fault tolerant ...lig-membres.imag.fr/pernet/Publications/SD16_mkhonji_cpernet.pdf · Perspectives Cloud Computing [Above the clouds, a Berkeley](https://reader035.vdocument.in/reader035/viewer/2022071214/6043813330b83f3806647832/html5/thumbnails/49.jpg)
fault tolerant linearalgebra
Majid Khonji andClément Pernet
InroductionDistributed and Cloudcomputing
Fault tolerance
Parallel Exact linear algebra
ChineseRemainder andError correctingcodePrinciple
Correction
Overview of thecomputationscheme
FurtherimprovementsOrdering, sign
Early termination
Updated computationscheme
Application tocomputingDeterminantsSoftwares
Live demo
Perspectives
Online Post certification scheme
![Page 50: Computing exactly with unsafe resources: fault tolerant ...lig-membres.imag.fr/pernet/Publications/SD16_mkhonji_cpernet.pdf · Perspectives Cloud Computing [Above the clouds, a Berkeley](https://reader035.vdocument.in/reader035/viewer/2022071214/6043813330b83f3806647832/html5/thumbnails/50.jpg)
fault tolerant linearalgebra
Majid Khonji andClément Pernet
InroductionDistributed and Cloudcomputing
Fault tolerance
Parallel Exact linear algebra
ChineseRemainder andError correctingcodePrinciple
Correction
Overview of thecomputationscheme
FurtherimprovementsOrdering, sign
Early termination
Updated computationscheme
Application tocomputingDeterminantsSoftwares
Live demo
Perspectives
Online Post certification scheme
![Page 51: Computing exactly with unsafe resources: fault tolerant ...lig-membres.imag.fr/pernet/Publications/SD16_mkhonji_cpernet.pdf · Perspectives Cloud Computing [Above the clouds, a Berkeley](https://reader035.vdocument.in/reader035/viewer/2022071214/6043813330b83f3806647832/html5/thumbnails/51.jpg)
fault tolerant linearalgebra
Majid Khonji andClément Pernet
InroductionDistributed and Cloudcomputing
Fault tolerance
Parallel Exact linear algebra
ChineseRemainder andError correctingcodePrinciple
Correction
Overview of thecomputationscheme
FurtherimprovementsOrdering, sign
Early termination
Updated computationscheme
Application tocomputingDeterminantsSoftwares
Live demo
Perspectives
Online Post certification scheme
![Page 52: Computing exactly with unsafe resources: fault tolerant ...lig-membres.imag.fr/pernet/Publications/SD16_mkhonji_cpernet.pdf · Perspectives Cloud Computing [Above the clouds, a Berkeley](https://reader035.vdocument.in/reader035/viewer/2022071214/6043813330b83f3806647832/html5/thumbnails/52.jpg)
fault tolerant linearalgebra
Majid Khonji andClément Pernet
InroductionDistributed and Cloudcomputing
Fault tolerance
Parallel Exact linear algebra
ChineseRemainder andError correctingcodePrinciple
Correction
Overview of thecomputationscheme
FurtherimprovementsOrdering, sign
Early termination
Updated computationscheme
Application tocomputingDeterminantsSoftwares
Live demo
Perspectives
Certification process
Read the decoder result R′
pick up a random prime picompute r = f over Zpi
reduce r ′ = R′ mod piif r ! = r ′ then
Return ERROR.else
Return SUCCESS with a given probabilityend if
![Page 53: Computing exactly with unsafe resources: fault tolerant ...lig-membres.imag.fr/pernet/Publications/SD16_mkhonji_cpernet.pdf · Perspectives Cloud Computing [Above the clouds, a Berkeley](https://reader035.vdocument.in/reader035/viewer/2022071214/6043813330b83f3806647832/html5/thumbnails/53.jpg)
fault tolerant linearalgebra
Majid Khonji andClément Pernet
InroductionDistributed and Cloudcomputing
Fault tolerance
Parallel Exact linear algebra
ChineseRemainder andError correctingcodePrinciple
Correction
Overview of thecomputationscheme
FurtherimprovementsOrdering, sign
Early termination
Updated computationscheme
Application tocomputingDeterminantsSoftwares
Live demo
Perspectives
![Page 54: Computing exactly with unsafe resources: fault tolerant ...lig-membres.imag.fr/pernet/Publications/SD16_mkhonji_cpernet.pdf · Perspectives Cloud Computing [Above the clouds, a Berkeley](https://reader035.vdocument.in/reader035/viewer/2022071214/6043813330b83f3806647832/html5/thumbnails/54.jpg)
fault tolerant linearalgebra
Majid Khonji andClément Pernet
InroductionDistributed and Cloudcomputing
Fault tolerance
Parallel Exact linear algebra
ChineseRemainder andError correctingcodePrinciple
Correction
Overview of thecomputationscheme
FurtherimprovementsOrdering, sign
Early termination
Updated computationscheme
Application tocomputingDeterminantsSoftwares
Live demo
Perspectives
OutlineInroduction
Distributed and Cloud computingFault toleranceParallel Exact linear algebra
Chinese Remainder and Error correcting codePrincipleCorrection
Overview of the computation scheme
Further improvementsOrdering, signEarly terminationUpdated computation scheme
Application to computing DeterminantsSoftwaresLive demo
Perspectives
![Page 55: Computing exactly with unsafe resources: fault tolerant ...lig-membres.imag.fr/pernet/Publications/SD16_mkhonji_cpernet.pdf · Perspectives Cloud Computing [Above the clouds, a Berkeley](https://reader035.vdocument.in/reader035/viewer/2022071214/6043813330b83f3806647832/html5/thumbnails/55.jpg)
fault tolerant linearalgebra
Majid Khonji andClément Pernet
InroductionDistributed and Cloudcomputing
Fault tolerance
Parallel Exact linear algebra
ChineseRemainder andError correctingcodePrinciple
Correction
Overview of thecomputationscheme
FurtherimprovementsOrdering, sign
Early termination
Updated computationscheme
Application tocomputingDeterminantsSoftwares
Live demo
Perspectives
Removing the ordering of the moduli
worst case scenario :
every errors having a bigger contribution at the end,
Mandelbaum algorithm requires to sort the moduli pi :
I to ensures accurately the t = (n − k)/2 correctionrate
I to compute the optimal bound Cmax =∏n
i=n−k+1 pifor the search of convergents.
Relaxation:I no sort, only find pmin and pmax,I more suited for a future on-the-fly modelI only an approximation of the distance (off by 1 in the
worst cases)I less accurate Cmax,
![Page 56: Computing exactly with unsafe resources: fault tolerant ...lig-membres.imag.fr/pernet/Publications/SD16_mkhonji_cpernet.pdf · Perspectives Cloud Computing [Above the clouds, a Berkeley](https://reader035.vdocument.in/reader035/viewer/2022071214/6043813330b83f3806647832/html5/thumbnails/56.jpg)
fault tolerant linearalgebra
Majid Khonji andClément Pernet
InroductionDistributed and Cloudcomputing
Fault tolerance
Parallel Exact linear algebra
ChineseRemainder andError correctingcodePrinciple
Correction
Overview of thecomputationscheme
FurtherimprovementsOrdering, sign
Early termination
Updated computationscheme
Application tocomputingDeterminantsSoftwares
Live demo
Perspectives
Removing the ordering of the moduli
worst case scenario :
every errors having a bigger contribution at the end,
Mandelbaum algorithm requires to sort the moduli pi :
I to ensures accurately the t = (n − k)/2 correctionrate
I to compute the optimal bound Cmax =∏n
i=n−k+1 pifor the search of convergents.
Relaxation:I no sort, only find pmin and pmax,I more suited for a future on-the-fly modelI only an approximation of the distance (off by 1 in the
worst cases)I less accurate Cmax,
![Page 57: Computing exactly with unsafe resources: fault tolerant ...lig-membres.imag.fr/pernet/Publications/SD16_mkhonji_cpernet.pdf · Perspectives Cloud Computing [Above the clouds, a Berkeley](https://reader035.vdocument.in/reader035/viewer/2022071214/6043813330b83f3806647832/html5/thumbnails/57.jpg)
fault tolerant linearalgebra
Majid Khonji andClément Pernet
InroductionDistributed and Cloudcomputing
Fault tolerance
Parallel Exact linear algebra
ChineseRemainder andError correctingcodePrinciple
Correction
Overview of thecomputationscheme
FurtherimprovementsOrdering, sign
Early termination
Updated computationscheme
Application tocomputingDeterminantsSoftwares
Live demo
Perspectives
Dealing with negative numbers
x ∈ C iff 0 ≤ x < Mk
correct errorpositive negative
Mk
Mn/2
0 Mn
⇒Mandelbaum algorithm only works with positivenumbers.
(1 mod 5,1 mod 7,10 mod 11) could beI −34 with 0 errorsI 1 with 1 error
Work-around solution:I reconstruct both
x = CRT (x1, . . . , xn)
y = CRT (−x1, . . . ,−xn)
I discard the bad one with the certifier
![Page 58: Computing exactly with unsafe resources: fault tolerant ...lig-membres.imag.fr/pernet/Publications/SD16_mkhonji_cpernet.pdf · Perspectives Cloud Computing [Above the clouds, a Berkeley](https://reader035.vdocument.in/reader035/viewer/2022071214/6043813330b83f3806647832/html5/thumbnails/58.jpg)
fault tolerant linearalgebra
Majid Khonji andClément Pernet
InroductionDistributed and Cloudcomputing
Fault tolerance
Parallel Exact linear algebra
ChineseRemainder andError correctingcodePrinciple
Correction
Overview of thecomputationscheme
FurtherimprovementsOrdering, sign
Early termination
Updated computationscheme
Application tocomputingDeterminantsSoftwares
Live demo
Perspectives
Dealing with negative numbers
x ∈ C iff 0 ≤ x < Mk
correct errorpositive negative
Mk
Mn/2
0 Mn
⇒Mandelbaum algorithm only works with positivenumbers.(1 mod 5,1 mod 7,10 mod 11) could be
I −34 with 0 errorsI 1 with 1 error
Work-around solution:I reconstruct both
x = CRT (x1, . . . , xn)
y = CRT (−x1, . . . ,−xn)
I discard the bad one with the certifier
![Page 59: Computing exactly with unsafe resources: fault tolerant ...lig-membres.imag.fr/pernet/Publications/SD16_mkhonji_cpernet.pdf · Perspectives Cloud Computing [Above the clouds, a Berkeley](https://reader035.vdocument.in/reader035/viewer/2022071214/6043813330b83f3806647832/html5/thumbnails/59.jpg)
fault tolerant linearalgebra
Majid Khonji andClément Pernet
InroductionDistributed and Cloudcomputing
Fault tolerance
Parallel Exact linear algebra
ChineseRemainder andError correctingcodePrinciple
Correction
Overview of thecomputationscheme
FurtherimprovementsOrdering, sign
Early termination
Updated computationscheme
Application tocomputingDeterminantsSoftwares
Live demo
Perspectives
Dealing with negative numbers
x ∈ C iff 0 ≤ x < Mk
correct errorpositive negative
Mk
Mn/2
0 Mn
⇒Mandelbaum algorithm only works with positivenumbers.(1 mod 5,1 mod 7,10 mod 11) could be
I −34 with 0 errorsI 1 with 1 error
Work-around solution:I reconstruct both
x = CRT (x1, . . . , xn)
y = CRT (−x1, . . . ,−xn)
I discard the bad one with the certifier
![Page 60: Computing exactly with unsafe resources: fault tolerant ...lig-membres.imag.fr/pernet/Publications/SD16_mkhonji_cpernet.pdf · Perspectives Cloud Computing [Above the clouds, a Berkeley](https://reader035.vdocument.in/reader035/viewer/2022071214/6043813330b83f3806647832/html5/thumbnails/60.jpg)
fault tolerant linearalgebra
Majid Khonji andClément Pernet
InroductionDistributed and Cloudcomputing
Fault tolerance
Parallel Exact linear algebra
ChineseRemainder andError correctingcodePrinciple
Correction
Overview of thecomputationscheme
FurtherimprovementsOrdering, sign
Early termination
Updated computationscheme
Application tocomputingDeterminantsSoftwares
Live demo
Perspectives
Early termination
Chinese Remainder Algorithm requires a bound.For Determinant ⇒Hadamard:
|det(A)| ≤ nn(max |ai,j |)n/2 ≤∏
i
pi
Could be really pessimistic withI sparse/structured matrices,I matrices with special properties,...⇒discover the bound during the computation
I for each new modular result, reconstruct the resultI if it stabilizes k times ⇒correct result with
probability 1− 1/pk
![Page 61: Computing exactly with unsafe resources: fault tolerant ...lig-membres.imag.fr/pernet/Publications/SD16_mkhonji_cpernet.pdf · Perspectives Cloud Computing [Above the clouds, a Berkeley](https://reader035.vdocument.in/reader035/viewer/2022071214/6043813330b83f3806647832/html5/thumbnails/61.jpg)
fault tolerant linearalgebra
Majid Khonji andClément Pernet
InroductionDistributed and Cloudcomputing
Fault tolerance
Parallel Exact linear algebra
ChineseRemainder andError correctingcodePrinciple
Correction
Overview of thecomputationscheme
FurtherimprovementsOrdering, sign
Early termination
Updated computationscheme
Application tocomputingDeterminantsSoftwares
Live demo
Perspectives
Early termination
Chinese Remainder Algorithm requires a bound.For Determinant ⇒Hadamard:
|det(A)| ≤ nn(max |ai,j |)n/2 ≤∏
i
pi
Could be really pessimistic withI sparse/structured matrices,I matrices with special properties,...
⇒discover the bound during the computation
I for each new modular result, reconstruct the resultI if it stabilizes k times ⇒correct result with
probability 1− 1/pk
![Page 62: Computing exactly with unsafe resources: fault tolerant ...lig-membres.imag.fr/pernet/Publications/SD16_mkhonji_cpernet.pdf · Perspectives Cloud Computing [Above the clouds, a Berkeley](https://reader035.vdocument.in/reader035/viewer/2022071214/6043813330b83f3806647832/html5/thumbnails/62.jpg)
fault tolerant linearalgebra
Majid Khonji andClément Pernet
InroductionDistributed and Cloudcomputing
Fault tolerance
Parallel Exact linear algebra
ChineseRemainder andError correctingcodePrinciple
Correction
Overview of thecomputationscheme
FurtherimprovementsOrdering, sign
Early termination
Updated computationscheme
Application tocomputingDeterminantsSoftwares
Live demo
Perspectives
Early termination
Chinese Remainder Algorithm requires a bound.For Determinant ⇒Hadamard:
|det(A)| ≤ nn(max |ai,j |)n/2 ≤∏
i
pi
Could be really pessimistic withI sparse/structured matrices,I matrices with special properties,...⇒discover the bound during the computation
I for each new modular result, reconstruct the resultI if it stabilizes k times ⇒correct result with
probability 1− 1/pk
![Page 63: Computing exactly with unsafe resources: fault tolerant ...lig-membres.imag.fr/pernet/Publications/SD16_mkhonji_cpernet.pdf · Perspectives Cloud Computing [Above the clouds, a Berkeley](https://reader035.vdocument.in/reader035/viewer/2022071214/6043813330b83f3806647832/html5/thumbnails/63.jpg)
fault tolerant linearalgebra
Majid Khonji andClément Pernet
InroductionDistributed and Cloudcomputing
Fault tolerance
Parallel Exact linear algebra
ChineseRemainder andError correctingcodePrinciple
Correction
Overview of thecomputationscheme
FurtherimprovementsOrdering, sign
Early termination
Updated computationscheme
Application tocomputingDeterminantsSoftwares
Live demo
Perspectives
Early termination
Problem: how to detect errors without a bound?
?
Multiple solution candidates:
I Solution x1 with 1 error correctedI Solution x2 with 3 error correctedI Solution x3,with 2 errors corrected
The certifier discriminates the candidates
![Page 64: Computing exactly with unsafe resources: fault tolerant ...lig-membres.imag.fr/pernet/Publications/SD16_mkhonji_cpernet.pdf · Perspectives Cloud Computing [Above the clouds, a Berkeley](https://reader035.vdocument.in/reader035/viewer/2022071214/6043813330b83f3806647832/html5/thumbnails/64.jpg)
fault tolerant linearalgebra
Majid Khonji andClément Pernet
InroductionDistributed and Cloudcomputing
Fault tolerance
Parallel Exact linear algebra
ChineseRemainder andError correctingcodePrinciple
Correction
Overview of thecomputationscheme
FurtherimprovementsOrdering, sign
Early termination
Updated computationscheme
Application tocomputingDeterminantsSoftwares
Live demo
Perspectives
Early termination
Problem: how to detect errors without a bound?
?
Multiple solution candidates:
I Solution x1 with 1 error corrected
I Solution x2 with 3 error correctedI Solution x3,with 2 errors corrected
The certifier discriminates the candidates
![Page 65: Computing exactly with unsafe resources: fault tolerant ...lig-membres.imag.fr/pernet/Publications/SD16_mkhonji_cpernet.pdf · Perspectives Cloud Computing [Above the clouds, a Berkeley](https://reader035.vdocument.in/reader035/viewer/2022071214/6043813330b83f3806647832/html5/thumbnails/65.jpg)
fault tolerant linearalgebra
Majid Khonji andClément Pernet
InroductionDistributed and Cloudcomputing
Fault tolerance
Parallel Exact linear algebra
ChineseRemainder andError correctingcodePrinciple
Correction
Overview of thecomputationscheme
FurtherimprovementsOrdering, sign
Early termination
Updated computationscheme
Application tocomputingDeterminantsSoftwares
Live demo
Perspectives
Early termination
Problem: how to detect errors without a bound?
?
Multiple solution candidates:
I Solution x1 with 1 error correctedI Solution x2 with 3 error corrected
I Solution x3,with 2 errors corrected
The certifier discriminates the candidates
![Page 66: Computing exactly with unsafe resources: fault tolerant ...lig-membres.imag.fr/pernet/Publications/SD16_mkhonji_cpernet.pdf · Perspectives Cloud Computing [Above the clouds, a Berkeley](https://reader035.vdocument.in/reader035/viewer/2022071214/6043813330b83f3806647832/html5/thumbnails/66.jpg)
fault tolerant linearalgebra
Majid Khonji andClément Pernet
InroductionDistributed and Cloudcomputing
Fault tolerance
Parallel Exact linear algebra
ChineseRemainder andError correctingcodePrinciple
Correction
Overview of thecomputationscheme
FurtherimprovementsOrdering, sign
Early termination
Updated computationscheme
Application tocomputingDeterminantsSoftwares
Live demo
Perspectives
Early termination
Problem: how to detect errors without a bound?
?
Multiple solution candidates:
I Solution x1 with 1 error correctedI Solution x2 with 3 error correctedI Solution x3,with 2 errors corrected
The certifier discriminates the candidates
![Page 67: Computing exactly with unsafe resources: fault tolerant ...lig-membres.imag.fr/pernet/Publications/SD16_mkhonji_cpernet.pdf · Perspectives Cloud Computing [Above the clouds, a Berkeley](https://reader035.vdocument.in/reader035/viewer/2022071214/6043813330b83f3806647832/html5/thumbnails/67.jpg)
fault tolerant linearalgebra
Majid Khonji andClément Pernet
InroductionDistributed and Cloudcomputing
Fault tolerance
Parallel Exact linear algebra
ChineseRemainder andError correctingcodePrinciple
Correction
Overview of thecomputationscheme
FurtherimprovementsOrdering, sign
Early termination
Updated computationscheme
Application tocomputingDeterminantsSoftwares
Live demo
Perspectives
Early termination
Problem: how to detect errors without a bound?
?
Multiple solution candidates:
I Solution x1 with 1 error correctedI Solution x2 with 3 error correctedI Solution x3,with 2 errors corrected
The certifier discriminates the candidates
![Page 68: Computing exactly with unsafe resources: fault tolerant ...lig-membres.imag.fr/pernet/Publications/SD16_mkhonji_cpernet.pdf · Perspectives Cloud Computing [Above the clouds, a Berkeley](https://reader035.vdocument.in/reader035/viewer/2022071214/6043813330b83f3806647832/html5/thumbnails/68.jpg)
fault tolerant linearalgebra
Majid Khonji andClément Pernet
InroductionDistributed and Cloudcomputing
Fault tolerance
Parallel Exact linear algebra
ChineseRemainder andError correctingcodePrinciple
Correction
Overview of thecomputationscheme
FurtherimprovementsOrdering, sign
Early termination
Updated computationscheme
Application tocomputingDeterminantsSoftwares
Live demo
Perspectives
Computation scheme
![Page 69: Computing exactly with unsafe resources: fault tolerant ...lig-membres.imag.fr/pernet/Publications/SD16_mkhonji_cpernet.pdf · Perspectives Cloud Computing [Above the clouds, a Berkeley](https://reader035.vdocument.in/reader035/viewer/2022071214/6043813330b83f3806647832/html5/thumbnails/69.jpg)
fault tolerant linearalgebra
Majid Khonji andClément Pernet
InroductionDistributed and Cloudcomputing
Fault tolerance
Parallel Exact linear algebra
ChineseRemainder andError correctingcodePrinciple
Correction
Overview of thecomputationscheme
FurtherimprovementsOrdering, sign
Early termination
Updated computationscheme
Application tocomputingDeterminantsSoftwares
Live demo
Perspectives
Multiple decoding
sign: 2 candidatesearly term.: multiple candidates, for each possible bound
Decoder returns a l of candidateswhile |l | > 1 do
pick a random prime piCertifier compute fi = f mod pisieve un-compatible elements in lif |l | = 1 then
return l.front()end if
end while
![Page 70: Computing exactly with unsafe resources: fault tolerant ...lig-membres.imag.fr/pernet/Publications/SD16_mkhonji_cpernet.pdf · Perspectives Cloud Computing [Above the clouds, a Berkeley](https://reader035.vdocument.in/reader035/viewer/2022071214/6043813330b83f3806647832/html5/thumbnails/70.jpg)
fault tolerant linearalgebra
Majid Khonji andClément Pernet
InroductionDistributed and Cloudcomputing
Fault tolerance
Parallel Exact linear algebra
ChineseRemainder andError correctingcodePrinciple
Correction
Overview of thecomputationscheme
FurtherimprovementsOrdering, sign
Early termination
Updated computationscheme
Application tocomputingDeterminantsSoftwares
Live demo
Perspectives
OutlineInroduction
Distributed and Cloud computingFault toleranceParallel Exact linear algebra
Chinese Remainder and Error correcting codePrincipleCorrection
Overview of the computation scheme
Further improvementsOrdering, signEarly terminationUpdated computation scheme
Application to computing DeterminantsSoftwaresLive demo
Perspectives
![Page 71: Computing exactly with unsafe resources: fault tolerant ...lig-membres.imag.fr/pernet/Publications/SD16_mkhonji_cpernet.pdf · Perspectives Cloud Computing [Above the clouds, a Berkeley](https://reader035.vdocument.in/reader035/viewer/2022071214/6043813330b83f3806647832/html5/thumbnails/71.jpg)
fault tolerant linearalgebra
Majid Khonji andClément Pernet
InroductionDistributed and Cloudcomputing
Fault tolerance
Parallel Exact linear algebra
ChineseRemainder andError correctingcodePrinciple
Correction
Overview of thecomputationscheme
FurtherimprovementsOrdering, sign
Early termination
Updated computationscheme
Application tocomputingDeterminantsSoftwares
Live demo
Perspectives
Parallelism : Kaapi
Kernel for Adaptive Asynchronous Parallel Interface
Virtualization of the parallel platformI SMP, distributed architecturesI virtualizes the processors⇒programmer only focus on a high level description
of the parallelismI Scheduling:
I static optimization from the data flow graphI dynamic using work stealing at fine grain granularity
![Page 72: Computing exactly with unsafe resources: fault tolerant ...lig-membres.imag.fr/pernet/Publications/SD16_mkhonji_cpernet.pdf · Perspectives Cloud Computing [Above the clouds, a Berkeley](https://reader035.vdocument.in/reader035/viewer/2022071214/6043813330b83f3806647832/html5/thumbnails/72.jpg)
fault tolerant linearalgebra
Majid Khonji andClément Pernet
InroductionDistributed and Cloudcomputing
Fault tolerance
Parallel Exact linear algebra
ChineseRemainder andError correctingcodePrinciple
Correction
Overview of thecomputationscheme
FurtherimprovementsOrdering, sign
Early termination
Updated computationscheme
Application tocomputingDeterminantsSoftwares
Live demo
Perspectives
Parallelism : KaapiExample of code
struct Fibonacci { void operator()( int n, a1::Shared_w<int& result )
! { ! ! if (n < 2) result.=rite( n );
! ! else {
! ! ! a1::Shared<int> subresult1; ! ! ! a1::Shared<int> subresult2;
! ! ! a1::Fork<Fibonacci>()(n-1, subresult1); ! ! ! a1::Fork<Fibonacci>()(n-2, subresult2);
! ! ! a1::Fork<Sum>()(result, subresult1, subresult2);
! ! } ! }
};
struct Sum {
void operator()(!a1::Shared_w<int& result, ! ! ! ! a1::Shared_w<int> sr1,
! ! ! ! a1::Shared_w<int> sr2 ) ! { result.=rite( sr1.read() + sr2.read() ); }
}
31
![Page 73: Computing exactly with unsafe resources: fault tolerant ...lig-membres.imag.fr/pernet/Publications/SD16_mkhonji_cpernet.pdf · Perspectives Cloud Computing [Above the clouds, a Berkeley](https://reader035.vdocument.in/reader035/viewer/2022071214/6043813330b83f3806647832/html5/thumbnails/73.jpg)
fault tolerant linearalgebra
Majid Khonji andClément Pernet
InroductionDistributed and Cloudcomputing
Fault tolerance
Parallel Exact linear algebra
ChineseRemainder andError correctingcodePrinciple
Correction
Overview of thecomputationscheme
FurtherimprovementsOrdering, sign
Early termination
Updated computationscheme
Application tocomputingDeterminantsSoftwares
Live demo
Perspectives
Parallelism : KaapiExample of code
struct Fibonacci { void operator()( int n, a1::Shared_w<int> result )
! { ! ! if (n < 2) result.write( n );
! ! else {
! ! ! a1::Shared<int> subresult1; ! ! ! a1::Shared<int> subresult2;
! ! ! a1::Fork<Fibonacci>()(n-1, subresult1); ! ! ! a1::Fork<Fibonacci>()(n-2, subresult2);
! ! ! a1::Fork<Sum>()(result, subresult1, subresult2);
! ! } ! }
};
struct Sum {
void operator()(!a1::Shared_w<int> result, ! ! ! ! a1::Shared_w<int> sr1,
! ! ! ! a1::Shared_w<int> sr2 ) ! { result.write( sr1.read() + sr2.read() ); }
}
32
![Page 74: Computing exactly with unsafe resources: fault tolerant ...lig-membres.imag.fr/pernet/Publications/SD16_mkhonji_cpernet.pdf · Perspectives Cloud Computing [Above the clouds, a Berkeley](https://reader035.vdocument.in/reader035/viewer/2022071214/6043813330b83f3806647832/html5/thumbnails/74.jpg)
fault tolerant linearalgebra
Majid Khonji andClément Pernet
InroductionDistributed and Cloudcomputing
Fault tolerance
Parallel Exact linear algebra
ChineseRemainder andError correctingcodePrinciple
Correction
Overview of thecomputationscheme
FurtherimprovementsOrdering, sign
Early termination
Updated computationscheme
Application tocomputingDeterminantsSoftwares
Live demo
Perspectives
Computations
Linear Algebra tasks: LinBox
I Fast MatMul (BLAS, delayed modular reductions,Strassen,...)
I Efficient, in-place, block Gaussian elimination
Decoder: Sage
I ease of use for experimenting different prototypesI good efficiencyI but BIG executable ⇒hard to distribute
![Page 75: Computing exactly with unsafe resources: fault tolerant ...lig-membres.imag.fr/pernet/Publications/SD16_mkhonji_cpernet.pdf · Perspectives Cloud Computing [Above the clouds, a Berkeley](https://reader035.vdocument.in/reader035/viewer/2022071214/6043813330b83f3806647832/html5/thumbnails/75.jpg)
fault tolerant linearalgebra
Majid Khonji andClément Pernet
InroductionDistributed and Cloudcomputing
Fault tolerance
Parallel Exact linear algebra
ChineseRemainder andError correctingcodePrinciple
Correction
Overview of thecomputationscheme
FurtherimprovementsOrdering, sign
Early termination
Updated computationscheme
Application tocomputingDeterminantsSoftwares
Live demo
Perspectives
Computations
Linear Algebra tasks: LinBox
I Fast MatMul (BLAS, delayed modular reductions,Strassen,...)
I Efficient, in-place, block Gaussian elimination
Decoder: Sage
I ease of use for experimenting different prototypesI good efficiencyI but BIG executable ⇒hard to distribute
![Page 76: Computing exactly with unsafe resources: fault tolerant ...lig-membres.imag.fr/pernet/Publications/SD16_mkhonji_cpernet.pdf · Perspectives Cloud Computing [Above the clouds, a Berkeley](https://reader035.vdocument.in/reader035/viewer/2022071214/6043813330b83f3806647832/html5/thumbnails/76.jpg)
fault tolerant linearalgebra
Majid Khonji andClément Pernet
InroductionDistributed and Cloudcomputing
Fault tolerance
Parallel Exact linear algebra
ChineseRemainder andError correctingcodePrinciple
Correction
Overview of thecomputationscheme
FurtherimprovementsOrdering, sign
Early termination
Updated computationscheme
Application tocomputingDeterminantsSoftwares
Live demo
Perspectives
The Grid 5000 architecture
I Country-wide grid of 9 sites (Nancy, Lyon, Grenoble,Toulouse, Lille, Bordeaux, Orsay, Rennes, Sophia)
I Each site has 2-3 clusters of 60-600 computersI heterogeneous architecture⇒3202 processors, 5714 cores
![Page 77: Computing exactly with unsafe resources: fault tolerant ...lig-membres.imag.fr/pernet/Publications/SD16_mkhonji_cpernet.pdf · Perspectives Cloud Computing [Above the clouds, a Berkeley](https://reader035.vdocument.in/reader035/viewer/2022071214/6043813330b83f3806647832/html5/thumbnails/77.jpg)
fault tolerant linearalgebra
Majid Khonji andClément Pernet
InroductionDistributed and Cloudcomputing
Fault tolerance
Parallel Exact linear algebra
ChineseRemainder andError correctingcodePrinciple
Correction
Overview of thecomputationscheme
FurtherimprovementsOrdering, sign
Early termination
Updated computationscheme
Application tocomputingDeterminantsSoftwares
Live demo
Perspectives
Live demo
![Page 78: Computing exactly with unsafe resources: fault tolerant ...lig-membres.imag.fr/pernet/Publications/SD16_mkhonji_cpernet.pdf · Perspectives Cloud Computing [Above the clouds, a Berkeley](https://reader035.vdocument.in/reader035/viewer/2022071214/6043813330b83f3806647832/html5/thumbnails/78.jpg)
fault tolerant linearalgebra
Majid Khonji andClément Pernet
InroductionDistributed and Cloudcomputing
Fault tolerance
Parallel Exact linear algebra
ChineseRemainder andError correctingcodePrinciple
Correction
Overview of thecomputationscheme
FurtherimprovementsOrdering, sign
Early termination
Updated computationscheme
Application tocomputingDeterminantsSoftwares
Live demo
Perspectives
OutlineInroduction
Distributed and Cloud computingFault toleranceParallel Exact linear algebra
Chinese Remainder and Error correcting codePrincipleCorrection
Overview of the computation scheme
Further improvementsOrdering, signEarly terminationUpdated computation scheme
Application to computing DeterminantsSoftwaresLive demo
Perspectives
![Page 79: Computing exactly with unsafe resources: fault tolerant ...lig-membres.imag.fr/pernet/Publications/SD16_mkhonji_cpernet.pdf · Perspectives Cloud Computing [Above the clouds, a Berkeley](https://reader035.vdocument.in/reader035/viewer/2022071214/6043813330b83f3806647832/html5/thumbnails/79.jpg)
fault tolerant linearalgebra
Majid Khonji andClément Pernet
InroductionDistributed and Cloudcomputing
Fault tolerance
Parallel Exact linear algebra
ChineseRemainder andError correctingcodePrinciple
Correction
Overview of thecomputationscheme
FurtherimprovementsOrdering, sign
Early termination
Updated computationscheme
Application tocomputingDeterminantsSoftwares
Live demo
Perspectives
PerspectivesA fully on-the-fly computation model:
I Need a fast CRT working on the flyI Refine the amortized technique
![Page 80: Computing exactly with unsafe resources: fault tolerant ...lig-membres.imag.fr/pernet/Publications/SD16_mkhonji_cpernet.pdf · Perspectives Cloud Computing [Above the clouds, a Berkeley](https://reader035.vdocument.in/reader035/viewer/2022071214/6043813330b83f3806647832/html5/thumbnails/80.jpg)
fault tolerant linearalgebra
Majid Khonji andClément Pernet
InroductionDistributed and Cloudcomputing
Fault tolerance
Parallel Exact linear algebra
ChineseRemainder andError correctingcodePrinciple
Correction
Overview of thecomputationscheme
FurtherimprovementsOrdering, sign
Early termination
Updated computationscheme
Application tocomputingDeterminantsSoftwares
Live demo
Perspectives
Sage ?
Revive the Dsage efforts?
distributed computations: Kaapi optional package?
I resource obliviousI task parallelism, work-stealing, ABFTI perspectives for GPU/multi-CPU support
P2P, volunteer computing
Spare your cycles and contribute solving BSDconjecture!