armadillo: template metaprogramming for linear...
TRANSCRIPT
-
1
IntroductionTemplate Metaprogramming
TMP in ArmadilloConclusion
Armadillo: Template Metaprogramming forLinear Algebra
Leo Antunes and Markus Kruber
Seminar on Automation, Compilers, and Code-GenerationHPAC
June 2014
Leo Antunes and Markus Kruber Armadillo
http://hpac.rwth-aachen.de
-
2
IntroductionTemplate Metaprogramming
TMP in ArmadilloConclusion
What is Armadillo?Code example
What is Armadillo?
. C++ Library for Linear Algebra
. Aims for ease of use
. Still wants performance
. Abstract interface to BLAS and LAPACK
. Pure template library (only headers)
Leo Antunes and Markus Kruber Armadillo
-
2
IntroductionTemplate Metaprogramming
TMP in ArmadilloConclusion
What is Armadillo?Code example
What is Armadillo?
. C++ Library for Linear Algebra
. Aims for ease of use
. Still wants performance
. Abstract interface to BLAS and LAPACK
. Pure template library (only headers)
Leo Antunes and Markus Kruber Armadillo
-
2
IntroductionTemplate Metaprogramming
TMP in ArmadilloConclusion
What is Armadillo?Code example
What is Armadillo?
. C++ Library for Linear Algebra
. Aims for ease of use
. Still wants performance
. Abstract interface to BLAS and LAPACK
. Pure template library (only headers)
Leo Antunes and Markus Kruber Armadillo
-
2
IntroductionTemplate Metaprogramming
TMP in ArmadilloConclusion
What is Armadillo?Code example
What is Armadillo?
. C++ Library for Linear Algebra
. Aims for ease of use
. Still wants performance
. Abstract interface to BLAS and LAPACK
. Pure template library (only headers)
Leo Antunes and Markus Kruber Armadillo
-
3
IntroductionTemplate Metaprogramming
TMP in ArmadilloConclusion
What is Armadillo?Code example
Code example
Calculating A−1BvvT :
mat A =
-
4
IntroductionTemplate Metaprogramming
TMP in ArmadilloConclusion
C++ TemplatesOverviewExamples
Refresher: C++ Templates
. Allow type-agnostic development (e.g.: class Array)”generics” in Java, C#; ”parametric polymorphism” in Haskell, Scala
. Realized at compile-time → code generation
. Generate different class for each type used in code(thus: ”template”)
. Therefore: no dynamic typing
. Allows for further optimizations by compiler
Leo Antunes and Markus Kruber Armadillo
-
4
IntroductionTemplate Metaprogramming
TMP in ArmadilloConclusion
C++ TemplatesOverviewExamples
Refresher: C++ Templates
. Allow type-agnostic development (e.g.: class Array)”generics” in Java, C#; ”parametric polymorphism” in Haskell, Scala
. Realized at compile-time → code generation
. Generate different class for each type used in code(thus: ”template”)
. Therefore: no dynamic typing
. Allows for further optimizations by compiler
Leo Antunes and Markus Kruber Armadillo
-
4
IntroductionTemplate Metaprogramming
TMP in ArmadilloConclusion
C++ TemplatesOverviewExamples
Refresher: C++ Templates
. Allow type-agnostic development (e.g.: class Array)”generics” in Java, C#; ”parametric polymorphism” in Haskell, Scala
. Realized at compile-time → code generation
. Generate different class for each type used in code(thus: ”template”)
. Therefore: no dynamic typing
. Allows for further optimizations by compiler
Leo Antunes and Markus Kruber Armadillo
-
4
IntroductionTemplate Metaprogramming
TMP in ArmadilloConclusion
C++ TemplatesOverviewExamples
Refresher: C++ Templates
. Allow type-agnostic development (e.g.: class Array)”generics” in Java, C#; ”parametric polymorphism” in Haskell, Scala
. Realized at compile-time → code generation
. Generate different class for each type used in code(thus: ”template”)
. Therefore: no dynamic typing
. Allows for further optimizations by compiler
Leo Antunes and Markus Kruber Armadillo
-
4
IntroductionTemplate Metaprogramming
TMP in ArmadilloConclusion
C++ TemplatesOverviewExamples
Refresher: C++ Templates
. Allow type-agnostic development (e.g.: class Array)”generics” in Java, C#; ”parametric polymorphism” in Haskell, Scala
. Realized at compile-time → code generation
. Generate different class for each type used in code(thus: ”template”)
. Therefore: no dynamic typing
. Allows for further optimizations by compiler
Leo Antunes and Markus Kruber Armadillo
-
5
IntroductionTemplate Metaprogramming
TMP in ArmadilloConclusion
C++ TemplatesOverviewExamples
C++ Template example
Type-dependent:
class Adder{
int base;
public:
Adder(int x):base(x){}
int doit(int y){
return base+y;
}
};
Type-agnostic:
template
class Adder{
TYPE base;
public:
Adder(TYPE x):base(x){}
TYPE doit(TYPE y){
return base+y;
}
};
Leo Antunes and Markus Kruber Armadillo
-
6
IntroductionTemplate Metaprogramming
TMP in ArmadilloConclusion
C++ TemplatesOverviewExamples
Template Metaprogramming
. Use C++ templates as a language in itself
. Type specification as pattern matching → Branching(independently of C++’s control structures)
. Template used inside template → Recursion
. Branching and Recursion → Turing complete
. Allows offloading of computation to compile-time
Leo Antunes and Markus Kruber Armadillo
-
6
IntroductionTemplate Metaprogramming
TMP in ArmadilloConclusion
C++ TemplatesOverviewExamples
Template Metaprogramming
. Use C++ templates as a language in itself
. Type specification as pattern matching → Branching(independently of C++’s control structures)
. Template used inside template → Recursion
. Branching and Recursion → Turing complete
. Allows offloading of computation to compile-time
Leo Antunes and Markus Kruber Armadillo
-
6
IntroductionTemplate Metaprogramming
TMP in ArmadilloConclusion
C++ TemplatesOverviewExamples
Template Metaprogramming
. Use C++ templates as a language in itself
. Type specification as pattern matching → Branching(independently of C++’s control structures)
. Template used inside template → Recursion
. Branching and Recursion → Turing complete
. Allows offloading of computation to compile-time
Leo Antunes and Markus Kruber Armadillo
-
6
IntroductionTemplate Metaprogramming
TMP in ArmadilloConclusion
C++ TemplatesOverviewExamples
Template Metaprogramming
. Use C++ templates as a language in itself
. Type specification as pattern matching → Branching(independently of C++’s control structures)
. Template used inside template → Recursion
. Branching and Recursion → Turing complete
. Allows offloading of computation to compile-time
Leo Antunes and Markus Kruber Armadillo
-
6
IntroductionTemplate Metaprogramming
TMP in ArmadilloConclusion
C++ TemplatesOverviewExamples
Template Metaprogramming
. Use C++ templates as a language in itself
. Type specification as pattern matching → Branching(independently of C++’s control structures)
. Template used inside template → Recursion
. Branching and Recursion → Turing complete
. Allows offloading of computation to compile-time
Leo Antunes and Markus Kruber Armadillo
-
7
IntroductionTemplate Metaprogramming
TMP in ArmadilloConclusion
C++ TemplatesOverviewExamples
Factorial: normal C++
int factorial(int n) {
if(n < 1){
return 1;
}
else {
return n*factorial(n-1);
}
}
Leo Antunes and Markus Kruber Armadillo
-
8
IntroductionTemplate Metaprogramming
TMP in ArmadilloConclusion
C++ TemplatesOverviewExamples
Factorial: Template metaprogramming
template
struct factorial {
const static int result =
n * factorial :: result;
};
template
struct factorial {
const static int result = 1 ;
};
. Types used for pattern matching.
. Structs used as functions.
. Most ”specific” template chosen first. (cf. Haskell)
Leo Antunes and Markus Kruber Armadillo
-
9
IntroductionTemplate Metaprogramming
TMP in ArmadilloConclusion
OverviewExamples
TMP in Armadillo
. Classes for common LA objects: matrix, vector, etc
. Literal expression in code converted to template tree, e.g.:A*inv(B) → Glue
. Infers information on arguments via pattern matching
. User can provide information about objects.e.g.: diagmat(A) or trimatu(A)
. Actual calculation delayed until runtime request for results
Leo Antunes and Markus Kruber Armadillo
-
9
IntroductionTemplate Metaprogramming
TMP in ArmadilloConclusion
OverviewExamples
TMP in Armadillo
. Classes for common LA objects: matrix, vector, etc
. Literal expression in code converted to template tree, e.g.:A*inv(B) → Glue
. Infers information on arguments via pattern matching
. User can provide information about objects.e.g.: diagmat(A) or trimatu(A)
. Actual calculation delayed until runtime request for results
Leo Antunes and Markus Kruber Armadillo
-
9
IntroductionTemplate Metaprogramming
TMP in ArmadilloConclusion
OverviewExamples
TMP in Armadillo
. Classes for common LA objects: matrix, vector, etc
. Literal expression in code converted to template tree, e.g.:A*inv(B) → Glue
. Infers information on arguments via pattern matching
. User can provide information about objects.e.g.: diagmat(A) or trimatu(A)
. Actual calculation delayed until runtime request for results
Leo Antunes and Markus Kruber Armadillo
-
9
IntroductionTemplate Metaprogramming
TMP in ArmadilloConclusion
OverviewExamples
TMP in Armadillo
. Classes for common LA objects: matrix, vector, etc
. Literal expression in code converted to template tree, e.g.:A*inv(B) → Glue
. Infers information on arguments via pattern matching
. User can provide information about objects.e.g.: diagmat(A) or trimatu(A)
. Actual calculation delayed until runtime request for results
Leo Antunes and Markus Kruber Armadillo
-
9
IntroductionTemplate Metaprogramming
TMP in ArmadilloConclusion
OverviewExamples
TMP in Armadillo
. Classes for common LA objects: matrix, vector, etc
. Literal expression in code converted to template tree, e.g.:A*inv(B) → Glue
. Infers information on arguments via pattern matching
. User can provide information about objects.e.g.: diagmat(A) or trimatu(A)
. Actual calculation delayed until runtime request for results
Leo Antunes and Markus Kruber Armadillo
-
10
IntroductionTemplate Metaprogramming
TMP in ArmadilloConclusion
OverviewExamples
Simple example: inv(inv(A))
. Rationale: should not compute anything; just return A
. Inner call to inv() returns a Op object.
. Outter call matches the definition of inv() withOp argument. Returns A.
. Advantages:
. Obvious: no actual inversion.
. Not obvious: multiple-dispatch without the runtime overhead.
. Allows compiler to get rid of unneeded function calls.
Leo Antunes and Markus Kruber Armadillo
-
10
IntroductionTemplate Metaprogramming
TMP in ArmadilloConclusion
OverviewExamples
Simple example: inv(inv(A))
. Rationale: should not compute anything; just return A
. Inner call to inv() returns a Op object.
. Outter call matches the definition of inv() withOp argument. Returns A.
. Advantages:
. Obvious: no actual inversion.
. Not obvious: multiple-dispatch without the runtime overhead.
. Allows compiler to get rid of unneeded function calls.
Leo Antunes and Markus Kruber Armadillo
-
11
IntroductionTemplate Metaprogramming
TMP in ArmadilloConclusion
OverviewExamples
Matrix Chain Multiplication
. Matrix multiplication is associative.(ABC )D = A(BCD) = . . .
. Order affects the number of operations.
. A ∈ R60×5, B ∈ R5×30, C ∈ R30×10
. (AB)C = (60 · 5 · 30) + (60 · 30 · 10) = 27.000
. A(BC ) = (5 · 30 · 10) + (60 · 5 · 10) = 4.500
. E.g. Matlab: naive left to right ordering
. Complexity: O(n3) with dynamic programmingSmarter approach: O(n · log(n))
Leo Antunes and Markus Kruber Armadillo
-
11
IntroductionTemplate Metaprogramming
TMP in ArmadilloConclusion
OverviewExamples
Matrix Chain Multiplication
. Matrix multiplication is associative.(ABC )D = A(BCD) = . . .
. Order affects the number of operations.
. A ∈ R60×5, B ∈ R5×30, C ∈ R30×10
. (AB)C = (60 · 5 · 30) + (60 · 30 · 10) = 27.000
. A(BC ) = (5 · 30 · 10) + (60 · 5 · 10) = 4.500
. E.g. Matlab: naive left to right ordering
. Complexity: O(n3) with dynamic programmingSmarter approach: O(n · log(n))
Leo Antunes and Markus Kruber Armadillo
-
11
IntroductionTemplate Metaprogramming
TMP in ArmadilloConclusion
OverviewExamples
Matrix Chain Multiplication
. Matrix multiplication is associative.(ABC )D = A(BCD) = . . .
. Order affects the number of operations.
. A ∈ R60×5, B ∈ R5×30, C ∈ R30×10
. (AB)C = (60 · 5 · 30) + (60 · 30 · 10) = 27.000
. A(BC ) = (5 · 30 · 10) + (60 · 5 · 10) = 4.500
. E.g. Matlab: naive left to right ordering
. Complexity: O(n3) with dynamic programmingSmarter approach: O(n · log(n))
Leo Antunes and Markus Kruber Armadillo
-
12
IntroductionTemplate Metaprogramming
TMP in ArmadilloConclusion
OverviewExamples
Complex example: A*B*C*D
. Instantiated as:Glue
. Determines ”depth” of expression: N = 3(amount of operators with same precedence)
. Pattern matches specialized function for N = 3 which:
. Decide (ABC )D vs. A(BCD)
. Heuristic: cost(ABC ) := columns(A) · rows(C )
. Recursively calls reordering function with N = 2
. Thus: linear (but sub-optimal) reordering of operations
. Remember: all this at compile-time!
Leo Antunes and Markus Kruber Armadillo
-
12
IntroductionTemplate Metaprogramming
TMP in ArmadilloConclusion
OverviewExamples
Complex example: A*B*C*D
. Instantiated as:Glue
. Determines ”depth” of expression: N = 3(amount of operators with same precedence)
. Pattern matches specialized function for N = 3 which:
. Decide (ABC )D vs. A(BCD)
. Heuristic: cost(ABC ) := columns(A) · rows(C )
. Recursively calls reordering function with N = 2
. Thus: linear (but sub-optimal) reordering of operations
. Remember: all this at compile-time!
Leo Antunes and Markus Kruber Armadillo
-
12
IntroductionTemplate Metaprogramming
TMP in ArmadilloConclusion
OverviewExamples
Complex example: A*B*C*D
. Instantiated as:Glue
. Determines ”depth” of expression: N = 3(amount of operators with same precedence)
. Pattern matches specialized function for N = 3 which:
. Decide (ABC )D vs. A(BCD)
. Heuristic: cost(ABC ) := columns(A) · rows(C )
. Recursively calls reordering function with N = 2
. Thus: linear (but sub-optimal) reordering of operations
. Remember: all this at compile-time!
Leo Antunes and Markus Kruber Armadillo
-
13
IntroductionTemplate Metaprogramming
TMP in ArmadilloConclusion
OverviewExamples
Complex example: A*B*C*D (cont.)
. ((AB)C )D 3
⇒ O(n3)
. (A(BC ))D 3
⇒ O(n3)
. A((BC )D) 3
⇒ O(n3)
. A(B(CD)) 3
⇒ O(n3)
. (AB)(CD) 7
⇒ O(n2)
n Armadillo All
3 2 24 4 55 8 146 16 427 32 132
n
n
A
1
n
B
n1
C
n
n
D
Leo Antunes and Markus Kruber Armadillo
-
13
IntroductionTemplate Metaprogramming
TMP in ArmadilloConclusion
OverviewExamples
Complex example: A*B*C*D (cont.)
. ((AB)C )D 3
⇒ O(n3)
. (A(BC ))D 3
⇒ O(n3)
. A((BC )D) 3
⇒ O(n3)
. A(B(CD)) 3
⇒ O(n3)
. (AB)(CD) 7
⇒ O(n2)
n Armadillo All
3 2 24 4 55 8 146 16 427 32 132
n
n
A
1
n
B
n1
C
n
n
D
Leo Antunes and Markus Kruber Armadillo
-
13
IntroductionTemplate Metaprogramming
TMP in ArmadilloConclusion
OverviewExamples
Complex example: A*B*C*D (cont.)
. ((AB)C )D 3 ⇒ O(n3)
. (A(BC ))D 3 ⇒ O(n3)
. A((BC )D) 3 ⇒ O(n3)
. A(B(CD)) 3 ⇒ O(n3)
. (AB)(CD) 7 ⇒ O(n2)
n Armadillo All
3 2 24 4 55 8 146 16 427 32 132
n
n
A
1
n
B
n1
C
n
n
D
Leo Antunes and Markus Kruber Armadillo
-
13
IntroductionTemplate Metaprogramming
TMP in ArmadilloConclusion
OverviewExamples
Complex example: A*B*C*D (cont.)
. ((AB)C )D 3 ⇒ O(n3)
. (A(BC ))D 3 ⇒ O(n3)
. A((BC )D) 3 ⇒ O(n3)
. A(B(CD)) 3 ⇒ O(n3)
. (AB)(CD) 7 ⇒ O(n2)
n Armadillo All
3 2 24 4 55 8 146 16 427 32 132
n
n
A
1
n
B
n1
C
n
n
D
Leo Antunes and Markus Kruber Armadillo
-
14
IntroductionTemplate Metaprogramming
TMP in ArmadilloConclusion
OverviewExamples
Other improvements
. A = A1 + A2 + . . . + AnNo temporary matrix besides output
. y = A−1x → y = solve(A, x)Faster and more accurate
. inv(diag(A))Elementwise inverse
. as scalar(r ∗ X ∗ q) with r ∈ R1×n,X ∈ Rn×n, q ∈ Rn×1Result of computation is 1× 1 matrix
. Lots more!
Leo Antunes and Markus Kruber Armadillo
-
14
IntroductionTemplate Metaprogramming
TMP in ArmadilloConclusion
OverviewExamples
Other improvements
. A = A1 + A2 + . . . + AnNo temporary matrix besides output
. y = A−1x → y = solve(A, x)Faster and more accurate
. inv(diag(A))Elementwise inverse
. as scalar(r ∗ X ∗ q) with r ∈ R1×n,X ∈ Rn×n, q ∈ Rn×1Result of computation is 1× 1 matrix
. Lots more!
Leo Antunes and Markus Kruber Armadillo
-
14
IntroductionTemplate Metaprogramming
TMP in ArmadilloConclusion
OverviewExamples
Other improvements
. A = A1 + A2 + . . . + AnNo temporary matrix besides output
. y = A−1x → y = solve(A, x)Faster and more accurate
. inv(diag(A))Elementwise inverse
. as scalar(r ∗ X ∗ q) with r ∈ R1×n,X ∈ Rn×n, q ∈ Rn×1Result of computation is 1× 1 matrix
. Lots more!
Leo Antunes and Markus Kruber Armadillo
-
14
IntroductionTemplate Metaprogramming
TMP in ArmadilloConclusion
OverviewExamples
Other improvements
. A = A1 + A2 + . . . + AnNo temporary matrix besides output
. y = A−1x → y = solve(A, x)Faster and more accurate
. inv(diag(A))Elementwise inverse
. as scalar(r ∗ X ∗ q) with r ∈ R1×n,X ∈ Rn×n, q ∈ Rn×1Result of computation is 1× 1 matrix
. Lots more!
Leo Antunes and Markus Kruber Armadillo
-
14
IntroductionTemplate Metaprogramming
TMP in ArmadilloConclusion
OverviewExamples
Other improvements
. A = A1 + A2 + . . . + AnNo temporary matrix besides output
. y = A−1x → y = solve(A, x)Faster and more accurate
. inv(diag(A))Elementwise inverse
. as scalar(r ∗ X ∗ q) with r ∈ R1×n,X ∈ Rn×n, q ∈ Rn×1Result of computation is 1× 1 matrix
. Lots more!
Leo Antunes and Markus Kruber Armadillo
-
15
IntroductionTemplate Metaprogramming
TMP in ArmadilloConclusion
Conclusion
3 Easy to use for non-LA-experts
3 Some optimization comes built-in
7 Cannot optimize all cases
7 Debugging for more complex code can be a problem (C++)
. Our recommendation: Use it! Don’t try to understand it.
Leo Antunes and Markus Kruber Armadillo
-
15
IntroductionTemplate Metaprogramming
TMP in ArmadilloConclusion
Conclusion
3 Easy to use for non-LA-experts
3 Some optimization comes built-in
7 Cannot optimize all cases
7 Debugging for more complex code can be a problem (C++)
. Our recommendation: Use it! Don’t try to understand it.
Leo Antunes and Markus Kruber Armadillo
-
15
IntroductionTemplate Metaprogramming
TMP in ArmadilloConclusion
Conclusion
3 Easy to use for non-LA-experts
3 Some optimization comes built-in
7 Cannot optimize all cases
7 Debugging for more complex code can be a problem (C++)
. Our recommendation: Use it! Don’t try to understand it.
Leo Antunes and Markus Kruber Armadillo
-
16
IntroductionTemplate Metaprogramming
TMP in ArmadilloConclusion
Thanks! Questions?
Sources:
i Conrad Sanderson, Armadillo: An Open Source C++ Linear AlgebraLibrary for Fast Prototyping and Computationally Intensive Experiments,Technical Report, NICTA, 2010.
i Armadillo source code.i C++ Programming, chapter Template Meta-Programming, Wikibooksi Matrix chain multiplication, Wikipedia
Leo Antunes and Markus Kruber Armadillo
http://arma.sourceforge.net/armadillo_nicta_2010.pdfhttp://arma.sourceforge.net/armadillo_nicta_2010.pdfhttp://arma.sourceforge.net/armadillo_nicta_2010.pdfhttp://sourceforge.net/projects/arma/files/https://en.wikibooks.org/wiki/C++_Programming/Templates/Template_Meta-Programminghttps://en.wikipedia.org/wiki/Matrix_chain_multiplication
IntroductionWhat is Armadillo?Code example
Template MetaprogrammingC++ TemplatesOverviewExamples
TMP in ArmadilloOverviewExamples
Conclusion