linking genetic profiles to biological outcome
Post on 13-Jan-2016
39 Views
Preview:
DESCRIPTION
TRANSCRIPT
Linking Genetic Profiles to Biological Outcome
Paul FogelConsultant, Paris
S. Stanley YoungNational Institute of Statistical Sciences
NISS, NMF Workshop February 23, ‘07
Scotch whiskey database
1 2 3 4 5 6 7 8 910 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86
1 2 3 4 5 6 7 8 910 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86
BodySweetnessSmokyMedicinalTobaccoHoneySpicyWineyNuttyMaltyFruityFloral
Flavor
Original matrix
Comp 1
Comp 2
Comp 3
Comp 4
Comp 1
Comp 2
Comp 3
Comp 4
BodySweetnessSmokyMedicinalTobaccoHoneySpicyWineyNuttyMaltyFruityFloral
Flavor
=
Prototypical flavor patterns
+
Residual
1 2 3 4 5 6 7 8 910 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86
1 2 3 4 5 6 7 8 910 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86
Row
1Ro
w 2
Row
3Ro
w 4
Row
5Ro
w 6
Row
7Ro
w 8
Row
9Ro
w 10
Row
11Ro
w 12
Row
13Ro
w 14
Row
15Ro
w 16
Row
17Ro
w 18
Row
19Ro
w 20
Row
21Ro
w 22
Row
23Ro
w 24
Row
25Ro
w 26
Row
27Ro
w 28
Row
29Ro
w 30
Row
31Ro
w 32
Row
33Ro
w 34
Row
35Ro
w 36
Row
37Ro
w 38
Row
39Ro
w 40
Row
41Ro
w 42
Row
43Ro
w 44
Row
45Ro
w 46
Row
47Ro
w 48
Row
49Ro
w 50
Row
51Ro
w 52
Row
53Ro
w 54
Row
55Ro
w 56
Row
57Ro
w 58
Row
59Ro
w 60
Row
61Ro
w 62
Row
63Ro
w 64
Row
65Ro
w 66
Row
67Ro
w 68
Row
69Ro
w 70
Row
71Ro
w 72
Row
73Ro
w 74
Row
75Ro
w 76
Row
77Ro
w 78
Row
79Ro
w 80
Row
81Ro
w 82
Row
83Ro
w 84
Row
85Ro
w 86
Row
1Ro
w 2
Row
3Ro
w 4
Row
5Ro
w 6
Row
7Ro
w 8
Row
9Ro
w 10
Row
11Ro
w 12
Row
13Ro
w 14
Row
15Ro
w 16
Row
17Ro
w 18
Row
19Ro
w 20
Row
21Ro
w 22
Row
23Ro
w 24
Row
25Ro
w 26
Row
27Ro
w 28
Row
29Ro
w 30
Row
31Ro
w 32
Row
33Ro
w 34
Row
35Ro
w 36
Row
37Ro
w 38
Row
39Ro
w 40
Row
41Ro
w 42
Row
43Ro
w 44
Row
45Ro
w 46
Row
47Ro
w 48
Row
49Ro
w 50
Row
51Ro
w 52
Row
53Ro
w 54
Row
55Ro
w 56
Row
57Ro
w 58
Row
59Ro
w 60
Row
61Ro
w 62
Row
63Ro
w 64
Row
65Ro
w 66
Row
67Ro
w 68
Row
69Ro
w 70
Row
71Ro
w 72
Row
73Ro
w 74
Row
75Ro
w 76
Row
77Ro
w 78
Row
79Ro
w 80
Row
81Ro
w 82
Row
83Ro
w 84
Row
85Ro
w 86
X
Mixing levels (weights)
How many flavor patterns?
0
50
100
150
200
250
Eig
en
Va
lue
0 1 2 3 4 5 6 7 8 9 10 11 12 13
Rows
Scree Plot (eigen values)
Scree plot
-52
-51
-50
-49
-48
-47
Pro
file
Lik
elih
oo
d0 1 2 3 4 5 6 7 8 9 10 11 12 13
Rows
Profile likelihood (eigen values)
Profile likelihood
(Zhu and Ghodsi)
0.7
0.8
0.9
1
1.1
De
t
0 1 2 3 4 5 6 7 8 9 10 11 12 13
Rows
Scree Plot (determinant)
Volume filled
(Determinant)
1 2 3 4 5 6 7 8 910 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86
1 2 3 4 5 6 7 8 910 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86
AnCnoc
1 2 3 4 5 6 7 8 910 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86
1 2 3 4 5 6 7 8 910 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86
BodySweetnessSmokyMedicinalTobaccoHoneySpicyWineyNuttyMaltyFruityFloral
Flavor
Comp 1
Comp 2
Comp 3
Comp 4
Comp 1
Comp 2
Comp 3
Comp 4
BodySweetnessSmokyMedicinalTobaccoHoneySpicyWineyNuttyMaltyFruityFloral
Flavor
Row
1Ro
w 2
Row
3Ro
w 4
Row
5Ro
w 6
Row
7Ro
w 8
Row
9Ro
w 10
Row
11Ro
w 12
Row
13Ro
w 14
Row
15Ro
w 16
Row
17Ro
w 18
Row
19Ro
w 20
Row
21Ro
w 22
Row
23Ro
w 24
Row
25Ro
w 26
Row
27Ro
w 28
Row
29Ro
w 30
Row
31Ro
w 32
Row
33Ro
w 34
Row
35Ro
w 36
Row
37Ro
w 38
Row
39Ro
w 40
Row
41Ro
w 42
Row
43Ro
w 44
Row
45Ro
w 46
Row
47Ro
w 48
Row
49Ro
w 50
Row
51Ro
w 52
Row
53Ro
w 54
Row
55Ro
w 56
Row
57Ro
w 58
Row
59Ro
w 60
Row
61Ro
w 62
Row
63Ro
w 64
Row
65Ro
w 66
Row
67Ro
w 68
Row
69Ro
w 70
Row
71Ro
w 72
Row
73Ro
w 74
Row
75Ro
w 76
Row
77Ro
w 78
Row
79Ro
w 80
Row
81Ro
w 82
Row
83Ro
w 84
Row
85Ro
w 86
Row
1Ro
w 2
Row
3Ro
w 4
Row
5Ro
w 6
Row
7Ro
w 8
Row
9Ro
w 10
Row
11Ro
w 12
Row
13Ro
w 14
Row
15Ro
w 16
Row
17Ro
w 18
Row
19Ro
w 20
Row
21Ro
w 22
Row
23Ro
w 24
Row
25Ro
w 26
Row
27Ro
w 28
Row
29Ro
w 30
Row
31Ro
w 32
Row
33Ro
w 34
Row
35Ro
w 36
Row
37Ro
w 38
Row
39Ro
w 40
Row
41Ro
w 42
Row
43Ro
w 44
Row
45Ro
w 46
Row
47Ro
w 48
Row
49Ro
w 50
Row
51Ro
w 52
Row
53Ro
w 54
Row
55Ro
w 56
Row
57Ro
w 58
Row
59Ro
w 60
Row
61Ro
w 62
Row
63Ro
w 64
Row
65Ro
w 66
Row
67Ro
w 68
Row
69Ro
w 70
Row
71Ro
w 72
Row
73Ro
w 74
Row
75Ro
w 76
Row
77Ro
w 78
Row
79Ro
w 80
Row
81Ro
w 82
Row
83Ro
w 84
Row
85Ro
w 86
Floral
Sweetness
Fruity
Malty
Nutty
1 2 3 4 5 6 7 8 910 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86
1 2 3 4 5 6 7 8 910 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86
Balmenach
1 2 3 4 5 6 7 8 910 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86
1 2 3 4 5 6 7 8 910 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86
BodySweetnessSmokyMedicinalTobaccoHoneySpicyWineyNuttyMaltyFruityFloral
Flavor
Comp 1
Comp 2
Comp 3
Comp 4
Comp 1
Comp 2
Comp 3
Comp 4
BodySweetnessSmokyMedicinalTobaccoHoneySpicyWineyNuttyMaltyFruityFloral
Flavor
Row
1Ro
w 2
Row
3Ro
w 4
Row
5Ro
w 6
Row
7Ro
w 8
Row
9Ro
w 10
Row
11Ro
w 12
Row
13Ro
w 14
Row
15Ro
w 16
Row
17Ro
w 18
Row
19Ro
w 20
Row
21Ro
w 22
Row
23Ro
w 24
Row
25Ro
w 26
Row
27Ro
w 28
Row
29Ro
w 30
Row
31Ro
w 32
Row
33Ro
w 34
Row
35Ro
w 36
Row
37Ro
w 38
Row
39Ro
w 40
Row
41Ro
w 42
Row
43Ro
w 44
Row
45Ro
w 46
Row
47Ro
w 48
Row
49Ro
w 50
Row
51Ro
w 52
Row
53Ro
w 54
Row
55Ro
w 56
Row
57Ro
w 58
Row
59Ro
w 60
Row
61Ro
w 62
Row
63Ro
w 64
Row
65Ro
w 66
Row
67Ro
w 68
Row
69Ro
w 70
Row
71Ro
w 72
Row
73Ro
w 74
Row
75Ro
w 76
Row
77Ro
w 78
Row
79Ro
w 80
Row
81Ro
w 82
Row
83Ro
w 84
Row
85Ro
w 86
Row
1Ro
w 2
Row
3Ro
w 4
Row
5Ro
w 6
Row
7Ro
w 8
Row
9Ro
w 10
Row
11Ro
w 12
Row
13Ro
w 14
Row
15Ro
w 16
Row
17Ro
w 18
Row
19Ro
w 20
Row
21Ro
w 22
Row
23Ro
w 24
Row
25Ro
w 26
Row
27Ro
w 28
Row
29Ro
w 30
Row
31Ro
w 32
Row
33Ro
w 34
Row
35Ro
w 36
Row
37Ro
w 38
Row
39Ro
w 40
Row
41Ro
w 42
Row
43Ro
w 44
Row
45Ro
w 46
Row
47Ro
w 48
Row
49Ro
w 50
Row
51Ro
w 52
Row
53Ro
w 54
Row
55Ro
w 56
Row
57Ro
w 58
Row
59Ro
w 60
Row
61Ro
w 62
Row
63Ro
w 64
Row
65Ro
w 66
Row
67Ro
w 68
Row
69Ro
w 70
Row
71Ro
w 72
Row
73Ro
w 74
Row
75Ro
w 76
Row
77Ro
w 78
Row
79Ro
w 80
Row
81Ro
w 82
Row
83Ro
w 84
Row
85Ro
w 86
Winey
Body
Honey
Sweetness
Nutty
Malty
1 2 3 4 5 6 7 8 910 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86
1 2 3 4 5 6 7 8 910 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86
GlenGarioch
1 2 3 4 5 6 7 8 910 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86
1 2 3 4 5 6 7 8 910 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86
BodySweetnessSmokyMedicinalTobaccoHoneySpicyWineyNuttyMaltyFruityFloral
Flavor
Comp 1
Comp 2
Comp 3
Comp 4
Comp 1
Comp 2
Comp 3
Comp 4
BodySweetnessSmokyMedicinalTobaccoHoneySpicyWineyNuttyMaltyFruityFloral
Flavor
Row
1Ro
w 2
Row
3Ro
w 4
Row
5Ro
w 6
Row
7Ro
w 8
Row
9Ro
w 10
Row
11Ro
w 12
Row
13Ro
w 14
Row
15Ro
w 16
Row
17Ro
w 18
Row
19Ro
w 20
Row
21Ro
w 22
Row
23Ro
w 24
Row
25Ro
w 26
Row
27Ro
w 28
Row
29Ro
w 30
Row
31Ro
w 32
Row
33Ro
w 34
Row
35Ro
w 36
Row
37Ro
w 38
Row
39Ro
w 40
Row
41Ro
w 42
Row
43Ro
w 44
Row
45Ro
w 46
Row
47Ro
w 48
Row
49Ro
w 50
Row
51Ro
w 52
Row
53Ro
w 54
Row
55Ro
w 56
Row
57Ro
w 58
Row
59Ro
w 60
Row
61Ro
w 62
Row
63Ro
w 64
Row
65Ro
w 66
Row
67Ro
w 68
Row
69Ro
w 70
Row
71Ro
w 72
Row
73Ro
w 74
Row
75Ro
w 76
Row
77Ro
w 78
Row
79Ro
w 80
Row
81Ro
w 82
Row
83Ro
w 84
Row
85Ro
w 86
Row
1Ro
w 2
Row
3Ro
w 4
Row
5Ro
w 6
Row
7Ro
w 8
Row
9Ro
w 10
Row
11Ro
w 12
Row
13Ro
w 14
Row
15Ro
w 16
Row
17Ro
w 18
Row
19Ro
w 20
Row
21Ro
w 22
Row
23Ro
w 24
Row
25Ro
w 26
Row
27Ro
w 28
Row
29Ro
w 30
Row
31Ro
w 32
Row
33Ro
w 34
Row
35Ro
w 36
Row
37Ro
w 38
Row
39Ro
w 40
Row
41Ro
w 42
Row
43Ro
w 44
Row
45Ro
w 46
Row
47Ro
w 48
Row
49Ro
w 50
Row
51Ro
w 52
Row
53Ro
w 54
Row
55Ro
w 56
Row
57Ro
w 58
Row
59Ro
w 60
Row
61Ro
w 62
Row
63Ro
w 64
Row
65Ro
w 66
Row
67Ro
w 68
Row
69Ro
w 70
Row
71Ro
w 72
Row
73Ro
w 74
Row
75Ro
w 76
Row
77Ro
w 78
Row
79Ro
w 80
Row
81Ro
w 82
Row
83Ro
w 84
Row
85Ro
w 86
Spicy
Fruity
Sweetness
Body
Malty
1 2 3 4 5 6 7 8 910 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86
1 2 3 4 5 6 7 8 910 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86
Lagavulin & Laphroig
1 2 3 4 5 6 7 8 910 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86
1 2 3 4 5 6 7 8 910 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86
BodySweetnessSmokyMedicinalTobaccoHoneySpicyWineyNuttyMaltyFruityFloral
Flavor
Comp 1
Comp 2
Comp 3
Comp 4
Comp 1
Comp 2
Comp 3
Comp 4
BodySweetnessSmokyMedicinalTobaccoHoneySpicyWineyNuttyMaltyFruityFloral
Flavor
Row
1Ro
w 2
Row
3Ro
w 4
Row
5Ro
w 6
Row
7Ro
w 8
Row
9Ro
w 10
Row
11Ro
w 12
Row
13Ro
w 14
Row
15Ro
w 16
Row
17Ro
w 18
Row
19Ro
w 20
Row
21Ro
w 22
Row
23Ro
w 24
Row
25Ro
w 26
Row
27Ro
w 28
Row
29Ro
w 30
Row
31Ro
w 32
Row
33Ro
w 34
Row
35Ro
w 36
Row
37Ro
w 38
Row
39Ro
w 40
Row
41Ro
w 42
Row
43Ro
w 44
Row
45Ro
w 46
Row
47Ro
w 48
Row
49Ro
w 50
Row
51Ro
w 52
Row
53Ro
w 54
Row
55Ro
w 56
Row
57Ro
w 58
Row
59Ro
w 60
Row
61Ro
w 62
Row
63Ro
w 64
Row
65Ro
w 66
Row
67Ro
w 68
Row
69Ro
w 70
Row
71Ro
w 72
Row
73Ro
w 74
Row
75Ro
w 76
Row
77Ro
w 78
Row
79Ro
w 80
Row
81Ro
w 82
Row
83Ro
w 84
Row
85Ro
w 86
Row
1Ro
w 2
Row
3Ro
w 4
Row
5Ro
w 6
Row
7Ro
w 8
Row
9Ro
w 10
Row
11Ro
w 12
Row
13Ro
w 14
Row
15Ro
w 16
Row
17Ro
w 18
Row
19Ro
w 20
Row
21Ro
w 22
Row
23Ro
w 24
Row
25Ro
w 26
Row
27Ro
w 28
Row
29Ro
w 30
Row
31Ro
w 32
Row
33Ro
w 34
Row
35Ro
w 36
Row
37Ro
w 38
Row
39Ro
w 40
Row
41Ro
w 42
Row
43Ro
w 44
Row
45Ro
w 46
Row
47Ro
w 48
Row
49Ro
w 50
Row
51Ro
w 52
Row
53Ro
w 54
Row
55Ro
w 56
Row
57Ro
w 58
Row
59Ro
w 60
Row
61Ro
w 62
Row
63Ro
w 64
Row
65Ro
w 66
Row
67Ro
w 68
Row
69Ro
w 70
Row
71Ro
w 72
Row
73Ro
w 74
Row
75Ro
w 76
Row
77Ro
w 78
Row
79Ro
w 80
Row
81Ro
w 82
Row
83Ro
w 84
Row
85Ro
w 86
Medicinal
Smoky
Body
Statistical Issues
1. Massive testing: Hundreds of “omic” predictors and several questions per sample.
2. Family-wise versus false discovery.
3. Missing data, outliers.
Don’t fool yourself.
Matrix Factorization Methods
1.Principle component analysis.
2.Singular value decomposition.
3.Non-negative matrix factorization.
4. Independent component analysis.
5.Robust MF.
Area of active
research.
Key Papers
1. Good (1969) Technometrics – SVD.
2. Liu et al. (2003) PNAS – rSVD.
3. Lee and Seung (1999) Nature – NMF.
4. Kim and Tidor (2003) Genome Research.
5. Brunet et al. (2004) PNAS – Micro array.
SVD eigen vectors come
from a composite of mechanisms.
NMF commits one vector to
each mechanism.
NMF Algorithm
Green are the “spectra”. Red are the “weights”.
= + EWH
Samples
A
Genes or Compounds
Start with random
elements in red and green.
Optimize so that
(aij – whij)2 is minimized.
Inference
• Test each variable sequentially within an ordered set. Each set corresponds to a particular eigenvector, which has been ordered by decreasing values.
Increase in statistical
power.
Genomic example.
Simulation.
• Group AML: patients with acute myeloid leukemia • Group ALL: patients with acute lymphoblastic
leukemia– Subgroup ALL-T: T cell subtypes– Subgroup ALL-B: B cell subtypes
Golub,T.R. et al. (1999) Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science, 286, 531–537.
Micro Array Example
Clustering
NMF clusters samples correctly.
Brunet et al (2004). PNAS
vol. 101 no. 12 4164–4169
Additional subgroup of
ALL-B.
Clustering
NMF clusters samples correctly.
Brunet et al (2004). PNAS
vol. 101 no. 12 4164–4169
Additional subgroup of
ALL-B.
Clustering
NMF clusters samples correctly.
Brunet et al (2004). PNAS
vol. 101 no. 12 4164–4169
Additional subgroup of
ALL-B.
Cluster 3 ALL-B2(169 genes)
Immune Response 10 genes (p=0.00019)
Cell Growth and Proliferation
61 genes
Cluster 1 ALL-B1(33 genes)
RNA Processing11 genes
P = 0.00260
Cell Cycle12 genes
Transcription16 genes
DNA Repair and Replication
11 genesP = 0.01519
MHC class II5 genes
MHC class I & II6 genes
P = 0.00018
Proteasome7 genes
P = 0.00054
Immune Response 28 genes (p=0.00047)
Sequential testing
Upregulation in ALL-B2 genes
Higher rate of transcription and
replication processes
More:
Proliferative nature compared
with ALL-B1
Proteasomal activity
Energy production.
Simulation
Simulation
50
100
150
200
Y
N N T1
T1
T1
T2
T2
Group
50
100
150
200
Y
N N T1
T1
T1
T2
T2
Group
Genes 1-5: up-regulated by T1
Genes 6-10: up-regulated by T2
Genes 11-20: up-regulated by T1
and T250
100
150
200
250
Y
N N T1
T1
T1
T2
T2
Group
Intragroup correlation structure
Simulation results
Increased power
Same level of FDR
For more details see paper
Summary
• The strategy is conceptually simple:– Non-negative matrix factorization is used to
create groups of genes that are moving together in the dataset.
– The error rate to be controlled is allocated over these groups.
– Within each group, genes are tested sequentially.
• The strategy should be effective if there are sets of genes moving together so that group formation reflects biological reality.
Areas of research:
Robust algorithms
Multiblock NMF (e.g.
relate active motifs with
differentially expressed
genes)
Speed
Contact Information
Independent consultant
Paul Fogel
paul.fogel@wanadoo.fr
+33 1 43 26 16 86
Stan Young
National Institute of Statistical Sciences
young@niss.org
919 685 9328
www.niss.org/irMF
Literature
Software
top related