linking genetic profiles to biological outcome

22
Linking Genetic Profiles to Biological Outcome Paul Fogel Consultant, Paris S. Stanley Young National Institute of Statistical Sciences NISS, NMF Workshop February 23, ‘07

Upload: poppy

Post on 13-Jan-2016

39 views

Category:

Documents


1 download

DESCRIPTION

Linking Genetic Profiles to Biological Outcome. Paul Fogel Consultant, Paris S. Stanley Young National Institute of Statistical Sciences NISS, NMF Workshop February 23, ‘07. Scotch whiskey database. Original matrix. = Prototypical flavor patterns. X Mixing levels (weights). + Residual. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Linking Genetic Profiles to Biological Outcome

Linking Genetic Profiles to Biological Outcome

Paul FogelConsultant, Paris

S. Stanley YoungNational Institute of Statistical Sciences

NISS, NMF Workshop February 23, ‘07

Page 2: Linking Genetic Profiles to Biological Outcome

Scotch whiskey database

1 2 3 4 5 6 7 8 910 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86

1 2 3 4 5 6 7 8 910 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86

BodySweetnessSmokyMedicinalTobaccoHoneySpicyWineyNuttyMaltyFruityFloral

Flavor

Original matrix

Comp 1

Comp 2

Comp 3

Comp 4

Comp 1

Comp 2

Comp 3

Comp 4

BodySweetnessSmokyMedicinalTobaccoHoneySpicyWineyNuttyMaltyFruityFloral

Flavor

=

Prototypical flavor patterns

+

Residual

1 2 3 4 5 6 7 8 910 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86

1 2 3 4 5 6 7 8 910 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86

Row

1Ro

w 2

Row

3Ro

w 4

Row

5Ro

w 6

Row

7Ro

w 8

Row

9Ro

w 10

Row

11Ro

w 12

Row

13Ro

w 14

Row

15Ro

w 16

Row

17Ro

w 18

Row

19Ro

w 20

Row

21Ro

w 22

Row

23Ro

w 24

Row

25Ro

w 26

Row

27Ro

w 28

Row

29Ro

w 30

Row

31Ro

w 32

Row

33Ro

w 34

Row

35Ro

w 36

Row

37Ro

w 38

Row

39Ro

w 40

Row

41Ro

w 42

Row

43Ro

w 44

Row

45Ro

w 46

Row

47Ro

w 48

Row

49Ro

w 50

Row

51Ro

w 52

Row

53Ro

w 54

Row

55Ro

w 56

Row

57Ro

w 58

Row

59Ro

w 60

Row

61Ro

w 62

Row

63Ro

w 64

Row

65Ro

w 66

Row

67Ro

w 68

Row

69Ro

w 70

Row

71Ro

w 72

Row

73Ro

w 74

Row

75Ro

w 76

Row

77Ro

w 78

Row

79Ro

w 80

Row

81Ro

w 82

Row

83Ro

w 84

Row

85Ro

w 86

Row

1Ro

w 2

Row

3Ro

w 4

Row

5Ro

w 6

Row

7Ro

w 8

Row

9Ro

w 10

Row

11Ro

w 12

Row

13Ro

w 14

Row

15Ro

w 16

Row

17Ro

w 18

Row

19Ro

w 20

Row

21Ro

w 22

Row

23Ro

w 24

Row

25Ro

w 26

Row

27Ro

w 28

Row

29Ro

w 30

Row

31Ro

w 32

Row

33Ro

w 34

Row

35Ro

w 36

Row

37Ro

w 38

Row

39Ro

w 40

Row

41Ro

w 42

Row

43Ro

w 44

Row

45Ro

w 46

Row

47Ro

w 48

Row

49Ro

w 50

Row

51Ro

w 52

Row

53Ro

w 54

Row

55Ro

w 56

Row

57Ro

w 58

Row

59Ro

w 60

Row

61Ro

w 62

Row

63Ro

w 64

Row

65Ro

w 66

Row

67Ro

w 68

Row

69Ro

w 70

Row

71Ro

w 72

Row

73Ro

w 74

Row

75Ro

w 76

Row

77Ro

w 78

Row

79Ro

w 80

Row

81Ro

w 82

Row

83Ro

w 84

Row

85Ro

w 86

X

Mixing levels (weights)

Page 3: Linking Genetic Profiles to Biological Outcome

How many flavor patterns?

0

50

100

150

200

250

Eig

en

Va

lue

0 1 2 3 4 5 6 7 8 9 10 11 12 13

Rows

Scree Plot (eigen values)

Scree plot

-52

-51

-50

-49

-48

-47

Pro

file

Lik

elih

oo

d0 1 2 3 4 5 6 7 8 9 10 11 12 13

Rows

Profile likelihood (eigen values)

Profile likelihood

(Zhu and Ghodsi)

0.7

0.8

0.9

1

1.1

De

t

0 1 2 3 4 5 6 7 8 9 10 11 12 13

Rows

Scree Plot (determinant)

Volume filled

(Determinant)

Page 4: Linking Genetic Profiles to Biological Outcome

1 2 3 4 5 6 7 8 910 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86

1 2 3 4 5 6 7 8 910 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86

AnCnoc

1 2 3 4 5 6 7 8 910 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86

1 2 3 4 5 6 7 8 910 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86

BodySweetnessSmokyMedicinalTobaccoHoneySpicyWineyNuttyMaltyFruityFloral

Flavor

Comp 1

Comp 2

Comp 3

Comp 4

Comp 1

Comp 2

Comp 3

Comp 4

BodySweetnessSmokyMedicinalTobaccoHoneySpicyWineyNuttyMaltyFruityFloral

Flavor

Row

1Ro

w 2

Row

3Ro

w 4

Row

5Ro

w 6

Row

7Ro

w 8

Row

9Ro

w 10

Row

11Ro

w 12

Row

13Ro

w 14

Row

15Ro

w 16

Row

17Ro

w 18

Row

19Ro

w 20

Row

21Ro

w 22

Row

23Ro

w 24

Row

25Ro

w 26

Row

27Ro

w 28

Row

29Ro

w 30

Row

31Ro

w 32

Row

33Ro

w 34

Row

35Ro

w 36

Row

37Ro

w 38

Row

39Ro

w 40

Row

41Ro

w 42

Row

43Ro

w 44

Row

45Ro

w 46

Row

47Ro

w 48

Row

49Ro

w 50

Row

51Ro

w 52

Row

53Ro

w 54

Row

55Ro

w 56

Row

57Ro

w 58

Row

59Ro

w 60

Row

61Ro

w 62

Row

63Ro

w 64

Row

65Ro

w 66

Row

67Ro

w 68

Row

69Ro

w 70

Row

71Ro

w 72

Row

73Ro

w 74

Row

75Ro

w 76

Row

77Ro

w 78

Row

79Ro

w 80

Row

81Ro

w 82

Row

83Ro

w 84

Row

85Ro

w 86

Row

1Ro

w 2

Row

3Ro

w 4

Row

5Ro

w 6

Row

7Ro

w 8

Row

9Ro

w 10

Row

11Ro

w 12

Row

13Ro

w 14

Row

15Ro

w 16

Row

17Ro

w 18

Row

19Ro

w 20

Row

21Ro

w 22

Row

23Ro

w 24

Row

25Ro

w 26

Row

27Ro

w 28

Row

29Ro

w 30

Row

31Ro

w 32

Row

33Ro

w 34

Row

35Ro

w 36

Row

37Ro

w 38

Row

39Ro

w 40

Row

41Ro

w 42

Row

43Ro

w 44

Row

45Ro

w 46

Row

47Ro

w 48

Row

49Ro

w 50

Row

51Ro

w 52

Row

53Ro

w 54

Row

55Ro

w 56

Row

57Ro

w 58

Row

59Ro

w 60

Row

61Ro

w 62

Row

63Ro

w 64

Row

65Ro

w 66

Row

67Ro

w 68

Row

69Ro

w 70

Row

71Ro

w 72

Row

73Ro

w 74

Row

75Ro

w 76

Row

77Ro

w 78

Row

79Ro

w 80

Row

81Ro

w 82

Row

83Ro

w 84

Row

85Ro

w 86

Floral

Sweetness

Fruity

Malty

Nutty

Page 5: Linking Genetic Profiles to Biological Outcome

1 2 3 4 5 6 7 8 910 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86

1 2 3 4 5 6 7 8 910 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86

Balmenach

1 2 3 4 5 6 7 8 910 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86

1 2 3 4 5 6 7 8 910 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86

BodySweetnessSmokyMedicinalTobaccoHoneySpicyWineyNuttyMaltyFruityFloral

Flavor

Comp 1

Comp 2

Comp 3

Comp 4

Comp 1

Comp 2

Comp 3

Comp 4

BodySweetnessSmokyMedicinalTobaccoHoneySpicyWineyNuttyMaltyFruityFloral

Flavor

Row

1Ro

w 2

Row

3Ro

w 4

Row

5Ro

w 6

Row

7Ro

w 8

Row

9Ro

w 10

Row

11Ro

w 12

Row

13Ro

w 14

Row

15Ro

w 16

Row

17Ro

w 18

Row

19Ro

w 20

Row

21Ro

w 22

Row

23Ro

w 24

Row

25Ro

w 26

Row

27Ro

w 28

Row

29Ro

w 30

Row

31Ro

w 32

Row

33Ro

w 34

Row

35Ro

w 36

Row

37Ro

w 38

Row

39Ro

w 40

Row

41Ro

w 42

Row

43Ro

w 44

Row

45Ro

w 46

Row

47Ro

w 48

Row

49Ro

w 50

Row

51Ro

w 52

Row

53Ro

w 54

Row

55Ro

w 56

Row

57Ro

w 58

Row

59Ro

w 60

Row

61Ro

w 62

Row

63Ro

w 64

Row

65Ro

w 66

Row

67Ro

w 68

Row

69Ro

w 70

Row

71Ro

w 72

Row

73Ro

w 74

Row

75Ro

w 76

Row

77Ro

w 78

Row

79Ro

w 80

Row

81Ro

w 82

Row

83Ro

w 84

Row

85Ro

w 86

Row

1Ro

w 2

Row

3Ro

w 4

Row

5Ro

w 6

Row

7Ro

w 8

Row

9Ro

w 10

Row

11Ro

w 12

Row

13Ro

w 14

Row

15Ro

w 16

Row

17Ro

w 18

Row

19Ro

w 20

Row

21Ro

w 22

Row

23Ro

w 24

Row

25Ro

w 26

Row

27Ro

w 28

Row

29Ro

w 30

Row

31Ro

w 32

Row

33Ro

w 34

Row

35Ro

w 36

Row

37Ro

w 38

Row

39Ro

w 40

Row

41Ro

w 42

Row

43Ro

w 44

Row

45Ro

w 46

Row

47Ro

w 48

Row

49Ro

w 50

Row

51Ro

w 52

Row

53Ro

w 54

Row

55Ro

w 56

Row

57Ro

w 58

Row

59Ro

w 60

Row

61Ro

w 62

Row

63Ro

w 64

Row

65Ro

w 66

Row

67Ro

w 68

Row

69Ro

w 70

Row

71Ro

w 72

Row

73Ro

w 74

Row

75Ro

w 76

Row

77Ro

w 78

Row

79Ro

w 80

Row

81Ro

w 82

Row

83Ro

w 84

Row

85Ro

w 86

Winey

Body

Honey

Sweetness

Nutty

Malty

Page 6: Linking Genetic Profiles to Biological Outcome

1 2 3 4 5 6 7 8 910 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86

1 2 3 4 5 6 7 8 910 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86

GlenGarioch

1 2 3 4 5 6 7 8 910 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86

1 2 3 4 5 6 7 8 910 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86

BodySweetnessSmokyMedicinalTobaccoHoneySpicyWineyNuttyMaltyFruityFloral

Flavor

Comp 1

Comp 2

Comp 3

Comp 4

Comp 1

Comp 2

Comp 3

Comp 4

BodySweetnessSmokyMedicinalTobaccoHoneySpicyWineyNuttyMaltyFruityFloral

Flavor

Row

1Ro

w 2

Row

3Ro

w 4

Row

5Ro

w 6

Row

7Ro

w 8

Row

9Ro

w 10

Row

11Ro

w 12

Row

13Ro

w 14

Row

15Ro

w 16

Row

17Ro

w 18

Row

19Ro

w 20

Row

21Ro

w 22

Row

23Ro

w 24

Row

25Ro

w 26

Row

27Ro

w 28

Row

29Ro

w 30

Row

31Ro

w 32

Row

33Ro

w 34

Row

35Ro

w 36

Row

37Ro

w 38

Row

39Ro

w 40

Row

41Ro

w 42

Row

43Ro

w 44

Row

45Ro

w 46

Row

47Ro

w 48

Row

49Ro

w 50

Row

51Ro

w 52

Row

53Ro

w 54

Row

55Ro

w 56

Row

57Ro

w 58

Row

59Ro

w 60

Row

61Ro

w 62

Row

63Ro

w 64

Row

65Ro

w 66

Row

67Ro

w 68

Row

69Ro

w 70

Row

71Ro

w 72

Row

73Ro

w 74

Row

75Ro

w 76

Row

77Ro

w 78

Row

79Ro

w 80

Row

81Ro

w 82

Row

83Ro

w 84

Row

85Ro

w 86

Row

1Ro

w 2

Row

3Ro

w 4

Row

5Ro

w 6

Row

7Ro

w 8

Row

9Ro

w 10

Row

11Ro

w 12

Row

13Ro

w 14

Row

15Ro

w 16

Row

17Ro

w 18

Row

19Ro

w 20

Row

21Ro

w 22

Row

23Ro

w 24

Row

25Ro

w 26

Row

27Ro

w 28

Row

29Ro

w 30

Row

31Ro

w 32

Row

33Ro

w 34

Row

35Ro

w 36

Row

37Ro

w 38

Row

39Ro

w 40

Row

41Ro

w 42

Row

43Ro

w 44

Row

45Ro

w 46

Row

47Ro

w 48

Row

49Ro

w 50

Row

51Ro

w 52

Row

53Ro

w 54

Row

55Ro

w 56

Row

57Ro

w 58

Row

59Ro

w 60

Row

61Ro

w 62

Row

63Ro

w 64

Row

65Ro

w 66

Row

67Ro

w 68

Row

69Ro

w 70

Row

71Ro

w 72

Row

73Ro

w 74

Row

75Ro

w 76

Row

77Ro

w 78

Row

79Ro

w 80

Row

81Ro

w 82

Row

83Ro

w 84

Row

85Ro

w 86

Spicy

Fruity

Sweetness

Body

Malty

Page 7: Linking Genetic Profiles to Biological Outcome

1 2 3 4 5 6 7 8 910 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86

1 2 3 4 5 6 7 8 910 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86

Lagavulin & Laphroig

1 2 3 4 5 6 7 8 910 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86

1 2 3 4 5 6 7 8 910 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86

BodySweetnessSmokyMedicinalTobaccoHoneySpicyWineyNuttyMaltyFruityFloral

Flavor

Comp 1

Comp 2

Comp 3

Comp 4

Comp 1

Comp 2

Comp 3

Comp 4

BodySweetnessSmokyMedicinalTobaccoHoneySpicyWineyNuttyMaltyFruityFloral

Flavor

Row

1Ro

w 2

Row

3Ro

w 4

Row

5Ro

w 6

Row

7Ro

w 8

Row

9Ro

w 10

Row

11Ro

w 12

Row

13Ro

w 14

Row

15Ro

w 16

Row

17Ro

w 18

Row

19Ro

w 20

Row

21Ro

w 22

Row

23Ro

w 24

Row

25Ro

w 26

Row

27Ro

w 28

Row

29Ro

w 30

Row

31Ro

w 32

Row

33Ro

w 34

Row

35Ro

w 36

Row

37Ro

w 38

Row

39Ro

w 40

Row

41Ro

w 42

Row

43Ro

w 44

Row

45Ro

w 46

Row

47Ro

w 48

Row

49Ro

w 50

Row

51Ro

w 52

Row

53Ro

w 54

Row

55Ro

w 56

Row

57Ro

w 58

Row

59Ro

w 60

Row

61Ro

w 62

Row

63Ro

w 64

Row

65Ro

w 66

Row

67Ro

w 68

Row

69Ro

w 70

Row

71Ro

w 72

Row

73Ro

w 74

Row

75Ro

w 76

Row

77Ro

w 78

Row

79Ro

w 80

Row

81Ro

w 82

Row

83Ro

w 84

Row

85Ro

w 86

Row

1Ro

w 2

Row

3Ro

w 4

Row

5Ro

w 6

Row

7Ro

w 8

Row

9Ro

w 10

Row

11Ro

w 12

Row

13Ro

w 14

Row

15Ro

w 16

Row

17Ro

w 18

Row

19Ro

w 20

Row

21Ro

w 22

Row

23Ro

w 24

Row

25Ro

w 26

Row

27Ro

w 28

Row

29Ro

w 30

Row

31Ro

w 32

Row

33Ro

w 34

Row

35Ro

w 36

Row

37Ro

w 38

Row

39Ro

w 40

Row

41Ro

w 42

Row

43Ro

w 44

Row

45Ro

w 46

Row

47Ro

w 48

Row

49Ro

w 50

Row

51Ro

w 52

Row

53Ro

w 54

Row

55Ro

w 56

Row

57Ro

w 58

Row

59Ro

w 60

Row

61Ro

w 62

Row

63Ro

w 64

Row

65Ro

w 66

Row

67Ro

w 68

Row

69Ro

w 70

Row

71Ro

w 72

Row

73Ro

w 74

Row

75Ro

w 76

Row

77Ro

w 78

Row

79Ro

w 80

Row

81Ro

w 82

Row

83Ro

w 84

Row

85Ro

w 86

Medicinal

Smoky

Body

Page 8: Linking Genetic Profiles to Biological Outcome

Statistical Issues

1. Massive testing: Hundreds of “omic” predictors and several questions per sample.

2. Family-wise versus false discovery.

3. Missing data, outliers.

Don’t fool yourself.

Page 9: Linking Genetic Profiles to Biological Outcome

Matrix Factorization Methods

1.Principle component analysis.

2.Singular value decomposition.

3.Non-negative matrix factorization.

4. Independent component analysis.

5.Robust MF.

Area of active

research.

Page 10: Linking Genetic Profiles to Biological Outcome

Key Papers

1. Good (1969) Technometrics – SVD.

2. Liu et al. (2003) PNAS – rSVD.

3. Lee and Seung (1999) Nature – NMF.

4. Kim and Tidor (2003) Genome Research.

5. Brunet et al. (2004) PNAS – Micro array.

SVD eigen vectors come

from a composite of mechanisms.

NMF commits one vector to

each mechanism.

Page 11: Linking Genetic Profiles to Biological Outcome

NMF Algorithm

Green are the “spectra”. Red are the “weights”.

= + EWH

Samples

A

Genes or Compounds

Start with random

elements in red and green.

Optimize so that

(aij – whij)2 is minimized.

Page 12: Linking Genetic Profiles to Biological Outcome

Inference

• Test each variable sequentially within an ordered set. Each set corresponds to a particular eigenvector, which has been ordered by decreasing values.

Increase in statistical

power.

Genomic example.

Simulation.

Page 13: Linking Genetic Profiles to Biological Outcome

• Group AML: patients with acute myeloid leukemia • Group ALL: patients with acute lymphoblastic

leukemia– Subgroup ALL-T: T cell subtypes– Subgroup ALL-B: B cell subtypes

Golub,T.R. et al. (1999) Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science, 286, 531–537.

Micro Array Example

Page 14: Linking Genetic Profiles to Biological Outcome

Clustering

NMF clusters samples correctly.

Brunet et al (2004). PNAS

vol. 101 no. 12 4164–4169

Additional subgroup of

ALL-B.

Page 15: Linking Genetic Profiles to Biological Outcome

Clustering

NMF clusters samples correctly.

Brunet et al (2004). PNAS

vol. 101 no. 12 4164–4169

Additional subgroup of

ALL-B.

Page 16: Linking Genetic Profiles to Biological Outcome

Clustering

NMF clusters samples correctly.

Brunet et al (2004). PNAS

vol. 101 no. 12 4164–4169

Additional subgroup of

ALL-B.

Page 17: Linking Genetic Profiles to Biological Outcome

Cluster 3 ALL-B2(169 genes)

Immune Response 10 genes (p=0.00019)

Cell Growth and Proliferation

61 genes

Cluster 1 ALL-B1(33 genes)

RNA Processing11 genes

P = 0.00260

Cell Cycle12 genes

Transcription16 genes

DNA Repair and Replication

11 genesP = 0.01519

MHC class II5 genes

MHC class I & II6 genes

P = 0.00018

Proteasome7 genes

P = 0.00054

Immune Response 28 genes (p=0.00047)

Sequential testing

Upregulation in ALL-B2 genes

Higher rate of transcription and

replication processes

More:

Proliferative nature compared

with ALL-B1

Proteasomal activity

Energy production.

Page 18: Linking Genetic Profiles to Biological Outcome

Simulation

Page 19: Linking Genetic Profiles to Biological Outcome

Simulation

50

100

150

200

Y

N N T1

T1

T1

T2

T2

Group

50

100

150

200

Y

N N T1

T1

T1

T2

T2

Group

Genes 1-5: up-regulated by T1

Genes 6-10: up-regulated by T2

Genes 11-20: up-regulated by T1

and T250

100

150

200

250

Y

N N T1

T1

T1

T2

T2

Group

Intragroup correlation structure

Page 20: Linking Genetic Profiles to Biological Outcome

Simulation results

Increased power

Same level of FDR

For more details see paper

Page 21: Linking Genetic Profiles to Biological Outcome

Summary

• The strategy is conceptually simple:– Non-negative matrix factorization is used to

create groups of genes that are moving together in the dataset.

– The error rate to be controlled is allocated over these groups.

– Within each group, genes are tested sequentially.

• The strategy should be effective if there are sets of genes moving together so that group formation reflects biological reality.

Areas of research:

Robust algorithms

Multiblock NMF (e.g.

relate active motifs with

differentially expressed

genes)

Speed

Page 22: Linking Genetic Profiles to Biological Outcome

Contact Information

Independent consultant

Paul Fogel

[email protected]

+33 1 43 26 16 86

Stan Young

National Institute of Statistical Sciences

[email protected]

919 685 9328

www.niss.org/irMF

Literature

Software