gene-disease associations based on network 报告人:李金金
TRANSCRIPT
Gene-Disease Associations Based on network
报告人:李金金
Contents
Background1
Heterogeneous network2
Methods3
Background
Correctly identifying association of genes with diseases has long been a goal in biology.
Identifying association of genes with diseases has contributed to improving medical care and understanding of gene functions and interactions.
Clinical diseases are characterized by distinct phenotypes. To identity disease genes, the relationship between genes and phenotypes is involved.
Background
Pheno-type
GeneDisea-se
Pheno-type
Gene
Association
Problems
Construction heterogeous network
Gene network based on HPRD
g1
g4
g3
g2
g5
g7
g6GA
Construction heterogeous network
Phenotype network using MinMiner
p1
p4
p2
p5
PA
Construction heterogeous network
Gene-Phenotype network based on OMIM
p1
p4
p2
p5
B
g1
g4
g3
g2
g5
g7
g6
Construction heterogeous network
)*( mmPA)*( nnGA )*( mnB
PTG
AB
BAA
Methods
Katz
RWRH
Prince
GeneWalker
CIPHER
CATAPULT
Methods
Methods
Katzis successfully applied for link prediction in social networks.
Methods
CATAPULTis a supervised learning method.Features are derived from hybrid walks through the heterogeneous network.
Katz
g3
g1
g4
g2
g5g6
001110
000101
100110
111011
101100
010100
A
Katz
g3
g1
g4
g5g6
g5g6
g1
g3
g2g2
g3
312221
121111
213221
212521
212231
111112
2A
3A 4A 5A……
Katz
How to get the similarity matrix?
Katz measure:
ijl
k
ll AS )(
1ij
0ll,
ll
21
1
katz 1,)(
AIAIAS l
k
l
l
Small values of k (k=3 or k=4) are known to yield competitive performance in the task of recommending similar nodes.
Katz on the heterogeneous network
Adjacency matrix of heterogeneous network:
PTG
AB
BAA AG gene-gene network
Bthe bipartite network genes and phenotypes
APHSthe similarity matrix of human diseases
APSthe similarity matrix of phenotypes of other species
SHS BBB
PS
PHSP A
AA
0
0
Katz on the heterogeneous network
Katz similarity measure specialized to A:
K=3,the similarities between gene nodes and human disease nodes could be denoted by
ijl
k
l
lij
K AAS )()(1
atz
)(s AS KatzH
)()( 2s PHsHsHsGHsKatzH ABBABAS
)( 22s
3PHsHsPHsHsGHsGH
T ABABABABBB
CATAPULT
How to train a biased SVM?
T the number of bootstraps
the sets of positive
the set of unlabeled gene-phenotype pairs
n+the number of examples in A
A
Step 1: Draw a bootstrap sample U of size n+ .
Step 2: Train a linear classifier θ using the positive training examples A and U as negative examples.
CATAPULT
How to train a biased SVM?
Step 2: Training classifier
CATAPULT
How to train a biased SVM?
Step 3: For anytUUx \ update: