um/ut microarray short course may 4, 2006 functional gene clustering by latent semantic indexing of...
TRANSCRIPT
![Page 1: UM/UT Microarray Short Course May 4, 2006 Functional Gene Clustering by Latent Semantic Indexing of MEDLINE Abstracts Ramin Homayouni, Ph.D. Department](https://reader036.vdocument.in/reader036/viewer/2022062806/5697c02e1a28abf838cd9ee3/html5/thumbnails/1.jpg)
UM/UT Microarray Short CourseUM/UT Microarray Short CourseMay 4, 2006May 4, 2006
Functional Gene Clustering by Latent Semantic Indexing
of MEDLINE Abstracts
Ramin Homayouni, Ph.D. Department of Neurology
University of Tennessee Health Science Center
Center for Neurobiology of Brain Diseases
![Page 2: UM/UT Microarray Short Course May 4, 2006 Functional Gene Clustering by Latent Semantic Indexing of MEDLINE Abstracts Ramin Homayouni, Ph.D. Department](https://reader036.vdocument.in/reader036/viewer/2022062806/5697c02e1a28abf838cd9ee3/html5/thumbnails/2.jpg)
Gene Expression ProfilingGene Expression Profiling
Alizadeh, et al., (2000) Nature 403:503.
Now What?Now What?
![Page 3: UM/UT Microarray Short Course May 4, 2006 Functional Gene Clustering by Latent Semantic Indexing of MEDLINE Abstracts Ramin Homayouni, Ph.D. Department](https://reader036.vdocument.in/reader036/viewer/2022062806/5697c02e1a28abf838cd9ee3/html5/thumbnails/3.jpg)
Some Web ResourcesSome Web Resources
NCBI SitesOMIM http://www.ncbi.nlm.nih.gov/Literature/index.html LocusLink http://www.ncbi.nlm.nih.gov/LocusLink/ PubMed http://www.ncbi.nlm.nih.gov/entrez/
OthersHAPI http://array.ucsd.edu/hapi/ GenMAPP http://www.genmapp.org/ GO Tree Machine http://genereg.ornl.gov/gotm/ PubGene http://www.pubgene.org Arrowsmith http://arrowsmith.psych.uic.edu/Chillibot http://www.chilibot.net/ iHOP http://www.ihop-net.org/
![Page 4: UM/UT Microarray Short Course May 4, 2006 Functional Gene Clustering by Latent Semantic Indexing of MEDLINE Abstracts Ramin Homayouni, Ph.D. Department](https://reader036.vdocument.in/reader036/viewer/2022062806/5697c02e1a28abf838cd9ee3/html5/thumbnails/4.jpg)
Defining Functional Relationships Defining Functional Relationships between Genesbetween Genes
Direct Relationship
Gene relationships already known (e.g., A-B or B-C)• Term co-occurrence
• Gene symbol: PubGene (Jenssen et al., Nature Genetics 2001 28:21)
• Gene names (synonyms and aliases) – biochemical
Indirect Relationship
Gene relationships unknown (e.g., such as A-C)
C
B
A
![Page 5: UM/UT Microarray Short Course May 4, 2006 Functional Gene Clustering by Latent Semantic Indexing of MEDLINE Abstracts Ramin Homayouni, Ph.D. Department](https://reader036.vdocument.in/reader036/viewer/2022062806/5697c02e1a28abf838cd9ee3/html5/thumbnails/5.jpg)
Reelin Signaling PathwayReelin Signaling Pathway
Dab1
ApoE
Reelin
VLDLRApoER2
APP
p35Cdk5
Amyloidplaques
pTau
fyn
![Page 6: UM/UT Microarray Short Course May 4, 2006 Functional Gene Clustering by Latent Semantic Indexing of MEDLINE Abstracts Ramin Homayouni, Ph.D. Department](https://reader036.vdocument.in/reader036/viewer/2022062806/5697c02e1a28abf838cd9ee3/html5/thumbnails/6.jpg)
Miscellaneous
Trp53FosNras
Rasa1Rab1Src
Notch1Dll1Jag1
Robo1PtchSmo
Reeler
RelnDab1
VLDLRLpr8
Gene Document Test SetGene Document Test Set
Alzheimer Disease
APP Aplp2Aplp1Psen1Psen2Lrp1MaptApoeA2m
Apbb1Apba1Cdk5Cdk5r
Cdk5r2
![Page 7: UM/UT Microarray Short Course May 4, 2006 Functional Gene Clustering by Latent Semantic Indexing of MEDLINE Abstracts Ramin Homayouni, Ph.D. Department](https://reader036.vdocument.in/reader036/viewer/2022062806/5697c02e1a28abf838cd9ee3/html5/thumbnails/7.jpg)
PubGene Query: Dab1PubGene Query: Dab1http://www.pubgene.org/http://www.pubgene.org/
Reln 7 timesCdk5r 6 timesCdk5 5 timesGli2 3 timesSrc 3 timesDab2 2 timesFyn 2 timesSam68 1 timesCdkn1a 1 timesTbr1 1 timesGli 1 timesScr 1 timesShh 1 timescdf 1 timesAsh 1 timesDlgh4 1 timesp80 1 timesLck 1 timesEmx1 1 timesPcdh18 1 timesAgrn 1 timesArg2 1 times
Mouse Human
DAB2 3 timesGAD1 3 timesRELN 3 timesGSN 2 timesTNFSF5 2 timesHLA-DQA1 1 timesBAT2 1 timesGAD2 1 times
PubMed Query: Dab1 AND Reln = 10PubMed Query: Dab1 AND reelin = 57 !
![Page 8: UM/UT Microarray Short Course May 4, 2006 Functional Gene Clustering by Latent Semantic Indexing of MEDLINE Abstracts Ramin Homayouni, Ph.D. Department](https://reader036.vdocument.in/reader036/viewer/2022062806/5697c02e1a28abf838cd9ee3/html5/thumbnails/8.jpg)
iHOP Query: Dab1iHOP Query: Dab1http://www.ihop-net.org/http://www.ihop-net.org/
![Page 9: UM/UT Microarray Short Course May 4, 2006 Functional Gene Clustering by Latent Semantic Indexing of MEDLINE Abstracts Ramin Homayouni, Ph.D. Department](https://reader036.vdocument.in/reader036/viewer/2022062806/5697c02e1a28abf838cd9ee3/html5/thumbnails/9.jpg)
iHOP Query: Dab1; Sentence StructureiHOP Query: Dab1; Sentence Structurehttp://www.ihop-net.org/http://www.ihop-net.org/
![Page 10: UM/UT Microarray Short Course May 4, 2006 Functional Gene Clustering by Latent Semantic Indexing of MEDLINE Abstracts Ramin Homayouni, Ph.D. Department](https://reader036.vdocument.in/reader036/viewer/2022062806/5697c02e1a28abf838cd9ee3/html5/thumbnails/10.jpg)
iHOP Query: Dab1; Network buildingiHOP Query: Dab1; Network buildinghttp://www.ihop-net.org/http://www.ihop-net.org/
![Page 11: UM/UT Microarray Short Course May 4, 2006 Functional Gene Clustering by Latent Semantic Indexing of MEDLINE Abstracts Ramin Homayouni, Ph.D. Department](https://reader036.vdocument.in/reader036/viewer/2022062806/5697c02e1a28abf838cd9ee3/html5/thumbnails/11.jpg)
Vector Space Model:Vector Space Model:Latent Semantic IndexingLatent Semantic Indexing
w1
w2
w3
QueryW1
W2
W3
.
.
.
Wx
Query
G1 G2 ... Gx
aij
G1
aij = lij gi
![Page 12: UM/UT Microarray Short Course May 4, 2006 Functional Gene Clustering by Latent Semantic Indexing of MEDLINE Abstracts Ramin Homayouni, Ph.D. Department](https://reader036.vdocument.in/reader036/viewer/2022062806/5697c02e1a28abf838cd9ee3/html5/thumbnails/12.jpg)
Semantic Gene OrganizerSemantic Gene Organizer©© User InterfaceUser Interface
![Page 13: UM/UT Microarray Short Course May 4, 2006 Functional Gene Clustering by Latent Semantic Indexing of MEDLINE Abstracts Ramin Homayouni, Ph.D. Department](https://reader036.vdocument.in/reader036/viewer/2022062806/5697c02e1a28abf838cd9ee3/html5/thumbnails/13.jpg)
Reelin Accession # QueryReelin Accession # Query
![Page 14: UM/UT Microarray Short Course May 4, 2006 Functional Gene Clustering by Latent Semantic Indexing of MEDLINE Abstracts Ramin Homayouni, Ph.D. Department](https://reader036.vdocument.in/reader036/viewer/2022062806/5697c02e1a28abf838cd9ee3/html5/thumbnails/14.jpg)
Reelin Keyword QueryReelin Keyword Query
![Page 15: UM/UT Microarray Short Course May 4, 2006 Functional Gene Clustering by Latent Semantic Indexing of MEDLINE Abstracts Ramin Homayouni, Ph.D. Department](https://reader036.vdocument.in/reader036/viewer/2022062806/5697c02e1a28abf838cd9ee3/html5/thumbnails/15.jpg)
50-Gene Document Collection50-Gene Document Collection
Development
CancerAlzheimer
1511
5
163
![Page 16: UM/UT Microarray Short Course May 4, 2006 Functional Gene Clustering by Latent Semantic Indexing of MEDLINE Abstracts Ramin Homayouni, Ph.D. Department](https://reader036.vdocument.in/reader036/viewer/2022062806/5697c02e1a28abf838cd9ee3/html5/thumbnails/16.jpg)
Hierarchical TreeHierarchical Tree
Development Cancer AlzheimerDevelopment
![Page 17: UM/UT Microarray Short Course May 4, 2006 Functional Gene Clustering by Latent Semantic Indexing of MEDLINE Abstracts Ramin Homayouni, Ph.D. Department](https://reader036.vdocument.in/reader036/viewer/2022062806/5697c02e1a28abf838cd9ee3/html5/thumbnails/17.jpg)
Unrooted Tree (Graph)Unrooted Tree (Graph)
![Page 18: UM/UT Microarray Short Course May 4, 2006 Functional Gene Clustering by Latent Semantic Indexing of MEDLINE Abstracts Ramin Homayouni, Ph.D. Department](https://reader036.vdocument.in/reader036/viewer/2022062806/5697c02e1a28abf838cd9ee3/html5/thumbnails/18.jpg)
Variation in Abstract RepresentationVariation in Abstract Representation
Reduce Reduce NoiseNoise
![Page 19: UM/UT Microarray Short Course May 4, 2006 Functional Gene Clustering by Latent Semantic Indexing of MEDLINE Abstracts Ramin Homayouni, Ph.D. Department](https://reader036.vdocument.in/reader036/viewer/2022062806/5697c02e1a28abf838cd9ee3/html5/thumbnails/19.jpg)
Abstract References in LocusLinkAbstract References in LocusLink
![Page 20: UM/UT Microarray Short Course May 4, 2006 Functional Gene Clustering by Latent Semantic Indexing of MEDLINE Abstracts Ramin Homayouni, Ph.D. Department](https://reader036.vdocument.in/reader036/viewer/2022062806/5697c02e1a28abf838cd9ee3/html5/thumbnails/20.jpg)
Gene symbols and names that are not Gene symbols and names that are not used in the literatureused in the literature
IncreaseIncreaseRepresentationRepresentation
![Page 21: UM/UT Microarray Short Course May 4, 2006 Functional Gene Clustering by Latent Semantic Indexing of MEDLINE Abstracts Ramin Homayouni, Ph.D. Department](https://reader036.vdocument.in/reader036/viewer/2022062806/5697c02e1a28abf838cd9ee3/html5/thumbnails/21.jpg)
Alternate Names and AliasesAlternate Names and Aliases
![Page 22: UM/UT Microarray Short Course May 4, 2006 Functional Gene Clustering by Latent Semantic Indexing of MEDLINE Abstracts Ramin Homayouni, Ph.D. Department](https://reader036.vdocument.in/reader036/viewer/2022062806/5697c02e1a28abf838cd9ee3/html5/thumbnails/22.jpg)
Log-entropy Term Weighting Log-entropy Term Weighting
W1
W2
W3
.
.
.
Wx
Query
G1 G2 ... Gx
aij
aij = lij gi
![Page 23: UM/UT Microarray Short Course May 4, 2006 Functional Gene Clustering by Latent Semantic Indexing of MEDLINE Abstracts Ramin Homayouni, Ph.D. Department](https://reader036.vdocument.in/reader036/viewer/2022062806/5697c02e1a28abf838cd9ee3/html5/thumbnails/23.jpg)
Top Terms in Gene DocumentTop Terms in Gene Document
reelin (4.0323)reeler (3.7762) positioning (1.9135) lissencephaly (1.8491) schizophrenia (1.7113) apoer2 (1.5637) cr (1.5544) esophageal (1.5339) dab1 (1.5118) vldlr (1.4973) carcinoma (1.4881) wild-type (1.4862) cask (1.4288) psychiatric (1.4266) apoe (1.3739) positioned (1.3726)
reelin (4.0323)reeler (3.7762) positioning (1.9135) lissencephaly (1.8491) schizophrenia (1.7113) apoer2 (1.5637) cr (1.5544) esophageal (1.5339) dab1 (1.5118) vldlr (1.4973) carcinoma (1.4881) wild-type (1.4862) cask (1.4288) psychiatric (1.4266) apoe (1.3739) positioned (1.3726)
![Page 24: UM/UT Microarray Short Course May 4, 2006 Functional Gene Clustering by Latent Semantic Indexing of MEDLINE Abstracts Ramin Homayouni, Ph.D. Department](https://reader036.vdocument.in/reader036/viewer/2022062806/5697c02e1a28abf838cd9ee3/html5/thumbnails/24.jpg)
Abstract retrieval by combining Abstract retrieval by combining weightedweighted terms in gene name, symbol or aliases terms in gene name, symbol or aliases
Query Description # abstracts
symbol Cdk5r2 0
alias p39 70
name cyclin-dependent kinase 5, regulatory subunit 2
0
c1 p39 AND cdk5 18
c2 p39 AND cyclin-dependent 17
c3 p39 AND kinase 24
c4 p39 AND cdk5 AND cyclin-dependent
17
c5 p39 AND cdk5 AND cyclin-dependent AND kinase
17
alias
c3
c1
53
171 7
![Page 25: UM/UT Microarray Short Course May 4, 2006 Functional Gene Clustering by Latent Semantic Indexing of MEDLINE Abstracts Ramin Homayouni, Ph.D. Department](https://reader036.vdocument.in/reader036/viewer/2022062806/5697c02e1a28abf838cd9ee3/html5/thumbnails/25.jpg)
Weighted PubMed QueriesWeighted PubMed Queries
Cdk5r2
Lrp8
Atoh1
Cdk5r
kit
egfr
fos
myc
Under-represented Genes Over-represented Genes
![Page 26: UM/UT Microarray Short Course May 4, 2006 Functional Gene Clustering by Latent Semantic Indexing of MEDLINE Abstracts Ramin Homayouni, Ph.D. Department](https://reader036.vdocument.in/reader036/viewer/2022062806/5697c02e1a28abf838cd9ee3/html5/thumbnails/26.jpg)
Weighted Query Weighted Query AlgorithmAlgorithm
Gene symbolGene Name Gene Aliases
Combination of highest weighted terms
Extract overlapping abstracts
RESULTS:2-59 fold increase in the number of abstracts associated with genes compared to those referenced in LL
RESULTS:2-59 fold increase in the number of abstracts associated with genes compared to those referenced in LL
![Page 27: UM/UT Microarray Short Course May 4, 2006 Functional Gene Clustering by Latent Semantic Indexing of MEDLINE Abstracts Ramin Homayouni, Ph.D. Department](https://reader036.vdocument.in/reader036/viewer/2022062806/5697c02e1a28abf838cd9ee3/html5/thumbnails/27.jpg)
Summary and ConclusionsSummary and Conclusions
Log-entropy weighting identifies descriptive or ‘useful’ aliases for genes.
Weighted PubMed Querying increases abstracts for under-represented genes and decreases abstracts for over-represented genes with high specificity.
This automated method improves gene abstract assignment 2 to 59 fold beyond those assigned by LocusLink indexers.
![Page 28: UM/UT Microarray Short Course May 4, 2006 Functional Gene Clustering by Latent Semantic Indexing of MEDLINE Abstracts Ramin Homayouni, Ph.D. Department](https://reader036.vdocument.in/reader036/viewer/2022062806/5697c02e1a28abf838cd9ee3/html5/thumbnails/28.jpg)
Vs.
Word x Gene DocMatrix
Word x Gene DocMatrix
PubMedAbstracts gene descriptor gene descriptor
word weights word weights
SearchTerm
Refinement
clustering clustering
pairwise Score pairwise ScoreGeneDoc
GeneDoc
GeneDoc
GeneDoc
PMID Citations inLocusLink
SGO overviewSGO overview
![Page 29: UM/UT Microarray Short Course May 4, 2006 Functional Gene Clustering by Latent Semantic Indexing of MEDLINE Abstracts Ramin Homayouni, Ph.D. Department](https://reader036.vdocument.in/reader036/viewer/2022062806/5697c02e1a28abf838cd9ee3/html5/thumbnails/29.jpg)
AcknowledgmentsAcknowledgments
UT MemphisUT MemphisNeurology
Lijing Xu, M.S.
Lai Wei, M.D.
Molecular Sciences
Yan Cui, Ph.D.
Mi Zhou, M.S.
UT KnoxvilleUT KnoxvilleComputer Science
Michael Berry, Ph.D.
Kevin Heinrich
Center for Neurobiology of Brain Diseases