limsoon wong laboratories for information technology singapore from informatics to bioinformatics
TRANSCRIPT
![Page 1: Limsoon Wong Laboratories for Information Technology Singapore From Informatics to Bioinformatics](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f285503460f94c40449/html5/thumbnails/1.jpg)
Limsoon WongLaboratories for Information Technology
Singapore
From Informaticsto Bioinformatics
![Page 2: Limsoon Wong Laboratories for Information Technology Singapore From Informatics to Bioinformatics](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f285503460f94c40449/html5/thumbnails/2.jpg)
What is Bioinformatics?
![Page 3: Limsoon Wong Laboratories for Information Technology Singapore From Informatics to Bioinformatics](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f285503460f94c40449/html5/thumbnails/3.jpg)
Themes of Bioinformatics
Bioinformatics = Data Mgmt + Knowledge Discovery
Data Mgmt =Integration + Transformation + Cleansing
Knowledge Discovery = Statistics + Algorithms + Databases
![Page 4: Limsoon Wong Laboratories for Information Technology Singapore From Informatics to Bioinformatics](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f285503460f94c40449/html5/thumbnails/4.jpg)
Benefits of Bioinformatics
To the patient:Better drug, better treatment
To the pharma:Save time, save cost, make more $
To the scientist:Better science
![Page 5: Limsoon Wong Laboratories for Information Technology Singapore From Informatics to Bioinformatics](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f285503460f94c40449/html5/thumbnails/5.jpg)
From Informatics to Bioinformatics
IntegrationTechnology(Kleisli)
Cleansing & Warehousing (FIMM)
MHC-PeptideBinding(PREDICT)
Protein InteractionsExtraction (PIES)
Gene Expression & Medical RecordDatamining (PCL)
Gene FeatureRecognition (Dragon)
VenomInformatics
1994 19981996 2000 2002
8 years of bioinformaticsR&D in Singapore
ISS KRDL LIT
![Page 6: Limsoon Wong Laboratories for Information Technology Singapore From Informatics to Bioinformatics](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f285503460f94c40449/html5/thumbnails/6.jpg)
Data IntegrationA DOE “impossible query”:
For each gene on a given cytogenetic band, find its non-human homologs.
source type location remarks
GDB Sybase Baltimore Flat tablesSQL joinsLocation info
Entrez ASN.1 Bethesda Nested tablesKeywordsHomolog info
![Page 7: Limsoon Wong Laboratories for Information Technology Singapore From Informatics to Bioinformatics](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f285503460f94c40449/html5/thumbnails/7.jpg)
Data Integration Resultssybase-add (#name:”GDB", ...);
create view L from locus_cyto_location using GDB;
create view E from object_genbank_eref using GDB;
select
#accn: g.#genbank_ref, #nonhuman-homologs: H
from
L as c, E as g,
{select u
from g.#genbank_ref.na-get-homolog-summary as u
where not(u.#title string-islike "%Human%") andalso
not(u.#title string-islike "%H.sapien%")} as H
where
c.#chrom_num = "22” andalso
g.#object_id = c.#locus_id andalso
not (H = { });
• Using Kleisli:
• Clear
• Succinct
• Efficient
• Handles
•heterogeneity
•complexity
![Page 8: Limsoon Wong Laboratories for Information Technology Singapore From Informatics to Bioinformatics](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f285503460f94c40449/html5/thumbnails/8.jpg)
Data WarehousingMotivation
efficiency
availabilty
“denial of service”
data cleansing
Requirements
efficient to query
easy to update.
model data naturally
{(#uid: 6138971,
#title: "Homo sapiens adrenergic ...",
#accession: "NM_001619",
#organism: "Homo sapiens",
#taxon: 9606,
#lineage: ["Eukaryota", "Metazoa", …],
#seq: "CTCGGCCTCGGGCGCGGC...",
#feature: {
(#name: "source",
#continuous: true,
#position: [
(#accn: "NM_001619",
#start: 0, #end: 3602,
#negative: false)],
#anno: [
(#anno_name: "organism",
#descr: "Homo sapiens"), …] ), …)}
![Page 9: Limsoon Wong Laboratories for Information Technology Singapore From Informatics to Bioinformatics](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f285503460f94c40449/html5/thumbnails/9.jpg)
Data Warehousing Results
Relational DBMS is insufficient because it forces us to fragment data into 3NF.
Kleisli turns flat relational DBMS into nested relational DBMS. It can use flat relational DBMS such as Sybase, Oracle, MySQL, etc. to be its update-able complex object store.
! Log inoracle-cplobj-add (#name: "db", ...);
! Define table
create table GP (#uid: "NUMBER", #detail: "LONG")using db;
! Populate table with GenPept reportsselect #uid: x.#uid, #detail: x into GPfrom aa-get-seqfeat-general "PTP” as xusing db;
! Map GP to that tablecreate view GP from GP using db;
! Run a queryto get title of 131470select x.#detail.#title from GP as xwhere x.#uid = 131470;
![Page 10: Limsoon Wong Laboratories for Information Technology Singapore From Informatics to Bioinformatics](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f285503460f94c40449/html5/thumbnails/10.jpg)
Epitope PredictionTRAP-559AAMNHLGNVKYLVIVFLIFFDLFLVNGRDVQNNIVDEIKYSEEVCNDQVDLYLLMDCSGSIRRHNWVNHAVPLAMKLIQQLNLNDNAIHLYVNVFSNNAKEIIRLHSDASKNKEKALIIIRSLLSTNLPYGRTNLTDALLQVRKHLNDRINRENANQLVVILTDGIPDSIQDSLKESRKLSDRGVKIAVFGIGQGINVAFNRFLVGCHPSDGKCNLYADSAWENVKNVIGPFMKAVCVEVEKTASCGVWDEWSPCSVTCGKGTRSRKREILHEGCTSEIQEQCEEERCPPKWEPLDVPDEPEDDQPRPRGDNSSVQKPEENIIDNNPQEPSPNPEEGKDENPNGFDLDENPENPPNPDIPEQKPNIPEDSEKEVPSDVPKNPEDDREENFDIPKKPENKHDNQNNLPNDKSDRNIPYSPLPPKVLDNERKQSDPQSQDNNGNRHVPNSEDRETRPHGRNNENRSYNRKYNDTPKHPEREEHEKPDNNKKKGESDNKYKIAGGIAGGLALLACAGLAYKFVVPGAATPYAGEPAPFDETLGEEDKDLDEPEQFRLPEENEWN
![Page 11: Limsoon Wong Laboratories for Information Technology Singapore From Informatics to Bioinformatics](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f285503460f94c40449/html5/thumbnails/11.jpg)
Epitope Prediction Results Prediction by our ANN model for HLA-A11
29 predictions 22 epitopes 76% specificity
1 66 100Rank by BIMAS
Number of experimental binders 19 (52.8%) 5 (13.9%) 12 (33.3%)
Prediction by BIMAS matrix for HLA-A*1101
![Page 12: Limsoon Wong Laboratories for Information Technology Singapore From Informatics to Bioinformatics](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f285503460f94c40449/html5/thumbnails/12.jpg)
Transcription Start Prediction
![Page 13: Limsoon Wong Laboratories for Information Technology Singapore From Informatics to Bioinformatics](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f285503460f94c40449/html5/thumbnails/13.jpg)
Transcription Start Prediction Results
![Page 14: Limsoon Wong Laboratories for Information Technology Singapore From Informatics to Bioinformatics](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f285503460f94c40449/html5/thumbnails/14.jpg)
Medical Record Analysis
Looking for patterns that are valid novel useful understandable
age sex chol ecg heart sick
49 M 266 Hyp 171 N64 M 211 Norm 144 N58 F 283 Hyp 162 N58 M 284 Hyp 160 Y58 M 224 Abn 173 Y
![Page 15: Limsoon Wong Laboratories for Information Technology Singapore From Informatics to Bioinformatics](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f285503460f94c40449/html5/thumbnails/15.jpg)
Gene Expression Analysis
Classifying gene expression profiles find stable differentially expressed genes find significant gene groups derive coordinated gene expression
![Page 16: Limsoon Wong Laboratories for Information Technology Singapore From Informatics to Bioinformatics](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f285503460f94c40449/html5/thumbnails/16.jpg)
Medical Record & Gene Expression Analysis Results
PCL, a novel “emerging pattern’’ method
Beats C4.5, CBA, LB, NB, TAN in 21 out of 32 UCI benchmarks
Works well for gene expressions
Cancer Cell, March 2002, 1(2)
![Page 17: Limsoon Wong Laboratories for Information Technology Singapore From Informatics to Bioinformatics](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f285503460f94c40449/html5/thumbnails/17.jpg)
Protein Interaction Extraction
“What are the protein-protein interaction pathwaysfrom the latest reported discoveries?”
![Page 18: Limsoon Wong Laboratories for Information Technology Singapore From Informatics to Bioinformatics](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f285503460f94c40449/html5/thumbnails/18.jpg)
Protein Interaction Extraction Results Rule-based system for
processing free texts in scientific abstracts
Specialized in extracting protein
names extracting protein-
protein interactions
![Page 19: Limsoon Wong Laboratories for Information Technology Singapore From Informatics to Bioinformatics](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f285503460f94c40449/html5/thumbnails/19.jpg)
Behind the Scene
Vladimir Bajic Vladimir Brusic Jinyan Li See-Kiong Ng Limsoon Wong Louxin Zhang
Allen Chong Judice Koh SPT Krishnan Huiqing Liu Seng Hong Seah Soon Heng Tan Guanglan Zhang Zhuo Zhangand many more:
students, folks from geneticXchange,MolecularConnections, and other collaborators….