supervised clustering based on dpcluso: prediction of...
Post on 27-Oct-2020
3 Views
Preview:
TRANSCRIPT
Research ArticleSupervised Clustering Based on DPClusO:Prediction of Plant-Disease Relations Using JamuFormulas of KNApSAcK Database
Sony Hartono Wijaya,1,2 Husnawati Husnawati,3 Farit Mochamad Afendi,4
Irmanida Batubara,5 Latifah K. Darusman,5 Md. Altaf-Ul-Amin,1 Tetsuo Sato,1
Naoaki Ono,1 Tadao Sugiura,1 and Shigehiko Kanaya1
1 Graduate School of Information Science, Nara Institute of Science and Technology, 8916-5 Takayama, Ikoma, Nara 630-0192, Japan2Department of Computer Science, Bogor Agricultural University, Kampus IPB Dramaga, Jl. Meranti, Bogor 16680, Indonesia3 Department of Biochemistry, Bogor Agricultural University, Kampus IPB Dramaga, Jl. Meranti, Bogor 16680, Indonesia4Department of Statistics, Bogor Agricultural University, Kampus IPB Dramaga, Jl. Meranti, Bogor 16680, Indonesia5 Biopharmaca Research Center, Bogor Agricultural University, Kampus IPB Taman Kencana, Jl. Taman Kencana No. 3,Bogor 16151, Indonesia
Correspondence should be addressed to Shigehiko Kanaya; skanaya@gtc.naist.jp
Received 30 November 2013; Accepted 18 February 2014; Published 7 April 2014
Academic Editor: Samuel Kuria Kiboi
Copyright © 2014 Sony Hartono Wijaya et al. This is an open access article distributed under the Creative Commons AttributionLicense, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properlycited.
Indonesia has the largest medicinal plant species in the world and these plants are used as Jamu medicines. Jamu medicines arepopular traditional medicines from Indonesia and we need to systemize the formulation of Jamu and develop basic scientificprinciples of Jamu to meet the requirement of Indonesian Healthcare System. We propose a new approach to predict the relationbetween plant and disease using network analysis and supervised clustering. At the preliminary step, we assigned 3138 Jamuformulas to 116 diseases of International Classification of Diseases (ver. 10) which belong to 18 classes of disease from NationalCenter for Biotechnology Information. The correlation measures between Jamu pairs were determined based on their ingredientsimilarity. Networks are constructed and analyzed by selecting highly correlated Jamu pairs. Clusters were then generated by usingthe network clustering algorithm DPClusO. By using matching score of a cluster, the dominant disease and high frequency plantassociated to the cluster are determined. The plant to disease relations predicted by our method were evaluated in the context ofpreviously published results and were found to produce around 90% successful predictions.
1. Introduction
Big data biology, which is a discipline of data-intensivescience, has emerged because of the rapid increasing ofdata in omics fields such as genomics, transcriptomics,proteomics, and metabolomics as well as in several otherfields such as ethnomedicinal survey. The number of medic-inal plants is estimated to be 40,000 to 70,000 around theworld [1] and many countries utilize these plants as blendedherbal medicines, for example, China (traditional Chinesemedicine), Japan (Kampo medicine), India (Ayurveda, Sid-dha, and Unani), and Indonesia (Jamu). Nowadays, the use
of traditional medicines is rapidly increasing [2, 3]. Thesemedicines consist of ingredients made from plants, animals,minerals, or combination of them.The traditional medicineshave been used for generations for treatments of diseasesor maintaining health of people and the most popular formof traditional medicine is herbal medicine. Blended herbalmedicines as well as single herb medicines include a largenumber of constituent substances which exert effects onhuman physiology through a variety of biological pathways.The KNApSAcK Family database systems can be used tocomprehensively understand the medicinal usage of plantsbased upon traditional and modern knowledge [4, 5]. This
Hindawi Publishing CorporationBioMed Research InternationalVolume 2014, Article ID 831751, 15 pageshttp://dx.doi.org/10.1155/2014/831751
2 BioMed Research International
Table 1: List of diseases using International Classification of Dis-eases ver. 10 (class of disease IDs correspond to Table 2).
ID Disease Class ofdisease
1 Abdominal pain 32 Abdominal pain, diarrhea 33 Acne 164 Acne, skin problems (cosmetics) 165 Amenorrhoea, dysmenorrhea 66 Amenorrhoea, irregular menstruation 67 Anaemia 18 Appendicitis, urinary tract infection, tonsillitis 39 Arthralgia 1110 Arthralgia, arthritis 1111 Asthma 1512 Benign prostatic hyperplasia (Bph) 1013 Breast disorder 614 Bromhidrosis 1615 Bronchitis 1516 Cancer 217 Cancer pain 218 Cancer, inflammation 219 Colic abdomen, bloating (in infant) 320 Common cold 1521 Common cold, dyspepsia, insect bites 15, 3, 1622 Common cold, influenza 1523 Cough 1524 Degenerative disease 1425 Dermatitis, urticaria, erythema 1626 Diabetes 1427 Diabetic gangrene 1628 Diarrhea 329 Diarrhea, abdominal pain 330 Diseases of the eye 531 Disorders in pregnancy 632 Dysmenorrhea 633 Dysmenorrhea, irregular menstruation 634 Dysmenorrhea, menstrual syndrome 635 Dyspepsia 336 Dyspnoea 1537 Dyspnoea, cough, orthopnoea 1538 Fatigue 1139 Fatigue, anaemia, loss appetite 140 Fatigue, lack of sexual function 641 Fatigue, low back pain 1142 Fatigue, myalgia, arthralgia 1143 Fatigue, osteoarthritis 1144 Fertility problem 6, 1045 Fever 0
Table 1: Continued.
ID Disease Class ofdisease
46 Gastritis, gastric ulcer 347 Haemorrhoids 148 Headache 1349 Heart diseases 850 Heartburn 3, 851 Hepatitis, other diseases of liver 352 Hypercholesterolaemia 1453 Hypertension 854 Hypertension, diabetes 1455 Hypertension, hypercholesterolaemia 1456 Hyperuricemia 157 Immunodefficiency 958 Indigestion (K.30) 359 Indigestion, lose appetite 360 Infertility 6, 10
61 Irregular menstruation, menstruationsyndrome 6
62 Kidney diseases 1763 Lactation problems 664 Leukorrhoea (Vaginalis) 665 Leukorrhoea (Vaginalis), dysmenorrhoea 666 Lose appetite 367 Lose appetite, underweight 1468 Low back pain, myalgia, arthralgia 1169 Low back pain, myalgia, constipation 1170 Low back pain, urinary tract infection 1771 Lung diseases 1572 Malaise and Fatigue 1173 Malaise and Fatigue, Constipation 1174 Malaise and Fatigue, Fertility Problems 10, 1175 Malaise and Fatigue, Low Back Pain 1176 Malaise and Fatigue, Sexual Dysfunction 11, 6, 10
77 Malaise and Fatigue, Skin Problems(Cosmetics) 16
78 Malaria, anaemia 179 Meno-metrorrhagia 680 Menopausal syndrome 6
81 Menopause/menstrual syndrome, leukorrhoea(vaginalis) 6
82 Menstrual syndrome 683 Menstrual syndrome, fatigue 684 Migraine 1385 Mood disorder 1886 Myalgia, arthralgia 1187 Nausea/vomiting of pregnancy 688 Osteoarthritis 1189 Osteoarthritis, fatigue 11
BioMed Research International 3
Table 1: Continued.
ID Disease Class ofdisease
90 Overweight, obesity 1491 Paralysis 1392 Post partum syndrome 693 Prevent from overweight 1494 Respiratory infection due to smoking 1595 Respiratory tract infection 1596 Rheumatoid arthritis, gout 1197 Secondary amenorrhea 698 Secondary amenorrhea, irregular menstruation 699 Sexual dysfunction, fatigue 6, 10100 Skin diseases 16101 Skin problems (cosmetics) 16102 Sleeping and Mood Disorders 18103 Sleeping disorders 18104 Stomatitis 3105 Stomatitis, gingivitis, tonsilitis 3106 Stone in kidney (N20.0) 17
107 Stone in kidney (N20.0), urinary bladder stone(N21.0) 17
108 Tonsilitis 4109 Tonsilofaringitis 4110 Toothache 13111 Typhoid, dyspepsia 3112 Ulcer of anus and rectum 3113 Underweight, lose appetite 3114 Urinary tract infection (urethritis) 17115 Vaginal discharges 6116 Vaginal diseases 6
database has information about the selected herbal ingre-dients, that is, the formulas of Kampo and Jamu, omicsinformation of plants and humans, and physiological activ-ities in humans. Jamu is generally composed based on theexperience of the users for decades or even hundreds ofyears. However, versatile scientific analyses are needed tosupport their efficacy and their safety. Attaining this objectiveis in accordance with the 2010 policy of the Ministry ofHealth of Indonesian Government about scientification ofJamu. Thus, it is required to systemize the formulationsand develop basic scientific principles of Jamu to meet therequirement of Indonesian Healthcare System. Afendi et al.initiated and conducted scientific analysis of Jamu for findingthe correlation between plants, Jamu, and their efficacy usingstatistical methods [6–8]. They used Biplot, partial leastsquares (PLS), and bootstrapping methods to summarize thedata and also focused on prediction of Jamu formulations.These methods give a good understanding about relationshipbetween plants, Jamu, and their efficacy. Among 465 plantsused in 3138 Jamu, 190 plants were shown to be effectivefor at least one efficacy and these plants were considered
to be the main ingredients of Jamu. The other 275 plantsare considered to be supporting ingredients in Jamu becausetheir efficacy has not been established yet.
Network biology can be defined as the study of thenetwork representations of molecular interactions, both toanalyze such networks and to use them as a tool to makebiological predictions [9]. This study includes modelling,analysis, and visualizations, which holds important task inlife science today [10]. Network analysis has been increasinglyutilized in interpreting high throughput data on omics infor-mation, including transcriptional regulatory networks [11],coexpression networks [12], and protein-protein interactions[13]. We can easily describe relationship between entities inthe network and also concentrate on part of the networkconsisting of important nodes or edges.These advantages canbe adopted for analyzing medicinal usage of plants in Jamuand diseases. Network analysis provides information aboutgroups of Jamu that are closely related to each other in termsof ingredient similarity and thus allows precise investigationto relate plants to diseases. On the other hand, multivariatestatisticalmethods such as PLS can assign plants to efficacy byglobal linear modeling of the Jamu ingredients and efficacy.However, there is still lack of appropriate network basedmethods to learn how and why many plants are grouped incertain Jamu formula and the combination rule embeddingnumerous Jamu formulas.
It is needed to explore the relationship between Indone-sian herbal plants used in Jamu medicines and the diseaseswhich are treated using Jamu medicines. When effectivenessof a plant against a disease is firmly established, then furtheranalysis about that plant can be proceeded to molecular levelto pinpoint the drug targets. The present study developeda network based approach for prediction of plant-diseaserelations. We utilized the Jamu data from the KNApSAcKdatabase. A Jamu network was constructed based on thesimilarity of their ingredients and then Jamu clusters weregenerated using the network clustering algorithm DPClusO[14, 15]. Plant-disease relations were then predicted by deter-mining the dominant diseases and plants associated withselected Jamu clusters.
2. Methods
2.1. Concept of the Methodology. Jamu medicines consistof combination of medicinal plants and are used to treatversatile diseases. In this work we exploit the ingredientsimilarity between Jamu medicines to predict plant-diseaserelations. The concept of the proposed method is depictedin Figure 1. In step 1 a network is constructed where a nodeis a Jamu medicine and an edge represents high ingredientsimilarity between the corresponding Jamu pair. In Figure 1,the nodes of the same color indicate the Jamumedicines usedfor the same disease.The similarity is represented by Pearsoncorrelation coefficient [16, 17]; that is,
corr (𝑋, 𝑌) =∑𝑙
𝑖=1(𝑥𝑖 − 𝑥) (𝑦𝑖 − 𝑦)
√∑𝑙
𝑖=1(𝑥𝑖 − 𝑥)
2
∑𝑙
𝑖=1(𝑦𝑖 − 𝑦)
2, (1)
4 BioMed Research International
Table 2: Distribution of Jamu formulas according to 18 classes of disease (classes of diseases are determined by NCBI in ID1 to ID16 and bythe present study in ID17 and ID18 represented by asterisks in Ref. columns).
ID Class of disease (NCBI) Ref. Number of Jamu Percentage1 Blood and lymph diseases NCBI 201 6.412 Cancers NCBI 32 1.023 The digestive system NCBI 457 14.564 Ear, nose, and throat NCBI 2 0.065 Diseases of the eye NCBI 1 0.036 Female-specific diseases NCBI 382 12.177 Glands and hormones NCBI 0 —8 The heart and blood vessels NCBI 57 1.829 Diseases of the immune system NCBI 22 0.7010 Male-specific diseases NCBI 17 0.5411 Muscle and bone NCBI 649 20.6812 Neonatal diseases NCBI 0 —13 The nervous system NCBI 32 1.0214 Nutritional and metabolic diseases NCBI 576 18.3615 Respiratory diseases NCBI 313 9.9716 Skin and connective tissue NCBI 163 5.1917 The urinary system ∗ 90 2.8718 Mental and behavioral disorders ∗ 21 0.67
The number of Jamu classified into multiple disease classes 119 3.79The number of Jamu unclassified 4 0.13Total Jamu formulas 3138 100.00
where 𝑥𝑖 is the weight of plant-𝑖 in Jamu 𝑋, 𝑦𝑖 is the weightof plant-𝑖 in Jamu 𝑌, 𝑥 is mean of Jamu 𝑋, and 𝑦 is meanof Jamu 𝑌. The higher similarity between Jamu pairs thehigher the correlation value. In the present study, 𝑥𝑖 and𝑦𝑖 are assigned as 1 or 0 in cases the 𝑖th plant is, respec-tively, included or not included in the formula. Under suchcondition, Pearson correlation corresponds to fourfold pointcorrelation coefficient; that is,
corr (𝑋, 𝑌) = 𝑎𝑑 − 𝑏𝑐
√(𝑎 + 𝑏) (𝑎 + 𝑐) (𝑏 + 𝑑) (𝑐 + 𝑑), (2)
where 𝑎, 𝑏, 𝑐, and 𝑑 represent the numbers of plants includedin both 𝑋 and 𝑌, in only 𝑋, in only 𝑌, and in neither 𝑋 nor𝑌, respectively.
In step 2 the Jamu clusters are generated using net-work clustering algorithm DPClusO. DPClusO can generateclusters characterized by high density and identified byperiphery; that is, the Jamu medicines belonging to a clusterare highly cohesive and separated by a natural boundary. Suchclusters contain potential information about plant-diseaserelations.
In step 3 we assess disease-dominant clusters based onmatching score represented by the following equation:
matching score
=number of Jamu belonging to the same disease
total number of Jamu in the cluster.
(3)
Matching score of a cluster is the ratio of the highest numberof Jamu associated with a single disease to the total numberof Jamu in the cluster. We assign a disease to a cluster forwhich the matching score is greater than a threshold value.In step 4, we determine the frequency of plants associatedwith a cluster if and only if a disease is assigned to it in theprevious step. The highest frequency plant associated to acluster is considered to be related to the disease assigned tothat cluster. True positive rates (TPR) or sensitivity was usedto evaluate resulting plants. TPR is the proportion of the truepositive predictions out of all the true predictions, defined bythe following formula [18]:
TPR = TPTP + FN
, (4)
where true positive (TP) is the number of correctly classifiedand false negative (FN) is the number of incorrectly rejectedentities. We refer to the proposed method as supervisedclustering because after generation of the clusters we narrowdown the candidate clusters for further analysis based onsupervised learning and thus improve the accuracy of predic-tion of the proposed method.
3. Result and Discussion
3.1. Construction and Comparison of Jamu and RandomNetworks. We used the same number of Jamu formulas fromprevious research [6], 3138 Jamu formulas, and the set union
BioMed Research International 5
A BC D
DCBA
Step 1
Constructing ingredient correlation network
Step 2
Extracting highly connected Jamu
Step 3
Supervised analysis for voting utilization
Step 4
Listing ingredients
Input: Jamu formulas
Output: plant-disease relations
Figure 1: Concept of the methodology: network construction based on ingredient similarity between individual Jamu medicines, networkclustering, and classification of medicinal plants to dominant disease.
J02538
J01667
J01392
J02594J00533
J00527J00526
J01267
J01407
J02548
J02484
J01406
J02338
J00379
J02195
J01401
J01760
J02570
J00525
J01669
J01666
J01665
J01399
J01659
J01400
J01398
J01875
J02712
J01452 J00942
J02752
J02148
J01448
J02867
J00518
J00374
J02931J01864
J02100
J00168
J01623J02806
J02154
J01205
J01185
J00778J00420
J01898J01899
J01897J01322J00709J01187
J02194J01281
J01241
J00358
J01057
J00365
J03028J02122
J02688
J00842
J00164
J00166J00929
J00876
J01670
J01473
J00107
J00174J02397
J00707J01148
J00430
J02762
J03029
J01209
J01207
J00195
J01463
J01634
J00253
J00605J01269
J00142J00022J03040
J00824J00917
J01709J01352
J01459
J00023
J01321
J01208
J00114
J01310
J01263J01251
J00292
J00199J02329J02328
J02119
J01163
J00686
J00321
J01602J02268J02905
J02071J01430
J02012
J02718J01787
J01800
J02189
J02657
J01725
J00496
J02146J00658
J01833
J02940J01673
J00111
J02359J02522
J01726J02198J01455
J01451
J02723 J00261
J00262
J00960
J01881
J01450
J03030
J02932
J02021
J00201J00412
J00380
J00838
J01841
J02938
J02549
J02537J01171
J02020
J02881
J00318
J01210
J02979J01005
J01303
J00494
J01926J01629
J03125
J00989J02459J00363
J00495
J00241
J03131
J02011
J00766
J01636
J00402
J03089
J03004
J01090
J02103
J02886
J02390
J01359
J02879
J02848J02660
J02855J03097J02789
J02261
J01361
J01053J01501
J00068J02754J02263J02374
J01698
J03088J02969J01863
J02203
J00758
J01528J01498J01565
J01615
J02031
J02354J03095
J01432
J02469 J02039
J02138
J01548
J01434J01576
J03101J02658J01908J02925
J02342J02811J02497J02339
J01837J00157
J02045J02373
J02568
J02663J01614J00227J00102
J02259J02530
J02157
J02788J00067
J01827J00798J02124J02260
J02840
J02793J02792
J02022J01983
J01690J00986
J00821
J01431J02856
J03100J01502J02655
J02036
J02262
J02137
J01551J00332
J00088
J02666J02400J02884
J02831
J01704J01680
J00898
J02037J01855J01368
J02204J03115J02388J03086
J01685
J01270
J01122
J01649J02326J01831J03085
J00054J01700J01064J02317
J00307J00286
J02950J02075
J01306
J01006
J00135
J00417
J00384
J02061
J00271
J00698
J01029
J00419
J00627
J02398
J01964
J02005J02040
J00141
J01045
J00165
J01189J00354
J01013
J00353
J00395J00409
J01166
J00429
J01799J00752
J00747J03090
J00039
J02356
J02010
J01020
J01225
J01044
J02472
J00276
J02536
J02346
J00828
J00759
J02526
J02922
J03081
J00280
J01689
J02126
J02656J02893
J02812J02847
J01832J01909
J02123J02156
J01697J01681
J01861
J02029J02159J02033
J02152J02028J01550
J01353J01609J01377J02830J01862
J00734J00331
J02871J00145
J01393J00233
J01845J00149J01500J02135
J00158
J01569J01617
J02670
J02662J02136
J02032J02751J01648
J02038J02653
J02202J01616
J00985
J00071
J03082
J01579
J01825
J02803J01394J00891
J00160J00147
J00894J00958J02387
J01786
J00957
J00895
J02904
J01887
J01023J02843
J01170
J01695
J00312
J00311J01552
J00554 J01868
J02223J02790J02616
J01879J03114J02732J01679J01316
J00732J00191
J02737J03093
J02027
J02997
J02255J01542J00298
J02486
J01640
J02639
J02640
J01655
J02389
J02818
J02816
J01554
J02633
J01668
J01671
J02303
J00990
J01635
J01637
J01639
J02206
J02967
J01672
J02703
J01497 J01641
J01638
J02554
J02116
J02961
J02702
J02493
J03000
J00362
J02665J02827
J00656
J01068
J00105
J00983
J02406
J01625
J01490
J02528
J01829J02701
J02098
J01227J00978
J00060
J02634
J02934J01584
J00214J02696
J02637J02093
J02635
J01867
J02324
J02807
J02707
J03108
J01656 J02987J01657J02672
J01445
J01596
J02275J01537J02697
J01980
J02286J02768
J01060
J00737J00435
J03010J02242
J02971
J00984
J01418
J01810
J01462
J00398J00757
J00729J02281
J00193
J02295
J00754
J00626J02708
J02251
J03110
J02048
J00753J01707
J01475
J00750
J00515
J01486J01593
J00364 J01582 J02651 J00070 J02619 J01107
J00238 J01581 J02650 J00069 J02618 J02846J01435J02569J00339J00563J02911J01108
J02842J00018J00954J00316J00556J02890
J00809
J02277J01478
J00465
J02288
J02626
J02499
J00760
J02309
J01558
J02291
J02648J00530
J00308
J01423
J02046
J02076
J00139
J01351
J01627
J02243
J01755
J01866
J02998
J01950 J00858J01734J01098
J02439
J01873
J02233
J01338
J01097
J01246
J00098
J00702
J01761J00851
J00113J00679J02217
J00375
J00704
J03003
J01002
J01803
J00901
J02423
J03044
J00436
J02695
J00546J01426 J00080
J00028J00186
J02516
J01139 J02441 J01389
J02442
J02193
J02212J01041
J01290
J01421
J01157
J01840
J00573
J01870
J02711
J02822
J01960
J00315
J00127
J02722
J02111
J00245
J02815J03046
J00117J00291
J01764
J01513
J01938 J00250 J00258
J03017
J02360
J01939J00422
J00904J02933J01645
J02256
J00879J01177J02557
J00540
J02140J02234
J02543J02627
J02130
J01051
J01590J01591
J01519
J01138J03024J00192
J01972J01975 J02623
J01971J01509
J01483
J00473
J01425
J01650J01842
J01922
J00489
J01324J01654
J02058
J01244
J02629
J00322
J00862J01533J01532J00259
J00918
J02721
J00899J00247J00900
J01265
J02283J02290J01653
J02963
J02249
J01601
J02000
J02227J00885
J01701J02572
J01481J01117
J01420
J01485J02049J00021J02825
J00811J03096
J02172
J00666
J03105
J00788
J02118
J00773
J00202
J03063
J00628J00638
J00864
J02369
J00644
J00589J01716
J00782
J00664J00637
J00722
J01017J00643
J00416
J03027J01232J01199
J01224J01201J01075
J00956J03130
J01135
J00034
J00007J00781
J00780
J00793
J02781J00661
J00593
J02427J02425
J00802
J03034
J03020
J00716
J02383
J00951J00787
J01223J01197
J01198
J00008
J01283
J01235
J00641
J00036
J01356
J03055
J01985
J00467
J01412
J01917
J02013
J00440
J02254J01776
J01792 J00739J01771
J01706
J02447J00019
J00619
J00624J00587J02451
J01768
J00376
J01749
J01746
J00834J00503
J02284
J00408
J01894
J01878
J01691
J00172
J00506
J02405
J00076
J01422
J02403
J00320J01708
J00812
J00220J00219J01573
J03136J00046
J00432J00367
J02571J00221J01770
J02253J00283
J00063J00064J00319
J01769J01692J00194
J01774J00222J02617
J00448J02970
J01248
J00703
J01355
J02347
J01229J02580
J02966
J02964
J00138
J00180
J00044J00196
J02835J01632
J02047J00300
J01773
J01882
J01748
J01713
J01443
J01427J00519
J01693
J02404
J01647
J02197
J01772
J00058
J01048
J00539J00908
J00282
J00100
J00323
J02114
J02009
J01388
J00653
J02024
J00225
J00324J02121
J01687
J01184
J02738J02561
J02336
J02725J00471
J01402
J02735J01856
J01247
J02652
J00582
J01775
J01793
J01676
J00594J00065
J01710
J02026
J01610
J02467
J01404
J00014J00013
J02476
J01416J00481
J00392
J00003J00950
J02208
J01410
J00049
J00931J03038
J03026
J03039
J00459
J03051
J02785
J02365
J03052
J02368
J00863J02191
J03043J03037
J03057
J01951J01220
J01203
J00461
J01811 J01192
J01997
J03048
J00969J01642
J00083J00971
J00377J02517
J00352
J01390J00575
J01114J00167
J02192J00004
J00700
J00551J02985J00152
J02099
J01743J00151
J00434
J01240
J00907
J01923
J03116
J01664
J00998
J00236
J03117
J03137
J02756
J00578
J01530
J00335J00096
J00896
J02115
J02141J00097J00699
J02611
J01903J00301
J02107J03103
J02276
J00101
J03133
J00056
J01815J01791J02158
J02943
J00187
J01477J01988J00125
J02714J02942J02959
J02372J01987J00938J00334J00178
J00078
J02272J02632
J02375J02960
J01797J00218
J00189J00094
J01900
J03064
J00770J02509
J01682J00991
J00631
J02409J02228
J02132
J02069
J01955J01936
J00179
J00667
J00617
J00584J00678
J01914J01727
J00622J02542
J02686
J00443
J00579J00414
J01160
J02381
J02462
J00600J01913
J00275
J01262
J01259
J01257
J02413
J02216
J00771
J00630
J02664
J00639
J00621
J00884
J01805
J01928
J00484
J01102J01766
J01896
J00257
J00887
J01893
J01460J02186
J00346J00343 J00349 J00994
J00604J03113
J02797J02169
J00997
J00406 J00687
J00728
J00726
J00410
J00508
J00911
J03041
J00683
J00947J00801J00134
J00910
J00888
J00170
J00705
J02939J02167
J00889
J02539J01054J01164J00350
J00949
J03032
J02171J02431J01115
J00795
J03119J00243
J01403J00502 J00762
J03120
J02019
J00640J02923
J01777J02869J02896
J02872J02841
J00235
J01646
J02892
J02727
J01780
J00614J02885
J00633
J00592
J03087J01279
J00646
J01993
J02929
J01071
J00988
J03134J02996
J01912
J02105J01597 J01531
J00413
J01630
J02724
J00333
J00955
J00675
J00234
J00810
J00684
J01789
J02207 J02742
J03056
J02487
J00571
J01499J02974
J03033
J00351
J01295
J00464
J01696
J00348
J02507
J01883
J02579
J01991J00823
J00162
J02927
J02687
J02743J03075
J01077
J00873J01911
J01439J01101
J02715J01266
J03062
J00531
J03060J02370
J01021
J02558
J02161
J01744
J01300
J01907J00317 J01358
J00255J02502
J01411
J02593
J01515J00209
J01386
J02716
J01367J00815J00085
J02808 J00813
J02845J01824
J02849
J01438 J01611
J02851
J02868J01781
J00337
J00154 J00155J00156J02852
J00542 J02371
J02318
J02880
J01444J03061J01456J00746
J02921
J02587
J01992
J02747
J02844J02839
J00086
J02850J00797
J02915J02097
J02744
J02741J02728
J01074J02733J02740J02524
J01395 J02035J03099
J02833
J01354
J00892
J00713J01835J02801
J01126
J01035
J00120
J02859
J00230
J02802J00733
J01702
J01116
J02758
J00148
J03079
J00865
J02853J02791
J00075J00355
J00818J02854
J00488
J01360J02401J01271J00820J02034J00053
J00161
J00150
J00146
J02990 J02993 J02983J02085 J02991 J01571
J02320
J01952J02305
J02316
J02084
J01268
J02083
J00460
J02838
J02771
J01910
J00123
J01087J00635
J01180
J01234
J02609J00612J02908J02875J00652J02945 J02700 J01508
J02874J02861
J02876
J02899 J01539J01834
J02247J02636
J01865J02610
J00611J00711
J00153
J02125J02810
J02153
J00427
J02995J02399J03098J01703
J00306J00226
J02787J00121J01699J00073J01336
J00087J02786
J01613J01618J02659
J01580J01120J03084
J01564J00735J03094
J01826J02753J02402
J02809J02258J01063J02948
J01188
J01688J01836
J03083
J01583
J01683J01830
J01527
J00897
J00066 J02949J02968J00077J02529J01121
J02302J01330J00916
J02433J02313J02759J01405
J02453
J01877
J01587
J02745J01872J00548
J01082
J01982 J01759 J02719 J02023
J02675J01871J00566J01080 J01981 J01758 J02709 J01996
J02129 J02050 J02763 J02669 J02607 J02319 J01437 J01119
J00959 J01479 J02761 J00232 J01387 J02298 J01436 J01118
J02301 J00490 J01817
J02314 J01998 J01818
J01651
J02515J02315
J00497J03042
J00468
J01315 J00326J02782
J01838
J02466
J00980
J00512J01932
J01643
J01948 J02092J01686
J00415J01504
J00933J01325
J00297
J00769J01660
J01941J00523 J00547
J02350
J00441
J00932J02682
J01030J00450
J00854
J00868
J02310J02351
J02341J00455J02300
J01179
J00844
J00266J00906J00928
J03035
J01001
J00017
J01927
J00583
J00609
J00618
J00690
J00599
J02090J00970
J02428
J02471
J03006
J00452
J01788
J00269
J00251
J01088
J01925
J01055
J02481J00695
J00682J00800J01309J01340
J01934J02465
J02919
J01379J01378 J01212J02661
J01216
J01202
J02645
J02673
J00856
J02799J00791
J00038J02592
J00265
J02559
J01018
J02643
J01396
J02485J02590
J00927J00772
J00136 J02490
J01072
J02488
J00175J01621
J02003
J01278
J00761
J03008
J00670
J02444
J02454
J00926
J02582
J03107
J02498J00740
J02265
J00493J01874
J00855
J01273J01535
J00025J01413
J00616
J00242
J02264J00458
J00861J00011J00224
J00708J01173
J02327
J00860
J00329J00357
J02440
J00454J00693
J01802J00061J01807
J00745
J01547
J00608
J00623
J00659
J01312
J02222
J00591
J00790
J00356J00001J00106
J00665J00002J00723
J01440J00035
J00009
J00601
J02798
J00598
J00944J00585
J03078
J01843J01272
J03018
J01110
J02449J01904
J00342
J01151
J00373
J00383
J00210
J00839
J00768
J02503J00371J03121
J02457J00963J01066
J02184
J02527
J01036
J02936
J01334J01317
J02004
J00654J00143
J00650J01149J00606
J02477J02478
J01409J01280J02501
J01014
J00850J00421
J01277
J00084
J00169
J00655
J00727
J02531J02713
J00590J00647
J00648
J00674J00634
J00137
J00632
J01719
J02832
J00629
J00037
J03092
J01150
J02680J00964
J01732 J01722
J01369
J00249
J02166
J01374
J03122
J01162
J01376
J02333
J01007
J01344
J01136
J00880
J01886
J01415J01137
J02681J01154
J01902
J00764J01065
J02209J00774
J00676J02532
J03019
J03013
J00847
J00203
J00691
J01906
J02510
J01711
J00845
J01226
J02679
J00882
J00890J01471
J00369
J00041
J02508
J01174
J01366
J02180
J00401
J01720
J00680J00672
J01371
J02525
J01747J00347
J01034
J01989
J00212J02937
J00574J02935
J00200
J03022
J00299
J00701
J01085
J02519
J00799
J01715 J01264J00294
J00692J00689
J00290
J00796 J03123
J00783J00996
J01073J00671
J00786
J01343
J00206
J00099
J01249J00915
J02473
J00651J00789J00586J00649
J01929J01895
J00016J01890
J00491
J00010J00724J00725
J00765
J00636
J00875J00405
J00965
J01056J00657J00784
J02420J02448J03021
J02430J02429J01009J03054J03112
J01109
J02363J00968
J03007
J00231
J00721
J00720J00719J00718
J00717
J00715
J01011J02642J02644
J01039J01132
J01285J00794
J00827
J00662
J02581 J01957J00763
J00779J01765 J01282
J00109
J01015
J01319
J00607
J00714
J02182J01380
J01373
J00568
J01417
J02210
J02586
J00183
J00361
J01753
J02311 J02595
J01175
J02813
J00368
J01757
J01652
J01131
J01453
J02081
J00866J00391
J02641
J01954
J00126
J01578J01123
J01752
J00681
J00198
J00748 J00424
J01718
J01275
J00972
J00867J00849
J00433
J01736
J02181
J02426J02424
J02131
J01512J01511
J02982J02065
J01419
J02078
J02331
J02064
J02574J01804
J02824
J01942
J01144J00588
J01176
J00785
J02086J01978
J00572J02685
J00500J01130
J00181
J00310
J00108
J00411J01370
J02553
J01328J00207J01522
J00469J00092J00089
J00837
J00090
J02306J02386J02496J01510
J02693J01049
J00012
J01429
J01712
J02414J01545J00006
J01979
J03025
J01012J00981
J01544
J01038
J02952
J01466
J00026
J00032
J00033
J00537
J01442
J01414
J00020
J02953
J03127
J02357
J00359
J00869
J01345J02573
J02438
J00743
J02241
J01828J00278
J02692J02691
J01678
J02746
J00756
J02267J02236
J00953
J00270
J02975J01974J02235
J01467
J00777 J01031
J01181
J01337
J01323
J00902
J02109
J00112
J03106
J02730
J01383
J01357
J02237
J00260J01294
J02951
J02562J00015
J00330 J01182J02443 J02055 J01534J01487J00288
J02296J02282
J01231J01798
J02008J00223
J01790J00501J00244
J00062
J00045
J01307J01604
J00263
J01756
J00935 J00124
J01287
J02057J00936J02567
J01206
J01024 J02489J02468
J02079
J01857
J01449
J02054
J01222J01196
J03109J00366
J00264
J00470
J00240
J01133
J02755J00281
J02170J01606J01880
J02955J00228
J02492J02269J02147J02041J01605J01169
J01165J02190J01794
J02063J01796
J02252J00304
J02353J01536J02226
J01561J02739J02749
J02541
J00081
J01839
J00482J02345
J01541J00625J00504
J00059
J01984
J03053
J00208
J02584
J03012J00843
J01663
J00595
J01296J02978
J02018J00825
J00248J00660
J01008
J02458J01730
J01724
J01723
J00029J02677
J01731J00688
J02073
J02072
J02491
J01019
J02060
J02556
J01739
J02355
J02511
J00874
J02415
J00976
J00645
J01161 J00505
J02380
J00254
J01572
J01885
J02218J01167J02500
J02188 J00883J02588
J01408
J02144 J01526J02535J02544
J01291
J01129
J01062
J00602
J01042 J00390
J00903
J00767
J02297
J00597
J01010
J01318
J01717
J00673
J03009
J01995
J01363
J02671
J03091
J01125
J02514
J00372
J00313
J00776
J02649J02460
J01754
J00755J00277
J00144
J01384
J01103
J01214
J02796J00974
J00285
J02470J01889
J00999
J00171 J00560J01586
J00279
J00613 J01888J01905
J00710
J00344J00345
J00511
J00048
J02895J02873
J02091J01314 J00407
J01298
J02385
J00529
J01884
J00881
J01305
J00941
J02683
J00095J01238
J03058
J01626
J01924
J02434
J02614
J02981
J02920J02006J02878J01976J02066
J01969
J02804
J01674
J01341J01469J01592J02232
J02546
J01915
J02684J01567
J00510
J00246
J02784 J00272 J02240 J01916
J02294
J02280J01956
J02250
J01965
J01562
J02819
J02134
J02710
J01557
J02325
J02566
J02070
J02074
J01958
J01568
J00498J00205
J02014J02225
J01230
J00807
J00492
J02435
J00922
J02229
J02555J02972
J01930J02720
J01812
J01919
J02704 J00914
J00923
J00846 J02780
J02113
J00360
J02215
J02455
J02421
J02330
J00327
J00886
J02965
J01183
J00742
J01313
J02174
J01302
J01967J02214
J01228
J01973
J02483 J01574
J00596
J00642
J00677
J00663
J01218
J01194
J03031J02266
J02565J00822
J02231
J02062J01555
J02238
J01694
J01105
J00386
J02976
J02583 J01153J00513
J02523
J01585
J01480
J03138J02962
J02731J01556J02248
J01600J02080
J02279
J01276
J01288
J01156
J00610
J02095J02293
J01999
J02178
J02201
J01043
J02219
J00393
J02108
J00204
J01159
J02177J00731
J00877
J00912
J03045
J01566
J02820
J01145
J02947J00620
J00444
J00905J01458
J01260J00057
J02575
J00309J00536
J01447
J00544J02221J01236
J01740
J01808J00775
J00104 J02179
J01172J02563
J00445
J01382J00295
J01850
J01311
J01970J00463J00437
J01178
J00946
J03016
J03005
J02795
J03023
J00848J02461
J02482
J02480
J02479
J01472J00055
J00252J02419J00457 J00696J00173
J01219
J02520
J02452
J02120J00987
J01549J00215
J01920
J01195
J01191
J02418J01937
J02432
J00453J02495
J01943
J00190J02016
J00499
J01484
J02162
J00934J02729
J02068
J01619
J00517
J02051
J02910
J02001J01239
J02273J02292
J02862
J02863
J02112
J01286
J00184
J01622
J02474
J02312
J02608
J00878
J01033
J01559
J01465
J00449
J00079J02676
J02494
J00122
J02211
J02094
J02367
J02446
J02392 J02778 J00486 J02926
J02391 J02777 J00438 J02897 J02882 J02887 J01493 J00514
J00615J02096J01849J01329J00552J01491J02901J02909
J00697J02667J01852J01326
J02133
J01069
J02436
J02067
J01901
J00118
J01783
J01779
J00005
J00937
J00830
J00816J01714
J00952
J01047
J02547
J00031J00555
J01284
J01945
J00389
J03065
J03104J01940
J00446
J03047
J00399
J01190
J00982J00418
J00520
J03059
J00302J00303
J02411
J02224
J01140
J00558
J02213J02630
J01918
J02475
J02110J00314
J00451
J01255J00387
J00188
J03049
J00110
J02437
J00229J02817
J02891
J00378J01677J00712
J00394J00425
J02902J00694
J02602
J00603
J03050
J02052J01986J01474
J00859J02866J02865
J01809J01729J02285
J01966
J01158J00293
J00336
J00730
J00074
J03072
J01859
J00532
J02087
J00973
J01032
J02690
J01959
J00819
J02694
J01806
J02030
J01299
J00975
J00388
J00751
J01385J02748J02145J01517
J02589
J00474
J00289
J01977J02007J02464
J02323
J02689J02533J01538
J02954
J00979
J01543
J03128
J01745
J02377
J01784
J01046
J01892
J02139
J00925
J00920
J02307J00030
J01375
J00338
J00404
J01381
J02394
J02199
J01461
J02200
J01141
J01349
J01350J00296
J01293J02106
J01146
J00328
J03135J00273
J01424 J02834
J00024
J02638
J01607J02836 J02760
J02776
J02343
J02142J01631 J01454
J03080
J01529
J03124 J02958
J03102
J02944J01061
J01854
J02973J02941
J02615J00177 J02956
J02410 J02407
J02564
J02185
J02165
J02187
J02736J02408
J02674
J02271 J02289
J01931
J00216
J02002 J02894
J02912 J02930
J02857
J02101
J02913J02900
J01506
J01507
J02104
J02102
J00580
J03002J01661J00961
J00340J01040J01737
J01741J01814
J01990J00133
J02552J02378
J02605
J01081
J02603
J00685
J00857
J01813J00284J02599
J01301
J01705
J03071
J01016
J01470J01488J01575
J00924J03129J00466
J02598J01100J01364J03011
J02591J00840 J02513J02606
J00268
J02344
J00400J02364
J01728
J01844
J01935
J01441
J00027
J02604
J00550
J02628 J01078J01308J02512
J00516 J02017
J00040 J00893
J02088
J01143J02597
J01953J01099
J01823
J00792
J02585J02056
J02176
J00841
J00487
J02540
J02794
J01525
J00091
J00668J02518
J00103J01742
J00370J02814
J01320
J00962
J01738J00267
J02646
J00669
J00948J01733
J00462
J01037
J01751
J01124
J01763J02220
J01433J01750
J01468
J01052J02362
J02196J02361
J02183J00382
J01624 J01348
J01963
J01523
J01853 J02928
J01520
J02015J01142J02625
J02622
J02393J01524
J00093
J02335J00082
J00808J00140
J00826 J01514J00472
J00921J01968 J00945
J00403
J01949J03126J02164
J01482J00341J01521
J01147J01211
J00522
J01962J01860
J02578
J01560 J01819
J00913
J00480
J00478
J01221J01076
J02534
J00479
J01847
J02783
J01134J01594
J02395J01200
J01603
J02545J00853
J01155
J02860
J01243J02168J02053
J01858
J02163J01027J00456J01204
J02600
J00176J00274J02631J01050
J02717
J01346
J02577
J01489
J02245
J01274
J02668
J00197J02883J02376J01215
J02576
J02278J02601
J02734J01464
J02143
J03076
J02299 J02160
Figure 2: The network consisting of 0.7% Jamu pairs (correlation value above or equal to 0.596).
6 BioMed Research International
Table 3: Statistics of three datasets.
Parameters 0.7% 0.5% 0.3%
Networkstatistics
Total pairs 34,454 24,610 14,766Minimum correlation 0.596 0.665 0.718Number of Jamu formulas 2,779 2,496 2,085Average degree 24.8 19.7 14.2(Random network: ER) (24.8 ± 0.0) (19.7 ± 0.0) (14.2 ± 0.0)
(Random network: BA) (24.7 ± 0.1) (19.7 ± 0.1) (14.1 ± 0.1)
(Random network: CNN) (24.7 ± 0.4) (19.7 ± 0.4) (14.0 ± 0.4)
Clustering coefficient 0.521 0.520 0.540(Random network: ER) (0.009 ± 0.000) (0.008 ± 0.000) (0.007 ± 0.000)
(Random network: BA) (0.030 ± 0.001) (0.028 ± 0.001) (0.026 ± 0.001)
(Random network: CNN) (0.246 ± 0.008) (0.239 ± 0.008) (0.233 ± 0.010)
Number of connected components 69 119 254(Random networks: ER, BA, CNN) (1) (1) (1)Network diameter 15 17 20(Random network: ER) (4.0 ± 0.0) (4.0 ± 0.0) (5.0 ± 0.0)
(Random network: BA) (10.8 ± 0.8) (11.2 ± 1.5) (10.8 ± 0.9)
(Random network: CNN) (14.6 ± 1.9) (14.1 ± 1.4) (14.7 ± 1.3)
Network density 0.008 0.008 0.007(Random network: ER) (0.009 ± 0.000) (0.008 ± 0.000) (0.007 ± 0.000)
(Random network: BA) (0.009 ± 0.000) (0.008 ± 0.000) (0.007 ± 0.000)
(Random network: CNN) (0.009 ± 0.000) (0.008 ± 0.000) (0.007 ± 0.000)
DPClusO
Total number of clusters 1,746 1,411 938Number of clusters with more than 2 Jamu 1,296 873 453(%) (74.2) (61.9) (48.3)Number of Jamu formulas in the biggest cluster 118 104 89
of all formulas consists of 465 plants. We assigned 3138 Jamuformulas to 116 diseases of International Classification ofDiseases (ICD) version 10 from World Health Organization(WHO, Table 1) [19]. Those 116 diseases are mapped to18 classes of disease, which contains 16 classes of diseasefromNational Center for Biotechnology Information (NCBI)[20] and 2 additional classes. Table 2 shows distributionof 3138 Jamu into 18 classes of disease. According to thisclassification, most Jamu formulas are useful for relievingmuscle and bone, nutritional and metabolic diseases, andthe digestive system. Furthermore, there is no Jamu formulaclassified into glands and hormones and neonatal diseaseclasses. We excluded 4 Jamu formulas which are used to treatfever in the evaluation process because this symptom is verygeneral and almost appeared in all disease classes. Jamu-plant-disease relations can be represented using 2 matrices:first matrix is Jamu-plant relation with dimension 3138 ×465 and the second matrix is Jamu-disease relation withdimension 3138 × 18.
After completion of data acquisition process, we calcu-lated the similarity between Jamu pairs using correlationmeasure. The similarity measures between Jamu pairs weredetermined based on their ingredients. Corresponding to 𝐾(3138 in present case) Jamu formulas, there can be maximum(𝐾 × (𝐾 − 1)/2) = (3138 × (3137/2)) = 4,921,953 Jamu
pairs. We sorted the Jamu pairs based on correlation valueusing descending order and selected top-𝑛 (0.7%, 0.5%,and 0.3%) pairs of Jamu formula to create 3 sets of Jamupairs. The number of Jamu pairs for 0.7%, 0.5%, and 0.3%datasets is 34,454 pairs, 24,610 pairs, and 14,766 pairs andthe corresponding minimum correlation values are 0.596,0.665, and 0.718, respectively. The three datasets of Jamupairs can be regarded as three undirected networks (step 1 inFigure 1) consisting of 2779, 2496, and 2085 Jamu formulas,respectively (Table 3). Figure 2 shows visualization of 0.7%Jamunetworks usingCytoscape Spring Embedded layout.Weverified that the degree distributions of the Jamu networksare somehow close to those of scale-free networks, that is,roughly are of power law type. However, in the high-degreeregion the power law structure is broken (Figure 3). Nearlyaccurate relation of power laws between medicinal herbsand the number of formulas utilizing them was observed inJamu system but not in Kampo (Japanese crude drug system)[4]. The difference of formulas between Jamu and Kampocan be explained by herb selection by medicinal researchersbased on the optimization process of selection [4]. Thus,the broken structure of power law corresponding to Jamunetworks is associated with the fact that selection of Jamupairs based on ingredient correlation leads to nonrandomselection. We also constructed random networks according
BioMed Research International 7
●
●
●● ●
●
●●●
●●
●●
●
●
●
●●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●●
●
●
●●●
●●
●
1 2 5 10 20 50 100 200
12
35
813
2340
7122
1
0.5%
Freq
uenc
y
Freq
uenc
y
●
●
●●
●
●●
●
●
●
●
●●●
●●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
1 2 5 10 20 50 100
12
47
1222
4381
0.3%(Deg.) (Deg.)
(Deg.)
Freq
uenc
y●
●●
●●●
●
●●
●●●●●●●
●
●●
●●
●
●●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●●●●●
●
●
●●●●●
●
●
●
●
●●
●●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●●●
●
●
●
●
●
●●
●
●
●●
●
1 2 5 10 20 50 100 200
12
35
813
2237
6311
423
40.7%
Figure 3: Degree distributions of three Jamu networks roughly follow power law.The 𝑥-axis corresponds to the log of degree of a node in theJamu network and the 𝑦-axis corresponds to the log of the number of Jamu.
to Erdos-Renyi (ER) model [21], Barabasi-Albert (BA) model[22], and Vazquez’s Connecting Nearest Neighbor (CNN)model [23] of the same size corresponding to each of the realJamu network. We used Cytoscape Network Analyzer plugin[24] and R software for analyzing the characteristics of boththe Jamu and the random networks.
We determined five statistical indexes, that is, averagedegree, clustering coefficient, number of connected compo-nent, network diameter, and network density of each Jamunetwork and also of each random network. The clusteringcoefficient 𝐶𝑛 of a node 𝑛 is defined as 𝐶𝑛 = 2𝑒𝑛/(𝑘𝑛(𝑘𝑛 − 1)),where 𝑘𝑛 is the number of neighbors of 𝑛 and 𝑒𝑛 is the numberof connected pairs between all neighbors of 𝑛. The networkdiameter is the largest distance between any two nodes. If
a network is disconnected, its diameter is themaximum of alldiameters of its connected components. A network’s densityis the ratio of the number of edges in the network over thetotal number of possible edges between all pairs of nodes(which is 𝑛(𝑛 − 1)/2, where 𝑛 is the number of vertices, foran undirected graph). The average number of neighbors andthe network density are the same for the real and randomnetworks of the same size as it is shown in Table 3. In caseof 0.7% and 0.5% real networks, the clustering coefficient isroughly the same and in case of 0.3% the clustering coefficientis somewhat larger. The number of connected componentsand the diameter of the Jamu networks gradually decreaseas the network grows bigger by addition of more nodes andedges.
8 BioMed Research International
Matching score
Num
ber o
f clu
sters
0.0 0.2 0.4 0.6 0.8 1.0
0
100
200
300
(a) 0.7%
Matching score
Num
ber o
f clu
sters
0.0 0.2 0.4 0.6 0.8 1.00
100
200
300
(b) 0.5%
Matching score
Num
ber o
f clu
sters
0.0 0.2 0.4 0.6 0.81.00
100
200
300
(c) 0.3%
Figure 4: Distribution of clusters based on matching score.
0.8Matching score threshold
Ratio
of n
umbe
r of c
luste
rs to
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.9 1.00.0
0.2
0.4
0.6
0.8
1.0
0.7%0.5%0.3%
tota
l clu
sters
(a)
50
0
100
150
Matching score threshold
Num
ber o
f pre
dict
ed p
lant
s
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
0.7%0.5%0.3%
(b)
Figure 5: (a) Success rate and (b) number of predicted plants with respect to matching score thresholds.
Very different values corresponding to clustering coef-ficient, connected component, and network diameter implythat the Jamu networks are quite different from all 3 typesof random networks.The differences between Jamu networksand ER random networks are the largest. Random networksconstructed based on other two models are also substantiallydifferent from Jamu networks. Based on the fact that therandom networks constructed based on all three types ofmodels are different from the Jamu networks, it can beconcluded that structure of Jamu networks is reasonablybiased and thus might contain certain information about
plant-disease relations. Specially, much higher value corre-sponding to clustering coefficient indicates that there areclusters in the networks worthy to be investigated. To extractclusters from the Jamu networks (step 2 in Figure 1) weapplied DPClusO network clustering algorithm [14] to gen-erate overlapping clusters based on density and peripherytracking.
3.2. Supervised Clustering Based on DPClusO. DPClusO is ageneral-purpose clustering algorithm and useful for findingoverlapping cohesive groups in an undirected simple graph
BioMed Research International 9
Table 4: List of plants assigned to each disease.
Number Plants name Hit-miss statusA. Disease: blood and lymph diseases
1 Tamarindus indica Hit ∗
2 Allium sativum Hit ∗
3 Tinospora tuberculata Hit ∗
4 Piper retrofractum Hit5 Syzygium aromaticum Hit ∗
6 Bupleurum falcatum Hit7 Graptophyllum pictum Hit8 Plantago major Hit9 Zingiber officinale Hit ∗
10 Cinnamomum burmannii Hit ∗
11 Soya max Miss ∗12 Kaempferia galanga Hit13 Curcuma longa Hit ∗
14 Piper nigrum Hit15 Zingiber aromaticum Hit ∗
16 Phyllanthus urinaria Hit ∗
17 Oryza sativa Hit18 Myristica fragrans Hit ∗
19 Alstonia scholaris Hit ∗
20 Syzygium polyanthum Miss21 Andrographis paniculata Hit ∗
22 Sida rhombifolia Miss23 Cyperus rotundus Hit24 Sonchus arvensis Miss25 Curcuma aeruginosa Hit ∗
26 Curcuma xanthorrhiza HitB. Disease: cancers
1 Catharanthus roseus HitC. Disease: the digestive system
1 Foeniculum vulgare Hit2 Glycyrrhiza uralensis Hit ∗
3 Imperata cylindrica Hit4 Zingiber purpureum Hit ∗
5 Physalis peruviana Hit6 Punica granatum Hit ∗
7 Echinacea purpurea Hit8 Zingiber officinale Hit ∗
9 Psidium guajava Hit10 Baeckea frutescens Hit ∗
11 Amomum compactum Hit12 Cinnamomum burmannii Hit ∗
13 Melaleuca leucadendra Hit14 Caesalpinia sappan Hit ∗
15 Parkia roxburghii Hit16 Rheum tanguticum Hit17 Kaempferia galanga Hit18 Coriandrum sativum Hit
Table 4: Continued.
Number Plants name Hit-miss status19 Curcuma longa Hit20 Zingiber aromaticum Hit21 Phyllanthus urinaria Hit22 Myristica fragrans Hit23 Hydrocotyle asiatica Hit ∗
24 Carica papaya Hit25 Mentha arvensis Hit26 Lepiniopsis ternatensis Hit27 Helicteres isora Hit28 Andrographis paniculata Hit29 Symplocos odoratissima Hit30 Schisandra chinensis Hit31 Blumea balsamifera Hit32 Silybum marianum Hit ∗
33 Cinnamomum sintoc Hit34 Elephantopus scaber Hit35 Curcuma aeruginosa Hit36 Kaempferia pandurata Hit37 Curcuma xanthorrhiza Hit38 Curcuma mangga Hit ∗
39 Curcuma zedoaria Hit40 Daucus carota Hit ∗
41 Matricaria chamomilla Hit ∗
42 Cymbopogon nardus Hit ∗
D. Disease: female-specific diseases1 Foeniculum vulgare Hit2 Imperata cylindrica Hit3 Tamarindus indica Hit4 Pluchea indica Hit ∗
5 Piper retrofractum Hit6 Punica granatum Hit7 Uncaria rhynchophylla Hit8 Zingiber officinale Hit9 Guazuma ulmifolia Hit ∗
10 Nigella sativa Hit11 Terminalia bellirica Hit12 Baeckea frutescens Hit13 Phaseolus radiatus Hit14 Amomum compactum Hit ∗
15 Sauropus androgynus Hit16 Usnea misaminensis Hit17 Cinnamomum burmannii Hit18 Melaleuca leucadendra Hit19 Parameria laevigata Hit20 Parkia roxburghii Hit21 Piper cubeba Hit22 Kaempferia galanga Hit
10 BioMed Research International
Table 4: Continued.
Number Plants name Hit-miss status23 Coriandrum sativum Hit24 Kaempferia angustifolia Hit25 Curcuma longa Hit26 Zingiber aromaticum Hit27 Languas galanga Hit28 Galla lusitania Hit29 Quercus lusitanica Hit30 Hydrocotyle asiatica Hit31 Areca catechu Hit32 Lepiniopsis ternatensis Hit33 Helicteres isora Hit ∗
34 Piper betle Hit35 Elephantopus scaber Hit ∗
36 Kaempferia pandurata Hit37 Curcuma xanthorrhiza Hit38 Sesbania grandiflora Hit
E. Disease: the heart and blood vessels1 Allium sativum Hit2 Curcuma longa Hit ∗
3 Morinda citrifolia Hit ∗
4 Homalomena occulta Hit ∗
5 Hydrocotyle asiatica Hit6 Alstonia scholaris Hit ∗
7 Syzygium polyanthum Miss ∗8 Andrographis paniculata Hit ∗
9 Apium graveolens Miss10 Imperata cylindrica Hit
F. Disease: male-specific diseases1 Cucurbita pepo Miss2 Serenoa repens Miss3 Baeckea frutescens Hit4 Phaseolus radiatus Hit5 Curcuma longa Hit6 Elephantopus scaber Hit
G. Disease: muscle and bone1 Foeniculum vulgare Hit2 Clausena anisum-olens Hit ∗
3 Zingiber purpureum Hit4 Allium sativum Hit5 Strychnos ligustrina Hit6 Tinospora tuberculata Hit ∗
7 Piper retrofractum Hit8 Syzygium aromaticum Hit9 Cola nitida Hit ∗
10 Ginkgo biloba Hit ∗
11 Panax ginseng Hit12 Equisetum debile Hit ∗
13 Zingiber officinale Hit
Table 4: Continued.
Number Plants name Hit-miss status14 Ganoderma lucidum Hit15 Nigella sativa Hit16 Terminalia bellirica Hit ∗
17 Baeckea frutescens Hit ∗
18 Amomum compactum Hit19 Cinnamomum burmannii Hit20 Melaleuca leucadendra Hit21 Parameria laevigata Hit ∗
22 Psophocarpus tetragonolobus Hit ∗
23 Parkia roxburghii Hit24 Piper cubeba Hit ∗
25 Kaempferia galanga Hit26 Coriandrum sativum Hit27 Cola acuminata Hit28 Coffea arabica Hit29 Orthosiphon stamineus Hit30 Curcuma longa Hit31 Piper nigrum Hit32 Alpinia galanga Hit33 Vitex trifolia Hit34 Zingiber amaricans Hit ∗
35 Zingiber zerumbet Hit36 Zingiber aromaticum Hit37 Languas galanga Hit38 Massoia aromatica Hit39 Morinda citrifolia Hit40 Carum copticum Hit ∗
41 Panax pseudoginseng Hit ∗
42 Oryza sativa Hit43 Myristica fragrans Hit44 Pandanus amaryllifolius Hit45 Eurycoma longifolia Hit46 Hydrocotyle asiatica Hit47 Areca catechu Hit ∗
48 Mentha arvensis Hit ∗
49 Lepiniopsis ternatensis Hit50 Pimpinella pruatjan Hit51 Andrographis paniculata Hit52 Blumea balsamifera Hit53 Cymbopogon nardus Hit54 Sida rhombifolia Hit55 Cinnamomum sintoc Hit56 Piper betle Hit ∗
57 Talinum paniculatum Hit58 Elephantopus scaber Hit59 Cyperus rotundus Hit60 Curcuma aeruginosa Hit61 Kaempferia pandurata Hit ∗
BioMed Research International 11
Table 4: Continued.
Number Plants name Hit-miss status62 Curcuma xanthorrhiza Hit63 Tribulus terrestris Hit64 Corydalis yanhusuo Hit65 Pausinystalia yohimbe HitH. Disease: nutritional andmetabolic diseases1 Foeniculum vulgare Hit2 Glycyrrhiza uralensis Hit3 Zingiber purpureum Hit4 Allium sativum Hit5 Tinospora tuberculata Hit6 Pandanus conoideus Hit7 Syzygium aromaticum Hit8 Punica granatum Hit9 Zingiber officinale Hit10 Guazuma ulmifolia Hit11 Nigella sativa Hit12 Amomum compactum Hit ∗13 Cinnamomum burmannii Hit14 Parameria laevigata Hit15 Caesalpinia sappan Hit16 Soya max Hit ∗17 Cocos nucifera Hit18 Rheum tanguticum Hit19 Piper cubeba Hit ∗20 Murraya paniculata Hit21 Kaempferia galanga Hit ∗22 Coffea arabica Hit ∗23 Orthosiphon stamineus Hit24 Curcuma longa Hit25 Piper nigrum Hit ∗26 Zingiber aromaticum Hit27 Aloe vera Hit28 Phaleria papuana Hit29 Galla lusitania Hit30 Quercus lusitanica Hit31 Morinda citrifolia Hit32 Myristica fragrans Hit ∗33 Momordica charantia Hit34 Areca catechu Hit35 Lepiniopsis ternatensis Hit36 Alstonia scholaris Hit37 Hibiscus sabdariffa Hit38 Laminaria japonica Hit39 Syzygium polyanthum Hit40 Andrographis paniculata Hit41 Sindora sumatrana Hit ∗42 Cassia angustifolia Hit43 Woodfordia floribunda Hit
Table 4: Continued.
Number Plants name Hit-miss status44 Piper betle Hit45 Spirulina Hit46 Stevia rebaudiana Hit47 Theae sinensis Hit48 Sonchus arvensis Hit49 Curcuma heyneana Hit50 Curcuma aeruginosa Hit51 Kaempferia pandurata Hit ∗
52 Curcuma xanthorrhiza Hit53 Curcuma zedoaria Hit ∗
54 Olea europaea HitI. Disease respiratory diseases
1 Foeniculum vulgare Hit2 Clausena anisum-olens Hit3 Glycyrrhiza uralensis Hit4 Zingiber purpureum Hit5 Piper retrofractum Hit ∗
6 Syzygium aromaticum Hit7 Gaultheria punctata Hit8 Panax ginseng Hit9 Equisetum debile Hit ∗
10 Zingiber officinale Hit11 Citrus aurantium Hit ∗
12 Nigella sativa Hit ∗
13 Amomum compactum Hit14 Cinnamomum burmannii Hit15 Melaleuca leucadendra Hit16 Parkia roxburghii Hit17 Cocos nucifera Hit18 Piper cubeba Hit19 Kaempferia galanga Hit20 Coriandrum sativum Hit21 Curcuma longa Hit22 Piper nigrum Hit23 Zingiber aromaticum Hit24 Languas galanga Hit25 Mentha piperita Hit26 Oryza sativa Hit ∗
27 Myristica fragrans Hit28 Pandanus amaryllifolius Hit ∗
29 Hydrocotyle asiatica Hit ∗
30 Mentha arvensis Hit31 Lepiniopsis ternatensis Hit32 Helicteres isora Hit33 Blumea balsamifera Hit34 Cymbopogon nardus Hit35 Piper betle Hit36 Curcuma xanthorrhiza Hit
12 BioMed Research International
Table 4: Continued.
Number Plants name Hit-miss status37 Salix alba Hit ∗
38 Matricaria chamomilla Miss ∗J. Disease: skin and connective tissue
1 Strychnos ligustrina Hit2 Merremia mammosa Hit ∗
3 Piper retrofractum Hit ∗
4 Santalum album Hit5 Zingiber officinale Hit ∗
6 Citrus aurantium Hit7 Citrus hystrix Hit8 Cassia siamea Hit9 Cocos nucifera Hit10 Trigonella foenum-graecum Hit11 Orthosiphon stamineus Hit12 Curcuma longa Hit13 Vetiveria zizanioides Hit14 Aloe vera Hit15 Rosa chinensis Hit16 Jasminum sambac Hit17 Phyllanthus urinaria Hit18 Mentha piperita Hit19 Oryza sativa Hit20 Myristica fragrans Hit ∗
21 Hydrocotyle asiatica Hit22 Lepiniopsis ternatensis Hit23 Alstonia scholaris Hit24 Andrographis paniculata Hit25 Cymbopogon nardus Hit26 Piper betle Hit27 Theae sinensis Hit28 Curcuma heyneana Hit29 Kaempferia pandurata Hit ∗
30 Curcuma xanthorrhiza Hit31 Melaleuca leucadendra Hit32 Matricaria chamomilla Miss ∗
K. Disease: the urinary system1 Foeniculum vulgare Hit ∗
2 Imperata cylindrica Hit ∗
3 Strychnos ligustrina Hit ∗
4 Plantago major Hit5 Zingiber officinale Hit ∗
6 Cinnamomum burmannii Hit ∗
7 Strobilanthes crispus Hit8 Kaempferia galanga Hit ∗
9 Orthosiphon stamineus Hit10 Phyllanthus urinaria Hit11 Blumea balsamifera Hit ∗
12 Sonchus arvensis Hit13 Curcuma xanthorrhiza Hit∗indicates that plant will not assigned if we use matching score >0.7.
63
24
1418
5 62 2 1
0
10
20
30
40
50
60
70
1 2 3 4 5 6 7 8 9
Num
ber o
f pla
nts
Number of diseases
2424
141418
55 662 2 1
Figure 6: Distribution of 135 plants assigned based on 0.7% datasetwith respect to the number of diseases they are assigned to.
for any type of application. It ensures coverage and performsrobustly in case of random addition, removal, and rearrange-ment of edges in protein-protein interaction (PPI) networks[14]. While applying DPClusO, the parameter values ofdensity and cluster property that we used in this experimentare 0.9 and 0.5, respectively [15]. Table 3 shows the summaryof clustering result by DPClusO. Because clusters consistingof two Jamu formulas are trivial clusters, for the next stepswe only use clusters each of which consists of 3 or moreJamu formulas. The number of total clusters increases alongwith the larger dataset, although the threshold correlationbetween Jamu pairs decreases. We evaluated the clusteringresult using matching score to determine dominant diseasefor every cluster (step 3 in Figure 1). Matching score of acluster is the ratio of the highest number of Jamu associatedwith the same disease to the total number of Jamu in thecluster. Thus matching score is a measure to indicate howstrongly a disease is associated to a cluster. Figure 4 showsthe distribution of the clusters with respect to matching scorefrom three datasets. All datasets have the highest frequencyof clusters at matching score >0.9 and overall most of theclusters have higher matching score, which means most ofthe DPClusO generated clusters can be confidently relatedto a dominant disease. Furthermore the number of clusterswith matching score >0.9 is remarkably larger compared tothe same in other ranges ofmatching score in case of the 0.3%dataset (Figure 4(c)). If we compare the ratio of frequency ofclusters at matching score >0.9 for every dataset, the 0.3%dataset has the highest ratio with 40.84% (of 453), comparedto 29.67% (of 873) and 21.91% (of 1296), in case of 0.5% and0.7% datasets, respectively. Thus, the most reliable speciesto disease relations can be predicted at matching score >0.9corresponding to the clusters generated from 0.3% dataset.
Figure 5(a) shows the success rate for all 3 datasets withrespect to threshold matching scores. Success rate is definedas the ratio of the number of clusters with matching scorelarger than the threshold to the total number of clusters.As expected it tends to produce lower success rate if wedecrease correlation value to create the datasets. Howevermore clusters are generated and more information can beextracted when we lower the threshold correlation value.Thesuccess rate increases rapidly as the matching score decreases
BioMed Research International 13
Table 5: Relation between disease classes in NCBI and efficacy classes reported by Afendi et al. [6].
Class of disease Ref. Efficacy classD1 Blood and lymph diseases NCBI E7 Pain/inflammation (PIN)D2 Cancers NCBI E7 Pain/inflammation (PIN)
D3The digestive system NCBI E4 Gastrointestinal disorders (GST)E7 Pain/inflammation (PIN)
D4 Ear, nose, and throat NCBI E7 Pain/inflammation (PIN)D5 Diseases of the eye NCBI E7 Pain/inflammation (PIN)D6 Female-specific diseases NCBI E5 Female reproductive organ problems (FML)D7 Glands and hormones NCBI E7 Pain/inflammation (PIN)D8The heart and blood vessels NCBI E7 Pain/inflammation (PIN)D9 Diseases of the immune system NCBI E7 Pain/inflammation (PIN)D10Male-specific diseases NCBI E6Musculoskeletal and connective tissue disorders (MSC)D11Muscle and bone NCBI E6Musculoskeletal and connective tissue disorders (MSC)D12 Neonatal diseases NCBI E7 Pain/inflammation (PIN)D13The nervous system NCBI E7 Pain/inflammation (PIN)
D14 Nutritional and metabolic diseases NCBI E2 Disorders of appetite (DOA)E4 Gastrointestinal disorders (GST)
D15 Respiratory diseases NCBI E8 Respiratory disease (RSP)E7 Pain/inflammation (PIN)
D16 Skin and connective tissue NCBI E9Wounds and skin infections (WND)D17The urinary system ∗ E1 Urinary related problems (URI)D18Mental and behavioural disorders ∗ E3 Disorders of mood and behavior (DMB)
from 0.9 to 0.6 and after that the slope of increase of successrate decreases. Therefore in this study we empirically decide0.6 as the threshold matching score to predict plant-diseaserelations.
3.3. Assignment of Plants to Disease. By using DPClusO re-sulting clusters, we assigned plants to classes of disease. Basedon a threshold matching score we assigned dominant diseaseto a cluster. Then we assign a plant to a cluster by way ofanalyzing the ingredients of the Jamu formulas belongingto that cluster and determining the highest frequency plant,that is, the plant that is used for maximum number Jamubelonging to that cluster (step 4 in Figure 1). Thus we assigna disease and a plant to each cluster having matching scoregreater than a threshold. Our hypothesis is that the diseaseand the plant assigned to the same cluster are related.
The total number of assigned plants depends onmatchingscore value. Figure 5(b) shows the number of predicted plantsthat can be assigned to diseases in the context of matchingscore. With higher matching score value, the number ofpredicted plants assigned to classes of disease is supposed toremain similar or decrease but the reliability of predictionincreases. In Figure 5(b) a sudden change in the numberof predicted plants is seen at matching score 0.6 which weconsider as empirical threshold in this work. Based on the0.7%dataset, the largest number of plants (135 plants, Table 4)was assigned to diseases. There are 63 plants assigned to onlyone class of disease, whereas the other 72 plants are assignedto at least two or more classes of disease (Figure 6).
3.4. Evaluation of the Supervised Clustering Based on DPClu-sO. Weused previously published results [6] as gold standardto evaluate our results. The previous study assigned plantsto 9 kinds of efficacy whereas we assigned the plants to 18disease classes (16 from NCBI and 2 additional classes). Forthe sake of evaluation we got done amapping of the 18 diseaseclasses to 9 efficacy classes by a professional doctor, whichis shown in Table 5. Table 6 shows the prediction result ofplant-disease relations for all 3 datasets, corresponding toclusters with matching score greater than 0.6. Table 6 alsoshows corresponding efficacy, the number of assigned plants,number of correctly predicted plants, and true positive rates(TPR), respectively.
We determined TPR corresponding to a disease/efficacyclass by calculating the ratio of the number of correctprediction to the number of all predictions. When a diseasecorresponds to more than one kind of efficacy, the highestTPR can be considered the TPR for the correspondingdisease. For all 3 datasets the TPR corresponding to eachdisease is roughly 90% or more. The 0.3% dataset consists ofJamu pairs with higher correlation values and based on thisdataset 117 plants are assigned to 14 disease classes. The 0.7%dataset contains more Jamu pairs and assigned plants to11 disease classes, one less disease class compared to 0.5%dataset. The two disease classes covered by 0.3% datasetbut not covered by 0.5% and 0.7% datasets are the nervoussystem (D13) and disease of the immune system (D9). Theonly disease class covered by 0.3% and 0.5% datasets butnot covered by 0.7% dataset is mental and behaviouraldisorders (D18). The larger dataset network tends to have
14 BioMed Research International
Table 6: The prediction result of plant-disease relations using matching score >0.6.
Class ofdisease
Correspondingefficacy
0.7% dataset 0.5% dataset 0.3% datasetNumber ofassignedplants
Correctprediction
Truepositiverate
Number ofassignedplants
Correctprediction
Truepositiverate
Number ofassignedplants
Correctprediction
Truepositiverate
D1 E7 26 22 0.85 24 20 0.83 24 20 0.83D2 E7 1 1 1.00 5 5 1.00 1 1 1.00
D3 E4 42 42 1.00 33 33 1.00 28 28 1.00E7 38 0.90 30 0.91 25 0.89
D4 E7 0 0 — 0 0 — 0 0 —D5 E7 0 0 — 0 0 — 0 0 —D6 E5 38 38 1.00 37 37 1.00 32 32 1.00D7 E7 0 0 — 0 0 — 0 0 —D8 E7 10 8 0.80 8 7 0.88 6 5 0.83D9 E7 0 0 — 0 0 — 1 1 1.00D10 E6 6 4 0.67 2 0 — 3 1 0.33D11 E6 65 65 1.00 71 71 1.00 60 60 1.00D12 E7 0 0 — 0 0 — 0 0 —D13 E7 0 0 — 0 0 — 5 5 1.00
D14 E2 54 44 0.81 45 36 0.80 35 26 0.74E4 54 1.00 45 1.00 35 1.00
D15 E7 38 37 0.97 34 34 1.00 33 33 1.00E8 31 0.82 30 0.88 29 0.88
D16 E9 32 31 0.97 32 32 1.00 27 27 1.00D17 E1 13 13 1.00 9 9 1.00 8 8 1.00D18 E3 0 0 — 5 5 1.00 4 4 1.00
Total assigned plants 135 129 117
lower coverage of disease classes. The number of Jamu pairs,that is, the number of edges in the network, affect the numberof DPClusO resulting clusters and number of Jamu formulasper cluster. As a consequence, for the larger dataset networks,the success rate becomes lower and the coverage of diseaseclasses is lower but prediction of more plant-disease relationscan be achieved.
4. Conclusions
This paper introduces a novel method called supervisedclustering for analyzing big biological data by integrat-ing network clustering and selection of clusters based onsupervised learning. In the present work we applied themethod for data mining of Jamu formulas accumulatedin KNApSAcK database. Jamu networks were constructedbased on correlation similarities between Jamu formulas andthen network clustering algorithm DPClusO was applied togenerate high density Jamu modules. For the analysis ofthe next steps potential clusters were selected by supervisedlearning. The successful clusters containing several Jamurelated to the same disease might be useful for finding mainingredient plant for that disease and the lower matchingscore value clusters will be associated with varying plants
which might be supporting ingredients. By applying theproposed method important plants from Jamu formulas forevery classes of disease were determined.The plant to diseaserelations predicted by proposed network based method wereevaluated in the context of previously published results andwere found to produce a TPR of 90%. For the larger datasetnetworks, success rate and the coverage of disease classesbecome lower but prediction of more plant-disease relationscan be achieved.
Conflict of Interests
The authors declare that there is no financial interest orconflict of interests regarding the publication of this paper.
Acknowledgments
Thisworkwas supported by theNational BioscienceDatabaseCenter in Japan and the Ministry of Education, Culture,Sports, Science, and Technology of Japan (Grant-in-Aidfor Scientific Research on Innovation Areas “BiosyntheticMachinery. Deciphering and Regulating the System for Cre-ating StructuralDiversity of BioactivityMetabolites (2007)”).
BioMed Research International 15
References
[1] R. Verporte, H. K. Kim, and Y. H. Choi, “Plants as source ofmedicines,” inMedicinal and Aromatic Plants, R. J. Boger, L. E.Craker, and D. Lange, Eds., chapter 19, pp. 261–273, 2006.
[2] A. Furnharm, “Why do people choose and use complemen-tary therapies?” in Complementary Medicine: An ObjectiveAppraisal, E. Ernst, Ed., pp. 71–88, Butterworth-Heinemann,Oxford, UK, 1996.
[3] E. Ernst, “Herbal medicines put into context,” British MedicalJournal, vol. 327, no. 7420, pp. 881–882, 2003.
[4] F. M. Afendi, T. Okada, M. Yamazaki et al., “KNApSAcK familydatabases: integrated metabolite—plant species databases formultifaceted plant research,” Plant and Cell Physiology, vol. 53,no. 2, p. e1, 2012.
[5] F.M.Afendi, N.Ono, Y.Nakamura et al., “Dataminingmethodsfor omics and knowledge of crude medicinal plants towardbig data biology,” Computational and Structural BiotechnologyJournal, vol. 4, no. 5, Article ID e201301010, 2013.
[6] F. M. Afendi, L. K. Darusman, A. Hirai et al., “System biologyapproach for elucidating the relationship between Indonesianherbal plants and the efficacy of Jamu,” in Proceedings of the10th IEEE International Conference on Data Mining Workshops(ICDMW ’10), pp. 661–668, Sydney, Australia, December 2010.
[7] F. M. Afendi, L. K. Darusman, A. H. Morita et al., “Efficacy ofJamu formulations by PLS modeling,” Current Computer-AidedDrug Design, vol. 9, pp. 46–59, 2013.
[8] F. M. Afendi, L. K. Darusman, M. Fukuyama, M. Altaf-Ul-Amin, and S. Kanaya, “A bootstrapping approach for investi-gating the consistency of assignment of plants to Jamu efficacyby PLS-DAmodel,”Malaysian Journal ofMathematical Sciences,vol. 6, no. 2, pp. 147–164, 2012.
[9] W. Winterbach, P. V. Mieghem, M. Reinders, H. Wang, and D.de Ridder, “Topology of molecular interaction networks,” BMCSystems Biology, vol. 7, article 90, 2013.
[10] C. Bachmaier, U. Brandes, and F. Schreiber, “Biological net-work,” in Handbook of Graph Drawing and Visualization, pp.621–651, CRC Press, 2013.
[11] X. Chen, M. Chen, and K. Ning, “BNArray: an R package forconstructing gene regulatory networks from microarray databy using Bayesian network,” Bioinformatics, vol. 22, no. 23, pp.2952–2954, 2006.
[12] P. Langfelder and S. Horvath, “WGCNA: an R package forweighted correlation network analysis,” BMC Bioinformatics,vol. 9, article 559, 2008.
[13] A. Martin, M. E. Ochagavia, L. C. Rabasa, J. Miranda, J.Fernandez-de-Cossio, and R. Bringas, “BisoGenet: a new toolfor gene network building, visualization and analysis,” BMCBioinformatics, vol. 11, article 91, 2010.
[14] M. Altaf-Ul-Amin, M. Wada, and S. Kanaya, “Partitioning aPPI network into overlapping modules constrained by high-density and periphery tracking,” ISRN Biomathematics, vol.2012, Article ID 726429, 11 pages, 2012.
[15] M. Altaf-Ul-Amin, H. Tsuji, K. Kurokawa, H. Asahi, Y. Shinbo,and S. Kanaya, “DPClus: a density-periphery based graphclustering software mainly focused on detection of proteincomplexes in interaction networks,” Journal of Computer AidedChemistry, vol. 7, pp. 150–156, 2006.
[16] S. K. Kachigan, Multivariate Statistical Analysis: A ConceptualIntroduction, Radius Press, New York, NY, USA, 1991.
[17] J. L. Rodgers and W. A. Nicewander, “Thirteen ways to look atthe correlations coefficient,”TheAmerican Statiscian, vol. 42, pp.59–66, 1995.
[18] M. Li, J.-E. Chen, J.-X. Wang, B. Hu, and G. Chen, “Modifyingthe DPClus algorithm for identifying protein complexes basedon new topological structures,” BMC Bioinformatics, vol. 9,article 398, 2008.
[19] World Health Organization, “International Classification ofDiseases (ICD) 10,” 2010, http://www.who.int/classifications/icd/en/.
[20] National Center for Biotechnology Information, Genes andDisease, NCBI, Bethesda, Md, USA, 1998.
[21] P. Erdos and A. Renyi, “On the evolution of random graph,”Publicationes Mathematicae Debrecen, vol. 6, pp. 290–297, 1959.
[22] A.-L. Barabasi and R. Albert, “Emergence of scaling in randomnetworks,” Science, vol. 286, no. 5439, pp. 509–512, 1999.
[23] A. Vazquez, “Growing network with local rules: preferentialattachment, clustering hierarchy, anddegree correlations,”Phys-ical Review E—Statistical, Nonlinear, and Soft Matter Physics,vol. 67, no. 5, Article ID 056104, 15 pages, 2003.
[24] Max Planck Institut Informatik, “NetworkAnalyzer,” 2013,http://med.bioinf.mpi-inf.mpg.de/netanalyzer/index.php.
top related