lumc: lopac screen analysis -...

32
LUMC: LOPAC screen analysis Preliminary results 21.5.2012 This document briefly presents the results from the LOPAC compounds screen performed at LUMC. It contains the following information: The set of all protein targets that were targeted by at least one compound - AllProteinTargets.csv (Section 1) List of compounds and the proteins that they target - Lopac2Target.csv and Lopac2Target-binary.csv (Section 1) Pruned list of protein targets. For both Mtb and Stm - LopacFiltered joint.csv. For Mtb - LopacFiltered Mtb.csv. For Stm - LopacFiltered Stm.csv (Sec- tion 2) Analysis of the positive and negative hits Gene to gene interactions (provided in a separate file) - g2g.txt. Gene annotations with terms from Gene Ontology (provided in a separate file) - g2ont.txt. Results from predictive clustering trees. Results from feature ranking. Results from predictive clustering trees including the info about cell viability. 1 Protein targets We describe each compound from the LOPAC library with the protein targets for which was showed the given compound is active. From the 1260 compounds in the library, 964 com- pounds were found to be active on human protein targets. In the further analysis, we include only those. Furthermore, we identified 711 (human) protein targets for which at least one LOPAC compound was found active. The list of protein targets is given in a separate file (AllProteinTargets.csv). The association of each LOPAC compound to the targets is given first in the file Lopac2Target.csv, where each row represents a compound. Each compound is identifeid by its LOPAC catnum (the first field in the row), and then the protein targets (i.e., the respective accession number ids) are listed. Furthermore, we provide a matrix with dimensions 965 × 712 in file Lopac2Target-binary.csv. Here, each row presents a compound and each column is a protein target. Note that, the first row contains the protein accession numbers, while the first column contains the LOPAC catnum identifeirs. 1

Upload: buikhanh

Post on 12-Mar-2018

220 views

Category:

Documents


1 download

TRANSCRIPT

LUMC: LOPAC screen analysis

Preliminary results

21.5.2012

This document briefly presents the results from the LOPAC compounds screenperformed at LUMC. It contains the following information:

• The set of all protein targets that were targeted by at least one compound -AllProteinTargets.csv (Section 1)

• List of compounds and the proteins that they target - Lopac2Target.csv andLopac2Target-binary.csv (Section 1)

• Pruned list of protein targets. For both Mtb and Stm - LopacFiltered joint.csv.For Mtb - LopacFiltered Mtb.csv. For Stm - LopacFiltered Stm.csv (Sec-tion 2)

• Analysis of the positive and negative hits

• Gene to gene interactions (provided in a separate file) - g2g.txt.

• Gene annotations with terms from Gene Ontology (provided in a separate file)- g2ont.txt.

• Results from predictive clustering trees.

• Results from feature ranking.

• Results from predictive clustering trees including the info about cell viability.

1 Protein targets

We describe each compound from the LOPAC library with the protein targets for which wasshowed the given compound is active. From the 1260 compounds in the library, 964 com-pounds were found to be active on human protein targets. In the further analysis, we includeonly those. Furthermore, we identified 711 (human) protein targets for which at least oneLOPAC compound was found active. The list of protein targets is given in a separate file(AllProteinTargets.csv). The association of each LOPAC compound to the targets is givenfirst in the file Lopac2Target.csv, where each row represents a compound. Each compoundis identifeid by its LOPAC catnum (the first field in the row), and then the protein targets(i.e., the respective accession number ids) are listed. Furthermore, we provide a matrix withdimensions 965× 712 in file Lopac2Target-binary.csv. Here, each row presents a compoundand each column is a protein target. Note that, the first row contains the protein accessionnumbers, while the first column contains the LOPAC catnum identifeirs.

1

2 Prunning of the protein targets

The number of 711 target proteins is large and it includes protein that are targeted by a largevariety of compounds. Typically, these proteins are not specific and closely relevant for the Mtband Stm measurements. Furthermore, more interesting are the protein targets that were foundactive in studies involving hit compounds (compounds with z-score larger than 2 or smaller than-2) and not active in the remainder of the compounds. To this end, we prune the set of theproteins using the following rule: A protein target is removed from the set of proteintargets if at least 90% of the compounds that are targeted by are not identified ashits.

Considering this, we provide three separate lists of compounds and the respective proteintargets. The first list (given in the file LopacFiltered joint.csv) gives the protein targetsthat are obtained when the hit compounds were defined using both Mtb and Stm z-scorevalues. The second list (given in the file LopacFiltered Mtb.csv) gives the protein targetsthat are relevent for the hit compounds for the Mtb study. The third list (given in the fileLopacFiltered Stm.csv) gives the protein targets relevant for the hit compounds for the Stmstudy.

We further briefly show the gene networks obtained from the pruned protein targets usingSTRING. First, we give the network for both Mtb and Stm (the 60 proteins given in the fileLopacFiltered joint.csv). Then, we show the network for Mtb (the 58 proteins from fileLopacFiltered Mtb.csv). Finally, we present the network for Stm (the 15 proteins from fileLopacFiltered Stm.csv)

Figure 1: The protein network for both Mtb and Stm.

2

Figure 2: The protein network for Mtb.

3

Figure 3: The protein network for Stm.

3 Positive and negative hits

Here, we first list the positive and negative hits for Mtb and Stm for which there is no knownprotein target.

Mtb:

• Negative hit (z ≤ -2): B154, P233, H8125, H135, G002, N2034, M226, N3136, H8645,H8759, H0879;

• Positive hit (z ≥ 2): E2375.

Stm:

• Negative hit (z ≤ -2): -;

• Positive hit (z ≥ 2): H8759, E2375.

Next, we give the ‘hit’ matrix.

Mtb ≥ 2 Mtb ≤ -2

Stm ≥ 2S5890, C9911, A1784, A255, P4405, V1377, N3510, M6383,A9809, E2375, C6645 C9754, V8879, M1404, T9033, H8759

Stm ≤ -2 - H1512, G5793, H89

4 Interactions and annotations

We describe each of the targets with the Gene Ontology (GO) and KEGG terms that areannotated with. Afterwards, for each compound, we take the union of the GO terms from eachprotein target that the given protein was described with. The file that contains the GO andKEGG annotations is g2ont.txt.

Next, we include the interactions of the annotated proteins. Namely, using the file g2g.txt,each compound is described with the union of the GO and KEGG terms for each of the proteinsand additionally, the union of the GO and KEGG terms that the interacting genes are annotatedwith.

4

5 Results from PCTs

This section includes the results from the analysis of the data using predictive clustering trees(PCTs). We present the results in three sub-sections, each dealing with the specific represen-tation of the protein targets and therefore the compounds. In other words, the three scenariosdiffer in the descriptive attributes, while the target attributes are always the same: the z-scoresfor each compound. For each of the scenarios we present the result for the real values of thez-scores and the discretized counterparts (positive and negative ‘hit’).

5.1 Simple protein accession numbers

5.1.1 Mtb - numeric

AAL06595+--yes: [-4.891333]: 3

| compound: [G6416,S9692,N1786]

+--no: NP_071445+--yes: [4.0715]: 2

| compound: [C9911,O3125]

+--no: NP_937802+--yes: [-3.3242]: 5

| compound: [R5010,N3510,P0453,T9033,T182]

+--no: NP_068810+--yes: [-3.88475]: 4

| compound: [V1377,F9677,P4405,V8879]

+--no: NP_004960+--yes: [-1.908375]: 8

| compound: [O2139,T4318,E3520,S8442,

| T3434,T4818,C5913,C3930]

+--no: [-0.49159]: 942

compound:[...]

AAL06595: signal transducer and activator of transcription 6, interleukin-4 inducedNP 071445: nucleotide-binding oligomerization domain-containing protein 2NP 937802: microphthalmia-associated transcription factor isoform 1NP 068810: transcription factor p65 isoform 1NP 004960: insulin-degrading enzyme isoform 1

5

5.1.2 Mtb - discrete

AAL06595+--yes: [-1] [3.0]: 3

| compound: [G6416,S9692,N1786]

+--no: ABM97548+--yes: NP_071445| +--yes: [1] [2.0]: 2

| | compound: [C9911,O3125]

| +--no: [-1] [2.0]: 2

| compound: [C9754,P4405]

+--no: NP_002522+--yes: [-1] [2.0]: 2

| compound: [F100,P1793]

+--no: AAH09524+--yes: [-1] [2.0]: 3

| compound: [T4318,B8433,C3930]

+--no: [0] [864.0]: 952

compound:[...]

AAL06595: signal transducer and activator of transcription 6, interleukin-4 inducedABM97548: Kruppel-like factor 5NP 071445: nucleotide-binding oligomerization domain-containing protein 2NP 002522: neurotensin receptor type 1AAH09524: PSMD14 protein

6

5.1.3 Stm - numeric

NP_068810+--yes: [5.1102]: 5

| compound: [R5010,V1377,F9677,P4405,V8879]

+--no: O60760+--yes: [6.7815]: 2

| compound: [M1404,I7378]

+--no: ABM97548+--yes: [4.893667]: 3

| compound: [C9911,C9754,O3125]

+--no: NP_937802+--yes: [3.5576]: 5

| compound: [N3510,P0453,T9033,N1786,T182]

+--no: AAB23646+--yes: [4.114667]: 3

| compound: [C6645,T1132,M6383]

+--no: P31645+--yes: [-1.786]: 4

| compound: [M3668,C6305,H1512,R118]

+--no: [0.278201]: 942

compound:[...]

NP 068810: transcription factor p65 isoform 1O60760: Hematopoietic prostaglandin D synthaseABM97548: Kruppel-like factor 5NP 937802: microphthalmia-associated transcription factor isoform 1AAB23646: amyloid precursor proteinP31645: Sodium-dependent serotonin transporter

5.1.4 Stm - discrete

ABM97548+--yes: [1] [3.0]: 4

| compound: [C9911,C9754,P4405,O3125]

+--no: NP_937802+--yes: NP_004337| +--yes: [0] [3.0]: 3

| | compound: [R5010,P0453,N1786]

| +--no: [1] [3.0]: 3

| compound: [N3510,T9033,T182]

+--no: [0] [929.0]: 954

compound:[...]

ABM97548: Kruppel-like factor 5NP 937802: microphthalmia-associated transcription factor isoform 1NP 004337: caspase-3 preproprotein

7

5.2 GO and KEGG terms

5.2.1 Mtb - numeric

GO0002829+--yes: [-4.891333]: 3

| compound: [G6416,S9692,N1786]

+--no: GO0051607+--yes: GO0005769| +--yes: [-5.2388]: 5

| | compound: [S8442,V1377,N3510,P4405,V8879]

| +--no: [-0.864917]: 12

| compound: [P4509,M6760,R5010,M9511,A164,F9677,

| T113,I5159,D5891,P154,T9025,T182]

+--no: GO0002275+--yes: [-5.1775]: 2

| compound: [P8139,W1628]

+--no: GO0001891+--yes: [4.0715]: 2

| compound: [C9911,O3125]

+--no: GO0030433+--yes: [-4.193]: 2

| compound: [D6140,T9033]

+--no: GO0003013+--yes: [-0.713659]: 296

| compound: [...]

+--no: GO0002479+--yes: [-2.9045]: 2

| compound: [T4318,B8433]

+--no: [-0.370877]: 632

compound:[...]

GO0002829:negative regulation of type 2 immune responseGO0051607:defense response to virusGO0005769:early endosomeGO0002275:myeloid cell activation involved in immune responseGO0001891:phagocytic cupGO0030433:ER-associated protein catabolic processGO0003013:circulatory system processGO0002479:antigen processing and presentation of exogenous peptide antigen via MHC classI, TAP-dependent

8

5.2.2 Mtb - discrete

GO0002700+--yes: GO0000185| +--yes: [1] [2.0]: 2

| | compound: [C9911,O3125]

| +--no: [-1] [4.0]: 4

| compound: [G6416,S9692,L8789,N1786]

+--no: GO0046626+--yes: GO0007600| +--yes: [0] [7.0]: 7

| | compound: [R5010,S1693,F9677,P0453,S3567,C0625,R7772]

| +--no: [-1] [7.0]: 8

| compound: [M6760,V1377,F9397,P4405,P8139,T5575,V8879,H89]

+--no: GO0016492+--yes: [-1] [2.0]: 2

| compound: [F100,P1793]

+--no: GO0034109+--yes: [-1] [2.0]: 2

| compound: [S8442,W1628]

+--no: GO0061135+--yes: [-1] [4.0]: 9

| compound: [C6645,T4318,B8433,T1132,T9033,

| S8567,C3930,M6383,M5250]

+--no: [0] [845.0]: 922

compound:[...]

GO0002700:regulation of production of molecular mediator of immune responseGO0000185:activation of MAPKKK activityGO0046626:regulation of insulin receptor signaling pathwayGO0007600:sensory perceptionGO0016492:G-protein coupled neurotensin receptor activityGO0034109:homotypic cell-cell adhesionGO0061135:endopeptidase regulator activity

9

5.2.3 Stm - numeric

GO0006117+--yes: [5.1102]: 5

| compound: [R5010,V1377,F9677,P4405,V8879]

+--no: GO0004667+--yes: [6.7815]: 2

| compound: [M1404,I7378]

+--no: GO0030031+--yes: [4.893667]: 3

| compound: [C9911,C9754,O3125]

+--no: GO0001871+--yes: [2.267154]: 13

| compound: [C6645,P4509,T4376,S8442,T1132,A164,N3510,

| T5575,D7910,T9025,M6383,T182,C7522]

+--no: GO0017016+--yes: [-1.5846]: 5

| compound: [M3668,C6305,H1512,R118,R7772]

+--no: [0.285709]: 928

compound:[...]

GO0006117:acetaldehyde metabolic processGO0004667:prostaglandin-D synthase activityGO0030031:cell projection assemblyGO0001871:pattern bindingGO0017016:Ras GTPase binding

10

5.2.4 Stm - discrete

GO0030031+--yes: [1] [3.0]: 4

| compound: [C9911,C9754,P4405,O3125]

+--no: GO0032270+--yes: GO0031300| +--yes: [1] [2.0]: 2

| | compound: [T9033,V8879]

| +--no: GO0051607| +--yes: GO0000724| | +--yes: [0] [3.0]: 3

| | | compound: [M6760,R5010,T9025]

| | +--no: [1] [3.0]: 3

| | compound: [V1377,N3510,T182]

| +--no: GO0000409| +--yes: [1] [2.0]: 3

| | compound: [A2385,A9809,E1383]

| +--no: GO0004867| +--yes: [1] [2.0]: 3

| | compound: [C6645,T1132,M6383]

| +--no: [0] [170.0]: 182

| compound: [...]

+--no: [0] [749.0]: 756

compound: [...]

GO0030031:cell projection assemblyGO0032270:positive regulation of cellular protein metabolic processGO0031300:intrinsic to organelle membraneGO0051607:defense response to virusGO0000724:double-strand break repair via homologous recombinationGO0000409:regulation of transcription by galactosGO0004867:serine-type endopeptidase inhibitor activity

11

5.3 GO and KEGG terms plus interaction information

5.3.1 Mtb - numeric

IAGO0009597+--yes: IAGO0017075| +--yes: [-4.8786]: 5

| | compound: [T4318,S8442,S9692,P8139,N1786]

| +--no: [-1.103167]: 6

| compound: [O2139,E3520,T3434,T4818,C5913,C3930]

+--no: IAGO0036046+--yes: [-3.387167]: 6

| compound: [G6416,R5010,V1377,F9677,P4405,V8879]

+--no: GO0001891+--yes: [4.0715]: 2

| compound: [C9911,O3125]

+--no: GO0004340+--yes: [-4.2865]: 2

| compound: [Q0125,N3510]

+--no: GO0030433+--yes: [-4.193]: 2

| compound: [D6140,T9033]

+--no: GO0003013+--yes: [-0.706692]: 295

| compound: [...]

+--no: [-0.38349]: 638

compound: [...]

IAGO0009597:detection of virusIAGO0017075:syntaxin-1 bindingIAGO0036046:protein demalonylationGO0001891:phagocytic cupGO0004340:glucokinase activityGO0030433:ER-associated protein catabolic processGO0003013:circulatory system process

12

5.3.2 Mtb - discrete

GO0002700+--yes: GO0000185| +--yes: [1] [2.0]: 2

| | compound: [C9911,O3125]

| +--no: [-1] [4.0]: 4

| compound: [G6416,S9692,L8789,N1786]

+--no: IAGO0019992+--yes: [-1] [3.0]: 3

| compound: [P8139,V8879,W1628]

+--no: IAGO0050982+--yes: IAGO0005018| +--yes: [-1] [3.0]: 3

| | compound: [S8442,N3510,P4405]

| +--no: [0] [106.0]: 132

| compound: [...]

+--no: GO0046631+--yes: [0] [10.0]: 16

| compound: [M105,F8927,M129,M3668,M6760,T5625,

| R115,T4568,F9397,T5318,A1784,

| D8065,T4693,M2692,F8175,F6800]

+--no: IAGO0030229+--yes: [-1] [2.0]: 3

| compound: [L3791,E2031,T9033]

+--no: [0] [740.0]: 793

compound: [...]

GO0002700:regulation of production of molecular mediator of immune responseGO0000185:activation of MAPKKK activityIAGO0019992:diacylglycerol bindingIAGO0050982:detection of mechanical stimulusIAGO0005018:platelet-derived growth factor alpha-receptor activityGO0046631:alpha-beta T cell activationIAGO0030229:very-low-density lipoprotein particle receptor activity

13

5.3.3 Stm - numeric

GO0006117+--yes: GO0000320| +--yes: [8.395333]: 3

| | compound: [V1377,P4405,V8879]

| +--no: [0.1825]: 2

| compound: [R5010,F9677]

+--no: GO0004667+--yes: [6.7815]: 2

| compound: [M1404,I7378]

+--no: GO0030031+--yes: [4.893667]: 3

| compound: [C9911,C9754,O3125]

+--no: GO0001871+--yes: IAGO0004743| +--yes: [8.6605]: 2

| | compound: [N3510,M6383]

| +--no: KEGG04640| +--yes: [2.1872]: 5

| | compound: [C6645,T4376,T1132,

| | D7910,T182]

| +--no: [0.202667]: 6

| compound: [P4509,S8442,A164,

| T5575,T9025,C7522]

+--no: IAGO0050917+--yes: [-3.0535]: 2

| compound: [H1512,A6134]

+--no: IAGO0019087+--yes: [0.691359]: 64

| compound: [...]

+--no: [0.252681]: 867

compound: [...]

GO0006117:acetaldehyde metabolic processGO0000320:re-entry into mitotic cell cycleGO0004667:prostaglandin-D synthase activityGO0030031:cell projection assemblyGO0001871:pattern bindingIAGO0004743:pyruvate kinase activityKEGG04640:Hematopoietic cell lineage/Immune responseIAGO0050917:sensory perception of umami tasteIAGO0019087:transformation of host cell by virus

14

5.3.4 Stm - discrete

GO0030031+--yes: [1] [3.0]: 4

| compound: [C9911,C9754,P4405,O3125]

+--no: IAGO0033132+--yes: GO0031347| +--yes: [0] [3.0]: 3

| | compound: [A2385,Q0125,F6889]

| +--no: [1] [3.0]: 3

| compound: [N3510,A9809,E1383]

+--no: GO0004866+--yes: GO0001558| +--yes: [0] [4.0]: 4

| | compound: [R5010,T1132,S8567,M5250]

| +--no: [1] [3.0]: 3

| compound: [C6645,T9033,M6383]

+--no: GO0006117+--yes: [1] [2.0]: 3

| compound: [V1377,F9677,V8879]

+--no: GO0008156+--yes: [0] [95.0]: 105

| compound: [...]

+--no: GO0033127+--yes: [-1] [1.0]: 2

| compound: [O0886,H89]

+--no: [0] [820.0]: 829

compound: [...]

GO0030031:cell projection assemblyIAGO0033132:regulation of histone phosphorylationGO0031347:regulation of defense responseGO0004866:endopeptidase inhibitor activityGO0001558:regulation of cell growthGO0006117:acetaldehyde metabolic processGO0008156:negative regulation of DNA replicationGO0033127:negative regulation of glucokinase activity

6 Resuls from Feature Ranking

Here, we present the results obtained by ReliefF feature ranking algorithm (from WEKA tool-box). Instead of given the complete list, we present the gene networks for Mtb and Stm forboth numeric and discrete values of the z-score.

15

Figure 4: The protein network for the top 40 genes for Mtb numeric.

16

Figure 5: The protein network for the top 40 genes for Mtb discrete.

17

Figure 6: The protein network for the top 40 genes for Stm numeric.

18

Figure 7: The protein network for the top 40 genes for Stm discrete.

19

7 Results from PCTs with cell viability info

This section includes the results from the analysis of the data using predictive clustering trees(PCTs). We present the results in three sub-sections, each dealing with the specific represen-tation of the protein targets and therefore the compounds. In other words, the three scenariosdiffer in the descriptive attributes, while the target attributes are always the same: the z-scoresfor each compound. For each of the scenarios we present the result for the real values of thez-scores and the discretized counterparts (positive and negative ‘hit’).

The results from Section 5 use the z-score value for the bactirial load as a target variable.The results shown here, besides the bacterial load, also have the z-score for cell viability. Thefirst value in the tree leafs is for bacterial load and the second value is for cell viability. Inthis scenario, we are more focused on three outcomes of the application of the compounds: killbacteria - do nothing to host, promote bacteria - kill host and promote bacteria - do nothingto host. We are however not interested in the outcomes where nothing happens to the bacteriaor both bacteria and the host are killed. In the respective section, we give a short overview ofeach PCT in the context of the three outcomes of compounds application.

7.1 Simple protein accession numbers

7.1.1 Mtb - numeric

ABM97548: Kruppel-like factor 5NP 937802: microphthalmia-associated transcription factor isoform 1NP 068810: transcription factor p65 isoform 1NP 858045: nuclear receptor coactivator 3 isoform aAAB23646: amyloid precursor proteinAAL06595: signal transducer and activator of transcription 6, interleukin-4 inducedCAC29064: serine/threonine kinase 33NP 004960: insulin-degrading enzyme isoform 1NP 000015: beta-2 adrenergic receptorNP 000509: hemoglobin subunit beta

In this tree, most interesting is the group of compounds that consists of G6416 and S9692,which when applied kill the bacteria and do nothing to the host. Namely, this happens when theprotein AAL06595 is targeted. Next interesting group of compounds is the one with A0966,Q3251,and F6627, which also kills the bacteria and does nothing to the host. These compoundstarget the NP 000509 protein. Potentially interesting group of compounds is the one thattargets NP 004960 and it consists of T4318, S8442, C5913, and C3930. These compounds dokill the bacteria (z-score of -2.579), however, they have relatively low z-score for cell viability(-1.4935).

20

ABM97548+--yes:[-0.11825,-8.79125]: 4

| compound: [C9911,C9754,P4405,O3125]

+--no:NP_937802+--yes: [-3.4595,-3.751667]: 6

| compound: [R5010,N3510,P0453,T9033,N1786,T182]

+--no: NP_068810+--yes: [-3.200667,-5.361667]: 3

| compound: [V1377,F9677,V8879]

+--no: NP_858045+--yes: [0.829143,-3.503429]: 7

| compound: [A2385,D0670,O2139,B7651,C7632,A9809,E1383]

+--no: AAB23646+--yes: [0.241,-4.899667]: 3

| compound: [C6645,T1132,M6383]

+--no: AAL06595+--yes: [-5.269,0.0475]: 2

| compound: [G6416,S9692]

+--no: CAC29064+--yes: [-1.172577,-1.492615]: 26

| compound: [M3668,E8875,E3520,T4376,S1693,

| M1404,K3888,A4638,C8221,T7040,A6671,

| C0253,F100,M6191,T3434,P8139,T9652,

| P1793,T7402,D7910,W1628,E3876,T4818,

| M2011,T4182,C191]

+--no: NP_004960+--yes: [-2.579,-1.4935]: 4

| compound: [T4318,S8442,C5913,C3930]

+--no: NP_000015+--yes: [-1.36225,-1.496417]: 12

| compound: [E4375,P4015,E4642,I5627,

| I8005,S5068,I3639,I6504,F1016,

| P6126,C6019,P9547]

+--no: NP_000509+--yes: [-2.836,-0.576667]: 3

| compound: [A0966,Q3251,F6627]

+--no: [-0.465922,-0.426892]: 894

compound: [...]

21

7.1.2 Mtb - discrete

CAC29064+--yes: [0,0] [26.0,22.0]: 37

| compound: [...]

+--no: EAW58774+--yes: [0,-1] [3.0,4.0]: 7

| compound: [C9911,C6645,M2537,T5193,F6627,T4143,M6383]

+--no: AAH65243+--yes: [0,-1] [6.0,5.0]: 10

| compound: [A2385,D0670,B7651,V1377,C7632,

| C9511,C3930,A9809,P9375,C7522]

+--no: NP_000015+--yes: [0,0] [9.0,7.0]: 12

| compound: [E4375,P4015,E4642,I5627,

| I8005,S5068,I3639,I6504,

| F1016,P6126,C6019,P9547]

+--no: NP_000509+--yes: [-1,0] [3.0,3.0]: 3

| compound: [A0966,Q3251,S9692]

+--no: NP_663745+--yes: [-1,-1] [2.0,2.0]: 4

| compound: [M6760,M9511,T113,D5891]

+--no: AAH09524+--yes: [-1,0] [2.0,2.0]: 2

| compound: [T4318,B8433]

+--no: [0,0] [819.0,861.0]: 889

compound: [...]

CAC29064: serine/threonine kinase 33EAW58774: Janus kinase 2 (a protein tyrosine kinase)AAH65243: Microphthalmia-associated transcription factorNP 000015: beta-2 adrenergic receptorNP 000509: hemoglobin subunit betaNP 663745: probable DNA dC-¿dU-editing enzyme APOBEC-3AAAH09524: PSMD14 protein

In this tree, two groups of compounds kill the bacteria and do nothing to the host. Thefirst gropu consists of the following compounds A0966, Q3251 and S9692 and it targets theNP 000509 protein. The second group of compounds consists of T4318 and B8433 and ittargets the AAH09524 protein.

22

7.1.3 Stm - numeric

ABM97548+--yes: [5.36075,-4.00925]: 4

| compound: [C9911,C9754,P4405,O3125]

+--no: NP_068810+--yes: [4.69725,-2.66375]: 4

| compound: [R5010,V1377,F9677,V8879]

+--no: O60760+--yes: [6.7815,-3.2525]: 2

| compound: [M1404,I7378]

+--no: AAB23646+--yes: [4.114667,-3.326333]: 3

| compound: [C6645,T1132,M6383]

+--no: NP_937802+--yes: [3.5576,-2.1022]: 5

| compound: [N3510,P0453,T9033,N1786,T182]

+--no: NP_003734+--yes: [1.41,-2.4055]: 4

| compound: [A2385,F6889,A9809,E1383]

+--no: NP_006555+--yes: [-0.139,-2.6015]: 2

| compound: [T7665,C7522]

+--no: P31645+--yes: [-1.786,0.683]: 4

| compound: [M3668,C6305,H1512,R118]

+--no: [0.274256,0.244634]: 936

compound: [...]

ABM97548: Kruppel-like factor 5NP 068810: transcription factor p65 isoform 1O60760: Hematopoietic prostaglandin D synthaseAAB23646: amyloid precursor proteinNP 937802: microphthalmia-associated transcription factor isoform 1NP 003734: nuclear receptor coactivator 1 isoform 1NP 006555: C-X-C chemokine receptor type 6P31645: Sodium-dependent serotonin transporter

This tree presents two interesting outcomes of the compound application in the Stm study.The most interesting outcome is when the protein P31645 is targeted by M3688, C6305,H1512, or R118, where the bacterial load is reduced and no harm is done to the host. Thesecond outcome is when any of the proteins ABM97548, NP 068810, O60760, AAB23646,NP 937802 and NP 003734 is targeted by the respective group of compounds (listed in thetree).

23

7.1.4 Stm - discrete

ABM97548+--yes: [1,-1] [3.0,3.0]: 4

| compound: [C9911,C9754,P4405,O3125]

+--no: NP_937802+--yes: NP_004337| +--yes: [0,0] [3.0,3.0]: 3

| | compound: [R5010,P0453,N1786]

| +--no: [1,-1] [3.0,3.0]: 3

| compound: [N3510,T9033,T182]

+--no: NP_068810+--yes: [1,-1] [2.0,2.0]: 3

| compound: [V1377,F9677,V8879]

+--no: AAB23646+--yes: [1,-1] [2.0,2.0]: 3

| compound: [C6645,T1132,M6383]

+--no: NP_003734+--yes: [0,-1] [2.0,2.0]: 4

| compound: [A2385,F6889,A9809,E1383]

+--no: NR_029493+--yes: [0,-1] [1.0,1.0]: 2

| compound: [P108,P6777]

+--no: O60760+--yes: [0,-1] [1.0,1.0]: 2

| compound: [M1404,I7378]

+--no: NP_056979+--yes: [0,0] [48.0,46.0]: 52

| compound: [...]

+--no: O75582+--yes: [-1,0] [1.0,2.0]: 2

| compound: [H89,R7772]

+--no: [0,0] [874.0,872.0]: 886

compound: [...]

ABM97548: Kruppel-like factor 5NP 937802: microphthalmia-associated transcription factor isoform 1NP 004337: caspase-3 preproproteinNP 068810: transcription factor p65 isoform 1AAB23646: amyloid precursor proteinNP 003734: nuclear receptor coactivator 1 isoform 1NR 029493: Homo sapiens microRNA 21 (MIR21)O60760: Hematopoietic prostaglandin D synthase;NP 056979: gemininO75582: Ribosomal protein S6 kinase alpha-5

In this tree, the most interesting outcome is produced when the protein 075582 is targetedby the H89 or R7772. These compounds reduce the bacterial load and do no harm to thehost. Another interesting outcome is the kill of host and increase of bacterial load. Thisoutcome is produced when any of the proteins ABM97548, NP 068810 and AAB23646is targeted by the respective group of compounds (given in the tree). Furthermore, this tree

24

reveals a connection between the proteins NP 937802 and NP 004337. Namely, when bothof the proteins are targeted then nothing happens to the bacteria and the host. However, ifNP 937802 is targeted and NP 004337 is not targeted (effect obtained with application ofN3510, T9033 or T182) then the bacterial load increases, while the cell viability decreases.

7.2 GO and KEGG terms

7.2.1 Mtb - numeric

GO0030031+--yes: [-0.11825,-8.79125]: 4

| compound: [C9911,C9754,P4405,O3125]

+--no: GO0030318+--yes: [-1.283611,-2.935]: 18

| compound: [A2385,D0670,B7651,R5010,V1377,C7632,

| C9511,N3510,P0453,T9033,T9652,D7910,

| C3930,N1786,A9809,T182,P9375,C7522]

+--no: GO0071944+--yes: [-5.0105,-5.163]: 2

| compound: [P8139,V8879]

+--no: GO0004867+--yes: [0.241,-4.899667]: 3

| compound: [C6645,T1132,M6383]

+--no: GO0002637+--yes: [-5.269,0.0475]: 2

| compound: [G6416,S9692]

+--no: GO0034109+--yes: [-4.677,-2.162]: 2

| compound: [S8442,W1628]

+--no: GO0004667+--yes: [-2.2085,-4.5795]: 2

| compound: [M1404,I7378]

+--no: GO0003013+--yes: GO0016202| +--yes: [-1.194182,-1.399045]: 22

| | compound: [...]

| +--no: [-0.699586,-0.45434]: 268

| compound: [...]

+--no: GO0070603+--yes: [0.7535,-3.8975]: 2

| compound: [E1383,R7772]

+--no: [-0.386454,-0.426102]: 631

compound: [...]

GO0030031: cell projection assemblyGO0030318: melanocyte differentiationGO0071944: cell peripheryGO0004867: serine-type endopeptidase inhibitor activityGO0002637: regulation of immunoglobulin productionGO0034109: homotypic cell-cell adhesionGO0004667: prostaglandin-D synthase activity

25

GO0003013: circulatory system processGO0016202: regulation of striated muscle tissue developmentGO0070603: SWI/SNF-type complex

This tree elucidates one interesting outcome which occurs when the regulation of immunoglob-ulin production is targeted (GO0002637) by either of the two compounds G6416 or S9692.

7.2.2 Mtb - discrete

GO0030318+--yes: [0,-1] [11.0,10.0]: 19

| compound: [A2385,D0670,B7651,R5010,V1377,C7632,C9511,N3510,P0453,

| P4405,T9033,T9652,D7910,C3930,N1786,A9809,T182,P9375,C7522]

+--no: GO0030031+--yes: [1,-1] [2.0,3.0]: 3

| compound: [C9911,C9754,O3125]

+--no: GO0044325+--yes: [-1,-1] [2.0,3.0]: 3

| compound: [S8442,S1693,V8879]

+--no: GO0042311+--yes: [0,0] [18.0,14.0]: 21

| compound: [...]

+--no: GO0004867+--yes: [-1,-1] [1.0,2.0]: 2

| compound: [C6645,M6383]

+--no: GO0002275+--yes: [-1,-1] [2.0,2.0]: 3

| compound: [P8139,I5159,W1628]

+--no: GO0002700+--yes: [-1,0] [3.0,3.0]: 3

| compound: [G6416,S9692,L8789]

+--no: GO0001820+--yes: [-1,-1] [2.0,1.0]: 2

| compound: [F100,P1793]

+--no: GO0044355+--yes: [-1,-1] [2.0,2.0]: 4

| compound: [M6760,M9511,T113,D5891]

+--no: GO0002479+--yes: [-1,0] [2.0,2.0]: 2

| compound: [T4318,B8433]

+--no: [0,0] [824.0,865.0]: 894

compound: [...]

GO0030318: melanocyte differentiationGO0030031: cell projection assemblyGO0044325: ion channel bindingGO0042311: vasodilationGO0004867: serine-type endopeptidase inhibitor activityGO0002275: myeloid cell activation involved in immune responseGO0002700: regulation of production of molecular mediator of immune responseGO0001820: serotonin secretion

26

GO0044355: clearance of foreign intracellular DNAGO0002479: antigen processing and presentation of exogenous peptide antigen via MHC classI, TAP-dependent

This tree illustrates two interesting outcomes. The first outcome is the reduction of bacterialload while nothing happens to the host. This outcome happens when either the regulationof production of molecular mediator of immune response (GO0002700) is targeted by G6416,S9692, or L8789 either the antigen processing and presentation of exogenous peptide antigenvia MHC class I, TAP-dependent (GO0002479) is targeted by T4318 or B8433. The secondoutcome is increased bacterial load and kill of the host. This happens when the cell projectionassembly (GO0030031) is targgeted by C9911, C9754 or O3125.

7.2.3 Stm - numeric

GO0030031+--yes: [5.36075,-4.00925]: 4

| compound: [C9911,C9754,P4405,O3125]

+--no: GO0006117+--yes: [4.69725,-2.66375]: 4

| compound: [R5010,V1377,F9677,V8879]

+--no: GO0004667+--yes: [6.7815,-3.2525]: 2

| compound: [M1404,I7378]

+--no: GO0004867+--yes: [4.114667,-3.326333]: 3

| compound: [C6645,T1132,M6383]

+--no: GO0030318+--yes: [1.747187,-0.938688]: 16

| compound: [A2385,D0670,B7651,C7632,C9511,N3510,

| P0453,T9033,T9652,D7910,C3930,

| N1786,A9809,T182,P9375,C7522]

+--no: GO0070603+--yes: [0.8005,-2.857]: 2

| compound: [E1383,R7772]

+--no: [0.264635,0.240033]: 925

compound: [...]

GO0030031: cell projection assemblyGO0006117: acetaldehyde metabolic processGO0004667: prostaglandin-D synthase activityGO0004867: serine-type endopeptidase inhibitor activityGO0030318: melanocyte differentiationGO0070603: SWI/SNF-type complex

This tree elucidates several functions whose targeting increases the bacterial load and re-duces the cell viability. These functions include GO0030031, GO0006117, GO0004667 andGO0004867, and to a certain extent GO0030318. These functions are targeted by the groupsof compounds given in the tree.

27

7.2.4 Stm - discrete

GO0001871+--yes: GO0012505| +--yes: GO0001959| | +--yes: [1,-1] [6.0,6.0]: 6

| | | compound: [C9911,C6645,N3510,P4405,M6383,T182]

| | +--no: [0,0] [3.0,2.0]: 3

| | compound: [T1132,D7910,C7522]

| +--no: [0,0] [8.0,8.0]: 8

| compound: [P4509,T4376,R5010,S8442,A164,T5575,O3125,T9025]

+--no: GO0000409+--yes: GO0000792| +--yes: [1,-1] [2.0,2.0]: 2

| | compound: [A9809,E1383]

| +--no: [0,0] [2.0,2.0]: 2

| compound: [A2385,F6889]

+--no: [0,0] [912.0,910.0]: 935

compound: [...]

GO0001871: pattern bindingGO0012505: endomembrane systemGO0001959: regulation of cytokine-mediated signaling pathwayGO0000409: regulation of transcription by galactoseGO0000792: heterochromatin

This tree elucidates connections between different gene functions that lead to increase ofthe bacterial load and decrease of cell viability. The first connection is between the threefunctions GO0001871, GO0012505 and GO0001959 that are targeted simultaneously by thefollowing compounds: C9911, C6645, N3510, P4405, M6383 and T182. The second connectionis between the two functions GO0000409 and GO0000792 that are targeted simultaneously bythe following compounds: A9809 and E1383.

28

7.3 GO and KEGG terms plus interaction information

7.3.1 Mtb - numeric

GO0030031+--yes: [-0.11825,-8.79125]: 4

| compound: [C9911,C9754,P4405,O3125]

+--no: GO0030318+--yes: [-1.283611,-2.935]: 18

| compound: [A2385,D0670,B7651,R5010,V1377,C7632,C9511,N3510,P0453,T9033,

| T9652,D7910,C3930,N1786,A9809,T182,P9375,C7522]

+--no: IAGO0032317+--yes: [-4.174,-3.62]: 5

| compound: [S8442,S1693,P8139,V8879,W1628]

+--no: GO0004867+--yes: [0.241,-4.899667]: 3

| compound: [C6645,T1132,M6383]

+--no: GO0002637+--yes: [-5.269,0.0475]: 2

| compound: [G6416,S9692]

+--no: GO0004667+--yes: [-2.2085,-4.5795]: 2

| compound: [M1404,I7378]

+--no: IAGO0032645+--yes: [-1.134649,-0.725175]: 57

| compound: [...]

+--no: GO0070603+--yes: [0.7535,-3.8975]: 2

| compound: [E1383,R7772]

+--no: GO0006987+--yes: [-1.9925,-1.147125]: 8

| compound: [M6760,A9950,F100,P1793,

| C5134,F6300,S3567,L9908]

+--no: GO0005344+--yes: [-3.624,-0.6495]: 2

| compound: [A0966,Q3251]

+--no: [-0.431718,-0.429251]: 853

compound: [...]

GO0030031: cell projection assemblyGO0030318: melanocyte differentiationIAGO0032317: regulation of Rap GTPase activityGO0004867: serine-type endopeptidase inhibitor activityGO0002637: regulation of immunoglobulin productionGO0004667: prostaglandin-D synthase activityIAGO0032645: regulation of granulocyte macrophage colony-stimulating factor productionGO0070603: SWI/SNF-type complexGO0006987: activation of signaling protein activity involved in unfolded protein responseGO0005344: oxygen transporter activity

This tree presents two scenarios where the bacterial load is decreased while the host is un-harmed. The first scenario occurs when the regulation of immunoglobulin production (GO0002637)

29

is targeted by G6416 or S9692. The second one is when the oxygen transporter activity(GO0005344) is targeted by A0966 or Q3251.

7.3.2 Mtb - discrete

IAGO0004912+--yes: GO0006874| +--yes: GO0038036| | +--yes: [1,-1] [3.0,3.0]: 3

| | | compound: [C9911,C6645,O3125]

| | +--no: [-1,-1] [6.0,6.0]: 6

| | compound: [M2537,S8442,N3510,P4405,P8139,M6383]

| +--no: [0,0] [7.0,7.0]: 7

| compound: [A4638,T5193,T3434,B7777,T4818,F6627,T4143]

+--no: GO0030318+--yes: [0,0] [11.0,9.0]: 17

| compound: [...]

+--no: IAGO0032317+--yes: [-1,-1] [2.0,3.0]: 3

| compound: [S1693,V8879,W1628]

+--no: GO0042311+--yes: [0,0] [18.0,14.0]: 21

| compound: [...]

+--no: IAGO0030174+--yes: IAGO0006957| +--yes: [-1,-1] [2.0,3.0]: 3

| | compound: [M6760,P1793,R8875]

| +--no: [0,0] [80.0,84.0]: 96

| compound: [...]

+--no: GO0005344+--yes: [-1,0] [3.0,3.0]: 3

| compound: [A0966,Q3251,S9692]

+--no: GO0060306+--yes: [0,0] [8.0,12.0]: 13

| compound: [N7261,S7395,E7881,L3791,F6886,

| I0782,O1008,N4779,T9025,

| D029,H6036,V6383,R5523]

+--no: GO0000045+--yes: [-1,0] [2.0,3.0]: 3

| compound: [T2408,L4762,F100]

+--no: GO0043491+--yes: [-1,0] [2.0,3.0]: 3

| compound: [L8789,H89,M5250]

+--no: [0,0] [730.0,762.0]: 778

compound: [...]

IAGO0004912: interleukin-3 receptor activityGO0006874: cellular calcium ion homeostasisGO0038036: sphingosine-1-phosphate receptor activityGO0030318: melanocyte differentiationIAGO0032317: regulation of Rap GTPase activity

30

GO0042311: vasodilationIAGO0030174: regulation of DNA-dependent DNA replication initiationIAGO0006957: cellular calcium ion homeostasisGO0005344: oxygen transporter activityGO0060306: regulation of membrane repolarizationGO0000045: autophagic vacuole assemblyGO0043491: protein kinase B signaling cascade

This tree presents two outcomes. The first outcome is the decrease of the bacterial load whilenothing happens to the host. There are several possibilities for this outcome to occur. To beginwith, the oxygen transporter activity (GO0005344) needs to be targeted by A0966, Q3251 orS9692. Next, the autophagic vacuole assembly (GO0000045) needs to be targeted by T2408,L4762 or F100. Finally, protein kinase B signaling cascade (GO0043491) needs to be targetedby L8789, H89 or M5250. The second outcome is the increase of bacterial load and decreaseof cell viability. This outcome occurs when cellular calcium ion homeostasis (GO0006874) andsphingosine-1-phosphate receptor activity (GO0038036) are targeted or a interleukin-3 receptoractivity (GO0004912) is targeted through interacting gene by application of C9911, C6645 orO3125.

7.3.3 Stm - numeric

GO0030031+--yes: [5.36075,-4.00925]: 4

| compound: [C9911,C9754,P4405,O3125]

+--no: GO0006117+--yes: [4.69725,-2.66375]: 4

| compound: [R5010,V1377,F9677,V8879]

+--no: GO0004667+--yes: [6.7815,-3.2525]: 2

| compound: [M1404,I7378]

+--no: GO0004867+--yes: [4.114667,-3.326333]: 3

| compound: [C6645,T1132,M6383]

+--no: IAGO0033132+--yes: [2.4455,-2.2295]: 6

| compound: [A2385,Q0125,N3510,F6889,A9809,E1383]

+--no: IAGO0022417+--yes: GO0016499| +--yes: [1.887,-2.10775]: 4

| | compound: [N170,N1786,T182,C7522]

| +--no: GO0001994| +--yes: [-2.4335,0.59475]: 4

| | compound: [M3668,C6305,H1512,D9815]

| +--no: [0.277579,-0.08329]: 145

| compound: [...]

+--no: GO0008514+--yes: [2.6528,-0.3634]: 5

| compound: [R0875,C3662,S5890,L8533,C2932]

+--no: [0.26745,0.301182]: 779

compound: [...]

31

GO0030031: cell projection assemblyGO0006117: acetaldehyde metabolic processGO0004667: prostaglandin-D synthase activityGO0004867: serine-type endopeptidase inhibitor activityIAGO0033132: negative regulation of glucokinase activityIAGO0022417: protein maturation by protein foldingGO0016499: orexin receptor activityGO0001994: norepinephrine-epinephrine vasoconstriction involved in regulation of systemic ar-terial blood pressureGO0008514: organic anion transmembrane transporter activity

This tree presents three outcomes. The first outcome is the decrease of bacterial load andnothing happens to the host. This outcome occurs when the norepinephrine-epinephrine vaso-constriction involved in regulation of systemic arterial blood pressure (GO0001994) is targetedby M3668, C6305, H1512 or D9815. The second outcome is increase of bacterial load whilenothing happens to the host. This outcome occurs when the organic anion transmembranetransporter activity (GO0008514) is targeted by R0875, C3662, S5890, L8533 or C2932. Thethird outcome is increase of bacterial load and decrease of cell viability. This happens whenany of the remaining functions listed in the tree is targeted directly or thorough interactionsby the respective group of compounds.

7.3.4 Stm - discrete

GO0001871+--yes: IAGO0004996| +--yes: [1,-1] [6.0,7.0]: 8

| | compound: [C9911,C6645,N3510,P4405,O3125,M6383,T182,C7522]

| +--no: [0,0] [9.0,9.0]: 9

| compound: [P4509,T4376,R5010,S8442,T1132,A164,T5575,D7910,T9025]

+--no: GO0000409+--yes: GO0000792| +--yes: [1,-1] [2.0,2.0]: 2

| | compound: [A9809,E1383]

| +--no: [0,0] [2.0,2.0]: 2

| compound: [A2385,F6889]

+--no: [0,0] [912.0,910.0]: 935

compound: [...]

GO0001871: pattern bindingIAGO0004996: thyroid-stimulating hormone receptor activityGO0000409: regulation of transcription by galactoseGO0000792: heterochromatin

This tree elucidates connections between different gene functions that lead to increase of thebacterial load and decrease of cell viability. The first connection is between the two functionsGO0001871 (directly) and GO0004996 (interactions) that are targeted simultaneously by thefollowing compounds: C9911, C6645, N3510, P4405, O3125, M6383, T182 and C7522. Thesecond connection is between the two functions GO0000409 and GO0000792 that are targetedsimultaneously by the following compounds: A9809 and E1383.

32