functional annotation and network analysis of predicted mirna...

17
1 International Journal of ISSN No. 2394-7772 Biomathematics and Systems Biology Official Journal of Biomathematical Society of India Volume 3, No. 1, Year 2016 Functional annotation and network analysis of predicted miRNA targets in leguminous plant Glycine max Nela Pragathi Sneha Anjana Rajendiran Archana Pan* Received: 14 Jun, 2017 Revised: 10 Aug 2017 Accepted: 17 Aug 2017 Abstract. MicroRNAs (miRNAs) are small non-coding RNA molecules, usually of 18-24 nucleotides in length. They bind to the 3′ untranslated region of mRNAs causing translational repression or mRNA degradation and thereby regulate gene expression. Identification of miRNA targets (mRNAs) and inferring their functions are important in order to understand their role in different biological processes. In the present study, using in silico approach, we predicted and functionally characterized miRNA targets in leguminous plant Glycine max. A total of 1333 potential targets were predicted using target prediction tools. Functional enrichment analysis using bioinformatics resources revealed that 732, 710 and 643 targets were found to be involved in molecular functions, biological processes and cellular components, respectively. Pathway enrichment analysis using Kyoto Encyclopedia of Genes and Genomes (KEGG) indicated that the identified targets were found to be associated with various vital metabolic pathways, including biosynthesis of antibiotics, glycolysis, amino acid metabolism, and starch and sucrose metabolism. Furthermore, network analysis stipulates the target’s functional importance in biological networks of the plant. The outcome from the analysis would broaden the scope of understanding the role of G.max miRNAs in flower development, root nodulation and response to biotic and abiotic stresses and therefore, can act as a lead towards experimental understanding of the target’s role for the plant system. Keywords: Glycine max, miRNA targets, Functional annotation, Network analysis Corresponding author, email: a[email protected], Telephone:+91-413-2654-584 MicroRNAs (miRNAs) are small non-coding, endogenous RNA molecules of about 18-24nt long. Lin-4 and let-7 are the first two miRNAs to be discovered in Caenorhabditis elegans way back in 1993 by Rosalind Lee and Rhonda Feinbaum in Victor Ambros’s laboratory (Ambros, 2008). The discovery triggered a revolution in the field of non-coding RNA (ncRNA) research, which led to the discovery of ncRNAs in other species, such as flies, mammals and plants. It was found that the identified miRNAs play a humongous role in animals and plants as well by targeting mRNAs thereby acting as an important post-transcriptional regulator. They act by targeting specific mRNAs, mostly at their 3' untranslated region (UTR), whereas a very few bind in the 5' UTR of mRNA and some even bind within the coding regions of mRNA. 1. Introduction Centre for Bioinformatics, Pondicherry University, Pondicherry - 605014, India

Upload: others

Post on 23-Jun-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Functional annotation and network analysis of predicted miRNA …biomathsociety.in/volume3/paper2.pdf · 2018-08-26 · Glycine max, miRNA targets, Functional annotation, ... The

1

International Journal of ISSN No. 2394-7772

Biomathematics and Systems Biology Official Journal of Biomathematical Society of India

Volume 3, No. 1, Year 2016

Functional annotation and network analysis of predicted

miRNA targets in leguminous plant Glycine max

Nela Pragathi Sneha Anjana Rajendiran Archana Pan*

Received: 14 Jun, 2017Revised: 10 Aug 2017

Accepted: 17 Aug 2017

Abstract. MicroRNAs (miRNAs) are small non-coding RNA molecules, usually of 18-24 nucleotides in

length. They bind to the 3′ untranslated region of mRNAs causing translational repression or mRNA

degradation and thereby regulate gene expression. Identification of miRNA targets (mRNAs) and inferring

their functions are important in order to understand their role in different biological processes. In the

present study, using in silico approach, we predicted and functionally characterized miRNA targets in

leguminous plant Glycine max. A total of 1333 potential targets were predicted using target prediction tools.

Functional enrichment analysis using bioinformatics resources revealed that 732, 710 and 643 targets were

found to be involved in molecular functions, biological processes and cellular components, respectively.

Pathway enrichment analysis using Kyoto Encyclopedia of Genes and Genomes (KEGG) indicated that the

identified targets were found to be associated with various vital metabolic pathways, including biosynthesis

of antibiotics, glycolysis, amino acid metabolism, and starch and sucrose metabolism. Furthermore, network

analysis stipulates the target’s functional importance in biological networks of the plant. The outcome from

the analysis would broaden the scope of understanding the role of G.max miRNAs in flower development,

root nodulation and response to biotic and abiotic stresses and therefore, can act as a lead towards

experimental understanding of the target’s role for the plant system.

Keywords: Glycine max, miRNA targets, Functional annotation, Network analysis

∗Corresponding author, email: [email protected], Telephone:+91-413-2654-584

MicroRNAs (miRNAs) are small non-coding, endogenous RNA molecules of about 18-24nt long. Lin-4 and let-7 are the first two miRNAs to be discovered in Caenorhabditis elegans way back in 1993 by Rosalind Lee and Rhonda Feinbaum in Victor Ambros’s laboratory (Ambros, 2008). The discovery triggered a revolution in the field of non-coding RNA (ncRNA) research, which led to the discovery of ncRNAs in other species, such as flies, mammals and plants. It was found that the identified miRNAs play a humongous role in animals and plants as well by targeting mRNAs thereby acting as an important post-transcriptional regulator. They act by targeting specific mRNAs, mostly at their 3' untranslated region (UTR), whereas a very few bind in the 5' UTR of mRNA and some even bind within the coding regions of mRNA.

1. Introduction

Centre for Bioinformatics, Pondicherry University,Pondicherry - 605014, India

Page 2: Functional annotation and network analysis of predicted miRNA …biomathsociety.in/volume3/paper2.pdf · 2018-08-26 · Glycine max, miRNA targets, Functional annotation, ... The

2

A large proportion of protein-coding genes appears to be regulated by miRNAs suggesting that miRNAs have a critical role in affecting

variety of biological functions. Many miRNAs have been identified in different plant species, including Arabidopsis thaliana (Rajagopalan

et al., 2006), rice (Zhu et al., 2008) , apple (Xia et al., 2012), maize (Zhang et al., 2009) and soybean (Song et al., 2011). Traditional

methods for detecting miRNAs include cloning, northern blotting, RT-PCR, and microarray techniques. In plants, miRNAs play a crucial

role in growth, response to biotic and abiotic stresses, basal defense and flowering etc. They also regulate various plant development

processes, such as leaf morphogenesis and polarity, floral differentiation, root initiation and development, vascular development and

transition of plant growth from vegetative growth to reproductive growth (Khraiwesh et al., 2012). Various abiotic stresses, such as cold

stress, drought stress, salt stress, have high impact on cultivation and yield in leguminous plants (Araújo et al., 2015). It is also found that

the nitrogen fixation in legume plants depends on several abiotic stresses, including soil salinity, low/high temperature, acidity, water

logging etc. (Serraj, 2003).

Glycine max, commonly known as soybean, is a leguminous plant of the family fabaceae (Michelfelder, 2009). Legumes are important crop

plants utilized as food and for soil improvement. They play a critical role in natural environment by accessing atmospheric N2 through

symbiosis with a group of soil bacteria, Rhizobia (Graham and Vance, 2003). They require minimal amount of N fertilizers and also

improve soil quality. Soybeans are rich source of protein and vegetable oil for human consumption globally. Its ability to adapt to a wide

range of soil, climatic conditions and ability to fix nitrogen makes it an agronomically important crop. A typical soybean contains 40%

protein and 20% oil (by weight). This composition ranks the crop highest in terms of protein content among all food crops and second in

terms of oil content after peanut (48%) among all food legumes. However, the growth, development and yield of soybean are greatly

affected by several abiotic stresses, such as flood, drought, salinity and heavy metal cadmium (Hossain et al., 2013). It has been reported

that miRNAs also play crucial roles in the development of this crop. For example, miR160 (Nizampatnam et al., 2015) controls auxin and

cytokinin sensitivities and is responsible for soybean nodule development.

It is found that miR393 stimulates the defense against Phytophthorasojae infection (Caudle et al., 2016). Moreover, soybean miRNAs are

reported to be involved in biotic and abiotic stresses (Kulcheski et al., 2011) and play a crucial role in the regulatory network of soybean

flower (Kulcheski et al., 2016). In order to understand the biological role of miRNAs in G. max, identification and analysis of the targets is

of utmost importance. Thus, in the present study an attempt was made to identify and annotate the G.max targets using in silico approach, as

the experimental methods are highly demanding in terms of resources and time. The study identified 1333 miRNA targets and they were

found to be associated with various vital metabolic pathways, including biosynthesis of antibiotics, glycolysis, amino acid metabolism, and

starch and sucrose metabloism. Furthermore, network analysis of target genes was performed and it was observed that they are involved in

the vital processes of plant, such as root growth, seed germination, flower development etc. These predicted targets are majorly involved in

regulating various stresses like water stress, cold stress, osmotic stress etc.

The outcome from the analysis would broaden the scope of understanding the role of G. max miRNAs in stress regulation and can be

experimentally verified. Further, the result can be considered for engineering G.max with stress resistance traits.

2. Materials and methods

2.1 Sequence database

A total of 638 mature miRNA sequences of G.max was downloaded from miRBase (release 21) (http://www.mirbase.org) (Kozomara and

Griffiths-Jones, 2014), a searchable database of published miRNA sequences and annotation. The downloaded mature miRNAs were used

Page 3: Functional annotation and network analysis of predicted miRNA …biomathsociety.in/volume3/paper2.pdf · 2018-08-26 · Glycine max, miRNA targets, Functional annotation, ... The

3

as a query sequence set for the prediction of G.max targets. The transcript library UniGene, DFCI Gene Index (GMGI) (version 16), was

downloaded for target identification.

2.2 Tools and Software

Two tools namely, psRNAtarget and TargetFinder were used for the prediction of miRNA targets. The mature miRNA sequences of G.max

were submitted as query to the target prediction server psRNATarget (Dai and Zhao, 2011), a small RNA target analysis server for plants.

The server uses dynamic programming with modified Smith & Waterman algorithm for aligning the sequences and it applies RNAup

algorithm (Lorenz et al., 2011) from Vienna package for analyzing target accessibility. TargetFinder (Fahlgren et al., 2007), a small RNA

target prediction tool, was also used for the prediction of targets, which predicts the small RNA binding sites on the target transcripts

computationally using position weighted scoring matrix. As miRNA usually target protein-coding sequences, the identified target mRNAs

were checked for their ability to code for a protein using Basic Local Alignment Search Tool (BLASTX), the homologous sequence finding

tool from NCBI. Annotation of the identified targets were done using BLAST2GO (Conesa et al., 2005) suite, a software for high-

throughput and automatic functional annotation of DNA or protein sequences based on the Gene Ontology vocabulary. Pathway studio plant

(web) (Nikitin et al., 2003) tool was used for the construction of network, which allows us to understand the role of proteins in regulatory

networks.

2.3 Prediction of Glycine max targets

The downloaded 638 G.max mature miRNA sequences were used as query to identify their respective target transcript sequences.

2.3.1 psRNATarget

The downloaded miRNA sequences were submitted as query to the psRNATarget server with G.max UniGene library as target database

using the option ‘user submitted small RNAs/ preloaded transcripts’. The program was run with default values for the following parameters:

maximum expectation as 3.0, length for complementarity scoring (HSPsize) as 20, number of top target’s genes for each small RNA as 200,

target accessibility- allowed maximum energy to unpair the target site (UPE) as 25.0, flanking length around the target site for target

accessibility analysis as 17bp in upstream/13bp in downstream, and range of central mismatch leading to translational inhibition as 9-11nt.

2.3.2 TargetFinder

TargetFinder is the software that works on LINUX environment. Herein, the mature G.max miRNAs were used as query and the

downloaded G.max UniGene transcript sequences were used as database to be searched against. Targets were identified by aligning the

input small RNA sequences against all transcripts, followed by site scoring using a position-weighted scoring matrix. It implements a

‘FASTA’ program along with a penalty scoring scheme for mismatches, bulges, or gaps for aligning the sequences.

2.3.3 Elucidation of common protein-coding targets

The identified mRNA targets were manually curated and the targets identified by psRNATarget and TargetFinder were considered as

‘common targets’, as the combination of results will deliver high true positive coverage (Srivastava et al., 2014). These common targets

were then checked to be protein-coding sequence. This was accomplished by querying the predicted target mRNA sequences against the

non-redundant protein database using BLASTX with parameters (i) expect threshold: 10, (ii) word size: 6, matrix: BLOSUM62. Those

sequences which are capable of protein-coding alone were retained for further analysis and were termed as ‘common potential targets’.

2.4 Annotation and Network analysis

The ‘common potential targets’ were subjected to Gene Ontology (GO) annotation using the Blast2GO (Conesa et al., 2005) suite.

Blast2GO suite uses three principles to perform the annotation, which include Blast search, mapping and annotation. The tool provides GO

annotation information about the query sequences by mapping against databases, such as PIR, PSD, Uniprot, Swiss-prot, TrEMBL, RefSeq,

Page 4: Functional annotation and network analysis of predicted miRNA …biomathsociety.in/volume3/paper2.pdf · 2018-08-26 · Glycine max, miRNA targets, Functional annotation, ... The

4

Gen Pept, PDB, GO and NCBI. Furthermore, the KEGG pathway analysis was done to interpret the pathways in which the targets are

involved, followed by InterPro domain analysis.

For the annotated G. max targets, network analysis was done using pathway studio plant (Nikitin et al., 2003), a tool that allows to

understand the role of proteins in regulatory networks. It builds pathways and networks from relationships between biological molecules

and processes extracted from databases. It identifies the relationship among proteins, diseases, small molecules and cell processes. The

annotated common potential targets belonging to G.max were considered for network construction. These G.max targets were submitted to

the pathway studio plant and network with cellular processes, small molecules and various plant diseases were constructed. The proteins

with high connectivity were considered to be important targets of G.max in the network. These were taken to analyze their potentiality using

Cytoscape 3.4.0 (Christmas et al., 2005), a software for biological network visualization and analysis. The targets selected were individually

analyzed by observing the network parameters upon node (target) deletion. The potentiality of the targets was determined based on the

change in the critical network parameter values like clustering coefficient, network heterogeneity, network centralization and characteristic

path length.

3. Results and Discussion

3.1 Glycine max target prediction

A total of 4225 mRNA targets were predicted for 610 mature miRNA sequences of G. max against G.max UniGene library (DFCI Gene

Index (GMGI)) by psRNATarget server. The parameters (i) maximum allowed unpaired energy (UPE), (ii) flanking length around the target

site for analysis of target accessibility, (iii) complementarity scoring length (HSPsize) and (iv) central mismatch range leading to translation

inhibition were set to default values. Results from the psRNATarget revealed that the identified UniGene targets were either cleaved or

inhibited by the identified miRNAs. About 4284 targets were predicted by TargetFinder tool for the 610 mature miRNA sequences against

the G.max UniGene library (DFCI Gene Index (GMGI)).

The previous comparison study (Srivastava et al., 2014) reported that the combination of targets obtained from both the tools will give high

true positive coverage. Thereby, the miRNA target genes obtained from psRNATarget server and TargetFinder tool were manually curated

to identify common targets. Since both the tools used UniGene transcript as database, target accession ID was same for the predicted

targets. This enabled the easy extraction of the transcripts predicted as targets by both the tools and were termed as ‘common targets’. A

total of 1333 transcript sequences were predicted as common targets by both the tools. It was observed that a single G.max miRNA targets

multiple transcripts and also a single transcript (target) is associated with multiple G.max miRNAs. These results were in accordance with

the earlier reports that miRNAs exhibit multiplicity nature. i.e., a single miRNA can target multiple transcripts and also a single transcript

(target) is acted upon by multiple miRNAs. It is found that on an average 12 targets are associated to a single miRNA (Table 1). 22

miRNAs target only single transcript, whereas other miRNAs target multiple transcript. miR9752 was found to have the maximum (83)

number of targets out of all the miRNAs. Table 2 depicts that each target is associated with one or multiple miRNAs.

3.2Functional annotation of miRNA target genes

To understand the functional importance of the transcripts identified as targets by the miRNAs, functional annotation of the target genes

was carried out. Out of the 1333 ‘common targets’, 1052 targets were found to be protein-coding genes through BLASTX analysis which

were considered to be the ‘common potential targets’. As miRNA targets protein-coding genes only, these transcripts were taken forward

for functional annotation. Functional annotation of the targets was done using BLAST2GO suite, which performs GO annotation and

pathway analysis. As a part of annotation, Blast2GO performs BLAST analysis initially which indicated that top blast hits belong to G. max

species. The GO annotation revealed that the identified targets are involved in various molecular functions, cellular components and

biological processes. In the present study, it was observed that 732 targets are involved in several molecular functions, such as catalytic

activity, ion binding, transferase activity, 710 targets participate in the biological processes, including metabolic processes, signaling,

regulation of biological processes and 643 targets take part in the cellular components, such as cell periphery, membrane, nucleus ( Figures:

1, 2 &3). InterPro domain analysis revealed that 697 targets are associated with different InterPro IDs.

Page 5: Functional annotation and network analysis of predicted miRNA …biomathsociety.in/volume3/paper2.pdf · 2018-08-26 · Glycine max, miRNA targets, Functional annotation, ... The

5

Pathway enrichment analysis revealed that a total of 114 targets are involved in the metabolic networks. The pathways include antibiotic

biosynthesis, amino acid biosynthesis, nitrogen, starch and sucrose metabolism (Table 3). Most of the targets (24) were found to be

involved in biosynthesis of antibiotics. From the literature it was found that the genes responsible for the synthesis of antibiotics will be

useful in disease management of crop production. Antibiotics can contribute to microbial competitiveness and the suppression of plant root

pathogens. So, it is assumed that the identified targets involved in antibiotic biosynthesis might play a role in defense mechanism of the

crop by evading root pathogens colonization, as the crop depends mainly on the roots for a very crucial process called symbiosis to fix

atmospheric nitrogen.

3.3 Network analysis of Glycine max targets

In order to understand the interactions of target genes with other small molecules or diseases we considered all the targets of G.max and

constructed network using pathway studio plant (Nikitin et al., 2003). From the constructed network (Figure 4), it was observed that the

identified targets are involved in the important plant processes like seed germination, plant growth, root growth, flower development, cell

death, defense response and photosynthesis. Targets like peroxidase (UNIREF100 ID- O23961), Heat Shock Protein 70 (HSP70:

UNIREF100 ID- P26413), mitogen-activated protein kinase (MAPK: UNIREF100 ID- Q5K6Q4), Ribosomal protein S6 (RPS6:

UNIREF100 ID- Q6SPR3) were found to be highly connected in the network. For example, Figure 5 shows that peroxidase is responsible

for many regulations (grey-dotted lines) in the network and also in treatments of osmotic stress, hypoxia and drought. Similarly, Figure 6

indicates that HSP70 has many incoming and outgoing connections in the network denoting that it is responsible for expression of proteins

(blue arrowed lines), regulations (grey-dotted lines) and binding (purple lines) and is responsible for abiotic stress treatment, water stress

treatment, infection treatment etc.

These targets were individually analyzed using STRING (Szklarczyk et al., 2011) database. The process of node deletion in each network

was performed and considerable changes in the network parameters were observed (Table 4). These targets also had high centroid value

than the average centroid value of their respective networks which implies that they will be involved in regulating specific cell activity

(Scardoni et al., 2014).

4. Conclusions

In the present study, we predicted 1333 targets for the mature miRNA sequences, out of which 1052 targets are protein-coding genes. A

total of 710 target genes were found to be involved in various biological processes, 643 target genes as cellular components and 732 target

genes are involved in molecular functions like ATP binding, DNA binding, hydrolase activity, transferase activity etc. InterPro domain

analysis showed that 698 InterPro IDs are associated with target genes from databases, including- CATH-Gene3D, HAPMAP, PANTHER,

Pfam, PRINTS, PIRSF, ProDom, Prosite, SMART, Superfamily. KEGG analysis showed that targets are involved in 97 pathways which

include antibiotic biosynthesis, starch and sucrose metabolism. Network analysis revealed that four targets viz., peroxidase, HSP70, MAPK

and RPS6 are highly connected in the network and are involved in plant disease treatments and vital plant processes. It is noted from the

literature that MAPK is a mediator of several abiotic stresses, such as water, salt, osmotic and cold stress (Cristina et al., 2010), while RPS6

being a contributor of nodule development (Um et al., 2013). In addition, HSP70 (Yu et al., 2015) was found to be a major contributor of

stress related processes of cells and peroxidase helps the plant in coping up with the biotic stress conditions and pathogen attack (Passardi et

al., 2005). Thus, the suggested targets can be taken further for experimental verification of their roles as stress regulators in the plant which

will benefit the scientific community.

List of abbreviations

miRNA – microRNA; G.max- Glycine max; UPE- UnPaired Energy; KEGG - Kyoto Encyclopedia of Genes and Genomes; UTR -

Untranslated region; NCBI- National Center for Biotechnology Information; GO - Gene Ontology; HSP70 - Heat Shock Protein 70; MAPK

- Mitogen-activated protein kinase; RPS6 -Ribosomal protein S6.

Page 6: Functional annotation and network analysis of predicted miRNA …biomathsociety.in/volume3/paper2.pdf · 2018-08-26 · Glycine max, miRNA targets, Functional annotation, ... The

6

References

Ambros, V. (2008) The evolution of our thinking about microRNAs. Nat. Med. 14, 1036–1040. doi: 10.1038/nm1008-1036.

Araújo, S. S., Beebe, S., Crespi, M., Delbreil, B., González, E. M., Gruber, V., Lejeune-henaut, I., Link, W., Monteros, M. J., Rao, I.,

Vadez, V. and Patto, M. C. V. (2015) Abiotic stress responses in legumes: strategies used to cope with environmental challenges. CRC Crit.

Rev. Plant Sci. 2689, 37–41. doi: 10.1080/07352689.2014.898450.

Caudle, A. S., Yang, W. T., Mittendorf, E. A. and Kuerer, H. M. (2016) HHS Public Access. 150, 137–143. doi:

10.1001/jamasurg.2014.1086.

Christmas, R., Avila-Campillo, I., Bolouri, H., Schwikowski, B., Anderson, M., Kelley, R., Landys, N., Workman, C., Ideker, T., Cerami,

E., Sheridan, R., Bader, G.D. and Sander, C. (2005) Cytoscape: A software environment for integrated models of bimolecular interaction

networks. Am. Assoc. Cancer Res. Educ. Book 12-16. doi: 10.1101/gr.1239303.

Conesa, A., Götz, S., García-Gómez, J. M., Terol, J., Talón, M. and Robles, M. (2005) Blast2GO: A universal annotation and visualization

tool in functional genomics research. Application note. Bioinformatics 21, 3674–3676. doi: 10.1093/bioinformatics/bti610.

Cristina, M., Petersen, M. and Mundy, J. (2010) Mitogen-activated protein kinase signaling in plants. Annu. Rev. Plant Biol. 61, 621–649.

doi: 10.1146/annurev-arplant-042809-112252.

Dai, X. and Zhao, P. X. (2011) psRNATarget: A plant small RNA target analysis server. Nucleic Acids Res. 39 (SUPPL. 2), 1–5. doi:

10.1093/nar/gkr319.

Fahlgren, N., Howell, M. D., Kasschau, K. D., Chapman, E. J., Sullivan, C. M., Cumbie, J. S., Givan, S. A., Law, T. F., Grant, S. R., Dangl,

J. L. and Carrington, J. C. (2007) High-throughput sequencing of Arabidopsis microRNAs: Evidence for frequent birth and death of

MIRNA genes. PLoS One 2. doi: 10.1371/journal.pone.0000219.

Graham, P. H. and Vance, C. P. (2003) Legumes: Importance and Constraints to Greater Use. Plant Physiol. 131, 872–877. doi:

10.1104/pp.017004.872.

Hossain, Z., Khatoon, A. and Komatsu, S. (2013) Soybean proteomics for unraveling abiotic stress response mechanism. J. Proteome Res.

12, 4670–4684. doi: 10.1021/pr400604b.

Khraiwesh, B., Zhu, J.-K. and Zhu, J. (2012) Role of miRNAs and siRNAs in biotic and abiotic stress responses of plants. Biochim.

Biophys. Acta. 1819, 137–148. doi: 10.1016/j.bbagrm.2011.05.001.

Kozomara, A. and Griffiths-Jones, S. (2014) miRBase: Annotating high confidence microRNAs using deep sequencing data. Nucleic Acids

Res. 42 (D1), 68–73. doi: 10.1093/nar/gkt1181.

Kulcheski, F. R., de Oliveira, L. F., Molina, L. G., Almerão, M. P., Rodrigues, F. a, Marcolino, J., Barbosa, J. F., Stolf-Moreira, R.,

Nepomuceno, A. L., Marcelino-Guimarães, F. C., Abdelnoor, R. V, Nascimento, L. C., Carazzolle, M. F., Pereira, G. A. and Margis, R.

(2011) Identification of novel soybean microRNAs involved in abiotic and biotic stresses. BMC Genomics 12, 307. doi: 10.1186/1471-

2164-12-307.

Kulcheski, F. R., Molina, L. G., da Fonseca, G. C., de Morais, G. L., de Oliveira, L. F. V and Margis, R. (2016) Novel and conserved

microRNAs in soybean floral whorls. Gene 575, 213–223. doi: 10.1016/j.gene.2015.08.061.

Page 7: Functional annotation and network analysis of predicted miRNA …biomathsociety.in/volume3/paper2.pdf · 2018-08-26 · Glycine max, miRNA targets, Functional annotation, ... The

7

Lorenz, R., Bernhart, S. H., Höner zu Siederdissen, C., Tafer, H., Flamm, C., Stadler, P. F. and Hofacker, I. L. (2011) ViennaRNA Package

2.0. Algorithms Mol. Biol. 6, 26. doi: 10.1186/1748-7188-6-26.

Michelfelder, A. J. (2009) Soy: A complete source of protein Am. Fam. Physician 79, 43–47.

Nikitin, A., Egorov, S., Daraselia, N. and Mazo, I. (2003) Pathway studio - The analysis and navigation of molecular networks.

Bioinformatics 19,2155–2157. doi: 10.1093/bioinformatics/btg290.

Nizampatnam, N. R., Schreier, S. J., Damodaran, S., Adhikari, S. and Subramanian, S. (2015) MicroRNA160 dictates stage-specific auxin

and cytokinin sensitivities and directs soybean nodule development. Plant J. 84, 140–153. doi: 10.1111/tpj.12965.

Passardi, F., Cosio, C., Penel, C. and Dunand, C. (2005) Peroxidases have more functions than a Swiss army knife. Plant Cell Rep. 24, 255–

265. doi: 10.1007/s00299-005-0972-6.

Rajagopalan, R., Vaucheret, H., Trejo, J. and Bartel, D. P. (2006) A diverse and evolutionarily fluid set of microRNAs in Arabidopsis

thaliana. Genes Dev. 20, +3407–3425. doi: 10.1101/gad.1476406.

Scardoni, G., Laudanna, C. and Tosadori, G. (2014) CentiScaPe : Network centralities for Cytoscape. 1–11.

Serraj, R. (2003) Effects of drought stress on legume symbiotic nitrogen fixation: Physiological mechanisms. Indian J. Exp. Bio. 41, 1136–

1141.

Song, Q.-X., Liu, Y.-F., Hu, X.-Y., Zhang, W.-K., Ma, B., Chen, S.-Y. and Zhang, J.S. (2011) Identification of miRNAs and their target

genes in developing soybean seeds by deep sequencing. BMC Plant Bio. 11, 5. doi: 10.1186/1471-2229-11-5.

Srivastava, P.K., Moturu, T.R., Pandey, P., Baldwin, I.T. and Pandey, S.P. (2014) A comparison of performance of plant miRNA target

prediction tools and the characterization of features for genome-wide target prediction. BMC Genomics 15, 348. doi: 10.1186/1471-2164-

15-348.

Szklarczyk, D., Franceschini, A., Kuhn, M., Simonovic, M., Roth, A., Minguez, P., Doerks, T., Stark, M., Muller, J., Bork, P., Jensen, L. J.

and Von Mering, C. (2011) The STRING database in 2011: Functional interaction networks of proteins, globally integrated and scored.

Nucleic Acids Res. 39 (SUPPL. 1), 561–568. doi: 10.1093/nar/gkq973.

Um, J. H., Kim, S., Kim, Y. K., Song, S. B., Lee, S. H., Verma, D. P. S. and Cheon, C. I. (2013) RNA interference-mediated repression of

S6 kinase 1 impairs root nodule development in soybean. Mol. Cells 35, 243–248. doi: 10.1007/s10059-013-2315-8.

Xia, R., Zhu, H., An, Y., Beers, E. P. and Liu, Z. (2012) Apple miRNAs and tasiRNAs with novel regulatory networks. Genome Bio. 13,

R47. doi: 10.1186/gb-2012-13-6-r47.

Yu, A., Li, P., Tang, T., Wang, J., Chen, Y. and Liu, L. (2015) Roles of Hsp70s in Stress Responses of Microorganisms, Plants, and

Animals. Biomed Res. Int. (Figure 1). doi: 10.1155/2015/510319.

Zhang, L., Chia, J. M., Kumari, S., Stein, J. C., Liu, Z., Narechania, A., Maher, C. A., Guill, K., McMullen, M. D. and Ware, D. (2009) A

genome-wide characterization of microRNA genes in maize. PLoS Genet. 5. doi: 10.1371/journal.pgen.1000716.

Zhu, Q. H., Spriggs, A., Matthew, L., Fan, L., Kennedy, G., Gubler, F. and Helliwell, C. (2008) A diverse set of microRNAs and

microRNA-like small RNAs in developing rice grains. Genome Res. 18, 1456–1465. doi: 10.1101/gr.075572.107.

Page 8: Functional annotation and network analysis of predicted miRNA …biomathsociety.in/volume3/paper2.pdf · 2018-08-26 · Glycine max, miRNA targets, Functional annotation, ... The

8

Table Legends

Table1: Glycine max miRNAs targeting multiple transcripts

Table 2: Targets associated to single and multiple miRNAs

Table 3: Pathway related information of the targets

Table 4 : Network parameters of target genes -pre and post node deletion

Figure Legends

Figure 1: Classification of predicted targets according to their biological process

Figure 2: Classification of predicted targets according to their molecular function

Figure 3: Classification of predicted targets according to cellular components

Figure 4: Network of Glycine max miRNA targets

Figure 5: Peroxidase interactions in the network

Figure 6 : HSP70 interactions in the network

Page 9: Functional annotation and network analysis of predicted miRNA …biomathsociety.in/volume3/paper2.pdf · 2018-08-26 · Glycine max, miRNA targets, Functional annotation, ... The

9

Table 1: Glycine max miRNAs targeting multiple transcripts

miRNA id No. of targets Mode of Inhibition

gma-mir9732 10 cleavage

gma-mir5038a 9 cleavage

gma-mir5672 18 cleavage

gma-mir393c-5p 14 cleavage

gma-mir1520f-3p 10 cleavage

gma-mir9739 1 cleavage

gma-mir319p 6 cleavage

gma-mir156v 32 cleavage

gma-mir1520l 12 cleavage

gma-mir169a 12 cleavage

gma-mir2606a 16 cleavage

gma-mir403b 6 cleavage/translation

gma-mir5371-5p 3 cleavage

gma-mir482b-5p 20 cleavage

gma-mir9750 2 cleavage

gma-mir171i-3p 10 cleavage

gma-mir1532 3 cleavage

Table 2: Targets associated to single and multiple miRNAs

Target miRNAs

TC423678 gma-miR1507c-5p

TC437576 gma-miR1508c

TC434615 gma-miR1513a-5p,gma-miR1513b

TC427051 gma-miR1514a-5p

TC420931 gma-miR1514b-5p

TC426727 gma-miR1516a-3p

Page 10: Functional annotation and network analysis of predicted miRNA …biomathsociety.in/volume3/paper2.pdf · 2018-08-26 · Glycine max, miRNA targets, Functional annotation, ... The

10

TC472015 gma-miR1516b

TC457335 gma-miR1534

TC462825 gma-miR1536

FK514297 gma-miR156b,gma-miR156p,gma-miR156t,gma-miR156f

TC425591 gma-miR156b,gma-miR156f

TC439563 gma-miR156f

DB962750 gma-miR156v,gma-miR156b,gma-miR156s,gma-miR156a,gma-miR156u,gma-miR156q,gma-

miR156h,gma-miR156y,gma-miR156g,gma-miR156x,gma-miR156w,gma-miR156f

TC430891 gma-miR156v,gma-miR156b,gma-miR156s,gma-miR156a,gma-miR156u,gma-miR156q,gma-

miR156h,gma-miR156y,gma-miR156g,gma-miR156x,gma-miR156w,gma-miR156f

TC466550 gma-miR156v,gma-miR156b,gma-miR156s,gma-miR156a,gma-miR156u,gma-miR156q,gma-

miR156h,gma-miR156y,gma-miR156g,gma-miR156x,gma-miR156w,gma-miR156f

TC437112 gma-miR171d

GE083705 gma-miR171k-5p,gma-miR171l

TC427597 gma-miR172i-3p,gma-miR172d,gma-miR172l,gma-miR172e,gma-miR172c

TC434089 gma-miR172i-3p,gma-miR172a,gma-miR172f,gma-miR172d,gma-miR172l,gma-miR172e,gma-

miR172h-3p,gma-miR172b-3p,gma-miR172c,gma-miR172k

TC457282 gma-miR172i-3p,gma-miR172a,gma-miR172f,gma-miR172d,gma-miR172l,gma-miR172e,gma-

miR172h-3p,gma-miR172b-3p,gma-miR172c,gma-miR172k

AW351184 gma-miR2108a,gma-miR2108b

EH222473 gma-miR2108a,gma-miR2108b

DB971500 gma-miR395l,gma-miR395i,gma-miR395k,gma-miR395h,gma-miR395j,gma-miR395m

TC489898 gma-miR396k-3p

TC444778 gma-miR397b-5p,gma-miR397a

TC479131 gma-miR4382

Page 11: Functional annotation and network analysis of predicted miRNA …biomathsociety.in/volume3/paper2.pdf · 2018-08-26 · Glycine max, miRNA targets, Functional annotation, ... The

11

Table 3: Pathways related information of the targets

Pathway Seqs in

Pathway

Enzyme Enzyme ID Sequence ID Pathway ID

Zeatin biosynthesis 3 dimethylallyltransferase ec:2.5.1.27 TC445284 map00908

Zeatin biosynthesis 3 dimethylallyltransferase ec:2.5.1.75 TC470457, TC462920, TC445284 map00908

Valine, leucine and isoleucine

degradation

4 C-acetyltransferase ec:2.3.1.9 TC430049 map00280

Valine, leucine and isoleucine

degradation

4 dehydrogenase ec:1.1.1.31 TC462912 map00280

Valine, leucine and isoleucine

degradation

4 dehydrogenase (NAD+) ec:1.2.1.3 TC457335, TC482610 map00280

Valine, leucine and isoleucine

biosynthesis

2 synthase ec:2.2.1.6 TC441223, BW669215 map00290

Tyrosine metabolism 11 oxidase ec:1.10.3.1 NP8367007, TC461966 map00350

Tyrosine metabolism 11 dehydrogenase ec:1.1.1.1 TC432095, TC444770, TC424618 map00350

Page 12: Functional annotation and network analysis of predicted miRNA …biomathsociety.in/volume3/paper2.pdf · 2018-08-26 · Glycine max, miRNA targets, Functional annotation, ... The

12

Table 4: Network parameters of target genes- pre and post node deletion

Target Name

Clustering

co-efficient

Network centralization characteristic path length Network heterogeneity

Before

node

deletion

After

node

deletion

Before

node

deletion

After

node

deletion

Before

node

deletion

After

node

deletion

Before

node

deletion

After

node

deletion

HSP70 0.225 0 0.521 0.637 1.471 2.103 0.331 1.338

MAPK 0.593 0.590 0.695 0.421 1.629 1.363 0.730 0.819

RPS6 0.990 0.989 0.011 0.012 1.010 1.011 0.020 0.021

GMIPER

(Peroxidase)

0.866 0.851 0.579 0.058 1.524 1.111 0.276 0.153

Page 13: Functional annotation and network analysis of predicted miRNA …biomathsociety.in/volume3/paper2.pdf · 2018-08-26 · Glycine max, miRNA targets, Functional annotation, ... The

13

Figure 1: Classification of predicted targets according to their Biological Process

Figure 2: Classification of predicted targets according to their molecular function

Page 14: Functional annotation and network analysis of predicted miRNA …biomathsociety.in/volume3/paper2.pdf · 2018-08-26 · Glycine max, miRNA targets, Functional annotation, ... The

14

Figure 3: Classification of predicted targets according to cellular components

Page 15: Functional annotation and network analysis of predicted miRNA …biomathsociety.in/volume3/paper2.pdf · 2018-08-26 · Glycine max, miRNA targets, Functional annotation, ... The

15

Figure 4: Network of Glycine max miRNA targets

Page 16: Functional annotation and network analysis of predicted miRNA …biomathsociety.in/volume3/paper2.pdf · 2018-08-26 · Glycine max, miRNA targets, Functional annotation, ... The

16

Figure 5: Peroxidase interactions in the network

Page 17: Functional annotation and network analysis of predicted miRNA …biomathsociety.in/volume3/paper2.pdf · 2018-08-26 · Glycine max, miRNA targets, Functional annotation, ... The

17

Figure 6: HSP70 interactions in the network