dietrich - migration and interaction tracking for...

16
Migration and interaction tracking for quantitative analysis of phagocyte-pathogen confrontation assays Susanne Brandes [1,2], Stefanie Dietrich* [1,2], Kerstin Hünniger [3], Oliver Kurzai [3,4], Marc Thilo Figge [1,2] [1] Applied Systems Biology, Leibniz Institute for Natural product research and Infection Biology, Hans Knöll Institute, Jena, Germany [2] Friedrich Schiller University, Jena, Germany [3] Septomics Research Center, Friedrich Schiller University and Leibniz Institue for Natural Product Research and Infection Biology, Hans Knöll Institute, Jena, Germany [4] Institute for Hygiene and Microbiology, University of Würzburg, Würzburg, Germany [email protected] Invasive fungal infections are emerging as a significant health risk for humans. The innate immune system is the first line of defence against invading micro-organisms and involves the recruitment of phagocytes, which engulf and kill pathogens, to the site of infection. To gain a quantitative understanding of the interplay between phagocytes and fungal pathogens, live-cell imaging is a modern approach to monitor the dynamic process of phagocytosis in time and space. However, this requires the processing of large amounts of video data that is tedious to be performed manually. Here, we present a novel framework, called AMIT (algorithm for migration and interaction tracking, [1]), that enables automated high-throughput analysis of multi-channel time-lapse microscopy videos of phagocyte–pathogen confrontation assays. The framework is based on our previously developed segmentation and tracking framework for non-rigid cells in brightfield microscopy [2]. We here present an advancement of this framework to segment and track different cell types in different video channels as well as to track the interactions between different cell types. For the confrontation assays of polymorphonuclear neutrophils (PMNs) and Candida glabrata considered in this work, the main focus lies on the correct detection of phagocytic events. To achieve this, we introduced different PMN states and a state-transition model that represents the basic principles of phagocyte- pathogen interactions. The framework is validated by a direct comparison of the automatically detected phagocytic activity of PMNs to a manual analysis and by a qualitative comparison with previously published analyses [3, 4]. We demonstrate the potential of our algorithm by comprehensive quantitative and multivariate analyses of confrontation assays involving human PMNs and the fungus C. glabrata. [1] Brandes, S., Dietrich, S., Hünniger, K., Kurzai, O., Figge, M.T. (2016) Migration and Interaction Tracking for Quantitative Analysis of Phagocyte-Pathogen Confrontation Assays. Medical Image Analysis, 36 (Nov.), 172-183. [2] Brandes, S., Mokhtari, Z., Essig, F., Hünniger, K., Kurzai, O., Figge, M.T. (2015). Automated segmentation and tracking of non-rigid objects in time-lapse microscopy videos of polymorphonuclear neutrophils. Medical Image Analysis, 20 (1), 34-51. [3] Duggan, S., Essig, F., Hünniger, K.,Mokhtari, Z., Bauer, L., Lehnert, T., Brandes, S., Häder, A., Jacobsen, I.D., Martin, R., Figge, M.T., Kurzai, O. (2015). Neutrophil activation by Candida glabrata but not Candida albicans promotes fungal uptake by monocytes. Cellular Microbiology, 17 (May), 1259-1276. [4] Essig, F., Hünniger, K., Dietrich, S., Figge, M.T., Kurzai, O. (2015). Human Neutrophils dump Candida glabrata after intracellular killing. Fungal Genetics and Biology, 84 (Nov.), 37-40

Upload: others

Post on 26-Sep-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: dietrich - Migration and interaction tracking for ...me17.bioinf.uni-leipzig.de/abstracts/poster/poster.pdfpathogens, live-cell imaging is a modern approach to monitor the dynamic

Migration and interaction tracking for quantitative analysis of phagocyte-pathogen confrontation assays

Susanne Brandes [1,2], Stefanie Dietrich* [1,2], Kerstin Hünniger [3], Oliver Kurzai [3,4], Marc Thilo Figge [1,2]

[1] Applied Systems Biology, Leibniz Institute for Natural product research and Infection Biology, Hans Knöll Institute, Jena, Germany[2] Friedrich Schiller University, Jena, Germany[3] Septomics Research Center, Friedrich Schiller University and Leibniz Institue for Natural Product Research and Infection Biology, Hans Knöll Institute, Jena, Germany[4] Institute for Hygiene and Microbiology, University of Würzburg, Würzburg, Germany

[email protected]

Invasive fungal infections are emerging as a significant health risk for humans. The innateimmune system is the first line of defence against invading micro-organisms and involves the recruitment of phagocytes, which engulf and kill pathogens, to the site of infection. To gain a quantitative understanding of the interplay between phagocytes and fungal pathogens, live-cell imaging is a modern approach to monitor the dynamic process of phagocytosis in time and space. However, this requires the processing of large amounts ofvideo data that is tedious to be performed manually.Here, we present a novel framework, called AMIT (algorithm for migration and interaction tracking, [1]), that enables automated high-throughput analysis of multi-channel time-lapsemicroscopy videos of phagocyte–pathogen confrontation assays. The framework is based on our previously developed segmentation and tracking framework for non-rigid cells in brightfield microscopy [2]. We here present an advancement of this framework to segment and track different cell types in different video channels as well as to track the interactions between different cell types. For the confrontation assays of polymorphonuclear neutrophils (PMNs) and Candida glabrata considered in this work, the main focus lies on the correct detection of phagocytic events. To achieve this, we introduced different PMN states and a state-transition model that represents the basic principles of phagocyte-pathogen interactions. The framework is validated by a direct comparison of the automatically detected phagocytic activity of PMNs to a manual analysis and by a qualitative comparison with previously published analyses [3, 4]. We demonstrate the potential of our algorithm by comprehensive quantitative and multivariate analyses of confrontation assays involving human PMNs and the fungus C. glabrata.

[1] Brandes, S., Dietrich, S., Hünniger, K., Kurzai, O., Figge, M.T. (2016) Migration and Interaction Tracking for Quantitative Analysis of Phagocyte-Pathogen Confrontation Assays. Medical Image Analysis, 36 (Nov.), 172-183.[2] Brandes, S., Mokhtari, Z., Essig, F., Hünniger, K., Kurzai, O., Figge, M.T. (2015). Automated segmentation and tracking of non-rigid objects in time-lapse microscopy videosof polymorphonuclear neutrophils. Medical Image Analysis, 20 (1), 34-51.[3] Duggan, S., Essig, F., Hünniger, K.,Mokhtari, Z., Bauer, L., Lehnert, T., Brandes, S., Häder, A., Jacobsen, I.D., Martin, R., Figge, M.T., Kurzai, O. (2015). Neutrophil activation by Candida glabrata but not Candida albicans promotes fungal uptake by monocytes. Cellular Microbiology, 17 (May), 1259-1276.[4] Essig, F., Hünniger, K., Dietrich, S., Figge, M.T., Kurzai, O. (2015). Human Neutrophils dump Candida glabrata after intracellular killing. Fungal Genetics and Biology, 84 (Nov.), 37-40

Page 2: dietrich - Migration and interaction tracking for ...me17.bioinf.uni-leipzig.de/abstracts/poster/poster.pdfpathogens, live-cell imaging is a modern approach to monitor the dynamic

Evolution of transcription activator-like e↵ectors in Xanthomonas

oryzae

Annett Erkes1, Maik Reschke2, Jens Boch2, Jan Grau11Institute of Computer Science, Martin Luther University Halle–Wittenberg

2Department of Plant Biotechnology, Leibniz Universitat [email protected]

Plant-pathogenic Xanthomonas bacteria employ transcription activator-like e↵ectors (TALEs) that bindto the promoter of plant genes and activate their transcription. Xanthomonas infections result in asubstantial yield loss for many crop plants including rice. The binding domain of TALEs consistsof tandem repeats of approximately 34 amino acids. These contain two hypervariable amino acidsat positions 12 and 13, which are called repeat variable di-residue (RVD). Each RVD recognizes onenucleotide of its target DNA and the consecutive array of RVDs determines TALE target specificity.

Di↵erent Xanthomonas strains possess di↵erent repertoires of TALEs, which have likely evolved froma common ancestor TALE. Here, we study the evolution of TALEs from the level of RVDs determiningtarget specificity down to the level of DNA sequence with focus on rice-pathogenic Xanthomonas oryzaepv. oryzae and Xanthomonas oryzae pv. oryzicola strains.

We discover that only one to three codon pairs coding for one RVD occur in known TALEs, eventhough the number of theoretically possible codon pairs is substantially larger. We base our analysis onaligned sequences within classes of significantly similar TALEs from published Xanthomonas genomesas generated by AnnoTALE [GRE+15], because TALEs within one class are likely evolutionary related.

We compare the aligned RVDs of class members and find synonymous substitutions only for two RVDs,whereas the remaining substitutions lead to a modification of the RVD. Most frequently, only onenucleotide is substituted between the alternative RVDs among class members.

Although even the flanking sequences of RVDs are highly conserved, the nucleotides at some of theflanking positions show dependencies on the RVD type. We train a classifier for distinguishing RVDtypes by their flanking region alone and apply it to repeats which show a substitution in the alignment.In the majority of cases, this classifier assigns all such repeats of a common class to the same RVDtype, although we observe substitutions in the RVD. Our findings indicate that one way how TALEspecificities evolve is by direct base substitutions in RVD codons.

In summary, we find strong indications that TALEs may evolve i) by base substitutions in codon pairscoding for RVDs, ii) by recombination of N-terminal or C-terminal regions of existing TALEs, or iii) bydeletion of individual TALE repeats, and we propose putative mechanisms.

We finally study the e↵ect of the presence/absence of TALEs and evolutionary modifications in TALEson transcriptional activation of putative target genes in rice, and find that even single RVD swaps maylead to considerable di↵erences in activation and that the e↵ect of RVD swaps may depend on thespecific target box.

References

[GRE

+15] Jan Grau, Maik Reschke, Annett Erkes, Jana Streubel, Richard D. Morgan, Geo↵rey G. Wilson,

Ralf Koebnik, and Jens Boch. AnnoTALE: bioinformatics tools for identification, annotation, and

nomenclature of TALEs from Xanthomonas genomic sequences. Scientific Reports, 6(21077), 2015.

Page 3: dietrich - Migration and interaction tracking for ...me17.bioinf.uni-leipzig.de/abstracts/poster/poster.pdfpathogens, live-cell imaging is a modern approach to monitor the dynamic

The tip and hidden part of the iceberg:

Proteinogenic and non-proteinogenic aliphatic amino

acids

MaximilianFichtner1 , Stefan Schuster1 , Kerstin Voigt2

1Dept. of Bioinformatics, University of Jena2Jena Microbial Resource Collection (JMRC)

[email protected]

Background: Amino acids are the essential building blocks of proteins and,

therefore, living organisms. While the focus often lies on the canonical or

proteinogenic amino acids, there is also a large number of non-canonical

amino acids to explore. Some of them are part of toxins or antibiotics in fungi,

bacteria or animals (e.g. sponges). Some others operate at the translational

level like an “undercover agent”.

Scope of review: Here we give an overview of natural aliphatic amino acids,

up to a side chain length of five carbons, without rings and with an unmodified

backbone, and have a closer look on each of them. Some of them are dehydro

amino acids with double or even triple bonds. Moreover, we outline

mathematical methods for enumerating the complete list of all potential

aliphatic amino acids of a given chain length. This should be of interest for

synthetic biology.

Major conclusions: Most non-proteinogenic amino acids are found within

fungi, with particularly many produced by Amanita species as defence

chemicals. Several are incorporated into peptide antibiotics. Some of the

amino acids occur due to broad substrate specificity of the branched-chain

amino acid synthesis pathways. A large variety of amino acids were also found

in the Murchison meteorite.

General significance: Non-proteinogenic amino acids are of interest for

numerous medical applications: discovery of new antibiotics, support in

designing synthetic antibiotics, improvement of protein and peptide

pharmaceuticals by avoiding incorporation of non-canonical amino acids,

study of toxic cyanobacteria and other applications.

Page 4: dietrich - Migration and interaction tracking for ...me17.bioinf.uni-leipzig.de/abstracts/poster/poster.pdfpathogens, live-cell imaging is a modern approach to monitor the dynamic

Supervoxel-based image segmentation Friebel A.1, Johann T. 1,3, Drasdo D. 2,3, Hoehme S. 1,# 1.) Institute of Computer Science, University of Leipzig, Germany; 2.) INRIA, Paris, France; 3.) IZBI, University of Leipzig, Germany; # Corresponding author

During the last decade, advances in imaging technology led to an enormous increase in the amount of produced image data. A multitude of microscopy techniques generates a variety of image types such as two-dimensional whole slide scans, three-dimensional image stacks or time resolved image sequences. In order to obtain segmentations image processing pipelines often are specifically tailored towards a certain image setup in terms of e.g. dimensionality and staining. Hence, such pipelines are typically limited in their applicability to different image setups. Engineering of accordingly adapted pipelines can be a very time-consuming process, and it requires the user to be trained and to have an understanding of the inner workings of the algorithms. This greatly hampers their usability in the wet lab. Therefore, it is of great importance to develop image processing methods, capable of processing a variety of image types and segmenting diverse tissue structures while requiring only few user interventions in a unified, easy to use and intuitive image analysis tool. We present a novel image segmentation methodology implemented in our image processing and analysis tool TiQuant that allows untrained users without image processing expertise to segment two- and three-dimensional images in an interactive, nearly parameter-free and standardized way. The tool is used to produce an initial oversegmentation of an image in so called supervoxels, which are analyzed for features such as local, and neighborhood color histograms, texture and gradients. Users are enabled to provide labeled training data for an image dataset through a convenient drawing interface. The features of the labeled data are used to train SVM or random forest classifiers, which are then used to predict class-membership probabilities of the dataset's supervoxels. Segmentations are produced by thresholding of those probabilities and can be post-processed with size-based object removal, smoothing and watershed filters to remove small artificial objects, clean up boundaries and separate clustered objects. TiQuant provides a graphical user interface to guide this workflow as well as a batch mode to process a large amount of images using previously trained and saved classifiers. The tool will be freely available upon publication. In the contribution to the Mittelerde 2017 we will introduce our supervoxel-based image processing module with an emphasis on the underlying methods as well as strengths, weaknesses and potentials of the approach.

Page 5: dietrich - Migration and interaction tracking for ...me17.bioinf.uni-leipzig.de/abstracts/poster/poster.pdfpathogens, live-cell imaging is a modern approach to monitor the dynamic

Diversity and evolution of the developmental transcriptomes

in flowering plants

Alexander Gabel1,2, Christoph Schuster3, Hajk-Georg Drost3, Ivo Grosse1,2, andElliot M. Meyerowitz3,4

1Institute of Computer Science, Martin Luther University Halle-Wittenberg, Germany2German Centre for Integrative Biodiversity Research (iDiv), Halle-Jena-Leipzig, Germany

3The Sainsbury Laboratory, University of Cambridge, United Kingdom4Division of Biology and Biological Engineering and Howard Hughes Medical Institute,

California Institute of Technology, USA

[email protected], [email protected],

[email protected], [email protected], [email protected]

For the past decades, Arabidopsis thaliana has been one of the most studied plant organ-isms providing a well annotated genome, and recent RNA-seq analyses uncover a broadrange of protein-coding and non-coding transcripts [1, 2, 3, 4]. For related flowering plantspecies, however, our knowledge about the transcriptional diversity is limited. Most tran-scriptome analyses and annotations only rely on protein-coding genes, whereas the diversityand function of the non-coding transcriptome is poorly studied and poorly understood. Us-ing strand-specific total RNA-seq, we investigate the developmental transcriptomes of ninediÄerent organs from seven related flowering plant species, and identify new splice variants,natural antisense transcripts (NATs), long-intergenic RNAs (linc RNAs) and other non-coding transcripts. We develop comparative approaches for predicting those transcripts andfor providing evidence for their in-vivo function. We apply those approaches with the goal ofcontributing to the identification of conserved transcripts and conserved expression patterns,leading to a deeper understanding of the evolution of expression patterns of protein-codingand non-coding transcripts in flowering plants.

References[1] Song Li et al. Integrated detection of natural antisense transcripts using strand-specificRNA sequencing data Integrated detection of natural antisense transcripts using strand-specific RNA sequencing data. pages 1730–1739, 2013.

[2] Runxuan Zhang et al. Rapid report AtRTD - a comprehensive reference transcriptdataset resource for accurate quantification of transcript-specific expression in Arabidop-sis thaliana. New Phytologist, 208:96–101, 2015.

[3] Song Li et al. High resolution expression map of the Arabidopsis root reveals alternativesplicing and lincRNA regulation. Developmental Cell, in press(4):508–522, 2016.

[4] CHIA-YI Chenget et al. Araport11: a complete reannotation of the Arabidopsis thalianareference genome. The Plant Journal, 89:789–804, 2017.

1

Page 6: dietrich - Migration and interaction tracking for ...me17.bioinf.uni-leipzig.de/abstracts/poster/poster.pdfpathogens, live-cell imaging is a modern approach to monitor the dynamic

Mittelerde Meeting, Leipzig, 2017

PoSeiDon: a web server for the detection of evolutionary recombination events and positive selection

Martin Hölzer1-2 and Manja Marz1-6

1Faculty of Mathematics and Computer Science, Friedrich Schiller University Jena, Germany, 2European Virus Bioinformatics Center (EVBC),Friedrich Schiller University Jena, Germany, 3FLI Leibniz Institute for Age Research, Jena, Germany, 4Michael Stifel Center Jena, Germany,5Aging Research Center (ARC), Jena, Germany, 6German Centre for Integrative Biodiversity Research (iDiv) Halle-Jena-Leipzig, Germany

Question

Viral infections are thought to be one of the major driving forces regarding selective pressure acting on living

organisms. So-called 'arms races' between host and virus result in high selection pressure on the host toevolve a defense against the pathogens, while the virus itself establishes countermeasures to evade the host

immune system. One approach to study selection pressure involves the comparison of orthologous genes todetect sites (codons) that underwent positive (diversifying) selection. For instance, certain amino acid

changes are favored if they increase the hosts fitness against a viral infection. In that case, the ratio betweenthe non-synonymous (dN) and synonymous (dS) substitution rate may reach values greater than 1 and we

call such a site positively selected. Since recombination can have a profound impact on evolutionaryprocesses and can adversely affect the accurate detection of positive selection, screening for breakpoints to

define recombinant parts should be a default step in each comparative evolutionary study.

Materials and Methods

We developed a pipeline called PoSeiDon, that comprises an assembly of different scripts and tools (Fig. 1)that allow for the detection of recombination and positive selection in protein-coding sequences. The input is

a multiple FASTA file of homologous coding sequences that is automatically transferred into a codon-basedalignment. Based on the alignment, PoSeiDon calculates putative recombination breakpoints, maximum-

likelihood-based phylogenetic trees, dN/dS ratios at each site and their impact on the positive selection.

Results

With this, PoSeiDon is an easy-to-use, web-based pipeline for the accurate detection of site-specific positiveselection and recombination events in protein-coding sequences. PoSeiDon does not only detect positively

selected sites in an alignment of homologous sequences, but also possible recombination events, that couldotherwise adversely affect the positive selection detection. The output of PoSeiDon is summarized in a user-

friendly and clear manner, allowing researches to study positive selection in various genes.

PoSeiDon is freely available at http://www.rna.uni-jena.de/poseidon.

Fig. 1: Workflow of the PoSeiDon pipeline and example of the HTML output.

Page 7: dietrich - Migration and interaction tracking for ...me17.bioinf.uni-leipzig.de/abstracts/poster/poster.pdfpathogens, live-cell imaging is a modern approach to monitor the dynamic

Modeling the metabolic reprogramming of macrophage activation Franziska Hörhold [1,2], Marcus Oswald [1,2], Amol Kolte [1,2], Rainer König, [1,2] Affiliations: [1] Associated Group of Network Modeling, Leibniz Institute for Natural Product Research and Infection Biology – Hans Knöll Institute, Jena, Germany [2] Center for Sepsis Control and Care, University Hospital, Jena, Germany E-mail address: [email protected] In response to their microenvironment, resting macrophages can polarize mainly into two distinct phenotypes: M1 (classical activation e.g. by microbial products or IFN-J) or M2 macrophages (alternative activation e.g. by IL-4). M1 macrophages can produce pro-inflammatory mediators such as TNF-α, IL-1 and nitric oxide [3] and can mediate an inflammatory response leading to a cytokine storm triggering sepsis. In contrast, M2 macrophage show a rather immunosuppressive behavior during sepsis [5]. Hence, a central aim in critical care is to reduce M1 and induce M2 activation. A metabolic switch of M1 compared to M2 macrophages had been described by upregulation of aconitate decarboxylase in M1 macrophages, which synthesizes itaconate from cis-aconitate [1], M1 macrophages modulate the citrate cycle such that nitric oxide is produced in the urea cycle [1], and other studies showed that a transition of M1 to M2 macrophages can be induced by IL-4 [4,6]. In order to understand the switch in regulation of metabolism, we investigate the regulation using our newly develop tool [7] and model metabolic fluxes to identify regulators which are responsible for the metabolic switch. Methodologically, we show an approach based on mixed-integer-linear programming, which integrates C13 labeling data to predict metabolic fluxes. It contains a new method to remove futile cycles which will be compared to a method from the constraint-based reconstruction and analysis (COBRA) framework [2]. Our study revealed promising results that help elucidating the metabolic and transcriptional reprogramming of M1 and M2 macrophages after activation. References: [1] Jha, A.K., Huang, S. C.-C., Sergushichev, A., Lampropoulou, V., Ivanova Y., et al (2015). Network Integration of Parallel Metabolic and Transcriptional Data Reveals Metabolic Modules that Regulate Macrophage Polarization. Immunity, 42, 419-430. doi:10.1016/j.immuni.2015.02.005. [2] Schellenberger, J., Lewis N. E., and Palsson, B. Ø (2011). Elimination of Thermodynamically Infeasible Loops in Steady-State Metabolic Models. Biophysical Journal, 100, 544-553. doi:10.1016/j.bpj.2010.12.3707 . [3] Stearns-Kurosawa, D.J., Osuchowski, M.F., Valentine, C., Kurosawa, S., Remick, D.G. (2011). Pathology in Sepsis. Annual review of pathology: mechanisms of disease. 6, 19-48. doi:10.1146/annurev-pathol-011110-130327. [4] Spiller, K.L., Nassiri, S., Witherel, C.E., Anfang R.R., Ng, J., et al (2015). Sequential delivery of immunomodulatory cytokines to facilitate the M1-to-M2 transition of the macrophages and enhance vascularization of bone scaffolds. Biomaterials, 37, 194-207. doi:10.1016/j.biomaterials.2014.10.017. [5] Wynn, T.A., Chawla, A, Pollard, J.W. (2013). Macrophage biology in development, homeostasis and disease. Nature, 496, 445-455. doi:10.1038/nature12034. [6] Zheng, X.-F., Hong, Y.-X., Feng, G.-J., Zhang, G.-F., Rogers, H. et al (2013). Lipopolysaccharide-Induced M2 to M1 Macrophage Transformation for IL-12p70 Production Is Blocked by Cadida albicans Mediated Up-regulation of EBI3 Expression. PLoS ONE, 8, e63967. doi:10.1371/journal.pone.0063967. [7] Poos AM, Maicher A, Dieckmann AK, Oswald M, Eils R, Kupiec M, Luke B, König R (2016). Mixed Integer Linear Programming based machine learning approach identifies regulators of telomerase in yeast, Nucleic Acids Research, 44, e93. [8] Bordbar, A., Mo, M.L., Nakayasu, E.S., Schrimpe-Rutledge, A.C., Palsson B.O., et al (2012). Model-driven multi-omic data analysis elucidates metabolic immunomodulators of macrophage activation. Molecular Systems Biology, 8:558.

Page 8: dietrich - Migration and interaction tracking for ...me17.bioinf.uni-leipzig.de/abstracts/poster/poster.pdfpathogens, live-cell imaging is a modern approach to monitor the dynamic

Extended Sunflower Hidden Markov Model for the de-novo recognition of cis-regulatory modules

Ioana Lemnian1, Ralf Eggeling2, and Ivo Grosse1,3

Affiliations:1 Institute of Computer Science, Martin Luther University Halle-Wittenberg, Halle (Saale), Germany2 Department of Computer Science, University of Helsinki, Finland3 German Centre for Integrative Biodiversity Research (iDiv), Halle-Jena-Leipzig, Germany

The transcription of genes is often regulated not only by transcription factors bind-ing at single sites per promoter, but by the interplay of multiple copies of one or more transcription factors binding at multiple sites forming a cis-regulatory module. The computational recognition of cis-regulatory modules from ChIP-seq or other high-throughput data is crucial in elucidating the regulatory mechanisms of gene expression at the transcriptional level. For their recognition the Sunflower Hidden Markov Model is a promising statistical model, since it allows for multiple occur-rences of multiple motifs. However, this model may not be well suited for de-novo motif discovery, since it neglects statistical dependences among nucleotides within binding sites and flanking regions. Recent studies have shown that taking into ac-count such dependencies improves the discovery of single cis-regulatory elements, so we speculate that the same effect also applies for the de-novo detection of entire cis-regulatory modules. Thus, we propose an extension of the Sunflower Hidden Markov Model that allows statistical dependences within binding sites, their reverse complements, and flanking regions. We study the efficacy of this extended Sun-flower Hidden Markov Model based on ChIP-seq data from the Human ENCODE Project and find that it often outperforms the traditional Sunflower Hidden Markov Model.

Page 9: dietrich - Migration and interaction tracking for ...me17.bioinf.uni-leipzig.de/abstracts/poster/poster.pdfpathogens, live-cell imaging is a modern approach to monitor the dynamic

High-Throughput Quantification of

Pavement Cell Shape Characteristics

Birgit Moller 1, Yvonne Poschl 1,2, Romina Plotner 3 and Katharina Burstenbinder 3

1 Institute of Computer Science, Martin Luther University Halle-Wittenberg2 iDiv, German Integrative Research Center for Biodiversity, Halle-Jena-Leipzig

3 Department of Molecular Signal Processing, Leibniz Institute of Plant Biochemistry, Halle

Pavement cells (PCs) are the most frequently occurring cell type in the leaf epidermis and play

important roles in controlling leaf growth. PCs di↵erentiate from simple shaped cells into highly

complex jigsaw puzzle shaped cells with interlocking lobes. Understanding of their development is of

high interest for plant science research because of their importance for leaf growth and hence for plant

fitness and crop yield. Studies of PC development, however, are still limited because robust methods

are lacking that enable automatic segmentation of PCs from microscope images and quantification

of shape parameters suitable to reflect their cellular complexity.

Here, we present our new ImageJ-based open-source tool for fully automatic analysis of images

and PC shape quantification. The tool allows to automatically detect cell boundaries of PCs from

confocal input images and simultaneously extracts 22 shape features including global, contour-based,

skeleton-based and PC-specific object descriptors. To quantify PC-specific shape characteristics such

as lobe number and lobe size, we propose a new algorithm based on the analysis of shape curvature

and curvature inflection points. Along with the ImageJ-based tool, we provide an R script for

graphical visualization and statistical analysis of extracted features.

We present quantitative experimental results to demonstrate the e�ciency of the tool for analyzing

PC shape characteristics during development and between di↵erent genotypes. Thus with this tool

we provide a platform for performing robust, e�cient and reproducible quantitative analysis of PC

shape characteristics to study PC development in large data sets.

Page 10: dietrich - Migration and interaction tracking for ...me17.bioinf.uni-leipzig.de/abstracts/poster/poster.pdfpathogens, live-cell imaging is a modern approach to monitor the dynamic

Workflow for processing large-scale non-target screening mass

spectrometry data for analytical and statistical purposes

Nontarget LC-MS data has been on the forefront of environmental research in recent years. Many

programs and libraries for processing and interpreting such data are currently available and are great

assets for substance identification. While already automated nontarget workflows exist, they focus

usually on specific problems, such as trend detection in time series.

The main purpose of the workflow described here is to agglomerate these approaches and to

additionally provide tools that simplify a comprehensive quality control and post-processing. To

date, there exists no such agglomeration of pre-existing algorithms from different sources.

The workflow is envisioned to encompass all necessary steps from reading mzML, mzXML or

netCDF files to receive a dataset with all known and tentative information annotated. The

annotations include, but are not limited to, alignment, retention time shift and componentization of

the aligned peaks.

The workflow is also designed with accessibility in mind, similarly for programmers wishing to

extend the package and for end users looking to get reliable and interpretable results. Several

visualization tools will be included in the graphical UI to simplify the search for signals of interest

in the data and allow to verify peak and alignment quality. The workflow will be provided as an R

package.

Page 11: dietrich - Migration and interaction tracking for ...me17.bioinf.uni-leipzig.de/abstracts/poster/poster.pdfpathogens, live-cell imaging is a modern approach to monitor the dynamic

Motif computational approach for the discovery of ceRNA in plants

Alexandre Rossi Paschoal1,2, Irma Lozada-Chávez1,3, Artur Trancoso Lopo de Queiroz9,, Douglas Silva Domingues 2, 4, Peter F. Stadler 1,3, 5-8

1 Bioinformatics Group, Department of Computer Science, and Interdisciplinary Center for Bioinformatics, University of Leipzig, Hartelstraße 16-18, D-04107 Leipzig, Germany 2 Graduation Program in Bioinformatics (PPGBIOINFO), Department of Computing, Federal University of Technology - Parana, Av.Alberto Carazzai, 1640, 86300-000 Corn´elio Proc´opio, Brazil 3 Max Planck Institute for Mathematics in the Sciences, Inselstraße 22, D-04103 Leipzig, Germany 4 Department of Botany, Instituto de Biociencias, Universidade Estadual Paulista, Rio Claro, Brazil 5 German Centre for Integrative Biodiversity Research (iDiv) Halle-Jena-Leipzig, Competence Center for Scalable Data Services and Solutions, and Leipzig Research Center for Civilization Diseases, University Leipzig 6 Fraunhofer Institute for Cell Therapy and Immunology, Perlickstrasse 1, D-04103 Leipzig, Germany 7 Center for RNA in Technology and Health, University of Copenhagen, Grønneg°ardsvej 3, Frederiksberg C, Denmark 8 Santa Fe Institute, 1399 Hyde Park Road, Santa Fe NM, 87501 9 Instituto Gonçalo Moniz, Fundação Oswaldo Cruz (FIOCRUZ), Salvador, Bahia, Brazil. The competing endogenous RNA (ceRNA) hypothesis suggests that, under certain conditions, highly abundant MRE-bearing RNAs (also known as target mimics: TMs) have the potential to titrate away miRNAs from their other targets, and thus to compete for post-transcriptional control. The ceRNA hypothesis has gained increasing attention as a potential global regulatory mechanism of miRNAs, and as powerful tool to predict the function of many non-coding RNAs. Most studies have been focused on animals, although TMs discoveries as well as significant computational and experimental advances have been developed in plants over the last decade. We provide here a novel bioinformatics approach for the discovery of potential TMs cross-targeting miRNA families in plants, based on the identification of consensus miRNA binding sites from known TMs across sequenced genomes, transcriptomes and known miRNAs. This computational approach is promising because, in contrast to animals, miRNA families in plants are quite large and comprise many loci encoding identical or very similar members, several of which are highly conserved. Our approach encompasses five steps: (i) data collection and filtering of target mimics binding sequences; (ii) clustering and consensus creation of TM motifs; (iii) localization of TM motifs across genomic features; (iv) TM motif match analysis against transcriptomic data; and (v) identification of miRNAs potentially harboring the consensus TM motifs. After analyzing 40 experimentally validated TM binding sites from six plant species and 24 different miRNA families, we found three candidate TM motifs specific for Cicer arietinum: MIM166, MIM171 and MIM159/319. We show that these TM motifs have the potential to represent endogenous miRNA:target-mimic binding sites in several plant species. In agreement with Wu et al. [1], MIM166 and MIM171 might represent examples of target mimics regulating several members of the same miRNA family. On the other hand, the consensus TM motif MIM159/319 is supported by previous studies suggesting or demonstrating (see for instance Reichel and Millar [2]) the cross-targeting mimics of two different miRNA families. Interestingly, miR159 and miR319 are related and ancient miRNA families essential for plant development and fertility with differential expression patterns and targets throughout the plant [3]. A further step will involve a large-scale extension of this pipeline to analyze 80 complete plant genomes and a larger and manually collected target mimic database. Funding: ARP: CNPq - Grant no. 00454505/2014-0; ILC: CONACyT’s support no. 185993 and the Max Planck Institute. DSD: CNPq research fellowship.

References: [1] Wu HJ, Wang ZM, Wang M, et al. (2013) Widespread long noncoding RNAs as endogenous target mimics for micrornas in plants. Plant Physiol, 161:1875. [2] Jin D, Wang Y, Zhao Y, et al. (2013) Micrornas and their cross-talks in plant development. J Genet Genomics, 40:161. [3] Reichel M, Millar A. (2015) Specificity of plant microRNA target MIMICs: Cross-targeting of miR159 and miR319. J Plant Physiol. 180:45.

Page 12: dietrich - Migration and interaction tracking for ...me17.bioinf.uni-leipzig.de/abstracts/poster/poster.pdfpathogens, live-cell imaging is a modern approach to monitor the dynamic

A meta-analysis of multiple honeybee transcriptomes using DiRank

Yvonne Poeschl1,2, Vincent Doublet1,3, Andreas Gogol-Doring1,4, Robert J. Paxton1,5 andChristina M. Grozinger6

1 iDiv, German Integrative Research Center for Biodiversity, Halle-Jena-Leipzig, Germany2 Institute of Computer Science, Martin Luther University Halle-Wittenberg, Germany

3 Centre for Ecology and Conservation, University of Exeter, Penryn, UK4 Technische Hochschule Mittelhessen, Gießen, Germany

5 Institute for Biology, Martin Luther University Halle-Wittenberg, Germany6 Center for Pollinator Research, Pennsylvania State University, USA

Pathogens and parasites are considered an important cause of insect pollinator decline, threatening the

ecosystem service of pollination. Host-pathogen relations are underpinned by intimate molecular inter-

actions. Disparate datasets have been generated on the transcriptome response of the honey bee (Apis

mellifera) to pathogens. To explore the commonality in host responses to pathogen challenge, we coor-

dinate two Trans-Bee workshops funded by the German Centre for Integrative Biodiversity Research

(iDiv) Halle-Jena-Leipzig (www.idiv-biodiversity.de) in which we synthesized current microarray and

next-generation sequencing data of 18 published and non-published transcriptome datasets, generated

after infection by gut pathogens Nosema spp., viruses or the parasitic mite Varroa destructor.

To perform a meta-analysis of these multiple transcriptomes, we developed DiRank, a new Bioin-

formatics approach incorporating ranks, that analyses genes based on their expression profile across

datasets. Applying this directed Rank Product (DiRank) approach to the synthesis dataset of 18 tran-

scriptomes, we identified a common suite of genes and conserved molecular pathways that respond to

all investigated pathogens, a result that suggests a commonality in response mechanisms to diverse

pathogens. We found that genes di↵erentially expressed after infection exhibit a higher evolutionary

rate than non-di↵erentially expressed genes. Using DiRank and user defined artifical expression pro-

files, we unveiled additional pathogen-specific responses of honey bees. Finally, we applied DiRank and

generated a gene co-expression network to identify highly connected (hub) genes that may represent

important mediators and regulators of anti-pathogen responses.

The identified key host genes and pathways that respond to phylogenetically diverse pathogens repre-

sent an important source for future functional studies. Beside this, the statistical and bioinformatics

approaches that were developed for this study are broadly applicable to synthesize information across

transcriptomic datasets. These approaches will likely have utility in addressing a variety of biological

questions.

Page 13: dietrich - Migration and interaction tracking for ...me17.bioinf.uni-leipzig.de/abstracts/poster/poster.pdfpathogens, live-cell imaging is a modern approach to monitor the dynamic

1,2 2 1 2

1

2

Page 14: dietrich - Migration and interaction tracking for ...me17.bioinf.uni-leipzig.de/abstracts/poster/poster.pdfpathogens, live-cell imaging is a modern approach to monitor the dynamic

zoo - A concept for a lightweight data container in viralzoo - A concept for a lightweight data container in viral

bioinformaticsbioinformatics

AbstractAbstract

One large obstacle to the rapid prototyping of ideas in viral bioinformatics is the collection of the necessary data. Virus data iscurated to varying degrees in multiple, mostly centralized databases, such as NCBI GenBank. One usually has to aggregate datafrom multiple sources, each with its own formats, conventions and edge cases. With “messy” data, a substantial amount of time hasto be spent upfront curating a custom data set for each particular use case, which is tedious. zoo - essentially a document database- aims to accelerate this drastically.Document (i.e. NoSQL) databases offer distinct advantages in the organisation of biological data, such as easy access to theinformation in a collection’s document. A “collection” is the equivalent of a table in the relational data model, while a “document” issimilar to a row in the table. However, a document schema offers much more flexibility.In zoo, each collection represents a family in the ICTV nomenclature, and each document is an instance of this family, i.e. a strain.Note that zoo allows taxonomical and phylogenetical searches. Taxonomic search is in accordance with the current ICTV Taxonomy.Phylogenetic search is achieved efficiently using so-called Minhash signatures.Besides efficiently making data searchable, zoo offers a Python library which connects the database as backend to downstreamanalyses such as general sequence manipulation (Biopython), machine learning and artificial intelligence (scikit-learn, Keras) andphylogenetic analyses (RAxML, BEAST).

Page 15: dietrich - Migration and interaction tracking for ...me17.bioinf.uni-leipzig.de/abstracts/poster/poster.pdfpathogens, live-cell imaging is a modern approach to monitor the dynamic

ME17 submission 31

Regulation of alternative splicing by the RNA-binding protein atGRP7

Claus Weinholdt 1, Tino Köster  2, Christine Nolte 2, Katja Meyer 2, Dorothee Staiger 2, Ivo Grosse 1,3

1 Institute of Computer Science, Martin Luther University Halle-Wittenberg, Germany

2 Molecular Cell Physiology, Faculty for Biology, University of Bielefeld, Germany

3 German Centre for Integrative Biodiversity Research (iDiv), Halle-Jena-Leipzig, Germany

RNA-binding proteins (RBPs) control the fate of mRNAs including alternative splicing, 3'end formation, nuclear export, and decay through dynamic interaction with cis-regulatory motifs, thereby forming messenger ribonucleoprotein complexes (mRNPs). In plants the interplay between trans-acting RBPs and their cis-regulatory motifs is not well understood. One such example is atGRP7, an RBP that is an important regulator of the circadian clock by affecting alternative splicing of a diverse set of target RNAs. To get deeper insights into the function of atGRP7, we performed a combination of RIP-seq and RNA-seq experiments, developed a pipeline for the integrative analysis of these data, and applied that pipeline to the generated data.

Page 16: dietrich - Migration and interaction tracking for ...me17.bioinf.uni-leipzig.de/abstracts/poster/poster.pdfpathogens, live-cell imaging is a modern approach to monitor the dynamic

A generalization of discrete Morse theory

Sylvia Yaptieu

Abstract

March 21, 2017

A discrete Morse function is one defined on a CW complex such that, it locally in-creases in dimension except possibly in one direction. The extracted vector field has theproperties that: each cell has either one incoming or outgoing arrow but not both andthere are no closed orbits. A boundary operator is constructed from this vector fieldconfiguration by considering the chain complex generated only by the critical cells, thatis, the ones without the arrows, and this yields the Betti numbers of the complex. Oneobtains the Morse inequalities which are inequalities between the numbers of critical cellsof fixed indices and the Betti numbers.

Our first approach to generalizing the above theory is our discrete analogue of Morse-Bott theory, on polyhedral complexes. We define a discrete Morse-Bott function by re-quiring some conditions on specific collections of polytopes. Using the reduced collections(excluding the noncritical pairs), we obtain some discrete Morse-Bott inequalities, that is,the Poincare polynomial of our polyhedral complex is expressed in terms of those of thereduced collections. The vector field originating from this discrete Morse-Bott function issuch that, inside each collection a polytope can have as many incoming and/or outgoingarrows, but between the collections there are no closed orbits. We also do some Conleytheory analysis by using the reduced collections (excluding the noncritical pairs) as ourisolated invariant sets, and define their respective isolating neighborhoods and exit sets,these two constitute the index pairs. The Poincare polynomial of our polyhedral complexis then expressed in terms of those of the index pairs.

Next, we take a finite CW complex having a vector field configuration which is suchthat a cell can have as many outgoing or incoming arrows as possible and there can not beclosed orbits, and define a boundary operator from which the Betti numbers of the CWcomplex can be extracted. This in particular tells us that this vector field originates fromsome function.We provide a definition for a boundary operator that uses all the givenarrows, by means of a probabilistic and averaging technique. Our critical cells are: thecells with no incoming and outgoing arrow, the abnormally downward noncritical cellstogether with their facets from which their arrows come; and the abnormally upwardnoncritical cells as well as those cells to which their arrows point. From our boundaryoperator we get not only the topological Betti numbers of our CW complex, but also ananalogue of the Morse inequalities for the CW complex under consideration.

1