espace.library.uq.edu.au673612/s... · ii abstract the majority of animals have a complex biphasic...

170
THE CHARACTERIZATION OF LARVAL HOMOLOGY: TRANSCRIPTOMIC INSIGHTS INTO THE ORIGIN AND EVOLUTION OF ANIMAL LIFE CYCLES WILLIAM LUDDEN HATLEBERG B.A. (HONORS) A thesis submitted for the degree of Doctor of Philosophy at The University of Queensland in 2017 School of Biological Sciences

Upload: others

Post on 21-Oct-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

  •  

    The characTerizaTion of larval homology: TranscripTomic insighTs inTo The origin and evoluTion of animal

    life cycles

    William ludden haTlebergb.a. (honors)

    A thesis submitted for the degree of Doctor of Philosophy atThe University of Queensland in 2017

    School of Biological Sciences

  • ii

    Abstract

    The majority of animals have a complex biphasic life cycle characterized by distinctive larval and adult

    body plans. Since the 1800s, the origins of marine larvae have puzzled embryologists, resulting in a

    long and convoluted history of theoretical literature, which largely focus on two central questions: (1)

    are extant larvae or adults more representative of the ancestral animal body plan, and (2) how many

    times did marine larvae evolve across metazoan phyla. Despite the ecological and developmental

    importance of this developmental mode, there is still no concrete evidence for when or how the biphasic

    life cycle evolved. However, recent studies predict that biphasy is an ancient synapomorphic trait of

    all metazoans, suggesting that ontogenies divided into distinct larval and adult phases represent the

    ancestral animal state. Therefore larval body plans are likely to be homologous in most animal phyla.

    In the present age of genomic data, these speculative theories have yet to be proven or refuted by

    empirical evidence. Therefore, in the present study, I employ novel computational methodology to

    re-examine established conceptual frameworks for life cycle evolution to create a unified model for

    animal body plan evolution.

    Here, I use the basal marine sponge, Amphimedon queenslandica, as a foundational case study for

    life cycle evolution. Because the majority of bioinformatic tools were developed in and optimized for

    classic vertebrate model systems, I first evaluate the validity of computational methods for functional

    gene characterization in an early branching non-model system. I find that functional characterization

    methods such as gene ontology (GO) are overly-specific and largely variable between annotation

    methods, and propose a new pipeline to effectively extract biological meaning from a non-model

    ontogeny. Using these tools, I characterize the pelagobenthic transition from free-swimming larvae to

    sessile adults in A. queenslandica. I find that larval and adult transcriptomes largely employ a shared

    transcriptional toolkit that is primarily composed of ancient pre-metazoan genes. However, I also find

    evidence for phase-specific regulatory modules characterized by the unequal distribution of gene age.

    Specifically, I show that the larval transcriptome is enriched in older, pre-metazoan genes, while the

    adult transcriptome is largely composed of novel, lineage specific innovations.

    To place these findings from A. queenslandica in an evolutionary context, I acquired comparable

    datasets from five other biphasic metazoan lineages across the animal tree including: the coral, Acropora

  • iii

    digitifera, the mollusc, Haliotis asinina, the hemichordate, Balanoglossus misakiensis, the sea urchin,

    Strongylocentrotus purpuratus, and the ascidian, Herdmania momus. Through this comparative approach,

    I find that the majority of the genes that are significantly differentially expressed during metamorphosis

    are lineage-specific innovations. However, many of these co-expressed, taxonomically restricted genes

    appear to be regulated by conserved transcription factors in multiple species. Taken together, these

    findings provide the first genomically informed framework for the origins of animal biphasy. Here

    I discuss the implications of these results in light of historical hypotheses for larval evolution, and

    propose a novel conceptual framework for animal life cycle evolution.

  • iv

    Declaration by author

    This thesis is composed of my original work, and contains no material previously published or written

    by another person except where due reference has been made in the text. I have clearly stated the

    contribution by others to jointly-authored works that I have included in my thesis.

    I have clearly stated the contribution of others to my thesis as a whole, including statistical assistance,

    survey design, data analysis, significant technical procedures, professional editorial advice, and any

    other original research work used or reported in my thesis. The content of my thesis is the result of

    work I have carried out since the commencement of my research higher degree candidature and does

    not include a substantial part of work that has been submitted to qualify for the award of any other

    degree or diploma in any university or other tertiary institution. I have clearly stated which parts of

    my thesis, if any, have been submitted to qualify for another award.

    I acknowledge that an electronic copy of my thesis must be lodged with the University Library and,

    subject to the policy and procedures of The University of Queensland, the thesis be made available for

    research and study in accordance with the Copyright Act 1968 unless a period of embargo has been

    approved by the Dean of the Graduate School.

    I acknowledge that copyright of all material contained in my thesis resides with the copyright holder(s)

    of that material. Where appropriate I have obtained copyright permission from the copyright holder

    to reproduce material in this thesis.

  • v

    Publications during candidature

    Conference Abstracts

    Hatleberg, W. L., B. M. Degnan, S. M. Degnan. (2016) Genomic insight into the evolution of lar-val body plans. Society for Molecular Biology and Evolution (SMBE), Gold Coast, QLD, Australia

    de Mendoza, A., W. L. Hatleberg, U. Technau, K. Pang, B. M. Degnan, R. Lister. (2016) Early evolution and dynamics of DNA methylation in animals. Society for Molecular Biology and Evolu-tion (SMBE), Gold Coast, QLD, Australia

    Hatleberg W. L., S. M. Degnan, B. M. Degnan. (2016) One genome, two body plans: how do lar-val and adult gene expression profiles differ in the sponge Amphimedon queenslandica? Society for Comparative and Integrative Biology (SICB), Portland, OR, USA

    Papers

    Hall, M. R., K. M. Kocot, K. W. Baughman, S. L. Fernandez-Valverde, M. E. A. Gauthier, W. L. Hatleberg, A. Krishnan, C. McDougall, C. A. Motti, E. Shoguchi, T. Wang, X. Xiang, M. Zhao, U. Bose, C. Shinzato, K. Hisata, M. Fujie, M. Kanda, S. F. Cummins, N. Satoh, S. M. Degnan, and B. M. Degnan. (2017) The crown-of-thorns starfish genome as a tool for biocontrol of a coral reef pest. Nature. 544: 231-234.

    Publications included in this thesis

    None.

  • vi

    Contributions by others to the thesis

    Bernard M. Degnan and Sandie M. Degnan contributed to the conception and design of this research,

    advised on the analysis and interpretation of data, and provided critical feedback on the drafting of

    the thesis.

    Selene L. Fernandez-Valverde contributed to the data analysis of Chapter 2, performing the BLIND

    clustering analyses, producing the lists of differentially expressed genes, and creating the Blast2GO/

    InterPro annotation (around 10% of the total data analysis).

    Andrew Baird of James Cook University and Sabrina Kaul-Strehlow from the University of Vienna

    performed the necessary fieldwork to acquire the Acropora sp. and Balanoglossus samples used in

    Chapter 4. Carmel McDougall assisted with the fieldwork and collection of Haliotis asinina.

    Library preparation and transcriptome sequencing in Chapter 4 were conducted by Macrogen Inc.,

    South Korea.

    Statement of parts of the thesis submitted to qualify for the award of another degree

    None.

  • vii

    Acknowledgements

    I originally arrived in Australia thinking that I would only be here for a year and a half to complete a

    Masters degree. Now, looking back on the last three and a half years of my life, I realize how many

    people I need to thank for making my stay in this remarkable country so extraordinary.

    First and foremost, I would like to thank Bernie and Sandie Degnan for giving me the opportunity to

    work in this lab and encouraging me to switch from a Masters to a PhD. I am very grateful for both

    the independence you’ve granted me as well as the much-needed guidance to keep me focused and

    on track. Together you have taught me to become a more thorough scientist and rigorous thinker.

    Your continuing encouragement and support of my interest in historical and philosophical biology

    has allowed me to pursue the PhD project I had been dreaming of ever since I first learned about the

    bizarre and beautiful world of larval evolution.

    This project would also not have been possible without the support of the American Australian

    Association, which provided me with the funding for my first year in Australia, as well as the grants

    awarded to Bernie Degnan and Sandie Degnan by the Australian Research Council, which allowed me

    to continue my research for the remainder of my PhD. I would also like to acknowledge the University

    of Queensland for providing me with a UQ International Scholarship, as well as the School of Biological

    Science for the opportunity to travel to the Society of Comparative and Integrative Biology (SICB)

    meeting in January 2016.

    To complete this thesis, I have had to stand on the shoulders of some true giants. I could not have

    completed this project without the foundation and framework that was provided to me by so many

    people. I came to Australia hoping to learn bioinformatics, and I could never have taken on this

    computational project without the continuing help of countless people. I am particularly indebted to

    Selene Fernandez-Valverde, as well as my fellow PhD students, particularly Andrew Calcino, Federico

    Gaiti, and Laura Grice. I am also grateful to Felipe Aguilera for his help with Phylostratigraphy, Kevin

    Kocot for his help with OrthoMCL, and Maely Gauthier for her constant willingness to help with

    software installations (and for being so patient with me when I couldn’t parse files!).

  • viii

    Although my stint in the wet lab was brief, I am grateful to everyone who helped me with my bench

    work, especially Federico for all of his guidance with ChIP-Seq, Carmel McDougall for her exhaustive

    knowledge about every protocol, and Kerry Roper for keeping the lab running like a well-oiled machine.

    Additionally, I would like to thank Gemma Richards, and Federico for providing me with comments

    on my drafts, and a special thanks to Laura for being my aesthetics consultant, proof-reader, and all

    around support team for my PhD.

    My two trips to Heron Island for fieldwork are among my fondest memories in Australia. Therefore, I

    would like to thank the Heron Island staff, particularly Liz and Dani for all of their help, and Maureen

    for reassuring my mom far away in the US that I was safe during the cyclone. Special thanks to Carmel

    for making my first trip collecting and spawning abalone and Herdmania go smoothly and enjoyably,

    as well as my field work partner for the second trip, Tahsha, for all of her help in the lab, being my

    snorkel buddy, and always partaking in midnight chocolate breaks (if only we’d found those Steggles!).

    I am also indebted to Sabrina Kaul-Strehlow from the University of Vienna for sharing Balanoglossus

    with me, and Andrew Baird of James Cook University for providing me with coral samples. Without

    these samples, Chapter 4 wouldn’t have been the same.

    More than anything, PhDs are a test of perseverance, and I could not have done it without the never-

    ending support of my family and friends. Above all else, I need to thank my family back home for

    whole-heartedly supporting me in my move across the world – I love you all the way to Australia and

    back. To my “Australian family”, Roger, Jenny, Jess, Ben, and Mia: I cannot thank you enough for

    adopting me into your amazing family and including me in your lives – you’ve shown me so much

    kindness; words cannot express how grateful I am. To my friends back home, particularly Lauren

    “Boots” Xenakis and Sara Powers: thank you for your unconditional reassurance that I am doing the

    right thing.

    To all of my friends in Australia: for the last 3.5 years, the Degnan office was largely my whole world.

    Coming alone to a country where I didn’t know a soul, I was lucky enough to learn that I didn’t need

    to look much farther than the desks around me. While labs are often transitory places, each member,

    both past and present, has helped make Brisbane an enjoyable place to spend the past few years of my

  • ix

    life: Ben, Fede, Katia, Eunice, Jabin, Shun, Andrew, Romy, Lisa, Aude, Xueyan, Gemma, Markus,

    Kerry, Maely, Kevin, Arun, Laura, Tahsha, Simone, and Bec. I will never forget all of the good times

    we had at Friday drinks, spontaneous coffees, and pizza parties in the park. Also, to all of my non-lab

    friends: Cait, Aowen, Steve, Eddie, Andrew, and Caitlin – thank you!

    I would also like to give a shout out to the people at the UQ swimming pool, in particular Jae and

    Sarah for encouraging me to join the swim squad and being a ray of sunshine at 5:00am. You’ve kept

    me balanced and taught me that I don’t need to be ‘sedentary by choice’.

    Finally, I would like to give special thanks to Laura, my fellow co-founder of ‘Milkshake Fridays’,

    for all of our culinary adventures, teaching me the joys of Kate Bush, and always being there when I

    needed you (Uni-Halo forever); Simone, for your continual encouragement, friendship, and reminders

    to eat my vegetables; Tahsha, for our amazing weekends at Caloundra and your unwavering support

    and positivity; and Bec, for teaching me to become a morning person and get to the pool on time,

    Wednesday movie club, and enabling my coffee addiction. The lists go on and on - I will truly miss

    you all!

  • x

    Keywords

    Evolution, development, larvae, biphasic, life cycle, transcriptome, genome, body plan

    Australian and New Zealand Standard Research Classifications (ANZSRC)

    060305, Evolution of Developmental Systems, 60%060102, Bioinformatics, 20%060408, Genomics, 20%

    Fields of Research (FoR) Classification

    0603, Evolutionary Biology, 60%0601, Biochemistry and Cell Biology, 20%0604, Genetics, 20%

  • xi

    Table of conTenTschapTer 1 - inTroducTion: philosophical and hisTorical paradigms for life cycle evoluTion 23

    1.1 Diversity of animal life cycles 23

    1.1.1 ‘Primary’ ciliated larval types 23

    1.1.2 Similarities between primary larvae 25

    1.1.3 Planktotrophy vs. lecithotrophy, the loss of feeding larvae, and the evolution of direct

    development 27

    1.1.4 Reacquisition of indirect development and the evolution of ‘secondary’ larvae 29

    1.1.5 How did larvae evolve? 30

    1.2 Life cycle evolution: a historical dilemma 31

    1.2.1 Early theories: from parallelism to recapitulation 31

    1.2.2 Balfour and Garstang 32

    1.2.3 Recapitulation and the ‘Terminal Addition Hypothesis’ 33

    1.2.4 Intercalation hypothesis 34

    1.2.5 Ancient synapomorphy and the adaptive decoupling hypothesis 36

    1.3 Aims of this study 37

    1.3.1 The Amphimedon queenslandica model system: a foundational case study 38

    1.3.2 Aim 1: Gleaning biological meaning from transcriptomes of non-model organisms 39

    1.3.3 Aim 2: Analysis of the genomic orchestration of biphasy in A. queenslandica 40

    1.3.4 Aim 3: Is metamorphosis conserved across the metazoan phyla? 40

    chapTer 2 - gleaning biological meaning from TranscripTomes of non-model organisms:

    assessmenT of cel-seq developmenTal expression profiles in The sponge Amphimedon queenslAndicA

    41

    2.1 Abstract 41

    2.2 Introduction 42

    2.3 Results 43

    2.3.1 BLIND clustering largely places transcriptomes in expected temporal order based on

    morphology 44

    2.3.2 BLIND clustering reveals distinct ‘transcriptional’ blocks 45

    2.3.3 Different gene ontology (GO) annotation methods provide drastically different GO

  • xii

    annotations for the A. queenslandica genome 48

    2.3.4 Gene ontology (GO) enrichment analyses of DEG lists accentuate the methodological bias

    between annotations 49

    2.3.5 Protein domain-based (Pfam and InterPro) gene ontology (GO) annotations are more

    effective at recovering known G-protein coupled receptors (GPCRs) than BLAST-based annotations

    51

    2.3.6 BLAST-based gene ontology (GO) annotations are more effective at recovering putative

    transcription factors (TFs) than protein domain-based annotations 52

    2.3.7 Gene ontology (GO) enrichment recovers the temporal expression of known differentially

    expressed genes (DEGs) 54

    2.3.8 KEGG and GOSlim methodologies are in accordance with one another for key

    developmental pathways 56

    2.4 Discussion 59

    2.4.1 Certain drastic life cycle transitions, such as the emergence from maternal brood chamber,

    are not marked by large-scale transcriptional changes 59

    2.4.2 GOSlim provides a conservative method to computationally characterize non-model life

    cycles 61

    2.4.3 The efficiency of each GO annotation depends on the biology of the candidate gene 63

    2.4.4 Gene ontology (GO) enrichments largely corroborate the temporal expression patterns of

    candidate genes 65

    2.4.5 GOSlim enrichments and KEGG are often in accordance with observed biological function

    67

    2.4.6 Computational recommendations: To avoid methodical bias, multiple annotations must be

    used to infer gene function in silico in a non-model organism 68

    2.5 Conclusions 69

    2.6 Methods 70

    2.6.1 Identification of transcriptional blocks and differential expression analyses 70

    2.6.2 Genome annotation using gene ontology (GO), orthology, and GO enrichment 70

    2.6.3 KEGG annotation and pathway reconstruction 71

    chapTer 3 - The orchesTraTion of biphasy in The sponge, Amphimedon queenslAndicA, and The

  • xiii

    evoluTion of complex life cycles 73

    3.1 Abstract 73

    3.2 Introduction 73

    3.3 Results 75

    3.3.1 Pelagic and benthic transcriptomes are transcriptionally distinct 75

    3.3.2 Amphimedon queenslandica larvae and adults are primarily using the same genes at similar

    expression levels, therefore, the majority of larval genes are not turned ‘OFF’ in adulthood 76

    3.3.3 Differentially expressed genes (DEGs) remaining highly expressed throughout the life cycle

    are enriched in pre-metazoan traits, suggesting that both body plans rely on core set of ancient genes

    77

    3.3.4 Distinct phase- and stage-specific coexpression clusters reinforce the transcriptional

    dissimilarity between pelagic and benthic phases 80

    3.3.5 Larval and juvenile transcriptomes are more similar in gene age than the reproductive adult

    81

    3.4 Discussion 83

    3.4.1 Pelagic and benthic transcriptomes share a core set of highly expressed genes of ancient

    origin, suggesting that biphasy is a synapomorphic trait of the Metazoa 83

    3.4.2 Despite this core set of genes, similarity in overall transcriptional profile correlate with body

    plan and ecology rather than with organismal size or complexity, supporting the adaptive decoupling

    hypothesis 85

    3.4.3 Gene age correlates with organismal size and complexity instead of ecological phase,

    consistent with theories for developmental constraint and the evolution of gene regulatory networks 86

    3.5 Conclusions 87

    3.6 Methods 88

    3.6.1 Differential Expression Analysis 88

    3.6.2 Quartile analysis 88

    3.6.3 Gene Annotation and Analysis 89

    3.6.4 Gene Age Analyses 90

    chapTer 4 - comparaTive insighT inTo The evoluTion of animal life cycles 91

    4.1 Abstract 91

  • xiv

    4.2 Introduction 92

    4.3 Results 94

    4.3.1 The majority of genes differentially expressed across metamorphosis do not have orthologs

    in the other sampled species 95

    4.3.2 Despite the lack of orthology between differentially expressed genes (DEGs) in each species,

    conserved transcription factors (TFs) are involved in metamorphosis across phyla 98

    4.3.3 Novelty is widespread throughout the metamorphic temporal coexpression network (TCN)

    in sponges and sea urchins 100

    4.3.4 Comparison of sponge and sea urchin temporal coexpression networks (TCNs) suggest that

    conserved transcription factors regulate phylum-specific batteries 103

    4.4 Discussion 105

    4.4.1 Metamorphosis is characterized by phylum-specific genes: evidence for convergence? 106

    4.4.2 Conserved transcription factors (TFs) are differentially expressed across metamorphosis in

    all six sampled taxa 109

    4.4.3 Temporal coexpression networks (TCNs) and the ‘mode’ of evolutionary change 110

    4.5 Conclusions 112

    4.6 Methods 113

    4.6.1 Sample collection 113

    4.6.2 Library preparation and sequencing 114

    4.6.3 Transcriptome assembly and gene prediction 114

    4.6.4 Differential expression analysis 115

    4.6.5 Orthology and gene age analyses 115

    4.6.6 Protein domain analyses 116

    4.6.7 Network analysis 116

    chapTer 5 - discussion: shifTing paradigms in The evoluTion of animal life cycles – The

    incorporaTion of genomic daTa seTs inTo hisTorical concepTual frameworks 119

    5.1 Overview of findings: How does biphasy operate on a genomic level? 120

    5.2 Theoretical and evolutionary implications: When and how did biphasy evolve? 122

    5.2.1 Periodization and adaptive decoupling 122

    5.2.2 Homology vs. convergence: When did biphasy evolve? 125

  • xv

    5.2.3 Terminal addition vs. intercalation: How did biphasy evolve? 128

    5.3 Synthesis of findings: A modified hypothesis for the evolution of biphasy 129

    5.4 Conclusions and looking forward 132

    references 133

    appendices 159

  • xvi

    lisT of figuresFigure 1.1 Diversity and phylogeny of metazoan life cycles 24

    Figure 1.2 Schematic flowchart illustrating the progression of historical theoretical hypotheses for the origin

    and evolution of animal life cycles 32

    Figure 1.3 Nielsen’s theory of terminal addition 34

    Figure 1.4 Implications of the intercalation hypothesis 35

    Figure 1.5 Possible scenarios for the evolution of biphasic life cycles 37

    Figure 1.6 Life cycle of the demosponge, Amphimedon queenslandica 39

    Figure 2.1 The A. queenslandica life cycle can be divided into discrete transcriptional ‘blocks,’ which allow

    for the characterization of ontogenetic changes in lieu of traditional morphological stages 46

    Figure 2.2 Gene ontology (GO) annotations of the A. queenslandica genome and lists of differentially

    expressed genes (DEGs) are variable between three methods 48

    Figure 2.3 Enrichment of GOSlim terms varies between annotation methods 50

    Figure 2.4 Certain gene ontology (GO) annotations are more successful at identifying G-protein coupled

    receptors than others in the A. queenslandica genome 51

    Figure 2.5 Gene Ontology (GO) does a poor job recovering transcription factors (TFs) identified by

    orthology to known human TFs in the A. queenslandica genome 53

    Figure 2.6 GO enrichments still support key changes in expression patterns for critical genes of interest 55

    Figure 2.7 GO enrichments for signal transduction are supported by KEGG analyses 57

    Figure 2.8 Comparison of GO enrichment and KEGG analyses for the apoptosis / cell death pathway 58

    Figure 2.9 Proposed workflows to bioinformatically characterize gene function across the ontogeny of a non-

    model system 60

    Figure 3.1 The life cycle of A. queenslandica 76

    Figure 3.2 Expression quartile analysis of differentially expressed genes (DEGs) 77

    Figure 3.3 Phylostratigraphy (PS) enrichment analysis for each quartile expression profile 78

    Figure 3.4 Evolutionary conservation of quartile expression profiles 79

    Figure 3.5 Differentially expressed genes (DEGs) divided into eight co-expressed ‘gene suites’ 81

    Figure 3.6 Phylostratigraphy (PS) analysis of the A. queenslandica transcriptome and individual gene suites

    82

    Figure 4.1 Gain/loss tree of HomologGroups (HGs) used in the inference of gene age 96

  • xvii

    Figure 4.2 Characterization of differentially expressed genes (DEGs) during the metamorphosis of six animal

    species 97

    Figure 4.3 Temporal coexpression network (TCN) for Amphimedon queenslandica metamorphosis (late

    embryogenesis through feeding juvenile stages), highlighting gene age distribution 101

    Figure 4.4 Temporal coexpression network (TCN) for Strongylocentrotus purpuratus metamorphosis

    (precompetent larva through young juvenile stages), highlighting gene age distribution 102

    Figure 4.5 Comparison of sponge (Amphimedon queenslandica) and sea urchin (Strongylocentrotus

    purpuratus) metamorphic temporal coexpression network (TCN) based on shared differentially expressed

    (DE) transcription factors (TFs) 105

    Figure 5.1 Walter Garstang’s theory of parallel ontogenies (1922): “The real phylogeny of Metazoa has never

    been a direct succession of adult forms, but a succession of ontogenies or life-cycles” 124

    Figure 5.2 Evolutionary hypothesis for the evolution of extant animal life cycles 130

  • xviii

    lisT of TablesTable 4.1 Shared TFs that are differentially expressed across metamorphosis in multiple taxa 99

  • xix

    lisT of appendicesAppendix 2.1 BLIND ordered samples indicating morphological stage and transcriptional block 159

    Appendix 2.2 Identification of differentially expressed gene lists 159

    Appendix 2.3 Differentially expressed lncRNAs 160

    Appendix 2.4 Differentially expressed genes across the A. queenslandica life cycle 160

    Appendix 2.5 List of enriched biological process (BP) GO terms for each annotation method (Blast2GO/

    InterPro, Trinotate-BLAST, and Trinotate-pfam), including subsequent overlap analyses 161

    Appendix 2.6 Total GO Enrichments for biological process (BP) are more variable between annotation

    methods than GOSlim, with little overlap between all three methodologies 161

    Appendix 2.7 List of enriched GOSlim terms for each annotation method (Blast2GO/InterPro, Trinotate-

    BLAST, and Trinotate-pfam), including subsequent overlap analyses 162

    Appendix 3.1 Additional quartile overlap 164

    Appendix 3.2. Gene ontology enrichment of each differentially expressed gene suite 164

    Appendix 3.3 Functional characterization of each gene suite 165

    Appendix 3.4 Additional phylostratigraphy 166

    Appendix 4.1 RNA-Seq Statistics for Acropora ‘dig-gem’, Haliotis asinina, Balanoglossus misakiensis, and

    Herdmania momus 167

    Appendix 4.2 Transcriptome assembly (Trinity), gene annotation (TransDecoder), and differential expression

    analysis (RSEM/edgeR) statistics 167

    Appendix 4.3 Fixed phylogenetic tree used in the parsimony analyses of gene age for all species 168

    Appendix 4.4 Dollo Parsimony (gain/loss) of HomologGroups 168

    Appendix 4.5 Pfam domain expansion for differentially expressed genes of six metazoan taxa 168

    Appendix 4.6 Species-specific, bilaterian and deuterostome Pfam domain enrichments 168

    Appendix 4.7 Species-specific, bilaterian and deuterostome Pfam domain enrichment examples 169

    Appendix 4.8 All HomologyGroups (HGs) containing transcription factor orthologs 169

    Appendix 4.9 HomologyGroups (HGs) containing differentially expressed transcription factors 169

    Appendix 4.10 HomologyGroups (HGs) containing differentially expressed transcription factor orthologs

    shared between the sea urchin and sponge (extended version) 169

    Appendix 4.11 FastOrtho configuration settings 170

    Appendix 4.12 Accession information for temporal coexpression networks 170

  • xx

    Abbreviation DefinitionAd Acropora digitifera (gem-type)Dig-gem Acropora digitifera (gem-type)Aq Amphimedon queenslandicaAqu2.1 Amphimedon queenslandica gene model version 2.1ARC Australian Research CouncilBLAST Basic Local Alignment Search Toolblastp Protein BLASTBLIND Basic linear index determination of transcriptomesBm Balanoglossus misakiensisBP Biological processbp Base pairCC Cellular componentCel-seq Cell Expression by Linear amplification and SequencingChIP-Seq Chromatin Immunoprecipitation Sequencingdb DatabaseDEG Differentially expressed geneDNase DeoxyribonucleaseDOWN Significantly downregulateddpf Days post fertilizationdps Days post settlementE-value Expect valueevo-devo Evolutionary developmental biologyFDR False discovery rateGEO Gene expression omnibusGO Gene ontologyGO-ID Gene ontology identifierGPCR G-protein coupled receptorGRN Gene regulatory networkgtf Gene transfer formatHa Haliotis asininaHG Homology groupHGT Horizontal gene transferHm Herdmania momusHMMER Biosequence analysis using profile hidden Markov modelshpf Hours post fertilizationhpi Hours post induction

    lisT of abbreviaTions

  • xxi

    KEGG Kyoto Encyclopedia of Genes and GenomesLCA Last common ancestorlncRNA Long non-coding RNAMAPK Mitogen-activated protein kinaseMF Molecular functionmRNA Messenger RNANCBI National Center for Biotechnology InformationNO Nitric OxideOG Ortholog groupP-adj Adjusted P-valueP-value Calculated probabilityPCA Principle component analysisPolyA PolyadenylationPS Phylostratigraphy or phylostrataQ QuartileRNA Ribonucleic acidRNA-seq RNA sequencingSp Strongylocentrotus purpuratusSRA Sequence read archiveTCN Temporal coexpression networkTF Transcription factorTGFβ transforming growth factor-β TRG Taxonomically-restricted geneUP Significantly upregulatedUQ University of Queenslandvsd variance stabilizing distributionWnt WinglessWSP Wnt Signaling pathway

  • 22

  • 23

    chapTer 1 - inTroducTion: philosophical and hisTorical paradigms for life cycle evoluTion

    1.1 diversiTy of animal life cyclesAnimals display a wide variety of life cycles, typified by two predominant strategies: direct and indirect

    development (Figure 1.1A). Direct developing species, including most of the classic genetic and

    developmental biology systems (mouse, zebrafish, C. elegans), only have one body plan throughout

    their life cycle. In contrast, indirect developing species progress through two unique body plans – a

    non-reproductive larva and a reproductive adult.

    1.1.1 ‘Primary’ ciliated larval types

    Among indirect developers, the majority of species have a ciliated larval body plan, including poriferans

    (sponges), cnidarians, protostomes (bryozoans, molluscs, annelids), and deuterostomes (echinoderms,

    and hemichordates). Sponges are widely considered to be one of the earliest branching animal lineages

    (Edgecombe et al. 2011) due to their relatively simple morphology and lack of neuronal and muscular

    cell types or a centralized gut. Despite this seemingly simple adult body plan, the phylum Porifera

    includes both indirect and direct developmental modes. This includes multiple larval types, such as

    amphiblastula and calciblastula in Calcarean sponges, trichimella in Hexaxtinellida sponges, and

    parenchymella, disphaerula, and coeloblastula in Demosponge taxa (reviewed in Ereskovsky 2010,

    Wörheide et al. 2012, Degnan et al. 2015). Based on recent poriferan phylogeny, the parenchymella

    larva is thought to be the ancestral larval type among sponges (Wörheide et al. 2012).

    Unlike the diversity of larval forms in poriferans, the majority of cnidarian taxa in all classes (Anthozoa,

    Schyphozoa, Cubozoa and Hydrozoa) have a similar bilayered and monociliated planula larva,

    characterized by an outer epidermis (ectoderm), an internal gastrodermis (endoderm), and aboral

    sensory cilia (Ruppert et al. 2004). This planula larva is considered to be ancestral to all cnidarian

    classes (Collins 2002). In contrast to sponges and cnidarians, which display phylum-specific larval types,

    protostomia-lophotrochozoa lineages (including Annelid, Mollusc, Nemertean, and Platyhelminthe

  • Th e c h a r a c T e r i z aT i o n o f l a rva l h o m o l o g y

    24

    phyla) all share a common trochophore larva. This trochophore larva is characterized by a ciliary band

    called a prototroch, which is used for locomotion and feeding (Damen and Dictus 1994). Likewise,

    non-chordate deuterostomes (including hemichordates and echinoderms) frequently have a dipleurula-

    type larvae, including the tornaria larvae of hemichordates and the dipleurula of echinoderms, which

    are believed to share common origins (Nezlin 2000).

    Porifera

    Ctenophora

    Cnidaria

    Acoelomorpha

    Mollusca

    Echinodermata

    Annelida

    Crustacea

    Arthropoda

    Hemichordata

    Urochodata

    Chordata

    Cephalochordata

    Protostomes

    Deuterostom

    es

    Indirect developing

    Direct developingPrimary larva

    Feeding Non-feeding Secondary larva

    Feeding Non-feeding

    = Indirect developing

    = Direct developing

    = Secondary larva

    = Primary larva

    = Loss of larval stage

    A B

    Direct developing

    Figure 1.1 Diversity and phylogeny of metazoan life cycles. (A) Phylogenetic distribution of primary developmental modes of the major metazoan lineages. Indirect/pel-agobenthic/biphasic life cycles (blue) are widespread throughout the animal kingdom. However, some phyla have are direct developing (green). Gold lines illustrate lineages with a ‘primary’ ciliated larva and red lines indicate secondarily-derived larvae. (B) Dichotomous key of extant larval types illustrating a proposed hypothesis for life cycle evolution, where indirect development is believed to be ancestral. Due to this inferred ancestral state, indirect developers are also known as ‘primary’ larva (indicated in gold), which can be further subdivided into feeding (planktotrophic) and non-feeding (lecithotrophic) larval types. Alternatively, it is believed that direct development evolved from an indirect developing ancestor through the loss of the larval stage (indicated by the gray dashed line). In some lineages, these direct developers independently evolved another biphasic life cycle. These new larval types are called ‘secondary’ (indicated in red) in contrast to the inferred ancestral state of ciliated larvae. Like primary larvae, secondary larvae can be further subdivided into feeding and non-feeding larval types.

  • 25

    Ch a p t e r 1: In t r o d u C t I o n

    Many studies hypothesize that these four larval types (sponge larva, cnidarian planula, trochophore

    larvae, and dipleurula-like larvae) represent four independent evolutionary origins of larval body plans

    (Nielsen 1998, Hadfield 2000). Furthermore, it is hypothesized that the biphasic life cycle, consisting of

    a pelagic ciliated larva and a benthic adult is the ancestral developmental mode in all phyla (Jägersten

    1972). Hence, ciliated larvae are often called ‘primary’ larvae, in reference to the supposed ancestry of

    these larval types (Jägersten 1972; Figure 1.1B; also see section 1.2). However, there are also multiple

    similarities between these four larval types, suggesting that primary larvae may also be homologous

    (Nielsen and Nørrevang 1985, Degnan and Degnan 2006, Nielsen 2013).

    1.1.2 Similarities between primary larvae

    Perhaps the earliest hypotheses for larval homology was among protostome species (Hatschek 1878,

    Roule 1891), suggesting that the ubiquity of the trochophore larval body across lophotrochozoans is

    indicative of common ancestry among spiralians (more recently this has been supported by Nielsen

    and Nørrevang 1985). Additional studies have compared protostome and deuterostome lineages in

    search of evidence of common bilaterian origins for marine larvae (Nielsen 1994, 1998). However,

    these studies largely support the independent evolution of trochophore larvae in protostomes (consistent

    with Nielsen and Nørrevang 1985, Rouse 1999, 2000) and dipleurula-like larvae in deuterostomes

    (consistent with Nezlin 2000).

    More recently, molecular techniques have been used to investigate whether there may be more deeply

    conserved processes patterning disparate larval types (e.g. Jackson et al. 2005, Dunn et al. 2007, Rentzsch

    et al. 2008, Mazza et al. 2010, Santagata et al. 2012, Marlow et al. 2014). Specifically, studies have

    examined the similarities of bilaterian larval gut development, illustrating that expression patterns,

    such as of Brachyury, otx, and goosecoid, are conserved between protostome and deuterostome larvae

    (Arendt et al. 2001). Unlike morphological studies, this suggests that protostome and deuterostome

    larvae share deeply conserved developmental mechanisms for larval patterning, suggesting that larval

    homology predates the bilaterian last common ancestor.

    In this context, the apical organ is an ideal candidate to investigate larval homology between phyla,

    because it is found across all bilaterian taxa and is only present in the larvae of indirect developing

  • Th e c h a r a c T e r i z aT i o n o f l a rva l h o m o l o g y

    26

    species (Nielsen 2005, Dunn et al. 2007, Lacalli 2008). In ciliated protostome and deuterostome larvae,

    the apical organ can be defined by a thickened epithelium at the animal pole of the larvae, with a

    characteristic cluster of elongated cilia (apical tuft), and neuronal ganglion (Nielsen 2004, Byrne et al.

    2007, Dunn et al. 2007). There is clear evidence that the larval apical organ is responsible for sensing

    surroundings and appears to be especially tied to larval settlement in species of most indirect developers,

    including cnidarian larvae (e.g. Rentzsch et al. 2008, Conzelmann et al. 2013). However, it is unclear if

    similarities of the apical neurons can be deemed homologous in basal clades, such as the planula larvae

    of cnidarians. Despite having a primitive nerve net as adults, cnidarian larvae have very similar apical

    morphology to trochophores, including an apical tuft (Chia and Koss 1979, Sinigaglia et al. 2015).

    Even without neurons, the anterior region of the sponge is capable of distinguishing changes in light

    intensity (Leys and Degnan 2001), and is suggested to be important to larval settlement (Degnan and

    Degnan 2010, Degnan et al. 2015, Nakanishi et al. 2015, Ueda et al. 2016). Specifically, flask cells in

    the anterior region of the A. queenslandica larvae have been shown to respond to environmental cues

    via calcium mediated signaling pathways (Nakanishi et al. 2015), which in turn interact with Nitric

    Oxide signaling pathways to initiate metamorphosis (Ueda et al. 2016). This suggests that the apical

    region (posterior region in sponges) in ciliated primary larva across the Metazoa might be involved

    in environmental sensing. However, this may also be the result of functional convergence (Dunn et

    al. 2007).

    There have also been a handful of molecular studies investigating genes associated with the formation

    of the larval apical region (most recently, Marlow et al. 2014). These studies comparing expression

    patterns between representatives of different phyletic lineages indicate that some genes involved in

    apical organ formation and patterning predate the eumetazoan split (FGF - Rentzsch et al. 2008; COE

    - Jackson et al. 2005; Six3/6 - Santagata et al. 2012, Marlow et al. 2014), and some are specific to

    bilaterians (Homeobrain, Rax, Orthopedia - Mazza et al. 2010). However, based on gene regulatory

    network inferences, Dunn et al. (2007) conclude that there is no evidence for homology in apical organ

    patterning between protostomes and deuterostomes, suggesting that trochophore and dipleurula larvae

    result from independent evolutionary events (Dunn et al. 2007). Given the findings presented in this

    section, it remains uncertain how many times ciliated larval body plans arose (reviewed in Hadfield

    2000), and under what mechanisms complex life cycles have evolved (see section 1.2).

  • 27

    Ch a p t e r 1: In t r o d u C t I o n

    1.1.3 Planktotrophy vs. lecithotrophy, the loss of feeding larvae, and the evolution of direct

    development

    Primary, ciliated larval types can be further subdivided into feeding (planktotrophic) and non-feeding

    (lecithotrophic) larval types (Figure 1.1B). Planktotrophic larvae (which can be found in animal phyla

    such as, Mollusca, Annelida, Echinodermata, and Hemichordata) are capable of feeding via a larval

    gut (Thorson 1950). Alternatively, lecithotrophic larvae, such as all poriferan larva, most cnidarians,

    some bryozoans, and some echinoderms, cannot feed until after metamorphosis, when the adult feeding

    structures have been established (Thorson 1950, Vance 1973). Non-feeding larvae generally have a

    shorter larval phase than planktotrophic species, the length of which is dictated by finite energetic

    resources from the maternally derived yolk (‘energetic burnout’: Hadfield et al. 2001; Vance 1973).

    Thus, planktotrophy versus lecithotrophy is often seen as an adaptive trade-off (Marshall and Keough

    2003): planktotrophic larvae are energetically cheap to produce, allowing for a greater number of

    offspring, yet, feeding larvae also experience a higher rate of mortality prior to settlement (Thorson

    1950).

    This trade-off has multiple implications for the various ecological roles marine larvae can play in an

    animal life cycle (reviewed in Strathmann 1978b). For instance, as many marine invertebrates have

    sessile adult body plans, it is intuitively believed that increasing time in the water column provides a

    mechanism for gene flow and dispersal between populations (Hedgecock 1986); however, this hypothesis

    has recently come into question (Levin 2006, Sanford and Kelly 2011), as some studies suggest that

    larvae may not be dispersing as far as originally thought (Swearer et al. 2002, Palumbi 2004, Morgan

    et al. 2009). Thus, as planktotrophic larvae are capable of feeding, they are able to remain in the water

    column longer than lecithotrophic species, potentially resulting in greater dispersal and gene flow. In

    contrast, lecithotrophic species, which are unable to travel as far, have a higher rate of metamorphic

    success. In addition to a role in dispersal, larvae play a crucial role in site selection (Raimondi and

    Keough 1990). If the larva does not choose an appropriate location to settle, the organism cannot reach

    sexual maturity (Pechenik 2006). Therefore many larvae have evolved a tightly regulated mechanism

    of site selection based on a myriad of environmental cues both within and between species (reviewed in

    Steinberg and Nys 2002, Hadfield 2011). As with dispersal, there are noticeable differences in settlement

    strategies between planktotrophic and lecithotrophic larvae, as the former are capable of feeding until

  • Th e c h a r a c T e r i z aT i o n o f l a rva l h o m o l o g y

    28

    an appropriate site is found (Hadfield et al. 2001, Pechenik 2006). Alternatively, lecithotrophic larvae

    are limited by the amount of yolk, and thus in the absence of an appropriate settlement cue, must either

    spontaneously settle or die (Hadfield et al. 2001).

    While life history strategies were originally thought to be dichotomous (Thorson 1950), some larva

    display ‘facultative planktotrophy’ (e.g. Kempf and Hadfield 1985, Emlet 1986), and are capable of

    taking advantage of both developmental modes (reviewed in Allen and Pernet 2007). However, there

    is increasing uncertainty whether these ‘non-traditional’ developmental modes represent evolutionary

    intermediates between planktotrophy and lecithotrophy or have been selected for due to some adaptive

    advantage (Collin 2012). Regardless, the prevalence of intermediate life history modes sheds light on

    the potential evolutionary theories for the diversity of extant life cycles. In the literature, it is assumed

    that planktotrophy is ancient among bilaterians and thus the loss of feeding larvae is a derived trait

    (Strathmann 1978a, 1978b, Collin 2004, Collin et al. 2007). Likewise, as the presence of a ciliated larva

    is considered to be ancestral (Jägersten 1972), it is believed that direct development is also a derived

    trait that evolved from an indirect developing species (Jägersten 1972, Strathmann 1978; reviewed

    in Reitzel et al. 2006). In some phyla, loss of a feeding larval phase is closely intertwined, and often

    synonymous, with the shift from indirect to direct development (e.g. Collin et al. 2007).

    Many phyla contain both direct and indirect developing taxa, including sponges, cnidarians, annelids,

    arthropods, echinoderms, and chordates. Based on echinoderm phylogeny, it is apparent that the ancestral

    echinoderm was indirect developing, therefore direct development has evolved multiple times in the

    phyla through the loss of a planktotrophic larval stage (Strathmann 1978b, Wray 1996). Additionally,

    among the cnidarians, extant taxa display a highly diverse range of developmental modes—each with

    varying degrees of biphasy (Martin and Koss 2002). However the majority of the four classes (Anthozoa,

    Schyphozoa, Cubozoa and Hydrozoa) produce similar non-feeding, ciliated planula larvae, suggesting

    that the ancestral cnidarian had a pelagobenthic life cycle (Collins 2002). Given that, metamorphosis

    from a planula larva to a primary polyp in the anthozoan, Nematostella vectensis, is a gradual process

    with little tissue reorganization (Rentzsch et al., 2008), indicating that the dramatic metamorphosis

    seen in other cnidarian taxa was partially lost in this species (Reitzel et al. 2006).

  • 29

    Ch a p t e r 1: In t r o d u C t I o n

    1.1.4 Reacquisition of indirect development and the evolution of ‘secondary’ larvae

    In the past, it was thought that once the complex morphological and genetic machinery needed to

    create a planktotrophic larva were lost, they were unlikely to be regained (Strathmann 1978a, 1978b).

    However, reacquisition occurs in both brachiopods, which retain larval morphology in juveniles, and

    some gastropods, which retain planktotrophic feeding structures in lecithotrophic larvae (Strathmann

    1978b). Recent studies in gastropods indicate that the likelihood of feeding reacquisition is related to

    the amount of time that has lapsed since its loss, as the feeding morphology in recently lecithotrophic

    species is more likely to be retained (Collin 2004). Furthermore, as both feeding and swimming

    structures remain intact in these lecithotrophic larva, it has been hypothesized that the reacquisition

    of planktotrophy in these species may result from heterochronic shifts in development pathways

    (Collin et al. 2007). Other explanations for the reacquisition of complex, developmental traits, such as

    planktotrophy and/or indirect development, include the genetic pleiotropy and the inherent modularity

    of developmental pathways involved in morphogenesis (reviewed in Collin and Miglietta 2008).

    As the majority of developmental genes are used in multiple morphogenetic pathways, Collin and

    Miglietta (2008) argue that structures for larval feeding can be readily reacquired over relatively short

    evolutionary time because the genetic machinery required for their establishment remain intact in other

    genetic modules.

    In addition to the reacquisition of a planktotrophic larva from a lecithotrophic species, in some extreme

    cases, it is believed that certain larval types (such as the tadpole larva in ascidians and the cyprid larva

    in crustaceans) evolved independently from a completely direct developing ancestor (Jägersten 1972,

    Hadfield 2000). Therefore, these larvae (deemed ‘secondary’ in comparison to ‘primary’ ciliated larvae;

    Figure 1.1B) were convergently reacquired from a species that had previously (and permanently) lost its

    ancestral ciliated larva. Hence, these ‘secondary’ larvae bear little morphological resemblance to ciliated

    larvae and are often characterized by distinct genetic mechanisms underlying metamorphic transitions

    (Hadfield 2000, Heyland and Moroz 2006). For instance, unlike primary pelagobenthic life cycles,

    where metamorphosis often occurs rapidly in response to external settlement cues, secondarily derived

    metamorphic events (such as in crustaceans, caterpillar/butterfly and tadpole/frog) are characterized

    by slow, hormonally-regulated transitions (Hadfield 2000, Heyland and Moroz 2006).

  • Th e c h a r a c T e r i z aT i o n o f l a rva l h o m o l o g y

    30

    Ascidian tadpole larvae represent another example of a secondarily derived larval body plan. Unlike

    ciliated larvae, ascidian tadpoles are characterized by a muscular tail and notochord that is reabsorbed

    during metamorphosis into the benthic filter-feeding body plan (Katz 1983). Due to the presence of

    the notochord, and the basal position of cephalochordates among chordates, ascidians are believed

    to have evolved from a direct developing cephalochordate-like ancestor (Satoh 2009). Thus, biphasy

    and the adult filter-feeding adult body plan in ascidians are likely to be derived. Interestingly, despite

    being classified as ‘secondary’ larvae, ascidian larvae display a significant number of similarities with

    primary ciliated larvae (Hadfield et al. 2000). Like many ciliated larvae, ascidian tadpoles display a

    distinct period of larval competence (Degnan et al. 1997) followed by a rapid metamorphic event

    in response to exogenous settlement cues (Green et al. 2002). This is consistent with the hypothesis

    presented in Hadfield et al. (2001), which argues that competence is a convergent trait that evolved to

    minimize vulnerability during settlement/metamorphosis (Hadfield 2000). Additionally, certain signaling

    pathways, such as Nitric Oxide appear to be employed in both primary and secondary metamorphic

    events, reinforcing convergence (Bishop and Brandhorst 2003, 2007, Comes et al. 2007, Bishop et al.

    2008, Ueda and Degnan 2013, 2014, Ueda et al. 2016). Alternatively, as hypothesized in the reacquisition

    of planktotrophic larvae (Collin and Miglietta 2008), these similarities may have also been preserved

    from ancient (and previously lost) life cycles through intact pleiotropic developmental pathways.

    1.1.5 How did larvae evolve?

    Classifications of marine larvae are inherently tied to hypotheses for life cycle evolution. Therefore,

    it is impossible to classify larvae without touching upon the theoretical origins of each larval type.

    However, the evolutionary history of animal life cycles remains largely unclear. Given the incredible

    diversity of marine larvae, studies have speculated that the biphasic life cycle, including those with

    primary or secondary larval types, is the result of multiple episodes of convergent evolution (Nielsen

    1998, Rouse 2000, Hadfield 2000, Dunn et al. 2007). However, the prevalence of biphasy among extant

    animal taxa suggests common ancestry of this complex developmental trait (Degnan and Degnan 2006,

    Mikhailov et al. 2009, Nielsen 2013). This dilemma lies at the very heart of animal evolution, drawing

    upon hundreds of years of theoretical embryology. Hence, the driving force behind this thesis is to

    expand upon these foundational speculative hypotheses, many of which were conceived before the

    dawn of modern genetics, using novel genomic and transcriptomic approaches. In the following section,

  • 31

    Ch a p t e r 1: In t r o d u C t I o n

    I examine these hypotheses through the lens of historical embryology in order to highlight how the

    shifting paradigms for life cycle evolution have influenced the current framework for larval evolution.

    1.2 life cycle evoluTion: a hisTorical dilemma1.2.1 Early theories: from parallelism to recapitulation

    There is a long and complex history of speculative literature surrounding the origins and evolution

    of animal life cycles, beginning with Karl Ernst von Baer (Figure 1.2). von Baer’s laws are a set of

    observations, which arguably founded embryological theory and modern evolutionary developmental

    (‘evo-devo’) biology (reviewed in Abzhanov 2013). Simply stated, these principles claim that all

    animals develop from a simple or ‘general’ form to a more complex one, and these general characters

    are only found during early development. This conceptual model is often described as a developmental

    ‘funnel’, where morphological and genetic complexity increases in a linear fashion from the zygote

    to the adult body plan. From an early stage of embryological work, researchers noticed a similarity

    between the increasing complexity during embryogenesis and the increasing complexity in the

    fossil record, establishing an early link between evolution and development, deemed ‘parallelism’

    (reviewed in Abzhanov 2013), which would become the original framework for subsequent comparative

    embryological and evolutionary theories.

    Perhaps the most well known of these is Haeckel’s biogenetic laws, which famously claim that ‘ontogeny

    recapitulates phylogeny’ (Haeckel 1868, Abzhanov 2013). Haeckel believed that each new phase is

    built upon ancestral forms. Therefore, the embryo provides a window into an animal’s ancestry, and

    development depicts the evolutionary progression to the adult body plan. Haeckel saw the similarity

    between developing metazoans at the gastrula stage, and asserted that the ancestor of all animals must

    have resembled this form at some point in evolution, a hypothetical ancestor he called the “gastraea”

    (Love et al. 2008). This drastic embellishment on early theories of parallelism perpetuated von Baer’s

    developmental funnel, and ultimately became deeply engrained in later concepts of evolutionary

    developmental thinking (see section 1.2.3).

  • Th e c h a r a c T e r i z aT i o n o f l a rva l h o m o l o g y

    32

    1.2.2 Balfour and Garstang

    While less well known than Haeckel (and significantly less cited), 19th century embryologists, Francis

    Balfour and Walter Garstang, made important advances in the understanding of embryology (reviewed

    by Brian Hall, 2000). Balfour’s treatises on embryology mark important differences from Haeckel

    (Balfour 1880, 1881). The first is that early development is not ‘immutable’ as Haeckel and von Baer

    thought, therefore “not all embryonic features reveal ancestral patterns” (Hall, 2000). In addition to

    phenotypic variation in the adult form, Balfour noticed that early development is often marked by

    significant embryonic variations between phyla. This idea highlights an important departure from the

    developmental ‘funnel’ proposed by von Baer and Haeckel (Abzhanov 2013), becoming a prototype

    for the ‘hourglass model’ of evolution, which was later made famous by the phylotypic stage (Duboule

    1994, Raff 1996). The primary implication of the hourglass model is that that morphological and

    genetic divergence could occur in all stages of an ontogeny except the phylotypic stage (Raff 1992).

    von Baer

    Haeckel

    Balfour

    Garstang Jägersten &Nielsen

    Degnan &Degnan

    Sly et al.

    NielsenPhylotypicStage

    Mikhailovet al

    ontogney recapitulates phylogeny

    Parallelism

    Terminal addition

    biphasy is synapomorphic

    Intercalation

    early embryogenesisis divergent

    Developmental stagesevolve independently

    In accordanceRejects

    Inspired

    Adaptivedecoupling

    1.2.1

    1.2.2

    1.2.3

    1.2.41.2.5

    Developental ‘funnel’ model

    Developental ‘hourglass’ model

    Figure 1.2 Schematic flowchart illustrating the progression of historical theoretical hypoth-eses for the origin and evolution of animal life cycles.

  • 33

    Ch a p t e r 1: In t r o d u C t I o n

    In addition to Balfour, Walter Garstang continued to reform Haeckel’s theories, even boldly claiming

    that the “basis of this law is demonstrably unsound” (Garstang 1922). In particular, Garstang argued that,

    “the real Phylogeny of Metazoa has never been the direct succession of adult forms, but a succession

    of ontogenies or life-cycles...so that every phase of the life-cycle is modified in some way or other”,

    thereby stressing that ontogeny does not recapitulate phylogeny, but “creates” it (Garstang, 1922).

    The implication of this statement is that while larvae and adults evolve separately, it is the ontogeny

    as a whole that is passed from generation to generation (Garstang 1928). Like Balfour, this concept

    paved the way for other alternative hypotheses for life cycle evolution that were unconstrained by the

    Haeckelian paradigm, including Nancy Moran’s Adaptive decoupling hypothesis (Moran 1994; see

    section 1.2.5).

    1.2.3 Recapitulation and the ‘Terminal Addition Hypothesis’

    Despite Garstang’s harsh criticism of recapitulation hypotheses, in the latter part of the previous century,

    there was a resurgence of Haeckel’s biogenetic laws, pioneered by Jägersten and Nielsen (Jägersten

    1972, Nielsen and Nørrevang 1985, Nielsen 2005, 2008, 2009, 2013). These theories proposed that

    the similarities between larval forms are too great to ascribe solely to convergent evolution. Therefore

    proponents of this “terminal addition” theory hypothesized that the last common ancestor was larva-

    like, and the adult phase evolved later.

    Early theories of terminal addition (the “Trochaea theory”; such as Nielsen and Nørrevang 1985)

    greatly resembled the original theories for the homology of trochophore larvae (Hatschek 1878, Roule

    1891), and therefore only extend to a protostome-lophotrochozoan last common ancestor. However,

    Nielsen’s most recent hypothesis (Figure 1.3) continues to develop upon Haeckel’s initial theory of

    recapitulation, expanding the scope to include deeper relationships in metazoan evolution (Nielsen

    2008, 2009, 2013). Nielsen proposes the last common ancestor (LCA) of living metazoans was a

    pelagic colonial choanoflagellate, the so-called “choanoblastaea”, capable of feeding using peripheral

    choanocytes. He hypothesizes that the origins of early pelagobenthic life cycles involved the polarization

    of the choanoblastaea, which settled on the pole lacking choanocytes, leading to the internalization

    and subsequent loss of locomotory function (Nielsen 2008). Following the establishment of a biphasic

    life cycle in early metazoans, Nielsen speculates that the adult “sponge-like” phase was lost through

  • Th e c h a r a c T e r i z aT i o n o f l a rva l h o m o l o g y

    34

    neoteny, establishing a holopelagic ciliary particle feeder: the gastraea. From the gastraea, an organized

    nervous system evolved, resulting in a neurogstraea—a eumetazoan ancestor which Nielsen suspects

    to be similar to modern anthozoan larvae. Based on this hypothesis, all larval types are homologous.

    However, adult body plans (and therefore the biphasic life cycle) evolved independently in cnidarian,

    protostome, and deuterostome lineages, yet share the same genetic toolkit that was present in a last

    common pelagobenthic metazoan ancestor.

    1.2.4 Intercalation hypothesis

    In response to the increasingly accepted Gastraea hypothesis, an opposing “Intercalation” theory has

    also gained momentum, in which the last common ancestor (LCA) of metazoans was a direct developing

    benthic organism (Sly et al. 2003). In this scenario, larval forms evolved through the intercalation

    of new genes to the previously existing “adult-like” ontogeny. In 2003, Sly et al. discussed how the

    hypothetical adult-first hypothesis could have occurred (Figure 1.4A). By illustrating gene ontogenies

    as colored arrows, Sly illustrates how a novel gene pathway could eventually pattern a new larval

    Gastraea

    Sponges

    Deuterostromes

    Neurogastraea

    Protostomes

    Trochaea

    Neoteny

    Cnidarians

    METAZOAEUMETAZOA

    BILATERIA

    LCA

    Figure 1.3 Nielsen’s theory of terminal addition. The most recent iteration of Haeckel’s recapitulation theory, where a holobenthic choanoblastaea-like ancestor developed a biphasic life cycle through the polarization and settlement of a choanoblastaea. Once biphasy is established, the adult phase of the last common ancestor was lost through the retention and subsequent sexual maturation of the larval stage. The resultant species (dubbed, ‘gastraea’) is the last common ancestor of eumetazoans. The gastraea precedes the ‘neurogastraea’ following the evolution of a nervous system. Upon the evolution of the neurogastraea, cnidarians and bilaterians split, and the trochaea is derived in the proto-stome lineage. This theory postulates at least 4 evolutionary events that lead to biphasic life cycles, implying that adult forms are derived, while larval forms are homologous among all metazoan lineages. Synthesized and adapted from Nielsen 2008, 2009.

  • 35

    Ch a p t e r 1: In t r o d u C t I o n

    body plan (thereby creating a new ontogeny), involving the addition of a facultative developmental

    sub-pathway. Through time, the new pathway gains constituency, and a rudimentary “larval” form

    is established. Eventually, the new facultative pathway is more advantageous to the original gene

    ontogeny, and a true larval form arises.

    Sly et al. (2003) claim that intercalation theory is more parsimonious than early terminal addition

    theories, as the adult-first hypothesis can explain gene homology between both adults and larval forms

    with fewer incidence of convergent evolution (Figure 1.4B). However, Sly et al.’s analysis of larval

    evolution only explains the emergence of biphasy at the last common ancestor of bilaterians. More

    recently, Degnan and Degnan (2006) hypothesized the importance of meiosis in the evolution of larval

    forms (Degnan and Degnan 2006). There, they argue that once gametes are released from the adult,

    they effectively become a new ontogeny; therefore the fertilized embryo and the adult are exposed to

    independent selective pressures resulting in distinct niche specialization and increasing modularity

    of “larval” forms (Figure 1.4B). In this scenario, they predict that the LCA of metazoans had already

    developed a biphasic life cycle prior to metazoan cladogenesis. Thus, there is only one evolution of

    biphasy, and larvae are a pleisiomorphic trait among metazoans.

    Adult-like ancestor

    Phyla 1 Larva Phyla 2 Larva Phyla 3 Larva

    Phyla 1 Adult Phyla 2 Adult Phyla 3 Adult

    A B

    Figure 1.4 Implications of the intercalation hypothesis. (A) Schematic representation of the intercalation of larval features into adult ontogenies. Blue lines represent ‘adult-like’ ontogeny, while red indicates larval ontogeny. Larval ontogeny arises in an ‘adult-like’, direct devel-oping ancestor as a facultative pathway free of evolutionary constraint, while adult specific gene pathways remain intact. Subsequently, adult pathways are discarded, as facultative genes become an adaptive advan-tage, resulting in a truly biphasic species (B) Intercalation hypothesis asserts a direct developing, ‘adult-like’ ancestor with complex genetic structure. Evolution of larvae occurs from co-option of previously acquired genes, resulting in homologous components of both larval and adult forms. Both figures redrawn from Sly et al. 2003.

  • Th e c h a r a c T e r i z aT i o n o f l a rva l h o m o l o g y

    36

    1.2.5 Ancient synapomorphy and the adaptive decoupling hypothesis

    Three of the most recent reviews on larval evolution (Degnan and Degnan 2006, Mikhailov et al.

    2009, Nielsen 2013), including proponents of both intercalation and terminal addition theories, agree

    that the last common ancestor of the Metazoa was likely to display some degree of biphasy; therefore,

    the pelagobenthic life cycle is hypothesized to be an ancient synapomorphic trait of extant animal

    lineages. These theories (particularly that of Degnan and Degnan 2006) are largely consistent with ideas

    proposed by Balfour and Garstang, specifically that different life cycle stages evolve independently

    of one another. This is in accordance with other hypotheses for the evolution of complex life cycles,

    such as Nancy Moran’s adaptive decoupling hypothesis (Moran 1994), which proposes an alternative

    scenario, whereby larvae and adult body plans evolve in parallel to one another due to ecologically and

    developmentally phase-specific selective pressures. Thus, if life cycle stages are decoupled, genetic

    traits of different phases (such as in larvae and adults) will be less correlated than genetic traits of the

    same ecological niche (Ebenman 1992), resulting in a distinct genetic barrier between pelagic and

    benthic phases.

    Other studies suggest that larval traits are equally important to the reproductive success of the juvenile,

    therefore it is unlikely that larval and adult phases could be entirely decoupled (Pechenik 2006,

    Crean et al. 2011). This implies that larval traits are likely to be constrained due to intense selection

    pressures associated with larval settlement (Crean et al. 2011). Interestingly, this hypothesis echoes

    Garstang’s early observation that larvae play a crucial role in the life cycle evolution (Garstang 1922).

    In addition to these selective pressures based on larval ecology in biphasic species, the constraint of

    earlier developmental stages is consistent with hypotheses for the phylotypic stage (Duboule 1994).

    It is hypothesized that the similarities between animal body plans during this stage are the result of

    stabilizing selection due to the increased connectivity of genes expressed during that developmental

    period (Raff 1992, 1994). This hypothesis has been supported by genetic approaches (Galis and Metz

    2001), suggesting that novel genes are more prevalent during early and late development, while older

    genes are expressed during the phylotypic stage (Domazet-Lošo and Tautz 2010, Akhshabi et al. 2014).

    Given these findings, many contemporary theoretical frameworks (i.e. phylotypic stage, hourglass

    model of evolution, and adaptive decoupling hypothesis) for life cycle evolution are in accordance

  • 37

    Ch a p t e r 1: In t r o d u C t I o n

    with one another. However, many critical questions remain unsubstantiated by empirical evidence –

    particularly if biphasic life cycles are homologous across crown metazoans.

    1.3 aims of This sTudyThe more current hypotheses for larval evolution (Intercalation theory: Degnan and Degnan 2006;

    and Terminal addition: Nielsen 2010, 2013) assert a complex biphasic metazoan ancestor, suggesting

    that indirect development is a synapomorphic trait of all animal phyla. If this is true, then the close

    examination of larval homology should uncover deeply conserved evolutionary signatures. However,

    it is also possible that larval traits are only shared between certain lineages, which would result in four

    primary evolutionary scenarios (Figure 1.5). The first scenario is the ancient metazoan synapomorphy

    of biphasic life cycles, as proposed by Degnan and Degnan 2006, Nielsen 2013, and Mikhailov et al.

    2009 (Figure 1.5A). The second scenario suggests that poriferan larvae evolved independently from the

    planula-like larvae, which is the ancestral larval-type to Cnidarian-Bilaterian lineages. This hypothesis

    would be consistent with examinations of apical morphology as suggested by Marlow et al. 2014 (Figure

    1.5B). The third scenario describes the independent evolution of poriferan, cnidarian, and bilaterian

    lineages. This scenario is consistent with the planuloid–acoeloid hypothesis, which postulates that

    biphasic species in bilaterians evolved from a planula-like last common ancestor (reviewed in Baguñà

    and Riutort 2004). This hypothesis is also consistent with the deeply conserved gut patterning shown

    between protostome and deuterostome larvae (Arendt et al. 2001; Figure 1.5C). Lastly, it is possible

    that biphasy has evolved convergently (Hadfield 2000) in each of these animal lineages, consistent

    with early Trochaea theories (e.g. Nielsen and Nørrevang 1985; Figure 1.5D).

    Poriferans

    Cnidarians

    Protostomes

    Deuterostomes

    Poriferans

    Cnidarians

    Protostomes

    Deuterostomes

    Poriferans

    Cnidarians

    Protostomes

    Deuterostomes

    Poriferans

    Cnidarians

    Protostomes

    Deuterostomes

    A B C D

    Figure 1.5 Possible scenarios for the evolution of biphasic life cycles. Different evolutionary scenarios can be inferred through homology. Colored blocks represent shared traits between clades and black circles represent a distinct evolutionary event leading to biphasy, following the assumption that homologous traits were also present in the LCA.

  • Th e c h a r a c T e r i z aT i o n o f l a rva l h o m o l o g y

    38

    In light of these theories, fundamental unanswered questions emerge: can modern genomics provide

    novel insight into the evolution of the pelagobenthic life cycle, and how can this new information be

    incorporated into the existing theoretical frameworks for life cycle evolution? More specifically, (1)

    under what mechanism can a single genome pattern two different body plans, and (2) what are the

    implications of these findings for the evolution of animal life cycles? In this thesis, I aim to answer these

    fundamental questions using the sponge, Amphimedon queenslandica, as a case study. Furthermore, I

    aim to use comparative techniques to identify and compare commonalities between A. queenslandica

    biphasy with that of other metazoans.

    1.3.1 The Amphimedon queenslandica model system: a foundational case study

    Sponges are widely considered to be the earliest branching metazoan phyletic lineage (Edgecombe

    et al. 2011), due to their simple adult body plan, which lacks neurons or a central gut. However,

    recent phylogenomic studies report that ctenophores may have branched off from the metazoan stem

    before sponges (Ryan et al. 2013, Moroz et al. 2014). This would suggest that the simplified adult

    body plan of the sponge represents a highly derived state, which resulted from the loss of many tissue

    types previously assumed to be conserved cnidarian-bilaterian synapomorphies (Ryan and Chiodin

    2015). However, despite this phylogenetic ambiguity, the majority of sponges possess a complex

    pelagobenthic life cycle (Wörheide et al. 2012), which is arguably the earliest branching example of

    a primary biphasic life cycle.

    A. queenslandica is a brooding haplosclerid demosponge with a biphasic life cycle consisting of a

    ciliated parenchymella larva and a sessile, filter-feeding adult (Degnan et al. 2015; Figure 1.6). Despite

    a lack of any neuronal signaling, A. queenslandica larvae are photosensitive (Leys and Degnan 2001),

    and display highly coordinated behaviors, including phototaxis and settlement selection (Jackson et al.

    2002, Degnan and Degnan 2010). Additionally, like most biphasic organisms, A. queenslandica larvae

    develop competence before settlement upon exposure to a specific settlement cue, such as crustose

    coralline algae rubble (Degnan and Degnan 2010).

    A. queenslandica provides important insight into the early evolution of animal ontogenies, particularly

    since hallmarks of early embryogenesis and the patterning of larval body plans are largely conserved

  • 39

    Ch a p t e r 1: In t r o d u C t I o n

    with other eumetazoan species (reviewed in Degnan et al. 2015). For instance, like eumetazoan species,

    A. queenslandica displays characteristic features of animal embryogenesis, including asymmetric cell

    division and cell fate determination by well conserved transcription factors and signaling pathways

    (reviewed in Degnan et al. 2015; Adamska et al. 2007). Furthermore, A. queenslandica offers a unique

    opportunity for genomic studies, as it is one of the few basal organisms with a fully sequenced and

    annotated genome (Srivastava et al. 2010) and reasonably-well annotated developmental transcriptome

    (Fernandez-Valverde et al. 2015).

    1.3.2 Aim 1: Gleaning biological meaning from transcriptomes of non-model organisms

    Despite the abundant genomic resources available for A. queenslandica, like many other early branching,

    non-model organisms, only a small portion of the putative gene models have been functionally tested

    in the lab. Hence, genome wide studies are largely restricted to bioinformatically-derived functional

    Embryogenesis

    adult brood chamber

    Acquisition of competence

    Larv

    al rel

    ease

    Settlement

    MetamorphosisMaturation

    Figure 1.6 Life cycle of the demosponge, Amphimedon queenslandica. A. queenslandica displays a typically biphasic life cycle with motile pelagic larva, and a benthic adult form. Pelagic phase is separated by metamorphosis, followed by the settlement of a competent larva in response to a specific chemical cue.

  • Th e c h a r a c T e r i z aT i o n o f l a rva l h o m o l o g y

    40

    annotations (such as gene ontology, KEGG, and Pfam) based on sequence similarity to better-

    characterized model systems such as Drosophila, C. elegans, and mice. Therefore, the first aim of my

    dissertation (Chapter 2) is to substantiate ubiquitously used computational tools in the A. queenslandica

    system against experimentally validated biological function. Based on these results, I develop a strategy

    for confidently mining existing A. queenslandica data sets to uncover how the biphasic life cycle is

    orchestrated in this species of sponge.

    1.3.3 Aim 2: Analysis of the genomic orchestration of biphasy in A. queenslandica

    Little is currently known about how a single genome is partitioned to pattern morphologically and

    ecologically distinct larval and adult body plans, and whether each phase expresses genes that are

    unique or shared between body plans. Therefore, the second aim of my dissertation (Chapter 3) is to

    use the A. queenslandica developmental transcriptome to investigate how biphasy is orchestrated in an

    extant species of sponge. Furthermore, I use computational methods of gene age to infer how biphasy

    in A. queenslandica may have evolved.

    1.3.4 Aim 3: Is metamorphosis conserved across the metazoan phyla?

    Given the hypothesis that the biphasic life cycle is an ancient synapomorphic trait of the Metazoa, my

    final chapter (Chapter 4) aims to use a cross-phyla comparative transcriptome approach to substantiate

    theoretical claims for the homology of marine life cycles. While larval traits can inform questions of

    larval homology, it is difficult to infer evidence for the mechanism of life cycle evolution (i.e. larval-

    first vs. adult-first hypotheses) from the larva alone. Therefore, to do this I examined developmental

    transcriptomes across metamorphosis (precompetent larval, competent larval, and feeding juvenile

    stages) from six indirectly developing species from different phyla, including the primary biphasic

    species – the cnidarian Acropora sp., the gastropod Haliotis asinina, the hemichordate Balanoglossus

    misakiensis, and the echinoderm Strongylocentrotus purpuratus - and one secondary biphasic species

    – the urochordate Herdmania momus.

  • 41

    chapTer 2 - gleaning biological meaning from TranscripTomes of non-model organisms: assessmenT of cel-seq developmenTal expression profiles in The sponge Amphimedon queenslAndicA

    2.1 absTracTBioinformatic methods that characterize genome and transcriptome content, such as gene ontology

    (GO) and KEGG, are widely used to infer biological function. However, there have been few studies

    comparing the reliability of these computational tools in non-model organisms. Here, I use the marine

    sponge, Amphimedon queenslandica – a species with an annotated genome and relatively well-described

    development – to compare and test these annotation techniques against experimentally validated results

    from the literature. BLIND clustering and correlation analysis of 82 A. queenslandica developmental

    CEL-Seq transcriptomes divided embryogenesis, larval development, metamorphosis, juveniles and

    adult stages into six distinct transcriptional blocks. Through the pairwise comparisons of each sequential

    block, I identified 13,580 significantly differentially expressed genes (DEGs). I then analyzed the A.

    queenslandica genome and developmental DEGs using three GO annotation methods, ‘Blast2GO/

    InterPro’, ‘Trinotate – BLAST’, and ‘Trinotate – Pfam’, and found that each yields different, often

    non-overlapping sets of annotations. GO enrichment analyses accentuate annotation bias, resulting

    in more uniquely enriched ontologies than shared ones between these three different annotations.

    However, GOSlim enrichments showed a higher degree of consistency than enrichment analyses using

    the entire GO hierarchy. Furthermore, I illustrated that Pfam and InterPro methods are more effective

    at identifying particular candidate genes in the A. queenslandica genome, such as G-protein coupled

    receptors, than BLAST-based methods. However, other candidate genes, such as putative transcription

    factors, are not effectively recovered by any GO annotation methods. Nonetheless, GO enrichment

    of DEG lists still largely correlated with known biological function. GOSlim appeared to be the most

    appropriate means to assess biological function, and when used in conjunction with KEGG analyses,

    can infer broad-scale biological trends across the A. queenslandica life cycle. In light of highlighted

    caveats of GO analyses, I provide a combined workflow that utilizes BLIND clustering, GOSlim, and

    KEGG to provide an accurate picture of gene function across the ontogeny of a non-model species.

  • Th e c h a r a c T e r i z aT i o n o f l a rva l h o m o l o g y

    42

    2.2 inTroducTionOver the last decade, large-scale sequencing techniques have become cheaper and more accessible,

    allowing previously understudied species to be examined for the first time on a genome- and

    transcriptome-wide scale (e.g. Putnam et al. 2007, Srivastava et al. 2010b, Simakov et al. 2013). When

    combined with new computational approaches (Anavy et al. 2014, Stegle et al. 2015), large RNA-Seq

    data sets, including single-cell transcriptomes (Shapiro et al. 2013, Schwartzman and Tanay 2015,

    Stegle et al. 2015) such as those generated by CEL-Seq (Hashimshony et al. 2012), can be assessed in

    novel ways with little prior knowledge of the underlying biology (Levin et al. 2012, 2016). However,

    this rapid influx of sequencing data has also created multiple challenges, including the functional

    characterization of thousands of newly sequenced and previously unidentified genes.

    Bioinformatic tools, such as gene ontology (GO) (Blake et al. 2015), Pfam (Finn et al. 2015), and

    KEGG (Ogata et al. 1999, Kanehisa et al. 2016a), are often employed to computationally characterize

    and annotate new genomes and transcriptomes, inferring gene function through sequence similarity to

    known orthologs in other species. Given its ease of use and customizability, GO is the most commonly

    employed method of in silico gene annotation. GO has been widely applied to a variety of studies

    focusing on previously uncharacterized species (e.g. Glöckner et al. 2016, Ricci et al. 2016), including

    developmental RNA-Seq studies seeking to understand the role of genes expressed at a particular life

    cycle stage or transition (Reyes-Bermudez et al. 2009a, 2016, Fiedler et al. 2010, Heyland et al. 2011,

    Conaco et al. 2012, Du et al. 2012, Vaughn et al. 2012, Qiu et al. 2015).

    The GO database consists of a hierarchical network of increasingly specific functional ontologies,

    generated from direct experimental assays, mutant phenotypes, and computationally-derived sequence

    or structural similarity (reviewed in Rhee et al. 2008). GO annotations can be assigned to new sequences

    though various methodologies, such as Blast2GO (Conesa et al. 2005, Götz et al. 2008) or Trinotate

    (Haas et al. 2013). These in turn can be further subdivided into methods that rely on whole-gene

    sequence similarity (i.e. BLAST-based inferences of gene orthology), the identification of known

    functional protein domains (i.e. Pfam and InterPro), or a combination of both. However, because the

    GO database is largely derived from a handful of well-characterized model bilaterians, these in silico

    methods of gene annotation are potentially unreliable in non-model species (e.g. Reyes-Bermudez et

  • 43

    Ch a p t e r 2: Gl e a n i n G b i o l o G i C a l m e a n i n G f r o m t r a n s C r i p to m e s

    al. 2009b, 2016, Fiedler et al. 2010, Heyland et al. 2011, Conaco et al. 2012b, Du et al. 2012, Vaughn

    et al. 2012, Ventura et al. 2013, Qiu et al. 2015). This may be particularly pronounced in more distantly

    related taxa, such as non-bilaterian animals (i.e. sponges, ctenophores, placozoans and cnidarians; e.g.

    Putnam et al. 2007, Srivastava et al. 2008, 2010b, Shinzato et al. 2011, Conaco et al. 2012b, Fortunato

    et al. 2012, Ryan et al. 2013, Moroz et al. 2014, Qiu et al. 2015) and non-metazoan eukaryotes (e.g.

    choanoflagellates, plants, and fungi - excluding model systems such as Arabidopsis and yeast; Galagan

    et al. 2003, Dean et al. 2005, King et al. 2008, Martinez et al. 2009, Collén et al. 2013, Suga et al.

    2013, Fairclough et al. 2013).

    To test the reliability of GO in a non-model species, I examine gene expression during the ontogeny of

    a marine sponge, Amphimedon queenslandica. As an early branching animal species with an annotated

    genome (Srivastava et al. 2010b), published developmental transcriptomes (Conaco et al. 2012,

    Anavy et al. 2014, Fernandez-Valverde et al. 2015; Levin et al. 2016), and well-described life cycle

    (reviewed in Degnan et al. 2015), A. queenslandica is a good non-bilaterian case study for in silico

    analysis of biological function across development. Here, I use BLIND ordering (Anavy et al. 2014)

    of an 82-sample replicated CEL-Seq data set (Levin et al. 2016) to divide the entire A. queenslandica

    ontogeny into six distinct transcriptional blocks that delineate different developmental stages. I test

    the validity of GO and KEGG methods by comparing the annotations of genes differentially expressed

    across key life cycle transitions with known cellular and developmental processes occurring during

    these periods. I find that each GO annotation method yields different functional enrichments. However,

    GOSlim can be used to conservatively represent A. queenslandica development in accordance with

    observed, and in some cases, experimentally tested biological function. Using this information I

    propose a streamlined workflow for future bioinformatic investigations in non-model organisms that

    minimizes spurious results.

    2.3 resulTs The life cycle of the sponge Amphimedon queenslandica (Figure 2.1A; reviewed in Degnan et al. 2015)

    begins in the maternal brood chamber, where the early cleaving embryos (cleavage stage) undergo

    development into a free-swimming larva. Embryogenesis is complete when the larval body plan is

    fully established, at which time the brood chamber-bound late-ring embryos are morphologically

  • Th e c h a r a c T e r i z aT i o n o f l a rva l h o m o l o g y

    44

    indistinguishable from larva. Upon emergence from the maternal sponge, the ciliated larva swims in

    the water column until it is competent to respond to ecological and chemical sensory stimuli (Jackson

    et al. 2002, Degnan and Degnan 2010). Once competence is attained, the larva settles and initiates

    metamorphosis on the benthos. Metamorphosis from a pelagic larva to a benthic feeding juvenile

    (oscula stage) takes three to four days at 25oC, before the established juvenile (oscula) grows and

    matures over an undetermined period of time into a sexually-mature adult.

    2.3.1 BLIND clustering largely places transcriptomes in expected temporal order based on morphology

    Because the maternal brood chamber contains mixed embryos from all stages of embryogenesis,

    identification of particular developmental stages is limited to manual morphological characterization.

    However, a computationally-based method called BLIND ordering (Anavy et al. 2014) has been

    developed to computationally sort individual RNA-Seq data sets into the correct temporal order. This

    method has been used to compare and contrast developmental transcriptomes between species (Levin

    et al. 2016), including a comprehensive timecourse of A. queenslandica embryogenesis consisting of

    59 individual transcriptomes from early cleavage to late-larvae (Anavy et al. 2014, Levin et al. 2016).

    However, these previous studies lack representa