benchmarking and comparing software for phylogenetic ... · figure 3: an example phylogenetic tree...

1

Benchmarking and comparing

software for phylogenetic analysis

from genome-scale data

COMP4560

Qiuyue Wang

U6342378

May 2019

Supervisors: Minh Bui and Yu Lin

2

Table of contents

Figure List ..................................................................................................................... 4

Table List ....................................................................................................................... 4

Equation List ................................................................................................................. 5

Acknowledgments.......................................................................................................... 6

Abstract .......................................................................................................................... 7

1. Introduction ............................................................................................................. 8

1.1 The theoretical basis of phylogeny ....................................................................... 8

1.2 Molecular data ...................................................................................................... 8

1.2.1 Deoxyribonucleic Acid .................................................................................. 8

1.2.2 Amino Acid ................................................................................................... 9

1.3 Phylogenetic tree ................................................................................................ 10

1.4 Phylogenetic tree reconstruction method ........................................................... 12

1.5 Aim ..................................................................................................................... 13

2. Background ............................................................................................................. 14

2.1 Phylogenetic inference and tree-space search .................................................... 14

2.1.1 Nearest Neighbor Interchange ..................................................................... 14

2.1.2 Subtree-Pruning-and -Regrafting ................................................................ 15

2.1.3 Time complexity .......................................................................................... 16

2.2 Phylogenetic inference by maximum likelihood ................................................ 16

2.2.1 RAxML........................................................................................................ 17

2.2.2 IQ-TREE...................................................................................................... 17

2.2.3 RAxML-NG ................................................................................................ 18

2.2.4 Related work ................................................................................................ 18

3. Method ..................................................................................................................... 19

3.1 Benchmark framework ....................................................................................... 19

3.2 Tree inference software ...................................................................................... 21

3.3 Cluster configuration .......................................................................................... 21

3.4 Data sets ............................................................................................................. 22

3.5 Evaluation method .............................................................................................. 23

4. Results & Discussion .............................................................................................. 24

3

4.1 Maximum log-likelihood score .......................................................................... 24

4.2 Running time ...................................................................................................... 27

4.3 Parallel efficiency ............................................................................................... 29

4.4 Memory usage .................................................................................................... 31

5. Conclusion ............................................................................................................... 33

6. References ............................................................................................................... 33

7. Appendix .................................................................................................................. 38

7.1 Final project description ................................................................................... 38

7.2 Project contract ................................................................................................. 38

7.3 Artefacts ............................................................................................................. 39

7.3.1 List of all program code files....................................................................... 39

7.3.2 Details of testing code ................................................................................. 40

7.3.3 Experimental environment .......................................................................... 40

7.4 README file ..................................................................................................... 40

7.5 Maximum log-likelihood scores for all tree inferences .................................... 43

7.6 Runtime, memory usage and parallel efficiency for all tree inferences ........... 50

4

Figure List

Figure 1: The same fragment of DNA alignment of different mammals, derived from

a DNA dataset. ............................................................................................................... 9

Figure 2: The same fragment of amino acid alignment of different land plants, derived

from a protein dataset................................................................................................... 10

Figure 3: An example phylogenetic tree of using IQ-TREE to build a phylogenetic

tree for a dataset and FigTree (v1.4.3) to visualize the tree file. ................................. 11

Figure 4: Simple visualization of rooted and unrooted trees.. ..................................... 11

Figure 5: Flowchart of constructing phylogenetic tree. ............................................... 12

Figure 6: Pruning and grafting of heuristic search algorithm. ..................................... 14

Figure 7: Simple schematic of NNI.. ........................................................................... 15

Figure 8: Simple schematic of SPR. ............................................................................ 16

Figure 9: The overall pipeline of my benchmark framework. ..................................... 20

Figure 10: Difference of maximum log-likelihood score to the score of best-known

trees. ............................................................................................................................. 26

Figure 11: Wall-clock execution time in hours (16 threads) ....................................... 28

Figure 12: The percentage of efficiency that runs in parallel using multiple threads. 30

Figure 13: Maximum memory usage in GB during the running. ................................ 32

Table List

Table 1: Software and related information. ................................................................. 21

Table 2: Hardware and software configuration of GDU server. .................................. 21

Table 3: Data sets with varying characteristics for evaluating. ................................... 22

Table 4: Command lines used for tree inference. ........................................................ 23

Table 5: The number of maximum log-likelihood tree inferences (out of 5) which

yield the best-known tree per dataset and inference software. .................................... 24

Table 6: The ratio of average RAxML-NG wall-clock running time relative to

RAxML and IQ-TREE. ................................................................................................ 27

Table 7: All likelihood scores extracted from the output files and the difference with

the maximum likelihood score of the best tree. ........................................................... 43

Table 8: All running time, memory usage and parallel efficiency calculated from the

output files. .................................................................................................................. 50

5

Equation List

Equation 1: Formula for calculating the percentage of parallel efficiency .................. 29

Equation 2: Formula for calculating theoretical maximum memory value ............... 31

6

Acknowledgement

First of all, I would like to thank my supervisors, Minh Bui and Yu Lin. Thank you for

your careful guidance and let me have a deeper understanding of phylogenetic analysis

and how to evaluate software. Thank you for giving me the opportunity to practice.

Thank you for your patience in every time I encounter confusion. Thank you for

teaching me how to take academic issues seriously.

Secondly, I would like to thank Cameron Jack from the ANU bioinformatics

consultancy. Thank you for helping me access the GDU server so that I could use huge

computing resources. Thanks to Alexey M. Kozlov, the developer of RAxML-NG, for

answering my question on github in time when I was confused about the operation.

I would also like to thank my colleague Yu Zhang. Thank you for your help and

encouragement on weekdays.

Thanks to all those who have helped me.

7

Abstract

With the rapid development of sequencing technologies in recent years, a large

number of nucleotide sequences and amino acid sequences are being collected at an

increasing pace. Biologists start inferring evolutionary relationships between species

(i.e. phylogenies) using aligned DNA or protein sequences. A number of maximum-

likelihood based phylogenomic software such as IQ-TREE, RAxML, PhyML have

emerged, raising the need for benchmarking of available approaches and thus helping

users to make an appropriate choice. This project was to design an automated pipeline

to benchmark different phylogenomic software over different datasets. The results

were evaluated through a multi-faceted assessment, including maximum likelihood

score, running speed, parallel efficiency, and memory usage. Twenty datasets with

distinct characteristics (type, length, number of taxa, etc.) were collected to compare

IQ-TREE, RAxML-NG, and RAxML.

In terms of maximum log-likelihood score, RAxML-NG performed best in general. It

found the best-known tree for fifteen data sets out of twenty, followed by IQ-TREE

which found the best tree for twelve data sets. RAxML-NG and IQ-TREE showed

good performance on both the DNA dataset and the protein dataset. In contrast,

RAxML found the best tree for only six data sets and five of datasets were protein.

But in some taxon-rich data sets, only RAxML-NG found one or two best trees in five

experiments. Neither IQ-TREE nor RAxML found the best tree. In repeated

experiments, IQ-TREE performed better results stability. It found the best tree for

eleven data sets (RAxML-NG:9; RAxML:5) in all five tree inferences. RAxML-NG

had considerable advantages when comparing speed. It was the fastest software in

seventeen data sets out of twenty. In the remaining three data sets, RAxML-NG was

slower than IQ-TREE or RAxML but found better ML trees. RAxML-NG and

RAxML have the best parallel efficiency and there remains a large room for IQ-TREE

for optimizing parallel implementation, especially when there were multiple

partitions. The memory usage of the three software was basically in line with

expectations, but when analyzing some huge data sets, this three software actually

took up more memory than expected.

This assessment helps users to understand the strengths and weaknesses and

performance of each software and select the software that best suits their needs. It also

provides insights for developers how to upgrade and improve software. e.g.,

improving the parallel efficiency for partition model in IQ-TREE.

8

1. Introduction

1.1 The theoretical basis of phylogeny

In the mid-19th century, Charles Darwin proposed the Darwinism that explains the

theory of biological evolution (C. Darwin, 1859). He carried out a systematic

explanation of the occurrence and development of the biological world, thus

overthrowing the dominance of the idealistic metaphysics of special creationism in

biology, making evolutionary biology a revolutionary change (Mayr, 2003). Last

universal common ancestor theory is the theoretical basis for constructing a

phylogenetic tree (Woese et al., 1990). It shows that all forms of life on Earth have a

common origin. Whether animals, plants, fungi, protists, or prokaryotes, they share a

common evolutionary history and have a near or far-reaching relationship. In molecular

biology, the genetic codes of all organisms are highly consistent.

A significant problem in the field of phylogeny is to reconstruct the evolutionary history

of all species and to use phylogenetic trees to represent evolutionary relationships

between biological groups. It is an important part of evolutionary biology research (Nei

and Kumar, 2000). Establishing a reliable phylogenetic relationship is not only the basis

for taxonomic classification and naming, but also a prerequisite for elucidating the

origin and spread of the genus, exploring the evolution of traits, and revealing the

mechanism of species formation (Futuyma, 1998; Soltis, 2000).

Due to technical limitations, biologists could only rely on the morphological

characteristics of living things to infer the genetic relationship between species in the

past. However, these characteristics have certain limitations. Sometimes the organisms

with different morphology also have certain genetic relationships, such as whales and

bats. Advances in molecular biology have made phylogenetic reconstruction possible.

Molecular data, especially DNA alignment has good richness and comparability, as well

as the normative nature of data analysis (Nei, 1987). It has become an important

means of evolutionary biology research. The theory and methods of constructing

phylogenetic trees based on mathematics and statistics have also developed rapidly,

forming a new research field named molecular phylogeny. It refers to the use of

information from biological molecules to infer the evolutionary history of organisms,

or to reconstruct the phylogenetic relationships of biological groups.

1.2 Molecular data

1.2.1 Deoxyribonucleic Acid

DNA is a long-chain polymer composed of nucleotides (Watson and Crick, 1953). The

nitrogenous bases of the nucleotides are adenine (A), guanine (G), cytosine (C), and

thymine (T). Almost all organisms store genetic information in the DNA. Since DNA

contains genetic information, through the comparison of DNA sequences, one can infer

9

the evolutionary relationship of organisms.

Usually we use a string of letters to display the primary structure of a DNA sequence

(Nei, 1987). Each letter represents a base, and the only possible letters are A, T, C, and

G. Because of the large differences in the evolution rates of different DNA fragments,

we can compare the DNA sequences to study the evolutionary relationship of organisms

at almost all levels. The genetic information contained in the DNA sequence is

significant for elucidating the evolution of the multigene family and understanding the

adaptive evolution of molecular levels (Nei, 1987). Figure 1 is an example of the

difference in same fragment of DNA between different species.

Figure 1: The same fragment of DNA alignment of different mammals, derived from a

DNA dataset (Tarver et al., 2016).

1.2.2 Amino Acid

Amino acid is the basic unit of protein. Prior to the invention of the rapid DNA

sequencing method (Maxam and Gilbert, 1977; Sanger et al., 1977), most molecular

evolution researches were based on amino acid sequences. They are more conservative

than DNA sequences and therefore provide more helpful information for the long-term

evolution of genes and species (Nei and Kumar, 2000). Figure 2 shows the same

fragment of amino acid alignment of some land plants. Different amino acids are

represented by different single letters. With this type of data, it is possible to determine

evolutionary differences between sequences and begin to construct phylogenetic trees.

10

Figure 2: The same fragment of amino acid alignment of different land plants, derived

from a protein dataset (Wickett et al., 2013).

1.3 Phylogenetic tree

The process of biological evolution is not directly visible. People can only understand

what has happened in history through relevant clues, and scientists use these clues to

establish various hypotheses, models, and even the history of life. In the study of

systematic classification, the most commonly used method to visualize evolutionary

relationships as a phylogenetic tree (e.g. Figure 3).

Charles Darwin introduced the concept of evolutionary "trees" in his groundbreaking

work "The Origin of Species" (C. Darwin, 1859). The phylogenetic tree uses a tree-like

branch diagram to represent the relationship between various organisms. The history of

species evolution is inferred by studying biological sequences, mainly DNA sequences

and amino acid sequences.

11

Figure 3: An example phylogenetic tree of using IQ-TREE (1.6.10) to build a

phylogenetic tree and FigTree (v1.4.3) to visualize the tree file for a dataset (Tarver et

al., 2016).

The phylogenetic relationship of organisms is often represented by a rooted or unrooted

tree structure. A rooted tree contains a unique root node that acts as the most recent

common ancestor of all species in the tree. Removing the rooted tree from the root

becomes an unrooted tree. The unrooted tree has no direction, and both directions of

the line segment are possible. The internal nodes of the tree represent the location of

the evolutionary event or the common ancestor in the evolution process. The external

nodes are also called leaf nodes, representing different species.

Figure 4: Simple visualization of (a) rooted and (b) unrooted trees. The internal nodes

represent potential common ancestors and leaf nodes A-F represent different species

such as human, Chimp, Gorilla etc.

Building a phylogenetic tree includes four steps: selecting sequences from different

species, aligning sequences, inferring phylogenetic trees, and evaluating phylogenetic

trees. The specific process is shown in the figure below. This report focuses on the last

two steps. The main task of inferring phylogenetic tree is to find the optimal tree

12

topology and estimate the branch length. The related algorithm and common software

will be described in detail later.

Figure 5: Flowchart of constructing phylogenetic tree.

1.4 Phylogenetic tree reconstruction method

The phylogenetic tree inference methods based on molecular level include distance-

based methods, maximum likelihood method (Felsenstein, 1981), maximum parsimony

method (Fitch, 1971) and Bayesian method (Rannala and Yang, 1996; Mau and Newton

1997). The latter three are based on discrete features. A commonly used distance-based

approach is the Neighbor-Joining method (Saitou and Nei, 1987). The method

minimizes the total distance of the phylogenetic tree by determining the closest (or

adjacent) pairwise classification units. The distance-based method usually cannot find

the exact minimum phylogenetic tree but the approximate minimum phylogenetic tree.

The disadvantage is this method has higher sensitivity to different mutation rates of the

species. The maximum parsimony method is to calculate all possible correct topologies

and pick the topology with the smallest number of substitutions as the optimal

phylogenetic tree. This method thinks that the evolutionary relationship with fewer

mutations is more likely to be the true evolutionary relationship between species (Sober,

1988). The maximum likelihood method analyzes a predetermined set of sequences

according to a specific substitution model and maximizes the likelihood value of each

topology obtained and selects the topology with the largest likelihood score as the best

phylogenetic tree. According to the multi-molecular evolution model, the Bayesian

method uses the Monte Carlo method of the Markov chain to generate the posterior

13

probability estimates of all parameters and finally selects the tree with the highest

reliability. Unlike the maximum likelihood method, the Bayesian algorithm specifies

the structure of the tree and the evolution model first and calculates the probability of

sequence composition to infer the corresponding phylogenetic tree.

There are many methods to infer a phylogenetic tree, and each has its own advantages

and disadvantages. Therefore, in practice, it is often necessary to combine different

construction methods to obtain the best analysis results according to their own research.

In general, the maximum likelihood method is considered to be more efficient than the

distance and parsimony methods when the evolution model is chosen reasonably, and

the results are in good agreement with the facts of evolutionary history (Yang and

Rannala, 2012; Whelan and Morrison, 2017). For closely related species sequences, the

maximum parsimony method is usually used; for distant species sequences, the

Neighbor-Joining method or the maximum likelihood method is generally used.

1.5 Aim

The purpose of this report is to design a framework to evaluate several commonly used

phylogenetic software based on the maximum likelihood method, and then compare

their performance and finally draw conclusions.

The specific objectives of this study are:

I. Develop and design an automated pipeline to invoke different software to infer

the phylogenetic tree for large data sets entered.

II. Collect and benchmark the performance of different software from various

aspects (stability, speed, parallel efficiency, etc.).

III. Analyze the results and provide suggestions to both users and developers.

To achieve these goals, I designed an automated benchmark pipeline. When the user

inputs an alignment file and a partition file, the pipeline automatically calls the different

software to perform the inference of the phylogenetic tree multiple times. After

inferring the phylogenetic trees, the pipeline collects all relevant maximum log-

likelihood scores, run time, parallel efficiency and other information and generate a

report containing a csv table of raw data and corresponding violin plots for multifaceted

assessment.

Different software may exhibit distinct performance for data sets with varying

characteristics. The numerical results and plot results provided by the pipeline help

users select software that is more suitable for their chosen data set. These results may

also help developers find the advantages and disadvantages of the software itself and

improve it.

14

2. Background

2.1 Phylogenetic inference and tree-space search

Building a phylogenetic tree is a typical NP-complete problem when the number of

species is large (Foulds and Graham, 1982). This means that the problem is unlikely to

be solved efficiently, and the relative solution can only be used to determine the most

appropriate answer. Heuristic search algorithms have been improved to search for the

tree spaces (Chor and Tuller 2005). An iterative "hill-climbing" optimization technique

is applied to solve this problem. The initial tree is modified using the rearrangement

algorithm, which replaces the initial tree if a better tree is found, according to the

maximum likelihood criterion. The heuristic search algorithm swaps the subtree

branches, grafts the branches to other locations of the current best tree found in this step,

and produces a tree with a similar topology to the initial tree (Figure 6). This process is

repeated until the algorithm terminates. The heuristic search algorithm can greatly

reduce the number of possible trees to be searched, thus solving the problem of a large

amount of calculations.

Figure 6: Pruning and grafting of heuristic search.

Although different phylogenetic software is based on the same maximum likelihood

method, they differ in the choice and implementation of tree rearrangement algorithms.

The currently used heuristic search algorithms are Nearest-Neighbor-Interchange (NNI)

algorithm and Subtree-Pruning-and -Regrafting (SPR) algorithm.

2.1.1 Nearest Neighbor Interchange

Nearest Neighbor Interchange (NNI) is the exchange of four subtrees in the main tree,

which means to swap the subtree to try to get a tree with higher probability (Robinson

1971). More specifically, the five internal branches on the tree are first removed, which

disconnects the four subtrees. Then rearrange the four subtrees in other ways. There are

three possible ways to connect four subtrees. In addition to the original connection,

interchange process creates two new trees. Repeat this process for subtrees until no

better tree generated. Thoroughly searching for possible nearest neighbors for each

possible subtree set is the slowest but most optimized way to perform the search. Figure

7 is an example of an NNI algorithm where branches B and C are exchanged or B and

D are exchanged.

15

Figure 7: Simple schematic of NNI. The two possible exchanges on an internal edge.

2.1.2 Subtree-Pruning-and -Regrafting

Subtree-Pruning-and -Regrafting (SPR) is a much broader heuristic search method. The

method is to select and separate the subtree from the main tree and reinsert it into

another branch of the main tree. It creates a new tree topology each time, and then

calculates the possibility of a new topology and evaluates for potential improvements.

(Swofford et al., 1996). This process is repeated for subtrees within the specified level

until no significant improvements are made. In Figure 8, the red subtree is cut and

reinsert into other position. This method explores changes in known trees with

approaching the minimum length. This reduces the amount of computation because it

is more efficient than checking a large number of alternative trees of unknown length.

16

Figure 8: Simple schematic of SPR. The red subtree is pruned and grafted

into other branches.

2.1.3 Time complexity

The NNI algorithm checks the O(N) topology each time, where N is the number of

leaves in the original tree. Instead, a single pass of the SPR algorithm checks O(N2)

new trees. The SPR method considers more trees than the NNI method and is therefore

more time consuming. But SPR is more scalable than NNI in terms of searching tree

space. NNI is not easy to find shorter trees sometimes.

2.2 Phylogenetic inference by maximum likelihood

All of the phylogenetic programs evaluated in this report was based on the maximum

likelihood method. Maximum likelihood is a statistical method that explicitly uses a

probability model (Felsenstein, 1981). The goal of this method is to find a phylogenetic

tree that can produce observation data with a high probability. It is a commonly used

method of phylogenetic tree reconstruction based on statistics.

The maximum likelihood method was first applied to phylogenetic analysis in the

17

analysis of gene frequency data. The principle is to take into account the likelihood

values of the residues at each locus and accumulate all possible residue substitution

probabilities at each position to produce a likelihood value for a particular locus

(Felsenstein, 1981). The maximum likelihood method computes a likelihood function

value for all possible phylogenetic trees to infer the probability distribution and assigns

the probability to a specific possible phylogenetic tree. The tree with the largest

likelihood function value is the most likely phylogenetic tree. To use the maximum

likelihood method to infer a phylogenetic tree of an alignment, we need to first

determine the model of sequence evolution. Substitution models for nucleotide

sequences are generally Jukes-Cantor model (Jukes and Cantor, 1969) and the Kimura-

2 parameter model (Kimura, 1980). The model of protein sequence generally chooses

Poisson correction. The maximum likelihood algorithm is based on statistical properties

and is supported by good mathematical theory. The disadvantage of this method is that

it requires a considerable amount of computation, and sometimes it can be time-

consuming.

Three popular fast phylogenetic software based on maximum likelihood are chosen in

our evaluation process, including RAxML, IQ-TREE and RAxML-NG. They all

support partitioned analysis and common and custom models. But they differ in the

choice and implementation of topological moves and the trade-off between speed and

performance.

2.2.1 RAxML

RAxML (Randomized Axelerated Maximum Likelihood) is a program for large

phylogenetic trees inference based on maximum likelihood. It uses a fast tree search

algorithm and returns a tree with good likelihood scores (Stamatakis et al., 2006, 2014).

The latest version 8.2.12 employs SPR-based heuristic search algorithms and “lazy

subtree rearrangements” to reduce the number of unreasonable SPR alternatives

(Stamatakis et al. 2005). On the one hand, the lazy subtree rearrangement algorithm

limits the distance between the re-grafted position and the pruning position. On the

other hand, when a re-grafting results in a worse likelihood score, all branches that are

away from that re-grafting position will no longer be considered (Stamatakis et al.,

2007). RAxML creatively employs dynamic adjustment of rearrangement distances

(Stamatakis et al. 2006). Multiple iteration distances are used on the starting tree to

determine the optimal rearrangement distance, and then the minimum rearrangement

distance that produces the best likelihood improvement is selected for inference.

RAxML has been also parallelized with MPI to perform parallel multiple bootstraps

and inferences on multiple initial trees. RAxML has an excellent performance in terms

of accuracy and speed (Stamatakis et al., 2006).

2.2.2 IQ-TREE

IQ-TREE is an emerging and widely used software for phylogenetic analysis from

18

genome-scale data. IQ-TREE (latest version 1.6.10) employs a new tree search strategy

to overcome local optimization problems. IQ-TREE combines some existing

phylogenetic and combinatorial optimization techniques to form a new efficient tree

search strategy (Nguyen et al. 2015). This new strategy combines extensive sampling

of initial starting trees, an NNI-based hill-climbing search algorithm, and a stochastic

perturbation method for the current best tree to escape local NNI optima caused by pure

"hill-climbing" methods. In detail, IQ-TREE generates multiple initial trees and stores

and updates candidate trees throughout the iteration. In each iteration, IQ-TREE

randomly selects a candidate tree and modifies the tree using stochastic perturbations.

An NNI based hill climbing tree search is then applied to this tree. If a tree with a higher

likelihood score is generated, the worst one of the current candidate trees is replaced;

otherwise, this iteration fails, and the analysis will terminate after the number of failed

iterations exceeds the limit. In addition, IQ-TREE uses some elements of the

evolutionary strategy to extend tree space search (Nguyen et al., 2015). IQ-TREE shows

good performance and is a time and search efficient ML tree rebuild program.

2.2.3 RAxML-NG

In 2018, RAxML's development team developed RAxML-NG (Next generation). This

is an established greedy tree search algorithm that re-implements RAxML from scratch

(Kozlov et al., 2018). Compared to RAxML, RAxML-NG supports more evolution

models and provides optimization of all model parameters. The user can also set a fixed

value for all parameters including branch lengths as needed. It fixes an issue where the

subtree enumeration method used in RAxML occasionally skips the promising topology.

RAxML-NG combines some of the latest released methods to improve performance.

RAxML-NG also employs a technique that optimizes likelihood computation called

site repeats (Kobert et al., 2017) to increase speed. RAxML-NG integrates balanced

load algorithms and parallel I/O optimization techniques used in ExaML (Kozlov et

al.'s software for large cascading datasets in 2015) to improve parallel efficiency.

RAxML-NG offers greater accuracy, speed, scalability and usability.

2.2.4 Related work

There is a relative lack of comprehensive benchmarking of the latest phylogenetic

software. Most evaluations are made by the software developer to compare their

software to the same type of software at the time. These studies are often outdated. For

example, in 2006, RAxML’s developers compared RAxML with other phylogeny

programs in the same period, including GARLI (Zwickl, 2006), MrBayes (Ronquist

and Huelsenbeck, 2003), IQPNNI (Minh et al., 2005) and PHYML (Guindon and

Gascuel, 2003). Their results proved that RAxML performed best. Some of these

programs have no longer been updated or have been eliminated in the competition.

A recent paper (Zhou et al., 2018) evaluated four fast phylogenetic software based on

maximum likelihood from the likelihood score topology and computational speed,

19

including IQ-TREE (1.5.5), RAxML(8.2.11), FastTree (2.1.10) and PhyML (20160531).

The results showed that IQ-TREE performed the best tree inference accuracy. Ranked

second close was RAxML. In contrast, PhyML generates trees with lower maximum

likelihood scores. In terms of running time, IQ-TREE was faster than RAxML for both

protein and DNA data. On average, PhyML was 1.5 times faster than RAxML for

protein data sets, but 3.1 times slower on DNA data sets. FastTree was the fastest

compared with other three programs but got the worst results according to maximum

likelihood.

In another newer paper (Kozlov et al., 2018), developers of RAxML-NG compared IQ-

TREE (1.6.7), RAxML-NG (0.6.0), ExaML (3.0.19) and RAxML (8.2.10). Their results

showed that RAxML-NG got better likelihood scores in both DNA datasets and protein

datasets. Ranked second was IQ-TREE. RAxML and ExaML performed similarly in

terms of maximum likelihood scores. RAxML-NG was also the fastest software on

most datasets. In the remaining few data sets, RAxML-NG was slower than ExaML or

RAxML but got trees with better likelihood scores.

It can be seen that systematic evaluation and comparison of phylogeny software are

relatively lacking, raising the need for benchmarking of available programs and thus

helping users to make an appropriate choice.

2. Method

3. 1 Benchmark framework

20

Figure 9: The overall pipeline of my benchmark framework. When the user inputs an

alignment file and a partition file, the pipeline automatically invokes different

software to infer phylogenetic trees multiple times. Then it collects and calculates all

results including maximum likelihood scores, run time, parallel efficiency and

21

memory usage and generate a report containing tables of raw data and corresponding

violin plots.

As Figure 9 shows, the highlight of the framework is its automation. The user only

needs to enter the directory where the alignment file and the partition file are located

and specify the number of times each software is repeated and the number of threads

they would like to use. Then they can obtain the corresponding output and report. The

pipeline automatically traverses all alignments in the directory and invokes multiple

software to infer phylogenetic trees. It then collects information from all output files

and summarizes the results including maximum likelihood score for each tree, the time

and memory it takes to infer the tree and parallel efficiency. A table containing the raw

data and the corresponding violin plots will be returned to the user finally. Another

advantage of the framework is the evaluation of parallel efficiency and memory usage.

This is an innovation that was not covered in previous research.

3.2 Tree inference software

Consider the results in other papers, IQ-TREE, RAxML and the newly improved

RAxML-NG were selected for evaluation in this report. The version, release time, and

reference of the software used are shown in the following table (Table 1). These three

programs were compared in terms of maximum likelihood score, run time, memory

usage, and parallel efficiency.

Software Version Release date References

IQ-TREE 1.6.10 March 2019 Nguyen et al., 2015

Chernomor et al., 2016

RAxML 8.2.12 May 2018 Stamatakis, 2014

RAxML-NG 0.6.0 September 2018 Kozlov et al., 2018

Table 1: Software and related information.

3.3 Cluster configuration

All evaluation codes were run on a cluster called GDU server provided by the Research

school of Biology in Australian National University. Related software and hardware

information is shown in Table 2.

System GDU Server

Hardware Software

CPU model Intel(R) Xeon(R)

CPU E5-2630

[email protected] GHz

Operating

System

CentOS Linux

release 6.6

CPU architecture Haswell

CPU cores 96 Compiler GCC version 8.2.0

Memory size 378GB MPI Open MPI 2.0.2

22

Table 2: Hardware and software configuration of GDU server.

3.4 Data sets

The three software may exhibit different behaviors for phylogenetic datasets with

different characteristics. Ten protein and ten DNA datasets were selected to measure

three software as comprehensively as possible. Among these datasets, seventeen

datasets were collected by Zhou et al. in 2018 for the evaluation of phylogenetic

software. The other three were collected by the developer of RAxML-NG, Kozlov et

al., to evaluate the performance of RAxML-NG. These datasets have varying numbers

of taxa and genes as well as different alignment lengths and cover a range of species

such as animals, plants, and fungi (Table 3). In the collection process of the data set,

the format of the alignment file was first converted into PHYLIP to ensure that they

could be recognized by most phylogenetic software. Twelve of them contain partition

files and use the partition model for tree inference. The partitioning scheme and

substitution model in the setting were consistent with the original study of these dataset.

Dataset Data

type

Taxa Length Distinct

patterns

Partitions Reference

SongD1 DNA 37 1,338,678 746,408 1 (Song et al.,

2012)

MisoD2b DNA 144 413,459 371,434 50 (Misof et al.,

2014)

WickD3a DNA 103 436,077 422,676 14 (Wickett et al.,

2014)

WickD3b DNA 103 290,718 277,375 8 (Wickett et al.,

2014)

XiD4 DNA 46 239,763 165,781 1 (Xi et al.,

2014)

PrumD6 DNA 200 394,684 236,674 75 (Prum et al.,

2015)

TarvD7 DNA 36 21,410,970 8,520,738 1 (Tarver et al.,

2016)

PeteD8 DNA 174 3,011,099 2,248,590 4,116 (Peters et al.,

2017)

ShiD9 DNA 815 20,364 13,311 29 (Shi and

Rabosky, 2015)

StamD10 DNA 436 1,371 1,011 1 (Stamatakis et

al., 2010)

NagyA1 AA 60 172,073 156,312 594 (Nagy et al.,

2014)

MisoA2 AA 144 413,459 406,963 479 (Misof et al.,

2014)

23

WickA3 AA 103 145,359 144,342 11 (Wickett et al.,

2014)

ChenA4 AA 58 1,806,035 1,547,914 1 (Chen et al.,

2015)

StruA5 AA 100 189,193 178,600 1 (Struck et al.,

2015)

BoroA6 AA 36 384,981 376,803 831 (Borowiec et al.,

2015)

WhelA7 AA 70 59,725 58,419 210 (Whelan et al.,

2015)

YangA8 AA 95 504,850 476,259 1,122 (Yang et al.,

2015)

ShenA9 AA 96 609,899 583,199 1 (Shen et al.,

2016)

GitzA12 AA 1,897 18,328 18,303 1 (Gitzendanner et

al., 2018)

Table 3: Data sets with varying characteristics for evaluating. The letter before the

number represents the data type, A is the amino acid, and D is the DNA.

3.5 Evaluation method

Multiple experiments help to reduce the difference in the results that random starting

trees can bring. Each software performed five phylogenetic tree inference processes for

each dataset with five distinct seeds for random number generator. That is, each dataset

got fifteen phylogenetic trees finally. The tree with the highest maximum likelihood

score in the dataset was selected from the fifteen trees. Such a tree was called the best-

known tree for a dataset.

By comparing the maximum likelihood score of the tree obtained from different

software with the maximum likelihood score of the best-known tree for each dataset,

we can see the performance of tree inference for different software. The specific

maximum likelihood score, runtime, memory usage, and parallel efficiency were also

used as comparison parameters.

Python was used to write the benchmark pipeline that inferred the tree as well as

calculated and summarized the run time and parallel efficiency and recorded the

memory usage information. Then the related results were visualized by ggplot() in R

language. The core command line is shown in the following table 4.

Mode Software Command lines

Inference IQ-TREE iqtree -s <ALIGNMENT> -q <PARTITIONS>

-seed <RSEED> -nt 16 -pre <OUTDIR>

24

Inference RAxML raxmlHPC-PTHREADS-AVX -m <MODEL>

-s <ALIGNMENT> -q <PARTITIONS> -p <RSEED>

-n <RUNNAME> -w <OUTDIR> -T 16

Inference RAxML-

NG

raxml-ng -search -msa<ALIGNMENT> -model

<PARTITIONS> -seed <RSEED> -prefix <OUTDIR>

-threads 16 -site-repeats on

Evaluation IQ-TREE iqtree -s <ALIGNMENT> -q <PARTITIONS>

-te <ML_TREE> -nt 16 -pre <OUTDIR>

Time /usr/bin/time

Table 4: Command lines used for tree inference.

4. Results & Discussion

4.1 Maximum log-likelihood score

The maximum likelihood score is one of the benchmarks for judging the phylogenetic

tree. When comparing the maximum likelihood score for each tree, RAxML-NG got

the best results, followed closely by IQ-TREE. Within the tolerances of the error, the

number of best-known trees each software found for each data set were counted. Details

are in Table 5. The information below is very intuitive to show that RAxML-NG found

the best-known tree for fifteen data sets out of twenty. IQ-TREE found best-known

trees on twelve datasets instead. At the same time, RAxML only found the best-known

tree in six data sets, and most of the data sets were protein. This shows that RAxML

may be easier to get a better maximum likelihood score when inferring phylogenetic

trees for protein alignments. IQ-TREE and RAxML-NG performed well in both protein

and DNA data sets. For some data sets (YangA8, ShenA9 and XiD4) with a small

number of taxa and no partitioning model used, all software found the best-known tree.

Conversely, in some large data sets that were too complex (PeteD8, ShiD9, GitzA12,

etc.), only RAxML-NG found the best-known tree. This may be because the SPR

algorithm used by RAxML-NG is more suitable for handling such cases than the NNI

algorithm used by IQ-TREE for data sets with a huge number of taxa. It's worth noting

that no software can find the best-known tree for all data sets. In other words, no

rearranging algorithm can outperform other algorithms in all cases.

Dataset ML tree searches which found the best-known tree

IQ-TREE RAxML RAxML-NG

SongD1 5 0 5

MisoD2b 0 0 5

WickD3a 0 0 5

WickD3b 5 0 3

XiD4 5 4 5

PrumD6 5 0 0

TarvD7 5 0 0

PeteD8 0 0 1

25

ShiD9 0 0 1

StamD10 0 0 2

NagyA1 5 0 5

MisoA2 5 0 0

WickA3 0 0 5

ChenA4 5 5 5

StrucA5 4 0 5

BoroA6 0 5 0

WhelA7 5 5 0

YangA8 5 5 5

ShenA9 5 5 3

GitzA12 0 0 1

Number of datasets for which the best-known tree was found

12 6 15

Table 5: The number of maximum log-likelihood tree inferences (out of 5) which

yield the best-known tree per dataset and inference software.

Figure 10 shows the difference between the maximum likelihood score of the tree

obtained by each software and the score of best-known trees. From the distribution

point of view, the performance of IQ-TREE was more stable. IQ-TREE tended to get

more similar results when using different random seeds. This may be because IQ-TREE

uses multiple initial trees and stores and updates candidate trees throughout the iteration.

This can avoid local optimization problems to some extent. RAxML-NG and RAxML

got a big difference sometimes. For software that is not stable enough, users may need

to repeat multiple operations to get satisfactory results. Therefore, RAxML-NG and

RAxML need to improve stability and avoid the contingency of random starting trees.

26

Figure 10: Difference of maximum log-likelihood score to the score of best-known trees.

27

4.2 Running Time

RAxML-NG was the fastest performing in seventeen out of twenty data sets (Figure11,

Table 6). Table 6 provides the ratio of the average running time of RAxML-NG to the

time spent by the other two software. It can be seen that in these seventeenth data sets,

the speed ratio of RAxML-NG was from 1.02 (relative to IQ-TREE on WhelA7) to 6.87

(relative to IQ-TREE on MisoD2b). IQ-TREE, which was the fastest only in one data

set (NagyA1) as well as found five best-known trees (Table 5). RAxML performed the

fastest on data sets StamD10 and GitzA12. But in these two data sets, only

RAxML-NG found the best-known tree. The latest version of RAxML-NG implements

an excellent phylogenetic likelihood kernel and efficient parallelization and load

balancing technolog. This may be why it performs well in terms of running time.

Although IQ-TREE took the longest time in most data sets, IQ-TREE was still better

trees than RAxML.

Dataset RAxML-NG speedup (x) compared to

IQ-TREE RAxML

SongD1 5.83 2.00

MisoD2b 6.87 1.90

WickD3a 2.01 1.14

WickD3b 2.17 1.10

XiD4 3.93 1.09

PrumD6 2.38 1.33

TarvD7 4.23 1.71

PeteD8 1.42 1.44

ShiD9 6.38 1.36

StamD10 21.75 0.25

NagyA1 0.72 2.37

MisoA2 1.54 1.69

WickA3 1.64 1.14

ChenA4 1.84 1.15

StrucA5 1.21 2.08

BoroA6 1.68 1.64

WhelA7 1.02 2.16

YangA8 1.13 1.51

ShenA9 1.36 1.23

GitzA12 2.59 0.93

Table 6: The ratio of average RAxML-NG wall-clock running time relative to

RAxML and IQ-TREE.

28

Figure 11: Wall-clock execution time in hours (16 threads).

29

4.3 Parallel efficiency

Compared with the existing papers, one of the innovations of this study is to evaluate

the parallel efficiency of different software. Given that all evaluated software support

multi-threaded operation, we hope to help developers identify problems and improve

software by measuring actual parallel efficiency. Sixteen threads were used in this

experiment. Therefore, the formula for calculating parallel efficiency is as follows,

where K represents the number of threads. C stands for CPU time and W for Wall clock

time.

C ÷ (K × W)

From the perspective of parallel efficiency (Figure 12), both RAxML-NG and RAxML

implemented excellent parallelization techniques. When using multiple threads, parallel

efficiency could approach 99.8%. If we compare it carefully, we can find that the

parallel efficiency of RAxML was slightly better than RAxML-NG. However, the

parallel efficiency of IQ-TREE only reached about 75% in average. The current version

of IQ-TREE has not yet achieved better parallelization and load balancing techniques.

Simply speaking, when the number of partitions in a data set is not multiplier of the

number of threads, IQ-TREE cannot balance the load well, which makes the running

time longer and the parallel efficiency lower. In actual, both IQ-TREE and RAxML

parallelize computations over alignment sites. But RAxML can divide partitions into

smaller chunks, and then fit the chunks into k threads equally. Whereas, IQ-TREE does

not implement this. When there are long partitions and short partitions, the load

balancing is not good, as more time is waiting for computations on long partitions to

finish. Therefore, this benchmark helps to identify this problem to further improve IQ-

TREE.

30

Figure 12: The percentage of efficiency that runs in parallel using multiple threads

(16 threads).

31

4.4 Memory usage

Another unprecedented innovation is that we also evaluated memory usage when

software infers trees. The data in Figure 13 is the maximum memory usage by different

software. It can be seen that even if in the same data set, the memory occupied by

different software was distinct. The red line represented the theoretical minimum

memory value. Since the implementation of the software and some other non-

computation operations also take up memory, it is actually higher than the theoretical

value. The theoretical value in bytes is calculated as follows:

M = N × M × K × R × size of (doubles )

N represents the number of sequences. M is the number of distinct site patterns. K

represents different states in the sequence, it represents four bases for DNA, while for

proteins K represents 20 amino acids. R stands for rate categories. In this experiment

G4 was used so R=4. Size of double is generally considered equal to eight bytes. This

is the memory requirement for storing the likelihoods on the tree according to the

pruning algorithm (Felsenstein, 1981). This should represent the major bulk of RAM,

but particular software may allocate more RAM.

In most cases the actual values of memory usage were only slightly higher than the

theoretical values. This was very reasonable and in line with expectations. However,

for some sites-rich data sets (TarvD7, PeteD8, etc.), all software tended to use more

memory than the theoretical value to infer the phylogenetic tree.

32

Figure 13: Maximum memory usage in GB during the running.

33

5. Conclusion

This study was to design an automated pipeline to benchmark different phylogenetic

software. The pipeline can call multiple phylogenetic software to perform phylogenetic

inference on the input data set and compare the results from four aspects: maximum

likelihood score, run time, parallel efficiency, and memory usage. Compared with

previous benchmarks (Zhou, 2017; Kozlov, 2019), ours additionally address the issue

of parallel efficiency and memory usage. Three popular maximum likelihood based

phylogenetic software were selected to be compared. To test the pipeline, some state-

of-the-art genome-scale data sets with varying characteristics were collected. From the

tree's maximum likelihood score, RAxML-NG was the best one. Ranked second was

IQ-TREE. But the stability of IQ-TREE was better. RAxML-NG was also the fastest

software in most data sets. Both RAxML and RAxML-NG implemented good

parallelization techniques. Our comparison revealed that there remains a large room for

IQ-TREE for improving parallel efficiency. Moreover, most of the situation of software

in terms of memory usage was similar to the theoretical value, but when analyzing some

data sets with more sites, the software generally uses memory that was significantly

higher than the theoretical value to infer the phylogenetic tree.

The pipeline can help users choose the software that works best for them. For software

developers, the benchmarking of ML-based phylogenetic software also helps them

discover the advantages and disadvantages of the software and how to improve it.

It can be seen that the implementation methods of technology and algorithm have a

great impact on performance of the software. Each software has excellent algorithmic

ideas and innovative implementations. It is the developers who are constantly

integrating new and excellent methods to make significant progress in the field of

phylogeny.

In the future, we plan to add more software that can be evaluated in our pipeline, such

as MEGA, PhyML and so on. We also intend to collect more reliable data sets for

evaluation. We hope to provide web services so that users can easily upload data sets

through web pages and get results summary of different software.

6. References

Borowiec, M. L. et al. (2015). Extracting phylogenetic signal and accounting for bias

in wholegenome data sets supports the ctenophora as sister to remaining metazoa. BMC

genomics, 16(1), 987.

Darwin, C. (1859). The origin of species by means of natural selection. (Murray,

London, 1859).

34

Chernomor, O. et al. (2016). Terrace aware data structure for phylogenomic inference

from supermatrices. Systematic Biology, 65(6), 997–1008.

Chen, M.-Y. et al. (2015). Selecting question-specific genes to reduce incongruence in

phylogenomics: A case study of jawed vertebrate backbone phylogeny. Systematic

Biology, 64(6), 1104–1120.

Chor B, Tuller T. (2005). Maximum Likelihood of Evolutionary Trees Is Hard. In:

Miyano S,Mesirov J, Kasif S, Istrail S, Pevzner PA,Waterman M, editors. Research in

Computational Molecular Biology: 9th Annual International Conference, RECOMB

2005, Cambridge, MA, USA; 2005 May 14–18, Proceedings. Berlin, Heidelberg

(Germany): Springer. p. 296–310.

Felsenstein J. (1981). A likelihood approach to character weighting and what it tells us

about parsimony and compatibility. Biological Journal of the Limnean Society, 16(3):

183-196.

Fitch, Walter M., Toward defining the course of evolution: minimum change for a

specific tree topology. Systematic Zoology, 20: 406-416 (1971).

Foulds LR, Graham RL. (1982). The steiner tree problem in phylogeny is NP-complete.

Advances in Applied Mathematics. 3: 4-49.

Futuyma DJ. (1998). Evolutionary biology. Sunderland, MA: Sinauer Associates.

Gitzendanner, M. A. et al. (2018). Plastid phylogenomic analysis of green plants: A

billion years of evolutionary history. American Journal of Botany, 105(3), 291–301.

Guindon S., Gascuel O. (2003) A simple, fast, and accurate algorithm to estimate large

phylogenies by maximum likelihood, Systematic Biology, vol. 52 (pg. 696-704)

Jukes TH. Cantor cr. (1969). Evolution of protein molecules. In: Mammalian Protein

Metabolism. New York: Academic Press.

Kimura M. (1980). A simple method for estimating evolutionary rates of base

substitutions through comparative studies of nucleotide sequences. Journal of

Molecular Evolution, 16(2): 111-120.

Kobert K, et al. (2017). Efficient detection of repeating sites to accelerate phylogenetic

likelihood calculates. Systematic Biology, 66(2), 205-217.

Kozlov AM, Aberer AJ, Stamatakis A. (2015). ExaML version 3: a tool for

phylogenomic analyses on supercomputers. Bioinformatics, 31(15), 2577-2579.

35

Kozlov AM. (2018). RAxML-NG: A fast, scalable, and user-friendly tool for maximum

likelihood phylogenetic inference. bioRxiv 447110

Mau B, Newton M. (1997). Phylogenetic inference for binary data on dendrograms

using Markov chain Monte Carlo. J.Comput.Graph.Stat, 6, 122-131.

Maxam AM, Gilbert W. (1977). A new method for sequencing DNA. Proc Natl Acad

Sci U S A, 74(2):560–564.

Mayr, E. (2003). The growth of biological thought. Cambridge, Mass.: The Belknap

Press of Harvard Univ. Press.

Nagy, L. G. et al. (2014). Latent homology and convergent regulatory evolution

underlies the repeated emergence of yeasts. Nature communications, 5, 4471.

Minh B.Q., et al. (2005). IQPNNI: parallel reconstruction of large maximum likelihood

phylogenies, Bioinformatics, vol. 21, 3794-3796

Misof, B. et al. (2014). Phylogenomics resolves the timing and pattern of insect

evolution. Science, 346(6210), 763–767.

Nei M. (1987). Molecular evolutionary genetics. New York: Columbia University Press.

Nei M, Kumar S. (2000). Molecular evolution and phylogenetics. Oxford: Oxford

University Press.

Nguyen, L.-T. et al. (2015). IQ-TREE: A fast and effective stochastic algorithm for

estimating maximum-likelihood phylogenies. Molecular Biology and Evolution, 32(1),

268–274.

Robinson DF. (1971). Comparison of labeled trees with valency three. J Comb Theory.

B 11(2): 105–119.

Ronquist F., Huelsenbeck J.. Mrbayes. (2003). 3: bayesian phylogenetic inference

under mixed models, Bioinformatics, vol. 19 (pg. 1572-1574)

Peters, R. S. et al. (2017). Evolutionary history of the hymenoptera. Current Biology,

27(7), 1013-1018

Prum, R. O. et al. (2015). A comprehensive phylogeny of birds (Aves) using targeted

nextgeneration DNA sequencing. Nature, 526(7574), 569–573.

Rannala, B. Yang, Z. (1996). Probability distribution of molecular evolutionary trees: a

36

new method of phylogenetic inference. Journal of Molecular Evaluation,43. 304-311.

Saitou N, Nei M. (1986). The number of nucleotides required to determine the

branching order of three species, with special reference to the human-chimpa-nzee-

gorilla divergence. Journal of Molecular Evolution, 24(1-2); 189-204.

Sanger F, Air GM, Barrell BG, Brown NL, Coulson AR, Fiddes CA, Hutchison CA,

Slocombe PM, Smith M. (1977). Nucleotide sequence of bacteriophage phi X174

DNA. Nature. 265(5596):687–695.

Shen, X.-X. et al. (2016). Reconstructing the backbone of the saccharomycotina yeast

phylogeny using genome-scale data. G3: Genes, Genomes, Genetics, pages g3–116.

Shi, J. J. and Rabosky, D. L. (2015). Speciation dynamics during the global radiation

of extant bats. Evolution, 69(6), 1528–1545.

Sober E. (1988). Reconstructing the Past: Parsimony Evolution and Inference. London:

Cambridge MIT Press.

Soltis ED, Soltis PS. (2000). Contributions of plant molecular systematics to studies of

molecular evolution. Plant Molecular Biology 42: 45-75.

Stamatakis A. (2014). RAxML version 8: a tool for phylogenetic analysis

and post-analysis of large phylogenies. Bioinformatics, 30(9):1312–1313.

Stamatakis A. (2006). RAxML-VI-HPC: maximum likelihood-based phylogenetic

analyses with thousands of taxa and mixed models. Bioinformatics, 22(21): 2688–2690.

Stamatakis A , Blagojevic F, Nikolopoulos DS, Antonopoulos CD (Stamatakis2007 co-

authors). (2007). Exploring new search algorithms and hardware for phylogenetics:

RAxML meets the IBM Cell. J VLSI Signal Process Syst Signal Image Video Technol.

483: 271–286.http://dx.doi.org/10.1007/s11265-007-0067-4

Stamatakis, A. et al. (2010). Maximum likelihood analyses of 3,490 rbcl sequences:

Scalability of comprehensive inference versus group-specific taxon sampling.

Evolutionary Bioinformatics, 6, EBO.S4528.

Struck, T. H. et al. (2015). The evolution of annelids reveals two adaptive routes to the

interstitial realm. Current Biology, 25(15), 1993–1999.

Song, S. et al. (2012). Resolving conflict in eutherian mammal phylogeny using

phylogenomics and the multispecies coalescent model. Proceedings of the National

Academy of Sciences, 109(37), 14942–14947.

37

Swofford DL, Olsen GJ, Waddell PJ, Hillis DM. (1996). Phylogenetic inference.

In: Hillis DM, Moritz C, Mable BK, editors. Molecular systematics. Sunderland (MA):

Sinauer Associates. p. 407–514.

Tarver, J. E. et al. (2016). The interrelationships of placental mammals and the limits

of phylogenetic inference. Genome Biology and Evolution, 8(2), 330–344.

Watson, J. and Crick, F. (1953). Molecular Structure of Nucleic Acids: A Structure for

Deoxyribose Nucleic Acid.

Whelan S, Morrison DA. (2017). Inferring trees. Methods Mol Biol.1525:349–377.

Whelan, N. V. et al. (2015). Error, signal, and the placement of Ctenophora sister to all

other animals. Proceedings of the National Academy of Sciences, 112(18), 5773–5778.

Wickett, N. J. et al. (2014). Phylotranscriptomic analysis of the origin and early

diversification of land plants. Proceedings of the National Academy of Sciences,

111(45), E4859–E4868.

Woese, C., Kandler, O. and Wheelis, M. (1990). Towards a natural system of organisms:

proposal for the domains Archaea, Bacteria, and Eucarya.

Xi, Z. et al. (2014). Coalescent versus concatenation methods and the placement of

amborella as sister to water lilies. Systematic biology, 63(6), 919–932.

Yang Z, Rannala B. (2012). Molecular phylogenetics: principles and practice.

Nat Rev Genet. 13(5): 303–314.

Yang, Y. et al. (2015). Dissecting molecular evolution in the highly diverse plant clade

caryophyllales using transcriptome sequencing. Molecular Biology and Evolution,

32(8), 2001–2014.

Zwickl D. (2006) Genetic algorithm approaches for the phylogenetic analysis of large

biological sequence datasets under the maximum likelihood criterion, TX University of

Texas at Austin PhD thesis

Zhou, X. et al. (2018). Evaluating fast maximum likelihood-based phylogenetic

programs using empirical phylogenomic data sets. Molecular biology and evolution,

35(2), 486-503

38

7. Appendix

7.1 Final project description

IQ-TREE is a widely used software for phylogenetic analysis from genome-scale data.

It has accumulated >1,300 citations for three novel and fast methods for model selection,

tree search algorithm and bootstrap approximation. This project was to test, verify,

benchmark and compare IQ-TREE with other existing software in phylogenetics,

including RAxML and RAxML-NG. The phylogenetic analysis was applied to a large

collection of publicly available datasets of DNA and amino acid alignments with

comprehensive metadata.

By the completion of this project, the student had a good understanding of the literature

review of phylogenetic analysis and showed good skills in developing and

benchmarking large-scale software. The student also wrote a report to communicate the

knowledge gained under the project.

7.2 Project contract

40

7.3 Artefacts

7.3.1 List of all program code files

The folder "projectcode" contains scripts "pipeline.py" and "plotforproject.R" that

used for these twenty datasets and two csv tables contain results and a subfolder

called "plot" contains all plots of these twenty datasets.

The folder "testcode" contains scripts "test.py" and "plotfortest.R" that you can run on

your own computer to test the pipeline.

The folder "example" contains two simple datasets that you can use as inputs of the

test script.

Since these twenty datasets are complex, we provide two simple alignments and a test

version of pipeline. These two scripts "pipeline.py" and "test.py" have some different

parameters. If you want to run it on your own computer, we recommend you use

test.py and the example alignments we provided. Before testing, please make sure all

41

software has installed in your computer and the corresponding versions are same.

All codes were implemented by Qiuyue Wang.

7.3.2 Details of testing code

All code has been tested for correctness. The test code is included in test.py.

7.3.3 Experimental environment

See Table 2 in chapter 3.3: Hardware and software configuration of GDU server.

See Table 3 in chapter 3.4: Data sets with varying characteristics for evaluating.

Compilers and versions: Python 2.7, R 3.5.3.

7.4 README file

This project is to design an automated pipeline to evaluate several phylogenetic

software from different aspects.

Author: Qiuyue Wang(u6342378) Please contact [email protected] if you have

any questions.

FOLDER DESCRIPTION

-----------------

The folder "datasets" contains following 20 datasets.

https://cloudstor.aarnet.edu.au/plus/s/hdnxvQaSyr225pC

The folder "outputs" contains all output files of these 20 datasets.

https://cloudstor.aarnet.edu.au/plus/s/7ClfKmP82mKP42K

The above two folders are so large that I provide links here to download them.

The folder "projectcode" contains scripts "pipeline.py" and "plotforproject.R" that

used for these twenty datasets and two csv tables contain results and a subfolder

called "plot" contains all plots of these twenty datasets.

The folder "testcode" contains scripts "test.py" and "plotfortest.R" that you can run on

your own computer to test the pipeline.

The folder "example" contains two simple datasets that you can use as inputs of the

test script.

42

Hint: Since these twenty datasets are complex, we provide two simple alignments and

a test version of pipeline. These two scripts "pipeline.py" and "test.py" have some

different parameters. If you want to run it on your own computer, we recommend you

use test.py and the example alignments we provided. Before testing, please make sure

all software has installed in your computer and the corresponding versions are same.

TEST OPERATION EXAMPLE

-----------------

1. Input "python test.py" in your command line, press Enter.

2. Input the path of the folder of all datasets, press Enter.

e.g. "example"

(If you use linux, you need to also input quotation marks. Here we hope you input the

folder's name instead of the alignment's name since our code can traverse all

subfolders under this folder)

3. Input the number of threads you want to use, press Enter.

e.g. 2

(We recommend you use a small number since the example alignments are simple,

otherwise some software like RAxML-NG may report an error)

4. Input how many trees you would like to obtain from each software.

e.g. 5

(We recommend you use 5 since too small number may cause the violin plots only

contain points, while too large number may spend more time)

5. Then you will get all output files and two csv files. The "lhscore.csv" contains each

tree's log-likelihood score. The "result.csv" contains running time, parallel efficiency

and memory usage for each tree inference.

6. If your python has installed rpy2 package you can call R directly to draw violin

plots, otherwise, you need to open your own Rstudio or other R tool to run

"plotfortest.R" after getting two csv tables. Then you will get four violin plots.

Hint: RAxML only support absolute path for output directory. You need change it

before use.

SOFTWARE VERSIONS

-----------------

RAxML(8.2.12)

IQ-TREE(1.6.10)

RAxML-NG(0.6.0)

COMPARISION

43

-----------------

Maximum log-likelihood score

Running time

Parallel efficiency

Memory usage

INPUT DATA (in project)

----------

For historical reasons, directory/file names are slightly different from the dataset

names used in the paper.

Please see the mapping below:

Paper Directory/file name

------ ---------------------

SongD1 dna_rokasD1

MisoD2b dna_rokasD2b

WickD3a dna_rokasD3a

WickD3b dna_rokasD3b

XiD4 dna_rokasD4

PrumD6 dna_rokasD6

TarvD7 dna_rokasD7

PeteD8 dna_hymeALL

ShiD9 dna_ShiD9

StamD10 dna_StamD10

NagyA1 aa_rokasA1

MisoA2 aa_rokasA2

WickA3 aa_rokasA3

ChenA4 aa_rokasA4

StruA5 aa_rokasA5

BoroA6 aa_rokasA6

WhelA7 aa_rokasA7

YangA8 aa_rokasA8

ShenA9 aa_rokasA9

GitzA12 aa_GitzA12

INPUT DATA (test)

----------

Dataset Source

------ ---------------------

dna_WoroD1 https://github.com/roblanf/BenchmarkAlignments

aa_NguyA1 https://github.com/roblanf/BenchmarkAlignments

44

7.5 Maximum log-likelihood scores for all tree inferences

dataset software seed ML score Best ML score

aa_GitzA12 iqtree p1 -3403341.11 -3403037.59

aa_GitzA12 iqtree p2 -3403135.13 -3403037.59

aa_GitzA12 iqtree p3 -3403246.50 -3403037.59

aa_GitzA12 iqtree p4 -3403179.69 -3403037.59

aa_GitzA12 iqtree p5 -3403186.61 -3403037.59

aa_GitzA12 raxng p1 -3403037.59 -3403037.59

aa_GitzA12 raxng p2 -3403045.17 -3403037.59

aa_GitzA12 raxng p3 -3403072.51 -3403037.59

aa_GitzA12 raxng p4 -3403076.12 -3403037.59

aa_GitzA12 raxng p5 -3403228.34 -3403037.59

aa_GitzA12 raxml p1 -3403216.13 -3403037.59

aa_GitzA12 raxml p2 -3403307.63 -3403037.59

aa_GitzA12 raxml p3 -3403587.25 -3403037.59

aa_GitzA12 raxml p4 -3403203.77 -3403037.59

aa_GitzA12 raxml p5 -3403448.12 -3403037.59

aa_rokasA1 iqtree p1 -5017861.13 -5017860.86





aa_rokasA1 raxng p1 -5017860.86 -5017860.86

aa_rokasA1 raxng p2 -5017860.86 -5017860.86

aa_rokasA1 raxng p3 -5017860.86 -5017860.86

aa_rokasA1 raxng p4 -5017860.86 -5017860.86

aa_rokasA1 raxng p5 -5017860.86 -5017860.86

aa_rokasA1 raxml p1 -5017872.46 -5017860.86

aa_rokasA1 raxml p2 -5017872.46 -5017860.86

aa_rokasA1 raxml p3 -5017872.46 -5017860.86

aa_rokasA1 raxml p4 -5017872.46 -5017860.86

aa_rokasA1 raxml p5 -5017872.46 -5017860.86






aa_rokasA2 raxng p1 -30793748.05 -30793593.82

aa_rokasA2 raxng p2 -30793748.03 -30793593.82

aa_rokasA2 raxng p3 -30793748.04 -30793593.82

aa_rokasA2 raxng p4 -30793748.04 -30793593.82

45

aa_rokasA2 raxng p5 -30793757.03 -30793593.82

aa_rokasA2 raxml p1 -30793646.44 -30793593.82

aa_rokasA2 raxml p2 -30793646.44 -30793593.82

aa_rokasA2 raxml p3 -30793646.44 -30793593.82

aa_rokasA2 raxml p4 -30793646.44 -30793593.82

aa_rokasA2 raxml p5 -30793646.44 -30793593.82






aa_rokasA3 raxng p1 -8481068.18 -8481068.17

aa_rokasA3 raxng p2 -8481068.17 -8481068.17

aa_rokasA3 raxng p3 -8481068.17 -8481068.17

aa_rokasA3 raxng p4 -8481068.17 -8481068.17

aa_rokasA3 raxng p5 -8481068.17 -8481068.17

aa_rokasA3 raxml p1 -8481070.89 -8481068.17

aa_rokasA3 raxml p2 -8481070.89 -8481068.17

aa_rokasA3 raxml p3 -8481070.89 -8481068.17

aa_rokasA3 raxml p4 -8481070.89 -8481068.17

aa_rokasA3 raxml p5 -8481070.89 -8481068.17






aa_rokasA4 raxng p1 -40909392.97 -40909392.88

aa_rokasA4 raxng p2 -40909392.98 -40909392.88

aa_rokasA4 raxng p3 -40909392.88 -40909392.88

aa_rokasA4 raxng p4 -40909392.88 -40909392.88

aa_rokasA4 raxng p5 -40909392.88 -40909392.88

aa_rokasA4 raxml p1 -40909393.03 -40909392.88

aa_rokasA4 raxml p2 -40909393.03 -40909392.88

aa_rokasA4 raxml p3 -40909393.03 -40909392.88

aa_rokasA4 raxml p4 -40909393.03 -40909392.88

aa_rokasA4 raxml p5 -40909393.03 -40909392.88






aa_rokasA5 raxng p1 -5028495.38 -5028495.38

aa_rokasA5 raxng p2 -5028495.39 -5028495.38

46

aa_rokasA5 raxng p3 -5028495.38 -5028495.38

aa_rokasA5 raxng p4 -5028495.40 -5028495.38

aa_rokasA5 raxng p5 -5028495.50 -5028495.38

aa_rokasA5 raxml p1 -5028663.61 -5028495.38

aa_rokasA5 raxml p2 -5028689.75 -5028495.38

aa_rokasA5 raxml p3 -5028655.72 -5028495.38

aa_rokasA5 raxml p4 -5028683.45 -5028495.38

aa_rokasA5 raxml p5 -5028665.45 -5028495.38






aa_rokasA6 raxng p1 -15164453.22 -15164441.89

aa_rokasA6 raxng p2 -15164453.22 -15164441.89

aa_rokasA6 raxng p3 -15164453.06 -15164441.89

aa_rokasA6 raxng p4 -15164453.06 -15164441.89

aa_rokasA6 raxng p5 -15164453.22 -15164441.89

aa_rokasA6 raxml p1 -15164442.04 -15164441.89

aa_rokasA6 raxml p2 -15164442.04 -15164441.89

aa_rokasA6 raxml p3 -15164442.04 -15164441.89

aa_rokasA6 raxml p4 -15164442.04 -15164441.89

aa_rokasA6 raxml p5 -15164441.89 -15164441.89






aa_rokasA7 raxng p1 -2894961.62 -2894956.74

aa_rokasA7 raxng p2 -2894961.62 -2894956.74

aa_rokasA7 raxng p3 -2894961.62 -2894956.74

aa_rokasA7 raxng p4 -2894961.62 -2894956.74

aa_rokasA7 raxng p5 -2894961.62 -2894956.74

aa_rokasA7 raxml p1 -2894956.74 -2894956.74

aa_rokasA7 raxml p2 -2894956.74 -2894956.74

aa_rokasA7 raxml p3 -2894956.74 -2894956.74

aa_rokasA7 raxml p4 -2894956.74 -2894956.74

aa_rokasA7 raxml p5 -2894956.74 -2894956.74






47

aa_rokasA8 raxng p1 -20012134.65 -20012134.63

aa_rokasA8 raxng p2 -20012134.65 -20012134.63

aa_rokasA8 raxng p3 -20012134.65 -20012134.63

aa_rokasA8 raxng p4 -20012134.65 -20012134.63

aa_rokasA8 raxng p5 -20012134.65 -20012134.63

aa_rokasA8 raxml p1 -20012134.63 -20012134.63

aa_rokasA8 raxml p2 -20012134.63 -20012134.63

aa_rokasA8 raxml p3 -20012134.63 -20012134.63

aa_rokasA8 raxml p4 -20012134.63 -20012134.63

aa_rokasA8 raxml p5 -20012134.63 -20012134.63






aa_rokasA9 raxng p1 -53494590.10 -53493548.14

aa_rokasA9 raxng p2 -53493548.28 -53493548.14

aa_rokasA9 raxng p3 -53493548.25 -53493548.14

aa_rokasA9 raxng p4 -53493548.31 -53493548.14

aa_rokasA9 raxng p5 -53494590.03 -53493548.14

aa_rokasA9 raxml p1 -53493548.14 -53493548.14

aa_rokasA9 raxml p2 -53493548.14 -53493548.14

aa_rokasA9 raxml p3 -53493548.14 -53493548.14

aa_rokasA9 raxml p4 -53493548.14 -53493548.14

aa_rokasA9 raxml p5 -53493548.14 -53493548.14

dna_hymeALL iqtree p1 -74022353.87 -74019633.13





dna_hymeALL raxng p1 -74019818.05 -74019633.13





dna_hymeALL raxml p1 -74022094.82 -74019633.13





dna_rokasD1 iqtree p1 -12715376.04 -12715375.68



48



dna_rokasD1 raxng p1 -12715375.70 -12715375.68





dna_rokasD1 raxml p1 -12715378.39 -12715375.68





dna_rokasD2b iqtree p1 -13252627.81 -13230654.38





dna_rokasD2b raxng p1 -13230654.51 -13230654.38





dna_rokasD2b raxml p1 -13230836.81 -13230654.38





dna_rokasD3a iqtree p1 -18545579.18 -18545572.54





dna_rokasD3a raxng p1 -18545572.54 -18545572.54





dna_rokasD3a raxml p1 -18545730.62 -18545572.54






49












































50

















dna_ShiD9 iqtree p1 -584879.25 -584549.80

dna_ShiD9 iqtree p2 -584920.55 -584549.80

dna_ShiD9 iqtree p3 -584868.20 -584549.80

dna_ShiD9 iqtree p4 -584861.39 -584549.80

dna_ShiD9 iqtree p5 -584885.19 -584549.80

dna_ShiD9 raxng p1 -584549.80 -584549.80

dna_ShiD9 raxng p2 -584580.02 -584549.80

dna_ShiD9 raxng p3 -584565.51 -584549.80

dna_ShiD9 raxng p4 -585070.55 -584549.80

dna_ShiD9 raxng p5 -584568.11 -584549.80

dna_ShiD9 raxml p1 -584920.52 -584549.80

dna_ShiD9 raxml p2 -584925.20 -584549.80

dna_ShiD9 raxml p3 -584934.56 -584549.80

dna_ShiD9 raxml p4 -584952.16 -584549.80

dna_ShiD9 raxml p5 -584928.13 -584549.80

dna_StamD10 iqtree p1 -30629.44 -30623.28





dna_StamD10 raxng p1 -30627.57 -30623.28

dna_StamD10 raxng p2 -30623.28 -30623.28

dna_StamD10 raxng p3 -30627.01 -30623.28

dna_StamD10 raxng p4 -30623.73 -30623.28

dna_StamD10 raxng p5 -30627.61 -30623.28

dna_StamD10 raxml p1 -30624.78 -30623.28

dna_StamD10 raxml p2 -30636.85 -30623.28

51

dna_StamD10 raxml p3 -30633.61 -30623.28

dna_StamD10 raxml p4 -30625.94 -30623.28

dna_StamD10 raxml p5 -30631.30 -30623.28

Table 7: All likelihood scores extracted from the output files and the difference with

the maximum likelihood score of the best tree. For ease of display, all data is rounded

off.

7.6 Runtime, memory usage and parallel efficiency for all tree

inferences

dataset software seed time(h) Memory(GB) efficiency(%)

aa_GitzA12 iqtree p1 88.39 89.22 75.06

aa_GitzA12 iqtree p2 106.50 89.22 76.19

aa_GitzA12 iqtree p3 59.00 89.22 71.50

aa_GitzA12 iqtree p4 79.03 89.20 72.50

aa_GitzA12 iqtree p5 53.33 89.21 73.81

aa_GitzA12 raxng p1 29.12 43.32 99.88

aa_GitzA12 raxng p2 35.14 43.41 99.88

aa_GitzA12 raxng p3 27.49 44.46 99.88

aa_GitzA12 raxng p4 29.48 44.26 99.88

aa_GitzA12 raxng p5 27.73 44.21 99.63

aa_GitzA12 raxml p1 26.50 82.99 99.88

aa_GitzA12 raxml p2 27.66 82.99 99.88

aa_GitzA12 raxml p3 28.76 82.99 99.88

aa_GitzA12 raxml p4 29.24 83.00 99.88

aa_GitzA12 raxml p5 25.93 82.99 99.88

aa_rokasA1 iqtree p1 2.53 16.96 75.25





aa_rokasA1 raxng p1 2.93 36.68 99.88

aa_rokasA1 raxng p2 3.97 37.52 98.25

aa_rokasA1 raxng p3 3.08 36.47 99.63

aa_rokasA1 raxng p4 4.03 37.58 99.69

aa_rokasA1 raxng p5 4.19 37.15 99.88

aa_rokasA1 raxml p1 9.97 30.42 99.88

aa_rokasA1 raxml p2 8.14 30.42 99.81

aa_rokasA1 raxml p3 9.52 30.42 99.88

aa_rokasA1 raxml p4 8.08 30.42 99.88

aa_rokasA1 raxml p5 7.47 30.42 99.88


52





aa_rokasA2 raxng p1 24.12 114.00 99.88

aa_rokasA2 raxng p2 30.84 113.82 99.88

aa_rokasA2 raxng p3 39.99 114.68 99.75

aa_rokasA2 raxng p4 28.16 110.74 99.88

aa_rokasA2 raxng p5 29.27 110.19 99.88

aa_rokasA2 raxml p1 58.60 148.18 99.88

aa_rokasA2 raxml p2 61.15 148.18 99.88

aa_rokasA2 raxml p3 53.91 148.18 99.88

aa_rokasA2 raxml p4 43.85 148.18 99.88

aa_rokasA2 raxml p5 39.30 148.18 99.88






aa_rokasA3 raxng p1 7.66 23.75 99.88

aa_rokasA3 raxng p2 8.88 23.91 99.50

aa_rokasA3 raxng p3 8.90 22.34 99.88

aa_rokasA3 raxng p4 7.50 24.59 99.88

aa_rokasA3 raxng p5 6.60 22.59 99.81

aa_rokasA3 raxml p1 10.78 35.55 99.88

aa_rokasA3 raxml p2 8.34 35.55 99.88

aa_rokasA3 raxml p3 8.03 35.40 99.88

aa_rokasA3 raxml p4 8.29 35.54 99.88

aa_rokasA3 raxml p5 9.82 35.55 99.88






aa_rokasA4 raxng p1 21.77 121.88 99.19

aa_rokasA4 raxng p2 21.45 129.32 99.19

aa_rokasA4 raxng p3 16.91 125.51 99.13

aa_rokasA4 raxng p4 24.63 135.95 99.38

aa_rokasA4 raxng p5 33.56 125.48 99.25

aa_rokasA4 raxml p1 31.53 212.06 99.88

aa_rokasA4 raxml p2 27.86 212.04 99.88

aa_rokasA4 raxml p3 23.05 212.06 99.88

aa_rokasA4 raxml p4 24.21 212.06 99.88

53

aa_rokasA4 raxml p5 29.47 212.05 99.88






aa_rokasA5 raxng p1 9.75 22.35 99.81

aa_rokasA5 raxng p2 10.83 21.57 99.88

aa_rokasA5 raxng p3 8.97 21.36 99.81

aa_rokasA5 raxng p4 11.09 23.70 99.81

aa_rokasA5 raxng p5 9.28 22.18 99.75

aa_rokasA5 raxml p1 14.42 42.49 99.88

aa_rokasA5 raxml p2 26.27 42.49 99.88

aa_rokasA5 raxml p3 24.39 42.46 99.88

aa_rokasA5 raxml p4 18.31 42.49 99.88

aa_rokasA5 raxml p5 20.21 42.46 99.88






aa_rokasA6 raxng p1 4.09 56.29 98.63

aa_rokasA6 raxng p2 4.01 56.37 99.81

aa_rokasA6 raxng p3 4.19 55.04 99.81

aa_rokasA6 raxng p4 4.42 57.52 99.81

aa_rokasA6 raxng p5 3.80 55.68 99.63

aa_rokasA6 raxml p1 7.01 42.42 99.81

aa_rokasA6 raxml p2 6.68 42.42 99.81

aa_rokasA6 raxml p3 7.07 42.43 99.88

aa_rokasA6 raxml p4 6.41 42.42 99.81

aa_rokasA6 raxml p5 6.50 42.42 99.81






aa_rokasA7 raxng p1 1.85 15.88 97.94

aa_rokasA7 raxng p2 1.32 16.23 99.69

aa_rokasA7 raxng p3 1.80 15.86 97.94

aa_rokasA7 raxng p4 1.41 16.47 99.69

aa_rokasA7 raxng p5 1.41 15.76 99.81

aa_rokasA7 raxml p1 2.85 12.78 99.88

aa_rokasA7 raxml p2 3.16 10.55 99.81

54

aa_rokasA7 raxml p3 4.31 10.55 99.81

aa_rokasA7 raxml p4 3.20 10.55 99.88

aa_rokasA7 raxml p5 3.28 10.55 99.88






aa_rokasA8 raxng p1 16.41 124.19 99.88

aa_rokasA8 raxng p2 20.46 123.17 99.38

aa_rokasA8 raxng p3 18.78 121.59 99.88

aa_rokasA8 raxng p4 22.99 123.89 99.88

aa_rokasA8 raxng p5 21.69 121.74 99.88

aa_rokasA8 raxml p1 25.66 124.89 99.81

aa_rokasA8 raxml p2 34.99 124.87 99.88

aa_rokasA8 raxml p3 36.09 124.89 99.88

aa_rokasA8 raxml p4 29.44 124.89 99.88

aa_rokasA8 raxml p5 25.00 124.88 99.88






aa_rokasA9 raxng p1 21.13 81.18 99.81

aa_rokasA9 raxng p2 23.01 84.10 98.25

aa_rokasA9 raxng p3 26.42 88.91 99.75

aa_rokasA9 raxng p4 25.69 85.11 98.13

aa_rokasA9 raxng p5 22.29 91.97 99.75

aa_rokasA9 raxml p1 24.66 130.73 99.88

aa_rokasA9 raxml p2 35.60 130.73 99.00

aa_rokasA9 raxml p3 35.26 130.73 99.88

aa_rokasA9 raxml p4 25.84 130.72 99.88

aa_rokasA9 raxml p5 24.62 130.73 99.88

dna_hymeALL iqtree p1 78.03 234.09 50.19





dna_hymeALL raxng p1 42.22 219.61 97.19





55

dna_hymeALL raxml p1 89.82 207.43 99.88





dna_rokasD1 iqtree p1 2.54 18.45 70.69





dna_rokasD1 raxng p1 0.45 10.29 99.25





dna_rokasD1 raxml p1 0.82 13.47 99.31





dna_rokasD2b iqtree p1 24.71 33.88 31.31





dna_rokasD2b raxng p1 4.00 14.56 99.88





dna_rokasD2b raxml p1 8.15 26.33 99.88





dna_rokasD3a iqtree p1 2.60 27.31 80.75





dna_rokasD3a raxng p1 3.96 14.71 99.88



56



dna_rokasD3a raxml p1 4.00 21.31 99.88









































57

























dna_ShiD9 iqtree p1 10.01 1.96 94.44

dna_ShiD9 iqtree p2 9.98 1.96 94.50

dna_ShiD9 iqtree p3 13.90 1.96 95.19



dna_ShiD9 raxng p1 1.92 3.68 99.25

dna_ShiD9 raxng p2 1.49 3.77 99.88

dna_ShiD9 raxng p3 1.55 3.67 98.38

dna_ShiD9 raxng p4 1.78 3.76 99.38

dna_ShiD9 raxng p5 1.02 3.71 99.88

dna_ShiD9 raxml p1 1.46 5.73 99.81

dna_ShiD9 raxml p2 2.39 5.90 99.81

dna_ShiD9 raxml p3 2.32 5.91 99.81

dna_ShiD9 raxml p4 2.26 5.91 99.81

dna_ShiD9 raxml p5 2.09 5.92 99.81

dna_StamD10 iqtree p1 2.19 0.34 61.19




58


dna_StamD10 raxng p1 0.54 0.66 93.63

dna_StamD10 raxng p2 0.06 0.67 99.81

dna_StamD10 raxng p3 0.06 0.66 99.81

dna_StamD10 raxng p4 0.10 0.67 97.81

dna_StamD10 raxng p5 0.07 0.66 99.88

dna_StamD10 raxml p1 0.05 0.32 99.75

dna_StamD10 raxml p2 0.03 0.32 99.75

dna_StamD10 raxml p3 0.04 0.32 99.81

dna_StamD10 raxml p4 0.04 0.32 99.75

dna_StamD10 raxml p5 0.04 0.32 99.81

Table 8: All running time, memory usage and parallel efficiency calculated from the

output files. For ease of display, all data is rounded off.

benchmarking and comparing software for phylogenetic ... · figure 3: an example phylogenetic tree...

Documents