Provided by: mptp_0.2.5-1_amd64 

NAME
mptp — single-locus species delimitation
SYNOPSIS
Maximum-likelihood species delimitation:
mptp --ml (--single | --multi) --tree_file newickfile --output_file outputfile [options]
Species delimitation with support values:
mptp --mcmc positive integer (--single | --multi) (--mcmc_startnull | --mcmc_startrandom |
--mcmc_startml) --mcmc_log positive integer --tree_file newickfile --output_file outputfile
[options]
DESCRIPTION
Species is one of the fundamental units of comparison in virtually all subfields of biology, from
systematics to anatomy, development, ecology, evolution, genetics and molecular biology. The aim of mptp
is to offer an open source tool to infer species boundaries on a a given phylogenetic tree based on the
Poisson Tree Process (PTP) and the Multiple Poisson Tree Process (mPTP) models.
mptp offers two methods for inferring species delimitation. First, a maximum-likelihood based method that
uses a dynamic programming approach to infer an ML estimate. Second, an mcmc approach for sampling the
space of possible delimitations providing the user with support values on the tree clades. Both
approaches are available in two flavours: the PTP and the mPTP model. The PTP model is specified by using
the single switch and the mPTP by using multi.
Input
The input for mptp is a newick file that contains one phylogenetic tree, i.e., branches express the
expected number of substitutions per alignment site.
Options
mptp parses a large number of command-line options. For easier navigation, options are grouped below by
theme.
General options:
--help Display help text and exit.
--version
Output version information and exit.
--quiet Suppress all output to stdout except for warnings and fatal error messages.
--tree_file filename
Input newick file that contains a phylogenetic tree. Can be rooted or unrooted.
--output_file filename
Specifies the prefix used for generating output files. For maximum-likelihood species
delimitation two files will be created. First, filename.txt that contains the actual
delimitation and filename.svg that contains an SVG figure of the computed delimitation.
For mcmc analyses, a file filename.txt is created that contains the newick tree with
supports values.
--outgroup comma-separated list of taxa
All computations for species delimitation are carried out on rooted trees. This option is
used only (and is required) In case an unrooted tree was specified with the --tree_file
option. mptp roots the unrooted tree by splitting the branch leading to the most recent
common ancestor (MRCA) of the comma-separated list of taxa into two branches of equal
size and introducing a new node (the root of the new rooted tree) that connects these two
branches.
--outgroup_crop
Crops taxa specified with the --outgroup option from the the tree.
--min_br real
Any branch lengths in the input tree smaller or equal than real are excluded (ignored)
from the computations. In addition, for mcmc analyses, subtrees that exclusively consist
of branch lengths smaller or equal to real are completely ignored from the proposals
(support values for those clades are set to 0). (default: 0.0001)
--precision positive integer
Specifies the precision of the decimal part of floating point numbers on output (default:
7)
--minbr_auto filename
Automatically detects the minimum branch length from the p-distances of the FASTA file
filename.
--tree_show
Show an ASCII version of the processed input tree (i.e. after it is rooted by,
potentially cropping, the outgroup).
Maximum-likelihood estimations:
Estimating the maximum-likelihood delimitation is triggered by the switch --ml followed by
--single (the PTP model) or --ml --multi (the mPTP model). Note that these two methods affect how
options --output_file behaves and can be controlled using the --min_br switch. Both methods
require a rooted phylogenetic tree, however an unrooted tree may be specified in conjunction with
the option --outgroup. In this case, mptp roots it at that outgroup (see General options,
--outgroup for more info). Note that both methods output an SVG depiction of the ML delimitation.
See Visualization for more information on adjusting and fine-tuning the SVG output.
Both methods ignore discard branch lengths of size smaller than the size specified using the
--min_br option. The PTP model then attempts to find a connected subgraph of the rooted tree that
(a) contains the root, and (b) the sum of likelihoods of fitting the edges of that subgraph in one
exponential distribution and the remaining edges in another (exponential distribution) is
maximized. With likelihood we mean the sums of the probability density function with the mean
defined as the reciprocal of the average of edge lengths in the particular distribution.
--ml --single
Triggers the algorithm for computing an ML estimate of the delimitation using the PTP
model.
--ml --multi
Triggers the algorithm for computing an ML estimate of the delimitation using the mPTP
model.
--pvalue real
Only used with the PTP model (specified with --single). Sets the p-value for performing a
likelihood ratio test. Note that, there is no likelihood ratio test for the mPTP model
this test is not done. (default: 0.001)
MCMC method:
The MCMC method is triggered with the --mcmc switch combined with either --single (the PTP model)
or --multi (the mPTP model).
Some more stuff to write
--mcmc positive integer --single
Triggers the algorithm for computing support values by taking the specified number of
MCMC samples (delimitations) using the PTP model.
--mcmc positive integer --multi
Triggers the algorithm for computing support values by taking the specified number of
MCMC samples (delimitations) using the mPTP model.
--mcmc_sample positive integer
Sample only every n-th MCMC step.
--mcmc_log
Log the scores (log-likelihood) for each MCMC sample in a file and create an SVG plot.
--mcmc_burnin positive integer
Ignore all MCMC samples generated before the specified step. (default: 1)
--mcmc_runs positive integer
Perform multiple MCMC runs. If more than 1 run is specified, mptp will generate one seed
for each run based on the provided seed using the --seed switch. Output files will be
generated for each run (default: 1)
--mcmc_credible real
Specify the probability (0.0 to 1.0) for which to generate the credible interval i.e.,
the probability the true number of species will fall within the credible interval given
the observed data. (default: 0.95)
--mcmc_startnull
Start MCMC sampling from the null-model.
--mcmc_startrandom
Start MCMC sampling from a random delimitation.
--mcmc_startrandom
Start MCMC sampling from the ML delimitation.
--seed positive integer
Specifies the seed for the pseudo-random number generator. (default: randomly generated
based on system time)
SVG Output:
The ML method generates one SVG file that visualizes the processed input tree (i.e. after it is
rooted by, potentially cropping, the outgroup) and marks the subtrees corresponding to coalescent
processes (the detected species groups) with red color, while the speciation process is colored
green.
The MCMC method generates one SVG file per run visualizing the processed tree, and indicates the
support value for each node, i.e., the percentage of MCMC samples (delimitations) in which the
particular node was part of the speciation process. A value of 1 means it was always in the
speciation process while a value of 0 means it was always in a coalescent process. The tree
branches are colored according to the support values of descendant nodes; a support of value of 0
is colored with red, 1 with black, and values in between are gradients of the two colors. Only
support values above 0.5 are shown to avoid packed numbers in dense branching events. In addition,
if --mcmc_log is specified, an additional SVG image of log-likelihoods plots for each sampled
delimitation is created.
--svg_width positive integer
Sets the total width (including margins) of the SVG in pixels. (default: 1920)
--svg_fontsize positive integer
Size of font in SVG image. (default: 12)
--svg_tipspacing positive integer
Vertical space in pixels between taxa in SVG tree. (default: 20)
--svg_legend_ratio real
Ratio (value between 0.0 and 1.0) of total tree length to be displayed as legend line.
(default: 0.1)
--svg_nolengend
Hide legend.
--svg_marginleft positive integer
Left margin in pixels. (default: 20)
--svg_marginright positive integer
Right margin in pixels. (default: 20)
--svg_margintop positive integer
Top margin in pixels. (default: 20)
--svg_marginbottom positive integer
Top margin in pixels. (default: 20)
--svg_inner_radius positive integer
Radius of inner nodes in pixels. (default: 0)
EXAMPLES
Compute the maximum likelihood estimate using the mPTP model by discarding all branches with length below
or equal to 0.0001
mptp --ml --multi --min_br 0.0001 --tree_file newick.txt --output_file out
Run an MCMC analysis of 100 million steps with the mPTP model, that logs every one million-th step,
ignores the first 2 million steps and discards all branches with lengths smaller or equal to 0.0001. Use
777 as seed. The chain will start from the ML delimitation (default).
mptp --mcmc 100000000 --multi --min_br 0.0001 --tree_file newick.txt --output_file out --mcmc_log
1000000 --mcmc_burnin 2000000 -seed 777
Perform an MCMC analysis of 5 runs, each of 100 million steps with the mPTP model, log every one million-
th step, ignore the first 2 million steps, and detect the minimum branch length by specifying the FASTA
file alignment.fa that contains the alignment. Use 777 as seed. Start each run from a random
delimitation.
mptp --mcmc 100000000 --multi ---mcmc_runs 5 --mcmc_log 1000000 --minbr_auto alignment.fa
--tree_file newick.txt --output_file out --mcmc_burnin 2000000 -seed 777 --mcmc_startrandom
AUTHORS
Implementation by Tomas Flouri, Sarah Lutteropp and Paschalia Kapli. Additional PTP and mPTP model
authors include Kassian Kobert, Jiajie Zhang, Pavlos Pavlidis, and Alexandros Stamatakis.
REPORTING BUGS
Submit suggestions and bug-reports at <https://github.com/Pas-Kapli/mptp/issues>, or e-mail Tomas Flouri
<Tomas.Flouri@h-its.org>.
AVAILABILITY
Source code and binaries are available at <https://github.com/Pas-Kapli/mptp>.
COPYRIGHT
Copyright (C) 2015-2017, Tomas Flouri, Sarah Lutteropp, Paschalia Kapli
All rights reserved.
Contact: Tomas Flouri <Tomas.Flouri@h-its.org>, Scientific Computing, Heidelberg Insititute for
Theoretical Studies, 69118 Heidelberg, Germany
This software is licensed under the terms of the GNU Affero General Public License version 3.
GNU Affero General Public License version 3
This program is free software: you can redistribute it and/or modify it under the terms of the GNU Affero
General Public License as published by the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even
the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Affero General
Public License for more details.
You should have received a copy of the GNU Affero General Public License along with this program. If
not, see <http://www.gnu.org/licenses/>.
VERSION HISTORY
New features and important modifications of mptp (short lived or minor bug releases may not be
mentioned):
v0.1.0 released June 27th, 2016
First public release.
v0.1.1 released July 15th, 2016
Bug fix (now LRT test is not printed in output file when using --multi)
v.0.2.0 released September 27th, 2016
Fixed floating point exception error when constructing random trees, caused from dividing
by zero. Changed allocation from malloc to calloc, as it caused unititialized variables
when converting unrooted trees to rooted when using the MCMC method. Fixed sample size for
the AIC with a correction for finite sample sizes.
v.0.2.1 released October 18th, 2016
Updated ASV to consider only coalescent roots of ML delimitation. Removed assertion
stopping mptp when using random starting delimitations for the MCMC method.
v0.2.2 released January 31st, 2017
Fixed regular expressions to allow scientific notation for branch lengths when parsing
trees. Improved the accuracy of ASV score by also taking into account tips forming
coalescent roots. Fixed memory leaks that occur when parsing incorrectly formatted trees.
v0.2.3 released July 25th, 2017
Replaced hsearch() with custom hashtable. Fixed minor output error messages.
v0.2.4 released May 14th, 2018
If we do not manage to generate a random starting delimitation with the wanted number of
species (randomly chosen), we use the currently generated delimitation instead.
v0.2.5 released Sep 9th, 2023
Added likelihood ratio test for the multi method. Added implementation for the incomplete
gamma function, and removed dependency for GNU scientific library.
mptp 0.2.5 Sep 11, 2023 mptp(1)