Abstract
AbstractBayesian phylogenetic inference is powerful but computationally intensive. Researchers may find themselves with two phylogenetic posteriors on overlapping data sets and may wish to approximate a combined result without having to re-run potentially expensive Markov chains on the combined data set. This raises the question: given overlapping subsets of a set of taxa (e.g. species or virus samples), and given posterior distributions on phylogenetic tree topologies for each of these taxon sets, how can we optimize a probability distribution on phylogenetic tree topologies for the entire taxon set? In this paper we develop a variational approach to this problem and demonstrate its effectiveness. Specifically, we develop an algorithm to find a suitable support of the variational tree topology distribution on the entire taxon set, as well as a gradient-descent algorithm to minimize the divergence from the restrictions of the variational distribution to each of the given per-subset probability distributions, in an effort to approximate the posterior distribution on the entire taxon set.
Funder
National Science Foundation
National Institutes of Health
Office of Research Infrastructure Programs, National Institutes of Health
Publisher
Springer Science and Business Media LLC