homologizer: Phylogenetic phasing of gene copies into polyploid subgenomes

W. Freyman, M.G. Johnson, and C.J. Rothfels, Methods in Ecology and Evolution (2023).

Abstract

Organisms such as allopolyploids and F1 hybrids contain multiple distinct subgenomes, each potentially with its own evolutionary history. These organisms present a challenge for multilocus phylogenetic inference and other analyses since it is not apparent which gene copies from different loci are from the same subgenome and thus share an evolutionary history. Here we introduce homologizer, a flexible Bayesian approach that uses a phylogenetic framework to infer the phasing of gene copies across loci into their respective subgenomes. Through the use of simulation tests, we demonstrate that homologizer is robust to a wide range of factors, such as incomplete lineage sorting and the phylogenetic informativeness of loci. Furthermore, we establish the utility of homologizer on real data, by analysing a multilocus dataset consisting of nine diploids and 19 tetraploids from the fern family Cystopteridaceae. Finally, we describe how homologizer may potentially be used beyond its core phasing functionality to identify non-homologous sequences, such as hidden paralogs or contaminants.