Journal articles on the topic 'Phylogenomic trees'

To see the other types of publications on this topic, follow the link: Phylogenomic trees.

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'Phylogenomic trees.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Lee, Michael D. "Applications and Considerations of GToTree: A User-Friendly Workflow for Phylogenomics." Evolutionary Bioinformatics 15 (January 2019): 117693431986224. http://dx.doi.org/10.1177/1176934319862245.

Full text
Abstract:
Phylogenomics is the practice of attempting to infer evolutionary relationships at a genome-level. This is becoming a standard step in the characterization of newly recovered genomes and to direct/constrain further research; yet the process from start to finish of building a de novo phylogenomic tree that is specific to the organisms of interest can still be computationally intractable for many biologists. GToTree is a recently published user-friendly workflow for phylogenomics intended to give more researchers the capability to generate phylogenomic trees to help guide their work. This commentary describes two common applications where GToTree can be helpful and then discusses some things to consider when using the program.
APA, Harvard, Vancouver, ISO, and other styles
2

Ay, Hilal, Hayrettin Saygin, and Nevzat Sahin. "Phylogenomic revision of the family Streptosporangiaceae, reclassification of Desertactinospora gelatinilytica as Spongiactinospora gelatinilytica comb. nov. and a taxonomic home for the genus Sinosporangium in the family Streptosporangiaceae." International Journal of Systematic and Evolutionary Microbiology 70, no. 4 (April 1, 2020): 2569–79. http://dx.doi.org/10.1099/ijsem.0.004073.

Full text
Abstract:
In recent years, the results of genome-based phylogenetic analyses have contributed to microbial systematics by increasing the availability of sequenced microbial genomes. Therefore, phylogenomic analysis within large taxa in the phylum Actinobacteria has appeared as a useful tool to clarify the taxonomic positions of ambiguous groups. In this study, we provide a revision of the actinobacterial family Streptosporangiaceae using a large collection of genome data and phylogenomics approaches. The phylogenomic analyses included the publicly available genome data of the members of the family Streptosporangiaceae and the state-of-the-art tools are used to infer the taxonomic affiliation of these species within the family. By comparing genome-based and 16S rRNA gene-based trees, as well as pairwise genome comparisons, the recently described genera Spongiactinospora and Desertactinospora are combined in the genus Spongiactinospora . In conclusion, a comprehensive phylogenomic revision of the family Streptosporangiaceae is proposed.
APA, Harvard, Vancouver, ISO, and other styles
3

Cheon, Seongmin, Jianzhi Zhang, and Chungoo Park. "Is Phylotranscriptomics as Reliable as Phylogenomics?" Molecular Biology and Evolution 37, no. 12 (July 13, 2020): 3672–83. http://dx.doi.org/10.1093/molbev/msaa181.

Full text
Abstract:
Abstract Phylogenomics, the study of phylogenetic relationships among taxa based on their genome sequences, has emerged as the preferred phylogenetic method because of the wealth of phylogenetic information contained in genome sequences. Genome sequencing, however, can be prohibitively expensive, especially for taxa with huge genomes and when many taxa need sequencing. Consequently, the less costly phylotranscriptomics has seen an increased use in recent years. Phylotranscriptomics reconstructs phylogenies using DNA sequences derived from transcriptomes, which are often orders of magnitude smaller than genomes. However, in the absence of corresponding genome sequences, comparative analyses of transcriptomes can be challenging and it is unclear whether phylotranscriptomics is as reliable as phylogenomics. Here, we respectively compare the phylogenomic and phylotranscriptomic trees of 22 mammals and 15 plants that have both sequenced nuclear genomes and publicly available RNA sequencing data from multiple tissues. We found that phylotranscriptomic analysis can be sensitive to orthologous gene identification. When a rigorous method for identifying orthologs is employed, phylogenomic and phylotranscriptomic trees are virtually identical to each other, regardless of the tissue of origin of the transcriptomes and whether the same tissue is used across species. These findings validate phylotranscriptomics, brighten its prospect, and illustrate the criticality of reliable ortholog detection in such practices.
APA, Harvard, Vancouver, ISO, and other styles
4

Galtier, Nicolas, and Vincent Daubin. "Dealing with incongruence in phylogenomic analyses." Philosophical Transactions of the Royal Society B: Biological Sciences 363, no. 1512 (October 7, 2008): 4023–29. http://dx.doi.org/10.1098/rstb.2008.0144.

Full text
Abstract:
Incongruence between gene trees is the main challenge faced by phylogeneticists in the genomic era. Incongruence can occur for artefactual reasons, when we fail to recover the correct gene trees, or for biological reasons, when true gene trees are actually distinct from each other, and from the species tree. Horizontal gene transfers (HGTs) between genomes are an important process of bacterial evolution resulting in a substantial amount of phylogenetic conflicts between gene trees. We argue that the (bacterial) species tree is still a meaningful scientific concept even in the case of HGTs, and that reconstructing it is still a valid goal. We tentatively assess the amount of phylogenetic incongruence caused by HGTs in bacteria by comparing bacterial datasets to a metazoan dataset in which transfers are presumably very scarce or absent. We review existing phylogenomic methods and their ability to return to the user, both the vertical (speciation/extinction history) and horizontal (gene transfers) phylogenetic signals.
APA, Harvard, Vancouver, ISO, and other styles
5

Shah, Toral, Fandey H. Mashimba, Haji O. Suleiman, Yahya S. Mbailwa, Julio V. Schneider, Georg Zizka, Vincent Savolainen, Isabel Larridon, and Iain Darbyshire. "Phylogenetics of Ochna (Ochnaceae) and a new infrageneric classification." Botanical Journal of the Linnean Society 198, no. 4 (December 3, 2021): 361–81. http://dx.doi.org/10.1093/botlinnean/boab071.

Full text
Abstract:
Abstract Advances in high-throughput DNA sequencing are allowing faster and more affordable generation of molecular phylogenetic trees for many organisms. However, resolving relationships at species level is still challenging, particularly for taxonomically difficult groups. Until recently, the classification of Ochna had been based only on morphological data. Here, we present the first comprehensive phylogenomic study for the genus using targeted sequencing with a custom probe kit. We sampled c. 85% of species to evaluate the current infrageneric classification. Our results show that the data generated using the custom probe kit are effective in resolving relationships in the genus, revealing three sections consistent with the current classification and a new section consisting of species from Madagascar and the Mascarene Islands. Our results provide the first insights into the evolutionary relationships of several widespread and morphologically diverse species numerous poorly known and potentially new species to science. We demonstrate that for morphologically challenging groups such as Ochna, an integrated approach to classification is essential. Phylogenomic results are only informative when derived from accurately named samples. There is a symbiotic relationship between molecular phylogenomics and morphology-based taxonomy, with taxonomic expertise a requirement to accurately interpret the phylogenomic results.
APA, Harvard, Vancouver, ISO, and other styles
6

Vinh, Lê Sỹ. "Phylogenetic and Phylogenomic Analyses for Large Datasets." Journal of Research and Development on Information and Communication Technology 2019, no. 2 (December 31, 2019): 84–92. http://dx.doi.org/10.32913/mic-ict-research.v2019.n2.898.

Full text
Abstract:
The phylogenetic tree is a main tool to study the evolutionary relationships among species. Computational methods for building phylogenetic trees from gene/protein sequences have been developed for decades and come of age. Efficient approaches, including distance-based methods, maximum likelihood methods, or classical maximum parsimony methods, are now able to analyze datasets with thousands of sequences. The advanced sequencing technologies have resulted in a huge amount of data including whole genomes. A number of methods have been proposed to analyze the wholegenome datasets, however, numerous challenges need to be addressed and solved to translate phylogenomic inferences into practices. In this paper, we will analyze widely-used methods to construct large phylogenetic trees, and available methods to build phylogenomic trees from whole-genome datasets. We will also give recommendations for best practices when performing phylogenetic and phylogenomic analyses. The paper will enable researchers to comprehend the state-ofthe-art methods and available software to efficiently study the evolutionary relationships among species from large datasets.
APA, Harvard, Vancouver, ISO, and other styles
7

Xiao, Guohua, Guirong Tang, and Chengshu Wang. "Congruence Amidst Discordance between Sequence and Protein-Content Based Phylogenies of Fungi." Journal of Fungi 6, no. 3 (August 13, 2020): 134. http://dx.doi.org/10.3390/jof6030134.

Full text
Abstract:
Amid the genomic data explosion, phylogenomic analysis has resolved the tree of life of different organisms, including fungi. Genome-wide clustering has also been conducted based on gene content data that can lighten the issue of the unequal evolutionary rate of genes. In this study, using different fungal species as models, we performed phylogenomic and protein-content (PC)-based clustering analysis. The obtained sequence tree reflects the phylogenetic trajectory of examined fungal species. However, 15 PC-based trees constructed from the Pfam matrices of the whole genomes, four protein families, and ten subcellular locations largely failed to resolve the speciation relationship of cross-phylum fungal species. However, lifestyle and taxonomic associations were more or less evident between closely related fungal species from PC-based trees. Pairwise congruence tests indicated that a varied level of congruent or discordant relationships were observed between sequence- and PC-based trees, and among PC-based trees. It was intriguing to find that a few protein family and subcellular PC-based trees were more topologically similar to the phylogenomic tree than was the whole genome PC-based phylogeny. In particular, a most significant level of congruence was observed between sequence- and cell wall PC-based trees. Cophylogenetic analysis conducted in this study may benefit the prediction of the magnitude of evolutionary conservation, interactive associations, or networking between different family or subcellular proteins.
APA, Harvard, Vancouver, ISO, and other styles
8

Jiang, Xiaodong, Scott V. Edwards, and Liang Liu. "The Multispecies Coalescent Model Outperforms Concatenation Across Diverse Phylogenomic Data Sets." Systematic Biology 69, no. 4 (February 3, 2020): 795–812. http://dx.doi.org/10.1093/sysbio/syaa008.

Full text
Abstract:
Abstract A statistical framework of model comparison and model validation is essential to resolving the debates over concatenation and coalescent models in phylogenomic data analysis. A set of statistical tests are here applied and developed to evaluate and compare the adequacy of substitution, concatenation, and multispecies coalescent (MSC) models across 47 phylogenomic data sets collected across tree of life. Tests for substitution models and the concatenation assumption of topologically congruent gene trees suggest that a poor fit of substitution models, rejected by 44% of loci, and concatenation models, rejected by 38% of loci, is widespread. Logistic regression shows that the proportions of GC content and informative sites are both negatively correlated with the fit of substitution models across loci. Moreover, a substantial violation of the concatenation assumption of congruent gene trees is consistently observed across six major groups (birds, mammals, fish, insects, reptiles, and others, including other invertebrates). In contrast, among those loci adequately described by a given substitution model, the proportion of loci rejecting the MSC model is 11%, significantly lower than those rejecting the substitution and concatenation models. Although conducted on reduced data sets due to computational constraints, Bayesian model validation and comparison both strongly favor the MSC over concatenation across all data sets; the concatenation assumption of congruent gene trees rarely holds for phylogenomic data sets with more than 10 loci. Thus, for large phylogenomic data sets, model comparisons are expected to consistently and more strongly favor the coalescent model over the concatenation model. We also found that loci rejecting the MSC have little effect on species tree estimation. Our study reveals the value of model validation and comparison in phylogenomic data analysis, as well as the need for further improvements of multilocus models and computational tools for phylogenetic inference. [Bayes factor; Bayesian model validation; coalescent prior; congruent gene trees; independent prior; Metazoa; posterior predictive simulation.]
APA, Harvard, Vancouver, ISO, and other styles
9

Lee, Michael D. "GToTree: a user-friendly workflow for phylogenomics." Bioinformatics 35, no. 20 (March 13, 2019): 4162–64. http://dx.doi.org/10.1093/bioinformatics/btz188.

Full text
Abstract:
Abstract Summary Genome-level evolutionary inference (i.e. phylogenomics) is becoming an increasingly essential step in many biologists’ work. Accordingly, there are several tools available for the major steps in a phylogenomics workflow. But for the biologist whose main focus is not bioinformatics, much of the computational work required—such as accessing genomic data on large scales, integrating genomes from different file formats, performing required filtering, stitching different tools together etc.—can be prohibitive. Here I introduce GToTree, a command-line tool that can take any combination of fasta files, GenBank files and/or NCBI assembly accessions as input and outputs an alignment file, estimates of genome completeness and redundancy, and a phylogenomic tree based on a specified single-copy gene (SCG) set. Although GToTree can work with any custom hidden Markov Models (HMMs), also included are 13 newly generated SCG-set HMMs for different lineages and levels of resolution, built based on searches of ∼12 000 bacterial and archaeal high-quality genomes. GToTree aims to give more researchers the capability to make phylogenomic trees. Availability and implementation GToTree is open-source and freely available for download from: github.com/AstrobioMike/GToTree. It is implemented primarily in bash with helper scripts written in python. Supplementary information Supplementary data are available at Bioinformatics online.
APA, Harvard, Vancouver, ISO, and other styles
10

Zhang, Chao, Celine Scornavacca, Erin K. Molloy, and Siavash Mirarab. "ASTRAL-Pro: Quartet-Based Species-Tree Inference despite Paralogy." Molecular Biology and Evolution 37, no. 11 (September 4, 2020): 3292–307. http://dx.doi.org/10.1093/molbev/msaa139.

Full text
Abstract:
Abstract Phylogenetic inference from genome-wide data (phylogenomics) has revolutionized the study of evolution because it enables accounting for discordance among evolutionary histories across the genome. To this end, summary methods have been developed to allow accurate and scalable inference of species trees from gene trees. However, most of these methods, including the widely used ASTRAL, can only handle single-copy gene trees and do not attempt to model gene duplication and gene loss. As a result, most phylogenomic studies have focused on single-copy genes and have discarded large parts of the data. Here, we first propose a measure of quartet similarity between single-copy and multicopy trees that accounts for orthology and paralogy. We then introduce a method called ASTRAL-Pro (ASTRAL for PaRalogs and Orthologs) to find the species tree that optimizes our quartet similarity measure using dynamic programing. By studying its performance on an extensive collection of simulated data sets and on real data sets, we show that ASTRAL-Pro is more accurate than alternative methods.
APA, Harvard, Vancouver, ISO, and other styles
11

Liu, Liang, Shaoyuan Wu, and Lili Yu. "Coalescent methods for estimating species trees from phylogenomic data." Journal of Systematics and Evolution 53, no. 5 (June 19, 2015): 380–90. http://dx.doi.org/10.1111/jse.12160.

Full text
APA, Harvard, Vancouver, ISO, and other styles
12

Mallo, Diego, Leonardo De Oliveira Martins, and David Posada. "SimPhy: Phylogenomic Simulation of Gene, Locus, and Species Trees." Systematic Biology 65, no. 2 (November 1, 2015): 334–44. http://dx.doi.org/10.1093/sysbio/syv082.

Full text
APA, Harvard, Vancouver, ISO, and other styles
13

Scornavacca, C., V. Berry, and V. Ranwez. "Building species trees from larger parts of phylogenomic databases." Information and Computation 209, no. 3 (March 2011): 590–605. http://dx.doi.org/10.1016/j.ic.2010.11.022.

Full text
APA, Harvard, Vancouver, ISO, and other styles
14

Jennings, W. Bryan. "On the independent gene trees assumption in phylogenomic studies." Molecular Ecology 26, no. 19 (September 14, 2017): 4862–71. http://dx.doi.org/10.1111/mec.14274.

Full text
APA, Harvard, Vancouver, ISO, and other styles
15

Vinh, Le Sy. "MODELING AMINO ACID SUBSTITUTIONS FOR WHOLE GENOMES." Journal of Computer Science and Cybernetics 37, no. 4 (October 12, 2021): 351–63. http://dx.doi.org/10.15625/1813-9663/37/4/15937.

Full text
Abstract:
Modeling amino acid substitution process is a core task in bioinformatics. New advanced sequencing technologies have generated huge datasets including whole genomes from various species. Estimating amino acid substitution models from whole genome datasets provides us unprecedented opportunities to accurately investigate relationships among species. In this paper, we review state-of-the-art computational methods to estimate amino acid substitution models from large datasets. We also describe a comprehensive pipeline to practically estimate amino acid models from whole genome datasets. Finally, we apply amino acid substitution models to build phylogenomic trees from bird and plant genome datasets. We compare our newly reconstructed phylogenomic trees and published ones and discuss new findings.
APA, Harvard, Vancouver, ISO, and other styles
16

Minh, Bui Quang, Matthew W. Hahn, and Robert Lanfear. "New Methods to Calculate Concordance Factors for Phylogenomic Datasets." Molecular Biology and Evolution 37, no. 9 (May 4, 2020): 2727–33. http://dx.doi.org/10.1093/molbev/msaa106.

Full text
Abstract:
Abstract We implement two measures for quantifying genealogical concordance in phylogenomic data sets: the gene concordance factor (gCF) and the novel site concordance factor (sCF). For every branch of a reference tree, gCF is defined as the percentage of “decisive” gene trees containing that branch. This measure is already in wide usage, but here we introduce a package that calculates it while accounting for variable taxon coverage among gene trees. sCF is a new measure defined as the percentage of decisive sites supporting a branch in the reference tree. gCF and sCF complement classical measures of branch support in phylogenetics by providing a full description of underlying disagreement among loci and sites. An easy to use implementation and tutorial is freely available in the IQ-TREE software package (http://www.iqtree.org/doc/Concordance-Factor, last accessed May 13, 2020).
APA, Harvard, Vancouver, ISO, and other styles
17

Wang, Yaxuan, Zhen Cao, Huw A. Ogilvie, and Luay Nakhleh. "Phylogenomic assessment of the role of hybridization and introgression in trait evolution." PLOS Genetics 17, no. 8 (August 18, 2021): e1009701. http://dx.doi.org/10.1371/journal.pgen.1009701.

Full text
Abstract:
Trait evolution among a set of species—a central theme in evolutionary biology—has long been understood and analyzed with respect to a species tree. However, the field of phylogenomics, which has been propelled by advances in sequencing technologies, has ushered in the era of species/gene tree incongruence and, consequently, a more nuanced understanding of trait evolution. For a trait whose states are incongruent with the branching patterns in the species tree, the same state could have arisen independently in different species (homoplasy) or followed the branching patterns of gene trees, incongruent with the species tree (hemiplasy). Another evolutionary process whose extent and significance are better revealed by phylogenomic studies is gene flow between different species. In this work, we present a phylogenomic method for assessing the role of hybridization and introgression in the evolution of polymorphic or monomorphic binary traits. We apply the method to simulated evolutionary scenarios to demonstrate the interplay between the parameters of the evolutionary history and the role of introgression in a binary trait’s evolution (which we call xenoplasy). Very importantly, we demonstrate, including on a biological data set, that inferring a species tree and using it for trait evolution analysis in the presence of gene flow could lead to misleading hypotheses about trait evolution.
APA, Harvard, Vancouver, ISO, and other styles
18

Kimball, Rebecca T., Carl H. Oliveros, Ning Wang, Noor D. White, F. Keith Barker, Daniel J. Field, Daniel T. Ksepka, et al. "A Phylogenomic Supertree of Birds." Diversity 11, no. 7 (July 10, 2019): 109. http://dx.doi.org/10.3390/d11070109.

Full text
Abstract:
It has long been appreciated that analyses of genomic data (e.g., whole genome sequencing or sequence capture) have the potential to reveal the tree of life, but it remains challenging to move from sequence data to a clear understanding of evolutionary history, in part due to the computational challenges of phylogenetic estimation using genome-scale data. Supertree methods solve that challenge because they facilitate a divide-and-conquer approach for large-scale phylogeny inference by integrating smaller subtrees in a computationally efficient manner. Here, we combined information from sequence capture and whole-genome phylogenies using supertree methods. However, the available phylogenomic trees had limited overlap so we used taxon-rich (but not phylogenomic) megaphylogenies to weave them together. This allowed us to construct a phylogenomic supertree, with support values, that included 707 bird species (~7% of avian species diversity). We estimated branch lengths using mitochondrial sequence data and we used these branch lengths to estimate divergence times. Our time-calibrated supertree supports radiation of all three major avian clades (Palaeognathae, Galloanseres, and Neoaves) near the Cretaceous-Paleogene (K-Pg) boundary. The approach we used will permit the continued addition of taxa to this supertree as new phylogenomic data are published, and it could be applied to other taxa as well.
APA, Harvard, Vancouver, ISO, and other styles
19

Paule, Juraj, Roswitha Schmickl, Tomáš Fér, Sabine Matuszak-Renger, Heidemarie Halbritter, and Georg Zizka. "Phylogenomic insights into the Fascicularia-Ochagavia group (Bromelioideae, Bromeliaceae)." Botanical Journal of the Linnean Society 192, no. 4 (December 21, 2019): 642–55. http://dx.doi.org/10.1093/botlinnean/boz085.

Full text
Abstract:
Abstract Ochagavia (four species) and Fascicularia (one species) form a well-supported clade of the early-diverging Bromelioideae. The two genera are morphologically similar, but they can be easily discerned on the basis of generative characters. Besides the species distributed on the Chilean mainland, the group includes O. elegans, endemic to the Robinson Crusoe Island of the Juan Fernández Islands. In previous molecular phylogenetic studies, O. elegans formed a sister clade to the remainder of Fascicularia and Ochagavia. A phylogenomic approach, including nearly complete and, in five cases, full plastomes (c. 160 kbp) and the nuclear rDNA cistron (c. 6 kbp), and scanning electron microscope (SEM) images of pollen were used to analyse relationships in the Fascicularia-Ochagavia group. Plastome and nuclear trees were largely congruent and supported previous phylogenetic analyses of O. elegans being sister to the remainder of the group. A divergent phylogenetic position was suggested for O. carnea using different organellar trees. SEM analysis of pollen supported the division of Fascicularia and Ochagavia. Evolutionary and taxonomic implications of our results are discussed.
APA, Harvard, Vancouver, ISO, and other styles
20

Chernomor, Olga, Bui Quang Minh, and Arndt von Haeseler. "Consequences of Common Topological Rearrangements for Partition Trees in Phylogenomic Inference." Journal of Computational Biology 22, no. 12 (December 2015): 1129–42. http://dx.doi.org/10.1089/cmb.2015.0146.

Full text
APA, Harvard, Vancouver, ISO, and other styles
21

J. Page, Andrew, Martin Hunt, Torsten Seemann, and Jacqueline A. Keane. "SaffronTree: Fast, reference-free pseudo-phylogenomic trees from reads or contigs." Journal of Open Source Software 2, no. 13 (May 3, 2017): 243. http://dx.doi.org/10.21105/joss.00243.

Full text
APA, Harvard, Vancouver, ISO, and other styles
22

Steenwyk, Jacob L., Thomas J. Buida, Abigail L. Labella, Yuanning Li, Xing-Xing Shen, and Antonis Rokas. "PhyKIT: a broadly applicable UNIX shell toolkit for processing and analyzing phylogenomic data." Bioinformatics 37, no. 16 (February 9, 2021): 2325–31. http://dx.doi.org/10.1093/bioinformatics/btab096.

Full text
Abstract:
Abstract Motivation Diverse disciplines in biology process and analyze multiple sequence alignments (MSAs) and phylogenetic trees to evaluate their information content, infer evolutionary events and processes and predict gene function. However, automated processing of MSAs and trees remains a challenge due to the lack of a unified toolkit. To fill this gap, we introduce PhyKIT, a toolkit for the UNIX shell environment with 30 functions that process MSAs and trees, including but not limited to estimation of mutation rate, evaluation of sequence composition biases, calculation of the degree of violation of a molecular clock and collapsing bipartitions (internal branches) with low support. Results To demonstrate the utility of PhyKIT, we detail three use cases: (1) summarizing information content in MSAs and phylogenetic trees for diagnosing potential biases in sequence or tree data; (2) evaluating gene–gene covariation of evolutionary rates to identify functional relationships, including novel ones, among genes and (3) identify lack of resolution events or polytomies in phylogenetic trees, which are suggestive of rapid radiation events or lack of data. We anticipate PhyKIT will be useful for processing, examining and deriving biological meaning from increasingly large phylogenomic datasets. Availability and implementation PhyKIT is freely available on GitHub (https://github.com/JLSteenwyk/PhyKIT), PyPi (https://pypi.org/project/phykit/) and the Anaconda Cloud (https://anaconda.org/JLSteenwyk/phykit) under the MIT license with extensive documentation and user tutorials (https://jlsteenwyk.com/PhyKIT). Supplementary information Supplementary data are available at Bioinformatics online.
APA, Harvard, Vancouver, ISO, and other styles
23

Saarela, Jeffery M., Sean V. Burke, William P. Wysocki, Matthew D. Barrett, Lynn G. Clark, Joseph M. Craine, Paul M. Peterson, Robert J. Soreng, Maria S. Vorontsova, and Melvin R. Duvall. "A 250 plastome phylogeny of the grass family (Poaceae): topological support under different data partitions." PeerJ 6 (February 2, 2018): e4299. http://dx.doi.org/10.7717/peerj.4299.

Full text
Abstract:
The systematics of grasses has advanced through applications of plastome phylogenomics, although studies have been largely limited to subfamilies or other subgroups of Poaceae. Here we present a plastome phylogenomic analysis of 250 complete plastomes (179 genera) sampled from 44 of the 52 tribes of Poaceae. Plastome sequences were determined from high throughput sequencing libraries and the assemblies represent over 28.7 Mbases of sequence data. Phylogenetic signal was characterized in 14 partitions, including (1) complete plastomes; (2) protein coding regions; (3) noncoding regions; and (4) three loci commonly used in single and multi-gene studies of grasses. Each of the four main partitions was further refined, alternatively including or excluding positively selected codons and also the gaps introduced by the alignment. All 76 protein coding plastome loci were found to be predominantly under purifying selection, but specific codons were found to be under positive selection in 65 loci. The loci that have been widely used in multi-gene phylogenetic studies had among the highest proportions of positively selected codons, suggesting caution in the interpretation of these earlier results. Plastome phylogenomic analyses confirmed the backbone topology for Poaceae with maximum bootstrap support (BP). Among the 14 analyses, 82 clades out of 309 resolved were maximally supported in all trees. Analyses of newly sequenced plastomes were in agreement with current classifications. Five of seven partitions in which alignment gaps were removed retrieved Panicoideae as sister to the remaining PACMAD subfamilies. Alternative topologies were recovered in trees from partitions that included alignment gaps. This suggests that ambiguities in aligning these uncertain regions might introduce a false signal. Resolution of these and other critical branch points in the phylogeny of Poaceae will help to better understand the selective forces that drove the radiation of the BOP and PACMAD clades comprising more than 99.9% of grass diversity.
APA, Harvard, Vancouver, ISO, and other styles
24

McGowen, Michael R., Georgia Tsagkogeorga, Sandra Álvarez-Carretero, Mario dos Reis, Monika Struebig, Robert Deaville, Paul D. Jepson, et al. "Phylogenomic Resolution of the Cetacean Tree of Life Using Target Sequence Capture." Systematic Biology 69, no. 3 (October 21, 2019): 479–501. http://dx.doi.org/10.1093/sysbio/syz068.

Full text
Abstract:
Abstract The evolution of cetaceans, from their early transition to an aquatic lifestyle to their subsequent diversification, has been the subject of numerous studies. However, although the higher-level relationships among cetacean families have been largely settled, several aspects of the systematics within these groups remain unresolved. Problematic clades include the oceanic dolphins (37 spp.), which have experienced a recent rapid radiation, and the beaked whales (22 spp.), which have not been investigated in detail using nuclear loci. The combined application of high-throughput sequencing with techniques that target specific genomic sequences provide a powerful means of rapidly generating large volumes of orthologous sequence data for use in phylogenomic studies. To elucidate the phylogenetic relationships within the Cetacea, we combined sequence capture with Illumina sequencing to generate data for $\sim $3200 protein-coding genes for 68 cetacean species and their close relatives including the pygmy hippopotamus. By combining data from $>$38,000 exons with existing sequences from 11 cetaceans and seven outgroup taxa, we produced the first comprehensive comparative genomic data set for cetaceans, spanning 6,527,596 aligned base pairs (bp) and 89 taxa. Phylogenetic trees reconstructed with maximum likelihood and Bayesian inference of concatenated loci, as well as with coalescence analyses of individual gene trees, produced mostly concordant and well-supported trees. Our results completely resolve the relationships among beaked whales as well as the contentious relationships among oceanic dolphins, especially the problematic subfamily Delphinidae. We carried out Bayesian estimation of species divergence times using MCMCTree and compared our complete data set to a subset of clocklike genes. Analyses using the complete data set consistently showed less variance in divergence times than the reduced data set. In addition, integration of new fossils (e.g., Mystacodon selenensis) indicates that the diversification of Crown Cetacea began before the Late Eocene and the divergence of Crown Delphinidae as early as the Middle Miocene. [Cetaceans; phylogenomics; Delphinidae; Ziphiidae; dolphins; whales.]
APA, Harvard, Vancouver, ISO, and other styles
25

Mai, Uyen, and Siavash Mirarab. "Completing gene trees without species trees in sub-quadratic time." Bioinformatics 38, no. 6 (January 3, 2022): 1532–41. http://dx.doi.org/10.1093/bioinformatics/btab875.

Full text
Abstract:
Abstract Motivation As genome-wide reconstruction of phylogenetic trees becomes more widespread, limitations of available data are being appreciated more than ever before. One issue is that phylogenomic datasets are riddled with missing data, and gene trees, in particular, almost always lack representatives from some species otherwise available in the dataset. Since many downstream applications of gene trees require or can benefit from access to complete gene trees, it will be beneficial to algorithmically complete gene trees. Also, gene trees are often unrooted, and rooting them is useful for downstream applications. While completing and rooting a gene tree with respect to a given species tree has been studied, those problems are not studied in depth when we lack such a reference species tree. Results We study completion of gene trees without a need for a reference species tree. We formulate an optimization problem to complete the gene trees while minimizing their quartet distance to the given set of gene trees. We extend a seminal algorithm by Brodal et al. to solve this problem in quasi-linear time. In simulated studies and on a large empirical data, we show that completion of gene trees using other gene trees is relatively accurate and, unlike the case where a species tree is available, is unbiased. Availability and implementation Our method, tripVote, is available at https://github.com/uym2/tripVote. Supplementary information Supplementary data are available at Bioinformatics online.
APA, Harvard, Vancouver, ISO, and other styles
26

Timme, Ruth E., Hugh Rand, Martin Shumway, Eija K. Trees, Mustafa Simmons, Richa Agarwala, Steven Davis, et al. "Benchmark datasets for phylogenomic pipeline validation, applications for foodborne pathogen surveillance." PeerJ 5 (October 6, 2017): e3893. http://dx.doi.org/10.7717/peerj.3893.

Full text
Abstract:
Background As next generation sequence technology has advanced, there have been parallel advances in genome-scale analysis programs for determining evolutionary relationships as proxies for epidemiological relationship in public health. Most new programs skip traditional steps of ortholog determination and multi-gene alignment, instead identifying variants across a set of genomes, then summarizing results in a matrix of single-nucleotide polymorphisms or alleles for standard phylogenetic analysis. However, public health authorities need to document the performance of these methods with appropriate and comprehensive datasets so they can be validated for specific purposes, e.g., outbreak surveillance. Here we propose a set of benchmark datasets to be used for comparison and validation of phylogenomic pipelines. Methods We identified four well-documented foodborne pathogen events in which the epidemiology was concordant with routine phylogenomic analyses (reference-based SNP and wgMLST approaches). These are ideal benchmark datasets, as the trees, WGS data, and epidemiological data for each are all in agreement. We have placed these sequence data, sample metadata, and “known” phylogenetic trees in publicly-accessible databases and developed a standard descriptive spreadsheet format describing each dataset. To facilitate easy downloading of these benchmarks, we developed an automated script that uses the standard descriptive spreadsheet format. Results Our “outbreak” benchmark datasets represent the four major foodborne bacterial pathogens (Listeria monocytogenes, Salmonella enterica, Escherichia coli, and Campylobacter jejuni) and one simulated dataset where the “known tree” can be accurately called the “true tree”. The downloading script and associated table files are available on GitHub: https://github.com/WGS-standards-and-analysis/datasets. Discussion These five benchmark datasets will help standardize comparison of current and future phylogenomic pipelines, and facilitate important cross-institutional collaborations. Our work is part of a global effort to provide collaborative infrastructure for sequence data and analytic tools—we welcome additional benchmark datasets in our recommended format, and, if relevant, we will add these on our GitHub site. Together, these datasets, dataset format, and the underlying GitHub infrastructure present a recommended path for worldwide standardization of phylogenomic pipelines.
APA, Harvard, Vancouver, ISO, and other styles
27

Zhou, Xiaofan, Sarah Lutteropp, Lucas Czech, Alexandros Stamatakis, Moritz Von Looz, and Antonis Rokas. "Quartet-Based Computations of Internode Certainty Provide Robust Measures of Phylogenetic Incongruence." Systematic Biology 69, no. 2 (August 29, 2019): 308–24. http://dx.doi.org/10.1093/sysbio/syz058.

Full text
Abstract:
Abstract Incongruence, or topological conflict, is prevalent in genome-scale data sets. Internode certainty (IC) and related measures were recently introduced to explicitly quantify the level of incongruence of a given internal branch among a set of phylogenetic trees and complement regular branch support measures (e.g., bootstrap, posterior probability) that instead assess the statistical confidence of inference. Since most phylogenomic studies contain data partitions (e.g., genes) with missing taxa and IC scores stem from the frequencies of bipartitions (or splits) on a set of trees, IC score calculation typically requires adjusting the frequencies of bipartitions from these partial gene trees. However, when the proportion of missing taxa is high, the scores yielded by current approaches that adjust bipartition frequencies in partial gene trees differ substantially from each other and tend to be overestimates. To overcome these issues, we developed three new IC measures based on the frequencies of quartets, which naturally apply to both complete and partial trees. Comparison of our new quartet-based measures to previous bipartition-based measures on simulated data shows that: (1) on complete data sets, both quartet-based and bipartition-based measures yield very similar IC scores; (2) IC scores of quartet-based measures on a given data set with and without missing taxa are more similar than the scores of bipartition-based measures; and (3) quartet-based measures are more robust to the absence of phylogenetic signal and errors in phylogenetic inference than bipartition-based measures. Additionally, the analysis of an empirical mammalian phylogenomic data set using our quartet-based measures reveals the presence of substantial levels of incongruence for numerous internal branches. An efficient open-source implementation of these quartet-based measures is freely available in the program QuartetScores (https://github.com/lutteropp/QuartetScores).
APA, Harvard, Vancouver, ISO, and other styles
28

Stephens, Timothy G., Debashish Bhattacharya, Mark A. Ragan, and Cheong Xin Chan. "PhySortR: a fast, flexible tool for sorting phylogenetic trees in R." PeerJ 4 (May 12, 2016): e2038. http://dx.doi.org/10.7717/peerj.2038.

Full text
Abstract:
A frequent bottleneck in interpreting phylogenomic output is the need to screen often thousands of trees for features of interest, particularly robust clades of specific taxa, as evidence of monophyletic relationship and/or reticulated evolution. Here we present PhySortR, a fast, flexible R package for classifying phylogenetic trees. Unlike existing utilities, PhySortR allows for identification of both exclusive and non-exclusive clades uniting the target taxa based on tip labels (i.e., leaves) on a tree, with customisable options to assess clades within the context of the whole tree. Using simulated and empirical datasets, we demonstrate the potential and scalability of PhySortR in analysis of thousands of phylogenetic trees without a priori assumption of tree-rooting, and in yielding readily interpretable trees that unambiguously satisfy the query. PhySortR is a command-line tool that is freely available and easily automatable.
APA, Harvard, Vancouver, ISO, and other styles
29

Bordenstein, S. R., C. Paraskevopoulos, J. C. Dunning Hotopp, P. Sapountzis, N. Lo, C. Bandi, H. Tettelin, J. H. Werren, and K. Bourtzis. "Parasitism and Mutualism in Wolbachia: What the Phylogenomic Trees Can and Cannot Say." Molecular Biology and Evolution 26, no. 1 (October 6, 2008): 231–41. http://dx.doi.org/10.1093/molbev/msn243.

Full text
APA, Harvard, Vancouver, ISO, and other styles
30

Allio, Rémi, Céline Scornavacca, Benoit Nabholz, Anne-Laure Clamens, Felix AH Sperling, and Fabien L. Condamine. "Whole Genome Shotgun Phylogenomics Resolves the Pattern and Timing of Swallowtail Butterfly Evolution." Systematic Biology 69, no. 1 (May 7, 2019): 38–60. http://dx.doi.org/10.1093/sysbio/syz030.

Full text
Abstract:
Abstract Evolutionary relationships have remained unresolved in many well-studied groups, even though advances in next-generation sequencing and analysis, using approaches such as transcriptomics, anchored hybrid enrichment, or ultraconserved elements, have brought systematics to the brink of whole genome phylogenomics. Recently, it has become possible to sequence the entire genomes of numerous nonbiological models in parallel at reasonable cost, particularly with shotgun sequencing. Here, we identify orthologous coding sequences from whole-genome shotgun sequences, which we then use to investigate the relevance and power of phylogenomic relationship inference and time-calibrated tree estimation. We study an iconic group of butterflies—swallowtails of the family Papilionidae—that has remained phylogenetically unresolved, with continued debate about the timing of their diversification. Low-coverage whole genomes were obtained using Illumina shotgun sequencing for all genera. Genome assembly coupled to BLAST-based orthology searches allowed extraction of 6621 orthologous protein-coding genes for 45 Papilionidae species and 16 outgroup species (with 32% missing data after cleaning phases). Supermatrix phylogenomic analyses were performed with both maximum-likelihood (IQ-TREE) and Bayesian mixture models (PhyloBayes) for amino acid sequences, which produced a fully resolved phylogeny providing new insights into controversial relationships. Species tree reconstruction from gene trees was performed with ASTRAL and SuperTriplets and recovered the same phylogeny. We estimated gene site concordant factors to complement traditional node-support measures, which strengthens the robustness of inferred phylogenies. Bayesian estimates of divergence times based on a reduced data set (760 orthologs and 12% missing data) indicate a mid-Cretaceous origin of Papilionoidea around 99.2 Ma (95% credibility interval: 68.6–142.7 Ma) and Papilionidae around 71.4 Ma (49.8–103.6 Ma), with subsequent diversification of modern lineages well after the Cretaceous-Paleogene event. These results show that shotgun sequencing of whole genomes, even when highly fragmented, represents a powerful approach to phylogenomics and molecular dating in a group that has previously been refractory to resolution.
APA, Harvard, Vancouver, ISO, and other styles
31

Dorrell, Richard G., Adrien Villain, Benoît Perez-Lamarque, Guillemette Audren de Kerdrel, Giselle McCallum, Andrew K. Watson, Ouardia Ait-Mohamed, et al. "Phylogenomic fingerprinting of tempo and functions of horizontal gene transfer within ochrophytes." Proceedings of the National Academy of Sciences 118, no. 4 (January 8, 2021): e2009974118. http://dx.doi.org/10.1073/pnas.2009974118.

Full text
Abstract:
Horizontal gene transfer (HGT) is an important source of novelty in eukaryotic genomes. This is particularly true for the ochrophytes, a diverse and important group of algae. Previous studies have shown that ochrophytes possess a mosaic of genes derived from bacteria and eukaryotic algae, acquired through chloroplast endosymbiosis and from HGTs, although understanding of the time points and mechanisms underpinning these transfers has been restricted by the depth of taxonomic sampling possible. We harness an expanded set of ochrophyte sequence libraries, alongside automated and manual phylogenetic annotation, in silico modeling, and experimental techniques, to assess the frequency and functions of HGT across this lineage. Through manual annotation of thousands of single-gene trees, we identify continuous bacterial HGT as the predominant source of recently arrived genes in the model diatom Phaeodactylum tricornutum. Using a large-scale automated dataset, a multigene ochrophyte reference tree, and mathematical reconciliation of gene trees, we note a probable elevation of bacterial HGTs at foundational points in diatom evolution, following their divergence from other ochrophytes. Finally, we demonstrate that throughout ochrophyte evolutionary history, bacterial HGTs have been enriched in genes encoding secreted proteins. Our study provides insights into the sources and frequency of HGTs, and functional contributions that HGT has made to algal evolution.
APA, Harvard, Vancouver, ISO, and other styles
32

Dorrell, Richard G., Adrien Villain, Benoît Perez-Lamarque, Guillemette Audren de Kerdrel, Giselle McCallum, Andrew K. Watson, Ouardia Ait-Mohamed, et al. "Phylogenomic fingerprinting of tempo and functions of horizontal gene transfer within ochrophytes." Proceedings of the National Academy of Sciences 118, no. 4 (January 8, 2021): e2009974118. http://dx.doi.org/10.1073/pnas.2009974118.

Full text
Abstract:
Horizontal gene transfer (HGT) is an important source of novelty in eukaryotic genomes. This is particularly true for the ochrophytes, a diverse and important group of algae. Previous studies have shown that ochrophytes possess a mosaic of genes derived from bacteria and eukaryotic algae, acquired through chloroplast endosymbiosis and from HGTs, although understanding of the time points and mechanisms underpinning these transfers has been restricted by the depth of taxonomic sampling possible. We harness an expanded set of ochrophyte sequence libraries, alongside automated and manual phylogenetic annotation, in silico modeling, and experimental techniques, to assess the frequency and functions of HGT across this lineage. Through manual annotation of thousands of single-gene trees, we identify continuous bacterial HGT as the predominant source of recently arrived genes in the model diatom Phaeodactylum tricornutum. Using a large-scale automated dataset, a multigene ochrophyte reference tree, and mathematical reconciliation of gene trees, we note a probable elevation of bacterial HGTs at foundational points in diatom evolution, following their divergence from other ochrophytes. Finally, we demonstrate that throughout ochrophyte evolutionary history, bacterial HGTs have been enriched in genes encoding secreted proteins. Our study provides insights into the sources and frequency of HGTs, and functional contributions that HGT has made to algal evolution.
APA, Harvard, Vancouver, ISO, and other styles
33

Smith, Brian Tilston, William M. Mauck, Brett W. Benz, and Michael J. Andersen. "Uneven Missing Data Skew Phylogenomic Relationships within the Lories and Lorikeets." Genome Biology and Evolution 12, no. 7 (May 29, 2020): 1131–47. http://dx.doi.org/10.1093/gbe/evaa113.

Full text
Abstract:
Abstract The resolution of the Tree of Life has accelerated with advances in DNA sequencing technology. To achieve dense taxon sampling, it is often necessary to obtain DNA from historical museum specimens to supplement modern genetic samples. However, DNA from historical material is generally degraded, which presents various challenges. In this study, we evaluated how the coverage at variant sites and missing data among historical and modern samples impacts phylogenomic inference. We explored these patterns in the brush-tongued parrots (lories and lorikeets) of Australasia by sampling ultraconserved elements in 105 taxa. Trees estimated with low coverage characters had several clades where relationships appeared to be influenced by whether the sample came from historical or modern specimens, which were not observed when more stringent filtering was applied. To assess if the topologies were affected by missing data, we performed an outlier analysis of sites and loci, and a data reduction approach where we excluded sites based on data completeness. Depending on the outlier test, 0.15% of total sites or 38% of loci were driving the topological differences among trees, and at these sites, historical samples had 10.9× more missing data than modern ones. In contrast, 70% data completeness was necessary to avoid spurious relationships. Predictive modeling found that outlier analysis scores were correlated with parsimony informative sites in the clades whose topologies changed the most by filtering. After accounting for biased loci and understanding the stability of relationships, we inferred a more robust phylogenetic hypothesis for lories and lorikeets.
APA, Harvard, Vancouver, ISO, and other styles
34

Wainaina, James M., Elijah Ateka, Timothy Makori, Monica A. Kehoe, and Laura M. Boykin. "Phylogenomic relationship and evolutionary insights of sweet potato viruses from the western highlands of Kenya." PeerJ 6 (July 19, 2018): e5254. http://dx.doi.org/10.7717/peerj.5254.

Full text
Abstract:
Sweet potato is a major food security crop within sub-Saharan Africa where 90% of Africa production occurs. One of the major limitations of sweet potato production are viral infections. In this study, we used a combination of whole genome sequences from a field isolate obtained from Kenya and those available in GenBank. Sequences of four sweet potato viruses: Sweet potato feathery mottle virus (SPFMV), Sweet potato virus C (SPVC), Sweet potato chlorotic stunt virus (SPCSV), Sweet potato chlorotic fleck virus (SPCFV) were obtained from the Kenyan sample. SPFMV sequences both from this study and from GenBank were found to be recombinant. Recombination breakpoints were found within the Nla-Pro, coat protein and P1 genes. The SPCSV, SPVC, and SPCFV viruses from this study were non-recombinant. Bayesian phylogenomic relationships across whole genome trees showed variation in the number of well-supported clades; within SPCSV (RNA1 and RNA2) and SPFMV two well-supported clades (I and II) were resolved. The SPCFV tree resolved three well-supported clades (I–III) while four well-supported clades were resolved in SPVC (I–IV). Similar clades were resolved within the coalescent species trees. However, there were disagreements between the clades resolved in the gene trees compared to those from the whole genome tree and coalescent species trees. However the coat protein gene tree of SPCSV and SPCFV resolved similar clades to the genome and coalescent species tree while this was not the case in SPFMV and SPVC. In addition, we report variation in selective pressure within sites of individual genes across all four viruses; overall all viruses were under purifying selection. We report the first complete genomes of SPFMV, SPVC, SPCFV, and a partial SPCSV from Kenya as a mixed infection in one sample. Our findings provide a snap shot on the evolutionary relationship of sweet potato viruses (SPFMV, SPVC, SPCFV, and SPCSV) from Kenya as well as assessing whether selection pressure has an effect on their evolution.
APA, Harvard, Vancouver, ISO, and other styles
35

Nelson, Thomas C., Angela M. Stathos, Daniel D. Vanderpool, Findley R. Finseth, Yao-wu Yuan, and Lila Fishman. "Ancient and recent introgression shape the evolutionary history of pollinator adaptation and speciation in a model monkeyflower radiation (Mimulus section Erythranthe)." PLOS Genetics 17, no. 2 (February 22, 2021): e1009095. http://dx.doi.org/10.1371/journal.pgen.1009095.

Full text
Abstract:
Inferences about past processes of adaptation and speciation require a gene-scale and genome-wide understanding of the evolutionary history of diverging taxa. In this study, we use genome-wide capture of nuclear gene sequences, plus skimming of organellar sequences, to investigate the phylogenomics of monkeyflowers in Mimulus section Erythranthe (27 accessions from seven species). Taxa within Erythranthe, particularly the parapatric and putatively sister species M. lewisii (bee-pollinated) and M. cardinalis (hummingbird-pollinated), have been a model system for investigating the ecological genetics of speciation and adaptation for over five decades. Across >8000 nuclear loci, multiple methods resolve a predominant species tree in which M. cardinalis groups with other hummingbird-pollinated taxa (37% of gene trees), rather than being sister to M. lewisii (32% of gene trees). We independently corroborate a single evolution of hummingbird pollination syndrome in Erythranthe by demonstrating functional redundancy in genetic complementation tests of floral traits in hybrids; together, these analyses overturn a textbook case of pollination-syndrome convergence. Strong asymmetries in allele sharing (Patterson’s D-statistic and related tests) indicate that gene tree discordance reflects ancient and recent introgression rather than incomplete lineage sorting. Consistent with abundant introgression blurring the history of divergence, low-recombination and adaptation-associated regions support the new species tree, while high-recombination regions generate phylogenetic evidence for sister status for M. lewisii and M. cardinalis. Population-level sampling of core taxa also revealed two instances of chloroplast capture, with Sierran M. lewisii and Southern Californian M. parishii each carrying organelle genomes nested within respective sympatric M. cardinalis clades. A recent organellar transfer from M. cardinalis, an outcrosser where selfish cytonuclear dynamics are more likely, may account for the unexpected cytoplasmic male sterility effects of selfer M. parishii organelles in hybrids with M. lewisii. Overall, our phylogenomic results reveal extensive reticulation throughout the evolutionary history of a classic monkeyflower radiation, suggesting that natural selection (re-)assembles and maintains species-diagnostic traits and barriers in the face of gene flow. Our findings further underline the challenges, even in reproductively isolated species, in distinguishing re-use of adaptive alleles from true convergence and emphasize the value of a phylogenomic framework for reconstructing the evolutionary genetics of adaptation and speciation.
APA, Harvard, Vancouver, ISO, and other styles
36

Rabiee, Maryam, and Siavash Mirarab. "INSTRAL: Discordance-Aware Phylogenetic Placement Using Quartet Scores." Systematic Biology 69, no. 2 (August 12, 2019): 384–91. http://dx.doi.org/10.1093/sysbio/syz045.

Full text
Abstract:
Abstract Phylogenomic analyses have increasingly adopted species tree reconstruction using methods that account for gene tree discordance using pipelines that require both human effort and computational resources. As the number of available genomes continues to increase, a new problem is facing researchers. Once more species become available, they have to repeat the whole process from the beginning because updating species trees is currently not possible. However, the de novo inference can be prohibitively costly in human effort or machine time. In this article, we introduce INSTRAL, a method that extends ASTRAL to enable phylogenetic placement. INSTRAL is designed to place a new species on an existing species tree after sequences from the new species have already been added to gene trees; thus, INSTRAL is complementary to existing placement methods that update gene trees. [ASTRAL; ILS; phylogenetic placement; species tree reconstruction.]
APA, Harvard, Vancouver, ISO, and other styles
37

Shen, Xing-Xing, Jacob L. Steenwyk, and Antonis Rokas. "Dissecting Incongruence between Concatenation- and Quartet-Based Approaches in Phylogenomic Data." Systematic Biology 70, no. 5 (February 22, 2021): 997–1014. http://dx.doi.org/10.1093/sysbio/syab011.

Full text
Abstract:
Abstract Topological conflict or incongruence is widespread in phylogenomic data. Concatenation- and coalescent-based approaches often result in incongruent topologies, but the causes of this conflict can be difficult to characterize. We examined incongruence stemming from conflict the between likelihood-based signal (quantified by the difference in gene-wise log-likelihood score or $\Delta $GLS) and quartet-based topological signal (quantified by the difference in gene-wise quartet score or $\Delta $GQS) for every gene in three phylogenomic studies in animals, fungi, and plants, which were chosen because their concatenation-based IQ-TREE (T1) and quartet-based ASTRAL (T2) phylogenies are known to produce eight conflicting internal branches (bipartitions). By comparing the types of phylogenetic signal for all genes in these three data matrices, we found that 30–36% of genes in each data matrix are inconsistent, that is, each of these genes has a higher log-likelihood score for T1 versus T2 (i.e., $\Delta $GLS $>$0) whereas its T1 topology has lower quartet score than its T2 topology (i.e., $\Delta $GQS $<$0) or vice versa. Comparison of inconsistent and consistent genes using a variety of metrics (e.g., evolutionary rate, gene tree topology, distribution of branch lengths, hidden paralogy, and gene tree discordance) showed that inconsistent genes are more likely to recover neither T1 nor T2 and have higher levels of gene tree discordance than consistent genes. Simulation analyses demonstrate that the removal of inconsistent genes from data sets with low levels of incomplete lineage sorting (ILS) and low and medium levels of gene tree estimation error (GTEE) reduced incongruence and increased accuracy. In contrast, removal of inconsistent genes from data sets with medium and high ILS levels and high GTEE levels eliminated or extensively reduced incongruence, but the resulting congruent species phylogenies were not always topologically identical to the true species trees.[Conflict; gene tree; phylogenetic signal; phylogenetics; phylogenomics; Tree of Life.]
APA, Harvard, Vancouver, ISO, and other styles
38

Boeckmann, B., M. Robinson-Rechavi, I. Xenarios, and C. Dessimoz. "Conceptual framework and pilot study to benchmark phylogenomic databases based on reference gene trees." Briefings in Bioinformatics 12, no. 5 (July 7, 2011): 423–35. http://dx.doi.org/10.1093/bib/bbr034.

Full text
APA, Harvard, Vancouver, ISO, and other styles
39

Smith, Stephen A., Nathanael Walker-Hale, Joseph F. Walker, and Joseph W. Brown. "Phylogenetic Conflicts, Combinability, and Deep Phylogenomics in Plants." Systematic Biology 69, no. 3 (November 20, 2019): 579–92. http://dx.doi.org/10.1093/sysbio/syz078.

Full text
Abstract:
Abstract Studies have demonstrated that pervasive gene tree conflict underlies several important phylogenetic relationships where different species tree methods produce conflicting results. Here, we present a means of dissecting the phylogenetic signal for alternative resolutions within a data set in order to resolve recalcitrant relationships and, importantly, identify what the data set is unable to resolve. These procedures extend upon methods for isolating conflict and concordance involving specific candidate relationships and can be used to identify systematic error and disambiguate sources of conflict among species tree inference methods. We demonstrate these on a large phylogenomic plant data set. Our results support the placement of Amborella as sister to the remaining extant angiosperms, Gnetales as sister to pines, and the monophyly of extant gymnosperms. Several other contentious relationships, including the resolution of relationships within the bryophytes and the eudicots, remain uncertain given the low number of supporting gene trees. To address whether concatenation of filtered genes amplified phylogenetic signal for relationships, we implemented a combinatorial heuristic to test combinability of genes. We found that nested conflicts limited the ability of data filtering methods to fully ameliorate conflicting signal amongst gene trees. These analyses confirmed that the underlying conflicting signal does not support broad concatenation of genes. Our approach provides a means of dissecting a specific data set to address deep phylogenetic relationships while also identifying the inferential boundaries of the data set. [Angiosperms; coalescent; gene-tree conflict; genomics; phylogenetics; phylogenomics.]
APA, Harvard, Vancouver, ISO, and other styles
40

Chuang, Chia-Rong, Chia-Lun Hsieh, Chi-Shan Chang, Chiu-Mei Wang, Danilo N. Tandang, Elliot M. Gardner, Lauren Audi, Nyree J. C. Zerega, and Kuo-Fang Chung. "Amis Pacilo and Yami Cipoho are not the same as the Pacific breadfruit starch crop—Target enrichment phylogenomics of a long-misidentified Artocarpus species sheds light on the northward Austronesian migration from the Philippines to Taiwan." PLOS ONE 17, no. 9 (September 30, 2022): e0272680. http://dx.doi.org/10.1371/journal.pone.0272680.

Full text
Abstract:
‘Breadfruit’ is a common tree species in Taiwan. In the indigenous Austronesian Amis culture of eastern Taiwan, ‘breadfruit’ is known as Pacilo, and its fruits are consumed as food. On Lanyu (Botel Tobago) where the indigenous Yami people live, ‘breadfruit’ is called Cipoho and used for constructing houses and plank-boats. Elsewhere in Taiwan, ‘breadfruit’ is also a common ornamental tree. As an essential component of traditional Yami culture, Cipoho has long been assumed to have been transported from the Batanes Island of the Philippines to Lanyu. As such, it represents a commensal species that potentially can be used to test the hypothesis of the northward Austronesian migration ‘into’ Taiwan. However, recent phylogenomic studies using target enrichment show that Taiwanese ‘breadfruit’ might not be the same as the Pacific breadfruit (Artocarpus altilis), which was domesticated in Oceania and widely cultivated throughout the tropics. To resolve persistent misidentification of this culturally and economically important tree species of Taiwan, we sampled 36 trees of Taiwanese Artocarpus and used the Moraceae probe set to enrich 529 nuclear genes. Along with 28 archived Artocarpus sequence datasets (representing a dozen taxa from all subgenera), phylogenomic analyses showed that all Taiwanese ‘breadfruit’ samples, together with a cultivated ornamental tree from Hawaii, form a fully supported clade within the A. treculianus complex, which is composed only of endemic Philippine species. Morphologically, the Taiwanese ‘breadfruit’ matches the characters of A. treculianus. Within the Taiwanese samples of A. treculianus, Amis samples form a fully supported clade derived from within the paraphyletic grade composed of Yami samples, suggesting a Lanyu origin. Results of our target enrichment phylogenomics are consistent with the scenario that Cipoho was transported northward from the Philippines to Lanyu by Yami ancestors, though the possibility that A. treculianus is native to Lanyu cannot be ruled out completely.
APA, Harvard, Vancouver, ISO, and other styles
41

Steenwyk, Jacob L., Dayna C. Goltz, Thomas J. Buida, Yuanning Li, Xing-Xing Shen, and Antonis Rokas. "OrthoSNAP: A tree splitting and pruning algorithm for retrieving single-copy orthologs from gene family trees." PLOS Biology 20, no. 10 (October 13, 2022): e3001827. http://dx.doi.org/10.1371/journal.pbio.3001827.

Full text
Abstract:
Molecular evolution studies, such as phylogenomic studies and genome-wide surveys of selection, often rely on gene families of single-copy orthologs (SC-OGs). Large gene families with multiple homologs in 1 or more species—a phenomenon observed among several important families of genes such as transporters and transcription factors—are often ignored because identifying and retrieving SC-OGs nested within them is challenging. To address this issue and increase the number of markers used in molecular evolution studies, we developed OrthoSNAP, a software that uses a phylogenetic framework to simultaneously split gene families into SC-OGs and prune species-specific inparalogs. We term SC-OGs identified by OrthoSNAP as SNAP-OGs because they are identified using a splitting and pruning procedure analogous to snapping branches on a tree. From 415,129 orthologous groups of genes inferred across 7 eukaryotic phylogenomic datasets, we identified 9,821 SC-OGs; using OrthoSNAP on the remaining 405,308 orthologous groups of genes, we identified an additional 10,704 SNAP-OGs. Comparison of SNAP-OGs and SC-OGs revealed that their phylogenetic information content was similar, even in complex datasets that contain a whole-genome duplication, complex patterns of duplication and loss, transcriptome data where each gene typically has multiple transcripts, and contentious branches in the tree of life. OrthoSNAP is useful for increasing the number of markers used in molecular evolution data matrices, a critical step for robustly inferring and exploring the tree of life.
APA, Harvard, Vancouver, ISO, and other styles
42

Miller, Justin B., Lauren M. McKinnon, Michael F. Whiting, and Perry G. Ridge. "CAM: an alignment-free method to recover phylogenies using codon aversion motifs." PeerJ 7 (June 4, 2019): e6984. http://dx.doi.org/10.7717/peerj.6984.

Full text
Abstract:
Background Common phylogenomic approaches for recovering phylogenies are often time-consuming and require annotations for orthologous gene relationships that are not always available. In contrast, alignment-free phylogenomic approaches typically use structure and oligomer frequencies to calculate pairwise distances between species. We have developed an approach to quickly calculate distances between species based on codon aversion. Methods Utilizing a novel alignment-free character state, we present CAM, an alignment-free approach to recover phylogenies by comparing differences in codon aversion motifs (i.e., the set of unused codons within each gene) across all genes within a species. Synonymous codon usage is non-random and differs between organisms, between genes, and even within a single gene, and many genes do not use all possible codons. We report a comprehensive analysis of codon aversion within 229,742,339 genes from 23,428 species across all kingdoms of life, and we provide an alignment-free framework for its use in a phylogenetic construct. For each species, we first construct a set of codon aversion motifs spanning all genes within that species. We define the pairwise distance between two species, A and B, as one minus the number of shared codon aversion motifs divided by the total codon aversion motifs of the species, A or B, containing the fewest motifs. This approach allows us to calculate pairwise distances even when substantial differences in the number of genes or a high rate of divergence between species exists. Finally, we use neighbor-joining to recover phylogenies. Results Using the Open Tree of Life and NCBI Taxonomy Database as expected phylogenies, our approach compares well, recovering phylogenies that largely match expected trees and are comparable to trees recovered using maximum likelihood and other alignment-free approaches. Our technique is much faster than maximum likelihood and similar in accuracy to other alignment-free approaches. Therefore, we propose that codon aversion be considered a phylogenetically conserved character that may be used in future phylogenomic studies. Availability CAM, documentation, and test files are freely available on GitHub at https://github.com/ridgelab/cam.
APA, Harvard, Vancouver, ISO, and other styles
43

Gulyaev, Sergey, Xin-Jie Cai, Fei-Yi Guo, Satoshi Kikuchi, Wendy L. Applequist, Zhi-Xiang Zhang, Elvira Hörandl, and Li He. "The phylogeny of Salix revealed by whole genome re-sequencing suggests different sex-determination systems in major groups of the genus." Annals of Botany 129, no. 4 (March 23, 2022): 485–98. http://dx.doi.org/10.1093/aob/mcac012.

Full text
Abstract:
Abstract Background and Aims The largest genus of Salicaceae sensu lato, Salix, has been shown to consist of two main clades: clade Salix, in which species have XY sex-determination systems (SDSs) on chromosome 7, and clade Vetrix including species with ZW SDSs on chromosome 15. Here, we test the utility of whole genome re-sequencing (WGR) for phylogenomic reconstructions of willows to infer changes between different SDSs. Methods We used more than 1 TB of WGR data from 70 Salix taxa to ascertain single nucleotide polymorphisms on the autosomes, the sex-linked regions (SLRs) and the chloroplast genomes, for phylogenetic and species tree analyses. To avoid bias, we chose reference genomes from both groups, Salix dunnii from clade Salix and S. purpurea from clade Vetrix. Key Results Two main largely congruent groups were recovered: the paraphyletic Salix grade and the Vetrix clade. The autosome dataset trees resolved four subclades (C1–C4) in Vetrix. C1 and C2 comprise species from the Hengduan Mountains and adjacent areas and from Eurasia, respectively. Section Longifoliae (C3) grouped within the Vetrix clade but fell into the Salix clade in trees based on the chloroplast dataset analysis. Salix triandra from Eurasia (C4) was revealed as sister to the remaining species of clade Vetrix. In Salix, the polyploid group C5 is paraphyletic to clade Vetrix and subclade C6 is consistent with Argus’s subgenus Protitea. Chloroplast datasets separated both Vetrix and Salix as monophyletic, and yielded C5 embedded within Salix. Using only diploid species, both the SLR and autosomal datasets yielded trees with Vetrix and Salix as well-supported clades. Conclusion WGR data are useful for phylogenomic analyses of willows. The different SDSs may contribute to the isolation of the two major groups, but the reproductive barrier between them needs to be studied.
APA, Harvard, Vancouver, ISO, and other styles
44

Suh, Alexander. "The phylogenomic forest of bird trees contains a hard polytomy at the root of Neoaves." Zoologica Scripta 45, S1 (September 27, 2016): 50–62. http://dx.doi.org/10.1111/zsc.12213.

Full text
APA, Harvard, Vancouver, ISO, and other styles
45

Karin, Benjamin R., Tony Gamble, and Todd R. Jackman. "Optimizing Phylogenomics with Rapidly Evolving Long Exons: Comparison with Anchored Hybrid Enrichment and Ultraconserved Elements." Molecular Biology and Evolution 37, no. 3 (November 9, 2019): 904–22. http://dx.doi.org/10.1093/molbev/msz263.

Full text
Abstract:
Abstract Marker selection has emerged as an important component of phylogenomic study design due to rising concerns of the effects of gene tree estimation error, model misspecification, and data-type differences. Researchers must balance various trade-offs associated with locus length and evolutionary rate among other factors. The most commonly used reduced representation data sets for phylogenomics are ultraconserved elements (UCEs) and Anchored Hybrid Enrichment (AHE). Here, we introduce Rapidly Evolving Long Exon Capture (RELEC), a new set of loci that targets single exons that are both rapidly evolving (evolutionary rate faster than RAG1) and relatively long in length (>1,500 bp), while at the same time avoiding paralogy issues across amniotes. We compare the RELEC data set to UCEs and AHE in squamate reptiles by aligning and analyzing orthologous sequences from 17 squamate genomes, composed of 10 snakes and 7 lizards. The RELEC data set (179 loci) outperforms AHE and UCEs by maximizing per-locus genetic variation while maintaining presence and orthology across a range of evolutionary scales. RELEC markers show higher phylogenetic informativeness than UCE and AHE loci, and RELEC gene trees show greater similarity to the species tree than AHE or UCE gene trees. Furthermore, with fewer loci, RELEC remains computationally tractable for full Bayesian coalescent species tree analyses. We contrast RELEC to and discuss important aspects of comparable methods, and demonstrate how RELEC may be the most effective set of loci for resolving difficult nodes and rapid radiations. We provide several resources for capturing or extracting RELEC loci from other amniote groups.
APA, Harvard, Vancouver, ISO, and other styles
46

Cerón-Romero, Mario A., Xyrus X. Maurer-Alcalá, Jean-David Grattepanche, Ying Yan, Miguel M. Fonseca, and L. A. Katz. "PhyloToL: A Taxon/Gene-Rich Phylogenomic Pipeline to Explore Genome Evolution of Diverse Eukaryotes." Molecular Biology and Evolution 36, no. 8 (May 7, 2019): 1831–42. http://dx.doi.org/10.1093/molbev/msz103.

Full text
Abstract:
Abstract Estimating multiple sequence alignments (MSAs) and inferring phylogenies are essential for many aspects of comparative biology. Yet, many bioinformatics tools for such analyses have focused on specific clades, with greatest attention paid to plants, animals, and fungi. The rapid increase in high-throughput sequencing (HTS) data from diverse lineages now provides opportunities to estimate evolutionary relationships and gene family evolution across the eukaryotic tree of life. At the same time, these types of data are known to be error-prone (e.g., substitutions, contamination). To address these opportunities and challenges, we have refined a phylogenomic pipeline, now named PhyloToL, to allow easy incorporation of data from HTS studies, to automate production of both MSAs and gene trees, and to identify and remove contaminants. PhyloToL is designed for phylogenomic analyses of diverse lineages across the tree of life (i.e., at scales of >100 My). We demonstrate the power of PhyloToL by assessing stop codon usage in Ciliophora, identifying contamination in a taxon- and gene-rich database and exploring the evolutionary history of chromosomes in the kinetoplastid parasite Trypanosoma brucei, the causative agent of African sleeping sickness. Benchmarking PhyloToL’s homology assessment against that of OrthoMCL and a published paper on superfamilies of bacterial and eukaryotic organellar outer membrane pore-forming proteins demonstrates the power of our approach for determining gene family membership and inferring gene trees. PhyloToL is highly flexible and allows users to easily explore HTS data, test hypotheses about phylogeny and gene family evolution and combine outputs with third-party tools (e.g., PhyloChromoMap, iGTP).
APA, Harvard, Vancouver, ISO, and other styles
47

Heckenhauer, Jacqueline, Ovidiu Paun, Mark W. Chase, Peter S. Ashton, A. S. Kamariah, and Rosabelle Samuel. "Molecular phylogenomics of the tribe Shoreeae (Dipterocarpaceae) using whole plastid genomes." Annals of Botany 123, no. 5 (December 12, 2018): 857–65. http://dx.doi.org/10.1093/aob/mcy220.

Full text
Abstract:
Abstract Background and Aims Phylogenetic relationships within tribe Shoreeae, containing the main elements of tropical forests in Southeast Asia, present a long-standing problem in the systematics of Dipterocarpaceae. Sequencing whole plastomes using next-generation sequencing- (NGS) based genome skimming is increasingly employed for investigating phylogenetic relationships of plants. Here, the usefulness of complete plastid genome sequences in resolving phylogenetic relationships within Shoreeae is evaluated. Methods A pipeline to obtain alignments of whole plastid genome sequences across individuals with different amounts of available data is presented. In total, 48 individuals, representing 37 species and four genera of the ecologically and economically important tribe Shoreeae sensu Ashton, were investigated. Phylogenetic trees were reconstructed using maximum parsimony, maximum likelihood and Bayesian inference. Key Results Here, the first fully sequenced plastid genomes for the tribe Shoreeae are presented. Their size, GC content and gene order are comparable with those of other members of Malvales. Phylogenomic analyses demonstrate that whole plastid genomes are useful for inferring phylogenetic relationships among genera and groups of Shorea (Shoreeae) but fail to provide well-supported phylogenetic relationships among some of the most closely related species. Discordance in placement of Parashorea was observed between phylogenetic trees obtained from plastome analyses and those obtained from nuclear single nucleotide polymorphism (SNP) data sets identified in restriction-site associated sequencing (RADseq). Conclusions Phylogenomic analyses of the entire plastid genomes are useful for inferring phylogenetic relationships at lower taxonomic levels, but are not sufficient for detailed phylogenetic reconstructions of closely related species groups in Shoreeae. Discordance in placement of Parashorea was further investigated for evidence of ancient hybridization.
APA, Harvard, Vancouver, ISO, and other styles
48

James, Timothy Y., Jason E. Stajich, Chris Todd Hittinger, and Antonis Rokas. "Toward a Fully Resolved Fungal Tree of Life." Annual Review of Microbiology 74, no. 1 (September 8, 2020): 291–313. http://dx.doi.org/10.1146/annurev-micro-022020-051835.

Full text
Abstract:
In this review, we discuss the current status and future challenges for fully elucidating the fungal tree of life. In the last 15 years, advances in genomic technologies have revolutionized fungal systematics, ushering the field into the phylogenomic era. This has made the unthinkable possible, namely access to the entire genetic record of all known extant taxa. We first review the current status of the fungal tree and highlight areas where additional effort will be required. We then review the analytical challenges imposed by the volume of data and discuss methods to recover the most accurate species tree given the sea of gene trees. Highly resolved and deeply sampled trees are being leveraged in novel ways to study fungal radiations, species delimitation, and metabolic evolution. Finally, we discuss the critical issue of incorporating the unnamed and uncultured dark matter taxa that represent the vast majority of fungal diversity.
APA, Harvard, Vancouver, ISO, and other styles
49

Stull, Gregory W., Pamela S. Soltis, Douglas E. Soltis, Matthew A. Gitzendanner, and Stephen A. Smith. "Nuclear phylogenomic analyses of asterids conflict with plastome trees and support novel relationships among major lineages." American Journal of Botany 107, no. 5 (May 2020): 790–805. http://dx.doi.org/10.1002/ajb2.1468.

Full text
APA, Harvard, Vancouver, ISO, and other styles
50

Yoshida, Ruriko, Lillian Paul, and Peter Nesbitt. "Stochastic Safety Radius on UPGMA." Algorithms 15, no. 12 (December 18, 2022): 483. http://dx.doi.org/10.3390/a15120483.

Full text
Abstract:
Unweighted Pair Group Method with Arithmetic Mean (UPGMA) is one of the most popular distance-based methods to reconstruct an equidistant phylogenetic tree from a distance matrix computed from an alignment of sequences. Since we use equidistant trees as gene trees for phylogenomic analyses under the multi-species coalescent model and since an input distance matrix computed from an alignment of each gene in a genome is estimated via the maximum likelihood estimators, it is important to conduct a robust analysis on UPGMA. Stochastic safety radius, introduced by Steel and Gascuel, provides a lower bound for the probability that a phylogenetic tree reconstruction method returns the true tree topology from a given distance matrix. In this article, we compute the stochastic safety radius of UPGMA for a phylogenetic tree with n leaves. Computational experiments show an improved gap between empirical probabilities estimated from random samples and the true tree topology from UPGMA, increasing confidence in phylogenic results.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography