Дисертації з теми "Bioinformatic methods development"
Оформте джерело за APA, MLA, Chicago, Harvard та іншими стилями
Ознайомтеся з топ-30 дисертацій для дослідження на тему "Bioinformatic methods development".
Біля кожної праці в переліку літератури доступна кнопка «Додати до бібліографії». Скористайтеся нею – і ми автоматично оформимо бібліографічне посилання на обрану працю в потрібному вам стилі цитування: APA, MLA, «Гарвард», «Чикаго», «Ванкувер» тощо.
Також ви можете завантажити повний текст наукової публікації у форматі «.pdf» та прочитати онлайн анотацію до роботи, якщо відповідні параметри наявні в метаданих.
Переглядайте дисертації для різних дисциплін та оформлюйте правильно вашу бібліографію.
Rossini, Roberto. "Development and validation of bioinformatic methods for GRC assembly and annotation." Thesis, Uppsala universitet, Institutionen för biologisk grundutbildning, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-414739.
Повний текст джерелаRuiz, Arenas Carlos 1990. "Methods and bioinformatic tools to study polymorphic inversions in complex diseases." Doctoral thesis, Universitat Pompeu Fabra, 2019. http://hdl.handle.net/10803/666582.
Повний текст джерелаChromosomal inversions are structural variants where a segment changes its orientation. Chromosomal inversions reduce homologous recombination, producing different haplotypes in standard and inverted chromosomes. As a result, they influence adaptation and selection and play a role in susceptibility to human diseases. Inversions can be studied using experimental and bioinformatic methods. SNP array data can be used to call inversion genotypes by using haplotype differences between inverted and standard chromosomes. However, these methods are not optimized for large cohorts (thousands of individuals from existing databases such as dbGaP or UK Biobank). Also, current methods can only genotype inversions with two haplotypes and the inversion calling is difficult to be harmonized among cohorts. Finally, it is recognized that chromosomal inversions affect gene expression and DNA methylation. However, there are no accurate methods to globally assess the effect of inversions on local gene expression or DNA methylation. The main aim of this thesis is to develop new robust and scalable methods and bioinformatic tools to study the phenotypic and functional effects of chromosomal inversions by overcoming the existing limitations. To this end, I have developed a new method to genotype chromosomal inversions that can be used in large cohorts, inversions with multiple haplotypes and that uses reference haplotypes allowing the integrative analysis of multiple cohorts. Second, I have implemented a multivariate method based on redundancy analysis to study the effects of chromosomal inversions on local DNA methylation and gene expression. Then, I applied both methods to study the role of chromosomal inversions in two groups of complex diseases: neurodevelopmental disorders and cancer. Finally, I developed a new method to study how chromosomal inversions affect recombination patterns. This method is extendable to any genomic regions containing subpopulations with different recombination patterns, allowing associating these subpopulations to phenotypic traits.
Mastick, Kellen J. "Identification of candidate genes involved in fin/limb development and evolution using bioinformatic methods." Thesis, University of South Dakota, 2014. http://pqdtopen.proquest.com/#viewpdf?dispub=1566765.
Повний текст джерелаKey to understanding the transition that vertebrates made from water to land is determining the developmental and genomic bases for the changes. New bioinformatic tools provide an opportunity to automate the discovery, broaden the number of, and provide an evidence-based ranking for potential candidate genes. I sought to explore this potential for the fin/limb transition, using the substantial genetic and phenotypic data available in model organism databases. Model organism data was used to hypothesize candidate genes for the fin/limb transition. In addition, 131 fin/limb candidate genes from the literature were extracted and used as a basis for comparison with candidates from the model organism databases. Additionally, seven genes specific to limb and 24 genes specific to fin were identified as future fin/limb transition candidates.
Zierep, Paul [Verfasser], and Stefan [Akademischer Betreuer] Günther. "Development of bioinformatic methods for the prediction and understanding of biosynthesis and activity of natural products." Freiburg : Universität, 2020. http://d-nb.info/1231711752/34.
Повний текст джерелаBesnier, Francois. "Development of Variance Component Methods for Genetic Dissection of Complex Traits." Doctoral thesis, Uppsala universitet, Centrum för bioinformatik, 2009. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-101399.
Повний текст джерелаJauhiainen, Alexandra. "Evaluation and Development of Methods for Identification of Biochemical Networks." Thesis, Linköping University, The Department of Physics, Chemistry and Biology, 2005. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-2811.
Повний текст джерелаSystems biology is an area concerned with understanding biology on a systems level, where structure and dynamics of the system is in focus. Knowledge about structure and dynamics of biological systems is fundamental information about cells and interactions within cells and also play an increasingly important role in medical applications.
System identification deals with the problem of constructing a model of a system from data and an extensive theory of particularly identification of linear systems exists.
This is a master thesis in systems biology treating identification of biochemical systems. Methods based on both local parameter perturbation data and time series data have been tested and evaluated in silico.
The advantage of local parameter perturbation data methods proved to be that they demand less complex data, but the drawbacks are the reduced information content of this data and sensitivity to noise. Methods employing time series data are generally more robust to noise but the lack of available data limits the use of these methods.
The work has been conducted at the Fraunhofer-Chalmers Research Centre for Industrial Mathematics in Göteborg, and at the division of Computational Biology at the Department of Physics and Measurement Technology, Biology, and Chemistry at Linköping University during the autumn of 2004.
Hedberg, Lilia. "Identification of obesity-associated SNPs in the human genome : Method development and implementation for SOLiD sequencing data analysis." Thesis, Linköpings universitet, Institutionen för klinisk och experimentell medicin, 2010. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-57932.
Повний текст джерелаLi, Miaoxin, and 李淼新. "Development of a bioinformatics and statistical framework to integratebiological resources for genome-wide genetic mapping and itsapplications." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 2009. http://hub.hku.hk/bib/B43572030.
Повний текст джерелаPatel, Hitesh [Verfasser], and Irmgard [Akademischer Betreuer] Merfort. "Use and development of chem-bioinformatics tools and methods for drug discovery and target identification." Freiburg : Universität, 2015. http://d-nb.info/1115495917/34.
Повний текст джерелаPennington, Steven. "Pulsed induction, a method to identify genetic regulators of determination events." Thesis, Oklahoma State University, 2015. http://pqdtopen.proquest.com/#viewpdf?dispub=3727701.
Повний текст джерелаAbstract: Determination is the process in which a stem cell commits to differentiation. The process of how a cell goes through determination is not well understood. Determination is important for proper regulation of cell turn-over in tissue and maintaining the adult stem cell population. Deregulation of determination or differentiation can lead to diseases such as several forms of cancer. In this study I will be using microarrays to identify candidate genes involved in determination by pulse induction of mouse erythroleukemia (MEL) cells with DMSO and looking at gene expression changes as the cells go through the early stages of erythropoiesis. The pulsed induction method I have developed to identify candidate genes is to induce cells for a short time (30 min, 2 hours, etc.) and allow them then to grow for the duration of their differentiation time (8 days). For reference, cells were also harvested at the time when the inducer is removed from the media. Results show high numbers of genes differentially expressed including erythropoiesis specific genes such as GATA1, globin genes and many novel candidate genes that have also been indicated as playing a role in the dynamic early signaling of erythropoiesis. In addition, several genes showed a pendulum effect when allowed to recover, making these interesting candidate genes for maintaining self-renewal of the adult stem cell population.
Castro-Mondragon, Jaime. "Development of bioinformatics methods for the analysis of large collections of transcription factor binding motifs : positional motif enrichment and motif clustering." Thesis, Aix-Marseille, 2017. http://www.theses.fr/2017AIXM0171.
Повний текст джерелаTranscription Factors (TFs) are DNA-binding proteins that control gene expression. TF binding motifs (TFBMs, simply called “motifs”) are usually represented as Position Specific Scoring Matrices (PSSMs), which can be visualized as sequence logos. The advent of high-throughput methods has allowed the detection of thousands of motifs which are usually stored in databases. In this work I developed two novel methods and implemented software tools to handle large collection of motifs in order to extract interpretable information from high-throughput data: (i) matrix-clustering regroups motifs by similarity and offers a dynamic interface; (2) position-scan detects TFBMs with positional preferences relative to a given reference location (e.g. ChIP-seq peaks, transcription start sites). The methods I developed have been evaluated based on control cases, and applied to extract meaningful information from different datasets from Drosophila melanogaster and Homo sapiens. The results show that these methods enable to analyse motifs in high-throughput datasets, and can be integrated in motif analysis workflows
Rohde, Christian [Verfasser]. "Development of experimental and bioinformatics methods for high resolution DNA methylation analysis of gene promoters on human chromosome 21 / Christian Rohde." Bremen : IRC-Library, Information Resource Center der Jacobs University Bremen, 2009. http://d-nb.info/1034996371/34.
Повний текст джерелаAyllón-Benítez, Aarón. "Development of new computational methods for a synthetic gene set annotation." Thesis, Bordeaux, 2019. http://www.theses.fr/2019BORD0305.
Повний текст джерелаThe revolution in new sequencing technologies, by strongly improving the production of omics data, is greatly leading to new understandings of the relations between genotype and phenotype. To interpret and analyze data grouped according to a phenotype of interest, methods based on statistical enrichment became a standard in biology. However, these methods synthesize the biological information by a priori selecting the over-represented terms and focus on the most studied genes that may represent a limited coverage of annotated genes within a gene set. During this thesis, we explored different methods for annotating gene sets. In this frame, we developed three studies allowing the annotation of gene sets and thus improving the understanding of their biological context.First, visualization approaches were applied to represent annotation results provided by enrichment analysis for a gene set or a repertoire of gene sets. In this work, a visualization prototype called MOTVIS (MOdular Term VISualization) has been developed to provide an interactive representation of a repertoire of gene sets combining two visual metaphors: a treemap view that provides an overview and also displays detailed information about gene sets, and an indented tree view that can be used to focus on the annotation terms of interest. MOTVIS has the advantage to solve the limitations of each visual metaphor when used individually. This illustrates the interest of using different visual metaphors to facilitate the comprehension of biological results by representing complex data.Secondly, to address the issues of enrichment analysis, a new method for analyzing the impact of using different semantic similarity measures on gene set annotation was proposed. To evaluate the impact of each measure, two relevant criteria were considered for characterizing a "good" synthetic gene set annotation: (i) the number of annotation terms has to be drastically reduced while maintaining a sufficient level of details, and (ii) the number of genes described by the selected terms should be as large as possible. Thus, nine semantic similarity measures were analyzed to identify the best possible compromise between both criteria while maintaining a sufficient level of details. Using GO to annotate the gene sets, we observed better results with node-based measures that use the terms’ characteristics than with edge-based measures that use the relations terms. The annotation of the gene sets achieved with the node-based measures did not exhibit major differences regardless of the characteristics of the terms used. Then, we developed GSAn (Gene Set Annotation), a novel gene set annotation web server that uses semantic similarity measures to synthesize a priori GO annotation terms. GSAn contains the interactive visualization MOTVIS, dedicated to visualize the representative terms of gene set annotations. Compared to enrichment analysis tools, GSAn has shown excellent results in terms of maximizing the gene coverage while minimizing the number of terms.At last, the third work consisted in enriching the annotation results provided by GSAn. Since the knowledge described in GO may not be sufficient for interpreting gene sets, other biological information, such as pathways and diseases, may be useful to provide a wider biological context. Thus, two additional knowledge resources, being Reactome and Disease Ontology (DO), were integrated within GSAn. In practice, GO terms were mapped to terms of Reactome and DO, before and after applying the GSAn method. The integration of these resources improved the results in terms of gene coverage without affecting significantly the number of involved terms. Two strategies were applied to find mappings (generated or extracted from the web) between each new resource and GO. We have shown that a mapping process before computing the GSAn method allowed to obtain a larger number of inter-relations between the two knowledge resources
Manser, Paul. "Methods for Integrative Analysis of Genomic Data." VCU Scholars Compass, 2014. http://scholarscompass.vcu.edu/etd/3638.
Повний текст джерелаGerst, Michelle Marie. "Improving methods to isolate bacteria producing antibacterial compounds followed by identification and characterization of select antimicrobials." The Ohio State University, 2017. http://rave.ohiolink.edu/etdc/view?acc_num=osu1512070391589857.
Повний текст джерелаStephens, Alex J. "The development of rapid genotyping methods for methicillin-resistant Staphylococcus aureus." Thesis, Queensland University of Technology, 2008. https://eprints.qut.edu.au/20172/1/Alexander_Stephens_Thesis.pdf.
Повний текст джерелаStephens, Alex J. "The development of rapid genotyping methods for methicillin-resistant Staphylococcus aureus." Queensland University of Technology, 2008. http://eprints.qut.edu.au/20172/.
Повний текст джерелаCui, Lingfei. "A Likelihood Method to Estimate/Detect Gene Flow and A Distance Method to Estimate Species Trees in the Presence of Gene Flow." The Ohio State University, 2014. http://rave.ohiolink.edu/etdc/view?acc_num=osu1406158261.
Повний текст джерелаChavan, Archana G. "Exploring the molecular architecture of proteins| Method developments in structure prediction and design." Thesis, University of the Pacific, 2014. http://pqdtopen.proquest.com/#viewpdf?dispub=3609082.
Повний текст джерелаProteins are molecular machines of life in the truest sense. Being the expressors of genotype, proteins have been a focus in structural biology. Since the first characterization and structure determination of protein molecule more than half a century ago1, our understanding of protein structure is improving only incrementally. While computational analysis and experimental techniques have helped scientist view the structural features of proteins, our concepts about protein folding remain at the level of simple hydrophobic interactions packing side-chain at the core of the protein. Furthermore, because the rate of genome sequencing is far more rapid than protein structure characterization, much more needs to be achieved in the field of structural biology. As a step in this direction, my dissertation research uses computational analysis and experimental techniques to elucidate the fine structural features of the tertiary packing in proteins. With these set of studies, the knowledge of the field of structural biology extends to the fine details of higher order protein structure.
Jiménez, Sánchez Alejandro. "Characterisation of the tumour microenvironment in ovarian cancer." Thesis, University of Cambridge, 2019. https://www.repository.cam.ac.uk/handle/1810/287935.
Повний текст джерелаRivas, Cruz Manuel A. "Medical relevance and functional consequences of protein truncating variants." Thesis, University of Oxford, 2015. http://ora.ox.ac.uk/objects/uuid:a042ca18-7b35-4a62-aef0-e3ba2e8795f7.
Повний текст джерелаSinclair, Lucas. "Molecular methods for microbial ecology : Developments, applications and results." Doctoral thesis, 2016. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-297613.
Повний текст джерела"Development of bioinformatics algorithms for trisomy 13 and 18 detection by next generation sequencing of maternal plasma DNA." 2011. http://library.cuhk.edu.hk/record=b5894869.
Повний текст джерелаThesis (M.Phil.)--Chinese University of Hong Kong, 2011.
Includes bibliographical references (p. 109-114).
Abstracts in English and Chinese.
ABSTRACT --- p.I
摘要 --- p.III
ACKNOWLEDGEMENTS --- p.IV
PUBLICATIONS --- p.VI
CONTRIBUTORS --- p.VII
TABLE OF CONTENTS --- p.VIII
LIST OF TABLES --- p.XIII
LIST OF FIGURES --- p.XIV
LIST OF ABBREVIATIONS --- p.XVI
Chapter SECTION I : --- BACKGROUND --- p.1
Chapter CHAPTER 1: --- PRENATAL DIAGNOSIS OF FETAL TRISOMY BY NEXT GENERATION SEQUENCING TECHNOLOGY --- p.2
Chapter 1.1 --- FETAL TRISOMY --- p.2
Chapter 1.2 --- CONVENTIONAL PRENATAL DIAGNOSIS OF FETAL TRISOMIES --- p.3
Chapter 1.3 --- CELL FREE FETAL D N A AND ITS APPLICATION IN PRENATAL DIAGNOSIS --- p.5
Chapter 1.4 --- NEXT GENERATION SEQUENCING TECHNOLOGY --- p.5
Chapter 1.5 --- SUBSTANTIAL BIAS IN THE NEXT GENERATION SEQUENCING PLATFORM --- p.9
Chapter 1.6 --- PRENATAL DIAGNOSIS OF TRISOMY BY NEXT GENERATION SEQUENCING --- p.10
Chapter 1.7 --- AIMS OF THIS THESIS --- p.11
Chapter SECTION I I : --- MATERIALS AND METHODS --- p.13
Chapter CHAPTER 2: --- METHODS FOR NONINVASIVE PRENATAL DIAGNOSIS OF FETAL TRISOMY MATERNAL PLASMA DNA SEQUENCING --- p.14
Chapter 2.1 --- STUDY DESIGN AND PARTICIPANTS --- p.14
Chapter 2.1.1 --- Ethics Statement --- p.14
Chapter 2.1.2 --- "Study design, setting and participants" --- p.14
Chapter 2.2 --- MATERNAL PLASMA D N A SEQUENCING --- p.17
Chapter 2.3 --- SEQUENCING DATA ANALYSIS --- p.18
Chapter SECTION I I I : --- TRISOMY 13 AND 18 DETECTION BY THE T21 BIOINFORMATICS ANALYSIS PIPELINE --- p.21
Chapter CHAPTER 3: --- THE T21 BIOINFORMATICS ANALYSIS PIPELINE FOR TRISOMY 13 AND 18 DETECTION --- p.22
Chapter 3.1 --- INTRODUCTION --- p.22
Chapter 3.2 --- METHODS --- p.23
Chapter 3.2.1 --- Bioinformatics analysis pipeline for trisomy 13 and 18 detection --- p.23
Chapter 3.3 --- RESULTS --- p.23
Chapter 3.3.1 --- Performance of the T21 bioinformatics analysis pipeline for trisomy 13 and 18 detection --- p.23
Chapter 3.3.2 --- The precision of quantifying chrl 3 and chrl 8 --- p.27
Chapter 3.4 --- DISCUSSION --- p.29
Chapter SECTION IV : --- IMPROVING THE T21 BIOINFORMATICS ANALYSIS PIPELINE FOR TRISOMY 13 AND 18 DETECTION --- p.30
Chapter CHAPTER 4: --- IMPROVING THE ALIGNMENT --- p.31
Chapter 4.1 --- INTRODUCTION --- p.31
Chapter 4.2 --- METHODS --- p.32
Chapter 4.2.1 --- Allowing mismatches in the index sequences --- p.32
Chapter 4.2.2 --- Calculating the mappability of the human reference genome --- p.33
Chapter 4.2.3 --- Aligning reads to the non-repeat masked human reference genome --- p.34
Chapter 4.2.4 --- Trisomy 13 and 18 detection --- p.34
Chapter 4.3 --- RESULTS --- p.34
Chapter 4.3.1 --- Increasing read numbers by allowing mismatches in the index sequences --- p.34
Chapter 4.3.2 --- Increasing read numbers by using the non-masked reference genome for alignment . --- p.38
Chapter 4.3.3 --- Allowing mismatches in the read alignment --- p.42
Chapter 4.3.4 --- The performance of trisomy 13 and 18 detection after improving the alignment --- p.47
Chapter 4.4 --- DISCUSSION --- p.50
Chapter CHAPTER 5: --- REDUCING THE GC BIAS BY CORRECTION OF READ COUNTS --- p.53
Chapter 5.1 --- INTRODUCTION --- p.53
Chapter 5.2 --- METHODS --- p.54
Chapter 5.2.1 --- Read alignment --- p.54
Chapter 5.2.2 --- Calculating the correlation between GC content and read counts --- p.55
Chapter 5.2.3 --- GC correction in read counts --- p.55
Chapter 5.2.4 --- Trisomy 13 and 18 detection --- p.56
Chapter 5.3 --- RESULTS --- p.56
Chapter 5.3.1 --- GC bias in plasma DNA sequencing --- p.56
Chapter 5.3.2 --- Correcting the GC bias in read counts by linear regression --- p.59
Chapter 5.3.3 --- Correcting the GC bias in read counts by LOESS regression --- p.65
Chapter 5.3.4 --- Bin size --- p.72
Chapter 5.4 --- DISCUSSION --- p.75
Chapter CHAPTER 6: --- REDUCING THE GC BIAS BY MODIFYING THE GENOMIC REPRESENTATION CALCULATION --- p.77
Chapter 6.1 --- INTRODUCTION --- p.77
Chapter 6.2 --- METHODS --- p.78
Chapter 6.2.1 --- Modifying the genomic representation calculation --- p.78
Chapter 6.2.2 --- Trisomy 13 and 18 detection --- p.78
Chapter 6.2.3 --- Combining GC correction and modified genomic representation --- p.78
Chapter 6.3 --- RESULTS --- p.79
Chapter 6.3.1 --- Reducing the GC bias by modifying genomic representation calculation --- p.79
Chapter 6.3.2 --- Combining GC correction and modified genomic representation --- p.86
Chapter 6.4 --- DISCUSSION --- p.89
Chapter CHAPTER 7: --- IMPROVING THE STATISTICS FOR TRISOMY 13 AND 18 DETECTION --- p.91
Chapter 7.1 --- INTRODUCTION --- p.91
Chapter 7.2 --- METHODS --- p.92
Chapter 7.2.1 --- Comparing chrl 3 or chrl8 with other chromosomes within the sample --- p.92
Chapter 7.2.2 --- Comparing chrl 3 or chrl 8 with the artificial chromosomes --- p.92
Chapter 7.3 --- RESULTS --- p.93
Chapter 7.3.1 --- Determining the trisomy 13 and 18 status by comparing chromosomes within the samples --- p.93
Chapter 7.3.2 --- Determining the trisomy 13 and 18 status by comparing chrl3 or chrl 8 with artificial chromosomes --- p.97
Chapter 7.4 --- DISCUSSION --- p.100
Chapter SECTION V : --- CONCLUDING REMARKS --- p.102
Chapter CHAPTER 8: --- CONCLUSION AND FUTURE PERSPECTIVES --- p.103
Chapter 8.1 --- THE PERFORMANCE OF THE T21 BIOINFORMATICS ANALYSIS PIPELINE DEVELOPED FOR TRISOMY 21 DETECTION IS SUBOPTIMAL FOR TRISOMY 13 AND 18 DETECTION --- p.103
Chapter 8.2 --- THE ALIGNMENT COULD BE IMPROVED BY ALLOWING ONE MISMATCH IN THE INDEX AND USING THE NON-REPEAT MASKED HUMAN REFERENCE GENOME AS THE ALIGNMENT REFERENCE --- p.104
Chapter 8.3 --- THE PRECISION OF QUANTIFYING CHR13 AND CHR18 COULD BE IMPROVED BY THE G C CORRECTION OR THE MODIFIED GENOMIC REPRESENTATION --- p.104
Chapter 8.4 --- THE STATISTICS FOR TRISOMY 13 AND 18 DETECTION COULD BE IMPROVED BY COMPARING CHR13 OR CHR18 WITH ARTIFICIAL CHROMOSOMES WITHIN THE SAMPLE --- p.105
Chapter 8.5 --- PROSPECTS FOR FUTURE WORK --- p.106
REFERENCE --- p.109
Rohde, Christian. "Development of experimental and bioinformatics methods for high resolution DNA methylation analysis of gene promoters on human chromosome 21 /." 2009. http://www.jacobs-university.de/phd/files/1254838231.pdf.
Повний текст джерелаReed, Eric R. "Development of advanced methods for large-scale transcriptomic profiling and application to screening of metabolism disrupting compounds." Thesis, 2020. https://hdl.handle.net/2144/41943.
Повний текст джерелаVenkatraman, Anand. "Validation of a novel expressed sequence tag (EST) clustering method and development of a phylogenetic annotation pipeline for livestock gene families." Thesis, 2008. http://hdl.handle.net/1969.1/ETD-TAMU-3112.
Повний текст джерелаMainz, Indra [Verfasser]. "Development and implementation of techniques for ontology engineering and an ontology-based search for bioinformatics tools and methods / vorgelegt von Indra Mainz." 2008. http://d-nb.info/99269776X/34.
Повний текст джерелаKrishnadev, O. "Development And Applications Of Computational Methods To Aid Recognition Of Protein Functions And Interactions." Thesis, 2010. http://etd.iisc.ernet.in/handle/2005/1457.
Повний текст джерелаBhar, Anirban. "Application of A Novel Triclustering Method in Analyzing Three Dimensional Transcriptomics Data." Doctoral thesis, 2015. http://hdl.handle.net/11858/00-1735-0000-0022-602C-1.
Повний текст джерелаLiu, Yu. "A phylogenomics approach to resolving fungal evolution, and phylogenetic method development." Thèse, 2009. http://hdl.handle.net/1866/5096.
Повний текст джерелаDespite the popularity of fungi as eukaryotic model systems, several questions on their phylogenetic relationships continue to be controversial. These include the classification of zygomycetes that are potentially paraphyletic, i.e. a combination of several not directly related fungal lineages. The phylogenetic position of Schizosaccharomyces species has also been controversial: do they belong to Taphrinomycotina (previously known as archiascomycetes) as predicted by analyses with nuclear genes, or are they instead related to Saccharomycotina (budding yeast) as in mitochondrial phylogenies? Another question concerns the precise phylogenetic position of nucleariids, a group of amoeboid eukaryotes that are believed to be close relatives of Fungi. Previously conducted multi-gene analyses have been inconclusive, because of limited taxon sampling and the use of only six nuclear genes. We have addressed these issues by assembling phylogenomic nuclear and mitochondrial datasets for phylogenetic inference and statistical testing. According to our results zygomycetes appear to be paraphyletic (Chapter 2), but the phylogenetic signal in the available mitochondrial dataset is insufficient for resolving their branching order with statistical confidence. In Chapter 3 we show with a large nuclear dataset (more than 100 proteins) and conclusive supports that Schizosaccharomyces species are part of Taphrinomycotina. We further demonstrate that the conflicting grouping of Schizosaccharomyces with budding yeasts, obtained with mitochondrial sequences, results from a phylogenetic error known as long-branch attraction (LBA, a common artifact that leads to the regrouping of species with high evolutionary rates irrespective of their true phylogenetic positions). In Chapter 4, using again a large nuclear dataset we demonstrate with significant statistical support that nucleariids are the closest known relatives of Fungi. We also confirm paraphyly of traditional zygomycetes as previously suggested, with significant support, but without placing all members of this group with confidence. Our results question aspects of a recent taxonomical reclassification of zygomycetes and their chytridiomycete neighbors (a group of zoospore-producing Fungi). Overcoming or minimizing phylogenetic artifacts such as LBA has been among our most recurring questions. We have therefore developed a new method (Chapter 5) that identifies and eliminates sequence sites with highly uneven evolutionary rates (highly heterotachous sites, or HH sites) that are known to contribute significantly to LBA. Our method is based on a likelihood ratio test (LRT). Two previously published datasets are used to demonstrate that gradual removal of HH sites in fast-evolving species (suspected for LBA) significantly increases the support for the expected ‘true’ topology, in a more effective way than comparable, published methods of sequence site removal. Yet in general, data manipulation prior to analysis is far from ideal. Future development should aim at integration of HH site identification and weighting into the phylogenetic inference process itself.