Tesis sobre el tema "Short read and long read sequencing"
Crea una cita precisa en los estilos APA, MLA, Chicago, Harvard y otros
Consulte los 22 mejores tesis para su investigación sobre el tema "Short read and long read sequencing".
Junto a cada fuente en la lista de referencias hay un botón "Agregar a la bibliografía". Pulsa este botón, y generaremos automáticamente la referencia bibliográfica para la obra elegida en el estilo de cita que necesites: APA, MLA, Harvard, Vancouver, Chicago, etc.
También puede descargar el texto completo de la publicación académica en formato pdf y leer en línea su resumen siempre que esté disponible en los metadatos.
Explore tesis sobre una amplia variedad de disciplinas y organice su bibliografía correctamente.
Soundiramourtty, Abirami. "Exploring the transpositional landscape and recent transposable element activity in beech trees using long read mobilome and genome sequencing and with new computational tools". Electronic Thesis or Diss., Perpignan, 2024. http://www.theses.fr/2024PERP0043.
Texto completoThe adaptation of organisms to environmental changes has become a fundamental research question,particularly in the context of climate change. A key area of this research is to identify underlying genetic elements, such as transposable elements (TEs), contributing to this process. TEs are repetitive DNA sequences found across all eukaryotes, possessing the unique ability to move within the genome, a phenomenon known as active transposition. They can cause mutations by generating transposable element insertion polymorphisms (TIPs) between individuals, and even somatic insertions. Generally, TEs remain inactive by epigenetic mechanisms that limit their uncontrolled proliferation. However, they can be reactivated upon various environmental stimuli, making active transposition relatively rare. TE mobility can be detected using extrachromosomal circular DNA (eccDNA) as a marker of transposition. The transpositional landscape of TEs and their recent activity have been documented in model organisms but remain underexplored in perennial species such as trees. This study aims to investigate recent transpositional activity and ongoing mobility of TEs in non-model perennial species, using European beech (Fagus sylvatica) as our model. We sought to study recent TE activity and their continuous mobility byidentifying TE-induced variants within a population and in an individual (at the somatic scale) using whole-genome sequencing (WGS) and mobilome sequencing (eccDNA). We conducted WGS and mobilome sequencing of trees from the Verzy forest, known for its dwarf and tortuous beeches, also referred as "mutants." These trees exhibit unstable phenotypical traits, with some trees developing new normal branches. We identified two TEs belonging to the Miniature Inverted Repeat Transposable Elements (MITEs) type, named SQUIRREL1 and SQUIRREL2, which are actively mobilizing in these trees, producing large amounts of eccDNA and even causing somatic variations.SQUIRREL1 and SQUIRREL2 are also active in beech trees from the Massane forest. Furthermore, in all these trees, several other TEs,mainly MITEs, produce significant amounts of eccDNA, although their activity levels appear to vary depending on the tissues, suggesting that TE activity could be tissue-specific indicating MITE-dominated transposition in beech. Simultaneously, we investigated TIPs in a population of beech trees from the Massane forest, an ancient forest classified as a UNESCO World Heritage site. By sequencing 150 trees, we aimed to understand how TEs contribute to the genetic diversity of the entire population by detecting TIPs generated by Long Terminal Repeat retrotransposons (LTR-RTs) and MITEs using WGS. We detected approximately 30,000 LTR-RT TIPs in each individual, compared to 70,000 MITE TIPs. While most of these TIPs remain at low frequency, many MITE-TIPs are located near functional genes and more conserved within the population. Using these TIPs, we identified several hotspots of variation and conserved regions along the beech genome, providing insights into genome structure in this species. In conclusion, our study highlights the importance of TEs in shaping the genomic landscape of trees, particularly in understanding how these elements contribute to the evolution of long-lived species. Future research could expand this work to other tree species and explore whether the patterns observed in beeches are common in other types of trees
Whiteford, Nava. "String matching in DNA sequences : implications for short read sequencing and repeat visualisation". Thesis, University of Southampton, 2007. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.438668.
Texto completoChacon, de San Baldomero Alejandro. "Read mapping on heterogeneous systems: scalability strategies for bioinformatic primitives". Doctoral thesis, Universitat Autònoma de Barcelona, 2021. http://hdl.handle.net/10803/671736.
Texto completoLa secuenciación genómica es un componente clave en nuevos avances en medicina, y su democratización es un paso importante hacia la accesibilidad para el paciente. Los beneficios implícitos en el descubrimiento de nuevas variantes genéticas son muy amplios, incluyendo desde la detección precoz de cáncer como la medicina personalizada, pasando por el diseño de fármaco y la edición genómica. Estos usos potenciales han incrementado exponencialmente el interés de la comunidad científica en el campo de la bioinformática durante los últimos años. Además, el surgimiento de los métodos de Secuenciación de Nueva Generación ha contribuido a la reducción rápida de los costes de secuenciación, permitiendo el desarrollo de nuevas aplicaciones genómicas. El principal objetivo de esta tesis es el de mejorar el rendimiento y precisión del estado del arte de la secuenciación genética a través del uso de plataformas de computo heterogéneo y sistemas de hardware híbridos. Más específicamente, el trabajo se ha centrado en la aceleración del problema del short-read mapping, dado que se describe como uno de los estadíos del pipeline con un mayor coste computacional. De forma global, se aspiraba a reducir el tiempo de procesado y el coste de la secuenciación genética, incrementando su disponibilidad. La principal contribución de esta tesis es la integración GPU del mapper GEM3 (GEM3-GPU). Este mapper reporta los mismos datos de salida para CPU y GPU, y es uno de los primeros mappers GPU que permite el alineamiento de reads largos y variables. Las propuestas han sido validadas utilizando datos reales, dado que el mapper ha estado corriendo en producción en un centro de secuenciación (Centro Nacional de Análisis Genómico (CNAG)). En conjunción con el mapper GEM3-GPU, durante esta tesis se ha creado una librería bioinformática en CUDA (GEM-cutter). La librería provee bloques de primitivas GPU básicas que han sido altamente optimizadas. Gem-cutter ofrece una API basada en primitivas de send and receive (message passing), e incorpora un scheduler para balancear el trabajo. Además, la librería soporta todas las arquitecturas GPU y Multi-GPU.
Genomic sequencing is the key component of new advances in medicine, and its democratization is an important step in improving accessibility for the patient. The benefits involved in discovering new genomic variations are vast and include everything from early cancer detection to personalized medicine, drug design and genome editing. All of these potential uses have greatly increased the interest of the scientific community in the field of bioinformatics in recent years. Moreover, the emergence of next-generation sequencing methods has contributed to the rapid reduction of sequencing costs, enabling new applications of genomics in precision medicine. The main goal of this thesis is to improve the state of the art in performance and accuracy for genome sequencing through the use of heterogeneous computing platforms and hybrid hardware systems. More specifically, the work is focused on accelerating the problem of short-read mapping, as it is described as one of the most computationally expensive parts of the pipeline process. Overall, we aim to reduce the processing time and cost of genome sequencing, and then increasing the availability of this type analysis. The main contribution of this thesis is the full GPU integration of the GEM3 mapper (GEM3-GPU), reporting significant improvements in performance and competitive accuracy results. The mapper reports the same output files for CPU and GPU and is one of the first GPU mappers to allow very long and variable read alignment. The proposals have been validated using real data, since the mapper has been running in production at a genomic sequencing center (Centro Nacional de Análisis Genómico (CNAG)). Together with the GEM3-GPU mapper, a complete bioinformatics CUDA library (GEM-cutter) has been created. The library provides the basic building blocks for genomic applications, which are highly optimised to run on GPUs. Gem-cutter offers an API based on send and receive primitives (message passing) and incorporates a scheduler to balance the work. Furthermore, the library supports all GPU architectures and Multi-GPU execution.
Universitat Autònoma de Barcelona. Programa de Doctorat en Informàtica
Targon, Robin. "A novel method for the production of long DNA sequences from short reads". Doctoral thesis, Università degli studi di Padova, 2015. http://hdl.handle.net/11577/3424278.
Texto completoL'avvento dei sequenziatori di ultima generazione (NGS) ha profondamente cambiato il nostro approccio allo studio del genoma e dell'espressione genica: negli ultimi dieci anni è stata prodotta un'incredibile quantità di dati e di evidenze sperimentali riguardanti la complessità del trascrittoma e le interazioni tra specifiche proteine e molecole di DNA o RNA, aprendo così la strada ad entusiasmanti scoperte ed applicazioni tecnologiche. Sfortunatamente, la ridotta lunghezza delle sequenze prodotte dai sequenziatori di seconda generazione limita le potenzialità di questa tecnologia. Nello specifico, alcune interessanti applicazioni quali l'analisi degli splicing alternativi e dell'RNA-editing, l'assemblaggio di genomi ex novo, la caratterizzazione di aplotipi e l'identificazione di variazioni strutturali a livello genomico, beneficerebbero sicuramente di una tecnologia in grado di produrre lunghe sequenze ad alta qualità. Lo studio che ho condotto durante il mio dottorato di ricerca è stato finalizzato alla produzione di lunghe sequenze ad alta qualità utilizzando gli attuali sequenziatori di seconda generazione. La principale motivazione che ha guidato questo studio è stata la volontà di caratterizzare a livello di sequenza nucleotidica le diverse isoforme trascrizionali in modo da poter verificare l'ipotesi di una relazione funzionale tra l'utilizzo di specifici siti d'inizio trascrizione e lo splicing alternativo degli esoni. Un'ulteriore motivazione era rappresentata dalla possibilità di ottenere la sequenza di lunghi frammenti di DNA al fine di facilitare l'assemblaggio di genomi. Non essendo possibile intervenire sulla lunghezza delle sequenze prodotte dai sequenziatori di seconda generazione, ho sviluppato una strategia che permette di ottenere lunghe sequenze nucleotidiche mediante un preciso assemblaggio di sequenze corte derivanti da una singola molecola. Questa strategia si basa sul concetto di “barcoding” molecolare. Un “barcode”, letteralmente “codice a barre”, è un corto frammento di DNA a sequenza nucleotidica nota che viene aggiunto a tutte le molecole di uno specifico campione. In questo modo è possibile sequenziare diversi campioni simultaneamente e associare ogni sequenza al proprio campione di provenienza semplicemente leggendo il “barcode” ad essa associato. Nel mio progetto lo scopo e la natura dei “barcode” è differente: i “barcode” utilizzati hanno sequenza casuale, in moda da poter marcare ogni singola molecola del campione con una sequenza univoca. La presenza di un “barcode” univoco permette l'assegnazione delle sequenze prodotte alla molecola di origine e, quindi, il loro corretto assemblaggio. Una parte considerevole di questo lavoro è stata dedicata allo sviluppo di strategie di ingegneria genetica che permettessero la costruzione di librerie “mate pair” in cui parte della sequenza fosse costituita dal “barcode”, mentre l'altra parte rappresentasse una porzione casuale della molecola di DNA o RNA di origine. Ogni singolo passaggio del protocollo è stato ottimizzato al fine di rendere il metodo più semplice e robusto. Diverse prove di sequenziamento sono state effettuate per poter valutare l'efficienza della metodica; sebbene l'analisi di queste prove sia stata condizionata dal basso “coverage” di sequenziamento, abbiamo dimostrato come le sequenze “mate pair” che condividono lo stesso “barcode” si allineino, come atteso, a livello della stessa posizione genomica. I risultati ottenuti, sebbene siano preliminari, dimostrano che il metodo sviluppato funziona. Nonostante alcuni passaggi del protocollo richiedano un'ulteriore ottimizzazione, il metodo verrà a breve impiegato per la produzione di lunghe sequenze genomiche aumentando il “coverage” di sequenziamento. Nel prossimo futuro l'introduzione di alcune modifiche minori al protocollo permetterà di estendere il suo utilizzo all'analisi di trascrittomi.
Long, Evan Michael. "Genomic Structural Variation Across Five Continental Populations of Drosophila melanogaster". BYU ScholarsArchive, 2018. https://scholarsarchive.byu.edu/etd/7335.
Texto completoFuente, Lorente Lorena de la. "Development of a bioinformatics approach for the functional analysis of alternative splicing". Doctoral thesis, Universitat Politècnica de València, 2019. http://hdl.handle.net/10251/124974.
Texto completo[CAT] Un dels aspectes més emocionants de la biologia del transcriptoma és l'adaptabilitat contextual de transcriptomes i proteomes eucariotes mitjançant la regulació post-transcripcional (PTR). Els mecanismes PTR, com el splicing alternatiu (AS) i la poliadenilació alternativa (APA), s'han convertit en processos molt regulats que juguen un paper clau en la generació de la complexitat del transcriptoma i en la coordinació de la diferenciació cel·lular o del desenvolupament de teixits. No obstant això, el nostre coneixement de com aquests mecanismes imprimeixen característiques funcionals diferents al conjunt resultant d'isoformes per definir el fenotip observat és encara escàs. El nombre de variants de PTR i les seues conseqüències potencialment funcionals fa que la validació funcional sigui una tasca poc pràctica si es fa cas per cas. A més, la manca d'enfocaments funcionals orientats a isoformes ha fet que gran part del treballs computacionals per esbrinar qüestions funcionals a nivell de transcriptoma siguen estratègies computacionals ad hoc aplicades a sistemes biològics específics o bé basats en un simple anàlisi d'enriquiment GO, que no aporten informació sobre l'impacte de la PTR sobre les propietats de les isoformes. Així, malgrat les més de 60.000 publicacions existents sobre AS, poques de les isoformes existents s'han associat a propietats específiques, mentre que el nombre de noves variants AS/APA amb funcions desconegudes i fins i tot inexplorades augmenta de manera exponencial gràcies a la seqüenciació de nova generació (NGS). A causa de les limitacions tècniques del NGS per reconstruir l'estructura dels transcrits, la seqüenciació d'alt rendiment de transcrits de longitud completa mitjançant tecnologies de tercera generació (TGS) obre una nova era en la transcriptòmica, ja que millora la definició dels models genètics i, per primera vegada, permet associar amb precisió esdeveniments funcionals dins de la molècula d'ARN. Aquesta tesi aborda tres grans reptes per a progressar en l'estudi de la funció de les isoformes. En primer lloc, amb l'aparició i la popularitat creixent del TGS, la definició precisa i la caracterització completa dels transcriptomes de novo són essencials per garantir la qualitat de qualsevol conclusió sobre la diversitat del transcriptoma. La manca d'anàlisis de qualitat orientats a lectures llargues va motivar el desenvolupament de SQANTI (https://bitbucket.org/ ConesaLab / sqanti), una estratègia computacional automatitzada per a la caracterització estructural i l'avaluació de la qualitat dels transcriptomes de longitud completa. En segon lloc, els recursos funcionals existents centrats en el gen suposen una gran limitació per a l'estudi extensiu de la variabilitat funcional de les isoformes, especialment en les noves isoformes, que no es poden caracteritzar per bases de dades estàtiques. Per tant, vam dissenyar IsoAnnot, que construeix dinàmicament una base de dades amb anotacions funcionals a nivell d'isoforma, que utilitza com a informació d'entrada les seqüències dels transcrits i integra informació de diverses bases de dades i mètodes de predicció. Finalment, com no hi havia cap mètode per interrogar l'impacte funcional del PTR, vam desenvolupar nous enfocaments i eines fàcils d'utilitzar, com ara tappAS (http://tappas.org/), dissenyada per facilitar als investigadors els estudis funcionals de transcriptoma complet i de regulació d'isoformes en contexts específics. Per tant, aquesta tesi descriu el desenvolupament d'un marc d'anàlisi que aborda els reptes fonamentals de l'anàlisi funcional d'isoformes. Aplicada a un sistema de diferenciació neuronal murina, vam descobrir regions transmembrana específiques d'isoformes, la modulació de les quals per PTR podria contribuir a controlar la dinàmica mitocondrial específica del tipus cel·lular durant la determinació del destí neuronal.
[EN] One of the most exciting aspects of transcriptome biology is the contextual adaptability of eukaryotic transcriptomes and proteomes by post-transcriptional regulation (PTR). PTR mechanisms such as alternative splicing (AS) and alternative polyadenylation (APA) have emerged as tightly regulated processes playing a key role in generating transcriptome complexity and coordinating cell differentiation or tissue development. However, how these mechanisms imprint distinct functional characteristics on the resulting set of isoforms to define the observed phenotype remains poorly understood. The number of PTR variants and their resulting range of potentially functional consequences makes their functional validation an impractical task if done on a case-by-case basis. Besides, the lack of isoform-oriented functional profiling approaches has made that much of the computational work done to elucidate transcriptome-wide functional questions has either involved ad hoc computational pipelines applied to specific biological systems or has relied on simple GO-enrichment analysis that are not informative about the PTR impact on isoform properties. Thus, even though more than 60,000 publications on AS, a few number of existing isoforms have been associated with specific properties while the number of novel AS/APA variants with unknown and even unexplored functions is exponentially increasing thanks to the use of next-generation sequencing (NGS). Due to the technical limitations of NGS to reconstruct the transcript structure, high-throughput sequencing of full-length transcripts using third-generation technologies (TGS) is opening up a new transcriptomics era that enhances the definition of gene models and, for the first time, enables to precisely associate functional events within the RNA molecule. This thesis addresses three major challenges to the progression of the study of isoform function. First, with the emergence and increasing popularity of TGS, the accurate definition and comprehensive characterisation of de novo transcriptomes is essential to ensure the quality of any conclusions on transcriptome diversity drawn from these data. The lack of long-read oriented quality aware analysis motivated the development of SQANTI \url{(https://bitbucket.org/ConesaLab/sqanti)}, an automated pipeline for the structural characterization and quality assessment of full-length transcriptomes. Secondly, the gene-centric nature of functional resources remained the major limitation to the extended study of functional isoform variability, especially for novel isoforms, which cannot be characterised by static databases. Thus, we designed IsoAnnot, which dynamically constructs an isoform-resolved rich database of functional annotations by using as input transcript sequences and integrating information disseminated across several databases and prediction methods. Finally, because no methods to interrogate the functional impact of PTR were available, we developed novel approaches and user-friendly tools such as tappAS \url{(http://tappas.org/)}, designed to facilitate researchers the transcriptome-wide functional study of context-specific isoform regulation. Thereby, this thesis describes the development of an analysis framework that tackles the fundamental challenges of the isoform functional analysis by providing a set of novel methods and tools that offer an unique opportunity to explore how the phenotype is specified by altering the functional characteristics of expressed isoforms. Applied to a murine neural differentiation system, our pipeline profiled the effect of isoform regulation on the inclusion of several functional elements within transcripts between motor-neuron and oligodendrocyte differentiation systems and specifically, we discovered isoform-specific transmembrane regions whose modulation by PTR might contribute to control cell type-specific mitochondrial dynamics during neural fate determination.
This work was funded by the following grants: From 2014 to 2018. FPU: Training programme for Academic Staff. Spanish Ministry of Education, FPU2013/02348. From 2016 to 2019. NOVELSEQ: Novel methods for new challenges in the analysis of high-throughput sequencing data. MINECO, BIO2015-1658-R. From 2014 to 2017. DEANN: Developing a European American NGS Network. EU Marie Curie IRSES, GA-612583.
Fuente Lorente, LDL. (2019). Development of a bioinformatics approach for the functional analysis of alternative splicing [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/124974
TESIS
Vogel, Alexander Verfasser], Björn [Akademischer Betreuer] [Usadel, Ingo Akademischer Betreuer] Kurth y Ulrich [Akademischer Betreuer] [Schaffrath. "Long-read sequencing for de novo genome assembly in bioeconomic context / Alexander Vogel ; Björn Usadel, Ingo Kurth, Ulrich Schaffrath". Aachen : Universitätsbibliothek der RWTH Aachen, 2020. http://d-nb.info/123506946X/34.
Texto completoZhang, Panpan. "Étude du paysage des éléments transposables sous forme d'ADN circulaire extrachromosomique et dans l'assemblage des génomes de plantes à l'aide du séquençage en lectures longues". Thesis, Université de Montpellier (2022-….), 2022. http://www.theses.fr/2022UMONG016.
Texto completoTransposable elements (TEs) are repetitive DNA sequences with the intrinsic ability to move and amplify in genomes. Active transposition of TEs is linked to the formation of extrachromosomal circular DNA (eccDNA). However, the complete landscape of this eccDNA compartment and its interactions with the genome were not well defined. In addition, at the beginning of my thesis, there were no bioinformatics tools available to identify eccDNAs from long-read sequencing data.To address these questions during my PhD, we first developed a tool, called ecc_finder, to automate eccDNA detection from long-read sequencing and optimized detection from short-read sequences to characterize TE mobility. By applying ecc_finder to Arabidopsis, human and wheat eccDNA-seq data (with genome sizes ranging from 120 Mb to 17 Gb), we documented the broad applicability of ecc_finder as well as optimization of computational time, sensitivity and accuracy.In the second project, we developed a meta-assembly tool called SASAR to reconcile the results of different genome assemblies from long-read sequencing data. For different plant species, SASAR obtained high quality genome assemblies in an efficient time and resolved structural variations caused by TEs.In the last project, we used SASAR-assembled genome and ecc_finder-detected eccDNA to characterize eccDNA-genome interactions. In Arabidopsis hypomethylated epigenetic mutants, we highlighted the role of the epigenome in protecting genome stability not only from TE mobility but also from genomic rearrangements and gene chimerism. Overall, our findings on eccDNA, genome assembly and their interactions, as well as the development of tools, offer new insights into the role of TEs in the adaptive evolution of plants to rapid environmental change
Jaudou, Sandra. "Metadetect : detection of Shiga toxin-producing Escherichia coli with novel metagenomics approaches and its application on dairy farms in France and Germany". Electronic Thesis or Diss., Maisons-Alfort, École nationale vétérinaire d'Alfort, 2023. http://www.theses.fr/2023ENVA0004.
Texto completoCurrent methodologies for characterization of Shiga toxin-producing Escherichia coli (STEC) require strain isolation, which is complicated by the fact that there is no specific isolation medium that clearly distinguishes STECs from non-pathogenic commensal E. coli. Therefore, obtaining strain information using a metagenomics approach would avoid isolating a strain to fully characterize it. In the framework of the project, in collaboration with the BfR in Germany, we will evaluate whether new, long-read metagenomics approaches could unambiguously determine whether specific markers of typical EHECs (Enterohemorrhagic E. coli) are co-located in the same strain. Third generation hybrid sequencing approaches will be evaluated. Appropriate bioinformatic pipelines developed in collaboration with the BfR will be evaluated to analyze the metagenomic analysis results. These methods will be applied in a pilot study to study the microbiota of raw milk from French and German dairy farms and to tentatively identify a common STEC-associated microbiome. We aim to define a ‘molecular score' based system to identify the status of the farms, in line with the objective to better precise the notion of ‘STEC molecular risk assessment approach' at the farm level
Šalanda, Vojtěch. "Optimalizace zarovnání dat z next-generation sekvenování". Master's thesis, Vysoké učení technické v Brně. Fakulta informačních technologií, 2014. http://www.nusl.cz/ntk/nusl-236077.
Texto completoHerzel, Lydia. "Co-transcriptional splicing in two yeasts". Doctoral thesis, Saechsische Landesbibliothek- Staats- und Universitaetsbibliothek Dresden, 2015. http://nbn-resolving.de/urn:nbn:de:bsz:14-qucosa-179274.
Texto completoKuderna, Lukas 1989. "Application of genome assembly methods to human and non-human primate genomics". Doctoral thesis, Universitat Pompeu Fabra, 2020. http://hdl.handle.net/10803/668648.
Texto completoEls anàlisis genòmics són el centre de la biologia contemporània. Aquests estudis depenen molt de l’assemblatge de genomes de referència, tot i que aquets en general estan molt fragmentats. Tenir representacions precises de genomes complexos, o parts d’aquests, és crucial per estudiar les malalties i l’evolució en humans i primats. En els estudis següents, desenvolupem i apliquem noves estratègies i tecnologies de seqüenciació per millorar els assemblatges de referència. En primer lloc, explorem el potencial de combinar diferents conjunts de dades per generar una referència substancialment millorada per al ximpanzé, una espècie crucial per a l'estudi dels orígens humans. Som capaços de tancar el 77% dels més de 159,000 buits que hi havia a la iteració prèvia de l’assemblatge d'aquesta espècie, i augmentar la continuïtat en més del 750%. A continuació, desenvolupem un protocol per assemblar el primer cromosoma Y humà d’ascendència africana, utilitzant cromosomes nadius aïllats per citometria de flux i seqüenciats mitjançant un dispositiu Nanopore. D’aquesta manera, aconseguim assemblar el cromosoma Y a una qualitat de referència i una resolució de seqüències sense precedents en regions estructuralment complexes. Aquests resultats obren noves vies per a estudis comparatius que inclouen el genoma del ximpanzé o els cromosomes Y humans.
FORMENTI, GIULIO PAOLO. "THIRD-GENERATION SEQUENCING AND ASSEMBLY OF THE BARN SWALLOW GENOME AND A STUDY ON THE EVOLUTION OF THE HUNTINGTIN GENE". Doctoral thesis, Università degli Studi di Milano, 2019. http://hdl.handle.net/2434/611650.
Texto completoFruchard, Cécile. "Étude des chromosomes sexuels et du déterminisme du sexe chez les plantes : comparaison des systèmes Silene et Coccinia". Thesis, Lyon, 2018. http://www.theses.fr/2018LYSE1108/document.
Texto completoAlthough rarer than in animals, separate sexes (dioecy) have evolved in ∼15,600 angiosperm species (∼6% of all angiosperm species). How sex is controlled is a central question in plant sciences and also in agronomy as many crops are dioecious (∼20% of crops) with only one useful sex (usually female). Only three master sex-determining genes have been identified in dioecious plants so far, namely in persimmons, asparagus and strawberry. Dioecy likely evolved several times independently in angiosperms, suggesting that sex-determining genes are of diverse origins. Hermaphroditism is the predicted ancestral state of the angiosperm flower. Two main pathways have been identified that explain the evolution of hermaphroditism towards dioecy: either through a monoecious state (with both unisexual male and female flowers on the same individual) or a gynodioecious state (with females and individuals having hermaphroditic flowers). My aim is to compare two plant systems representing each one of these two pathways. In Coccinia grandis, a Cucurbitaceae with an XY chromosome system, dioecy evolved through monoecy. In Silene latifolia, a well-studied dioecious plant with XY sex chromosomes, dioecy evolved through gynodioecy. Three genes controlling monoecy have been identified in melon, and it was suggested that these genes act as sex-determining genes in closely related dioecious species such as C. grandis. I therefore chose a candidate gene approach in this species. Very few genetic and genomic data are available in C. grandis, and we chose to use SEX-DETector, a probabilistic method that uses RNA-seq data to genotype parents and their offspring, and infers sex-linked genes with no need for a reference genome. This method allowed me to identify 1,364 genes that are present on the sex chromosomes of C. grandis. I found that the sex chromosomes are enriched in sex-biasedgenes when compared to autosomes and I characterized Y chromosome degeneration in terms of decreased expression and gene loss. Finally, I showed that dosage compensation occurs in C. grandis. Testing for the three candidates genes is ongoing. In S. latifolia 3 regions involved in sex determination have already been identified on the Y chromosome. We chose to sequence this chromosome to identify sex-determining genes. The sequencing of Y chromosomes remains one of the greatest challenges of current genomics. The assembly step is very difficult because of their highly repeated content. Consequently, fully sequenced Y chromosomes are rare and mainly available for research in animals. To overcome the difficulty of assembling reads with many repeats, I used third generation sequencing (TGS, producing long reads). I produced a dataset using the Oxford Nanopore MinION sequencer with Y chromosome DNA. Assembling was performed using a combination of Illumina, MinION and PacBio sequencing data. The final assembly had a total length of 563 Mb with a scaffold N50 of 6,114 bp, and contained 16,219 de novo annotated genes
Lehmann, Nathalie. "Development of bioinformatics tools for single-cell transcriptomics applied to the search for signatures of symmetric versus asymmetric division mode in neural progenitors". Electronic Thesis or Diss., Université Paris sciences et lettres, 2021. http://www.theses.fr/2021UPSLE070.
Texto completoIn recent years, single-cell RNA-seq (scRNA-seq) has fostered the characterization of cell heterogeneity at a remarkable high resolution. Despite their democratization, the analysis of scRNA-seq remains a challenge, particularly for organisms whose genomic annotations are partial. During my PhD, I observed that the chick genomic annotations are often incomplete, thus resulting in a loss of a large number of sequencing reads. I investigated how an enriched annotation affects the biological results and conclusions from these analyses. We developed a novel approach based on the re-annotation of the genome with scRNA-seq data and long reads bulk RNA-seq. This computational biology project capitalises on a tight collaboration with the experimental team of Xavier Morin (IBENS). The main biological focus is the search for signatures of symmetric versus asymmetric division mode in neural progenitors. In order to identify the key transcriptional switches that occur during the neurogenic transition, I have implemented bioanalysis approaches dedicated to the search for gene signatures from scRNA-seq data
Hsieh, Yi-Te y 謝憶得. "Long Read Error Correction by Short-Read Alignment Using FM-Index". Thesis, 2015. http://ndltd.ncl.edu.tw/handle/k3c65a.
Texto completo國立中正大學
資訊工程研究所
103
The third generation sequencing can generate multi-kilobase sequences and has the potential to improve genome assembly. Nevertheless, it has higher error rates in comparison with second generation sequencing. The error rates have limited its use to improve assembly. In this thesis, we introduce a hybrid correction algorithm to correct third generation sequencing reads by finding overlapping short reads with high-quality. We improved the efficiency of a previous method which align short reads onto long PacBio reads using FM-index. The results indicate that the accuracy of corrected PacBio reads achieve over 93%, the memory consumption is lower, and the running time is faster than previous method.
Tsai, Cheng-Wei y 蔡承洧. "Scaffolding Pre-Assembled Contigs Using Long-Read Sequencing". Thesis, 2013. http://ndltd.ncl.edu.tw/handle/97232567550873046854.
Texto completo國立中正大學
資訊工程研究所
101
In recent years, third-generation sequencing platform has been applied for improving genome assembly, which is able to sequence a single DNA molecular in real time and generate reads with longer length. But unfortunately, these long reads are often with higher error rates compared with previous sequencing technologies, in which most errors are indels. The high error rates greatly reduce the usability of long reads for improving genome assembly. In this thesis, we design and implement a program for scaffolding pre-assembled contigs using long reads (called SACLR) generated by Pacific Biosciences platform. Given a set of pre-assembled contigs and long reads, SACLR determines the mapped boundary of contigs using a novel clustering alignment approach for tolerating various errors of the platform. The linkage between contigs across multiple long reads is established and integrated for further improving the scaffolding length. It is worth mentioning that the gaps within our scaffolds can be directly filled and the two ends of each scaffold may be further extended by long reads. SACLR has been tested using a variety of real data sets. The experimental results showed that SCALR produced more contiguous and accurate sequences.
Bi, Chongwei. "Long Read Based Individual Molecule Sequencing and Real-time Pathogen Detection". Diss., 2021. http://hdl.handle.net/10754/672109.
Texto completoTANG, YU-YU y 湯玉宇. "Hybrid error correction for long-read sequencing using adaptive seeding strategies". Thesis, 2019. http://ndltd.ncl.edu.tw/handle/qztabn.
Texto completo國立中正大學
資訊工程研究所
107
Next-generation sequencing (NGS) and Third-generation sequencing (TGS) technologies are the popular choices in de novo assembly projects. NGS can achieve highest sequencing accuracy, the assembly genomes are often highly fragmented due to repeats larger than the short-read lengths. On the other head, long reads generated by TGS are able to span across larger repeats and thus assemble a complete genome. To date, the acceptance of TGS is still limited by the high error rate and cost. Previously, we combined the advantage of both NGS and TGS by developing a hybrid correction strategy, called PBHC. PBHC correct error-prone long reads with highly-accuracy short reads using both alignment-based and alignment-free methods. However, the assembly contiguity of PBHC drops significantly in the large genome data sets. This thesis investigated the root cause of bad assembly regions in the large genome and found the reads are largely uncorrected within repetitive regions. Further investigation revealed the seeds within repeat regions mostly are error-prone. We invented an adaptive seeing strategy to improve the accuracy of seed. Any given long read is partitioned into repeat and unique regions and applied with different seeding strategies to identify the seed. The experimental results indicated the new seeding algorithm improved the genome contiguity under lower sequencing coverage.
Bachmann, J. A., Andrew Tedder, B. Laenen, K. A. Steige y T. Slotte. "Targeted long-read sequencing of a locus under long-term balancing selection in Capsella". 2018. http://hdl.handle.net/10454/17277.
Texto completoRapid advances in short-read DNA sequencing technologies have revolutionized population genomic studies, but there are genomic regions where this technology reaches its limits. Limitations mostly arise due to the difficulties in assembly or alignment to genomic regions of high sequence divergence and high repeat content, which are typical characteristics for loci under strong long-term balancing selection. Studying genetic diversity at such loci therefore remains challenging. Here, we investigate the feasibility and error rates associated with targeted long-read sequencing of a locus under balancing selection. For this purpose, we generated bacterial artificial chromosomes (BACs) containing the Brassicaceae S-locus, a region under strong negative frequency-dependent selection which has previously proven difficult to assemble in its entirety using short reads. We sequence S-locus BACs with single-molecule long-read sequencing technology and conduct de novo assembly of these S-locus haplotypes. By comparing repeated assemblies resulting from independent long-read sequencing runs on the same BAC clone we do not detect any structural errors, suggesting that reliable assemblies are generated, but we estimate an indel error rate of 5.7×10−5. A similar error rate was estimated based on comparison of Illumina short-read sequences and BAC assemblies. Our results show that, until de novo assembly of multiple individuals using long-read sequencing becomes feasible, targeted long-read sequencing of loci under balancing selection is a viable option with low error rates for single nucleotide polymorphisms or structural variation. We further find that short-read sequencing is a valuable complement, allowing correction of the relatively high rate of indel errors that result from this approach.
This study was supported by a grant from the Swedish Research Council to T.S.
Natarajan, Santhi. "Accelerated and Accurate Alignment of Short Reads in High Throughput Next Generation Sequencing [NGS] Platforms". Thesis, 2016. http://etd.iisc.ac.in/handle/2005/4073.
Texto completoBachmann, J. A., Andrew Tedder, B. Laenen, M. Fracassetti, A. Désamoré, C. Lafon-Placette, K. A. Steige et al. "Genetic basis and timing of a major mating system shift in Capsella". 2019. http://hdl.handle.net/10454/17270.
Texto completoA crucial step in the transition from outcrossing to self-fertilization is the loss of genetic self-incompatibility (SI). In the Brassicaceae, SI involves the interaction of female and male speci-ficity components, encoded by the genesSRKandSCRat the self-incompatibility locus (S-lo-cus). Theory predicts thatS-linked mutations, and especially dominant mutations inSCR, arelikely to contribute to loss of SI. However, few studies have investigated the contribution ofdominant mutations to loss of SI in wild plant species. Here, we investigate the genetic basis of loss of SI in the self-fertilizing crucifer speciesCapsella orientalis, by combining genetic mapping, long-read sequencing of completeS-hap-lotypes, gene expression analyses and controlled crosses. We show that loss of SI inC. orientalisoccurred<2.6 Mya and maps as a dominant trait totheS-locus. We identify a fixed frameshift deletion in the male specificity geneSCRand con-firm loss of male SI specificity. We further identify anS-linked small RNA that is predicted tocause dominance of self-compatibility. Our results agree with predictions on the contribution of dominantS-linked mutations toloss of SI, and thus provide new insights into the molecular basis of mating system transitions.
Work at Uppsala Genome Center is funded by 550 RFI / VR and Science for Life Laboratory, Sweden. The SNP&SEQ Platform is supported by 551 the Swedish Research Council and the Knut and Alice Wallenberg Foundation. V.C. 552 acknowledges support by a grant from the European Research Council (NOVEL project, 553 grant #648321). The authors thank the French Ministère de l’Enseignement Supérieur et de la 554 Recherche, the Hauts de France Region and the European Funds for Regional Economical 555 Development for their financial support to this project. This work was supported by a grant 556 from the Swedish Research Council (grant #D0432001) and by a grant from the Science for 557 Life Laboratory, Swedish Biodiversity Program to T.S. The Swedish Biodiversity Program is 558 supported by the Knut and Alice Wallenberg Foundation.