Dissertations / Theses on the topic 'Séquençage à haut débit (NGS)'
Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles
Consult the top 50 dissertations / theses for your research on the topic 'Séquençage à haut débit (NGS).'
Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.
You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.
Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.
Becmeur-Lefebvre, Mathilde. "Identification de nouveaux genes responsables d'anomalies du développement par séquençage haut débit d'exome." Thesis, Bourgogne Franche-Comté, 2019. http://www.theses.fr/2019UBFCK080.
Full textMultiple congenital anomalies (MCA) are often genetic conditions, with a risk of recurrence. The etiologic diagnosis of these conditions in fetuses is mandatory to allow genetic counseling for the future pregnancies. Regarding current diagnostic tests (fetal autopsy, cytogenetic test and targeted molecular tets), the diagnostic rate in MCA fetuses is about 30%, allowing genetic counselling in only one third of families. Exome sequencing (ES) has allowed to identify the molecular basis of many new syndromes.We aimed to assess the contribution of ES solo-based strategy to identify new developmental genes in fetuses presenting with MCA without etiological diagnosis after standard investigations with an original multistep strategy.We performed solo ES in 95 MCA fetuses from 10 prenatal diagnostic centers in France. First, we focused on OMIM related disease genes, with a first step using bioinformatic scores and public databases independently of phenotype, a second step using genotype-phenotype correlation and a third step of research analysis extended to the whole exome. Variant confirmation and parental segregation were done by Sanger sequencing. ES allowed the identification of a causative variants in 23 fetuses (24%), variants of unknown significance (VUS) in 7 fetuses (7%) and variants in new candidate genes in 6 fetuses (6%). Among causative variants, most were from autosomal recessive inheritance (50%), 42% were sporadic and 4% were from autosomal dominant inheritance. The additionnal strategy identified 17/23 causative variants, including 2 new causative variants not identified by the classical approach because of atypical or extreme fetal phenotype, and 2 new VUS. No new candidate gene was identified by this strategy.To conclude, solo ES with classical and additionnal strategy presents a low efficiency to identify new genes implicated in embryonary development but allows the extension of the clinical spectrum of well-known pediatric pathologies to the prenatal period. Trio ES or genome sequencing would be now insteresting strategies to be explored
Lacoste, Deixonne Caroline. "Apport du séquençage haut débit dans l'amélioration de la prise en charge des maladies monogéniques." Thesis, Aix-Marseille, 2016. http://www.theses.fr/2016AIXM5062/document.
Full textThe diffusion of Next Generation Sequencing (NGS) technologies induces an important change that modifies molecular diagnostics indications and prompts laboratories to re-think their diagnostic strategies, up-to-now based on Sanger sequencing routine. Several high throughput approaches are available from the sequencing of a gene panel, to a whole exome, or even a whole genome. In all cases, a tremendous amount of data are generated, that have to be filtered, interpreted and analyzed by the use of powerful bioinformatics tools.In part 1, existing strategies and the difficulties and challenges of high-throughput sequencing for molecular diagnosis in genetic diseases are discussed. In part 2, the set up and the technical validation of this diagnostic approach in the Molecular Genetics’ Laboratory of the Timone Hospital in Marseille is presented and illustrated by 3 examples of complex diagnostics solved thanks to NGS. NGS promises to shorten significantly the time of analysis and results reporting, and to expand the number of tested genes. It also promises to increase the proportion of positive diagnoses. Finally, the NGS can identify new variants and new genes involved in human pathology, thus will globally improve patient clinical care
Bisseux, Maxime. "Dynamique de la circulation des Entérovirus de l'homme à l'environnement : Etude par séquençage haut débit." Thesis, Université Clermont Auvergne (2017-2020), 2017. http://www.theses.fr/2017CLFAS013.
Full textEnterovirus (EV) are Picornaviruses (non-enveloped, positive-sense RNA viruses), characterized by a large genetic and antigenic diversity (116 types classified within 4 taxonomic species EV-A to D) and rapid evolution. Human infections are frequent, highly contagious from stools and occur as outbreaks. The infections are mainly asymptomatic or benign but severe or fatal cases can be reported in young children. Poliomyelitis is the model EV infection. Combined with clinical and virological surveillance, mass vaccination is closer than ever to achieve the WHO program of the Global Polio Eradication Initiative. However, the detection of wild type polioviruses in polio-free countries and the recent worldwide emergence of non-polio enteroviruses (EV-A71, EV-D68) associated with severe clinical manifestations underscore the importance of surveilling EV circulation in the general population. The aim of the PhD thesis was the detection and identification of EV strains in wastewater treated in the sewage treatment plant at Clermont-Ferrand (France). The viral data were compared with those reported through clinical surveillance to obtain a comprehensive picture of the viral circulation in the local population. A method was developed to concentrate viruses from raw and treated wastewater and molecular assays were used to detect EVs and 6 other human enteric viruses. The viral genomes were detected in all samples from October 2014 to October 2015, with a median of 6 and 4 different viruses in raw and treated wastewater respectively. Phylogenetic analysis of viral sequences (EV, hepatitis A and E viruses) determined in wastewater and reported in patients during the sampling period, showed the efficiency of the method for surveilling enteric viruses in the community. The EV diversity in raw wastewater was analyzed by sequencing of amplicons with the Illumina high throughput technology (metabarcoding). The analysis revealed a large viral diversity and the silent circulation of 25 types not detected from hospital data (in particular 9 EV-C, of which sequences of vaccine poliovirus 1). The phylogenetic analyses of intra-typic variants showed different epidemic patterns in the predominant EV types circulating over the study period. The data demonstrate the feasibility and sensitivity of the strategy developed for the detection and characterization of EV in wastewater and provide a future prospect for the implementation of environmental surveillance of non-polio EV infections in epidemiological studies, epidemic prevention, and for health alert. Combining the surveillance of enteric viruses in the environment and in the clinical setting allows a better understanding of their prevalence. This global approach of virus circulation and ecological health represents an important investment for laboratories, which will require integration in national and international collaboration networks beyond the scope of enterovirus surveillance
Croville, Guillaume. "Séquençage et PCR à haut débit : application à la détection et la caractérisation d'agents pathogènes respiratoires aviaires et au contrôle de pureté microbiologique des vaccins." Thesis, Paris Sciences et Lettres (ComUE), 2017. http://www.theses.fr/2017PSLEP028/document.
Full textDetection of pathogens becomes an increasing challenge, since infectious diseases represent major risks for both human and animal health. Globalization of trade and travels, evolution of farming practices and global climatic changes, as well as mass migrations are impacting the biology of pathogens and their emerging potential. This manuscript describes three approaches, based on three innovative technologies of molecular biology applied to the detection of pathogens in three different settings : (i) detection of a list of pathogens using real-time quantitative PCR on a microfluidic platform, (ii) unbiased detection of pathogens in complex matrix, using metagenomics and Illumina (Miseq) sequencing and (iii) genotyping of pathogens without isolation of PCR-enrichment using a 3rd generation NGS (Next Generation Sequencing) platform MinION from Oxford Nanopore Technologies. The three studies shown the contribution of these techniques, each representing distinctive features, suitable for the respective applications. Beyond application of these techniques to the field of microbial diagnostics, their use for the control of veterinary immunological drugs is a priority of this project. Veterinary vaccines are not only submitted to mandatory detection of listed pathogens to be excluded, but also to validation of the genetic identity of vaccine strains. The exponential availability and performances of new PCR or sequencing technologies open cutting-edge perspectives in the field of microbial diagnostic and control
Mansour-Hendili, Lamisse. "Mise en place d’une stratégie de validation fonctionnelle de variations de signification incertaine dans les pathologies constitutionnelles du globule rouge." Electronic Thesis or Diss., Paris 12, 2022. http://www.theses.fr/2022PA120057.
Full textThe deployment of next generation sequencing (NGS) over the past ten years in hospital genetic laboratories in France and around the world has revolutionized the management of rare diseases, including constitutional hemolytic anemia (CHA). It has led to the multiplication of variations of uncertain significance (VUS) requiring the implementation of functional tests to permit a re-classification. The objective of this work is to propose a realistic and effective strategy for the functional exploration of VUS associated with CHs. This approach is based on family genetic studies, study of transcripts on Paxgene tubes, development of methods on-site such as the LORRCA MaxSis for the study of the RBC membrane properties, improvment of techniques such as the RBC density measurement by phthalate gradient and establishment of a collaborative network (example of CNRS in Roscoff for electrophysiological studies). We have shown the interest of NGS in these patients with suspected CHA and have highlighted associations of variations of interest in different genes of RBC pathologies in the same patient (Mansour-Hendili et al 2020). We identified a new pathological entity in two patients with “autoimmune direct antiglobulin test negative” haemolytic anemia who did not respond to immunomodulators. This is a mechanism of acquired spherocytosis by point mutation of the ANK1 gene probably due to clonal hematopoiesis in the elderly (submission in progress). The realization of a whole genome sequencing led to a diagnosis for a child suffering from unexplained transfusion-dependent hemolysis with neurodevelopmental delay due to the VPS4A gene (Lunati-Rozie et al 2021). Via a patient recall system, additional explorations have been carried out. Twenty-five patients underwent a transcript study allowing the reclassification of sixteen variations. Ten family studies have been carried out, one of which excludes the deleterious nature of a VUS of the GATA1 gene. We have shown the interest of measuring RBC density as screening tool for RBC membrane diseases. Its use as a functional test in the case of associations of variations in RBC membrane genes has highlighted the usefulness of the dense cell rate as a differential marker of the presence/absence of the associated variation. Concerning the LORRCA, osmoscan profiles make it possible to discriminate patient with associations of variations compared to “positive” controls without association. Stability studies conducted for these phenotypic tests at different storage times and temperatures show the importance of pre-analytical conditions. We illustrated this problem with the known KCNN4 gene mutation: p.R352H described with a normal osmoscan and ektacytometry profiles. We found twice on two independent samples and manipulations realized on D0 without storage abnormal osmoscan profiles. In addition, we show the interest of the study of electrophysiological properties of the PIEZO1 and KCNN4 channels carried out in Roscoff in the classification of VUS (case one patient with a new KCNN4 mutation and thrombosis, Mansour-Hendili et al 2021). For the associations of variations of interest, the interpretation profiles are more complex but also show profiles differences compared to well-chosen controls. This work has made it possible to demonstrate the usefulness, in addition to family and transcript studies, of RBC phenotypic diagnostic or monitoring tools (LORRCA, density of the GR) to help with the functional validation of isolated or associated VUS in CHA patients. This requires means of revocation, adequate positive controls (intrafamilial cases) and compliance with preanalytical conditions. The establishment of collaborative networks also brings real usefulness and reciprocal intellectual and human added value. The return to the phenotype is an essential recourse for the classification of VUS in particular for the CHA
Piorkowski, Geraldine. "Étude des quasi-espèces du virus Ebola en réponse au traitement par favipiravir dans un modèle de primate non-humain par séquençage haut débit." Thesis, Aix-Marseille, 2019. http://www.theses.fr/2019AIXM0216.
Full textEbola virus disease (EVD) is a major public health issue due to the lack of antiviral treatment or candidate vaccine receiving market authorisation. The scope of the recent outbreaks (2014-2016 and 2018) has highlighted the urgent need to develop efficient treatments.The first scope of this thesis concerns the implementation of a non-human model (Mauritian Cynomolgus Macaques) of Ebola virus (EBOV-Gabon 2001 strain) infection. Following intramuscular administration of EBOV, vital parameters and viral genomic evolution (consensus mutations and viral quasi species) over the disease course were observed. Results demonstrated that evolution of EVD, in this model, is closer from human than previously described models (clinical, biological parameters deteriorate later, and death occurs later). Lethality is 100%. Viral variability is low and infectious dose has a limited impact on disease course.The second scope would highlight the antiviral efficacy of different favipiravir (T-705) doses (100, 150, 180mg/kg) administrated intravenously in this model. Clinical, biological parameters and viral variability were evaluated during disease course. The highest favipiravir dose administration (180 mg/kg) was associated with 60% of monkeys’ survival.Next generation sequencing of viral quasi species over disease course has given some insights into the Proposed mechanism of action of favipiravir. Viral quasi specie number was increased by five between treated monkeys and negative controls. Favipiravir is a GTP analogue inhibiting viral polymerase which induces C to T and G to A mutations leading to error catastrophe mechanism
Robitaille, Alexis. "Detection and identification of papillomavirus sequences in NGS data of human DNA samples : a bioinformatic approach." Thesis, Lyon, 2019. http://www.theses.fr/2019LYSE1358.
Full textHuman Papillomaviruses (HPV) are a family of small double-stranded DNA viruses that have a tropism for the mucosal and cutaneous epithelia. More than 200 types of HPV have been discovered so far and are classified into several genera based on their DNA sequence. Due to the role of some HPV types in human disease, ranging from benign anogenital warts to cancer, methods to detect and characterize HPV population in DNA sample have been developed. These detection methods are needed to clarify the implications of HPV at the various stages of the disease. The detection of HPV from targeted wet-lab approaches has traditionally used PCR- based methods coupled with cloning and Sanger sequencing. With the introduction of next generation sequencing (NGS) these approaches can be improved by integrating the sequencing power of NGS. While computational tools have been developed for metagenomic approaches to search for known or novel viruses in NGS data, no appropriate bioinformatic tool has been available for the classification and identification of novel viral sequences from data produced by amplicon-based methods. In this thesis, we initially describe five fully reconstructed novel HPV genomes detected from skin samples after amplification using degenerate L1 primers. Then, is the second part, we present PVAmpliconFinder, a data analysis workflow designed to rapidly identify and classify known and potentially new Papillomaviridae sequences from NGS amplicon sequencing with degenerate PV primers. This thesis describes the features of PVAmpliconFinder and presents several applications using biological data obtained from amplicon sequencing of human specimens, leading to the identification of new HPV types
Jourdain, Anne-Sophie. "Déterminisme moléculaire du développement des membres : apport des nouvelles technologies d’étude du génome." Thesis, Lille 2, 2019. http://www.theses.fr/2019LIL2S037.
Full textLimbs development is a complex process of which mecanism is today only partially known. Embryological development abnormalities of genetic origins are rare entities. Such abnormalities can be unique or multiple, single or syndromic, sporadic or of family origins.The study of large cohorts of patients carrier of limb extremities malformations is an excellent tool that allows an identification of the genes or regulatory elements involved in their pathology and consenquently, in the development of the limb. In most of the cases, the genetic event involved is a point mutation in the genes coding transcriptionnal factor or regulatory sequence. However, variations in the number of copies are also involved.Today, new technologies of genome study, from high through put sequencing of a target genes panel to a whole exome or genome sequencing, can allow an identification of these new targets. It is thank to these technological advances that we decided to study the moleculary determinism of limbs development. To do so, we analyzed a very large cohort of 684 patients, all carriers of a limb malformation, through different genes panels, of different sizes, but also through a whole exome analysis and a pangenomic CGH array.The results of this work allowed us, in the first part, to establish a genes panel, suitable to a molecular analysis laboratory, to the bioinformatic analysis with an optimized cost, and that can identify the SNVs but also the CNVs in only one analysis.On a second part, we managed to identify 5 genes, not yet described in human pathology, which seemed to have a role in limb development. For one of these genes a promising functional analysis has started
Rudewicz, Justine. "Méthodes bioinformatiques pour l'analyse de données de séquençage dans le contexte du cancer." Thesis, Bordeaux, 2017. http://www.theses.fr/2017BORD0635/document.
Full textCancer results from the excessive proliferation of cells decending from the same founder cell and following a Darwinian process of diversification and selection. This process is defined by the accumulation of genetic and epigenetic alterations whose characterization is a key element for establishing a therapy that would specifically target tumor cells. The advent of new high-throughput sequencing technologies enables this characterization at the molecular level. This technological revolution has led to the development of numerous bioinformatics methods. In this thesis, we are particularly interested in the development of new computational methods for the analysis of sequencing data of tumor samples allowing precise identification of tumor-specific alterations and an accurate description of tumor subpopulations. In the first chapter, we explore methods for identifying single nucleotide alterations in targeted sequencing data and apply them to a cohort of breast cancer patients. We introduce two new methods of analysis, each tailored to a particular sequencing technology, namely Roche 454 and Pacific Biosciences. In the first case, we adapted existing approaches to the particular case of transcript sequencing. In the second case, when using conventional approaches, we were confronted with a high background noise resulting in a high rate of false positives. We have developed a new method, MICADo, based on the De Bruijn graphs and making possible an effective distinction between patient-specific alterations and alterations common to the cohort, which makes the results usable in a clinical context. Second chapter deals with the identification of copy number alterations. We describe the approach put in place for their efficient identification from very low coverage data. The main contribution of this work is the development of a strategy for statistical analysis in order to emphasise local and global changes in the genome that occurred during the treatment administered to patients with breast cancer. Our method is based on the construction of a linear model to establish scores of differences between samples before and after treatment. In the third chapter, we focus on the problem of clonal reconstruction. This problem has recently gathered a lot of interest, but it still lacks a well-established formal framework. We first propose a formalization of the clonal reconstruction problem. Then we use this formalism to put in place a method based on Gaussian mixture models. Our method uses single nucleotide and copy number alterations - such as those discussed in the previous two chapters - to characterize and quantify different clonal populations present in a tumor sample
Nemoz, Benjamin. "Exploration longitudinale à haut débit et en cellule unique du répertoire d'anticorps neutralisants à large spectre chez un neutraliseur d'élite du VIH-1." Electronic Thesis or Diss., Université Grenoble Alpes, 2024. http://www.theses.fr/2024GRALV012.
Full textHuman Immunodeficiency Virus type 1 (HIV-1) infection remains a major global health concern, with an estimated 37.7 million people living with the virus worldwide and new contaminations above a million cases yearly. Efficient anti-retroviral therapies are available, allowing a sustained relief for infected individuals. These therapeutics have also contributed to a better prevention and helped curb the epidemic, notably in high-income countries. However, a vaccine is still highly awaited for controlling this epidemic, especially in lower-income regions and precarious settings.The protective role of neutralizing antibodies (NAbs) has been unequivocally demonstrated in both animal models of HIV infection and in human settings. Consequently, the development of a B-cell-based vaccine capable of eliciting antibodies (Abs) with the ability to neutralize the majority of circulating viruses, namely broadly NAbs (bNAbs), could be foreseen as an answer to the HIV pandemic.The investigation of bNAb development in HIV-1 elite neutralizers provides valuable insights to inform the design of such vaccines. To date, most of the undertaken studies have relied on conventional single B-cell FACS sorting to isolate bNAbs. In the present study, we have used the Chromium Single Cell Immune Profiling approach to conduct a high-throughput longitudinal single-cell exploration of the B-cell repertoire in an HIV-1 elite neutralizer. Importantly, this novel method enables the use of a much greater number of HIV envelope glycoprotein (Env) baits compared to regular FACS-based Ab isolation studies, providing a more comprehensive view of the anti-Env Ab repertoire. In addition, this approach yields a wealth of information on the nature of the specific Abs identified and the corresponding B-cells.The study enabled the uncovering of the sequence of 12,130 putative HIV Env specific Abs. Antibodies from 39 lineages were produced and tested for neutralization, revealing 21 distinct neutralizing lineages. The results thus demonstrated the ability of the method to explore large antigen-specific Ab repertoires from longitudinal samples. The neutralizing activity of Abs from four neutralizing lineages together recapitulated the serum activity of the donor, achieving neutralization against 62.4 % of a large predictive panel of 126 pseudoviruses. One of these neutralizing Ab lineages was shown to target the gp120 high-mannose patch supersite with great breadth and potency; Abs from this lineage were sensitive to the presence of a glycan in position N332. A single of those Abs achieved most of the neutralization breadth (51.1 %) with a high potency (mean IC50 of 91.1 ng.mL-1). This Ab exhibited a 23 AA-long CDRH3 and 20 % somatic hypermutation (SMH). The lineage showed continuous evolution over 6.5 years of maturation, with observed SHM rates ranging from 2.0 % to 30.6 % for the heavy chain, without any insertions or deletions.Conventional FACS-based sorting was previously used to isolate bNAbs from the same donor. In comparison, the single cell high-throughput approach made possible the isolation of orders of magnitude more Abs. Furthermore, the newly isolated NAbs were overall more potent and broader than those isolated previously, indicating the superiority of the novel method in recovering neutralizing lineages. Ongoing structural studies will elucidate the epitopes responsible for the broad neutralization observed in this donor. Together, the findings may help the design of reverse vaccine approaches, which show promise in the development of an effective AIDS vaccine
Curk, Franck. "Organisation du complexe d’espèce et décryptage des structures des génomes en mosaïque interspécifiques chez les agrumes cultivés." Thesis, Montpellier 2, 2014. http://www.theses.fr/2014MON20223/document.
Full textCitrus fruit, the most important fruit crop in the world, show a wide phenotypic diversity. Previous studies (molecular markers) identified four ancestral taxa (Citrus reticulata Blanco, mandarins; C. maxima (Burm.) Merr., pummelos; C. medica L., citrons; C. micrantha Wester, papedas) as the ancestors of all cultivated Citrus after reticulate evolutions. As a result, modern citrus varieties have complex and highly heterozygous genotypic structures, generally fixed by apomixis, and formed by a mosaic of large chromosomal fragments of different phylogenetic origins. Furthermore, the structuration of the phenotypic variability suggests that the initial differentiation of the basic taxa is the main source of most of the variability of the useful citrus phenotypic diversity. A thorough knowledge of the origin of cultivated citrus and their phylogenomic structure are essential for the management of biological resources and breeding program optimization. This thesis explores different approaches for analyzing genome diversity in order to identify the phylogenetic origins of the various horticultural citrus groups and to decipher their phylogenomic genome's structures. We focused on limes and lemons. This thesis takes advantage of the rapid evolution of NGS and proposes a rational use of available tools, based on research questions. Roche 454 parallel sequencing of amplicons provides multi-loci haplotype information on 500 base fragments. It was used to decipher the interspecific mosaic structure of chromosome 2 for fifty varieties and to identify ancestral taxa diagnostic SNP markers. The genotyping of all limes and lemons of the Inra/Cirad and Ivia germplasms with these markers, in association with SSR and indel markers, allowed to propose new hypothesis on the origins of limes and lemons. Data from Illumina whole genome re-sequencing of 7 varieties of limes and lemons, compared to those of representatives of the ancestral taxa, allowed to infer the interspecific structure of their genomes and to map out, for the first time, their phylogenomic karyotypes. The different approaches led to similar conclusions. Our results confirm previous hypothesis about the evolutionary steps at the origin of sour orange (C. aurantium), sweet orange (C. sinensis) and grapefruit (C. paradisi) involving C. maxima and C. reticulata gene pools. They highlight frequent introgressions of C. maxima in the genome of mandarin varieties despite the fact they were considered as representative of C. reticulata. We were also able to quantify the relative proportions of these two ancestral taxa in the genome of many varieties of small citrus fruit (mandarin hybrids, tangors and tangelos). Our work on limes and lemons demonstrate that C. medica is the male parent of this varietal group at the diploid level. Two groups of lemons are clearly differentiated: one from direct hybridizations between C. reticulata and C. medica, and one from crosses between hybrids (C. maxima × C. reticulata) and C. medica. Sour orange seems to be the female parent of ‘Eureka' type lemons (C. limon). The ‘Mexican' limes (C. aurantifolia) seems to come from a direct hybridization C. micrantha × C. medica. Finally, triploid big fruit limes have two major origins. The ‘Tahiti' type probably results from an ‘Eureka' type lemon (C. limon) ovule fecundated by a diploid gamete of a ‘Mexican' type lime (C. aurantifolia), while the other type would come from a back-cross between C. aurantifolia (diploid gamete) and C. medica. This new insights in genomic structure of secondary species makes to consider possible a reconstruction of these ideotypes from ancestral taxa germplasm. They also open new ways for association genetic studies based on phylogenomics of genes involved in the development of quality, resistance and adaptation traits. Finally, developed specific taxa diagnostic markers will find many applications for the characterization of collections and further genetic studies
Chiarello, Marlène. "Biodiversité du microbiome cutané des organismes marins : variabilité, déterminants et importance dans l’écosystème." Thesis, Montpellier, 2017. http://www.theses.fr/2017MONTT092/document.
Full textOceans contain thousands of microbial species playing crucial roles for the functioning of the marine ecosystem. These microorganisms are present everywhere in the water column. Some microorganisms also colonize the surface and the digestive tract of marine macro-organisms, forming communities called microbiomes. These microbiomes have positive effects for their host’s fitness. The diversity of these marine animal surface microbiome is still largely understudied, despite recent progress in molecular biology that now permits to fully assess its different facets of biodiversity, i.e. taxonomic, phylogenetic and functional. The goal of this thesis is therefore to describe the diversity of the surface microbiome of marine animals, to assess its variability at different levels, as well as its determinants, and the significance of such diversity at the ecosystem’s scale. Firstly, I have assessed the efficiency of various diversity indices to detect ecological signals in the specific case of microbial communities. Secondly, I have described the surface microbiome of major marine animal clades (teleostean fishes, cetaceans and several classes of invertebrates). I found that these microbiomes are highly distinct from the surrounding planktonic communities. I demonstrated that these microbiomes are variable both between individuals from the same species and between species, but do not show a phylosymbiosis pattern. Last, I assessed the contribution of surface microbiomes to the global microbial community at the scale of a coral reef ecosystem. I demonstrated that marine animal surfaces host almost twenty times more microbial species than the water column, and 75% of the phylogenetic richness present in the ecosystem. In a context of massive erosion of marine macroscopic organisms, it is therefore urgent to exhaustively assess marine microbial biodiversity and its vulnerability facing anthropic pressures
Mandon, Perrine. "Origines et évolution de lignées hydrothermales." Thesis, Sorbonne université, 2018. http://www.theses.fr/2018SORUS467.
Full textThe originality of the hydrothermal vents fauna led to the classification of some organisms under new high taxonomic ranks. However, previous molecular studies reassigned them to known lineages, leading to major reductions in such ranking. Classically in phylogenetic studies, optimizing both taxonomic sampling and molecular markers is challenging. This Ph.D project illustrates this limitation, but still provides breakthroughs in the understanding of the origin and evolution of three hydrothermal taxa. In Polynoidae worms, the multigenic approach, led on a large taxonomic and ecological sampling, indicates at least two colonization events of hydrothermal vents. However, the limited resolution of these markers for deep nodes prevented the clear understanding of such events. A similar limitation was previously encountered for Alvinocarididae shrimp and Bythograeidae crabs families in their respective infra-orders (Caridea and Brachyura). Here, two approaches aiming to search and identify markers were tested on these groups. The first one, based on the sequencing of the mitochondrial genome (easily generalizable), resolves deep nodes in Brachyura, and places the available Bythograeidae species near the Xanthidae. The second, based on transcriptome sequencing, allows the identification of molecular markers conserved enough to resolve inter-familial relationships in Caridea. Although this approach is less generalizable, the identified markers could be targeted a posteriori on a wide taxonomic scale using marker-specific probes
Lucasson, Aude. "Caractérisation et diversité des mécanismes du syndrome de mortalité affectant les juvéniles de Crassostrea gigas." Thesis, Montpellier, 2018. http://www.theses.fr/2018MONTG076/document.
Full textInfectious diseases are very often explored using reductionist approaches, despite repeated evidence showing them to be strongly influenced by numerous interacting host and environmental factors. Many diseases with complex etiology therefore remain misunderstood. In this thesis, by developing a holistic approach to tackle the complexity of the interaction, (i) we deciphered the complex intra-host interactions underlying the Pacific oyster mortality syndrome affecting juveniles of Crassostrea gigas, the main oyster species exploited worldwide and (ii) we validated this mechanism in different infectious environments and oyster genotypes. Using ecologically realistic experimental infections combined with thorough molecular (metabarcoding, transcriptomics, pathogen monitoring) and histological analyses on oyster families with contrasting susceptibilities, we demonstrated that the disease is caused by a multiple infection whose initial and necessary step is the infection of oyster haemocytes by a herpesvirus. Viral replication leads to an immune-compromised state of the host, evolving toward subsequent bacteremia by opportunistic bacteria. By identifying critical intra-host interactions between microorganisms and host immunity, this study cracks the code of the Pacific oyster mortality syndrome and provides important molecular data for the design of prophylactic measures and breeding programs dedicated to the production of oysters resistant to the mortality syndrome. We believe that such a systems biology approach could be applied to decipher other multi-factorial diseases that affect non-model invertebrate species worldwide
Redin, Claire. "NGS-based approaches for the diagnosis of intellectual disability and other genetically heterogeneous developmental disorders." Thesis, Strasbourg, 2014. http://www.theses.fr/2014STRAJ129/document.
Full textSome monogenic disorders are characterized by a vast genetic heterogeneity. In individuals with similar clinical phenotype, causative mutations can be found in one gene from a subset described as implicated in the disease. Such genetic heterogeneity limits considerably the diagnostic offer for the patients, and a majority is left without molecular diagnosis. We developed an alternative diagnostic approach by targeted high throughput sequencing (specific to the coding regions of genes of interest by a technique of exon capture) through three genetically heterogeneous disorders: Bardet-Biedl syndrome (19 genes reported), leukodystrophies (50 genes), and intellectual disability (>400 genes). In light of its efficiency, this approach has since been implemented in diagnostic routine for Bardet-Biedl syndrome and intellectual disability (80% and 25% of diagnostic yields respectively, significantly higher than those of previous methods). Beyond diagnosis, this approach allows unbiased means to assess the contribution of each gene in the disease and highlight recurrent genes, and establish new correlations genotype to phenotype, overall providing much insight in the genetics of a particular disease
Debladis, Emilie. "Etude de l'activité transpositionnelle en condition de stress chez le riz, Oryza sativa." Thesis, Perpignan, 2016. http://www.theses.fr/2016PERP0026/document.
Full textTransposable elements (TEs) are ubiquitous among eukaryotic genomes sometimes overriding in plants. Due to their ability to replicate and transpose, they are potentially mutagenic and recognized as actors of genome evolution. However, the analysis of the transpositional activity of TEs in different plant species have shown that most of them are maintained in a transcriptionally inactive state through powerful and specific epigenetic mechanisms. These silencing processes can nevertheless be allievated under stress conditions, leading to TE reactivation. Are these stress sufficient to activate transposition in natural populations? Are repeated heat stress able to trigger transposition and therefore lead to bursts of transposition? In recent reports, reactivation of retrotransposons has been shown in Arabidopsis thaliana mutants impaired in the RdDM pathway (RNA-directed DNA Methylation) and submitted to heat stress. My PhD works reports the study of of a wild rice and a new rice mutant, affected in the RdDM, cultivated under optimal or heat stress conditions over generations. Here, we propose to determine (1) the impact of the mutation at the different levels leading to the retrotranspositional activation and (2) the retrotranspositional activity in response to heat stress. An important part of this work has been devoted to the development and the comparison of different methods to identify TE movements, and different -omics approaches have been used. The reactivation of 5 new TEs in mutants, suggests that the RdDM pathway is involved in the control of the repression of these TEs. Furthermore, our result confirm that all TEs are not regulated through the same pathways but are under the control of different lock
Delhomme, Tiffany. "Using the systematic nature of errors in NGS data to efficiently detect mutations : computational methods and application to early cancer detection." Thesis, Lyon, 2019. http://www.theses.fr/2019LYSE1098/document.
Full textComprehensive characterization of DNA variations can help to progress in multiple cancer genomics fields. Next Generation Sequencing (NGS) is currently the most efficient technique to determine a DNA sequence, due to low experiment cost and time compared to the traditional Sanger sequencing. Nevertheless, detection of mutations from NGS data is still a difficult problem, in particular for somatic mutations present in very low abundance like when trying to identify tumor subclonal mutations, tumor-derived mutations in cell free DNA, or somatic mutations from histological normal tissue. The main difficulty is to precisely distinguish between true mutations from sequencing artifacts as they reach similar levels. In this thesis we have studied the systematic nature of errors in NGS data to propose efficient methodologies in order to accurately identify mutations potentially in low proportion. In a first chapter, we describe needlestack, a new variant caller based on the modelling of systematic errors across multiple samples to extract candidate mutations. In a second chapter, we propose two post-calling variant filtering methods based on new summary statistics and on machine learning, with the aim of boosting the precision of mutation detection through the identification of non-systematic errors. Finally, in a last chapter we apply these approaches to develop cancer early detection biomarkers using circulating tumor DNA
Pichon, Maxime. "Caractérisation du microbiome respiratoire et de la diversité génomique virale au cours des formes de grippes sévères." Thesis, Lyon, 2018. http://www.theses.fr/2018LYSE1271.
Full textInfluenza is a respiratory infection responsible for respiratory or neurological complications and require rapid and adapted management. The emergence of next-generation sequencing (NGS) allows the study of resident microbial communities as well as an in-depth study of the genome of the pathogens. This thesis aimed to characterize the respiratory microbiome and the viral genomic diversity of influenza virus infected patients, correlating these data to the collected clinical data. After sampling of respiratory specimens from hospitalized children between 2010 and 2014, the sequencing of their respiratory microbiome revealed an increase in microbial diversity and a differential microbial signature between clinical forms. A differential taxon distribution (OTU) allows the prediction of complications in infected children. The study of adult respiratory samples will complete the predictive signature.After validation of the analytical and bioinformatic processes by artificial reconstitution of quasi-species and collection of 125 respiratory clinical specimens, the sequencing of the whole genome by NGS of the influenza viruses allow to differentiate the initial diversities according to the nature of the infecting virus and the complication. Compared to early samples, specimen sampled successively show a differential diversification between the different segments of influenza viruses, whether in immunocompetent patients or in an immunocompromised patient with prolonged excretion
Garcia, del Rio Diego Fernando. "Studying protein complexes for assessing the function of ghost proteins (Ghost in the Cell)." Electronic Thesis or Diss., Université de Lille (2022-....), 2023. https://pepite-depot.univ-lille.fr/ToutIDP/EDBSL/2023/2023ULILS115.pdf.
Full textOvarian cancer (OvCa) has the highest mortality rate among female reproductive cancers worldwide. OvCa is often referred to as a stealth killer because it is commonly diagnosed late or misdiagnosed. Once diagnosed, OvCa treatment options include surgery or chemotherapy. However, chemotherapy resistance is a significant obstacle. Therefore, there is an urgent need to identify new targets and develop novel therapeutic strategies to overcome therapy resistance.In this context the ghost proteome is a potentially rich source of biomarkers. The ghost proteome, also known as the alternative proteome, consists of proteins translated from alternative open reading frames (AltORFs). These AltORFs originate from different start codons within mRNA molecules, such as the coding DNA sequence (CDS) in frameshifts (+1, +2), the 5'-UTR, 3'-UTR, and possible translation products from non-coding RNAs (ncRNA).Studies on alternative proteins (AltProts) are often limited due to their case-by-case occurrence and complexity. Obtaining functional protein information for AltProts requires complex and costly biomolecular studies. However, their functions can be inferred by profiling their interaction partners, known as "guilty by association" approaches. Indeed, assessing AltProts' protein-protein interactions (PPIs) with reference proteins (RefProts) can help identify their function and set them as research targets. Since there is a lack of antibodies against AltProts, crosslinking mass spectrometry (XL-MS) is an appropriate tool for this task. Additionally, bioinformatic tools that link protein functional information through networks and gene ontology (GO) analysis are also powerful. These tools enable the visualization of signaling pathways and the grouping of RefProts based on their biological process, molecular function, or cellular localization, thus enhancing our understanding of cellular mechanisms.In this work, we developed a methodology that combines XL-MS and subcellular fractionation. The key step of subcellular fractionation allowed us to reduce the complexity of the samples analyzed by liquid chromatography tandem mass spectrometry (LC-MS/MS). To assess the validity of crosslinked interactions, we performed molecular modeling of the 3D structures of the AltProts, followed by docking studies and measurement of the corresponding crosslink distances. Network analysis indicated potential roles for AltProts in biological functions and processes. The advantages of this workflow include non-targeted AltProt identification and subcellular identification.Additionally, a proteogenomic analysis was performed to investigate the proteomes of two ovarian cancer cell lines (PEO-4 and SKOV-3 cells) in comparison to a normal ovarian epithelial cell line (T1074 cell). Using RNA-seq data, customized protein databases for each cell line were generated. Differential expression of several proteins, including AltProts, was identified between the cancer and normal cell lines. The expression of some RefProts and their transcripts were associated with cancer-related pathways. Moreover, the XL-MS methodology described above was used to identify PPIs in the cancerous cell lines.This work highlights the significant potential of proteogenomics in uncovering new aspects of ovarian cancer biology. It enables us to identify previously unknown proteins and variants that may have functional significance. The use of customized protein databases and the crosslinking approach have shed light on the "ghost proteome," an area that has remained unexplored until now
Lerat, Justine. "Neuropathies Périphériques Génétiques et Surdité : Etude des Relations Génétiques et Mécanistiques." Thesis, Limoges, 2018. http://www.theses.fr/2018LIMO0055.
Full textHereditary Peripheral Neuropathies (PN) are characterized by various phenotypes and great genetic heterogeneity. Charcot-Marie-Tooth disease (CMT) accounts for most sensori-motor peripheral neuropathies. Besides, other symptoms can be associated, such as deafness. No precise estimation of deafness within this population exist and its pathogenicity is uncertain. The aim of this PhD was to better understand the physiopathology of deafness in patients suffering from PN. Various complementary approaches were used; 1) a clinical approach on a French cohort of patients suffering from both PN and hearing loss and molecular genetic tests with NGS sequencing (PN, deafness panels, and/or exomes), 2) a biochemical approach on murine and human cochlear nerve samples and 3) a bioinformatic approach to identify protein hubs implicated in the onset of PN-associated deafness.This has enabled us to characterize the various phenotypes of patients suffering from both hereditary PN and deafness, and then notice that deafness can be endo-, retro- or endo- and retrocochlear. Thirty-six genes were reported to be associated with both PN and hearing impairment. Sixty percent of our patients were genotyped, highlighting seven novel pathogenic variants in five different genes. Our research also suggests that PMP22, the most frequent gene in CMT, is probably not or poorly implicated in deafness onset in PN patients. In two of our patients with PMP22 pathogenic variants, a second involved gene was found with COCH and MYH14 respectively. Genotype-phenotype correlations were found out with the ABHD12, SH3TC2, NEFL and PRPS1 genes. Secondly, the preliminary immunohistochemical study on wild-type rats auditory nerves highlighted the expression of pmp22, mpz, nefl and trpv4 on the cochlear nerve and tracked a different expression in CMTpmp22/+ rats. However, the study on humans was not conclusive. Recently, in silico research of pathways common to the different genes described to be involved in both PN and deafness, has found the direct link between PMP22 and MPZ. Indirect links between several other proteins have been tracked.This thesis also shows that hearing impairment is most probably under-diagnosed in this population of genetic PN sufferers. We suggest regular audiologic follow-up for PN patients and neurological assessment for deaf children
Ric, Audrey Marie Amélie. "Caractérisation d'aptamères par électrophorèse capillaire couplée au séquençage haut-débit Illumina." Thesis, Toulouse 3, 2017. http://www.theses.fr/2017TOU30388/document.
Full textAptamers are oligomers of small single-stranded DNA or RNA which can have strong and specific interactions with some targets when they fold into three-dimensional structures. The objective of this thesis was to complete existing studies on the use of capillary electrophoresis in order to develop a method for the selection of aptamers by CE coupled to laser induced fluorescence and Illumina high-throughput sequencing. In a first step, we developed a method of detection and separation by capillary electrophoresis coupled with the double detection UV-LEDIF of a DNA library interacting with a target: thrombin. It is a model already studied and for which two aptamers have been published. We used aptamer T29 as part of our study because it has the best affinity. Capillary Electrophoresis is a powerful analytical tool that facilitates the selection efficiency of aptamers and specifies the determination of the interaction parameters. We thus were able to determine the affinity constant KD by CE-UV-LEDIF on the basic model: thrombin. Moreover, we also show how the use of Tris buffer can degrade single-stranded DNA during capillary electrophoresis and we propose as an alternative the use of a dibasic sodium phosphate buffer which avoids the phenomenon of degradation. Finally, we explain the difficulty of amplification by qPCR and PCR of an aptamer such as T29 with a G-quadruplex structure. We showed that the Illumina high-throughput sequencing allowed us to find a correlation between the number of sequenced molecules and the number of sequences obtained. Analysis of the sequences obtained shows a significant amount (20%) of T29 sequences which do not correspond to the sequence of this aptamer. This shows that the PCR and high-throughput sequencing steps for the detection of G-quadruplex can induce bias in the identification of these molecules
Martinez, Palacios Paulina. "Réponse des agents non codants du génome – éléments transposables et petits ARN – à un événement d'allopolyploïdie : le génome du colza (Brassica napus) comme modèle d'étude." Thesis, Paris 11, 2014. http://www.theses.fr/2014PA112055/document.
Full textThe evolutionary success of polyploid species is partly due to the dynamic changes in genome organization and gene expression patterns that occur at the onset of the polyploid formation. These changes are promoted by the merging of divergent genomes into a single nucleus (i.e. allopolyploidy) that causes a “genomic shock”; they are thought to provide a rich source of new genetic material upon which selection can act to promote adaptation and evolution. Many studies have thus aimed to uncover molecular mechanisms that are responsible for the evolutionary success of allopolyploid species, most of them focusing on gene expression changes. In the present PhD thesis, my interest has been concentrated on the non-coding components of the genome: transposable elements and small non-coding RNAs. My study involves oilseed rape (Brassica napus, AACC), a relatively young allopolyploid species that originated from hybridizations between B. rapa (AA) and B. oleracea (CC). Specifically, I have used resynthesized B. napus polyploids advanced by self-pollination of single plants for several generations; I have analyzed these plants at different generations for genomic changes accompanying polyploid formation and subsequent evolution. In a first part, sequence-specific amplification polymorphism (SSAP) targeting the C genome-specific transposable element Bot1, was used to evaluate transposition rate of Bot1 in resynthesized B. napus in comparison with the diploid parents. Only a few transposition events were identified. When combined with the results obtained for two other TEs, this work suggests that allopolyploidy has only a moderate impact on TE transposition and restructuring. The changes observed in SSAP profiles led us to hypothesize that some of them resulted from changes in DNA methylation, resulting in rare but highly specific TE activation and transposition. In a second part, I have concentrated on small non-coding RNAs (sRNAs), which are thought to mediate different aspects of the response to the “genomic shock” induced by allopolyploid formation. Comprehensive analyses of sRNA expression in resynthesized B. napus allopolyploids have been carried out by deep sequencing sRNAs from 11 libraries prepared from stems of three allotetraploids (surveyed at the two generations S1 and S5) and the two diploid parents. Characterization of sRNA distributions in these plants indicates that sRNAs show an immediate but transient response to allopolyploidy. The sRNAs derived from transposable elements (down-regulated in the S1) or targeting unknown sequences (no Blast hit against any available public database) were particularly affected. The use of B. napus mRNAseq data revealed that these latest unknown candidates, which are 21-nt long and over-expressed in the earliest generations (F1, S0, S1) were derived from endogenous viral elements (EVE). We confirmed that these EVEs showed the same expression patterns as the 21-nt long sRNAs that specifically target them (over-expression in the F1, S0 and S1). These results suggest that (at least) some EVEs might be reactivated as a response to the merging of divergent genomes (in interspecific hybrids and newly formed allopolyploids). Altogether, our results have demonstrated a succession of sRNA pathways that counteract the reactivation of some specific TEs and/or EVEs at the onset of polyploid formation; reactivated TEs and/or EVEs being immediately repressed at the post-transcriptional level (PTGS), and then fully repressed by transcriptional gene silencing (TGS) in the subsequent generations. Such data lead to hypothesize that sRNAs are essential to overcome interspecific hybrid incompatibilities due to the uncontrolled and deleterious reactivation of TEs / EVEs. Therefore, sRNAs should be considered as the guardians of genome integrity even in newly-formed allopolyploids
Kopylova, Evguenia. "Algorithmes bio-informatiques pour l'analyse de données de séquençage à haut débit." Phd thesis, Université des Sciences et Technologie de Lille - Lille I, 2013. http://tel.archives-ouvertes.fr/tel-00919185.
Full textKopylova, Evguenia. "Algorithmes bio-informatiques pour l’analyse de données de séquençage à haut débit." Thesis, Lille 1, 2013. http://www.theses.fr/2013LIL10181/document.
Full textSequence alignment algorithms are at the heart of bioinformatic sequence analysis. In this thesis we focus on the alignment of millions of short sequences produced by Next-Generation Sequencing (NGS) technologies in particular for the analysis of metagenomic and metatranscriptomic data, that is the DNA and RNA directly extracted for an environment. Two major challenges were confronted in our developed algorithms. First, all NGS technologies today are susceptible to sequencing errors in the form of nucleotide substitutions, insertions and deletions. Second, metagenomic samples can contain hundreds of unknown organisms and the standard approach to identifying them is to align against known closely related species. To overcome these challenges we designed a new approximate matching technique based on the universal Levenshtein automaton which quickly locates short regions of similarity (seeds) between two sequences allowing 1 error of any type. Using seeds to detect possible high scoring alignments is a widely used heuristic for rapid sequence alignment, although most existing software are optimized for performing high similarity searches and apply exact seeds. Furthermore, we describe a new indexing data structure based on the Burst trie which optimizes the search for approximate seeds. We demonstrate the efficacy of our method in two implemented software, SortMeRNA and SortMeDNA. The former can quickly filter ribosomal RNA fragments from metatranscriptomic data and the latter performs full alignment for genomic and metagenomic data
Latypova, Martin Xénia. "Etude fonctionnelle de variants identifiés par séquençage haut-débit : apports et perspectives." Thesis, Nantes, 2018. http://www.theses.fr/2018NANT1024.
Full textTechnological advances have opened unparalleled opportunities to detect genetic variation. Interpretation of these datausing in vivo disease modeling approaches provides helpful input to inform Medical Genetics clinical practice. Neurodevelopmental disorders, including intellectual disability and autism spectrum disorder, pose a major challengefor genomic data interpretation and disease modeling, given the extensive locus heterogeneity, high contribution of de novo variation to etiologic burden and low accessibility of cell types of interest. Using anatomical surrogate phenotypes in zebrafish, we established relevance to disease and tested pathogenicity of point mutations in novel neurodevelopmental disease causing genes RORA and SIN3B. First, we categorized the RORA-associated disorder in two clinical subtypes depending on the presence of cerebellar features present in addition to intellectual disability and autism spectrum disorder. Nonsynonymous variant testing in zebrafish indicated that there was a diverse direction of variant effect, which was consistent with the clinical subtypes observed. Additionally, we supported SIN3B involvement in a syndromic intellectual disability syndrome by demonstrating that disruption of craniofacial architecture, a comorbid feature, was caused by sin3b targeting in zebrafish. This work highlights the utility of the zebrafish model organism as an informative experimental tool for variant interpretation in genomic medicine, especially in neurodevelopmental disorders
Vervier, Kevin. "Méthodes d’apprentissage structuré pour la microbiologie : spectrométrie de masse et séquençage haut-débit." Thesis, Paris, ENMP, 2015. http://www.theses.fr/2015ENMP0081/document.
Full textUsing high-throughput technologies is changing scientific practices and landscape in microbiology. On one hand, mass spectrometry is already used in clinical microbiology laboratories. On the other hand, the last ten years dramatic progress in sequencing technologies allows cheap and fast characterization of microbial diversity in complex clinical samples. Consequently, the two technologies are approached in future diagnostics solutions. This thesis aims to play a part in new in vitro diagnostics (IVD) systems based on high-throughput technologies, like mass spectrometry or next generation sequencing, and their applications in microbiology.Because of the volume of data generated by these new technologies and the complexity of measured parameters, we develop innovative and versatile statistical learning methods for applications in IVD and microbiology. Statistical learning field is well-suited for tasks relying on high-dimensional raw data that can hardly be used by medical experts, like mass-spectrum classification or affecting a sequencing read to the right organism. Here, we propose to use additional known structures in order to improve quality of the answer. For instance, we convert a sequencing read (raw data) into a vector in a nucleotide composition space and use it as a structuredinput for machine learning approaches. We also add prior information related to the hierarchical structure that organizes the reachable micro-organisms (structured output)
Haidar, Zahraa. "Identification de gènes responsables de maladies neurologiques héréditaires par séquençage à haut débit." Thesis, Aix-Marseille, 2019. http://www.theses.fr/2019AIXM0662.
Full textMy work is a joint PhD between Saint Joseph University in Beirut (Lebanon) and Aix Marseille University in Marseille (France). My PhD project aims at identifying genes responsible for rare neurological diseases by next-generation sequencing (NGS) in consanguineous Lebanese families. Neurological diseases are characterized by extensive phenotypic and genetic heterogeneity, and affect the structure and function of different regions of the central and peripheral nervous system.During my PhD work, I have studied several of these families, trying to identify the molecular basis of the studied disease, using NGS technologies. First, I performed the bioinformatics analysis of the exome and genome data, as well as the segregation by Sanger sequencing, and the family segregation of the candidate variants identified by NGS. In some diseases, for which a new mutation or gene has been identified; I have carried out more functional studies, in order to understand the physiopathological mechanisms bases
Mersch, Marjorie. "Analyse de la méthylation de l'ADN par séquençage haut-débit chez la Poule." Thesis, Toulouse, INPT, 2018. http://www.theses.fr/2018INPT0107/document.
Full textAnticipating the impact of environmental changes (on climate and feed) is a crucial issue for livestock production systems, including poultry. The influence of the environment on phenotypes is partly mediated by epigenetic phenomena, including DNA methylation, which may be involved in the regulation of gene expression. These mechanisms do not affect the DNA sequence but can be inherited by mitosis or meiosis. The interactions between epigenomes and gene expression are increasingly being studied in animal models and in plants. However, the mechanisms of regulation of genome expression through DNA methylation are relatively unknown in birds. This thesis work is based on two experimental devices realized in chicken aiming to characterize the methylome by high-throughput sequencing. The methylation patterns across the genome, and their link with expression, were first established by whole-genome bisulfite sequencing (WGBS) in whole embryos, following a reduced representation bisulfite sequencing (RRBS) from hypothalamus of adults. To date, no specific chicken RRBS study has been published. These two analyses were carried out by developing an optimized bioinformatics pipeline, available for scientific community. Overall, the pattern of methylation in chicken is like those in mammals: CpG islands - dinucleotides CG-rich regions which are often poorly methylated, and which are found mainly in the promoter regions of the genome - are generally poorly methylated in promoters on WGBS and RRBS data. Embryo methylome analyses confirmed the absence of a dose-compensation phenomenon on sex chromosomes, or the presence of a hypermethylated region on the Z chromosome. The analyses of RRBS data revealed an overall hypermethylation of CGs across the genome, suggesting a methylation response to environmental stress. From the analysis of WGBS data, we found that the level of methylation in promoters was negatively correlated with the expression of the associated gene. For the first time, a specific allele methylation was also detected between chicken lines whose frequency is comparable to that observed in humans. On the RRBS data, preliminary results of the methylome response to environmental stresses showed the complex nature of this relationship. The use of a low-energy diet would led to greater mobilization of body fat, while individuals with heat stress had a lighter body weight. Integrating these data with phenotypic measurements would allow to link methylation and environment. Beyond the fundamental aspect of this thesis, the method developed in this work could be applied to livestock systems to breed animals better adapted to a changing environment, by improving production traits
Fermey, Pierre. "Identification de nouvelles bases moléculaires des cancers précoces par séquençage à haut débit." Thesis, Normandie, 2017. http://www.theses.fr/2017NORMR110/document.
Full textOne of the greatest advances in oncology and genetics over the past 20 years has been the identification of hereditary forms of cancer and of the cancer genes. Nevertheless, in a majority of patients suspected to present an inherited form of cancer, analyses of the genes known to be involved in the Mendelian predispositions to cancer often remain negative. Today, thanks to the emergence of high-throughput sequencing (NGS), it is now possible to sequence all exons of an individual (exome) or several hundred genes in a short period of time and for a reasonable cost. In this context, we have applied several strategiesbased on these new tools in order to identify new molecular basis of early-onset cancers. First, we applied an intra-familial exome analysis strategy to an atypical family with chondrosarcomas of the chest, for which no molecular basis could be identified. Using this strategy, we were able to identify a truncating alteration of the EXT2 gene NM_000401.3; c.237G> A; p.Trp79 *). The documented loss of function alterations of this gene are implicated in a disease called multiple osteochondromas (OM), associated with benign lesions. Interestingly, these patients showed no clinical signs of OM indicating a potential phenotypic extension of EXT2 mutations. Plus, this work allowed us to change the clinical management of this family. We then used a strategy of subtractive exomic analysis of trio sick child/healthy parents in order to identify de novo mutations in a young patient who developed a medulloblastoma of the cerebellum at 8 years-old followed by a meningioma at 22 years-old. The analysis of the trio revealed the existence of a de novo mutation affecting a highly conserved amino acid of the HID-1 protein. HID-1 is specifically expressed in neuronal and secretory cells, and seems to function around the Golgi apparatus to regulate the sorting of newly formed vesicles. Our hypothesis is that a defect of the HID-1 protein linked to a mutation of the HID-1 gene, could alter the secretory pathway therefore contributing to the development of the tumor. This work, which is still ongoing, demonstrates both the strength of the trio strategy for the rapid identification of de novo mutations and illustrates all the difficulty of interpreting variants detected in genes not yet involved in cancer. Then, thanks to the recruitment of the Laboratory of Molecular Genetics of the CHU of Rouen, we have collected a cohort of 10 patients who developed an adrenocortical carcinoma (ACC) at a very early age and for which no molecular basis could be identified. Despite subtractive and inter-familial exomic analyses, we were unable to highlight new molecular bases for these cases of pediatric ACC. Finally, under the assumption that rare or private mutations in a limited number of genes involved in cancer could contribute to inherited forms of cancer, we undertook a project to sequence 201 genes involved in cancer in patients who developed tumors at a pediatric age. The first results of this project confirmed the robustness of this technique and suggested a phenotypic extension of the DICER1 mutation spectrum as well as an oligogenic contribution of DNA repair genes in pediatric tumors. Soon, these results will be compiled in a database and will benefit from a statistical analysis with the objective to identify enrichment of rare variants in specific genes or biological pathways in these patients compared to control individuals
Nguyen, Quang Nam. "Utilisation du séquençage à haut débit pour la sélection et l'ingénierie des aptamères." Thesis, Université Paris-Saclay (ComUE), 2017. http://www.theses.fr/2017SACLS238.
Full textSELEX is a directed molecular evolution technic which allows, after several rounds of selection, enriching a library from random nucleic acids to sequences able to bind specifically a target. Sequencing technics are then used to identify these sequences called « aptamers ». Since the arrival of High-Throughput Sequencing (HTS), it is now possible to analyse millions of sequences. The aim of the thesis was to develop methods for the treatment and the analysis of HTS data, in order to facilitate the identification of the best aptamers inside a SELEX. During this thesis, a semi-automatic binding test on adherent living cells has been developed to measure the affinity of aptamers identified in SELEX directed against specific cells (cell-SELEX). Then, the evolution of the sequence enrichment during a cell-SELEX has been analysed by HTS. This analysis gave us the possibility to design a new phylogenetic approch named FREDROGRAM. This evolutive approch allowed to identify variants of an aptamer’s family with a better affinity. Finally, HTS of two SELEX directed against proteins has contributed to a better understanding of the impact of selection parameters on the library and to identified new aptamers, notably by reducing the number of SELEX rounds. To conclude, this work shows the importance of HTS in the identification of the best aptamers and suggests new protocols to monitor the next SELEX in a different manner
Mambu, Mambueni Hendrick. "Identification de nouveaux variants rares associés à la spondyloarthrite par séquençage haut-débit." Electronic Thesis or Diss., université Paris-Saclay, 2022. http://www.theses.fr/2022UPASL064.
Full textSpondyloarthritis (SpA) is a multifactorial disease with an estimated heritability of over 90%, mainly related to HLA-B27. All identified susceptibility factors, including HLA-B27, explain less than one third of the heritability. The involvement of rare variants could explain part of this missing heritability. The aim of this work was to identify rare variants associated with SpA via a combined family analysis and high-throughput sequencing approach. First, we sequenced a 1.4 Mb region significantly linked to SpA at 13q13 in 71 patients and 21 healthy controls from families with a high linkage score in this region. We identified a rare variant in the FREM2 gene present in 9 patients from a family with high linkage to the region and not found in other families or isolated cases of SpA. We then sequenced the exome of 48 patients from 20 multiplex families. Unfortunately, we did not observe any recurrent variants between families. We then focused on a second, previously known genetic linkage peak on chromosome 9. The study of the family most linked to this region, which includes 12 patients, led to the identification of several rare coding variants segregating with the disease. However, subsequent studies have shown equivalent allelic frequencies of these variants between cases and controls. Finally, whole genome sequencing of 413 patients from 76 multiplex families with 4 or more patients was performed. We identified 1203 rare, coding, non-synonymous variants shared by at least all affected family members. Genetic and functional validation analyses of these variants are underway, as is the analysis of non-coding variants. In conclusion, these different approaches suggest significant genetic heterogeneity in SpA and also highlight the difficulty of confirming the involvement of rare variants in complex diseases
Gicquel, Evelyne. "Etude par approches globales de la sélectivité d’atteinte dans les dystrophies des ceintures." Thesis, Université Paris-Saclay (ComUE), 2016. http://www.theses.fr/2016SACLE041.
Full textLimb Girdle Muscular Dystrophies are a group of genetic diseases affecting the muscles of the body with different degrees of severity. The factors behind these differences of impairment have not been identified.The objective of this thesis work is to identify the molecular differences existing in normal condition between muscles known to show a difference of impairment in case of genetic deficiencies asssociated with Limb Girdle Muscular Dystrophy. We based our work on the assumption that the differences of impairment between muscles would be caused by mechanisms leading to modifications of the expression of protective or sensitizer genes in the muscle. Therefore, we explored these mechanisms through a global approach. Analyses by high-throughput sequencing in Primate muscles allowed the identification of several genes and regulatory elements whose expression differs between the sensitive and the resistant muscles. These genes interact in a common network of interactions, which could be targeted for therapeutic purpose. Some of these differences were shown to be conserved in the mouse. We then explored the mechanisms by which the identified regulatory elements may be involved in selectivity of impairment. The results of this thesis provide a deeper understanding of the pathophysiological mechanisms of Limb Girdle Muscular Dystrophies. They will also pave the way for the development of new treatments for this group of diseases
Liais, Etienne. "Identification et caractérisation de virus aviaires par des approches de séquençage à haut débit." Thesis, Toulouse, INPT, 2014. http://www.theses.fr/2014INPT0134/document.
Full textInfectious diseases are considered the most prevalent cause of mortality in humans as well as other animals worldwide. Since the advent of high throughput sequencing technologies, diagnostic methods for these conditions have quickly changed and evolved, as the continuously decreasing cost of mass sequencing is making this tool available to larger numbers of people. As part of my thesis project, an Illumina®-based sequencing method (on a MiSeq machine) was designed for diagnostic purposes in clinical cases in poultry. We first used this method to identify the causative agent of the fulminating disease of guinea fowl. This validated the use of our protocol to identify the pathogenic infectious agent behind a specific condition. This newly identified Coronavirus was further analysed and characterised. In a second study we used an unbiased mass sequencing approach to describe the RNA virus populations present in the duck respiratory tract during clinical episodes (respiratory illness or egg drops). Data showed an important viral diversity and we identified some candidate pathogens. Taken together, these results validate the use of high throughput sequencing as a powerful diagnostic tool
Mirauta, Bogdan. "Etude du transcriptome à partir de données de comptages issues de séquençage haut débit." Thesis, Paris 6, 2014. http://www.theses.fr/2014PA066424/document.
Full textIn this thesis we address the problem of reconstructing the transcription profile from RNA-Seq reads in cases where the reference genome is available but without making use of existing annotation. In the first two chapters consist of an introduction to the biological context, high-throughput sequencing and the statistical methods that can be used in the analysis of series of counts. Then we present our contribution for the RNA-Seq read count model, the inference transcription profile by using Particle Gibbs and the reconstruction of DE regions. The analysis of several data-sets proved that using Negative Binomial distributions to model the read count emission is not generally valid. We develop a mechanistic model which accounts for the randomness generated within all RNA-Seq protocol steps. Such a model is particularly important for the assessment of the credibility intervals associated with the transcription level and coverage changes. Next, we describe a State Space Model accounting for the read count profile for observations and transcription profile for the latent variable. For the transition kernel we design a mixture model combining the possibility of making, between two adjacent positions, no move, a drift move or a shift move. We detail our approach for the reconstruction of the transcription profile and the estimation of parameters using the Particle Gibbs algorithm. In the fifth chapter we complete the results by presenting an approach for analysing differences in expression without making use of existing annotation. The proposed method first approximates these differences for each base-pair and then aggregates continuous DE regions
Mirauta, Bogdan. "Etude du transcriptome à partir de données de comptages issues de séquençage haut débit." Electronic Thesis or Diss., Paris 6, 2014. http://www.theses.fr/2014PA066424.
Full textIn this thesis we address the problem of reconstructing the transcription profile from RNA-Seq reads in cases where the reference genome is available but without making use of existing annotation. In the first two chapters consist of an introduction to the biological context, high-throughput sequencing and the statistical methods that can be used in the analysis of series of counts. Then we present our contribution for the RNA-Seq read count model, the inference transcription profile by using Particle Gibbs and the reconstruction of DE regions. The analysis of several data-sets proved that using Negative Binomial distributions to model the read count emission is not generally valid. We develop a mechanistic model which accounts for the randomness generated within all RNA-Seq protocol steps. Such a model is particularly important for the assessment of the credibility intervals associated with the transcription level and coverage changes. Next, we describe a State Space Model accounting for the read count profile for observations and transcription profile for the latent variable. For the transition kernel we design a mixture model combining the possibility of making, between two adjacent positions, no move, a drift move or a shift move. We detail our approach for the reconstruction of the transcription profile and the estimation of parameters using the Particle Gibbs algorithm. In the fifth chapter we complete the results by presenting an approach for analysing differences in expression without making use of existing annotation. The proposed method first approximates these differences for each base-pair and then aggregates continuous DE regions
Da, Silva Ophélie. "Structure de l'écosystème planctonique : apport des données à haut débit de séquençage et d'imagerie." Electronic Thesis or Diss., Sorbonne université, 2021. http://www.theses.fr/2021SORUS183.
Full textPlanktonic organisms are key actors in oceanic ecosystems, which support trophic networks and play a major role in biogeochemical cycles and climate regulation. While the spatio-temporal distribution of planktonic diversity can be investigated at several levels, from the gene to the ecosystem, identifying the underlying mechanisms is challenging. Indeed, the structure of diversity results from different evolutionary and ecological processes that can act simultaneously. Since the beginning of the 21st century, the oceanic environment has been increasingly monitored. Numerous observation platforms have been deployed, leading to the acquisition of a large amount of data for multiple environmental characteristics. At the same time, technologies for studying living organisms have been developed. Thus, an unprecedented sampling of planktonic organisms has taken place. In particular, high-throughput sequencing and imaging data provide molecular, taxonomic and functional information at several biological levels. The objective of this thesis was to explore the structure of planktonic ecosystems using high-throughput sequencing and imaging data. Coupling with environmental data could contribute to a better understanding of the spatial distribution of planktonic diversity, from species to communities. In the first part, the genetic diversity of protists was studied at the species level. The hypothesis was that metagenomics could provide access to the poorly characterized spatial organization of the intraspecific protist genetic diversity, as well as to the mechanisms underlying it. In a second part, the link between genetic diversity and functional diversity was explored. Transparency was targeted. This functional trait is little explored at the community level and its molecular basis is poorly identified. A data-driven approach allowed this trait to emerge from imaging data, leading to the exploration of its biogeography and molecular basis. In the last part, the high potential of complementarity between sequencing, imaging and environmental datasets was explored, in order to highlight the multi-scale structure of the planktonic ecosystem and to identify its global structure. Finally, all the results were discussed to highlight the contributions that these data can provide to the understanding of planktonic ecosystems, as well as the limitations they can face
Chaaya, Nancy. "Anticorps catalytiques et répertoires immuns murins : analyse génétique, biochimique et bio-informatique." Thesis, Compiègne, 2019. http://www.theses.fr/2019COMP2495.
Full textIn the late 80s, catalytic antibodies have been discovered in the serum of patients, especially patients with auto-immune diseases. Some of the catalytic antibodies appear to have a beneficial effect on health while others are deleterious. In order to understand the link between catalytic antibodies and immune system pathologies, previous work leaded to 4 single chain Fragment variable (scFv) libraries exposed on phage surface, representing different genetic backgrounds and immunological states. The scFvs, composed with the variable regions of the heavy (H) and light (L) chains, are encoded by immunoglobulin gene subgroups V(H), D(H), J(H), V(L) and J(L). With the objective to decipher the potential origin of catalytic antibodies, a statistical representation of each subgroup within each repertoire has been done, based on more than 300 000 sequences. The NGS data analysis showed a variable expression of some gene subgroups (comprising “rare” ones) between the 4 libraries showing that the genetic background and/or the immunological state influence immunoglobulin gene subgroup expression. Then, we investigated the presence of antibodies with potent active sites in the libraries by molecular modelling. Libraries express more putative catalytic antibodies than others depending on the genetic background and the immunological state profile. Finally, in the objective to validate this in silico approach, an in vitro approach was considered. 5 scFvs exposed on phage surface have thus been selected during a previous work by iterative process on the basis of their catalytic activity: β-lactamase like activity. Each of them displays a unique primary and tertiary structure. The scFvs exposed on the phage surface must be catalytically active while expressed in soluble form too. One of the selected scFvs, P90C2, was optimized and expressed in E. coli BL21 (DE3) bacteria in the form of inclusion bodies and then solubilized and refolded. Although soluble P90C2 fully retained its binding activity, its catalytic potency was completely lost. Further experiments aimed to i) optimize refolding protocol, ii) study the impact of scFv codon-optimization, and iii) show the influence of the pIII fusion protein on the scFv catalytic activity
Hurel, Julie. "Détection d'organismes génétiquement modifiés (OGM) inconnus par analyse statistique de données de séquençage haut débit." Thesis, Rennes 1, 2020. http://www.theses.fr/2020REN1B027.
Full textThe European Union has adopted a very restrictive policy towards the dissemination and use of genetically modified organisms (GMOs), whose use in food is not well accepted by consumers. Although a maximum threshold exists for a food to be labelled "GM-free", only known GMOs are easily detectable. A GMO consists mainly of a host genome and a sequence inserted by a non-natural process that confers a particular property on the organism, such as resistance to certain diseases. In recent years, GMOs with an inserted sequence that is not known have been produced that are not detectable by approaches used until now (PCR-type). Hence the need to propose a tool for the detection of unknown GMOs, the subject of this thesis, based on recent advances in terms of high-throughput sequencing. Statistically, each organism has a specific frequency of nucleotide use in its genome. Any introduction of foreign genetic material will locally alter the nucleotide use frequencies in that region, resulting in different nucleotide use frequencies compared to those of the host organism. Based on this assertion, an unknown GMO detection tool has been developed from bacterial sequencing data when the GMO results from the insertion of a foreign gene, the truncation or fusion of a gene that may belong to the host genome. The tool has been tested on 4 GMO bacterial genomes, 7 wild bacterial genomes and 42 synthetic bacterial genomes. The results demonstrate the effectiveness of the method developed by presenting only one false positive gene and identifying more than 99% of the genes of GMO inserts
Brinda, Karel. "Nouvelles techniques informatiques pour la localisation et la classification de données de séquençage haut débit." Thesis, Paris Est, 2016. http://www.theses.fr/2016PESC1027/document.
Full textSince their emergence around 2006, Next-Generation Sequencing technologies have been revolutionizing biological and medical research. Obtaining instantly an extensive amount of short or long reads from almost any biological sample enables detecting genomic variants, revealing the composition of species in a metagenome, deciphering cancer biology, decoding the evolution of living or extinct species, or understanding human migration patterns and human history in general. The pace at which the throughput of sequencing technologies is increasing surpasses the growth of storage and computer capacities, which still creates new computational challenges in NGS data processing. In this thesis, we present novel computational techniques for the problems of read mapping and taxonomic classification. With more than a hundred of published mappers, read mapping might be considered fully solved. However, the vast majority of mappers follow the same paradigm and only little attention has been paid to non-standard mapping approaches. Here, we propound the so-called dynamic mapping that we show to significantly improve the resulting alignments compared to traditional mapping approaches. Dynamic mapping is based on exploiting the information from previously computed alignments, helping to improve the mapping of subsequent reads. We provide the first comprehensive overview of this method and demonstrate its qualities using Dynamic Mapping Simulator, a pipeline that compares various dynamic mapping scenarios to static mapping and iterative referencing. An important component of a dynamic mapper is an online consensus caller, i.e., a program collecting alignment statistics and guiding updates of the reference in the online fashion. We provide OCOCO, the first online consensus caller that implements a smart statistics for individual genomic positions using compact bit counters. Beyond its application to dynamic mapping, OCOCO can be employed as an online SNP caller in various analysis pipelines, enabling calling SNPs from a stream without saving the alignments on disk. Metagenomic classification of NGS reads is another major problem studied in the thesis. Having a database of thousands reference genomes placed on a taxonomic tree, the task is to rapidly assign to tree nodes a huge amount of NGS reads, and possibly estimate the relative abundance of involved species. In this thesis, we propose improved computational techniques for this task. In a series of experiments, we show that spaced seeds consistently improve the classification accuracy. We provide Seed-Kraken, a spaced seed extension of Kraken, the most popular classifier at present. Furthermore, we suggest a new indexing strategy based on a BWT-index, obtaining a much smaller and more informative index compared to Kraken. We provide a modified version of BWA that improves the BWT-index for a quick k-mer look-up
Caporossi, Alban. "Apport du séquençage haut débit dans l'analyse bioinformatique du génome du virus de l'hépatite C." Thesis, Université Grenoble Alpes (ComUE), 2019. http://www.theses.fr/2019GREAS021/document.
Full textHigh-throughput sequencing has been used in this work to reconstruct with adapted methods the whole genomeof the hepatitis C virus (HCV) particularly for accurately typing the virus. Thus, we managed to detect in a studya recombinant form of HCV circulating within a patient. We typed and detected in another study resistancemutations of several HCV strains of different genotypes. Finally, a last study based on this approach enabled touncover a HCV strain belonging to a new subtype. High-throughput sequencing has also been used in this workto detect multiple infections and analyze viral evolution with targeted HCV genes and non-specific methods for2 HCV patients under treatment. This retrospective study enabled to define the composition of each temporalsample, assess their nucleotide diversity, investigate viral population genetic structure and temporal evolutionand date secondary infections. Results of this analysis support the hypothesis of onset mechanism of treatmentresistance (selective sweeps)
Chaara, Wahiba. "Caractérisation de la diversité du répertoire TCR par modélisation de données de séquençage haut-débit." Thesis, Paris 6, 2016. http://www.theses.fr/2016PA066410/document.
Full textT lymphocytes (LT) are key players in the immune system, a complex and dynamic system evolving over the organism’s life. The concept of "lymphocyte repertoire" designates a collection of lymphocytes sharing the same phenotype, the same function or any other criteria. Each LT is characterized by a unique membrane receptor, called TCR, allowing it to recognize specifically antigens. TCRs are characterized by variable regions produced by a series of somatic rearrangements that occur during the thymic differentiation; these regions engage LT recognition diversity. The “TCR repertoire” approach focuses the clonal characterisation of LT populations on the diversity of the TCR expressed on the scale of the population. The high-throughput sequencing of TCR chains (RepSeq) describes this diversity with unprecedented precision. However, this approach requires adapted tools to enable a relevant deciphering of the analysed TCR repertoire diversity. My thesis aimed to: i) deepen the concept of diversity of the lymphocyte repertoire, ii) develop an appropriate methodology to exploit optimally RepSeq data while taking into account the limits of this technology, and iii) develop a tool providing immunologists a thorough characterisation of their TCR repertoires of interest
Chaara, Wahiba. "Caractérisation de la diversité du répertoire TCR par modélisation de données de séquençage haut-débit." Electronic Thesis or Diss., Paris 6, 2016. https://accesdistant.sorbonne-universite.fr/login?url=https://theses-intra.sorbonne-universite.fr/2016PA066410.pdf.
Full textT lymphocytes (LT) are key players in the immune system, a complex and dynamic system evolving over the organism’s life. The concept of "lymphocyte repertoire" designates a collection of lymphocytes sharing the same phenotype, the same function or any other criteria. Each LT is characterized by a unique membrane receptor, called TCR, allowing it to recognize specifically antigens. TCRs are characterized by variable regions produced by a series of somatic rearrangements that occur during the thymic differentiation; these regions engage LT recognition diversity. The “TCR repertoire” approach focuses the clonal characterisation of LT populations on the diversity of the TCR expressed on the scale of the population. The high-throughput sequencing of TCR chains (RepSeq) describes this diversity with unprecedented precision. However, this approach requires adapted tools to enable a relevant deciphering of the analysed TCR repertoire diversity. My thesis aimed to: i) deepen the concept of diversity of the lymphocyte repertoire, ii) develop an appropriate methodology to exploit optimally RepSeq data while taking into account the limits of this technology, and iii) develop a tool providing immunologists a thorough characterisation of their TCR repertoires of interest
Glouzon, Jean-Pierre. "Étude de la dynamique des populations du viroïde de la mosaïque latente du pêcher par séquençage à haut débit et segmentation." Mémoire, Université de Sherbrooke, 2012. http://hdl.handle.net/11143/6582.
Full textMbareche, Hamza. "Molecular tools for the study of fungal aerosols." Doctoral thesis, Université Laval, 2019. http://hdl.handle.net/20.500.11794/35697.
Full textSince the rapid development of high-throughput sequencing methods in molecular ecology, fungi have been the underdogs of the microbial world, especially in bioaerosol studies. Particularly, studies describing fungal exposure in different occupational environments have been limited by traditional culture methods that underestimate the broad spectrum of fungi present in the air. There are potential risks in the human inhalation of fungal spores in an occupational scenario where the quantity and diversity of fungi is high. Although some health problems are already known to be associated with fungal exposure in certain work environments, the risk may be underestimated due to the methods used. Applying high-throughput sequencing in soil samples has helped the explanation of the fungal role in ecosystems. However, the literature is not decisive in terms of the genomic region to use as target for the enrichment and sequencing of fungi. The present thesis deals with the challenge of determining which region from the two universally used regions, ITS1 and ITS2, is best suited for study of fungal aerosols. In tandem with this challenge came another of addressing the loss of fungal cells during the centrifugation of liquid impaction air samples for purposes of concentration. This thesis describes a new filtration-based method to circumvent such losses during centrifugation. These two challenges represent the first part of the thesis, which focuses on methodology development. In synopsis, the treatment of air samples prior to DNA extraction is considered, along with the identification of the best region to target in amplicon-based high throughput sequencing. In the second part of the thesis, the focus turns to the application of the developed methodology to characterize fungal exposure in three different work environments: compost, biomethanization, and dairy farms. All three are of special interest due to potentially high fungal exposure. Results show that ITS1 outperformed ITS2 in disclosing higher levels of fungal diversity in aerosol samples. Due to complementarity in the taxonomic profiles disclosed by the two regions, the author suggests the use of both regions to cover the greatest possible number of taxa when taxonomy is the main interest of the study. However, ITS1 should be the first choice in other studies, mainly because of the high diversity it reveals and its concordance with results obtained via shotgun metagenomic profiling. In addition, the new filtration-based approach proposed in this work might be the best alternative available for compensating the loss of propagules in centrifugation done prior to DNA extraction. Taken together, these methods allowed a profound characterization of fungal exposure in occupational environments.
Karaouzene, Thomas. "Bioinformatique et infertilité : analyse des données de séquençage haut-débit et caractérisation moléculaire du gène DPY19L2." Thesis, Université Grenoble Alpes (ComUE), 2017. http://www.theses.fr/2017GREAS041/document.
Full textIn the last decade, the investigations of genetic diseases have been revolutionized by the rise of high throughput sequencing (HTS). Thanks to these new techniques it is now possible to analyze the totality of the coding sequences of an individual (exome sequencing) or even the sequences of his entire genome or transcriptome.The understanding of a pathology and of the genes associated with it now depends on our ability to identify causal variants within a plethora of technical artifact and benign variants.HTS is expected to be particularly useful in the field infertility as this pathology is expected to be highly genetically heterogeneous and only a few genes have so far been associated with it. My thesis focuses on male infertility and is divided into two main parts: HTS data analysis of infertile men and the molecular characterization of a specific phenotype, globozoospermia.Several thousands of distinct variants can be identified in a single exome, thereby using effective informatics is essential in order to obtain a short and actionable list of variants. It is for this purpose that I developed a HTS data analysis pipeline performing successively all bioinformatics analysis steps: 1) reads mapping along a reference genome, 2) genotype calling, 3) variant annotation and 4) the filtering of the variants considered as non-relevant for the analysis. Performing all these independent steps within a single pipeline is a good way to calibrate them and therefore to reduce the number of erroneous calls. This pipeline has been used in five studies and allowed the identification of variants impacting candidate genes that may explain the patients’ infertility phenotype. All these variants have been experimentally validated using Sanger sequencing.I also took part in the genetic and molecular investigations which permitted to demonstrate that the absence of the DPY192 gene induces male infertility due to globozoospermia, the presence in the ejaculate of only round-headed and acrosomeless spermatozoa. Most patients with globozoospermia have a homozygous deletion of the whole gene. I contributed to the characterization of the mechanisms responsible for this recurrent deletion, then, using Dpy19l2 knockout (KO) mice, I realized the comparative study of testicular transcriptome of wild type and Dpy19l2 -/- KO mice. This study highlighted a dysregulation of 76 genes in KO mice. Among them, 23 are involved in nucleic acid and protein binding, which may explain acrosome anchoring defaults observed in the sperm of globozoospermic patients.My work allowed a better understanding of globozoospermia and the development of a HTS data analysis pipeline. The latter allowed the identification of more than 15 human gametogenesis genes involved in different infertility phenotypes
Limasset, Antoine. "Nouvelles approches pour l'exploitation des données de séquences génomique haut débit." Thesis, Rennes 1, 2017. http://www.theses.fr/2017REN1S049/document.
Full textNovel approaches for the exploitation of high throughput sequencing data In this thesis we discuss computational methods to deal with DNA sequences provided by high throughput sequencers. We will mostly focus on the reconstruction of genomes from DNA fragments (genome assembly) and closely related problems. These tasks combine huge amounts of data with combinatorial problems. Various graph structures are used to handle this problem, presenting trade-off between scalability and assembly quality. This thesis introduces several contributions in order to cope with these tasks. First, novel representations of assembly graphs are proposed to allow a better scaling. We also present novel uses of those graphs apart from assembly and we propose tools to use such graphs as references when a fully assembled genome is not available. Finally we show how to use those methods to produce less fragmented assembly while remaining tractable
Doan, Trung-Tung. "Epidémiologie moléculaire et métagénomique à haut débit sur la grille." Phd thesis, Université Blaise Pascal - Clermont-Ferrand II, 2012. http://tel.archives-ouvertes.fr/tel-00778073.
Full textMuller, Etienne. "Les défis du séquençage à haut débit dans l'exploration génétique des cancers du sein et de l'ovaire." Thesis, Normandie, 2017. http://www.theses.fr/2017NORMR100/document.
Full textBreast and ovarian cancers appear in 5 to 10% of cases in a context of genetic predisposition, of which only a small proportion is explained by the presence of a pathogenic variant on the BRCA1, BRCA2 and PALB2 genes. High throughput sequencing can explore this missing heredity, but represents a new challenge both in computing, statistics and biology. Three approaches using this new technology have been used to investigate new predisposition factors. First, the risks associated with 34 known or suspected genes involved in predispositions were estimated from the analysis of 5,131 index cases and the development of a new statistical approach. Also, the participation of mosaic neo-mutations in the syndrome was explored from 1,750 index cases from the previous study, with a software developed specifically for detecting poorly represented variants: outLyzer. Finally, the exploration by sequencing of the missing heredity was extended to a panel of 201 genes involved in cancer, from 118 patients selected for the early onset of their disease, a highly suggestive element of a predisposition factor. The results of this work validated the relevance of the PALB2, RAD51C and RAD51D study for patient management, and also suggested an underestimated involvement of mosaic variants. However, there are still very likely other highly penetrating genetic factors to be discovered, but whose risk modulation is based on an oligogenic model
Nguyen, Do Ngoc Linh. "Mise au point de l’analyse par séquençage à haut-débit du microbiote fongique et bactérien respiratoire chez les patients atteints de mucoviscidose." Thesis, Lille 2, 2016. http://www.theses.fr/2016LIL2S011/document.
Full textChronic pulmonary infection results in an irreversible decline in lung function in patients with cystic fibrosis (CF). While several bacteria are known as main causes for these infections (for example: Pseudomonas aeruginosa, Staphylococcus aureus, Burkholderia cepacia, Achromobacter xylosoxidans...), more recently some fungal genera including filamentous fungi (such as Aspergillus, Scedosporium...) have also been identified as emerging or re-emerging pathogens able to cause invasive mycosis. Thus, the identification of the microorganisms involved in the respiratory colonizations and/or infections has become essential.Still now culture methods remain the gold standard for diagnostic of microbial infections. However, it could not identify non-culturable or difficult-to-cultivate microorganisms. Thanks to the development of high-throughput sequencing (next generation sequencing or NGS), recent studies have shown that the lung of patients with CF is a complex poly-microbial flora, also called the CF lung microbiota, which includes not only bacteria but also fungi (yeast and/or filamentous fungi), and viruses and phages. Dysbiosis (loss of abundance and/or diversity) of the lung microbiota has been associated with the patient's decreased lung function and poor clinical status.While lung bacteriota and its role in pathogenesis have widely been studied, few research studies focus on the fungal component (mycobiota/ mycobiome) of the lungs. Our thesis (PhD work) focuses on NGS analysis of pro- and eukaryotic lung microbiota in CF patients, in particular on the comparison of different methodological approaches to optimize and standardize the NGS protocol. This project has been developed under the supervision of Pr. Laurence Delhaes in the “Biology and Diversity of Eukaryotic Emerging Pathogens” team directed by Dr. Eric Viscogliosi.Firstly, we present a state of art on the current knowledge on the fungal colonization/infections risk in CF as well as the development of new concepts of lung microbiota and lung mycobiota on which our team focuses.Secondly, we applied the NGS approach to study the pro- and eukaryotic microbiota in the sputum samples of CF patient lung. Indeed, NGS is a powerful technique that may introduce biases on numerous methodological steps. One of the most important biases is that this technique could not differentiate among the living microorganisms, the dead or damaged cells, and the extracellular DNA. In the context of the CF lung microbiota which is often exposed to high-dose intravenous antibiotics, the analysis by NGS might evaluate4inaccurately the abundance and the diversity of the lung microbiota. Pretreatment of samples by propidium monoazide (PMA), which can target selectively the DNA of viable cells, could be a solution to overcome this limitation. Our study aimed to determine whether a sample pretreatment with PMA modified the lung pro- and eukaryotic microbiota analyzed by NGS. We discuss the clinical relevance of this approach "PMA - NGS" in the context of CF patients to a better quantification of living microorganisms
Padioleau, Ismaël. "Étude génomique de l'interférence entre la réplication et la transcription comme source du stress réplicatif." Thesis, Montpellier, 2017. http://www.theses.fr/2017MONTT053/document.
Full textOncogenes activation promotes aberrant cell proliferation, increasing replication stress and DNA damage. It has been proposed that genomic instability leads to checkpoints inhibition and promotes cancer development (Halazonetis et al. 2008). However, the link between aberrant proliferation, replication stress and DNA breaks is still unclear. We hypothesized that aberrant proliferation leads to more incident due to DNA and RNA polymerases encounter and stalling. When the two polymerases encounter, the accumulation of positive-supercoiled DNA between two polymerases induces fork stalling, resulting in the formation of fragile structures such as single-stranded DNA (ssDNA). These ssDNAs formed at stalled forks could be a source for DNA breaks, promoting the development of cancer cells. To validate this hypothesis, biologists from our team have worked on HeLa cell lines with increased replication-transcription conflicts. I perform the bioinformatics analysis of the following genomic data:- DRIP-seq: R-Loops positioning on genome using immunoprecipitation on DNA/RNA hybrids.-γ-H2AX ChIP-Seq: Gamma-H2AX is an histone mark found at DNA breaks.-pRPA ChIP-Seq : Positioning of stalled forks using the substrate of ATR kinase, phospho-RPA (S33) as a marker.Each data was produced on control cells and two cell lines where TOP1 and ASF/SF2 were depleted by as inducible shRNA (shTOP1 and shASF). Topoisomerase 1 is a topological enzyme that unwinds DNA when supercoiling accumulates. ASF/SF2 is part of the splicing complexes that processes mRNP (messenger ribonucleoprotein particles) to prevent the accumulation of R-loops during transcription. Using these data and others from literature, I determined that regions having higher risk to induce replication stress are located downstream of highly transcribed and early replicated genes, and preferentially with head-on collision between DNA and RNA polymerases. I also revealed that cancer-related genes are enriched in these regions of the genome