Dissertations / Theses on the topic 'Genomics'

To see the other types of publications on this topic, follow the link: Genomics.

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 dissertations / theses for your research on the topic 'Genomics.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.

1

Batzoglou, Serafim. "Computational genomics : mapping, comparison, and annotation of genomes." Thesis, Massachusetts Institute of Technology, 2000. http://hdl.handle.net/1721.1/8629.

Full text
Abstract:
Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2000.
Includes bibliographical references (leaves 180-191).
The field of genomics provides many challenges to computer scientists and mathematicians. The area of computational genomics has been expanding recently, and the timely application of computer science in this field is proving to be an essential component of the large international effort in genomics. In this thesis we address key issues in the different stages of genome research: planning of a genome sequencing project, obtaining and assembling sequence information, and ultimately study, cross-species comparison, and annotation of finished genomic sequence. We present applications of computational techniques to the above areas: (1) In relation to the early stages of a genome project, we address physical mapping, and we present results on the theoretical problem of finding minimum superstrings of hypergraphs, a combinatorial problem motivated by physical mapping. We also present a statistical and simulation study of "walking with clone-end sequences", an important method for sequencing a large genome.
(cont.) (2) Turning to the problem of obtaining the finished genomic sequence, we present ARACHNE, a prototype software system for assembling sequence data that are derived from sequencing a genome with the "shotgun" method. (3) Finally, we turn to the computational analysis of finished genomic sequence. We present GLASS, a software system for obtaining global pairwise alignments of orthologous finished sequences. We finally use GLASS to perform a comparative structure and sequence analysis of orthologous human and mouse genomic regions, and develop ROSETTA, the first cross-species comparison-based system for the prediction of protein coding regions in genomic sequences.
by Serafin Batzoglou.
Ph.D.
APA, Harvard, Vancouver, ISO, and other styles
2

Roidl, Andreas. "“Functional Genomics”." Diss., lmu, 2007. http://nbn-resolving.de/urn:nbn:de:bvb:19-67491.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Tiwari, Jitesh. "Assembly and Automated Annotation of the Clostridium scatologenes Genome." TopSCHOLAR®, 2012. http://digitalcommons.wku.edu/theses/1175.

Full text
Abstract:
Clostridium scatologenes is an anaerobic bacterium that demonstrates some unusual metabolic traits such as the production of 3-methyl indole. The availability of genome level sequencing has lent itself to the exploration and elucidation of unique metabolic pathways in other organisms such as Clostridium botulinum. The Clostridium scatologenes genome, with an estimated length 4.2 million bp, was sequenced by the Applied Biosystems Solid method and the Roche 454 pyrosequencing method. The resulting DNA sequences were combined and assembled into 8267 contigs with an average length of 1250 bp with the Newbler Assembler program. Comparision of published subunits of csd gene and assembled contigs identified that one contig contained all three subunits. In addition a gene with similarity to clostridium carboxidivorans butyrate kinase was found lined next to csd gene. An alignment of the contig and csdgene sequences identified three deletions in the contig within the 4066 bases of the alignment. This implies that there is about 0.07% error rate in the sequencing itself requiring more finishing. Even without finishing the genome assembly into single contig, contigs were annotated in RAST pipeline predicting 2521 protein encoding genes (PEGs). The PEGs were classified by their metabolic function and compared to classified PEGs found in the closely related clostridium species, Clostridium carboxidivorans and Clostridium. ljungdahlii, which have similarly sized genomes. According to the RAST analysis, Clostridium scatologenes had 35% subsystem coverage of all known metabolic processes with its 2521 PEGs. This compares to 41% for Clostridium carboxidivorans with 4174 PEGs (29) and 42% for Clostridium ljungdahlii with 4184 PEGs (30), indicating that Clostridium scatologenesmay still have more genes to be identified. Comparison of the percent genes found in the metabolic subsystems was similar except in motility and chemotaxis. The contigs, on which the csd gene and tryptophan metabolizing genes lay, were examined to see if additional genes might support these metabolic pathways. Butyrate kinase was associated with the csd genes but no other associations were found for the two tryptophan metabolizing genes. The tryptophan biosynthesis operon genes were all found on one contig (contig 6771) and were syntenic with other bacterial species.
APA, Harvard, Vancouver, ISO, and other styles
4

St, Jean Andrew Louis. "Haloarchaeal comparative genomics and the local context model of genomic evolution." Thesis, University of Ottawa (Canada), 1996. http://hdl.handle.net/10393/10308.

Full text
Abstract:
Genomics is a rapidly expanding field of research that seeks to study the structure, function and evolution of an organism's genome. Genomic investigations were conducted on three species of haloarchaea, a monophyletic group of prokaryotes belonging to the kingdom Euryarchaeota of the domain Archaea that are adapted to high-salt environments. A physical and genetic map of the genome of Halobacterium salinarum GRB is described. This map and the previously published map of the genome of Haloferax volcanii DS2 were compared with the object of detecting any conservation in the order or spacing of homologous loci between the two genomes. A computer program--COMPAGEN--was developed to aid in the analysis of the data generated by this comparison. No map order conservation could be detected at the 15 kbp average resolution of this comparison between genomes estimated to have diverged 600 million years ago. A second comparison was performed between the chromosomes of Haloferax volcanii DS2 and Haloferax mediterranei ATCC 33500 (R-4). Extensive conservation was found between these two genomes which diverged approximately 80 million years ago showing only three rearrangements: two inversions and a transposition. Conclusions drawn from an analysis of the comparisons include: (1) that higher resolution is required to deal with distantly related genomes, likely making use of sequence data, and (2) that it is important to compare genomes that have diverged at different times if one wishes to investigate the dynamics of genomic evolution within a phylogenetic group. The local context model was developed in an effort to explain the pattern of conservation and divergence seen in these and other prokaryotic genome comparisons. This model states that since the expression of genes is affected by flanking genetic elements, genes will resist changing their position relative to one another so long as this change is likely to alter gene expression in a way deleterious to the cell. The local context model thus provides a force promoting the conservation of genomic map order. The implications of this model for the evolution of the haloarchaea is discussed and future directions of prokaryotic genomics in general is explored.
APA, Harvard, Vancouver, ISO, and other styles
5

Al-Nuaimi, Bashar. "Ancestral Reconstruction and Investigations of Genomics Recombination on Chloroplasts Genomes." Thesis, Bourgogne Franche-Comté, 2017. http://www.theses.fr/2017UBFCD042/document.

Full text
Abstract:
La théorie de l’évolution repose sur la biologie moderne. Toutes les nouvelles espèces émergent d’une espèce existante. Il en résulte que différentes espèces partagent une ascendance commune, telle que représentée dans la classification phylogénétique. L’ascendance commune peut expliquer les similitudes entre tous les organismes vivants, tels que la chimie générale, la structure cellulaire, l’ADN comme matériau génétique et le code génétique. Les individus d’une espèce partagent les mêmes gènes mais (d’ordinaire) différentes séquences d’allèles de ces gènes. Un individu hérite des allèles de leur ascendance ou de leurs parents. Le but des études phylogénétiques est d’analyser les changements qui se produisent dans différents organismes pendant l’évolution en identifiant les relations entre les séquences génomiques et en déterminant les séquences ancestrales et leurs descendants. Une étude de phylogénie peut également estimer le temps de divergence entre les groupes d’organismes qui partagent un ancêtre commun. Les arbres phylogénétiques sont utiles dans les domaines de la biologie, comme la bio informatique, pour une phylogénétique systématique et comparative. L’arbre évolutif ou l’arbre phylogénétique est une exposition ramifiée les relations évolutives entre divers organismes biologiques ou autre existence en fonction des différences et des similitudes dans leurs caractéristiques génétiques. Les arbres phylogénétiques sont construits à partir de données moléculaires comme les séquences d’ADN et les séquences de protéines. Dans un arbre phylogénétique, les nœuds représentent des séquences génomiques et s’appellent des unités taxonomiques. Chaque branche relie deux nœuds adjacents. Chaque séquence similaire sera un voisin sur les branches extérieures, et une branche interne commune les reliera à un ancêtre commun. Les branches internes sont appelées unités taxonomiques hypothétiques. Ainsi, les unités taxonomiques réunies dans l’arbre impliquent d’être descendues d’un ancêtre commun. Notre recherche réalisée dans cette dissertation met l’accent sur l’amélioration des prototypes évolutifs appropriés et des algorithmes robustes pour résoudre les problèmes d’inférence phylogénétiques et ancestrales sur l’ordre des gènes et les données ADN dans l’évolution du génome complet, ainsi que leurs applications.[...]
The theory of evolution is based on modern biology. All new species emerge of an existing species. As a result, different species share common ancestry,as represented in the phylogenetic classification. Common ancestry may explainthe similarities between all living organisms, such as general chemistry, cell structure,DNA as genetic material and genetic code. Individuals of one species share the same genes but (usually) different allele sequences of these genes. An individual inheritsalleles of their ancestry or their parents. The goal of phylogenetic studies is to analyzethe changes that occur in different organisms during evolution by identifying therelationships between genomic sequences and determining the ancestral sequences and theirdescendants. A phylogeny study can also estimate the time of divergence betweengroups of organisms that share a common ancestor. Phylogenetic trees are usefulin the fields of biology, such as bioinformatics, for systematic phylogeneticsand comparative. The evolutionary tree or the phylogenetic tree is a branched exposure the relationsevolutionary between various biological organisms or other existence depending on the differences andsimilarities in their genetic characteristics. Phylogenetic trees are built infrom molecular data such as DNA sequences and protein sequences. Ina phylogenetic tree, the nodes represent genomic sequences and are calledtaxonomic units. Each branch connects two adjacent nodes. Each similar sequencewill be a neighbor on the outer branches, and a common internal branch will link them to acommon ancestor. Internal branches are called hypothetical taxonomic units. Thus,Taxonomic units gathered in the tree involve being descended from a common ancestor. Ourresearch conducted in this dissertation focuses on improving evolutionary prototypesappropriate and robust algorithms to solve phylogenetic inference problems andancestral information about the order of genes and DNA data in the evolution of the complete genome, as well astheir applications
APA, Harvard, Vancouver, ISO, and other styles
6

Gaiarsa, S. "EVOLUTION, COMPARATIVE GENOMICS AND GENOMIC EPIDEMIOLOGY OF BACTERIA OF PUBLIC HEALTH IMPORTANCE." Doctoral thesis, Università degli Studi di Milano, 2017. http://hdl.handle.net/2434/525881.

Full text
Abstract:
La presente tesi è incentrata sull'epidemiologia genomica delle infezioni batteriche ospedaliere. L'ambiente ospedaliero è peculiare, in quanto al suo interno si concentrano un elevato numero di agenti batterici, pazienti con un sistema immunitario debole e un uso massiccio di sostanze antimicrobiche. Questa combinazione favorisce lo sviluppo e la selezione di ceppi resistenti agli antibiotici e la diffusione di infezioni opportunistiche: in generale il prosperare dei patogeni nosocomiali. Alcune tecniche all'avanguardia per lo studio di questo tipo di infezioni sono basate sull’uso della genomica e di approcci evoluzionistici: esse permettono di conoscere le caratteristiche genomiche dei ceppi batterici e di ricostruire la loro storia evolutiva. Grazie alla possibilità di sequenziare il DNA ad un prezzo sempre più economico, i progetti di ricerca sono supportati da un numero sempre crescente di genomi e i dati genomici depositati nelle banche dati sono in crescita esponenziale: questo rende possibile eseguire una varietà sempre maggiore di analisi. Il primo lavoro qui riportato descrive l'evoluzione del Clonal Complex 258 (CC258) di Klebsiella pneumoniae. Le mutazioni puntiformi (single nucleotide polymorphism, SNP) hanno permesso di ricostruire la filogenesi globale di tutta la specie e di collocare il CC258 nel suo contesto evolutivo. Successivamente, è stato possibile rilevare la presenza di una ricombinazione di 1,3 Mb nei genomi del clade in analisi. Un’analisi del molecular clock ha poi consentito di datare sia questo che gli altri eventi di ricombinazione scoperti in lavori precedenti. Questi risultati sono stati usati per completare il quadro della storia evolutiva del CC258, caratterizzata da frequenti eventi di macro-ricombinazione. Un’evoluzione rapida e caratterizzata da scambi di elevate quantità di informazioni genomiche è una caratteristica comune ad altri patogeni nosocomiali che sviluppano fenotipi da "superbatteri". Sebbene frequente, il modello di evoluzione per macro-ricombinazioni non è comune a tutti i batteri responsabili di infezioni nosocomiali. Un’eccezione è il ceppo SMAL di Acinetobacter baumannii, presentato in un altro sottoprogetto di questa tesi. In questo lavoro sono stati analizzati i genomi del sequence type (ST) 78 di A. baumannii. La filogenesi e la genomica comparativa hanno rivelato la presenza di due differenti cladi all'interno del ST che presentano differenti "stili" evolutivi. Un gruppo (contenente i genomi SMAL) è caratterizzato da una minore variabilità del contenuto genico e dalla presenza di un numero più elevato di copie di insertion sequence (IS). Una IS interrompe il gene comEC/rec2 in tutti i genomi SMAL. Questo gene codifica per una proteina coinvolta nell’acquisizione del DNA esogeno, quindi la sua inattivazione limita lo scambio di geni. Questo suggerisce una spiegazione per la bassa plasticità genomica. In un altro lavoro presentato in questa tesi, l'epidemiologia genomica è stata applicata per ricostruire la diffusione di un focolaio epidemico di K. pneumoniae in un’unità di terapia intensiva ospedaliera. In un primo momento, è stato utilizzato un approccio filogenetico per separare gli isolati appartenenti all'epidemia da quelli sporadici. Poi le date di isolamento e gli SNP genomici hanno permesso di costruire una rete genomica che modellasse la propagazione delle infezioni nel reparto. La ricostruzione ha indicato una diffusione radiale del patogeno dal paziente zero a tutti gli altri infetti, rivelando così un errore sistematico nelle procedure di biosicurezza dell'ospedale. Questa applicazione quasi forense dell'epidemiologia genomica è stata utilizzata anche in altri due lavori qui presentati, entrambi riguardanti la ricostruzione di infezioni alimentari. In uno degli articoli, incentrato su Salmonella enterica, l’analisi filogenetica è stata eseguita solamente con gli SNP sinonimi al fine di filtrare le mutazioni patoadattative. Nell'altro lavoro sono stati utilizzati dati epidemiologici, tipizzazione molecolare e filogenesi basata sugli SNP per studiare l'infezione di nove isolati di Listeria monocytogenes, che si ritenevano essere parte dello stesso focolaio e alla fine sono risultati genomicamente non correlati. Infine, viene qui presentato anche un articolo di review riguardante l'epidemiologia genomica. L'articolo è focalizzato sulle ultime pubblicazioni ad alto impatto che analizzano l'evoluzione genomica degli agenti patogeni batterici e le dinamiche di propagazione delle epidemie in brevi periodi di tempo. L'articolo descrive, infine, le ultime ricostruzioni epidemiologiche a livello storico, che sono possibili grazie alle moderne tecnologie di isolamento e sequenza del DNA.
The present thesis is focused on genomic epidemiology of bacterial hospital infections. The hospital environment is unique, as it concentrates a high number of bacterial agents, frequent antibiotic use, and patients with weak immune systems. This combination favours the development and selection of antibiotic resistant strains and the spread of opportunistic infections: in general the thriving of nosocomial pathogens. Genomics and evolutionary approaches have emerged as the cutting edge tools for studying this kind of infections, allowing to study the genomic features of bacterial strains and their evolution. Thanks to the possibility to sequence DNA at a constantly cheaper price, research projects are supported by a growing number of genomes and a considerable amount of genomic data is available in the databases, expanding the amount of possible investigations that can be performed. The first work presented here describes the evolution of the Clonal Complex 258 (CC258) of Klebsiella pneumoniae. Single nucleotide polymorphisms (SNPs) allowed to reconstruct the global phylogeny of the entire species and to collocate the CC258 in its evolutionary context. Furthermore, it was possible to detect the presence of a 1.3 Mb recombination in the genomes of the clade in analysis. A molecular clock approach allowed to date this and other previously discovered recombination events. These findings were used to complete the picture of the evolutionary history of CC258, which is characterized by frequent macro-recombination events. A quick evolutive strategy characterized by exchange of high amount of information is a common feature to other nosocomial pathogens, which develop “superbug” phenotypes. Although common, the macro-recombination evolution model is not shared by all nosocomial infection bacteria. One exception is the SMAL strain of Acinetobacter baumannii, presented in another subproject of this thesis. In this work, the genomes of Sequence Type (ST) 78 of A. baumannii were analyzed. Phylogeny and comparative genomics revealed the presence of two different clades within the ST, presenting different evolutive “lifestyles”. One group (containing the SMAL genomes) was characterized by a lower gene content variability and by the presence of a higher copy number of insertion sequences (ISs). One IS interrupts the comEC/rec2 gene in all the SMAL genomes. This gene codes for a protein involved in the exogenous DNA importation, thus its inactivation limits the gene exchange, suggesting an explanation for the low genomic plasticity. In another work presented in this document, genomic epidemiology was applied to reconstruct the spreading routes of a K. pneumoniae epidemic event in an hospital intensive care unit. At first, a phylogenetic approach was used to separate the isolates that belonged to the outbreak from the sporadic ones. Then the isolation dates and genomic SNPs allowed to build a genomic network, which modelled the chain of infection events in the ward. The reconstruction suggested a star-like diffusion of the pathogen from patient zero to the other infected ones, thus revealing a systematic error in the biosafety procedures of the hospital. This almost-forensic application of genomic epidemiology was also used in two other works presented, both of them concerning the reconstruction of food-borne infections. In one of the works, focused on Salmonella enterica, only synonymous SNPs were used as input to a phylogenetic based investigation, in order to filter out pathoadaptative mutations. In the other article, epidemiological data, molecular typing and SNP-based phylogeny were used to investigate the infection of nine Listeria monocytogenes isolates, which were believed to be part of the same outbreak and in the end proved to be genomically unrelated. Lastly, a review paper on genomic epidemiology is also presented. The article is focused on the latest high impact publications analyzing the genome evolution of bacterial pathogens as well as the propagation dynamics of epidemic outbreaks in very short periods of time. The article also describes the latest historical epidemiological studies, which are possible thanks to modern DNA isolation and sequencing technologies.
APA, Harvard, Vancouver, ISO, and other styles
7

Kern, Andrew David. "Drosophila population genomics /." For electronic version search Digital dissertations database. Restricted to UC campuses. Access is free to UC campus dissertations, 2005. http://uclibs.org/PID/11984.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Loman, Nicholas James. "Comparative bacterial genomics." Thesis, University of Birmingham, 2012. http://etheses.bham.ac.uk//id/eprint/2839/.

Full text
Abstract:
For the most part, diagnostic clinical microbiology still relies on 19th century ideas and techniques, particularly microscopy and laboratory culture. In this thesis I investigate the utility of a new approach, whole-genome sequencing (WGS), to tackle current issues in infectious disease. I present four studies. The first demonstrates the utility of WGS in a hospital outbreak of Acinetobacter baumannii. The second study uses WGS to examine the evolution of drug resistance following antibiotic treatment. I then explore the use of WGS prospectively during an international outbreak of food-borne Escherichia coli infection, which caused over 50 deaths. The final study compares the performance of benchtop sequencers applied to the genome of this outbreak strain and touches on the issue of whether WGS is ready for routine use by clinical and public health laboratories. In conclusion, through this programme of work, I provide ample evidence that whole-genome sequencing of bacterial pathogens has great potential in clinical and public health microbiology. However, a number of technical and logistical challenges have yet to be addressed before such approaches can become routine.
APA, Harvard, Vancouver, ISO, and other styles
9

Lin, Ying. "Development and assessment of machine learning attributes for ortholog detection." Access to citation, abstract and download form provided by ProQuest Information and Learning Company; downloadable PDF file 0.31 Mb., 65 p, 2006. http://wwwlib.umi.com/dissertations/fullcit/3220791.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Meng, Da. "Bioinformatics tools for evaluating microbial relationships." Pullman, Wash. : Washington State University, 2009. http://www.dissertations.wsu.edu/Dissertations/Spring2009/d_meng_042209.pdf.

Full text
Abstract:
Thesis (Ph. D.)--Washington State University, May 2009.
Title from PDF title page (viewed on June 8, 2009). "School of Electrical Engineering and Computer Science." Includes bibliographical references.
APA, Harvard, Vancouver, ISO, and other styles
11

Shearer, Aiden Eliot. "Deafness in the genomics era." Diss., University of Iowa, 2014. https://ir.uiowa.edu/etd/4750.

Full text
Abstract:
Deafness is the most common sensory deficit in humans, affecting 278 million people worldwide. Non-syndromic hearing loss (NSHL), hearing loss not associated with other symptoms, is the most common type of hearing loss and most NSHL in developed countries is due to a genetic cause. The inner ear is a remarkably complex organ, and as such, there are estimated to be hundreds of genes with mutations that can cause hearing loss. To date, 62 of these genes have been identified. This extreme genetic heterogeneity has made comprehensive genetic testing for deafness all but impossible due to low-throughput genetic testing methods that sequence a single gene at a time. The human genome project was completed in 2003. Soon after, genomic technologies, including massively parallel sequencing, were developed. MPS gives the ability to sequence millions or billions of DNA base-pairs of the genome simultaneously. The goal of my thesis work was to use these newly developed genomic technologies to create a comprehensive genetic testing platform for deafness and use this platform to answer key scientific questions about genetic deafness. This platform would need to be relatively inexpensive, highly sensitive, and accurate enough for clinical diagnostics. In order to accomplish this goal we first determined the best methods to use for this platform by comparing available methods for isolation of all exons of all genes implicated in deafness and massively parallel sequencers. We performed this pilot study on a limited number of patient samples, but were able to determine that solution-phase targeted genomic enrichment (TGE) and Illumina sequencing presented the best combination of sensitivity and cost. We decided to call this platform and diagnostic pipeline OtoSCOPE®. Also during this study we identified several weaknesses with the standard method for TGE that we sought to improve. The next aim was to focus on these weaknesses to develop an improved protocol for TGE that was highly reproducible and efficient. We developed a new protocol and tested the limits of sequencer capacity. These findings allowed us to translate OtoSCOPE® to the clinical setting and use it to perform comprehensive genetic testing on a large number of individuals in research studies. Finally, we used the OtoSCOPE® platform to answer crucial questions about genetic deafness that had remained unanswered due to the low-throughput genetic testing methods available previously. By screening 1,000 normal hearing individuals from 6 populations we determined the carrier frequency for non-DFNB1 recessive deafness-causing mutations to be 3.3%. Our findings will also help us to interpret variants uncovered during analysis of deafness genes in affected individuals. When we used OtoSCOPE® to screen 100 individuals with apparent genetic deafness, we were able to provide a genetic diagnosis in 45%, a large increase compared to previous gene-by-gene sequencing methods. Because it provides a pinpointed etiological diagnosis, genetic testing with a comprehensive platform like OtoSCOPE® could provide an attractive alternative to the newborn hearing screen. In addition, this research lays the groundwork for molecular therapies to restore or reverse hearing loss that are tailored to specific genes or genetic mutations. Therefore, a molecular diagnosis with a comprehensive platform like OtoSCOPE® is integral for those affected by hearing loss.
APA, Harvard, Vancouver, ISO, and other styles
12

Seibert, Sara Rose. "Host-parasite interactions: comparative analyses of population genomics, disease-associated genomic regions, and host use." Wright State University / OhioLINK, 2020. http://rave.ohiolink.edu/etdc/view?acc_num=wright1590585260282244.

Full text
APA, Harvard, Vancouver, ISO, and other styles
13

De, Lazzari Eleonora. "Gene families distributions across bacterial genomes : from models to evolutionary genomics data." Thesis, Paris 6, 2017. http://www.theses.fr/2017PA066406/document.

Full text
Abstract:
La génomique comparative est un sujet essentiel pour éclaircir la biologie évolutionnaire. La première étape pour dépasser une connaissance seulement descriptive est de développer une méthode pour représenter le contenu du génome. Nous avons choisi la représentation modulaire des génomes pour étudier les lois quantitatives qui réglementent leur composition en unités élémentaires de type fonctionnel ou évolutif. La première partie de la thèse se fonde sur l'observation que le nombre de domaines ayant la même fonction est lié à la taille du génome par une loi de puissance. Puisque les catégories fonctionnelles sont des agrégats de familles de domaines, on se demande comment le nombre de domaines dans la même catégorie fonctionnelle est lié à l'évolution des familles. Le résultat est que les familles suivent également une loi de puissance. Le deuxième partie présente un modèle positif qui construit une réalisation à partir des composants liés dans un réseau de dépendance. L'ensemble de toutes les réalisations reproduit la distribution des composants partagés et la relation entre le nombre de familles distinctes et la taille du génome. Le dernier chapitre étend l'approche modulaire aux écosystèmes microbiens. Sur la base des constatations que nous avons faites sur les lois de puissance pour les familles de domaines, nous avons analysé comment le nombre de familles dans un metagénome en est influencé. Par conséquence, nous avons défini une nouvelle observable dont la forme fonctionnelle comprend des informations quantitatives sur la composition originelle du metagénome
Comparative genomics is as a fundamental discipline to unravel evolutionary biology. To overcome a mere descriptive knowledge of it the first challenge is to develop a higher-level description of the content of a genome. Therefore we used the modular representation of genomes to explore quantitative laws that regulate how genomes are built from elementary functional and evolutionary ingredients. The first part sets off from the observation that the number of domains sharing the same function increases as a power law of the genome size. Since functional categories are aggregates of domain families, we asked how the abundance of domains performing a specific function emerges from evolutionary moves at the family level. We found that domain families are also characterized by family-dependent scaling laws. The second chapter provides a theoretical framework for the emergence of shared components from dependency in empirical component systems with non-binary abundances. We defined a positive model that builds a realization from a set of components linked in a dependency network. The ensemble of resulting realizations reproduces both the distribution of shared components and the law for the growth of the number of distinct families with genome size. The last chapter extends the component systems approach to microbial ecosystems. Using our findings about families scaling laws, we analyzed how the abundance of domain families in a metagenome is affected by the constraint of power-law scaling of family abundance in individual genomes. The result is the definition of an observable, whose functional form contains quantitative information on the original composition of the metagenome
APA, Harvard, Vancouver, ISO, and other styles
14

Mariotti, Marco. "Computational genomics of selenoproteins." Doctoral thesis, Universitat Pompeu Fabra, 2013. http://hdl.handle.net/10803/295583.

Full text
Abstract:
Selenoproteins are a diverse class of proteins containing selenocysteine, the 21st aminoacid. Selenocysteine is inserted co-translationally, recoding very specific UGA codons through a dedicated machinery. Standard gene prediction programs consider UGA only as translational stop, and for this reason selenoprotein genes are typically misannotated. In the past years, we developed computational tools to predict selenoproteins at genomics scale. With these, we characterized the set of selenoproteins across many sequenced genomes, and we inferred their phylogenetic history. We dedicated particular attention to selenophosphate synthetase, a selenoprotein family required for selenocysteine biosynthesis, that can be used as marker of the selenocysteine coding trait. We show that selenoproteins went through a very diverse evolution in different lineages. While very conserved in vertebrates, selenoproteins were lost independently in many other organisms. Using genome sequencing, we traced with precision the path of genomic events that lead to recent selenoprotein extinctions in certain fruit flies.
Les selenoproteïnes s’agrupen en una classe heterogènia de proteïnes les quals contenen selenocysteïna, l’aminoàcid 21. La selenocisteïna és insertada durant el procés de traducció, recodificant codons UGA molt específics, mitjançant una maquinàiria dedicada. Els programes estàndard de predicció de gens interpreten el codó UGA només com a senyal d’stop de la traducció, i per aquesta raó els gens de selenoproteïness solen estar mal anotats. En els darrers anys, hem desenvolupat eines computacionals per a predir selenoproteïnes a escala genòmica. Amb aquestes, hem caracteritzat el conjunt de selenoproteïnes en aquells genomes que han estat seqüenciats, inferint la seva història filogenèitca. Hem dedicat especial ateníció a la família selenophosphate synthetase, selenoproteïna necessària per a la síntesi de selenocisteïna, i que per tant pot ser utilitzada com a marcador de codificació de selenocisteïna Mostrem que les selenoproteïnes han patit una evolució molt diversa en diferents llinatges. Tot i que es troben molt conservades en vertebrats, les selenoproteïnes van ser perdudes de manera independent en molts altres organismes. Gràcies a la sequenciació de genomes, vam traçar amb precisió els esdeveniment que van portar a l’extinció de selenoproteïnes a diverses espècies de drosòfila.
APA, Harvard, Vancouver, ISO, and other styles
15

Axelsson, Erik. "Comparative Genomics in Birds." Doctoral thesis, Uppsala : Acta Universitatis Upsaliensis, 2007. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-7432.

Full text
APA, Harvard, Vancouver, ISO, and other styles
16

Mariotti, Marco 1984. "Computational genomics of selenoproteins." Doctoral thesis, Universitat Pompeu Fabra, 2013. http://hdl.handle.net/10803/295583.

Full text
Abstract:
Selenoproteins are a diverse class of proteins containing selenocysteine, the 21st aminoacid. Selenocysteine is inserted co-translationally, recoding very specific UGA codons through a dedicated machinery. Standard gene prediction programs consider UGA only as translational stop, and for this reason selenoprotein genes are typically misannotated. In the past years, we developed computational tools to predict selenoproteins at genomics scale. With these, we characterized the set of selenoproteins across many sequenced genomes, and we inferred their phylogenetic history. We dedicated particular attention to selenophosphate synthetase, a selenoprotein family required for selenocysteine biosynthesis, that can be used as marker of the selenocysteine coding trait. We show that selenoproteins went through a very diverse evolution in different lineages. While very conserved in vertebrates, selenoproteins were lost independently in many other organisms. Using genome sequencing, we traced with precision the path of genomic events that lead to recent selenoprotein extinctions in certain fruit flies.
Les selenoproteïnes s’agrupen en una classe heterogènia de proteïnes les quals contenen selenocysteïna, l’aminoàcid 21. La selenocisteïna és insertada durant el procés de traducció, recodificant codons UGA molt específics, mitjançant una maquinàiria dedicada. Els programes estàndard de predicció de gens interpreten el codó UGA només com a senyal d’stop de la traducció, i per aquesta raó els gens de selenoproteïness solen estar mal anotats. En els darrers anys, hem desenvolupat eines computacionals per a predir selenoproteïnes a escala genòmica. Amb aquestes, hem caracteritzat el conjunt de selenoproteïnes en aquells genomes que han estat seqüenciats, inferint la seva història filogenèitca. Hem dedicat especial ateníció a la família selenophosphate synthetase, selenoproteïna necessària per a la síntesi de selenocisteïna, i que per tant pot ser utilitzada com a marcador de codificació de selenocisteïna Mostrem que les selenoproteïnes han patit una evolució molt diversa en diferents llinatges. Tot i que es troben molt conservades en vertebrats, les selenoproteïnes van ser perdudes de manera independent en molts altres organismes. Gràcies a la sequenciació de genomes, vam traçar amb precisió els esdeveniment que van portar a l’extinció de selenoproteïnes a diverses espècies de drosòfila.
APA, Harvard, Vancouver, ISO, and other styles
17

Östlund, David. "Genomics in the Cloud." Thesis, Uppsala universitet, Institutionen för informationsteknologi, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-453802.

Full text
Abstract:
The continued cost reduction for sequencing genomics data is causing an exponentialgrowth in the amount of data available. Moving both storage and calculation of thisdata to the cloud has been a common trend, but the way to do it is not alwaysobvious. This report compares three different alternatives for doing ad-hoc queries ina cloud based setting: two solutions using data lakes and one solution using arelational database hosted in the cloud. The data lake solutions proved to be easy toset up and fully functional for querying genomics data. The relational database wasmore complicated to set up, but the queries were more time efficient and more costefficient when performing more than 1200 queries per month on at least 100GB ofdata. To make the cloud computing possible for genomics data it had to betransformed into a file format supported by the cloud providers. For this purpose theParquet file format was chosen, tested, and proven to work well
APA, Harvard, Vancouver, ISO, and other styles
18

van, Tonder Andries J. "Pneumococcal genomics and evolution." Thesis, University of Oxford, 2016. https://ora.ox.ac.uk/objects/uuid:de6ba2cc-a1a6-4c86-ad43-6e4e785be46a.

Full text
Abstract:
Streptococcus pneumoniae (the 'pneumococcus') is a global pathogen responsible for nearly a million deaths each year in children under the age of five. The polysaccharide capsule or serotype is the primary pneumococcal virulence factor and is encoded by the cps locus. The introduction of pneumococcal conjugate vaccines has led to a significant reduction in the prevalence of invasive diseases caused by vaccine serotypes and a concomitant reduction of vaccine serotypes recovered from the nasopharynx of healthy children, although the overall rates of carriage have remained constant due to the replacement of vaccine serotypes with non-vaccine serotypes. The increasing numbers of genomes from large well-characterised pneumococcal datasets allows for key biological questions related to the pneumococcus, such as capsular diversity, to be examined in unprecedented detail. This thesis describes the application of genomics to investigating important aspects of pneumococcal biology including the pan-genome, cps locus diversity and genomic regions potentially associated with invasive disease. The first part of the thesis described a Bayesian decision model for estimating bacterial core genomes. This model was applied to genome datasets from five different bacterial species. The same model was later used to estimate core genomes for four different pneumococcal carriage datasets. These estimated core genomes were then compared and revealed a very small shared core genome of 303 genes. The results also highlighted the unexpected diversity of a pneumococcal dataset from Thailand. The second part of the thesis examined the diversity of the cps locus starting with serogroup 6 and in particular the recently described novel serotype 6E. This analysis revealed that the majority of what had previously been described as serotype 6B pneumococci were in fact serotype 6E and serological assays demonstrated that vaccine-induced serotype 6B antibodies were able to kill serotype 6E pneumococci. The analysis of cps loci was then extended to consider the diversity of 49 serotypes and included 5,405 genomes in the analyses. These analyses revealed several hybrid and putative novel serotypes. The final part of the thesis used a genome-wide association study to identify 89 candidate loci significantly associated with invasive disease. The results in this thesis showed that large numbers of genome sequences could provide a detailed insight into important aspects of pneumococcal biology.
APA, Harvard, Vancouver, ISO, and other styles
19

Sheikh, Sanea. "Genomics of Sorocarpic Amoebae." Doctoral thesis, Uppsala universitet, Systematisk biologi, 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-320432.

Full text
Abstract:
Sorocarpy is the aggregation of unicellular organisms to form multicellular fruiting bodies (sorocarps). This thesis is about the two best-known groups of sorocarpic amoebae, Dictyostelids and Acrasids. Paper I describes assembly and analysis of a multigene dataset to identify the root of the dictyostelid tree. Phylogenetic analyses of 213 genes (conserved in all sequenced dictyostelid genomes and an outgroup) place the root between Groups 1+2 and 3+4 (now: Cavenderiaceae + Acytosteliaceae and Raperosteliaceae + Dictyosteliaceae). Resolution of the dictyostelid root made it possible to proceed with a major taxonomic revision of the group. Paper II focuses on the taxonomic revision of Dictyostelia based on molecular phylogeny and SSU ribosomal RNA sequence signatures. The two major divisions were treated at the rank of order as Acytosteliales ord. nov. and Dictyosteliales. The two major clades within each of these orders were given the rank of family. Twelve genera were recognized. This is the first revision of a major protist taxon using molecular signatures and offers guidelines for taxonomic revision of protist groups where morphology is insufficient. Paper III presents the mitochondrial genome (mtDNA) of Acrasis kona. Over a quarter of the genome consists of novel open reading frames, while 16 genes present in the mtDNA of its relative, Naegleria gruberi, are missing. We identified many of these genes in the A. kona nuclear DNA, and used phylogenetic analyses to show that most of these genes arose by transfer from mtDNA. Paper IV presents the nuclear genome of A. kona, the second genome sequence of a free-living excavate. The 44 Mb genome has 15,868 open reading frames of which 4,987 are novel. A surprising number of genes are most similar to homologs in distant relatives, suggesting acquisition by horizontal gene transfer (HGT). Most HGT candidates are expressed and many constitute multi-gene families and/or have acquired introns and membrane targeting sequences. Strong HGT candidates include some genes essential to development and signaling in Dictyostelia. Flagellar motility and meiosis genes are also present and conserved, suggesting cryptic flagellar and sexual stages.
APA, Harvard, Vancouver, ISO, and other styles
20

Padmanabhan, Babu roshan. "Taxano-genomics, a strategy incorporating genomic data into the taxonomic description of human bacteria." Thesis, Aix-Marseille, 2014. http://www.theses.fr/2014AIXM5056.

Full text
Abstract:
Mon projet de doctorat était de créer un pipeline pour taxono-génomique pour la comparaison de plusieurs génomes bactériens. Deuxièmement, je automatisé le processus d'assemblage (NGS) et annotation à l'aide de divers logiciels open source ainsi que la création de scripts de maison pour le laboratoire. Enfin, nous avons intégré le pipeline dans la description de plusieurs espèces bactériennes de laboratoire sur. Cette thèse est divisée principalement en Taxono- génomique et Microbiogenomics. Les avis de la section taxono-génomique, décrit sur les avancées technologiques en génomique et métagénomique pertinentes dans le domaine de la microbiologie médicale et décrit la stratégie taxono-génomique en détail et comment la stratégie polyphasique avec des approches génomiques sont reformatage de la définition de la taxonomie bactérienne. Les articles décrivent les bactéries cliniquement importantes, leur séquençage complet du génome et les études génomiques comparatives, génomiques et taxono-génomique de ces bactéries. Dans cette thèse, j'ai inclus les articles décrivant ces organismes: Megasphaera massiliensis, Corynebacterium ihumii, Collinsella massiliensis, Clostridium dakarense. Bacillus dielmoensis, jeddahense, Occidentia Massiliensis, Necropsobacter rosorum et Pantoea septica. Oceanobacillus
My PhD project was to create a pipeline for taxono-genomics for the comparison of multiple bacterial genomes. Secondly I automated the process of assembly (NGS) and annotation using various open source softwares as well as creating in house scripts for the lab. Finally we incorporated the pipeline in describing several bacterial species from out lab. This thesis is subdivided mainly into Taxono-genomics and Microbiogenomics. The reviews in taxono-genomics section, describes about the technological advances in genomics and metagenomics relevant to the field of medical microbiology and describes the strategy taxono-genomics in detail and how polyphasic strategy along with genomic approaches are reformatting the definition of bacterial taxonomy. The articles describes clinically important bacteria, their whole genome sequencing and the genomic, comparative genomic and taxono-genomic studies of these bacteria
APA, Harvard, Vancouver, ISO, and other styles
21

Eriksen, Niklas. "Combinatorial methods in comparative genomics." Doctoral thesis, KTH, Mathematics, 2003. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-3508.

Full text
APA, Harvard, Vancouver, ISO, and other styles
22

Lemay, Matthew Alexander. "Ecological genomics of kokanee salmon." Thesis, University of British Columbia, 2014. http://hdl.handle.net/2429/48528.

Full text
Abstract:
Divergent natural selection across a heterogeneous landscape can drive the evolution of locally adapted populations in which phenotypic variation is fine-tuned to the environment. At the molecular level, such processes can be inferred by identifying correlations between genetic variation and environmental variables. In this dissertation, I used multiple complementary approaches to investigate the genetic basis of adaptation in natural populations of kokanee, the freshwater form of sockeye salmon (Oncorhynchus nerka). In Chapter 2, I found that the frequency and length of alleles in the circadian regulation gene, OtsClock1b, displays a predictable distribution with respect to latitude among lakes sampled from British Columbia to Alaska, providing evidence that variation at this locus may be locally adapted to divergent photoperiod regimes. In Chapter 3, I tested for transcriptome-wide patterns of sequence divergence in reproductive ecotypes of kokanee within Okanagan Lake and found evidence for differential gene expression and asymmetrical pathogen load between ecotypes. In Chapter 4, I used restriction site associated DNA sequencing to identify ~6,000 single nucleotide polymorphisms (SNPs) from multiple spawning populations of kokanee within Okanagan Lake; statistical outlier tests revealed 20 SNPs that were putatively under divergent selection between ecotypes, many of which annotated to genes associated with early development. While there was no evidence for neutral genetic divergence, outlier SNPs demonstrated significant structure with respect to ecotype and had high assignment accuracy (>99%) in mixed composition simulations, suggesting the utility of these loci for genetic stock identification. These data support the hypothesis that kokanee ecotypes are in the early stages of ecological differentiation, making them an ideal system for investigating the genomic basis of adaptation. Results from this study will be used to assist conservation and management initiatives by providing molecular tools for in-season monitoring of ecotype abundance.
Irving K. Barber School of Arts and Sciences (Okanagan)
Biology, Department of (Okanagan)
Graduate
APA, Harvard, Vancouver, ISO, and other styles
23

Schéele, Camilla. "Functional genomics studies of PINK1 /." Stockholm : Karolinska institutet, 2007. http://diss.kib.ki.se/2007/978-91-7357-376-4/.

Full text
APA, Harvard, Vancouver, ISO, and other styles
24

Sampson, Joshua Neil. "Clustering genes in genetical genomics /." Thesis, Connect to this title online; UW restricted, 2007. http://hdl.handle.net/1773/9549.

Full text
APA, Harvard, Vancouver, ISO, and other styles
25

Chen, Lin. "Causal modeling in quantitative genomics /." Thesis, Connect to this title online; UW restricted, 2008. http://hdl.handle.net/1773/9577.

Full text
APA, Harvard, Vancouver, ISO, and other styles
26

Greninger, Alexander L. "Genomics and Proteomics of Picornaviruses." Thesis, University of California, San Francisco, 2013. http://pqdtopen.proquest.com/#viewpdf?dispub=3558423.

Full text
Abstract:

Viruses have long been noted to be composed simply of nucleic acid and protein. This thesis describes this confluence of science of viruses at the interface of genomics and proteomics. Chapter 2 describes the discovery of klassevirus, a new picornavirus in pediatric diarrhea. Chapter 3 shows that klassevirus is likely a human pathogen given the seroconversion of klassevirus-positive individuals against a klassevirus non-structural protein that is not present in the picornavirus virion. Subsequent work failed to obtain a culturable virus from klassevirus-positive stool samples, enabling the transition to culture-independent methods of characterizing picornavirus-host protein interactions. Chapter 4 describes the use of affinity purification mass spectrometry to discovery a novel picornavirus 3A-ACBD3-PI4KB complex that promotes viral replication in the enteroviruses and kobuviruses. Chapter 5 extends upon the methodology to describe a novel host protein interactor of ACBD3 (TBC1D22A/B), whose interaction is altered specifically by the kobuvirus 3A protein. This complex also demonstrates significant interaction with the klassevirus 3A protein, suggesting that the AP-MS work may inform the biology of the uncultured virus. Finally, chapter 6 describes future directions that are opened up by this work.

APA, Harvard, Vancouver, ISO, and other styles
27

Wang, Ping 1968 Feb 17. "Structure genomics of zebrafish granulins." Thesis, McGill University, 2004. http://digitool.Library.McGill.CA:80/R/?func=dbin-jump-full&object_id=81452.

Full text
Abstract:
This study focused on the structural genomics of granulin/epithelin modules (GEMs). Expression, purification and NMR studies of 14 out of all known 19 zebrafish GEMs allowed an assessment of the degree of structural diversity in their C-terminal subdomains. Very interestingly, one well-folded zebrafish GEM was obtained, and its three-dimensional structure was determined with high accuracy using NOE, H-bond, dihedral angle and residue dipolar coupling constraints. Solution structure determination and 15N NMR relaxation measurements indicate that one unique proline residue of the zebrafish GEM may confer the well-structured and stable folding property of the stack of all four beta-hairpins in contrast to other members of the GEM protein family.
APA, Harvard, Vancouver, ISO, and other styles
28

Hershman, Steven Gregory. "Personal Genomics and Mitochondrial Disease." Thesis, Harvard University, 2013. http://dissertations.umi.com/gsas.harvard:10863.

Full text
Abstract:
Mitochondrial diseases involving dysfunction of the respiratory chain are the most common inborn errors of metabolism. Mitochondria are found in all cell types besides red blood cells; consequently, patients can present with any symptom in any organ at any age. These diseases are genetically heterogeneous, and exhibit maternal, autosomal dominant, autosomal recessive and X-linked modes of inheritance. Historically, clinical genetic evaluation of mitochondrial disease has been limited to sequencing of the mitochondrial DNA (mtDNA) or several candidate genes. As human genome sequencing transformed from a research grade effort costing $250,000 to a clinical test orderable by doctors for under $10,000, it has become practical for researchers to sequence individual patients. This thesis describes our experiences in applying "MitoExome" sequencing of the mtDNA and exons of >1000 nuclear genes encoding mitochondrial proteins in ~200 patients with suspected mitochondrial disease. In 42 infants, we found that 55% harbored pathogenic mtDNA variants or compound heterozygous mutations in candidate genes. The pathogenicity of two nuclear genes not previously linked to disease, NDUFB3 and AGK, was supported by complementation studies and evidence from multiple patients, respectively. In an additional two unrelated children presenting with Leigh syndrome and combined OXPHOS deficiency, we identified compound heterozygous mutations in MTFMT. Patient fibroblasts exhibit severe defects in mitochondrial translation that can be rescued by exogenous expression of MTFMT. Furthermore, patient fibroblasts have dramatically reduced fMet-\(tRNA^{Met}\) levels and an abnormal formylation profile of mitochondrially translated \(COX_1\). These results demonstrate that MTFMT is critical for human mitochondrial translation. Lastly, to facilitate evaluation of copy number variants (CNVs), we developed a web-interface that integrates CNV calling with genetic and phenotypic information. Additional diagnoses are suggested and in a male with ataxia, neuropathy, azoospermia, and hearing loss we found a deletion compounded with a missense variant in D-bifunctional protein, \(HSD_{17}B_4\), a peroxisomal enzyme that catalyzes beta-oxidation of very long chain fatty acids. Retrospective review of metabolic testing from this patient revealed alterations of long- and very-long chain fatty acid metabolism consistent with a peroxisomal disorder. This work expands the molecular basis of mitochondrial disease and has implications for clinical genomics.
APA, Harvard, Vancouver, ISO, and other styles
29

Rodrigues, Ana Paula da Conceição. "Target selection in structural genomics." Thesis, University of York, 2004. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.428428.

Full text
APA, Harvard, Vancouver, ISO, and other styles
30

Manee, Manee. "Comparative genomics of noncoding DNA." Thesis, University of Manchester, 2016. https://www.research.manchester.ac.uk/portal/en/theses/comparative-genomics-of-noncoding-dna(d16aa46c-b8a2-4e6c-b825-d4246d3775fa).html.

Full text
Abstract:
High levels of primary sequence conservation are observed in many noncoding regions of eukaryotic genomes. These conserved noncoding elements (CNEs) have shown to be robust indicators of functionally constrained elements. Nevertheless, the function of only a small fraction of such CNEs is known and their role in genome biology remains largely a mystery. Comparative genomics analysis in model organisms can shed light on CNE function and evolution of noncoding DNA in general. Recently, it has been reported that short CNEs in the Drosophila genome are typically very AT-rich but have unusually high levels of GC content in a much larger (~500 bp) window around them. To understand whether these "side effects" are dependent on their CNE definition or are a more general feature of the Drosophila genome, we analysed base composition of CNEs from two different CNE detection methods. We found side effects are real, but are restricted to a subset of CNEs in the genome. An alternative hypothesis to explain the existence of CNEs is the mutational cold spot hypothesis. Previous work using SNPs was shown evidence that CNEs are not mutational cold spots. Here, non-reference transposable elements (TEs) were used to test cold spot hypothesis. A significant reduction in levels of non-reference TEs was found in intronic and intergenic CNEs compared to the expected number of insertions. TEs in intergenic CNEs were also found at lower allele frequencies than TEs in intergenic spacers. Furthermore, we used simulation to explore the effects of insertion/deletion (indel) evolution on noncoding DNA sequences with and without constrained noncoding elements. We assessed several indel-capable simulators to test expected outcomes with no selectively constrained elements. Simulations with constrained elements show that sequences grow in length even when the deletion rate is exactly the same as the insertion rate. This result can be interpreted as being due to purifying selection on CNEs acting to remove an excess of deletion over insertions. Together, the results presented here provide insights into the evolution of noncoding DNA in one of the most important model organisms.
APA, Harvard, Vancouver, ISO, and other styles
31

Mikkelsen, Tarjei Sigurd 1978. "Mammalian comparative genomics and epigenomics." Thesis, Massachusetts Institute of Technology, 2009. http://hdl.handle.net/1721.1/52808.

Full text
Abstract:
Thesis (Ph. D.)--Harvard-MIT Division of Health Sciences and Technology, 2009.
This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.
Cataloged from student submitted PDF version of thesis.
Includes bibliographical references.
The human genome sequence can be thought of as an instruction manual for our species, written and rewritten over more than a billion of years of evolution. Taking a complete inventory of our genome, dissecting its genes and their functional components, and elucidating how these genes are selectively used to establish and maintain cell types with markedly different behaviors, are key challenges of modern biology. In this thesis we present contributions to our understanding of the structure, function and evolution of the human genome. We rely on two complementary approaches. First, we study signatures of evolutionary processes that have acted on the genome using comparative sequence analysis. We generate high quality draft genome sequences of the chimpanzee, the dog and the opossum. These species share a last common ancestor with humans approximately 6 million, 80 million and 140 million years ago, respectively, and therefore provide distinct perspectives on our evolutionary history. We apply computational methods to explore the functional organization of the genome and to identify genes that contribute to shared and species-specific traits. Second, we study how the genome is bound by proteins and packaged into chromatin in distinct cell types. We develop new methods to map protein-DNA interactions and DNA methylation using single-molecule based sequencing technology. We apply these methods to identify new functional sequence elements based on characteristic chromatin signatures, and to explore the relationship between DNA sequence, chromatin and cellular state.
by Tarjei Sigurd Mikkelsen.
Ph.D.
APA, Harvard, Vancouver, ISO, and other styles
32

Ryder, Carol D. "Comparative genomics of Brassica oleracea." Thesis, University of Warwick, 2012. http://wrap.warwick.ac.uk/51651/.

Full text
Abstract:
The scientific case made by the AUTHOR’S comparative Brassica oleracea genomics work is presented through 5 peer reviewed research papers. In order to achieve a comprehensive understanding of the evolution of B. oleracea the identification of unique genome characteristics, established using comparative genomics, is required. The genome characteristics established within these papers deliver significant contributions to original knowledge. These include a detailed illustration of how macro scale synteny varies markedly between the B. oleracea and A. thaliana genomes; unambiguous integration of the B. oleracea cytogenetic and genetic linkage maps; a cross species characterisation of a large collinear inverted segmental duplication on a single B. oleracea chromosome establishing that the relative physical distances have stayed approximately the same; retrotransposon copy number estimations and characterisation of their genomic organisation and isolation, characterisation and cross species analysis of a C genome specific repeat. For each paper the AUTHOR’S individual scientific contribution to each aspect of the work is described in detail. Both individually and as a body of work these publications substantially advance the fields of comparative, Brassica and genomic research.
APA, Harvard, Vancouver, ISO, and other styles
33

Pereira, Anirene Galvão Tavares. "Nellore meat quality and genomics." Universidade de São Paulo, 2016. http://www.teses.usp.br/teses/disponiveis/11/11141/tde-01082016-181552/.

Full text
Abstract:
This study was developed in order to explore chromosomal regions associated with carcass and meat traits in Nellore cattle breed, identifying metabolic and genetic pathways related to its characteristics expression, as well as generate additional phenotypes for future genome association studies, in order to fully describe parameters related to final product quality. Thereunto, 995 bulls were genotyped for more than 770,000 single nucleotide polymorphisms (SNPs), were evaluated for body weight at birth, weight gain at weaning and yearling, conformation, finishing precocity and muscling at weaning and yearling. These traits are correlated, therefore, genomic mapping method were applied in order to identify pleiotropic regions. Results highlighted previously described genomic regions associated to beef cattle weight gain and growth traits, particularly PLAG1 gene, sheltered by the most significantly associated marker region, which in other studies were associated to weight, height and sexual precocity in Nellore breed. To evaluate carcass and meat quality traits, 576 young bulls were evaluated for hot carcass weight, ribeye area, fat thickness, pH 24 hours after slaughter and color parameters (L*, a*, b*), for shearing force, dripping and cooking loss, evaluations were performed for different maturation times (7, 14 and 21 days). Animals were genotyped on two platforms, Illumina® BovineHD BeadChip (HD) and Bovine GeneSeek® Genomic Profiler ™ HD Illumina Infinium® (GGP). Animals genotyped at a lower density (GGP) were imputed to high density chip (HD). Shear force, dripping and cooking loss measures which relates to meat tenderness, were associated to cytoskeleton structure and proteolytic enzymes activity, pointing to serine/serpin enzyme complex as main candidates for regulate proteolysis and muscle fiber structure degradation. Were performed an evaluation of Longissimus thoracis et lumborum intramuscular fat content of 148 animals. It was approached by a human health perspective where samples received a classification regarding fatty acids effects on human organism (\"beneficial\", \"evil\" or \"neutral\"), as well as provided phenotypic information for future genome association studies. The identification of 42 fatty acids and 16 indexes, generated detailed information on these animals\' meat fat composition. Principal component analysis (PCA) results showed that large variation proportion between samples fat composition occurs due to expression differences among desaturase and elongase enzymes. Thus, it is expected that generated data, information and knowledge hereby, can assist animal breeding programs to improve Brazilian herds according meat chain interests.
O presente trabalho foi desenvolvido com o objetivo de explorar regiões cromossômicas associadas à características de carcaça e carne em bovinos da raça Nelore, explorar suas funções em vias metabólicas e gênicas relacionadas às manifestações dessas características, assim como gerar novos fenótipos para futuros estudos de associação genômica, com vistas a descrever, de forma completa, as características relacionadas à qualidade do produto final. Para isso, 995 animais machos não castrados, genotipados para mais de 770.000 marcadores de polimorfismos de nucleotídeo único (SNP), foram avaliados quanto ao peso corporal ao nascimento, ganho de peso à desmama e ao sobre ano, conformação, precocidade de terminação e musculosidade à desmama e ao sobre ano. Como estas características são correlacionadas, foram aplicadas metodologias de mapeamento genômico com o objetivo de identificar regiões pleiotrópicas. Os resultados destacaram regiões do genoma bovino que contêm genes descritos por influenciarem em características de crescimento e ganho de peso nestes animais, com destaque para o gene PLAG1, pertencente à região do marcador mais significativo associado aos fenótipos, anteriormente associado ao peso, altura e precocidade sexual em animais dessa raça. Para acessar atributos de qualidade de carcaça e carne, 576 machos não castrados foram avaliados quanto ao peso de carcaça quente, área de lombo, espessura de gordura subcutânea, pH após 24 horas do abate, cor (L*, a*, b*) e perdas de peso por exsudação e cozimento e força de cisalhamento em diferentes tempos de maturação (7, 14 e 21 dias). Os animais foram genotipados em duas plataformas, Illumina® BovineHD BeadChip (HD) e GeneSeek® Genomic Profiler Bovine HD™ Illumina Infinium® (GGP), sendo os genótipos deste último imputados para o conjunto de maior densidade. As avaliações de perdas de peso por exsudação e cozimento e força de cisalhamento, utilizada para mensurar maciez, revelam a influencia da estrutura do citoesqueleto e da ação das enzimas proteolíticas, apontando o complexo enzimático serinas/serpinas como candidato na regulação do processo de proteólise e degradação da estrutura da fibra muscular. Foi realizada avaliação dos ácidos graxos no músculo Longissimus thoracis et lumborum de 148 animais com vistas à classificação das amostras quanto aos efeitos esperados no organismo humano (\"benéfico\", \"maléfico\" ou \"neutro\"), assim como prover informação fenotípica para futuros estudos de associação genômica. A identificação de 42 ácidos graxos e 16 índices gerou informação detalhada sobre a gordura presente na carne destes animais, sendo observado, por análise de componentes principais (PCA), que a maior variação entre a composição das amostras avaliadas parece ser em decorrência da diferença de expressão das enzimas elongases e dessaturases. Dessa forma, espera-se que os dados, informações e conhecimento gerados por este trabalho, possam auxiliar os programas de melhoramento genético animal a aprimorar o rebanho brasileiro segundo características de interesse da cadeia produtiva de carne.
APA, Harvard, Vancouver, ISO, and other styles
34

Sargsyan, Alex. "Genetics and Genomics in Nursing." Digital Commons @ East Tennessee State University, 2019. https://dc.etsu.edu/etsu-works/8471.

Full text
APA, Harvard, Vancouver, ISO, and other styles
35

Belal, Nahla Ahmed. "Two Problems in Computational Genomics." Diss., Virginia Tech, 2011. http://hdl.handle.net/10919/26318.

Full text
Abstract:
This work addresses two novel problems in the field of computational genomics. The first is whole genome alignment and the second is inferring horizontal gene transfer using posets. We define these two problems and present algorithmic approaches for solving them. For the whole genome alignment, we define alignment graphs for representing different evolutionary events, and define a scoring function for those graphs. The problem defined is proven to be NP-complete. Two heuristics are presented to solve the problem, one is a dynamic programming approach that is optimal for a class of sequences that we define in this work as breakable arrangements. And, the other is a greedy approach that is not necessarily optimal, however, unlike the dynamic programming approach, it allows for reversals. For inferring horizontal gene transfer, we define partial order sets among species, with respect to different genes, and infer genes involved in horizontal gene transfer by comparing posets for different genes. The posets are used to construct a tree for each gene. Those trees are then compared and tested for contradiction, where contradictory trees correspond to genes that are candidates of horizontal gene transfer.
Ph. D.
APA, Harvard, Vancouver, ISO, and other styles
36

Pereira, Pagarete Antonio Joaquim. "Functional Genomics of Coccolithophore Viruses." Paris 6, 2010. http://hal.upmc.fr/tel-01111009v1.

Full text
Abstract:
L’Emiliania huxleyi virus (EhV) est un NCLDV. Il appartient à la famille des virus algaux, les Phycodnaviridae. Il infect Emiliania huxleyi, le coccolithophore le plus abondant dans les océans modernes. Nous avons montré sur une base phylogénétique le transfert de 29 gènes entre le génome d’Emiliania huxleyi et de EhV, notamment 7 gènes impliqués dans la biosynthèse des sphingolipides (SBP). C’est le premier cas patent, dans un système de virus et phytoplancton eucaryotes, de transfert horizontal de multiples gènes d’enzymes liées fonctionnellement. Pour deux des plus importantes enzymes de la SBP, la sérine palmitoyl transférase et la dihydroceramide désaturase, l’étude transcriptomique a permis de définir trois étapes au cours de la formation et de la disparition des blooms de E. Huxleyi, pendant lesquelles on registre une activation progressive des transcrits de coccolithovirus, culminant avec leur contrôle de la SBP au cours des étapes 2 et 3. En utilisant la technique de puce à ADN on a réalisé la première étude transcriptomique globale entre l’hôte le virus au sein d’une communauté océanique naturel. Nos résultats montrent que durant les efflorescences de E. Huxleyi il y a un épisode synchrone de dominance virale qui est clairement visible à travers les signaux transcriptomiques qui en résultent. Parmi les gènes dont la quantité de transcrits augmentent significativement entre la pre et la post dominance virale on a trouvé des fonctions impliquées dans le transfert de l’information génétique, mais aussi des gènes probablement impliqués dans le contrôle post-transitionnel, dans les mécanismes de déplacement intracellulaires, ou même dans le contrôle de l’apoptose
Emiliania huxleyi Virus (EhV) is a giant nucleo-cytoplasmic double stranded DNA virus that belongs to the Phycodnavirus family. It has the capacity to infect Emiliania huxleyi, the most abundant coccolithophore in today’s oceans. Population dynamics of these eukaryotic microalgae is clearly controlled by the severe lytic action of EhV. After an extended bibliographic review on the current knowledge existing on these viruses, we present a series of bioinformatic and experimental analyses conducted to unveil important functional genomic features of the EhV. Evidence for the transfer of 29 genes between E. Huxleyi’s and the EhV genomes is presented. In particular, we investigate the origin of seven genes involved in the unique viral sphingolipid biosynthesis pathway (SBP) encoded in EhV genome. This is the first clear case of horizontal gene transfer of multiple functionally-linked enzymes in a eukaryotic host-virus system. We then focus on a field E. Huxleyi/EhV system from a mesocosm experiment in Norway. The dynamics of expression for two of the most important homologous, host and virus, genes of this pathway, serine palmitoyl transferase and dihydroceramide desaturase is investigated. Three defined transcriptional stages are reported during the bloom, with the coccolithovirus transcripts taking over and controlling the SBP. Finally, host and virus global transcript abundance occurring along the mesocosm experiment was investigated. The majority of the genes that significantly increased in abundance from pre to post viral takeover corresponded to viral sequences for which there is so far no match in the protein databases. Nonetheless, novel transcription features associated with EhV infection were discovered, namely the utilization of genes potentially related to genetic information processing, posttranslational control, intracellular trafficking mechanisms, and control of programmed cell death. As a conclusion, the entire dataset analysed herein is discussed, followed by the potential implications of these findings and future research perspectives in the field of plankton virology
APA, Harvard, Vancouver, ISO, and other styles
37

Bryant, Josephine Maria. "Evolutionary genomics of pathogenic mycobacteria." Thesis, University of Cambridge, 2015. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.708462.

Full text
APA, Harvard, Vancouver, ISO, and other styles
38

Dong, Xin. "Comparative genomics of rickettsia species." Thesis, Aix-Marseille, 2012. http://www.theses.fr/2012AIXM5054/document.

Full text
Abstract:
Le genre Rickettsia, sont des petites bactéries Gram-négatives et symbiotes intracellulaires obligatoires des eucaryotes. Les Rickettsia sont surtout connus pour leur pathogénicité et pour provoquer des maladies graves chez l'homme et les autres animaux. À ce jour, 26 espèces valides de Rickettsies ont été identifiées dans le monde entier, dont 20 sont des agents pathogènes éprouvées. Toutes les espèces de Rickettsies validées sont associées à des arthropodes. Les phylogénies basées sur divers marqueurs moléculaires ont présenté des topologies discordantes, avec seulement R. bellii et R. canadensis qui ne sont classées ni parmi la fièvre boutonneuse groupe rickettsies, ni parmi le typhus groupe rickettsies. En utilisant les méthodes avancées de séquençage de génomes entiers, nous avons obtenu et analysé quatre séquences génomiques de Rickettsies : R. helvetica, R. honei, R. australis et R. japonica. Via la phylogénomique qui constitue une nouvelle stratégie permettant de mieux comprendre leur évolution, l'on remarque que ces micro-organismes ont subi une évolution génomique réduite au cours de spécialisation en intracellulaire. Plusieurs caractéristiques évolutives, comme le réarrangement des gènes, la réduction génomique, le transfert horizontal de gènes et l'acquisition d'ADN égoïste, ont formé les génomes Rickettsia d'aujourd'hui. Ces processus peuvent jouer un rôle important pour équilibrer la taille du génome afin de l'adapter au mode de vie intracellulaire. En outre, la pathogénicité des rickettsies peut être associée à la réduction génomique
The Rickettsia genus is composed of small, Gram-negative, bacteria that are obligate intracellular eukaryotic symbionts. Members of the genus Rickettsia are best known for infecting and causing severe diseases in humans and other animals. To date, 26 valid Rickettsia species have been identified worldwide, including 20 that are proven pathogens. All validated Rickettsia species are associated to arthropods that act as vectors and/or reservoirs. The phylogenies based on various molecular markers have resulted in discrepant topologies, with R. bellii and R. canadensis being classified neither among spotted fever nor typhus group rickettsiae. In this thesis, using the advanced whole genomic sequencing methods, we have and analyzed the genomic sequences from four Rickettsia species, including R. helvetica, R. honei, R. australis and R. japonica. Phylogenomics constitute a new strategy to better understand their evolution. These microorganisms underwent a reductive genomic evolution during their specialization to their intracellular lifestyle. Several evolutive characteristics, such as gene rearrangement, reduction, horizontal gene transfer and aquisition of selfish DNA, have shaped Rickettsia genomes. These processes may play an important role in free-living bacteria for balancing the size of genome in order to adapt the intracellular life style. In addition, in contrast with the concept of bacteria becoming pathogens by acquisition of virulence factors, rickettsial pathogenecity may be linked to genomic reduction of metabolism and regulation pathways
APA, Harvard, Vancouver, ISO, and other styles
39

Sentausa, Erwin. "Intraspecies comparative genomics of Rickettsia." Thesis, Aix-Marseille, 2013. http://www.theses.fr/2013AIXM5082/document.

Full text
Abstract:
Le genre Rickettsia est composé de bactéries Gram-négatives, intracellulaires obligatoires qui causent un éventail de maladies humaines à travers le monde. Des nouvelles techniques ont permis de progresser dans l'identification et la classification des Rickettsia, y compris l'introduction de méthodes moléculaires comme la comparaison de séquences de gènes (ARNr 16S, ompA, ompB, gltA, sca4 …) et la création du statut de sous-espèce. La génomique et les techniques de séquençage de nouvelle génération ont permis d’accéder à une nouvelle façon d’en apprendre davantage sur la pathogenèse et l'évolution de Rickettsia. La première partie de cette thèse est une revue sur les avantages et les limites de la génomique en taxonomie des procaryotes, tandis que la seconde partie est constituée des analyses génomiques de cinq sous-espèces de Rickettsia et une nouvelle espèce de Rickettsia. En utilisant des méthodes de séquençage à haut débit, nous avons obtenu les génomes de R. sibirica sibirica, R. sibirica mongolitimonae, R. conorii indica, R. conorii caspia, R. conorii israelensis et R. gravesii. Ce travail constitue la base d’autres études qui permettront de mieux comprendre les mécanismes physiopathologiques, l’évolution, et la taxonomie des rickettsies
The Rickettsia genus is composed of Gram-negative, obligate intracellular bacteria that cause a range of human diseases around the world. New techniques have led to progress in the identification and classification of Rickettsia, including the introduction of molecular methods like sequence comparison (16S rRNA, ompA, ompB, gltA, sca4 …) and the creation of the subspecies status. Genomics and next-generation sequencing have opened a new way to learn more about the pathogenesis and evolution of Rickettsia. The first part of this thesis is a review on the advantages and limitations of genomics in prokaryotic taxonomy, while the second part consists of the genomic analyses of five Rickettsia subspecies and a new Rickettsia species. Using high-throughput sequencing methods, we obtained the draft genomes of R. sibirica sibirica, R. sibirica mongolitimonae, R. conorii indica, R. conorii caspia, R. conorii israelensis, and R. gravesii. This work can be a basis of further studies to increase the understanding on the disease-causing mechanisms, evolutionary relationships, and taxonomy of rickettsiae
APA, Harvard, Vancouver, ISO, and other styles
40

Tsai, Isheng Jason. "Population genomics of Saccharomyces yeasts." Thesis, Imperial College London, 2009. http://hdl.handle.net/10044/1/4361.

Full text
Abstract:
This thesis examines genome-wide polymorphisms amongst 20 strains of Saccharomyces paradoxus, a yeast strain which has recently emerged as a model organism for population genetic studies. Three major studies are included in this thesis. The first study attempts to quantify the life cycle of yeast undergoing different modes of reproduction in nature. Measures of mutational and recombinational diversity are used to make two independent estimates of the population size. In an obligatory sexual population these estimates should be approximately equal. Instead, there is a discrepancy of about three orders of magnitude, indicating that S. paradoxus goes through one sexual cycle once in every ~1,000 asexual generations. This study illustrates the utility of population genomic data in quantifying the life cycles of organisms undergoing different modes of reproduction. Second, a map showing the distribution of rates of population recombination along chromosome III of S. paradoxus is presented. Several regions of very high recombination (hotspots) are identified in chromosome III. Comparison of hotspot regions between S. paradoxus and S. cerevisiae shows evidence of conservation of recombination hotspot regions. I argue that these observations reflect the weak impact of recombination due to the reduced frequency of sex of yeasts in nature. Recombination rates correlate with GC content, consistent with various studies in yeasts and humans, but there is no correlation with diversity or divergence. In addition, regions of extremely high recombination (hotspots) show an increased rate of GC→AT than rest of the chromosome. The reason for this is not clear at present ii Finally, a catalogue of polymorphisms within each population, and divergences between the two populations of S. paradoxus is presented. Tests of selection on the chromosomal sequence of S. paradoxus suggest a predominant mode of purifying selection. At least a third of mutations in synonymous sites and ~90% of mutations in replacement sites are removed by purifying selection. We estimate that around 12-31% of replacement mutations are deleterious in S. paradoxus, and that the average selection strength acting on these mutations is 1%. I also present a summary of data and subsequent analyses from the Saccharomyces Genome Resequencing Project (SGRP). Population genetic measures are applied to data using different basecalling quality cutoffs. From the results I recommend that at least a quality score of 40 is necessary to achieve the confidence required in data to be used in population genomic analyses.
APA, Harvard, Vancouver, ISO, and other styles
41

Beatson, Scott. "Pseudomonas aeruginosa genomics and pathogenesis /." [St. Lucia, Qld.], 2002. http://www.library.uq.edu.au/pdfserve.php?image=thesisabs/absthe16848.pdf.

Full text
APA, Harvard, Vancouver, ISO, and other styles
42

Benevides, Leandro. "Comparative Genomics of Faecalibacterium spp." Thesis, Université Paris-Saclay (ComUE), 2018. http://www.theses.fr/2018SACLS129.

Full text
Abstract:
Dans le côlon humain, le genre Faecalibacterium est le membre principal du groupe Clostridium leptum et comprend le deuxième genre représentatif le plus commun dans les échantillons fécaux, après Clostridium coccoides. Il a été reconnu comme une bactérie importante favorisant la santé intestinale et est aujourd'hui considéré comme un probiotique de prochaine génération. Jusqu'à récemment, on croyait qu'il n'y avait qu'une seule espèce dans ce genre, mais depuis 2012, certaines études ont commencé à suggérer l'existence de deux phylogroupes dans le genre. Cette nouvelle proposition de reclassification dans ce genre augmente l'importance de nouvelles études, toutes souches confondues, pour mieux comprendre la diversité, les interactions avec l'hôte et les aspects sécuritaires dans son utilisation comme probiotique. Brièvement, dans ce travail, nous introduisons les analyses de génomique comparative au genre Faecalibacterium en effectuant une étude phylogénétique profonde et en évaluant les aspects de sécurité pour son utilisation comme probiotique. Les analyses phylogénétiques comprenaient non seulement l'utilisation classique du gène de l'ARNr 16S, mais aussi l'utilisation de 17 génomes et techniques complets comme le typage de séquence multi-locus (wgMLST), l'identité nucléotidique moyenne (ANI), le synténie génique et le pangénome. C'est aussi le premier travail à combiner une analyse du développement du pangénome avec l'analyse ANI afin de corroborer l'attribution de souches à de nouvelles espèces. Les analyses phylogénétiques ont confirmé l'existence de plus d'une espèce dans le genre Faecalibacterium. De plus, l'évaluation de la sécurité impliquait (1) la prédiction des régions acquises horizontalement (îlots de résistance aux antibiotiques, îlots métaboliques et régions phagiques), (2) la prédiction des voies métaboliques, (3) la recherche de gènes liés à la résistance aux antibiotiques et des bactériocines. Ces analyses ont identifié des îlots génomiques dans tous les génomes, mais aucun d'entre eux n'est exclusif à une souche ou à une génospécie. En outre, ont été identifiés 8 gènes liés aux mécanismes de résistance aux antibiotiques répartis entre les génomes. 126 voies métaboliques ont été prédites et parmi certaines ont été mises en évidence: la dégradation du bisphénol A, le métabolisme du butanoate et la biosynthèse de la streptomycine. En outre, nous avons étudié le contexte génomique d'une protéine (molécule anti-inflammatoire microbienne - MAM) décrite pour la première fois par notre groupe. Cette recherche montre que la MAM apparaît proche des gènes liés au processus de sporulation et, dans certaines souches, proche d'un transporteur ABC
Within the human colon, the genus Faecalibacterium is the main member of the Clostridium leptum cluster and comprises the second-most common representative genus in fecal samples, after Clostridium coccoides. It has been recognized as an important bacterium promoting the intestinal health and today is considered as a potential next generation probiotic. Until recently, it was believed that there was only one species in this genus, but since 2012, some studies have begun to suggest the existence of two phylogroups into the genus. This new proposition of reclassification into this genus increases the importance of new studies, with all strains, to better understand the diversity, the interactions with the host and the safety aspects in its use as probiotic. Briefly, in this work we introduce the comparative genomics analyzes to the genus Faecalibacterium performing a deep phylogenetic study and evaluating the safety aspects for its use as a probiotic. The phylogenetic analyzes included not only the classical use of 16S rRNA gene, but also the utilization of 17 complete genomes and techniques like whole genome Multi-Locus Sequence Typing (wgMLST), Average Nucleotide Identity (ANI), gene synteny, and pangenome. Also, this is the first work to combine an analysis of pangenome development with ANI analysis in order to corroborate the assignment of strains to new species. The phylogenetic analyzes confirmed the existence of more than one species into the genus Faecalibacterium. Moreover, the safety assessment involved the (1) prediction of horizontally acquired regions (Antibiotic resistance islands, Metabolic islands and phage regions), (2) prediction of metabolic pathways, (3) search of genes related to antibiotic resistance and (4) search of bacteriocins. These analyzes identified genomic islands in all genomes, but none of than are exclusive to one strain or genospecies. Also, were identified 8 genes related to antibiotic resistance mechanisms distributed among the genomes. 126 metabolic pathways were predicted and among than some were highlighted: Bisphenol A degradation, Butanoate metabolism and Streptomycin biosynthesis. In addition, we studied the genomic context of one protein (Microbial Anti-inflammatory Molecule - MAM) first described by our group. This investigation shows that MAM appears close to genes related to sporulation process and, in some strains, close to an ABC-transporter
APA, Harvard, Vancouver, ISO, and other styles
43

CALABRIA, ANDREA. "Data integration for clinical genomics." Doctoral thesis, Università degli Studi di Milano-Bicocca, 2011. http://hdl.handle.net/10281/19219.

Full text
Abstract:
Genetics and Molecular Biology are keys for the understanding the mechanisms of many of the human diseases that have strong harmful effects. The empirical mission of Genetics is to translate these mechanisms into Clinical benefits, thus bridging in-silico findings to patient bed side: approaching this goal means achieving what is commonly referred as clinical genomics or personalized medicine. In this process, technologies are assuming an increasing role. With the introduction of new experimental platforms (microarrays, sequencing, etc), today's analyses are much more detailed and can cover a wide spectrum of applications, from gene expression to Copy Number Variants detection. The advantages of technological improvements are usually followed by data management drawbacks due to the explosion of data throughput that reflects on a real need for new systems of data rationalization and management, data access, query and extraction. Our genetic laboratories partners encountered all those issues: what they need is a tool that allows data-integration and supports biological data analysis exploiting computational infrastructures on distributed environment. From such needs, we defined two main goals: (1) Computer Science goal: to design and implement a framework that integrates and manages data and genetic analyses; (2) Genetics and Molecular Biology goal (application domains): to solve biological problems through the framework and develop new methods. Given these requirements and related specifications, we designed an extensible framework based on three inter-connected layers: (1) Experimental data layer, that provides data integration of data from high-throughput platforms (also called horizontal data integration); (2) Knowledge data layer, that provides data integration of knowledge data (also called vertical integration); (3) Computational layer, that provides access to distributed environments for data analysis, in our cases GRID and Cluster technologies. Above the three design blocks, single biological problems can be supported and custom user interfaces are implemented. From our partner laboratories, two main relevant biological problems have been addressed: (1) Linkage Analysis: given a large pedigree in which subjects were genotyped with chips of 1 million of SNPs, the linkage analysis problem presented real computational limits. We designed a heuristic method to overcome computational restrictions and implemented it within our framework, exploiting GRID and Cluster environments. Using our approach, we obtained genetic results, successfully validated by end-users. We also tested performances of the system, reporting compared results. (2) SNP selection and ranking: given the problem of ranking SNPs based on a-priori information, we developed a novel method for biological data mining on genes' annotations. The method has been implemented as a web tool, SNP Ranker, that is under deep validation by our partners laboratories. The framework here designed and implemented demonstrated that this approach is consistent and can have potential impacts on the scientific community.
APA, Harvard, Vancouver, ISO, and other styles
44

Rousset, Francois. "CRISPRi screens in bacterial genomics." Electronic Thesis or Diss., Sorbonne université, 2020. http://www.theses.fr/2020SORUS373.

Full text
Abstract:
La génomique chez les bactéries a connu un véritable essor au cours de la dernière décennie grâce aux progrès des méthodes de séquençage de l'ADN. De nouvelles techniques expérimentales sont nécessaires afin de mieux comprendre la fonction des gènes. La découverte des systèmes immunitaires adaptatifs CRISPR-Cas chez les bactéries a conduit au développement de nombreuses technologies pour cibler un acide nucléique de manière séquence-spécifique. En particulier, l’enzyme dCas9 peut être guidée vers une séquence d’ADN par un court ARN nommé sgRNA afin d'inhiber l'expression d'un gène de manière spécifique, un mécanisme nommé CRISPRi. La présente thèse décrit le développement d'une technique de criblage haut-débit basée sur des collections de sgRNAs synthétisées et clonées en parallèle. Nous avons d'abord montré comment cette technique peut être utilisée pour identifier les gènes essentiels chez E. coli. Nous l’avons également utilisée dans le contexte d’une infection par différents bactériophages afin d'identifier les gènes nécessaires à l’infection. Alors que la majorité des études génomiques sont basées sur des souches modèles qui ne représentent pas la diversité de l'espèce, nous avons ensuite adapté un système CRISPRi compatible avec la majorité des isolats d’E. coli ou d’espèces proches. Une collection de sgRNAs ciblant ~3300 gènes conservés d’E. coli a été créée et utilisée dans une collection d'isolats naturels afin de déterminer l’impact de la diversité génétique sur l’essentialité des gènes conservés de l'espèce. Nous avons ainsi montré comment des gènes transférés horizontalement peuvent moduler l'essentialité de gènes conservés. Ces travaux démontrent le potentiel des criblages CRISPRi haut-débit en génomique bactérienne
Advances in sequencing technologies over the past decade have significantly expanded the field of bacterial genomics. In this context, new experimental methods are still required to better understand gene function. The discovery of CRISPR-Cas systems in bacterial adaptive immunity led to the development of a variety of biotechnological tools to target DNA in a sequence-specific manner. In particular, the dCas9 protein can be guided towards a target DNA sequence by short RNAs called sgRNAs to inhibit gene expression in a mechanism called CRISPRi. The present thesis describes the development of a high-throughput screening method based on the pooled synthesis and cloning of sgRNAs libraries. We first showed that CRISPRi screens can confidently predict essential genes in E. coli. We also exploited this method during infection by different bacteriophages to determine which host genes are required for a successful infection. While most genomics studies rely on model strains which fail to represent the genetic diversity of the species, we next developed a CRISPRi platform that is compatible with most isolates from E. coli and closely-related species. A sgRNA library targeting ~3,300 persistent genes from the E. coli species was designed and implemented in a collection of natural isolates to determine the impact of genetic diversity on the essentiality of core genes. We demonstrated how horizontally-transferred genes can modulate core gene essentiality. Altogether, this work shows the potential of high-throughput CRISPRi screens in bacterial genomics
APA, Harvard, Vancouver, ISO, and other styles
45

Woo, Yong. "Characterizing the Dynamics of Genome Evolution in Tumorigenesis." Fogler Library, University of Maine, 2009. http://www.library.umaine.edu/theses/pdf/WooY2009.pdf.

Full text
APA, Harvard, Vancouver, ISO, and other styles
46

van, Rensburg Melissa Jansen. "The genomic epidemiology of Campylobacter from the Republic of South Africa." Thesis, University of Oxford, 2015. https://ora.ox.ac.uk/objects/uuid:983f9d81-94ea-4149-adae-d5ec397504a8.

Full text
Abstract:
As the leading cause of bacterial gastroenteritis, Campylobacter represents a significant public health burden; however, our knowledge of its epidemiology in low- and middle-income countries remains limited. Recent studies have demonstrated the power of whole-genome sequencing (WGS) for public health microbiology. The primary aim of this thesis was to exploit WGS to improve our understanding of the epidemiology of Campylobacter from the Republic of South Africa, a middle-income country. In the first half of this thesis, in silico approaches were developed to evaluate diagnostic assays and methods of species identification. Large-scale analyses of publicly available WGS data identified a robust real-time PCR assay for the detection of Campylobacter jejuni and Campylobacter coli, the primary causes of human campylobacteriosis. Evaluation of in silico speciation methods demonstrated that the atpA gene and ribosomal multilocus sequence typing can be used to identify Campylobacter from WGS data. The second half of this thesis extended concepts developed in the first half to investigate the epidemiology of Campylobacter from animals and humans from South Africa. Isolates from a study of Campylobacter from free-range broiler carcasses belonged to the agriculture-associated ST-828 lineage, but were atypically homogenous and differed at only 46/1,513 (3%) loci, providing novel insights into clonal infections in chickens. Analyses of human disease isolates collected in Cape Town in 1991, 2011, and 2012 confirmed that the local epidemiology of Campylobacter is distinct from that of high-income countries: in addition to major agriculture-associated C. jejuni and C. coli lineages, a putative novel C. jejuni subsp. jejuni/C. jejuni subsp. doylei hybrid clade and genetically diverse C. jejuni subsp. doylei and C. upsaliensis isolates were identified. This work delivers further evidence of the utility of WGS for clinical microbiology, presents approaches that address general problems in Campylobacter diagnostics and public health microbiology, and provides insights into the epidemiology of this important group of pathogens in South Africa.
APA, Harvard, Vancouver, ISO, and other styles
47

Nelson, Thomas. "The Origins and Maintenance of Genomic Variation in the Threespine Stickleback (Gasterosteus aculeatus)." Thesis, University of Oregon, 2017. http://hdl.handle.net/1794/22660.

Full text
Abstract:
Genetic variation is the raw material of evolution. The sources of this variation within a population, and its maintenance within a species, have been mysterious since the birth of the field of evolutionary genetics. In this work, I study divergently adapted freshwater and marine populations of the threespine stickleback (Gasterosteus aculeatus) as an evolutionary model to track the origin of adaptive genetic variation and to describe the evolutionary processes maintaining variation across the genome. The stickleback is a small fish with a large geographic range encompassing the northern half of the Northern Hemisphere and composed of coastal marine habitats, freshwater lakes, and river systems. Populations of stickleback adapt rapidly to changes in habitat, and fossil evidence suggests that similar adaptive transitions have been ongoing in this lineage for at least ten million years. In this work, I develop a significant extension of restriction site-associated DNA sequencing (RAD-seq) to generate phased haplotype information to estimate gene tree topologies and divergence times at thousands of loci simultaneously. I find anciently derived clades of variation associated with marine and freshwater habitats in genomic regions involved in recent adaptive divergence; some divergence times extend to over ten million years ago. This history of adaptive divergence has had profound effects on genetic variation elsewhere in the genome: chromosomes harboring freshwater-adaptive variants retain anciently derived variation in linked genomic regions, while marine chromosomes have much more recent ancestry. I present a conceptual model of asymmetric selective and demographic processes to explain this result, which will form a nucleus for future research in this species. Lastly, by incorporating genome-wide recombination rates estimated from multiple genetic maps, I describe a recombination landscape that is favorable to the maintenance of marine-freshwater genomic divergence. Low recombination rates in key chromosomal regions condense widespread divergence of the physical genome, encompassing many megabases, into a small number of Mendelian loci. Combined, my results demonstrate the interconnectedness of evolutionary processes taking place on ecological and geological timescales. The genetic variation available for adaptive evolution today is a product of the long-term evolutionary history of a species.
APA, Harvard, Vancouver, ISO, and other styles
48

Bertoldi, Loris. "Bioinformatics for personal genomics: development and application of bioinformatic procedures for the analysis of genomic data." Doctoral thesis, Università degli studi di Padova, 2018. http://hdl.handle.net/11577/3421950.

Full text
Abstract:
In the last decade, the huge decreasing of sequencing cost due to the development of high-throughput technologies completely changed the way for approaching the genetic problems. In particular, whole exome and whole genome sequencing are contributing to the extraordinary progress in the study of human variants opening up new perspectives in personalized medicine. Being a relatively new and fast developing field, appropriate tools and specialized knowledge are required for an efficient data production and analysis. In line with the times, in 2014, the University of Padua funded the BioInfoGen Strategic Project with the goal of developing technology and expertise in bioinformatics and molecular biology applied to personal genomics. The aim of my PhD was to contribute to this challenge by implementing a series of innovative tools and by applying them for investigating and possibly solving the case studies included into the project. I firstly developed an automated pipeline for dealing with Illumina data, able to sequentially perform each step necessary for passing from raw reads to somatic or germline variant detection. The system performance has been tested by means of internal controls and by its application on a cohort of patients affected by gastric cancer, obtaining interesting results. Once variants are called, they have to be annotated in order to define their properties such as the position at transcript and protein level, the impact on protein sequence, the pathogenicity and more. As most of the publicly available annotators were affected by systematic errors causing a low consistency in the final annotation, I implemented VarPred, a new tool for variant annotation, which guarantees the best accuracy (>99%) compared to the state-of-the-art programs, showing also good processing times. To make easy the use of VarPred, I equipped it with an intuitive web interface, that allows not only a graphical result evaluation, but also a simple filtration strategy. Furthermore, for a valuable user-driven prioritization of human genetic variations, I developed QueryOR, a web platform suitable for searching among known candidate genes as well as for finding novel gene-disease associations. QueryOR combines several innovative features that make it comprehensive, flexible and easy to use. The prioritization is achieved by a global positive selection process that promotes the emergence of the most reliable variants, rather than filtering out those not satisfying the applied criteria. QueryOR has been used to analyze the two case studies framed within the BioInfoGen project. In particular, it allowed to detect causative variants in patients affected by lysosomal storage diseases, highlighting also the efficacy of the designed sequencing panel. On the other hand, QueryOR simplified the recognition of LRP2 gene as possible candidate to explain such subjects with a Dent disease-like phenotype, but with no mutation in the previously identified disease-associated genes, CLCN5 and OCRL. As final corollary, an extensive analysis over recurrent exome variants was performed, showing that their origin can be mainly explained by inaccuracies in the reference genome, including misassembled regions and uncorrected bases, rather than by platform specific errors.
Nell’ultimo decennio, l’enorme diminuzione del costo del sequenziamento dovuto allo sviluppo di tecnologie ad alto rendimento ha completamente rivoluzionato il modo di approcciare i problemi genetici. In particolare, il sequenziamento dell’intero esoma e dell’intero genoma stanno contribuendo ad un progresso straordinario nello studio delle varianti genetiche umane, aprendo nuove prospettive nella medicina personalizzata. Essendo un campo relativamente nuovo e in rapido sviluppo, strumenti appropriati e conoscenze specializzate sono richieste per un’efficiente produzione e analisi dei dati. Per rimanere al passo con i tempi, nel 2014, l’Università degli Studi di Padova ha finanziato il progetto strategico BioInfoGen con l’obiettivo di sviluppare tecnologie e competenze nella bioinformatica e nella biologia molecolare applicate alla genomica personalizzata. Lo scopo del mio dottorato è stato quello di contribuire a questa sfida, implementando una serie di strumenti innovativi, al fine di applicarli per investigare e possibilmente risolvere i casi studio inclusi all’interno del progetto. Inizialmente ho sviluppato una pipeline per analizzare i dati Illumina, capace di eseguire in sequenza tutti i processi necessari per passare dai dati grezzi alla scoperta delle varianti sia germinali che somatiche. Le prestazioni del sistema sono state testate mediante controlli interni e tramite la sua applicazione su un gruppo di pazienti affetti da tumore gastrico, ottenendo risultati interessanti. Dopo essere state chiamate, le varianti devono essere annotate al fine di definire alcune loro proprietà come la posizione a livello del trascritto e della proteina, l’impatto sulla sequenza proteica, la patogenicità, ecc. Poiché la maggior parte degli annotatori disponibili presentavano errori sistematici che causavano una bassa coerenza nell’annotazione finale, ho implementato VarPred, un nuovo strumento per l’annotazione delle varianti, che garantisce la migliore accuratezza (>99%) comparato con lo stato dell’arte, mostrando allo stesso tempo buoni tempi di esecuzione. Per facilitare l’utilizzo di VarPred, ho sviluppato un’interfaccia web molto intuitiva, che permette non solo la visualizzazione grafica dei risultati, ma anche una semplice strategia di filtraggio. Inoltre, per un’efficace prioritizzazione mediata dall’utente delle varianti umane, ho sviluppato QueryOR, una piattaforma web adatta alla ricerca all’interno dei geni causativi, ma utile anche per trovare nuove associazioni gene-malattia. QueryOR combina svariate caratteristiche innovative che lo rendono comprensivo, flessibile e facile da usare. La prioritizzazione è raggiunta tramite un processo di selezione positiva che fa emergere le varianti maggiormente significative, piuttosto che filtrare quelle che non soddisfano i criteri imposti. QueryOR è stato usato per analizzare i due casi studio inclusi all’interno del progetto BioInfoGen. In particolare, ha permesso di scoprire le varianti causative dei pazienti affetti da malattie da accumulo lisosomiale, evidenziando inoltre l’efficacia del pannello di sequenziamento sviluppato. Dall’altro lato invece QueryOR ha semplificato l’individuazione del gene LRP2 come possibile candidato per spiegare i soggetti con un fenotipo simile alla malattia di Dent, ma senza alcuna mutazione nei due geni precedentemente descritti come causativi, CLCN5 e OCRL. Come corollario finale, è stata effettuata un’analisi estensiva su varianti esomiche ricorrenti, mostrando come la loro origine possa essere principalmente spiegata da imprecisioni nel genoma di riferimento, tra cui regioni mal assemblate e basi non corrette, piuttosto che da errori piattaforma-specifici.
APA, Harvard, Vancouver, ISO, and other styles
49

Schwartz, Marín Ernesto. "Genomic sovereignty and "the Mexican genome"." Thesis, University of Exeter, 2011. http://hdl.handle.net/10036/3500.

Full text
Abstract:
This PhD seeks to explore the development of a bio-molecular (i.e., genomic) map as a sovereign resource in Mexico. The basic analytical thread of the dissertation is related to the circulation of genomic variability through the policy/legal and scientific social worlds that compose the Mexican medical-population genomics arena. It follows the construction of the Mexican Institute of Genomic Medicine (INMEGEN), the notion of genomic sovereignty, and the Mexican Genome Diversity Project (MGDP).The key argument for the construction of the INMEGEN relied in a nationalist policy framing, which considered the Mexican genome as a sovereign resource, coupling Mexican “uniqueness” to the very nature of genomic science. Nevertheless, the notion of genomic sovereignty was nothing similar to a paradigm, and was not based on shared visions of causality, since the very “nature” of the policy object —Mexican Genome— was, and still is, a disputed reality. It was through the rhetoric upon independence, emancipation and biopiracy: i.e. experiences of dispossession “in archaeology, botany or zoology” (IFS 2001: 25) that the novelty of population genomics became amenable to be understood as a sovereign matter. Therefore, the strategic reification of Mexicanhood fuelled the whole policy and the legal agenda of the INMEGEN as well, which permitted cooperation without consensus and opened the process of policy innovation. Conversely, scientists considered genomic sovereignty an unfounded exaggeration, but anyhow they cooperated and even created a new policy and scientific enterprise. Genomic sovereignty exemplifies the process of cooperation without consensus on its most extreme version .So, as the notion circulated and gradually became a law to protect Mexican genomic patrimony, the initial coalition of scientists, lawyers and policy makers disaggregated. Many of the original members of the coalition now think of genomic sovereignty as a strategy of the INMEGEN to monopolise genomic research in the country. This dissertation additionally explores the way in which the MGDP is constructed in mass media, in INMEGEN´s communication and in the laboratory practices. These different dimensions of the MGDP depict the difficulties that emerge between the probabilistic, relative and multiple constructions of population genomics and the rhetorical strategies to continually assert the existence of the unique “Mexican Genome”. I argue that the Mexican case study provides an entry point to what I and others (Benjamin 2009; Schwartz-Marin 2011) have identified as a postcolonial biopolitics in which the nation state is reasserted rather than diluted. However the relation between sovereignty, race and nation is not mediated by the biological purification of the nation (Agamben 1998; Foucault 2007), or the active participation of citizens looking to increase their vitality (Rose 2008, Rose & Rabinow 2006), but on an awareness of subalternity in the genomic arena and a collective desire to compete in the biomedical global economy.
APA, Harvard, Vancouver, ISO, and other styles
50

Migeon, Pierre. "Comparative genomics of repetitive elements between maize inbred lines B73 and Mo17." Thesis, Kansas State University, 2017. http://hdl.handle.net/2097/35377.

Full text
Abstract:
Master of Science
Genetics Interdepartmental Program
Sanzhen Liu
The major component of complex genomes is repetitive elements, which remain recalcitrant to characterization. Using maize as a model system, we analyzed whole genome shotgun (WGS) sequences for the two maize inbred lines B73 and Mo17 using k-mer analysis to quantify the differences between the two genomes. Significant differences were identified in highly repetitive sequences, including centromere, 45S ribosomal DNA (rDNA), knob, and telomere repeats. Genotype specific 45S rDNA sequences were discovered. The B73 and Mo17 polymorphic k-mers were used to examine allele-specific expression of 45S rDNA in the hybrids. Although Mo17 contains higher copy number than B73, equivalent levels of overall 45S rDNA expression indicates that transcriptional or post-transcriptional regulation mechanisms operate for the 45S rDNA in the hybrids. Using WGS sequences of B73xMo17 doubled haploids, genomic locations showing differential repetitive contents were genetically mapped, revealing differences in organization of highly repetitive sequences between the two genomes. In an analysis of WGS sequences of HapMap2 lines, including maize wild progenitor, landraces, and improved lines, decreases and increases in abundance of additional sets of k-mers associated with centromere, 45S rDNA, knob, and retrotransposons were found among groups, revealing global evolutionary trends of genomic repeats during maize domestication and improvement.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography