Dissertations / Theses on the topic 'Comparative Genomics Analysis'

To see the other types of publications on this topic, follow the link: Comparative Genomics Analysis.

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 dissertations / theses for your research on the topic 'Comparative Genomics Analysis.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.

1

Prakash, Amol. "Algorithms for comparative sequence analysis and comparative proteomics /." Thesis, Connect to this title online; UW restricted, 2006. http://hdl.handle.net/1773/6904.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Karanam, Suresh Kumar. "Automation of comparative genomic promoter analysis of DNA microarray datasets." Thesis, Available online, Georgia Institute of Technology, 2004:, 2003. http://etd.gatech.edu/theses/available/etd-04062004-164658/unrestricted/karanam%5Fsuresh%5Fk%5F200312%5Fms.pdf.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Jordan, Gregory. "Analysis of alignment error and sitewise constraint in mammalian comparative genomics." Thesis, University of Cambridge, 2012. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.610693.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Jentzsch, Iris Miriam Vargas. "Comparative genomics of microsatellite abundance: a critical analysis of methods and definitions." Thesis, University of Canterbury. Biological Sciences, 2009. http://hdl.handle.net/10092/4282.

Full text
Abstract:
This PhD dissertation is focused on short tandemly repeated nucleotide patterns which occur extremely often across DNA sequences, called microsatellites. The main characteristic of microsatellites, and probably the reason why they are so abundant across genomes, is the extremely high frequency of specific replication errors occurring within their sequences, which usually cause addition or deletion of one or more complete tandem repeat units. Due to these errors, frequent fluctuations in the number of repetitive units can be observed among cellular and organismal generations. The molecular mechanisms as well as the consequences of these microsatellite mutations, both, on a generational as well as on an evolutionary scale, have sparked debate and controversy among the scientific community. Furthermore, the bioinformatic approaches used to study microsatellites and the ways microsatellites are referred to in the general literature are often not rigurous, leading to misinterpretations and inconsistencies among studies. As an introduction to this complex topic, in Chapter I I present a review of the knowledge accumulated on microsatellites during the past two decades. A major part of this chapter has been published in the Encyclopedia of Life Sciences in a Chapter about microsatellite evolution (see Publication 1 in Appendix II). The ongoing controversy about the rates and patterns of microsatellite mutation was evident to me since before starting this PhD thesis. However, the subtler problems inherent to the computational analyses of microsatellites within genomes only became apparent when retrieving information on microsatellite distribution and abundance for the design of comparative genomic analyses. There are numerous publications analyzing the microsatellite content of genomes but, in most cases, the results presented can neither be reliably compared nor reproduced, mainly due to the lack of details on the microsatellite search process (particularly the program’s algorithm and the search parameters used) and because the results are expressed in terms that are relative to the search process (i.e. measures based on the absolute number of microsatellites). Therefore, in Chapter II I present a critical review of all available software tools designed to scan DNA sequences for microsatellites. My aim in undertaking this review was to assess the comparability of search results among microsatellite programs, and to identify the programs most suitable for the generation of microsatellite datasets for a thorough and reproducible comparative analysis of microsatellite content among genomic sequences. Using sequence data where the number and types of microsatellites were empirical know I compared the ability of 19 programs to accurately identify and report microsatellites. I then chose the two programs which, based on the algorithm and its parameters as well as the output informativity, offered the information most suitable for biological interpretation, while also reflecting as close as possible the microsatellite content of the test files. From the analysis of microsatellite search results generated by the various programs available, it became apparent that the program’s search parameters, which are specified by the user in order to define the microsatellite characteristics to the program, influence dramatically the resulting datasets. This is especially true for programs suited to allow imperfections within tandem repeats, because imperfect repetitions can not be defined accurately as is the case for perfect ones, and because several different algorithms have been proposed to address this problem. The detection of approximate microsatellites is, however, essential for the study of microsatellite evolution and for comparative analyses based on microsatellites. It is now well accepted that small deviations from perfect tandem repeat structure are common within microsatellites and larger repeats, and a number of different algorithms have been developed to confront the challenge of finding and registering microsatellites with all expectable kinds of imperfection. However, biologists have still to apply these tools to their full potential. In biological analyses single tandem repeat hits are consistently interpreted as isolated and independent repeats. This interpretation also depends on the search strategy used to report the microsatellites in DNA sequences and, therefore, I was particularly interested in the capacity of repeat finding programs to report imperfect microsatellites allowing interpretations that are useful in a biological sense. After analzying a series of tandem repeat finding programs I optimized my microsatellite searches to yield the best possible datasets for assessing and comparing the degree of imperfection of microsatellites among different genomes (Chapter III) During the program comparisons performed in Chapter II, I show that the most critical search parameter influencing microsatellite search results is the minimum length threshold. Biologically speaking, there is no consensus with respect to the minimum length, beyond which a short tandem repeat is expected to become prone to microsatellite-like mutations. Usually, a single absolute value of ~12 nucleotides is assigned irrespective of motif length.. In other cases thresholds are assigned in terms of number of repeat units (i.e. 3 to 5 repeats or more), which are better applied individually for each motif. The variation in these thresholds is considerable and not always justifiable. In addition, any current minimum length measures are likely naïve because it is clear that different microsatellite motifs undergo replication slippage at different length thresholds. Therefore, in Chapter III, I apply two probabilistic models to predict the minimum length at which microsatellites of varying motif types become overrepresented in different genomes based on the individual oligonucleotide frequency data of these genomes. Finally, after a range of optimizations and critical analyses, I performed a preliminary analysis of microsatellite abundance among 24 high quality complete eukaryotic genomes, including also 8 prokaryotic and 5 archaeal genomes for contrast. The availability of the methodologies and the microsatellite datasets generated in this project will allow informed formulation of questions for more specific genome research, either about microsatellites, or about other genomic features microsatellites could influence. These datasets are what I would have needed at the beginning of my PhD to support my experimental design, and are essential for the adequate data interpretation of microsatellite data in the context of the major evolutionary units; chromosomes and genomes.
APA, Harvard, Vancouver, ISO, and other styles
5

Chen, Lu. "Comparative and functional analysis of alternative splicing in eukaryotic genomes." Thesis, University of Bath, 2012. https://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.558885.

Full text
Abstract:
Alternative splicing (AS) is a common post-transcriptional process in eukaryotic organisms, by which multiple distinct functional transcripts are produced from a single gene. Because of its potential role in expanding transcript diversity, interest in alternative splicing has been increasing over the last decade, ever since the release of the human genome draft showed it contained little more than the number of genes of a worm. Although recent studies have shown that 94% human multi-exon genes undergo AS while aberrant AS may cause disease or cancer, evolution of AS in eukaryotic genomes remains largely unexplored mainly due to the lack of comparable AS estimates. In this thesis I built a Eukaryote Comprehensive & Comparable Alternative Splicing Events Database (ECCASED) based on the analyses of over 30 million Expressed Sequence Tag (ESTs) for 114 eukaryotic genomes, including protists (22), plants (20), fungi (23), metazoan (non-vertebrates, 29) and vertebrates (20). Using this database, I addressed two main questions: 1) How does alternative splicing relate to gene duplication (GD) as an alternative mechanism to increase transcript diversity? and 2) What is the contribution of alternative splicing to eukaryote transcript diversity? I found that the previous “interchangeable model” of AS and gene duplication is a by-product of an existing relation between gene expression breadth, AS and gene family size. I also show that alternative splicing has played a key role in the expansion of transcript diversity and that this expansion is the best predictor reported to date of organisms complexity assayed as number of cell types. In addition, by comparing alternative splicing patterns in cancer and normal transcript libraries I found that cancer derived transcript libraries have increased levels of “noisy splicing”.
APA, Harvard, Vancouver, ISO, and other styles
6

Nelson, A. D. L., E. S. Forsythe, U. K. Devisetty, D. S. Clausen, A. K. Haug-Batzell, A. M. R. Meldrum, M. R. Frank, E. Lyons, and M. A. Beilstein. "A Genomic Analysis of Factors Driving lincRNA Diversification: Lessons from Plants." GENETICS SOCIETY AMERICA, 2016. http://hdl.handle.net/10150/621708.

Full text
Abstract:
Transcriptomic analyses from across eukaryotes indicate that most of the genome is transcribed at some point in the developmental trajectory of an organism. One class of these transcripts is termed long intergenic noncoding RNAs (lincRNAs). Recently, attention has focused on understanding the evolutionary dynamics of lincRNAs, particularly their conservation within genomes. Here, we take a comparative genomic and phylogenetic approach to uncover factors influencing lincRNA emergence and persistence in the plant family Brassicaceae, to which Arabidopsis thaliana belongs. We searched 10 genomes across the family for evidence of >5000 lincRNA loci from A. thaliana. From loci conserved in the genomes of multiple species, we built alignments and inferred phylogeny. We then used gene tree/species tree reconciliation to examine the duplication history and timing of emergence of these loci. Emergence of lincRNA loci appears to be linked to local duplication events, but, surprisingly, not whole genome duplication events (WGD), or transposable elements. Interestingly, WGD events are associated with the loss of loci for species having undergone relatively recent polyploidy. Lastly, we identify 1180 loci of the 6480 previously annotated A. thaliana lincRNAs (18%) with elevated levels of conservation. These conserved lincRNAs show higher expression, and are enriched for stress-responsiveness and cis-regulatory motifs known as conserved noncoding sequences (CNSs). These data highlight potential functional pathways and suggest that CNSs may regulate neighboring genes at both the genomic and transcriptomic level. In sum, we provide insight into processes that may influence lincRNA diversification by providing an evolutionary context for previously annotated lincRNAs.
APA, Harvard, Vancouver, ISO, and other styles
7

Jiang, Xiaofang. "Genomics and Transcriptomics Analysis of the Asian Malaria Mosquito Anopheles stephensi." Diss., Virginia Tech, 2016. http://hdl.handle.net/10919/79959.

Full text
Abstract:
Anopheles stephensi is a potent vector of malaria throughout the Indian subcontinent and Middle East. An. stephensi is emerging as a model for molecular and genetic studies of mosquito-parasite interactions. Here we conducted a series of genomic and transcriptomic studies to improve the understanding of the biology of Anopheles stephensi and mosquito in general. First we reported the genome sequence and annotation of the Indian strain of the type form of An. stephensi. The 221 Mb genome assembly was produced using a combination of 454, Illumina, and PacBio sequencing. This hybrid assembly method was significantly better than assemblies generated from a single data source. A total of 11,789 protein-encoding genes were annotated using a combination of homology and de novo prediction. Secondly, we demonstrated the presence of complete dosage compensation in An. stephensi by determining that autosomal and X-linked genes have very similar levels of expression in both males and females. The uniformity of average expression levels of autosomal and X-linked genes remained when An. stephensi gene expression was normalized by that of their Ae. aegypti orthologs, strengthening the conclusion of complete dosage compensation in Anopheles. Lastly, we investigated trans-splicing events in Anopheles stephensi. We identified six trans-splicing events and all the trans-splicing sites are conserved and present in Ae. aegypti. The proteins encoded by the trans-spliced mRNAs are also highly conserved and their orthologs are co-linearly transcribed in out-groups of family Culicidae. This finding indicates the need to preserve the intact mRNA and protein function of the broken-up genes by trans-splicing during evolution. In summary, we presented the first genome assembly of Anopheles stephensi and studied two interesting evolution events" dosage compensation and trans-splicing - via transcriptomic analysis.
Ph. D.
APA, Harvard, Vancouver, ISO, and other styles
8

Motro, Yair. "Comparative genomics analysis and development of bioinformatics tools for two newly sequenced spirochaete species." Thesis, Motro, Yair (2008) Comparative genomics analysis and development of bioinformatics tools for two newly sequenced spirochaete species. PhD thesis, Murdoch University, 2008. https://researchrepository.murdoch.edu.au/id/eprint/41679/.

Full text
Abstract:
The bacterial family Spirochaetales contains a number of potent pathogens responsible for serious and well-known diseases, such as tick Lyme disease (Borrelia burgdoferri), leptospirosis (Leptospira interrogens), and sypillis (Treponema pallidum). Though the mentioned species have been extensively investigated, there still remain spirochaete genera, and the spirochaete family as a whole, that have been minimally characterised. The Brachyspira genera includes species primarily responsible for gastro-intestinal diseases. Some biological characteristics of the two species B. hyodysenteriae and B. pilosicoli are known. For example B. hyodysenteriae causes disease in swine, while B. pilosicoli causes disease in a wide range of animals and humans. As there are no whole genome sequences available for any Brachyspira species, their underlying molecular mechanisms, evolution and function are not understood. This work is part of a large project which aims to sequence the two whole genome sequences for vaccine design and development. This thesis represents the first report of an in-depth comparative genome analysis (CGA) of the novel whole genome sequences of both Brachyspira species, providing greater understanding into their genomic functional relationships, evolution and diversity, while also identifying elements for potential vaccine and drug design and development.
APA, Harvard, Vancouver, ISO, and other styles
9

Sturgill, David Matthew. "Comparative Genome Analysis of Three Brucella spp. and a Data Model for Automated Multiple Genome Comparison." Thesis, Virginia Tech, 2003. http://hdl.handle.net/10919/10163.

Full text
Abstract:
Comparative analysis of multiple genomes presents many challenges ranging from management of information about thousands of local similarities to definition of features by combination of evidence from multiple analyses and experiments. This research represents the development stage of a database-backed pipeline for comparative analysis of multiple genomes. The genomes of three recently sequenced species of Brucella were compared and a superset of known and hypothetical coding sequences was identified to be used in design of a discriminatory genomic cDNA array for comparative functional genomics experiments. Comparisons were made of coding regions from the public, annotated sequence of B. melitensis (GenBank) to the annotated sequence of B. suis (TIGR) and to the newly-sequenced B. abortus (personal communication, S. Halling, National Animal Disease Center, USDA). A systematic approach to analysis of multiple genome sequences is described including a data model for storage of defined features is presented along with necessary descriptive information such as input parameters and scores from the methods used to define features. A collection of adjacency relationships between features is also stored, creating a unified database that can be mined for patterns of features which repeat among or within genomes. The biological utility of the data model was demonstrated by a detailed analysis of the multiple genome comparison used to create the sample data set. This examination of genetic differences between three Brucella species with different virulence patterns and host preferences enabled investigation of the genomic basis of virulence. In the B. suis genome, seventy-one differentiating genes were found, including a contiguous 17.6 kb region unique to the species. Although only one unique species-specific gene was identified in the B. melitensis genome and none in the B. abortus genome, seventy-nine differentiating genes were found to be present in only two of the three Brucella species. These differentiating features may be significant in explaining differences in virulence or host specificity. RT-PCR analysis was performed to determine whether these genes are transcribed in vitro. Detailed comparisons were performed on a putative B. suis pathogenicity island (PAI). An overview of these genomic differences and discussion of their significance in the context of host preference and virulence is presented.
Master of Science
APA, Harvard, Vancouver, ISO, and other styles
10

McAdam, Paul R. "Population analysis of bacterial pathogens on distinct temporal and spatial scales." Thesis, University of Edinburgh, 2014. http://hdl.handle.net/1842/17852.

Full text
Abstract:
Bacteria have been the causative agents of major infectious disease pandemics throughout human history. Over the past 4 decades, a combination of changing medical practices, industrialization, and globalisation have led to a number of emergences and re-emergences of bacterial pathogens. The design of rational control programs and bespoke therapies will require an enhanced understanding of the dynamics underpinning the emergence and transmission of pathogenic clones. The recent development of new technologies for sequencing bacterial genomes rapidly and economically has led to a greatly enhanced understanding of the diversity of bacterial populations. This thesis describes the application of whole genome sequencing of 2 bacterial pathogens, Staphylococcus aureus and Legionella pneumophila, in order to understand the dynamics of bacterial infections on different temporal and spatial scales. The first study involves the examination of S. aureus evolution during a chronic infection of a single patient over a period of 26 months revealing differences in antibiotic resistance profiles and virulence factor expression over time. The genetic variation identified correlated with differences in growth rate, haemolytic activity, and antibiotic sensitivity, implying a profound effect on the ecology of S. aureus. Importantly, polymorphisms were identified in global regulators of virulence, with a high frequency of polymorphisms within the SigB locus identified, suggesting this region may be under selection in this patient. The identification of genes under diversifying selection during long-term infection may inform the design of novel therapeutics for the control of refractory chronic infections. Secondly, the emergence and transmission of 3 pandemic lineages derived from S. aureus clonal complex 30 (CC30) were investigated. Independent origins for each pandemic lineage were identified, with striking molecular correlates of hospital- or community-associated pandemics represented by mobile genetic elements, such as bacteriophage and Staphylococcal pathogenicity islands, and non-synonymous mutations affecting antibiotic resistance and virulence. Hospitals in large cities were identified as hubs for the transmission of MRSA to regional health care centres. In addition, comparison of whole genome sequences revealed that at least 3 independent acquisitions of TSST-1 have occurred in CC30, but a single distinct clade of diverse community-associated CC30 strains was responsible for the TSS epidemic of the late 1970s, and for subsequent cases of TSS in the UK and USA. Finally, whole genome sequencing was used as a tool for investigating a recent outbreak of legionellosis in Edinburgh. An unexpectedly high level of genomic diversity was identified among the outbreak strains, with respect to core genome polymorphisms, and accessory genome content. The data indicate that affected individuals may be infected with heterogeneous strains. The findings highlight the complexities in identifying environmental sources and suggest possible differences in pathogenic potential among isolates from a single outbreak. Taken together, the findings demonstrate applications of bacterial genome sequencing leading to enhanced understanding of bacterial pathogen evolution, emergence, and transmission, which may ultimately inform appropriate infection control measures.
APA, Harvard, Vancouver, ISO, and other styles
11

Plass, Pórtulas Mireya 1982. "Comparative analysis of splicing in eukaryotes." Doctoral thesis, Universitat Pompeu Fabra, 2011. http://hdl.handle.net/10803/78124.

Full text
Abstract:
L’splicing és el mecanisme pel qual els introns són eliminats del pre-mRNA per generar un trànscrit madur. Aquest procés és dut a terme per un complex macromolecular anomenat spliceosoma i requereix el reconeixement dels senyals d’splicing al pre-mRNA. Aquests senyals no són sempre identificats correctament, el que permet la producció de trànscrits diferents a partir d’un únic pre-mRNA mitjançant un procés anomenat splicing alternatiu. Aquest procés pot ser regulat mitjançant factors proteics específics o per altres mecanismes que alteren el reconeixement dels senyals d’splicing com l’estructura secundària adoptada pels pre-mRNAs. En aquesta tesi hem investigat els mecanismes de regulació de l’splicing en eucariotes mitjançant tècniques computacionals. També hem estudiat la relació existent entre les proteïnes que intervenen en la regulació de l’splicing i els senyals d’splicing, i com han coevolucionat en diferents espècies. Finalment, i tenint en compte les possibilitats que l’splicing alternatiu ofereix des del punt de vista evolutiu, també hem analitzat l’impacte de l’splicing alternatiu en l’evolució gènica.
Splicing is the mechanism by which introns are removed from the pre-mRNA to create a mature transcript. This process is performed by a macromolecular complex, the spliceosome, and involves the recognition of the splicing signals in the premRNA. These signals are not always perfectly recognized, which allows the production of different mature transcripts from a single pre-mRNA through a process called alternative splicing. This process can be regulated by specific protein factors or by other mechanisms that affect the recognition of the splicing signals, such as the secondary structure adopted by the pre-mRNA. In this thesis we have investigated the mechanisms of splicing regulation in eukaryotes using computational approaches. Moreover, we have also studied the relationship that exists between protein factors involved in splicing regulation and splicing signals, and how they have co-evolved across species. Finally, and considering the possibilities that alternative splicing can offer from the evolutionary point of view, he have also analyzed the impact of alternative splicing in gene evolution.
APA, Harvard, Vancouver, ISO, and other styles
12

Plass, Pórtulas Mireya. "Comparative analysis of splicing in eukaryotes." Doctoral thesis, Universitat Pompeu Fabra, 2011. http://hdl.handle.net/10803/78124.

Full text
Abstract:
L’splicing és el mecanisme pel qual els introns són eliminats del pre-mRNA per generar un trànscrit madur. Aquest procés és dut a terme per un complex macromolecular anomenat spliceosoma i requereix el reconeixement dels senyals d’splicing al pre-mRNA. Aquests senyals no són sempre identificats correctament, el que permet la producció de trànscrits diferents a partir d’un únic pre-mRNA mitjançant un procés anomenat splicing alternatiu. Aquest procés pot ser regulat mitjançant factors proteics específics o per altres mecanismes que alteren el reconeixement dels senyals d’splicing com l’estructura secundària adoptada pels pre-mRNAs. En aquesta tesi hem investigat els mecanismes de regulació de l’splicing en eucariotes mitjançant tècniques computacionals. També hem estudiat la relació existent entre les proteïnes que intervenen en la regulació de l’splicing i els senyals d’splicing, i com han coevolucionat en diferents espècies. Finalment, i tenint en compte les possibilitats que l’splicing alternatiu ofereix des del punt de vista evolutiu, també hem analitzat l’impacte de l’splicing alternatiu en l’evolució gènica.
Splicing is the mechanism by which introns are removed from the pre-mRNA to create a mature transcript. This process is performed by a macromolecular complex, the spliceosome, and involves the recognition of the splicing signals in the premRNA. These signals are not always perfectly recognized, which allows the production of different mature transcripts from a single pre-mRNA through a process called alternative splicing. This process can be regulated by specific protein factors or by other mechanisms that affect the recognition of the splicing signals, such as the secondary structure adopted by the pre-mRNA. In this thesis we have investigated the mechanisms of splicing regulation in eukaryotes using computational approaches. Moreover, we have also studied the relationship that exists between protein factors involved in splicing regulation and splicing signals, and how they have co-evolved across species. Finally, and considering the possibilities that alternative splicing can offer from the evolutionary point of view, he have also analyzed the impact of alternative splicing in gene evolution.
APA, Harvard, Vancouver, ISO, and other styles
13

Stephan, Taylorlyn. "What's In A Neanderthal: A Comparative Analysis." Oberlin College Honors Theses / OhioLINK, 2017. http://rave.ohiolink.edu/etdc/view?acc_num=oberlin159282580067034.

Full text
APA, Harvard, Vancouver, ISO, and other styles
14

Page, Justin Thomas. "Bioinformatics for the Comparative Genomic Analysis of the Cotton (Gossypium) Polyploid Complex." BYU ScholarsArchive, 2015. https://scholarsarchive.byu.edu/etd/5557.

Full text
Abstract:
Understanding the composition, evolution, and function of the cotton (Gossypium) genome is complicated by the joint presence of two genomes in its nucleus (AT and DT genomes). Specifically, read-mapping (a fundamental part of next-generation sequence analysis) cannot adequately differentiate reads as belonging to one genome or the other. These two genomes were derived from progenitor A-genome and D-genome diploids involved in ancestral allopolyploidization. To better understand the allopolyploid genome, we developed PolyCat to categorize reads according to their genome of origin based on homoeo-SNPs that differentiate the two genomes. We re-sequenced the genomes of extant diploid relatives of tetraploid cotton that contain the A1 (Gossypium herbaceum), A2 (Gossypium arboreum), or D5 (Gossypium raimondii) genomes. We identified 24 million SNPs between the A-diploid and D-diploid genomes. These analyses facilitated the construction of a robust index of conserved SNPs between the A-genomes and D-genomes at all detected polymorphic loci. This index can be used by PolyCat to assign reads from an allotetraploid to its genome-of-origin. Continued characterization of the Gossypium genomes will further enhance our ability to manipulate fiber and agronomic production of cotton. We re-sequenced 34 allotetraploid cotton lines, representing all 7 tetraploid cotton species. The analysis of these genomes-using PolyCat and PolyDog-provides us with the beginnings of a HapMap-like resource for cotton species, including indices of both homoeo-SNPs and allele-SNPs. With this information, we explore the phylogenetic relationships among cotton species, including the newly characterized species G. ekmanianum and G. stephensii. We examine gene conversion both recent and ancient, discovering that recent gene conversion is extremely rare, and ancient gene conversion is far less extensive than previously believed, with many previously identified conversion events being more probably due to autapamorphic SNPs in the descent of diploid relatives. In order to carry out these experiments, many tools for next-generation sequence analysis were developed. These tools, along with PolyCat and PolyDog, comprise the tool suite BamBam.
APA, Harvard, Vancouver, ISO, and other styles
15

au, kryan@ccg murdoch edu, and Karon Magdalene Leanne Ryan. "Variation of flour colour in Western Australia adapted wheat: comparative genomics, molecular markers and QTL analysis." Murdoch University, 2005. http://wwwlib.murdoch.edu.au/adt/browse/view/adt-MU20061019.130337.

Full text
Abstract:
The yellowness of flour colour ranges is an important quality trait in wheat for end-use products and is determined by the accumulation of carotenoids in the endosperm. The aims of this study were to develop EST-based molecular markers for genes encoding enzymes of the carotenoid biosynthetic pathway leading to xanthophyll accumulation and identify quantitative trait loci for flour colour (b*) and xanthophyll content in Western Australian adapted germplasm. A novel bioinformatic strategy was developed to identify rice genes encoding key enzymes of the carotenoid biosynthetic pathway and to predict wheat orthologues on the short arm of chromosome 3 or long arm of chromosome 7. The bioinformatic strategy involved the identification of rice carotenoid genes on BAC/PAC contigs aligned to wheat mapped ESTs. Rice genes predicted to have wheat orthologues were selected based on ESTs mapping to regions on wheat homoeologous chromosomes 3 and 7 known to be involved in flour colour. The rice genes predicted to have wheat orthologues were Geranylgeranyltransferase I ƒÒ¡Vsubunit (GGT-Ibeta) and Rab geranylgeranyltransferase component A (RGGT-A) on the short arm of chromosome 3, Lycopene ƒÒ¡Vcylcase (LBC) on the long arm of chromosome 3 and Lycopene £`¡Vcylcase (LEC) on the long arm of chromosome 7. The prediction of these wheat orthologues provided the basis for development of EST-based molecular markers for detecting variation in xanthophyll content. Wheat ESTs with unknown chromosomal locations and having the highest similarity to GGT-Ibeta, RGGT-A and LBC were selected for the development of molecular markers. No EST homologues were identified for LEC and therefore this gene was not further considered. Orthology was confirmed by sequencing and deletion lines were used to confirm chromosomal locations. Two partial orthologues of GGT-Ibeta were identified on the short arms of chromosomes 3B and 3D. A partial orthologue of RGGT-A was mapped to the proximal regions of the short and long arms of chromosome 3B. At least two or more orthologues of LBC were identified from nullisomic-tetrasomic lines. An EST-based molecular marker for GGT-Ibeta was found to be involved in minor variation of xanthophyll content in a Westonia*2/Janz doubled haploid population. QTL analysis from three doubled haploid populations indicated variation in WA-adapted germplasm may be due to different alleles controlling flour colour. QTLs for b* and xanthophyll content were found to coincide on the short arms of chromosomes 3A, 4D, and 7B and the long arm of chromosomes 7A and 7B in WA-adapted germplasm. Homoeologous expression of regions controlling variation in b* and xanthophyll content on the long arm of chromosomes 7A and 7B suggests the shut-down of genes in the same region on chromosome 7D. The main outcome of this study is flour colour and identification of gene orthologues in wheat controlling variation in xanthophyll content is complex most likely because of the interaction of the carotenoid biosynthetic pathway with other pathways.
APA, Harvard, Vancouver, ISO, and other styles
16

Bonnardeaux, Yumiko Graciela. "Seed dormancy in barley (Hordeum vulgare L.) : comparative genomics, quantitative trait loci analysis and molecular genetics." University of Western Australia. Faculty of Natural and Agricultural Sciences, 2008. http://theses.library.uwa.edu.au/adt-WU2009.0019.

Full text
Abstract:
[Truncated abstract] Under prolonged wet and damp conditions, barley grain with low dormancy can germinate precociously, a condition known as preharvest sprouting that causes a number of detrimental effects in grain quality. In particular, preharvest sprouting renders the grain unsuitable for malting. The aim of this study was to take a genomics approach to identify and characterise candidate genes that could be linked to the control of seed dormancy in barley. This thesis developed a bioinformatic strategy that exploited the availability of gene sequences with functional evidence in the model species of Arabidopsis and rice. The bioinformatic strategy integrated phenotypic data (QTL data) and comparative genomics for a targeted approach in identifying candidate genes with a high probability of having a conserved function in cereals. This bioinformatic study identified two candidate genes ERA1 and ABI2 with strong evidence for a role in seed dormancy based on their function in Arabidopsis in abscisic acid (ABA) signal transduction and their co-location to seed dormancy QTLs in Arabidopsis, rice and wheat. In order to establish whether the candidate genes mapped to seed dormancy QTLs in barley, QTL analyses were performed on a double haploid population, not previously studied, developed from a cross between Stirling, a major Australian malting cultivar, and Harrington, a major Canadian malting cultivar. This cross was specifically chosen for this study, as elucidation of chromosomal regions associated with seed dormancy in the background of a malting cultivar would make a significant contribution for the malting industry. '...' Identification of a seed dormancy QTL on the long arm of 3H, in a region syntenic to the wheat chromosome locations of ESTS aligning to the ERA1 and ABI2 genes, laid the foundation for physical and genetic mapping of the candidate genes to investigate whether the genes co-located to the QTL on 3H. Physical mapping of the genes in wheat barley addition lines confirmed their positions on the long arm of 3H. Genetic mapping of the ERA1 gene was performed using a CAPS marker developed in this thesis. The genetic mapping of the ERA1 gene did not place the gene within either of the minor QTLs on 3HL, although segregation distortion may have influenced the map position of this gene. Further investigation is required to resolve the positioning of the ERA1 and ABI2 genes in relation to the 3H seed dormancy QTL. The main outcomes of this study have been 1) identification of candidate genes for further study; 2) identification of QTLs on the long arm of 3H that were previously unknown; 3) demonstration of the potential differences in dormancy that can be achieved through the use of specific gene combinations, highlighting the importance of minor genes and the epistatic interactions that occur between them and; 4) the development of a CAPS marker for the ERA1 gene, which can be used to track the gene in barley breeding programs to observe its association with important agronomic traits. This thesis also pioneered the implementation of several new technologies including multiplex-ready PCR (Hayden et al. 2008) for fluorescence–based SSR genotyping and QTLNetwork (Yang et al. 2008) for statistical analysis of QTLs. Seed dormancy is a complex trait and is likely to involve the interplay of a number of genes that have a role in other developmental and regulatory processes.
APA, Harvard, Vancouver, ISO, and other styles
17

Ryan, Karon Magadalene Leanne. "Variation of flour colour in Western Australia adapted wheat : comparative genomics, molecular markers and QTL analysis /." Access via Murdoch University Digital Theses Project, 2005. http://wwwlib.murdoch.edu.au/adt/browse/view/adt-MU20061019.130337.

Full text
APA, Harvard, Vancouver, ISO, and other styles
18

Ryan, Karon Magdalene Leanne. "Variation of flour colour in Western Australia adapted wheat: comparative genomics, molecular markers and QTL analysis." Thesis, Ryan, Karon Magdalene Leanne (2005) Variation of flour colour in Western Australia adapted wheat: comparative genomics, molecular markers and QTL analysis. PhD thesis, Murdoch University, 2005. https://researchrepository.murdoch.edu.au/id/eprint/285/.

Full text
Abstract:
The yellowness of flour colour ranges is an important quality trait in wheat for end-use products and is determined by the accumulation of carotenoids in the endosperm. The aims of this study were to develop EST-based molecular markers for genes encoding enzymes of the carotenoid biosynthetic pathway leading to xanthophyll accumulation and identify quantitative trait loci for flour colour (b*) and xanthophyll content in Western Australian adapted germplasm. A novel bioinformatic strategy was developed to identify rice genes encoding key enzymes of the carotenoid biosynthetic pathway and to predict wheat orthologues on the short arm of chromosome 3 or long arm of chromosome 7. The bioinformatic strategy involved the identification of rice carotenoid genes on BAC/PAC contigs aligned to wheat mapped ESTs. Rice genes predicted to have wheat orthologues were selected based on ESTs mapping to regions on wheat homoeologous chromosomes 3 and 7 known to be involved in flour colour. The rice genes predicted to have wheat orthologues were Geranylgeranyltransferase I beta-subunit (GGT-Ibeta) and Rab geranylgeranyltransferase component A (RGGT-A) on the short arm of chromosome 3, Lycopene beta-cylcase (LBC) on the long arm of chromosome 3 and Lycopene epsilon-cylcase (LEC) on the long arm of chromosome 7. The prediction of these wheat orthologues provided the basis for development of EST-based molecular markers for detecting variation in xanthophyll content. Wheat ESTs with unknown chromosomal locations and having the highest similarity to GGT-Ibeta, RGGT-A and LBC were selected for the development of molecular markers. No EST homologues were identified for LEC and therefore this gene was not further considered. Orthology was confirmed by sequencing and deletion lines were used to confirm chromosomal locations. Two partial orthologues of GGT-Ibeta were identified on the short arms of chromosomes 3B and 3D. A partial orthologue of RGGT-A was mapped to the proximal regions of the short and long arms of chromosome 3B. At least two or more orthologues of LBC were identified from nullisomic-tetrasomic lines. An EST-based molecular marker for GGT-Ibeta was found to be involved in minor variation of xanthophyll content in a Westonia*2/Janz doubled haploid population. QTL analysis from three doubled haploid populations indicated variation in WA-adapted germplasm may be due to different alleles controlling flour colour. QTLs for b* and xanthophyll content were found to coincide on the short arms of chromosomes 3A, 4D, and 7B and the long arm of chromosomes 7A and 7B in WA-adapted germplasm. Homoeologous expression of regions controlling variation in b* and xanthophyll content on the long arm of chromosomes 7A and 7B suggests the shut-down of genes in the same region on chromosome 7D. The main outcome of this study is flour colour and identification of gene orthologues in wheat controlling variation in xanthophyll content is complex most likely because of the interaction of the carotenoid biosynthetic pathway with other pathways.
APA, Harvard, Vancouver, ISO, and other styles
19

Ryan, Karon Magdalene Leanne. "Variation of flour colour in Western Australia adapted wheat: comparative genomics, molecular markers and QTL analysis." Ryan, Karon Magdalene Leanne (2005) Variation of flour colour in Western Australia adapted wheat: comparative genomics, molecular markers and QTL analysis. PhD thesis, Murdoch University, 2005. http://researchrepository.murdoch.edu.au/285/.

Full text
Abstract:
The yellowness of flour colour ranges is an important quality trait in wheat for end-use products and is determined by the accumulation of carotenoids in the endosperm. The aims of this study were to develop EST-based molecular markers for genes encoding enzymes of the carotenoid biosynthetic pathway leading to xanthophyll accumulation and identify quantitative trait loci for flour colour (b*) and xanthophyll content in Western Australian adapted germplasm. A novel bioinformatic strategy was developed to identify rice genes encoding key enzymes of the carotenoid biosynthetic pathway and to predict wheat orthologues on the short arm of chromosome 3 or long arm of chromosome 7. The bioinformatic strategy involved the identification of rice carotenoid genes on BAC/PAC contigs aligned to wheat mapped ESTs. Rice genes predicted to have wheat orthologues were selected based on ESTs mapping to regions on wheat homoeologous chromosomes 3 and 7 known to be involved in flour colour. The rice genes predicted to have wheat orthologues were Geranylgeranyltransferase I beta-subunit (GGT-Ibeta) and Rab geranylgeranyltransferase component A (RGGT-A) on the short arm of chromosome 3, Lycopene beta-cylcase (LBC) on the long arm of chromosome 3 and Lycopene epsilon-cylcase (LEC) on the long arm of chromosome 7. The prediction of these wheat orthologues provided the basis for development of EST-based molecular markers for detecting variation in xanthophyll content. Wheat ESTs with unknown chromosomal locations and having the highest similarity to GGT-Ibeta, RGGT-A and LBC were selected for the development of molecular markers. No EST homologues were identified for LEC and therefore this gene was not further considered. Orthology was confirmed by sequencing and deletion lines were used to confirm chromosomal locations. Two partial orthologues of GGT-Ibeta were identified on the short arms of chromosomes 3B and 3D. A partial orthologue of RGGT-A was mapped to the proximal regions of the short and long arms of chromosome 3B. At least two or more orthologues of LBC were identified from nullisomic-tetrasomic lines. An EST-based molecular marker for GGT-Ibeta was found to be involved in minor variation of xanthophyll content in a Westonia*2/Janz doubled haploid population. QTL analysis from three doubled haploid populations indicated variation in WA-adapted germplasm may be due to different alleles controlling flour colour. QTLs for b* and xanthophyll content were found to coincide on the short arms of chromosomes 3A, 4D, and 7B and the long arm of chromosomes 7A and 7B in WA-adapted germplasm. Homoeologous expression of regions controlling variation in b* and xanthophyll content on the long arm of chromosomes 7A and 7B suggests the shut-down of genes in the same region on chromosome 7D. The main outcome of this study is flour colour and identification of gene orthologues in wheat controlling variation in xanthophyll content is complex most likely because of the interaction of the carotenoid biosynthetic pathway with other pathways.
APA, Harvard, Vancouver, ISO, and other styles
20

Shoja, Valia. "A Broad Analysis of Tandemly Arrayed Genes in the Genomes of Human, Mouse, and Rat." Thesis, Virginia Tech, 2006. http://hdl.handle.net/10919/35800.

Full text
Abstract:
Tandemly arrayed genes (TAG) play an important functional and physiological role in the genome. Most previous studies have focused on individual TAG families in a few species, yet a broad characterization of TAGs is not available. We identified all the TAGs in the genomes of human, chimp, mouse, and rat and performed a comprehensive analysis of TAG distribution, TAG sizes, TAG gene orientations and intergenic distances, and TAG gene functions. TAGs account for about 14-17% of all the genomic genes and nearly one third of all the duplicated genes in the four genomes, highlighting the predominant role that tandem duplication plays in gene duplication. For all species, TAG distribution is highly heterogeneous along chromosomes and some chromosomes are enriched with TAG forests while others are enriched with TAG deserts. The majority of TAGs are of size two for all genomes, similar to the previous findings in C. elegans, A. thaliana, and O. sativa, suggesting that it is a rather general phenomenon in eukaryotes. The comparison with the genome patterns shows that TAG members have a significantly higher proportion of parallel gene orientation in all species, corroborating Graham's claim that parallel orientation is the preferred form of orientation in TAGs. Moreover, TAG members with parallel orientation tend to be closer to each other than all neighboring genes with parallel orientation in the genome. The analysis of GO function indicate that genes with receptor or binding activities are significantly over-represented by TAGs. Simulation reveals that random gene rearrangements have little effect on the statistics of TAGs for all genomes. It is noteworthy to mention that gene family sizes are significantly correlated with the extent of tandem duplication, suggesting that tandem duplication is a preferred form of duplication, especially in large families. There has not been any systematic study of TAG genes' expression patterns in the genome. Taking advantage of recent large-scale microarray data, we were able to study expression divergence of some of the TAGs of size two in human and mouse for which the expression data is available and examine the effect of sequence divergence, gene orientation, and physical proximity on the divergence of gene expression patterns. Our results show that there is a weak negative correlation between sequence divergence and expression similarity between the two members of a TAG, and also a weak negative correlation between physical proximity of two genes and their expression similarity. No significant relationship was detected between gene orientation and expression similarity. Moreover, we compared the expression breadth of upstream and downstream duplicate copies and found that downstream duplicate does not show significantly narrower expression breadth. We also compared TAG gene pairs with their neighboring non-TAG pairs for both physical proximity and expression similarity. Our results show that TAG gene pairs do not show any distinct differences in the two aspects from their neighboring gene pairs, suggesting that sufficient divergence has occurred to these duplicated genes during evolution and their original similarity conferred by duplication has decayed to a level that is comparable to their surrounding regions.
Master of Science
APA, Harvard, Vancouver, ISO, and other styles
21

Wang, Jixin. "Bioinformatic analysis of chicken chemokines, chemokine receptors, and Toll-like receptor 21." Texas A&M University, 2006. http://hdl.handle.net/1969.1/4212.

Full text
Abstract:
Chemokines triggered by Toll-like receptors (TLRs) are small chemoattractant proteins, which mainly regulate leukocyte trafficking in inflammatory reactions via interaction with G protein-coupled receptors. Forty-two chemokines and 19 cognate receptors have been found in the human genome. Prior to this study, only 11 chicken chemokines and 7 receptors had been reported. The objectives of this study were to identify systematically chicken chemokines and their cognate receptor genes in the chicken genome and to annotate these genes and ligand-receptor binding by a comparative genomics approach. Twenty-three chemokine and 14 chemokine receptor genes were identified in the chicken genome. The number of coding exons in these genes and the syntenies are highly conserved between human, mouse, and chicken although the amino acid sequence homologies are generally low between mammalian and chicken chemokines. Chicken genes were named with the systematic nomenclature used in humans and mice based on phylogeny, synteny, and sequence homology. The independent nomenclature of chicken chemokines and chemokine receptors suggests that the chicken may have ligand-receptor pairings similar to mammals. The TLR family represents evolutionarily conserved components of the patternrecognizing receptors (PRRs) of the innate immune system that recognize specific pathogen-associated molecular patterns (PAMPs) through their ectodomains (ECDs). TLR's ECDs contain 19 to 25 tandem copies of leucine-rich repeat (LRR) motifs. TLRs play important roles in the activation of pro-inflammatory cytokines, chemokines and modulation of antigen-specific adaptive immune responses. To date, nine TLRs have been reported in chicken, along with a non-functional TLR8. Two non-mammalian TLRs, TLR21 and TLR22, have been identified in pufferfish and zebrafish. The objectives of this study were to determine if there is the existence of chicken genes homologous to fish-specific TLRs, and if possible ligands of these receptors exist. After searching the chicken genome sequence and EST database, a novel chicken TLR homologous to fish TLR21 was identified. Phylogenetic analysis indicated that the identified chicken TLR is the orthologue of TLR21 in fish. Bioinformatic analysis of potential PAMP binding sites within LRR insertions showed that CpG DNA is the putative ligand of this receptor.
APA, Harvard, Vancouver, ISO, and other styles
22

Håfström, Therese. "Genome closure and bioinformatic analysis of the parallel sequenced bacterium Brachyspira intermedia PWS/AT." Thesis, Uppsala universitet, Institutionen för biologisk grundutbildning, 2011. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-164335.

Full text
Abstract:
Brachyspira species are bacteria that colonize the intestines of some mammalian and avian species with different degrees of pathogenicity. Brachyspira intermedia is a mild pig and bird pathogen with an unknown genomic sequence. In this project, we completed the genome of Brachyspira intermedia PWS/AT and did a comparative genomic analysis between B. intermedia PWS/AT and the already completed genomes of B. hyodysenteriae WA1, B. murdochii 56-150T and B. pilosicoli 95/1000. A table containing 15 classes of unique and shared genes was developed and analyzed in order to gain a better understanding of species-specific traits and clues behind the different degree of pathogenicity. Our result shows that genes are overall poorly annotated and further studies are of great importance for understanding different and shared properties. The largest number of unique features was found in B. intermedia and B. murdochii. B. hyodysenteriae and B. pilosicoli has most likely developed independently towards different biological niches and B. pilosicoli has undergone a major reductive evolution. One plasmid and six prophages were found in B. intermedia, where two of the phages appear to be capable of horizontal gene transfer. Further genome sequencing of more strains will probably increase the understanding of species-specific traits even more.
APA, Harvard, Vancouver, ISO, and other styles
23

Tanaka, Sunao. "In silico analysis-based identification of the target residue of integrin α6 for metastasis inhibition of basal-like breast cancer." Kyoto University, 2020. http://hdl.handle.net/2433/253190.

Full text
APA, Harvard, Vancouver, ISO, and other styles
24

Raborn, R. Taylor. "Genome-wide analysis of transcription initiation and promoter architecture in eukaryotes." Diss., University of Iowa, 2012. https://ir.uiowa.edu/etd/4728.

Full text
Abstract:
The transcriptome represents the entirety of RNA molecules within a cell or tissue at a given time. Recent advances have facilitated the production of large-scale, global interrogations of transcriptomes, finding that genomes are extensively transcribed and contain diverse classes of RNAs (Dinger et al., 2009). Information generated by high-throughput analyses of mRNA transcription start sites (TSSs) such as CAGE (Cap Analysis of Gene Expression) indicate that eukaryotic genomes have complex landscapes of transcription initiation. The TSS is important for the annotation of cis-regulatory sequences, because it provides a link between the mRNA transcript and the promoter. The patterns of TSS distributions observed within mRNA 5' end profiling studies prevent straightforward annotation of putative promoters. To address this challenge, we developed a method to identify- on a genome-wide basis- the putative promoter, which we define by TSS distributions and designate the transcription start region (TSR). We applied a clustering method to identify and annotate TSRs within the budding yeast Saccharomyces cerevisiae using a full-length cDNA dataset (Miura et al., 2006). To validate these TSR annotations, we performed an integrative genomic analysis using multiple datasets. Our method identified TSRs at positions consistent with bona fide promoters in S. cerevisiae. In addition, using 5'RACE, we find overall agreement between computationally-defined TSRs and TSSs identified experimentally. From this analysis, we find that a significant proportion of genes exhibiting alternative promoter usage within sporulation are associated with respiration, suggesting that this is regulated on a condition-specific basis in budding yeast. We further developed our TSS clustering method into a bioinformatics tool called TSRchitect, which identifies and annotates TSRs from large-scale TSS profiling information. TSRchitect is capable of handling both tag and sequence-based TSS information and efficiently computes TSRs from global TSS datasets on a desktop computer. We find support for TSRchitect's annotations in human from a CAGE experiment from the ENCODE (Encyclopedia of DNA Elements) project. Finally, we use TSRchitect to identify TSRs from the transcriptomes of diverse eukaryotes. We investigated the conservation of TSRs among orthologous genes. We frequently identify multiple TSRs for a given gene, suggesting that alternative promoter usage is widespread. Overall, using TSS profiling data derived from separate tissues within mouse and human, we find that the positions of TSRs are relatively stable across tissues surveyed; however, a small fraction of genes exhibit tissue-specific differences in TSR use. As transcriptome profiling information continues to be generated at an rapid pace, computational approaches are increasingly important. It is anticipated that the method and approach we describe within this dissertation will contribute to an improved of gene regulation and promoter architecture in eukaryotes.
APA, Harvard, Vancouver, ISO, and other styles
25

Findeiß, Sven. "Expanding the repertoire of bacterial (non-)coding RNAs." Doctoral thesis, Universitätsbibliothek Leipzig, 2011. http://nbn-resolving.de/urn:nbn:de:bsz:15-qucosa-67816.

Full text
Abstract:
The detection of non-protein-coding RNA (ncRNA) genes in bacteria and their diverse regulatory mode of action moved the experimental and bio-computational analysis of ncRNAs into the focus of attention. Regulatory ncRNA transcripts are not translated to proteins but function directly on the RNA level. These typically small RNAs have been found to be involved in diverse processes such as (post-)transcriptional regulation and modification, translation, protein translocation, protein degradation and sequestration. Bacterial ncRNAs either arise from independent primary transcripts or their mature sequence is generated via processing from a precursor. Besides these autonomous transcripts, RNA regulators (e.g. riboswitches and RNA thermometers) also form chimera with protein-coding sequences. These structured regulatory elements are encoded within the messenger RNA and directly regulate the expression of their “host” gene. The quality and completeness of genome annotation is essential for all subsequent analyses. In contrast to protein-coding genes ncRNAs lack clear statistical signals on the sequence level. Thus, sophisticated tools have been developed to automatically identify ncRNA genes. Unfortunately, these tools are not part of generic genome annotation pipelines and therefore computational searches for known ncRNA genes are the starting point of each study. Moreover, prokaryotic genome annotation lacks essential features of protein-coding genes. Many known ncRNAs regulate translation via base-pairing to the 5’ UTR (untranslated region) of mRNA transcripts. Eukaryotic 5’ UTRs have been routinely annotated by sequencing of ESTs (expressed sequence tags) for more than a decade. Only recently, experimental setups have been developed to systematically identify these elements on a genome-wide scale in prokaryotes. The first part of this thesis, describes three experimental surveys of exploratory field studies to analyze transcript organization in pathogenic bacteria. To identify ncRNAs in Pseudomonas aeruginosa we used a combination of an experimental RNomics approach and ncRNA prediction. Besides already known ncRNAs we identified and validated the expression of six novel RNA genes. Global detection of transcripts by next generation RNA sequencing techniques unraveled an unexpectedly complex transcript organization in many bacteria. These ultra high-throughput methods give us the appealing opportunity to analyze the complete RNA output of any species at once. The development of the differential RNA sequencing (dRNA-seq) approach enabled us to analyze the primary transcriptome of Helicobacter pylori and Xanthomonas campestris. For the first time we generated a comprehensive and precise transcription start site (TSS) map for both species and provide a general framework for the analysis of dRNA-seq data. Focusing on computer-aided analysis we developed new tools to annotate TSS, detect small protein-coding genes and to infer homology of newly detected transcripts. We discovered hundreds of TSS in intergenic regions, upstream of protein-coding genes, within operons and antisense to annotated genes. Analysis of 5’ UTRs (spanning from the TSS to the start codon of the adjacent protein-coding gene) revealed an unexpected size diversity ranging from zero to several hundred nucleotides. We identified and validated the expression of about 60 and about 20 ncRNA candidates in Helicobacter and Xanthomonas, respectively. Among these ncRNA candidates we found several small protein-coding genes that have previously evaded annotation in both species. We showed that the combination of dRNA-seq and computational analysis is a powerful method to examine prokaryotic transcriptomes. Experimental setups are time consuming and often combined with huge costs. Another limitation of experimental approaches is that genes which are expressed in specific developmental stages or stress conditions are likely to be missed. Bioinformatic tools build an alternative to overcome such restraints. General approaches usually depend on comparative genomic data and evolutionary signatures are used to analyze the (non-)coding potential of multiple sequence alignments. In the second part of my thesis we present our major update of the widely used ncRNA gene finder RNAz and introduce RNAcode, an efficient tool to asses local protein-coding potential of genomic regions. RNAz has been successfully used to identify structured RNA elements in all domains of life. However, our own experience and the user feedback not only demonstrated the applicability of the RNAz approach, but also helped us to identify limitations of the current implementation. Using a much larger training set and a new classification model we significantly improved the prediction accuracy of RNAz. During transcriptome analysis we repeatedly identified small protein-coding genes that have not been annotated so far. Only a few of those genes are known to date and standard proteincoding gene finding tools suffer from the lack of training data. To avoid an excess of false positive predictions, gene finding software is usually run with an arbitrary cutoff of 40-50 amino acids and therefore misses the small sized protein-coding genes. We have implemented RNAcode which is optimized for emerging applications not covered by standard protein-coding gene annotation software. In addition to complementing classical protein gene annotation, a major field of application of RNAcode is the functional classification of transcribed regions. RNA sequencing analyses are likely to falsely report transcript fragments (e.g. mRNA degradation products) as non-coding. Hence, an evaluation of the protein-coding potential of these fragments is an essential task. RNAcode reports local regions of high coding potential instead of complete protein-coding genes. A training on known protein-coding sequences is not necessary and RNAcode can therefore be applied to any species. We showed this with our analysis of the Escherichia coli genome where the current annotation could be accurately reproduced. We furthermore identified novel small protein-coding genes with RNAcode in this extensively studied genome. Using transcriptome and proteome data we found compelling evidence that several of the identified candidates are bona fide proteins. In summary, this thesis clearly demonstrates that bioinformatic methods are mandatory to analyze the huge amount of transcriptome data and to identify novel (non-)coding RNA genes. With the major update of RNAz and the implementation of RNAcode we contributed to complete the repertoire of gene finding software which will help to unearth hidden treasures of the RNA World.
APA, Harvard, Vancouver, ISO, and other styles
26

Jean, Géraldine. "In silico methods for genome rearrangement analysis : from identification of common markers to ancestral reconstruction." Thesis, Bordeaux 1, 2008. http://www.theses.fr/2008BOR13704/document.

Full text
Abstract:
L'augmentation du nombre de génomes totalement séquencés rend de plus en plus efficace l'étude des mécanismes évolutifs à partir de la comparaison de génomes contemporains. L'un des principaux problèmes réside dans la reconstruction d'architectures de génomes ancestraux plausibles afin d'apporter des hypothèses à la fois sur l'histoire des génomes existants et sur les mécanismes de leur formation. Toutes les méthodes de reconstruction ancestrale ne convergent pas nécessairement vers les mêmes résultats mais sont toutes basées sur les trois mêmes étapes : l'identification des marqueurs communs dans les génomes contemporains, la construction de cartes comparatives des génomes, et la réconciliation de ces cartes en utilisant le critère de parcimonie maximum. La qualité importante des données à analyser nécessite l'automatisation des traitements et résoudre ces problèmes représente de formidables challenges computationnels. Affiner le modèles et outils mathématiques existants par l'ajout de contraintes biologiques fortes rend les hypothèses établies biologiquement plus réalistes. Dans cette thèse, nous proposons une nouvelle méthode permettant d'identifier des marqueurs communs pour des espèces évolutivement distantes. Ensuite, nous appliquons sur les cartes comparatives reconstituées une nouvelle méthode pour la reconstruction d'architectures ancestrales basée sur les adjacences entre les marqueurs calculés et les distances génomiques entre les génomes contemporains. Enfin, après avoir corrigé l'algorithme existant permettant de déterminer une séquence optimale de réarrangements qui se sont produits durant l'évolution des génomes existants depuis leur ancêtre commun, nous proposons un nouvel outil appelé VIRAGE qui permet la visualisation animée des scénarios de réarrangements entre les espèces
Abstract
APA, Harvard, Vancouver, ISO, and other styles
27

Valles, Ibáñez Guillem de 1986. "Evolutionary analysis of the genome load of loss-of-function variants and their contribution to immunodeficiencies." Doctoral thesis, Universitat Pompeu Fabra, 2016. http://hdl.handle.net/10803/565404.

Full text
Abstract:
Human genomes have been found to harbor an unexpected number of ~100 loss-of-function (LoF) variants, with ~20 of them in an homozygous state, in most cases without a visible effect despite its potential truncation of proteins. This suggests that some of those variants should be neutral but also a fraction could be lethal alleles. In this work we study the implications of LoF variants in two different fields: in comparative genomics by exploring for the first time the mutational load of LoF variants segregating in 79 genomes belonging to six different great ape populations and its possible detrimental effects, and in medical genomics by its implication with other functional variants in 36 patients diagnosed with Common Variable Immunodeficiency, an heterogeneous disease with several genes implied in its etiology, using both monogenic and oligogenic models for this antibody deficiency.
Recentment s'ha descobert que els genomes humans contenen unes inesperades ~100 variants que causen pèrdua de funció (LoF), ~20 de les quals es troben en homozigosi, sense causar cap efecte visible malgrat el seu potencial per esguerrar una proteïna. Això suggereix que algunes d'aquestes variants han de ser neutres, però també que una fracció podrien ser al·lels letals. En aquesta tesis estudiem les implicacions de les LoF variants en dos camps diferents: en la genòmica comparativa explorant per primer cop la carrega mutacional de les variants LoF segregant en 79 genomes que pertanyen a sis poblacions diferents de grans simis i els seus possibles efectes deleteris, i en el camp de la genòmica mèdica per la seva implicació, junt amb altres tipus de variants, en 36 pacients diagnosticats amb Immunodeficiència Comú Variable, una malaltia heterogènia amb varis gens implicats en la seva etiologia, utilitzant models monogènics i poligènics per estudiar aquesta deficiència d'anticossos.
APA, Harvard, Vancouver, ISO, and other styles
28

Paschoal, Alexandre Rossi. "GINGA - Graphical Interface for Comparative Genome Analysis: o desenvolvimento de um sistema computacional de visualização gráfica para a análise comparativa de genomas de bactérias." Laboratório Nacional de Computação Científica, 2007. http://www.lncc.br/tdmc/tde_busca/arquivo.php?codArquivo=124.

Full text
Abstract:
Esta dissertação resultou de um sistema computacional voltado para a visualização gráfica de análises comparativas entre genomas de procariotos. O sistema denominado de GINGA Graphical Interface for comparative Genome Analysis foi desenvolvido basicamente para analisar genomas parcialmente seqüenciados por meio da comparação com genomas completos. O sistema mostra a representação do alinhamento entre seqüências de reads, contigs e scaffolds do genoma parcial com a seqüência completa do outro genoma, permitindo a identificação de blocos comuns, regiões específicas e rearranjos. GINGA é um sistema web-based que foi desenvolvido em linguagem PERL para acessar um banco de dados MySQL, onde estão armazenadas as informações obtidas nas análises comparativas. O módulo de interface da biblioteca gráfica GD da linguagem PERL foi utilizado para a construção da ferramenta de visualização. A representação gráfica criada permite a navegação com opções de zoom in/out, disponibilizando as informações de montagem, anotação das seqüências codificadoras e da organização das seqüências entre os genomas. Relatórios são ainda disponibilizados como fonte complementar da apresentação dos resultados. O sistema GINGA foi utilizado para analisar de maneira comparativa o genoma das bactérias Leifsonia xyli subsp. cynodontis (Lxc genoma parcialmente seqüenciado) e Leifsonia xyli subsp. xyli (Lxx genoma completamente seqüenciado). Lxx provoca o raquitismo da soqueria em cana-de-açúcar, enquanto Lxc é capaz de colonizar cana-de-açúcar sem provocar sintomas de doença. O objetivo foi revelar, ainda durante o processo de seqüenciamento do genoma de Lxc, diferenças genéticas existentes entre os genomas dessas duas bactérias. Fizeram parte das análises comparativas um total de 9.754 reads do genoma de Lxc que formaram 1.064 contigs e 317 scaffolds, totalizando 1.470.731 de bases não redundantes. GINGA permitiu a identificação de 206.320 bases (~19%) em seqüências de contigs específicos (contigs que não apresentaram alinhamento algum com o genoma completo de Lxx) e 19 scaffolds (5,9%) que totalizaram 56.884 bases específicas ao genoma de Lxc, além de aproximadamente 1 milhão de nucleotídeos alinhados ao genoma de Lxx e pelo menos 6 grandes rearranjos. Estes resultados foram disponibilizados em uma interface gráfica e relatórios, permitindo orientar o andamento do projeto de seqüenciamento do genoma de Lxc quanto à seleção das regiões a serem seqüenciadas e, simultaneamente, oferecendo informações para a formalização de hipóteses relevantes à biologia destes microorganismos.
This study aimed to develop a computational system applied to the comparative analysis of prokaryotic genomes in a graphical view. The system named GINGA Graphical Interface for comparative Genome Analysis was developed to analyse a draft genome sequence in comparison to a complete genome. The system shows the alignment between sequence of reads, contigs and scaffolds from partial sequenced genomes and the complete sequence of another genome and allows the identification shared and unique regions as well as rearrangements. GINGA is a web-based system developed using the PERL language to access a MySQL database where all the information regard to the comparative analysis is stored. The module of the interface to GD (Graphics Library) was used to help the construction of the graphical tool. The graphical view allows zoom in/out on the information on assembly, annotation and the organization of the sequences. Supplementary information can be accessed in the form of reports. GINGA system was used to compare the genomes of Leifsonia xyli subsp. cynodontis (Lxc draft genome sequence) and Leifsonia xyli subsp. xyli (Lxx complete genome sequence). The mail goal was to identify genetic differences that may help to understand the pathogeniciy of Lxx towards sugarcane. A total of 9.754 reads assembled in 1.064 contigs and 317 scaffolds produced 1.470.731 of no redundant bases of Lxc genome and were used in the analysis. GINGA allowed the identification of 206.320 bp (~20%) of Lxc specific sequences organized in contigs and 56.884 bp organized in 19 scaffolds (5,9%), around 1 milion bp aligned to Lxx genome and at least 6 large scale genomic rearrangements. These results were presented in a graphical interface and allowed to guide the partial genome sequencing, helping to decide which regions should be further sequenced and at the same time allowing the formulation of hypothesis related to important biological aspects of these microorganisms
APA, Harvard, Vancouver, ISO, and other styles
29

Martin, Kyle. "Investigating the evolutionary impact of the teleost genome duplication through comparative genomics and phylogenetic analysis of homeobox genes in the Osteoglossomorpha." Thesis, University of Oxford, 2016. https://ora.ox.ac.uk/objects/uuid:7c2df1c9-0aa6-4a63-a38b-757b3f12f664.

Full text
Abstract:
Multiple rounds of whole genome duplication (WGD) have played a pivotal role in the expansion, elaboration, and evolutionary diversification of vertebrate genomes. In addition to sharing two rounds of whole genome duplication with all other vertebrates, a teleost-specific genome duplication (TGD) occurred in the stem of the teleost lineage ~350 million years ago (MYA) and is thus a genomic synapomorphy shared by all ~26,000 extant species. The TGD has variously been implicated in accelerated speciation, evolution of morphological complexity, increased rates of molecular evolution, and the evolution of novelty, and therefore is therefore of significant interest for its impact on teleost evolution and also as a model for understanding the evolutionary patterns and processes which accompany WGDs more generally. Investigation of the TGD has contributed extensively to the general understanding of WGDs however, until the present work, a relatively narrow taxonomic sampling of species within a single teleost subdivision, Clupeocephala, have been investigated. This taxonomic bias has left potentially relevant evolutionary changes to the teleost genome in the immediate wake of the TGD obscured. Due to their deeply branching ancestry, species belonging to the two other major teleost subdivisions, Osteoglossomorpha and Elopomorpha, are well positioned for deeper comparative genomic analyses of the TGD and the accompanying phenomenon of diploidization. The focus of the present work has been to develop the first genomic resources specifically for osteoglossomorphs and to investigate the evolutionary patterns and processes which accompanied diploidization prior the deep divergence of the three extant teleost subdivisions. To this end, I have generated de novo genome and transcriptome data from four osteoglossomorph taxa (Pantodon buchholzi, Osteoglossum bicirrhosum, Chitala ornata, and Gnathonemus petersii) and conducted comparative genomic and phylogenetic analysis with other teleosts and pre-TGD vertebrates including the gar Lepisosteus oculeatus. With a focus on Hox and other ANTP class homeobox-containing transcription factor families I provide evidence that speciation of the major teleost subdivisions occurred prior to the termination of the diploidization process following TGD and discuss the evolutionary implications of this model. Beginning with an analysis of the Hox clusters in P. buchholzi I show that divergent resolution of TGD-generated Hox duplicates occurred both at the individual gene level as well as at the level of whole cluster losses. Detailed phylogenetic analyses of the P. buchholzi Hox clusters further revealed that the transition from polyploid alleles to full paralogs during the diploidization process can occur independently in different lineages when speciation rapidly follows WGDs, causing duplicated genes to exhibit a special case of four-way gene homology which I have termed 'tetralogy'. A genome-wide survey of ANTP class homeobox genes in a de novo assembly of the P. buchholzi genome revealed that ancient TGD duplicates of at least 14 subfamilies were preserved uniquely in the P. buchholzi genome and lost from clupeocephalan teleosts. Finally, by comparing the Hox complements in gar and P. buchholzi with three additional osteoglossomorphs I show that the diversity in potential duplicate resolution patterns is also highly variable between osteoglossomorph families. Overall, this work highlights the importance of considering not only the relative timing of gene duplication and speciation in comparative genomic analyses but also their timing relative to diploidization. Going forward, the research community will need to carefully evaluate the effects differences in diploidization rate and pattern, both between lineages and across the genome, have had in influencing the fate of individual gene duplicates as well as upon the macroevolutionary phenomena frequently correlated with WGDs more generally.
APA, Harvard, Vancouver, ISO, and other styles
30

Zhou, Bin. "Construction of a minimal tiling path across the euchromatic arms of sorghum chromosome 3 and comparative analysis with the rice chromosome 1 pseudomolecule." [College Station, Tex. : Texas A&M University, 2006. http://hdl.handle.net/1969.1/ETD-TAMU-1162.

Full text
APA, Harvard, Vancouver, ISO, and other styles
31

Abril, Ferrando Josep Francesc. "Comparative analysis of eukaryotic gene sequence features." Doctoral thesis, Universitat Pompeu Fabra, 2005. http://hdl.handle.net/10803/7108.

Full text
Abstract:
L'incessant augment del nombre de seqüències genòmiques, juntament amb
l'increment del nombre de tècniques experimentals de les que es disposa,
permetrà obtenir el catàleg complet de les funcions cel.lulars de
diferents organismes, incloent-hi la nostra espècie. Aquest catàleg
definirà els fonaments sobre els que es podrà entendre millor com els
organismes funcionen a nivell molecular. Al mateix temps es tindran més
pistes sobre els canvis que estan associats amb les malalties. Per tant,
la seqüència en brut, tal i com s'obté dels projectes de seqüenciació de
genomes, no té cap valor sense les anàlisis i la subsegüent anotació de
les característiques que defineixen aquestes funcions. Aquesta tesi
presenta la nostra contribució en tres aspectes relacionats de
l'anotació dels gens en genomes eucariotes.

Primer, la comparació a nivell de seqüència entre els genomes humà i de
ratolí es va dur a terme mitjançant un protocol semi-automàtic. El
programa de predicció de gens SGP2 es va desenvolupar a partir
d'elements d'aquest protocol. El concepte al darrera de l'SGP2 és que
les regions de similaritat obtingudes amb el programa TBLASTX, es fan
servir per augmentar la puntuació dels exons predits pel programa
geneid, amb el que s obtenen conjunts d'anotacions més acurats
d'estructures gèniques. SGP2 té una especificitat que és prou gran com
per que es puguin validar experimentalment via RT-PCR. La validació de
llocs d'splicing emprant la tècnica de la RT-PCR és un bon exemple de
com la combinació d'aproximacions computacionals i experimentals
produeix millors resultats que per separat.

S'ha dut a terme l'anàlisi descriptiva a nivell de seqüència dels llocs
d'splicing obtinguts sobre un conjunt fiable de gens ortòlegs per humà,
ratolí, rata i pollastre. S'han explorat les diferències a nivell de
nucleòtid entre llocs U2 i U12, pel conjunt d'introns ortòlegs que se'n
deriva d'aquests gens. S'ha trobat que els senyals d'splicing ortòlegs
entre humà i rossegadors, així com entre rossegadors, estan més
conservats que els llocs no relacionats. Aquesta conservació addicional
pot ser explicada però a nivell de conservació basal dels introns.
D'altra banda, s'ha detectat més conservació de l'esperada entre llocs
d'splicing ortòlegs entre mamífers i pollastre. Els resultats obtinguts
també indiquen que les classes intròniques U2 i U12 han evolucionat
independentment des de l'ancestre comú dels mamífers i les aus. Tampoc
s'ha trobat cap cas convincent d'interconversió entre aquestes dues
classes en el conjunt d'introns ortòlegs generat, ni cap cas de
substitució entre els subtipus AT-AC i GT-AG d'introns U12. Al contrari,
el pas de GT-AG a GC-AG, i viceversa, en introns U2 no sembla ser inusual.

Finalment, s'han implementat una sèrie d'eines de visualització per
integrar anotacions obtingudes pels programes de predicció de gens i per
les anàlisis comparatives sobre genomes. Una d'aquestes eines, el
gff2ps, s'ha emprat en la cartografia dels genomes humà, de la mosca del
vinagre i del mosquit de la malària, entre d'altres. El programa
gff2aplot i els filtres associats, han facilitat la tasca d'integrar
anotacions de seqüència amb els resultats d'eines per la cerca
d'homologia, com ara el BLAST. S'ha adaptat també el concepte de
pictograma a l'anàlisi comparativa de llocs d splicing ortòlegs, amb el
desenvolupament del programa compi.
El aumento incesante del número de secuencias genómicas, junto con el
incremento del número de técnicas experimentales de las que se dispone,
permitirá la obtención del catálogo completo de las funciones celulares
de los diferentes organismos, incluida nuestra especie. Este catálogo
definirá las bases sobre las que se pueda entender mejor el
funcionamiento de los organismos a nivel molecular. Al mismo tiempo, se
obtendrán más pistas sobre los cambios asociados a enfermedades. Por
tanto, la secuencia en bruto, tal y como se obtiene en los proyectos de
secuenciación masiva, no tiene ningún valor sin los análisis y la
posterior anotación de las características que definen estas funciones.
Esta tesis presenta nuestra contribución a tres aspectos relacionados de
la anotación de los genes en genomas eucariotas.

Primero, la comparación a nivel de secuencia entre el genoma humano y el
de ratón se llevó a cabo mediante un protocolo semi-automático. El
programa de predicción de genes SGP2 se desarrolló a partir de elementos
de dicho protocolo. El concepto sobre el que se fundamenta el SGP2 es
que las regiones de similaridad obtenidas con el programa TBLASTX, se
utilizan para aumentar la puntuación de los exones predichos por el
programa geneid, con lo que se obtienen conjuntos más precisos de
anotaciones de estructuras génicas. SGP2 tiene una especificidad
suficiente como para validar esas anotaciones experimentalmente vía
RT-PCR. La validación de los sitios de splicing mediante el uso de la
técnica de la RT-PCR es un buen ejemplo de cómo la combinación de
aproximaciones computacionales y experimentales produce mejores
resultados que por separado.

Se ha llevado a cabo el análisis descriptivo a nivel de secuencia de los
sitios de splicing obtenidos sobre un conjunto fiable de genes ortólogos
para humano, ratón, rata y pollo. Se han explorado las diferencias a
nivel de nucleótido entre sitios U2 y U12 para el conjunto de intrones
ortólogos derivado de esos genes. Se ha visto que las señales de
splicing ortólogas entre humanos y roedores, así como entre roedores,
están más conservadas que las no ortólogas. Esta conservación puede ser
explicada en parte a nivel de conservación basal de los intrones. Por
otro lado, se ha detectado mayor conservación de la esperada entre
sitios de splicing ortólogos entre mamíferos y pollo. Los resultados
obtenidos indican también que las clases intrónicas U2 y U12 han
evolucionado independientemente desde el ancestro común de mamíferos y
aves. Tampoco se ha hallado ningún caso convincente de interconversión
entre estas dos clases en el conjunto de intrones ortólogos generado, ni
ningún caso de substitución entre los subtipos AT-AC y GT-AG en intrones
U12. Por el contrario, el paso de GT-AG a GC-AG, y viceversa, en
intrones U2 no parece ser inusual.

Finalmente, se han implementado una serie de herramientas de
visualización para integrar anotaciones obtenidas por los programas de
predicción de genes y por los análisis comparativos sobre genomas. Una
de estas herramientas, gff2ps, se ha utilizado para cartografiar los
genomas humano, de la mosca del vinagre y del mosquito de la malaria. El
programa gff2aplot y los filtros asociados, han facilitado la tarea de
integrar anotaciones a nivel de secuencia con los resultados obtenidos
por herramientas de búsqueda de homología, como BLAST. Se ha adaptado
también el concepto de pictograma al análisis comparativo de los sitios
de splicing ortólogos, con el desarrollo del programa compi.
The constantly increasing amount of available genome sequences, along
with an increasing number of experimental techniques, will help to
produce the complete catalog of cellular functions for different
organisms, including humans. Such a catalog will define the base from
which we will better understand how organisms work at the molecular
level. At the same time it will shed light on which changes are
associated with disease. Therefore, the raw sequence from genome
sequencing projects is worthless without the complete analysis and
further annotation of the genomic features that define those functions.
This dissertation presents our contribution to three related aspects of
gene annotation on eukaryotic genomes.

First, a comparison at sequence level of human and mouse genomes was
performed by developing a semi-automatic analysis pipeline. The SGP2
gene-finding tool was developed from procedures used in this pipeline.
The concept behind SGP2 is that similarity regions obtained by TBLASTX
are used to increase the score of exons predicted by geneid, in order to
produce a more accurate set of gene structures. SGP2 provides a
specificity that is high enough for its predictions to be experimentally
verified by RT-PCR. The RT-PCR validation of predicted splice junctions
also serves as example of how combined computational and experimental
approaches will yield the best results.

Then, we performed a descriptive analysis at sequence level of the
splice site signals from a reliable set of orthologous genes for human,
mouse, rat and chicken. We have explored the differences at nucleotide
sequence level between U2 and U12 for the set of orthologous introns
derived from those genes. We found that orthologous splice signals
between human and rodents and within rodents are more conserved than
unrelated splice sites. However, additional conservation can be
explained mostly by background intron conservation. Additional
conservation over background is detectable in orthologous mammalian and
chicken splice sites. Our results also indicate that the U2 and U12
intron classes have evolved independently since the split of mammals and
birds. We found neither convincing case of interconversion between these
two classes in our sets of orthologous introns, nor any single case of
switching between AT-AC and GT-AG subtypes within U12 introns. In
contrast, switching between GT-AG and GC-AG U2 subtypes does not appear
to be unusual.

Finally, we implemented visualization tools to integrate annotation
features for gene- finding and comparative analyses. One of those tools,
gff2ps, was used to draw the whole genome maps for human, fruitfly and
mosquito. gff2aplot and the accompanying parsers facilitate the task of
integrating sequence annotations with the output of homologybased tools,
like BLAST.We have also adapted the concept of pictograms to the
comparative analysis of orthologous splice sites, by developing compi.
APA, Harvard, Vancouver, ISO, and other styles
32

Shaw, Daniel 1993. "Streamlining minimal bacterial genomes : Analysis of the pan bacterial essential genome, and a novel strategy for random genome deletions in Mycoplasma pneumoniae." Doctoral thesis, Universitat Pompeu Fabra, 2019. http://hdl.handle.net/10803/668244.

Full text
Abstract:
Understanding what constitutes a true Minimal Cell is a key challenge in synthetic biology. In this work, we present two new tools to aid in this endeavour. i) A novel methodology for minimising the Mycoplasma pneumoniae genome via random deletions of genetic material. This protocol utilises the Cre Lox system coupled with random transposon mutagenesis to create a population with random lox sites dispersed around the genome. This allows for a population of cells containing a high variability of large and small-scale deletions ranging from 50bp to 25Kb within M. pneumoniae. ii) The first large scale analysis of the essentiality of genes from multiple bacterial species, and how the composition and function of the essential genome of a bacterium changes based on the genome’s complexity.
Discernir cuales son los componentes que podrían constituir una célula mínima es un desafío clave para la Biología Sintética. En esta tesis, se presentan dos nuevas herramientas para facilitar esta tarea. (i) Una nueva metodología para minimizar el genoma de Mycoplasma pneumoniae mediante la deleción aleatoria de material genético. Esta técnica combina el sistema Cre/lox con la mutagénesis aleatoria mediada por transposones para generar poblaciones bacterianas en las que los sitios lox están distribuidos de manera aleatoria a lo largo de su genoma. Esto permite la generación de poblaciones bacterianas en las que el tamaño de las deleciones efectuadas varia desde 50 pb hasta 25 kb. (ii) El primer análisis a gran escala de la esencialidad genética en múltiples especies bacterianas, y cómo la composición y función del grupo de genes esenciales de una bacteria cambia en función de la complejidad de su genoma.
APA, Harvard, Vancouver, ISO, and other styles
33

Brandström, Mikael. "Bioinformatic Analysis of Mutation and Selection in the Vertebrate Non-coding Genome." Doctoral thesis, Uppsala University, Department of Evolution, Genomics and Systematics, 2007. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-8240.

Full text
Abstract:

The majority of the vertebrate genome sequence is not coding for proteins. In recent years, the evolution of this noncoding fraction of the genome has gained interest. These studies have been greatly facilitated by the availability of full genome sequences. The aim of this thesis is to study evolution of the noncoding vertebrate genome through bioinformatic analysis of large-scale genomic datasets.

In a first analysis we addressed the use of conservation of sequence between highly diverged genomes to infer function. We provided evidence for a turnover of the patterns of negative selection. Hence, measures of constraint based on comparisons of diverged genomes might underestimate the functional proportion of the genome.

In the following analyses we focused on length variation as found in small-scale insertion and deletion (indel) polymorphisms and microsatellites. For indels in chicken, replication slippage is a likely mutation mechanism, as a large proportion of the indels are parts of tandem-duplicates. Using a set of microsatellite polymorphisms in chicken, where we avoid ascertainment bias, we showed that polymorphism is positively correlated with microsatellite length and AT-content. Furthermore, interruptions in the microsatellite sequence decrease the levels of polymorphism.

We also analysed the association between microsatellite polymorphism and recombination in the human genome. Here we found increased levels of microsatellite polymorphism in human recombination hotspots and also similar increases in the frequencies of single nucleotide polymorphisms (SNPs) and indels. This points towards natural selection shaping the levels of variation. Alternatively, recombination is mutagenic for all three kinds of polymorphisms.

Finally, I present the program ILAPlot. It is a tool for visualisation, exploration and data extraction based on BLAST.

Our combined results highlight the intricate connections between evolutionary phenomena. It also emphasises the importance of length variability in genome evolution, as well as the gradual difference between indels and microsatellites.

APA, Harvard, Vancouver, ISO, and other styles
34

Zhong, Cuncong. "Computational Methods for Comparative Non-coding RNA Analysis: From Structural Motif Identification to Genome-wide Functional Classification." Doctoral diss., University of Central Florida, 2013. http://digital.library.ucf.edu/cdm/ref/collection/ETD/id/5894.

Full text
Abstract:
Non-coding RNA (ncRNA) plays critical functional roles such as regulation, catalysis, and modification etc. in the biological system. Non-coding RNAs exert their functions based on their specific structures, which makes the thorough understanding of their structures a key step towards their complete functional annotation. In this dissertation, we will cover a suite of computational methods for the comparison of ncRNA secondary and 3D structures, and their applications to ncRNA molecular structural annotation and their genome-wide functional survey. Specifically, we have contributed the following five computational methods. First, we have developed an alignment algorithm to compare RNA structural motifs, which are recurrent RNA 3D structural fragments. Second, we have improved upon the previous alignment algorithm by incorporating base-stacking information and devise a new branch-and-bond algorithm. Third, we have developed a clustering pipeline for RNA structural motif classification using the above alignment methods. Fourth, we have generalized the clustering pipeline to a genome-wide analysis of RNA secondary structures. Finally, we have devised an ultra-fast alignment algorithm for RNA secondary structure by using the sparse dynamic programming technique. A large number of novel RNA structural motif instances and ncRNA elements have been discovered throughout these studies. We anticipate that these computational methods will significantly facilitate the analysis of ncRNA structures in the future.
Ph.D.
Doctorate
Computer Science
Engineering and Computer Science
Computer Science
APA, Harvard, Vancouver, ISO, and other styles
35

Cragun, Deborah Le. "Universal Tumor Screening for Lynch Syndrome: Identification of system-level implementation factors influencing patient reach." Scholar Commons, 2013. http://scholarcommons.usf.edu/etd/4658.

Full text
Abstract:
Lynch syndrome (LS) is the most prevalent cause of hereditary colorectal cancer (CRC) and confers high risks for several other types of cancer. Universal tumor screening (UTS) of all newly diagnosed patients with CRC can improve LS identification and decrease associated morbidity and mortality among patients and family members. However, for UTS to be effective, patients who screen positive must pursue genetic counseling and confirmatory germline testing (i.e., high patient reach). The purposes of this study were to characterize UTS programs, identify barriers and facilitators to implementation, document whether there have been negative outcomes, and determine institutional and implementation conditions that are associated with high and low patient reach. Using two conceptual frameworks, RE-AIM and Consolidated Framework for Implementation Research, a baseline survey was conducted of 25 representatives from different institutions performing UTS. Descriptive statistics were used to illustrate similarities and differences among programs. A multiple-case study was then conducted by extracting data from surveys and interviews of representatives from 15 different institutions where UTS programs had been operational for over 6 months and where aggregated patient outcome data were available. Qualitative comparative analysis was performed to make systematic cross-case comparisons and identify conditions uniquely associated with high or low patient reach. Data were triangulated to create models explaining how UTS implementation and system-level factors influence patient reach. Few patient concerns or negative outcomes were reported. UTS procedures and patient reach were highly variable. All 5 high-reach (H-R) centers have genetics professionals disclose positive screening results and either do not require a referral from another health care provider or have streamlined the referral process. Although 2 of the 5 mid-reach (M-R) centers also share these conditions, they have a less automated follow-up procedure and report difficulty contacting patients as a barrier. Both of the academic institutions with low patient reach (L-R) did not receive patient information that would allow them to follow-up on positive screening results. The three non-academic L-R institutions reported a high proportion of challenges to facilitators during implementation and did not have genetic professionals disclose positive screening results to patients. Implementing a combination of procedures to streamline UTS protocols and procedures, eliminate barriers to patient follow-through after a positive tumor screen, and incorporate a high level of involvement of genetic professionals in contacting patients and disclosing screening results are expected to lead to improvement in patient reach
APA, Harvard, Vancouver, ISO, and other styles
36

Wilkinson, Tracey Nicole. "Evolutionary analysis of the relaxin peptide family and their receptors." Connect to thesis, 2006. http://repository.unimelb.edu.au/10187/2315.

Full text
Abstract:
The relaxin-like peptide family consists of relaxin-1, 2 and 3, and the insulin-like peptides (INSL)-3, 4, 5 and 6. The evolution of this family has been controversial; points of contention include the existence of an invertebrate relaxin and the absence of a ruminant relaxin. Using the known members of the relaxin peptide family, all available vertebrate and invertebrate genomes were searched for relaxin peptide sequences. Contrary to previous reports an invertebrate relaxin was not found; sequence similarity searches indicate the family emerged during early vertebrate evolution. Phylogenetic analyses revealed the presence of potential relaxin-3, relaxin and INSL5 homologs in fish; dating their emergence far earlier than previously believed. Furthermore, estimates of mutation rates suggested that the expansion of the family (i.e. the emergence of INSL6, INSL4 and relaxin-1) during mammalia was driven by positive Darwinian selection. In contrast, relaxin-3 is constrained by strong purifying selection, implying a highly conserved function. (For complete abstract open document)
APA, Harvard, Vancouver, ISO, and other styles
37

Ullrich, Sophie. "Genomic and transcriptomic characterization of novel iron oxidizing bacteria of the genus “Ferrovum“." Doctoral thesis, Technische Universitaet Bergakademie Freiberg Universitaetsbibliothek "Georgius Agricola", 2016. http://nbn-resolving.de/urn:nbn:de:bsz:105-qucosa-205981.

Full text
Abstract:
Acidophilic iron oxidizing bacteria of the betaproteobacterial genus “Ferrovum” are ubiquitously distributed in acid mine drainage (AMD) habitats worldwide. Since their isolation and maintenance in the laboratory has proved to be extremely difficult, members of this genus are not accessible to a “classical” microbiological characterization with exception of the designated type strain “Ferrovum myxofaciens” P3G. The present study reports the characterization of “Ferrovum” strains at genome and transcriptome level. “Ferrovum” sp. JA12, “Ferrovum” sp. PN-J185 and “F. myxofaciens” Z-31 represent the iron oxidizers of the mixed cultures JA12, PN-J185 and Z-31. The mixed cultures were derived from the mine water treatment plant Tzschelln close to the lignite mining site in Nochten (Lusatia, Germany). The mixed cultures also contain a heterotrophic strain of the genus Acidiphilium. The genome analysis of Acidiphilium sp. JA12-A1, the heterotrophic contamination of the mixed culture JA12, indicates an interspecies carbon and phosphate transfer between Acidiphilium and “Ferrovum” in the mixed culture, and possibly also in their natural habitat. The comparison of the inferred metabolic potentials of four “Ferrovum” strains and the analysis of their phylogenetic relationships suggest the existence of two subgroups within the genus “Ferrovum” (i.e. the operational taxonomic units OTU-1 and OUT-2) harboring characteristic metabolic profiles. OTU-1 includes the “F. myxofaciens” strains P3G and Z-31, which are predicted to be motile and diazotrophic, and to have a higher acid tolerance than OTU-2. The latter includes two closely related proposed species represented by the strains JA12 and PN-J185, which appear to lack the abilities of motility, chemotaxis and molecular nitrogen fixation. Instead, both OTU-2 strains harbor the potential to use urea as alternative nitrogen source to ammonium, and even nitrate in case of the JA12-like species. The analysis of the genome architectures of the four “Ferrovum” strains suggests that horizontal gene transfer and loss of metabolic genes, accompanied by genome reduction, have contributed to the evolution of the OTUs. A trial transcriptome study of “Ferrovum” sp. JA12 supports the ferrous iron oxidation model inferred from its genome sequence, and reveals the potential relevance of several hypothetical proteins in ferrous iron oxidation. Although the inferred models in “Ferrovum” spp. share common features with the acidophilic iron oxidizers of the Acidithiobacillia, it appears to be more similar to the neutrophilic iron oxidizers Mariprofundus ferrooxydans (“Zetaproteobacteria”) and Sideroxydans lithotrophicus (Betaproteobacteria). These findings suggest a common origin of ferrous iron oxidation in the Beta- and “Zetaproteobacteria”, while the acidophilic lifestyle of “Ferrovum” spp. may have been acquired later, allowing them to also colonize acid mine drainage habitats.
APA, Harvard, Vancouver, ISO, and other styles
38

Sahlqvist, Anna-Stina. "Genetic Characterization of Chicken Models for Autoimmune Disease." Doctoral thesis, Uppsala universitet, Autoimmunitet, 2012. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-182843.

Full text
Abstract:
Autoimmune diseases are endemic, but the disease mechanisms are poorly understood. A way to better understand these are to find disease-regulating genes. However, this is difficult as the diseases are complex, with several genes as well as environmental factors influencing the development of disease. A way to facilitate the search for genes responsible for the diseases is to use comparative genomic studies. Animal models are relatively easy to analyze since control of environment and breeding are obtained. The University of California at Davies – line 200 (UCD-200) chickens have a hereditary disease that is similar to systemic sclerosis. Using a backcross between UCD-200 chickens and red junglefowl (RJF) chickens we identified three loci linked to the disease. The loci contained immune-regulatory genes suggested to be involved in systemic sclerosis in humans, as well as a previously unidentified linkage between systemic sclerosis in UCD-200 chickens and IGFBP3. The Dark brown (Db) gene enhances red pheomelanin and restricts expression of eumelanin in chickens. The Db phenotype is regulated by an 8 kb deletion upstream of SOX10. Pigmentation studies are potentially useful when trying to identify pathogenic mechanisms and candidate genes in vitiligo The Obese strain (OS) of chickens spontaneously develops an autoimmune thyroiditis which closely resembles human Hashimoto’s thyroiditis. By using an intercross between OS chickens and RJF chickens, we found several disease phenotypes that can be used in an ongoing linkage analysis with the goal to find candidate genes for autoimmune disease. An important phenotype to record and add to the linkage analysis is autoantibodies against thyroid peroxidase, since this phenotype is a key feature in Hashimoto’s thyroiditis. Previous attempts to measure these titres in OS chickens have failed, hence an assay was developed for this purpose.
APA, Harvard, Vancouver, ISO, and other styles
39

Seibert, Sara Rose. "Host-parasite interactions: comparative analyses of population genomics, disease-associated genomic regions, and host use." Wright State University / OhioLINK, 2020. http://rave.ohiolink.edu/etdc/view?acc_num=wright1590585260282244.

Full text
APA, Harvard, Vancouver, ISO, and other styles
40

Herzog, Rebecca [Verfasser]. "Global change genomics - comparative genomic analyses on environmental associated speciation and adaptation processes in Odonata / Rebecca Herzog." Hannover : Gottfried Wilhelm Leibniz Universität, 2021. http://d-nb.info/1238221785/34.

Full text
APA, Harvard, Vancouver, ISO, and other styles
41

Soares, Siomar de Castro. "Pan-genomic analyses of Corynebacterium pseudotuberculosis and characterization of the biovars ovis and equi through comparative genomics." Universidade Federal de Minas Gerais, 2013. http://hdl.handle.net/1843/BUOS-9B8JTZ.

Full text
Abstract:
Corynebacterium pseudotuberculosis is the causative agent of diverse communicable diseases in small ruminants (biovar ovis), horses, camels, buffalo and other animals (biovar equi), which mainly differ in symptoms and site of infection. Additionally, the diseases present a highly important economic problem worldwide and there is still a lack of efficient treatments against C. pseudotuberculosis. In this work, we describe the steps from the first genome sequencing of a strain of C. pseudotuberculosis to the pangenomic analyses of 15 strains isolated from different hosts and countries with diverse symptoms. Briefly, we introduce the genus Corynebacterium and the in silico analyses performed in pathogenic species of this genus to date. Then, we describe the implementation of a software for the prediction of pathogenicity islands (PAIs) in bacteria (PIPS), which outperformed the other available software, and identified 7 PAIs with important virulence factors in C. pseudotuberculosis biovar ovis. Moreover, we extend the analyses of PAIs to strains of C. pseudotuberculosis biovar equi and predict 49 putative vaccine targets, in silico, which are commonly shared by both biovars, ovis and equi. Finally, we present the phylogenomic, pan-genomic, core genomic, singletons and genomic plasticity analyses of the 15 strains of C. pseudotuberculosis, from both biovars. All the analyses performed here point for a clonal-like behavior of C. pseudotuberculosis, which could be the result of the facultative intracellular behavior of the species. Moreover, the biovar equi presents a higher variability in gene content when compared to biovar ovis, specially in PAI regions. Noteworthy, the strains from biovar ovis present a high degree of similarity in pili clusters of genes, whereas the biovar equi strains are very variable. The conservation of pili clusters of genes in biovar ovis could account for the ability of these strains to spread inside host tissues and penetrate live cells to live intracellularly, where they would have less contact to other organisms, thus, possibly explaining the clonal-like behavior of the biovar ovis.
APA, Harvard, Vancouver, ISO, and other styles
42

Pethick, Florence Elizabeth. "Comparative genomic analyses of Corynebacterium pseudotuberculosis." Thesis, University of Glasgow, 2013. http://theses.gla.ac.uk/4287/.

Full text
Abstract:
This study set out to sequence the genome of Corynebacterium pseudotuberculosis (Cp) 3/99-5, an ovine strain isolated from a naturally-occurring case of caseous lymphadenitis (CLA) in Scotland. The isolate was sequenced and assembled by 454 Life Sciences, and then gap closure performed by ‘PCR bridging’. The resulting sequence consisted of three contigs with a length of 2,319,079 bp and a G+C content of 52.18%. The genome was then annotated and predicted to contain 2,153 coding sequences. Analysis of the coding sequences revealed the presence of several putative virulence factors, including four sortases with multiple sortase target proteins containing LPXTG motifs. A further two Cp strains, an Australian ovine and a North American equine isolate, as well as C. ulcerans NCTC 12077 were sequenced for comparison. Comparative genomics, both intra- and inter-species showed all the genomes to be highly homologous. However, the C. ulcerans genome is larger than the Cp genomes and is more distinct; it was found to be more similar to the equine Cp 1/06-A isolate which is the most diverged of the Cp isolates. Phylogenetic analyses of the Corynebacterium genus were performed using house-keeping loci but also secreted protein loci from Cp 3/99-5. Bayesian analysis of house-keeping loci distinguished the bacteria to a species level. Inclusion of secreted protein loci did not distinguish the isolates any further. The main objective of this work was to utilise the Cp genome sequence to identify potential diagnostic targets which could be used to augment the available ELITEST CLA or replace it. The ELITEST CLA is the only diagnostic test for CLA that exists on the commercial market in the UK. However, due to low specificity and sensitivity, it is only operated on a flock/group basis. Analyses of the Cp 3/99-5 genome identified several potential diagnostic candidates and seven protein targets were investigated further. Attempts were made to express these candidates as recombinant proteins, however, only two recombinants were successfully expressed and purified, Cp3995_0570 and CP40. The seroreactivity of these were then assessed by IgG ELISA using a panel of ten positive and ten negative CLA ovine sera. The sera were previously defined as positive or negative by PLD and whole cell ELISAs; both of which showed a significant difference between sera types. However, neither Cp3995_0570 nor CP40 distinguished between sera originating from Cp-infected and Cp-naïve animals.
APA, Harvard, Vancouver, ISO, and other styles
43

Wilson, Ian. "Comparative genomic analyses of Entamoeba species." Thesis, University of Liverpool, 2014. http://livrepository.liverpool.ac.uk/2007270/.

Full text
Abstract:
Amoebiasis is the third-most common cause of mortality worldwide from a disease borne of a parasitic infection. It affects up to 50 million people annually, of which 40,000 to 100,000 cases are fatal. Entamoeba histolytica is an obligate protozoon parasite of humans and is the aetiological agent of the disease. Recent suggestions that other members of the Entamoeba genus are human-infective, and potentially pathogenic, have been investigated here. A draft assembly and annotation of the 25 Mb genome of E. moshkovskii strain Laredo is presented, to which multiple E. moshkovskii strains were mapped. The E. moshkovskii genome was found to be approximately 200 times more variable than that of E. histolytica. Performance of the four-haplotype test revealed that genetic recombination does not seem to occur in E. moshkovskii. As such, it is suggested that it be referred to as a ‘species complex’, rather than an individual species. A comparative genomic analysis of E. histolytica HM-1:IMSS, E. moshkovskii Laredo, E. invadens IP-1 and the avirulent E. dispar SAW760 was performed. Subsequent comparative analyses against members of genera representative of the diversity in the Unikonts clade enabled the identification of orthologous gene families unique to the Entamoeba genus. Analysis of virulence factors within this set revealed that gene families involved in adhesion of amoebic trophozoites to host cells play a key role in the development of invasive amoebiasis. The Gal/GalNAc lectins and members of the BspA family are of particular interest, being present in all analysed species, except for E. dispar. The presence of these key families, plus cysteine proteases, in the E. moshkovskii genome suggests that some sequence types within this species complex may be pathogenic. E. invadens was found to possess larger numbers of more variable genes within many virulence factor families, including the BspA family and the Gal/GalNAc lectins. This suggests that sequence diversity facilitates E. invadens’ polyxenous lifestyle. Finally, a novel species recently isolated from a human faecal sample - E. bangladeshi, strain 8237 – was sequenced. Its genome was assembled using multiple de novo genome assemblers and coding sequences were assembled individually. A combination of all methods tested was found to be beneficial in maximising the number of gene sequences assembled, which is advised as good practice in future similar assemblies. The phylogeny of E. bangladeshi, achieved using the combined assemblies’ outputs suggested that the novel species is human-infective. The work presented here utilised modern comparative genomic techniques to improve understanding of Entamoeba species, their capacity for causing disease and their potential impact upon the epidemiology of amoebiasis.
APA, Harvard, Vancouver, ISO, and other styles
44

Moreno, Luisa Zanolli. "Caracterização e análise comparativa de genomas de estirpes de Leptospira isoladas no Brasil." Universidade de São Paulo, 2017. http://www.teses.usp.br/teses/disponiveis/10/10134/tde-03082017-155135/.

Full text
Abstract:
O presente estudo teve como objetivo caracterizar o genoma de estirpes de Leptospira isoladas no Brasil e realizar a análise comparativa destes com os genomas disponíveis no banco de dados GenBank. Foram caracterizadas 17 estirpes isoladas de distintas espécies animais, em diferentes regiões do Brasil, no período de 1998 a 2012. Estas foram previamente tipificadas por sequenciamento do gene 16S rRNA e soroaglutinção microscópica em seis espécies (L. interrogans, L. santarosai, L. inadai, L. kirschneri, L. borgpetersenii e L. noguchii) e mais de oito sorogrupos. Foi realizado o sequenciamento em plataforma Illumina™ MiSeq e montagem dos genomas com algoritmo ab initio. Para ordenação e anotação foram utilizados genomas de referência das respectivas espécies estudadas. Foi realizada a análise in silico da Tipagem por Sequenciamento de Multilocus (MLST) para os três protocolos vigentes de Leptospira. A análise comparativa dos genomas, incluindo wgSNP, foi realizada intra-espécie avaliando as variações existentes entre os sorogrupos das espécies de Leptospira estudadas. As estirpes de L. interrogans apresentaram resultados na MLST congruentes com a sua identificação prévia. No caso de L. kirschneri, apenas uma estirpe apresentou novos alelos nos três protocolos de MLST e se distancia das demais estirpes brasileiras de L. kirschneri. As estirpes de L. santarosai, assim como as de L. borgpetersenii e L. noguchii, possuem novos alelos e/ou perfis alélicos para pelo menos dois dos protocolos vigentes de MLST, sendo que ainda se destacam em um agrupamento próprio de origem brasileira. Os genomas de L. interrogans apesar de apresentarem alta identidade e sintenia com a referência sorovar Copenhageni, também apresentaram regiões de diferença entre os respectivos sorogrupos. Os genomas dos sorogrupos Australis e Serjoe se destacaram por apresentarem inserções e deleções, respectivamente, principalmente no cromossomo 2. O genoma de L. borgpetersenii também apresentou grande variação de composição, como esperado para espécie, sendo esta proporcionada por sequências de inserção e transposição de elementos móveis. Os sorogrupos Canicola e Pomona apresentaram maior proximidade entre si na análise wgSNP. Também foram identificados dois plasmídeos nos genomas do sorogrupo Canicola com alta identidade aos plasmídeos descritos na estirpe chinesa do mesmo sorovar. Na espécie L. kirschneri, a estirpe 47 (M36/05) apresentou alta identidade e sintenia com os genomas do sorovar Mozdok, como esperado, incluindo a estirpe brasileira de origem humana. Já a estirpe 55 (M110/06) se diferenciou dos demais genomas de L. kirschneri tanto no MLST quanto no wgSNP. O genoma brasileiro de L. inadai apresentou alta identidade à referência americana de origem humana, incluindo a presença de um bacteriófago próprio da espécie. A distinção das estirpes brasileiras de L. santarosai na MLST, também foi evidenciada na análise comparativa e no wgSNP, sendo que a estirpe 68 (M52/8-19), que não apresentou reatividade aos sorogrupos testados, ainda se diferencia das demais reafirmando a possibilidade de novo sorogrupo/sorovar. Dessa forma, o estudo genômico possibilitou a identificação de particularidades das estirpes brasileiras de Leptospira, incluindo a existência de elementos extra-cromossomais, proximidade com estirpes de origem humana indicando maior risco para saúde pública, além da possibilidade de novo sorogrupo de L. santarosai.
The present study aimed to characterize the genome of Leptospira strains isolated in Brazil and to perform their comparative analysis with GenBank available genomes. 17 strains isolated from distinct species, in different regions of Brazil, from 1998 to 2012 were characterized. These were previously typified through 16S rRNA sequencing and microscopic agglutination into six species (L. interrogans, L. santarosai, L. inadai, L. kirschneri, L. borgpetersenii and L. noguchii) and over eight serogroups. Illumina™ MiSeq sequencing and genome assembly with ab initio algorithm were performed. For ordering and annotation, reference genomes of the respective species were used. The in silico analysis of Multilocus Sequencing Typing (MLST) was performed for the three current Leptospira protocols. The comparative genomic analysis, including wgSNP, was performed intra-species evaluating the existing variations between the serogroups of the studied Leptospira species. The L. interrogans strains presented MLST results congruent with their previous identification. In the case of L. kirschneri, only one strain presented new alleles in the three MLST protocols and distanced itself from the other Brazilian L. kirschneri strains. The L. santarosai strains, as well as L. borgpetersenii and L. noguchii, presented new alleles and/or allelic profiles for at least two of the current MLST protocols, and still stand out in a separate group of Brazilian origin. Even though the L. interrogans genomes presented high identity and synteny with serovar Copenhageni reference, they also presented regions of difference between the respective serogroups. Serogroups Australis and Serjoe genomes stood out for having insertions and deletions, respectively, mainly in chromosome 2. The L. borgpetersenii genome also presented great variation of composition, as expected for the species, which is provided by insertion sequences and transposition of mobile elements. The serogroups Canicola and Pomona presented higher proximity in the wgSNP analysis. Two plasmids were also identified in the serogroup Canicola genomes with high identity to the plasmids described in the Chinese strain of the same serovar. In the L. kirschneri species, the strain 47 (M36/05) presented high identity and synteny with the serovar Mozdok genomes, as expected, including the Brazilian strain of human origin. The strain 55 (M110/06) differed from other L. kirschneri genomes in both MLST and wgSNP. The Brazilian L. inadai genome presented high identity to the American reference of human origin including the presence of bacteriophage specific for the species. The distinction of the Brazilian L. santarosai strains in the MLST was also evidenced in the comparative analysis and in the wgSNP, and the strain 68 (M52 / 8-19), which showed no reactivity to the tested serogroups, also differs from the others reaffirming the possibility of a new serogroup/serovar. Therefore, the genomic study allowed the identification of particularities of Brazilian Leptospira strains, including the existence of extrachromosomal elements, proximity to strains of human origin indicating a greater risk for public health, in addition to the possibility of a new L. santarosai serogroup.
APA, Harvard, Vancouver, ISO, and other styles
45

Waterhouse, Robert Michael. "Computational comparative analysis of insect genomes." Thesis, Imperial College London, 2009. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.512005.

Full text
APA, Harvard, Vancouver, ISO, and other styles
46

Dehal, Paramvir Singh. "Comparative evolutionary analysis of chordate genomes /." For electronic version search Digital dissertations database. Restricted to UC campuses. Access is free to UC campus dissertations, 2003. http://uclibs.org/PID/11984.

Full text
APA, Harvard, Vancouver, ISO, and other styles
47

Wasbrough, Elizabeth. "Comparative genomic and evolutionary analysis of sperm proteomes." Thesis, University of Bath, 2011. https://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.538135.

Full text
Abstract:
While the central role of spermatozoa in sexual reproduction and fertilization is well understood, many functional attributes of sperm have yet to be elucidated at the molecular level. One key to ultimately understand the molecular basis of sperm function is to comprehensively characterize its biochemical composition. This crucial information has been lacking, as molecular characterization of the sperm cell cannot be assessed by classic gene expression assays since mature spermatozoa are transcriptionaly inert. Whole-cell shotgun proteomic approaches have revolutionized the molecular analysis of sperm form and function. We have utilized improved methodologies to re-analyze the D. melanogaster sperm proteome and characterize five additional Drosophila species sperm proteomes. This methodology, which included a 1D SDS-PAGE prefractionation step, resulted in good reproducibility between biological replicates and high quality sperm proteomes. An interspecific analysis of the sperm proteomes revealed that despite variation in protein composition, Drosophila sperm proteomes have a consistent functional profile and 519 proteins were identified a being conserved across the melanogaster subgroup within a phylogenetic framework. Evolution of the sperm proteome was explored in Mus musculus through the utilization of targeted proteomic datasets completed, which provided subcellular localizations for sperm components. This study resulted in several novel findings, including evidence for accelerated evolution as well as an enrichment of positive selection on genes found in the cell membrane and acrosome. This may be a result of the selective pressures encountered by these membrane proteins during sperm development, maturation and transit through the female reproductive tract where the sperm cell membrane, and eventually the acrosome, are exposed to the extracellular milieu and are available for direct cell-cell interactions. These findings not only reveal the varying evolutionary pressures acting on a single cell type but also highlights the utility of the proteomics technique in clarifying protein interaction and evolutionary history.
APA, Harvard, Vancouver, ISO, and other styles
48

Anastasi, Elisa. "Comparative genomics and emerging antibiotic resistance in Rhodococcus equi." Thesis, University of Edinburgh, 2016. http://hdl.handle.net/1842/25887.

Full text
Abstract:
Rhodococcus equi is a soil-dwelling facultative intracellular pathogen that can infect many mammals, including humans. R. equi is most well known for its ability to cause severe pyogranulomatous disease in foals, primarily involving the lungs although other body systems may also be affected. The disease is endemic on many horse-breeding farms worldwide and poses a severe threat to the horse breeding industry because there is no vaccine available. Current prophylaxis is based on systematic preventative treatments with macrolides combined with rifampicin, which are also used to treat clinical cases of the disease in foals. In this thesis I have used a combination of wet laboratory and bioinformatic approaches to identify the molecular basis of emerging combined resistance to macrolides and rifampicin in R. equi foal isolates from the USA. The genomes of a selection of resistant and susceptible strains from across the USA were sequenced and assembled. Resistance genes were systematically searched by reciprocal best-match BLASTP comparisons to known antibiotic resistance determinants. This led to the discovery of a novel erythromycin ribosomal methylase (erm) gene, erm(46), in all resistant strains. Complementation analysis in a susceptible R. equi strain showed that erm(46) was sufficient to confer resistance to all macrolides, lincosamides, and streptogramin B. The erm(46) gene is carried by an integrative conjugative element (ICE) which is transferable between R. equi strains. The ICE is formed by two distinct parts, a class I integron associated with an IS6100 sequence and the erm(46) determinant carried by a sub-element which contains putative actinobacterial conjugative translocase apparatus and a transposase/integrase. All resistant strains also carry the same non synonymous point mutation in rpoB conferring rifampicin resistance. Thus, these strains are carrying double resistance to the most commonly used antibiotics to treat R. equi worldwide. Phylogenetic analysis based on the core genome demonstrated that all resistant strains are clonal. This indicates that although conjugal acquisition of the erm(46) conjugative element may occur at a high frequency, the need for the concurrent presence of a second rpoB mutation for survival in the macrolide and rifampicin dominated farm environment has effectively selected for the spread of a single clone. In the second section of this work, we sequenced a further 20 R. equi genomes from difference sources (equine, porcine, bovine, human), including representatives of each of the seven major genogroups previously defined in our laboratory based on pulsed field gel electrophoresis. I have used the newly acquired genetic information to study the genome of R. equi and analyse its diversity within and outwith its species group. This enabled us to explore the pan genome and define that R. equi is a genetically well-defined bacterial species. Our results provide definitive evidence that resolves the current dispute over R. equi classification, specifically they do not support the recent proposal (based on classical polyphasic bacterial taxonomical methods) that R. equi should be transferred to a new genus. Our core-genome phylogenomic analyses unambiguously show that the genus Rhodocococcus is monophyletic and that R. equi forms a clade together with the most recently described related environmental species R. defluvii that radiates from within the genus. Together with other shared biological and genetic characteristics, namely the unique niche-adaptive mechanism based on evolutionarily related extrachromosomal replicons, R. equi should be conseidered a bona fide member of the genus Rhodococcus. We also confirm that Rhodococcus spp. and Nocardia spp. are sufficiently distinct to warrant them belonging to different genera. In conclusion, this work used whole genome sequencing to characterize the molecular basis underlying the emergence and clonal spread of multi-resistant R. equi in horse breeding farms in the USA. This work also highlights the limitations of classical taxonomical approaches in bacterial systematics, and illustrates the importance of incorporating modern phylogenomic approaches to understand the evolutionary relationships between bacterial strains and their accurate taxonomic position.
APA, Harvard, Vancouver, ISO, and other styles
49

Itou, Junji. "Functional and comparative genomics analyses of pmp22 in medaka fish." Kyoto University, 2009. http://hdl.handle.net/2433/126464.

Full text
APA, Harvard, Vancouver, ISO, and other styles
50

Ongen, Halit. "Comparative Genomic Analysis of Susceptibility to Coronary Artery Disease." Thesis, University of Oxford, 2008. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.514975.

Full text
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography