Dissertations / Theses on the topic 'Protein sequence evolution'

To see the other types of publications on this topic, follow the link: Protein sequence evolution.

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 41 dissertations / theses for your research on the topic 'Protein sequence evolution.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.

1

Hollich, Volker. "Orthology and protein domain architecture evolution /." Stockholm, 2006. http://diss.kib.ki.se/2006/91-7140-783-9/.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Kosiol, Carolin. "Markov models for protein sequence evolution." Thesis, University of Cambridge, 2006. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.614166.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Warnecke, Tobias. "Determinants of coding sequence evolution- beyong protein function." Thesis, University of Bath, 2010. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.531341.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Guney, Tacettin Dogacan. "Prediction Of Protein-protein Interactions From Sequence Using Evolutionary Relations Of Proteins And Species." Master's thesis, METU, 2009. http://etd.lib.metu.edu.tr/upload/12611058/index.pdf.

Full text
Abstract:
Prediction of protein-protein interactions is an important part in understanding the biological processes in a living cell. There are completely sequenced organisms that do not yet have experimentally verified protein-protein interaction networks. For such organisms, we can not generally use a supervised method, where a portion of the protein-protein interaction network is used as training set. Furthermore, for newly-sequenced organisms, many other data sources, such as gene expression data and gene ontology annotations, that are used to identify protein-protein interaction networks may not be available. In this thesis work, our aim is to identify and cluster likely protein-protein interaction pairs using only sequence of proteins and evolutionary information. We use a protein&rsquo
s phylogenetic profile because the co-evolutionary pressure hypothesis suggests that proteins with similar phylogenetic profiles are likely to interact. We also divide phylogenetic profile into smaller profiles based on the evolutionary lines. These divided profiles are then used to score the similarity between all possible protein pairs. Since not all profile groups have the same number of elements, it is a difficult task to assess the similarity between such pairs. We show that many commonly used measures do not work well and that the end result greatly depends on the type of the similarity measure used. We also introduce a novel similarity measure. The resulting dense putative interaction network contains many false-positive interactions, therefore we apply the Markov Clustering algorithm to cluster the protein-protein interaction network and filter out the weaker edges. The end result is a set of clusters where proteins within the clusters are likely to be functionally linked and to interact. While this method does not perform as well as supervised methods, it has the advantage of not requiring a training set and being able to work only using sequence data and evolutionary information. So it can be used as a first step in identifying protein-protein interactions in newly-sequenced organisms.
APA, Harvard, Vancouver, ISO, and other styles
5

Davies, L. "Sequence database searching using structural models of protein evolution." Thesis, University of Cambridge, 2002. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.598371.

Full text
Abstract:
Commonly used programs to search sequence databases such as BLAST, FASTA and SSEARCH identify sequence homology through pairwise alignment techniques. These programs are good at detecting closely related sequences but have problems accurately detecting homologous sequences with low sequence identity. This thesis describes a new approach that attempts to improve the detection of distantly related sequences by rejecting the assumption that all sites in a protein behave in an identical manner. This is done without the use of profile techniques, which require the preliminary collection of a set of homologs. Existing programs use general properties of proteins to generate alignment scores, which simplify calculations but may also result in a decrease in accuracy. In reality, amino acid replacement probabilities and rates, amino acid frequencies and gap probabilities all vary according to where a residue lies in a protein structure. Typical patterns of these structure-specific variations in evolutionary dynamics can be incorporated into a database search program through the use of hidden Markov models (HMMs), and hence potentially improve the detection of more distantly related sequences. In this thesis, the utility of including structure-specific evolutionary information in a database search program has been assessed. I have developed a general methodology permitting structure-based evolutionary models to be used for database searching, and specific algorithms that incorporate either solvent accessibility distinctions or protein secondary structure distinctions for globular proteins. In addition I have developed a database search algorithm for transmembrane proteins. The improvement afforded by adding the extra information has then been evaluated through the use of both simulated sequences, which exactly fit the models, and real sequences from the SCOP database. The success rate of each of these programs has been compared to a simplified model that contains the general properties of proteins but with no structural distinctions.
APA, Harvard, Vancouver, ISO, and other styles
6

Wistrand, Markus. "Hidden Markov models for remote protein homology detection /." Stockholm, 2005. http://diss.kib.ki.se/2006/91-7140-598-4/.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Nordesjö, Olle. "Searching for novel protein-protein specificities using a combined approach of sequence co-evolution and local structural equilibration." Thesis, Uppsala universitet, Institutionen för biologisk grundutbildning, 2016. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-275040.

Full text
Abstract:
Greater understanding of how we can use protein simulations and statistical characteristics of biomolecular interfaces as proxies for biological function will make manifest major advances in protein engineering. Here we show how to use calculated change in binding affinity and coevolutionary scores to predict the functional effect of mutations in the interface between a Histidine Kinase and a Response Regulator. These proteins participate in the Two-Component Regulatory system, a system for intracellular signalling found in bacteria. We find that both scores work as proxies for functional mutants and demonstrate a ~30 fold improvement in initial positive predictive value compared with choosing randomly from a sequence space of 160 000 variants in the top 20 mutants. We also demonstrate qualitative differences in the predictions of the two scores, primarily a tendency for the coevolutionary score to miss out on one class of functional mutants with enriched frequency of the amino acid threonine in one position.
APA, Harvard, Vancouver, ISO, and other styles
8

Höglund, Pär J. "Identification, Characterization and Evolution of Membrane-bound Proteins /." Uppsala : Acta Universitatis Upsaliensis Acta Universitatis Upsaliensis, 2008. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-9329.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Dubey, Anshul. "Search and Analysis of the Sequence Space of a Protein Using Computational Tools." Diss., Georgia Institute of Technology, 2006. http://hdl.handle.net/1853/14115.

Full text
Abstract:
A new approach to the process of Directed Evolution is proposed, which utilizes different machine learning algorithms. Directed Evolution is a process of improving a protein for catalytic purposes by introducing random mutations in its sequence to create variants. Through these mutations, Directed Evolution explores the sequence space, which is defined as all the possible sequences for a given number of amino acids. Each variant sequence is divided into one of two classes, positive or negative, according to their activity or stability. By employing machine learning algorithms for feature selection on the sequence of these variants of the protein, attributes or amino acids in its sequence important for the classification into positive or negative, can be identified. Support Vector Machines (SVMs) were utilized to identify the important individual amino acids for any protein, which have to be preserved to maintain its activity. The results for the case of beta-lactamase show that such residues can be identified with high accuracy while using a small number of variant sequences. Another class of machine learning problems, Boolean Learning, was used to extend this approach to identifying interactions between the different amino acids in a proteins sequence using the variant sequences. It was shown through simulations that such interactions can be identified for any protein with a reasonable number of variant sequences. For experimental verification of this approach, two fluorescent proteins, mRFP and DsRed, were used to generate variants, which were screened for fluorescence. Using Boolean Learning, an interacting pair was identified, which was shown to be important for the fluorescence. It was also shown through experiments and simulations that knowing such pairs can increase the fraction active variants in the library. A Boolean Learning algorithm was also developed for this application, which can learn Boolean functions from data in the presence of classification noise.
APA, Harvard, Vancouver, ISO, and other styles
10

Randall, Ryan Nicole. "Experimental phylogenetics: a benchmark for ancestral sequence reconstruction." Thesis, Georgia Institute of Technology, 2012. http://hdl.handle.net/1853/48998.

Full text
Abstract:
The field of molecular evolution has benefited greatly from the use of ancestral sequence reconstruction as a methodology to better understand the molecular mechanisms associated with functional divergence. The method of ancestral sequence reconstruction has never been experimentally validated despite the method being exploited to generate high profile publications and gaining wider use in many laboratories. The failure to validate such a method is a consequence of 1) our inability to travel back in time to document evolutionary transitions and 2) the slow pace of natural evolutionary processes that prevent biologists from ‘witnessing’ evolution in action (pace viruses). In this thesis research, we have generated an experimentally known phylogeny of fluorescent proteins in order to benchmark ancestral sequence reconstruction methods. The tips/leaves of the fluorescent protein experimental phylogeny are used to determine the performances of various ASR methods. This is the first example of combining experimental phylogenetics and ancestral sequence reconstruction.
APA, Harvard, Vancouver, ISO, and other styles
11

Chan, Yvonne H. "The Complex Role of Sequence and Structure in the Stability and Function of the TIM Barrel Proteins." eScholarship@UMMS, 2011. http://escholarship.umassmed.edu/gsbs_diss/934.

Full text
Abstract:
Sequence divergence of orthologous proteins enables adaptation to a plethora of environmental stresses and promotes evolution of novel functions. As one of the most common motifs in biology capable of diverse enzymatic functions, the TIM barrel represents an ideal model system for mapping the phenotypic manifestations of protein sequence. Limits on evolution imposed by constraints on sequence and structure were investigated using a model TIM barrel protein, indole-3-glycerol phosphate synthase (IGPS). Exploration of fitness landscapes of phylogenetically distant orthologs provides a strategy for elucidating the complex interrelationship in the context of a protein fold. Fitness effects of point mutations in three phylogenetically divergent IGPS proteins during adaptation to temperature stress were probed by auxotrophic complementation of yeast with prokaryotic, thermophilic IGPS. Significant correlations between the fitness landscapes of distant orthologues implicate both sequence and structure as primary forces in defining the TIM barrel fitness landscape. These results suggest that fitness landscapes of point mutants can be successfully translocated in sequence space, where knowledge of one landscape may be predictive for the landscape of another ortholog. Analysis of a surprising class of beneficial mutations in all three IGPS orthologs pointed to a long-range allosteric pathway towards the active site of the protein. Biophysical and biochemical analyses provided insights into the molecular mechanism of these beneficial fitness effects. Epistatic interactions suggest that the helical shell may be involved in the observed allostery. Taken together, knowledge of the fundamental properties of the TIM protein architecture will provide new strategies for de novo protein design of a highly targeted protein fold.
APA, Harvard, Vancouver, ISO, and other styles
12

Chan, Yvonne H. "The Complex Role of Sequence and Structure in the Stability and Function of the TIM Barrel Proteins." eScholarship@UMMS, 2017. https://escholarship.umassmed.edu/gsbs_diss/934.

Full text
Abstract:
Sequence divergence of orthologous proteins enables adaptation to a plethora of environmental stresses and promotes evolution of novel functions. As one of the most common motifs in biology capable of diverse enzymatic functions, the TIM barrel represents an ideal model system for mapping the phenotypic manifestations of protein sequence. Limits on evolution imposed by constraints on sequence and structure were investigated using a model TIM barrel protein, indole-3-glycerol phosphate synthase (IGPS). Exploration of fitness landscapes of phylogenetically distant orthologs provides a strategy for elucidating the complex interrelationship in the context of a protein fold. Fitness effects of point mutations in three phylogenetically divergent IGPS proteins during adaptation to temperature stress were probed by auxotrophic complementation of yeast with prokaryotic, thermophilic IGPS. Significant correlations between the fitness landscapes of distant orthologues implicate both sequence and structure as primary forces in defining the TIM barrel fitness landscape. These results suggest that fitness landscapes of point mutants can be successfully translocated in sequence space, where knowledge of one landscape may be predictive for the landscape of another ortholog. Analysis of a surprising class of beneficial mutations in all three IGPS orthologs pointed to a long-range allosteric pathway towards the active site of the protein. Biophysical and biochemical analyses provided insights into the molecular mechanism of these beneficial fitness effects. Epistatic interactions suggest that the helical shell may be involved in the observed allostery. Taken together, knowledge of the fundamental properties of the TIM protein architecture will provide new strategies for de novo protein design of a highly targeted protein fold.
APA, Harvard, Vancouver, ISO, and other styles
13

Menlove, Kit J. "Model Detection Based upon Amino Acid Properties." BYU ScholarsArchive, 2010. https://scholarsarchive.byu.edu/etd/2253.

Full text
Abstract:
Similarity searches are an essential component to most bioinformatic applications. They form the bases of structural motif identification, gene identification, and insights into functional associations. With the rapid increase in the available genetic data through a wide variety of databases, similarity searches are an essential tool for accessing these data in an informative and productive way. In our chapter, we provide an overview of similarity searching approaches, related databases, and parameter options to achieve the best results for a variety of applications. We then provide a worked example and some notes for consideration. Homology detection is one of the most basic and fundamental problems at the heart of bioinformatics. It is central to problems currently under intense investigation in protein structure prediction, phylogenetic analyses, and computational drug development. Currently discriminative methods for homology detection, which are not readily interpretable, are substantially more powerful than their more interpretable counterparts, particularly when sequence identity is very low. Here I present a computational graph-based framework for homology inference using physiochemical amino acid properties which aims to both reduce the gap in accuracy between discriminative and generative methods and provide a framework for easily identifying the physiochemical basis for the structural similarity between proteins. The accuracy of my method slightly improves on the accuracy of PSI-BLAST, the most popular generative approach, and underscores the potential of this methodology given a more robust statistical foundation.
APA, Harvard, Vancouver, ISO, and other styles
14

Stern, Joshua Gallant. "STORI: selectable taxon ortholog retrieval iteratively." Thesis, Georgia Institute of Technology, 2013. http://hdl.handle.net/1853/53377.

Full text
Abstract:
Speciation and gene duplication are fundamental evolutionary processes that enable biological innovation. For over a decade, biologists have endeavored to distinguish orthology (homology caused by speciation) from paralogy (homology caused by duplication). Disentangling orthology and paralogy is useful to diverse fields such as phylogenetics, protein engineering, and genome content comparison. A common step in ortholog detection is the computation of Bidirectional Best Hits (BBH). However, we found this computation impractical for more than 24 Eukaryotic proteomes. Attempting to retrieve orthologs in less time than previous methods require, we developed a novel algorithm and implemented it as a suite of Perl scripts. This software, Selectable Taxon Ortholog Retrieval Iteratively (STORI), retrieves orthologous protein sequences for a set of user-defined proteomes and query sequences. While the time complexity of the BBH method is O(#taxa^2), we found that the average CPU time used by STORI may increase linearly with the number of taxa. To demonstrate one aspect of STORI’s usefulness, we used this software to infer the orthologous sequences of 26 ribosomal proteins (rProteins) from the large ribosomal subunit (LSU), for a set of 115 Bacterial and 94 Archaeal proteomes. Next, we used established tree-search methods to seek the most probable evolutionary explanation of these data. The current implementation of STORI runs on Red Hat Enterprise Linux 6.0 with installations of Moab 5.3.7, Perl 5 and several Perl modules. STORI is available at: .
APA, Harvard, Vancouver, ISO, and other styles
15

Capella, Gutiérrez Salvador Jesús 1985. "Analysis of multiple protein sequence alignments and phylogenetic trees in the context of phylogenomics studies." Doctoral thesis, Universitat Pompeu Fabra, 2012. http://hdl.handle.net/10803/97289.

Full text
Abstract:
Phylogenomics is a biological discipline which can be understood as the intersection of the fields of genomics and evolution. Its main focuses are the analyses of genomes through the evolutionary lens and the understanding of how different organisms relate to each other. Moreover, phylogenomics allows to make accurate functional annotations of newly sequenced genomes. This discipline has grown in response to the deluge of data coming from different genome projects. To achieve their objectives, phylogenomics heavily depends on the accuracy of different methods to generate precise phylogenetic trees. Phylogenetic trees are the basic tool of this field and serve to represent how sequences or species relate to each other through common ancestry. During my thesis, I have centered my efforts in improving an automated pipeline to generate accurate phylogenetic trees and its posterior publication through a public database. Among the efforts to improve the pipeline, I have specially focused on the problem of multiple sequence alignment post-processing, which has been shown to be central to the reliability of subsequent analyses. Subsequently I have applied this pipeline, and a battery of other phylogenomics tools, to the study of the phylogenetic position of Microsporidia, a group of fast-evolving intracellular parasites. Due to their special genomic features, Microsporidia evolution constitutes one of the classical examples of challenging problems for phylogenomics. Finally, I have also used the pipeline as a part of a newly designed method for selecting robust combinations of phylogenetic gene markers. I have used this method for selecting optimal gene sets to assess the phylogenetic relationships within fungi and cyanobacteria, showing that the potential of these genes as phylogenetic markers goes well beyond the species used for their selection.
Filogenómica es una disciplina biológica que puede ser entendida como la intersección entre los campos de la genómica y la evolución. Su área de estudio es el análisis evolutivo de los genomas y como se relacionan las distintas especies entre sí. Además, la filogenómica tiene como objetivo anotar funcionalmente, con gran precisi ón, genomas recién secuenciados. De hecho, esta disciplina ha crecido rápidamente en los úultimos años como respuesta a la avalancha de datos provenientes de distintos proyectos genómicos. Para alcanzar sus objetivos, la filogenómica depende, en gran medida, de los distintos métodos usados para generar árboles filogenéticos. Los árboles filogenéticos son las herramientas básicas de la filogenómica y sirven para representar como secuencias y especies se relacionan entre sí por ascendencia. Durante el desarrollo de mi tesis, he centrado mis esfuerzos en mejorar una pipeline (conjunto de programas ejecutados de forma controlada) automática que permite generar árboles filogenéticos con gran precisión, y como ofrecer estos datos a la comunidad científica a través de una base de datos. Entre los esfuerzos realizados para mejorar la pipeline, me he centrado especialmente en el post-procesamiento previo a cualquier análisis de alineamientos múltiples de secuencias, ya que la calidad del alineamiento determina la de los estudios posteriores. En un contexto más biológico, he usado esta pipeline junto con otras herramientas filogenómicas en el estudio de la posición filogenética de Microsporidia. Dadas sus características genómicas especiales, la evolución de Microsporidia constituye uno de los problemas clásicos y difíciles de resolver en filogenómica. Finalmente, he usado también la pipeline como parte de un nuevo método para seleccionar combinaciones óptimas de genes con potencial como marcadores filogenéticos. De hecho, he usado este método para identificar conjuntos de marcadores filogenéticos que permiten reconstruir con alto grado de precisión las relaciones evolutivas en Cyanobacterias y en Hongos. Lo más interesante de este método es que eval úa la fiabilidad de los marcadores en especies no usadas para su selección.
APA, Harvard, Vancouver, ISO, and other styles
16

Kratzer, James Timothy. "Reengineering a human-like uricase for the treatment of gout." Diss., Georgia Institute of Technology, 2013. http://hdl.handle.net/1853/52149.

Full text
Abstract:
There is an unmet medical need in the treatment of gout. This type of inflammatory arthritis can be efficiently alleviated by the enzyme uricase. This enzyme breaks down uric acid, the causative agent of gout, so it can be flushed from the body. In humans and the other great apes, uricase is a pseudogene and as such is inactive. Research on therapeutic uricases has focused on using enzymes from naturally occurring sources; however, these foreign proteins can be very antigenic and present a potentially life-threatening safety risk to patients. We address the challenges of developing a safer uricase therapeutic by exploiting evidence that, while inactive, the human pseudogene is expressed in the human body and may be recognized as self by the immune system. To develop a モhuman-likeヤ? uricase we apply the hybrid computational and experimental approach of Ancestral Sequence Reconstruction to search functional sequence space of uricase proteins to engineer an enzyme with high sequence identity to the human pseudogene, and possessing therapeutic levels of activity for the breakdown of uric acid. This dissertation describes the development and characterization of several uricase leads. The most active ancestral uricase possesses both enhanced in vitro and in vivo stability (in healthy rats) when assayed head-to-head Pegloticase, the only FDA approved uricase for the treatment of gout.
APA, Harvard, Vancouver, ISO, and other styles
17

WEBER, M. ELISABETH. "Transporteurs de pyrimidines chez saccharomyces cerevisiae : sequence de deux genes et prediction de structure des proteines correspondantes." Université Louis Pasteur (Strasbourg) (1971-2008), 1987. http://www.theses.fr/1987STR13197.

Full text
APA, Harvard, Vancouver, ISO, and other styles
18

Pond, Sergei L. "Modeling evolution of protein coding DNA sequences." Diss., The University of Arizona, 2003. http://hdl.handle.net/10150/289906.

Full text
Abstract:
We develop a new class of computationally feasible stochastic models for statistical analysis of genetic sequence evolution and inference of properties of the underlying substitution processes in the context of maximum likelihood framework. Existing models for evolution of protein coding sequences allow site to site variation in non-synonymous substitution rates, but assume that the rate of synonymous substitutions is constant for all sites. New models provide a rigorous statistical framework for testing the hypothesis of synonymous rate constancy, and enable a host of data exploration and analysis tools. For several indicative data sets, the constancy assumption is shown to be violated, and some possible explanations are given. We also present an algorithm for improving efficiency of maximum likelihood evaluations, and discuss HyPhy--a user friendly and publicly distributed software implementation of our methods.
APA, Harvard, Vancouver, ISO, and other styles
19

Gregory, Matthew Alan. "Characterisation and evolution of homoimmune Streptomyces bacteriophages." Thesis, University of Nottingham, 2000. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.324534.

Full text
APA, Harvard, Vancouver, ISO, and other styles
20

Ekman, Diana, and Arne Elofsson. "Identifying and Quantifying Orphan Protein Sequences in Fungi." Stockholms universitet, Institutionen för biokemi och biofysik, 2010. http://urn.kb.se/resolve?urn=urn:nbn:se:su:diva-49277.

Full text
Abstract:
For large regions of many proteins, and even entire proteins, no homology to known domains or proteins can be detected. These sequences are often referred to as orphans. Surprisingly, it has been reported that the large number of orphans is sustained in spite of a rapid increase of available genomic sequences. However, it is believed that de novo creation of coding sequences is rare in comparison to mechanisms such as domain shuffling and gene duplication; hence, most sequences should have homologs in other genomes. To investigate this, the sequences of 19 complete fungi genomes were compared. By using the phylogenetic relationship between these genomes, we could identify potentially de novo created orphans in Saccharomyces cerevisiae. We found that only a small fraction, <2%, of the S. cerevisiae proteome is orphan, which confirms that de novo creation of coding sequences is indeed rare. Furthermore, we found it necessary to compare the most closely related species to distinguish between de novo created sequences and rapidly evolving sequences where homologs are present but cannot be detected. Next, the orphan proteins (OPs) and orphan domains (ODs) were characterized. First, it was observed that both OPs and ODs are short. In addition, at least some of the OPs have been shown to be functional in experimental assays, showing that they are not pseudogenes. Furthermore, in contrast to what has been reported before and what is seen for older orphans, S. cerevisiae specific ODs and proteins are not more disordered than other proteins. This might indicate that many of the older, and earlier classified, orphans indeed are fast-evolving sequences. Finally, >90% of the detected ODs are located at the protein termini, which suggests that these orphans could have been created by mutations that have affected the start or stop codons.

authorCount :2

APA, Harvard, Vancouver, ISO, and other styles
21

Bianchetti, Laurent. "Intégration de l'évolution pour contribuer à l'étude de la relation séquence, structure, fonction des protéines." Thesis, Strasbourg, 2019. https://publication-theses.unistra.fr/restreint/theses_doctorat/2019/bianchetti_laurent_2019_ED414.pdf.

Full text
Abstract:
Intégrer l’évolution peut aider à comprendre la relation séquence, structure, fonction des protéines. Dans un 1er projet, j’ai utilisé la phylogénèse moléculaire pour montrer que les gènes Testis expressed 19 » (Tex19) et « Secreted and transmembrane 1 » (Sectm1) coévoluent. Bien que Tex19 et Sectm1 interviennent dans des processus biologiques différents, régulation des transposons et immunité respectivement, la coévolution établit entre eux un lien fonctionnel très fort. Comme Tex19 ne s’exprime que dans le testicule de l’adulte sain et en cellule cancéreuse, ce résultat pourrait présenter un intérêt en immunothérapie du cancer. Dans un 2nd projet, je me suis appuyé sur des calculs de modélisation moléculaire et sur l’analyse d'évolution de séquence pour interroger la validité de la structure de l’homodimère du domaine de liaison au ligand (LBD) du récepteur ɑ aux glucocorticoïdes (GR ɑ) [Bledsoe R.K. et al, 2002]. Premièrement, ce complexe serait vraisemblablement un artefact de contact cristallin. Deuxièmement, j’ai identifié un assemblage alternatif présentant les caractéristiques moléculaires d’une interface de contact biologique
Evolution can help to provide valuable information to understand protein sequence, structure and function relationship. In a first project, I used molecular phylogeny to show the coevolution of “Testis expressed 19” (Tex19) and “Secreted and Transmembrane 1” (Sectm1) genes. Although Tex19 and Sectm1 are involved in different biological pathways, i.e. transposon regulation and immunity respectively, coevolution supports a strong functional relationship between both genes.Since Tex19 is expressed only in adult healthy testis and cancer cells, this result may be useful for cancer immunotherapy. In a second project, I used molecular modelling and sequence evolution analysis to question the validity of the glucocorticoïd receptor ɑ (GR ɑ ) ligand binding domain (LBD) homodimeric assembly [Bledsoe R.K. et al, 2002]. First, this complex is likely a crystallization artefact. Second, I have identified an alternative assembly that presents the molecular characteristics of a biological interface
APA, Harvard, Vancouver, ISO, and other styles
22

Holder, Mark Travis. "Using a complex model of sequence evolution to evaluate and improve phylogenetic methods." Access restricted to users with UT Austin EID Full text (PDF) from UMI/Dissertation Abstracts International, 2001. http://wwwlib.umi.com/cr/utexas/fullcit?p3037500.

Full text
APA, Harvard, Vancouver, ISO, and other styles
23

Hill, E. E. "Evolution of protein families : genome sequences and three dimensional structures." Thesis, University of Cambridge, 2002. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.604054.

Full text
Abstract:
The aim here is to investigate the relationship between sequence and structure for families of structurally related proteins with low sequence identity in order to determine any conserved positions and define them. We do this in two main ways: (i) Evolution of Three Dimensional Structures The members of the 4-helical cytokine superfamily of proteins have no significant sequence identity. Despite this superfamilies' low to non-existent sequence similarity their homology is inferred by their common structural and functions. We carry out an in depth analysis of the long and short chain families both separately and together to determine the conserved structural regions. From an examination of the residues that occur at equivalent sites within these regions we identified the only positions at which there is any conservation. We then determined the structural role of these conserved sites so as to understand how the members of this family can maintain similar structures but have very different sequences. (ii) Evolution of Sequences Within Genomes For the cadherin superfamily, we use automated methods and hand analysis (incorporating information gathered previously on the structurally important residues) in order to identify all cadherin domains in their respective proteins within two sequenced eukaryotic genomes, Caenorhabditis elegans and Drosophila melanogaster. Identification of the entire cadherin protein repertoires within the two eukaryotic genomes allowed us to carry out a comparative analysis. This allows that the cadherin repertoires in the two organisms are surprisingly different. The ability to identify all genes within an organism that encode certain structural domains is certainly a huge achievement, and must be part of the way towards understanding an organism in its entirety.
APA, Harvard, Vancouver, ISO, and other styles
24

Wintz, Henri. "Contribution a l'etude de l'organisation et de la structure des genes de rna de transfert mitochondriaux des plantes." Université Louis Pasteur (Strasbourg) (1971-2008), 1988. http://www.theses.fr/1988STR13008.

Full text
Abstract:
Les genes codant pour le trna**(phe), trna**(trp), trna**(typ), trna**(met), trna**(pro), trna**(asp) et trna**(his) ont ete identifies a l'aide de trnas isoaccepteurs purifies et d'oligonucleotide de synthese. Localisation des genes de trna chloroplastiques presents dans le genome mitochondrial de mais. Determination de la sequence nucleotidique de deux genes de trna (trna**(cys) et trna**(ser)) et d'un pseudo-gene de trna**(pne) inactive par la presence d'une insertion dans la sequence codante du gene. Le gene de trna**(cys) fait partie d'une insertion de pna chloroplastique dans le genome mitochondrial. Mise en evidence de deux phases de lecture ouverte susceptible de coder pour une sous unite(n**(o) 3) de la nadh dehydrogenase et pour la proteine s12 de la petite sous unite du ribosome. L'etude comparative de ces genes suggere que ces genes mitochondriaux et chloroplastiques ont une origine commune
APA, Harvard, Vancouver, ISO, and other styles
25

Anderson, Jon Paul. "Molecular diversity and evolution of human immunodeficiency virus type 1 /." Thesis, Connect to this title online; UW restricted, 1999. http://hdl.handle.net/1773/8049.

Full text
APA, Harvard, Vancouver, ISO, and other styles
26

Radó, i. Trilla Núria 1985. "Low-complexity regions in proteins as a source of evolutionary innovation." Doctoral thesis, Universitat Pompeu Fabra, 2013. http://hdl.handle.net/10803/113603.

Full text
Abstract:
In this thesis we aimed to study evolutionary implications of low-complexity regions, protein sequences of very simple amino acid composition. Its uncontrolled expansion causes several human diseases, including Huntington’s disease and other neurodegenerative and developmental diseases. However, they are surprisingly abundant in proteins, which seem paradoxical given their high pathogenic potential. Moreover, experimental data has shown that the formation of novel LCRs, or the modification of existing ones, can have functional consequences. First we wanted to perform a descriptive analysis of low-complexity regions in chordates focusing on lineage and age related features of LCR evolution. Second, we want to assess why low-complexity regions are so common in eukaryotic proteins. Two hypotheses have been proposed: on one hand, they may be an important source of genetic variability and might be involved in adaptive processes. To investigate whether LCRs are important players in the acquisition of novel functions, we examined transcription factor gene duplicates. On the other hand, low-complexity regions may also contribute to the formation of novel coding sequences, facilitating the generation of novel protein functions. We have tested this hypothesis by examining the content of low-complexity sequences in proteins of different age. Both analysis let us to conclude that low-complexity regions may be involved in protein diversification, either providing new functional sequences that will modify existing proteins or being involved in the formation of novel protein coding sequences.
L'objectiu d'aquesta tesi és estudiar les implicacions evolutives de les regions de baixa complexitat (LCRs, en anglès), seqüències de proteïnes amb una composició d'aminoàcids molt simple. La seva expansió incontrolada causa diverses malalties humanes, incloent la malaltia de Huntington i altres malalties neurodegeneratives i del desenvolupament. No obstant això, són sorprenentment abundants en les proteïnes, cosa que pot semblar paradoxal, donat el seu potencial patogènic. A més, estudis experimentals han demostrat que la formació de noves LCRs, o la modificació de les ja existents, pot tenir conseqüències funcionals. En primer lloc hem volgut fer una anàlisi descriptiva de les regions de baixa complexitat en cordats, incidint en les característiques relacionades amb el llinatge i l'edat de les LCRs des d'un punt de vista evolutiu. En segon lloc, hem volgut avaluar per què les LCRs són tan freqüents en les proteïnes d'eucariotes. S'han proposat dues hipòtesis: d'una banda, poden ser una important font de variabilitat genètica i podrien estar implicades en processos d'adaptació. Per tal d'investigar si les LCRs juguen un paper important en L'adquisició de noves funcions, hem examinat factors de transcripció que han patit una duplicació o. D'altra banda, les regions de baixa complexitat també poden contribuir a la formació de noves seqüències codificants, facilitant la generació de funcions noves de les proteïnes. Per comprovar aquesta hipòtesi, hem examinat el contingut de les seqüències de baixa complexitat en proteïnes d'edats diferents. Les dues anàlisis permeten concloure que les regions de baixa complexitat poden estar involucrades en la diversificació de les proteïnes, ja sigui proporcionant noves seqüències funcionals que modifiquen les proteïnes existents o participant en la formació de noves seqüències codificants de proteïnes.
APA, Harvard, Vancouver, ISO, and other styles
27

Groussin, Mathieu. "Résurrection du passé à l’aide de modèles hétérogènes d’évolution des séquences protéiques." Thesis, Lyon 1, 2013. http://www.theses.fr/2013LYO10201/document.

Full text
Abstract:
La reconstruction et la résurrection moléculaire de protéines ancestrales est au coeur de cette thèse. Alors que les données moléculaires fossiles sont quasi inexistantes, il est possible d'estimer quelles étaient les séquences ancestrales les plus probables le long d'un arbre phylogénétique décrivant les relations de parentés entre séquences actuelles. Avoir accès à ces séquences ancestrales permet alors de tester de nombreuses hypothèses biologiques, de la fonction des protéines ancestrales à l'adaptation des organismes à leur environnement. Cependant, ces inférences probabilistes de séquences ancestrales sont dépendantes de modèles de substitution fournissant les probabilités de changements entre acides aminés. Ces dernières années ont vu le développement de nouveaux modèles de substitutions d'acides aminés, permettant de mieux prendre en compte les phénomènes biologiques agissant sur l'évolution des séquences protéiques. Classiquement, les modèles supposent que le processus évolutif est à la fois le même pour tous les sites d'un alignement protéique et qu'il est resté constant au cours du temps lors de l'évolution des lignées. On parle alors de modèle homogène en temps et en sites. Les modèles récents, dits hétérogènes, ont alors permis de lever ces contraintes en permettant aux sites et/ou aux lignées d'évoluer selon différents processus. Durant cette thèse, de nouveaux modèles hétérogènes en temps et sites ont été développés en Maximum de Vraisemblance. Il a notamment été montré qu'ils permettent d'améliorer considérablement l'ajustement aux données et donc de mieux prendre en compte les phénomènes régissant l'évolution des séquences protéiques afin d'estimer de meilleurs séquences ancestrales. A l'aide de ces modèles et de reconstruction ou résurrection de protéines ancestrales en laboratoire, il a été montré que l'adaptation à la température est un déterminant majeur de la variation des taux évolutifs entre lignées d'Archées. De même, en appliquant ces modèles hétérogènes le long de l'arbre universel du vivant, il a été possible de mieux comprendre la nature du signal évolutif informant de manière non-parcimonieuse un ancêtre universel vivant à plus basse température que ses deux descendants, à savoir les ancêtres bactériens et archéens. Enfin, il a été montré que l'utilisation de tels modèles pouvait permettre d'améliorer la fonctionnalité des protéines ancestrales ressuscitées en laboratoire, ouvrant la voie à une meilleure compréhension des mécanismes évolutifs agissant sur les séquences biologiques
The molecular reconstruction and resurrection of ancestral proteins is the major issue tackled in this thesis manuscript. While fossil molecular data are almost nonexistent, phylogenetic methods allow to estimate what were the most likely ancestral protein sequences along a phylogenetic tree describing the relationships between extant sequences. With these ancestral sequences, several biological hypotheses can be tested, from the evolution of protein function to the inference of ancient environments in which the ancestors were adatapted. These probabilistic estimations of ancestral sequences depend on substitution models giving the different probabilities of substitution between all pairs of amino acids. Classicaly, substitution models assume in a simplistic way that the evolutionary process remains homogeneous (constant) among sites of the multiple sequence alignment or between lineages. During the last decade, several methodological improvements were realised, with the description of substitution models allowing to account for the heterogeneity of the process among sites and in time. During my thesis, I developed new heterogeneous substitution models in Maximum Likelihood that were proved to better fit the data than any other homogeneous or heterogeneous models. I also demonstrated their better performance regarding the accuracy of ancestral sequence reconstruction. With the use of these models to reconstruct or resurrect ancestral proteins, my coworkers and I showed the adapation to temperature is a major determinant of evolutionary rates in Archaea. Furthermore, we also deciphed the nature of the phylogenetic signal informing substitution models to infer a non-parsimonious scenario for the adaptation to temperature during early Life on Earth, with a non-hyperthermophilic last universal common ancestor living at lower temperatures than its two descendants. Finally, we showed that the use of heterogeneous models allow to improve the functionality of resurrected proteins, opening the way to a better understanding of evolutionary mechanisms acting on biological sequences
APA, Harvard, Vancouver, ISO, and other styles
28

Wood, Natasha Tandi. "Modelling the Evolution of HIV-1 Protein-Coding Sequences with Particular focus on the early stages of Infection." Doctoral thesis, University of Cape Town, 2010. http://hdl.handle.net/11427/4352.

Full text
Abstract:
Modelling the Evolution of HIV-1 Protein-Coding Sequences with Particular Focus on the Early Stages of Infection Natasha Thandi Wood, February 2010 The evolution of the viral genome sequence over the course of HIV-1 infection is of interest for vaccine and drug design, and for the development of effective treatment strategies. Characteristics of the transmitted viral genome that could render the virus more sensitive to host immune responses, are of particular interest for vaccine studies. However, sequence samples from the earliest phase of HIV infection are scarce, and inferences about the nature of the infecting virus and its evolution during the course of early infection are often made from samples isolated from later stages, or from chronic infections. To establish in detail the adaptive changes that occur in early infection, an investigation was carried out on a large dataset consisting of sequences isolated from individuals in early infection. The majority of these infections were inferred to have resulted from transmission of a single virion or virally infected cell, which permitted a detailed investigation of HIV-1 diversification in early infection for the first time. Comparing viral diversification across multiple patients, it was possible to identify specific evolutionary patterns in the HIV-1 genome that occur frequently during the earliest stages of infection. The analyses revealed that APOBEC-mediated hypermutation has an important role in early viral diversification and may enable rapid escape from the first wave of host immune responses. Several mutations in early infection that were likely to result in immune escape were identified, some of which have subsequently been confirmed experimentally. In general, experimental verification of model-based inferences is necessary, but can be expensive and time-consuming. To reduce the costs involved, it is essential that the evolutionary methods produce accurate results. Simulation results presented in this thesis show that inferences made about viral evolution can be subject to bias when key aspects of viral biology are not accounted for by the models used. In particular, some previous comparisons between sequence groups that share genealogical histories, positive selection studies that fail to account for recombination, and research on HIV covariation, may need to be revisited, using more accurate evolutionary models. The results presented in this thesis demonstrate the importance of accurate evolutionary models to understand the selection pressures acting on the virus during various stages of infection. Furthermore, using a phylogenetic model it was possible to identify sites in the HIV genome that were evolving adaptively and are implicated in CTL immune escape during early infection. Characterising escape mutations in the transmitted virus may lead to novel approaches to develop vaccines and antiviral drugs.
APA, Harvard, Vancouver, ISO, and other styles
29

Delorme, Marie-Odile. "Analyse des sequences biologiques par des methodes d'apprentissage numerique et symbolique." Paris 6, 1988. http://www.theses.fr/1988PA066188.

Full text
APA, Harvard, Vancouver, ISO, and other styles
30

Nadaradjane, Aravindan. "Exploring the use of Deep Mutational Scanning and of Evolution for the Structural Prediction of Protein Complexes." Thesis, université Paris-Saclay, 2020. http://www.theses.fr/2020UPASS014.

Full text
Abstract:
L’objectif de cette thèse a été de développer des stratégies computationnelles permettant d’exploiter les informations issues des technologies de mutagenèse à haut débit (deep mutational scanning, DMS) pour prédire le mode d’assemblage des protéines. Pour cela, j’ai cherché à améliorer l’accord entre les modèles simulés par les techniques d’amarrage moléculaire et les contraintes expérimentales. Deux complexes de référence dont les structures ont été résolues expérimentalement et pour lesquels les données de DMS ont été publiées ont été utilisés pour la mise au point de ces méthodes, les complexes parD3-parE3 et dockerin-cohesin. Pour chacun des nombreux mutants générés par DMS, une mesure expérimentale de score de liaison quantifiant l’affinité des partenaires en complexe a pu être extraite. Pour la modélisation, différents protocoles basés sur le logiciel Rosetta ont été explorés pour prédire l’effet des mutations sur la stabilité des interfaces. Un compromis entre efficacité et précision a été identifié, permettant d’estimer de façon satisfaisante l’effet des mutations sur les structures natives des complexes. L’accord entre prédiction et données expérimentales a été quantifié en utilisant deux métriques, la corrélation entre les scores d’affinité prédits et mesurés et l’aire sous la courbe ROC (Receiver Operating Characteristic) définissant l’efficacité du prédicteur à classer correctement les mutations ayant le plus fort impact. Appliquées sur un ensemble de 1000 modèles de complexes issus des simulations d’amarrage, ces deux métriques ont été évaluées dans leur capacité à discriminer les modèles corrects des modèles faux. Pour les deux systèmes étudiés, la deuxième métrique apparaît comme la mieux adaptée. La méthodologie a ensuite été appliquée à un cas de complexe antigène-anticorps dans le cadre d’une collaboration avec l’équipe de B. Maillère. Mon travail de thèse a également été consacré au traitement des données de DMS générées avec nos collaborateurs O. Pereira-Ramos et L. Martin dans un cas de design de peptides à haute affinité pour la protéine Asf1 et pour le criblage des surfaces d’interaction de cette protéine avec ses partenaires. Enfin, j’ai participé tout au long de mon doctorat aux différentes cibles proposées par les organisateurs du 7ème concours CAPRI, concours international pour l’évaluation des méthodes de prédiction des structures de complexes protéiques. Le manuscrit détaille l’ensemble des stratégies mises en oeuvre qui ont permis à notre équipe de se classer en tête de ce concours en générant le plus grand nombre de modèles corrects et précis
The thesis project aimed at developing computational strategies to exploit the information generated by deep mutational scanning (DMS) technologies to predict the structures of protein assemblies. In that scope, I explored how to improve the agreement between the models simulated by molecular docking and experimental constraints. From the literature, two reference complexes whose structures have been solved experimentally and for which DMS data were published could be used for the methodological development: the parD3-parE3 and dockerin-cohesin complexes. For each of the many mutants generated by DMS, an experimental score quantifying the affinity of the complex could be extracted from the available data. For the simulations, a number of protocols based on the Rosetta software were tested and optimized to predict the effect of mutations on interface stability. A compromise was found between efficiency and precision, allowing for a fair estimation of the effect of mutations on native complex structures. The agreement between the predicted and the experimental data was quantified using two different metrics, either the correlation between the predicted and experimental binding scores or the area under the ROC (Receiver Operating Characteristic) curve, defining how efficiently the predictor could sort out the most impacting mutations. Applied to a set of 1000 decoys of complexes generated by docking, both metrics were assessed for their ability to discriminate correct from wrong models. For both reference systems, the second metrics based on ROC curves was found most useful. This methodology was further applied to an antibody-antigen complex which was studied by DMS in the group of B. Maillère. My PhD work was also dedicated to the processing of the raw data from DMS experiments which were generated by our collaborators, O. Pereira-Ramos and L. Martin, in order to design a high affinity peptide for the protein Asf1 and to screen interaction surfaces between Asf1 and its binding partners. Last, throughout my PhD I had the opportunity to participate in all targets submitted to the docking community by the organizers of CAPRI, an international challenge for the assessment of methods for the structural prediction of protein interactions. The manuscript details all the strategies which were set up to tackle these challenges for which our team eventually ranked first by generating the highest number of both correct and precise models
APA, Harvard, Vancouver, ISO, and other styles
31

Nosek, Ondřej. "Hardwarová akcelerace algoritmu pro hledání podobnosti dvou DNA řetězců." Master's thesis, Vysoké učení technické v Brně. Fakulta informačních technologií, 2007. http://www.nusl.cz/ntk/nusl-236882.

Full text
Abstract:
Methods for aproximate string matching of various sequences used in bioinformatics are crucial part of development in this branch. Tasks are of very large time complexity and therefore we want create a hardware platform for acceleration of these computations. Goal of this work is to design a generalized architecture based on FPGA technology, which can work with various types of sequences. Designed acceleration card will use especially dynamic algorithms like Needleman-Wunsch and Smith-Waterman.
APA, Harvard, Vancouver, ISO, and other styles
32

Buck, Michael Joseph. "Protein evolution from sequence to structure." 2003. http://www.lib.ncsu.edu/theses/available/etd-05192003-144950/unrestricted/etd.pdf.

Full text
APA, Harvard, Vancouver, ISO, and other styles
33

Bhaskara, Ramachandra M. "Structure, Stability and Evolution of Multi-Domain Proteins." Thesis, 2013. http://etd.iisc.ernet.in/2005/3384.

Full text
Abstract:
Analyses of protein sequences from diverse genomes have revealed the ubiquitous nature of multi-domain proteins. They form up to 70% of proteomes of most eukaryotic organisms. Yet, our understanding of protein structure, folding and evolution has been dominated by extensive studies on single-domain proteins. We provide quantitative treatment and proof for prevailing intuitive ideas on the strategies employed by nature to stabilize otherwise unstable domains. We find that domains incapable of independent stability are stabilized by favourable interactions with tethered domains in the multi-domain context. Natural variations (nsSNPs) at these sites alter communication between domains and affect stability leading to disease manifestation. We emphasize this by using explicit all-atom molecular dynamics simulations to study the interface nsSNPs of human Glutathione S-transferase omega 1. We show that domain-domain interface interactions constrain inter-domain geometry (IDG) which is evolutionarily well conserved. The inter-domain linkers modulate the interactions by varying their lengths, conformations and local structure, thereby affecting the overall IDG. These findings led to the development of a method to predict interfacial residues in multi-domain proteins based on difference evolutionary information extracted from at least two diverse domain architectures (single and multi-domain). Our predictions are highly accurate (∼85%) and specific (∼95%). Using predicted residues to constrain domain–domain interaction, rigid-body docking was able to provide us with accurate full-length protein structures with correct orientation of domains. Further, we developed and employed an alignment-free approach based on local amino-acid fragment matching to compare sequences of multi-domain proteins. This is especially effective in the absence of proper alignments, which is usually the case for multi-domain proteins. Using this, we were able to recreate the existing Hanks and Hunter classification scheme for protein kinases. We also showed functional relationships among Immunoglobulin sequences. The clusters obtained were functionally distinct and also showed unique domain-architectures. Our analysis provides guidelines toward rational protein and interaction design which have attractive applications in obtaining stable fragments and domain constructs essential for structural studies by crystallography and NMR. These studies enable a deeper understanding of rapport of protein domains in the multi-domain context.
APA, Harvard, Vancouver, ISO, and other styles
34

Mohaddes, Zia. "Modeling protein evolution using secondary structures." Thèse, 2010. http://hdl.handle.net/1866/4767.

Full text
Abstract:
L’évolution des protéines est un domaine important de la recherche en bioinformatique et catalyse l'intérêt de trouver des outils d'alignement qui peuvent être utilisés de manière fiable et modéliser avec précision l'évolution d'une famille de protéines. TM-Align (Zhang and Skolnick, 2005) est considéré comme l'outil idéal pour une telle tâche, en termes de rapidité et de précision. Par conséquent, dans cette étude, TM-Align a été utilisé comme point de référence pour faciliter la détection des autres outils d'alignement qui sont en mesure de préciser l'évolution des protéines. En parallèle, nous avons élargi l'actuel outil d'exploration de structures secondaires de protéines, Helix Explorer (Marrakchi, 2006), afin qu'il puisse également être utilisé comme un outil pour la modélisation de l'évolution des protéines.
Protein evolution is an important field of research in bioinformatics and catalyzes the requirement of finding alignment tools that can be used to reliably and accurately model the evolution of a protein family. TM-Align (Zhang and Skolnick, 2005) is considered to be the ideal tool for such a task, in terms of both speed and accuracy. Therefore in this study, TM-Align has been used as a point of reference to facilitate the detection of other alignment tools that are able to accurately model protein evolution. In parallel, we expand the existing protein secondary structure explorer tool, Helix Explorer (Marrakchi, 2006), so that it can also be used as a tool to model protein evolution.
APA, Harvard, Vancouver, ISO, and other styles
35

Pandya, Chetanya. "Sequence- and structure-based approaches to deciphering enzyme evolution in the Haloalkonoate Dehalogenase superfamily." Thesis, 2014. https://hdl.handle.net/2144/15107.

Full text
Abstract:
Understanding how changes in functional requirements of the cell select for changes in protein sequence and structure is a fundamental challenge in molecular evolution. This dissertation delineates some of the underlying evolutionary forces using as a model system, the Haloalkanoate Dehalogenase Superfamily (HADSF). HADSF members have unique cap-core architecture with the Rossmann-fold core domain accessorized by variable cap domain insertions (delineated by length, topology, and point of insertion). To identify the boundaries of variable domain insertions in protein sequences, I have developed a comprehensive computational strategy (CapPredictor or CP) using a novel sequence alignment algorithm in conjunction with a structure-guided sequence profile. Analysis of more than 40,000 HADSF sequences led to the following observations: (i) cap-type classes exhibit similar distributions across different phyla, indicating existence of all cap-types in the last universal common ancestor, and (ii) comparative analysis of the predicted cap-type and functional diversity indicated that cap-type does not dictate the divergence of substrate recognition and chemical pathway, and hence biological function. By analyzing a unique dataset of core- and cap-domain-only protein structures, I investigated the consequences of the accessory cap domain on the sequence-structure relationship of the core domain. The relationship between sequence and structure divergence in the core fold was shown to be monotonic and independent of the corresponding cap type. However, core domains with the same cap type bore a greater similarity than the core domains with different cap types, suggesting coevolution of the cap and core domains. Remarkably, a few degrees of freedom are needed to describe the structural diversity in the Rossmann fold accounting for the majority of the observed structural variance. Finally, I examined the location and role of conserved residue positions and co-evolving residue pairs in the core domain in the context of the cap domain. Positions critical for function were conserved while non-conserved positions mapped to highly mobile regions. Notably, we found exponential dependence of co-variance on inter-residue distance. Collectively, these novel algorithms and analyses contribute to an improved understanding of enzyme evolution, especially in the context of the use of domain insertions to expand substrate specificity and chemical mechanism.
APA, Harvard, Vancouver, ISO, and other styles
36

Scherrer, Michael Paul. "From the inside out : determining sequence conservation within the context of relative solvent accessibility." 2013. http://hdl.handle.net/2152/21613.

Full text
Abstract:
Evolutionary rates vary vastly across intraspecific genes and the determinants of these rates is of central concern to the field of comparative genomics. Tradition has held that preservation of protein function conserved the sequence, however mounting evidence implicates the biophysical properties of proteins themselves as the elements that constrain sequence evolution. Of these properties, the exposure of a residue to solvent is the most prevalent determinant of its evolutionary rate due to pressures to maintain proper synthesis and folding of the structure. In this work, we have developed a model that considers the microenvironment of a residue in the estimation of its evolutionary rate. By working within the structural context of a protein's residues, we show that our model is better able to capture the overall evolutionary trends affecting conservation of both the coding sequences and the protein structures from a genomic level down to individual genes.
text
APA, Harvard, Vancouver, ISO, and other styles
37

Ptáčková, Barbora. "Strukturní charakterizace vybraných náhodných proteinových sekvencí s vysokým obsahem neuspořádanosti." Master's thesis, 2018. http://www.nusl.cz/ntk/nusl-379356.

Full text
Abstract:
An infinitesimal fraction of the practically infinite sequence space has achieved enormous functional diversity of proteins during evolution. Intrinsically disordered proteins (IDPs) which lack a fully defined three-dimensional structure are the most likely precursors to today's proteins because of their flexible conformation and functional diversity. But how have these proteins evolved into often rigid and highly specialized protein structures? This evolutionary trajectory has the greatest support in the theory of induced fold whereby the development of the structure was mediated by the interaction and coevolution of primordial unstructured proteins with different cofactors or RNA molecules. Although some random sequences from the sequence space which is not used by nature are also able to form folded proteins the more suitable candidates for evolution of structure and function appear to be random sequences with a high content of disordered which have low aggregation propensity. The selected random protein sequences with high disorder content have been structurally characterized in this work for their further use in evolutionary studies. Three artificial proteins were selected from a random-sequence library based on previous study in our laboratory. In the present work they were purified and...
APA, Harvard, Vancouver, ISO, and other styles
38

van, Hazel Ilke. "Molecular Evolution and Functional Characterization of the Visual Pigment Proteins of the Great Bowerbird (Chlamydera nuchalis) and Other Vertebrates." Thesis, 2012. http://hdl.handle.net/1807/43401.

Full text
Abstract:
Visual pigments are light sensitive receptors in the eye that form the basis of sensory visual transduction. This thesis presents three studies that explore visual pigment proteins in vertebrates using a number of computational and experimental methods in an evolutionary framework. The objective is not only to identify, but also to experimentally investigate the functional consequences of genetic variation in vertebrate visual pigments. The focus is on great bowerbirds (Chlamydera nuchalis), which are a model system in visual ecology due to their spectacular behaviour of building and decorating courtship bowers. There are 4 chapters: Chapter 1 introduces background information on visual pigments and vision in birds. Among visual pigment types, the short-wavelength-sensitive (SWS1) pigments have garnered particular interest due to the broad spectral range among vertebrates and the importance of UV signals in communication. Chapter 2 investigates the evolutionary history of SWS1 in vertebrates with a view toward its utility as a phylogenetic marker. Chapter 3 investigates SWS1 evolution and short-wavelength vision in birds, with particular focus on C. nuchalis and its SWS1. The evolution of spectral tuning mechanisms mediating UV/violet vision in passerines and parrots is elucidated in this chapter using site-directed mutagenesis, protein expression, and phylogenetic recreation of ancestral opsins. While cone opsins mediate colour vision in bright light, the rhodopsin visual pigment contained in rod photoreceptors is critical for dim light vision. Detailed characterization of rhodopsin function has only been conducted on a few model systems. Chapter 4 examines C. nuchalis RH1 using a number of functional assays in addition to absorbance spectra, including hydroxylamine sensitivity and the rate of retinal release. This chapter includes an investigation into the role of amino acid mutations typical of dim-light adapted vertebrates, D83N and A292S, in regulating functional properties of bovine and avian RH1s using site-directed mutagenesis. Together these chapters describe naturally occurring mutations in visual pigments and explore the way they can influence visual perception. These represent one of the few investigations of visual pigments from a species that is not a model lab organism and form a significant contribution to the field of visual pigment biochemistry and evolution.
APA, Harvard, Vancouver, ISO, and other styles
39

Tsai, Tsung Yu, and 蔡宗佑. "Analyze the relationships among functions、sequences and structures of protein folds based on structure evolution." Thesis, 2005. http://ndltd.ncl.edu.tw/handle/82960481964409613076.

Full text
Abstract:
碩士
國立清華大學
資訊系統與應用研究所
93
Central dogma of molecular biology states that DNA carries the genetic information which is transcribed to mRNA and subsequently translated to protein. And function of protein is determined from protein structure. Widespread researches all use the sequence alignment to join the structure information to find out the latent function. This method often causes not alike or relationship of evolutionary the farther sequence can't find out the function. According to this, the usage of structure alignment builds up evolutionary tree in this thesis, by the structure alignment tree and the sequence alignment tree to the relation of that analyze the sequence, structure, function and evolution. The first step of our method used carries on the alignment to the protein structure through the CE tool and will need of as a result save the distance matrix. Using the UPGMA algorithm come to carries on to the distance matrix to divide into the cluster and the usage of evolutionary tree in proper order to present. For each protein structure, we join the EC number that it belongs to mark clearly its function. The dataset that this thesis uses is the TIM-Barrel fold under the SCOP classification. Discard the mutants and will have after many PDB files of other multi-domains carry on incising. According to the SCOP most the first floor classification Species picks 239 protein structures random respectively to carry on structure alignment. After noting through the establishment of the structure alignment tree and the mark of the EC number, will it with ClustalW produce of the sequence alignment tree does the analysis comparison. By as a result know, in spite of the sequence identity of high or low. The function cluster for structure alignment tree is more consistent and can find in same function cluster rather far sequence of existing the relation of sequence evolutionary. even can see accordingly belong to the different superfamily but exist wait for various sequence, structure function and evolution in same function cluster. According to this reason, this research suggests while carrying on taking structure as the sequence alignment of the base. Also can use the structure alignment to as a result do the mutual analysis and find out some latent relations by this.
APA, Harvard, Vancouver, ISO, and other styles
40

Dickinson, GH, IE Vega, KJ Wahl, B. Orihuela, V. Beyley, EN Rodriguez, RK Everett, J. Bonaventura, and D. Rittschof. "Barnacle cement: a polymerization model based on evolutionary concepts." Thesis, 2009. http://hdl.handle.net/10161/653.

Full text
Abstract:
Enzymes and biochemical mechanisms essential to survival are under extreme selective pressure and are highly conserved through evolutionary time. We applied this evolutionary concept to barnacle cement polymerization, a process critical to barnacle fitness that involves aggregation and cross-linking of proteins. The biochemical mechanisms of cement polymerization remain largely unknown. We hypothesized that this process is biochemically similar to blood clotting, a critical physiological response that is also based on aggregation and cross-linking of proteins. Like key elements of vertebrate and invertebrate blood clotting, barnacle cement polymerization was shown to involve proteolytic activation of enzymes and structural precursors, transglutaminase cross-linking and assembly of fibrous proteins. Proteolytic activation of structural proteins maximizes the potential for bonding interactions with other proteins and with the surface. Transglutaminase cross-linking reinforces cement integrity. Remarkably, epitopes and sequences homologous to bovine trypsin and human transglutaminase were identified in barnacle cement with tandem mass spectrometry and/or western blotting. Akin to blood clotting, the peptides generated during proteolytic activation functioned as signal molecules, linking a molecular level event (protein aggregation) to a behavioral response (barnacle larval settlement). Our results draw attention to a highly conserved protein polymerization mechanism and shed light on a long-standing biochemical puzzle. We suggest that barnacle cement polymerization is a specialized form of wound healing. The polymerization mechanism common between barnacle cement and blood may be a theme for many marine animal glues.
Dissertation
APA, Harvard, Vancouver, ISO, and other styles
41

Shen, Yaoqing. "In silico analysis of mitochondrial proteins." Thèse, 2009. http://hdl.handle.net/1866/3766.

Full text
Abstract:
Le rôle important joué par la mitochondrie dans la cellule eucaryote est admis depuis longtemps. Cependant, la composition exacte des mitochondries, ainsi que les processus biologiques qui sy déroulent restent encore largement inconnus. Deux facteurs principaux permettent dexpliquer pourquoi létude des mitochondries progresse si lentement : le manque defficacité des méthodes didentification des protéines mitochondriales et le manque de précision dans lannotation de ces protéines. En conséquence, nous avons développé un nouvel outil informatique, YimLoc, qui permet de prédire avec succès les protéines mitochondriales à partir des séquences génomiques. Cet outil intègre plusieurs indicateurs existants, et sa performance est supérieure à celle des indicateurs considérés individuellement. Nous avons analysé environ 60 génomes fongiques avec YimLoc afin de lever la controverse concernant la localisation de la bêta-oxydation dans ces organismes. Contrairement à ce qui était généralement admis, nos résultats montrent que la plupart des groupes de Fungi possèdent une bêta-oxydation mitochondriale. Ce travail met également en évidence la diversité des processus de bêta-oxydation chez les champignons, en corrélation avec leur utilisation des acides gras comme source dénergie et de carbone. De plus, nous avons étudié le composant clef de la voie de bêta-oxydation mitochondriale, lacyl-CoA déshydrogénase (ACAD), dans 250 espèces, couvrant les 3 domaines de la vie, en combinant la prédiction de la localisation subcellulaire avec la classification en sous-familles et linférence phylogénétique. Notre étude suggère que les gènes ACAD font partie dune ancienne famille qui a adopté des stratégies évolutionnaires innovatrices afin de générer un large ensemble denzymes susceptibles dutiliser la plupart des acides gras et des acides aminés. Finalement, afin de permettre la prédiction de protéines mitochondriales à partir de données autres que les séquences génomiques, nous avons développé le logiciel TESTLoc qui utilise comme données des Expressed Sequence Tags (ESTs). La performance de TESTLoc est significativement supérieure à celle de tout autre outil de prédiction connu. En plus de fournir deux nouveaux outils de prédiction de la localisation subcellulaire utilisant différents types de données, nos travaux démontrent comment lassociation de la prédiction de la localisation subcellulaire à dautres méthodes danalyse in silico permet daméliorer la connaissance des protéines mitochondriales. De plus, ces travaux proposent des hypothèses claires et faciles à vérifier par des expériences, ce qui présente un grand potentiel pour faire progresser nos connaissances des métabolismes mitochondriaux.
The important role of mitochondria in the eukaryotic cell has long been appreciated, but their exact composition and the biological processes taking place in mitochondria are not yet fully understood. The two main factors that slow down the progress in this field are inefficient recognition and imprecise annotation of mitochondrial proteins. Therefore, we developed a new computational tool, YimLoc, which effectively predicts mitochondrial proteins from genomic sequences. This tool integrates the strengths of existing predictors and yields higher performance than any individual predictor. We applied YimLoc to ~60 fungal genomes in order to address the controversy about the localization of beta oxidation in these organisms. Our results show that in contrast to previous studies, most fungal groups do possess mitochondrial beta oxidation. This work also revealed the diversity of beta oxidation in fungi, which correlates with their utilization of fatty acids as energy and carbon sources. Further, we conducted an investigation of the key component of the mitochondrial beta oxidation pathway, the acyl-CoA dehydrogenase (ACAD). We combined subcellular localization prediction with subfamily classification and phylogenetic inference of ACAD enzymes from 250 species covering all three domains of life. Our study suggests that ACAD genes are an ancient family with innovative evolutionary strategies to generate a large enzyme toolset for utilizing most diverse fatty acids and amino acids. Finally, to enable the prediction of mitochondrial proteins from data beyond genome sequences, we designed the tool TESTLoc that uses expressed sequence tags (ESTs) as input. TESTLoc performs significantly better than known tools. In addition to providing two new tools for subcellular localization designed for different data, our studies demonstrate the power of combining subcellular localization prediction with other in silico analyses to gain insights into the function of mitochondrial proteins. Most importantly, this work proposes clear hypotheses that are easily testable, with great potential for advancing our knowledge of mitochondrial metabolism.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography