To see the other types of publications on this topic, follow the link: Protein sequence evolution.

Journal articles on the topic 'Protein sequence evolution'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'Protein sequence evolution.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Trifonov, Edward N. "Early Molecular Evolution." Israel Journal of Ecology and Evolution 52, no. 3-4 (April 12, 2006): 375–87. http://dx.doi.org/10.1560/ijee_52_3-4_375.

Full text
Abstract:
Four fundamentally novel, recent developments make a basis for the Theory of Early Molecular Evolution. The theory outlines the molecular events from the onset of the triplet code to the formation of the earliest sequence/structure/function modules of proteins. These developments are: (1) Reconstruction of the evolutionary chart of codons; (2) Discovery of omnipresent protein sequence motifs, apparently conserved since the last common ancestor; (3) Discovery of closed loops—standard structural modules of modern proteins; (4) Construction of protein sequence space of module size fragments, with far-reaching evolutionary implications. The theory generates numerous predictions, confirmed by massive nucleotide and protein sequence analyses, such as existence of two distinct classes of amino acids, and their periodical distribution along the sequences. The emerging picture of the earliest molecular evolutionary events is outlined: consecutive engagement of codons, formation of the earliest short peptides, and growth of the polypeptide chains to the size of loop closure, 25-30 residues.
APA, Harvard, Vancouver, ISO, and other styles
2

Sikosek, Tobias, and Hue Sun Chan. "Biophysics of protein evolution and evolutionary protein biophysics." Journal of The Royal Society Interface 11, no. 100 (November 6, 2014): 20140419. http://dx.doi.org/10.1098/rsif.2014.0419.

Full text
Abstract:
The study of molecular evolution at the level of protein-coding genes often entails comparing large datasets of sequences to infer their evolutionary relationships. Despite the importance of a protein's structure and conformational dynamics to its function and thus its fitness, common phylogenetic methods embody minimal biophysical knowledge of proteins. To underscore the biophysical constraints on natural selection, we survey effects of protein mutations, highlighting the physical basis for marginal stability of natural globular proteins and how requirement for kinetic stability and avoidance of misfolding and misinteractions might have affected protein evolution. The biophysical underpinnings of these effects have been addressed by models with an explicit coarse-grained spatial representation of the polypeptide chain. Sequence–structure mappings based on such models are powerful conceptual tools that rationalize mutational robustness, evolvability, epistasis, promiscuous function performed by ‘hidden’ conformational states, resolution of adaptive conflicts and conformational switches in the evolution from one protein fold to another. Recently, protein biophysics has been applied to derive more accurate evolutionary accounts of sequence data. Methods have also been developed to exploit sequence-based evolutionary information to predict biophysical behaviours of proteins. The success of these approaches demonstrates a deep synergy between the fields of protein biophysics and protein evolution.
APA, Harvard, Vancouver, ISO, and other styles
3

Yomo, Tetsuya. "S12A2 Protein evolution from random sequence(Unifying Comprehension from Genome to Cells through Reconsideration of Protein)." Seibutsu Butsuri 47, supplement (2007): S17. http://dx.doi.org/10.2142/biophys.47.s17_3.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Chang, P. C., M. L. Hsieh, J. H. Shien, D. A. Graham, M. S. Lee, and H. K. Shieh. "Complete nucleotide sequence of avian paramyxovirus type 6 isolated from ducks." Journal of General Virology 82, no. 9 (September 1, 2001): 2157–68. http://dx.doi.org/10.1099/0022-1317-82-9-2157.

Full text
Abstract:
There are nine serotypes of avian paramyxovirus (APMV). Only the genome of APMV type 1 (APMV-1), also called Newcastle disease virus (NDV), has been completely sequenced. In this study, the complete nucleotide sequence of an APMV-6 serotype isolated from ducks is reported. The 16236 nt genome encodes eight proteins, nucleocapsid protein (NP), phosphoprotein (P), V protein, matrix protein (M), fusion protein (F), small hydrophobic (SH) protein, haemagglutinin–neuraminidase (HN) protein and large (L) protein, which are flanked by a 55 nt leader sequence and a 54 nt trailer sequence. Sequence comparison reveals that the protein sequences of APMV-6 are most closely related to those of APMV-1 (NDV) and -2, with sequence identities ranging from 22 to 44%. However, APMV-6 contains a gene that might encode the SH protein, which is absent in APMV-1, but present in the rubulaviruses simian virus type 5 and mumps virus. The presence of an SH gene in APMV-6 might provide a link between the evolution of APMV and rubulaviruses. Phylogenetic analysis demonstrates that APMV-6, -1, -2 (only the F and HN sequences were available for analysis) and -4 (only the HN sequences were available for analysis) all cluster into a single lineage that is distinct from other paramyxoviruses. This result suggests that APMV should constitute a new genus within the subfamily Paramyxovirinae.
APA, Harvard, Vancouver, ISO, and other styles
5

Bitard-Feildel, Tristan. "Navigating the amino acid sequence space between functional proteins using a deep learning framework." PeerJ Computer Science 7 (September 17, 2021): e684. http://dx.doi.org/10.7717/peerj-cs.684.

Full text
Abstract:
Motivation Shedding light on the relationships between protein sequences and functions is a challenging task with many implications in protein evolution, diseases understanding, and protein design. The protein sequence space mapping to specific functions is however hard to comprehend due to its complexity. Generative models help to decipher complex systems thanks to their abilities to learn and recreate data specificity. Applied to proteins, they can capture the sequence patterns associated with functions and point out important relationships between sequence positions. By learning these dependencies between sequences and functions, they can ultimately be used to generate new sequences and navigate through uncharted area of molecular evolution. Results This study presents an Adversarial Auto-Encoder (AAE) approached, an unsupervised generative model, to generate new protein sequences. AAEs are tested on three protein families known for their multiple functions the sulfatase, the HUP and the TPP families. Clustering results on the encoded sequences from the latent space computed by AAEs display high level of homogeneity regarding the protein sequence functions. The study also reports and analyzes for the first time two sampling strategies based on latent space interpolation and latent space arithmetic to generate intermediate protein sequences sharing sequential properties of original sequences linked to known functional properties issued from different families and functions. Generated sequences by interpolation between latent space data points demonstrate the ability of the AAE to generalize and produce meaningful biological sequences from an evolutionary uncharted area of the biological sequence space. Finally, 3D structure models computed by comparative modelling using generated sequences and templates of different sub-families point out to the ability of the latent space arithmetic to successfully transfer protein sequence properties linked to function between different sub-families. All in all this study confirms the ability of deep learning frameworks to model biological complexity and bring new tools to explore amino acid sequence and functional spaces.
APA, Harvard, Vancouver, ISO, and other styles
6

Choi, I. G., and S. H. Kim. "Evolution of protein structural classes and protein sequence families." Proceedings of the National Academy of Sciences 103, no. 38 (September 7, 2006): 14056–61. http://dx.doi.org/10.1073/pnas.0606239103.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Chen, Z., F. Wen, N. Sun, and H. Zhao. "Directed evolution of homing endonuclease I-SceI with altered sequence specificity." Protein Engineering Design and Selection 22, no. 4 (January 10, 2009): 249–56. http://dx.doi.org/10.1093/protein/gzp001.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Pascual-García, Alberto, Miguel Arenas, and Ugo Bastolla. "The Molecular Clock in the Evolution of Protein Structures." Systematic Biology 68, no. 6 (April 23, 2019): 987–1002. http://dx.doi.org/10.1093/sysbio/syz022.

Full text
Abstract:
Abstract The molecular clock hypothesis, which states that substitutions accumulate in protein sequences at a constant rate, plays a fundamental role in molecular evolution but it is violated when selective or mutational processes vary with time. Such violations of the molecular clock have been widely investigated for protein sequences, but not yet for protein structures. Here, we introduce a novel statistical test (Significant Clock Violations) and perform a large scale assessment of the molecular clock in the evolution of both protein sequences and structures in three large superfamilies. After validating our method with computer simulations, we find that clock violations are generally consistent in sequence and structure evolution, but they tend to be larger and more significant in structure evolution. Moreover, changes of function assessed through Gene Ontology and InterPro terms are associated with large and significant clock violations in structure evolution. We found that almost one third of significant clock violations are significant in structure evolution but not in sequence evolution, highlighting the advantage to use structure information for assessing accelerated evolution and gathering hints of positive selection. Clock violations between closely related pairs are frequently significant in sequence evolution, consistent with the observed time dependence of the substitution rate attributed to segregation of neutral and slightly deleterious polymorphisms, but not in structure evolution, suggesting that these substitutions do not affect protein structure although they may affect stability. These results are consistent with the view that natural selection, both negative and positive, constrains more strongly protein structures than protein sequences. Our code for computing clock violations is freely available at https://github.com/ugobas/Molecular_clock.
APA, Harvard, Vancouver, ISO, and other styles
9

Waters, E. R. "The molecular evolution of the small heat-shock proteins in plants." Genetics 141, no. 2 (October 1, 1995): 785–95. http://dx.doi.org/10.1093/genetics/141.2.785.

Full text
Abstract:
Abstract The small heat-shock proteins have undergone a tremendous diversification in plants; whereas only a single small heat-shock protein is found in fungi and many animals, over 20 different small heat-shock proteins are found in higher plants. The small heat-shock proteins in plants have diversified in both sequence and cellular localization and are encoded by at least five gene families. In the study, 44 small heat-shock protein DNA and amino acid sequences were examined, using both phylogenetic analysis and analysis of nucleotide substitution patterns to elucidate the evolutionary history of the small heat-shock proteins. The phylogenetic relationships of the small heat-shock proteins, estimated using parsimony and distance methods, reveal the gene duplication, sequence divergence and gene conversion have all played a role in the evolution of the small heat-shock proteins. Analysis of nonsynonymous substitutions and conservative and radical replacement substitutions )in relation to hydrophobicity) indicates that the small heat-shock protein gene families are evolving at different rates. This suggests that the small heat-shock proteins may have diversified in function as well as in sequence and cellular localization.
APA, Harvard, Vancouver, ISO, and other styles
10

Haimel, Matthias, Karin Pröll, and Michael Rebhan. "ProteinArchitect: Protein Evolution above the Sequence Level." PLoS ONE 4, no. 7 (July 15, 2009): e6176. http://dx.doi.org/10.1371/journal.pone.0006176.

Full text
APA, Harvard, Vancouver, ISO, and other styles
11

Facco, Elena, Andrea Pagnani, Elena Tea Russo, and Alessandro Laio. "The intrinsic dimension of protein sequence evolution." PLOS Computational Biology 15, no. 4 (April 8, 2019): e1006767. http://dx.doi.org/10.1371/journal.pcbi.1006767.

Full text
APA, Harvard, Vancouver, ISO, and other styles
12

Huang, Chi-Ruei, and Szecheng J. Lo. "Evolution and Diversity of the Human Hepatitis D Virus Genome." Advances in Bioinformatics 2010 (February 24, 2010): 1–9. http://dx.doi.org/10.1155/2010/323654.

Full text
Abstract:
Human hepatitis delta virus (HDV) is the smallest RNA virus in genome. HDV genome is divided into a viroid-like sequence and a protein-coding sequence which could have originated from different resources and the HDV genome was eventually constituted through RNA recombination. The genome subsequently diversified through accumulation of mutations selected by interactions between the mutated RNA and proteins with host factors to successfully form the infectious virions. Therefore, we propose that the conservation of HDV nucleotide sequence is highly related with its functionality. Genome analysis of known HDV isolates shows that the C-terminal coding sequences of large delta antigen (LDAg) are the highest diversity than other regions of protein-coding sequences but they still retain biological functionality to interact with the heavy chain of clathrin can be selected and maintained. Since viruses interact with many host factors, including escaping the host immune response, how to design a program to predict RNA genome evolution is a great challenging work.
APA, Harvard, Vancouver, ISO, and other styles
13

Wolstenholme, David R., and Douglas O. Clary. "SEQUENCE EVOLUTION OF DROSOPHILA MITOCHONDRIAL DNA." Genetics 109, no. 4 (April 1, 1985): 725–44. http://dx.doi.org/10.1093/genetics/109.4.725.

Full text
Abstract:
ABSTRACT We have compared nucleotide sequences of corresponding segments of the mitochondrial DNA (mtDNA) molecules of Drosophila yakuba and Drosophila melanogaster, which contain the genes for six proteins and seven tRNAs. The overall frequency of substitution between the nucleotide sequences of these protein genes is 7.2%. As was found for mtDNAs from closely related mammals, most substitutions (86%) in Drosophila mitochondrial protein genes do not result in an amino acid replacement. However, the frequencies of transitions and transversions are approximately equal in Drosophila mtDNAs, which is in contrast to the vast excess of transitions over transversions in mammalian mtDNAs. In Drosophila mtDNAs the frequency of C ↔ T substitutions per codon in the third position is 2.5 times greater among codons of two-codon families than among codons of four-codon families; this is contrary to the hypothesis that third position silent substitutions are neutral in regard to selection. In the third position of codons of four-codon families transversions are 4.6 times more frequent than transitions and A ↔ T substitutions account for 86% of all transversions. Ninety-four percent of all codons in the Drosophila mtDNA segments analyzed end in A or T. However, as this alone cannot account for the observed high frequency of A ↔ T substitutions there must be either a disproportionately high rate of A ↔ T mutation in Drosophila mtDNA or selection bias for the products of A ↔ T mutation.—Consideration of the frequencies of interchange of AGA and AGT codons in the corresponding D. yakuba and D. melanogaster mitochondrial protein genes provides strong support for the view that AGA specifies serine in the Drosophila mitochondrial genetic code.
APA, Harvard, Vancouver, ISO, and other styles
14

D’Costa, Sameer, Emily C. Hinds, Chase R. Freschlin, Hyebin Song, and Philip A. Romero. "Inferring protein fitness landscapes from laboratory evolution experiments." PLOS Computational Biology 19, no. 3 (March 1, 2023): e1010956. http://dx.doi.org/10.1371/journal.pcbi.1010956.

Full text
Abstract:
Directed laboratory evolution applies iterative rounds of mutation and selection to explore the protein fitness landscape and provides rich information regarding the underlying relationships between protein sequence, structure, and function. Laboratory evolution data consist of protein sequences sampled from evolving populations over multiple generations and this data type does not fit into established supervised and unsupervised machine learning approaches. We develop a statistical learning framework that models the evolutionary process and can infer the protein fitness landscape from multiple snapshots along an evolutionary trajectory. We apply our modeling approach to dihydrofolate reductase (DHFR) laboratory evolution data and the resulting landscape parameters capture important aspects of DHFR structure and function. We use the resulting model to understand the structure of the fitness landscape and find numerous examples of epistasis but an overall global peak that is evolutionarily accessible from most starting sequences. Finally, we use the model to perform an in silico extrapolation of the DHFR laboratory evolution trajectory and computationally design proteins from future evolutionary rounds.
APA, Harvard, Vancouver, ISO, and other styles
15

Perron, Umberto, Alexey M. Kozlov, Alexandros Stamatakis, Nick Goldman, and Iain H. Moal. "Modeling Structural Constraints on Protein Evolution via Side-Chain Conformational States." Molecular Biology and Evolution 36, no. 9 (May 22, 2019): 2086–103. http://dx.doi.org/10.1093/molbev/msz122.

Full text
Abstract:
Abstract Few models of sequence evolution incorporate parameters describing protein structure, despite its high conservation, essential functional role and increasing availability. We present a structurally aware empirical substitution model for amino acid sequence evolution in which proteins are expressed using an expanded alphabet that relays both amino acid identity and structural information. Each character specifies an amino acid as well as information about the rotamer configuration of its side-chain: the discrete geometric pattern of permitted side-chain atomic positions, as defined by the dihedral angles between covalently linked atoms. By assigning rotamer states in 251,194 protein structures and identifying 4,508,390 substitutions between closely related sequences, we generate a 55-state “Dayhoff-like” model that shows that the evolutionary properties of amino acids depend strongly upon side-chain geometry. The model performs as well as or better than traditional 20-state models for divergence time estimation, tree inference, and ancestral state reconstruction. We conclude that not only is rotamer configuration a valuable source of information for phylogenetic studies, but that modeling the concomitant evolution of sequence and structure may have important implications for understanding protein folding and function.
APA, Harvard, Vancouver, ISO, and other styles
16

Papadopoulos, Chris, Isabelle Callebaut, Jean-Christophe Gelly, Isabelle Hatin, Olivier Namy, Maxime Renard, Olivier Lespinet, and Anne Lopes. "Intergenic ORFs as elementary structural modules of de novo gene birth and protein evolution." Genome Research 31, no. 12 (November 22, 2021): 2303–15. http://dx.doi.org/10.1101/gr.275638.121.

Full text
Abstract:
The noncoding genome plays an important role in de novo gene birth and in the emergence of genetic novelty. Nevertheless, how noncoding sequences’ properties could promote the birth of novel genes and shape the evolution and the structural diversity of proteins remains unclear. Therefore, by combining different bioinformatic approaches, we characterized the fold potential diversity of the amino acid sequences encoded by all intergenic open reading frames (ORFs) of S. cerevisiae with the aim of (1) exploring whether the structural states’ diversity of proteomes is already present in noncoding sequences, and (2) estimating the potential of the noncoding genome to produce novel protein bricks that could either give rise to novel genes or be integrated into pre-existing proteins, thus participating in protein structure diversity and evolution. We showed that amino acid sequences encoded by most yeast intergenic ORFs contain the elementary building blocks of protein structures. Moreover, they encompass the large structural state diversity of canonical proteins, with the majority predicted as foldable. Then, we investigated the early stages of de novo gene birth by reconstructing the ancestral sequences of 70 yeast de novo genes and characterized the sequence and structural properties of intergenic ORFs with a strong translation signal. This enabled us to highlight sequence and structural factors determining de novo gene emergence. Finally, we showed a strong correlation between the fold potential of de novo proteins and one of their ancestral amino acid sequences, reflecting the relationship between the noncoding genome and the protein structure universe.
APA, Harvard, Vancouver, ISO, and other styles
17

Lichtinger, Simon M., Adiran Garaizar, Rosana Collepardo-Guevara, and Aleks Reinhardt. "Targeted modulation of protein liquid–liquid phase separation by evolution of amino-acid sequence." PLOS Computational Biology 17, no. 8 (August 24, 2021): e1009328. http://dx.doi.org/10.1371/journal.pcbi.1009328.

Full text
Abstract:
Rationally and efficiently modifying the amino-acid sequence of proteins to control their ability to undergo liquid–liquid phase separation (LLPS) on demand is not only highly desirable, but can also help to elucidate which protein features are important for LLPS. Here, we propose a computational method that couples a genetic algorithm to a sequence-dependent coarse-grained protein model to evolve the amino-acid sequences of phase-separating intrinsically disordered protein regions (IDRs), and purposely enhance or inhibit their capacity to phase-separate. We validate the predicted critical solution temperatures of the mutated sequences with ABSINTH, a more accurate all-atom model. We apply the algorithm to the phase-separating IDRs of three naturally occurring proteins, namely FUS, hnRNPA1 and LAF1, as prototypes of regions that exist in cells and undergo homotypic LLPS driven by different types of intermolecular interaction, and we find that the evolution of amino-acid sequences towards enhanced LLPS is driven in these three cases, among other factors, by an increase in the average size of the amino acids. However, the direction of change in the molecular driving forces that enhance LLPS (such as hydrophobicity, aromaticity and charge) depends on the initial amino-acid sequence. Finally, we show that the evolution of amino-acid sequences to modulate LLPS is strongly coupled to the make-up of the medium (e.g. the presence or absence of RNA), which may have significant implications for our understanding of phase separation within the many-component mixtures of biological systems.
APA, Harvard, Vancouver, ISO, and other styles
18

Podgornaia, Anna I., and Michael T. Laub. "Pervasive degeneracy and epistasis in a protein-protein interface." Science 347, no. 6222 (February 5, 2015): 673–77. http://dx.doi.org/10.1126/science.1257360.

Full text
Abstract:
Mapping protein sequence space is a difficult problem that necessitates the analysis of 20N combinations for sequences of length N. We systematically mapped the sequence space of four key residues in the Escherichia coli protein kinase PhoQ that drive recognition of its substrate PhoP. We generated a library containing all 160,000 variants of PhoQ at these positions and used a two-step selection coupled to next-generation sequencing to identify 1659 functional variants. Our results reveal extensive degeneracy in the PhoQ-PhoP interface and epistasis, with the effect of individual substitutions often highly dependent on context. Together, epistasis and the genetic code create a pattern of connectivity of functional variants in sequence space that likely constrains PhoQ evolution. Consequently, the diversity of PhoQ orthologs is substantially lower than that of functional PhoQ variants.
APA, Harvard, Vancouver, ISO, and other styles
19

Khatun, Mst Shamima, Watshara Shoombuatong, Md Mehedi Hasan, and Hiroyuki Kurata. "Evolution of Sequence-based Bioinformatics Tools for Protein-protein Interaction Prediction." Current Genomics 21, no. 6 (September 16, 2020): 454–63. http://dx.doi.org/10.2174/1389202921999200625103936.

Full text
Abstract:
Protein-protein interactions (PPIs) are the physical connections between two or more proteins via electrostatic forces or hydrophobic effects. Identification of the PPIs is pivotal, which contributes to many biological processes including protein function, disease incidence, and therapy design. The experimental identification of PPIs via high-throughput technology is time-consuming and expensive. Bioinformatics approaches are expected to solve such restrictions. In this review, our main goal is to provide an inclusive view of the existing sequence-based computational prediction of PPIs. Initially, we briefly introduce the currently available PPI databases and then review the state-of-the-art bioinformatics approaches, working principles, and their performances. Finally, we discuss the caveats and future perspective of the next generation algorithms for the prediction of PPIs.
APA, Harvard, Vancouver, ISO, and other styles
20

Tzul, Franco O., Daniel Vasilchuk, and George I. Makhatadze. "Evidence for the principle of minimal frustration in the evolution of protein folding landscapes." Proceedings of the National Academy of Sciences 114, no. 9 (February 14, 2017): E1627—E1632. http://dx.doi.org/10.1073/pnas.1613892114.

Full text
Abstract:
Theoretical and experimental studies have firmly established that protein folding can be described by a funneled energy landscape. This funneled energy landscape is the result of foldable protein sequences evolving following the principle of minimal frustration, which allows proteins to rapidly fold to their native biologically functional conformations. For a protein family with a given functional fold, the principle of minimal frustration suggests that, independent of sequence, all proteins within this family should fold with similar rates. However, depending on the optimal living temperature of the organism, proteins also need to modulate their thermodynamic stability. Consequently, the difference in thermodynamic stability should be primarily caused by differences in the unfolding rates. To test this hypothesis experimentally, we performed comprehensive thermodynamic and kinetic analyses of 15 different proteins from the thioredoxin family. Eight of these thioredoxins were extant proteins from psychrophilic, mesophilic, or thermophilic organisms. The other seven protein sequences were obtained using ancestral sequence reconstruction and can be dated back over 4 billion years. We found that all studied proteins fold with very similar rates but unfold with rates that differ up to three orders of magnitude. The unfolding rates correlate well with the thermodynamic stability of the proteins. Moreover, proteins that unfold slower are more resistant to proteolysis. These results provide direct experimental support to the principle of minimal frustration hypothesis.
APA, Harvard, Vancouver, ISO, and other styles
21

Markovič, Oskar, and Štefan Janeček. "Pectin degrading glycoside hydrolases of family 28: sequence-structural features, specificities and evolution." Protein Engineering, Design and Selection 14, no. 9 (September 2001): 615–31. http://dx.doi.org/10.1093/protein/14.9.615.

Full text
APA, Harvard, Vancouver, ISO, and other styles
22

Ferrada, Evandro. "Gene Families, Epistasis and the Amino Acid Preferences of Protein Homologs." Evolutionary Bioinformatics 15 (January 2019): 117693431987048. http://dx.doi.org/10.1177/1176934319870485.

Full text
Abstract:
In order to preserve structure and function, proteins tend to preferentially conserve amino acids at particular sites along the sequence. Because mutations can affect structure and function, the question arises whether the preference of a protein site for a particular amino acid varies between protein homologs, and to what extent that variation depends on sequence divergence. Answering these questions can help in the development of models of sequence evolution, as well as provide insights on the dependence of the fitness effects of mutations on the genetic background of sequences, a phenomenon known as epistasis. Here, I comment on recent computational work providing a systematic analysis of the extent to which the amino acid preferences of proteins depend on the background mutations of protein homologs.
APA, Harvard, Vancouver, ISO, and other styles
23

Studer, Romain A., and Marc Robinson-Rechavi. "Evidence for an episodic model of protein sequence evolution." Biochemical Society Transactions 37, no. 4 (July 22, 2009): 783–86. http://dx.doi.org/10.1042/bst0370783.

Full text
Abstract:
The evolution of protein function appears to involve alternating periods of conservative evolution and of relatively rapid change. Evidence for such episodic evolution, consistent with some theoretical expectations, comes from the application of increasingly sophisticated models of evolution to large sequence datasets. We present here some of the recent methods to detect functional shifts, using amino acid or codon models. Both provide evidence for punctual shifts in patterns of amino acid conservation, including the fixation of key changes by positive selection. Although a link to gene duplication, a presumed source of functional changes, has been difficult to establish, this episodic model appears to apply to a wide variety of proteins and organisms.
APA, Harvard, Vancouver, ISO, and other styles
24

Gumulya, Yosephine, and Elizabeth M. J. Gillam. "Exploring the past and the future of protein evolution with ancestral sequence reconstruction: the ‘retro’ approach to protein engineering." Biochemical Journal 474, no. 1 (December 22, 2016): 1–19. http://dx.doi.org/10.1042/bcj20160507.

Full text
Abstract:
A central goal in molecular evolution is to understand the ways in which genes and proteins evolve in response to changing environments. In the absence of intact DNA from fossils, ancestral sequence reconstruction (ASR) can be used to infer the evolutionary precursors of extant proteins. To date, ancestral proteins belonging to eubacteria, archaea, yeast and vertebrates have been inferred that have been hypothesized to date from between several million to over 3 billion years ago. ASR has yielded insights into the early history of life on Earth and the evolution of proteins and macromolecular complexes. Recently, however, ASR has developed from a tool for testing hypotheses about protein evolution to a useful means for designing novel proteins. The strength of this approach lies in the ability to infer ancestral sequences encoding proteins that have desirable properties compared with contemporary forms, particularly thermostability and broad substrate range, making them good starting points for laboratory evolution. Developments in technologies for DNA sequencing and synthesis and computational phylogenetic analysis have led to an escalation in the number of ancient proteins resurrected in the last decade and greatly facilitated the use of ASR in the burgeoning field of synthetic biology. However, the primary challenge of ASR remains in accurately inferring ancestral states, despite the uncertainty arising from evolutionary models, incomplete sequences and limited phylogenetic trees. This review will focus, firstly, on the use of ASR to uncover links between sequence and phenotype and, secondly, on the practical application of ASR in protein engineering.
APA, Harvard, Vancouver, ISO, and other styles
25

Daza, Daniel Ocampo, Görel Sundström, Christina A. Bergqvist, Cunming Duan, and Dan Larhammar. "Evolution of the Insulin-Like Growth Factor Binding Protein (IGFBP) Family." Endocrinology 152, no. 6 (April 19, 2011): 2278–89. http://dx.doi.org/10.1210/en.2011-0047.

Full text
Abstract:
The evolution of the IGF binding protein (IGFBP) gene family has been difficult to resolve. Both chromosomal and serial duplications have been suggested as mechanisms for the expansion of this gene family. We have identified and annotated IGFBP sequences from a wide selection of vertebrate species as well as Branchiostoma floridae and Ciona intestinalis. By combining detailed sequence analysis with sequence-based phylogenies and chromosome information, we arrive at the following scenario: the ancestral chordate IGFBP gene underwent a local gene duplication, resulting in a gene pair adjacent to a HOX cluster. Subsequently, the gene family expanded in the two basal vertebrate tetraploidization (2R) resulting in the six IGFBP types that are presently found in placental mammals. The teleost fish ancestor underwent a third tetraploidization (3R) that further expanded the IGFBP repertoire. The five sequenced teleost fish genomes retain 9–11 of IGFBP genes. This scenario is supported by the phylogenies of three adjacent gene families in the HOX gene regions, namely the epidermal growth factor receptors (EGFR) and the Ikaros and distal-less (DLX) transcription factors. Our sequence comparisons show that several important structural components in the IGFBPs are ancestral vertebrate features that have been maintained in all orthologs, for instance the integrin interaction motif Arg-Gly-Asp in IGFBP-2. In contrast, the Arg-Gly-Asp motif in IGFBP-1 has arisen independently in mammals. The large degree of retention of IGFBP genes after the ancient expansion of the gene family strongly suggests that each gene evolved distinct and important functions early in vertebrate evolution.
APA, Harvard, Vancouver, ISO, and other styles
26

Shafee, Thomas, Antony Bacic, and Kim Johnson. "Evolution of Sequence-Diverse Disordered Regions in a Protein Family: Order within the Chaos." Molecular Biology and Evolution 37, no. 8 (May 2, 2020): 2155–72. http://dx.doi.org/10.1093/molbev/msaa096.

Full text
Abstract:
Abstract Approaches for studying the evolution of globular proteins are now well established yet are unsuitable for disordered sequences. Our understanding of the evolution of proteins containing disordered regions therefore lags that of globular proteins, limiting our capacity to estimate their evolutionary history, classify paralogs, and identify potential sequence–function relationships. Here, we overcome these limitations by using new analytical approaches that project representations of sequence space to dissect the evolution of proteins with both ordered and disordered regions, and the correlated changes between these. We use the fasciclin-like arabinogalactan proteins (FLAs) as a model family, since they contain a variable number of globular fasciclin domains as well as several distinct types of disordered regions: proline (Pro)-rich arabinogalactan (AG) regions and longer Pro-depleted regions. Sequence space projections of fasciclin domains from 2019 FLAs from 78 species identified distinct clusters corresponding to different types of fasciclin domains. Clusters can be similarly identified in the seemingly random Pro-rich AG and Pro-depleted disordered regions. Sequence features of the globular and disordered regions clearly correlate with one another, implying coevolution of these distinct regions, as well as with the N-linked and O-linked glycosylation motifs. We reconstruct the overall evolutionary history of the FLAs, annotated with the changing domain architectures, glycosylation motifs, number and length of AG regions, and disordered region sequence features. Mapping these features onto the functionally characterized FLAs therefore enables their sequence–function relationships to be interrogated. These findings will inform research on the abundant disordered regions in protein families from all kingdoms of life.
APA, Harvard, Vancouver, ISO, and other styles
27

Xia, Yu, and Michael Levitt. "Simulating protein evolution in sequence and structure space." Current Opinion in Structural Biology 14, no. 2 (April 2004): 202–7. http://dx.doi.org/10.1016/j.sbi.2004.03.001.

Full text
APA, Harvard, Vancouver, ISO, and other styles
28

Wilke, Claus O., and D. Allan Drummond. "Signatures of protein biophysics in coding sequence evolution." Current Opinion in Structural Biology 20, no. 3 (June 2010): 385–89. http://dx.doi.org/10.1016/j.sbi.2010.03.004.

Full text
APA, Harvard, Vancouver, ISO, and other styles
29

Thorne, Jeffrey L. "Models of protein sequence evolution and their applications." Current Opinion in Genetics & Development 10, no. 6 (December 2000): 602–5. http://dx.doi.org/10.1016/s0959-437x(00)00142-8.

Full text
APA, Harvard, Vancouver, ISO, and other styles
30

Heringa, Jaap. "The evolution and recognition of protein sequence repeats." Computers & Chemistry 18, no. 3 (September 1994): 233–43. http://dx.doi.org/10.1016/0097-8485(94)85018-6.

Full text
APA, Harvard, Vancouver, ISO, and other styles
31

Ortiz, Angel R., and Jeffrey Skolnick. "Sequence Evolution and the Mechanism of Protein Folding." Biophysical Journal 79, no. 4 (October 2000): 1787–99. http://dx.doi.org/10.1016/s0006-3495(00)76430-7.

Full text
APA, Harvard, Vancouver, ISO, and other styles
32

Zhang, Jianzhi, and Jian-Rong Yang. "Determinants of the rate of protein sequence evolution." Nature Reviews Genetics 16, no. 7 (June 9, 2015): 409–20. http://dx.doi.org/10.1038/nrg3950.

Full text
APA, Harvard, Vancouver, ISO, and other styles
33

Tan, C. S. H. "Sequence, Structure, and Network Evolution of Protein Phosphorylation." Science Signaling 4, no. 182 (July 12, 2011): mr6. http://dx.doi.org/10.1126/scisignal.2002093.

Full text
APA, Harvard, Vancouver, ISO, and other styles
34

Kosiol, C., I. Holmes, and N. Goldman. "An Empirical Codon Model for Protein Sequence Evolution." Molecular Biology and Evolution 24, no. 7 (March 8, 2007): 1464–79. http://dx.doi.org/10.1093/molbev/msm064.

Full text
APA, Harvard, Vancouver, ISO, and other styles
35

Kosiol, C., I. Holmes, and N. Goldman. "An Empirical Codon Model for Protein Sequence Evolution." Molecular Biology and Evolution 24, no. 9 (May 23, 2007): 2151. http://dx.doi.org/10.1093/molbev/msm154.

Full text
APA, Harvard, Vancouver, ISO, and other styles
36

Grahnen, Johan A., Priyanka Nandakumar, Jan Kubelka, and David A. Liberles. "Biophysical and structural considerations for protein sequence evolution." BMC Evolutionary Biology 11, no. 1 (2011): 361. http://dx.doi.org/10.1186/1471-2148-11-361.

Full text
APA, Harvard, Vancouver, ISO, and other styles
37

Yuan, Ling, Itzhak Kurek, James English, and Robert Keenan. "Laboratory-Directed Protein Evolution." Microbiology and Molecular Biology Reviews 69, no. 3 (September 2005): 373–92. http://dx.doi.org/10.1128/mmbr.69.3.373-392.2005.

Full text
Abstract:
SUMMARY Systematic approaches to directed evolution of proteins have been documented since the 1970s. The ability to recruit new protein functions arises from the considerable substrate ambiguity of many proteins. The substrate ambiguity of a protein can be interpreted as the evolutionary potential that allows a protein to acquire new specificities through mutation or to regain function via mutations that differ from the original protein sequence. All organisms have evolutionarily exploited this substrate ambiguity. When exploited in a laboratory under controlled mutagenesis and selection, it enables a protein to “evolve” in desired directions. One of the most effective strategies in directed protein evolution is to gradually accumulate mutations, either sequentially or by recombination, while applying selective pressure. This is typically achieved by the generation of libraries of mutants followed by efficient screening of these libraries for targeted functions and subsequent repetition of the process using improved mutants from the previous screening. Here we review some of the successful strategies in creating protein diversity and the more recent progress in directed protein evolution in a wide range of scientific disciplines and its impacts in chemical, pharmaceutical, and agricultural sciences.
APA, Harvard, Vancouver, ISO, and other styles
38

Aadland, Kelsey, and Bryan Kolaczkowski. "Alignment-Integrated Reconstruction of Ancestral Sequences Improves Accuracy." Genome Biology and Evolution 12, no. 9 (August 12, 2020): 1549–65. http://dx.doi.org/10.1093/gbe/evaa164.

Full text
Abstract:
Abstract Ancestral sequence reconstruction (ASR) uses an alignment of extant protein sequences, a phylogeny describing the history of the protein family and a model of the molecular-evolutionary process to infer the sequences of ancient proteins, allowing researchers to directly investigate the impact of sequence evolution on protein structure and function. Like all statistical inferences, ASR can be sensitive to violations of its underlying assumptions. Previous studies have shown that, whereas phylogenetic uncertainty has only a very weak impact on ASR accuracy, uncertainty in the protein sequence alignment can more strongly affect inferred ancestral sequences. Here, we show that errors in sequence alignment can produce errors in ASR across a range of realistic and simplified evolutionary scenarios. Importantly, sequence reconstruction errors can lead to errors in estimates of structural and functional properties of ancestral proteins, potentially undermining the reliability of analyses relying on ASR. We introduce an alignment-integrated ASR approach that combines information from many different sequence alignments. We show that integrating alignment uncertainty improves ASR accuracy and the accuracy of downstream structural and functional inferences, often performing as well as highly accurate structure-guided alignment. Given the growing evidence that sequence alignment errors can impact the reliability of ASR studies, we recommend that future studies incorporate approaches to mitigate the impact of alignment uncertainty. Probabilistic modeling of insertion and deletion events has the potential to radically improve ASR accuracy when the model reflects the true underlying evolutionary history, but further studies are required to thoroughly evaluate the reliability of these approaches under realistic conditions.
APA, Harvard, Vancouver, ISO, and other styles
39

Cohen, H. M., D. S. Tawfik, and A. D. Griffiths. "Altering the sequence specificity of HaeIII methyltransferase by directed evolution using in vitro compartmentalization." Protein Engineering Design and Selection 17, no. 1 (January 1, 2004): 3–11. http://dx.doi.org/10.1093/protein/gzh001.

Full text
APA, Harvard, Vancouver, ISO, and other styles
40

Boucher, Jeffrey I., Troy W. Whitfield, Ann Dauphin, Gily Nachum, Carl Hollins, Konstantin B. Zeldovich, Ronald Swanstrom, Celia A. Schiffer, Jeremy Luban, and Daniel N. A. Bolon. "Constrained Mutational Sampling of Amino Acids in HIV-1 Protease Evolution." Molecular Biology and Evolution 36, no. 4 (February 4, 2019): 798–810. http://dx.doi.org/10.1093/molbev/msz022.

Full text
Abstract:
Abstract The evolution of HIV-1 protein sequences should be governed by a combination of factors including nucleotide mutational probabilities, the genetic code, and fitness. The impact of these factors on protein sequence evolution is interdependent, making it challenging to infer the individual contribution of each factor from phylogenetic analyses alone. We investigated the protein sequence evolution of HIV-1 by determining an experimental fitness landscape of all individual amino acid changes in protease. We compared our experimental results to the frequency of protease variants in a publicly available data set of 32,163 sequenced isolates from drug-naïve individuals. The most common amino acids in sequenced isolates supported robust experimental fitness, indicating that the experimental fitness landscape captured key features of selection acting on protease during viral infections of hosts. Amino acid changes requiring multiple mutations from the likely ancestor were slightly less likely to support robust experimental fitness than single mutations, consistent with the genetic code favoring chemically conservative amino acid changes. Amino acids that were common in sequenced isolates were predominantly accessible by single mutations from the likely protease ancestor. Multiple mutations commonly observed in isolates were accessible by mutational walks with highly fit single mutation intermediates. Our results indicate that the prevalence of multiple-base mutations in HIV-1 protease is strongly influenced by mutational sampling.
APA, Harvard, Vancouver, ISO, and other styles
41

Yan, Zhiqiang, and Jin Wang. "Funneled energy landscape unifies principles of protein binding and evolution." Proceedings of the National Academy of Sciences 117, no. 44 (October 16, 2020): 27218–23. http://dx.doi.org/10.1073/pnas.2013822117.

Full text
Abstract:
Most proteins have evolved to spontaneously fold into native structure and specifically bind with their partners for the purpose of fulfilling biological functions. According to Darwin, protein sequences evolve through random mutations, and only the fittest survives. The understanding of how the evolutionary selection sculpts the interaction patterns for both biomolecular folding and binding is still challenging. In this study, we incorporated the constraint of functional binding into the selection fitness based on the principle of minimal frustration for the underlying biomolecular interactions. Thermodynamic stability and kinetic accessibility were derived and quantified from a global funneled energy landscape that satisfies the requirements of both the folding into the stable structure and binding with the specific partner. The evolution proceeds via a bowl-like evolution energy landscape in the sequence space with a closed-ring attractor at the bottom. The sequence space is increasingly reduced until this ring attractor is reached. The molecular-interaction patterns responsible for folding and binding are identified from the evolved sequences, respectively. The residual positions participating in the interactions responsible for folding are highly conserved and maintain the hydrophobic core under additional evolutionary constraints of functional binding. The positions responsible for binding constitute a distributed network via coupling conservations that determine the specificity of binding with the partner. This work unifies the principles of protein binding and evolution under minimal frustration and sheds light on the evolutionary design of proteins for functions.
APA, Harvard, Vancouver, ISO, and other styles
42

Caetano-Anollés, Gustavo, Minglei Wang, Derek Caetano-Anollés, and Jay E. Mittenthal. "The origin, evolution and structure of the protein world." Biochemical Journal 417, no. 3 (January 16, 2009): 621–37. http://dx.doi.org/10.1042/bj20082063.

Full text
Abstract:
Contemporary protein architectures can be regarded as molecular fossils, historical imprints that mark important milestones in the history of life. Whereas sequences change at a considerable pace, higher-order structures are constrained by the energetic landscape of protein folding, the exploration of sequence and structure space, and complex interactions mediated by the proteostasis and proteolytic machineries of the cell. The survey of architectures in the living world that was fuelled by recent structural genomic initiatives has been summarized in protein classification schemes, and the overall structure of fold space explored with novel bioinformatic approaches. However, metrics of general structural comparison have not yet unified architectural complexity using the ‘shared and derived’ tenet of evolutionary analysis. In contrast, a shift of focus from molecules to proteomes and a census of protein structure in fully sequenced genomes were able to uncover global evolutionary patterns in the structure of proteins. Timelines of discovery of architectures and functions unfolded episodes of specialization, reductive evolutionary tendencies of architectural repertoires in proteomes and the rise of modularity in the protein world. They revealed a biologically complex ancestral proteome and the early origin of the archaeal lineage. Studies also identified an origin of the protein world in enzymes of nucleotide metabolism harbouring the P-loop-containing triphosphate hydrolase fold and the explosive discovery of metabolic functions that recapitulated well-defined prebiotic shells and involved the recruitment of structures and functions. These observations have important implications for origins of modern biochemistry and diversification of life.
APA, Harvard, Vancouver, ISO, and other styles
43

Trifonov, Edward N., Alla Kirzhner, Valery M. Kirzhner, and Igor N. Berezovsky. "Distinct Stages of Protein Evolution as Suggested by Protein Sequence Analysis." Journal of Molecular Evolution 53, no. 4-5 (October 1, 2001): 394–401. http://dx.doi.org/10.1007/s002390010229.

Full text
APA, Harvard, Vancouver, ISO, and other styles
44

Liu, Ying, Annie Huang, Rebecca M. Booth, Gabriela Geraldo Mendes, Zabeena Merchant, Kathleen S. Matthews, and Sarah E. Bondos. "Evolution of the activation domain in a Hox transcription factor." International Journal of Developmental Biology 62, no. 11-12 (2018): 745–53. http://dx.doi.org/10.1387/ijdb.180151sb.

Full text
Abstract:
Linking changes in amino acid sequences to the evolution of transcription regulatory domains is often complicated by the low sequence complexity and high mutation rates of intrinsically disordered protein regions. For the Hox transcription factor Ultrabithorax (Ubx), conserved motifs distributed throughout the protein sequence enable direct comparison of specific protein regions, despite variations in the length and composition of the intervening sequences. In cell culture, the strength of transcription activation by Drosophila melanogaster Ubx correlates with the presence of a predicted helix within its activation domain. Curiously, this helix is not preserved in species more divergent than flies, suggesting the nature of transcription activation may have evolved. To determine whether this helix contributes to Drosophila Ubx function in vivo, wild-type and mutant proteins were ectopically expressed in the developing wing and the phenotypes evaluated. Helix mutations alter Drosophila Ubx activity in the developing wing, demonstrating its functional importance in vivo. The locations of activation domains in Ubx orthologues were identified by testing the ability of truncation mutants to activate transcription in yeast one-hybrid assays. In Ubx orthologues representing 540 million years of evolution, the ability to activate transcription varies substantially. The sequence and the location of the activation domains also differ. Consequently, analogous regions of Ubx orthologues change function over time, and may activate transcription in one species, but have no activity, or even inhibit transcription activation in another species. Unlike homeodomain-DNA binding, the nature of transcription activation by Ubx has substantially evolved.
APA, Harvard, Vancouver, ISO, and other styles
45

Harris, AJ, and Aaron David Goldman. "The very early evolution of protein translocation across membranes." PLOS Computational Biology 17, no. 3 (March 8, 2021): e1008623. http://dx.doi.org/10.1371/journal.pcbi.1008623.

Full text
Abstract:
In this study, we used a computational approach to investigate the early evolutionary history of a system of proteins that, together, embed and translocate other proteins across cell membranes. Cell membranes comprise the basis for cellularity, which is an ancient, fundamental organizing principle shared by all organisms and a key innovation in the evolution of life on Earth. Two related requirements for cellularity are that organisms are able to both embed proteins into membranes and translocate proteins across membranes. One system that accomplishes these tasks is the signal recognition particle (SRP) system, in which the core protein components are the paralogs, FtsY and Ffh. Complementary to the SRP system is the Sec translocation channel, in which the primary channel-forming protein is SecY. We performed phylogenetic analyses that strongly supported prior inferences that FtsY, Ffh, and SecY were all present by the time of the last universal common ancestor of life, the LUCA, and that the ancestor of FtsY and Ffh existed before the LUCA. Further, we combined ancestral sequence reconstruction and protein structure and function prediction to show that the LUCA had an SRP system and Sec translocation channel that were similar to those of extant organisms. We also show that the ancestor of Ffh and FtsY that predated the LUCA was more similar to FtsY than Ffh but could still have comprised a rudimentary protein translocation system on its own. Duplication of the ancestor of FtsY and Ffh facilitated the specialization of FtsY as a membrane bound receptor and Ffh as a cytoplasmic protein that could bind nascent proteins with specific membrane-targeting signal sequences. Finally, we analyzed amino acid frequencies in our ancestral sequence reconstructions to infer that the ancestral Ffh/FtsY protein likely arose prior to or just after the completion of the canonical genetic code. Taken together, our results offer a window into the very early evolutionary history of cellularity.
APA, Harvard, Vancouver, ISO, and other styles
46

Wolfe, Kenneth H. "Comparative genomics and genome evolution in yeasts." Philosophical Transactions of the Royal Society B: Biological Sciences 361, no. 1467 (February 2006): 403–12. http://dx.doi.org/10.1098/rstb.2005.1799.

Full text
Abstract:
Yeasts provide a powerful model system for comparative genomics research. The availability of multiple complete genome sequences from different fungal groups—currently 18 hemiascomycetes, 8 euascomycetes and 4 basidiomycetes—enables us to gain a broad perspective on genome evolution. The sequenced genomes span a continuum of divergence levels ranging from multiple individuals within a species to species pairs with low levels of protein sequence identity and no conservation of gene order. One of the most interesting emerging areas is the growing number of events such as gene losses, gene displacements and gene relocations that can be attributed to the action of natural selection.
APA, Harvard, Vancouver, ISO, and other styles
47

Samuel, Selvaraj, and Mary Rajathei. "A Web Database IR-PDB for Sequence Repeats of Proteins in the Protein Data Bank." International Journal of Knowledge Discovery in Bioinformatics 7, no. 2 (July 2017): 1–10. http://dx.doi.org/10.4018/ijkdb.2017070101.

Full text
Abstract:
Amino acid repeats play significant roles in the evolution of structure and function of many large proteins. Analysis of internal repeats of protein with known structure helps to understand the importance of repeats of the protein. A database IR-PDB for repeats in sequence of the proteins in the PDB has been developed for the analysis of impact of repeats in proteins. Using the state of the art repeat detection method RADAR, internal repeats in 148202 sequences out of 285714 sequences belonging to 115031 PDB structures were detected. The identified sequence repeats were annotated with secondary structural information with a view to analyze the structural consequence and conservation of the repeats. The tertiary structure of the repeats and their functional involvements can be found out through web links to PDB, PDBsum and Pfam. IR-PDB is systematically annotated for the the proteins in the PDB with sequence repeats and their structure with the possibility to access the dataset interactively through web services.
APA, Harvard, Vancouver, ISO, and other styles
48

Waterborg, Jakob H. "Evolution of histone H3: emergence of variants and conservation of post-translational modification sites1This article is part of Special Issue entitled Asilomar Chromatin and has undergone the Journal’s usual peer review process." Biochemistry and Cell Biology 90, no. 1 (February 2012): 79–95. http://dx.doi.org/10.1139/o11-036.

Full text
Abstract:
Histone H3 proteins are highly conserved across all eukaryotes and are dynamically modified by many post-translational modifications (PTMs). Here we describe a method that defines the evolution of the family of histone H3 proteins, including the emergence of functionally distinct variants. It combines information from histone H3 protein sequences in eukaryotic species with the evolution of these species as described by the tree of life (TOL) project. This so-called TOL analysis identified the time when the few observed protein sequence changes occurred and when distinct, co-existing H3 protein variants arose. Four distinct ancient duplication events were identified where replication-coupled (RC) H3 variants diverged from replication-independent (RI) forms, like histone H3.3 in animals. These independent events occurred in ancestral lineages leading to the clades of metazoa, viridiplantae, basidiomycota, and alveolata. The proto-H3 sequence in the last eukaryotic common ancestor (LECA) was expanded to at least 133 of its 135 residues. Extreme conservation of known acetylation and methylation sites of lysines and arginines predicts that these PTMs will exist across the eukaryotic crown phyla and in protists with canonical chromatin structures. Less complete conservation was found for most serine and threonine phosphorylation sites. This study demonstrates that TOL analysis can determine the evolution of slowly evolving proteins in sequence-saturated datasets.
APA, Harvard, Vancouver, ISO, and other styles
49

Letarov, A., X. Manival, C. Desplats, and H. M. Krisch. "gpwac of the T4-Type Bacteriophages: Structure, Function, and Evolution of a Segmented Coiled-Coil Protein That Controls Viral Infectivity." Journal of Bacteriology 187, no. 3 (February 1, 2005): 1055–66. http://dx.doi.org/10.1128/jb.187.3.1055-1066.2005.

Full text
Abstract:
ABSTRACT The wac gene product (gpwac) or fibritin of bacteriophage T4 forms the six fibers that radiate from the phage neck. During phage morphogenesis these whiskers bind the long tail fibers (LTFs) and facilitate their attachment to the phage baseplate. After the cell lysis, the gpwac fibers function as part of an environmental sensing device that retains the LTFs in a retracted configuration and thus prevents phage adsorption in unfavorable conditions. A comparative analysis of the sequences of 5 wac gene orthologs from various T4-type phages reveals that the ∼50-amino-acid N-terminal domain is the only highly conserved segment of the protein. This sequence conservation is probably a direct consequence of the domain's strong and specific interactions with the neck proteins. The sequence of the central fibrous region of gpwac is highly plastic, with only the heptad periodicity of the coiled-coil structure being conserved. In the various gpwac sequences, the small C-terminal domain essential for initiation of the folding of T4 gpwac is replaced by unrelated sequences of unknown origin. When a distant T4-type phage has a novel C-terminal gpwac sequence, the phage's gp36 sequence that is located at the knee joint of the LTF invariably has a novel domain in its C terminus as well. The covariance of these two sequences is compatible with genetic data suggesting that the C termini of gpwac and gp36 engage in a protein-protein interaction that controls phage infectivity. These results add to the limited evidence for domain swapping in the evolution of phage structural proteins.
APA, Harvard, Vancouver, ISO, and other styles
50

Kuznetsov, Vladimir A. "Hypergeometric Model of Evolution of Conserved Protein Coding Sequences in the Proteomes." Fluctuation and Noise Letters 03, no. 03 (September 2003): L295—L324. http://dx.doi.org/10.1142/s0219477503001397.

Full text
Abstract:
The diversity of protein sequences that exists today has probably evolved from antecedent evolutionarily- conserved domain-like sequences (i.e. motifs, repeats, structural domains) encoded by short ancient genes. We have studied the statistical distributions of the occurrences of the domain-like families within proteins in the proteomes. A generalized hypergeometric stochastic process is introduced in order to model the evolution dynamics of these conserved sequences. We found that the limiting probability function associated with this process fits the empirical distributions for the 90 fully-sequence bacterial, archaeal and eukaryotic organisms. For eukaryotes, our limiting distribution is reduced to Waring's distribution. However, for many archaeal and bacterial organisms the empirical distributions degenerate to the Yule-like distribution. Comparison of all of these distributions implies critical evolutionary events, which lead to the proportional growth of the number of new protein-coding genes and proteome complexity in the eukaryotic organisms and suggest that evolution of many archaeal and bacterial organisms are subject to external global (ecological) forces. Best-fit model data predicts that (1) there are only ~ 5500 or so of the distinct InterPro domains in a given higher eukaryotic organism and that (2) a general trend in eukaryotic proteome evolution is described by the increase in frequency of multi-domain proteins composed of already-existing (older) distinct domains as oppose to creating new ones. Our model can be applicable for analysis of the evolution of word distributions in the texts and be used in other large-scale evolutional systems like the Internet, the economy and the universe.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography