Dissertations / Theses: 'Protein Structure Networks (PSN)'

1

Zhao, Jing. "Protein Structure Prediction Based on Neural Networks." Thèse, Université d'Ottawa / University of Ottawa, 2013. http://hdl.handle.net/10393/23636.

Full text

Abstract:

Proteins are the basic building blocks of biological organisms, and are responsible for a variety of functions within them. Proteins are composed of unique amino acid sequences. Some has only one sequence, while others contain several sequences that are combined together. These combined amino acid sequences fold to form a unique three-dimensional (3D) shape. Although the sequences may fold proteins into different 3D shapes in diverse environments, proteins with similar amino acid sequences typically have similar 3D shapes and functions. Knowledge of the 3D shape of a protein is important in both protein function analysis and drug design, for example when assessing the toxicity reduction associated with a given drug. Due to the complexity of protein 3D shapes and the close relationship between shapes and functions, the prediction of protein 3D shapes has become an important topic in bioinformatics. This research introduces a new approach to predict proteins’ 3D shapes, utilizing a multilayer artificial neural network. Our novel solution allows one to learn and predict the representations of the 3D shape associated with a protein by starting directly from its amino acid sequence descriptors. The input of the artificial neural network is a set of amino acid sequence descriptors we created based on a set of probability density functions. In our algorithm, the probability density functions are calculated by the correlation between the constituent amino acids, according to the substitution matrix. The output layer of the network is formed by 3D shape descriptors provided by an information retrieval system, called CAPRI. This system contains the pose invariant 3D shape descriptors, and retrieves proteins having the closest structures. The network is trained by proteins with known amino acid sequences and 3D shapes. Once the network has been trained, it is able to predict the 3D shape descriptors of the query protein. Based on the predicted 3D shape descriptors, the CAPRI system allows the retrieval of known proteins with 3D shapes closest to the query protein. These retrieved proteins may be verified as to whether they are in the same family as the query protein, since proteins in the same family generally have similar 3D shapes. The search for similar 3D shapes is done against a database of more than 45,000 known proteins. We present the results when evaluating our approach against a number of protein families of various sizes. Further, we consider a number of different neural network architectures and optimization algorithms. When the neural network is trained with proteins that are from large families where the proteins in the same family have similar amino acid sequences, the accuracy for finding proteins from the same family is 100%. When we employ proteins whose family members have dissimilar amino acid sequences, or those from a small protein family, in which case, neural networks with one hidden layer produce more promising results than networks with two hidden layers, and the performance may be improved by increasing the number of hidden nodes when the networks have one hidden layer.

APA, Harvard, Vancouver, ISO, and other styles

2

Zotenko, Elena. "Computational methods in protein structure comparison and analysis of protein interaction networks." College Park, Md.: University of Maryland, 2007. http://hdl.handle.net/1903/7621.

Full text

Abstract:

Thesis (Ph. D.) -- University of Maryland, College Park, 2007.
Thesis research directed by: Dept. of Computer Science. Title from t.p. of PDF. Includes bibliographical references. Published by UMI Dissertation Services, Ann Arbor, Mich. Also available in paper.

APA, Harvard, Vancouver, ISO, and other styles

3

Grochow, Joshua A. "On the structure and evolution of protein interaction networks." Thesis, Massachusetts Institute of Technology, 2006. http://hdl.handle.net/1721.1/42053.

Full text

Abstract:

Thesis (M. Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2006.
This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.
Includes bibliographical references (p. 107-114).
The study of protein interactions from the networks point of view has yielded new insights into systems biology [Bar03, MA03, RSM+02, WS98]. In particular, "network motifs" become apparent as a useful and systematic tool for describing and exploring networks [BP06, MKFV06, MSOI+02, SOMMA02, SV06]. Finding motifs has involved either exact counting (e.g. [MSOI+02]) or subgraph sampling (e.g. [BP06, KIMA04a, MZW05]). In this thesis we develop an algorithm to count all instances of a particular subgraph, which can be used to query whether a given subgraph is a significant motif. This method can be used to perform exact counting of network motifs faster and with less memory than previous methods, and can also be combined with subgraph sampling to find larger motifs than ever before -- we have found motifs with up to 15 nodes and explored subgraphs up to 20 nodes. Unlike previous methods, this method can also be used to explore motif clustering and can be combined with network alignment techniques [FNS+06, KSK+03]. We also present new methods of estimating parameters for models of biological network growth, and present a new model based on these parameters and underlying binding domains. Finally, we propose an experiment to explore the effect of the whole genome duplication [KBL04] on the protein-protein interaction network of S. cerevisiae, allowing us to distinguish between cases of subfunctionalization and neofunctionalization.
by Joshua A. Grochow.
M.Eng.

APA, Harvard, Vancouver, ISO, and other styles

4

Valenta, Martin. "Predikce proteinových domén." Master's thesis, Vysoké učení technické v Brně. Fakulta informačních technologií, 2013. http://www.nusl.cz/ntk/nusl-236163.

Full text

Abstract:

The work is focused on the area of the proteins and their domains. It also briefly describes gathering methods of the protein´s structure at the various levels of the hierarchy. This is followed by examining of existing tools for protein´s domains prediction and databases consisting of domain´s information. In the next part of the work selected representatives of prediction methods are introduced. These methods work with the information about the internal structure of the molecule or the amino acid sequence. The appropriate chapter outlines applied procedure of domains´ boundaries prediction. The prediction is derived from the primary structure of the protein, using a neural network The implemented procedure and its possibility of further development in the related thesis are introduced at the conclusion of this work.

APA, Harvard, Vancouver, ISO, and other styles

5

Tsilo, Lipontseng Cecilia. "Protein secondary structure prediction using neural networks and support vector machines." Thesis, Rhodes University, 2009. http://hdl.handle.net/10962/d1002809.

Full text

Abstract:

Predicting the secondary structure of proteins is important in biochemistry because the 3D structure can be determined from the local folds that are found in secondary structures. Moreover, knowing the tertiary structure of proteins can assist in determining their functions. The objective of this thesis is to compare the performance of Neural Networks (NN) and Support Vector Machines (SVM) in predicting the secondary structure of 62 globular proteins from their primary sequence. For each NN and SVM, we created six binary classifiers to distinguish between the classes’ helices (H) strand (E), and coil (C). For NN we use Resilient Backpropagation training with and without early stopping. We use NN with either no hidden layer or with one hidden layer with 1,2,...,40 hidden neurons. For SVM we use a Gaussian kernel with parameter fixed at = 0.1 and varying cost parameters C in the range [0.1,5]. 10- fold cross-validation is used to obtain overall estimates for the probability of making a correct prediction. Our experiments indicate for NN and SVM that the different binary classifiers have varying accuracies: from 69% correct predictions for coils vs. non-coil up to 80% correct predictions for stand vs. non-strand. It is further demonstrated that NN with no hidden layer or not more than 2 hidden neurons in the hidden layer are sufficient for better predictions. For SVM we show that the estimated accuracies do not depend on the value of the cost parameter. As a major result, we will demonstrate that the accuracy estimates of NN and SVM binary classifiers cannot distinguish. This contradicts a modern belief in bioinformatics that SVM outperforms other predictors.

APA, Harvard, Vancouver, ISO, and other styles

6

Alistair, Chalk. "PREDICTION OF PROTEIN SECONDARY STRUCTURE by Incorporating Biophysical Information into Artificial Neural Networks." Thesis, University of Skövde, Department of Computer Science, 1998. http://urn.kb.se/resolve?urn=urn:nbn:se:his:diva-235.

Full text

Abstract:

This project applied artificial neural networks to the field of secondary structure prediction of proteins. A NETtalk architecture with a window size 13 was used. Over-fitting was avoided by the use of 3 real numbers to represent amino acids, reducing the number of adjustable weights to 840. Two alternative representations of amino acids that incorporated biophysical data were created and tested. They were tested both separately and in combination on a standard 7-fold cross-validation set of 126 proteins. The best performance was achieved using an average result from two predictions. This was then filtered and gave the following results. Accuracy levels for core structures were: Q3total accuracy of 61.3% consisting of Q3 accuracy’s of 54.0%, 38.1% & 77.0% for Helix, Strand and Coil respectively with Matthew’s correlation’s Ca = 0.34, Cb = 0.26 , Cc = 0.31. The average lengths of structures predicted were 9.8, 4.9 and 11.0, for helix, sheet and coil respectively. These results are lower than those of other methods using single sequences and localist representations. The most likely reason for this is over generalisation caused by using a small number of units.

APA, Harvard, Vancouver, ISO, and other styles

7

Reyaz-Ahmed, Anjum B. "Protein Secondary Structure Prediction Using Support Vector Machines, Nueral Networks and Genetic Algorithms." Digital Archive @ GSU, 2007. http://digitalarchive.gsu.edu/cs_theses/43.

Full text

Abstract:

Bioinformatics techniques to protein secondary structure prediction mostly depend on the information available in amino acid sequence. Support vector machines (SVM) have shown strong generalization ability in a number of application areas, including protein structure prediction. In this study, a new sliding window scheme is introduced with multiple windows to form the protein data for training and testing SVM. Orthogonal encoding scheme coupled with BLOSUM62 matrix is used to make the prediction. First the prediction of binary classifiers using multiple windows is compared with single window scheme, the results shows single window not to be good in all cases. Two new classifiers are introduced for effective tertiary classification. This new classifiers use neural networks and genetic algorithms to optimize the accuracy of the tertiary classifier. The accuracy level of the new architectures are determined and compared with other studies. The tertiary architecture is better than most available techniques.

APA, Harvard, Vancouver, ISO, and other styles

8

Mulnaes, Daniel [Verfasser]. "TopSuite: A meta-suite for protein structure prediction using deep neural networks / Daniel Mulnaes." Düsseldorf : Universitäts- und Landesbibliothek der Heinrich-Heine-Universität Düsseldorf, 2020. http://d-nb.info/1222261634/34.

Full text

APA, Harvard, Vancouver, ISO, and other styles

9

Royer, Loic. "Unraveling the Structure and Assessing the Quality of Protein Interaction Networks with Power Graph Analysis." Doctoral thesis, Saechsische Landesbibliothek- Staats- und Universitaetsbibliothek Dresden, 2017. http://nbn-resolving.de/urn:nbn:de:bsz:14-qucosa-62562.

Full text

Abstract:

Molecular biology has entered an era of systematic and automated experimentation. High-throughput techniques have moved biology from small-scale experiments focused on specific genes and proteins to genome and proteome-wide screens. One result of this endeavor is the compilation of complex networks of interacting proteins. Molecular biologists hope to understand life's complex molecular machines by studying these networks. This thesis addresses tree open problems centered upon their analysis and quality assessment. First, we introduce power graph analysis as a novel approach to the representation and visualization of biological networks. Power graphs are a graph theoretic approach to lossless and compact representation of complex networks. It groups edges into cliques and bicliques, and nodes into a neighborhood hierarchy. We demonstrate power graph analysis on five examples, and show its advantages over traditional network representations. Moreover, we evaluate the algorithm performance on a benchmark, test the robustness of the algorithm to noise, and measure its empirical time complexity at O (e1.71)- sub-quadratic in the number of edges e. Second, we tackle the difficult and controversial problem of data quality in protein interaction networks. We propose a novel measure for accuracy and completeness of genome-wide protein interaction networks based on network compressibility. We validate this new measure by i) verifying the detrimental effect of false positives and false negatives, ii) showing that gold standard networks are highly compressible, iii) showing that authors' choice of confidence thresholds is consistent with high network compressibility, iv) presenting evidence that compressibility is correlated with co-expression, co-localization and shared function, v) showing that complete and accurate networks of complex systems in other domains exhibit similar levels of compressibility than current high quality interactomes. Third, we apply power graph analysis to networks derived from text-mining as well to gene expression microarray data. In particular, we present i) the network-based analysis of genome-wide expression profiles of the neuroectodermal conversion of mesenchymal stem cells. ii) the analysis of regulatory modules in a rare mitochondrial cytopathy: emph{Mitochondrial Encephalomyopathy, Lactic acidosis, and Stroke-like episodes} (MELAS), and iii) we investigate the biochemical causes behind the enhanced biocompatibility of tantalum compared with titanium.

APA, Harvard, Vancouver, ISO, and other styles

10

Planas, Iglesias Joan 1980. "On the study of 3D structure of proteins for developing new algorithms to complete the interactome and cell signalling networks." Doctoral thesis, Universitat Pompeu Fabra, 2013. http://hdl.handle.net/10803/104152.

Full text

Abstract:

Proteins are indispensable players in virtually all biological events. The functions of proteins are determined by their three dimensional (3D) structure and coordinated through intricate networks of protein-protein interactions (PPIs). Hence, a deep comprehension of such networks turns out to be crucial for understanding the cellular biology. Computational approaches have become critical tools for analysing PPI networks. In silico methods take advantage of the existing PPI knowledge to both predict new interactions and predict the function of proteins. Regarding the task of predicting PPIs, several methods have been already developed. However, recent findings demonstrate that such methods could take advantage of the knowledge on non-interacting protein pairs (NIPs). On the task of predicting the function of proteins,the Guilt-by-Association (GBA) principle can be exploited to extend the functional annotation of proteins over PPI networks. In this thesis, a new algorithm for PPI prediction and a protocol to complete cell signalling networks are presented. iLoops is a method that uses NIP data and structural information of proteins to predict the binding fate of protein pairs. A novel protocol for completing signalling networks –a task related to predicting the function of a protein, has also been developed. The protocol is based on the application of GBA principle in PPI networks.
Les proteïnes tenen un paper indispensable en virtualment qualsevol procés biològic. Les funcions de les proteïnes estan determinades per la seva estructura tridimensional (3D) i són coordinades per mitjà d’una complexa xarxa d’interaccions protiques (en anglès, protein-protein interactions, PPIs). Axí doncs, una comprensió en profunditat d’aquestes xarxes és fonamental per entendre la biologia cel•lular. Per a l’anàlisi de les xarxes d’interacció de proteïnes, l’ús de tècniques computacionals ha esdevingut fonamental als darrers temps. Els mètodes in silico aprofiten el coneixement actual sobre les interaccions proteiques per fer prediccions de noves interaccions o de les funcions de les proteïnes. Actualment existeixen diferents mètodes per a la predicció de noves interaccions de proteines. De tota manera, resultats recents demostren que aquests mètodes poden beneficiar-se del coneixement sobre parelles de proteïnes no interaccionants (en anglès, non-interacting pairs, NIPs). Per a la tasca de predir la funció de les proteïnes, el principi de “culpable per associació” (en anglès, guilt by association, GBA) és usat per extendre l’anotació de proteïnes de funció coneguda a través de xarxes d’interacció de proteïnes. En aquesta tesi es presenta un nou mètode pre a la predicció d’interaccions proteiques i un nou protocol basat per a completar xarxes de senyalització cel•lular. iLoops és un mètode que utilitza dades de parells no interaccionants i coneixement de l’estructura 3D de les proteïnes per a predir interaccions de proteïnes. També s’ha desenvolupat un nou protocol per a completar xarxes de senyalització cel•lular, una tasca relacionada amb la predicció de les funcions de les proteïnes. Aquest protocol es basa en aplicar el principi GBA a xarxes d’interaccions proteiques.

APA, Harvard, Vancouver, ISO, and other styles

11

Senekal, Frederick Petrus. "Protein secondary structure prediction using amino acid regularities." Diss., Pretoria : [s.n.], 2008. http://upetd.up.ac.za/thesis/available/etd-01232009-120040/.

Full text

APA, Harvard, Vancouver, ISO, and other styles

12

Clayton, Arnshea. "The Relative Importance of Input Encoding and Learning Methodology on Protein Secondary Structure Prediction." Digital Archive @ GSU, 2006. http://digitalarchive.gsu.edu/cs_theses/19.

Full text

Abstract:

In this thesis the relative importance of input encoding and learning algorithm on protein secondary structure prediction is explored. A novel input encoding, based on multidimensional scaling applied to a recently published amino acid substitution matrix, is developed and shown to be superior to an arbitrary input encoding. Both decimal valued and binary input encodings are compared. Two neural network learning algorithms, Resilient Propagation and Learning Vector Quantization, which have not previously been applied to the problem of protein secondary structure prediction, are examined. Input encoding is shown to have a greater impact on prediction accuracy than learning methodology with a binary input encoding providing the highest training and test set prediction accuracy.

APA, Harvard, Vancouver, ISO, and other styles

13

Zhu, Shaoming. "Multiscale analysis of protein functions and stochastic modelling of gene transcriptional regulatory networks." Thesis, Queensland University of Technology, 2010. https://eprints.qut.edu.au/41693/1/Shaoming_Zhu_Thesis.pdf.

Full text

Abstract:

Genomic and proteomic analyses have attracted a great deal of interests in biological research in recent years. Many methods have been applied to discover useful information contained in the enormous databases of genomic sequences and amino acid sequences. The results of these investigations inspire further research in biological fields in return. These biological sequences, which may be considered as multiscale sequences, have some specific features which need further efforts to characterise using more refined methods. This project aims to study some of these biological challenges with multiscale analysis methods and stochastic modelling approach. The first part of the thesis aims to cluster some unknown proteins, and classify their families as well as their structural classes. A development in proteomic analysis is concerned with the determination of protein functions. The first step in this development is to classify proteins and predict their families. This motives us to study some unknown proteins from specific families, and to cluster them into families and structural classes. We select a large number of proteins from the same families or superfamilies, and link them to simulate some unknown large proteins from these families. We use multifractal analysis and the wavelet method to capture the characteristics of these linked proteins. The simulation results show that the method is valid for the classification of large proteins. The second part of the thesis aims to explore the relationship of proteins based on a layered comparison with their components. Many methods are based on homology of proteins because the resemblance at the protein sequence level normally indicates the similarity of functions and structures. However, some proteins may have similar functions with low sequential identity. We consider protein sequences at detail level to investigate the problem of comparison of proteins. The comparison is based on the empirical mode decomposition (EMD), and protein sequences are detected with the intrinsic mode functions. A measure of similarity is introduced with a new cross-correlation formula. The similarity results show that the EMD is useful for detection of functional relationships of proteins. The third part of the thesis aims to investigate the transcriptional regulatory network of yeast cell cycle via stochastic differential equations. As the investigation of genome-wide gene expressions has become a focus in genomic analysis, researchers have tried to understand the mechanisms of the yeast genome for many years. How cells control gene expressions still needs further investigation. We use a stochastic differential equation to model the expression profile of a target gene. We modify the model with a Gaussian membership function. For each target gene, a transcriptional rate is obtained, and the estimated transcriptional rate is also calculated with the information from five possible transcriptional regulators. Some regulators of these target genes are verified with the related references. With these results, we construct a transcriptional regulatory network for the genes from the yeast Saccharomyces cerevisiae. The construction of transcriptional regulatory network is useful for detecting more mechanisms of the yeast cell cycle.

APA, Harvard, Vancouver, ISO, and other styles

14

Sardana, Divya. "Analysis of Meso-scale Structures in Weighted Graphs." University of Cincinnati / OhioLINK, 2017. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1510927111275038.

Full text

APA, Harvard, Vancouver, ISO, and other styles

15

Di, Domenico Tomás. "Computational Analysis and Annotation of Proteome Data: Sequence, Structure, Function and Interactions." Doctoral thesis, Università degli studi di Padova, 2014. http://hdl.handle.net/11577/3423805.

Full text

Abstract:

With the advent of modern sequencing technologies, the amount of biological data available has begun to challenge our ability to process it. The development of new tools and methods has become essential for the production of results based on such a vast amount of information. This thesis focuses on the development of such computational tools and method for the study of protein data. I first present the work done towards the understanding of intrinsic protein disorder. Through the development of novel disorder predictors, we were able to expand the available data sources to cover any protein of known sequence. By storing these predicted annotations, together with data from other sources, we created MobiDB, a resource that provides a comprehensive view of available disorder annotations for a protein of interest, covering all sequences in the UniProt database. Based on observations obtained from this resource, we proceeded to create a data analysis workflow with the goal of furthering our understanding of intrinsic protein disorder. The second part focuses on tandem repeat proteins. The RAPHAEL method was developed to assist in the identification of tandem repeat protein structures from PDB files. Identified repeat structures were then manually classified into a formal classification schema, and published as part of the RepeatsDB database. Finally, I describe the development of network-based tools for the analysis of protein data. RING allows the user to visualise and study the structure of a protein as a network of nodes, linked by physico-chemical properties. The second method, PANADA, enables the user to create protein similarity networks and to assess the transferability of functional annotations between clusters of proteins.
Con l'avvento delle tecnologie di sequenziamento moderne, la quantità di dati biologici disponibili ha cominciato a sfidare la nostra capacità di elaborarli. È diventato quindi essenziale sviluppare nuovi strumenti e tecniche capaci di produrre dei risultati basati su grandi moli di informazioni. Questa tesi si concentra sullo sviluppo di tali strumenti computazionali e dei metodi per lo studio dei dati proteici. Viene dapprima presento il lavoro svolto per la comprensione delle proteine intrinsecamente disordinate. Attraverso lo sviluppo di nuovi predittori di disordine, siamo stati in grado di sfruttare le fonti di dati attualmente disponibili per annotare qualsiasi proteina avente sequenza nota. Memorizzando queste predizioni, insieme ai dati provenienti da altre fonti, è stato creato MobiDB. Questa risorsa fornisce una visione completa sulle annotazioni di disordine disponibili per una qualsiasi proteina di interesse presente nel database UniProt. Sulla base delle osservazioni ottenute da questo strumento, è stato quindi creato un workflow di analisi dei dati con l'obiettivo di approfondire la nostra comprensione delle proteine intrinsecamente disordinate. La seconda parte della tesi si concentra sulle proteine ripetute. Il metodo RAPHAEL è stato sviluppato per contribuire nell'identificazione di strutture proteiche ripetute all'interno dei file PDB. Le strutture selezionate da questo strumento sono state poi catalogate manualmente utilizzando uno schema formale di classificazione, e pubblicate quindi come parte del database RepeatsDB. Infine, viene descritto lo sviluppo di strumenti basati su grafi per l'analisi di dati proteici. RING consente all'utente di visualizzare e studiare la struttura di una proteina come una rete di nodi collegati da tra loro da proprietà fisico-chimiche. Il secondo metodo, PANADA, consente all'utente di creare reti di similarità di proteine e di valutare la trasferibilità delle annotazioni funzionali tra cluster diversi.

APA, Harvard, Vancouver, ISO, and other styles

16

Royer, Loic [Verfasser], Michael [Akademischer Betreuer] Schroeder, and Ralf [Gutachter] Zimmer. "Unraveling the Structure and Assessing the Quality of Protein Interaction Networks with Power Graph Analysis / Loic Royer ; Gutachter: Ralf Zimmer ; Betreuer: Michael Schroeder." Dresden : Saechsische Landesbibliothek- Staats- und Universitaetsbibliothek Dresden, 2017. http://d-nb.info/1150309210/34.

Full text

APA, Harvard, Vancouver, ISO, and other styles

17

Karami, Yasaman. "Joint analysis of dynamically correlated networks and coevolved residue clusters : large-scale analysis and methods for predicting the effects of genetic disease associated mutations." Thesis, Paris 6, 2016. http://www.theses.fr/2016PA066375/document.

Full text

Abstract:

Nous avons présenté COMMA, une méthode pour décrire et comparer les architectures dynamiques de différentes protéines. Il extrait propriétés dynamiques de ensembles conformationnels pour identifier les voies de communication, des chaînes de résidus liés par des interactions stables qui se déplacent ensemble, et cliques indépendants, des groupes de résidus qui fluctuent de manière concertée. Il fournit une description de l'infostery d'un complexe de protéines qui va au-delà des mesures au-delà de classiques de la façon dont une protéine se déplace ou change de forme. Nous avons montré l'efficacité de notre approche pour fournir des idées mécanistiques sur les effets des mutations délétères en identifiant les résidus qui jouent un rôle clé dans la propagation de ces effets. En outre COMMA révèle un lien entre les clusters de coévoluant résidus et les réseaux de corrélations dynamiques. Il permet de comparer les différents types de communication se produisant entre les résidus et de hiérarchiser les différentes régions d'une protéine en fonction de l'efficacité de leur communication. En outre, nous avons présenté une approche pour exploiter les séquences et les dynamiques structurelles pour prédire un paysage mutationnel. La discussion des exemples, a révélé l'interprétation physique sur la façon dont l'étude de la conservation apporte des idées importantes sur la sensibilité des positions conservées à des mutations. Notre méthode proposée, peut détecter des régions de protéines qui sont sujettes à des troubles ou des réarrangements conformationnels substantiels. De plus, il nous a permis de proposer des mutations qui régulent la stabilité des bobines enroulées désordonnées
We presented COMMA, a method to describe and compare the dynamical architectures of different proteins or different variants of the same protein. COMMA extracts dynamical properties from conformational ensembles to identify communication pathways, chains of residues linked by stable interactions that move together, and independent cliques, clusters of residues that fluctuate in a concerted way. It provides a description of the infostery of a protein or protein complex that goes beyond the notions of chain, domain and secondary structure element/motif, and beyond classical measures of how a protein moves and/or changes its shape. We showed the efficiency of our approach in providing mechanistic insights on the effects of deleterious mutations by pinpointing residues playing key roles in the propagation of these effects. In addition COMMA reveals a link between clusters of coevolving residues and networks of dynamical correlations. It enables to contrast the different types of communication occurring between residues and to hierarchise the different regions of a protein depending on their communication efficiency. Furthermore, we presented an approach to exploit both the sequences and structural dynamics to predict a mutational landscape. The discussion of examples, revealed physical interpretation on how the study of conservation brings significant insights on the sensitivity of conserved positions to mutations. Our proposed method, can detect protein regions that are prone to disorder or substantial conformational rearrangements. Moreover, it enabled us to suggest mutations that regulate the stability of the disordered coiled-coils

APA, Harvard, Vancouver, ISO, and other styles

18

Malik, Sheriff Rahuman S. [Verfasser], Eli [Akademischer Betreuer] Zamir, Philippe I. [Gutachter] Bastiaens, and Katja [Gutachter] Ickstadt. "Spatially resolving the dynamics and structure of protein networks in adhesion sites / Rahuman S. Malik Sheriff. Betreuer: Eli Zamir. Gutachter: Philippe I. Bastiaens ; Katja Ickstadt." Dortmund : Universitätsbibliothek Dortmund, 2014. http://d-nb.info/1104947404/34.

Full text

APA, Harvard, Vancouver, ISO, and other styles

19

Villar, Gabriel. "Aqueous droplet networks for functional tissue-like materials." Thesis, University of Oxford, 2012. http://ora.ox.ac.uk/objects/uuid:602f9161-368c-48c0-9619-7974f743f2f2.

Full text

Abstract:

An aqueous droplet in a solution of lipids in oil acquires a lipid monolayer coat, and two such droplets adhere to form a bilayer at their interface. Networks of droplets have been constructed in this way that function as light sensors, batteries and electrical circuits by using membrane proteins incorporated into the bilayers. However, the droplets have been confined to a bulk oil phase, which precludes direct communication with physiological environments. Further, the networks typically have been assembled manually, which limits their scale and complexity. This thesis addresses these limitations, and thereby enables prospective medical and technological applications for droplet networks. In the first part of the work, defined droplet networks are encapsulated within mm-scale drops of oil in water to form structures called multisomes. The encapsulated droplets adhere to one another and to the surface of the oil drop to form interface bilayers that allow them to communicate with each other and with the surrounding aqueous environment through membrane pores. The contents of the droplets can be released by changing the pH or temperature of the surrounding solution. Multisomes have potential applications in synthetic biology and medicine. In the second part of the work, a three-dimensional printing technique is developed that allows the construction of complex networks of tens of thousands of heterologous droplets ~50 µm in diameter. The droplets form a self-supporting material in bulk oil or water analogous to biological tissue. The mechanical properties of the material are calculated to be similar to those of soft tissues. Membrane proteins can be printed in specific droplets, for example to establish a conductive pathway through an otherwise insulating network. Further, the networks can be programmed by osmolarity gradients to fold into designed shapes. Printed droplet networks can serve as platforms for soft devices, and might be interfaced with living tissues for medical applications.

APA, Harvard, Vancouver, ISO, and other styles

20

Hellenkamp, Björn Verfasser], Thorsten [Akademischer Betreuer] [Gutachter] [Hugel, Martin [Gutachter] Zacharias, and Ben [Gutachter] Schuler. "Dynamic structure of a multi-domain protein : uncovered using self-consistent FRET networks and time-correlated distance distributions / Björn Hellenkamp ; Gutachter: Thorsten Hugel, Martin Zacharias, Ben Schuler ; Betreuer: Thorsten Hugel." München : Universitätsbibliothek der TU München, 2016. http://d-nb.info/1132773954/34.

Full text

APA, Harvard, Vancouver, ISO, and other styles

21

Toufighi, Kiana 1980. "Integrative study of gene expression and protein complexes." Doctoral thesis, Universitat Pompeu Fabra, 2014. http://hdl.handle.net/10803/380907.

Full text

Abstract:

Over the last several decades, the emerging ‘integrated’ view of the cell has triumphed over the ‘one gene/one protein/one function’ paradigm. This is illustrated by the biologically opposite effects of key regulatory proteins in different cell types, in established versus primary cells, and in vitro versus in vivo situations. The persistent theme throughout this dissertation is the integration of a wide range of data sources for the purpose of understanding distinct cellular contexts. We first use circadian expression data from human epidermal stem cells to discover waves of transcripts expressed in tune with known clock genes and show that time-of-day dependent responses to proliferation/differentiation cues is important for skin homeostasis. We then combine this expression data with information on protein structures and complexes to describe how protein-complex assembly is temporally regulated during differentiation. Lastly, we show that human protein complexes are composed of a stable ‘core’ and a plastic ‘periphery’ whose tissue-specific expression allows protein complexes to function in a context-dependent manner.
En las últimas décadas, la emergente vista integrativa de la célula ha triunfado sobre el paradigma histórico: ‘un gene/una proteína/una función’. Esto es ilustrado por los efectos biológicos opuestos de proteínas regulatorias clave en cultivos celulares inmortalizados frente a primarios e in vitro frente a in vivo. El tema persistente en este disertación es la integración de un amplio set de datos para estudiar los distintos contextos celulares. En primer lugar, utilizamos los datos de expresión génica obtenidos de células madre epidérmicas para descubrir las ondas de transcripción expresadas en sintonía con los genes conocidos de los ritmos circadianos. En este estudio demostramos que las respuestas de las células madres a las señales de proliferación/diferenciación dependen de hora del día y el tiempo circadiano es importante para la homeostasis de la piel. Posteriormente, combinamos estos datos de expresión con la información estructural de proteínas y complejos proteicos para describir la regulación temporal de complejos durante el proceso de diferenciación. Por último, mostramos que los complejos de proteínas humanos están compuestos de un ‘núcleo’ estable y una 'periferia' plástica cuya expresión específica de tejido celular permite que los complejos de proteínas funcionen de una manera dependiente del contexto.

APA, Harvard, Vancouver, ISO, and other styles

22

Dorn, Márcio. "MOIRAE : a computational strategy to predict 3-D structures of polypeptides." reponame:Biblioteca Digital de Teses e Dissertações da UFRGS, 2012. http://hdl.handle.net/10183/142870.

Full text

Abstract:

Currently, one of the main research problems in Structural Bioinformatics is associated to the study and prediction of the 3-D structure of proteins. The 1990’s GENOME projects resulted in a large increase in the number of protein sequences. However, the number of identified 3-D protein structures have not followed the same growth trend. The number of protein sequences is much higher than the number of known 3-D structures. Many computational methodologies, systems and algorithms have been proposed to address the protein structure prediction problem. However, the problem still remains challenging because of the complexity and high dimensionality of a protein conformational search space. This work presents a new computational strategy for the 3-D protein structure prediction problem. A first principle strategy which uses database information for the prediction of the 3-D structure of polypeptides was developed. The proposed technique manipulates structural information from the PDB in order to generate torsion angles intervals. Torsion angles intervals are used as input to a genetic algorithm with a local-search operator in order to search the protein conformational space and predict its 3-D structure. Results show that the 3-D structures obtained by the proposed method were topologically comparable to their correspondent experimental structure.

APA, Harvard, Vancouver, ISO, and other styles

23

Pereira, José Geraldo de Carvalho. "Redes neurais residuais profundas e autômatos celulares como modelos para predição que fornecem informação sobre a formação de estruturas secundárias proteicas." Universidade de São Paulo, 2018. http://www.teses.usp.br/teses/disponiveis/95/95131/tde-03052018-095932/.

Full text

Abstract:

O processo de auto-organização da estrutura proteica a partir da cadeia de aminoácidos é conhecido como enovelamento. Apesar de conhecermos a estrutura tridimencional de muitas proteínas, para a maioria delas, não possuímos uma compreensão suficiente para descrever em detalhes como a estrutura se organiza a partir da sequência de aminoácidos. É bem conhecido que a formação de núcleos de estruturas locais, conhecida como estrutura secundária, apresenta papel fundamental no enovelamento final da proteína. Desta forma, o desenvolvimento de métodos que permitam não somente predizer a estrutura secundária adotada por um dado resíduo, mas também, a maneira como esse processo deve ocorrer ao longo do tempo é muito relevante em várias áreas da biologia estrutural. Neste trabalho, desenvolvemos dois métodos de predição de estruturas secundárias utilizando modelos com o potencial de fornecer informações mais detalhadas sobre o processo de predição. Um desses modelos foi construído utilizando autômatos celulares, um tipo de modelo dinâmico onde é possível obtermos informações espaciais e temporais. O outro modelo foi desenvolvido utilizando redes neurais residuais profundas. Com este modelo é possível extrair informações espaciais e probabilísticas de suas múltiplas camadas internas de convolução, o que parece refletir, em algum sentido, os estados de formação da estrutura secundária durante o enovelamento. A acurácia da predição obtida por esse modelo foi de ~78% para os resíduos que apresentaram consenso na estrutura atribuída pelos métodos DSSP, STRIDE, KAKSI e PROSS. Tal acurácia, apesar de inferior à obtida pelo PSIPRED, o qual utiliza matrizes PSSM como entrada, é superior à obtida por outros métodos que realizam a predição de estruturas secundárias diretamente a partir da sequência de aminoácidos.
The process of self-organization of the protein structure is known as folding. Although we know the structure of many proteins, for a majority of them, we do not have enough understanding to describe in details how the structure is organized from its amino acid sequence. In this work, we developed two methods for secondary structure prediction using models that have the potential to provide detailed information about the prediction process. One of these models was constructed using cellular automata, a type of dynamic model where it is possible to obtain spatial and temporal information. The other model was developed using deep residual neural networks. With this model it is possible to extract spatial and probabilistic information from its multiple internal layers of convolution. The accuracy of the prediction obtained by this model was ~ 78% for residues that showed consensus in the structure assigned by the DSSP, STRIDE, KAKSI and PROSS methods. Such value is higher than that obtained by other methods which perform the prediction of secondary structures from the amino acid sequence only.

APA, Harvard, Vancouver, ISO, and other styles

24

Bhattacharyya, Moitrayee. "Probing Ligand Induced Perturbations In Protien Structure Networks : Physico-Chemical Insights From MD Simulations And Graph Theory." Thesis, 2012. http://etd.iisc.ernet.in/handle/2005/2341.

Full text

Abstract:

The fidelity of biological processes and reactions, inspite of the widespread diversity, is programmed by highly specific physico-chemical principles. This underlines our basic understanding of different interesting phenomena of biological relevance, ranging from enzyme specificity to allosteric communication, from selection of fold to structural organization / states of oligomerization, from half-sites-reactivity to reshuffling of the conformational free energy landscape, encompassing the dogma of sequence-structure dynamics-function of macromolecules. The role of striking an optimal balance between rigidity and flexibility in macromolecular 3D structural organisation is yet another concept that needs attention from the functional perspective. Needless to say that the variety of protein structures and conformations naturally leads to the diversity of their function and consequently many other biological functions in general. Classical models of allostery like the ‘MWC model’ or the ‘KNF model’ and the more recently proposed ‘population shift model’ have advanced our understanding of the underlying principles of long range signal transfer in macromolecules. Extensive studies have also reported the importance of the fold selection and 3D structural organisation in the context of macromolecular function. Also ligand induced conformational changes in macromolecules, both subtle and drastic, forms the basis for controlling several biological processes in an ordered manner by re-organizing the free energy landscape. The above mentioned biological phenomena have been observed from several different biochemical and biophysical approaches. Although these processes may often seem independent of each other and are associated with regulation of specialized functions in macromolecules, it is worthwhile to investigate if they share any commonality or interdependence at the detailed atomic level of the 3D structural organisation. So the nagging question is, do these diverse biological processes have a unifying theme, when probed at a level that takes into account even subtle re-orchestrations of the interactions and energetics at the protein/nucleic acid side-chain level. This is a complex problem to address and here we have made attempts to examine this problem using computational tools. Two methods have been extensively applied: Molecular Dynamics (MD) simulations and network theory and related parameters. Network theory has been extensively used in the past in several studies, ranging from analysis of social networks to systems level networks in biology (e.g., metabolic networks) and have also found applications in the varied fields of physics, economics, cartography and psychology. More recently, this concept has been applied to study the intricate details of the structural organisation in proteins, providing a local view of molecular interactions from a global perspective. On the other hand, MD simulations capture the dynamics of interactions and the conformational space associated with a given state (e.g., different ligand-bound states) of the macromolecule. The unison of these two methods enables the detection and investigation of the energetic and geometric re-arrangements of the 3D structural organisation of macromolecule/macromolecular complexes from a dynamical or ensemble perspective and this has been one of the thrust areas of the current study. So we not only correlate structure and functions in terms of subtle changes in interactions but also bring in conformational dynamics into the picture by studying such changes along the MD ensemble. The focus was to identify the subtle rearrangements of interactions between non-covalently interacting partners in proteins and the interacting nucleic acids. We propose that these rearrangements in interactions between residues (amino acids in proteins, nucleic acids in RNA/DNA) form the common basis for different biological phenomena which regulates several apparently unrelated processes in biology. Broadly, the major goal of this work is to elucidate the physico-chemical principles underlying some of the important biological phenomena, such as allosteric communication, ligand induced modulation of rigidity/ﬂexibility, half-sites-reactivity and so on, in molecular details. We have investigated several proteins, protein-RNA/DNA complexes to formulate general methodologies to address these questions from a molecular perspective. In the process we have also specifically illuminated upon the mechanistic aspects of the aminoacylation reaction by aminoacyl-tRNA synthetases like tryptophanyl and pyrrolysyl tRNA synthetase, structural details related to an enzyme catalyzed reaction that influences the process of quorum sensing in bacteria. Further, we have also examined the ‘dynamic allosterism’ that manipulates the activity of MutS, a prominent component of the DNA bp ‘mismatch repair’ machinery. Additionally, our protein structure network (PSN) based studies on a dataset of Rossmann fold containing proteins have provided insights into the structural signatures that drive the adoption of a fold from a repertoire of diverse sequences. Ligand induced percolations distant from the active sites, which may be of functional relevance have also been probed, in the context of the S1A family of serine proteases. In the course of our investigation, we have borrowed several concepts of network parameters from social network analysis and have developed new concepts. The Introduction (Chapter-1) summarizes the relevant literature and lays down a suitable background for the subsequent chapters in the thesis. The major questions addressed and the main goal of this thesis are described to set an appropriate stage for the detailed discussions. The methodologies involved are discussed in Chapter-2. Chapter-3 deals with a protein, LuxS that is involved in the bacterial quorum sensing; the first part of the chapter describes the application of network analysis on the static structures of several LuxS proteins from different organisms and the second part of this chapter describes the application of a dynamic network approach to analyze the MD trajectories of H.pylori LuxS. Chapter-4 focuses on the investigation of human tryptophanyl-tRNA synthetase (hTrpRS), with an emphasis to identify ligand induced subtle conformational changes in terms of the alternation of rigidity/flexibility at different sites and the re-organisation of the free energy landscape. Chapter-5 presents a novel application of a quantum clustering (QC) technique, popular in the fields of pattern recognition, to objectively cluster the conformations, sampled by molecular dynamics simulations performed on different ligand bound structures of the protein. The protein structure network (PSN) in the earlier studies were constituted on the basis of geometric interactions. In Chapters 6 and 7, we describe the networks (proteins+nucleic acids) using interaction energy as edges, thus incorporating the detailed chemistry in terms of an energy-weighted complex network. Chapter-6 describes an application of the energy weighted network formalism to probe allosteric communication in D.hafniense pyrrolysyl-tRNA synthetase. The methodology developed for in-depth study of ligand induced changes in DhPylRS has been adopted to the protein MutS, the first ‘check-point protein’ for DNA base pair (bp) mismatch repair. In Chapter-7, we describe the network analysis and the biological insights derived from this study (the work is done in collaboration with Prof. David Beveridge and Dr. Susan Pieniazek). Chapter-8 describes the application of a network approach to capture the ligand-induced subtle global changes in protein structures, using a dataset of high resolution structures from the S1A family of serine proteases. Chapter-9 deals with probing the structural rationale behind diverse sequences adopting the same fold with the NAD(P)-binding Rossmann fold as a case study. Future directions are discussed in the final chapter of the thesis (Chapter-10).

APA, Harvard, Vancouver, ISO, and other styles

25

Brinda, K. V. "Protein Structure Networks : Implications To Protein Stabiltiy And Protein-Protein Interactions." Thesis, 2005. http://etd.iisc.ernet.in/handle/2005/1504.

Full text

APA, Harvard, Vancouver, ISO, and other styles

26

Vijayabaskar, M. S. "Protein-DNA Graphs And Interaction Energy Based Protein Structure Networks." Thesis, 2011. http://etd.iisc.ernet.in/handle/2005/1904.

Full text

Abstract:

Proteins orchestrate a number of cellular processes either alone or in concert with other biomolecules like nucleic acids, carbohydrates, and lipids. They exhibit an intrinsic ability to fold de novo to their functional states. The three–dimensional structure of a protein, dependent on its amino acid sequence, is important for its function. Understanding this sequence– structure–function relationship has become one of the primary goals in biophysics. Various experimental techniques like X–ray crystallography, Nuclear Magnetic Resonance (NMR), and site–directed mutagenesis have been used extensively towards this goal. Computational studies include mainly sequence based, and structure based approaches. The sequence based approaches such as sequence alignments, phylogenetic analysis, domain identification, statistical coupling analysis etc., aim at deriving meaningful information from the primary sequence of the protein. The structure based approaches, on the other hand, use structures of folded proteins. Recent advances in structure determination and efforts by various structural consortia have resulted in an enormous amount of structures available for analysis. Innumerable observations such as the allowed and disallowed regions in the conformations of a peptide unit, hydrophobic core in globular proteins, existence of regular secondary structures like helices, sheets, and turns and a limited fold space have been landmarks in understanding the characteristics of protein structures. The uniqueness of protein structure is attained through non–covalent interactions among the constituent amino acids. Analyses of protein structures show that different types of non–covalent interactions like hydrophobic interactions, hydrogen bonding, salt bridges, aromatic stacking, cation–π interactions, and solvent interactions hold protein structures together. Although such structure analyses have provided a wealth of information, they have largely been performed at a pair–wise level and an investigation involving such pair–wise interactions alone is not sufficient to capture all the determinants of protein structures, since they happen at a global level. This consideration has led to the development of graphs/networks for proteins. Graphs or Networks are a collection of nodes connected by edges. Protein Structure Networks (PSNs) can be constructed using various definitions of nodes and edges. Nodes may vary from atoms to secondary structures in Synopsis proteins, and the edges can range from simple atom–atom distances to distance between secondary structures. To study the interplay of amino acids in structure formation, the most commonly used PSNs consider amino acids as nodes. The criterion for edge definition, however, varies. PSNs can be constructed at a course grain level by considering the distances between Cα/Cβ atoms, any side–chain atoms, or the centroids of the amino acids. At a finer level, PSNs can be constructed using atomic details by considering the interaction types or by computing the extent of interaction between amino acids. Representation of proteins as networks and their analyses has given us a unique perspective on various aspects such as protein structure organization, stability, folding, function, oligomerization and so on. A variety of network properties like the degree distribution, clustering coefficient, characteristic path lengths, clusters, and hubs have been investigated. Most of these studies are carried out on protein structures alone. However, the interaction of proteins with other biopolymers like nucleic acids is vital for many crucial biological processes like transcription and translation. In this thesis, we have attempted to address this problem by constructing and analyzing combined graphs of the structures of protein and DNA. Also, in almost all of the PSN studies, the connections have been made solely on the basis of geometric criteria. In the later part of the thesis, we have taken PSN a step further by defining the non–covalent connections based on chemical considerations in the form of the energies of interactions. The thesis contains two sections. The first part mainly involves the construction and application of PSNs to study DNA binding proteins. The DNA binding proteins are involved in several high fidelity processes like DNA recombination, DNA replication, and transcription. Although the protein– DNA interfaces have been extensively analyzed using pair–wise interactions, we gain additional global perspective from network approach. Furthermore, most of the earlier investigations have been carried out from the protein point of view (protein centric) and the present network approach aims to combine both the protein centric and the DNA centric view points by construction and analyses of protein–DNA graphs. These studies are described in Chapters 3 and 4. The second part of the thesis discusses the development, characterization, and application of protein structure networks based on non– covalent interaction energies. The investigations are presented in chapters 5 and 6. Chapter 3 discusses the development of Protein–DNA Graphs (PDGs) where the protein–DNA interfaces are represented as networks. PDG is a bipartite network in which amino acids form a set of nodes and the nucleotides form the other set. The extent of interaction between the two diverse types of biopolymers is normalized to define the strength of interaction. Edges are then constructed based on the interaction strength between amino acids and nucleotides. Such a representation, reported here for the first time, provides a holistic view of the interacting surface. The developed PDGs are further analyzed in terms of clusters of interacting residues and identification of highly connected residues, known as hubs, along the protein–DNA interface and discussed in terms of their interacting motifs. Important clusters have been identified in a set of protein–DNA complexes, where the amino acids interact with different chemical components of DNA such as phosphate, deoxyribose and base with varying degrees of connectivity. An analysis of such fragment based PDGs provided insights into the nature of protein–DNA interaction, which could not have been obtained by conventional pair–wise analysis. The predominance of deoxyribose–amino acid clusters in beta–sheet proteins, distinction of the interface clusters in helix–turn–helix and the zipper type proteins are some of the new findings from the analysis of PDGs. Additionally, a potential classification scheme has been proposed for protein–DNA complexes on the basis of their interface clusters. This classification scheme gives a general idea of how the proteins interact with different components of DNA in various complexes. The present graph–based method has provided a deeper insight into the analysis of the protein–DNA recognition mechanisms from both protein and DNA view points, thus throwing more light on the nature and specificity of these interactions (Sathyapriya, Vijayabaskar et al. 2008). Chapter 4 delineates the application of PSN to an important problem in molecular biology. An analysis of interface clusters from multimeric proteins provides a clue to the important residues contributing to the stability of the oligomers. One such prediction was made on the DNA binding protein under starvation from Mycobacterium smegmatis (Ms–Dps) using PSNs. Two types of trimers, Trimer A (tA) and Trimer B (tB) can be derived from the dodecamer because of the inherent three fold symmetry of the spherical crystal structure. The irreversible dodecamerization of these native Ms--Dps trimers, in vitro, is known to be directly associated with the bimodal function (DNA binding and iron storage) of this protein. Interface clusters which were Synopsis identified from the PSNs of the derived trimers, allowed us to convincingly predict the residues E146 and F47 for mutation studies. The prediction was followed up by our experimental collaborators (Rakhi PC and Dipankar Chatterji), which led to the elucidation of the molecular mechanism behind the in vitro oligomerization of Ms--Dps. The F47E mutant was impaired in dodecamerization, and the double mutant (E146AF47E) was a native monomer in solution. These two observations suggested that the two trimers are important for dodecamerization and that the residues selected are important for the structural stability of the protein in vitro. From the structural and functional characterizations of the mutants, we have proposed an oligomerization pathway of Ms–Dps (Chowdhury, Vijayabaskar et al. 2008). The second part of the thesis involves the development, characterization (Chapter 5) and application (Chapter 6) of Protein Energy Networks (PENs). As mentioned above, the PSNs constructed on the geometric basis efficiently capture the topology and associated properties at the level of atom–atom contact. The chemistry, however, is not completely captured by these network representations, and a wealth of information can be extracted by incorporating the details of chemical interactions. This study is an advancement over the existing PSNs, in terms of edges being defined on the basis of interaction energies among the amino acids. This interaction energy is the resultant of various types of interactions within a protein. Use of such realistic interaction energies in a weighted network captures all the essential features responsible for maintaining the protein structure. The methodology involved in representing proteins as interaction energy weighted networks, with realistic edge weights obtained from standard force fields is described in Chapter 5. The interaction energies were derived from equilibrium ensembles (obtained using molecular dynamics simulations) to account for the structural plasticity, which is essential for function elucidation. The suitability of this method to study single static structures was validated by obtaining interaction energies on minimized crystal structures of proteins. The PENs were then characterized using network parameters like edge weight distributions, clusters, hubs, and shortest paths. The PENs exhibited three distinct behaviors in terms of the size of the largest connected cluster as a function of interaction energy; namely, the pre–transition, transition, and post transition regions, irrespective of the topology of the proteins. The pre– transition region (energies<–20 kJ/mol) comprises smaller clusters with mainly charged and polar residues as hubs. Crucial topological changes take place in the transition region (–10 to –20 kJ/mol), where the smaller clusters aggregate, through low energy van der Waals interactions, to form a single large cluster in the post–transition region (energies>–10 kJ/mol). These behaviors reinforce the concept that hydrophobic interactions hold together local clusters of highly interacting residues, keeping the protein topology intact (Vijayabaskar and Vishveshwara 2010). The applications of PENs in studying protein organization, allosteric communication, thermophilic stability and the structural relation of remote homologues of TIM barrel families have been outlined in Chapter 6. In the first case, the weighted networks were used to identify stabilization regions in protein structures and hierarchical organization in the folded proteins, which may provide some insights into the general mechanism of protein folding and stabilization (Vijayabaskar and Vishveshwara 2010). In the second case the features of communication paths in proteins were elucidated from PENs, and specific paths have been extensively discussed in the case of PDZ domain, which is known to bring together protein partners, mediating various cellular processes. Changes in PEN upon ligand binding, resulting in alterations of the shortest paths (energetically most favorable paths) for a small fraction of residues, indicated that allosteric communication is anisotropic in PDZ. The observations also establish that the shortest paths between functionally important sites traverse through key residues in PDZ2 domain. Furthermore, shortest paths in PENs provide us the exact pathways of communication between residues. Although the communication in PDZ has been extensively investigated, detailed information of pathways at the energy level has emerged for the first time from the present study from PEN analysis (Vijayabaskar and Vishveshwara 2010). In the third case, a set of thermophilic and mesophilic proteins were compared to determine the factors responsible for their thermal stability from a network perspective using PENs. The sub– graph parameters such as cluster population, hubs and cliques were the prominent contributing factors for thermal stability. Also, the thermophilic proteins have a better–packed hydrophobic core. The property of thermophilic protein to increase stability by increasing the connectivity but retain conformational flexibility is discussed from a cliques and communities (higher order inter–connection of residues) perspective (Vijayabaskar and Vishveshwara 2010). Finally, the remote homologues from the TIM barrel fold have been analyzed using PENs to identify the interactions responsible for the maintenance of the fold despite low sequence similarity. A study of conserved Synopsis interactions in family specific PENs reveals that the formation of the central beta barrel is vital for the TIM barrel formation. The beta barrel is being formed by either conserved long range electrostatic interactions or by tandem arrangement of low energy hydrophobic interactions. The contributions of helix–sheet and helix–helix interactions are not conserved in the families. This study suggests that the sequentially near residues forming the helix–sheet interactions are common in many folds and hence formed despite non– conservation, whereas formation of beta barrel requires long range interactions, thus more conserved within the families. The thesis also consists of an appendix in which a web–tool, developed to express proteins as networks and analyze these networks using different network parameters is discussed. The web based program–GraProStr allows us to represent proteins as structure graphs/networks by considering the amino acid residues as nodes and representing non–covalent interactions among them as edges. The different networks (classified based on edge definition) which can be obtained using GraProStr are Protein Side–chain Networks (PScNs), Cα/Cβ distance based networks (PcNs) and Protein– Ligand Networks (PLNs). The parameters which can be generated include clusters, hubs, cliques (rigid regions in proteins) and communities (group of cliques). It is also possible to differentiate the above mentioned parameters for monomers and interfaces in multimeric proteins. The well tested tool is now made available to the scientific community for the first time. GraProStr is available online and can be accessed from http://vishgraph.mbu.iisc.ernet.in/GraProStr/index.html. With a variety of structure networks, and a set of easily interpretable network parameters GraProStr can be useful is analyzing protein structures from a global paradigm (Vijayabaskar, Vidya et al. 2010). In summary, we have extensively studied DNA binding proteins using side– chain based protein structure networks and by integrating the DNA molecule into the network. Also, we have upgraded the existing methodology of generating structure networks, by representing both the geometry and the chemistry of residues as interaction energies among them. Using this energy based network we have studied diverse problems like protein structure formation, stabilization, and allosteric communication in detail. The above mentioned methodologies are a considerable advancement over existing structure network representations and have been shown in this thesis to shed more light on the structural features of proteins.

APA, Harvard, Vancouver, ISO, and other styles

27

Sykes, JE. "Protein structure and evolution." Thesis, 2021. https://eprints.utas.edu.au/37906/1/Sykes_whole_thesis.pdf.

Full text

Abstract:

Proteins are a major interface between the genotype and phenotype of living things. Understanding the structure of these molecules, and how their structure interacts with their sequence and function, is vital to our knowledge of all life. Specifically, knowledge of protein structure evolution has a multitude of applications in health and biology, as it could allow us to accurately predict the effects of any protein mutation and use this knowledge for drug discovery and to combat disease. Protein structural relationships and changes can be effectively mapped using a network, with the proteins represented as nodes and connections between them indicating degree of similarity or possible evolutionary relationships. Many different approaches to determining where these connections should lie have been presented, with all producing complex pictures of the protein universe. However, constructing a model of protein structural change that captures enough physical and chemical information to make accurate predictions remains a major challenge. With this ultimate goal in mind, I present three original studies that contribute to our understanding of protein structure. The first is a benchmarking study of protein structure alignment methods, with efficacy assessed through their ability to determine levels of structural similarity between protein domains and to cluster domains into those of equivalent structure. Sorting proteins by structure is relevant to many biochemical problems, including constructing networks based on structural similarity. The second study assesses the completeness of our current understanding of protein structure space by focusing on triplets of secondary structure elements. Determining whether or not the current Protein Data Bank (PDB) contains all structures of proteins will give an idea of the level of novelty we can expect in a complete network of protein evolution. The third study looks at the relationships between contact density, protein age and protein sequence diversity. These are all potential contributors to determining possible evolutionary links between proteins. Also, relationships between these variables could result in patterns of similarity in networks that could otherwise be explained through convergent or divergent evolution. It is our hope that the results of this work will inform future network-based representations of the protein universe.

APA, Harvard, Vancouver, ISO, and other styles

28

Tsilo, Lipontseng Cecilia. "Protein secondary structure prediction using neural networks and support vector machines /." 2008. http://eprints.ru.ac.za/1675/.

Full text

Abstract:

Thesis (M.Sc. (Statistics)) - Rhodes University, 2009.
A thesis submitted to Rhodes University in partial fulfillment of the requirements for the degree of Master of Science in Mathematical Statistics.

APA, Harvard, Vancouver, ISO, and other styles

29

Ahmed, Hazem Radwan A. "Pattern Discovery in Protein Structures and Interaction Networks." Thesis, 2014. http://hdl.handle.net/1974/12051.

Full text

Abstract:

Pattern discovery in protein structures is a fundamental task in computational biology, with important applications in protein structure prediction, profiling and alignment. We propose a novel approach for pattern discovery in protein structures using Particle Swarm-based flying windows over potentially promising regions of the search space. Using a heuristic search, based on Particle Swarm Optimization (PSO) is, however, easily trapped in local optima due to the sparse nature of the problem search space. Thus, we introduce a novel fitness-based stagnation detection technique that effectively and efficiently restarts the search process to escape potential local optima. The proposed fitness-based method significantly outperforms the commonly-used distance-based method when tested on eight classical and advanced (shifted/rotated) benchmark functions, as well as on two other applications for proteomic pattern matching and discovery. The main idea is to make use of the already-calculated fitness values of swarm particles, instead of their pairwise distance values, to predict an imminent stagnation situation. That is, the proposed fitness-based method does not require any computational overhead of repeatedly calculating pairwise distances between all particles at each iteration. Moreover, the fitness-based method is less dependent on the problem search space, compared with the distance-based method. The proposed pattern discovery algorithms are first applied to protein contact maps, which are the 2D compact representation of protein structures. Then, they are extended to work on actual protein 3D structures and interaction networks, offering a novel and low-cost approach to protein structure classification and interaction prediction. Concerning protein structure classification, the proposed PSO-based approach correctly distinguishes between the positive and negative examples in two protein datasets over 50 trials. As for protein interaction prediction, the proposed approach works effectively on complex, mostly sparse protein interaction networks, and predicts high-confidence protein-protein interactions — validated by more than one computational and experimental source — through knowledge transfer between topologically-similar interaction patterns of close proximity. Such encouraging results demonstrate that pattern discovery in protein structures and interaction networks are promising new applications of the fast-growing and far-reaching PSO algorithms, which is the main argument of this thesis.
Thesis (Ph.D, Computing) -- Queen's University, 2014-04-21 12:54:03.37

APA, Harvard, Vancouver, ISO, and other styles

30

Correia, Fernanda Maria dos Reis Brito e. Rodrigues. "Prediction and analysis of biological networks structure and dynamics." Doctoral thesis, 2019. http://hdl.handle.net/10773/29200.

Full text

Abstract:

Increasing knowledge about the biological processes that govern the dynamics of living organisms has fostered a better understanding of the origin of many diseases as well as the identification of potential therapeutic targets. Biological systems can be modeled through biological networks, allowing to apply and explore methods of graph theory in their investigation and characterization. This work had as main motivation the inference of patterns and rules that underlie the organization of biological networks. Through the integration of different types of data, such as gene expression, interaction between proteins and other biomedical concepts, computational methods have been developed so that they can be used to predict and study diseases. The first contribution, was the characterization a subsystem of the human protein interactome through the topological properties of the networks that model it. As a second contribution, an unsupervised method using biological criteria and network topology was used to improve the understanding of the genetic mechanisms and risk factors of a disease through co-expression networks. As a third contribution, a methodology was developed to remove noise (denoise) in protein networks, to obtain more accurate models, using the network topology. As a fourth contribution, a supervised methodology was proposed to model the protein interactome dynamics, using exclusively the topology of protein interactions networks that are part of the dynamic model of the system. The proposed methodologies contribute to the creation of more precise, static and dynamic biological models through the identification and use of topological patterns of protein interaction networks, which can be used to predict and study diseases.
O conhecimento crescente sobre os processos biológicos que regem a dinâmica dos organismos vivos tem potenciado uma melhor compreensão da origem de muitas doenças, assim como a identificação de potenciais alvos terapêuticos. Os sistemas biológicos podem ser modelados através de redes biológicas, permitindo aplicar e explorar métodos da teoria de grafos na sua investigação e caracterização. Este trabalho teve como principal motivação a inferência de padrões e de regras que estão subjacentes à organização de redes biológicas. Através da integração de diferentes tipos de dados, como a expressão de genes, interação entre proteínas e outros conceitos biomédicos, foram desenvolvidos métodos computacionais, para que possam ser usados na previsão e no estudo de doenças. Como primeira contribuição, foi proposto um método de caracterização de um subsistema do interactoma de proteínas humano através das propriedades topológicas das redes que o modelam. Como segunda contribuição, foi utilizado um método não supervisionado que utiliza critérios biológicos e topologia de redes para, através de redes de co-expressão, melhorar a compreensão dos mecanismos genéticos e dos fatores de risco de uma doença. Como terceira contribuição, foi desenvolvida uma metodologia para remover ruído (denoise) em redes de proteínas, para obter modelos mais precisos, utilizando a topologia das redes. Como quarta contribuição, propôs-se uma metodologia supervisionada para modelar a dinâmica do interactoma de proteínas, usando exclusivamente a topologia das redes de interação de proteínas que fazem parte do modelo dinâmico do sistema. As metodologias propostas contribuem para a criação de modelos biológicos, estáticos e dinâmicos, mais precisos, através da identificação e uso de padrões topológicos das redes de interação de proteínas, que podem ser usados na previsão e no estudo doenças.
Programa Doutoral em Engenharia Informática

APA, Harvard, Vancouver, ISO, and other styles

31

Lee, Yun, and 李昀. "Prediction of Protein Secondary Structure with Dependency Graphs and Their Expanded Bayesian Networks." Thesis, 2005. http://ndltd.ncl.edu.tw/handle/15829735362099350751.

Full text

Abstract:

碩士
國立清華大學
電機工程學系
93
The completion of Human Genome Project has triggered a wave of investigating various biological problems directly through the string of nucleotides and also its derived amino acid sequence. Therefore, the urgent need of predicting protein three-dimensional structure simply from the amino acid sequence propels us to develop a model-based method to predict the composition of the fundamental structural elements–that is, secondary structures–of any protein chain. To accomplish this goal, we first represent all the eligible secondary structure sequences as specific paths in a secondary structure trellis. Then we employ the method of dependency graphs and their expanded Bayesian networks to quantify the relationship between primary and secondary structures. Following the similar procedure as in the coding theory, we finally assign a secondary structure element to each amino acid through the use of two decoding algorithms: the Viterbi algorithm and the sum-product algorithm. The simulation results reveal that our proposed method achieves an accuracy that is indistinguishable from other existing sequence-only methods, and that a better outcome is reached when the target sequences are confined to a specific protein fold.

APA, Harvard, Vancouver, ISO, and other styles

32

Royer, Loic. "Unraveling the Structure and Assessing the Quality of Protein Interaction Networks with Power Graph Analysis." Doctoral thesis, 2010. https://tud.qucosa.de/id/qucosa%3A24399.

Full text

Abstract:

Molecular biology has entered an era of systematic and automated experimentation. High-throughput techniques have moved biology from small-scale experiments focused on specific genes and proteins to genome and proteome-wide screens. One result of this endeavor is the compilation of complex networks of interacting proteins. Molecular biologists hope to understand life's complex molecular machines by studying these networks. This thesis addresses tree open problems centered upon their analysis and quality assessment. First, we introduce power graph analysis as a novel approach to the representation and visualization of biological networks. Power graphs are a graph theoretic approach to lossless and compact representation of complex networks. It groups edges into cliques and bicliques, and nodes into a neighborhood hierarchy. We demonstrate power graph analysis on five examples, and show its advantages over traditional network representations. Moreover, we evaluate the algorithm performance on a benchmark, test the robustness of the algorithm to noise, and measure its empirical time complexity at O (e1.71)- sub-quadratic in the number of edges e. Second, we tackle the difficult and controversial problem of data quality in protein interaction networks. We propose a novel measure for accuracy and completeness of genome-wide protein interaction networks based on network compressibility. We validate this new measure by i) verifying the detrimental effect of false positives and false negatives, ii) showing that gold standard networks are highly compressible, iii) showing that authors' choice of confidence thresholds is consistent with high network compressibility, iv) presenting evidence that compressibility is correlated with co-expression, co-localization and shared function, v) showing that complete and accurate networks of complex systems in other domains exhibit similar levels of compressibility than current high quality interactomes. Third, we apply power graph analysis to networks derived from text-mining as well to gene expression microarray data. In particular, we present i) the network-based analysis of genome-wide expression profiles of the neuroectodermal conversion of mesenchymal stem cells. ii) the analysis of regulatory modules in a rare mitochondrial cytopathy: emph{Mitochondrial Encephalomyopathy, Lactic acidosis, and Stroke-like episodes} (MELAS), and iii) we investigate the biochemical causes behind the enhanced biocompatibility of tantalum compared with titanium.

APA, Harvard, Vancouver, ISO, and other styles

33

Hsu, Wei-Lun. "Mechanisms of binding diversity in protein disorder : molecular recognition features mediating protein interaction networks." Thesis, 2014. http://hdl.handle.net/1805/4035.

Full text

Abstract:

Indiana University-Purdue University Indianapolis (IUPUI)
Intrinsically disordered proteins are proteins characterized by lack of stable tertiary structures under physiological conditions. Evidence shows that disordered proteins are not only highly involved in protein interactions, but also have the capability to associate with more than one partner. Short disordered protein fragments, called “molecular recognition features” (MoRFs), were hypothesized to facilitate the binding diversity of highly-connected proteins termed “hubs”. MoRFs often couple folding with binding while forming interaction complexes. Two protein disorder mechanisms were proposed to facilitate multiple partner binding and enable hub proteins to bind to multiple partners: 1. One region of disorder could bind to many different partners (one-to-many binding), so the hub protein itself uses disorder for multiple partner binding; and 2. Many different regions of disorder could bind to a single partner (many-to-one binding), so the hub protein is structured but binds to many disordered partners via interaction with disorder. Thousands of MoRF-partner protein complexes were collected from Protein Data Bank in this study, including 321 one-to-many binding examples and 514 many-to-one binding examples. The conformational flexibility of MoRFs was observed at atomic resolution to help the MoRFs to adapt themselves to various binding surfaces of partners or to enable different MoRFs with non-identical sequences to associate with one specific binding pocket. Strikingly, in one-to-many binding, post-translational modification, alternative splicing and partner topology were revealed to play key roles for partner selection of these fuzzy complexes. On the other hand, three distinct binding profiles were identified in the collected many-to-one dataset: similar, intersecting and independent. For the similar binding profile, the distinct MoRFs interact with almost identical binding sites on the same partner. The MoRFs can also interact with a partially the same but partially different binding site, giving the intersecting binding profile. Finally, the MoRFs can interact with completely different binding sites, thus giving the independent binding profile. In conclusion, we suggest that protein disorder with post-translational modifications and alternative splicing are all working together to rewire the protein interaction networks.

APA, Harvard, Vancouver, ISO, and other styles

34

Dantas, Joana Margarida Franco. "Characterization of extracellular electron transfer networks in Geobacter sulfurreducens, a key bacterium for bioremediation and bioenergy applications." Doctoral thesis, 2017. http://hdl.handle.net/10362/27867.

Full text

Abstract:

Geobacter bacteria have awakened significantly attention because of their impact on natural environments and biotechnological applications that include the bioremediation of organic and inorganic contaminants, bioenergy production and bioelectronics. In addition to electron transfer towards extracellular terminal acceptors, Geobacter cells can also accept electrons from electrodes, in currentconsuming biofilms, a process that is currently explored in microbial electrosynthesis. These practical applications rely on an efficient transfer of electrons between the cell and its exterior, a process designated extracellular electron transfer (EET). However, the precise mechanisms underlying EET processes are still under debate. Genetic and proteomics studies have identified several c-type cytochromes as key components for EET in G. sulfurreducens. These proteins are located at the innermembrane (IM), periplasm and outer-membrane (OM). Examples of such cytochromes include the IMassociated cytochrome MacA, periplasmic cytochromes PpcA-E and PccH, as well as, the OM cytochrome OmcF, which were studied in this Thesis. Molecular interactions between PpcA-E and their putative redox partners, including a humic substance analogue molecule, MacA or PccH, were probed by NMR spectroscopy, stopped-flow kinetics and molecular docking. For the interacting pairs, their binding affinity was also determined by NMR chemical shift perturbation experiments. The results obtained showed that the interacting molecules establish reversible low-binding affinity complexes in specific regions of the proteins to warrant a rapid and selective electron transfer, a typical feature observed for electron transfer reactions between redox partners. In addition, NMR spectroscopy was also used to determine the solution structure of OmcF in the reduced state, its pH-dependent conformational changes and backbone dynamics. A biochemical and structural characterization of the cytochrome PccH was also carried out using circular dichroism, UV-visible and NMR spectroscopic techniques. The structure of PccH determined by X-ray crystallography showed that it is unique among the monoheme c-type cytochromes. The reduction potentials determined for PccH at different pH values by visible redox titrations are unusually low compared to those reported for other monoheme c-type cytochromes. Considering the structural and functional features of PccH it was proposed that this protein represents a first characterized example of a new subclass of monoheme c-type cytochromes. Overall, the results obtained constitute an important contribute to the current understanding of the G. sulfurreducens extracellular electron transfer mechanisms.

APA, Harvard, Vancouver, ISO, and other styles

35

(10137641), Ahmadreza Ghanbarpour Ghouchani. "Applications of Deep Neural Networks in Computer-Aided Drug Design." Thesis, 2021.

Find full text

Abstract:

Deep neural networks (DNNs) have gained tremendous attention over the recent years due to their outstanding performance in solving many problems in different fields of science and technology. Currently, this field is of interest to many researchers and growing rapidly. The ability of DNNs to learn new concepts with minimal instructions facilitates applying current DNN-based methods to new problems. Here in this dissertation, three methods based on DNNs are discussed, tackling different problems in the field of computer-aided drug design.

The first method described addresses the problem of prediction of hydration properties from 3D structures of proteins without requiring molecular dynamics simulations. Water plays a major role in protein-ligand interactions and identifying (de)solvation contributions of water molecules can assist drug design. Two different model architectures are presented for the prediction the hydration information of proteins. The performance of the methods are compared with other conventional methods and experimental data. In addition, their applications in ligand optimization and pose prediction is shown.

The design of de novo molecules has always been of interest in the field of drug discovery. The second method describes a generative model that learns to derive features from protein sequences to design de novo compounds. We show how the model can be used to generate molecules similar to the known for the targets the model have not seen before and compare with benchmark generative models.

Finally, it is demonstrated how DNNs can learn to predict secondary structure propensity values derived from NMR ensembles. Secondary structure propensities are important in identifying flexible regions in proteins. Protein flexibility has a major role in drug-protein binding, and identifying such regions can assist in development of methods for ligand binding prediction. The prediction performance of the method is shown for several proteins with two or more known secondary structure conformations.

APA, Harvard, Vancouver, ISO, and other styles

36

Sathyapriya, R. "Exploring Protein-Nucleic Acid Interactions Using Graph And Network Approaches." Thesis, 2007. http://hdl.handle.net/2005/624.

Full text

Abstract:

The flow of genetic information from genes to proteins is mediated through proteins which interact with the nucleic acids at several stages to successfully transmit the information from the nucleus to the cell cytoplasm. Unlike in the case of protein-protein interactions, the principles behind protein-nucleic acid interactions are still not very (Pabo and Nekludova, 2000) and efforts are still underway to arrive at the basic principles behind the specific recognition of nucleic acids by proteins (Prabakaran et al., 2006). This is mainly due to the innate complexity involved in recognition of nucleotides by proteins, where, even within a given family of DNA binding proteins, different modes of binding and recognition strategies are employed to suit their function (Luscomb et al., 2000). Such difficulties have also not made possible, a thorough classification of DNA/RNA binding proteins based on the mode of interaction as well as the specificity of recognition of the nucleotides. The availability of a large number of structures of protein-nucleic acids complexes (albeit lesser than the number of protein structures present in the PDB) in the past few decades has provided the knowledge-base for understanding the details behind their molecular mechanisms (Berman et al., 1992). Previously, studies have been carried out to characterize these interactions by analyzing specific non-covalent interactions such as hydrogen bonds, van der Walls, and hydrophobic interactions between a given amino acid and the nucleic acid (DNA, RNA) in a pair-wise manner, or through the analysis of interface areas of the protein-nucleic acid complexes (Nadassy et al., 1998; Jones et al., 1999). Though the studies have deciphered the common pairing preferences of a particular amino acid with a given nucleotide of DNA or RNA, there is little room for understanding these specificities in the context of spatial interactions at a global level from the protein-nucleic acid complexes. The representation of the amino acids and the nucleotides as components of graphs, and trying to explore the nature of the interactions at a level higher than exploring the individual pair-wise interactions, could provide greater details about the nature of these interactions and their specificity. This thesis reports the study of protein-nucleic interactions using graph and network based approaches. The evaluation of the parameters for characterizing protein-nucleic acid graphs have been carried out for the first time and these parameters have been successfully employed to capture biologically important non-covalent interactions as clusters of interacting amino acids and nucleotides from different protein-DNA and protein-RNA complexes. Graph and network based approaches are well established in the field of protein structure analysis for analyzing protein structure, stability and function (Kannan and Vishveshwara, 1999; Brinda and Vishveshwara, 2005). However, the use of graph and network principles for analyzing structures of protein-nucleic acid complexes is so far not accomplished and is being reported the first time in this thesis. The matter embodied in the thesis is presented as ten chapters. Chapter 1 lays the foundation for the study, surveying relevant literature from the field. Chapter 2 describes in detail the methods used in constructing graphs and networks from protein-nucleic acid complexes. Initially, only protein structure graphs and networks are constructed from proteins known to interact with specific DNA or RNA, and inferences with regard to nucleic acid binding and recognition were indirectly obtained . Subsequently, parameters were evaluated for representing both the interacting amino acids and the nucleotides as components of graphs and a direct evaluation of protein-DNA and Protein-RNA interactions as graphs has been carried out. Chapter 3 and 4 discuss the graph and network approaches applied to proteins from a dataset of DNA binding proteins complexed with DNA. In chapter 3, the protein structure graphs were constructed on the basis of the non-covalent interactions existing between the side chains of amino acids. Clusters of interacting side chains from the graphs were obtained using the graph spectral method. The clusters from the protein-DNA interface were analyzed in detail for the interaction geometry and biological importance (Sathyapriya and Vishveshwara, 2004). Chapter 4 also uses the same dataset of DNA binding proteins, but a network-based approach is presented. From the analysis of the protein structure networks from these DNA binding proteins, interesting observations relating the presence of highly connected nodes(or hubs) of the network to functionally important amino acids in the structure, emerged. Also, the comparison between the hubs identified from the protein-protein and the protein-DNA interfaces in terms of their amino acid composition and their connectivity are also presented (Sathyapriya and Vishveshwara, 2006) Chapter 5 and 6 deal with the graph and network applications to a specific system of protein-RNA complex (aminoacyl-tRNA synthetases) to gain insights into their interface biology based on amino acid connectivity. Chapter 5 deals with a dataset of aminoacyl-tRNA synthetase (aaRS) complexes obtained with various ligands like ATP, tRNA and L-amino acids. A graph based identification of side chain clusters from these ligand-bound aaRS structures has highlighted important features of ligand-binding at the catalytic sites of the two structurally different classes of aaRS (Class I and Class II). Side chain clusters from other regions of aaRS such as the anticodon binding region and the ligand-activation sites are discussed. A network approach is used in a specific system of aaRS(E.coli Glutaminyl-tRNA synthetase (GlnRS) complexed with its ligands, to specifically understand the effects of different ligand binding., in chapter 6. The structure networks of E.coli GlnRS in the ligand-free and different ligand-bound states are constructed. The ligand-free and the ligand-bound complexes are compared by analyzing their network properties and the presence of hubs to understand the effect of ligand-binding. These properties have elegantly captured the effects of ligand-binding to the GlnRS structure and have also provided an alternate method for comparing three dimensional structures of proteins in different ligand-bound states (Sathyapriya and Vishveshwara, 2007). In contrast to protein structure graphs (PSG), both the interacting amino acids and nucleotides (DNA/RNA) form the components of the protein-nucleic acid graphs (PNG) from protein-nucleic acid complexes. These graphs are constructed based on the non-covalent interactions existing between the side chains of the amino acids and nucleotides. After representing the interacting nucleotides and amino acids as graphs, clusters of the interacting components are identified. These clusters are the strongly interacting amino acids and nucleotides from the protein-nucleic acid complexes. These clusters can be generated at different strengths of interaction between the amino acid side chain and the nucleotide (measured in terms of its atomic connectivity) and can be used for detecting clusters of non-specific as well as specific interactions of amino acids and nucleotides. Though the methodology of graph construction and cluster identification are given in chapter 2, the details of the parameters evaluated for constructing PNG are given in chapter 7. Unlike in the previous chapters, the succeeding chapters deal exclusively with results that are obtained from the analyses of PNG. Two examples of obtaining clusters from a PNG are given, one each for a protein-DNA and a protein-RNA complex. In the first example, a nucleosome core particle is subjected to the graph based analysis and different clusters of amino acids with different regions of the DNA chain such as phosphate, deoxyribose sugar and the base are identified. Another example of aminoacyl-tRNA synthetase complexed with its cognate tRNA is used to illustrate the method with a protein-RNA complex. Further, the method of constructing and analyzing protein-nucleic acid graphs has been applied to the macromolecular machinery of the pre-translocation complex of the T. thermophilus 70S ribosome. Chapter 8 deals exclusively with the results identified from the analysis of this magnificent macromolecular ensemble. The availability of the method that can handle interactions between both amino acids and the nucleotides of the protein-nucleic acid complexes has given us the basis fro evaluating these interactions in a level higher than that of analyzing pair-wise interactions. A study on the evaluation of short hydrogen bonds(SHB) in proteins, which does not fall under the realm of the main objective of the thesis, is discussed in the Chapter 9. The short hydrogen bonds, defined by the geometrical distance and angle parameters, are identified from a non-redundant dataset of proteins. The insights into their occurrence, amino acid composition and secondary structural preferences are discussed. The SHB are present in distinct regions of protein three-dimensional structures, such that they mediate specific geometrical constraints that are necessary for stability of the structure (Sathyapriya and Vishveshwara, 2005). The significant conclusions of various studies carried out are summarized in the last chapter (Chapter 10). In conclusion, this thesis reports the analyses performed with protein-nucleic acid complexes using graph and network based methods. The parameters necessary for representing both amino acids and the nucleotides as components of a graph, are evaluated for the first time and can be used subsequently for other analyses. More importantly, the use of graph-based methods has resulted in considering the interaction between the amino acids and the nucleotides at a global level with respect to their topology of the protein-nucleic acid complexes. Such studies performed on a wide variety of protein-nucleic acid complexes could provide more insights into the details of protein-nucleic acid recognition mechanisms. The results of these studies can be used for rational design of experimental mutations that ascertain the structure-function relationships in proteins and protein-nucleic acid complexes.

APA, Harvard, Vancouver, ISO, and other styles

37

Pavithra, S. "Functional Role Of Heat Shock Protein 90 From Plasmodium Falciparum." Thesis, 2006. http://hdl.handle.net/2005/433.

Full text

Abstract:

Molecular chaperones have emerged in recent years as major players in many aspects of cell biology. Molecular chaperones are also known as heat shock proteins (HSPs) since many were originally discovered due to their increased synthesis in response to heat shock. They were initially identified when Drosophila salivary gland cells were exposed to a heat shock at 37°C for 30 min and then returned to their normal temperature of 25°C for recovery. A “puffing” of genes was found to have occurred in the chromosome of recovering cells, which was later shown to be accompanied by an increase in the synthesis of proteins with molecular masses of 70 and 26 kDa. These proteins were hence named “heat shock proteins”. The first identification of a function for HSPs was the discovery in Escherichia coli that five proteins synthesized in response to heat shock were involved in λ phage growth. The products of the groEL and groES genes were found to be essential for phage head assembly while the dnaK, dnaJ and grpE gene products were essential for λ phage replication. It was later shown that GroEL and GroES are part of a chaperonin system for protein folding in the prokaryotic cytosol while DnaK is a member of the Hsp70 family that works in conjunction with the DnaJ (Hsp40) co-chaperone and the nucleotide exchange factor GrpE to promote phage replication by dissociating the DnaB helicase from the phage-encoded P protein. Since then, a large number of other proteins collectively referred to as HSPs have been discovered. However, heat shock is not the only signal that induces synthesis of heat shock proteins. Stress of any kind, such as nutrient deprivation, chemical treatment and oxidative stress among others causes increased production of HSPs and therefore, they are also known as stress proteins. The term “molecular chaperone” was originally used to describe the function of nucleoplasmin, a Xenopus oocyte protein that promotes nucleosome assembly by binding tightly to histones and donating the bound histone to chromatin. However, since then, chaperones have been defined as “a family of unrelated classes of proteins that mediate the correct assembly of other proteins, but are not themselves components of the final functional structure”. This view of molecular chaperones, though undoubtedly correct, doesn’t capture the multifaceted roles they have since been discovered to play in cellular processes. In recent years, molecular chaperones have been shown to perform other functions in addition to the maintenance of protein homeostasis: translocation of proteins across organelle membranes, quality control in the endoplasmic reticulum, turnover of misfolded proteins as well as signal transduction. As a result, many chaperones are also essential under non-stress conditions and play crucial roles in cell growth and development, cell-cell communication and regulation of gene expression. Heat shock protein 90 (Hsp90) is one of the most abundant and highly conserved molecular chaperones in organisms ranging from bacteria to all branches of eukarya. It has been shown to be essential for cell viability in Saccharomyces cerevisiae, Schizosaccharomyces pombe and Drosophila melanogaster. Although the bacterial homolog HtpG is dispensable under normal conditions, it is important for cell survival during heat shock. In addition to its role as general chaperone in protein folding following stress, Hsp90 has a more specialized role as a chaperone for several protein kinases and transcription factors. Many Hsp90 client proteins are signaling proteins involved in regulation of cell growth and survival. These proteins are critically dependent on Hsp90 for their maturation and conformational maintenance resulting in a key role for Hsp90 in these processes. Recent reports have also highlighted a role for Hsp90 in linking the expression of genetic and epigenetic variation in response to environmental stress with morphological development in Drosophila melanogaster and Arabidopsis thaliana. In Candida albicans, Hsp90 augments the development of drug resistance, implicating a role for Hsp90 in the evolution of infectious diseases. The malarial parasite, Plasmodium falciparum, is the causative agent of the most lethal form of human malaria. The parasite life cycle involves two hosts: an invertebrate mosquito vector and a vertebrate human host. As the parasite moves from the mosquito to the human body, it experiences an increase in temperature resulting in a severe heat shock. The mechanisms by which the parasite adapts to changes in temperature have not been deciphered. Our laboratory has been interested in investigating the role of heat shock proteins during acclimatization of the parasite to such temperature fluctuations. Heat shock proteins of the Hsp40, Hsp60, Hsp70 and Hsp90 families have been characterized in the parasite and are being examined in our laboratory. This thesis pertains to understanding the functional role of Plasmodium falciparum Hsp90 (PfHsp90) during adaptation of the parasite to fluctuations in environmental temperature. The parasite expresses a single gene for cytosolic Hsp90 on chromosome 7 (PlasmoDB accession no.: PF07_0029) coding for a protein of 745 amino acids with a pI of 4.94 and Mw of 86 kDa. Eukaryotic Hsp90 regulates several protein kinases and transcription factors involved in cell growth and differentiation pathways resulting in a crucial role for Hsp90 in developmental processes. A role for PfHsp90 in parasite development, therefore, seems likely. Indeed, PfHsp90 has previously been implicated in parasite development from the ring stage to the trophozoite stage during the intra-erythrocytic cycle. Pharmacological inhibition of PfHsp90 function using geldanamycin (GA), a specific inhibitor of Hsp90 activity, abrogates stage progression. These experiments suggest that PfHsp90 may play a critical role in parasite development. This is further substantiated by the fact that several pathogenic protozoan parasites such as Leishmania donovani, Trypanosoma cruzi, Toxoplasma gondii and Eimeria tenella depend on Hsp90 function during different stages of their life cycles. It appears, therefore, that a principal role of Hsp90 in protozoan parasites may be the regulation of their developmental cycles. However, the precise functions of PfHsp90 during the intra-erythrocytic cycle of the malarial parasite are not clear. In this study we have carried out a functional analysis of PfHsp90 in the malarial parasite. We have examined the role of PfHsp90 in parasite development during repeated exposure to febrile temperatures. We have investigated its involvement in parasite development during a commonly used synchronization protocol involving cyclical changes in temperature. We have examined the interaction of GA with the Hsp90 multi-chaperone complex from P. falciparum as well as the human host. Finally, we have carried out a systems level analysis of chaperone networks in the malarial parasite as well as its human host using an in silico approach. We have analyzed the protein-protein interactions of PfHsp90 in the chaperone network and predicted putative cellular processes likely to be regulated by parasite chaperones, particularly PfHsp90.

APA, Harvard, Vancouver, ISO, and other styles

38

Filippi, Michal. "Predikce sekundární struktury proteinu pomocí hlubokých neuronových sítí." Master's thesis, 2017. http://www.nusl.cz/ntk/nusl-365184.

Full text

Abstract:

Determination of protein structure in space is a crucial part of protein function analysis. But structure determination is an expensive and time consuming pro- cess, therefore structure prediction model raised on popularity. The most notable subproblem of protein structure prediction is prediction of local conformation of the adjacent amino acids, ie. secondary structure. This thesis studies usage of deep neural networks for protein secondary structure prediction. We implemented pre- diction model and different modifications are evaluated. Especially compassion of LSTM and GRU memory cells was done. Furthermore, two new preprocessing me- thods are evaluated. Fast PSSM calculation method was proposed and prediction of tertiary structure was used as input for prediction model. Last part of this thesis examine application of filtering methods for models predicting secondary structure with eight classes. 1

APA, Harvard, Vancouver, ISO, and other styles

39

Ghosh, Soma. "A Multiscale Modeling Study of Iron Homeostasis in Mycrobacterium Tuberculosis." Thesis, 2014. http://etd.iisc.ernet.in/2005/3519.

Full text

Abstract:

Mycobacterium tuberculosis (M.tb), the causative agent of tuberculosis (TB), has remained the largest killer among infectious diseases for over a century. The increasing emergence of drug resistant varieties such as the multidrug resistant (MDR) and extremely drug resistant (XDR) strains are only increasing the global burden of the disease. Available statistics indicate that nearly one-third of the world’s population is infected, where the bacteria remains in the latent state but can reactivate into an actively growing stage to cause disease when the individual is immunocompromised. It is thus immensely important to rethink newer strategies for containing and combating the spread of this disease. Extraction of iron from the host cell is one of the many factors that enable the bacterium to survive in the harsh environments of the host macrophages and promote tuberculosis. Host–pathogen interactions can be interpreted as the battle of two systems, each aiming to overcome the other. From the host’s perspective, iron is essential for diverse processes such as oxygen transport, repression, detoxification and DNA synthesis. Infact, during infection, both the host and the pathogen are known to fight for the available iron, thereby influencing the outcome of the infection. It is of no surprise therefore, that many studies have investigated several components of the iron regulatory machinery of M.tb and the host. However, very few attempts have been made to study the interactions between these components and how such interactions lead to a better adapted phenotype. Such studies require exploration at multiple levels of structural and functional complexity, thereby necessitating the use of a multiscale approach. Systems biology adopts an integrated approach to study and understand the function of biological systems. It involves building large scale models based on individual biochemical interactions, followed by model validation and predictions of the system’s response to perturbations, such as a gene knock-out or exposure to drug. In multiscale modeling, an approach employed in this thesis, a particular biological phenomenon is studied at different spatiotemporal levels. Studying responses at multiple scales provides a broader picture of the communications that occur between a host and pathogen. Moreover, such an analysis also provides valuable insights into how perturbation at a particular level can elicit responses at another level and help in the identification of crucial inter-level communications that can possibly be hindered or activated for a desired physiological outcome. The broad objectives of this thesis was to obtain a comprehensive in silico understanding of mycobacterial iron homeostasis and metabolism, the influence of iron on host-pathogen interactions, identification of key players that mediate such interactions, determination of the molecular consequences of inhibiting the key players and finally the global response of M.tb to altered iron concentration. Perturbation of iron homeostasis holds a strong therapeutic potential, given its essentiality in both the host and the pathogen. Understanding the workings of iron metabolism and regulation in M.tb has been a main objective, so as to ultimately obtain insights about specific therapeutic strategies that capitalize on the criticality of iron concentration. An in-depth study of iron metabolism and regulation is performed at different levels of temporal and spatial scales using diverse methods, each appropriate to investigate biological events associated with the different scales. The specific investigations carried out in the thesis are as follows, a) Reconstruction of a host-pathogen interaction (HPI) model, with focus on iron homeostasis. This study represented the inter-cellular level analysis and was crucial for the identification of key players that mediate communication between the host and pathogen. Additionally, the model also provided a mathematical framework to study the effect of perturbations and gene knock-outs. b) Understanding the influence of iron on IdeR, an iron-responsive transcription factor, also identified as a key player in the HPI model. The study was carried out at the molecular level to identify atomistic details of how IdeR senses iron and the resulting structural modifications, which finally enables IdeR-DNA interaction. The study enabled identification of residues for the functioning of IdeR. c) Genome scale identification of genes that are regulated by IdeR to obtain an overview of the various biological processes affected by changing iron concentrations and IdeR mutation in M.tb. d) To understand the direct and indirect influences of iron and IdeR on the M.tb proteome using large scale protein-protein interaction network. The study enabled identification of highest differentially regulated genes and altered activity of the different biological processes under differing iron concentrations and regulation. e) Systems level analysis of the M.tb metabolome to investigate the metabolic re-adjustments undertaken by M.tb to adapt to altered iron concentration and regulation. The conceptual details and the background of each of the methods used to study the specific aims are provided in the Methodology chapter (Chapter 2). Construction of the host-pathogen interaction (HPI) model and the insights obtained from this study are presented in Chapter 3. A rule based HPI model was built with a focus on the iron regulatory mechanisms in both the host and pathogen. The model consisted of 194 rules, of which 4 rules represented interactions between the host and pathogen. The model not only represented an overview of iron metabolism but also allowed prediction of critical interaction that had the potential to form bottleneck in the system so as to control bacterial proliferation. Infact, model simulation led to the identification of 5 bottlenecks or chokepoints in the system, which if perturbed, could successfully interfere with the host-pathogen dynamics in favour of the host. The model also provided a framework to test perturbation strategies based on the bottlenecks. The study also established the importance of an iron responsive transcription factor, IdeR for regulating iron concentration in the pathogen and mediating host-pathogen interactions. Additionally, the importance of mycobactin and transferrin as key molecular players, involved in host-pathogen dynamics was also determined. The model provided a mathematical framework to test TB pathogenesis and provided significant insights about key molecular players and perturbation strategies that can be used to enhance therapeutic strategies. Given the importance of IdeR in HPI, its molecular mechanism of activation and dimerization was explored in Chapter 4. The main objective of the study was to explore the structural details of IdeR and its iron sensing capacity at the molecular level. A combination of molecular dynamics and protein structure network (PSN) were used to analyse IdeR monomers and dimers in the presence and absence of iron. PSNs used in this thesis are based on non-covalent interactions between sidechain atoms and are quite efficient in identifying iron induced subtle conformational variations. The study distinctly indicated the role of iron in IdeR stability. Further, it was observed that IdeR monomers can take up two major conformations, the ‘open’ and ‘close’ conformation with the iron bound structure preferring the ‘close’ conformation. Major structural changes, such as the N-terminal folding and increased propensity for dimerization were observed upon iron binding. Interestingly, careful analysis of structure suggests a role of these structural modifications towards DNA binding and has been tested in the next chapter. Overall, the results clearly highlight the influence of iron on IdeR activation and dimerization. The predisposition of IdeR to bind to DNA in the presence of metal is clearly visible even when the simulations are performed solely on protein molecules. However, to confirm the conjectures proposed in this chapter and to obtain the atomistic details of IdeR-DNA interactions, the IdeR-DNA complex was investigated. Chapter 5 focuses on the mechanistic details of IdeR-DNA interactions and the influence of iron on the same. IdeR is known to bind to a specific stretch of DNA, known as the ‘iron-box’ motif to form a dimer-of-dimer complex. Molecular dynamics followed by protein-DNA bipartite network analysis was performed on a set of four IdeR-DNA complexes to obtain a molecular level understanding of IdeR-DNA interactions. A striking observation was the dissociation of IdeR-DNA complex in the absence of iron, undoubtedly establishing the importance of iron for IdeR-DNA binding. At the residue level, hydrogen bond and non-covalent interactions clearly established the importance of N-terminal residues for DNA binding, thereby confirming the conjecture put forth in the previous chapter. An important aspect studied in this chapter is the allosteric nature of IdeR-DNA binding. Recent years have witnessed a paradigm shift in the understanding of allostery. Unlike the classical definition of allostery that was based on static structures, the newer definition is based on the conformational ensemble as represented by the shift in the energy landscape of the protein. The allosteric nature of IdeR-DNA complex was probed using simulated trajectories and indeed they suggest iron to be an allosteric regulator of the protein. Finally, based on the known experimental data and observations presented in Chapters 4 and 5, a multi-step model of IdeR activation and DNA binding has been proposed. In chapter 6, a global perspective of IdeR regulation in M.tb was obtained. This was important to gain insights about the influences of iron and its regulation at the M.tb cellular level. A genome scale identification of all possible IdeR targets based on the presence of ‘iron-box’ motif in the promoter region of the genes was carried out. An interesting aspect of this study was the use of energetic information from previous molecular dynamics study as an input for generation of the motif. A total of 255 such IdeR targets were identified and converted into an IdeR target network (IdeRnet). Along with IdeRnet, an unbiased systems level protein-protein interaction network was also generated. To study the response of the pathogen to external perturbations, iron-specific gene expression data was integrated into the network as node weights and edge weights. Analysis of IdeRnet provides interesting associations between fatty acid metabolism and IdeR regulations. Specific genes such as fadD32, DesA3 or lppW have been found to be affected by IdeR mutation. While IdeRnet discusses the direct associations, the global level responses are monitored by analysing pathways for the flow of information in the protein-protein interaction network (PPInet). Comparisons of the PPInets under conditions such as altering iron concentrations and lack of iron homeostasis led to the identification of the ‘top-most’ active paths under the different conditions. The study clearly suggests a halt in the protein synthesis machinery and decreased energy consumption under iron scarcity and an uninhibited consumption of energy when iron homeostasis is perturbed. In the final chapter (Chapter 7), flux balance analyses has been used to investigate the influence of iron on M.tb metabolism. The importance of iron for metabolic enzymes has already been established in the previous chapter. Additionally, M.tb is known to produce siderophores, an important metabolite that requires amino acids as its precursors, for iron extraction. All this, together highlighted the importance of iron and its regulation of M.tb metabolism. Flux balance analysis has been used previously to study the metabolic alterations that occur in an organism under different conditions. For this study, iron specific gene expression data was also incorporated into the model as reaction bounds and the flux values so obtained were compared in different environmental conditions. The study provided valuable insights into the metabolic adjustments taken up by M.tb under iron stress conditions and correlates well with the responses observed from the interactome as well as experimental observations. Most significantly, changes were observed in the energy preferences of the cell. For instance, it was noted that while the wild type strain of M.tb prefers synthesis of ATP via glycolysis, the IdeR mutant strain preferred oxidative phosphorylation. The picture becomes clearer when one accounts for the uncontrolled utilization of energy and rapid activation of protein synthesis machinery in the IdeR mutant strain. Biological systems are inherently multiscale in nature and therefore for a successful drug target regime, analysis of the genome to the phenome, which captures interactions at multiple levels, is essential. In this thesis, a detailed understanding of iron homeostasis and regulation in M.tb at multiple levels has been attempted. More importantly, insights obtained from one level, formed questions in the next level. The study was initiated at the inter-cellular level, where the influence of iron on HPI was modeled and analysed. From this study, IdeR, an iron-responsive transcription factor was identified as a key player that had the potential to alter host-pathogen interactions in the favour of the host. For a complete understanding of how IdeR regulates iron homeostasis, it was imperative to obtain a molecular level insight of its mechanism of action. Finally, the various aspects of IdeR regulation were investigated at the cellular level by analysing direct and indirect influences of IdeR on M.tb proteome and metabolome. The study suggests certain therapeutic interventions, such as 1) reduction in the concentration of free transferrin various, 2) mutations at the N-terminal sites of IdeR, 3) regulation of proteins involved in production of mycolic acids by iron and 4) perturbation of altering energy sources, which capitalize on iron and should be investigated in detail. In summary, the consequences of iron on TB infection were studied by threading different levels. This is based on the belief that most biological functions involve multiple spatio-temporal levels with frequent cross talks between the different levels, thereby making such multiscale approaches very useful.

APA, Harvard, Vancouver, ISO, and other styles

Dissertations / Theses on the topic 'Protein Structure Networks (PSN)'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles