Dissertations / Theses: 'Structural Bioinformatic'

1

Roberts, Rick Lee. "Structural and bioinformatic analysis of ethylmalonyl-CoA decarboxylase." Thesis, State University of New York at Buffalo, 2015. http://pqdtopen.proquest.com/#viewpdf?dispub=1600817.

Full text

Abstract:

Many enzymes of the major metabolic pathways are categorized into superfamilies which share common folds. Current models postulate these superfamilies are the result of gene duplications coupled with mutations that result in the acquisition of new functions. Some of these new functions are considered advantageous and selected for, while others may simply be tolerated. The latter can result in metabolites being produced at low rates that are of no known use by the cell, and can become toxic when accumulated. Concurrent with the evolution of this tolerable or potentially detrimental metabolism, organisms are selected to evolve a means of correcting or “proofreading” these non-canonical metabolites to counterbalance their detrimental effects. Metabolite proofreading is a process of intermediary metabolism analogous to DNA proof reading that acts on these abnormal metabolites to prevent their accumulation and toxic effects.

Here we structurally characterize ethylmalonyl-CoA decarboxylase (EMCD), a member of the family of enoyl-CoA hydratases within the crotonase superfamily of proteins, which is coded by the ECHDC1 (enoyl-CoA hydratase domain containing 1) gene. EMCD has been shown to have a metabolic proofreading property, acting on the metabolic byproduct ethylmalonyl-CoA to prevent its accumulation which could result in oxidative damage. We use the complimentary methods of in situ crystallography, small angle X-ray scattering, and single crystal X-ray crystallography to structurally characterize EMCD, followed by homology analysis in order to propose a mechanism of action. This represents the first structure of a crotonase superfamily member thought to function as a metabolite proof reading enzyme.

APA, Harvard, Vancouver, ISO, and other styles

2

Stahl, Morgan A. "The Perilipin Family of Proteins: Structural and Bioinformatic Analysis." Otterbein University Honors Theses / OhioLINK, 2005. http://rave.ohiolink.edu/etdc/view?acc_num=otbnhonors1620460421392971.

Full text

APA, Harvard, Vancouver, ISO, and other styles

3

Chiara, M. "BIOINFORMATIC TOOLS FOR NEXT GENERATION GENOMICS." Doctoral thesis, Università degli Studi di Milano, 2012. http://hdl.handle.net/2434/173424.

Full text

Abstract:

New sequencing strategies have redefined the concept of “high-throughput sequencing” and many companies, researchers, and recent reviews use the term “Next-Generation Sequencing” (NGS) instead of high-throughput sequencing. These advances have introduced a new era in genomics and bioinformatics⁠⁠. During my years as PhD student I have developed various software, algorithms and procedures for the analysis of Nest Generation sequencing data required for distinct biological research projects and collaborations in which our research group was involved. The tools and algorithms are thus presented in their appropriate biological contexts. Initially I dedicated myself to the development of scripts and pipelines which were used to assemble and annotate the mitochondrial genome of the model plant Vitis vinifera. The sequence was subsequently used as a reference to study the RNA editing of mitochondrial transcripts, using data produced by the Illumina and SOLiD platforms. I subsequently developed a new approach and a new software package for the detection of of relatively small indels between a donor and a reference genome, using NGS paired-end (PE) data and machine learning algorithms. I was able to show that, suitable Paired End data, contrary to previous assertions, can be used to detect, with high confidence, very small indels in low complexity genomic contexts. Finally I participated in a project aimed at the reconstruction of the genomic sequences of 2 distinct strains of the biotechnologically relevant fungus Fusarium. In this context I performed the sequence assembly to obtain the initial contigs and devised and implemented a new scaffolding algorithm which has proved to be particularly efficient.

APA, Harvard, Vancouver, ISO, and other styles

4

Gendoo, Deena. "Bioinformatic sequence and structural analysis for Amyloidogenicity in Prions and other proteins." Thesis, McGill University, 2012. http://digitool.Library.McGill.CA:80/R/?func=dbin-jump-full&object_id=110518.

Full text

Abstract:

Detection of amyloidogenic peptides or domains in proteins is of paramount importance towards understanding their role in amyloidosis in conformational diseases. This thesis explores different methods towards detection and prediction of amyloidogenic peptides using a variety of bioinformatic analytical methods. Bioinformatic analysis of secondary structural changes is employed to determine whether classes of structurally ambivalent peptides, mainly discordant and chameleon sequences, are efficient predictors of amyloidogenic segments. This analysis elucidates statistical relationships between discordance, chameleonism, and amyloidogenicity across a database of protein domains (SCOP), a subset of amyloid-forming proteins, and the prion family. The presented results stress upon the limitations of these peptides as predictors of amyloidogenicity, and raise issues on the predictive power that can be reaped from secondary structure prediction methods. In another bioinformatic approach, detection of conformationally variable segments in tertiary structures of PrP globular domains has been performed using Principal Component Analysis. This technique succeeded in identifying five conformationally variable domains within PrP, and ranking these subdomains by their ability to differentiate PrPs based on non-local structural response to pathogenic mutation and prion disease susceptibility. The presented results are corroborated by previous observations from experimental methods and molecular dynamic simulations, suggesting that this approach serves as a fast and reliable method for detection of potential amyloidogenic segments in amyloid-forming proteins. Finally, a structural, functional, and evolutionary bioinformatic analysis is conducted to assess the prevalence of the first experimentally verified amyloid fibril fold in nature, and whether this fold can serve as a prototype for other amyloid-forming proteins. The results indicate a limited scope of this fold in amyloid-forming proteins and across the protein universe, and have implications on future identification of amyloid-forming proteins that share this fold. Collectively, the presented thesis compares these different methods and discusses their efficacy in detection of amyloidogenic segments.
La détection de peptides ou de domaines amyloïdogéniques dans les protéines est d'une importance primordiale dans la compréhension de leur rôle dans l'amylose dans les maladies conformationnelles. Cette thèse explore différentes méthodes en vue de la détection et la prédiction des peptides amyloïdogéniques utilisant une variété de méthodes d'analyse bio-informatique. L'analyse bio-informatique des changements structurels secondaires est employé afin de déterminer si les classes des peptides structurellement ambivalentes, principalement des séquences discordantes et caméléons, sont des prédicteurs efficaces de segments amyloïdogéniques. Cette analyse élucide des relations statistiques entre la discordance, la chameleonism et l'amyloïdogénicité à travers une base de données de domaines protéiques (SCOP), un sous-ensemble de protéines formées d'amyloïdes, et de la famille prion. Les résultats présentés soulignent les limites de ces peptides en tant que prédicteurs d'amyloïdogénicité, et soulèvent des questions sur le pouvoir prédictif qui peut être récolté de méthodes de prédiction de structure secondaire. Dans une autre approche bio-informatique, la détection de segments de conformation variables dans les structures tertiaires de domaines globulaires PrP a été effectuée utilisant « Principal Component Analysis ». Cette technique a réussi à identifier cinq domaines de conformation variables au sein de la protéine PrP, et à classer ces sous-domaines par leur capacité à différencier les PrP fondés sur des réponses structurelles non-locales à la mutation pathogène et la susceptibilité aux maladies prion. Les résultats présentés sont corroborés par des observations antérieures à partir de méthodes expérimentales et de simulations de dynamique moléculaire, ce qui suggère que cette approche sert comme une méthode rapide et fiable pour la détection de segments amyloïdogéniques potentiels dans les protéines formées d'amyloïdes. Finalement, une analyse structurelle, fonctionnelle et évolutive bio-informatique est menée afin d'évaluer la prévalence du premier pli de fibrille amyloïde dans la nature vérifié expérimentalement, et si ce pli peut servir de prototype pour d'autres protéines formées d'amyloïdes. Les résultats indiquent une portée limitée de ce pli dans les protéines formées d'amyloïdes et à travers l'univers des protéines, et ont des répercussions sur l'identification future de protéines formées d'amyloïdes qui partagent ce pli. Collectivement, la thèse présentée compare ces différentes méthodes et discute leur efficacité dans la détection de segments amyloïdogéniques.

APA, Harvard, Vancouver, ISO, and other styles

5

MOZZICAFREDDO, MATTEO. "Structural bioinformatic analyses of (macro)molecular interactions of biomedical relevance: an experimental validation." Doctoral thesis, Università degli Studi di Camerino, 2014. http://hdl.handle.net/11581/401775.

Full text

Abstract:

Structural bioinformatics, like many other subdisciplines within bioinformatics, is characterized by the establishment of general purpose methods for manipulating information about biological macromolecules, and the application of these methods to solving problems in biology and creating new knowledge. Among its capabilities, structural bioinformatics can analyse the feasible (macro)molecular interactions and assists, or sometimes anticipates, the experimental approaches in biological research, even starting from a prediction analysis of the three-dimensional structures of the partners. This thesis reports on the in silico and in vitro characterization of a selection on physiologically relevant processes involving binding between proteins and endogenous and exogenous ligands, with results confirming the well-founded capability of the bioinformatic methods to clarify these issues. The general approach consisted in: (i) the in silico derivation of the predictive structural and equilibrium parameters for the ligand-receptor complexes starting from either deposited crystallographic (where available) or homology modeled structures; (ii) the experimental validation of the computational data according to both "in solution" and "on surface" in vitro studies; (iii) the final evaluation of the effects of the interactions on cell based models. Specifically, during the PhD period my interest was mainly focused on the characterization of the molecular basis of the systemic sclerosis (SSc), a rare human auto-immune disease, with particular emphasis on the interaction between platelet-derived growth factor receptor and a selection of human autoantibodies expressed in SSc patients (the revised version of this manuscript is currently under evaluation in Nature Communications). This project was paralleled by several other studies, among which the modulation by natural polyphenols of two human enzymes, HMG-CoA reductase and plasmin, involved in cholesterol biosynthetic pathway and in cellular adhesion and mobility, respectively. The results of these studies were published in impacted scientific journals.

APA, Harvard, Vancouver, ISO, and other styles

6

Martínez, Fundichely Alexander 1978. "Bioinformatic characterization and analysis of polymorphic inversions in the human genome." Doctoral thesis, Universitat Pompeu Fabra, 2013. http://hdl.handle.net/10803/384837.

Full text

Abstract:

Within the great interest in the characterization of genomic structural variants (SVs) in the human genome, inversions present unique challenges and have been little studied. This thesis has developed "GRIAL", a new algorithm focused specifically in detect and map accurately inversions from paired-end mapping (PEM) data, which is the most widely used method to detect SVs. GRIAL is based on geometrical rules to cluster, merge and refine both breakpoints of putative inversions. That way, we have been able to predict hundreds of inversions in the human genome. In addition, thanks to the different GRIAL quality scores, we have been able to identify spurious PEM-patterns and their causes, and discard a big fraction of the predicted inversions as false positives. Furthermore, we have created â ˘ AIJInvFESTâ˘A˙I, the first database of human polymorphic inversions, which represents the most reliable catalogue of inversions and integrates all the associated information from multiple sources. Currently, InvFEST combines information from 30 different studies and contains 1092 candidate inversions, which are categorized based on internal scores and manual curation. Finally, the analysis of all the data generated has provided information on the genomic patterns of inversions, contributing decisively to the understanding of the map of human polymorphic inversions.
Dentro del estudio de las variantes estructurales en el genoma humano, las inversiones han sido las menos han consolidado sus resultados y constituye uno de los principales retos en la actualidad. Esta tesis aborda el tema a través de la implementación de "GRIAL" un nuevo algoritmo específicamente diseñado para la detección más precisa posible de las inversiones usando el mapeo de secuencias apareadas (del inglés PEM) que es el método más utilizado para estudiar la variación estructural. GRIAL se basa en reglas geométricas para agrupar los patrones de PEM que señalan un posible punto de rotura (del inglés breakpoint) de inversión, además une cada breakpoint correspondientes a inversiones independientes y refina lo más exacto posible su localización. Su uso nos permitió predecir cientos de inversiones. Un gran aporte de nuestro método es la creación de índices (del inglés score) de fiabilidad para las predicciones mediante los cuales identificamos patrones de inversión incorrectos y sus causas. Esto nos permitió filtrar nuestro resultado eliminando un gran número de predicciones posiblemente falsas. Además se creó "InvFEST", la primera base de datos especialmente dedicada a inversiones polimórficas en el genoma humano la cual representa el catálogo más fiable de inversiones, integrando además a cada inversión conocida la información asociada disponible. Actualmente InvFEST contiene (y mantiene la clasificación según el nivel de certeza) un catálogo de 1092 inversiones clasificadas, a partir de datos de 30 estudios diferentes. Finalmente el análisis de toda la información generada nos permitió describir algunos patrones de las inversiones polimórficas en el genoma humano contribuyendo de este modo a la comprensión de esta variante estructural y el estado de su información en los estudios del genoma humano.
Inversió genòmica

APA, Harvard, Vancouver, ISO, and other styles

7

Moss, Tiffanie. "CHARACTERIZATION OF STRUCTURAL VARIANTS AND ASSOCIATED MICRORNAS IN FLAX FIBER AND LINSEED GENOTYPES BY BIOINFORMATIC ANALYSIS AND HIGH-THROUGHPUT SEQUENCING." Case Western Reserve University School of Graduate Studies / OhioLINK, 2012. http://rave.ohiolink.edu/etdc/view?acc_num=case1333648149.

Full text

APA, Harvard, Vancouver, ISO, and other styles

8

RIZZA, FABIO. "Structural modelling of biological macromolecules: the cases of neurofibromin, bifurcating Electron Transferring Flavoprotein and Amyloid-β (1-16) peptide." Doctoral thesis, Università degli Studi di Milano-Bicocca, 2021. http://hdl.handle.net/10281/310480.

Full text

Abstract:

In questa tesi sono stati affrontati tre progetti indipendenti, accomunati dall’uso della modellistica molecolare e in particolare della dinamica molecolare. Nel primo progetto è stato studiato il dominio Sec14-PH della neurofibromina (NF1). I domini sec14 sono stati scoperti in numerose proteine dai procarioti all’uomo come scambiatori di lipidi tra membrane, per mezzo di una tasca la cui apertura è legata al movimento di una specifica alpha-elica (elica lid). La struttura cristallina del dominio Sec14 di NF1 (sia del wild type sia di mutanti associati all’insorgenza della patologia neurofibromatosi) ha rivelato la sua particolarità di essere strutturalmente accoppiato ad un dominio PH che interagisce fortemente con l’elica lid tramite un suo loop (detto lid-lock loop). Su questa base è stato formulato un meccanismo di apertura della tasca del Sec14 che coinvolgerebbe un movimento concertato del lid-lock loop, ma questo movimento non è mai stato osservato o dimostrato. Guidati da dati sperimentali sulla denaturazione termica del Sec14-PH di NF1, sia del wild type sia di alcuni mutanti, diverse simulazioni ad alta temperatura sono state effettuare per comparare la dinamica del dominio wild type con un mutante patologico associato all’insorgenza della patologia neurofibromatosi. Con le nostre simulazioni è stato possibile proporre un meccanismo di funzionamento dell’apertura dell’elica lid e fornire delle basi strutturali e dinamiche dell’insorgenza della patologia nel caso del mutante specifico studiato. Nel secondo progetto è stato affrontato lo studio di una proteina chiamata EtfAB che catalizza un processo recentemente scoperto noto come biforcazione elettronica basata sulle flavine. Questo meccanismo è sfruttato solo da alcuni microrganismi anerobici come terza via di accoppiamento energetico e finora si conoscono quattro famiglie di proteine, evolutivamente non correlate, in grado di catalizzarlo. Una di queste è EtfAB, della quale non è chiaro come possa avvenire il trasferimento elettronico tra le due molecole di FAD ad essa legate. Infatti, la distanza tra questi due FAD osservata nella struttura cristalline di EtfAB è di 18 Å, mentre si ritiene più plausibile che i trasferimenti elettronici in biologia non avvengano a distanze maggiori di 14 Å. Per questo è stato suggerito un possibile meccanismo che potrebbe avvicinare le due molecole di FAD. Usando la dinamica molecolare è stato possibile testare, e smentire, il meccanismo proposto. Inoltre, con il Density Functional Theory (DFT), è stato possibile fornire un’interpretazione ad alcuni dati spettroscopici riguardo il possibile trasferimento elettronico tra le due molecole di FAD. Nel terzo progetto, ho collaborato con il Prof. Luca Bertini ad un progetto sulla produzione e propagazione di alcune specie reattive dell’ossigeno (ROS) nel contesto del peptide amiloide beta coinvolto nella patogenesi dell’Alzheimer. Nell’ambito dell’ipotesi amiloide sull’insorgenza della patologia di Alzheimer, un ruolo importante è stato attribuito ai danni causati dai ROS, prodotti da un complesso metallico all’interno del peptide amiloide stesso, in particolare dal radicale ossidrilico (OH.-). Tuttavia, i dettagli su come questi radicali propaghino e reagiscano non sono ancora stati chiariti. Mentre i calcoli DFT del Prof. Bertini affrontavano le capacità ossidative del radicale ossidrilico e i possibili prodotti di reazione nel contesto del peptide amiloide beta, con i miei calcoli di dinamica molecolari è stata fornita una panoramica su quali possibili bersagli del radicale ossidrilico, coordinato allo ione Cu del complesso, possano effettivamente reagire entrando in contatto con il radicale ossidrilico a causa dei moti dinamici del peptide.
In this thesis, three independent projects were addressed, sharing the computational approach based on molecular modeling and in particular molecular dynamics. In the first project, the Sec14-PH domain of neurofibromin (NF1) was investigated. The Sec14 domains have been identified in many different proteins, from prokaryotes to humans, serving as exchangers of lipid molecules between membranes, by means of a pocket whose opening is allowed by the motion of a specific alpha-helix (called lid helix). The crystal structure of the NF1-Sec14 domain (of both the wild type and some mutants associated with the onset of neurofibromatosis pathology) has revealed its peculiarity of being structurally coupled to a PH domain that strongly interacts with the lid helix through a long loop (called lid-lock loop). On this basis, a mechanism for the opening of the Sec14 lipid pocket was formulated which would involve a concerted movement of the lid-lock loop, but this movement has actually never been shown. Guided by available experimental data on the thermal denaturation of Sec14-PH domain of NF1, both on the wild type and some neurofibromin-related mutants, several simulations at high temperature were carried out to compare the dynamics of the wild type domain with a pathological mutant associated with the onset of neurofibromatosis. Our simulations lead us to suggest an opening mechanism for the lid helix and provide a hypothesis for the structural and dynamic basis of the onset of the disease in the case of the specific mutant. The second project addressed the study of a protein called EtfAB which catalyzes a recently discovered process known as Flavin-Based Electron Bifurcation (FBEB). This mechanism is only exploited by some anaerobic microorganisms as a third way of energy coupling. So far, four unrelated protein families are known that are able to catalyze FBEB. Among these, EtfAB, catalyzes the electron transfer between the two FAD molecules bound to it. Surprisingly, the distance between these two FADs, as observed in the crystal structure of EtfAB, is 18 Å, whereas biological electron transfer is considered more likely to occur at a maximal distance of 14 Å. To explain this, a possible mechanism has been suggested that could bring the two FAD molecules closer together. Using molecular dynamics, it was possible to test, and discard, the proposed mechanism. Furthermore, with the Density Functional Theory (DFT), it was possible to provide an interpretation to some spectroscopic data regarding the possible electron transfer between the two FAD molecules. In the third project, I collaborated with Prof. Luca Bertini on a project on the production and propagation of some reactive oxygen species (ROS) in the context of the amyloid-beta peptide involved in the pathogenesis of Alzheimer's. In the amyloid hypothesis on the onset of Alzheimer's disease, an important role has been attributed to the damage caused by ROS, produced by a metal ion coordinated to the amyloid peptide itself, in particular by the hydroxyl radical (OH.-). However, the details of how these radicals propagate and react have not yet been clarified. While Prof. Bertini's DFT calculations addressed the oxidative capacities of the hydroxyl radical and the possible reaction products in the context of the amyloid-beta peptide, my molecular dynamics simulations provided an overview on which possible targets of the hydroxyl radical, coordinated to the ion Cu of the complex, could actually react with the hydroxyl radical due to the dynamic motions of the peptide.

APA, Harvard, Vancouver, ISO, and other styles

9

LAURENZI, TOMMASO. "STUDY ON THE HDL::LCAT INTERACTION AND INSIGHTS INTO LCAT PHARMACOLOGICAL MODULATION." Doctoral thesis, Università degli Studi di Milano, 2021. http://hdl.handle.net/2434/835127.

Full text

Abstract:

Lecithin:cholesterol-acyl-transferase (LCAT) plays a major role in cholesterol metabolism as it is the only extracellular enzyme able to esterify cholesterol. LCAT activity is required for lipoprotein remodeling and, most specifically, for the growth and maturation of HDLs. In fact, genetic alterations affecting LCAT functionality may cause a severe reduction in plasma levels of HDL-cholesterol with important clinical consequences, for which, at present, no optimal treatment is available. Within this project, we ultimately aim at establishing landmarks for future structure-based drug-discovery of novel small-molecule activators able to rescue the defective enzyme in LCAT deficiency patients. To this end, we thoroughly studied the LCAT::HDL recognition and activation mechanism and investigated some aspects of LCAT pharmacological modulation. Although several hypotheses were formulated, the exact molecular recognition mechanism between LCAT and HDLs is still unknown. We employed a combination of structural bioinformatics procedures to deepen the insights into the HDL-LCAT interplay that promotes LCAT activation and cholesterol esterification. We have generated a data-driven model of reconstituted HDL (rHDL) and studied the dynamics of an assembled rHDL::LCAT supramolecular complex, pinpointing the conformational changes originating from the interaction between LCAT and apolipoprotein A-I (apoA-I) that are necessary for LCAT activation. Specifically, we propose a mechanism in which the anchoring of LCAT lid to apoA-I helices allows the formation of a hydrophobic hood that expands LCAT active site and shields it from the solvent, allowing the enzyme to process large hydrophobic substrates. Through the atomistic knowledge gained from our modeling work, we then studied the mechanism-of-action of some members of two known classes of small-molecule LCAT modulators and their interaction with a subset of LCAT mutants, rationalizing the bases for the future design of novel activators characterized by higher efficacy.

APA, Harvard, Vancouver, ISO, and other styles

10

Liu, Xiao. "Comprehensive bioinformatic analysis of kinesin classification and prediction of structural changes from a closed to an open conformation of the motor domain." Diss., lmu, 2009. http://nbn-resolving.de/urn:nbn:de:bvb:19-108430.

Full text

APA, Harvard, Vancouver, ISO, and other styles

11

Hvidsten, Torgeir R. "Predicting Function of Genes and Proteins from Sequence, Structure and Expression Data." Doctoral thesis, Uppsala : Acta Universitatis Upsaliensis : Univ.-bibl. [distributör], 2004. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-4490.

Full text

APA, Harvard, Vancouver, ISO, and other styles

12

Björkholm, Patrik. "Method for recognizing local descriptors of protein structures using Hidden Markov Models." Thesis, Linköping University, The Department of Physics, Chemistry and Biology, 2008. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-11408.

Full text

Abstract:

Being able to predict the sequence-structure relationship in proteins will extend the scope of many bioinformatics tools relying on structure information. Here we use Hidden Markov models (HMM) to recognize and pinpoint the location in target sequences of local structural motifs (local descriptors of protein structure, LDPS) These substructures are composed of three or more segments of amino acid backbone structures that are in proximity with each other in space but not necessarily along the amino acid sequence. We were able to align descriptors to their proper locations in 41.1% of the cases when using models solely built from amino acid information. Using models that also incorporated secondary structure information, we were able to assign 57.8% of the local descriptors to their proper location. Further enhancements in performance was yielded when threading a profile through the Hidden Markov models together with the secondary structure, with this material we were able assign 58,5% of the descriptors to their proper locations. Hidden Markov models were shown to be able to locate LDPS in target sequences, the performance accuracy increases when secondary structure and the profile for the target sequence were used in the models.

APA, Harvard, Vancouver, ISO, and other styles

13

Lysholm, Fredrik. "Structural characterization of overrepresented." Thesis, Linköping University, The Department of Physics, Chemistry and Biology, 2008. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-12325.

Full text

Abstract:

Background: Through the last decades vast amount of sequence information have been produced by various protein sequencing projects, which enables studies of sequential patterns. One of the bestknown efforts to chart short peptide sequences is the Prosite pattern data bank. While sequential patterns like those of Prosite have proved very useful for classifying protein families, functions etc. structural analysis may provide more information and possible crucial clues linked to protein folding. Today PDB, which is the main repository for protein structure, contains more than 50’000 entries which enables structural protein studies.

Result: Strongly folded pentapeptides, defined as pentapeptides which retained a specific conformation in several significantly structurally different proteins, were studied out of PDB. Among these several groups were found. Possibly the most well defined is the “double Cys” pentapeptide group, with two amino acids in between (CXXCX|XCXXC) which were found to form backbone loops where the two Cysteine amino acids formed a possible Cys-Cys bridge. Other structural motifs were found both in helixes and in sheets like "ECSAM" and "TIKIW", respectively.

Conclusion: There is much information to be extracted by structural analysis of pentapeptides and other oligopeptides. There is no doubt that some pentapeptides are more likely to obtain a specific fold than others and that there are many strongly folded pentapeptides. By combining the usage of such patterns in a protein folding model, such as the Hydrophobic-polar-model improvements in speed and accuracy can be obtained. Comparing structural conformations for important overrepresented pentapeptides can also help identify and refine both structural information data banks such as SCOP and sequential pattern data banks such as Prosite.

APA, Harvard, Vancouver, ISO, and other styles

14

Freyhult, Eva. "New techniques for analysing RNA structure /." Uppsala, 2004. http://www.math.uu.se/research/pub/Freyhult1.pdf.

Full text

APA, Harvard, Vancouver, ISO, and other styles

15

Capuccini, Marco. "Structure-Based Virtual Screening in Spark." Thesis, Uppsala universitet, Institutionen för farmaceutisk biovetenskap, 2015. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-257028.

Full text

APA, Harvard, Vancouver, ISO, and other styles

16

Wallner, Björn. "Protein Structure Prediction : Model Building and Quality Assessment." Doctoral thesis, Stockholm University, Department of Biochemistry and Biophysics, 2005. http://urn.kb.se/resolve?urn=urn:nbn:se:su:diva-649.

Full text

Abstract:

Proteins play a crucial roll in all biological processes. The wide range of protein functions is made possible through the many different conformations that the protein chain can adopt. The structure of a protein is extremely important for its function, but to determine the structure of protein experimentally is both difficult and time consuming. In fact with the current methods it is not possible to study all the billions of proteins in the world by experiments. Hence, for the vast majority of proteins the only way to get structural information is through the use of a method that predicts the structure of a protein based on the amino acid sequence.

This thesis focuses on improving the current protein structure prediction methods by combining different prediction approaches together with machine-learning techniques. This work has resulted in some of the best automatic servers in world – Pcons and Pmodeller. As a part of the improvement of our automatic servers, I have also developed one of the best methods for predicting the quality of a protein model – ProQ. In addition, I have also developed methods to predict the local quality of a protein, based on the structure – ProQres and based on evolutionary information – ProQprof. Finally, I have also performed the first large-scale benchmark of publicly available homology modeling programs.

APA, Harvard, Vancouver, ISO, and other styles

17

Novotny, Marian. "Applications of Structural Bioinformatics for the Structural Genomics Era." Doctoral thesis, Uppsala : Acta Universitatis Upsaliensis Acta Universitatis Upsaliensis, 2007. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-7593.

Full text

APA, Harvard, Vancouver, ISO, and other styles

18

Peng, Zeshan. "Structure comparison in bioinformatics." Click to view the E-thesis via HKUTO, 2006. http://sunzi.lib.hku.hk/hkuto/record/B36271299.

Full text

APA, Harvard, Vancouver, ISO, and other styles

19

Peng, Zeshan, and 彭澤山. "Structure comparison in bioinformatics." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 2006. http://hub.hku.hk/bib/B36271299.

Full text

APA, Harvard, Vancouver, ISO, and other styles

20

Viklund, Håkan. "Formalizing life : Towards an improved understanding of the sequence-structure relationship in alpha-helical transmembrane proteins." Doctoral thesis, Stockholm University, Department of Biochemistry and Biophysics, 2007. http://urn.kb.se/resolve?urn=urn:nbn:se:su:diva-7144.

Full text

Abstract:

Genes coding for alpha-helical transmembrane proteins constitute roughly 25% of the total number of genes in a typical organism. As these proteins are vital parts of many biological processes, an improved understanding of them is important for achieving a better understanding of the mechanisms that constitute life.

All proteins consist of an amino acid sequence that fold into a three-dimensional structure in order to perform its biological function. The work presented in this thesis is directed towards improving the understanding of the relationship between sequence and structure for alpha-helical transmembrane proteins. Specifically, five original methods for predicting the topology of alpha-helical transmembrane proteins have been developed: PRO-TMHMM, PRODIV-TMHMM, OCTOPUS, Toppred III and SCAMPI.

A general conclusion from these studies is that approaches that use multiple sequence information achive the best prediction accuracy. Further, the properties of reentrant regions have been studied, both with respect to sequence and structure. One result of this study is an improved definition of the topological grammar of transmembrane proteins, which is used in OCTOPUS and shown to further improve topology prediction. Finally, Z-coordinates, an alternative system for representation of topological information for transmembrane proteins that is based on distance to the membrane center has been introduced, and a method for predicting Z-coordinates from amino acid sequence, Z-PRED, has been developed.

APA, Harvard, Vancouver, ISO, and other styles

21

Michel, Mirco. "From Sequence to Structure : Using predicted residue contacts to facilitate template-free protein structure prediction." Doctoral thesis, Stockholms universitet, Institutionen för biokemi och biofysik, 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:su:diva-141946.

Full text

Abstract:

Despite the fundamental role of experimental protein structure determination, computational methods are of essential importance to bridge the ever growing gap between available protein sequence and structure data. Common structure prediction methods rely on experimental data, which is not available for about half of the known protein families. Recent advancements in amino acid contact prediction have revolutionized the field of protein structure prediction. Contacts can be used to guide template-free structure predictions that do not rely on experimentally solved structures of homologous proteins. Such methods are now able to produce accurate models for a wide range of protein families. We developed PconsC2, an approach that improved existing contact prediction methods by recognizing intra-molecular contact patterns and noise reduction. An inherent problem of contact prediction based on maximum entropy models is that large alignments with over 1000 effective sequences are needed to infer contacts accurately. These are however not available for more than 80% of all protein families that do not have a representative structure in PDB. With PconsC3, we could extend the applicability of contact prediction to families as small as 100 effective sequences by combining global inference methods with machine learning based on local pairwise measures. By introducing PconsFold, a pipeline for contact-based structure prediction, we could show that improvements in contact prediction accuracy translate to more accurate models. Finally, we applied a similar technique to Pfam, a comprehensive database of known protein families. In addition to using a faster folding protocol we employed model quality assessment methods, crucial for estimating the confidence in the accuracy of predicted models. We propose models tobe accurate for 558 families that do not have a representative known structure. Out of those, over 75% have not been reported before.

At the time of the doctoral defense, the following papers were unpublished and had a status as follows: Paper 2: Submitted. Paper 4: In press.

APA, Harvard, Vancouver, ISO, and other styles

22

Nordström, Rickard. "3DPOPS : From carbohydrate sequence to 3D structure." Thesis, University of Skövde, Department of Computer Science, 2002. http://urn.kb.se/resolve?urn=urn:nbn:se:his:diva-713.

Full text

Abstract:

In this project a web-based system called 3DPOPS have been designed, developed and implemented. The system creates initial 3D structures of oligosaccharides according to user input data and is intended to be integrated with an automatized 3D prediction system for saccharides. The web interface uses a novel approach with a dynamically updated graphical representation of the input carbohydrate. The interface is embedded in a web page as a Java applet. Both expert and novice users needs are met by informative messages, a familiar concept and a dynamically updated graphical user interface in which only valid input can be created.

A set of test sequences was collected from the CarbBank database. An initial structure to each sequence could be created. All contained the information necessary to serve as starting points in a conformation search carried out by a 3D prediction system for carbohydrates.

APA, Harvard, Vancouver, ISO, and other styles

23

Freyhult, Eva. "A Study in RNA Bioinformatics : Identification, Prediction and Analysis." Doctoral thesis, Uppsala : Acta Universitatis Upsaliensis Acta Universitatis Upsaliensis, 2007. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-8305.

Full text

APA, Harvard, Vancouver, ISO, and other styles

24

Veanes, Margus. "Identification of novel loss of heterozygosity collateral lethality genes for potential applications in cancer." Thesis, Uppsala universitet, Institutionen för biologisk grundutbildning, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-433768.

Full text

Abstract:

Over the course of this project, I demonstrate the utility of a 4-phase analysis pipeline in the context of cancer therapy and the associated search for antineoplastic drug candidates. I showcase a repeatable means for generating lists of potential targets which may be used in conjunction with methods like small molecule screening as part of a search for broadly effective antineoplastic agents. By using publicly available variant call format (VCF) data sourced from the 1000 genomes project, global human population-wide data for non-sex chromosomes was filtered and transformed in a 4-phase process to obtain high population frequency, heterozygotic, nonsynonymous single nucleotide variants (nsSNVs) residing in functional domains of proteins. Through manual filtration combined with software-assisted annotation, I obtained a ranked list of 50 top scoring annotated variants across the human autosome, all residing in known protein domains. Additionally, a single top variant was selected for proof-of-concept structure prediction and visualization. When the methodology outlined herein is coupled to additional loss-of-heterozygosity (LOH) prevalence data across cancer genomes, it may be used to identify candidate variants which collectively represent potential loss-of-heterozygosity based collateral lethalities (CL) in the underlying cancer. Furthermore, under the assumption that subsequent methods like small molecule screening succeed in finding molecule(s) targeting a structural aspect of one of these variants, any subsequently developed therapeutic approaches may possess broader therapeutic utility dependent upon the strictness of the initial heterozygotic filtering threshold applied at the onset of the project pipeline. When combined with additional cancer data, the recreation of such gene lists at other degrees of heterozygotic thresholding can allow for the creation of lists of autosomal loss-of-heterozygosity gene candidates, representing potential collateral lethality targets with varied degrees of utility dependent upon the strictness of the initial filtration threshold.

APA, Harvard, Vancouver, ISO, and other styles

25

Bliven, Spencer Edward. "Structure-Preserving Rearrangements| Algorithms for Structural Comparison and Protein Analysis." Thesis, University of California, San Diego, 2015. http://pqdtopen.proquest.com/#viewpdf?dispub=3716489.

Full text

Abstract:

Protein structure is fundamental to a deep understanding of how proteins function. Since structure is highly conserved, structural comparison can provide deep information about the evolution and function of protein families. The Protein Data Bank (PDB) continues to grow rapidly, providing copious opportunities for advancing our understanding of proteins through large-scale searches and structural comparisons. In this work I present several novel structural comparison methods for specific applications, as well as apply structure comparison tools systematically to better understand global properties of protein fold space.

Circular permutation describes a relationship between two proteins where the N-terminal portion of one protein is related to the C-terminal portion of the other. Proteins that are related by a circular permutation generally share the same structure despite the rearrangement of their primary sequence. This non-sequential relationship makes them difficult for many structure alignment tools to detect. Combinatorial Extension for Circular Permutations (CE-CP) was developed to align proteins that may be related by a circular permutation. It is widely available due to its incorporation into the RCSB PDB website.

Symmetry and structural repeats are common in protein structures at many levels. The CE-Symm tool was developed in order to detect internal pseudosymmetry within individual polypeptide chains. Such internal symmetry can arise from duplication events, so aligning the individual symmetry units provides insights about conservation and evolution. In many cases, internal symmetry can be shown to be important for a number of functions, including ligand binding, allostery, folding, stability, and evolution.

Structural comparison tools were applied comprehensively across all PDB structures for systematic analysis. Pairwise structural comparisons of all proteins in the PDB have been computed using the Open Science Grid computing infrastructure, and are kept continually up-to-date with the release of new structures. These provide a network-based view of protein fold space. CE-Symm was also applied to systematically survey the PDB for internally symmetric proteins. It is able to detect symmetry in ~20% of all protein families. Such PDB-wide analyses give insights into the complex evolution of protein folds.

APA, Harvard, Vancouver, ISO, and other styles

26

Brown, Peter G. "Structural Alignments for Similarity Detection in Bioinformatics." Thesis, Griffith University, 2019. http://hdl.handle.net/10072/390033.

Full text

Abstract:

This thesis addresses problems involving structural alignments for similarity detection between entities. In the general computational context, a structural alignment is defined as an optimization problem where representative inputs are assigned to relative positions subject to the minimization of some objective function. The output is an inferred relationship based upon the resultant value of the objective function, and/or the arrangement of aligned positions. Two bioinformatics similarity detection applications were used as case studies within this work, the structural alignment of biomolecular proteins and the document similarity detection problem in biomedical literature. The structural alignment of protein biomolecules involves generating residue pair correspondences of maximal overlap with minimal geometric divergence using each protein’s set of three-dimensional atomic coordinates. As protein structure decides its functionality, similarity in structure usually implies similarity in function. During the investigation of this structural alignment problem, it became apparent that a fast and approximate asymmetric linear sum assignment algorithm was required. Accordingly, a new heuristic algorithm, Asymmetric Greedy Search (AGS), was developed. Extensive computational experiments using a range of model graphs demonstrated the effectiveness of the algorithm. In addition, a new type of deterministic model graph that is suitable for reproducible benchmarking of these types of algorithms was also developed. Incorporating AGS, a new non-sequential protein structure alignment method, SPalignNS, was then developed. As compared to existing methods, SPalignNS achieved greater alignment accuracy with commonly used protein alignment test datasets, and also achieved the highest agreement with manually curated reference alignments. The document similarity detection problem is a fundamental application of natural language processing, and constitutes the basis of information retrieval systems. Document matching systems for locating relevant literature have mostly relied on methods developed over a decade ago, largely due to the unavailability of a common evaluation framework. A database of relevance annotations for over 180,000 PubMed-listed document pairs was developed with a subsequent application in training a sentence-based transferred learning model, HuBERT (Hierarchical PubMed BERT). When applied to relevant biomedical literature searches in PubMed, the new HuBERT method produced superior results compared to those attained by the baseline methods from existing document matching systems.
Thesis (PhD Doctorate)
Doctor of Philosophy (PhD)
School of Info & Comm Tech
Science, Environment, Engineering and Technology
Full Text

APA, Harvard, Vancouver, ISO, and other styles

27

Stamatelou, Ismini Christina. "Clustering approaches for extracting structural determinants of enzyme active sites." Thesis, Uppsala universitet, Institutionen för biologisk grundutbildning, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-426221.

Full text

Abstract:

The study of enzyme binding sites is an essential but rather demanding process of increased complexity since the amino acids lining these areas are not rigid. At the same time, the minimization of side effects and the specificity of new ligands is a great challenge in the structure-based drug design approach. Using glycogen phosphorylase - a validated target for the development of new antidiabetic agents - as a case study, this project focuses on the examination of side-chain conformations of amino acids that play a key role in the catalytic site of the enzyme. Specifically, different rotamers of each amino acid were collected to build a dataset of different conformations of the catalytic site. The rotamers were filtered by their probability of occurrence and subsequently, all rotamers that create steric clashes were rejected. Then, these conformations were clustered based on their similarity. Three different clustering algorithms and multiple numbers of clusters were tested using the silhouette scores evaluation for the clustering process. In order to measure the similarity, the Euclidean metric was used which due to the correspondence of the coordinates between the conformations was very similar to the cRMSD metric. Two-level clustering was applied to the dataset for more in-depth observations. According to the clustering results, specific aminoacids with major geometrical variations in their rotamers play the most important role in the separation of the clusters. Additionally, all rotamers of an amino acid can be grouped based on their structure, something that was confirmed using “Chimera” software as a visualization tool. To this end, the ultimate aim of this study is to examine whether the clustering of conformations produces clusters with points geometrically similar to each other, in order to identify near neighbors, i.e. conformations that are quite similar in structure but do not play a determinant role in the function and those that are quite diverse and could be further exploited.

APA, Harvard, Vancouver, ISO, and other styles

28

Jakobsson, Jenny. "Structural variation identification in non-reference cattle breed genomes." Thesis, Uppsala universitet, Institutionen för biologisk grundutbildning, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-448593.

Full text

Abstract:

Cattle are essential for the global food industry through the meat and milk production. It is from an economical point of view in our best interest to make cattle as efficient as possible, whether it is milk or beef production, without negatively influencing their health and welfare. That has led to a steady increase in the interest of genetic analysis of cattle. The sequencing and identification of genomic variation has led to the association of genotypes with phenotypes of interest and the discovery of the underlaying genetic risk factors for many diseases and traits. Diseases or monogenetic traits caused by a single nucleotide polymorphism (SNP), small deletions and insertions or other small mutations are often easy to identify if the correct region is found. The diseases caused by structural variants (SVs), variants larger than 50 base pairs (bp) are still challenging. It is more challenging because they are harder to identify, especially using shortread sequencing technologies. It is therefore still a rather unexplored area for cattle and other domestic species.This thesis looks at SVs found in the Swedish Red and Brown (SRB) cattle to discover breed specific SVs. This was done by creating a pipeline with VCF files as input. The identified SVs were filtered and overlapped with externally identified SVs. The pipeline was tested with two SRB datasets. The structural variant caller, DELLY, performed poorly with low read depth data when comparing single replicate data and combined replicates data. Multiple SVs were identified in all individuals and did overlap with both functional and gene annotation. There was also overlap found with datasets in the European variant archive (EVA). This indicates that the identified SVs are shared among multiple breeds of cattle and that DELLY can be used to develop future pipelines to include long read sequencing technologies and/or data with higher read depth.

APA, Harvard, Vancouver, ISO, and other styles

29

Sontheimer, Jana. "Functional characterization of proteins involved in cell cycle by structure-based computational methods." Doctoral thesis, Saechsische Landesbibliothek- Staats- und Universitaetsbibliothek Dresden, 2012. http://nbn-resolving.de/urn:nbn:de:bsz:14-qucosa-86778.

Full text

Abstract:

In the recent years, a rapidly increasing amount of experimental data has been generated by high-throughput technologies. Despite of these large quantities of protein-related data and the development of computational prediction methods, the function of many proteins is still unknown. In the human proteome, at least 20% of the annotated proteins are not characterized. Thus, the question, how to predict protein function from its amino acid sequence, remains to be answered for many proteins. Classical bioinformatics approaches for function prediction are based on inferring function from well-characterized homologs, which are identified based on sequence similarity. However, these methods fail to identify distant homologs with low sequence similarity. As protein structure is more conserved than sequence in protein families, structure-based methods (e.g. fold recognition) may recognize possible structural similarities even at low sequence similarity and therefore provide information for function inference. These fold recognition methods have already been proven to be successful for individual proteins, but their automation for high-throughput application is difficult due to intrinsic challenges of these techniques, mainly caused by a high false positive rate. Automated identification of remote homologs based on fold recognition methods would allow a signi cant improvement in functional annotation of proteins. My approach was to combine structure-based computational prediction methods with experimental data from genome-wide RNAi screens to support the establishment of functional hypotheses by improving the analysis of protein structure prediction results. In the first part of my thesis, I characterized proteins from the Ska complex by computational methods. I showed the benefit of including experimental information to identify remote homologs: Integration of functional data helped to reduce the number of false positives in fold recognition results and made it possible to establish interesting functional hypotheses based on high con dence structural predictions. Based on the structural hypothesis of a GLEBS motif in c13orf3 (Ska3), I could derive a potential molecular mechanism that could explain the observed phenotype. In the second part of my thesis, my goal was to develop computational tools and automated analysis techniques to be able to perform structure-based functional annotation in a high-throughput way. I designed and implemented key tools that were successfully integrated into a computational platform, called StrAnno, which I set up together with my colleagues. These novel computational modules include a domain prediction algorithm and a graphical overview that facilitates and accelerates the analysis of results. StrAnno can be seen as a first step towards automatic functional annotation of proteins by structure-based methods. First, the analysis of long hit lists to identify promising candidates for further analysis is substantially facilitated by integration and combination of various sequence-based computational tools and data from functional databases. Second, the developed post-processing tools accelerate the evaluation of structural and functional hypotheses. False positives from the threading result lists are removed by various filters, and analysis of the possible true positives is greatly enhanced by the graphical overview. With these two essential benefits, fold recognition techniques are applicable to large-scale approaches. By applying this developed methodology to hits from a genome-wide cell cycle RNAi screen and evaluating structural hypotheses by molecular modeling techniques, I aimed to associate biological functions to human proteins and link the RNAi phenotype to a molecular function. For two selected human proteins, c20orf43 and HJURP, I could establish interesting structural and functional hypotheses. These predictions were based on templates with low sequence identity (10-20%). The uncharacterized human protein c20orf43 might be a E3 SUMO-ligase that could be involved either in DNA repair or rRNA regulatory processes. Based on the structural hypotheses of two domains of HJURP, I predicted a potential link to ubiquitylation processes and direct DNA binding. In addition, I substantiated the cell cycle arrest phenotype of these two genes upon RNAi knockdown. Fold recognition methods are a promising alternative for functional annotation of proteins that escape sequence-based annotation due to their low sequence identity to well-characterized protein families. The structural and functional hypotheses I established in my thesis open the door to investigate the molecular mechanisms of previously uncharacterized proteins, which may provide new insights into cellular mechanisms.

APA, Harvard, Vancouver, ISO, and other styles

30

Peterson, Mark Erik. "Evolutionary constraints on the structural similarity of proteins and applications to comparative protein structure modeling." Diss., Search in ProQuest Dissertations & Theses. UC Only, 2008. http://gateway.proquest.com/openurl?url_ver=Z39.88-2004&rft_val_fmt=info:ofi/fmt:kev:mtx:dissertation&res_dat=xri:pqdiss&rft_dat=xri:pqdiss:3339202.

Full text

APA, Harvard, Vancouver, ISO, and other styles

31

Bottoms, Christopher A. "Bioinformatics of protein bound water." Diss., Columbia, Mo. : University of Missouri-Columbia, 2005. http://hdl.handle.net/10355/4188.

Full text

Abstract:

Thesis (Ph. D.)--University of Missouri-Columbia, 2005.
The entire dissertation/thesis text is included in the research.pdf file; the official abstract appears in the short.pdf file (which also appears in the research.pdf); a non-technical general description, or public abstract, appears in the public.pdf file. Title from title screen of research.pdf file viewed on (July 17, 2006) Vita. Includes bibliographical references.

APA, Harvard, Vancouver, ISO, and other styles

32

Leonardi, Emanuela. "Bioinformatic Analysis of Protein Mutations." Doctoral thesis, Università degli studi di Padova, 2012. http://hdl.handle.net/11577/3426280.

Full text

Abstract:

Many gene defects have been associated to genetic disorders, but the details of molecular mechanisms by which they contribute to the disease are often unclear. The study of mutation effects at the protein level can help elucidate the biological processes involved in the disease and the role of the protein in it. Bioinformatics can help to address this problem, being the connection between different disciplines including clinical, genetics, structural biology, and biochemistry. By using a computational approach I tackled the analysis of some examples of biomedical interesting proteins integrating various sources of data and addressing experimental and clinical investigations. Experimentally defined structures and molecular modelling were used as a basis to determine the protein structure-function relationship, which is essential to gain insights into disease genotype-phenotype correlation. Proteins have been further analyzed in their context, considering interactions that they take in specific cellular compartments. The results have been used to formulate functional hypotheses, which in some cases have been tested and confirmed by further investigations performed by cooperation groups. Mutations found in genes encoding these proteins have been evaluated for their impact on the protein structure and function by using several available prediction methods. These studies provided the idea for developing novel approaches, using residue interaction networks and an ensemble of methods. A novel strategy has been also designed to evaluate genomic data obtained by next generation sequencing technology. This consists in using available resources and software to prioritize rare functional variants and estimate their contribution to the disease. The novel approaches developed in this thesis have been applied and assessed at the Critical Assessment of Genome Interpretation (CAGI) experiment in 2011, providing in some cases very successful results
Alterazioni genetiche sono state identificate per molte malattie di natura genetica, ma in molti casi i meccanismi molecolari che contribuiscono all’insorgere della malattia non sono ancora chiari. Lo studio degli effetti delle mutazioni a livello della proteina permette di chiarire i processi biologici coinvolti nella malattia e il ruolo della proteina in essa. La bioinformatica può aiutare a affrontare questo problema rappresentando il punto di connessione tra diverse discipline quali la clinica, la genetica, la biologia strutturale e la biochimica. In questa tesi ho impiegato un approccio computazionale per affrontare l’analisi di alcuni esempi di proteine di interesse biomedico, integrando diverse risorse di dati e indirizzando la ricerca sperimentale e clinica. Strutture proteiche determinate sperimentalmente o mediante il modelling molecolare sono state utilizzate come base per determinare la relazione tra struttura e funzione, essenziale per ottenere informazioni sulla correlazione genotipo-fenotipo. Le proteine prese in esame sono state inoltre analizzate nel loro contesto, considerando le interazioni che avvengono con altre proteine o ligandi nei diversi compartimenti cellulari. I risultati dell’analisi bioinformatica sono stati poi utilizzati per formulare ipotesi funzionali che in alcuni casi sono state verificate e confermate sperimentalmente da altri gruppi di ricerca. Le mutazioni identificate nei geni codificanti per le proteine in esame sono state valutate per il loro impatto sulla struttura e funzione della proteina utilizzando numerosi metodi di predizione disponibili online. Le diverse applicazioni descritte in questa tesi hanno fornito l’idea per lo sviluppo di nuovi approcci computazionali per lo caratterizzazione strutturale e funzionale di proteine e dei loro mutanti. Si è visto che la predizione migliora utilizzando un ensemble dei diversi metodi di predizione disponibili. Inoltre, per la predizione degli effetti di mutazioni è stato ideato un nuovo approccio computazionale che utilizza le reti di interazione tra residui per rappresentare la struttura proteica. Questi metodi sono stati utilizzati anche nell’analisi di dati genomici originati da nuove tecnologie di sequenziamento. Questo ambito necessita di nuove strategie di indagine per l’individuazione di poche varianti causative in un’enorme quantità di varianti identificate di dubbio significato. A questo scopo viene proposta una strategia di analisi che utilizza informazioni derivanti dalle reti di interazioni proteiche. I nuovi approcci formulati in questa tesi sono stati applicati e valutati ad un nuovo esperimento internazionale, chiamato Critical Assessment of Genome Interpretation (CAGI), fornendo in alcuni casi ottimi risultati

APA, Harvard, Vancouver, ISO, and other styles

33

Baez, William David. "RNA Secondary Structures: from Biophysics to Bioinformatics." The Ohio State University, 2018. http://rave.ohiolink.edu/etdc/view?acc_num=osu1525714439675315.

Full text

APA, Harvard, Vancouver, ISO, and other styles

34

Brown, David K. "Bioinformatics tool development with a focus on structural bioinformatics and the analysis of genetic variation in humans." Thesis, Rhodes University, 2018. http://hdl.handle.net/10962/60708.

Full text

Abstract:

This thesis is divided into three parts, united under the general theme of bioinformatics tool development and variation analysis. Part 1 describes the design and development of the Job Management System (JMS), a workflow management system for high performance computing (HPC). HPC has become an integral part of bioinformatics. Computational methods for molecular dynamics and next generation sequencing (NGS) analysis, which require complex calculations on large datasets, are not yet feasible on desktop computers. As such, powerful computer clusters have been employed to perform these calculations. However, making use of these HPC clusters requires familiarity with command line interfaces. This excludes a large number of researchers from taking advantage of these resources. JMS was developed as a tool to make it easier for researchers without a computer science background to make use of HPC. Additionally, JMS can be used to host computational tools and pipelines and generates both web-based interfaces and RESTful APIs for those tools. The web-based interfaces can be used to quickly and easily submit jobs to the underlying cluster. The RESTful web API, on the other hand, allows JMS to provided backend functionality for external tools and web servers that want to run jobs on the cluster. Numerous tools and workflows have already been added to JMS, several of which have been incorporated into external web servers. One such web server is the Human Mutation Analysis (HUMA) web server and database. HUMA, the topic of part 2 of this thesis, is a platform for the analysis of genetic variation in humans. HUMA aggregates data from various existing databases into a single, connected and related database. The advantages of this are realized in the powerful querying abilities that it provides. HUMA includes protein, gene, disease, and variation data and can be searched from the angle of any one of these categories. For example, searching for a protein will return the protein data (e.g. protein sequences, structures, domains and families, and other meta-data). However, the related nature of the database means that genes, diseases, variation, and literature related to the protein will also be returned, giving users a powerful and holistic view of all data associated with the protein. HUMA also provides links to the original sources of the data, allowing users to follow the links to find additional details. HUMA aims to be a platform for the analysis of genetic variation. As such, it also provides tools to visualize and analyse the data (several of which run on the underlying cluster, via JMS). These tools include alignment and 3D structure visualization, homology modeling, variant analysis, and the ability to upload custom variation datasets and map them to proteins, genes and diseases. HUMA also provides collaboration features, allowing users to share and discuss datasets and job results. Finally, part 3 of this thesis focused on the development of a suite of tools, MD-TASK, to analyse genetic variation at the protein structure level via network analysis of molecular dynamics simulations. The use of MD-TASK in combination with the tools developed in the previous parts of this thesis is showcased via the analysis of variation in the renin-angiotensinogen complex, a vital part of the renin-angiotensin system.

APA, Harvard, Vancouver, ISO, and other styles

35

Johansson, Joakim. "Modifying a Protein-Protein Interaction Identifier with a Topology and Sequence-Order Independent Structural Comparison Method." Thesis, Linköpings universitet, Bioinformatik, 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-147777.

Full text

Abstract:

Using computational methods to identify protein-protein interactions (PPIs) supports experimental techniques by using less time and less resources. Identifying PPIs can be made through a template-based approach that describes how unstudied proteins interact by aligning a common structural template that exists in both interacting proteins. A pipeline that uses this is InterPred, that combines homology modelling and massive template comparison to construct coarse interaction models. These models are reviewed by a machine learning classifier that classifies models that shows traits of being true, which can be further refined with a docking technique. However, InterPred is dependent on using complex structural information, that might not be available from unstudied proteins, while it is suggested that PPIs are dependent of the shape and interface of proteins. A method that aligns structures based on the interface attributes is InterComp, which uses topological and sequence-order independent structural comparison. Implementing this method into InterPred will lead to restricting structural information to the interface of proteins, which could lead to discovery of undetected PPI models. The result showed that the modified pipeline was not comparable based on the receiver operating characteristic (ROC) performance. However, the modified pipeline could identify new potential PPIs that were undetected by InterPred.

APA, Harvard, Vancouver, ISO, and other styles

36

Bittencourt, Valnaide Gomes. "Aplica??o de t?cnicas de aprendizado de m?quina no reconhecimento de classes estruturais de prote?nas." Universidade Federal do Rio Grande do Norte, 2005. http://repositorio.ufrn.br:8080/jspui/handle/123456789/15423.

Full text

Abstract:

Made available in DSpace on 2014-12-17T14:56:03Z (GMT). No. of bitstreams: 1 ValnaideGB.pdf: 1369975 bytes, checksum: 404710d72240200cbd30a9116933d340 (MD5) Previous issue date: 2005-11-25
Coordena??o de Aperfei?oamento de Pessoal de N?vel Superior
Nowadays, classifying proteins in structural classes, which concerns the inference of patterns in their 3D conformation, is one of the most important open problems in Molecular Biology. The main reason for this is that the function of a protein is intrinsically related to its spatial conformation. However, such conformations are very difficult to be obtained experimentally in laboratory. Thus, this problem has drawn the attention of many researchers in Bioinformatics. Considering the great difference between the number of protein sequences already known and the number of three-dimensional structures determined experimentally, the demand of automated techniques for structural classification of proteins is very high. In this context, computational tools, especially Machine Learning (ML) techniques, have become essential to deal with this problem. In this work, ML techniques are used in the recognition of protein structural classes: Decision Trees, k-Nearest Neighbor, Naive Bayes, Support Vector Machine and Neural Networks. These methods have been chosen because they represent different paradigms of learning and have been widely used in the Bioinfornmatics literature. Aiming to obtain an improvment in the performance of these techniques (individual classifiers), homogeneous (Bagging and Boosting) and heterogeneous (Voting, Stacking and StackingC) multiclassification systems are used. Moreover, since the protein database used in this work presents the problem of imbalanced classes, artificial techniques for class balance (Undersampling Random, Tomek Links, CNN, NCL and OSS) are used to minimize such a problem. In order to evaluate the ML methods, a cross-validation procedure is applied, where the accuracy of the classifiers is measured using the mean of classification error rate, on independent test sets. These means are compared, two by two, by the hypothesis test aiming to evaluate if there is, statistically, a significant difference between them. With respect to the results obtained with the individual classifiers, Support Vector Machine presented the best accuracy. In terms of the multi-classification systems (homogeneous and heterogeneous), they showed, in general, a superior or similar performance when compared to the one achieved by the individual classifiers used - especially Boosting with Decision Tree and the StackingC with Linear Regression as meta classifier. The Voting method, despite of its simplicity, has shown to be adequate for solving the problem presented in this work. The techniques for class balance, on the other hand, have not produced a significant improvement in the global classification error. Nevertheless, the use of such techniques did improve the classification error for the minority class. In this context, the NCL technique has shown to be more appropriated
Atualmente, a classifica??o estrutural de prote?nas, que diz respeito ? infer?ncia de padr?es em sua conforma??o 3D, ? um dos principais problemas em aberto da Biologia Molecular. Esse problema vem recebendo a aten??o de muitos pesquisadores na ?rea de Bioinform?tica pelo fato de as fun??es das prote?nas estarem intrinsecamente relacionadas ?s suas diferentes conforma??es espaciais, que s?o de dif?cil obten??o experimental em laborat?rio. Considerando a grande diferen?a entre o n?mero de seq??ncias de prote?nas conhecidas e o n?mero de estruturas tridimensionais determinadas experimentalmente, ? alta a demanda por t?cnicas automatizadas de classifica??o estrutural de prote?nas. Nesse contexto, as ferramentas computacionais, principalmente as t?cnicas de Aprendizado de M?quina (AM), tornaram-se alternativas essenciais para tratar esse problema. Neste trabalho, t?cnicas de AM s?o empregadas no reconhecimento de classes estruturais de prote?nas: ?rvore de Decis?o, k-Vizinhos Mais Pr?ximos, Na?ve Bayes, M?quinas de Vetores Suporte e Redes Neurais Artificiais. Esses m?todos foram escolhidos por representarem diferentes paradigmas de aprendizado e serem bastante citados na literatura. Visando conseguir uma melhoria de desempenho na solu??o do problema abordado, sistemas de multiclassifica??o homog?nea (Bagging e Boosting) e heterog?nea (Voting, Stacking e StackingC) s?o aplicados nesta pesquisa, usando como base as t?cnicas de AM anteriormente mencionadas. Al?m disso, pelo fato de a base de dados de prote?nas considerada neste trabalho apresentar o problema de classes desbalanceadas, t?cnicas artificiais de balanceamento de classes (Under-sampling Aleat?rio, Tomek Links, CNN, NCL e OSS) s?o utilizadas a fim de minimizar esse problema e melhorar o desempenho dos classificadores. Para a avalia??o dos m?todos de AM, um procedimento de valida??o cruzada ? empregado, em que a acur?cia dos classificadores ? medida atrav?s das m?dias da taxa de classifica??o incorreta nos conjuntos de testes independentes. Essas m?dias s?o comparadas duas a duas pelo teste de hip?tese a fim de avaliar se h? diferen?a estatisticamente significativa entre elas. Com os resultados obtidos, pode-se observar, entre os classificadores base, o desempenho superior do m?todo M?quinas de Vetores Suporte. Os sistemas de multiclassifica??o (homog?nea e heterog?nea), por sua vez, apresentaram, em geral, uma acur?cia superior ou similar a dos classificadores usados como base, destacando-se o Boosting que usou ?rvore de Decis?o em sua forma??o e o StackingC tendo como meta classificador a Regress?o Linear. O m?todo Voting, apesar de sua simplicidade, tamb?m mostrou-se adequado para a solu??o do problema considerado nesta disserta??o. Em rela??o ?s t?cnicas de balanceamento de classes, n?o foram alcan?ados melhores resultados de classifica??o global com as bases de dados obtidas com a aplica??o de tais t?cnicas. No entanto, foi poss?vel uma melhor classifica??o espec?fica da classe minorit?ria, de dif?cil aprendizado. A t?cnica NCL foi a que se mostrou mais apropriada ao balanceamento de classes da base de dados de prote?nas

APA, Harvard, Vancouver, ISO, and other styles

37

Wu, Man-kit Edward, and 胡文傑. "Improved indexes for next generation bioinformatics applications." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 2009. http://hub.hku.hk/bib/B43224222.

Full text

APA, Harvard, Vancouver, ISO, and other styles

38

Wu, Man-kit Edward. "Improved indexes for next generation bioinformatics applications." Click to view the E-thesis via HKUTO, 2009. http://sunzi.lib.hku.hk/hkuto/record/B43224222.

Full text

APA, Harvard, Vancouver, ISO, and other styles

39

Grimbs, Sergio. "Towards structure and dynamics of metabolic networks." Phd thesis, Universität Potsdam, 2009. http://opus.kobv.de/ubp/volltexte/2009/3239/.

Full text

Abstract:

This work presents mathematical and computational approaches to cover various aspects of metabolic network modelling, especially regarding the limited availability of detailed kinetic knowledge on reaction rates. It is shown that precise mathematical formulations of problems are needed i) to find appropriate and, if possible, efficient algorithms to solve them, and ii) to determine the quality of the found approximate solutions. Furthermore, some means are introduced to gain insights on dynamic properties of metabolic networks either directly from the network structure or by additionally incorporating steady-state information. Finally, an approach to identify key reactions in a metabolic networks is introduced, which helps to develop simple yet useful kinetic models. The rise of novel techniques renders genome sequencing increasingly fast and cheap. In the near future, this will allow to analyze biological networks not only for species but also for individuals. Hence, automatic reconstruction of metabolic networks provides itself as a means for evaluating this huge amount of experimental data. A mathematical formulation as an optimization problem is presented, taking into account existing knowledge and experimental data as well as the probabilistic predictions of various bioinformatical methods. The reconstructed networks are optimized for having large connected components of high accuracy, hence avoiding fragmentation into small isolated subnetworks. The usefulness of this formalism is exemplified on the reconstruction of the sucrose biosynthesis pathway in Chlamydomonas reinhardtii. The problem is shown to be computationally demanding and therefore necessitates efficient approximation algorithms. The problem of minimal nutrient requirements for genome-scale metabolic networks is analyzed. Given a metabolic network and a set of target metabolites, the inverse scope problem has as it objective determining a minimal set of metabolites that have to be provided in order to produce the target metabolites. These target metabolites might stem from experimental measurements and therefore are known to be produced by the metabolic network under study, or are given as the desired end-products of a biotechological application. The inverse scope problem is shown to be computationally hard to solve. However, I assume that the complexity strongly depends on the number of directed cycles within the metabolic network. This might guide the development of efficient approximation algorithms. Assuming mass-action kinetics, chemical reaction network theory (CRNT) allows for eliciting conclusions about multistability directly from the structure of metabolic networks. Although CRNT is based on mass-action kinetics originally, it is shown how to incorporate further reaction schemes by emulating molecular enzyme mechanisms. CRNT is used to compare several models of the Calvin cycle, which differ in size and level of abstraction. Definite results are obtained for small models, but the available set of theorems and algorithms provided by CRNT can not be applied to larger models due to the computational limitations of the currently available implementations of the provided algorithms. Given the stoichiometry of a metabolic network together with steady-state fluxes and concentrations, structural kinetic modelling allows to analyze the dynamic behavior of the metabolic network, even if the explicit rate equations are not known. In particular, this sampling approach is used to study the stabilizing effects of allosteric regulation in a model of human erythrocytes. Furthermore, the reactions of that model can be ranked according to their impact on stability of the steady state. The most important reactions in that respect are identified as hexokinase, phosphofructokinase and pyruvate kinase, which are known to be highly regulated and almost irreversible. Kinetic modelling approaches using standard rate equations are compared and evaluated against reference models for erythrocytes and hepatocytes. The results from this simplified kinetic models can simulate acceptably the temporal behavior for small changes around a given steady state, but fail to capture important characteristics for larger changes. The aforementioned approach to rank reactions according to their influence on stability is used to identify a small number of key reactions. These reactions are modelled in detail, including knowledge about allosteric regulation, while all other reactions were still described by simplified reaction rates. These so-called hybrid models can capture the characteristics of the reference models significantly better than the simplified models alone. The resulting hybrid models might serve as a good starting point for kinetic modelling of genome-scale metabolic networks, as they provide reasonable results in the absence of experimental data, regarding, for instance, allosteric regulations, for a vast majority of enzymatic reactions.
In dieser Arbeit werden mathematische und informatische Ansätze zur Behandlung diverser Probleme im Zusammenhang mit der Modellierung metabolischer Netzwerke vorgestellt, insbesondere unter Berücksichtigung der eingeschränkten Verfügbarkeit detaillierter Enzymkinetiken. Es wird gezeigt, dass präzise mathematische Formulierungen der Probleme notwendig sind, um erstens angemessene und, falls möglich, effiziente Algorithmen zur Lösung zu entwickeln. Und zweitens, um die Güte der so gefundenen Lösungen zu bewerten. Des weiteren werden Methoden zur Analyse dynamischer Eigenschaften metabolischer Netzwerke eingeführt, welche entweder nur auf der Struktur der Netzwerke basieren oder zusätzlich noch Informationen über stationäre Zustände mit berücksichtigen. Außerdem wird eine Strategie zur Bestimmung von Schlüsselreaktionen eines Netzwerkes vorgestellt, welche die Entwicklung kinetischer Modelle vereinfacht. Der Erfolg neuer Technologien ermöglicht eine immer billigere und schnellere Sequenzierung des Genoms. Dies wird in naher Zukunft die Analyse biologischer Netzwerke nicht nur für Spezies, sondern auch für einzelne Individuen ermöglichen. Die automatische Rekonstruktion metabolischer Netzwerke ist bestens dafür geeignet, diese großen Datenmengen auszuwerten. Eine mathematische Formulierung der Rekonstruktion als Optimierungsproblem wird vorgestellt, die sowohl bereits vorhandenes Wissen als auch theoretische Vorhersagen verschiedenster bioinformatischer Methoden berücksichtigt. Die rekonstruierten Netzwerke sind hinsichtlich möglichst großer und plausibler Zusammenhangskomponenten hin optimiert, um fragmentierte und isolierte Teilnetzwerke zu vermeiden. Als Beispiel dient die Rekonstruktion der Saccharosesynthese in Chlamydomonas reinhardtii. Es wird gezeigt, dass das Problem sehr rechenintensiv ist und somit Approximationsalgorithmen erforderlich macht. Das 'inverse scope' Problem hat als Optimierungsziel, für ein gegebenes metabolisches Netzwerk die minimale Menge notwendiger Metabolite zu bestimmen, um eine ebenfalls gegebene Menge von gewünschten Zielmetaboliten zu produzieren. Diese Zielmetabolite können entweder durch experimentellen Messungen festgelegt werden, oder sie sind die gewünschten Endprodukte einer biotechnologischen Anwendung. Es wird gezeigt, dass das 'inverse scope' Problem rechenintensiv ist. Allerdings wird angenommen, dass die Berechnungskomplexität stark von der Anzahl gerichteter Zyklen innerhalb des metabolischen Netzwerkes abhängt. Dies könnte die Entwicklung effizienter Approximationsalgorithmen ermöglichen. Unter der Annahme von Massenwirkungskinetiken erlaubt es die 'chemical reaction network theory' (CRNT), anhand der Struktur metabolischer Netzwerke Rückschlüsse auf Multistabilität zu ziehen. Auch weitere Kinetiken können durch Modellierung von Enzymmechanismen mit berücksichtigt werden. CRNT wird zum Vergleich von mehreren Modellen des Calvinzyklus, welche sich in Größe und Abstraktionsniveau unterscheiden, verwendet. Obwohl für kleinere Modelle Ergebnisse erzielt werden, erlauben es die verfügbaren Theoreme und Algorithmen der CRNT nicht, Aussagen für größere Modelle zu machen, da die gegenwärtigen Implementierungen der Algorithmen an ihre Berechnungsgrenzen stoßen. Sind sowohl die Stoichiometrie eines metabolischen Netzwerkes, als auch die Metabolitkonzentrationen und Flüsse im stationären Zustand bekannt, so kann 'structural kinetic modelling' angewandt werden, um das dynamische Verhalten des Netzwerkes zu analysieren, selbst wenn die expliziten Ratengleichung unbekannt sind. Dieser Ansatz wird verwendet, um den stabilisierenden Einfluss allosterischer Regulation in menschlichen Erythrozyten zu untersuchen. Des weiteren werden die Reaktionen anhand ihrer Bedeutung hinsichtlich Stabilität im stationären Zustand angeordnet. Die wichtigsten Reaktionen bezüglich dieser Ordnung sind Hexokinase, Phosphofructokinase und Pyruvatkinase, welche bekanntermaßen stark reguliert und irreversibel sind. Kinetische Modelle, die auf generischen Ratengleichung beruhen, werden mit detaillierten Referenzmodellen für Erythrozyten und Hepatozyten verglichen. Die generischen Modelle simulieren das Verhalten nur in der Nähe eines gegebenen stationären Zustandes recht gut. Der zuvor erwähnte Ansatz, wichtige Reaktionen bezüglich Stabilität zu identifizieren, wird zur Bestimmung von Schlüsselreaktionen genutzt. Diese Schlüsselreaktionen werden im Detail modelliert, während für alle anderen Reaktionen weiterhin generische Ratengleichung verwendet werden. Die so entstandenen Hybridmodelle können das Verhalten des Referenzmodells signifikant besser beschreiben. Die Hybridmodelle können als Ausgangspunkt zur Erstellung genomweiter kinetischer Modelle dienen.

APA, Harvard, Vancouver, ISO, and other styles

40

Cuthbertson, Jonathan M. "Structural bioinformatics and simulation studies of Î±-helical membrane proteins." Thesis, University of Oxford, 2005. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.420449.

Full text

APA, Harvard, Vancouver, ISO, and other styles

41

Hatherley, Rowan. "Structural bioinformatics studies and tool development related to drug discovery." Thesis, Rhodes University, 2016. http://hdl.handle.net/10962/d1020021.

Full text

Abstract:

This thesis is divided into two distinct sections which can be combined under the broad umbrella of structural bioinformatics studies related to drug discovery. The first section involves the establishment of an online South African natural products database. Natural products (NPs) are chemical entities synthesised in nature and are unrivalled in their structural complexity, chemical diversity, and biological specificity, which has long made them crucial to the drug discovery process. South Africa is rich in both plant and marine biodiversity and a great deal of research has gone into isolating compounds from organisms found in this country. However, there is no official database containing this information, making it difficult to access for research purposes. This information was extracted manually from literature to create a database of South African natural products. In order to make the information accessible to the general research community, a website, named “SANCDB”, was built to enable compounds to be quickly and easily searched for and downloaded in a number of different chemical formats. The content of the database was assessed and compared to other established natural product databases. Currently, SANCDB is the only database of natural products in Africa with an online interface. The second section of the thesis was aimed at performing structural characterisation of proteins with the potential to be targeted for antimalarial drug therapy. This looked specifically at 1) The interactions between an exported heat shock protein (Hsp) from Plasmodium falciparum (P. falciparum), PfHsp70-x and various host and exported parasite J proteins, as well as 2) The interface between PfHsp90 and the heat shock organising protein (PfHop). The PfHsp70-x:J protein study provided additional insight into how these two proteins potentially interact. Analysis of the PfHsp90:PfHop also provided a structural insight into the interaction interface between these two proteins and identified residues that could be targeted due to their contribution to the stability of the Hsp90:Hop binding complex and differences between parasite and human proteins. These studies inspired the development of a homology modelling tool, which can be used to assist researchers with homology modelling, while providing them with step-by-step control over the entire process. This thesis presents the establishment of a South African NP database and the development of a homology modelling tool, inspired by protein structural studies. When combined, these two applications have the potential to contribute greatly towards in silico drug discovery research.

APA, Harvard, Vancouver, ISO, and other styles

42

Repo, Susanna. "Structural bioinformatics in the study of protein function and evolution /." Turku, Finland : Dept. of Biochemistry and Pharmacy, Abo Akademi University, 2008. http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&doc_number=017048818&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA.

Full text

APA, Harvard, Vancouver, ISO, and other styles

43

Shu, Nanjiang. "Protein structure prediction zinc-binding sites, one-dimensional structure and remote homology /." Doctoral thesis, Stockholm : Department of Materials and Environmental Chemistry (MMK), Stockholm University, 2010. http://urn.kb.se/resolve?urn=urn:nbn:se:su:diva-34094.

Full text

Abstract:

Diss. (sammanfattning) Stockholm : Stockholms universitet, 2010.
At the time of the doctoral defense, the following paper was unpublished and had a status as follows: Paper 3: Manuscript. Härtill 4 uppsatser.

APA, Harvard, Vancouver, ISO, and other styles

44

Tamura, Takeyuki. "Graph Algorithmic Approaches for Structure Inferences in Bioinformatics." 京都大学 (Kyoto University), 2006. http://hdl.handle.net/2433/68893.

Full text

APA, Harvard, Vancouver, ISO, and other styles

45

Carlsson, Jonas. "Mutational effects on protein structure and function." Doctoral thesis, Linköpings universitet, Bioinformatik, 2009. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-50491.

Full text

Abstract:

In this thesis several important proteins are investigated from a structural perspective. Some of the proteins are disease related while other have important but not completely characterised functions. The techniques used are general as demonstrated by applications on metabolic proteins (CYP21, CYP11B1, IAPP, ADH3), regulatory proteins (p53, GDNF) and a transporter protein (ANTR1). When the protein CYP21 (steroid 21-hydroxylase) is deficient it causes CAH (congenital adrenal hyperplasia). For this protein, there are about 60 known mutations with characterised clinical phenotypes. Using manual structural analysis we managed to explain the severity of all but one of the mutations. By observing the properties of these mutations we could perform good predictions on, at the time, not classified mutations. For the cancer suppressor protein p53, there are over thousand mutations with known activity. To be able to analyse such a large number of mutations we developed an automated method for evaluation of the mutation effect called PREDMUT. In this method we include twelve different prediction parameters including two energy parameters calculated using an energy minimization procedure. The method manages to differentiate severe mutations from non-severe mutations with 77% accuracy on all possible single base substitutions and with 88% on mutations found in breast cancer patients. The automated prediction was further applied to CYP11B1 (steroid 11-beta-hydroxylase), which in a similar way as CYP21 causes CAH when deficient. A generalized method applicable to any kind of globular protein was developed. The method was subsequently evaluated on nine additional proteins for which mutants were known with annotated disease phenotypes. This prediction achieved 84% accuracy on CYP11B1 and 81% accuracy in total on the evaluation proteins while leaving 8% as unclassified. By increasing the number of unclassified mutations the accuracy of the remaining mutations could be increased on the evaluation proteins and substantially increase the classification quality as measured by the Matthews correlation coefficient. Servers with predictions for all possible single based substitutions are provided for p53, CYP21 and CYP11B1. The amyloid formation of IAPP (islet amyloid polypeptide) is strongly connected to diabetes and has been studied using both molecular dynamics and Monte Carlo energy minimization. The effects of mutations on the amount and speed of amyloid formation were investigated using three approaches. Applying a consensus of the three methods on a number of interesting mutations, 94% of the mutations could be correctly classified as amyloid forming or not, evaluated with in vitro measurements. In the brain there are many proteins whose functions and interactions are largely unknown. GDNF (glial cell line-derived neurotrophic factor) and NCAM (neural cell adhesion molecule) are two such neuron connected proteins that are known to interact. The form of interaction was studied using protein--protein docking where a docking interface was found mediated by four oppositely charged residues in respective protein. This interface was subsequently confirmed by mutagenesis experiments. The NCAM dimer interface upon binding to the GDNF dimer was also mapped as well as an additional interacting protein, GFRα1, which was successfully added to the protein complex without any clashes. A large and well studied protein family is the alcohol dehydrogenase family, ADH. A class of this family is ADH3 (alcohol dehydrogenase class III) that has several known substrates and inhibitors. By using virtual screening we tried to characterize new ligands. As some ligands were already known we could incorporate this knowledge when the compound docking simulations were scored and thereby find two new substrates and two new inhibitors which were subsequently successfully tested in vitro. ANTR1 (anion transporter 1) is a membrane bound transporter important in the photosynthesis in plants. To be able to study the amino acid residues involved in inorganic phosphate transportation a homology model of the protein was created. Important residues were then mapped onto the structure using conservation analysis and we were in this way able to propose roles of amino acid residues involved in the transportation of inorganic phosphate. Key residues were subsequently mutated in vitro and a transportation process could be postulated. To conclude, we have used several molecular modelling techniques to find functional clues, interaction sites and new ligands. Furthermore, we have investigated the effect of muations on the function and structure of a multitude of disease related proteins.

APA, Harvard, Vancouver, ISO, and other styles

46

Garma, L. D. (Leonardo D. ). "Structural bioinformatics tools for the comparison and classification of protein interactions." Doctoral thesis, Oulun yliopisto, 2017. http://urn.fi/urn:isbn:9789526216065.

Full text

Abstract:

Abstract Most proteins carry out their functions through interactions with other molecules. Thus, proteins taking part in similar interactions are likely to carry out related functions. One way to determine whether two proteins do take part in similar interactions is by quantifying the likeness of their structures. This work focuses on the development of methods for the comparison of protein-protein and protein-ligand interactions, as well as their application to structure-based classification schemes. A method based on the MultiMer-align (or MM-align) program was developed and used to compare all known dimeric protein complexes. The results of the comparison demonstrates that the method improves over MM-align in a significant number of cases. The data was employed to classify the complexes, resulting in 1,761 different protein-protein interaction types. Through a statistical model, the number of existing protein-protein interaction types in nature was estimated at around 4,000. The model allowed the establishment of a relationship between the number of quaternary families (sequence-based groups of protein-protein complexes) and quaternary folds (structure-based groups). The interactions between proteins and small organic ligands were studied using sequence-independent methodologies. A new method was introduced to test three similarity metrics. The best of these metrics was subsequently employed, together with five other existing methodologies, to conduct an all-to-all comparison of all the known protein-FAD (Flavin-Adenine Dinucleotide) complexes. The results demonstrates that the new methodology captures the best the similarities between complexes in terms of protein-ligand contacts. Based on the all-to-all comparison, the protein-FAD complexes were subsequently separated into 237 groups. In the majority of cases, the classification divided the complexes according to their annotated function. Using a graph-based description of the FAD-binding sites, each group could be further characterized and uniquely described. The study demonstrates that the newly developed methods are superior to the existing ones. The results indicate that both the known protein-protein and the protein-FAD interactions can be classified into a reduced number of types and that in general terms these classifications are consistent with the proteins' functions
Tiivistelmä Suurin osa proteiinien toiminnasta tapahtuu vuorovaikutuksessa muiden molekyylien kanssa. Proteiinit, jotka osallistuvat samanlaisiin vuorovaikutuksiin todennäköisesti toimivat samalla tavalla. Kahden proteiinin todennäköisyys esiintyä samanlaisissa vuorovaikutustilanteissa voidaan määrittää tutkimalla niiden rakenteellista samankaltaisuutta. Tämä väitöskirjatyö käsittelee proteiini-proteiini- ja proteiini-ligandi -vuorovaikutusten vertailuun käytettyjen menetelmien kehitystä, ja niiden soveltamista rakenteeseen perustuvissa luokittelujärjestelmissä. Tunnettuja dimeerisiä proteiinikomplekseja tutkittiin uudella MultiMer-align-ohjelmaan (MM-align) perustuvalla menetelmällä. Vertailun tulokset osoittavat, että uusi menetelmä suoriutui MM-alignia paremmin merkittävässä osassa tapauksista. Tuloksia käytettiin myös kompleksien luokitteluun, jonka tuloksena oli 1761 erilaista proteiinien välistä vuorovaikutustyyppiä. Luonnossa esiintyvien proteiinien välisten vuorovaikutusten määrän arvioitiin tilastollisen mallin avulla olevan noin 4000. Tilastollisen mallin avulla saatiin vertailtua sekä sekvenssin (”quaternary families”) sekä rakenteen (”quaternary folds”) mukaan ryhmiteltyjen proteiinikompleksien määriä. Proteiinien ja pienien orgaanisten ligandien välisiä vuorovaikutuksia tutkittiin sekvenssistä riippumattomilla menetelmillä. Uudella menetelmällä testattiin kolmea eri samankaltaisuutta mittaavaa metriikkaa. Näistä parasta käytettiin viiden muun tunnetun menetelmän kanssa vertailemaan kaikkia tunnettuja proteiini-FAD (Flavin-Adenine-Dinucleotide, flaviiniadeniinidinukleotidi) -komplekseja. Proteiini-ligandikontaktien osalta uusi menetelmä kuvasi kompleksien samankaltaisuutta muita menetelmiä paremmin. Vertailun tuloksia hyödyntäen proteiini-FAD-kompleksit luokiteltiin edelleen 237 ryhmään. Suurimmassa osassa tapauksista luokittelujärjestelmä oli onnistunut jakamaan kompleksit ryhmiin niiden toiminnallisuuden mukaisesti. Ryhmät voitiin määritellä yksikäsitteisesti kuvaamalla FAD:n sitoutumispaikka graafisesti. Väitöskirjatyö osoittaa, että siinä kehitetyt menetelmät ovat parempia kuin aikaisemmin käytetyt menetelmät. Tulokset osoittavat, että sekä proteiinien väliset että proteiini-FAD -vuorovaikutukset voidaan luokitella rajattuun määrään vuorovaikutustyyppejä ja yleisesti luokittelu on yhtenevä proteiinien toiminnan suhteen

APA, Harvard, Vancouver, ISO, and other styles

47

Mengiste, Simachew Abebe. "Computational Approaches to the Degeneration of Brain Networks and Other Complex Networks." Doctoral thesis, KTH, Skolan för datavetenskap och kommunikation (CSC), 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-213729.

Full text

Abstract:

Networks are ubiquitous with several levels of complexity, configuration, hierarchy and function. Many micro- and macro-scale biological or non-biological interactions define complex systems. Our most sophisticated organ, the brain, accommodates the interaction of its billions of neurons through trillions of synapses and is a good example of a complex system. Network structure has been shown to be the key to determine network functions. For instance, communities or modules in the network explain functional segregation and modular interactions reveal functional integration. Moreover, the dynamics of cortical networks have been experimentally shown to be linked to the behavioral states of the animal. The level of rate and synchrony have been demonstrated to be related to sleep (inactive) and awake (active) states of animals. The structure of brain networks is not static. New synapses are formed and some existing synapses or neurons die due to neurodegenerative disease, environmental influences, development and learning, etc. Although there are many studies on the function of brain networks, the changes by neuronal and synaptic degeneration have not been so far in focus. In fact, there is no known mathematical model on the progressive pattern of synaptic pruning and neurodegeneration. The goal of this dissertation is to develop various models of progressive network degeneration and analyze their impact on structural and functional features of the networks. In order to expand the often chosen approach of the "random networks", the "small world" and "scale-free" network topologies are considered which have recently been proposed as alternatives. The effect of four progressive synaptic pruning strategies on the size of critical sites of brain networks and other complex networks is analyzed. Different measures are used to estimate the levels of population rate, regularity, synchrony and pair-wise correlation of neuronal networks. Our analysis reveals that the network degree, instead of network topology, highly affects the mean population activity.

QC 20170906

APA, Harvard, Vancouver, ISO, and other styles

48

Cury, Jean. "Evolutionary genomics of conjugative elements and integrons." Thesis, Sorbonne Paris Cité, 2017. http://www.theses.fr/2017USPCB062/document.

Full text

APA, Harvard, Vancouver, ISO, and other styles

49

Liu, Tsunglin. "Physics and bioinformatics of RNA." Columbus, Ohio : Ohio State University, 2006. http://rave.ohiolink.edu/etdc/view?acc%5Fnum=osu1141407392.

Full text

APA, Harvard, Vancouver, ISO, and other styles

50

Bahena, Silvia. "Computational Methods for the structural and dynamical understanding of GPCR-RAMP interactions." Thesis, Uppsala universitet, Institutionen för biologisk grundutbildning, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-416790.

Full text

Abstract:

Protein-protein interaction dominates all major biology processes in living cells. Recent studies suggestthat the surface expression and activity of G protein-coupled receptors (GPCRs), which are the largestfamily of receptors in human cells, can be modulated by receptor activity–modifying proteins (RAMPs). Computational tools are essential to complement experimental approaches for the understanding ofmolecular activity of living cells and molecular dynamics simulations are well suited to providemolecular details of proteins function and structure. The classical atom-level molecular modeling ofbiological systems is limited to small systems and short time scales. Therefore, its application iscomplicated for systems such as protein-protein interaction in cell-surface membrane. For this reason, coarse-grained (CG) models have become widely used and they represent an importantstep in the study of large biomolecular systems. CG models are computationally more effective becausethey simplify the complexity of the protein structure allowing simulations to have longer timescales. The aim of this degree project was to determine if the applications of coarse-grained molecularsimulations were suitable for the understanding of the dynamics and structural basis of the GPCRRAMP interactions in a membrane environment. Results indicate that the study of protein-proteininteractions using CG needs further improvement with a more accurate parameterization that will allowthe study of complex systems.

APA, Harvard, Vancouver, ISO, and other styles

Dissertations / Theses on the topic 'Structural Bioinformatic'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles