Dissertations / Theses on the topic 'Translational and applied bioinformatics'

To see the other types of publications on this topic, follow the link: Translational and applied bioinformatics.

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 dissertations / theses for your research on the topic 'Translational and applied bioinformatics.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.

1

Siangphoe, Umaporn. "META-ANALYSIS OF GENE EXPRESSION STUDIES." VCU Scholars Compass, 2015. http://scholarscompass.vcu.edu/etd/4040.

Full text
Abstract:
Combining effect sizes from individual studies using random-effects models are commonly applied in high-dimensional gene expression data. However, unknown study heterogeneity can arise from inconsistency of sample qualities and experimental conditions. High heterogeneity of effect sizes can reduce statistical power of the models. We proposed two new methods for random effects estimation and measurements for model variation and strength of the study heterogeneity. We then developed a statistical technique to test for significance of random effects and identify heterogeneous genes. We also proposed another meta-analytic approach that incorporates informative weights in the random effects meta-analysis models. We compared the proposed methods with the standard and existing meta-analytic techniques in the classical and Bayesian frameworks. We demonstrate our results through a series of simulations and application in gene expression neurodegenerative diseases.
APA, Harvard, Vancouver, ISO, and other styles
2

Podowski, Raf M. "Applied bioinformatics for gene characterization /." Stockholm, 2006. http://diss.kib.ki.se/2006/91-7140-818-5/.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Vyas, Hiten. "Information management applied to bioinformatics." Thesis, Loughborough University, 2006. https://dspace.lboro.ac.uk/2134/12906.

Full text
Abstract:
Bioinformatics, the discipline concerned with biological information management is essential in the post-genome era, where the complexity of data processing allows for contemporaneous multi level research including that at the genome level, transcriptome level, proteome level, the metabolome level, and the integration of these -omic studies towards gaining an understanding of biology at the systems level. This research is also having a major impact on disease research and drug discovery, particularly through pharmacogenomics studies. In this study innovative resources have been generated via the use of two case studies. One was of the Research & Development Genetics (RDG) department at AstraZeneca, Alderley Park and the other was of the Pharmacogenomics Group at the Sanger Institute in Cambridge UK. In the AstraZeneca case study senior scientists were interviewed using semi-structured interviews to determine information behaviour through the study scientific workflows. Document analysis was used to generate an understanding of the underpinning concepts and fonned one of the sources of context-dependent information on which the interview questions were based. The objectives of the Sanger Institute case study were slightly different as interviews were carried out with eight scientists together with the use of participation observation, to collect data to develop a database standard for one process of their Pharmacogenomics workflow. The results indicated that AstraZeneca would benefit through upgrading their data management solutions in the laboratory and by development of resources for the storage of data from larger scale projects such as whole genome scans. These studies will also generate very large amounts of data and the analysis of these will require more sophisticated statistical methods. At the Sanger Institute a minimum information standard was reported for the manual design of primers and included in a decision making tree developed for Polymerase Chain Reactions (PCRs). This tree also illustrates problems that can be encountered when designing primers along with procedures that can be taken to address such issues.
APA, Harvard, Vancouver, ISO, and other styles
4

Andrade, Jorge. "Grid and High-Performance Computing for Applied Bioinformatics." Doctoral thesis, Stockholm : Bioteknologi, Kungliga Tekniska högskolan, 2007. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-4573.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Zheng, Chunfang. "Genome rearrangement algorithms applied to comparative maps." Thesis, University of Ottawa (Canada), 2006. http://hdl.handle.net/10393/27313.

Full text
Abstract:
The Hannenhalli-Pevzner algorithm for computing the evolutionary distance between two genomes is very efficient when the genomes are signed and totally ordered. But in real comparative maps, the data suffer from problems such as coarseness, missing data, no signs, paralogy, order conflicts and mapping noise. In this thesis we have developed a suite of algorithms for genome rearrangement analysis in the presence of noise and incomplete information. For coarseness and missing data, we represent each chromosome as a partial order, summarized by a directed acyclic graph (DAG). We augment each DAG to a directed graph (DG) in which all possible linearizations are embedded. The chromosomal DGs representing two genomes are combined to produce a single bicoloured graph. The major contribution of the thesis is an algorithm for extracting a maximal decomposition of some subgraph into alternating coloured cycles, determining an optimal sequence of rearrangements, and hence the genomic distance. Also based on this framework, we have proposed an algorithm to solve all the above problems of comparative maps simultaneously by adding heuristic preprocessing to the exact algorithm approach. We have applied this to the comparison of maize and sorghum genomic maps on the GRAMENE database. A further contribution treats the inflation of genome distance by high levels of noise due to incorrectly resolved paralogy and error at the mapping, sequencing and alignment levels. We have developed an algorithm to remove the noise by maximizing strips and tested its robustness as noise levels increase.
APA, Harvard, Vancouver, ISO, and other styles
6

Abrams, Zachary. "A Translational Bioinformatics Approach to Parsing and Mapping ISCN Karyotypes: A Computational Cytogenetic Analysis of Chronic Lymphocytic Leukemia (CLL)." The Ohio State University, 2016. http://rave.ohiolink.edu/etdc/view?acc_num=osu1461078174.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Bentele, Kajetan. "Mechanisms of translational regulation in bacteria." Doctoral thesis, Humboldt-Universität zu Berlin, Mathematisch-Naturwissenschaftliche Fakultät I, 2013. http://dx.doi.org/10.18452/16839.

Full text
Abstract:
Diese Arbeit untersucht den Zusammenhang zwischen Mechanismen der translationalen Regulation und der Genomorganisation in Bakterien. Der erste Teil der Arbeit analysiert die Beziehung zwischen der Translationseffizienz von Genen und der Häufigkeit bestimmter Codons am Genanfang. Es ist bekannt, dass die Häufigkeitsverteilung der Codons am Anfang der Gene bei einigen Organismen eine andere ist als sonst im Genom. Durch die systematische Analyse von ungefähr 400 bakteriellen Genomen, evolutionären Simulationen und experimentellen Untersuchungen sind wir zu dem Schluss gekommen, dass die beobachtete Abweichung der Codonhäufigkeiten wohl eine Konsequenz der Notwendigkeit ist, RNA Sekundärstruktur in der Nähe des Translationsstarts zu vermeiden und somit eine effiziente Initiation der Translation zu gewährleisten. Im zweiten Teil der Arbeit untersuchen wir den Einfluss der Genreihenfolge innerhalb eines Operons auf die Fitness von E. coli. In bakteriellen Genomen vereint ein Operon funktionell zusammengehörige Gene, die in einer mRNA zusammen transkribiert werden und somit in der Expression stark korreliert sind. Daneben kann die translationale Kopplung, d. h. die Interdependenz der Translationseffizienz zwischen benachbarten Genen innerhalb einer solchen mRNA, eine bestimmte Proteinstöchiometrie weiter stabilisieren. Mithilfe eines Modells für die translationale Kopplung sowie für den Chemotaxis Signalweg konnten wir zeigen, dass die native Genreihenfolge eine der Permutationen ist, die am meisten zur Robustheit der Chemotaxis beitragen. Die translationale Kopplung ist daher ein wichtiger Faktor, der die Anordnung der Gene innerhalb des Chemotaxis Operon bestimmt. Diese Arbeit zeigt, dass die Anforderungen einer effizienten Genexpression sowie die Robustheit wichtiger zellulärer Funktionen einen Einfluss auf die Organisation eines Genoms haben können: einerseits bei der Wahl der Codons am Anfang der Gene, andererseits auf die Ordnung der Gene innerhalb eines Operons.
This work investigates the relationship between mechanisms of translational regulation and genome organization in bacteria. The first part analyzes the connection between translational efficiency and codon usage at the beginning of genes. It is known for some organisms that usage of synonymous codons at the gene start deviates from the codon usage elsewhere in the genome. By analyzing about 400 bacterial genomes, evolutionary simulations and experimental investigations, we conclude that the observed deviation of codon usage at the beginning of genes is most likely a consequence of the need to suppress mRNA structure around the ribosome binding site, thereby allowing efficient initiation of translation. We investigate further driving forces for genome organization by studying the impact of gene order within an operon on the fitness of bacterial cells. Operons group functionally related genes which are transcribed together as single mRNAs in E. coli and other bacteria. Correlation of protein levels is thus to a large extent attributed to this coupling on the transcriptional level. In addition, translational coupling, i.e. the interdependence of translational efficiency between neighboring genes within such a mRNA, can stabilize a desired stoichiometry between proteins. Here, we study the role of translational coupling in robustness of E. coli chemotaxis. By employing a model of translational coupling and simulating the underlying signal transduction network we show that the native gene order ranks among the permutations contributing most to robustness of chemotaxis. We therefore conclude that translational coupling is an important determinant of the gene order within the chemotaxis operon. Both these findings show that requirements for efficient gene expression and robustness of cellular function have a pronounced impact on the genomic organization, influencing the local codon usage at the beginning of genes and the order of genes within operons.
APA, Harvard, Vancouver, ISO, and other styles
8

Moreno, Cabrera José Marcos. "A translational bioinformatics approach to improve genetic diagnostics of hereditary cancer using next-generation sequencing data." Doctoral thesis, Universitat de Barcelona, 2021. http://hdl.handle.net/10803/672364.

Full text
Abstract:
This PhD thesis has been carried out with the aim of improving, from a bioinformatic-based approach, the genetic diagnostics of hereditary cancer. More specifically, the aims were: 1. To perform a comprehensive evaluation of tools suitable for detecting CNVs from NGS panel data at single-exon resolution. 2. To select the best candidate tool to implement in the genetic diagnostics pipeline of the ICO-IGTP program on hereditary cancer. 3. After implementing it, to evaluate the impact of including the selected NGS CNV detection tool as a first-tier screening step prior to MLPA validation. 4. To develop a tool to identify false positives produced by germline NGS CNV detection tools. 5. To develop a web-based tool to support the entire diagnostic process during the laboratory routine.
APA, Harvard, Vancouver, ISO, and other styles
9

Chou, Hsin-Jung. "Transcriptome-Wide Analysis of Roles for Transfer RNA Modifications in Translational Regulation." eScholarship@UMMS, 2017. https://escholarship.umassmed.edu/gsbs_diss/943.

Full text
Abstract:
Covalent nucleotide modifications in RNAs affect numerous biological processes, and novel functions are continually being revealed even for well-known modifications. Among all RNA species, transfer RNAs (tRNAs) are highly enriched with diverse modifications, which are known to play roles in decoding and tRNA stability, charging, and cellular trafficking. However, studies of tRNA modifications have been limited in a small scale and performed by groups with different methodologies. To systematically compare the functions of a large set of noncoding RNA modifications in translational regulation, I carried out ribosome profiling in 57 budding yeast mutants lacking nonessential genes involved in tRNA modifications. Deletion mutants with enzymes known to modify the anticodon loop or non-tRNA substrates such as rRNA exhibited the most dramatic translational perturbations, including altered dwell time of ribosomes on relevant codons, and altered ribosome density in protein-coding regions or untranslated regions of specific genes. Several mutants that result in loss of tRNA modifications in locations away from the anticodon loop also exhibited altered dwell time of ribosomes on relevant codons. Translational upregulation of the nutrient-responsive transcription factor Gcn4 was observed in roughly half of the mutants, consistent with the previous studies of Gcn4 in response to numerous tRNA perturbations. This work also discovered unexpected roles for tRNA modifying enzymes in rRNA 2’-O-methylation, and in transcriptional regulation of TY retroelements. Taken together, this work revealed the importance and novel functions of tRNA modifications, and provides a rich resource for discovery of additional links between tRNA modifications and gene regulation.
APA, Harvard, Vancouver, ISO, and other styles
10

Raiford, Douglas W. III. "Algorithmic Techniques Employed in the Isolation of Codon Usage Biases in Prokaryotic Genomes." Wright State University / OhioLINK, 2008. http://rave.ohiolink.edu/etdc/view?acc_num=wright1211902424.

Full text
APA, Harvard, Vancouver, ISO, and other styles
11

Clarke, Declan. "Leveraging Mathematical Models to Predict Allosteric Hotspots in the Age of Deep Sequencing." Thesis, Yale University, 2016. http://pqdtopen.proquest.com/#viewpdf?dispub=10160849.

Full text
Abstract:

A mathematical model is an abstraction that distills quantifiable behaviors and properties into a well-defined formalism in order to learn or predict something about a system. Such models may be as light as pencil-and-paper calculations on the back of an envelope or as heavy as to entail modern super computers. They may be as simple as predicting the trajectory of a baseball or as complex as forecasting the weather. By using macromolecular protein structures as substrates, the objective of this thesis is to improve upon and leverage mathematical models in order to address what is both a growing challenge and a burgeoning opportunity in the age of next-generation sequencing. The rapidly growing volume of data being produced by emerging deep sequencing technologies is enabling more in-depth analyses of protein conservation than previously possible. Increasingly, deep sequencing is bringing to light many disease-associated loci and localized signatures of strong conservation. These signatures in sequence space are the "shadows" of selective pressures that have been acting on proteins over the course of many years. However, despite the rapidly growing abundance of available data on such signatures, as well as the finer resolution with which they may be detected, an intuitive biophysical or functional rationale behind such genomic shadows is often missing (such intuition may otherwise be provided, for instance, by the need to engage in protein-protein interactions, undergo post-translational modification, or achieve a close-packed hydrophobic core). Allostery may frequently provide the missing conceptual link. Allosteric mechanisms act through changes in the dynamic behavior of protein architectures. Because selective evolutionary pressures often act through processes that are intrinsically dynamic in nature, static renderings can fail to provide any plausible rationale for constraint. In the work outlined here, models of protein conformational change are used to predict allosteric residues that either a) act as essential cavities on the protein surface which serve as sources or sinks in allosteric communication; or b) function as important information flow bottlenecks within the allosteric communication pathways of the protein interior. Though most existing approaches entail computationally expensive methods (such as MD) or rely on less direct measures (such as sequence features), the framework discussed herein is simultaneously both computationally tractable and fundamentally structural in nature – conformational change and topology are directly included in the search for allosteric residues – thereby enabling allosteric site prediction across the Protein Data Bank. Large-scale (i.e., general) properties of the predicted allosteric residues are then evaluated with respect to conservation. Multiple threads of evidence (using different sources of data and employing a variety of metrics) are used to demonstrate that the predicted allosteric residues tend to be significantly conserved across diverse evolutionary time scales. In addition, specific examples in which these residues can help to explain previously poorly understood disease-associated variants are discussed. Finally, a practical and computationally rapid software tool that enables users to perform this analysis on their own proteins of interest has been made available to the scientific public.

APA, Harvard, Vancouver, ISO, and other styles
12

Szekeres, Ferenc. "Bioinformatics applied to chlorophyll a/b binding proteins in Avena sativa (oat)." Thesis, University of Skövde, Department of Computer Science, 2003. http://urn.kb.se/resolve?urn=urn:nbn:se:his:diva-820.

Full text
Abstract:

The chlorophyll a/b binding (CAB) genes play a very central role in all photosynthetic systems and are for Avena sativa (oat) totally unexplored. This dissertation investigates a large number of EST sequences and this investigation characterises the CAB genes in oat, with help from the evolutionary background of oat and the comparison to a reference organism and similar species.

APA, Harvard, Vancouver, ISO, and other styles
13

Ahmed, Ashraf. "Investigation of immunity related genes in a disease host using applied bioinformatics." Thesis, University of Sheffield, 2017. http://etheses.whiterose.ac.uk/18247/.

Full text
APA, Harvard, Vancouver, ISO, and other styles
14

Rho, Mina. "Probabilistic models in computational molecular biology applied to the identification of mobile genetic elements and gene finding." [Bloomington, Ind.] : Indiana University, 2009. http://gateway.proquest.com/openurl?url_ver=Z39.88-2004&rft_val_fmt=info:ofi/fmt:kev:mtx:dissertation&res_dat=xri:pqdiss&rft_dat=xri:pqdiss:3386714.

Full text
Abstract:
Thesis (Ph.D.)--Indiana University, School of Informatics and Computing, 2009.
Title from PDF t.p. (viewed on Jul 22, 2010). Source: Dissertation Abstracts International, Volume: 70-12, Section: B, page: 7299. Adviser: Haixu Tang.
APA, Harvard, Vancouver, ISO, and other styles
15

Diboun, I. "Bioinformatics protocols for analysis of functional genomics data applied to neuropathy microarray datasets." Thesis, University College London (University of London), 2010. http://discovery.ucl.ac.uk/19298/.

Full text
Abstract:
Microarray technology allows the simultaneous measurement of the abundance of thousands of transcripts in living cells. The high-throughput nature of microarray technology means that automatic analytical procedures are required to handle the sheer amount of data, typically generated in a single microarray experiment. Along these lines, this work presents a contribution to the automatic analysis of microarray data by attempting to construct protocols for the validation of publicly available methods for microarray. At the experimental level, an evaluation of amplification of RNA targets prior to hybridisation with the physical array was undertaken. This had the important consequence of revealing the extent to which the significance of intensity ratios between varying biological conditions may be compromised following amplification as well as identifying the underlying cause of this effect. On the basis of these findings, recommendations regarding the usability of RNA amplification protocols with microarray screening were drawn in the context of varying microarray experimental conditions. On the data analysis side, this work has had the important outcome of developing an automatic framework for the validation of functional analysis methods for microarray. This is based on using a GO semantic similarity scoring metric to assess the similarity between functional terms found enriched by functional analysis of a model dataset and those anticipated from prior knowledge of the biological phenomenon under study. Using such validation system, this work has shown, for the first time, that ‘Catmap’, an early functional analysis method performs better than the more recent and most popular methods of its kind. Crucially, the effectiveness of this validation system implies that such system may be reliably adopted for validation of newly developed functional analysis methods for microarray.
APA, Harvard, Vancouver, ISO, and other styles
16

Davies, Robert William. "On the Generation of a Classification Algorithm from DNA Based Microarray Studies." Thesis, University of Ottawa (Canada), 2010. http://hdl.handle.net/10393/28583.

Full text
Abstract:
The purpose of this thesis is to build a classification algorithm using a Genome Wide Association (GWA) study. Briefly, a GWA is a case-control study using genotypes derived from DNA microarrays for thousands of people. These microarrays are able to acquire the genotypes of hundreds of thousands of Single Nucleotide Polymorphisms (SNPs) for a person at a time. In this thesis, we first describe the processes necessary to prepare the data for analysis. Next, we introduce the Naive Bayes classification algorithm and a modification so that effects of a SNP on the disease of interest are weighted by a Bayesian posterior probability of association. This thesis then uses the data from three coronary artery disease GWAs, one as a training set and two as test sets, to build and test the classifier. Finally, this thesis discusses the relevance of the results and the generalizability of this method to future studies.
APA, Harvard, Vancouver, ISO, and other styles
17

Huseby, Carol. "Molecular Neuropathology in Alzheimer's Disease." The Ohio State University, 2018. http://rave.ohiolink.edu/etdc/view?acc_num=osu1543314678552794.

Full text
APA, Harvard, Vancouver, ISO, and other styles
18

Dabdoub, Shareef Majed. "Applied Visual Analytics in Molecular, Cellular, and Microbiology." The Ohio State University, 2011. http://rave.ohiolink.edu/etdc/view?acc_num=osu1322602183.

Full text
APA, Harvard, Vancouver, ISO, and other styles
19

Kierczak, Marcin. "From Physicochemical Features to Interdependency Networks : A Monte Carlo Approach to Modeling HIV-1 Resistome and Post-translational Modifications." Doctoral thesis, Uppsala universitet, Centrum för bioinformatik, 2009. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-109873.

Full text
Abstract:
The availability of new technologies supplied life scientists with large amounts of experimental data. The data sets are large not only in terms of the number of observations, but also in terms of the number of recorded features. One of the aims of modeling is to explain a given phenomenon in possibly the simplest way, hence the need for selection of suitable features. We extended a Monte Carlo-based approach to selecting statistically significant features with discovery of feature interdependencies and used it in modeling sequence-function relationships in proteins. Our approach led to compact and easy-to-interpret predictive models. First, we represented protein sequences in terms of their physicochemical properties. This was followed by our feature selection and discovery of feature interdependencies. Finally, predictive models based on e.g., decision trees or rough sets were constructed. We applied the method to model two important biological problems: 1) HIV-1 resistance to reverse transcriptase-targeted drugs and 2) post-translational modifications of proteins. In the case of HIV resistance, we were not only able to predict whether the mutated protein is resistant to a drug or not, but we also suggested some new, previously neglected, mutations that possibly contribute to drug resistance. For all these mutations we proposed probable molecular mechanisms of action using literature and 3D structure studies. In the case of predicting PTMs, we built high accuracy models of modifications. In comparison to other methods, we were able to resolve whether the closest neighborhood of a residue (the nanomer) is sufficient to determine its modification status. Importantly, the application of our method yields networks of interdependent physicochemical properties of amino acids that show how these properties collaborate in establishing a given modification. We believe that the presented methods will help researchers to analyze a large class of important biological problems and will guide them in their research.
APA, Harvard, Vancouver, ISO, and other styles
20

Stokes, Todd Hamilton. "Development of a visualization and information management platform in translational biomedical informatics." Diss., Georgia Institute of Technology, 2009. http://hdl.handle.net/1853/33967.

Full text
Abstract:
Translational Biomedical Informatics (TBMI) is an emerging discipline expanding beyond traditional bioinformatics, with a focus on developing computational technologies for real-world biomedical practice. The goal of my Ph.D. research is to address a few key challenges in TBI, including: (1) the high quality and reproducibility required by medical applications when processing high throughput data, (2) the need for knowledge management solutions that allow molecular data to be handled and evaluated by researchers, regulators, and doctors collectively, (3) the need for near real-time, efficient access to decision-oriented visualizations of integrated data and data processing results, and (4) the need for an integrated solution that can evolve as medical consensus evolves, without requiring retraining, overhaul or replacement. This dissertation resulted in the development and adoption of concrete web-based application deliverables in regular use by bioinformaticians, clinicians, biologists and nanotechnologists. These include: the Chip Artifact Correction (caCORRECT) web site and grid services, the ArrayWiki community microarray repository, and the SimpleVisGrid visualization grid services (including eGOMiner, nanoDRIVE, PathwayVis and SphingoVisGrid).
APA, Harvard, Vancouver, ISO, and other styles
21

Gupta, Manish. "Complexity Reduction for Near Real-Time High Dimensional Filtering and Estimation Applied to Biological Signals." Thesis, Harvard University, 2016. http://nrs.harvard.edu/urn-3:HUL.InstRepos:33493389.

Full text
Abstract:
Real-time processing of physiological signals collected from wearable sensors that can be done with low computational power is a requirement for continuous health monitoring. Such processing involves identifying underlying physiological state x from a measured biomedical signal y, that are related stochastically: y = f(x; e) (here e is a random variable). Often the state space of x is large, and the dimensionality of y is low: if y has dimension N and S is the state space of x then |S| >> N, since the purpose is to infer a complex physiological state from minimal measurements. This makes real-time inference a challenging task. We present algorithms that address this problem by using lower dimensional approximations of the state. Our algorithms are based on two techniques often used for state dimensionality reduction: (a) decomposition where variables can be grouped into smaller sets, and (b) factorization where variables can be factored into smaller sets. The algorithms are computationally inexpensive, and permit online application. We demonstrate their use in dimensionality reduction by successfully solving two real complex problems in medicine and public safety. Motivated originally by the problem of predicting cognitive fatigue state from EEG (Chapter 1), we developed the Correlated Sparse Signal Recovery (CSSR) algorithm and successfully applied it to the problem of elimination of blink artifacts in EEG from awake subjects (Chapter 2). Finding the decomposition x = x1+ x2 into a low dimensional representation of the artifact signal x1 is a non-trivial problem and currently there are no online real-time methods accurately solve the problem for small N (dimensionality of y). By using a skew-Gaussian dictionary and a novel method to represent group statistical structure, CSSR is able to identify and remove blink artifacts even from few (e.g. 4-6) channels of EEG recordings in near real-time. The method uses a Bayesian framework. It results in more effective decomposition, as measured by spectral and entropy properties of the decomposed signals, compared to some state-of-the-art artifact subtraction and structured sparse recovery methods. CSSR is novel in structured sparsity: unlike existing group sparse methods (such as block sparse recovery) it does not rely on the assumption of a common sparsity profile. It is also a novel EEG denoising method: unlike state-of-the art artifact removal technique such as independent components analysis, it does not require manual intervention, long recordings or high density (e.g. 32 or more channels) recordings. Potentially this method of denoising is of tremendous utility to the medical community since EEG artifact removal is usually done manually, which is a lengthy tedious process requiring trained technicians and often making entire epochs of data unuseable. Identification of the artifact in itself can be used to determine some physiological state relevant from the artifact properties (for example, blink duration and frequency can be used as a marker of fatigue). A potential application of CSSR is to determine if structurally decomposed cortical EEG (i.e. non-spectral ) representation can instead be used for fatigue prediction. A new E-M based active learning algorithm for ensemble classification is presented in Chapter 3 and applied to the problem of detection of artifactual epochs based upon several criteria including the sparse features obtained from CSSR. The algorithm offers higher accuracy than existing ensemble methods for unsupervised learning such as similarity- and graph-based ensemble clustering, as well as higher accuracy and lower computational complexity than several active learning methods such as Query-by-Committee and Importance-Weighted Active Learning when tested on data comprising of noisy Gaussian mixtures. In one case we were to successfully identify artifacts with approximately 98% accuracy based upon 31-dimensional data from 700,000 epochs in a matter of seconds on a personal laptop using less than 10% active labels. This is to be compared to a maximum of 94% from other methods. As far as we know, the area of active learning for ensemble-based classification has not been previously applied to biomedical signal classification including artifact detection; it can also be applied to other medical areas, including classification of polysomnographic signals into sleep stages. Algorithms based upon state-space factorization in the case where there is unidirectional dependence amongst the dynamics groups of variables ( the "Cascade Markov Model") are presented in Chapters 4. An algorithm for estimation of factored state where dynamics follow a Markov model from observations is developed using E-M (i.e. a version of Baum-Welch algorithm on factored state spaces) and applied to real-time human gait and fall detection. The application of factored HMMs to gait and fall detection is novel; falls in the elderly are a major safety issue. Results from the algorithm show higher fall detection accuracy (95%) than that achieved with PCA based estimation (70%). In this chapter, a new algorithm for optimal control on factored Markov decision processes is derived. The algorithm, in the form of decoupled matrix differential equations, both is (i) computationally efficient requiring solution of a one-point instead of two-point boundary value problem and (ii) obviates the "curse of dimensionality" inherent in HJB equations thereby facilitating real-time solution. The algorithm may have application to medicine, such as finding optimal schedules of light exposure for correction of circadian misalignment and optimal schedules for drug intervention in patients. The thesis demonstrates development of new methods for complexity reduction in high dimensional systems and that their application solves some problems in medicine and public safety more efficiently than state-of-the-art methods.
Engineering and Applied Sciences - Applied Math
APA, Harvard, Vancouver, ISO, and other styles
22

Liu, Shaolin 1968. "Oligonucleotides applied in genomics, bioinformatics and development of molecular markers for rice and barley." Thesis, McGill University, 2004. http://digitool.Library.McGill.CA:80/R/?func=dbin-jump-full&object_id=85569.

Full text
Abstract:
A genome sequence can be conceptualized as a 'book' written with four nucleotide 'letters' in oligonucleotide (oligo) 'words'. These words can be used in genomics, bioinformatics and the development of molecular markers. The whole-genome sequence for rice (Oryza sativa L.) is almost finished and has been assembled into pseudomolecules. For barley ( Hordeum vulgare L.) expressed sequence tags (ESTs) have been assembled into 21,981 tentative consensus sequences (TCs). The availability of such sequence information provides opportunities to investigate oligo usage within and between genomes. For the first of three studies reported in this thesis, a C++ program was written to automatically design oligos that are conserved between two sets of sequence information. In silico mapping between rice coding sequences (CDS) and barley TCs indicated that oligos between 18 and 24 bp provide good specificity and sensitivity (83% and 86%, respectively, for 20mers). Conserved oligos used as PCR primers had a high (91%) success rate on barley lines. Sequencing of PCR products revealed conservation in exon sequence, size and order between barley and rice. Introns were not conserved in sequence but were relatively stable in size. Map locations of eight new markers in barley revealed both genome colinearity and rearrangements between barley and rice. The second study reported in this thesis examined word frequency within the rice genome. A non-random landscape composed of high-frequency and low-frequency zones was observed. Interestingly, high-frequency words seemed to be rice specific while single-copy words were gene specific and conserved across species. As in the first study, oligos of 12 bp or less were not specific, and 18 bp seemed to be a critical length for the specificity of oligos. The third study reported in this thesis involved the development of molecular markers for known genes using public sequence information. Six new polymorphic markers were d
APA, Harvard, Vancouver, ISO, and other styles
23

Garcia, Krystine. "Bioinformatics Pipeline for Improving Identification of Modified Proteins by Neutral Loss Peak Filtering." Ohio University / OhioLINK, 2015. http://rave.ohiolink.edu/etdc/view?acc_num=ohiou1440157843.

Full text
APA, Harvard, Vancouver, ISO, and other styles
24

Piñero, González Janet 1977. "Computational approaches and resources to support translational research in human diseases." Doctoral thesis, Universitat Pompeu Fabra, 2015. http://hdl.handle.net/10803/328417.

Full text
Abstract:
In the last two decades, the volume and variety of biomedical data has dramatically increased. The data is heterogeneous and scattered across many resources. This produces bottlenecks in the analysis and extraction of knowledge from this sea of information. To overcome this hurdle, better catalogs that integrate different data types, offer easy access to users, and support automatic workflows, are needed. With this in mind, we have developed DisGeNET, a discovery platform that contains information on more than 17,000 genes related to over 14,000 diseases. We have used DisGeNET to study the properties of disease genes in the context of protein interaction networks. To produce an accurate analysis of the mesoscale properties of the human interactome, we first compared the network partitions generated by two popular clustering algorithms, to assess how this would impact the follow-up biological analysis. Using the best performing algorithm we then explored the network properties of disease genes. Then we evaluated the relationship between the network properties of different groups of disease genes and their tolerance to likely deleterious germline variants across human populations. Finally, we have developed a new network medicine approach to study disease comorbidities, and applied it to the analysis of COPD comorbidities.
Los avances tecnológicos de las últimas dos décadas han producido un incremento dramático en la cantidad y la diversidad de datos biomédicos disponibles. Este proceso ha ocurrido de manera fragmentada, y en consecuencia los datos se encuentran almacenados en distintos repositorios, lo cual impone barreras a la hora de integrarlos, analizarlos y extraer conocimiento a partir de ellos. Para superar estas barreras, es necesario contar con recursos computacionales que integren esta información, y ofrezcan un fácil acceso a la misma, permitiendo al mismo tiempo su análisis automatizado. En respuesta a esta necesidad hemos desarrollado DisGeNET, una plataforma orientada a la exploración de las causas genéticas de las enfermedades humanas, que contiene actualmente información sobre más de 14.000 enfermedades y 17.000 genes. En esta tesis, describimos el uso de DisGeNET para el estudio de las propiedades de los genes asociados a enfermedades en el contexto de redes de interacción entre proteínas. Para ello, evaluamos previamente cómo la utilización de distintos algoritmos de reconocimiento de comunidades en redes afecta a los resultados de los análisis e influencia su interpretación biológica. A continuación, caracterizamos las propiedades de redes de los genes asociados a enfermedades como conjunto y también en sub-grupos, empleando diferentes criterios de clasificaciones de las enfermedades. Posteriormente, evaluamos cómo estas propiedades están relacionadas con la tolerancia a mutaciones posiblemente deletéreas en distintos grupos de genes, mediante el análisis de datos generados por las nuevas tecnologías de secuenciación. Finalmente, desarrollamos una nueva metodología de medicina de sistemas para explorar los mecanismos moleculares de la comorbilidades, y la aplicamos al estudio de las comorbilidades de la enfermedad pulmonar obstructiva crónica
APA, Harvard, Vancouver, ISO, and other styles
25

Niklasson, Markus. "Coding to cure : NMR and thermodynamic software applied to congenital heart disease research." Doctoral thesis, Linköpings universitet, Kemi, 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-142785.

Full text
Abstract:
Regardless of scientific field computers have become pivotal tools for data analysis and the field of structural biology is not an exception. Here, computers are the main tools used for tasks including structural calculations of proteins, spectral analysis of nuclear magnetic resonance (NMR) spectroscopy data and fitting mathematical models to data. As results reported in papers heavily rely on software and scripts it is of key importance that the employed computational methods are robust and yield reliable results. However, as many scientific fields are niched and possess a small potential user base the task to develop necessary software often falls on researchers themselves. This can cause divergence when comparing data analyzed by different measures or by using subpar methods. Therein lies the importance of development of accurate computational methods that can be employed by the scientific community. The main theme of this thesis is software development applied to structural biology, with the purpose to aid research in this scientific field by speeding up the process of data analysis as well as to ensure that acquired data is properly analyzed. Among the original results of this thesis are three user-friendly software: COMPASS - a resonance assignment software for NMR spectroscopy data capable of analyzing chemical shifts and providing the user with suggestions to potential resonance assignments, based on a meticulous database comparison. CDpal - a curve fitting software used to fit thermal and chemical denaturation data of proteins acquired by circular dichroism (CD) spectroscopy or fluorescence spectroscopy. PINT - a line shape fitting and downstream analysis software forNMRspectroscopy data, designed with the important purpose to easily and accurately fit peaks in NMR spectra and extract parameters such as relaxation rates, intensities and volumes of peaks. This thesis also describes a study performed on variants of the life essential regulatory protein calmodulin that have been associated with the congenital life threatening heart disease long QT syndrome (LQTS). The study provided novel insights revealing that all variants are distinct from the wild type in regards to structure and dynamics on a detailed level; the presented results are useful for the interpretation of results from protein interaction studies. The underlying research of this paper makes use of all three developed software, which validates that all developed methods fulfil a scientific purpose and are capable of producing solid results.
APA, Harvard, Vancouver, ISO, and other styles
26

Mooney, Alex M. "The Influence of DNA Sequence and Post Translational Modifications on Nucleosome Positioning and Stability." The Ohio State University, 2012. http://rave.ohiolink.edu/etdc/view?acc_num=osu1354733493.

Full text
APA, Harvard, Vancouver, ISO, and other styles
27

ZANIN, João Luiz Baldim. "Bioinformatics applied to natural products discovery processes: systematization, biosynthetic evidences, and isolation of promising species." Universidade Federal de Alfenas, 2016. https://bdtd.unifal-mg.edu.br:8443/handle/tede/983.

Full text
Abstract:
Estratégias guiadas por genoma foram utilizadas a fim de examinar o potencial biossintético de microorganismos da classe Betaproteobacteria no âmbito de Produtos Naturais. Uma estratégia capaz de ser expandida para todos os tipos de microorganismos foi criada para estimar as reações enzimáticas das Peptideo Sintetases Não Ribossomais a fim de sistematizar a e analisar suas similaridades biossintéticas. Todas as bases de dados e software user-friendly foram adotadas a fim de tornar esta estratégia simples e mais abrangente. Elas foram NCBI, KEGG, NORINE, antiSMASH, Cystoscape, Gitools, MEGA e Clustal. Os resultados tornaram possível a criação de uma stratégia, chamada XPAIRT (eXPAndable Identification of amino acids in nonRibosomal peptides Tendencies) correlacionando pares de peptídeos e seus genomas similares via Jaccard Index e filogenia. Neste contexto, espécies Betaproteobacteria mostraram sintetizar produtos naturais seguindo certa similaridade biossintética na montagem de monômeros para a construção do esqueleto peptídico. Subunidades estruturais tais como asp.ser e orn.ser foram amplamente encontradas. Essas similaridades foram correlacionados gerando índice de similaridade entre espécies e sua distribuição entre genomas semelhantes, que foram nomeados como contribuíntes. Quanto maior a identidade genômica de um cluster de gene biossintético para um produto natural de forma geral, maior a chance de um contribuínte expressar pares similares relativos ao cluster em questão. A partir de análises de contagem de clusteres de genes biossintéticos, pôde-se eleger microorganismos promissores para isolamento de amostras ambientais. Essas análises mostraram que espécies do gênero Burkholderia são as mais promissoras quando comparadas a todos os genomas disponíveis da subclasse Betaproteobacteria. Análises genômicas da espécie padrão do gênero, Burkholderia thailandensis mostraram que cromossomos 1 e 2, em comparação a uma cepa produtora de antibióticos padrão, S. coelicolor, não apresentarem mesmas informações para biossíntese de compostos, mas apresentam similaridades de classes, sendo elas, Terpenos, T1PKS, Bacteriocinas e Peptídeos Não Ribossomais. Todos os resultados não tiveram correlações com os clusteres de S. coelicolor evidenciando que B. thailandensis apresenta-se promissora para a descoberta de novos compostos. Como espécies do gênero Burkholderia foram o principal alvo neste trabalho, um método guiado por genoma foi desenvolvido para isolar tanto quanto possível cepas de amostras ambientais. O método levou em consideração as necessidades básicas de um microorganismo para sobreviver: a) o tipo de microbioma que os microorganismos de interesse se encontram, analizados através de resultados de metagenômica, b) resistência à antibióticos e metais, c) capacidade de metabolizar compostos com papel biológico, d) crescimento celular e nutrientes, e e) variações de pH e crescimento celular. Todas as análises foram cruzadas e os melhores candidatos à composição de meios de culturas celular específicos para o isolamento de microorganismos do gênero Burkholderia foram selecionados. A estratégia foi bem-sucedida para diversos tipos de amostras. Estes experimentos excepcionais demonstraram a eficácia na resolução de problemas químico-biológicos auxiliando a análise posterior de novos produtos naturais.
Genome-guided strategies were applied to examine Betaproteobacteria species potential for the biosynthesis of nonribosomal peptides. A generalizable strategy was created to track similarities in enzymatic reactions of nonribosomal peptides synthetases in order to organize their capability of assembling monomers building the peptides backbones. Databases and user-friendly software were adopted making this strategy a comprehensive one. Databases and software adopted, as well as, NCBI, KEGG, NORINE, antiSMASH, Cystoscape, Gitools, MEGA e Clustal were used for this purpose. Betaproteobacteria species showed to possess biosynthetic similarities in assembling monomers for the peptide backbone of a nonribosomal peptide. These evidences were correlated giving similarities indexes between species and their distribution between similar genomes. Predictions were fragmented in several ways, for example, monomers, pairs and triads. Correlation analyses displayed that pairs it is the best way of tracking similarities. This result turned possible to create a strategy, named XPAIRT (eXPAndable Identification of amino acids in nonRibosomal peptides Tendencies) correlating pairs of peptides and their similar genomes via Jaccard Index and phylogeny. Thought these investigations it was noticed that Betaproteobacteria species generally assemble asp.orn and orn.ser, mainly Burkholderia species, among other pairs of peptides. Further analysis showed that species from the genera Burkholderia are the most promising ones due to their Biosynthetic Gene Cluster counting for all available Betaproteobacteria genomes. These species were further analyzed and a standard strain, Burkholderia thailandensis, was used to the identification of intraspecific variation for their biosynthetic potential. A specific study on Biosynthetic Gene Cluster variation was proceeded for discovering disparities between chromosomes 1 and 2, and a standard antibiotic producer strain, S. coelicolor. Results showed that B. thailandensis have different possibilities for biosynthesizing natural products. Even thought, common classes of compounds such as, Terpenes, Bacteriocins, T1PKS and Nonribosomal Peptides were identified for all strains. As Burkholderia species were the main target in this work, a genome-guided method was developed for isolating as much strains as possible from environmental samples. This very method took into account the basic needs for a microorganism to survive: a) the type of microbiome that microorganisms of interest coexist, analyzed through metagenomics, b) resistance to antibiotics and metals, c) ability to metabolize compounds with biological role, d) cell growth related to different nutrients, and e) cell growth under pH variations. The strategy was successful for diverse types of samples. These exceptional experiments are part of a novel way of working with Natural Products, using genomic, bioinformatics and visual statistical analysis in order to access common characteristics and uniqueness of species guiding the search of medically relevant natural products.
Coordenação de Aperfeiçoamento de Pessoal de Nível Superior - CAPES
APA, Harvard, Vancouver, ISO, and other styles
28

Aniba, Mohamed Radhouane. "Knowledge based expert system development in bioinformatics : applied to multiple sequence alignment of protein sequences." Strasbourg, 2010. https://publication-theses.unistra.fr/public/theses_doctorat/2010/ANIBA_Mohamed_Radhouane_2010.pdf.

Full text
Abstract:
L'objectif de ce projet de thèse a été le développement d'un système expert afin de tester, évaluer et d'optimiser toutes les étapes de la construction et l'analyse d'un alignement multiple de séquences. Le nouveau système a été validé en utilisant des alignements de référence et apporte une nouvelle vision pour le développement de logiciels en bioinformatique: les systèmes experts basés sur la connaissance. L'architecture utilisée pour construire le système expert est très modulaire et flexible, permettant à AlexSys d'évoluer en même temps que de nouveaux algorithmes seront mis à disposition. Ultérieurement, AlexSys sera utilisé pour optimiser davantage chaque étape du processus d'alignement, par exemple en optimisant les paramètres des différents programmes d 'alignement. Le moteur d'inférence pourrait également être étendu à identification des combinaisons d'algorithmes qui pourraient fournir des informations complémentaires sur les séquences. Par exemple, les régions bien alignées par différents algorithmes pourraient être identifiées et regroupées en un alignement consensus unique. Des informations structurales et fonctionnelles supplémentaires peuvent également être utilisées pour améliorer la précision de l'alignement final. Enfin, un aspect crucial de tout outil bioinformatique consiste en son accessibilité et la convivialité d' utilisation. Par conséquent, nous sommes en train de développer un serveur web, et un service web, nous allons également concevoir un nouveau module de visualisation qui fournira une interface intuitive et conviviale pour toutes les informa ions récupérées et construites par AlexSys
The objective of this PhD project was the development of an integrated expert system to test, evaluate and optimize all the stages of the construction and the analysis of a multiple sequence alignment. The new system was validated using standard benchmark cases and brings a ncw vision to software development in Bioinformatics: knowledge-guided systems. The architecture used to build the expert system is highly modular and flcxible, allowing AlcxSys to evolve as new algorithms are made available. In the future, AlexSys will he uscd to furthcr optimize each stage of the alignment process, for example by optimizing the input parameters of the different algorithms. The inference engine could also be extended to identify combinations of algorithms that could potentially provide complementary information about the input sequences. For example, well aligned regions from different aligners could be identified and combined into a single consensus alignment. Additional structural and functional information could also be exploited to improve the final alignment accuracy. Finally, a crucial aspect of any bioinformatics tool is its accessibility and usability. Therefore, we are currently developing a web server, and a web services based distributed system. We will also design a novel visualization module that will provide an intuitive, user-friendly interface to all the information retrieved and constructed by AlexSys
APA, Harvard, Vancouver, ISO, and other styles
29

Shahalizadeh, Kalkhoran Solmaz. "An integrative bioinformatics approach for developing predictors of recurrence for the triple negative and basal subtypes of breast cancer." Thesis, McGill University, 2011. http://digitool.Library.McGill.CA:80/R/?func=dbin-jump-full&object_id=97150.

Full text
Abstract:
The triple-negative (TN) and basal-like breast cancers have poor outcomes, and lack both targeted therapies and accurate prognostic markers of outcome. Despite the application of microarray technologies to molecular profiling of breast tumors, most current genomics-derived predictors are incapable of stratifying TN or basal breast cancer patients by outcome. We have collected all publicly available breast cancer gene expression datasets to build a human-Compendium; from this, we selected TN and basal patient cohorts to build a TN-Compendium (TN-C) and basal-Compendium (basal-C). Using a de novo machine learning methodology, we have built 25-gene predictors of recurrence for TN and basal patients. Compared to previously reported predictors, these classifiers exhibit superior performance, and highlight multiple biological processes, including immune response, cytoskeletal regulation, signaling and ligand gated ion channels, as being differentially present between recurrent and non-recurrent patients. The small size of these predictors makes them potential candidates for use in clinical settings.
Le Cancer du sein triple négatif (TN) ou basal-like ont de mauvais résultats, et manque les thérapies ciblées et précises marqueurs pronostiques du résultat. Malgré l'application des technologies des biopuces pour le profilage moléculaire des tumeurs du sein, la plus récente des prédicteurs de génomique provenant sont incapables de stratification TN ou basale cancer du sein par résultat. Nous avons rassemblé tous les ensembles de données accessible au public du cancer du sein par l'expression des gènes pour construire un homme-Compendium; de cela, nous avons sélectionné AMT et à la base des cohortes de patients pour construire un TN-Compendium (TN-C) et basal-Compendium (basal-C). En utilisant une machine de novo méthodologie d'apprentissage, nous avons construit des prédicteurs 25-gène de récidive pour les TN et les patients de la base. Par rapport à des prédicteurs signalés précédemment, la performance exposer ces classificateurs supérieure, et mettre en évidence plusieurs processus biologiques, y compris la réponse immunitaire, la régulation du cytosquelette, signalisation et canaux ioniques ligand, comme étant présents de façon différentielle entre les patients récurrents et non récurrents. La petite taille de ces prédicteurs en fait des candidats potentiels pour une utilisation en milieu clinique.
APA, Harvard, Vancouver, ISO, and other styles
30

Dong, Siyuan. "A time dependent adaptive learning process for estimating drug exposure from register data - applied to insulin and its analogues." Thesis, KTH, Beräkningsbiologi, CB, 2013. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-128438.

Full text
APA, Harvard, Vancouver, ISO, and other styles
31

Ferretti, Yuri. "Ferramenta computacional para análise integrada de dados clínicos e biomoleculares." Universidade de São Paulo, 2015. http://www.teses.usp.br/teses/disponiveis/95/95131/tde-05042016-093735/.

Full text
Abstract:
A massificação dos estudos da medicina translacional permite aos pesquisadores que usufruam de fontes de dados das mais diversas áreas. Uma área de suma importância e a bioinformatica, que agrega o alta capacidade de processamento computacional disponível atualmente, com a infindável quantidade de dados gerada por métodos de sequenciamento de ultima geração, para entregar aos pesquisadores uma quantidade rica de dados para serem analisados. Apesar da disponibilidade desses dados, a expertise necessária para analisa-los dificulta que profissionais com pouco conhecimento em bioinformatica, estatística e ciência da computação possam realizar pesquisas e analises com estes dados. Dada esta situação, este trabalho consistiu em criar uma ferramenta que tira proveito da integração de múltiplas bases de dados proporcionada pelo framework IPTrans, permitindo que usuários da área biomédica realizem analises com os dados contidos nessas bases. Com base em outras ferramentas existentes e em um levantamento de requisitos junto a potenciais usuários, foram identificadas as funcionalidades mais importantes e assim foi projetada e implementada a IPTrans Advanced Analysis Tool (IPTrans A2Tool). Esta ferramenta permite que usuários façam analises de expressão diferencial mais comuns como heatmaps, volcano plots, consenso de agrupamentos e blox-plot. Além disso, a ferramenta proporciona um algoritmo de mineração de dados baseado na extração de regras de associação entre dados clínicos e biomoleculares, que permite ao usuário descobrir novas associações entre a expressão dos genes dados clínicos e fenotípicos. Adicionalmente a este trabalho, foi criado também o BioBank Warden, um sistema de controle de dados clínicos e amostras biomoleculares, que foi utilizado como uma das fontes de dados para o IPTrans A2Tool. Este sistema permite que usuários adicionem informações clinicas de pacientes e também das amostras extraídas para a realização de estudos. Uma avaliação preliminar de usabilidade, realizada junto a profissionais da área biomédica, mostrou que as ferramentas possuem potencial para serem utilizadas no contexto da medicina translacional.
The great number of translational medicine studies allows researchers to make benefit of data sources from various fields. An area of great importance is bioinformatics, which combines the high computational processing capabilities found nowadays with the endless amount of data generated by next-generation sequencing methods, to give researchers a rich amount of data to be analyzed. Despite the availability of such data, the expertise required to analyze it makes difficult for professionals with little knowledge in bioinformatics, statistics or computer science, to conduct research and analysis on this data. Given this situation, this work was intended to create a tool that takes advantage of multiple databases integration capabilities provided by IPTrans and that allows users to perform analysis on the data contained in these databases. To accomplish that other tools were studied in order to observe which features our framework should aggregate and thus was created the IPTrans A2Tool (IPTrans Advanced Analysis Tool). This tool allows users to perform differential expression analysis and generate output as heatmaps, volcano plots, consensus clustering and blox-plots. In addition, the tool provides an association rule extraction algorithm between clinical and biomolecular data, allowing the user to discover hidden associations between the expression of analyzed genes and clinical data. As a by-product of this work was also created the BioBank Warden a clinical data and biomolecular samples management system that was used as one of the data sources for IPTrans A2Tool. This system allows users to add patients clinical information and also of samples taken for carrying out studies. In addition, the system provides a strong research group and project permission management that ensures only authorized people to have access to patients data.
APA, Harvard, Vancouver, ISO, and other styles
32

Schröder, Michael, Rainer Winnenburg, and Conrad Plake. "Improved mutation tagging with gene identifiers applied to membrane protein stability prediction." Saechsische Landesbibliothek- Staats- und Universitaetsbibliothek Dresden, 2015. http://nbn-resolving.de/urn:nbn:de:bsz:14-qucosa-177379.

Full text
Abstract:
Background The automated retrieval and integration of information about protein point mutations in combination with structure, domain and interaction data from literature and databases promises to be a valuable approach to study structure-function relationships in biomedical data sets. Results We developed a rule- and regular expression-based protein point mutation retrieval pipeline for PubMed abstracts, which shows an F-measure of 87% for the mutation retrieval task on a benchmark dataset. In order to link mutations to their proteins, we utilize a named entity recognition algorithm for the identification of gene names co-occurring in the abstract, and establish links based on sequence checks. Vice versa, we could show that gene recognition improved from 77% to 91% F-measure when considering mutation information given in the text. To demonstrate practical relevance, we utilize mutation information from text to evaluate a novel solvation energy based model for the prediction of stabilizing regions in membrane proteins. For five G protein-coupled receptors we identified 35 relevant single mutations and associated phenotypes, of which none had been annotated in the UniProt or PDB database. In 71% reported phenotypes were in compliance with the model predictions, supporting a relation between mutations and stability issues in membrane proteins. Conclusion We present a reliable approach for the retrieval of protein mutations from PubMed abstracts for any set of genes or proteins of interest. We further demonstrate how amino acid substitution information from text can be utilized for protein structure stability studies on the basis of a novel energy model.
APA, Harvard, Vancouver, ISO, and other styles
33

Glass, Edmund. "Power Analysis in Applied Linear Regression for Cell Type-Specific Differential Expression Detection." VCU Scholars Compass, 2016. http://scholarscompass.vcu.edu/etd/4516.

Full text
Abstract:
The goal of many human disease-oriented studies is to detect molecular mechanisms different between healthy controls and patients. Yet, commonly used gene expression measurements from any tissues suffer from variability of cell composition. This variability hinders the detection of differentially expressed genes and is often ignored. However, this variability may actually be advantageous, as heterogeneous gene expression measurements coupled with cell counts may provide deeper insights into the gene expression differences on the cell type-specific level. Published computational methods use linear regression to estimate cell type-specific differential expression. Yet, they do not consider many artifacts hidden in high-dimensional gene expression data that may negatively affect the performance of linear regression. In this dissertation we specifically address the parameter space involved in the most rigorous use of linear regression to estimate cell type-specific differential expression and report under which conditions significant detection is probable. We define parameters affecting the sensitivity of cell type-specific differential expression estimation as follows: sample size, cell type-specific proportion variability, mean squared error (spread of observations around linear regression line), conditioning of the cell proportions predictor matrix, and the size of actual cell type-specific differential expression. Each parameter, with the exception of cell type-specific differential expression (effect size), affects the variability of cell type-specific differential expression estimates. We have developed a power-analysis approach to cell type by cell type and genomic site by site differential expression detection which relies upon Welch’s two-sample t-test and factors in differences in cell type-specific expression estimate variability and reduces false discovery. To this end we have published an R package, LRCDE, available in GitHub (http://www.github.com/ERGlass/lrcde.dev) which outputs observed statistics of cell type-specific differential expression, including two-sample t- statistic, t-statistic p-value, and power calculated from two-sample t-statistic on a genomic site- by-site basis.
APA, Harvard, Vancouver, ISO, and other styles
34

Thölken, Clemens [Verfasser], and Marcus [Akademischer Betreuer] Lechner. "Applied Bioinformatics for ncRNA Characterization - Case Studies Combining Next Generation Sequencing & Genomics / Clemens Thölken ; Betreuer: Marcus Lechner." Marburg : Philipps-Universität Marburg, 2020. http://d-nb.info/1204199736/34.

Full text
APA, Harvard, Vancouver, ISO, and other styles
35

Jones, Derek. "Scalable Feature Selection and Extraction with Applications in Kinase Polypharmacology." UKnowledge, 2018. https://uknowledge.uky.edu/cs_etds/65.

Full text
Abstract:
In order to reduce the time associated with and the costs of drug discovery, machine learning is being used to automate much of the work in this process. However the size and complex nature of molecular data makes the application of machine learning especially challenging. Much work must go into the process of engineering features that are then used to train machine learning models, costing considerable amounts of time and requiring the knowledge of domain experts to be most effective. The purpose of this work is to demonstrate data driven approaches to perform the feature selection and extraction steps in order to decrease the amount of expert knowledge required to model interactions between proteins and drug molecules.
APA, Harvard, Vancouver, ISO, and other styles
36

Heyer, Erin E. "Optimizing RNA Library Preparation to Redefine the Translational Status of 80S Monosomes: A Dissertation." eScholarship@UMMS, 2015. https://escholarship.umassmed.edu/gsbs_diss/810.

Full text
Abstract:
Deep sequencing of strand-specific cDNA libraries is now a ubiquitous tool for identifying and quantifying RNAs in diverse sample types. The accuracy of conclusions drawn from these analyses depends on precise and quantitative conversion of the RNA sample into a DNA library suitable for sequencing. Here, we describe an optimized method of preparing strand-specific RNA deep sequencing libraries from small RNAs and variably sized RNA fragments obtained from ribonucleoprotein particle footprinting experiments or fragmentation of long RNAs. Because all enzymatic reactions were optimized and driven to apparent completion, sequence diversity and species abundance in the input sample are well preserved. This optimized method was used in an adapted ribosome-profiling approach to sequence mRNA footprints protected either by 80S monosomes or polysomes in S. cerevisiae. Contrary to popular belief, we show that 80S monosomes are translationally active as demonstrated by strong three-nucleotide phasing of monosome footprints across open reading frames. Most mRNAs exhibit some degree of monosome occupancy, with monosomes predominating on upstream ORFs, canonical ORFs shorter than ~590 nucleotides and any ORF for which the total time required to complete elongation is substantially shorter than the time required for initiation. Additionally, endogenous NMD targets tend to be monosome-enriched. Thus, rather than being inactive, 80S monosomes are significant contributors to overall cellular translation.
APA, Harvard, Vancouver, ISO, and other styles
37

Heyer, Erin E. "Optimizing RNA Library Preparation to Redefine the Translational Status of 80S Monosomes: A Dissertation." eScholarship@UMMS, 2010. http://escholarship.umassmed.edu/gsbs_diss/810.

Full text
Abstract:
Deep sequencing of strand-specific cDNA libraries is now a ubiquitous tool for identifying and quantifying RNAs in diverse sample types. The accuracy of conclusions drawn from these analyses depends on precise and quantitative conversion of the RNA sample into a DNA library suitable for sequencing. Here, we describe an optimized method of preparing strand-specific RNA deep sequencing libraries from small RNAs and variably sized RNA fragments obtained from ribonucleoprotein particle footprinting experiments or fragmentation of long RNAs. Because all enzymatic reactions were optimized and driven to apparent completion, sequence diversity and species abundance in the input sample are well preserved. This optimized method was used in an adapted ribosome-profiling approach to sequence mRNA footprints protected either by 80S monosomes or polysomes in S. cerevisiae. Contrary to popular belief, we show that 80S monosomes are translationally active as demonstrated by strong three-nucleotide phasing of monosome footprints across open reading frames. Most mRNAs exhibit some degree of monosome occupancy, with monosomes predominating on upstream ORFs, canonical ORFs shorter than ~590 nucleotides and any ORF for which the total time required to complete elongation is substantially shorter than the time required for initiation. Additionally, endogenous NMD targets tend to be monosome-enriched. Thus, rather than being inactive, 80S monosomes are significant contributors to overall cellular translation.
APA, Harvard, Vancouver, ISO, and other styles
38

Rajkovic, Andrei. "Promoting Bacterial Synthesis of Oligo-prolines by Modifying Elongation Factor P Post-translationally." The Ohio State University, 2016. http://rave.ohiolink.edu/etdc/view?acc_num=osu1469123846.

Full text
APA, Harvard, Vancouver, ISO, and other styles
39

Lin, Frank Po-Yen Centre for Health Informatics Faculty of Medicine UNSW. "In silico virulence prediction and virulence gene discovery of Streptococcus agalactiae." Awarded By:University of New South Wales. Centre for Health Informatics, 2009. http://handle.unsw.edu.au/1959.4/44382.

Full text
Abstract:
Physicians frequently face challenges in predicting which bacterial subpopulations are likely to cause severe infections. A more accurate prediction of virulence would improve diagnostics and limit the extent of antibiotic resistance. Nowadays, bacterial pathogens can be typed with high accuracy with advanced genotyping technologies. However, effective translation of bacterial genotyping data into assessments of clinical risk remains largely unexplored. The discovery of unknown virulence genes is another key determinant of successful prediction of infectious disease outcomes. The trial-and-error method for virulence gene discovery is time-consuming and resource-intensive. Selecting candidate genes with higher precision can thus reduce the number of futile trials. Several in silico candidate gene prioritisation (CGP) methods have been proposed to aid the search for genes responsible for inherited diseases in human. It remains uninvestigated as to how the CGP concept can assist with virulence gene discovery in bacterial pathogens. The main contribution of this thesis is to demonstrate the value of translational bioinformatics methods to address challenges in virulence prediction and virulence gene discovery. This thesis studied an important perinatal bacterial pathogen, group B streptococcus (GBS), the leading cause of neonatal sepsis and meningitis in developed countries. While several antibiotic prophylactic programs have successfully reduced the number of early-onset neonatal diseases (infections that occur within 7 days of life), the prevalence of late-onset infections (infections that occur between 7??30 days of life) remained constant. In addition, the widespread use of intrapartum prophylactic antibiotics may introduce undue risk of penicillin allergy and may trigger the development of antibiotic-resistant microorganisms. To minimising such potential harm, a more targeted approach of antibiotic use is required. Distinguish virulent GBS strains from colonising counterparts thus lays the cornerstone of achieving the goal of tailored therapy. There are three aims of this thesis: 1. Prediction of virulence by analysis of bacterial genotype data: To identify markers that may be associated with GBS virulence, statistical analysis was performed on GBS genotype data consisting of 780 invasive and 132 colonising S. agalactiae isolates. From a panel of 18 molecular markers studied, only alp3 gene (which encodes a surface protein antigen commonly associated with serotype V) showed an increased association with invasive diseases (OR=2.93, p=0.0003, Fisher??s exact test). Molecular serotype II (OR=10.0, p=0.0007) was found to have a significant association with early-onset neonatal disease when compared with late-onset diseases. To investigate whether clinical outcomes can be predicted by the panel of genotype markers, logistic regression and machine learning algorithms were applied to distinguish invasive isolates from colonising isolates. Nevertheless, the predictive analysis only yielded weak predictive power (area under ROC curve, AUC: 0.56??0.71, stratified 10-fold cross-validation). It was concluded that a definitive predictive relationship between the molecular markers and clinical outcomes may be lacking, and more discriminative markers of GBS virulence are needed to be investigated. 2. Development of two computational CGP methods to assist with functional discovery of prokaryotic genes: Two in silico CGP methods were developed based on comparative genomics: statistical CGP exploits the differences in gene frequency against phenotypic groups, while inductive CGP applies supervised machine learning to identify genes with similar occurrence patterns across a range of bacterial genomes. Three rediscovery experiments were carried out to evaluate the CGP methods: a) Rediscovery of peptidoglycan genes was attempted with 417 published bacterial genome sequences. Both CGP methods achieved their best AUC >0.911 in Escherichia coli K-12 and >0.978 Streptococcus agalactiae 2603 (SA-2603) genomes, with an average improvement in precision of >3.2-fold and a maximum of >27-fold using statistical CGP. A median AUC of >0.95 could still be achieved with as few as 10 genome examples in each group in the rediscovery of the peptidoglycan metabolism genes. b) A maximum of 109-fold improvement in precision was achieved in the rediscovery of anaerobic fermentation genes. c) In the rediscovery experiment with genes of 31 metabolic pathways in SA-2603, 14 pathways achieved an AUC >0.9 and 28 pathways achieved AUC >0.8 with the best inductive CGP algorithms. The results from the rediscovery experiments demonstrated that the two CGP methods can assist with the study of functionally uncategorised genomic regions and the discovery of bacterial gene-function relationships. 3. Application of the CGP methods to discover GBS virulence genes: Both statistical and inductive CGP were applied to assist with the discovery of unknown GBS virulence factors. Among a list of hypothetical protein genes, several highly-ranked genes were plausibly involved in molecular mechanisms in GBS pathogenesis, including several genes encoding family 8 glycosyltransferase, family 1 and family 2 glycosyltransferase, multiple adhesins, streptococcal neuraminidase, staphylokinase, and other factors that may have roles in contributing to GBS virulence. Such genes may be candidates for further biological validation. In addition, the co-occurrence of these genes with currently known virulence factors suggested that the virulence mechanisms of GBS in causing perinatal diseases are multifactorial. The procedure demonstrated in this prioritisation task should assist with the discovery of virulence genes in other pathogenic bacteria.
APA, Harvard, Vancouver, ISO, and other styles
40

Vila, Casadesús Maria. "Design of bioinformatic tools for integrative analysis of microRNA-mRNA interactome applied to digestive cancers." Doctoral thesis, Universitat de Barcelona, 2017. http://hdl.handle.net/10803/663087.

Full text
Abstract:
En esta tesis se han desarrollado e implementado distintas herramientas bioinformáticas que permiten el estudio de las interacciones miRNA-mRNA en contextos celulares específicos. oncretamente se ha creado un paquete de R (miRComb) que calcula las interacciones miRNA-mRNA partiendo de expresión de miRNAs y mRNAs, y predicciones bloinformáticas de bases de datos preexistentes. Las interacciones miRNA-mRNA finales son aquellas que muestran una correlación negativa y han estado predichas por al meno una base de datos. Como valor añadido, el paquete miRComb realiza un resumen en pdf con los resultados básicos del análisis (número de interacciones, número de mRNAs target por miRNA, análisis funcional, etc.), que permite comparar los datos de distintos estudios. Hemos aplicado esta metodología en el contexto de cánceres digestivos. En un primer estudio hemos utilizado datos públicos de 5 cánceres digestivos (colon, recto, esófago, stómago e hígado) y hemos determinado las interacciones miRNA-mRNA comunes entre ellos y específicas de cada uno. En un segundo estudio, hemos utilizado la misma metodología para analizar datos de IRNA-mRNA en biopsias de pacientes del Hospital Clínic de Barcelona con cáncer de páncreas. En este estudio hemos descrito interacciones miRNA-mRNA en el contexto de cáncer pancreático y hemos podido validar dos de ellas a nivel experimental. En resumen, podemos concluir que el paquete miRComb es una herramienta útil para el estudio del interactoma de miRNA-mRNA, y que ha servido para establecer hipótesis biológicas que luego se han podido comprobar en el laboratorio.
APA, Harvard, Vancouver, ISO, and other styles
41

Katz, Lee Scott. "Computational tools for molecular epidemiology and computational genomics of Neisseria meningitidis." Diss., Georgia Institute of Technology, 2010. http://hdl.handle.net/1853/42934.

Full text
Abstract:
Neisseria meningitidis is a gram negative, and sometimes encapsulated, diplococcus that causes devastating disease worldwide. For the worldwide genetic surveillance of N. meningitidis, the gold standard for profiling the bacterium uses genetic loci found around the genome. Unfortunately, the software for analyzing the data for these profiles is difficult to use for a variety of reasons. This thesis shows my suite of tools called the Meningococcus Genome Informatics Platform for the analysis of these profiling data. To better understand N. meningitidis, the CDC Meningitis Laboratory and other world class laboratories have adopted a whole genome approach. To facilitate this approach, I have developed a computational genomics assembly and annotation pipeline called the CG-Pipeline. It assembles a genome, predicts locations of various features, and then annotates those features. Next, I developed a comparative genomics browser and database called NBase. Using CG-Pipeline and NBase, I addressed two open questions in N. meningitidis research. First, there are N. meningitidis isolates that cause disease but many that do not cause disease. What is the genomic basis of disease associated versus asymptomatically carried isolates of N. meningitidis? Second, some isolates' capsule type cannot be easily determined. Since isolates are grouped into one of many serogroups based on this capsule, which aids in epidemiological studies and public health response to N. meningitidis, often an isolate cannot be grouped. Thus the question is what is the genomic basis of nongroupability? This thesis addresses both of these questions on a whole genome level.
APA, Harvard, Vancouver, ISO, and other styles
42

Dutt, Mohini D. "Adverse Childhood Experiences and its Association with Cognitive Impairment in Non- Patient Older Population." Scholar Commons, 2017. http://scholarcommons.usf.edu/etd/7019.

Full text
Abstract:
This study explores cognitive impairment and its correlation to early- life adverse experiences in non-patient population between the ages of 50 to 65. This developmental approach and observational study design explores cognition in pre-clinical Alzheimer’s disease (AD). Using a standardized neuropsychological instrument, the Montreal Cognitive Assessment (MoCA) and clinically administered questionnaire, the ACE (Adverse Childhood Experiences), I hypothesized that participants with high ACE scores will inversely have low MoCA scores. My goal was to use a multiple linear regression model with 3 covariates and 1 predictor of interest (ACEs). At 80% power, a sample size of 40 was calculated as needed. This would mean that the results would have 80 % chance of declaring statistical significance. This corresponds to an R-squared value (percentage of variation in MoCA score explained by the predictor) of 17.2%. The desired sample size was not attained successfully due to several barriers in receiving sample data from the collaborating site and the 2017 Hurricane Irma causing a drop in participation rate. Overall 13 participants had successfully participated. The analysis of the results is demonstrated in a line graph indicating a relationship between ACE and MoCA scores. The accuracy of the descriptive statistics could be argued against due to the low sample size. The analysis of the ethnographic interviews brings out some trends in the participant responses. The focus here has been to discuss these responses as to how they advocate for the entanglement theory of aging. In other words, how the exposure to social and environmental factors at various stages of an individual’s lifecourse can interact with one’s physiology, resulting in exposure- specific health conditions at later life stages. Among the period of exposure, my focus through this study is specifically on the early exposures in the lifecourse. This is facilitated by the use of the ACE questionnaire regarding exposures to adverse experiences such as sexual/ physical abuse, familial mental health issues, alcohol/ drug abuse in the family and loss or separation from parents. The entanglement theory further allows for race or culture specific exposures to adversity that raises the question of varying health consequences among cultural or racial groups and the need for a more critical approach in providing access to healthcare and healthcare policy development. Trends in ethnographic results obtained have allowed for the critical discourse in the transgenerational effects of social adversity, effects of resilience- building from adversity and the need for care- giver mental health services. The study brought out critiques on how the ACE module could be made more inclusive of experiences specific to diverse cultures and regions, as well as the need to address the severity of individual experiences. We conclude by discussing how effects of social or environmental experiences can be used toward AD and aging research and what supporting literature and initiatives currently exist. The discussion is also inspired by the existing political discourse around the medicalization of AD and how that influences the reductionist methods in AD research. This new direction of applied and holistic approach derives its perspective from neuroanthropology and applied medical anthropology. The overall aim of this study is to ask questions challenging existing research methods with the ultimate hope to newly influence the allocation of AD research and risk reduction toward interdisciplinary focus and funding, involving early-life lived experiences and life course perspectives.
APA, Harvard, Vancouver, ISO, and other styles
43

Schröder, Michael, Rainer Winnenburg, and Conrad Plake. "Improved mutation tagging with gene identifiers applied to membrane protein stability prediction." BioMed Central, 2009. https://tud.qucosa.de/id/qucosa%3A28888.

Full text
Abstract:
Background The automated retrieval and integration of information about protein point mutations in combination with structure, domain and interaction data from literature and databases promises to be a valuable approach to study structure-function relationships in biomedical data sets. Results We developed a rule- and regular expression-based protein point mutation retrieval pipeline for PubMed abstracts, which shows an F-measure of 87% for the mutation retrieval task on a benchmark dataset. In order to link mutations to their proteins, we utilize a named entity recognition algorithm for the identification of gene names co-occurring in the abstract, and establish links based on sequence checks. Vice versa, we could show that gene recognition improved from 77% to 91% F-measure when considering mutation information given in the text. To demonstrate practical relevance, we utilize mutation information from text to evaluate a novel solvation energy based model for the prediction of stabilizing regions in membrane proteins. For five G protein-coupled receptors we identified 35 relevant single mutations and associated phenotypes, of which none had been annotated in the UniProt or PDB database. In 71% reported phenotypes were in compliance with the model predictions, supporting a relation between mutations and stability issues in membrane proteins. Conclusion We present a reliable approach for the retrieval of protein mutations from PubMed abstracts for any set of genes or proteins of interest. We further demonstrate how amino acid substitution information from text can be utilized for protein structure stability studies on the basis of a novel energy model.
APA, Harvard, Vancouver, ISO, and other styles
44

Taslim, Cenny. "Multi-Stage Experimental Planning and Analysis for Forward-Inverse Regression Applied to Genetic Network Modeling." The Ohio State University, 2008. http://rave.ohiolink.edu/etdc/view?acc_num=osu1213286112.

Full text
APA, Harvard, Vancouver, ISO, and other styles
45

Llorach, Parés Laura. "Computer-Aided Drug Design applied to marine drug discovery = Disseny de fàrmacs assistit per ordinador aplicat a la cerca de possibles fàrmacs marins." Doctoral thesis, Universitat de Barcelona, 2019. http://hdl.handle.net/10803/668298.

Full text
Abstract:
The potential of natural products in general, and marine natural products in particular, as pharmacological entities has been widely demonstrated in recent years. Marine benthic ecosystems contain an extraordinary range of diverse organisms that possess bioactive natural compounds, which are commonly used as defensive or protective chemical mechanisms. These effective defensive strategies are based on secondary metabolites that are crucial for the species survival. The pharmacological properties of these unique chemical compounds constitute an interesting and emerging hot research line, based upon exploiting them for the development of new drugs. The evolution, biodiversity, and specific environmental conditions found in marine ecosystems, such as Antarctica and the Mediterranean Sea, make them an amazing source of potential therapeutic agents. Interestingly, some of these natural products are capable to modulate protein functions in pathogenesis-related pathways. The process of discovery and development of new drugs, for instance small molecules, with the aforementioned capacity to modulate protein functions, is a tedious procedure that requires economic resources and time. To reduce these drawbacks, computer-aided drug design (CADD) has emerged as one of the most effective methods. A rapid exploration of the chemical space can be done with computational methods, and they are very interesting and useful complementary approaches to experimental methods. CADD techniques can be applied in different steps of the drug discovery pipeline, and also, can cover several phases of this pipeline. To that end, several objectives have been set and reached in this thesis: 1. To find possible therapeutic activities and to establish the capability to modulate protein functions in pathogenesis-related pathways from marine molecules by using different CADD tools and techniques: I. Improve the drug discovery pipeline by the elucidation of the possible therapeutic potential of a set of marine molecules against a list of targets related to different pathologies. II. Elucidation of different pharmacophoric features of marine compounds and a precise in silico binding study, highlighting the power of CADD techniques, and reporting the inhibitory activity of different natural products and indole scaffold derivatives as GSK3β, CK1δ, DYRK1A, and CLK1 inhibitors. III. Computational study and an experimental validation of meridianins and lignarenones as possible ATP and/or substrate inhibitors of GSK3β. The main conclusions of this thesis are that marine molecules can be used as therapeutic agents against protein kinases related to the AD, and the exemplification of CADD potential applied to marine drug discovery.
El potencial dels productes naturals en general, i els productes naturals marins en particular, com a entitats farmacològiques ha quedat demostrat al llarg dels últims anys. Els ecosistemes bentònics marins contenen una extraordinària diversitat d'organismes que posseeixen compostos naturals bioactius, que utilitzen com mecanismes químics defensius i de protecció. Aquestes efectives estratègies defensives es basen en metabòlits secundaris, crucials per a la supervivència de les espècies. Tenint en compte les propietats farmacològiques d'aquests compostos químics únics, utilitzar-los per al desenvolupament de nous fàrmacs constitueix una línia interessant de recerca emergent. L'evolució, la biodiversitat i les condicions específiques que es troben en els ecosistemes marins, com ara l'Antàrtida i el mar Mediterrani, els converteixen en una font increïble de possibles agents terapèutics, capaços de modular funcions de proteïnes involucrades en determinades patologies. El procés de descobriment i desenvolupament de nous fàrmacs, per exemple, molècules petites, és un procediment tediós que requereix de recursos econòmics i de temps. Per reduir aquests inconvenients, el disseny de fàrmacs assistit per ordinador (DFAO) ha sorgit com un dels mètodes principals i més eficaços. Es pot fer una exploració ràpida de l'espai químic amb mètodes computacionals i a més, són aproximacions complementàries als mètodes experimentals molt interessants i útils. Les tècniques de DFAO es poden aplicar en diferents passos del procés de descobriment de fàrmacs, i també, poden cobrir diverses fases d'aquest pipeline. Amb aquesta finalitat, es varen establir diversos objectius en aquesta tesi: 1. Dilucidar la possible activitat terapèutica i la capacitat per modular les funcions de proteïnes que estan relacionades amb una determinada patologia de les molècules marines mitjançant l'ús de diferents eines i tècniques de DFAO: I. millorar el pipeline de descobriment de fàrmacs mitjançant l'elucidació del possible potencial terapèutic d'un conjunt de molècules marines enfront d'una llista de dianes relacionades amb diferents patologies. II. Dilucidació de les diferents característiques farmacofóriques dels compostos marins i en un precís estudi d’unió in silico, destacant el poder de les tècniques de DFAO, i avaluar l'activitat inhibidora de diferents productes naturals i derivats d’esquelets indòlics com inhibidors de GSK3β, CK1δ, DYRK1A i CLK1. III. Estudi computacional i validació experimental de meridianines i lignarenones com a possibles inhibidors de GSK3β mitjançant la unió a la cavitat de l'ATP i/o del substrat. En relació amb aquests objectius, les conclusions principals d'aquesta tesi són, que les molècules marines poden ser utilitzades com a agents terapèutics contra proteïnes quinases relacionades amb la malaltia d’Alzheimer, i l'exemplificació del potencial de les tècniques de DFAO aplicat al descobriment de fàrmacs marins.
APA, Harvard, Vancouver, ISO, and other styles
46

Nielsen, Michael Lund. "Characterization of Polypeptides by Tandem Mass Spectrometry Using Complementary Fragmentation Techniques." Doctoral thesis, Uppsala : Acta Universitatis Upsaliensis, 2006. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-7409.

Full text
APA, Harvard, Vancouver, ISO, and other styles
47

Franaszek, Krzysztof. "Translation-mediated stress responses : mining of ribosome profiling data." Thesis, University of Cambridge, 2017. https://www.repository.cam.ac.uk/handle/1810/269473.

Full text
Abstract:
Advances in next-generation sequencing platforms during the past decade have resulted in exponential increases in biological data generation. Besides applications in determining the sequences of genomes and other DNA elements, these platforms have allowed the characterization of cell-wide mRNA pools under different conditions and in different tissues. In 2009, Ingolia and colleagues developed an extension of high-throughput sequencing that provides a snapshot of all cellular mRNA fragments protected by translating ribosomes, dubbed ribosome profiling. This approach allows detection of differential translation activity, annotation of novel protein coding sequences and variants, identification of ribosome pause sites and estimates of de novo protein synthesis. As with other sequencing based methodologies, a major challenge of ribosome profiling has been sorting, filtering and interpreting the gigabytes of data produced during the course of a typical experiment. In this thesis, I developed and applied computational pipelines to interrogate ribosome profiling data in relation to gene expression in several viruses and eukaryotic species, as well as to identify sites of ribosomal pausing and sites of non-canonical translation activity. Specifically, I applied various control analyses for characterizing the quality of profiling data and developed scripts for visualizing genome-based (exon-by-exon) rather than transcript-based ribosome footprint alignments. I also examined the challenge of mapping footprints to repetitive sequences in the genome and propose ways to mitigate the associated problems. I performed differential expression analyses on data from coronavirus-infected murine cells, retrovirus-infected human cells and temperature-stressed Arabidopsis thaliana plants. Dissection of translational responses in Arabidopsis thaliana during heat shock or cold shock revealed several groups of genes that were highly upregulated within 10 minutes of temperature challenge. Analysis of the branches of the unfolded protein and integrated stress responses during coronavirus infection allowed for deconvolution of transcriptional and translational contributions. During the course of these analyses, I identified errors in a recently publicized algorithm for detection of differential translation, and wrote corrections that have now been pulled into the repository for this package. Comparison of the translational kinetics of the dengue virus infection in mosquito and human cell lines revealed host-specific sites of ribosome pausing and RNA accumulation. Analysis of HIV profiling data revealed footprint peaks which were in agreement with previously proposed models of peptide or RNA mediated ribosome stalling. I also developed a simulation to identify transcripts that are prone to generating RPFs with multiple alignments during the read mapping process. Together, the scripts and pipelines developed during the course of this work will serve to expedite future analyses of ribosome profiling data, and the results will inform future studies of several important pathogens and temperature stress in plants.
APA, Harvard, Vancouver, ISO, and other styles
48

Wilman, Henry R. "Computational studies of protein helix kinks." Thesis, University of Oxford, 2014. http://ora.ox.ac.uk/objects/uuid:21225f0e-efed-49c6-af27-5d3fe78fa731.

Full text
Abstract:
Kinks are functionally important structural features found in the alpha-helices of many proteins, particularly membrane proteins. Structurally, they are points at which a helix abruptly changes direction. Previous kink definition and identification methods often disagree with one another. Here I describe three novel methods to characterise kinks, which improve on existing approaches. First, Kink Finder, a computational method that consistently locates kinks and estimates the error in the kink angle. Second the B statistic, a statistically robust method for identifying kinks. Third, Alpha Helices Assessed by Humans, a crowdsourcing approach that provided a gold-standard data set on which to train and compare existing kink identification methods. In this thesis, I show that kinks are a feature of long -helices in both soluble and membrane proteins, rather than just transmembrane -helices. Characteristics of kinks in the two types of proteins are similar, with Proline being the dominant feature in both types of protein. In soluble proteins, kinked helices also have a clear structural preference in that they typically point into the solvent. I also explored the conservation of kinks in homologous proteins. I found examples of conserved and non-conserved kinks in both the helix pairs and the helix families. Helix pairs with non-conserved kinks generally have less similar sequences than helix pairs with conserved kinks. I identified helix families that show highly conserved kinks, and families that contain non-conserved kinks, suggesting that some kinks may be flexible points in protein structures.
APA, Harvard, Vancouver, ISO, and other styles
49

Gopalappa, Chaitra. "Three Essays on Analytical Models to Improve Early Detection of Cancer." Scholar Commons, 2010. https://scholarcommons.usf.edu/etd/1647.

Full text
Abstract:
Development of approaches for early detection of cancer requires a comprehensive understanding of the cellular functions that lead to cancer, as well as implementing strategies for population-wide early detection. Cell functions are supported by proteins that are produced by active or expressed genes. Identifying cancer biomarkers, i.e., the genes that are expressed and the corresponding proteins present only in a cancer state of the cell, can lead to its use for early detection of cancer and for developing drugs. There are approximately 30,000 genes in the human genome producing over 500,000 proteins, thereby posing significant analytical challenges in linking specific genes to proteins and subsequently to cancer. Along with developing diagnostic strategies, effective population-wide implementation of these strategies is dependent on the behavior and interaction between entities that comprise the cancer care system, like patients, physicians, and insurance policies. Hence, obtaining effective early cancer detection requires developing models for a systemic study of cancer care. In this research, we develop models to address some of the analytical challenges in three distinct areas of early cancer detection, namely proteomics, genomics, and disease progression. The specific research topics (and models) are: 1) identification and quantification of proteins for obtaining biomarkers for early cancer detection (mixed integer-nonlinear programming (MINLP) and wavelet-based model), 2) denoising of gene values for use in identification of biomarkers (wavelet-based multiresolution denoising algorithm), and 3) estimation of disease progression time of colorectal cancer for developing early cancer intervention strategies (computational probability model and an agent-based simulation).
APA, Harvard, Vancouver, ISO, and other styles
50

Miñarro, Giménez José Antonio. "Entorno para la Gestión Semántica de Información Biomédica en Investigación Traslacional." Doctoral thesis, Universidad de Murcia, 2012. http://hdl.handle.net/10803/92299.

Full text
Abstract:
Las investigaciones traslacionales tienen el objetivo de poner a disposición de investigaciones las evidencias obtenidas en investigaciones básicas para ensayos clínicos. Para facilitar la investigación traslacional es necesario relacionar dicha información mediante la integración de repositorios de información biológica y médica. Debido a la complejidad, cantidad, diversidad y rápida evolución de la información biológica, es imposible gestionar los repositorios biológicos de manera manual ya que supondría una gran inversión en tiempo y en esfuerzo. Por lo tanto, cada vez más es necesario dotar de nuevas herramientas de gestión que faciliten esta tarea y pueda ser realizada de manera autónoma. Esta tesis presenta un entorno para la gestión e integración semántica utilizado las tecnologías de la Web semántica, las cuales son utilizadas para representar, almacenar, explotar y guiar el proceso de integración de la información y conocimiento. Como resultado principal se integraron repositorios de genes y proteínas ortólogas con enfermedades genéticas.
Translational research aims to connect basic biomedical researches with clinical research in order to reach new conclusions based on biomedical evidences. To facilitate the translational research, biological and biomedical information must be related. So, we need to integrate biological and biomedical repositories. Life sciences is a knowledge based discipline, in the data and knowledge is represented through vast amounts of complex and changing information stored in disparate resources and in machine-unfriendly formats. Therefore, the availability of computational methods for organizing, accessing and retrieving information in a systematic way has become crucial for the progress of research in life sciences. In this thesis, we present a framework for the semantic management and integration using semantic web technologies. This framework assists life scientists in the exploration of orthologs/genetic diseases research paths by providing a precise, explicit meaning for information units and intertwining such information.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography