Dissertations / Theses: 'Computational biology'

1

Istrail, Sorin. "Computational molecular biology /." Amsterdam [u.a.] : Elsevier, 2003. http://www.loc.gov/catdir/toc/fy037/2003051360.html.

Full text

APA, Harvard, Vancouver, ISO, and other styles

2

Stegle, Oliver. "Probabilistic models in computational biology." Thesis, University of Cambridge, 2009. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.611560.

Full text

APA, Harvard, Vancouver, ISO, and other styles

3

Athanasakis, D. "Feature selection in computational biology." Thesis, University College London (University of London), 2014. http://discovery.ucl.ac.uk/1432346/.

Full text

Abstract:

This thesis concerns feature selection, with a particular emphasis on the computational biology domain and the possibility of non-linear interaction between features. Towards this it establishes a two-step approach, where the first step is feature selection, followed by the learning of a kernel machine in this reduced representation. Optimization of kernel target alignment is proposed as a model selection criterion and its properties are established for a number of feature selection algorithms, including some novel variants of stability selection. The thesis further studies greedy and stochastic approaches for optimizing alignment, propos- ing a fast stochastic method with substantial probabilistic guarantees. The proposed stochastic method compares favorably to its deterministic counterparts in terms of computational complexity and resulting accuracy. The characteristics of this stochastic proposal in terms of computational complexity and applicabil- ity to multi-class problems make it invaluable to a deep learning architecture which we propose. Very encouraging results of this architecture in a recent challenge dataset further justify this approach, with good further results on a signal peptide cleavage prediction task. These proposals are evaluated in terms of generalization accuracy, interpretability and numerical stability of the models, and speed on a number of real datasets arising from infectious disease bioinfor- matics, with encouraging results.

APA, Harvard, Vancouver, ISO, and other styles

4

Wu, Yichao Hurd Harry L. Ji Chuanshu. "Probability approximations with applications in computational finance and computational biology." Chapel Hill, N.C. : University of North Carolina at Chapel Hill, 2006. http://dc.lib.unc.edu/u?/etd,247.

Full text

Abstract:

Thesis (Ph. D.)--University of North Carolina at Chapel Hill, 2006.
Title from electronic title page (viewed Oct. 10, 2007). "... in partial fulfillment of the requirements for the degree of Doctor of Philosophy in the Department of Statistics and Operations Research." Discipline: Statistics and Operations Research; Department/School: Statistics and Operations Research.

APA, Harvard, Vancouver, ISO, and other styles

5

Ranjard, Louis. "Computational biology of bird song evolution." e-Thesis University of Auckland, 2010. http://hdl.handle.net/2292/5719.

Full text

Abstract:

Individuals of a given population share more behavioural traits with each other than with members of other populations. For example, in humans, traditions are specific to regions or countries. These cultural relationships can tell us about the history of the populations, their origin and the amount of exchange between them. In birds, regional dialects have been described in many species. However, the mechanisms with which dialects form in populations is not fully understood because it is difficult to analyse experimentally. Translocated populations, with their known histories, offer an opportunity to study these mechanisms. From the study of bird vocalisations we can make inferences regarding population structure and relationships as well as their history, individual behavioural state, neuronal and physiological mechanisms or development of neuronal learning. Too achieve this, cross-disciplinary approaches are necessary, combining field work, bioacoustic methods, statistical tools such as machine learning, ecological knowledge and phylogenetic methods. Here, I will describe computational methods for the treatment and classification of bird vocalisations and will use them to depict the relationships between bird populations. First, I discretise the data in order to define the cultural traits. Then phylogenetic tree-building methods are used. Two approaches are possible, first to map these traits onto known phylogenies and, second, to directly build the phylogeny of these traits. I describe the application of these methods to test several hypothesis on bird songs evolution related to both their history and the mechanisms with which they evolve. Evidence for the presence of dialects in the Puget Sound white-crowned sparrow (Zonotrichia leucophrys pugetensis) is provided on the basis of the syllable content of the songs. The absence of vocal sexual dimorphism is reported in the Australasian gannet (or takapu, Morus serrator), a member of the Sulidae family for which extensive sexual dimorphism has been reported in other species. Subsequently, convergence between the begging calls of several cuckoo species and their respective hosts is suggested by various bioacoustic methods. In addition, the male calls of the hihi (or stitchbird, Notiomystis cincta) is analysed in an island population. The corresponding pattern of variation suggests a post-dispersal acquisition of calls via learning which is in agreement with the most related species in the revised phylogeny of the hihi. Finally, the mechanisms of song evolution are depicted in translocated populations of tieke (or saddleback, Philesturnus carunculatus rufusater), resulting in the development of island dialects.

APA, Harvard, Vancouver, ISO, and other styles

6

Lanctôt, J. Kevin. "Some string problems in computational biology." Thesis, National Library of Canada = Bibliothèque nationale du Canada, 2000. http://www.collectionscanada.ca/obj/s4/f2/dsk1/tape3/PQDD_0023/NQ51207.pdf.

Full text

APA, Harvard, Vancouver, ISO, and other styles

7

Miller, David J. Ghosh Avijit. "New methods in computational systems biology /." Philadelphia, Pa. : Drexel University, 2008. http://hdl.handle.net/1860/2810.

Full text

APA, Harvard, Vancouver, ISO, and other styles

8

Li, Limin, and 李丽敏. "Machine learning methods for computational biology." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 2010. http://hub.hku.hk/bib/B44546749.

Full text

APA, Harvard, Vancouver, ISO, and other styles

9

Vialette, Stéphane. "Algorithmic Contributions to Computational Molecular Biology." Habilitation à diriger des recherches, Université Paris-Est, 2010. http://tel.archives-ouvertes.fr/tel-00862069.

Full text

APA, Harvard, Vancouver, ISO, and other styles

10

Selega, Alina. "Computational methods for RNA integrative biology." Thesis, University of Edinburgh, 2018. http://hdl.handle.net/1842/29630.

Full text

Abstract:

Ribonucleic acid (RNA) is an essential molecule, which carries out a wide variety of functions within the cell, from its crucial involvement in protein synthesis to catalysing biochemical reactions and regulating gene expression. Such diverse functional repertoire is indebted to complex structures that RNA can adopt and its flexibility as an interacting molecule. It has become possible to experimentally measure these two crucial aspects of RNA regulatory role with such technological advancements as next-generation sequencing (NGS). NGS methods can rapidly obtain the nucleotide sequence of many molecules in parallel. Designing experiments, where only the desired parts of the molecule (or specific parts of the transcriptome) are sequenced, allows to study various aspects of RNA biology. Analysis of NGS data is insurmountable without computational methods. One such experimental method is RNA structure probing, which aims to infer RNA structure from sequencing chemically altered transcripts. RNA structure probing data is inherently noisy, affected both by technological biases and the stochasticity of the underlying process. Most existing methods do not adequately address the issue of noise, resorting to heuristics and limiting the informativeness of their output. In this thesis, a statistical pipeline was developed for modelling RNA structure probing data, which explicitly captures biological variability, provides automated bias-correcting strategies, and generates a probabilistic output based on experimental measurements. The output of our method agrees with known RNA structures, can be used to constrain structure prediction algorithms, and remains robust to reduced sequence coverage, thereby increasing sensitivity of the technology. Another recent experimental innovation maps RNA-protein interactions at very high temporal resolution, making it possible to study rapid binding events happening on a minute time scale. In this thesis, a non-parametric algorithm was developed for identifying significant changes in RNA-protein binding time-series between different conditions. The method was applied to novel yeast RNA-protein binding time-course data to study the role of RNA degradation in stress response. It revealed pervasive changes in the binding to the transcriptome of the yeast transcription termination factor Nab3 and the cytoplasmic exoribonuclease Xrn1 under nutrient stress. This challenged the common assumption of viewing transcriptional changes as the major driver of changes in RNA expression during stress and highlighted the importance of degradation. These findings inspired a dynamical model for RNA expression, where transcription and degradation rates are modelled using RNA-protein binding time-series data.

APA, Harvard, Vancouver, ISO, and other styles

11

Simoni, Giulia. "Modeling Startegies for Computational Systems Biology." Doctoral thesis, Università degli studi di Trento, 2020. http://hdl.handle.net/11572/254361.

Full text

Abstract:

Mathematical models and their associated computer simulations are nowadays widely used in several research fields, such as natural sciences, engineering, as well as social sciences. In the context of systems biology, they provide a rigorous way to investigate how complex regulatory pathways are connected and how the disruption of these processes may contribute to the develop- ment of a disease, ultimately investigating the suitability of specific molecules as novel therapeutic targets. In the last decade, the launching of the precision medicine initiative has motivated the necessity to define innovative computational techniques that could be used for customizing therapies. In this context, the combination of mathematical models and computer strategies is an essential tool for biologists, which can analyze complex system pathways, as well as for the pharmaceutical industry, which is involved in promoting programs for drug discovery. In this dissertation, we explore different modeling techniques that are used for the simulation and the analysis of complex biological systems. We analyze the state of the art for simulation algorithms both in the stochastic and in the deterministic frameworks. The same dichotomy has been studied in the context of sensitivity analysis, identifying the main pros and cons of the two approaches. Moreover, we studied the quantitative system pharmacology (QSP) modeling approach that elucidates the mechanism of action of a drug on the biological processes underlying a disease. Specifically, we present the definition, calibration and validation of a QSP model describing Gaucher disease type 1 (GD1), one of the most common lysosome storage rare disorders. All of these techniques are finally combined to define a novel computational pipeline for patient stratification. Our approach uses modeling techniques, such as model simulations, sensitivity analysis and QSP modeling, in combination with experimental data to identify the key mechanisms responsible for the stratification. The pipeline has been applied to three test cases in different biological contexts: a whole-body model of dyslipidemia, the QSP model of GD1 and a QSP model of cardiac electrophysiology. In these test cases, the pipeline proved to be accurate and robust, allowing the interpretation of the mechanistic differences underlying the phenotype classification.

APA, Harvard, Vancouver, ISO, and other styles

12

Simoni, Giulia. "Modeling Startegies for Computational Systems Biology." Doctoral thesis, Università degli studi di Trento, 2020. http://hdl.handle.net/11572/254361.

Full text

Abstract:

Mathematical models and their associated computer simulations are nowadays widely used in several research fields, such as natural sciences, engineering, as well as social sciences. In the context of systems biology, they provide a rigorous way to investigate how complex regulatory pathways are connected and how the disruption of these processes may contribute to the develop- ment of a disease, ultimately investigating the suitability of specific molecules as novel therapeutic targets. In the last decade, the launching of the precision medicine initiative has motivated the necessity to define innovative computational techniques that could be used for customizing therapies. In this context, the combination of mathematical models and computer strategies is an essential tool for biologists, which can analyze complex system pathways, as well as for the pharmaceutical industry, which is involved in promoting programs for drug discovery. In this dissertation, we explore different modeling techniques that are used for the simulation and the analysis of complex biological systems. We analyze the state of the art for simulation algorithms both in the stochastic and in the deterministic frameworks. The same dichotomy has been studied in the context of sensitivity analysis, identifying the main pros and cons of the two approaches. Moreover, we studied the quantitative system pharmacology (QSP) modeling approach that elucidates the mechanism of action of a drug on the biological processes underlying a disease. Specifically, we present the definition, calibration and validation of a QSP model describing Gaucher disease type 1 (GD1), one of the most common lysosome storage rare disorders. All of these techniques are finally combined to define a novel computational pipeline for patient stratification. Our approach uses modeling techniques, such as model simulations, sensitivity analysis and QSP modeling, in combination with experimental data to identify the key mechanisms responsible for the stratification. The pipeline has been applied to three test cases in different biological contexts: a whole-body model of dyslipidemia, the QSP model of GD1 and a QSP model of cardiac electrophysiology. In these test cases, the pipeline proved to be accurate and robust, allowing the interpretation of the mechanistic differences underlying the phenotype classification.

APA, Harvard, Vancouver, ISO, and other styles

13

Zagordi, Osvaldo. "Statistical physics methods in computational biology." Doctoral thesis, SISSA, 2007. http://hdl.handle.net/20.500.11767/3971.

Full text

Abstract:

The interest of statistical physics for combinatorial optimization is not new, it suffices to think of a famous tool as simulated annealing. Recently, it has also resorted to statistical inference to address some "hard" optimization problems, developing a new class of message passing algorithms. Three applications to computational biology are presented in this thesis, namely: 1) Boolean networks, a model for gene regulatory networks; 2) haplotype inference, to study the genetic information present in a population; 3) clustering, a general machine learning tool.

APA, Harvard, Vancouver, ISO, and other styles

14

Pettersson, Fredrik. "A multivariate approach to computational molecular biology." Doctoral thesis, Umeå : Univ, 2005. http://urn.kb.se/resolve?urn=urn:nbn:se:umu:diva-609.

Full text

APA, Harvard, Vancouver, ISO, and other styles

15

Ziehm, Matthias Fritz. "Computational biology of longevity in model organisms." Thesis, University of Cambridge, 2014. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.648888.

Full text

APA, Harvard, Vancouver, ISO, and other styles

16

Small, Benjamin Gavin. "The chemical and computational biology of inflammation." Thesis, University of Manchester, 2011. https://www.research.manchester.ac.uk/portal/en/theses/the-chemical-and-computational-biology-of-inflammation(4de5c19c-e377-4783-acfb-ad168ad35d46).html.

Full text

Abstract:

Non-communicable diseases (NCD) such as cancer, heart disease and cerebrovascular injury are dependent on or aggravated by inflammation. Their prevention and treatment is arguably one of the greatest challenges to medicine in the 21st century. The pleiotropic, proinflammatory cytokine; interleukin-l beta (IL-l~) is a primary, causative messenger of inflammation. Lipopolysaccharide (LPS) induction ofIL-l~ expression via toll-like receptor 4 (TLR4) in myeloid cells is a robust experimental model of inflammation and is driven in large part via p38-MAPK and NF-KB signaling networks. The control of signaling networks involved in IL-l~ expression is distributed and highly complex, so to perturb intracellular networks effectively it is often necessary to modulate several steps simultaneously. However, the number of possible permutations for intervention leads to a combinatorial explosion in the experiments that would have to be performed in a complete analysis. We used a multi-objective evolutionary algorithm (EA) to optimise reagent combinations from a dynamic chemical library of 33 compounds with established or predicted targets in the regulatory network controlling IL-l ~ expression. The EA converged on excellent solutions within 11 generations during which we studied just 550 combinations out of the potential search space of - 9 billion. The top five reagents with the greatest contribution to combinatorial effects throughout the EA were then optimised pair- wise with respect to their concentrations, using an adaptive, dose matrix search protocol. A p38a MAPK inhibitor (30 ± 10% inhibition alone) with either an inhibitor of IKB kinase (12 ± 9 % inhibition alone) or a chelator of poorly liganded iron (19 ± 8 % inhibition alone) yielded synergistic inhibition (59 ± 5 % and 59 ± 4 % respectively, n=7, p≥O.04 for both combinations, tested by one way ANOVA with Tukey's multiple test correction) of macrophage IL-l~ expression. Utilising the above data, in conjunction with the literature, an LPS-directed transcriptional map of IL-l ~ expression was constructed. Transcription factors (TF) targeted by the signaling networks coalesce at precise nucleotide binding elements within the IL-l~ regulatory DNA. Constitutive binding of PU.l and C/EBr-~ TF's are obligate for IL-l~ expression. The findings in this thesis suggest that PU.l and C/EBP-~ TF's form scaffolds facilitating dynamic control exerted by other TF's, as exemplified by c-Jun. Similarly, evidence is emerging that epigenetic factors, such as the hetero-euchromatin balance, are also important in the relative transcriptional efficacy in different cell types. Evolutionary searches provide a powerful and general approach to the discovery of novel combinations of pharmacological agents with potentially greater therapeutic indices than those of single drugs. Similarly, construction of signaling network maps aid the elucidation of pharmacological mechanism and are mandatory precursors to the development of dynamic models. The symbiosis of both approaches has provided further insight into the mechanisms responsible for IL-lβ expression, and reported here provide a - platform for further developments in understanding NCD's dependent on or aggravated by inflammation.

APA, Harvard, Vancouver, ISO, and other styles

17

Futamura, Natsuhiko. "Algorithms for large-scale problems in computational biology." Related electronic resource: Current Research at SU : database of SU dissertations, recent titles available full text, 2002. http://wwwlib.umi.com/cr/syr/main.

Full text

APA, Harvard, Vancouver, ISO, and other styles

18

Ding, Jiarui. "Computational methods for systems biology data of cancer." Thesis, University of British Columbia, 2016. http://hdl.handle.net/2429/58164.

Full text

Abstract:

High-throughput genome sequencing and other techniques provide a cost-effective way to study cancer biology and seek precision treatment options. In this dissertation I address three challenges in cancer systems biology research: 1) predicting somatic mutations, 2) interpreting mutation functions, and 3) stratifying patients into biologically meaningful groups. Somatic single nucleotide variants are frequent therapeutically actionable mutations in cancer, e.g., the ‘hotspot’ mutations in known cancer driver genes such as EGFR, KRAS, and BRAF. However, only a small proportion of cancer patients harbour these known driver mutations. Therefore, there is a great need to systematically profile a cancer genome to identify all the somatic single nucleotide variants. I develop methods to discover these somatic mutations from cancer genomic sequencing data, taking into account the noise in high-throughput sequencing data and valuable validated genuine somatic mutations and non-somatic mutations. Of the somatic alterations acquired for each cancer patient, only a few mutations ‘drive’ the initialization and progression of cancer. To better understand the evolution of cancer, as well as to apply precision treatments, we need to assess the functions of these mutations to pinpoint the driver mutations. I address this challenge by predicting the mutations correlated with gene expression dysregulation. The method is based on hierarchical Bayes modelling of the influence of mutations on gene expression, and can predict the mutations that impact gene expression in individual patients. Although probably no two cancer genomes share exactly the same set of somatic mutations because of the stochastic nature of acquired mutations across the three billion base pairs, some cancer patients share common driver mutations or disrupted pathways. These patients may have similar prognoses and potentially benefit from the same kind of treatment options. I develop an efficient clustering algorithm to cluster high-throughput and high-dimensional bio- logical datasets, with the potential to put cancer patients into biologically meaningful groups for treatment selection.
Science, Faculty of
Computer Science, Department of
Graduate

APA, Harvard, Vancouver, ISO, and other styles

19

Uys, Lafras. "Computational systems biology of sucrose accumulation in sugarcane." Thesis, Link to the online version, 2006. http://hdl.handle.net/10019/245.

Full text

APA, Harvard, Vancouver, ISO, and other styles

20

Dinescu, Adriana Cundari Thomas R. "Metals in chemistry and biology computational chemistry studies /." [Denton, Tex.] : University of North Texas, 2007. http://digital.library.unt.edu/permalink/meta-dc-3678.

Full text

APA, Harvard, Vancouver, ISO, and other styles

21

Jones, Neil Christopher. "Computational tools for high-throughput discovery in biology." Connect to a 24 p. preview or request complete full text in PDF format. Access restricted to UC campuses, 2007. http://wwwlib.umi.com/cr/ucsd/fullcit?p3267820.

Full text

Abstract:

Thesis (Ph. D.)--University of California, San Diego, 2007.
Title from first page of PDF file (viewed August 7, 2007). Available via ProQuest Digital Dissertations. Vita. Includes bibliographical references (p. 115-127).

APA, Harvard, Vancouver, ISO, and other styles

22

Cong, Yang, and 丛阳. "Optimization models and computational methods for systems biology." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 2012. http://hub.hku.hk/bib/B47752841.

Full text

Abstract:

Systems biology is a comprehensive quantitative analysis of the manner in which all the components of a biological system interact functionally along with time. Mathematical modeling and computational methods are indispensable in such kind of studies, especially for interpreting and predicting the complex interactions among all the components so as to obtain some desirable system properties. System dynamics, system robustness and control method are three crucial properties in systems biology. In this thesis, the above properties are studied in four different biological systems. The outbreak and spread of infectious diseases have been questioned and studied for years. The spread mechanism and prediction about the disease could enable scientists to evaluate isolation plans to have significant effects on a particular epidemic. A differential equation model is proposed to study the dynamics of HIV spread in a network of prisons. In prisons, screening and quarantining are both efficient control manners. An optimization model is proposed to study optimal strategies for the control of HIV spread in a prison system. A primordium (plural: primordia) is an organ or tissue in its earliest recognizable stage of development. Primordial development in plants is critical to the proper positioning and development of plant organs. An optimization model and two control mechanisms are proposed to study the dynamics and robustness of primordial systems. Probabilistic Boolean Networks (PBNs) are mathematical models for studying the switching behavior in genetic regulatory networks. An algorithm is proposed to identify singleton and small attractors in PBNs which correspond to cell types and cell states. The captured problem is NP-hard in general. Our algorithm is theoretically and computationally demonstrated to be much more efficient than the naive algorithm that examines all the possible states. The goal of studying the long-term behavior of a genetic regulatory network is to study the control strategies such that the system can obtain desired properties. A control method is proposed to study multiple external interventions meanwhile minimizing the control cost. Robustness is a paramount property for living organisms. The impact degree is a measure of robustness of a metabolic system against the deletion of single or multiple reaction(s). An algorithm is proposed to study the impact degree in Escherichia coli metabolic system. Moreover, approximation method based on Branching process is proposed for estimating the impact degree of metabolic networks. The effectiveness of our method is assured by testing with real-world Escherichia coli, Bacillus subtilis, Saccharomyces cerevisiae and Homo Sapiens metabolic systems.
published_or_final_version
Mathematics
Doctoral
Doctor of Philosophy

APA, Harvard, Vancouver, ISO, and other styles

23

Trybilo, Maciej. "Computational design of orthogonal microRNAs for synthetic biology." Thesis, Brunel University, 2013. http://bura.brunel.ac.uk/handle/2438/8409.

Full text

Abstract:

Upcoming applications of synthetic biology will require access to a wide array of robust genetic components (parts). The logic of a genetic system is encoded with regulatory elements such as pairs of transcription factors:promoters, miRNAs:target sites, or ribozymes:aptamers among others. Due to a relatively simple form and mode of operation of miRNAs, it is possible to design their synthetic variants. Out of all possible miRNA sequences the ones chosen should perform efficiently and should avoid cross-talk with both the host system circuits and within the imported synthetic ones. In this work, a computational method involving a series of heuristics is developed that can be used to design ensembles of such sequences depending on the host transcriptome. As an example, an ensemble of eight such miRNA sequences is produced using this method for use in a human host. Those have then been validated experimentally against the above-mentioned requirements by transfection into HEK 293 cells and flow cytometry measurements of fluorescent markers. The produced sequences are available for use from pENTR vectors of the Gateway cloning system. The required computations were facilitated by a modern cluster computing system—Kaichu—especially developed for this project, but fit for general purpose use and available under an open-source license.

APA, Harvard, Vancouver, ISO, and other styles

24

Dinescu, Adriana. "Metals in Chemistry and Biology: Computational Chemistry Studies." Thesis, University of North Texas, 2007. https://digital.library.unt.edu/ark:/67531/metadc3678/.

Full text

Abstract:

Numerous enzymatic reactions are controlled by the chemistry of metallic ions. This dissertation investigates the electronic properties of three transition metal (copper, chromium, and nickel) complexes and describes modeling studies performed on glutathione synthetase. (1) Copper nitrene complexes were computationally characterized, as these complexes have yet to be experimentally isolated. (2) Multireference calculations were carried out on a symmetric C2v chromium dimer derived from the crystal structure of the [(tBu3SiO)Cr(µ-OSitBu3)]2 complex. (3) The T-shaped geometry of a three-coordinate β-diketiminate nickel(I) complex with a CO ligand was compared and contrasted with isoelectronic and isosteric copper(II) complexes. (4) Glutathione synthetase (GS), an enzyme that belongs to the ATP-grasp superfamily, catalyzes the (Mg, ATP)-dependent biosynthesis of glutathione (GSH) from γ-glutamylcysteine and glycine. The free and reactant forms of human GS (wild-type and glycine mutants) were modeled computationally by employing molecular dynamics simulations, as these currently have not been structurally characterized.

APA, Harvard, Vancouver, ISO, and other styles

25

Fratkin, Eugene. "Application of non-parametric algorithms to computational biology /." May be available electronically:, 2009. http://proquest.umi.com/login?COPT=REJTPTU1MTUmSU5UPTAmVkVSPTI=&clientId=12498.

Full text

APA, Harvard, Vancouver, ISO, and other styles

26

BARDINI, ROBERTA. "A diversity-aware computational framework for systems biology." Doctoral thesis, Politecnico di Torino, 2019. http://hdl.handle.net/11583/2752792.

Full text

APA, Harvard, Vancouver, ISO, and other styles

27

Codó, Tarraubella Laia. "Computational Infrastructures for biomolecular research." Doctoral thesis, Universitat de Barcelona, 2019. http://hdl.handle.net/10803/668536.

Full text

Abstract:

Recently, research processes in Life sciences have evolved at a rapid pace. This evolution, mainly due to technological advances, offers more powerful equipment and generalizes the digital format of research data. In the data deluge context, we need to overcome the current tsunami of data and prepare for the future. The current model, consisting to regularly add hardware resources into centralized core facilities without global coordination, is no longer sustainable. Scientific data management and analysis should be enhanced in order to offer services and developments corresponding to the new e-Science uses, and infrastructures are the vehicles to achieve so. We propose and implement research support infrastructures in line with new science directives, adapting them to the scenarios presented by the divergent use cases. Three different domain-specific infrastructures framed in three different scientific projects are assembly and introduced in this dissertation. The first case is framed in the clinical data management field, and focuses on the data platforms build around two epidemiologic case studies on Immune Mediated Inflammatory diseases (IMIDs), IMID-clinica and IMID-longitudinal. Making the leap to infrastructures more oriented to analysis process support, the transPLANT infrastructure represents a first intrusion into the topical cloud computing model. It is focused on plant genomics and its design became the seed for a more integrative cloud-based solution, this time developed for the non-programmer’s members of the 3D/4D genomics community. MuGVRE is the front cover of the resulting platform. Becoming obvious the transversal potential of cloud-based computational infrastructures as virtual research environments, openVRE is implemented as an abstraction of MuGVRE. It offers a vanilla platform encompassing computation, data and administration services ready to be adopted and customized by other scientific communities. They all represent an opportunity to establish better research processes through enhanced collaboration, data management, analysis practices and resources optimization.

APA, Harvard, Vancouver, ISO, and other styles

28

Camacho, Diogo Mayo. "In silico cell biology and biochemistry: a systems biology approach." Diss., Virginia Tech, 2007. http://hdl.handle.net/10919/27960.

Full text

Abstract:

In the post-"omic" era the analysis of high-throughput data is regarded as one of the major challenges faced by researchers. One focus of this data analysis is uncovering biological network topologies and dynamics. It is believed that this kind of research will allow the development of new mathematical models of biological systems as well as aid in the improvement of already existing ones. The work that is presented in this dissertation addresses the problem of the analysis of highly complex data sets with the aim of developing a methodology that will enable the reconstruction of a biological network from time series data through an iterative process. The first part of this dissertation relates to the analysis of existing methodologies that aim at inferring network structures from experimental data. This spans the use of statistical tools such as correlations analysis (presented in Chapter 2) to more complex mathematical frameworks (presented in Chapter 3). A novel methodology that focuses on the inference of biological networks from time series data by least squares fitting will then be introduced. Using a set of carefully designed inference rules one can gain important information about the system which can aid in the inference process. The application of the method to a data set from the response of the yeast Saccharomyces cerevisiae to cumene hydroperoxide is explored in Chapter 5. The results show that this method can be used to generate a coarse-level mathematical model of the biological system at hand. Possible developments of this method are discussed in Chapter 6.
Ph. D.

APA, Harvard, Vancouver, ISO, and other styles

29

Weis, Michael Christian. "Computational Models of the Mammalian Cell Cycle." Case Western Reserve University School of Graduate Studies / OhioLINK, 2011. http://rave.ohiolink.edu/etdc/view?acc_num=case1323278159.

Full text

APA, Harvard, Vancouver, ISO, and other styles

30

Hallett, Michael Trevor. "An integrated complexity analysis of problems from computational biology." Thesis, National Library of Canada = Bibliothèque nationale du Canada, 1996. http://www.collectionscanada.ca/obj/s4/f2/dsk3/ftp04/nq21933.pdf.

Full text

APA, Harvard, Vancouver, ISO, and other styles

31

Rahman, Muhammad Arifur. "Gaussian process in computational biology : covariance functions for transcriptomics." Thesis, University of Sheffield, 2018. http://etheses.whiterose.ac.uk/19460/.

Full text

Abstract:

In the field of machine learning, Gaussian process models are widely used families of stochastic process for modelling data observed over time, space or both. Gaussian processes models are nonparametric, meaning that the models are developed on an infinite-dimensional parameter space. The parameter space is then typically learnt as the set of all possible solutions for a given learning problem. Gaussian process distributions are distribution over functions. The covariance function determines the properties of functions samples drawn from the process. Once the decision to model with a Gaussian process has been made the choice of the covariance function is a central step in modelling. In molecular biology and genetics, a transcription factor is a protein that binds to specific DNA sequences and controls the flow of genetic information from DNA to mRNA. To develop models of cellular processes, quantitative estimation of the regulatory relationship between transcription factors and genes is a basic requirement. Quantitative estimation is complex due to various reasons. Many of the transcription factors' activities and their own transcription level are post transcriptionally modified; very often the levels of the transcription factors' expressions are low and noisy. So, from the expression levels of their target genes, it is useful to infer the activity of the transcription factors. Here we developed a Gaussian process based nonparametric regression model to infer the exact transcription factor activities from a combination of mRNA expression levels and DNA-protein binding measurements. Clustering of gene expression time series gives insight into which genes may be coregulated, allowing us to discern the activity of pathways in a given microarray experiment. Of particular interest is how a given group of genes varies with different conditions or genetic backgrounds. In this thesis, we developed a new clustering method that allows each cluster to be parametrized according to the behaviour of the genes across conditions whether they are correlated or anti-correlated. By specifying the correlation between such genes, we gain more information within the cluster about how the genes interrelate. Our study shows the effectiveness of sharing information between replicates and different model conditions while modelling gene expression time series.

APA, Harvard, Vancouver, ISO, and other styles

32

Castillo, Andrea R. (Andrea Redwing). "Assessing computational methods and science policy in systems biology." Thesis, Massachusetts Institute of Technology, 2009. http://hdl.handle.net/1721.1/51655.

Full text

Abstract:

Thesis (S.M. in Technology and Policy)--Massachusetts Institute of Technology, Engineering Systems Division, Technology and Policy Program, 2009.
Includes bibliographical references (p. 109-112).
In this thesis, I discuss the development of systems biology and issues in the progression of this science discipline. Traditional molecular biology has been driven by reductionism with the belief that breaking down a biological system into the fundamental biomolecular components will elucidate such phenomena. We have reached limitations with this approach due to the complex and dynamical nature of life and our inability to intuit biological behavior from a modular perspective [37]. Mathematical modeling has been integral to current system biology endeavors since detailed analysis would be invasive if performed on humans experimentally or in clinical trials [17]. The interspecies commonalities in systemic properties and molecular mechanisms suggests that certain behaviors transcend specie differentiation and therefore easily lend to generalizing from simpler organisms to more complex organisms such as humans [7, 17]. Current methodologies in mathematical modeling and analysis have been diverse and numerous, with no standardization to progress the discipline in a collaborative manner. Without collaboration during this formative period, successful development and application of systems biology for societal welfare may be at risk. Furthermore, such collaboration has to be standardized in a fundamental approach to discover generic principles, in the manner of preceding long-standing science disciplines. This study effectively implements and analyzes a mathematical model of a three-protein biochemical network, the Synechococcus elongatus circadian clock.
(cont.) I use mass action theory expressed in kronecker products to exploit the ability to apply numerical methods-including sensitivity analysis via boundary value formulation (BVP) and trapiezoidal integration rule-and experimental techniques-including partial reaction fitting and enzyme-driven activations-when mathematically modeling large-scale biochemical networks. Amidst other applicable methodologies, my approach is grounded in the law of mass action because it is based in experimental data and biomolecular mechanistic properties, yet provides predictive power in the complete delineation of the biological system dynamics for all future time points. The results of my research demonstrate the holistic approach that mass action method-ologies have in determining emergent properties of biological systems. I further stress the necessity to enforce collaboration and standardization in future policymaking, with reconsiderations on current stakeholder incentive to redirect academia and industry focus from new molecular entities to interests in holistic understanding of the complexities and dynamics of life entities. Such redirection away from reductionism could further progress basic and applied scientific research to embetter our circumstances through new treatments and preventive measures for health, and development of new strains and disease control in agriculture and ecology [13].
by Andrea R. Castillo.
S.M.in Technology and Policy

APA, Harvard, Vancouver, ISO, and other styles

33

Haider, Syed Abbas. "Computational systems biology-based feature selection for cancer prognosis." Thesis, University of Cambridge, 2012. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.610378.

Full text

APA, Harvard, Vancouver, ISO, and other styles

34

Picart, Armada Sergio. "Statistical normalisation of network propagation methods for computational biology." Doctoral thesis, Universitat Politècnica de Catalunya, 2020. http://hdl.handle.net/10803/672381.

Full text

Abstract:

The advent of high-throughput technologies and their decreasing cost have fostered the creation of a rich ecosystem of public database resources. In an era of affordable data acquisition, the core challenge has shifted to improve data interpretation, in order to understand normal and disease states. To that end, leveraging the current contextual knowledge in the form of annotations and biological networks is a powerful data amplifier to elucidate novel hypotheses. Label propagation and diffusion are the linchpin of the state of the art in network algorithms. In its simplest form, label propagation predicts the labels of a given node (for instance a gene, protein or metabolite) using those of its interactors. More elaborated approaches propagate beyond direct interactors, with robust performance in many computational biology domains. It has been pointed out that the topological structure of biological networks can bias propagation algorithms. Poorly known entities are overlooked and harder to link to experimental findings, which in turn keeps them barely annotated. Some efforts try to break this circularity by statistically normalising the topological bias, but the properties of the bias and the real benefit of its removal are yet to be carefully examined. This thesis covers two blocks. First, a characterisation of the bias in diffusion-based algorithms, with the implementation of statistical normalisations. Second, the application of such normalisation in classical computational biology problems: pathway analysis for metabolomics data and target gene prediction for drug development. In the first block, the presence of the bias is confirmed and linked to the network topology, albeit dependent on which nodes have labels. Equivalences are proven between diffusion processes with variations on their definitions, thus easing its choice. Closed forms on the first and second statistical moments of the null distributions of the diffusion scores are provided and linked to the spectral features of the network. The normalisation can be detrimental if the bias favours nodes with positive labels. An ad-hoc study of the data and the expected properties of the findings is recommended for an optimal choice. To that end, this thesis contributes the diffuStats software package, easing the computation and benchmark of several normalised and unnormalised diffusion scores. The second block starts with pathway analysis for metabolomics data. This choice is driven by the relative lack of computational solutions for metabolomics, whose output still requires an effortful interpretation. Here, a knowledge graph is conceived to connect the metabolites to the biological pathways through intermediate entities, like reactions and enzymes. Given the metabolites of interest, a propagation process is run to prioritise a relevant sub-network, suitable for manual inspection. The statistical normalisation is required due to the network design and properties. The usefulness of this approach is proven not only regarding pathway findings, but also examining the metabolites and reactions within the suggested sub-networks. The knowledge network construction and the propagation algorithm are distributed in the FELLA software package. The second practical application is the prediction of plausible gene targets in disease. Besides benchmarking the effect of the statistical normalisation, particular care is put into obtaining meaningful performance estimates for practical drug development. Target data is usually known at the protein complex level, which leads to performance over-estimation if ignored. Here, this effect is corrected in a varied comparison of prioritisation algorithms, networks, performance metrics and diseases. The results support that the statistical normalisation has a small but negative impact. After correcting for the protein complex structure, network-based algorithms are still deemed useful for drug discovery.
La aparición de tecnologías experimentales de alto rendimiento ha propiciado la creación de un rico entorno de bases de datos que aglomeran todo tipo de anotaciones moleculares. Dada la creciente facilidad para la adquisición de datos en varios niveles moleculares, el reto central de la biología computacional ha virado hacia la interpretación de dicho volumen de datos. La comprensión de los procesos de normalidad y enfermedad involucrados en los cambios observados en los estudios experimentales es el motor que expande la frontera del conocimiento humano. Para ello, es fundamental aprovechar la herencia de conocimiento previo, recogido en las bases de datos en forma de anotaciones y redes biológicas, y minarlo en busca de nuevos patrones e hipótesis. Los algoritmos más extendidos para extraer conocimiento de las redes biológicas son los denominados métodos de propagación y difusión. Su trasfondo es el principio de culpa por asociación, que postula que las entidades biológicas que mantienen relación o interacción son más propensas a compartir funciones y propiedades. Dichos algoritmos aprovechan las interacciones conocidas, en formato de red, para predecir propiedades de nodos (por ejemplo, genes, proteínas o metabolitos) usando las propiedades de sus interactores. Existe evidencia de que la estructura topológica de las redes sesga los algoritmos de propagación, de forma que los nodos mejor descritos gozan de una ventaja sistemática. Los nodos menos conocidos quedan en desventaja, se entorpece el descubrimiento de su implicación en los experimentos, a su vez perpetuando nuestro pobre conocimiento sobre ellos. La literatura ofrece algunos estudios donde se normaliza dicho efecto, pero las propiedades intrínsecas del sesgo y el beneficio real de dicha normalización requiere un estudio más detallado. El objeto de esta tesis tiene dos vertientes. Primero, la caracterización de la estadística del sesgo en los algoritmos de propagación, la concepción de normalizaciones estadísticas y su distribución como software científico. Segundo, la aplicación de dicha normalización en problemas clásicos de biología computacional. Concretamente, en el análisis de vías biológicas para datos de metabolómica y en la predicción de genes como dianas terapéuticas en el desarrollo de fármacos. Ambos problemas son abordables mediante técnicas de propagación y, por lo tanto, potencialmente sensibles al efecto del sesgo topológico. En el primer bloque, se corrobora la existencia del sesgo y su dependencia no sólo de la estructura de la red, sino de los nodos en los que se define la propagación. Se demuestran equivalencias matemáticas entre ciertas variaciones en la definición de la propagación, facilitando así su elección. Se proporcionan expresiones cerradas sobre los momentos estadísticos de la difusión y se halla una conexión con las propiedades espectrales de las redes. Un punto importante es que la normalización no siempre ayuda, y su aplicabilidad dependerá de cada caso particular y de las hipótesis sobre la topología de los nodos que deben ser descubiertos. Para ello, esta tesis deja como resultado diffuStats, un software disponible en un repositorio púlico, que permite calcular y comparar la propagación con ciertas variantes, y con presencia o ausencia de normalización. En el segundo bloque, se escoge el análisis de vías en metabolómica dada la relativa juventud de los estudios metabolómicos y, por ende, su falta de herramientas informáticas dedicadas. El análisis de vías clásico parte de una lista de metabolitos de interés, normalmente procedentes de un estudio, y reporta una lista de vías o procesos metabólicos estadísticamente relacionados con ellos. Algunas variantes usan redes de metabolitos para dar más contexto biológico, pero la interpretación de los datos sigue requiriendo un extenso esfuerzo manual. La aportación de esta tesis es la creación de una red de conocimiento que relaciona los metabolitos con las vías a través de las entidades intermedias anotadas, como reacciones y enzimas. Sobre dicha red se aplican algoritmos de propagación para identificar las entidades más relacionadas con los metabolitos de interés. La normalización estadística es necesaria, dada la estructura y las características de la red. Se demuestra no sólo la coherencia de las vías metabólicas propuestas, sino la de los metabolitos y las reacciones priorizadas. La publicación del software FELLA proporciona la construcción de la red de conocimiento y el algoritmo de difusión a la comunidad científica. FELLA va acompañado de seis casos de aplicación en estudios humanos y animales. Por otro lado, se aborda el problema de predicción de genes para dianas terapéuticas a través de redes biológicas. Además de probar el efecto de la normalización estadística, se pone énfasis en estimar el desempeño real esperado en un escenario de desarrollo de fármacos. Los datos de dianas terapéuticas no se suelen conocer al nivel de proteína sino al de complejo o familia de proteínas. La mayoría de estudios no lo tiene en cuenta, llegando a estimaciones optimistas sobre el desempeño esperado. En esta tesis se propone un estudio exhaustivo que corrige el efecto de los complejos de proteínas, compara algoritmos de propagación con distintas métricas de rendimiento por su informatividad y explora el rol de la red biológica y de la enfermedad en cuestión. Se demuestra que la normalización estadística tiene poco efecto en el desempeño y que, en general, los métodos de propagación siguen siendo útiles en el desarrollo de fármacos después de corregir las estimaciones optimistas de su rendimiento.

APA, Harvard, Vancouver, ISO, and other styles

35

McFarlane, Ross. "High-performance computing for computational biology of the heart." Thesis, University of Liverpool, 2010. http://livrepository.liverpool.ac.uk/3173/.

Full text

Abstract:

This thesis describes the development of Beatbox — a simulation environment for computational biology of the heart. Beatbox aims to provide an adaptable, approachable simulation tool and an extensible framework with which High Performance Computing may be harnessed by researchers. Beatbox is built upon the QUI software package, which is studied in Chapter 2. The chapter discusses QUI’s functionality and common patterns of use, and describes its underlying software architecture, in particular its extensibility through the addition of new software modules called ‘devices’. The chapter summarises good practice for device developers in the Laws of Devices. Chapter 3 discusses the parallel architecture of Beatbox and its implementation for distributed memory clusters. The chapter discusses strategies for domain decomposition, halo swapping and introduces an efficient method for exchange of data with diagonal neighbours called Magic Corners. The development of Beatbox’s parallel Input/Output facilities is detailed, and its impact on scaling performance discussed. The chapter discusses the way in which parallelism can be hidden from the user, even while permitting the runtime execution user-defined functions. The chapter goes on to show how QUI’s extensibility can be continued in a parallel environment by providing implicit parallelism for devices and defining Laws of Parallel Devices to guide third-party developers. Beatbox’s parallel performance is evaluated and discussed. Chapter 4 describes the extension of Beatbox to simulate anatomically realistic tissue geometry. Representation of irregular geometries is described, along with associated user controls. A technique to compute no-flux boundary conditions on irregular boundaries is introduced. The Laws of Devices are further developed to include irregular geometries. Finally, parallel performance of anatomically realistic meshes is evaluated.

APA, Harvard, Vancouver, ISO, and other styles

36

Kudahl, Ulrich Johan. "A computational biology approach to studying algae-bacterial interactions." Thesis, University of Cambridge, 2018. https://www.repository.cam.ac.uk/handle/1810/276956.

Full text

Abstract:

Microalgae have a profound effect on the world due to their large contribution to net carbon fixation. Although they are phototrophic, more than 50% of microalgae are thought to depend on external supply of metabolites such as B-vitamins. In oceans, algae are therefore often found together with a community of bacteria and form intricate networks where metabolites are exchanged. Currently, only a fraction of the related mechanisms and metabolite exchanges between algae and bacteria have been uncovered and many more are likely to exist. The work presented in this thesis is based on a model system for algae-bacterial interactions made up of the green alga, Lobomonas rostrata and the alpha-proteobacterium Mesorhizobium loti. In the model system, it is known that the bacterium provides vitamin B12 to the alga and itself, whilst the alga provides fixed carbon. I have applied methods from the field of computational biology to study the interactions between these organisms and other similar partnerships, with the aim of uncovering new insights. The thesis is made up of three research chapters, each focused on using a specific method to study algae-bacterial interactions. I developed a genome scale metabolic model of metabolism of M. loti that enabled simulation of growth. The model simulates 1908 enzymatic reactions and takes 1804 metabolites into account. Using the model, I simulated growth of the bacterium on 1018 different substrates with the aim of identifying substrates supplied by L. rostrata when the two organisms are co-cultured. In addition, I carried out a set of simulations studying the bacterium’s ability to produce B12 from 1368 different substrates. The modelling efforts in this project was successful in enabling simulations, but it was not possible to validate the simulations with experimental data. A transcriptomics experiment was undertaken with the aim of identifying genes related to the interaction between L. rostrata and M. loti. In the experiment, the partners from the model system was grown in axenic and co-culture conditions and RNA samples were taken from each state. Using RNA-seq, the RNA samples were sequenced and from this a candidate transcriptome was created. The expression of each putative gene was then quantified and differentially expressed genes were identified. Based on sequence similarity, candidate functions were assigned where possible. In the analysis of differentially expressed genes, it was found that there appears to be an increased expression of a transporter responsible for uptake of the plant hormone, auxin. Currently, only a small fraction of all bacteria has been shown to produce B12 and it is not clear in which phylogenetic groups this is a common trait. I therefore applied methods from comparative genomics to study the synthesis of this metabolite in more than 8000 bacterial species. This involved developing a computational framework that allowed me to search for the presence of more than 50 genes in more than 8000 genomes in a rapid manner. I found that 37.2% of bacteria can synthesis B12 and that this capability is very common in some phylogenetic groups such as Cyanobacteria, but extremely rare in others such as Lactobacillus. I was also able to confirm that cyanobacteria are not able to make cobalamin, a variant of B12 used by eukaryotic algae, and thus they are unlikely to support algal growth in the photic zone. In the final section of the thesis, I discuss the application of computational biology methods in this field and summarise my experience from applying genome scale modelling, comparative genomics and transcriptomics to study algae-bacterial interactions.

APA, Harvard, Vancouver, ISO, and other styles

37

Warne, David James. "Computational inference in mathematical biology: Methodological developments and applications." Thesis, Queensland University of Technology, 2020. https://eprints.qut.edu.au/202835/1/David_Warne_Thesis.pdf.

Full text

Abstract:

Complexity in living organisms occurs on multiple spatial and temporal scales. The function of tissues depends on interactions of cells, and in turn, cell dynamics depends on intercellular and intracellular biochemical networks. A diverse range of mathematical modelling frameworks are applied in quantitative biology. Effective application of models in practice depends upon reliable statistical inference methods for experimental design, model calibration and model selection. In this thesis, new results are obtained for quantification of contact inhibition and cell motility mechanisms in prostate cancer cells, and novel computationally efficient inference algorithms suited for the study of biochemical systems are developed.

APA, Harvard, Vancouver, ISO, and other styles

38

Subramanian, Ayshwarya. "Inferring tumor evolution using computational phylogenetics." Research Showcase @ CMU, 2013. http://repository.cmu.edu/dissertations/275.

Full text

Abstract:

Cancer research has made tremendous progress in understanding the basic biology of tumors. One of the key insights that has informed work in this area is the recognition that a tumor is an evolutionary system, in which individual cells undergo a process of rapid mutation and selection leading to a progression in phenotypes and, typically, aggressiveness of the tumor. Tumor phylogenetics is a strategy for interpreting the evolution of tumors using computer algorithms for phylogenetics, i.e., the inference of evolutionary trees. The approach takes advantage of a large body of phylogenetic theory and algorithms, developed primarily for inferring evolution among species, to interpret complex tumor data sets as evidence for evolutionary processes. The result is a tumor phylogeny, or phylogenetic tree, a reconstruction of the sequences of mutations that cells within a tumor or class of tumors accumulate over the course of their progression. The goals of finding such trees are to better interpret heterogeneity within and among tumors, identify and classify tumor subtypes with possible underlying mechanisms of action, learn markers of progression for key steps in tumor evolution, and enable predictive modeling of likely tumor progression steps that may ultimately assist in diagnosis and treatment. In this dissertation, we discuss a computational framework for reconstructing phylogenies from genome-scale tumor array and sequencing data. We first present a novel phylogenetic pipeline for building tumor phylogenies from whole-genome copy number variation data. The steps included computational unmixing for resolving heterogeneity in genomic data from tumors, a statistical method for progression marker discovery, a statistical method for data discretization, application of character-based phylogeny reconstruction, and analyses of the resulting trees to draw biological significance. We then describe HMM-CNA, an improved model for discovering progression markers from cohorts of patient tumor copy number data that are especially relevant for phylogeny reconstruction via a custom multi-sample Hidden Markov model (HMM). We next present a novel strategy for phylogeny building from single cell sequencing data by inferring features that can accurately capture the composition of the individual genome sequences and distinguish among stages of tumor progression. We demonstrate these contributions on both simulated and human breast tumor biopsy and cell line data assuming a maximum parsimony model of evolution. Finally, we discuss future directions for building a more realistic model of tumor evolution by integrating patterns in genome structural changes with the functional elements they encode. We close with a discussion of recent research, current trends, and challenges and opportunities facing the field.

APA, Harvard, Vancouver, ISO, and other styles

39

Ballweg, Richard A. III. "Computational Analysis of Heterogeneous Cellular Responses." University of Cincinnati / OhioLINK, 2020. http://rave.ohiolink.edu/etdc/view?acc_num=ucin159216973756476.

Full text

APA, Harvard, Vancouver, ISO, and other styles

40

Karathia, Hiren Mahendrabhai. "Development and application of computational methdologies for Integrated Molecular Systems Biology." Doctoral thesis, Universitat de Lleida, 2012. http://hdl.handle.net/10803/110518.

Full text

Abstract:

L'objectiu del treball presentat en aquesta tesi va ser el desenvolupament i l'aplicació de metodologies computacionals que integren l’anàlisis de informació sobre seqüències proteiques, informació funcional i genòmica per a la reconstrucció, anotació i organització de proteomes complets, de manera que els resultats es poden comparar entre qualsevol nombre d'organismes amb genomes completament seqüenciats. Metodològicament, m'he centrat en la identificació de l'organització molecular dins d'un proteoma complet d'un organisme de referència i comparació amb proteomes d'altres organismes, en espacial, estructural i funcional, el teixit cel • lular de desenvolupament, o els nivells de la fisiologia. La metodologia es va aplicar per abordar la qüestió de la identificació de organismes model adequats per a estudiar diferents fenòmens biològics. Això es va fer mitjançant la comparació d’un conjunt de proteines involucrades en diferents fenòmens biològics en Saccharomyces cerevisiae i Homo sapiens amb els conjunts corresponents d'altres organismes amb genomes. La tesi conclou amb la presentació d'un servidor web, Homol-MetReS, en què s'implementa la metodologia. Homol-MetReS proporciona un entorn de codi obert a la comunitat científica en què es poden realitzar múltiples nivells de comparació i anàlisi de proteomes.
El objetivo del trabajo presentado en esta tesis fue el desarrollo y la aplicación de metodologías computacionales que integran el análisis de la secuencia y de la información funcional y genómica, con el objetivo de reconstruir, anotar y organizar proteomas completos, de tal manera que estos proteomas se puedan comparar entre cualquier número de organismos con genomas completamente secuenciados. Metodológicamente, I centrado en la identificación de organización molecular dentro de un proteoma completo de un organismo de referencia, vinculando cada proteína en que proteoma a las proteínas de otros organismos, de tal manera que cualquiera puede comparar los dos proteomas en espacial, estructural, funcional tejido, celular, el desarrollo o los niveles de la fisiología. La metodología se aplicó para abordar la cuestión de la identificación de organismos modelo adecuados para estudiar diferentes fenómenos biológicos. Esto se hizo comparando conjuntos de proteínas involucradas en diferentes fenómenos biológicos en Saccharomyces cerevisiae y Homo sapiens con los conjuntos correspondientes de otros organismos con genomas completamente secuenciados. La tesis concluye con la presentación de un servidor web, Homol-MetReS, en el que se implementa la metodología. Homol-MetReS proporciona un entorno de código abierto a la comunidad científica en la que se pueden realizar múltiples niveles de comparación y análisis de proteomas.
The aim of the work presented in this thesis was the development and application of computational methodologies that integrate sequence, functional, and genomic information to provide tools for the reconstruction, annotation and organization of complete proteomes in such a way that the results can be compared between any number of organisms with fully sequenced genomes. Methodologically, I focused on identifying molecular organization within a complete proteome of a reference organism and comparing with proteomes of other organisms at spatial, structural, functional, cellular tissue, development or physiology levels. The methodology was applied to address the issue of identifying appropriate model organisms to study different biological phenomena. This was done by comparing the protein sets involved in different biological phenomena in Saccharomyces cerevisiae and Homo sapiens. This thesis concludes by presenting a web server, Homol-MetReS, on which the methodology is implemented. It provides an open source environment to the scientific community on which they can perform multi-level comparison and analysis of proteomes.

APA, Harvard, Vancouver, ISO, and other styles

41

Weirather, Jason Lee. "Computational approaches to the study of human trypanosomatid infections." Thesis, The University of Iowa, 2014. http://pqdtopen.proquest.com/#viewpdf?dispub=3609102.

Full text

Abstract:

Trypanosomatids cause human diseases such as leishmaniasis and African trypanosomiasis. Trypanosomatids are protists from the order Trypanosomatida and include species of the genera Trypanosoma and Leishmania, which occupy a similar ecological niche. Both have digenic life-stages, alternating between an insect vector and a range of mammalian hosts. However, the strategies used to subvert the host immune system differ greatly as do the clinical outcome of infections between species. The genomes of both the host and the parasite instruct us about strategies the pathogens use to subvert the human immune system, and adaptations by the human host allowing us to better survive infections. We have applied unsupervised learning algorithms to aid visualization of amino acid sequence similarity and the potential for recombination events within Trypanosoma brucei 's large repertoire of variant surface glycoproteins (VSGs). Methods developed here reveal five groups of VSGs within a single sequenced genome of T. brucei, indicating many likely recombination events occurring between VSGs of the same type, but not between those of different types. These tools and methods can be broadly applied to identify groups of non-coding regulatory sequences within other Trypanosomatid genomes. To aid in the detection, quantification, and species identification of leishmania DNA isolated from environmental or clinical specimens, we developed a set of quantitative-PCR primers and probes targeting a taxonomically and geographically broad spectrum of Leishmania species. This assay has been applied to DNA extracted from both human and canine hosts as well as the sand fly vector, demonstrating its flexibility and utility in a variety of research applications. Within the host genomes, fine mapping SNP analysis was performed to detect polymorphisms in a family study of subjects in a region of Northeast Brazil that is endemic for Leishmania infantum chagasi, the parasite causing visceral leishmaniasis. These studies identified associations between genetic loci and the development of visceral leishmaniasis, with a single polymorphism associated with an asymptomatic outcome after infection. The methods and results presented here have capitalized on the large amount of genomics data becoming available that will improve our understanding of both parasite and host genetics and their role in human disease.

APA, Harvard, Vancouver, ISO, and other styles

42

Facchetti, Giuseppe. "Computational approaches to complex biological networks." Doctoral thesis, SISSA, 2013. http://hdl.handle.net/20.500.11767/4822.

Full text

Abstract:

The need of understanding and modeling the biological networks is one of the raisons d'être and of the driving forces behind the emergence of Systems Biology. Because of its holistic approach and because of the widely different level of complexity of the networks, different mathematical methods have been developed during the years. Some of these computational methods are used in this thesis in order to investigate various properties of different biological systems. The first part deals with the prediction of the perturbation of cellular metabolism induced by drugs. Using Flux Balance Analysis to describe the reconstructed genome-wide metabolic networks, we consider the problem of identifying the most selective drug synergisms for given therapeutic targets. The second part of this thesis considers gene regulatory and large social networks as signed graphs (activation/deactivation or friendship/hostility are rephrased as positive/negative coupling between spins). Using the analogy with an Ising spin glass an analysis of the energy landscape and of the content of “disorder” 'is carried out. Finally, the last part concerns the study of the spatial heterogeneity of the signaling pathway of rod photoreceptors. The electrophysiological data produced by our collaborators in the Neurobiology laboratory have been analyzed with various dynamical systems giving an insight into the process of ageing of photoreceptors and into the role diffusion in the pathway.

APA, Harvard, Vancouver, ISO, and other styles

43

Jones-Rhoades, Matthew W. (Matthew William). "Computational and experimental analysis of plant microRNAs." Thesis, Massachusetts Institute of Technology, 2005. http://hdl.handle.net/1721.1/31191.

Full text

Abstract:

Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Biology, 2005.
Includes bibliographical references.
MicroRNAs (miRNAs) are small, endogenous, non-coding RNAs that mediate gene regulation in plants and animals. We demonstrated that Arabidopsis thaliana miRNAs are highly complementary (0-3 mispairs in an ungapped alignment) to more mRNAs than would be expected by chance. These mRNAs are therefore putative regulatory targets of their complementary miRNAs. Many miRNA complementary sites are conserved to the monocot Oryza sativa (rice), implying evolutionary conservation based on function at the nucleotide level. The majority of predicted miRNA targets encode for transcription factors and other proteins with known or inferred roles in developmental patterning, implying that the miRNAs themselves are high-level regulators of development. Our findings indicated that miRNAs are key components of numerous regulatory circuits in plants and set the stage for numerous additional experiments to investigate in depth the significance of miRNA-mediated regulation for particular target families and genes. We developed a comparative genomics approach to identify miRNAs and miRNA targets conserved between Arabidopsis and Oryza. Seven previously unknown miRNAs families were experimentally verified, bringing the total number of known miRNA genes in Arabidopsis to 92, representing 22 families. We expanded the range of functionalities known to be regulated by miRNAs to include F-box proteins, laccases, superoxide dismutases, and ATP-sulfurylases. The expression of miR395, which targets sulfate metabolizing enzymes, is induced by sulfate- starvation, demonstrating that miRNA expression can be responsive to growth conditions.
(cont.) We investigated the biological role of miR394-mediated regulation of Atlg27340, an F-box gene of previously unknown function. Transgenic plants expressing a miR394-resistant version of Atlg27340 displayed a range of developmental abnormalities, including radialized and fused cotyledons, absent shoot apical meristems, curled and radialized leaves, and abortive flowers. The severity of these abnormalities correlated with the overaccumulation of Atlg27340 mRNA. These findings confirm the biological relevance of the interaction between miR394 and Atlg27340, and represent the first insights into the roles of miRNA-mediated regulation of F-box genes. Our results establish that both MIR394 and Atlg27340 are important regulators of meristem identity, and suggest that Atlg27340 targets an activator of class III HD-ZIP function for ubiquitination and proteolysis.
by Matthew W. Jones-Rhoades.
Ph.D.

APA, Harvard, Vancouver, ISO, and other styles

44

Kasap, Server. "High performance reconfigurable architectures for bioinformatics and computational biology applications." Thesis, University of Edinburgh, 2010. http://hdl.handle.net/1842/24757.

Full text

Abstract:

The field of Bioinformatics and Computational Biology (BCB), a relatively new discipline which spans the boundaries of Biology, Computer Science and Engineering, aims to develop systems that help organise, store, retrieve and analyse genomic and other biological information in a convenient and speedy way. This new discipline emerged mainly as a result of the Human Genome project which succeeded in transcribing the complete DNA sequence of the human genome, hence making it possible to address many problems which were impossible to even contemplate before, with a plethora of applications including disease diagnosis, drug engineering, bio-material engineering and genetic engineering of plants and animals; all with a real impact on the quality of the life of ordinary individuals. Due to the sheer immensity of the data sets involved in BCB algorithms (often measured in tens/hundreds of Gigabytes) as well as their computation demands (often measured in Tera-Ops), high performance supercomputers and computer clusters have been used as implementation platforms for high performance BCB computing. However, the high cost as well as the lack of suitable programming interfaces for these platforms still impedes a wider undertaking of this technology in the BCB community. Moreover, with increased heat dissipation, supercomputers are now often augmented with special-purpose hardware (or ASICs) in order to speed up their operations while reducing their power dissipation. However, since ASICs are fully customised to implement particular tasks/algorithms, they suffer from increased development times, higher Non-Recurring-Engineering (NRE) costs, and inflexibility as they cannot be reused to implement tasks/algorithms other than those they have been designed to perform. On the other hand, Field Programmable Gate Arrays (FPGAs) have recently been proposed as a viable alternative implementation platform for BCB applications due to their flexible computing and memory architecture which gives them ASIC-like performance with the added programmability feature. In order to counter the aforementioned limitations of both supercomputers and ASICs, this research proposes the use of state-of-the-art reprogrammable system-on-chip technology, in the form of platform FPGAs, as a relatively low cost, high performance and reprogrammable implementation platform for BCB applications. This research project aims to develop a sophisticated library of FPGA architectures for bio-sequence analysis, phylogenetic analysis, and molecular dynamics simulation.

APA, Harvard, Vancouver, ISO, and other styles

45

Misirli, Goksel. "Data integration strategies for informing computational design in synthetic biology." Thesis, University of Newcastle upon Tyne, 2013. http://hdl.handle.net/10443/1873.

Full text

Abstract:

The potential design space for biological systems is complex, vast and multidimensional. Therefore, effective large-scale synthetic biology requires computational design and simulation. By constraining this design space, the time- and cost-efficient design of biological systems can be facilitated. One way in which a tractable design space can be achieved is to use the extensive and growing amount of biological data available to inform the design process. By using existing knowledge design efforts can be focused on biologically plausible areas of design space. However, biological data is large, incomplete, heterogeneous, and noisy. Data must be integrated in a systematic fashion in order to maximise its benefit. To date, data integration has not been widely applied to design in synthetic biology. The aim of this project is to apply data integration techniques to facilitate the efficient design of novel biological systems. The specific focus is on the development and application of integration techniques for the design of genetic regulatory networks in the model bacterium Bacillus subtilis. A dataset was constructed by integrating data from a range of sources in order to capture existing knowledge about B. subtilis 168. The dataset is represented as a computationally-accessible, semantically-rich network which includes information concerning biological entities and their relationships. Also included are sequence-based features mined from the B. subtilis genome, which are a useful source of parts for synthetic biology. In addition, information about the interactions of these parts has been captured, in order to facilitate the construction of circuits with desired behaviours. This dataset was also modelled in the form of an ontology, providing a formal specification of parts and their interactions. The ontology is a major step towards the unification of the data required for modelling with a range of part catalogues specifically designed for synthetic biology. The data from the ontology is available to existing reasoners for implicit knowledge extraction. The ontology was applied to the automated identification of promoters, operators and coding sequences. Information from the ontology was also used to generate dynamic models of parts. The work described here contributed to the development of a formalism called Standard Virtual Parts (SVPs), which aims to represent models of biological parts in a standardised manner. SVPs comprise a mapping between biological parts and modular computational models. A genetic circuit designed at a part-level abstraction can be investigated in detail by analysing a circuit model composed of SVPs. The ontology was used to construct SVPs in the form of standard Systems Biology Markup Language models. These models are publicly available from a computationally-accessible repository, and include metadata which facilitates the computational composition of SVPs in order to create models of larger biological systems. To test a genetic circuit in vitro or in vivo, the genetics elements necessary to encode the enitites in the in silico model, and their associated behaviour, must be derived. Ultimately, this process results in the specification for synthesisable DNA sequence. For large models, particularly those that are produced computationally, the transformation process is challenging. To automate this process, a model-to-sequence conversion algorithm was developed. The algorithm was implemented as a Java application called MoSeC. Using MoSeC, both CellML and SBML models built with SVPs can be converted into DNA sequences ready to synthesise. Selection of the host bacterial cell for a synthetic genetic circuit is very important. In order not to interfere with the existing cellular machinery, orthogonal parts from other species are used since these parts are less likely to have undesired interactions with the host. In order to find orthogonal transcription factors (OTFs), and their target binding sequences, a subset of the data from the integrated B. subtilis dataset was used. B. subtilis gene regulatory networks were used to re-construct regulatory networks in closely related Bacillus species. The system, called BacillusRegNet, stores both experimental data for B. subtilis and homology predictions in other species. BacillusRegNet was mined to extract OTFs and their binding sequences, in order to facilitate the engineering of novel regulatory networks in other Bacillus species. Although the techniques presented here were demonstrated using B. subtilis, they can be applied to any other organism. The approaches and tools developed as part of this project demonstrate the utility of this novel integrated approach to synthetic biology.

APA, Harvard, Vancouver, ISO, and other styles

46

Fu, Yan. "Computational Systems Biology Analysis of Cell Reprogramming and Activation Dynamics." Diss., Virginia Tech, 2012. http://hdl.handle.net/10919/28414.

Full text

Abstract:

In the past two decades, molecular cell biology has transitioned from a traditional descriptive science into a quantitative science that systematically measures cellular dynamics on different levels of genome, transcriptome and proteome. Along with this transition emerges the interdisciplinary field of systems biology, which aims to unravel complex interactions in biological systems through integrating experimental data into qualitative or quantitative models and computer simulations. In this dissertation, we applied various systems biology tools to investigate two important problems with respect to cellular activation dynamics and reprograming. Specifically, in the first section of the dissertation, we focused on lipopolysaccharide (LPS)-mediated priming and tolerance: a reprogramming in cytokine production in macrophages pretreated with specific doses of LPS. Though both priming and tolerance are important in the immune systemâ s response to pathogens, the molecular mechanisms still remain unclear. We computationally investigated all network topologies and dynamics that are able to generate priming or tolerance in a generic three-node model. Accordingly, we found three basic priming mechanisms and one tolerance mechanism. Existing experimental evidence support these in silico found mechanisms. In the second part of the dissertation, we applied stochastic modeling and simulations to investigate the phenotypic transition of bacteria E.coli between normally-growing cells and persister cells (growth-arrested phenotype), and how this process can contribute to drug resistance. We built up a complex computational model capturing the molecular mechanism on both single cell level and population level. The paper also proposed a novel way to accelerate the phenotypic transition from persister cells to normally growing cell under resonance activation. The general picture of phenotypic transitions should be applicable to a broader context of biological systems, such as T cell differentiation and stem cell reprogramming.
Ph. D.

APA, Harvard, Vancouver, ISO, and other styles

47

Donaldson, Eric F. Baric Ralph S. "Computational and molecular biology approaches to viral replication and pathogenesis." Chapel Hill, N.C. : University of North Carolina at Chapel Hill, 2008. http://dc.lib.unc.edu/u?/etd,1731.

Full text

Abstract:

Thesis (Ph. D.)--University of North Carolina at Chapel Hill, 2008.
Title from electronic title page (viewed Sep. 16, 2008). "... in partial fulfillment of the requirements for the degree of Doctor of Philosophy in the Department of Microbiology and Immunology Virology." Discipline: Microbiology and Immunology; Department/School: Medicine.

APA, Harvard, Vancouver, ISO, and other styles

48

Yang, Pengyi. "Ensemble methods and hybrid algorithms for computational and systems biology." Thesis, The University of Sydney, 2012. https://hdl.handle.net/2123/28979.

Full text

Abstract:

Modern molecular biology increasingly relies on the application of high-throughput technologies for studying the function, interaction, and integration of genes, proteins, and a variety of other molecules on a large scale. The application of those high throughput technologies has led to the exponential growth of biological data, making modern molecular biology a data-intensive science. Huge effort has been directed to the development of robust and efficient computational algorithms in order to make sense of these extremely large and complex biological data, giving rise to several interdisciplinary fields, such as computational and systems biology. Machine learning and data mining are disciplines dealing with knowledge discovery from large data, and their application to computational and systems biology has been extremely fruitful. However, the ever-increasing size and complexity of the biological data require novel computational solutions to be developed. This thesis attempts to contribute to these inter-disciplinary fields by deve10ping and applying different ensemble learning methods and hybrid algorithms for solving a variety of problems in computational and systems biology. Through the study of different types of data generated from a variety of biological systems using different high-throughput approaches, we demonstrate that ensemble learning methods and hybrid algorithms are general, flexible, and highly effective tools for computational and systems biology.

APA, Harvard, Vancouver, ISO, and other styles

49

Khan, Maria Mohammad. "Computational Biology in the Analysis of Epigenetic Nuclear Self-Organization." Thesis, The University of Arizona, 2010. http://hdl.handle.net/10150/146042.

Full text

Abstract:

The function of the nucleus is central to the survival of cells and thus life as a whole. Among other processes, it is the site of gene expression, DNA repair, and genome stability. These functions are carried in the context of a complex nuclear architecture. The nucleus is compartmentalized both spatially and functionally. These compartments are proteinaceous nuclear bodies or chromatin domains, both of which are not segregated from other compartments by membranes-as are the organelles of cells. Specifically, proteinaceous nuclear bodies are characterized as regions within the nucleus with distinct sets of inhabitant proteins. Examples of such proteinaceous nuclear bodies include the nucleolus, splicing factor compartments, and the Cajal body. The nucleolus is the location of the transcription and processing of ribosomal RNA and the Cajal body is the site of snRNP assembly, while the splicing factor compartments are a storage and assembly site for spliceosomal components.

APA, Harvard, Vancouver, ISO, and other styles

50

Ghaffarizadeh, Ahmadreza. "COMPUTATIONAL MODELS OF INTRACELLULAR AND INTERCELLULAR PROCESSES IN DEVELOPMENTAL BIOLOGY." DigitalCommons@USU, 2014. https://digitalcommons.usu.edu/etd/3103.

Full text

Abstract:

Systems biology takes a holistic approach to biological questions as it applies mathematical modeling to link and understand the interaction of components in complex biological systems. Multiscale modeling is the only method that can fully accomplish this aim. Mutliscale models consider processes at different levels that are coupled within the modeling framework. A first requirement in creating such models is a clear understanding of processes that operate at each level. This research focuses on modeling aspects of biological development as a complex process that occurs at many scales. Two of these scales were considered in this work: cellular differentiation, the process of in which less specialized cells acquired specialized properties of mature cell types, and morphogenesis, the process in which an organism develops its shape and tissue architecture. In development, cellular differentiation typically is required for morphogenesis. Therefore, cellular differentiation is at a lower scale than morphogenesis in the overall process of development. In this work, cellular differentiation and morphogenesis were modeled in a variety of biological contexts, with the ultimate goal of linking these different scales of developmental events into a unified model of development. Three aspects of cellular differentiation were investigated, all united by the theme of how the dynamics of gene regulatory networks (GRNs) control differentiation. Two of the projects of this dissertation studied the effect of noise and robustness in switching between cell types during differentiation, and a third deals with the evaluation of hypothetical GRNs that allow the differentiation of specific cell types. All these projects view cell types as highdimensional attractors in the GRNs and use random Boolean networks as the modeling framework for studying network dynamics. Morphogenesis was studied using the emergence of three-dimensional structures in biofilms as a relatively simple model. Many strains of bacteria form complex structures during growth as colonies on a solid medium. The morphogenesis of these structures was modeled using an agent-based framework and the outcomes were validated using structures of biofilm colonies reported in the literature.

APA, Harvard, Vancouver, ISO, and other styles

Dissertations / Theses on the topic 'Computational biology'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles