Dissertations / Theses: 'Gene network reconstruction'

1

ACERBI, ENZO. "Continuos time Bayesian networks for gene networks reconstruction." Doctoral thesis, Università degli Studi di Milano-Bicocca, 2014. http://hdl.handle.net/10281/52709.

Full text

Abstract:

Dynamic aspects of gene regulatory networks are typically investigated by measuring system variables at multiple time points. Current state-of-the-art computational approaches for reconstructing gene networks directly build on such data, making a strong assumption that the system evolves in a synchronous fashion at fixed points in time. However, nowadays omics data are being generated with increasing time course granularity. Thus, modellers now have the possibility to represent the system as evolving in continuous time and improve the models' expressiveness. Continuous time Bayesian networks is proposed as a new approach for gene network reconstruction from time course expression data. Their performance was compared to two state-of-the-art methods: dynamic Bayesian networks and Granger causality analysis. On simulated data methods's comparison was carried out for networks of increasing dimension, for measurements taken at different time granularity densities and for measurements evenly vs. unevenly spaced over time. Continuous time Bayesian networks outperformed the other methods in terms of the accuracy of regulatory interactions learnt from data for all network dimensions. Furthermore, their performance degraded smoothly as the dimension of the network increased. Continuous time Bayesian network were significantly better than dynamic Bayesian networks for all time granularities tested and better than Granger causality for dense time series. Both continuous time Bayesian networks and Granger causality performed robustly for unevenly spaced time series, with no significant loss of performance compared to the evenly spaced case, while the same did not hold true for dynamic Bayesian networks. The comparison included the IRMA experimental datasets which confirmed the effectiveness of the proposed method. Continuous time Bayesian networks were then applied to elucidate the regulatory mechanisms controlling murine T helper 17 (Th17) cell differentiation and were found to be effective in discovering well-known regulatory mechanisms as well as new plausible biological insights. Continuous time Bayesian networks resulted to be effective on networks of both small and big dimensions and particularly feasible when the measurements are not evenly distributed over time. Reconstruction of the murine Th17 cell differentiation network using continuous time Bayesian networks revealed several autocrine loops suggesting that Th17 cells may be auto regulating their own differentiation process.

APA, Harvard, Vancouver, ISO, and other styles

2

Fichtenholtz, Alexander Michael. "In silico bacterial gene regulatory network reconstruction from sequence." Thesis, Boston University, 2012. https://hdl.handle.net/2144/32880.

Full text

Abstract:

Thesis (Ph.D.)--Boston University
PLEASE NOTE: Boston University Libraries did not receive an Authorization To Manage form for this thesis or dissertation. It is therefore not openly accessible, though it may be available by request. If you are the author or principal advisor of this work and would like to request open access for it, please contact us at open-help@bu.edu. Thank you.
DNA sequencing techniques have evolved to the point where one can sequence millions of bases per minute, while our capacity to use this information has been left behind. One particularly notorious example is in the area of gene regulatory networks. A molecular study of gene regulation proceeds one protein at a time, requiring bench scientists months of work purifying transcription factors and performing DNA footprinting studies. Massive scale options like ChIP-Seq and microarrays are a step up, but still require considerable resources in terms of manpower and materials. While computational biologists have developed methods to predict protein function from sequence, gene locations from sequence, and even metabolic networks from sequence, the space of regulatory network reconstruction from sequence remains virtually untouched. Part of the reason comes from the fact that the components of a regulatory interaction, such as transcription factors and binding sites, are difficult to detect. The other, more prominent reason, is that there exists no "recognition code" to determine which transcription factors will bind which sites. I've created a pipeline to reconstruct regulatory networks starting from an unannotated complete genomic sequence for a prokaryotic organism. The pipeline predicts necessary information, such as gene locations and transcription factor sequences, using custom tools and third party software. The core step is to determine the likelihood of interaction between a TF and a binding site using a black box style recognition code developed by applying machine learning methods to databases of prokaryotic regulatory interactions. I show how one can use this pipeline to reconstruct the virtually unknown regulatory network of Bacillus anthracis.
2031-01-01

APA, Harvard, Vancouver, ISO, and other styles

3

Li, Song. "Integrate qualitative biological knowledge for gene regulatory network reconstruction with dynamic Bayesian networks." [Ames, Iowa : Iowa State University], 2007.

Find full text

APA, Harvard, Vancouver, ISO, and other styles

4

Steiger, Edgar [Verfasser]. "Efficient Sparse-Group Bayesian Feature Selection for Gene Network Reconstruction / Edgar Steiger." Berlin : Freie Universität Berlin, 2018. http://d-nb.info/1170876633/34.

Full text

APA, Harvard, Vancouver, ISO, and other styles

5

Kröger, Stefan. "Bioinformatic analyses for T helper cell subtypes discrimination and gene regulatory network reconstruction." Doctoral thesis, Humboldt-Universität zu Berlin, 2017. http://dx.doi.org/10.18452/18122.

Full text

Abstract:

Die Etablierung von Hochdurchsatz-Technologien zur Durchführung von Genexpressionsmessungen führte in den letzten 20 Jahren zu einer stetig wachsende Menge an verfügbaren Daten. Sie ermöglichen durch Kombination einzelner Experimente neue Vergleichsstudien zu kombinieren oder Experimente aus verschiedenen Studien zu großen Datensätzen zu vereinen. Dieses Vorgehen wird als Meta-Analyse bezeichnet und in dieser Arbeit verwendet, um einen großen Genexpressionsdatensatz aus öffentlich zugänglichen T-Zell Experimenten zu erstellen. T-Zellen sind Immunzellen, die eine Vielzahl von unterschiedlichen Funktionen des Immunsystems inititiieren und steuern. Sie können in verschiedene Subtypen mit unterschiedlichen Funktionen differenzieren. Der mittels Meta-Analyse erstellte Datensatz beinhaltet nur Experimente zu einem T-Zell-Subtyp, den regulatorischen T-Zellen (Treg) bzw. der beiden Untergruppen, natürliche Treg (nTreg) und induzierte Treg (iTreg) Zellen. Eine bisher unbeantwortete Frage lautet, welche subtyp-spezifischen gen-regulatorische Mechanismen die T-Zell Differenzierung steuern. Dazu werden in dieser Arbeit zwei spezifische Herausforderungen der Treg Forschung behandelt: (i) die Identifikation von Zelloberflächenmarkern zur Unterscheidung und Charakterisierung der Subtypen, sowie (ii) die Rekonstruktion von Treg-Zell-spezifischen gen-regulatorischen Netzwerken (GRN), die die Differenzierungsmechanismen beschreiben. Die implementierte Meta-Analyse kombiniert mehr als 150 Microarray-Experimente aus über 30 Studien in einem Datensatz. Dieser wird benutzt, um mittels Machine Learning Zell-spezifische Oberflächenmarker an Hand ihres Expressionsprofils zu identifizieren. Mit der in dieser Arbeit entwickelten Methode wurden 41 Genen extrahiert, von denen sechs Oberflächenmarker sind. Zusätzliche Validierungsexperimente zeigten, dass diese sechs Gene die Experimenten beider T-Zell Subtypen sicher unterscheiden können. Zur Rekonstruktion von GRNs vergleichen wir unter Verwendung des erstellten Datensatzes 11 verschiedene Algorithmen und evaluieren die Ergebnisse mit Informationen aus Interaktionsdatenbanken. Die Evaluierung zeigt, dass die derzeit verfügbaren Methoden nicht in der Lage sind den Wissensstand Treg-spezifischer, regulatorsicher Mechanismen zu erweitern. Abschließend präsentieren wir eine Datenintegrationstrategie zur Rekonstruktion von GRN am Beispiel von Th2 Zellen. Aus Hochdurchsatzexperimenten wird ein Th2-spezifisches GRN bestehend aus 100 Genen rekonstruiert. Während 89 dieser Gene im Kontext der Th2-Zelldifferenzierung bekannt sind, wurden 11 neue Kandidatengene ohne bisherige Assoziation zur Th2-Differenzierung ermittelt. Die Ergebnisse zeigen, dass Datenintegration prinzipiell die GRN Rekonstruktion ermöglicht. Mit der Verfügbarkeit von mehr Daten mit besserer Qualität ist zu erwarten, dass Methoden zur Rekonstruktion maßgeblich zum besseren Verstehen der zellulären Differenzierung im Immunsystem und darüber hinaus beitragen können und so letztlich die Ursachenforschung von Dysfunktionen und Krankheiten des Immunsystems ermöglichen werden.
Within the last two decades high-throughput gene expression screening technologies have led to a rapid accumulation of experimental data. The amounts of information available have enabled researchers to contrast and combine multiple experiments by synthesis, one of such approaches is called meta-analysis. In this thesis, we build a large gene expression data set based on publicly available studies for further research on T cell subtype discrimination and the reconstruction of T cell specific gene regulatory events. T cells are immune cells which have the ability to differentiate into subtypes with distinct functions, initiating and contributing to a variety of immune processes. To date, an unsolved problem in understanding the immune system is how T cells obtain a specific subtype differentiation program, which relates to subtype-specific gene regulatory mechanisms. We present an assembled expression data set which describes a specific T cell subset, regulatory T (Treg) cells, which can be further categorized into natural Treg (nTreg) and induced Treg (iTreg) cells. In our analysis we have addressed specific challenges in regulatory T cell research: (i) discriminating between different Treg cell subtypes for characterization and functional analysis, and (ii) reconstructing T cell subtype specific gene regulatory mechanisms which determine the differences in subtype-specific roles for the immune system. Our meta-analysis strategy combines more than one hundred microarray experiments. This data set is applied to a machine learning based strategy of extracting surface protein markers to enable Treg cell subtype discrimination. We identified a set of 41 genes which distinguish between nTregs and iTregs based on gene expression profile only. Evaluation of six of these genes confirmed their discriminative power which indicates that our approach is suitable to extract candidates for robust discrimination between experiment classes. Next, we identify gene regulatory interactions using existing reconstruction algorithms aiming to extend the number of known gene-gene interactions for Treg cells. We applied eleven GRN reconstruction tools based on expression data only and compared their performance. Taken together, our results suggest that the available methods are not yet sufficient to extend the current knowledge by inferring so far unreported Treg specific interactions. Finally, we present an approach of integrating multiple data sets based on different high-throughput technologies to reconstruct a subtype-specific GRN. We constructed a Th2 cell specific gene regulatory network of 100 genes. While 89 of these are known to be related to Th2 cell differentiation, we were able to attribute 11 new candidate genes with a function in Th2 cell differentiation. We show that our approach to data integration does, in principle, allow for the reconstruction of a complex network. Future availability of more and more consistent data may enable the use of the concept of GRN reconstruction to improve understanding causes and mechanisms of cellular differentiation in the immune system and beyond and, ultimately, their dysfunctions and diseases.

APA, Harvard, Vancouver, ISO, and other styles

6

Chen, Wei, and 陈玮. "A factor analysis approach to transcription regulatory network reconstruction using gene expression data." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 2012. http://hub.hku.hk/bib/B49617783.

Full text

Abstract:

Reconstruction of Transcription Regulatory Network (TRN) and Transcription Factor Activity (TFA) from gene expression data is an important problem in systems biology. Currently, there exist various factor analysis methods for TRN reconstruction, but most approaches have specific assumptions not satisfied by real biological data. Network Component Analysis (NCA) can handle such limitations and is considered to be one of the most effective methods. The prerequisite for NCA is knowledge of the structure of TRN. Such structure can be obtained from ChIP-chip or ChIP-seq experiments, which however have quite limited applications. In order to cope with the difficulty, we resort to heuristic optimization algorithm such as Particle Swarm Optimization (PSO), in order to explore the possible structures of TRN and choose the most plausible one. Regarding the structure estimation problem, we extend classical PSO and propose a novel Probabilistic binary PSO. Furthermore, an improved NCA called FastNCA is adopted to compute the objective function accurately and fast, which enables PSO to run efficiently. Since heuristic optimization cannot guarantee global convergence, we run PSO multiple times and integrate the results. Then GCV-LASSO (Generalized Cross Validation - Least Absolute Shrinkage and Selection Operator) is performed to estimate TRN. We apply our approach and other factor analysis methods on the synthetic data. The results indicate that the proposed PSOFastNCA-GCV-LASSO algorithm gives better estimation. In order to incorporate more prior information on TRN structure and gene expression dynamics in the linear factor analysis model for improved estimation of TRN and TFAs, a linear Bayesian framework is adopted. Under the unified Bayesian framework, Bayesian Linear Sparse Factor Analysis Model (BLSFM) and Bayesian Linear State Space Model (BLSSM) are developed for instantaneous and dynamic TRN, respectively. Various approaches to incorporate partial and ambiguous prior network structure information in the Bayesian framework are proposed to improve performance in practical applications. Furthermore, we propose a novel mechanism for estimating the hyper-parameters of the distribution priors in our BLSFM and BLSSM models, which can significantly improve the estimation compared to traditional ways of hyper-parameter setting. With this development, reasonably good estimation of TFAs and TRN can be obtained even without use of any structure prior of TRN. Extensive numerical experiments are performed to investigate our developed methods under various settings, with comparison to some existing alternative approaches. It is demonstrated that our hyper-parameter estimation method improves the estimation of TFA and TRN in most settings and has superior performance, and that structure priors in general leads to improved estimation performance. Regarding application to real biological data, we execute the PSO-FastNCAGCV-LASSO algorithm developed in the thesis using E. Coli microarray data and obtain sensible estimation of TFAs and TRN. We apply BLSFM without structure priors of TRN, BLSSM without structure priors as well as with partial structure priors to Yeast S. cerevisiae microarray data and obtain a reasonable estimation of TFAs and TRN.
published_or_final_version
Electrical and Electronic Engineering
Doctoral
Doctor of Philosophy

APA, Harvard, Vancouver, ISO, and other styles

7

Henderson, David Allen. "Reconstruction of metabolic pathways by the exploration of gene expression data with factor analysis." Diss., Virginia Tech, 2001. http://hdl.handle.net/10919/30089.

Full text

Abstract:

Microarray gene expression data for thousands of genes in many organisms is quickly becoming available. The information this data can provide the experimental biologist is powerful. This data may provide information clarifying the regulatory linkages between genes within a single metabolic pathway, or alternative pathway routes under different environmental conditions, or provide information leading to the identification of genes for selection in animal and plant genetic improvement programs or targets for drug therapy. Many analysis methods to unlock this information have been both proposed and utilized, but not evaluated under known conditions (e.g. simulations). Within this dissertation, an analysis method is proposed and evaluated for identifying independent and linked metabolic pathways and compared to a popular analysis method. Also, this same analysis method is investigated for its ability to identify regulatory linkages within a single metabolic pathway. Lastly, a variant of this same method is used to analyze time series microarray data. In Chapter 2, Factor Analysis is shown to identify and group genes according to membership within independent metabolic pathways for steady state microarray gene expression data. There were cases, however, where the allocation of all genes to a pathway was not complete. A competing analysis method, Hierarchical Clustering, was shown to perform poorly when negatively correlated genes are assumed unrelated, but performance improved when the sign of the correlation coefficient was ignored. In Chapter 3, Factor Analysis is shown to identify regulatory relationships between genes within a single metabolic pathway. These relationships can be explained using metabolic control analysis, along with external knowledge of the pathway structure and activation and inhibition of transcription regulation. In this chapter, it is also shown why factor analysis can group genes by metabolic pathway using metabolic control analysis. In Chapter 4, a Bayesian exploratory factor analysis is developed and used to analyze microarray gene expression data. This Bayesian model differs from a previous implementation in that it is purely exploratory and can be used with vague or uninformative priors. Additionally, 95% highest posterior density regions can be calculated for each factor loading to aid in interpretation of factor loadings. A correlated Bayesian exploratory factor analysis model is also developed in this chapter for application to time series microarray gene expression data. While this method is appropriate for the analysis of correlated observation vectors, it fails to group genes by metabolic pathway for simulated time series data.
Ph. D.

APA, Harvard, Vancouver, ISO, and other styles

8

Kröger, Stefan [Verfasser], Ulf [Gutachter] Leser, Joachim [Gutachter] Selbig, and Nils [Gutachter] Blüthgen. "Bioinformatic analyses for T helper cell subtypes discrimination and gene regulatory network reconstruction / Stefan Kröger ; Gutachter: Ulf Leser, Joachim Selbig, Nils Blüthgen." Berlin : Humboldt-Universität zu Berlin, 2017. http://d-nb.info/118933108X/34.

Full text

APA, Harvard, Vancouver, ISO, and other styles

9

Aravena, Duarte Andrés Octavio. "Probabilistic and constraint based modelling to determine regulation events from heterogeneous biological data." Phd thesis, Université Rennes 1, 2013. http://tel.archives-ouvertes.fr/tel-00988255.

Full text

Abstract:

This thesis proposes a method to build realistic causal regulatory networks hat has lower false positive rate than traditional methods. The first contribution of this thesis is to integrate heterogeneous information from two types of network predictions to determine a causal explanation of the observed gene co-expression. The second contribution is to model this integration as a combinatorial optimization problem. We demonstrate that this problem belongs to the NP-hard complexity class. The third contribution is the proposition of a heuristic approach to have an approximate solution in a practical execution time. Our evaluation shows that the E.coli regulatory network resulting from the application of this method has a higher accuracy than the putative one built with traditional tools. The bacterium Acidithiobacillus ferrooxidans is particularly challenging for the experimental determination of its regulatory network. Using the tools we developed, we propose a putative regulatory network and analyze it to rank the relevance of central regulators. In a second part of this thesis we explore how these regulatory relationships are manifested in a case linked to human health, developing a method to complete a linked to Alzheimer 's disease network. As an addendum we address the mathematical problem of microarray probe design. We conclude that, to fully predict the hybridization dynamics, we need a modified energy function for secondary structures of surface-attached DNA molecules and propose a scheme for determining such function.

APA, Harvard, Vancouver, ISO, and other styles

10

Molnar, Istvan, David Lopez, Jennifer Wisecaver, Timothy Devarenne, Taylor Weiss, Matteo Pellegrini, and Jeremiah Hackett. "Bio-crude transcriptomics: Gene discovery and metabolic network reconstruction for the biosynthesis of the terpenome of the hydrocarbon oil-producing green alga, Botryococcus braunii race B (Showa)*." BioMed Central, 2012. http://hdl.handle.net/10150/610020.

Full text

Abstract:

BACKGROUND:Microalgae hold promise for yielding a biofuel feedstock that is sustainable, carbon-neutral, distributed, and only minimally disruptive for the production of food and feed by traditional agriculture. Amongst oleaginous eukaryotic algae, the B race of Botryococcus braunii is unique in that it produces large amounts of liquid hydrocarbons of terpenoid origin. These are comparable to fossil crude oil, and are sequestered outside the cells in a communal extracellular polymeric matrix material. Biosynthetic engineering of terpenoid bio-crude production requires identification of genes and reconstruction of metabolic pathways responsible for production of both hydrocarbons and other metabolites of the alga that compete for photosynthetic carbon and energy.RESULTS:A de novo assembly of 1,334,609 next-generation pyrosequencing reads form the Showa strain of the B race of B. braunii yielded a transcriptomic database of 46,422 contigs with an average length of 756 bp. Contigs were annotated with pathway, ontology, and protein domain identifiers. Manual curation allowed the reconstruction of pathways that produce terpenoid liquid hydrocarbons from primary metabolites, and pathways that divert photosynthetic carbon into tetraterpenoid carotenoids, diterpenoids, and the prenyl chains of meroterpenoid quinones and chlorophyll. Inventories of machine-assembled contigs are also presented for reconstructed pathways for the biosynthesis of competing storage compounds including triacylglycerol and starch. Regeneration of S-adenosylmethionine, and the extracellular localization of the hydrocarbon oils by active transport and possibly autophagy are also investigated.CONCLUSIONS:The construction of an annotated transcriptomic database, publicly available in a web-based data depository and annotation tool, provides a foundation for metabolic pathway and network reconstruction, and facilitates further omics studies in the absence of a genome sequence for the Showa strain of B. braunii, race B. Further, the transcriptome database empowers future biosynthetic engineering approaches for strain improvement and the transfer of desirable traits to heterologous hosts.

APA, Harvard, Vancouver, ISO, and other styles

11

Werhli, Adriano Velasque. "Reconstruction of gene regulatory networks from postgenomic data." Thesis, University of Edinburgh, 2007. http://hdl.handle.net/1842/3198.

Full text

Abstract:

An important problem in systems biology is the inference of biochemical pathways and regulatory networks from postgenomic data. The recent substantial increase in the availability of such data has stimulated the interest in inferring the networks and pathways from the data themselves. The main interests of this thesis are the application, evaluation and the improvement of machine learning methods applied to the reverse engineering of biochemical pathways and networks. The thesis starts with the application of an established method to newly available gene expression data related to the interferon pathway of the human immune system in order to identify active subpathways under di erent experimental conditions. The thesis continues with the comparative evaluation of various machine learning methods (Relevance networks, Graphical Gaussian Models, Bayesian networks) using observational and interventional data from cytometry experiments as well as simulated data from a gold-standard network. The thesis also extends and improves existing methods to include biological prior knowledge under the Bayesian approach in order to increase the accuracy of the predicted networks and it quanti es to what extent the reconstruction accuracy can be improved in this way.

APA, Harvard, Vancouver, ISO, and other styles

12

Deng, Wenping. "Algorithms for Reconstruction of Gene Regulatory Networks from High-Throughput Gene Expression Data." Thesis, Michigan Technological University, 2019. http://pqdtopen.proquest.com/#viewpdf?dispub=13420080.

Full text

Abstract:

Understanding gene interactions in complex living systems is one of the central tasks in system biology. With the availability of microarray and RNA-Seq technologies, a multitude of gene expression datasets has been generated towards novel biological knowledge discovery through statistical analysis and reconstruction of gene regulatory networks (GRN). Reconstruction of GRNs can reveal the interrelationships among genes and identify the hierarchies of genes and hubs in networks. The new algorithms I developed in this dissertation are specifically focused on the reconstruction of GRNs with increased accuracy from microarray and RNA-Seq high-throughput gene expression data sets.

The first algorithm (Chapter 2) focuses on modeling the transcriptional regulatory relationships between transcription factors (TF) and pathway genes. Multiple linear regression and its regularized version, such as Ridge regression and LASSO, are common tools that are usually used to model the relationship between predictor variables and dependent variable. To deal with the outliers in gene expression data, the group effect of TFs in regulation and to improve the statistical efficiency, it is proposed to use Huber function as loss function and Berhu function as penalty function to model the relationships between a pathway gene and many or all TFs. A proximal gradient descent algorithm was developed to solve the corresponding optimization problem. This algorithm is much faster than the general convex optimization solver CVX. Then this Huber-Berhu regression was embedded into partial least square (PLS) framework to deal with the high dimension and multicollinearity property of gene expression data. The result showed this method can identify the true regulatory TFs for each pathway gene with high efficiency.

The second algorithm (Chapter 3) focuses on building multilayered hierarchical gene regulatory networks (ML-hGRNs). A backward elimination random forest (BWERF) algorithm was developed for constructing an ML-hGRN operating above a biological pathway or a biological process. The algorithm first divided construction of ML-hGRN into multiple regression tasks; each involves a regression between a pathway gene and all TFs. Random forest models with backward elimination were used to determine the importance of each TF to a pathway gene. Then the importance of a TF to the whole pathway was computed by aggregating all the importance values of the TF to the individual pathway gene. Next, an expectation maximization algorithm was used to cut the TFs to form the first layer of direct regulatory relationships. The upper layers of GRN were constructed in the same way only replacing the pathway genes by the newly cut TFs. Both simulated and real gene expression data were used to test the algorithms and demonstrated the accuracy and efficiency of the method.

The third algorithm (Chapter 4) focuses on Joint Reconstruction of Multiple Gene Regulatory Networks (JRmGRN) using gene expression data from multiple tissues or conditions. In the formulation, shared hub genes across different tissues or conditions were assumed. Under the framework of the Gaussian graphical model, JRmGRN method constructs the GRNs through maximizing a penalized log-likelihood function. It was formulated as a convex optimization problem, and then solved it with an alternating direction method of multipliers (ADMM) algorithm. Both simulated and real gene expression data manifested JRmGRN had better performance than existing methods.

APA, Harvard, Vancouver, ISO, and other styles

13

Kentzoglanakis, Kyriakos. "Reconstructing gene regulatory networks : a swarm intelligence framework." Thesis, University of Portsmouth, 2010. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.523619.

Full text

APA, Harvard, Vancouver, ISO, and other styles

14

Groß, Torsten. "Network Inference from Perturbation Data: Robustness, Identifiability and Experimental Design." Doctoral thesis, Humboldt-Universität zu Berlin, 2021. http://dx.doi.org/10.18452/22355.

Full text

Abstract:

Hochdurchsatzverfahren quantifizieren eine Vielzahl zellulärer Komponenten, können aber selten deren Interaktionen beschreiben. Daher wurden in den letzten 20 Jahren verschiedenste Netzwerk-Rekonstruktionsmethoden entwickelt. Insbesondere Perturbationsdaten erlauben dabei Rückschlüsse über funktionelle Mechanismen in der Genregulierung, Signal Transduktion, intra-zellulärer Kommunikation und anderen Prozessen zu ziehen. Dennoch bleibt Netzwerkinferenz ein ungelöstes Problem, weil die meisten Methoden auf ungeeigneten Annahmen basieren und die Identifizierbarkeit von Netzwerkkanten nicht aufklären. Diesbezüglich beschreibt diese Dissertation eine neue Rekonstruktionsmethode, die auf einfachen Annahmen von Perturbationsausbreitung basiert. Damit ist sie in verschiedensten Zusammenhängen anwendbar und übertrifft andere Methoden in Standard-Benchmarks. Für MAPK und PI3K Signalwege in einer Adenokarzinom-Zellline generiert sie plausible Netzwerkhypothesen, die unterschiedliche Sensitivitäten von PI3K-Mutanten gegenüber verschiedener Inhibitoren überzeugend erklären. Weiterhin wird gezeigt, dass sich Netzwerk-Identifizierbarkeit durch ein intuitives Max-Flow Problem beschreiben lässt. Dieses analytische Resultat erlaubt effektive, identifizierbare Netzwerke zu ermitteln und das experimentelle Design aufwändiger Perturbationsexperimente zu optimieren. Umfangreiche Tests zeigen, dass der Ansatz im Vergleich zu zufällig generierten Perturbationssequenzen die Anzahl der für volle Identifizierbarkeit notwendigen Perturbationen auf unter ein Drittel senkt. Schließlich beschreibt die Dissertation eine mathematische Weiterentwicklung der Modular Response Analysis. Es wird gezeigt, dass sich das Problem als analytisch lösbare orthogonale Regression approximieren lässt. Dies erlaubt eine drastische Reduzierung des nummerischen Aufwands, womit sich deutlich größere Netzwerke rekonstruieren und neueste Hochdurchsatz-Perturbationsdaten auswerten lassen.
'Omics' technologies provide extensive quantifications of components of biological systems but rarely characterize the interactions between them. To fill this gap, various network reconstruction methods have been developed over the past twenty years. Using perturbation data, these methods can deduce functional mechanisms in gene regulation, signal transduction, intra-cellular communication and many other cellular processes. Nevertheless, this reverse engineering problem remains essentially unsolved because inferred networks are often based on inapt assumptions, lack interpretability as well as a rigorous description of identifiability. To overcome these shortcoming, this thesis first presents a novel inference method which is based on a simple response logic. The underlying assumptions are so mild that the approach is suitable for a wide range of applications while also outperforming existing methods in standard benchmark data sets. For MAPK and PI3K signalling pathways in an adenocarcinoma cell line, it derived plausible network hypotheses, which explain distinct sensitivities of PI3K mutants to targeted inhibitors. Second, an intuitive maximum-flow problem is shown to describe identifiability of network interactions. This analytical result allows to devise identifiable effective network models in underdetermined settings and to optimize the design of costly perturbation experiments. Benchmarked on a database of human pathways, full network identifiability is obtained with less than a third of the perturbations that are needed in random experimental designs. Finally, the thesis presents mathematical advances within Modular Response Analysis (MRA), which is a popular framework to quantify network interaction strengths. It is shown that MRA can be approximated as an analytically solvable total least squares problem. This insight drastically reduces computational complexity, which allows to model much bigger networks and to handle novel large-scale perturbation data.

APA, Harvard, Vancouver, ISO, and other styles

15

Pournara, Iosifina-Vasiliki. "Reconstructing gene networks by passive and active Bayesian learning." Thesis, Birkbeck (University of London), 2005. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.417001.

Full text

APA, Harvard, Vancouver, ISO, and other styles

16

Thomas, Spencer Angus. "Synthesis, analysis and reconstruction of gene regulatory networks using evolutionary algorithms." Thesis, University of Surrey, 2014. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.659110.

Full text

Abstract:

Large and complex biological networks are thought to be built from small functional modules called motifs. Currently there has been insufficient study of the fundamental understanding of these motifs which has resulted in a lack of consensus of their role and presence in biology. Here we investigate two networks that produce biologically important dynamics, an oscillation and a toggle switch. We couple these motifs and observe multiple sets of combined dynamic behaviour and evidence of 'gene connectivity preferences between the two networks. Such fundamental studies of networks can be performed computationally with detailed mathematical analysis that may not be possible from experimental data due to noise and experimental costs. Computational studies can also be used in conjunction with experimental data to analysis and interpret large scale data sets such as high-throughput data. Here we use such an approach to go beyond fundamental networks and model a system of particular interest in biology, the bacteria Streptomyces coelicolor, which produces a plethora of antibiotics and medicinal compounds. The regulatory network of genes in S. colicolor is vast and sub-networks can span hundreds, or even thousands of genes. Currently there is insufficient data to statistically reverse engineer regulatory networks for large networks, known as under determined problems. The complexity of real data due to noise is also a problem for inferring networks, and as a result much of the research community focus on small artificial data sets to benchmark their algorithms. Here we develop a novel algorithm which uses data integration and processing with a multi-objective set-up that enhances convergence through multiobjectivization. Additionally our algorithm uses a decoupled optimisation approach to improve the optimisation and parallel computation to significantly reduce computational run times. Our algorithm is general and can be applied to any network with time series data of any size. We compare various size biologically relevant sub-networks within S. colicolor with several optimisation arrangements and demonstrate our novel approach is the best over any network size. Furthermore, we apply our algorithm. to the PhoP sub-network of 911 genes within S. colicolor, which is strongly linked to antibiotic production. All networks here are reconstructed from real experimental data. Our algorithm is able to build a regulatory model for 911 genes in the PhoP network for time series data sets of up to 32 points, both of which are far larger than current methods.

APA, Harvard, Vancouver, ISO, and other styles

17

Malysheva, Valeriya. "Reconstruction of gene regulatory networks defining the cell fate transition processes." Thesis, Strasbourg, 2016. http://www.theses.fr/2016STRAJ084/document.

Full text

Abstract:

L’établissement de l’identité cellulaire est un phénomène très complexe qui implique pléthore de signaux instructifs intrinsèques et extrinsèques. Cependant, malgré les progrès importants qui ont été faits pour l’identification des régulateurs clés, les liens mécanistiques entre facteurs de transcription, épigénome, et structure de la chromatine lors de la différenciation cellulaire, et de la transformation tumorigénique des cellules, sont peu connus. Pour résoudre ces problématiques nous avons utilisé deux modèles de transition de l’identité cellulaire : la différenciation neuronale et endodermique induites par un même morphogène, l’acide rétinoïque. Concernant la transformation tumorale des cellules nous avons utilisé un système de tumorigenèse par étape de cellules primaires humaines. Nous avons conduit des études intégratives incluant des données transcriptomiques, épigénomiques, et des données concernant l’architecture de la chromatine. Notre approche systématique pour caractériser l’acquisition de l’identité cellulaire, combinée à la modélisation de la transduction du signal, renforce donc nos connaissances sur les mécanismes responsables de la plasticité cellulaire. Une meilleure compréhension des mécanismes régulateurs de l’identité cellulaire non seulement nous éclaire sur les relations de cause à effet entre les différents niveaux de régulation dans la cellule, mais aussi ouvre de nouvelles possibilités en terme de transdifférenciation dirigée
The cell fate acquisition is a highly complex phenomenon that involves a plethora of intrinsic and extrinsic instructive signals. However, despite the important progress in identification of key regulatory factors of this process, the mechanistic links between transcription factors, epigenome and chromatin structure which coordinate the regulation of cell differentiation and deregulation of gene networks during cell transformation are largely unknown. To address these questions for two model systems of cell fate transitions, namely the neuronal and endodermal cell differentiation induced by the morphogen retinoic acid and the stepwise tumorigenesis of primary human cells, we conducted integrative transcriptome, epigenome and chromatin architecture studies. Through extensive integration with thousands of available genomic data sets, we deciphered the gene regulatory networks of these processes and revealed new insights in the molecular circuitry of cell fate acquisition. The understanding of regulatory mechanisms that underlie the cell fate decision processes not only brings the fundamental understanding of cause-and-consequence relationships inside the cell, but also open the doors to the directed trans-differentiation

APA, Harvard, Vancouver, ISO, and other styles

18

Ghanbari, Mahsa [Verfasser]. "Association measures and prior information in the reconstruction of gene networks / Mahsa Ghanbari." Berlin : Freie Universität Berlin, 2016. http://d-nb.info/1104733757/34.

Full text

APA, Harvard, Vancouver, ISO, and other styles

19

Liu, Jinhua [Verfasser]. "Bioinformatic Reconstruction of Gene Regulatory Networks Controlling EMT and Mesoderm Formation / Jinhua Liu." Berlin : Freie Universität Berlin, 2020. http://d-nb.info/1218530537/34.

Full text

APA, Harvard, Vancouver, ISO, and other styles

20

Hasegawa, Takanori. "Reconstructing Biological Systems Incorporating Multi-Source Biological Data via Data Assimilation Techniques." 京都大学 (Kyoto University), 2015. http://hdl.handle.net/2433/195985.

Full text

APA, Harvard, Vancouver, ISO, and other styles

21

Ait-Hamlat, Adel. "Reconstruction de réseaux de gènes à partir de données d'expression par déconvolution centrée autour des hubs." Electronic Thesis or Diss., Sorbonne université, 2019. http://www.theses.fr/2019SORUS011.

Full text

Abstract:

Les réseaux de régulation de gènes (RRG) sont des graphes dans lesquels les nœuds sont des gènes et les arcs représentent les relations de régulation entre gènes régulateurs et gènes cibles. La topologie d’un RRG est caractérisée par un petit nombre de gènes qui ont un grand nombre de connexions alors que la majorité des autres gènes a peu de connexions. Les nœuds hautement connectés s’appellent des hubs ; ils permettent à deux nœuds quelconques d’être connectés par des chemins relativement courts dans les réseaux peu denses que sont les RRGs. HubNeD (Hub-centered network deconvolution) est une nouvelle méthode qui exploite les propriétés topologiques des RRGs pour les reconstruire à partir des profils d’expression des gènes à l’état d’équilibre. La méthode HubNeD se compose de trois étapes : premièrement, une étape de regroupement des gènes considérés comme uniquement régulés en les regroupant dans des communautés de co-régulation hautement homogènes. Deuxièmement, les hubs du RRG sont sélectionnés à partir des gènes restants en analysant les similitudes de leurs profils de corrélation avec les gènes des communautés de co-régulation. Troisièmement, une matrice d’adjacence est calculée par une déconvolution centrée sur les hubs des scores de corrélation de Pearson. Cette dernière étape pénalise les connexions directes entre deux gènes qui n’ont pas été choisis comme hubs, réduisant ainsi le taux de faux positifs. La stratégie originale consistant à reconstituer le RRG après une étape de sélection des hubs permet à HubNeD d’obtenir les meilleures performances sur les jeux de données d’expression associés aux deux RRGs bien établis de E. Coli et Saccharomyces cerevisiae
Gene regulatory networks (GRNs) are graphs in which nodes are genes and edges represent causal relationships from regulator genes, towards their downstream targets. One important topological property of GRNs is that a small number of their nodes have a large number of connections whereas the majority of the genes have few connections. The highly connected nodes are called hubs ; they allow any two nodes to be connected by relatively short paths in sparse networks. HubNeD (Hub-centered network deconvolution) is a novel method that exploits topological properties of GRNs to reconstruct them from steady state expression profiles. It works in three steps : firstly, a clustering step extracts genes that are considered solely regulated by grouping them in highly homogeneous co-regulation communities. Secondly, hub are inferred from the remaining genes, by analyzing the similarities of their correlation profiles to the genes in the co-regulations communities. Thirdly, an adjacency matrix is computed by a hub-centered deconvolution of the Pearson correlation scores. This last step penalizes direct connections between non-hubs, thus reducing the rate of false positives. The original strategy of preceding GRN reconstruction by a hub selection step, allows HubNeD to habe the highest performances on expression datasets associated with the two well established experimentally curated GRNs of E. Coli and Saccharomyces cerevisiae

APA, Harvard, Vancouver, ISO, and other styles

22

江益志. "Integrating gene expression, SNP markers, and gene-gene interaction towards network reconstruction." Thesis, 2010. http://ndltd.ncl.edu.tw/handle/97695667099035745199.

Full text

APA, Harvard, Vancouver, ISO, and other styles

23

Lai, Jhih-Siang, and 賴至祥. "Graph-Based Clustering Approaches for Gene Network Reconstruction." Thesis, 2009. http://ndltd.ncl.edu.tw/handle/17481425002494051273.

Full text

Abstract:

碩士
國立臺灣大學
醫學工程學研究所
97
To understand regulatory relationships between genes in real life. Biologists often use RNA interference (RNAi) or knockout genes to observe the response in the real life system. Informationists try to reconstruct regulatory relationship between genes from mRNA expression profile by algorithms or mathematic models. There are several phases involved in gene regulation such as transcription, post-transcriptional modifications, translation, mRNA degradation and post-translational modifications .Time is essential for all these phases to be completed and many researches analyze regulation via these features. In this study, we use two methods to reconstruct regulatory relationships between genes. One is a graph partition algorithm named Normalized Cuts for partitioning off genes into functional gene network. The other method, PARE (Pattern Recognition Approach), an algorithm based on time-lagged non-linear feature of the profile, is to infer regulation between genes. In addition, we use yeast microarray to construct gene regulatory networks and check results from KEGG pathway database, BIOGRID interaction database and MIPS database. Comparing our F score result with Dynamic Bayesian Network developed by Kim, et al., it shows that our method performs better than theirs. Finally, we apply our method to a real case in yeast microarray in which yox1 and yhp1 are both deleted and we analyze its mRNA expression time profile. Although mechanisms between phases in cell cycle are not clear, yox1 and yhp1 are two genes known controlling duration of a cell in G1 phase by negative feedback. We successfully find networks associated with cell cycle and one of the networks is associated with cell mitosis. In the future, we hope to decipher more mechanisms between phases in cell cycle.

APA, Harvard, Vancouver, ISO, and other styles

24

"Computational models for efficient reconstruction of gene regulatory network." Thesis, 2011. http://library.cuhk.edu.hk/record=b6075380.

Full text

Abstract:

Zhang, Qing.
Thesis (Ph.D.)--Chinese University of Hong Kong, 2011.
Includes bibliographical references (leaves 129-148).
Electronic reproduction. Hong Kong : Chinese University of Hong Kong, [2012] System requirements: Adobe Acrobat Reader. Available via World Wide Web.
Abstract also in Chinese.

APA, Harvard, Vancouver, ISO, and other styles

25

Lin, Chung-Hsun, and 林仲訓. "Using Microarray Time Series Data and Gene Ontology for Gene Clustering and Network Reconstruction." Thesis, 2013. http://ndltd.ncl.edu.tw/handle/83763564848362542147.

Full text

Abstract:

碩士
國立中山大學
資訊管理學系研究所
101
In recent years, using microarray time series data to reconstruct gene regulatory network, has become a very popular way. However, the number of these genes are usually very large. We want to rebuild before these genes do a proper clustering, which is the each other interaction between the genes will be divided in the same group. The way we use here is to combine multiple data sources. On the one hand avoid being affected by the impact of a single data source. When there is only one data source, data quality will have a great impact. On the other hand, we hope to have some clustering performance improvement, and improve the subsequent reconstruction of the accuracy of gene regulatory network. In our study, we combine two different types of data sources. One of source is microarray time series data, the other is the Gene Ontology. We quantify Gene Ontology, and combine with time series data. Finally, we use the partition clustering algorithm to cluster, and use Boolnet to reconstruct gene regulatory network. After our experiment, we can obtain more great performance when we use microarray time series data and Gene Ontology simultaneously. In the following reconstruction, when the clustering result is better, we can get a better reconstruction of gene regulatory network. Therefore, our method for clustering of gene is effective and feasible.

APA, Harvard, Vancouver, ISO, and other styles

26

Richard, Guilhem. "Affecting the macrophage response to infection by integrating signaling and gene-regulatory networks." Thesis, 2014. https://hdl.handle.net/2144/14270.

Full text

Abstract:

Obesity has reached epidemic proportions in recent years. The World Health Organization estimated in 2008 that 1.4 billion people were overweight of whom 500 million were obese. Obesity associates with a wide range of conditions, such as cardiovascular diseases, cancer, diabetes, and neurological disorders, and causes approximately 2.8 million deaths each year. Many studies have established that obesity strongly impacts the normal function of the immune system: it dysregulates production of inflammatory and anti–inflammatory cytokines, alters numbers of immune cells, and causes an overall weaker immune response. Developing therapies that aim to improve the immune response is crucial in order to increase the quality of life of obese subjects and to reduce their ever–increasing healthcare-related costs. The long-term objective of this work is to contribute to the development of therapies that can increase the immune response in obese macrophages. In particular, gene modifications adjusting the response to infection in obese macrophages closer to that of lean macrophages are desired. To this end, the present work focused on the Toll-like Receptors (TLRs), which play an essential role in the detection of pathogens and the initiation of both innate and acquired immune responses. Genes essential to the transmission of the infection signal were first identified using a model of the TLR signaling pathways. These genes provided the basis for reconstructing a gene regulatory network that not only accounts for information coming from the TLRs, but also regulates key reactions within the pathways. The topology and regulatory functions of this network were identified by applying novel computational techniques to time-series gene-expression datasets. The TLR signaling and gene-regulatory networks were then integrated to develop a modeling framework for macrophage that predicts the time behavior of several markers for infection. Finally, formal verification techniques were used to demonstrate that the model satisfies several properties characteristic of the response to infection in macrophage. The work detailed in this dissertation offers a suitable platform for developing and testing biological hypotheses that aim to improve responses to infection.

APA, Harvard, Vancouver, ISO, and other styles

27

Padolina, Joanna Melinda. "Phylogenetic reconstruction of Phalaenopsis (Orchidaceae) using nuclear and chloroplast DNA sequence data and using Phalaenopsis as a natural system for assessing methods to reconstruct hybrid evolution in phylogenetic analyses." 2006. http://hdl.handle.net/2152/20172.

Full text

Abstract:

Two phylogenies of Phalaenopsis (Orchidaceae) are presented, one from combined chloroplast DNA data and one from a nuclear actin gene. We used these phylogenies to assess and modify the classification of Phalaenopsis and to examine several morphological characters and geographical distribution patterns. Our results support Christenson’s (2001) treatment of Phalaenopsis as a broadly defined genus that includes the species previously placed in the genera Doritis and Kingidium. Some of Christenson’s subgeneric groups needed to be recircumscribed to reflect a natural classification. We recognized four subgenera and six sections, subgenera Aphyllae, Parishianae (with sections Conspicuum, Delisiosae, Esmeralda, and Parishianae), Phalaenopsis, and Polychilos (with sections Fuscatae and Polychilos). In order to find a set of universally amplifiable, phylogenetically informative, single-copy nuclear regions, we conducted a whole genome comparison of the rice (Oryza sativa) and Arabidopsis thaliana genomes. We constructed a database of both genomes and searched for pairs of sequences using criteria we felt would ensure primers that would reliably amplify using standard PCR protocols. We tested the most promising 142 primer pairs in the lab on eighteen taxa and found four potentially informative markers in Phalaenopsis and one in Helianthus. Our results indicated that it will be difficult to find universal nuclear markers, however our database provides an important tool for finding informative nuclear markers within specific groups. The full set of primer combinations is available online at, “The Conserved Primer Pair Project,” http://aug.csres.utexas.edu:8080/cpp/index.html. We used fourteen Phalaenopsis species and seven horticultural hybrids to create a real dataset with which to test phylogenetic network reconstruction methods. We tested the performance of Neighbor-Net, implemented in SplitsTree, under four different categories of complexity: one hybrid, two independent hybrids (hybrids with no parents in common), three independent hybrids, and two non-independent hybrids (one parent was shared between hybrids). Neighbor-Net was able to predict accurately the parents of hybrids in only about half of the datasets we tested, and there were so many false positives that it was impossible to distinguish the hybrids from the species. We plan to use this dataset to test methods, such as RIATA and RGNet, when they become available.
text

APA, Harvard, Vancouver, ISO, and other styles

28

Chang, Chih-Jung, and 張志榮. "Gene Networks Reconstruction based on Structural Equation Modeling." Thesis, 2007. http://ndltd.ncl.edu.tw/handle/21699798502766537048.

Full text

Abstract:

碩士
臺灣大學
醫學工程學研究所
96
With the continual progress of human genome researches, more and more genes have been found to be closely related to human diseases. Accordingly, exploration of genetic functions has become one of major foci in biotechnology researches. It is well known that each gene does not work alone. Instead, it may involve enormous complicated interactions among genes in a biological process. Because of the complexity of physiological and biochemical processes in biology, the relations between the genes and most diseases are not clear currently. Therefore, the ultimate goal of gene networks reconstruction is to analyze the regulatory mechanisms among genes and understand how genes involve in biological processes. Limited by the high cost of microarrays, most biological experiments can not offer a large number of observations for gene network reconstruction. To overcome this limitation, a new gene network model：linear dynamic factor model, which is based on structural equation modeling, is proposed in this study. Besides observed variables, linear dynamic factor model also incorporates hidden factors to depict regulations from proteins and other molecules that are not included in the gene networks but have influence on the gene networks. We simulated data from a 6-gene network with different observations to see the influence of the number of observations on the performance of the algorithm. We also applied the algorithm to microarray data to reconstruct the gene networks from focal adhesion pathway、SGS1 and its synthetic sick or lethal(SSL) partners and G2/M DNA damage checkpoint of Saccharomyces cerevisiae. For the simulated data with 14 observations, the performance of the algorithm is well；for the simulated data with 52 observations, the performance of the algorithm is better than that of the simulated data with 14 observations. For the microarray data, the sensitivity or true positive rate can be in the neighborhood of 50%.

APA, Harvard, Vancouver, ISO, and other styles

29

"Reconstructing gene regulatory networks with new datasets." 2013. http://library.cuhk.edu.hk/record=b5549309.

Full text

Abstract:

競爭性內源核糖核酸(ceRNA) 假設最近已成為生物訊息學研究中最熱門的話題之一。Cell 是在生物科學界上經常被引用的學術期刊，早前亦有一班學者在Cell 2011年同一期成功發佈四篇關於ceRNA 假設的學術文章。跟據有關ceRNA 假設的學術文章，大部份學者均以不同的個別例子成功驗證假定，可是，欠缺一個大規模的及全面性的分析。
在我兩年碩士的研究中，我引入了一個新的概念微核糖核酸及其目標對向聚類(MTB) 運用了ceRNA 的假設，還提出算法，成功從微核糖核酸與信使核糖核酸的相互數據中找出一系列的MTB' 還利用GENCODE 項目上大量的微核糖核酸及信使核糖核酸的表達數據去驗証MTB 的概念。一方面，我從大量的表達數據中成功推斷出微核糖核酸與信使核糖核酸之間的相反關連、信使核糖核酸之間的正面關運和微核糖核酸之間的正面關連;另一方面，這些關連進一步肯定ceRNA 假設的真實性。此外，我提出一個從大量基因組中找出基因功能分析的方法，並在大量的MTB 的基因組中找出重要的基因註解。最後，我提出另一個MTB 概念的應用一新算法來預測微核糖核酸與信使核糖核酸的相互影響。總括而吉， MTB 概念從複雜且混亂的微核糖核酸與信使核糖核酸網絡中定義簡單且穩固的模姐，提供一個系統生物學分析微核糖核酸調節能力的方法。
The competing Endogenous RNA (ceRNA) hypothesis has become one of the hottest topics in bioinformatics research recently. Four papers related to the ceRNA hypothesis were published simultaneously in Cell in 2011, a top journal in life sciences. For most papers related to the ceRNA hypothesis, the corresponding studies have successfully validated the hypothesis with different individual examples, without a large-scale and comprehensive analysis.
In my Master of Philosophy study, a novel concept, called mi-RNA Target Bicluster (MTB), is introduced to model the ceRNA hypothesis. The MTBs are identified computationally from validated and/or predicted miRNA-mRNA interaction pairs. The MTB models were tested with the mRNAs and miRNAs expression data from the GENCODE Project. Statistically significant miRNA-mRNA anti-correlation, mRNA-mRNA correlation and miRNA-miRNA correlation in expression data are found, verifying the correlation relations among mRNAs and miRNAs stated in the ceRNA hypothesis with large-scale data support. Moreover, a novel large-scale functional enrichment analysis is performed, and the mRNAs selected by the MTBs are found to be biologically relevant. Besides, some new target prediction algorithms are suggested, as another application of the MTBs, are suggested. Overall, the concept of MTB defines simple and robust modules from the complex and noisy miRNA-mRNA network, suggesting ways for system biology analyses in miRNA-mediated regulations.
Detailed summary in vernacular field only.
Detailed summary in vernacular field only.
Yip, Kit Sang Danny.
Thesis (M.Phil.)--Chinese University of Hong Kong, 2013.
Includes bibliographical references (leaves [117]-126).
Electronic reproduction. Hong Kong : Chinese University of Hong Kong, [2012] System requirements: Adobe Acrobat Reader. Available via World Wide Web.
Abstracts also in Chinese.
Abstract --- p.i
Acknowledgement --- p.iv
Chapter 1 --- Introduction --- p.1
Chapter 1.1 --- Contributions --- p.1
Chapter 1.2 --- Thesis Outline --- p.2
Chapter 2 --- Background --- p.3
Chapter 2.1 --- Bioinformatics --- p.3
Chapter 2.2 --- Biological Background --- p.7
Chapter 2.2.1 --- The Central Dogma of Molecular Biology . --- p.7
Chapter 2.2.2 --- RNAs --- p.8
Chapter 2.2.3 --- Competing Endogenous RNA (ceRNA) hypothesis --- p.9
Chapter 2.2.4 --- Biological Considerations in Functional Enrichment Analysis --- p.11
Chapter 2.3 --- Computational Background --- p.12
Chapter 2.3.1 --- miRNA Genomic Annotation Prediction --- p.13
Chapter 2.3.2 --- miRNA Target Interaction Prediction --- p.14
Chapter 2.3.3 --- Applying Computational Algorithms on Related Problems --- p.16
Chapter 2.3.4 --- Algorithms in Functional Enrichment Analysis --- p.16
Chapter 2.4 --- Experiments and Data --- p.17
Chapter 2.4.1 --- miRNA Target Interactions --- p.17
Chapter 2.4.2 --- Expression Data --- p.18
Chapter 2.4.3 --- Annotation Datasets --- p.19
Chapter 2.5 --- Research Motivations --- p.20
Chapter 3 --- Definitions of miRNA Target Biclusters (MTB) --- p.22
Chapter 3.1 --- Representations --- p.22
Chapter 3.1.1 --- Binary Association Matrix Representation --- p.23
Chapter 3.1.2 --- Bipartite Graph Representation --- p.23
Chapter 3.1.3 --- Mathematical Representation --- p.24
Chapter 3.2 --- Concept of MTB --- p.24
Chapter 3.2.1 --- MTB Restrictive Type (Type R) --- p.27
Chapter 3.2.2 --- MTB Restrictive Type on miRNA (Type Rmi) --- p.31
Chapter 3.2.3 --- MTB Restrictive Type on mRNA (Type Rm) --- p.34
Chapter 3.2.4 --- MTB Restrictive and General Type (Type Rgen) --- p.37
Chapter 3.2.5 --- MTB Loose Type (Type L) --- p.44
Chapter 3.2.6 --- MTB Loose Type but restricts on miRNA (Type Lmi) --- p.47
Chapter 3.2.7 --- MTB Loose Type but restricts on mRNA (Type Lm) --- p.50
Chapter 3.2.8 --- MTB Loose and General Type (Type Lgen) --- p.53
Chapter 3.2.9 --- A General Definition on all Eight Types --- p.58
Chapter 3.2.10 --- Discussions --- p.60
Chapter 4 --- MTB Workflow in Checking Correlation Relations --- p.61
Chapter 4.1 --- MTB Workflow in Checking Correlation Relations --- p.61
Chapter 4.1.1 --- MTB Identification --- p.62
Chapter 4.1.2 --- Correlation Coefficients --- p.63
Chapter 4.1.3 --- Scoring Scheme --- p.64
Chapter 4.1.4 --- Background Construction --- p.65
Chapter 4.1.5 --- Wilcoxon Rank-sum Test --- p.66
Chapter 4.1.6 --- Preliminary Studies --- p.67
Chapter 4.2 --- miRNA-mRNA Anti-correlation in Expression Data --- p.68
Chapter 4.2.1 --- Interaction Datasets --- p.69
Chapter 4.2.2 --- Expression Datasets --- p.72
Chapter 4.2.3 --- Independence of the Choices of Datasets --- p.73
Chapter 4.2.4 --- Independence of the Types of MTBs --- p.76
Chapter 4.2.5 --- Independence of the Choices of Correlation Coefficients --- p.78
Chapter 4.2.6 --- Dependence on the Way to Score --- p.79
Chapter 4.2.7 --- Independence of theWay to Construct Background --- p.81
Chapter 4.2.8 --- Independence of Natural Bias in Datasets --- p.82
Chapter 4.3 --- mRNA-mRNA Correlation in Expression Data --- p.84
Chapter 4.3.1 --- Variations in the Analysis --- p.85
Chapter 4.3.2 --- Discussions --- p.87
Chapter 4.4 --- miRNA-miRNA Correlation in Expression Data --- p.88
Chapter 4.4.1 --- Variations in the Analysis --- p.89
Chapter 4.4.2 --- Discussions --- p.92
Chapter 5 --- Target Prediction Aided by MTB --- p.94
Chapter 5.1 --- Workflow in Target Prediction --- p.94
Chapter 5.2 --- Contingency Table Approach --- p.96
Chapter 5.2.1 --- One-tailed Hypothesis Testing --- p.97
Chapter 5.3 --- Ranked List Approach --- p.98
Chapter 5.3.1 --- Wilcoxon Signed Rank Test --- p.99
Chapter 5.4 --- Results and Discussions --- p.99
Chapter 6 --- Large-scale Functional Enrichment Analysis --- p.102
Chapter 6.1 --- Principles in Functional Enrichment Analysis --- p.102
Chapter 6.1.1 --- Annotation Files --- p.104
Chapter 6.1.2 --- Functional Enrichment Analysis on a gene --- p.set105
Chapter 6.1.3 --- Functional Enrichment Analysis on many gene sets --- p.106
Chapter 6.2 --- Results and Discussions --- p.107
Chapter 7 --- Future Perspectives and Conclusions --- p.112
Chapter 7.1 --- Applying MTB definition on other problems --- p.112
Chapter 7.2 --- Matrix Definitions and Optimization Problems --- p.113
Chapter 7.3 --- Non-binary association matrix problem settings --- p.114
Chapter 7.4 --- Limitations --- p.114
Chapter 7.5 --- Conclusions --- p.116
Bibliography --- p.117
Chapter A --- Publications --- p.127
Chapter A.1 --- Publications --- p.127

APA, Harvard, Vancouver, ISO, and other styles

30

Whitehead, Dion [Verfasser]. "Reconstructing gene function and gene regulatory networks in prokaryotes / by Dion Whitehead." 2005. http://d-nb.info/977558460/34.

Full text

APA, Harvard, Vancouver, ISO, and other styles

31

Su, Chu-Hsien, and 蘇矩賢. "Reconstruction of Interaction Networks of Escherichia coli through Literature Mining of Gene-Gene Relations." Thesis, 2014. http://ndltd.ncl.edu.tw/handle/80389978697198390403.

Full text

Abstract:

碩士
國立陽明大學
生物醫學資訊研究所
102
Previous studies of reconstructing interaction networks indicated that network-based methods represent the relationships between genes and gene products of the target organisms. In this study we used E. coli K-12 MG1655, an important model organism, as the target organism for reconstructing interaction networks. Interactions of E. coli large molecules are usually obtained from databases and literature. Most interactions in the databases are lacking of literature supports. It is challenging to retrieve traceable literature citations for these interactions of E. coli manually. We applied text mining methods to extract interactions from 310,378 abstracts of E. coli researches in PubMed databases and provide sentence-level annotations of the interactions. F-scores of 0.81, 0.86, and 0.93 were achieved for identification of gene regulations, physical interactions and signal transductions by text mining in random sampling evaluations. 1,084 interactions were identified after text mining extraction. We found that 394 of the 1,084 interactions were newly identified interactions comparing to collected interactions from the E. coli databases. These 394 newly identified interactions provided new insights and bridged the gaps in the interaction networks of E. coli. The precision of 52% was achieved for the identifications of interactions through text mining. We provided sentence-level annotations for 12% of collected interactions in the E. coli databases. We performed functional enrichment analysis of the genes involved in the newly identified interaction extracted by text mining. The enriched functional categories are DNA replication and repair, biofilm formation, and cell motility associated with RpoS-centered stress responses of E. coli. After combing interactions collected from the databases and extracted through text mining, we reconstructed integrated networks of E. coli. From the integrated networks, we found that the newly identified interactions filled the gaps between separated components of the interaction networks based on collected interactions from the databases. The newly identified interactions also led to the organizational changes of hierarchical structure of E. coli’s gene regulatory networks.

APA, Harvard, Vancouver, ISO, and other styles

Dissertations / Theses on the topic 'Gene network reconstruction'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles