Dissertations / Theses on the topic 'Multi-omics Integration'

To see the other types of publications on this topic, follow the link: Multi-omics Integration.

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 24 dissertations / theses for your research on the topic 'Multi-omics Integration.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.

1

Sathyanarayanan, Anita. "Integration of multi-omics data in cancer." Thesis, Queensland University of Technology, 2021. https://eprints.qut.edu.au/225924/1/Anita_Sathyanarayanan_Thesis.pdf.

Full text
Abstract:
Cancer is a complex disease with multiple molecular (omics) factors influencing the risk, development, prognosis, and treatment. Availability of largescale multiple omics data has provided the opportunity to jointly analyse these data using advanced statistical approaches and identify cancer drivers and regulatory pathways underpinning the disease. In the first study, this thesis provides the much-needed guidance for conducting multi-omics analysis using open-source software tools. Next, it introduces an enrichment pipeline developed using imputation-based integration of multi-omics data and was applied to breast and prostate cancers to identify the associated biomarkers and genes.
APA, Harvard, Vancouver, ISO, and other styles
2

Zandonà, Alessandro. "Predictive networks for multi meta-omics data integration." Doctoral thesis, Università degli studi di Trento, 2017. https://hdl.handle.net/11572/367893.

Full text
Abstract:
The role of microbiome in disease onset and in equilibrium is being exposed by a wealth of high-throughput omics methods. All key research directions, e.g., the study of gut microbiome dysbiosis in IBD/IBS, indicate the need for bioinformatics methods that can model the complexity of the microbial communities ecology and unravel its disease-associated perturbations. A most promising direction is the “meta-omics†approach, that allows a profiling based on various biological molecules at the metagenomic scale (e.g., metaproteomics, metametabolomics) as well as different “microbial†omes (eukaryotes and viruses) within a system biology approach. This thesis introduces a bioinformatic framework for microbiota datasets that combines predictive profiling, differential network analysis and meta-omics integration. In detail, the framework identifies biomarkers discriminating amongst clinical phenotypes, through machine learning techniques (Random Forest or SVM) based on a complete Data Analysis Protocol derived by two initiatives funded by FDA: the MicroArray Quality Control-II and Sequencing Quality Control projects. The biomarkers are interpreted in terms of biological networks: the framework provides a setup for networks inference, quantification of networks differences based on the glocal Hamming and Ipsen-Mikhailov (HIM) distance and detection of network communities. The differential analysis of networks allows the study of microbiota structural organization as well as the evolving trajectories of microbial communities associated to the dynamics of the target phenotypes. Moreover, the framework combines a novel similarity network fusion method and machine learning to identify biomarkers from the integration of multiple meta-omics data. The framework implementation requires only standard open source computational biology tools, as a combination of R/Bioconductor and Python functions. In particular, full scripts for meta-omics integration are available in a GitHub repository to ease reuse (https://github.com/AleZandona/INF). The pipeline has been validated on original data from three different clinical datasets. First, the predictive profiling and the network differential analysis have been applied on a pediatric Inflammatory Bowel Disease (IBD) cohort (in faecal vs biopsy environments) and controls, in collaboration with a multidisciplinary team at the Ospedale Pediatrico Bambino Gesú (Rome, I). Then, the meta-omics integration has been tested on a paired bacterial and fungal gut microbiota human IBD datasets from the Gastroenterology Department of the Saint Antoine Hospital (Paris, F), thanks to the collaboration with “Commensals and Probiotics-Host Interactions†team at INRA (Jouy-en-Josas, F). Finally, the framework has been validated on a bacterial-fungal gut microbiota dataset from children affected by Rett syndrome. The different nature of datasets used for validation naturally supports the extension of the framework on different omics datasets. Besides, clinical practice can take advantage of our framework, given the reproducibility and robustness of results, ensured by the adopted Data Analysis Protocol, as well as the biological relevance of the findings, confirmed by the clinical collaborators. Specifically, the omics-based dysbiosis profiles and the inferred biological networks can support the current diagnostic tools to reveal disease-associated perturbations at a much prodromal earlier stage of disease and may be used for disease prevention, diagnosis and prognosis.
APA, Harvard, Vancouver, ISO, and other styles
3

Serra, Angela. "Multi-view learning and data integration for omics data." Doctoral thesis, Universita degli studi di Salerno, 2017. http://hdl.handle.net/10556/2580.

Full text
Abstract:
2015 - 2016
In recent years, the advancement of high-throughput technologies, combined with the constant decrease of the data-storage costs, has led to the production of large amounts of data from different experiments that characterise the same entities of interest. This information may relate to specific aspects of a phenotypic entity (e.g. Gene expression), or can include the comprehensive and parallel measurement of multiple molecular events (e.g., DNA modifications, RNA transcription and protein translation) in the same samples. Exploiting such complex and rich data is needed in the frame of systems biology for building global models able to explain complex phenotypes. For example, theuseofgenome-widedataincancerresearch, fortheidentificationof groups of patients with similar molecular characteristics, has become a standard approach for applications in therapy-response, prognosis-prediction, and drugdevelopment.ÂăMoreover, the integration of gene expression data regarding cell treatment by drugs, and information regarding chemical structure of the drugs allowed scientist to perform more accurate drug repositioning tasks. Unfortunately, there is a big gap between the amount of information and the knowledge in which it is translated. Moreover, there is a huge need of computational methods able to integrate and analyse data to fill this gap. Current researches in this area are following two different integrative methods: one uses the complementary information of different measurements for the 7 i i “Template” — 2017/6/9 — 16:42 — page 8 — #8 i i i i i i study of complex phenotypes on the same samples (multi-view learning); the other tends to infer knowledge about the phenotype of interest by integrating and comparing the experiments relating to it with respect to those of different phenotypes already known through comparative methods (meta-analysis). Meta-analysis can be thought as an integrative study of previous results, usually performed aggregating the summary statistics from different studies. Due to its nature, meta-analysis usually involves homogeneous data. On the other hand, multi-view learning is a more flexible approach that considers the fusion of different data sources to get more stable and reliable estimates. Based on the type of data and the stage of integration, new methodologies have been developed spanning a landscape of techniques comprising graph theory, machine learning and statistics. Depending on the nature of the data and on the statistical problem to address, the integration of heterogeneous data can be performed at different levels: early, intermediate and late. Early integration consists in concatenating data from different views in a single feature space. Intermediate integration consists in transforming all the data sources in a common feature space before combining them. In the late integration methodologies, each view is analysed separately and the results are then combined. The purpose of this thesis is twofold: the former objective is the definition of a data integration methodology for patient sub-typing (MVDA) and the latter is the development of a tool for phenotypic characterisation of nanomaterials (INSIdEnano). In this PhD thesis, I present the methodologies and the results of my research. MVDA is a multi-view methodology that aims to discover new statistically relevant patient sub-classes. Identify patient subtypes of a specific diseases is a challenging task especially in the early diagnosis. This is a crucial point for the treatment, because not allthe patients affected bythe same diseasewill have the same prognosis or need the same drug treatment. This problem is usually solved by using transcriptomic data to identify groups of patients that share the same gene patterns. The main idea underlying this research work is that to combine more omics data for the same patients to obtain a better characterisation of their disease profile. The proposed methodology is a late integration approach i i “Template” — 2017/6/9 — 16:42 — page 9 — #9 i i i i i i based on clustering. It works by evaluating the patient clusters in each single view and then combining the clustering results of all the views by factorising the membership matrices in a late integration manner. The effectiveness and the performance of our method was evaluated on six multi-view cancer datasets related to breast cancer, glioblastoma, prostate and ovarian cancer. The omics data used for the experiment are gene and miRNA expression, RNASeq and miRNASeq, Protein Expression and Copy Number Variation. In all the cases, patient sub-classes with statistical significance were found, identifying novel sub-groups previously not emphasised in literature. The experiments were also conducted by using prior information, as a new view in the integration process, to obtain higher accuracy in patients’ classification. The method outperformed the single view clustering on all the datasets; moreover, it performs better when compared with other multi-view clustering algorithms and, unlike other existing methods, it can quantify the contribution of single views in the results. The method has also shown to be stable when perturbation is applied to the datasets by removing one patient at a time and evaluating the normalized mutual information between all the resulting clusterings. These observations suggest that integration of prior information with genomic features in sub-typing analysis is an effective strategy in identifying disease subgroups. INSIdE nano (Integrated Network of Systems bIology Effects of nanomaterials) is a novel tool for the systematic contextualisation of the effects of engineered nanomaterials (ENMs) in the biomedical context. In the recent years, omics technologies have been increasingly used to thoroughly characterise the ENMs molecular mode of action. It is possible to contextualise the molecular effects of different types of perturbations by comparing their patterns of alterations. While this approach has been successfully used for drug repositioning, it is still missing to date a comprehensive contextualisation of the ENM mode of action. The idea behind the tool is to use analytical strategies to contextualise or position the ENM with the respect to relevant phenotypes that have been studied in literature, (such as diseases, drug treatments, and other chemical exposures) by comparing their patterns of molecular alteration. This could greatly increase the knowledge on the ENM molecular effects and in turn i i “Template” — 2017/6/9 — 16:42 — page 10 — #10 i i i i i i contribute to the definition of relevant pathways of toxicity as well as help in predicting the potential involvement of ENM in pathogenetic events or in novel therapeutic strategies. The main hypothesis is that suggestive patterns of similarity between sets of phenotypes could be an indication of a biological association to be further tested in toxicological or therapeutic frames. Based on the expression signature, associated to each phenotype, the strength of similarity between each pair of perturbations has been evaluated and used to build a large network of phenotypes. To ensure the usability of INSIdE nano, a robust and scalable computational infrastructure has been developed, to scan this large phenotypic network and a web-based effective graphic user interface has been built. Particularly, INSIdE nano was scanned to search for clique sub-networks, quadruplet structures of heterogeneous nodes (a disease, a drug, a chemical and a nanomaterial) completely interconnected by strong patterns of similarity (or anti-similarity). The predictions have been evaluated for a set of known associations between diseases and drugs, based on drug indications in clinical practice, and between diseases and chemical, based on literature-based causal exposure evidence, and focused on the possible involvement of nanomaterials in the most robust cliques. The evaluation of INSIdE nano confirmed that it highlights known disease-drug and disease-chemical connections. Moreover, disease similarities agree with the information based on their clinical features, as well as drugs and chemicals, mirroring their resemblance based on the chemical structure. Altogether, the results suggest that INSIdE nano can also be successfully used to contextualise the molecular effects of ENMs and infer their connections to other better studied phenotypes, speeding up their safety assessment as well as opening new perspectives concerning their usefulness in biomedicine. [edited by author]
L’avanzamento tecnologico delle tecnologie high-throughput, combinato con il costante decremento dei costi di memorizzazione, ha portato alla produzione di grandi quantit`a di dati provenienti da diversi esperimenti che caratterizzano le stesse entit`a di interesse. Queste informazioni possono essere relative a specifici aspetti fenotipici (per esempio l’espressione genica), o possono includere misure globali e parallele di diversi aspetti molecolari (per esempio modifiche del DNA, trascrizione dell’RNA e traduzione delle proteine) negli stessi campioni. Analizzare tali dati complessi `e utile nel campo della systems biology per costruire modelli capaci di spiegare fenotipi complessi. Ad esempio, l’uso di dati genome-wide nella ricerca legata al cancro, per l’identificazione di gruppi di pazienti con caratteristiche molecolari simili, `e diventato un approccio standard per una prognosi precoce piu` accurata e per l’identificazione di terapie specifiche. Inoltre, l’integrazione di dati di espressione genica riguardanti il trattamento di cellule tramite farmaci ha permesso agli scienziati di ottenere accuratezze elevate per il drug repositioning. Purtroppo, esiste un grosso divario tra i dati prodotti, in seguito ai numerosi esperimenti, e l’informazione in cui essi sono tradotti. Quindi la comunit`a scientifica ha una forte necessit`a di metodi computazionali per poter integrare e analizzate tali dati per riempire questo divario. La ricerca nel campo delle analisi multi-view, segue due diversi metodi di analisi integrative: uno usa le informazioni complementari di diverse misure per studiare fenotipi complessi su diversi campioni (multi-view learning); l’altro tende ad inferire conoscenza sul fenotipo di interesse di una entit`a confrontando gli esperimenti ad essi relativi con quelli di altre entit`a fenotipiche gi`a note in letteratura (meta-analisi). La meta-analisi pu`o essere pensata come uno studio comparativo dei risultati identificati in un particolare esperimento, rispetto a quelli di studi precedenti. A causa della sua natura, la meta-analisi solitamente coinvolge dati omogenei. D’altra parte, il multi-view learning `e un approccio piu` flessibile che considera la fusione di diverse sorgenti di dati per ottenere stime piu` stabili e affidabili. In base al tipo di dati e al livello di integrazione, nuove metodologie sono state sviluppate a partire da tecniche basate sulla teoria dei grafi, machine learning e statistica. In base alla natura dei dati e al problema statistico da risolvere, l’integrazione di dati eterogenei pu`o essere effettuata a diversi livelli: early, intermediate e late integration. Le tecniche di early integration consistono nella concatenazione dei dati delle diverse viste in un unico spazio delle feature. Le tecniche di intermediate integration consistono nella trasformazione di tutte le sorgenti dati in un unico spazio comune prima di combinarle. Nelle tecniche di late integration, ogni vista `e analizzata separatamente e i risultati sono poi combinati. Lo scopo di questa tesi `e duplice: il primo obbiettivo `e la definizione di una metodologia di integrazione dati per la sotto-tipizzazione dei pazienti (MVDA) e il secondo `e lo sviluppo di un tool per la caratterizzazione fenotipica dei nanomateriali (INSIdEnano). In questa tesi di dottorato presento le metodologie e i risultati della mia ricerca. MVDA `e una tecnica multi-view con lo scopo di scoprire nuove sotto tipologie di pazienti statisticamente rilevanti. Identificare sottotipi di pazienti per una malattia specifica `e un obbiettivo con alto rilievo nella pratica clinica, soprattutto per la diagnosi precoce delle malattie. Questo problema `e generalmente risolto usando dati di trascrittomica per identificare i gruppi di pazienti che condividono gli stessi pattern di alterazione genica. L’idea principale alla base di questo lavoro di ricerca `e quello di combinare piu` tipologie di dati omici per gli stessi pazienti per ottenere una migliore caratterizzazione del loro profilo. La metodologia proposta `e un approccio di tipo late integration basato sul clustering. Per ogni vista viene effettuato il clustering dei pazienti rappresentato sotto forma di matrici di membership. I risultati di tutte le viste vengono poi combinati tramite una tecnica di fattorizzazione di matrici per ottenere i metacluster finali multi-view. La fattibilit`a e le performance del nostro metodo sono stati valutati su sei dataset multi-view relativi al tumore al seno, glioblastoma, cancro alla prostata e alle ovarie. I dati omici usati per gli esperimenti sono relativi alla espressione dei geni, espressione dei mirna, RNASeq, miRNASeq, espressione delle proteine e della Copy Number Variation. In tutti i dataset sono state identificate sotto-tipologie di pazienti con rilevanza statistica, identificando nuovi sottogruppi precedentemente non noti in letteratura. Ulteriori esperimenti sono stati condotti utilizzando la conoscenza a priori relativa alle macro classi dei pazienti. Tale informazione `e stata considerata come una ulteriore vista nel processo di integrazione per ottenere una accuratezza piu` elevata nella classificazione dei pazienti. Il metodo proposto ha performance migliori degli algoritmi di clustering clussici su tutti i dataset. MVDA ha ottenuto risultati migliori in confronto a altri algoritmi di integrazione di tipo ealry e intermediate integration. Inoltre il metodo `e in grado di calcolare il contributo di ogni singola vista al risultato finale. I risultati mostrano, anche, che il metodo `e stabile in caso di perturbazioni del dataset effettuate rimuovendo un paziente alla volta (leave-one-out). Queste osservazioni suggeriscono che l’integrazione di informazioni a priori e feature genomiche, da utilizzare congiuntamente durante l’analisi, `e una strategia vincente nell’identificazione di sotto-tipologie di malattie. INSIdE nano (Integrated Network of Systems bIology Effects of nanomaterials) `e un tool innovativo per la contestualizzazione sistematica degli effetti delle nanoparticelle (ENMs) in contesti biomedici. Negli ultimi anni, le tecnologie omiche sono state ampiamente applicate per caratterizzare i nanomateriali a livello molecolare. E’ possibile contestualizzare l’effetto a livello molecolare di diversi tipi di perturbazioni confrontando i loro pattern di alterazione genica. Mentre tale approccio `e stato applicato con successo nel campo del drug repositioning, una contestualizzazione estensiva dell’effetto dei nanomateriali sulle cellule `e attualmente mancante. L’idea alla base del tool `e quello di usare strategie comparative di analisi per contestualizzare o posizionare i nanomateriali in confronto a fenotipi rilevanti che sono stati studiati in letteratura (come ad esempio malattie dell’uomo, trattamenti farmacologici o esposizioni a sostanze chimiche) confrontando i loro pattern di alterazione molecolare. Questo potrebbe incrementare la conoscenza dell’effetto molecolare dei nanomateriali e contribuire alla definizione di nuovi pathway tossicologici oppure identificare eventuali coinvolgimenti dei nanomateriali in eventi patologici o in nuove strategie terapeutiche. L’ipotesi alla base `e che l’identificazione di pattern di similarit`a tra insiemi di fenotipi potrebbe essere una indicazione di una associazione biologica che deve essere successivamente testata in ambito tossicologico o terapeutico. Basandosi sulla firma di espressione genica, associata ad ogni fenotipo, la similarit`a tra ogni coppia di perturbazioni `e stata valuta e usata per costruire una grande network di interazione tra fenotipi. Per assicurare l’utilizzo di INSIdE nano, `e stata sviluppata una infrastruttura computazionale robusta e scalabile, allo scopo di analizzare tale network. Inoltre `e stato realizzato un sito web che permettesse agli utenti di interrogare e visualizzare la network in modo semplice ed efficiente. In particolare, INSIdE nano `e stato analizzato cercando tutte le possibili clique di quattro elementi eterogenei (un nanomateriale, un farmaco, una malattia e una sostanza chimica). Una clique `e una sotto network completamente connessa, dove ogni elemento `e collegato con tutti gli altri. Di tutte le clique, sono state considerate come significative solo quelle per le quali le associazioni tra farmaco e malattia e farmaco e sostanze chimiche sono note. Le connessioni note tra farmaci e malattie si basano sul fatto che il farmaco `e prescritto per curare tale malattia. Le connessioni note tra malattia e sostanze chimiche si basano su evidenze presenti in letteratura del fatto che tali sostanze causano la malattia. Il focus `e stato posto sul possibile coinvolgimento dei nanomateriali con le malattie presenti in tali clique. La valutazione di INSIdE nano ha confermato che esso mette in evidenza connessioni note tra malattie e farmaci e tra malattie e sostanze chimiche. Inoltre la similarit`a tra le malattie calcolata in base ai geni `e conforme alle informazioni basate sulle loro informazioni cliniche. Allo stesso modo le similarit`a tra farmaci e sostanze chimiche rispecchiano le loro similarit`a basate sulla struttura chimica. Nell’insieme, i risultati suggeriscono che INSIdE nano pu`o essere usato per contestualizzare l’effetto molecolare dei nanomateriali e inferirne le connessioni rispetto a fenotipi precedentemente studiati in letteratura. Questo metodo permette di velocizzare il processo di valutazione della loro tossicit`a e apre nuove prospettive per il loro utilizzo nella biomedicina. [a cura dell'autore]
XV n.s.
APA, Harvard, Vancouver, ISO, and other styles
4

Lu, Yingzhou. "Multi-omics Data Integration for Identifying Disease Specific Biological Pathways." Thesis, Virginia Tech, 2018. http://hdl.handle.net/10919/83467.

Full text
Abstract:
Pathway analysis is an important task for gaining novel insights into the molecular architecture of many complex diseases. With the advancement of new sequencing technologies, a large amount of quantitative gene expression data have been continuously acquired. The springing up omics data sets such as proteomics has facilitated the investigation on disease relevant pathways. Although much work has previously been done to explore the single omics data, little work has been reported using multi-omics data integration, mainly due to methodological and technological limitations. While a single omic data can provide useful information about the underlying biological processes, multi-omics data integration would be much more comprehensive about the cause-effect processes responsible for diseases and their subtypes. This project investigates the combination of miRNAseq, proteomics, and RNAseq data on seven types of muscular dystrophies and control group. These unique multi-omics data sets provide us with the opportunity to identify disease-specific and most relevant biological pathways. We first perform t-test and OVEPUG test separately to define the differential expressed genes in protein and mRNA data sets. In multi-omics data sets, miRNA also plays a significant role in muscle development by regulating their target genes in mRNA dataset. To exploit the relationship between miRNA and gene expression, we consult with the commonly used gene library - Targetscan to collect all paired miRNA-mRNA and miRNA-protein co-expression pairs. Next, by conducting statistical analysis such as Pearson's correlation coefficient or t-test, we measured the biologically expected correlation of each gene with its upstream miRNAs and identify those showing negative correlation between the aforementioned miRNA-mRNA and miRNA-protein pairs. Furthermore, we identify and assess the most relevant disease-specific pathways by inputting the differential expressed genes and negative correlated genes into the gene-set libraries respectively, and further characterize these prioritized marker subsets using IPA (Ingenuity Pathway Analysis) or KEGG. We will then use Fisher method to combine all these p-values derived from separate gene sets into a joint significance test assessing common pathway relevance. In conclusion, we will find all negative correlated paired miRNA-mRNA and miRNA-protein, and identifying several pathophysiological pathways related to muscular dystrophies by gene set enrichment analysis. This novel multi-omics data integration study and subsequent pathway identification will shed new light on pathophysiological processes in muscular dystrophies and improve our understanding on the molecular pathophysiology of muscle disorders, preventing and treating disease, and make people become healthier in the long term.
Master of Science
APA, Harvard, Vancouver, ISO, and other styles
5

Zampieri, Guido. "Prioritisation of candidate disease genes via multi-omics data integration." Doctoral thesis, Università degli studi di Padova, 2018. http://hdl.handle.net/11577/3421826.

Full text
Abstract:
The uncovering of genes linked to human diseases is a pressing challenge in molecular biology, towards the full achievement of precision medicine. Next-generation technologies provide an unprecedented amount of biological information, but at the same time they unveil enormous numbers of candidate disease genes and pose novel challenges at multiple analytical levels. Multi-omics data integration is currently the principal strategy to prioritise candidate disease genes. In particular, kernel-based methods are a powerful resource for the integration of biological knowledge, but their use is often precluded by their limited scalability. In this thesis, we propose a novel scalable kernel-based method for gene prioritisation which implements a novel multiple kernel learning approach, based on a semi-supervised perspective and on the optimisation of the margin distribution in binary problems. Our method is optimised to cope with strongly unbalanced settings where known disease genes are few and large scale predictions are required. Importantly, it is able to efficiently deal both with a large amount of candidate genes and with an arbitrary number of data sources. Through the simulation of real case studies, we show that our method outperforms a wide range of state-of-the-art methods and has enhanced scalability compared to existing kernel-based approaches for genomic data. We apply the proposed method to investigate the potential role for disease gene prediction of metabolic rearrangements caused by genetic perturbations. To this end, we use constraint-based modelling of metabolism to generate gene-specific information at a genome scale, which is mined via machine learning. Moreover, we compare constraint-based modelling and our kernel-based method as alternative integration strategies for omics data such as transcriptional profiles. Experimental assessments across various cancers demonstrate that information on metabolic rewiring reconstructed in silico can be valuable to prioritise associated genes, although accuracy strongly depends on the cancer type. Despite these fluctuations, predictions achieved starting from metabolic modelling are largely complementary to those from gene expression or pathway annotations, highlighting the potential of this approach to identify novel genes involved in cancer.
La scoperta dei geni legati alle malattie nell'uomo è una sfida pressante in biologia molecolare, in vista del pieno raggiungimento della medicina di precisione. Le tecnologie di nuova generazione forniscono una quantità di informazioni biologiche senza precedenti, ma allo stesso tempo rivelano numeri enormi di geni malattia candidati e pongono nuove sfide a molteplici livelli di analisi. L'integrazione di dati multi-omici è attualmente la strategia principale per prioritizzare geni malattia candidati. In particolare, i metodi basati su kernel sono una potente risorsa per l'integrazione della conoscenza biologica, tuttavia il loro utilizzo è spesso precluso dalla loro limitata scalabilità. In questa tesi, proponiamo un nuovo metodo kernel scalabile per la prioritizzazione di geni, che applica un nuovo approccio di multiple kernel learning basato su una prospettiva semi-supervisionata e sull'ottimizzazione della distribuzione dei margini in problemi binari. Il nostro metodo è ottimizzato per fare fronte a condizioni fortemente sbilanciate in cui si disponga di pochi geni malattia noti e siano richieste predizioni su larga scala. Significativamente, è capace di gestire sia un gran numero di candidati sia un numero arbitrario di sorgenti di informazione. Attraverso la simulazione di casi studio reali, mostriamo che il nostro metodo supera in prestazioni un'ampia gamma di metodi allo stato dell'arte ed è dotato di migliore scalabilità rispetto a metodi kernel esistenti per dati genomici. Applichiamo il metodo proposto per studiare il potenziale ruolo per la predizione di geni malattia dei riarrangiamenti metabolici causati da perturbazioni genetiche. A questo scopo, utilizziamo modelli del metabolismo basati su vincoli per generare informazione sui geni a scala genomica, che viene analizzata tramite apprendimento automatico. Inoltre, compariamo modelli basati su vincoli ed il nostro metodo basato su kernel come strategie di integrazione alternative per dati omici come profili trascrizionali. Valutazioni sperimentali su vari cancri dimostrano come i riarrangiamenti metabolici ricostruiti in silico possano essere utili per prioritizzare i geni associati, nonostante l'accuratezza dipenda fortemente dalla tipologia di cancro. Malgrado queste fluttuazioni, le predizioni basate su modelli metabolici sono largamente complentari a quelle basate su espressione genica o annotazioni di pathway, evidenziando il potenziale di questo approccio per identificare nuovi geni implicati nel cancro.
APA, Harvard, Vancouver, ISO, and other styles
6

Xiao, Hui. "Network-based approaches for multi-omic data integration." Thesis, University of Cambridge, 2019. https://www.repository.cam.ac.uk/handle/1810/289716.

Full text
Abstract:
The advent of advanced high-throughput biological technologies provides opportunities to measure the whole genome at different molecular levels in biological systems, which produces different types of omic data such as genome, epigenome, transcriptome, translatome, proteome, metabolome and interactome. Biological systems are highly dynamic and complex mechanisms which involve not only the within-level functionality but also the between-level regulation. In order to uncover the complexity of biological systems, it is desirable to integrate multi-omic data to transform the multiple level data into biological knowledge about the underlying mechanisms. Due to the heterogeneity and high-dimension of multi-omic data, it is necessary to develop effective and efficient methods for multi-omic data integration. This thesis aims to develop efficient approaches for multi-omic data integration using machine learning methods and network theory. We assume that a biological system can be represented by a network with nodes denoting molecules and edges indicating functional links between molecules, in which multi-omic data can be integrated as attributes of nodes and edges. We propose four network-based approaches for multi-omic data integration using machine learning methods. Firstly, we propose an approach for gene module detection by integrating multi-condition transcriptome data and interactome data using network overlapping module detection method. We apply the approach to study the transcriptome data of human pre-implantation embryos across multiple development stages, and identify several stage-specific dynamic functional modules and genes which provide interesting biological insights. We evaluate the reproducibility of the modules by comparing with some other widely used methods and show that the intra-module genes are significantly overlapped between the different methods. Secondly, we propose an approach for gene module detection by integrating transcriptome, translatome, and interactome data using multilayer network. We apply the approach to study the ribosome profiling data of mTOR perturbed human prostate cancer cells and mine several translation efficiency regulated modules associated with mTOR perturbation. We develop an R package, TERM, for implementation of the proposed approach which offers a useful tool for the research field. Next, we propose an approach for feature selection by integrating transcriptome and interactome data using network-constrained regression. We develop a more efficient network-constrained regression method eGBL. We evaluate its performance in term of variable selection and prediction, and show that eGBL outperforms the other related regression methods. With application on the transcriptome data of human blastocysts, we select several interested genes associated with time-lapse parameters. Finally, we propose an approach for classification by integrating epigenome and transcriptome data using neural networks. We introduce a superlayer neural network (SNN) model which learns DNA methylation and gene expression data parallelly in superlayers but with cross-connections allowing crosstalks between them. We evaluate its performance on human breast cancer classification. The SNN provides superior performances and outperforms several other common machine learning methods. The approaches proposed in this thesis offer effective and efficient solutions for integration of heterogeneous high-dimensional datasets, which can be easily applied to other datasets presenting the similar structures. They are therefore applicable to many fields including but not limited to Bioinformatics and Computer Science.
APA, Harvard, Vancouver, ISO, and other styles
7

DI, NANNI NOEMI. "A network diffusion method for the integration of multi-omics data with applications in precision medicine." Doctoral thesis, Università degli studi di Pavia, 2020. http://hdl.handle.net/11571/1315930.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Schulte-Sasse, Roman [Verfasser]. "Integration of multi-omics data with graph convolutional networks to identify cancer-associated genes / Roman Schulte-Sasse." Berlin : Freie Universität Berlin, 2021. http://nbn-resolving.de/urn:nbn:de:kobv:188-refubium-31311-1.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Bussola, Nicole. "AI for Omics and Imaging Models in Precision Medicine and Toxicology." Doctoral thesis, Università degli studi di Trento, 2022. http://hdl.handle.net/11572/348706.

Full text
Abstract:
This thesis develops an Artificial Intelligence (AI) approach intended for accurate patient stratification and precise diagnostics/prognostics in clinical and preclinical applications. The rapid advance in high throughput technologies and bioinformatics tools is still far from linking precisely the genome-phenotype interactions with the biological mechanisms that underlie pathophysiological conditions. In practice, the incomplete knowledge on individual heterogeneity in complex diseases keeps forcing clinicians to settle for surrogate endpoints and therapies based on a generic one-size-fits-all approach. The working hypothesis is that AI can add new tools to elaborate and integrate together in new features or structures the rich information now available from high-throughput omics and bioimaging data, and that such re- structured information can be applied through predictive models for the precision medicine paradigm, thus favoring the creation of safer tailored treatments for specific patient subgroups. The computational techniques in this thesis are based on the combination of dimensionality reduction methods with Deep Learning (DL) architectures to learn meaningful transformations between the input and the predictive endpoint space. The rationale is that such transformations can introduce intermediate spaces offering more succinct representations, where data from different sources are summarized. The research goal was attacked at increasing levels of complexity, starting from single input modalities (omics and bioimaging of different types and scales), to their multimodal integration. The approach also deals with the key challenges for machine learning (ML) on biomedical data, i.e. reproducibility, stability, and interpretability of the models. Along this path, the thesis contribution is thus the development of a set of specialized AI models and a core framework of three tools of general applicability: i. A Data Analysis Plan (DAP) for model selection and evaluation of classifiers on omics and imaging data to avoid selection bias. ii. The histolab Python package that standardizes the reproducible pre-processing of Whole Slide Images (WSIs), supported by automated testing and easily integrable in DL pipelines for Digital Pathology. iii. Unsupervised and dimensionality reduction techniques based on the UMAP and TDA frameworks for patient subtyping. The framework has been successfully applied on public as well as original data in precision oncology and predictive toxicology. In the clinical setting, this thesis has developed1: 1. (DAPPER) A deep learning framework for evaluation of predictive models in Digital Pathology that controls for selection bias through properly designed data partitioning schemes. 2. (RADLER) A unified deep learning framework that combines radiomics fea- tures and imaging on PET-CT images for prognostic biomarker development in head and neck squamous cell carcinoma. The mixed deep learning/radiomics approach is more accurate than using only one feature type. 3. An ML framework for automated quantification tumor infiltrating lymphocytes (TILs) in onco-immunology, validated on original pathology Neuroblastoma data of the Bambino Gesu’ Children’s Hospital, with high agreement with trained pathologists. The network-based INF pipeline, which applies machine learning models over the combination of multiple omics layers, also providing compact biomarker signatures. INF was validated on three TCGA oncogenomic datasets. In the preclinical setting the framework has been applied for: 1. Deep and machine learning algorithms to predict DILI status from gene expression (GE) data derived from cancer cell lines on the CMap Drug Safety dataset. 2. (ML4TOX) Deep Learning and Support Vector Machine models to predict potential endocrine disruption of environmental chemicals on the CERAPP dataset. 3. (PathologAI) A deep learning pipeline combining generative and convolutional models for preclinical digital pathology. Developed as an internal project within the FDA/NCTR AIRForce initiative and applied to predict necrosis on images from the TG-GATEs project, PathologAI aims to improve accuracy and reduce labor in the identification of lesions in predictive toxicology. Furthermore, GE microarray data were integrated with histology features in a unified multi-modal scheme combining imaging and omics data. The solutions were developed in collaboration with domain experts and considered promising for application.
APA, Harvard, Vancouver, ISO, and other styles
10

Samaras, Patroklos E. [Verfasser], Bernhard [Akademischer Betreuer] Küster, Bernhard [Gutachter] Küster, Martin [Gutachter] Eisenacher, and Julien [Gutachter] Gagneur. "Multi-omics data integration and data model optimization in ProteomicsDB / Patroklos E. Samaras ; Gutachter: Bernhard Küster, Martin Eisenacher, Julien Gagneur ; Betreuer: Bernhard Küster." München : Universitätsbibliothek der TU München, 2020. http://d-nb.info/1223616886/34.

Full text
APA, Harvard, Vancouver, ISO, and other styles
11

Balázs, Kinga [Verfasser], Nina Henriette [Akademischer Betreuer] Uhlenhaut, Nina Henriette [Gutachter] Uhlenhaut, and Hans-Werner [Gutachter] Mewes. "Multi-omics data integration approaches to study glucocorticoid receptor function / Kinga Balázs ; Gutachter: Nina Henriette Uhlenhaut, Hans-Werner Mewes ; Betreuer: Nina Henriette Uhlenhaut." München : Universitätsbibliothek der TU München, 2021. http://d-nb.info/1238781640/34.

Full text
APA, Harvard, Vancouver, ISO, and other styles
12

Kim, Jieun. "Computational tools for the integrative analysis of muti-omics data to decipher trans-omics networks." Thesis, The University of Sydney, 2022. https://hdl.handle.net/2123/28524.

Full text
Abstract:
Regulatory networks define the phenotype, morphology, and function of cells. These networks are built from the basic building blocks of the cell—DNA, RNA, and proteins—and cut across the respective omics layers—genome, transcriptome, and proteome. The resulting omics networks depict a near infinite possibility of nodes and edges that intricately connect the ‘omes’. With the rapid advancement in the technologies that generate omics data in bulk samples and now at single-cell resolution, the field of life sciences is now met with the challenge to connect these omes to generate trans-omics networks. To this end, this thesis addressed some of the pressing challenges in trans-omics network reconstruction and the integrative analysis of omics data at both bulk and single-cell resolution: 1) the lack of an integrated pipeline for processing and downstream analysis of lesser studied omics layers; 2) the need for an integrative framework to reconstruct transcriptional networks and discover novel regulators of transcriptional regulation; and 3) development of tools for the reconstruction of single-cell multi-modal TRNs. I envision the work of my thesis to contribute towards the integrative study of bulk and single-cell trans-omics analysis, which I believe will become essential and standard-place in molecular biological studies as the comprehensiveness and accuracy of omics data measurements and databases for connecting different omics improves.
APA, Harvard, Vancouver, ISO, and other styles
13

Ding, Hao. "Visualization and Integrative analysis of cancer multi-omics data." The Ohio State University, 2016. http://rave.ohiolink.edu/etdc/view?acc_num=osu1467843712.

Full text
APA, Harvard, Vancouver, ISO, and other styles
14

Liu, Yunpeng Ph D. Massachusetts Institute of Technology. "Integrative multi-omics dissection of cancer cell states and susceptibility." Thesis, Massachusetts Institute of Technology, 2020. https://hdl.handle.net/1721.1/130818.

Full text
Abstract:
Thesis: Ph. D., Massachusetts Institute of Technology, Department of Biology, February, 2021
Cataloged from the official PDF of thesis. "February 2021."
Includes bibliographical references (pages 217-239).
Cancer cells are characterized by a broad spectrum of unique genetic, epigenetic and transcriptional states, which are often concomitant with high degrees of plasticity in cell identity. These cell states and the fluidity therein are a major source of resistance to both chemotherapy and targeted therapy. Combinatorial efforts in experimental assays and computational modeling are pivotal for understanding the origins of cancer cell plasticity and exposing cell state-specific vulnerabilities. In this thesis, I will first present my studies on two clinically challenging types of hematopoietic malignancies and discuss key genes that sustain cell identity and survival programs revealed through multi-omics approaches.
In the first study, a combination expression, chromatin binding and chromatin accessibility analyses revealed the plant homeodomain finger-like family protein PHF6's novel functions as a lineage identity regulator in a mouse model of BCR-ABL-driven B cell acute lymphoblastic leukemia. In the second case, single cell transcriptomic profiling, computational inference of cell cycle trajectories and unbiased functional genomics jointly identified RAD51B as a uniquely essential gene in near-haploid leukemia. Finally, to systematically model heterogeneous cell states and generate readily testable predictions of susceptibilities in cancer, I proposed a novel computational pipeline that integrates multiple data types to construct a quantitative model of transcription regulation, which can in turn be used to infer changes in gene expression in response to transcription factor perturbation.
The pipeline then uses these gene expression responses to perturbations to estimate changes in protein activity and finds a combination of protein activity score changes that best predicts changes cell fitness. Applying the pipeline to glioblastoma multiforme - a cancer type that lacks effective targeted therapy, I prioritized a small set of genes including MYBL2 as subtype-specific candidate targets. My thesis work demonstrates the power of integrative, multi-omics approaches for effective discovery of susceptibilities in cancer and highlights an emerging paradigm for understanding the information flow in the cellular circuitry.
by Yunpeng Liu.
Ph. D.
Ph.D. Massachusetts Institute of Technology, Department of Biology
APA, Harvard, Vancouver, ISO, and other styles
15

Noack, Stephan [Verfasser]. "Integrative Auswertung von Multi-Omics-Daten aus dem Zentralstoffwechsel von Corynebacterium glutamicum / Stephan Noack." Siegen : Universitätsbibliothek der Universität Siegen, 2011. http://d-nb.info/1017706166/34.

Full text
APA, Harvard, Vancouver, ISO, and other styles
16

Zhou, Shanshan. "Integrating multi-omics to investigate the correlation between the quality and efficacy of ginseng." HKBU Institutional Repository, 2019. https://repository.hkbu.edu.hk/etd_oa/693.

Full text
Abstract:
Ginseng, the root and rhizome of Panax ginseng C. A. Mey. (Araliaceae), is one of the most famed dietary and medicinal herbs worldwide due to its multifaceted efficacies. Ginsenosides and carbohydrates are demonstrated the major bioactive components of ginseng. Ginseng materials are always formed under various conditions, e.g. different growth years or different post-harvest processing/handling manners. These conditions can impact chemical profiles and thereby cause different quality and efficacy of ginseng. To address this issue, it will be necessary to understand the correlation between the quality and efficacy of ginseng materials formed under different conditions. Previous studies have attempted to investigate how growth years and post-harvest processing/handling manners affect the quality and efficacy of ginseng. In the most of these cases, several chemical components and biological parameters were selected as the indicators for evaluating the quality and efficacy of ginseng, respectively. However, it has been well recognized that the therapy of ginseng is featured by "multiple components against multiple targets". Therefore, several selected indicators may fail to comprehensively characterize the quality and efficacy of ginseng, thus cannot accurately reveal their correlations. Instead, holism-based approaches should be employed. In this study, we integrated chemomics, metabolomics and gut microbiota genomics to investigate the correlation between the quality and efficacy of ginseng in the conditions of growth years, steam-processing and sulfur-fumigation. First, chemomics approach was developed to qualitatively and quantitatively determine major ginsenosides and carbohydrates (poly-, oligo- and monosaccharides) by ultra-high performance liquid chromatography-tandem triple quadrupole mass spectrometry (UHPLC-QqQ-MS/MS) and high performance liquid chromatography coupled with evaporative light scattering detector (HPLC-ELSD) for characterizing the overall quality of ginseng. Second, ultra-performance liquid chromatography-quadrupole time-of-flight mass spectrometry (UPLC-QTOF-MS/MS)-based metabolomics and 16S rRNA gene sequencing-based gut microbiota genomics coupled with biochemical parameters determination were performed to evaluate anti-fatigue and anti-obesity activities of the different ginseng on animal models. Third, the obtained multi-omics data were processed by multivariate statistical analysis and then were integrated to discuss the correlation between the quality and efficacy of ginseng materials in different conditions. The results indicated that: 1) ginseng with 4-6 growth years possessed different anti-fatigue activity in multiple targets due to the different effects of ginsenosides and carbohydrates on endogenous metabolism and gut microbiota; 2) steam-processing qualitatively and quantitatively altered ginsenosides and carbohydrates in ginseng, resulting in different anti-obesity activity between white ginseng and red ginseng, and the mechanisms potentially involve chemically structural/compositional specificity to gut microbiota; 3) SO2 residual content caused by sulfur-fumigation did not correlate with the quality, efficacy and toxicity changes of sulfur-fumigated ginseng, more specifically, less SO2 residue did not indicate higher quality, better efficacy nor weaker toxicity. The research provides scientific insights for guiding the clinical and dietary practice of ginseng and offers new methodology for comprehensively exploring the correlation between the quality and efficacy of herbal medicines
APA, Harvard, Vancouver, ISO, and other styles
17

Schneider, Lara Kristina [Verfasser], and Hans-Peter [Akademischer Betreuer] Lenhof. "Multi-omics integrative analyses for decision support systems in personalized cancer treatment / Lara Kristina Schneider ; Betreuer: Hans-Peter Lenhof." Saarbrücken : Saarländische Universitäts- und Landesbibliothek, 2020. http://d-nb.info/1213723973/34.

Full text
APA, Harvard, Vancouver, ISO, and other styles
18

Wery, Méline. "Identification de signature causale pathologie par intégration de données multi-omiques." Thesis, Rennes 1, 2020. http://www.theses.fr/2020REN1S071.

Full text
Abstract:
Le lupus systémique erythémateux est un exemple de maladie complexe, hétérogène et multi-factorielle. L'identification de signature pouvant expliquer la cause d'une maladie est un enjeu important pour la stratification des patients. De plus, les analyses statistiques classiques s'appliquent difficilement quand les populations d'intérêt sont hétérogènes et ne permettent pas de mettre en évidence la cause. Cette thèse présente donc deux méthodes permettant de répondre à cette problématique. Tout d'abord, un modèle transomique est décrit pour structurer l'ensemble des données omiques en utilisant le Web sémantique (RDF). Son alimentation repose sur une analyse à l'échelle du patient. L'interrogation de ce modèle sous forme d'une requête SPARQL a permis l'identification d'expression Individually-Consistent Trait Loci (eICTLs). Il s'agit d'une association par raisonnement d'un couple SNP-gène pour lequel la présence d'un SNP influence la variation d'expression du gène. Ces éléments ont permis de réduire la dimensionalité des données omiques et présentent un apport plus informatif que les données de génomique. Cette première méthode se base uniquement sur l'utilisation des données omiques. Ensuite, la deuxième méthode repose sur la dépendance entre les régulations existante dans les réseaux biologiques. En combinant la dynamique des systèmes biologiques et l'analyse par concept formel, les états stables générés sont automatiquement classés. Cette classification a permis d'enrichir des signatures biologiques, caractéristique de phénotype. De plus, de nouveaux phénotypes hybrides ont été identifiés
Systematic erythematosus lupus is an example of a complex, heterogeneous and multifactorial disease. The identification of signature that can explain the cause of a disease remains an important challenge for the stratification of patients. Classic statistical analysis can hardly be applied when population of interest are heterogeneous and they do not highlight the cause. This thesis presents two methods that answer those issues. First, a transomic model is described in order to structure all the omic data, using semantic Web (RDF). Its supplying is based on a patient-centric approach. SPARQL query interrogates this model and allow the identification of expression Individually-Consistent Trait Loci (eICTLs). It a reasoning association between a SNP and a gene whose the presence of the SNP impact the variation of its gene expression. Those elements provide a reduction of omics data dimension and show a more informative contribution than genomic data. This first method are omics data-driven. Then, the second method is based on the existing regulation dependancies in biological networks. By combining the dynamic of biological system with the formal concept analysis, the generated stable states are automatically classified. This classification enables the enrichment of biological signature, which caracterised a phenotype. Moreover, new hybrid phenotype is identified
APA, Harvard, Vancouver, ISO, and other styles
19

Ronen, Jonathan. "Integrative analysis of data from multiple experiments." Doctoral thesis, Humboldt-Universität zu Berlin, 2020. http://dx.doi.org/10.18452/21612.

Full text
Abstract:
Auf die Entwicklung der Hochdurchsatz-Sequenzierung (HTS) folgte eine Reihe von speziellen Erweiterungen, die erlauben verschiedene zellbiologischer Aspekte wie Genexpression, DNA-Methylierung, etc. zu messen. Die Analyse dieser Daten erfordert die Entwicklung von Algorithmen, die einzelne Experimenteberücksichtigen oder mehrere Datenquellen gleichzeitig in betracht nehmen. Der letztere Ansatz bietet besondere Vorteile bei Analyse von einzelligen RNA-Sequenzierung (scRNA-seq) Experimenten welche von besonders hohem technischen Rauschen, etwa durch den Verlust an Molekülen durch die Behandlung geringer Ausgangsmengen, gekennzeichnet sind. Um diese experimentellen Defizite auszugleichen, habe ich eine Methode namens netSmooth entwickelt, welche die scRNA-seq-Daten entrascht und fehlende Werte mittels Netzwerkdiffusion über ein Gennetzwerk imputiert. Das Gennetzwerk reflektiert dabei erwartete Koexpressionsmuster von Genen. Unter Verwendung eines Gennetzwerks, das aus Protein-Protein-Interaktionen aufgebaut ist, zeige ich, dass netSmooth anderen hochmodernen scRNA-Seq-Imputationsmethoden bei der Identifizierung von Blutzelltypen in der Hämatopoese, zur Aufklärung von Zeitreihendaten unter Verwendung eines embryonalen Entwicklungsdatensatzes und für die Identifizierung von Tumoren der Herkunft für scRNA-Seq von Glioblastomen überlegen ist. netSmooth hat einen freien Parameter, die Diffusionsdistanz, welche durch datengesteuerte Metriken optimiert werden kann. So kann netSmooth auch dann eingesetzt werden, wenn der optimale Diffusionsabstand nicht explizit mit Hilfe von externen Referenzdaten optimiert werden kann. Eine integrierte Analyse ist auch relevant wenn multi-omics Daten von mehrerer Omics-Protokolle auf den gleichen biologischen Proben erhoben wurden. Hierbei erklärt jeder einzelne dieser Datensätze nur einen Teil des zellulären Systems, während die gemeinsame Analyse ein vollständigeres Bild ergibt. Ich entwickelte eine Methode namens maui, um eine latente Faktordarstellungen von multiomics Daten zu finden.
The development of high throughput sequencing (HTS) was followed by a swarm of protocols utilizing HTS to measure different molecular aspects such as gene expression (transcriptome), DNA methylation (methylome) and more. This opened opportunities for developments of data analysis algorithms and procedures that consider data produced by different experiments. Considering data from seemingly unrelated experiments is particularly beneficial for Single cell RNA sequencing (scRNA-seq). scRNA-seq produces particularly noisy data, due to loss of nucleic acids when handling the small amounts in single cells, and various technical biases. To address these challenges, I developed a method called netSmooth, which de-noises and imputes scRNA-seq data by applying network diffusion over a gene network which encodes expectations of co-expression patterns. The gene network is constructed from other experimental data. Using a gene network constructed from protein-protein interactions, I show that netSmooth outperforms other state-of-the-art scRNA-seq imputation methods at the identification of blood cell types in hematopoiesis, as well as elucidation of time series data in an embryonic development dataset, and identification of tumor of origin for scRNA-seq of glioblastomas. netSmooth has a free parameter, the diffusion distance, which I show can be selected using data-driven metrics. Thus, netSmooth may be used even in cases when the diffusion distance cannot be optimized explicitly using ground-truth labels. Another task which requires in-tandem analysis of data from different experiments arises when different omics protocols are applied to the same biological samples. Analyzing such multiomics data in an integrated fashion, rather than each data type (RNA-seq, DNA-seq, etc.) on its own, is benefitial, as each omics experiment only elucidates part of an integrated cellular system. The simultaneous analysis may reveal a comprehensive view.
APA, Harvard, Vancouver, ISO, and other styles
20

Teng, Sin Yong. "Intelligent Energy-Savings and Process Improvement Strategies in Energy-Intensive Industries." Doctoral thesis, Vysoké učení technické v Brně. Fakulta strojního inženýrství, 2020. http://www.nusl.cz/ntk/nusl-433427.

Full text
Abstract:
S tím, jak se neustále vyvíjejí nové technologie pro energeticky náročná průmyslová odvětví, stávající zařízení postupně zaostávají v efektivitě a produktivitě. Tvrdá konkurence na trhu a legislativa v oblasti životního prostředí nutí tato tradiční zařízení k ukončení provozu a k odstavení. Zlepšování procesu a projekty modernizace jsou zásadní v udržování provozních výkonů těchto zařízení. Současné přístupy pro zlepšování procesů jsou hlavně: integrace procesů, optimalizace procesů a intenzifikace procesů. Obecně se v těchto oblastech využívá matematické optimalizace, zkušeností řešitele a provozní heuristiky. Tyto přístupy slouží jako základ pro zlepšování procesů. Avšak, jejich výkon lze dále zlepšit pomocí moderní výpočtové inteligence. Účelem této práce je tudíž aplikace pokročilých technik umělé inteligence a strojového učení za účelem zlepšování procesů v energeticky náročných průmyslových procesech. V této práci je využit přístup, který řeší tento problém simulací průmyslových systémů a přispívá následujícím: (i)Aplikace techniky strojového učení, která zahrnuje jednorázové učení a neuro-evoluci pro modelování a optimalizaci jednotlivých jednotek na základě dat. (ii) Aplikace redukce dimenze (např. Analýza hlavních komponent, autoendkodér) pro vícekriteriální optimalizaci procesu s více jednotkami. (iii) Návrh nového nástroje pro analýzu problematických částí systému za účelem jejich odstranění (bottleneck tree analysis – BOTA). Bylo také navrženo rozšíření nástroje, které umožňuje řešit vícerozměrné problémy pomocí přístupu založeného na datech. (iv) Prokázání účinnosti simulací Monte-Carlo, neuronové sítě a rozhodovacích stromů pro rozhodování při integraci nové technologie procesu do stávajících procesů. (v) Porovnání techniky HTM (Hierarchical Temporal Memory) a duální optimalizace s několika prediktivními nástroji pro podporu managementu provozu v reálném čase. (vi) Implementace umělé neuronové sítě v rámci rozhraní pro konvenční procesní graf (P-graf). (vii) Zdůraznění budoucnosti umělé inteligence a procesního inženýrství v biosystémech prostřednictvím komerčně založeného paradigmatu multi-omics.
APA, Harvard, Vancouver, ISO, and other styles
21

Pavel, Ana Brandusa. "Multi-omics data integration for the detection and characterization of smoking related lung diseases." Thesis, 2017. https://hdl.handle.net/2144/24073.

Full text
Abstract:
Lung cancer is the leading cause of death from cancer in the world. First, we hypothesized that microRNA expression is altered in the bronchial epithelium of patients with lung cancer and that incorporating microRNA expression into an existing mRNA biomarker may improve its performance. Using bronchial brushings collected from current and former smokers, we profiled microRNA expression via small RNA sequencing for 347 patients with available mRNA data. We found that four microRNAs were under-expressed in cancer patients compared to controls (p<0.002, FDR<0.2). We explored the role of these microRNAs and their gene targets in cancer. In addition, we found that adding a microRNA feature to an existing 23-gene biomarker significantly improves its performance (AUC) in a test set (p<0.05). Next, we generalized the biomarker discovery process, and developed a visualization tool for biomarker selection. We built upon an existing biomarker discovery pipeline and created a web-based interface to visualize the performance of multiple predictors. The “visualization” component is the key to sorting through a thousand potential biomarkers, and developing clinically useful molecular predictors. Finally, we explored the molecular events leading to the development of COPD and ILD, two heterogeneous diseases with high mortality. We hypothesized that integrative genetic and expression networks can help identify drivers and elucidate mechanisms of genetic susceptibility. We utilized 262 lung tissue specimens profiled with microRNA sequencing, microarray gene expression and SNP chip genotyping. Next, we built condition specific integrative networks using a causality inference test for predicting SNP-microRNA-mRNA associations, where the microRNA is a predicted mediator of the SNP’s effect on gene expression. We identified the microRNAs predicted to affect the most genes within each network. Members of miR-34/449 family, known to promote airway differentiation by repressing the Notch pathway, were among the top ranked microRNAs in COPD and ILD networks, but not in the non-disease network. In addition, the miR-34/449 gene module was enriched among genes that increase in expression over time when airway basal cells are differentiated at an air-liquid interface and among genes that increase in expression with the airway wall thickening in patients with emphysema.
2019-07-31T00:00:00Z
APA, Harvard, Vancouver, ISO, and other styles
22

Papież, Anna. "Integrative data analysis methods in multi-omics molecular biology studies for disease of affluence biomarker research." Rozprawa doktorska, 2019. https://repolis.bg.polsl.pl/dlibra/docmetadata?showContent=true&id=59005.

Full text
APA, Harvard, Vancouver, ISO, and other styles
23

Papież, Anna. "Integrative data analysis methods in multi-omics molecular biology studies for disease of affluence biomarker research." Rozprawa doktorska, 2019. https://delibra.bg.polsl.pl/dlibra/docmetadata?showContent=true&id=59005.

Full text
APA, Harvard, Vancouver, ISO, and other styles
24

Abd-Rabbo, Diala. "Beyond hairballs: depicting complexity of a kinase-phosphatase network in the budding yeast." Thèse, 2017. http://hdl.handle.net/1866/19318.

Full text
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography