Щоб переглянути інші типи публікацій з цієї теми, перейдіть за посиланням: Multi-omics Integration.

Дисертації з теми "Multi-omics Integration"

Оформте джерело за APA, MLA, Chicago, Harvard та іншими стилями

Оберіть тип джерела:

Ознайомтеся з топ-33 дисертацій для дослідження на тему "Multi-omics Integration".

Біля кожної праці в переліку літератури доступна кнопка «Додати до бібліографії». Скористайтеся нею – і ми автоматично оформимо бібліографічне посилання на обрану працю в потрібному вам стилі цитування: APA, MLA, «Гарвард», «Чикаго», «Ванкувер» тощо.

Також ви можете завантажити повний текст наукової публікації у форматі «.pdf» та прочитати онлайн анотацію до роботи, якщо відповідні параметри наявні в метаданих.

Переглядайте дисертації для різних дисциплін та оформлюйте правильно вашу бібліографію.

1

Sathyanarayanan, Anita. "Integration of multi-omics data in cancer." Thesis, Queensland University of Technology, 2021. https://eprints.qut.edu.au/225924/1/Anita_Sathyanarayanan_Thesis.pdf.

Повний текст джерела
Анотація:
Cancer is a complex disease with multiple molecular (omics) factors influencing the risk, development, prognosis, and treatment. Availability of largescale multiple omics data has provided the opportunity to jointly analyse these data using advanced statistical approaches and identify cancer drivers and regulatory pathways underpinning the disease. In the first study, this thesis provides the much-needed guidance for conducting multi-omics analysis using open-source software tools. Next, it introduces an enrichment pipeline developed using imputation-based integration of multi-omics data and was applied to breast and prostate cancers to identify the associated biomarkers and genes.
Стилі APA, Harvard, Vancouver, ISO та ін.
2

Zandonà, Alessandro. "Predictive networks for multi meta-omics data integration." Doctoral thesis, Università degli studi di Trento, 2017. https://hdl.handle.net/11572/367893.

Повний текст джерела
Анотація:
The role of microbiome in disease onset and in equilibrium is being exposed by a wealth of high-throughput omics methods. All key research directions, e.g., the study of gut microbiome dysbiosis in IBD/IBS, indicate the need for bioinformatics methods that can model the complexity of the microbial communities ecology and unravel its disease-associated perturbations. A most promising direction is the “meta-omics†approach, that allows a profiling based on various biological molecules at the metagenomic scale (e.g., metaproteomics, metametabolomics) as well as different “microbial†omes (eukaryotes and viruses) within a system biology approach. This thesis introduces a bioinformatic framework for microbiota datasets that combines predictive profiling, differential network analysis and meta-omics integration. In detail, the framework identifies biomarkers discriminating amongst clinical phenotypes, through machine learning techniques (Random Forest or SVM) based on a complete Data Analysis Protocol derived by two initiatives funded by FDA: the MicroArray Quality Control-II and Sequencing Quality Control projects. The biomarkers are interpreted in terms of biological networks: the framework provides a setup for networks inference, quantification of networks differences based on the glocal Hamming and Ipsen-Mikhailov (HIM) distance and detection of network communities. The differential analysis of networks allows the study of microbiota structural organization as well as the evolving trajectories of microbial communities associated to the dynamics of the target phenotypes. Moreover, the framework combines a novel similarity network fusion method and machine learning to identify biomarkers from the integration of multiple meta-omics data. The framework implementation requires only standard open source computational biology tools, as a combination of R/Bioconductor and Python functions. In particular, full scripts for meta-omics integration are available in a GitHub repository to ease reuse (https://github.com/AleZandona/INF). The pipeline has been validated on original data from three different clinical datasets. First, the predictive profiling and the network differential analysis have been applied on a pediatric Inflammatory Bowel Disease (IBD) cohort (in faecal vs biopsy environments) and controls, in collaboration with a multidisciplinary team at the Ospedale Pediatrico Bambino Gesú (Rome, I). Then, the meta-omics integration has been tested on a paired bacterial and fungal gut microbiota human IBD datasets from the Gastroenterology Department of the Saint Antoine Hospital (Paris, F), thanks to the collaboration with “Commensals and Probiotics-Host Interactions†team at INRA (Jouy-en-Josas, F). Finally, the framework has been validated on a bacterial-fungal gut microbiota dataset from children affected by Rett syndrome. The different nature of datasets used for validation naturally supports the extension of the framework on different omics datasets. Besides, clinical practice can take advantage of our framework, given the reproducibility and robustness of results, ensured by the adopted Data Analysis Protocol, as well as the biological relevance of the findings, confirmed by the clinical collaborators. Specifically, the omics-based dysbiosis profiles and the inferred biological networks can support the current diagnostic tools to reveal disease-associated perturbations at a much prodromal earlier stage of disease and may be used for disease prevention, diagnosis and prognosis.
Стилі APA, Harvard, Vancouver, ISO та ін.
3

Zandonà, Alessandro. "Predictive networks for multi meta-omics data integration." Doctoral thesis, University of Trento, 2017. http://eprints-phd.biblio.unitn.it/2547/1/zandona2017_phdthesis.pdf.

Повний текст джерела
Анотація:
The role of microbiome in disease onset and in equilibrium is being exposed by a wealth of high-throughput omics methods. All key research directions, e.g., the study of gut microbiome dysbiosis in IBD/IBS, indicate the need for bioinformatics methods that can model the complexity of the microbial communities ecology and unravel its disease-associated perturbations. A most promising direction is the “meta-omics” approach, that allows a profiling based on various biological molecules at the metagenomic scale (e.g., metaproteomics, metametabolomics) as well as different “microbial” omes (eukaryotes and viruses) within a system biology approach. This thesis introduces a bioinformatic framework for microbiota datasets that combines predictive profiling, differential network analysis and meta-omics integration. In detail, the framework identifies biomarkers discriminating amongst clinical phenotypes, through machine learning techniques (Random Forest or SVM) based on a complete Data Analysis Protocol derived by two initiatives funded by FDA: the MicroArray Quality Control-II and Sequencing Quality Control projects. The biomarkers are interpreted in terms of biological networks: the framework provides a setup for networks inference, quantification of networks differences based on the glocal Hamming and Ipsen-Mikhailov (HIM) distance and detection of network communities. The differential analysis of networks allows the study of microbiota structural organization as well as the evolving trajectories of microbial communities associated to the dynamics of the target phenotypes. Moreover, the framework combines a novel similarity network fusion method and machine learning to identify biomarkers from the integration of multiple meta-omics data. The framework implementation requires only standard open source computational biology tools, as a combination of R/Bioconductor and Python functions. In particular, full scripts for meta-omics integration are available in a GitHub repository to ease reuse (https://github.com/AleZandona/INF). The pipeline has been validated on original data from three different clinical datasets. First, the predictive profiling and the network differential analysis have been applied on a pediatric Inflammatory Bowel Disease (IBD) cohort (in faecal vs biopsy environments) and controls, in collaboration with a multidisciplinary team at the Ospedale Pediatrico Bambino Gesú (Rome, I). Then, the meta-omics integration has been tested on a paired bacterial and fungal gut microbiota human IBD datasets from the Gastroenterology Department of the Saint Antoine Hospital (Paris, F), thanks to the collaboration with “Commensals and Probiotics-Host Interactions” team at INRA (Jouy-en-Josas, F). Finally, the framework has been validated on a bacterial-fungal gut microbiota dataset from children affected by Rett syndrome. The different nature of datasets used for validation naturally supports the extension of the framework on different omics datasets. Besides, clinical practice can take advantage of our framework, given the reproducibility and robustness of results, ensured by the adopted Data Analysis Protocol, as well as the biological relevance of the findings, confirmed by the clinical collaborators. Specifically, the omics-based dysbiosis profiles and the inferred biological networks can support the current diagnostic tools to reveal disease-associated perturbations at a much prodromal earlier stage of disease and may be used for disease prevention, diagnosis and prognosis.
Стилі APA, Harvard, Vancouver, ISO та ін.
4

PATRIZI, SARA. "Multi-omics approaches to complex diseases in children." Doctoral thesis, Università degli Studi di Trieste, 2022. http://hdl.handle.net/11368/3015193.

Повний текст джерела
Анотація:
Le tecnologie “-omiche” studiano l’insieme delle molecole presenti nel campione biologico di interesse, in maniera completamente agnostica. L’integrazione di diversi tipi di dati omici, chiamata “multi-omica” o “omica verticale”, fornisce indicazioni importanti su come le cause di una malattia portano alle sue conseguenze funzionali. Queste indicazioni sono particolarmente utili nel caso delle malattie complesse, che sono causate dall’interazione di vari fattori genetici e regolatori con vari contributi ambientali. In questo lavoro, degli approcci multi-omici appropriati sono stati applicati a due malattie complesse che di solito iniziano a manifestarsi durante l’infanzia, hanno un’incidenza crescente, e hanno vari elementi sconosciuti nella loro patologia molecolare, ovvero le malformazioni polmonari congenite e la celiachia. Gli scopi dei due progetti sono, rispettivamente, di verificare se nel tessuto polmonare malformato ci sono varianti genetiche o alterazioni della metilazione del DNA associate al cancro, e di trovare alterazioni comuni nel metiloma e nel trascrittoma di cellule epiteliali dell’intestino tenue di bambini affetti da celiachia. Per quanto riguarda i metodi, nel progetto sulle malformazioni polmonari sono stati usati microarray di metilazione whole genome e sequenziamento dell’intero genoma, mentre nel progetto sulla celiachia sono stati usati microarray di metilazione whole genome e sequenziamento dell’mRNA totale. In tutte le 20 malformazioni polmonari incluse nello studio sono state trovate regioni differenzialmente metilate in geni probabilmente legati al cancro del polmone. Inoltre, 5 campioni malformati avevano almeno una variante somatica missenso in un gene noto come driver del tumore del polmone, e 5 altri campioni avevano un totale di 2 delezioni di oncosoppressori driver del tumore del polmone e 10 amplificazioni di oncogeni driver del tumore del polmone. Questi dati suggeriscono che le malformazioni polmonari congenite possono avere alterazioni genetiche ed epigenetiche di tipo pre-maligno, la cui presenza è impossibile da prevedere sulla base delle sole informazioni cliniche. Nel secondo progetto, una Principal Component Analysis dei dati di metilazione ha mostrato che i pazienti celiaci si dividono in due cluster, di cui uno si sovrappone ai controlli. 174 geni erano differenzialmente metilati rispetto ai controlli in entrambi i cluster. Una Principal Component Analysis dei dati di espressione genica (mRNA-Seq) ha mostrato una distribuzione simile a quella dei dati di metilazione, e 442 geni erano differenzialmente espressi in entrambi i cluster. Sei geni, principalmente coinvolti nella risposta interferonica e nel processo di processamento e presentazione degli antigeni, erano sia differenzialmente espressi che differenzialmente metilati in entrambi i cluster. Questi risultati indicano che le cellule epiteliali dell’intestino tenue di bambini affetti da celiachia sono altamente variabili da un punto di vista molecolare, ma condividono delle differenze fondamentali che le rendono in grado di rispondere agli interferoni e di processare e presentare antigeni con maggiore efficienza rispetto ai controlli. Nonostante le loro limitazioni, gli studi presentati mostrano che degli approcci multi-omici specifici possono essere usati per rispondere alle domande ancora aperte riguardo a diverse malattie, studiando più funzioni cellulari contemporaneamente e spesso portando anche alla generazione di nuove ipotesi e a scoperte inaspettate.
“-Omic” technologies can detect the entirety of the molecules in the biological sample of interest, in a non-targeted and non-biased fashion. The integration of multiple types of omics data, known as “multi-omics” or “vertical omics”, can provide a better understanding of how the cause of disease leads to its functional consequences, which is particularly valuable in the study of complex diseases, that are caused by the interaction of multiple genetic and regulatory factors with contributions from the environment. In the present work appropriate multi-omics approaches are applied to two complex conditions that usually first manifest in childhood, have rising incidence and gaps in the knowledge of their molecular pathology, specifically Congenital Lung Malformations and Coeliac Disease. The aims are, respectively, to verify if cancer-associated genomic variants or DNA methylation features exist in the malformed lung tissue and to find common alterations in the methylome and the transcriptome of small intestine epithelial cells of children with CD. The methods used in the Congenital Lung Malformations project are Whole Genome Methylation microarrays and Whole Genome Sequencing, and for the Coeliac Disease the whole genome methylation microarrays and mRNA sequencing. Differentially methylated regions in possibly cancer-related genes were found in each one of the 20 lung malformation samples included. Moreover, 5 malformed samples had at least one somatic missense single nucleotide variant in genes known as lung cancer drivers, and 5 malformed samples had a total of 2 deletions of lung cancer driver tumour suppressor and 10 amplifications of lung cancer driver oncogenes. The data showed that congenital lung malformations can have premalignant genetic and epigenetic features, that are impossible to predict with clinical information only. In the second project, Principal Component Analysis of the whole genome methylation data showed that CD patients divide into two clusters, one of which overlaps with controls. 174 genes were differentially methylated compared to the controls in both clusters. Principal Component Analysis of gene expression data (mRNA-Seq) showed a distribution that is similar to the methylation data, and 442 genes were differentially expressed in both clusters. Six genes, mainly related to interferon response and antigen processing and presentation, were differentially expressed and methylated in both clusters. These results show that the intestinal epithelial cells of individuals with CD are highly variable from a molecular point of view, but they share some fundamental differences that make them able to respond to interferons, process, and present antigens more efficiently than controls. Despite the limitations of the present studies, they have shown that targeted multi-omics approaches can be set up to answer the relevant disease-specific questions by investigating many cellular functions at once, often generating new hypotheses and making unexpected discoveries in the process.
Стилі APA, Harvard, Vancouver, ISO та ін.
5

Serra, Angela. "Multi-view learning and data integration for omics data." Doctoral thesis, Universita degli studi di Salerno, 2017. http://hdl.handle.net/10556/2580.

Повний текст джерела
Анотація:
2015 - 2016
In recent years, the advancement of high-throughput technologies, combined with the constant decrease of the data-storage costs, has led to the production of large amounts of data from different experiments that characterise the same entities of interest. This information may relate to specific aspects of a phenotypic entity (e.g. Gene expression), or can include the comprehensive and parallel measurement of multiple molecular events (e.g., DNA modifications, RNA transcription and protein translation) in the same samples. Exploiting such complex and rich data is needed in the frame of systems biology for building global models able to explain complex phenotypes. For example, theuseofgenome-widedataincancerresearch, fortheidentificationof groups of patients with similar molecular characteristics, has become a standard approach for applications in therapy-response, prognosis-prediction, and drugdevelopment.ÂăMoreover, the integration of gene expression data regarding cell treatment by drugs, and information regarding chemical structure of the drugs allowed scientist to perform more accurate drug repositioning tasks. Unfortunately, there is a big gap between the amount of information and the knowledge in which it is translated. Moreover, there is a huge need of computational methods able to integrate and analyse data to fill this gap. Current researches in this area are following two different integrative methods: one uses the complementary information of different measurements for the 7 i i “Template” — 2017/6/9 — 16:42 — page 8 — #8 i i i i i i study of complex phenotypes on the same samples (multi-view learning); the other tends to infer knowledge about the phenotype of interest by integrating and comparing the experiments relating to it with respect to those of different phenotypes already known through comparative methods (meta-analysis). Meta-analysis can be thought as an integrative study of previous results, usually performed aggregating the summary statistics from different studies. Due to its nature, meta-analysis usually involves homogeneous data. On the other hand, multi-view learning is a more flexible approach that considers the fusion of different data sources to get more stable and reliable estimates. Based on the type of data and the stage of integration, new methodologies have been developed spanning a landscape of techniques comprising graph theory, machine learning and statistics. Depending on the nature of the data and on the statistical problem to address, the integration of heterogeneous data can be performed at different levels: early, intermediate and late. Early integration consists in concatenating data from different views in a single feature space. Intermediate integration consists in transforming all the data sources in a common feature space before combining them. In the late integration methodologies, each view is analysed separately and the results are then combined. The purpose of this thesis is twofold: the former objective is the definition of a data integration methodology for patient sub-typing (MVDA) and the latter is the development of a tool for phenotypic characterisation of nanomaterials (INSIdEnano). In this PhD thesis, I present the methodologies and the results of my research. MVDA is a multi-view methodology that aims to discover new statistically relevant patient sub-classes. Identify patient subtypes of a specific diseases is a challenging task especially in the early diagnosis. This is a crucial point for the treatment, because not allthe patients affected bythe same diseasewill have the same prognosis or need the same drug treatment. This problem is usually solved by using transcriptomic data to identify groups of patients that share the same gene patterns. The main idea underlying this research work is that to combine more omics data for the same patients to obtain a better characterisation of their disease profile. The proposed methodology is a late integration approach i i “Template” — 2017/6/9 — 16:42 — page 9 — #9 i i i i i i based on clustering. It works by evaluating the patient clusters in each single view and then combining the clustering results of all the views by factorising the membership matrices in a late integration manner. The effectiveness and the performance of our method was evaluated on six multi-view cancer datasets related to breast cancer, glioblastoma, prostate and ovarian cancer. The omics data used for the experiment are gene and miRNA expression, RNASeq and miRNASeq, Protein Expression and Copy Number Variation. In all the cases, patient sub-classes with statistical significance were found, identifying novel sub-groups previously not emphasised in literature. The experiments were also conducted by using prior information, as a new view in the integration process, to obtain higher accuracy in patients’ classification. The method outperformed the single view clustering on all the datasets; moreover, it performs better when compared with other multi-view clustering algorithms and, unlike other existing methods, it can quantify the contribution of single views in the results. The method has also shown to be stable when perturbation is applied to the datasets by removing one patient at a time and evaluating the normalized mutual information between all the resulting clusterings. These observations suggest that integration of prior information with genomic features in sub-typing analysis is an effective strategy in identifying disease subgroups. INSIdE nano (Integrated Network of Systems bIology Effects of nanomaterials) is a novel tool for the systematic contextualisation of the effects of engineered nanomaterials (ENMs) in the biomedical context. In the recent years, omics technologies have been increasingly used to thoroughly characterise the ENMs molecular mode of action. It is possible to contextualise the molecular effects of different types of perturbations by comparing their patterns of alterations. While this approach has been successfully used for drug repositioning, it is still missing to date a comprehensive contextualisation of the ENM mode of action. The idea behind the tool is to use analytical strategies to contextualise or position the ENM with the respect to relevant phenotypes that have been studied in literature, (such as diseases, drug treatments, and other chemical exposures) by comparing their patterns of molecular alteration. This could greatly increase the knowledge on the ENM molecular effects and in turn i i “Template” — 2017/6/9 — 16:42 — page 10 — #10 i i i i i i contribute to the definition of relevant pathways of toxicity as well as help in predicting the potential involvement of ENM in pathogenetic events or in novel therapeutic strategies. The main hypothesis is that suggestive patterns of similarity between sets of phenotypes could be an indication of a biological association to be further tested in toxicological or therapeutic frames. Based on the expression signature, associated to each phenotype, the strength of similarity between each pair of perturbations has been evaluated and used to build a large network of phenotypes. To ensure the usability of INSIdE nano, a robust and scalable computational infrastructure has been developed, to scan this large phenotypic network and a web-based effective graphic user interface has been built. Particularly, INSIdE nano was scanned to search for clique sub-networks, quadruplet structures of heterogeneous nodes (a disease, a drug, a chemical and a nanomaterial) completely interconnected by strong patterns of similarity (or anti-similarity). The predictions have been evaluated for a set of known associations between diseases and drugs, based on drug indications in clinical practice, and between diseases and chemical, based on literature-based causal exposure evidence, and focused on the possible involvement of nanomaterials in the most robust cliques. The evaluation of INSIdE nano confirmed that it highlights known disease-drug and disease-chemical connections. Moreover, disease similarities agree with the information based on their clinical features, as well as drugs and chemicals, mirroring their resemblance based on the chemical structure. Altogether, the results suggest that INSIdE nano can also be successfully used to contextualise the molecular effects of ENMs and infer their connections to other better studied phenotypes, speeding up their safety assessment as well as opening new perspectives concerning their usefulness in biomedicine. [edited by author]
L’avanzamento tecnologico delle tecnologie high-throughput, combinato con il costante decremento dei costi di memorizzazione, ha portato alla produzione di grandi quantit`a di dati provenienti da diversi esperimenti che caratterizzano le stesse entit`a di interesse. Queste informazioni possono essere relative a specifici aspetti fenotipici (per esempio l’espressione genica), o possono includere misure globali e parallele di diversi aspetti molecolari (per esempio modifiche del DNA, trascrizione dell’RNA e traduzione delle proteine) negli stessi campioni. Analizzare tali dati complessi `e utile nel campo della systems biology per costruire modelli capaci di spiegare fenotipi complessi. Ad esempio, l’uso di dati genome-wide nella ricerca legata al cancro, per l’identificazione di gruppi di pazienti con caratteristiche molecolari simili, `e diventato un approccio standard per una prognosi precoce piu` accurata e per l’identificazione di terapie specifiche. Inoltre, l’integrazione di dati di espressione genica riguardanti il trattamento di cellule tramite farmaci ha permesso agli scienziati di ottenere accuratezze elevate per il drug repositioning. Purtroppo, esiste un grosso divario tra i dati prodotti, in seguito ai numerosi esperimenti, e l’informazione in cui essi sono tradotti. Quindi la comunit`a scientifica ha una forte necessit`a di metodi computazionali per poter integrare e analizzate tali dati per riempire questo divario. La ricerca nel campo delle analisi multi-view, segue due diversi metodi di analisi integrative: uno usa le informazioni complementari di diverse misure per studiare fenotipi complessi su diversi campioni (multi-view learning); l’altro tende ad inferire conoscenza sul fenotipo di interesse di una entit`a confrontando gli esperimenti ad essi relativi con quelli di altre entit`a fenotipiche gi`a note in letteratura (meta-analisi). La meta-analisi pu`o essere pensata come uno studio comparativo dei risultati identificati in un particolare esperimento, rispetto a quelli di studi precedenti. A causa della sua natura, la meta-analisi solitamente coinvolge dati omogenei. D’altra parte, il multi-view learning `e un approccio piu` flessibile che considera la fusione di diverse sorgenti di dati per ottenere stime piu` stabili e affidabili. In base al tipo di dati e al livello di integrazione, nuove metodologie sono state sviluppate a partire da tecniche basate sulla teoria dei grafi, machine learning e statistica. In base alla natura dei dati e al problema statistico da risolvere, l’integrazione di dati eterogenei pu`o essere effettuata a diversi livelli: early, intermediate e late integration. Le tecniche di early integration consistono nella concatenazione dei dati delle diverse viste in un unico spazio delle feature. Le tecniche di intermediate integration consistono nella trasformazione di tutte le sorgenti dati in un unico spazio comune prima di combinarle. Nelle tecniche di late integration, ogni vista `e analizzata separatamente e i risultati sono poi combinati. Lo scopo di questa tesi `e duplice: il primo obbiettivo `e la definizione di una metodologia di integrazione dati per la sotto-tipizzazione dei pazienti (MVDA) e il secondo `e lo sviluppo di un tool per la caratterizzazione fenotipica dei nanomateriali (INSIdEnano). In questa tesi di dottorato presento le metodologie e i risultati della mia ricerca. MVDA `e una tecnica multi-view con lo scopo di scoprire nuove sotto tipologie di pazienti statisticamente rilevanti. Identificare sottotipi di pazienti per una malattia specifica `e un obbiettivo con alto rilievo nella pratica clinica, soprattutto per la diagnosi precoce delle malattie. Questo problema `e generalmente risolto usando dati di trascrittomica per identificare i gruppi di pazienti che condividono gli stessi pattern di alterazione genica. L’idea principale alla base di questo lavoro di ricerca `e quello di combinare piu` tipologie di dati omici per gli stessi pazienti per ottenere una migliore caratterizzazione del loro profilo. La metodologia proposta `e un approccio di tipo late integration basato sul clustering. Per ogni vista viene effettuato il clustering dei pazienti rappresentato sotto forma di matrici di membership. I risultati di tutte le viste vengono poi combinati tramite una tecnica di fattorizzazione di matrici per ottenere i metacluster finali multi-view. La fattibilit`a e le performance del nostro metodo sono stati valutati su sei dataset multi-view relativi al tumore al seno, glioblastoma, cancro alla prostata e alle ovarie. I dati omici usati per gli esperimenti sono relativi alla espressione dei geni, espressione dei mirna, RNASeq, miRNASeq, espressione delle proteine e della Copy Number Variation. In tutti i dataset sono state identificate sotto-tipologie di pazienti con rilevanza statistica, identificando nuovi sottogruppi precedentemente non noti in letteratura. Ulteriori esperimenti sono stati condotti utilizzando la conoscenza a priori relativa alle macro classi dei pazienti. Tale informazione `e stata considerata come una ulteriore vista nel processo di integrazione per ottenere una accuratezza piu` elevata nella classificazione dei pazienti. Il metodo proposto ha performance migliori degli algoritmi di clustering clussici su tutti i dataset. MVDA ha ottenuto risultati migliori in confronto a altri algoritmi di integrazione di tipo ealry e intermediate integration. Inoltre il metodo `e in grado di calcolare il contributo di ogni singola vista al risultato finale. I risultati mostrano, anche, che il metodo `e stabile in caso di perturbazioni del dataset effettuate rimuovendo un paziente alla volta (leave-one-out). Queste osservazioni suggeriscono che l’integrazione di informazioni a priori e feature genomiche, da utilizzare congiuntamente durante l’analisi, `e una strategia vincente nell’identificazione di sotto-tipologie di malattie. INSIdE nano (Integrated Network of Systems bIology Effects of nanomaterials) `e un tool innovativo per la contestualizzazione sistematica degli effetti delle nanoparticelle (ENMs) in contesti biomedici. Negli ultimi anni, le tecnologie omiche sono state ampiamente applicate per caratterizzare i nanomateriali a livello molecolare. E’ possibile contestualizzare l’effetto a livello molecolare di diversi tipi di perturbazioni confrontando i loro pattern di alterazione genica. Mentre tale approccio `e stato applicato con successo nel campo del drug repositioning, una contestualizzazione estensiva dell’effetto dei nanomateriali sulle cellule `e attualmente mancante. L’idea alla base del tool `e quello di usare strategie comparative di analisi per contestualizzare o posizionare i nanomateriali in confronto a fenotipi rilevanti che sono stati studiati in letteratura (come ad esempio malattie dell’uomo, trattamenti farmacologici o esposizioni a sostanze chimiche) confrontando i loro pattern di alterazione molecolare. Questo potrebbe incrementare la conoscenza dell’effetto molecolare dei nanomateriali e contribuire alla definizione di nuovi pathway tossicologici oppure identificare eventuali coinvolgimenti dei nanomateriali in eventi patologici o in nuove strategie terapeutiche. L’ipotesi alla base `e che l’identificazione di pattern di similarit`a tra insiemi di fenotipi potrebbe essere una indicazione di una associazione biologica che deve essere successivamente testata in ambito tossicologico o terapeutico. Basandosi sulla firma di espressione genica, associata ad ogni fenotipo, la similarit`a tra ogni coppia di perturbazioni `e stata valuta e usata per costruire una grande network di interazione tra fenotipi. Per assicurare l’utilizzo di INSIdE nano, `e stata sviluppata una infrastruttura computazionale robusta e scalabile, allo scopo di analizzare tale network. Inoltre `e stato realizzato un sito web che permettesse agli utenti di interrogare e visualizzare la network in modo semplice ed efficiente. In particolare, INSIdE nano `e stato analizzato cercando tutte le possibili clique di quattro elementi eterogenei (un nanomateriale, un farmaco, una malattia e una sostanza chimica). Una clique `e una sotto network completamente connessa, dove ogni elemento `e collegato con tutti gli altri. Di tutte le clique, sono state considerate come significative solo quelle per le quali le associazioni tra farmaco e malattia e farmaco e sostanze chimiche sono note. Le connessioni note tra farmaci e malattie si basano sul fatto che il farmaco `e prescritto per curare tale malattia. Le connessioni note tra malattia e sostanze chimiche si basano su evidenze presenti in letteratura del fatto che tali sostanze causano la malattia. Il focus `e stato posto sul possibile coinvolgimento dei nanomateriali con le malattie presenti in tali clique. La valutazione di INSIdE nano ha confermato che esso mette in evidenza connessioni note tra malattie e farmaci e tra malattie e sostanze chimiche. Inoltre la similarit`a tra le malattie calcolata in base ai geni `e conforme alle informazioni basate sulle loro informazioni cliniche. Allo stesso modo le similarit`a tra farmaci e sostanze chimiche rispecchiano le loro similarit`a basate sulla struttura chimica. Nell’insieme, i risultati suggeriscono che INSIdE nano pu`o essere usato per contestualizzare l’effetto molecolare dei nanomateriali e inferirne le connessioni rispetto a fenotipi precedentemente studiati in letteratura. Questo metodo permette di velocizzare il processo di valutazione della loro tossicit`a e apre nuove prospettive per il loro utilizzo nella biomedicina. [a cura dell'autore]
XV n.s.
Стилі APA, Harvard, Vancouver, ISO та ін.
6

Lu, Yingzhou. "Multi-omics Data Integration for Identifying Disease Specific Biological Pathways." Thesis, Virginia Tech, 2018. http://hdl.handle.net/10919/83467.

Повний текст джерела
Анотація:
Pathway analysis is an important task for gaining novel insights into the molecular architecture of many complex diseases. With the advancement of new sequencing technologies, a large amount of quantitative gene expression data have been continuously acquired. The springing up omics data sets such as proteomics has facilitated the investigation on disease relevant pathways. Although much work has previously been done to explore the single omics data, little work has been reported using multi-omics data integration, mainly due to methodological and technological limitations. While a single omic data can provide useful information about the underlying biological processes, multi-omics data integration would be much more comprehensive about the cause-effect processes responsible for diseases and their subtypes. This project investigates the combination of miRNAseq, proteomics, and RNAseq data on seven types of muscular dystrophies and control group. These unique multi-omics data sets provide us with the opportunity to identify disease-specific and most relevant biological pathways. We first perform t-test and OVEPUG test separately to define the differential expressed genes in protein and mRNA data sets. In multi-omics data sets, miRNA also plays a significant role in muscle development by regulating their target genes in mRNA dataset. To exploit the relationship between miRNA and gene expression, we consult with the commonly used gene library - Targetscan to collect all paired miRNA-mRNA and miRNA-protein co-expression pairs. Next, by conducting statistical analysis such as Pearson's correlation coefficient or t-test, we measured the biologically expected correlation of each gene with its upstream miRNAs and identify those showing negative correlation between the aforementioned miRNA-mRNA and miRNA-protein pairs. Furthermore, we identify and assess the most relevant disease-specific pathways by inputting the differential expressed genes and negative correlated genes into the gene-set libraries respectively, and further characterize these prioritized marker subsets using IPA (Ingenuity Pathway Analysis) or KEGG. We will then use Fisher method to combine all these p-values derived from separate gene sets into a joint significance test assessing common pathway relevance. In conclusion, we will find all negative correlated paired miRNA-mRNA and miRNA-protein, and identifying several pathophysiological pathways related to muscular dystrophies by gene set enrichment analysis. This novel multi-omics data integration study and subsequent pathway identification will shed new light on pathophysiological processes in muscular dystrophies and improve our understanding on the molecular pathophysiology of muscle disorders, preventing and treating disease, and make people become healthier in the long term.
Master of Science
Стилі APA, Harvard, Vancouver, ISO та ін.
7

Zampieri, Guido. "Prioritisation of candidate disease genes via multi-omics data integration." Doctoral thesis, Università degli studi di Padova, 2018. http://hdl.handle.net/11577/3421826.

Повний текст джерела
Анотація:
The uncovering of genes linked to human diseases is a pressing challenge in molecular biology, towards the full achievement of precision medicine. Next-generation technologies provide an unprecedented amount of biological information, but at the same time they unveil enormous numbers of candidate disease genes and pose novel challenges at multiple analytical levels. Multi-omics data integration is currently the principal strategy to prioritise candidate disease genes. In particular, kernel-based methods are a powerful resource for the integration of biological knowledge, but their use is often precluded by their limited scalability. In this thesis, we propose a novel scalable kernel-based method for gene prioritisation which implements a novel multiple kernel learning approach, based on a semi-supervised perspective and on the optimisation of the margin distribution in binary problems. Our method is optimised to cope with strongly unbalanced settings where known disease genes are few and large scale predictions are required. Importantly, it is able to efficiently deal both with a large amount of candidate genes and with an arbitrary number of data sources. Through the simulation of real case studies, we show that our method outperforms a wide range of state-of-the-art methods and has enhanced scalability compared to existing kernel-based approaches for genomic data. We apply the proposed method to investigate the potential role for disease gene prediction of metabolic rearrangements caused by genetic perturbations. To this end, we use constraint-based modelling of metabolism to generate gene-specific information at a genome scale, which is mined via machine learning. Moreover, we compare constraint-based modelling and our kernel-based method as alternative integration strategies for omics data such as transcriptional profiles. Experimental assessments across various cancers demonstrate that information on metabolic rewiring reconstructed in silico can be valuable to prioritise associated genes, although accuracy strongly depends on the cancer type. Despite these fluctuations, predictions achieved starting from metabolic modelling are largely complementary to those from gene expression or pathway annotations, highlighting the potential of this approach to identify novel genes involved in cancer.
La scoperta dei geni legati alle malattie nell'uomo è una sfida pressante in biologia molecolare, in vista del pieno raggiungimento della medicina di precisione. Le tecnologie di nuova generazione forniscono una quantità di informazioni biologiche senza precedenti, ma allo stesso tempo rivelano numeri enormi di geni malattia candidati e pongono nuove sfide a molteplici livelli di analisi. L'integrazione di dati multi-omici è attualmente la strategia principale per prioritizzare geni malattia candidati. In particolare, i metodi basati su kernel sono una potente risorsa per l'integrazione della conoscenza biologica, tuttavia il loro utilizzo è spesso precluso dalla loro limitata scalabilità. In questa tesi, proponiamo un nuovo metodo kernel scalabile per la prioritizzazione di geni, che applica un nuovo approccio di multiple kernel learning basato su una prospettiva semi-supervisionata e sull'ottimizzazione della distribuzione dei margini in problemi binari. Il nostro metodo è ottimizzato per fare fronte a condizioni fortemente sbilanciate in cui si disponga di pochi geni malattia noti e siano richieste predizioni su larga scala. Significativamente, è capace di gestire sia un gran numero di candidati sia un numero arbitrario di sorgenti di informazione. Attraverso la simulazione di casi studio reali, mostriamo che il nostro metodo supera in prestazioni un'ampia gamma di metodi allo stato dell'arte ed è dotato di migliore scalabilità rispetto a metodi kernel esistenti per dati genomici. Applichiamo il metodo proposto per studiare il potenziale ruolo per la predizione di geni malattia dei riarrangiamenti metabolici causati da perturbazioni genetiche. A questo scopo, utilizziamo modelli del metabolismo basati su vincoli per generare informazione sui geni a scala genomica, che viene analizzata tramite apprendimento automatico. Inoltre, compariamo modelli basati su vincoli ed il nostro metodo basato su kernel come strategie di integrazione alternative per dati omici come profili trascrizionali. Valutazioni sperimentali su vari cancri dimostrano come i riarrangiamenti metabolici ricostruiti in silico possano essere utili per prioritizzare i geni associati, nonostante l'accuratezza dipenda fortemente dalla tipologia di cancro. Malgrado queste fluttuazioni, le predizioni basate su modelli metabolici sono largamente complentari a quelle basate su espressione genica o annotazioni di pathway, evidenziando il potenziale di questo approccio per identificare nuovi geni implicati nel cancro.
Стилі APA, Harvard, Vancouver, ISO та ін.
8

Xiao, Hui. "Network-based approaches for multi-omic data integration." Thesis, University of Cambridge, 2019. https://www.repository.cam.ac.uk/handle/1810/289716.

Повний текст джерела
Анотація:
The advent of advanced high-throughput biological technologies provides opportunities to measure the whole genome at different molecular levels in biological systems, which produces different types of omic data such as genome, epigenome, transcriptome, translatome, proteome, metabolome and interactome. Biological systems are highly dynamic and complex mechanisms which involve not only the within-level functionality but also the between-level regulation. In order to uncover the complexity of biological systems, it is desirable to integrate multi-omic data to transform the multiple level data into biological knowledge about the underlying mechanisms. Due to the heterogeneity and high-dimension of multi-omic data, it is necessary to develop effective and efficient methods for multi-omic data integration. This thesis aims to develop efficient approaches for multi-omic data integration using machine learning methods and network theory. We assume that a biological system can be represented by a network with nodes denoting molecules and edges indicating functional links between molecules, in which multi-omic data can be integrated as attributes of nodes and edges. We propose four network-based approaches for multi-omic data integration using machine learning methods. Firstly, we propose an approach for gene module detection by integrating multi-condition transcriptome data and interactome data using network overlapping module detection method. We apply the approach to study the transcriptome data of human pre-implantation embryos across multiple development stages, and identify several stage-specific dynamic functional modules and genes which provide interesting biological insights. We evaluate the reproducibility of the modules by comparing with some other widely used methods and show that the intra-module genes are significantly overlapped between the different methods. Secondly, we propose an approach for gene module detection by integrating transcriptome, translatome, and interactome data using multilayer network. We apply the approach to study the ribosome profiling data of mTOR perturbed human prostate cancer cells and mine several translation efficiency regulated modules associated with mTOR perturbation. We develop an R package, TERM, for implementation of the proposed approach which offers a useful tool for the research field. Next, we propose an approach for feature selection by integrating transcriptome and interactome data using network-constrained regression. We develop a more efficient network-constrained regression method eGBL. We evaluate its performance in term of variable selection and prediction, and show that eGBL outperforms the other related regression methods. With application on the transcriptome data of human blastocysts, we select several interested genes associated with time-lapse parameters. Finally, we propose an approach for classification by integrating epigenome and transcriptome data using neural networks. We introduce a superlayer neural network (SNN) model which learns DNA methylation and gene expression data parallelly in superlayers but with cross-connections allowing crosstalks between them. We evaluate its performance on human breast cancer classification. The SNN provides superior performances and outperforms several other common machine learning methods. The approaches proposed in this thesis offer effective and efficient solutions for integration of heterogeneous high-dimensional datasets, which can be easily applied to other datasets presenting the similar structures. They are therefore applicable to many fields including but not limited to Bioinformatics and Computer Science.
Стилі APA, Harvard, Vancouver, ISO та ін.
9

Jagtap, Surabhi. "Multilayer Graph Embeddings for Omics Data Integration in Bioinformatics." Electronic Thesis or Diss., université Paris-Saclay, 2023. http://www.theses.fr/2023UPAST014.

Повний текст джерела
Анотація:
Les systèmes biologiques sont composés de biomolécules en interaction à différents niveaux moléculaires. D’un côté, les avancées technologiques ont facilité l’obtention des données omiques à ces divers niveaux. De l’autre, de nombreuses questions se posent, pour donner du sens et élucider les interactions importantes dans le flux d’informations complexes porté par cette énorme variété et quantité des données multi-omiques. Les réponses les plus satisfaisantes seront celles qui permettront de dévoiler les mécanismes sous-jacents à la condition biologique d’intérêt. On s’attend souvent à ce que l’intégration de différents types de données omiques permette de mettre en lumière les changements causaux potentiels qui conduisent à un phénotype spécifique ou à des traitements ciblés. Avec les avancées récentes de la science des réseaux, nous avons choisi de traiter ce problème d’intégration en représentant les données omiques à travers les graphes. Dans cette thèse, nous avons développé trois modèles à savoir BraneExp, BraneNet et BraneMF pour l’apprentissage d’intégrations de noeuds à partir de réseaux biologiques multicouches générés à partir de données omiques. Notre objectif est de résoudre divers problèmes complexes liés à l’intégration de données multiomiques, en développant des méthodes expressives et évolutives capables de tirer parti de la riche sémantique structurelle latente des réseaux du monde réel
Biological systems are composed of interacting bio-molecules at different molecular levels. With the advent of high-throughput technologies, omics data at their respective molecular level can be easily obtained. These huge, complex multi-omics data can be useful to provide insights into the flow of information at multiple levels, unraveling the mechanisms underlying the biological condition of interest. Integration of different omics data types is often expected to elucidate potential causative changes that lead to specific phenotypes, or targeted treatments. With the recent advances in network science, we choose to handle this integration issue by representing omics data through networks. In this thesis, we have developed three models, namely BraneExp, BraneNet, and BraneMF, for learning node embeddings from multilayer biological networks generated with omics data. We aim to tackle various challenging problems arising in multi-omics data integration, developing expressive and scalable methods capable of leveraging rich structural semantics of realworld networks
Стилі APA, Harvard, Vancouver, ISO та ін.
10

DI, NANNI NOEMI. "A network diffusion method for the integration of multi-omics data with applications in precision medicine." Doctoral thesis, Università degli studi di Pavia, 2020. http://hdl.handle.net/11571/1315930.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
11

Schulte-Sasse, Roman [Verfasser]. "Integration of multi-omics data with graph convolutional networks to identify cancer-associated genes / Roman Schulte-Sasse." Berlin : Freie Universität Berlin, 2021. http://nbn-resolving.de/urn:nbn:de:kobv:188-refubium-31311-1.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
12

Benkirane, Hakim. "Deep learning methods for the integration of multi-omics and histopathology data for precision medicine in oncology." Electronic Thesis or Diss., université Paris-Saclay, 2024. http://www.theses.fr/2024UPASR022.

Повний текст джерела
Анотація:
La médecine de précision est une approche émergente pour le traitement et la prévention des maladies qui prend en compte la variabilité individuelle dans les gènes, l'environnement et le mode de vie. L'objectif est de prédire plus précisément quelles stratégies de traitement et de prévention pour une maladie particulière fonctionneront dans quels groupes de personnes. En oncologie, la médecine de précision s'accompagne d'une augmentation drastique des données collectées pour chaque individu, caractérisée par une grande diversité de sources de données. Par exemple, les patients recevant un traitement contre le cancer sont souvent soumis à un profilage moléculaire complet, en plus du profilage clinique et des images de pathologie anatomique. Par conséquent, l'intégration de données multimodales (images, cliniques, moléculaires) est une question critique pour permettre la définition de modèles prédictifs individuels. Cette thèse aborde le développement de modèles computationnels et de stratégies d'apprentissage capables de déchiffrer des interactions complexes et de haute dimension. Un accent significatif est également mis sur l'explicabilité de ces modèles pilotés par l'IA, assurant que les prédictions soient compréhensibles et cliniquement exploitables
Precision medicine is an emerging approach for disease treatment and prevention that takes into account individual variability in genes, environment, and lifestyle. The objective it to predict more accurately which treatment and prevention strategies for a particular disease will work in which groups of people. In oncology, precision medicine comes with a drastic increase in the data that is collected for each individual, characterized by a large diversity of data sources. Advanced cancer patients receiving cancer treatment, for instance, are often subject to a complete molecular profiling, on top of clinical profiling and pathology images. As a consequence, integration methods for multi-modal data (image, clinical, molecular) is a critical issue to allow the definition of individual predictive models. This thesis tackles the development of computational models and learning strategies adept at deciphering complex, high-dimensional interactions. A significant focus is also placed on the explainability of these AI-driven models, ensuring that predictions are understandable and clinically actionable
Стилі APA, Harvard, Vancouver, ISO та ін.
13

Bussola, Nicole. "AI for Omics and Imaging Models in Precision Medicine and Toxicology." Doctoral thesis, Università degli studi di Trento, 2022. http://hdl.handle.net/11572/348706.

Повний текст джерела
Анотація:
This thesis develops an Artificial Intelligence (AI) approach intended for accurate patient stratification and precise diagnostics/prognostics in clinical and preclinical applications. The rapid advance in high throughput technologies and bioinformatics tools is still far from linking precisely the genome-phenotype interactions with the biological mechanisms that underlie pathophysiological conditions. In practice, the incomplete knowledge on individual heterogeneity in complex diseases keeps forcing clinicians to settle for surrogate endpoints and therapies based on a generic one-size-fits-all approach. The working hypothesis is that AI can add new tools to elaborate and integrate together in new features or structures the rich information now available from high-throughput omics and bioimaging data, and that such re- structured information can be applied through predictive models for the precision medicine paradigm, thus favoring the creation of safer tailored treatments for specific patient subgroups. The computational techniques in this thesis are based on the combination of dimensionality reduction methods with Deep Learning (DL) architectures to learn meaningful transformations between the input and the predictive endpoint space. The rationale is that such transformations can introduce intermediate spaces offering more succinct representations, where data from different sources are summarized. The research goal was attacked at increasing levels of complexity, starting from single input modalities (omics and bioimaging of different types and scales), to their multimodal integration. The approach also deals with the key challenges for machine learning (ML) on biomedical data, i.e. reproducibility, stability, and interpretability of the models. Along this path, the thesis contribution is thus the development of a set of specialized AI models and a core framework of three tools of general applicability: i. A Data Analysis Plan (DAP) for model selection and evaluation of classifiers on omics and imaging data to avoid selection bias. ii. The histolab Python package that standardizes the reproducible pre-processing of Whole Slide Images (WSIs), supported by automated testing and easily integrable in DL pipelines for Digital Pathology. iii. Unsupervised and dimensionality reduction techniques based on the UMAP and TDA frameworks for patient subtyping. The framework has been successfully applied on public as well as original data in precision oncology and predictive toxicology. In the clinical setting, this thesis has developed1: 1. (DAPPER) A deep learning framework for evaluation of predictive models in Digital Pathology that controls for selection bias through properly designed data partitioning schemes. 2. (RADLER) A unified deep learning framework that combines radiomics fea- tures and imaging on PET-CT images for prognostic biomarker development in head and neck squamous cell carcinoma. The mixed deep learning/radiomics approach is more accurate than using only one feature type. 3. An ML framework for automated quantification tumor infiltrating lymphocytes (TILs) in onco-immunology, validated on original pathology Neuroblastoma data of the Bambino Gesu’ Children’s Hospital, with high agreement with trained pathologists. The network-based INF pipeline, which applies machine learning models over the combination of multiple omics layers, also providing compact biomarker signatures. INF was validated on three TCGA oncogenomic datasets. In the preclinical setting the framework has been applied for: 1. Deep and machine learning algorithms to predict DILI status from gene expression (GE) data derived from cancer cell lines on the CMap Drug Safety dataset. 2. (ML4TOX) Deep Learning and Support Vector Machine models to predict potential endocrine disruption of environmental chemicals on the CERAPP dataset. 3. (PathologAI) A deep learning pipeline combining generative and convolutional models for preclinical digital pathology. Developed as an internal project within the FDA/NCTR AIRForce initiative and applied to predict necrosis on images from the TG-GATEs project, PathologAI aims to improve accuracy and reduce labor in the identification of lesions in predictive toxicology. Furthermore, GE microarray data were integrated with histology features in a unified multi-modal scheme combining imaging and omics data. The solutions were developed in collaboration with domain experts and considered promising for application.
Стилі APA, Harvard, Vancouver, ISO та ін.
14

Samaras, Patroklos E. [Verfasser], Bernhard [Akademischer Betreuer] Küster, Bernhard [Gutachter] Küster, Martin [Gutachter] Eisenacher, and Julien [Gutachter] Gagneur. "Multi-omics data integration and data model optimization in ProteomicsDB / Patroklos E. Samaras ; Gutachter: Bernhard Küster, Martin Eisenacher, Julien Gagneur ; Betreuer: Bernhard Küster." München : Universitätsbibliothek der TU München, 2020. http://d-nb.info/1223616886/34.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
15

Balázs, Kinga [Verfasser], Nina Henriette [Akademischer Betreuer] Uhlenhaut, Nina Henriette [Gutachter] Uhlenhaut, and Hans-Werner [Gutachter] Mewes. "Multi-omics data integration approaches to study glucocorticoid receptor function / Kinga Balázs ; Gutachter: Nina Henriette Uhlenhaut, Hans-Werner Mewes ; Betreuer: Nina Henriette Uhlenhaut." München : Universitätsbibliothek der TU München, 2021. http://d-nb.info/1238781640/34.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
16

Xia, Yao. "Artificial intelligence-assisted prediction, feature selection, and multi-omics integration in exploring the interaction between IgG N-glycome and transcriptome and constructing the ageing clock." Thesis, Edith Cowan University, Research Online, Perth, Western Australia, 2023. https://ro.ecu.edu.au/theses/2646.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
17

Gantzer, Justine. "Integrative multi-omics characterization of mesenchymal tumors." Electronic Thesis or Diss., Strasbourg, 2024. http://www.theses.fr/2024STRAJ056.

Повний текст джерела
Анотація:
Ce travail de thèse s’articule autour de trois projets indépendants dont le but est de mieux caractériser trois tumeurs mésenchymateuses grâce à une approche multi-omique intégrative.Les tumeurs thoraciques indifférenciées SMARCA4-déficientes (SMARCA4-UT), initialement « sarcomes » semblaient répondre aux inhibiteurs de point de contrôle immunitaire (ICIs) comme d’autres tumeurs SWI/SNF déficientes, sans qu’aucune caractérisation de leur microenvironnement tumoral (TME) ne soit faite pour le comprendre. Grâce à des immunomarquages et à une analyse transcriptomique, nous avons mis en évidence un TME désertique avec une efficacité limitée des ICIs, en lien avec la cellule d’origine. Les tumeurs des cellules épithélioïdes périvasculaires (PEComes) forment un ensemble hétérogène de tumeurs coexprimant des marqueurs mélanocytaires et musculaires lisses pour lesquelles deux types génomiques se distinguent. Grâce à notre analyse, nous avons démontré qu’il existait d’autres réarrangements que ceux impliquant TFE3 et qu’il existait une classification transcriptomique pronostique de quatre sous types de PEComes, chacun étant enrichi d’un profil génomique et présentant des vulnérabilités différentes sur le plan thérapeutique. Les tumeurs desmoïdes (TDs) sont des tumeurs bénignes localement agressives dont l’hétérogénéité dans l’évolution tumorale est peu comprise. Grâce à notre analyse, nous avons mis en évidence que plus de 50% des TDs avaient des mutations dans un des gènes remodelant la chromatine et que parmi les deux sous-types transcriptomiques identifiés, le type immuno-myogénique doté d’un programme transcriptomique proche de celui des muscles, activait les voies de l’immunité évoquant un potentiel thérapeutique des ICIs
This thesis work takes the form of three independent projects aimed at better characterizing three mesenchymal tumors through an integrative multi-omics approach.The thoracic undifferentiated SMARCA4-deficient tumors (SMARCA4-UT), initially classified as "sarcomas," appeared to respond to immune checkpoint inhibitors (ICIs) similarly to other SWI/SNF-deficient tumors, despite no characterization of their tumor microenvironment (TME) being done to understand this response. Through immunostaining and transcriptomic analysis, we highlighted a desert-like TME with limited ICI efficacy, linked to the tumor’s cell of origin. Perivascular epithelioid cell tumors (PEComas) form a heterogeneous group of tumors co-expressing melanocytic and smooth muscle markers, with two distinct molecular types identified. Our analysis demonstrated that there are additional rearrangements beyond those involving TFE3 and provided a prognostic transcriptomic classification of four PEComa subtypes, each enriched with a unique genomic profile and presenting different therapeutic vulnerabilities. Desmoid tumors (TDs) are benign, locally aggressive tumors with poorly understood heterogeneity in tumor evolution. Our analyses revealed that more than 50% of TDs had mutations in chromatin remodeling genes and that among the two identified transcriptomic subtypes, the immuno-myogenic subtype, with a transcriptomic program similar to muscles, activated immune pathways suggesting a potential therapeutic benefit from ICIs
Стилі APA, Harvard, Vancouver, ISO та ін.
18

Ali, Baber. "Prédiction et compréhension des interactions génotypes x environnements par des approches d'intégration multi-omique chez le maïs." Electronic Thesis or Diss., université Paris-Saclay, 2024. http://www.theses.fr/2024UPASB060.

Повний текст джерела
Анотація:
Les programmes de sélection du maïs s'appuient sur des réseaux d'essais pour évaluer la performance phénotypique (P) des hybrides dans diverses conditions de culture. Dans ces essais, les interactions entre le génotype et l'environnement (IGE) ont un effet substantiel sur la variabilité phénotypique qui peut parfois dépasser l'effet génétique principal (G). Il est donc important de mieux modéliser ces IGE pour garantir l'amélioration génétique du maïs.Les modèles classiques de prédiction génomique, même ceux modélisant les IGE, ne tiennent pas compte de la complexité du génome et de la manière dont les régions génomiques réagissent aux stimuli environnementaux. En supposant un modèle infinitésimal, ils agissent comme des boîtes noires s'appuyant sur des relations statistiques plutôt que biologiques. Les chercheurs ont suggéré que les annotations fonctionnelles des gènes et les données multi-omiques pourraient permettre de mieux expliquer la relation entre génotype et phénotype. Des études ont montré que la hiérarchisation des marqueurs génomiques sur la base d'information biologique ou fonctionnelle peut contribuer à améliorer les capacités prédictives des modèles. Des résultats similaires ont été rapportés dans des études reposant sur des données multi-omiques, telles que les transcrits et les protéines. Toutefois, la plupart de ces études portent sur un seul essai ou sur des expériences réalisées en un seul lieu. Leur capacité à modéliser les IGE pour des caractères quantitatifs dans un réseau d'essais étendu doit être davantage étudiée. Par conséquent, cette thèse vise à (i) évaluer le potentiel des annotations fonctionnelles pour améliorer les prédictions en donnant la priorité à certaines régions génomiques, (ii) étudier le potentiel des données multi-omiques pour prendre en compte les IGE tout en améliorant la prédiction des caractères complexes, et (iii) identifier les gènes qui sont associés aux caractères de productivité et qui répondent aux conditions environnementales pour mieux comprendre la biologie derrière les IGE.Notre étude repose sur un panel de 244 hybrides de maïs évalués pour leur productivité dans des essais menés en Europe et au Chili sous des régimes hydriques contrastés. En outre, les annotations fonctionnelles des gènes ont été obtenues à partir de bases de données publiques. Les mêmes génotypes ont également été évalués pour leurs caractéristiques écophysiologiques, et les profils transcriptomiques et protéomiques ont été mesurés dans des conditions hydriques contrastées sur une plateforme.Dans le chapitre 1, nous avons montré que les marqueurs situés à proximité des gènes de certaines catégories fonctionnelles peuvent améliorer les prédictions de caractères de plein champ ou de plateforme.Dans le chapitre 2, nous avons pu montrer que les données omiques pouvaient accroître la capacité de prédiction par rapport à la sélection génomique, en particulier pour les caractères phénotypés dans les mêmes conditions que celles dans lesquelles les données omiques ont été acquises. Nous avons également intégré des covariables environnementales et les informations multi-omiques dans un même modèle, ce qui, à notre connaissance, n'a encore jamais été testé dans la littérature.Dans le chapitre 3, l'étude d'association transcriptomique (TWAS) a montré que les données omiques mesurées dans des conditions de plateforme contrôlées peuvent aider à disséquer l'architecture génétique du rendement mesuré dans des conditions de plein champ. Nous avons également constaté que certains des transcrits significativement associés ont déjà été identifiés dans la littérature comme étant associés à la réponse au stress. En outre, nous avons observé que la TWAS est complémentaire de la génétique d'association car elle peut améliorer la résolution et la puissance de détection.Pour conclure, cette thèse indique que les annotations fonctionnelles et les données multi-omiques sont utiles pour comprendre et prédire les IGE
Maize breeding programs heavily rely on multi-environmental trials (MET) to evaluate the phenotypic (P) performance of hybrids under diverse field conditions. Within these trials, genotype by environment (GxE) interactions has substantial effect on phenotypic variability, and can sometimes exceed the main genetic effect (G). Therefore, predicting and understanding GxE interactions is of utmost importance to ensure genetic improvement of maize.Classical genomic prediction models, even the ones accounting for GxE component separately from G, do not consider the complexity of maize genome and how genomic regions respond differently to environmental stimuli. By assuming an infinitisemal model, they act as black boxes relying on statistical rather than biological relationships. Researchers have suggested that genome functional annotations and multi-omics information have potential to better explain the genotype phenotype relationship. Studies have shown that prioritizing genomic markers, i.e., SNPs, based on a prior biological or functional information can help improve predictive abilities of models. Similar results have also been reported in the studies accounting for multi-omics information, such as transcripts, proteins, and metabolites, in genomic prediction. However, most of these studies are either performed for a single experiment or a set of experiments within a single location. Their potential in capturing GxE interactions for complex quantitative traits in a large MET setting needs further validation.Therefore, this thesis aims to (i) evaluate the potential of genomic functional annotations to improve maize predictions by prioritizing those genomic regions that respond to environmental stimuli for a given trait, (ii) investigate the potential of multi-omics data to account for GxE while improving prediction of complex traits, and (iii) identify genes that are found to be associated with productivity traits and respond to environmental conditions to offer insights into the biology beyond GxE interactions.Our study uses a set of 244 maize hybrids evaluated for productivity traits in field trials carried out across Europe and Chile under contrasted watering regimes. Environmental covariates related to key developmental stages of plants in field were also obtained. In addition, gene ontology (GO) functional annotations for maize genome was obtained from publicly available databases. The same genotypes were also evaluated for ecophysiological traits, and transcriptomic and proteomic profiles were measured for contrasting watering regimes in controlled conditions on a platform.In Chapter 1, we illustrated that when the right GO terms are considered, biologically relevant SNPs can account for variance separately from the rest of the SNPs, ultimately improving predictions of both field productivity and platform ecophysiological traits.In Chapter 2, we were able to show that the omics data could increase predictive ability in comparison to genomic selection, in particular for the traits phenotyped in the controlled experiment in which the omics were measured. We also integrated ECs and multi-omics information within the same model, that according to our knowledge this was the first example in literature.In Chapter 3, transcriptome wide association study (TWAS) showed that omics measured in controlled platform conditions can help dissect the genetic architecture of grain yield measured in field MET. We also found that some of the significantly associated transcripts have already been reported in the literature to be associated with response to stress. Importantly, we observed that TWAS complements GWAS as it can improve resolution and detection power of association analysis.Overall, this thesis indicates that functional annotations and multi-omics are useful in understanding and predicting GxE interactions
Стилі APA, Harvard, Vancouver, ISO та ін.
19

Ding, Hao. "Visualization and Integrative analysis of cancer multi-omics data." The Ohio State University, 2016. http://rave.ohiolink.edu/etdc/view?acc_num=osu1467843712.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
20

Kim, Jieun. "Computational tools for the integrative analysis of muti-omics data to decipher trans-omics networks." Thesis, The University of Sydney, 2022. https://hdl.handle.net/2123/28524.

Повний текст джерела
Анотація:
Regulatory networks define the phenotype, morphology, and function of cells. These networks are built from the basic building blocks of the cell—DNA, RNA, and proteins—and cut across the respective omics layers—genome, transcriptome, and proteome. The resulting omics networks depict a near infinite possibility of nodes and edges that intricately connect the ‘omes’. With the rapid advancement in the technologies that generate omics data in bulk samples and now at single-cell resolution, the field of life sciences is now met with the challenge to connect these omes to generate trans-omics networks. To this end, this thesis addressed some of the pressing challenges in trans-omics network reconstruction and the integrative analysis of omics data at both bulk and single-cell resolution: 1) the lack of an integrated pipeline for processing and downstream analysis of lesser studied omics layers; 2) the need for an integrative framework to reconstruct transcriptional networks and discover novel regulators of transcriptional regulation; and 3) development of tools for the reconstruction of single-cell multi-modal TRNs. I envision the work of my thesis to contribute towards the integrative study of bulk and single-cell trans-omics analysis, which I believe will become essential and standard-place in molecular biological studies as the comprehensiveness and accuracy of omics data measurements and databases for connecting different omics improves.
Стилі APA, Harvard, Vancouver, ISO та ін.
21

Liu, Yunpeng Ph D. Massachusetts Institute of Technology. "Integrative multi-omics dissection of cancer cell states and susceptibility." Thesis, Massachusetts Institute of Technology, 2020. https://hdl.handle.net/1721.1/130818.

Повний текст джерела
Анотація:
Thesis: Ph. D., Massachusetts Institute of Technology, Department of Biology, February, 2021
Cataloged from the official PDF of thesis. "February 2021."
Includes bibliographical references (pages 217-239).
Cancer cells are characterized by a broad spectrum of unique genetic, epigenetic and transcriptional states, which are often concomitant with high degrees of plasticity in cell identity. These cell states and the fluidity therein are a major source of resistance to both chemotherapy and targeted therapy. Combinatorial efforts in experimental assays and computational modeling are pivotal for understanding the origins of cancer cell plasticity and exposing cell state-specific vulnerabilities. In this thesis, I will first present my studies on two clinically challenging types of hematopoietic malignancies and discuss key genes that sustain cell identity and survival programs revealed through multi-omics approaches.
In the first study, a combination expression, chromatin binding and chromatin accessibility analyses revealed the plant homeodomain finger-like family protein PHF6's novel functions as a lineage identity regulator in a mouse model of BCR-ABL-driven B cell acute lymphoblastic leukemia. In the second case, single cell transcriptomic profiling, computational inference of cell cycle trajectories and unbiased functional genomics jointly identified RAD51B as a uniquely essential gene in near-haploid leukemia. Finally, to systematically model heterogeneous cell states and generate readily testable predictions of susceptibilities in cancer, I proposed a novel computational pipeline that integrates multiple data types to construct a quantitative model of transcription regulation, which can in turn be used to infer changes in gene expression in response to transcription factor perturbation.
The pipeline then uses these gene expression responses to perturbations to estimate changes in protein activity and finds a combination of protein activity score changes that best predicts changes cell fitness. Applying the pipeline to glioblastoma multiforme - a cancer type that lacks effective targeted therapy, I prioritized a small set of genes including MYBL2 as subtype-specific candidate targets. My thesis work demonstrates the power of integrative, multi-omics approaches for effective discovery of susceptibilities in cancer and highlights an emerging paradigm for understanding the information flow in the cellular circuitry.
by Yunpeng Liu.
Ph. D.
Ph.D. Massachusetts Institute of Technology, Department of Biology
Стилі APA, Harvard, Vancouver, ISO та ін.
22

Duperret, Léo. "Caractérisation des mécanismes moléculaires de la permissivité au Syndrome de Mortalité de l'Huître du Pacifique (POMS) sous influence de la température et du régime alimentaire." Electronic Thesis or Diss., Perpignan, 2024. http://www.theses.fr/2024PERP0042.

Повний текст джерела
Анотація:
Les systèmes de production alimentaire ont dû répondre ces dernières décennies à une demande alimentaire croissante générée par l'augmentation exponentielle de la population humaine. Ceci a mené à une intensification des cultures, des élevages et de la pêche au détriment des stocks et de la santé de notre planète. Pour le milieu marin, l'intensification de la pêche a conduit à l'amenuisement de certains stocks et à la mise en place de quotas. Cette diminution des ressources halieutiques a conduit au développement de l'aquaculture, une pratique d'élevage de la ressource bleue. Cependant, avec la surproduction et les changements globaux nous assistons à une recrudescence des épizooties depuis 1970, surtout chez les orgnaismes ectothermes. La maladie du POMS (Pacific Oyster Mortality Syndrome) en est une parfaite illustration puisqu'elle est responsable, chaque année, d'importants épisodes de mortalités chez les juvéniles de l'espèce d'huître Magallana gigas dans l'ensemble des pays producteurs. Maladie polymicrobienne apparue en 2008 en France, sa pathogénicité dépend de multiples facteurs dont la température (entre 16 et 24°C sur les côtes françaises) et la disponibilité en ressources nutritives. Alors que de nombreuses recherches ont permis de caractériser la pathogénèse et d'identifier les différents facteurs influençant le développement de cette maladie, les mécanismes moléculaires responsables des variations de permissivité en fonction de ces facteurs demeurent encore largement inconnus. Cette thèse s'inscrit donc dans cet objectif. Par un design expérimental rigoureux, une approche holistique et une analyse comparative intégrative à différentes échelles dans des conditions permissives et non-permissives à la maladie, nous avons pu identifier les mécanismes moléculaires sous-jacents à la permissivité liée à la température et à la ressource alimentaire. Ces résultats permettent de mieux comprendre la complexité de cette interaction hôte-pathogène-environnement et permettront à terme d'implémenter des modèles prédictifs du risque épidémiologique
Over the past decades, food production systems have had to meet the growing demand for food driven by the exponential increase in the global human population. This demand has led to intensified agriculture, livestock farming, and fishing practices, often at the expense of natural resources and planetary health. In the marine environment, intensified fishing has resulted in the depletion of certain stocks and the implementation of fishing quotas. The decline in marine resources has prompted the development of aquaculture, a practice for farming blue resources. However, with overproduction and global environmental changes, we have witnessed an upsurge in epizootics since 1970, particularly among ectothermic organisms. The Pacific Oyster Mortality Syndrome (POMS) is a prime example, responsible for significant annual mortality episodes in juvenile oysters of the species Magallana gigas across major producing countries. Emerging in 2008 in France, this polymicrobial disease is influenced by several factors, including temperature (between 16°C and 24°C along the French coasts) and the availability of nutritional resources. Although extensive research has helped characterize its pathogenesis and identify the various factors influencing the development of the disease, the molecular mechanisms underlying variations in permissiveness according to these factors remain largely unknown. This thesis addresses this objective. Through a rigorous experimental design, a holistic approach, and an integrative comparative analysis at multiple scales under permissive and non-permissive conditions for the disease, we identified the molecular mechanisms underlying permissiveness related to temperature and nutritional resources. These findings enhance our understanding of the complexity of host-pathogen-environment interactions and will ultimately contribute to the development of predictive models for epidemiological risk
Стилі APA, Harvard, Vancouver, ISO та ін.
23

Noack, Stephan [Verfasser]. "Integrative Auswertung von Multi-Omics-Daten aus dem Zentralstoffwechsel von Corynebacterium glutamicum / Stephan Noack." Siegen : Universitätsbibliothek der Universität Siegen, 2011. http://d-nb.info/1017706166/34.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
24

Zhou, Shanshan. "Integrating multi-omics to investigate the correlation between the quality and efficacy of ginseng." HKBU Institutional Repository, 2019. https://repository.hkbu.edu.hk/etd_oa/693.

Повний текст джерела
Анотація:
Ginseng, the root and rhizome of Panax ginseng C. A. Mey. (Araliaceae), is one of the most famed dietary and medicinal herbs worldwide due to its multifaceted efficacies. Ginsenosides and carbohydrates are demonstrated the major bioactive components of ginseng. Ginseng materials are always formed under various conditions, e.g. different growth years or different post-harvest processing/handling manners. These conditions can impact chemical profiles and thereby cause different quality and efficacy of ginseng. To address this issue, it will be necessary to understand the correlation between the quality and efficacy of ginseng materials formed under different conditions. Previous studies have attempted to investigate how growth years and post-harvest processing/handling manners affect the quality and efficacy of ginseng. In the most of these cases, several chemical components and biological parameters were selected as the indicators for evaluating the quality and efficacy of ginseng, respectively. However, it has been well recognized that the therapy of ginseng is featured by "multiple components against multiple targets". Therefore, several selected indicators may fail to comprehensively characterize the quality and efficacy of ginseng, thus cannot accurately reveal their correlations. Instead, holism-based approaches should be employed. In this study, we integrated chemomics, metabolomics and gut microbiota genomics to investigate the correlation between the quality and efficacy of ginseng in the conditions of growth years, steam-processing and sulfur-fumigation. First, chemomics approach was developed to qualitatively and quantitatively determine major ginsenosides and carbohydrates (poly-, oligo- and monosaccharides) by ultra-high performance liquid chromatography-tandem triple quadrupole mass spectrometry (UHPLC-QqQ-MS/MS) and high performance liquid chromatography coupled with evaporative light scattering detector (HPLC-ELSD) for characterizing the overall quality of ginseng. Second, ultra-performance liquid chromatography-quadrupole time-of-flight mass spectrometry (UPLC-QTOF-MS/MS)-based metabolomics and 16S rRNA gene sequencing-based gut microbiota genomics coupled with biochemical parameters determination were performed to evaluate anti-fatigue and anti-obesity activities of the different ginseng on animal models. Third, the obtained multi-omics data were processed by multivariate statistical analysis and then were integrated to discuss the correlation between the quality and efficacy of ginseng materials in different conditions. The results indicated that: 1) ginseng with 4-6 growth years possessed different anti-fatigue activity in multiple targets due to the different effects of ginsenosides and carbohydrates on endogenous metabolism and gut microbiota; 2) steam-processing qualitatively and quantitatively altered ginsenosides and carbohydrates in ginseng, resulting in different anti-obesity activity between white ginseng and red ginseng, and the mechanisms potentially involve chemically structural/compositional specificity to gut microbiota; 3) SO2 residual content caused by sulfur-fumigation did not correlate with the quality, efficacy and toxicity changes of sulfur-fumigated ginseng, more specifically, less SO2 residue did not indicate higher quality, better efficacy nor weaker toxicity. The research provides scientific insights for guiding the clinical and dietary practice of ginseng and offers new methodology for comprehensively exploring the correlation between the quality and efficacy of herbal medicines
Стилі APA, Harvard, Vancouver, ISO та ін.
25

Schneider, Lara Kristina [Verfasser], and Hans-Peter [Akademischer Betreuer] Lenhof. "Multi-omics integrative analyses for decision support systems in personalized cancer treatment / Lara Kristina Schneider ; Betreuer: Hans-Peter Lenhof." Saarbrücken : Saarländische Universitäts- und Landesbibliothek, 2020. http://d-nb.info/1213723973/34.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
26

Wery, Méline. "Identification de signature causale pathologie par intégration de données multi-omiques." Thesis, Rennes 1, 2020. http://www.theses.fr/2020REN1S071.

Повний текст джерела
Анотація:
Le lupus systémique erythémateux est un exemple de maladie complexe, hétérogène et multi-factorielle. L'identification de signature pouvant expliquer la cause d'une maladie est un enjeu important pour la stratification des patients. De plus, les analyses statistiques classiques s'appliquent difficilement quand les populations d'intérêt sont hétérogènes et ne permettent pas de mettre en évidence la cause. Cette thèse présente donc deux méthodes permettant de répondre à cette problématique. Tout d'abord, un modèle transomique est décrit pour structurer l'ensemble des données omiques en utilisant le Web sémantique (RDF). Son alimentation repose sur une analyse à l'échelle du patient. L'interrogation de ce modèle sous forme d'une requête SPARQL a permis l'identification d'expression Individually-Consistent Trait Loci (eICTLs). Il s'agit d'une association par raisonnement d'un couple SNP-gène pour lequel la présence d'un SNP influence la variation d'expression du gène. Ces éléments ont permis de réduire la dimensionalité des données omiques et présentent un apport plus informatif que les données de génomique. Cette première méthode se base uniquement sur l'utilisation des données omiques. Ensuite, la deuxième méthode repose sur la dépendance entre les régulations existante dans les réseaux biologiques. En combinant la dynamique des systèmes biologiques et l'analyse par concept formel, les états stables générés sont automatiquement classés. Cette classification a permis d'enrichir des signatures biologiques, caractéristique de phénotype. De plus, de nouveaux phénotypes hybrides ont été identifiés
Systematic erythematosus lupus is an example of a complex, heterogeneous and multifactorial disease. The identification of signature that can explain the cause of a disease remains an important challenge for the stratification of patients. Classic statistical analysis can hardly be applied when population of interest are heterogeneous and they do not highlight the cause. This thesis presents two methods that answer those issues. First, a transomic model is described in order to structure all the omic data, using semantic Web (RDF). Its supplying is based on a patient-centric approach. SPARQL query interrogates this model and allow the identification of expression Individually-Consistent Trait Loci (eICTLs). It a reasoning association between a SNP and a gene whose the presence of the SNP impact the variation of its gene expression. Those elements provide a reduction of omics data dimension and show a more informative contribution than genomic data. This first method are omics data-driven. Then, the second method is based on the existing regulation dependancies in biological networks. By combining the dynamic of biological system with the formal concept analysis, the generated stable states are automatically classified. This classification enables the enrichment of biological signature, which caracterised a phenotype. Moreover, new hybrid phenotype is identified
Стилі APA, Harvard, Vancouver, ISO та ін.
27

Ronen, Jonathan. "Integrative analysis of data from multiple experiments." Doctoral thesis, Humboldt-Universität zu Berlin, 2020. http://dx.doi.org/10.18452/21612.

Повний текст джерела
Анотація:
Auf die Entwicklung der Hochdurchsatz-Sequenzierung (HTS) folgte eine Reihe von speziellen Erweiterungen, die erlauben verschiedene zellbiologischer Aspekte wie Genexpression, DNA-Methylierung, etc. zu messen. Die Analyse dieser Daten erfordert die Entwicklung von Algorithmen, die einzelne Experimenteberücksichtigen oder mehrere Datenquellen gleichzeitig in betracht nehmen. Der letztere Ansatz bietet besondere Vorteile bei Analyse von einzelligen RNA-Sequenzierung (scRNA-seq) Experimenten welche von besonders hohem technischen Rauschen, etwa durch den Verlust an Molekülen durch die Behandlung geringer Ausgangsmengen, gekennzeichnet sind. Um diese experimentellen Defizite auszugleichen, habe ich eine Methode namens netSmooth entwickelt, welche die scRNA-seq-Daten entrascht und fehlende Werte mittels Netzwerkdiffusion über ein Gennetzwerk imputiert. Das Gennetzwerk reflektiert dabei erwartete Koexpressionsmuster von Genen. Unter Verwendung eines Gennetzwerks, das aus Protein-Protein-Interaktionen aufgebaut ist, zeige ich, dass netSmooth anderen hochmodernen scRNA-Seq-Imputationsmethoden bei der Identifizierung von Blutzelltypen in der Hämatopoese, zur Aufklärung von Zeitreihendaten unter Verwendung eines embryonalen Entwicklungsdatensatzes und für die Identifizierung von Tumoren der Herkunft für scRNA-Seq von Glioblastomen überlegen ist. netSmooth hat einen freien Parameter, die Diffusionsdistanz, welche durch datengesteuerte Metriken optimiert werden kann. So kann netSmooth auch dann eingesetzt werden, wenn der optimale Diffusionsabstand nicht explizit mit Hilfe von externen Referenzdaten optimiert werden kann. Eine integrierte Analyse ist auch relevant wenn multi-omics Daten von mehrerer Omics-Protokolle auf den gleichen biologischen Proben erhoben wurden. Hierbei erklärt jeder einzelne dieser Datensätze nur einen Teil des zellulären Systems, während die gemeinsame Analyse ein vollständigeres Bild ergibt. Ich entwickelte eine Methode namens maui, um eine latente Faktordarstellungen von multiomics Daten zu finden.
The development of high throughput sequencing (HTS) was followed by a swarm of protocols utilizing HTS to measure different molecular aspects such as gene expression (transcriptome), DNA methylation (methylome) and more. This opened opportunities for developments of data analysis algorithms and procedures that consider data produced by different experiments. Considering data from seemingly unrelated experiments is particularly beneficial for Single cell RNA sequencing (scRNA-seq). scRNA-seq produces particularly noisy data, due to loss of nucleic acids when handling the small amounts in single cells, and various technical biases. To address these challenges, I developed a method called netSmooth, which de-noises and imputes scRNA-seq data by applying network diffusion over a gene network which encodes expectations of co-expression patterns. The gene network is constructed from other experimental data. Using a gene network constructed from protein-protein interactions, I show that netSmooth outperforms other state-of-the-art scRNA-seq imputation methods at the identification of blood cell types in hematopoiesis, as well as elucidation of time series data in an embryonic development dataset, and identification of tumor of origin for scRNA-seq of glioblastomas. netSmooth has a free parameter, the diffusion distance, which I show can be selected using data-driven metrics. Thus, netSmooth may be used even in cases when the diffusion distance cannot be optimized explicitly using ground-truth labels. Another task which requires in-tandem analysis of data from different experiments arises when different omics protocols are applied to the same biological samples. Analyzing such multiomics data in an integrated fashion, rather than each data type (RNA-seq, DNA-seq, etc.) on its own, is benefitial, as each omics experiment only elucidates part of an integrated cellular system. The simultaneous analysis may reveal a comprehensive view.
Стилі APA, Harvard, Vancouver, ISO та ін.
28

Teng, Sin Yong. "Intelligent Energy-Savings and Process Improvement Strategies in Energy-Intensive Industries." Doctoral thesis, Vysoké učení technické v Brně. Fakulta strojního inženýrství, 2020. http://www.nusl.cz/ntk/nusl-433427.

Повний текст джерела
Анотація:
S tím, jak se neustále vyvíjejí nové technologie pro energeticky náročná průmyslová odvětví, stávající zařízení postupně zaostávají v efektivitě a produktivitě. Tvrdá konkurence na trhu a legislativa v oblasti životního prostředí nutí tato tradiční zařízení k ukončení provozu a k odstavení. Zlepšování procesu a projekty modernizace jsou zásadní v udržování provozních výkonů těchto zařízení. Současné přístupy pro zlepšování procesů jsou hlavně: integrace procesů, optimalizace procesů a intenzifikace procesů. Obecně se v těchto oblastech využívá matematické optimalizace, zkušeností řešitele a provozní heuristiky. Tyto přístupy slouží jako základ pro zlepšování procesů. Avšak, jejich výkon lze dále zlepšit pomocí moderní výpočtové inteligence. Účelem této práce je tudíž aplikace pokročilých technik umělé inteligence a strojového učení za účelem zlepšování procesů v energeticky náročných průmyslových procesech. V této práci je využit přístup, který řeší tento problém simulací průmyslových systémů a přispívá následujícím: (i)Aplikace techniky strojového učení, která zahrnuje jednorázové učení a neuro-evoluci pro modelování a optimalizaci jednotlivých jednotek na základě dat. (ii) Aplikace redukce dimenze (např. Analýza hlavních komponent, autoendkodér) pro vícekriteriální optimalizaci procesu s více jednotkami. (iii) Návrh nového nástroje pro analýzu problematických částí systému za účelem jejich odstranění (bottleneck tree analysis – BOTA). Bylo také navrženo rozšíření nástroje, které umožňuje řešit vícerozměrné problémy pomocí přístupu založeného na datech. (iv) Prokázání účinnosti simulací Monte-Carlo, neuronové sítě a rozhodovacích stromů pro rozhodování při integraci nové technologie procesu do stávajících procesů. (v) Porovnání techniky HTM (Hierarchical Temporal Memory) a duální optimalizace s několika prediktivními nástroji pro podporu managementu provozu v reálném čase. (vi) Implementace umělé neuronové sítě v rámci rozhraní pro konvenční procesní graf (P-graf). (vii) Zdůraznění budoucnosti umělé inteligence a procesního inženýrství v biosystémech prostřednictvím komerčně založeného paradigmatu multi-omics.
Стилі APA, Harvard, Vancouver, ISO та ін.
29

Pavel, Ana Brandusa. "Multi-omics data integration for the detection and characterization of smoking related lung diseases." Thesis, 2017. https://hdl.handle.net/2144/24073.

Повний текст джерела
Анотація:
Lung cancer is the leading cause of death from cancer in the world. First, we hypothesized that microRNA expression is altered in the bronchial epithelium of patients with lung cancer and that incorporating microRNA expression into an existing mRNA biomarker may improve its performance. Using bronchial brushings collected from current and former smokers, we profiled microRNA expression via small RNA sequencing for 347 patients with available mRNA data. We found that four microRNAs were under-expressed in cancer patients compared to controls (p<0.002, FDR<0.2). We explored the role of these microRNAs and their gene targets in cancer. In addition, we found that adding a microRNA feature to an existing 23-gene biomarker significantly improves its performance (AUC) in a test set (p<0.05). Next, we generalized the biomarker discovery process, and developed a visualization tool for biomarker selection. We built upon an existing biomarker discovery pipeline and created a web-based interface to visualize the performance of multiple predictors. The “visualization” component is the key to sorting through a thousand potential biomarkers, and developing clinically useful molecular predictors. Finally, we explored the molecular events leading to the development of COPD and ILD, two heterogeneous diseases with high mortality. We hypothesized that integrative genetic and expression networks can help identify drivers and elucidate mechanisms of genetic susceptibility. We utilized 262 lung tissue specimens profiled with microRNA sequencing, microarray gene expression and SNP chip genotyping. Next, we built condition specific integrative networks using a causality inference test for predicting SNP-microRNA-mRNA associations, where the microRNA is a predicted mediator of the SNP’s effect on gene expression. We identified the microRNAs predicted to affect the most genes within each network. Members of miR-34/449 family, known to promote airway differentiation by repressing the Notch pathway, were among the top ranked microRNAs in COPD and ILD networks, but not in the non-disease network. In addition, the miR-34/449 gene module was enriched among genes that increase in expression over time when airway basal cells are differentiated at an air-liquid interface and among genes that increase in expression with the airway wall thickening in patients with emphysema.
2019-07-31T00:00:00Z
Стилі APA, Harvard, Vancouver, ISO та ін.
30

Lovino, Marta. "Algorithms for complex systems in the life sciences: AI for gene fusion prioritization and multi-omics data integration." Doctoral thesis, 2021. https://hdl.handle.net/11583/2973149.

Повний текст джерела
Анотація:
Due to the continuous increase in the number and complexity of the genomics and biological data, new computer science techniques are needed to analyse these data and provide valuable insights into the main features. The thesis research topic consists of designing and developing bioinformatics methods for complex systems in life sciences to provide informative models about biological processes. The thesis is divided into two main sub-topics. The first sub-topic concerns machine and deep learning techniques applied to the analysis of aberrant genetic sequences like, for instance, gene fusions. The second one is the development of statistics and deep learning techniques for heterogeneous biological and clinical data integration. Referring to the first sub-topic, a gene fusion is a biological event in which two distinct regions in the DNA create a new fused gene. Gene fusions are a relevant issue in medicine because many gene fusions are involved in cancer, and some of them can even be used as cancer predictors. However, not all of them are necessarily oncogenic. The first part of this thesis is devoted to the automated recognition of oncogenic gene fusions, a very open and challenging problem in cancer development analysis. In this context, an automated model for the recognition of oncogenic gene fusions relying exclusively on the amino acid sequence of the resulting proteins has been developed. The main contributions consist of: 1. creation of a proper database used to train and test the model; 2. development of the methodology through the design and the implementation of a predictive model based on a Convolutional Neural Network (CNN) followed by a bidirectional Long Short Term Memory (LSTM) network; 3. extensive comparative analysis with other reference tools in the literature; 4. engineering of the developed method through the implementation and release of an automated tool for gene fusions prioritization downstream of gene fusion detection tools. Since the previous approach does not consider post-transcriptional regulation effects, new biological features have been considered (e.g., micro RNA data, gene ontologies, and transcription factors) to improve the overall performance, and a new integrated approach based on MLP has explicitly been designed. In the end, extensive comparisons with other methods present in the literature have been made. These contributions led to an improved model that outperforms the previous ones, and it competes with state-of-the-art tools. The rationale behind the second sub-topic of this thesis is the following: due to the widespread of Next Generation Sequencing (NGS) technologies, a large amount of heterogeneous complex data related to several diseases and healthy individuals is now available (e.g., RNA-seq, gene expression data, miRNAs expression data, methylation sequencing data, and many others). Each one of these data is also called omic, and their integrative study is called multi-omics. In this context, the aim is to integrate multi-omics data involving thousands of features (genes, microRNA) and identifying which of them are relevant for a specific biological process. From a computational point of view, finding the best strategies for multi-omics analysis and relevant features identification is a very open challenge. The first chapter dedicated to this second sub-topic focuses on the integrative analysis of gene expression and connectivity data of mouse brains exploiting machine learning techniques. The rational behind this study is the exploration of the capability to evaluate the grade of physical connection between brain regions starting from their gene expression data. Many studies have been performed considering the functional connection of two or more brain areas (which areas are activated in response to a specific stimulus). While, analyzing physical connections (i.e., axon bundles) starting from gene expression data is still an open problem. Despite this study is scientifically very relevant to deepen human brain functioning, ethical reasons strongly limit the availability of samples. For this reason, several studies have been carried out on the mouse brain, anatomically similar to the human one. The neuronal connection data (obtained by viral tracers) of mouse brains were processed to identify brain regions physically connected and then evaluated with these areas’ gene expression data. A multi-layer perceptron was applied to perform the classification task between connected and unconnected regions providing gene expression data as input. Furthermore, a second model was created to infer the degree of connection between distinct brain regions. The implemented models successfully executed the binary classification task (connected regions against unconnected regions) and distinguished the intensity of the connection in low, medium, and high. A second chapter describes a statistical method to reveal pathology-determining microRNA targets in multi-omic datasets. In this work, two multi-omics datasets are used: breast cancer and medulloblastoma datasets. Both the datasets are composed of miRNA, mRNA, and proteomics data related to the same patients. The main computational contribution to the field consists of designing and implementing an algorithm based on the statistical conditional probability to infer the impact of miRNA post-transcriptional regulation on target genes exploiting the protein expression values. The developed methodology allowed a more in-depth understanding and identification of target genes. Also, it proved to be significantly enriched in three well-known databases (miRDB, TargetScan, and miRTarBase), leading to relevant biological insights. Another chapter deals with the classification of multi-omics samples. The literature’s main approaches integrate all the features available for each sample upstream of the classifier (early integration approach) or create separate classifiers for each omic and subsequently define a consensus set rules (late integration approach). In this context, the main contribution consists of introducing the probability concept by creating a model based on Bayesian and MLP networks to achieve a consensus guided by the class label and its probability. This approach has shown how a probabilistic late integration classification is more specific than an early integration approach and can identify samples out of the training domain. To provide new molecular profiles and patients’ categorization, class labels could be helpful. However, they are not always available. Therefore, the need to cluster samples based on their intrinsic characteristics is revealed and dealt with in a specific chapter. Multi-omic clustering in literature is mainly addressed by creating graphs or methods based on multidimensional data reduction. This field’s main contribution is creating a model based on deep learning techniques by implementing an MLP with a specifically designed loss function. The loss represents the input samples in a reduced dimensional space by calculating the intra-cluster and inter-cluster distance at each epoch. This approach reported performances comparable to those of most referred methods in the literature, avoiding pre-processing steps for either feature selection or dimensionality reduction. Moreover, it has no limitations on the number of omics to integrate.
Стилі APA, Harvard, Vancouver, ISO та ін.
31

Papież, Anna. "Integrative data analysis methods in multi-omics molecular biology studies for disease of affluence biomarker research." Rozprawa doktorska, 2019. https://repolis.bg.polsl.pl/dlibra/docmetadata?showContent=true&id=59005.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
32

Papież, Anna. "Integrative data analysis methods in multi-omics molecular biology studies for disease of affluence biomarker research." Rozprawa doktorska, 2019. https://delibra.bg.polsl.pl/dlibra/docmetadata?showContent=true&id=59005.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
33

Abd-Rabbo, Diala. "Beyond hairballs: depicting complexity of a kinase-phosphatase network in the budding yeast." Thèse, 2017. http://hdl.handle.net/1866/19318.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
Ми пропонуємо знижки на всі преміум-плани для авторів, чиї праці увійшли до тематичних добірок літератури. Зв'яжіться з нами, щоб отримати унікальний промокод!

До бібліографії