Dissertations / Theses on the topic 'Analisi dati omici'

To see the other types of publications on this topic, follow the link: Analisi dati omici.

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 21 dissertations / theses for your research on the topic 'Analisi dati omici.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.

1

Berti, Elisa. "Applicazione del metodo QDanet_PRO alla classificazione di dati omici." Bachelor's thesis, Alma Mater Studiorum - Università di Bologna, 2015. http://amslaurea.unibo.it/9411/.

Full text
Abstract:
Il presente lavoro di tesi si pone nell'ambito dell'analisi dati attraverso un metodo (QDanet_PRO), elaborato dal Prof. Remondini in collaborazine coi Dott. Levi e Malagoli, basato sull'analisi discriminate a coppie e sulla Teoria dei Network, che ha come obiettivo la classificazione di dati contenuti in dataset dove il numero di campioni è molto ridotto rispetto al numero di variabili. Attraverso questo studio si vogliono identificare delle signature, ovvero un'insieme ridotto di variabili che siano in grado di classificare correttamente i campioni in base al comportamento delle variabili stesse. L'elaborazione dei diversi dataset avviene attraverso diverse fasi; si comincia con una un'analisi discriminante a coppie per identificare le performance di ogni coppia di variabili per poi passare alla ricerca delle coppie più performanti attraverso un processo che combina la Teoria dei Network con la Cross Validation. Una volta ottenuta la signature si conclude l'elaborazione con una validazione per avere un'analisi quantitativa del successo o meno del metodo.
APA, Harvard, Vancouver, ISO, and other styles
2

MASPERO, DAVIDE. "Computational strategies to dissect the heterogeneity of multicellular systems via multiscale modelling and omics data analysis." Doctoral thesis, Università degli Studi di Milano-Bicocca, 2022. http://hdl.handle.net/10281/368331.

Full text
Abstract:
L'eterogeneità pervade i sistemi biologici e si manifesta in differenze strutturali e funzionali osservate sia tra diversi individui di uno stesso gruppo (es. organismi o patologie), sia fra gli elementi costituenti di un singolo individuo (es. cellule). Lo studio dell’eterogeneità dei sistemi biologici e, in particolare, di quelli multicellulari è fondamentale per la comprensione meccanicistica di fenomeni fisiologici e patologici complessi (es. il cancro), così come per la definizione di strategie prognostiche, diagnostiche e terapeutiche efficaci. Questo lavoro è focalizzato sullo sviluppo e l’applicazione di metodi computazionali e modelli matematici per la caratterizzazione dell’eterogeneità di sistemi multicellulari e delle sottopopolazioni di cellule tumorali che sottendono l’evoluzione di una patologia neoplastica. Analoghe metodologie sono state sviluppate per caratterizzare efficacemente l’evoluzione e l’eterogeneità virale. La ricerca è suddivisa in due porzioni complementari, la prima finalizzata alla definizione di metodi per l’analisi e l’integrazione di dati omici generati da esperimenti di sequenziamento, la seconda alla modellazione e simulazione multiscala di sistemi multicellulari. Per quanto riguarda il primo filone, le tecnologie di next-generation sequencing permettono di generare enormi moli di dati omici, relativi per esempio al genoma o trascrittoma di un determinato individuo, attraverso esperimenti di bulk o single-cell sequencing. Una delle sfide principale in informatica è quella di definire metodi computazionali per estrarre informazione utile da tali dati, tenendo conto degli alti livelli di errori dato-specifico, dovuti principalmente a limiti tecnologici. In particolare, nell’ambito di questo lavoro, ci si è concentrati sullo sviluppo di metodi per l’analisi di dati di espressione genica e di mutazioni genomiche. In dettaglio, è stata effettuata una comparazione esaustiva dei metodi di machine-learning per il denoising e l’imputation di dati di single-cell RNA-sequencing. Inoltre, sono stati sviluppati metodi per il mapping dei profili di espressione su reti metaboliche, attraverso un framework innovativo che ha consentito di stratificare pazienti oncologici in base al loro metabolismo. Una successiva estensione del metodo ha permesso di analizzare la distribuzione dei flussi metabolici all'interno di una popolazione di cellule, via un approccio di flux balance analysis. Per quanto riguarda l’analisi dei profili mutazionali, è stato ideato e implementato il primo metodo per la ricostruzione di modelli filogenomici a partire da dati longitudinali a risoluzione single-cell, che sfrutta un framework che combina una Markov Chain Monte Carlo con una nuova funzione di likelihood pesata. Analogamente, è stato sviluppato un framework che sfrutta i profili delle mutazioni a bassa frequenza per ricostruire filogenie robuste e probabili catene di infenzione, attraverso l’analisi dei dati di sequenziamento di campioni virali. Gli stessi profili mutazionali permettono anche di deconvolvere il segnale nelle firme associati a specifici meccanismi molecolari che generano tali mutazioni, attraverso un approccio basato su non-negative matrix factorization. La ricerca condotta per quello che riguarda la simulazione computazionale ha portato allo sviluppo di un modello multiscala, in cui la simulazione della dinamica di popolazioni cellulari, rappresentata attraverso un Cellular Potts Model, è accoppiata all'ottimizzazione di un modello metabolico associato a ciascuna cellula sintetica. Co modello è possibile rappresentare ipotesi in termini matematici e osservare proprietà emergenti da tali assunti. Infine, un primo tentativo per combinare i due approcci metodologici ha condotto all'integrazione di dati di single-cell RNA-seq all'interno del modello multiscala, consentendo di formulare ipotesi data-driven sulle proprietà emergenti del sistema.
Heterogeneity pervades biological systems and manifests itself in the structural and functional differences observed both among different individuals of the same group (e.g., organisms or disease systems) and among the constituent elements of a single individual (e.g., cells). The study of the heterogeneity of biological systems and, in particular, of multicellular systems is fundamental for the mechanistic understanding of complex physiological and pathological phenomena (e.g., cancer), as well as for the definition of effective prognostic, diagnostic, and therapeutic strategies. This work focuses on developing and applying computational methods and mathematical models for characterising the heterogeneity of multicellular systems and, especially, cancer cell subpopulations underlying the evolution of neoplastic pathology. Similar methodologies have been developed to characterise viral evolution and heterogeneity effectively. The research is divided into two complementary portions, the first aimed at defining methods for the analysis and integration of omics data generated by sequencing experiments, the second at modelling and multiscale simulation of multicellular systems. Regarding the first strand, next-generation sequencing technologies allow us to generate vast amounts of omics data, for example, related to the genome or transcriptome of a given individual, through bulk or single-cell sequencing experiments. One of the main challenges in computer science is to define computational methods to extract useful information from such data, taking into account the high levels of data-specific errors, mainly due to technological limitations. In particular, in the context of this work, we focused on developing methods for the analysis of gene expression and genomic mutation data. In detail, an exhaustive comparison of machine-learning methods for denoising and imputation of single-cell RNA-sequencing data has been performed. Moreover, methods for mapping expression profiles onto metabolic networks have been developed through an innovative framework that has allowed one to stratify cancer patients according to their metabolism. A subsequent extension of the method allowed us to analyse the distribution of metabolic fluxes within a population of cells via a flux balance analysis approach. Regarding the analysis of mutational profiles, the first method for reconstructing phylogenomic models from longitudinal data at single-cell resolution has been designed and implemented, exploiting a framework that combines a Markov Chain Monte Carlo with a novel weighted likelihood function. Similarly, a framework that exploits low-frequency mutation profiles to reconstruct robust phylogenies and likely chains of infection has been developed by analysing sequencing data from viral samples. The same mutational profiles also allow us to deconvolve the signal in the signatures associated with specific molecular mechanisms that generate such mutations through an approach based on non-negative matrix factorisation. The research conducted with regard to the computational simulation has led to the development of a multiscale model, in which the simulation of cell population dynamics, represented through a Cellular Potts Model, is coupled to the optimisation of a metabolic model associated with each synthetic cell. Using this model, it is possible to represent assumptions in mathematical terms and observe properties emerging from these assumptions. Finally, we present a first attempt to combine the two methodological approaches which led to the integration of single-cell RNA-seq data within the multiscale model, allowing data-driven hypotheses to be formulated on the emerging properties of the system.
APA, Harvard, Vancouver, ISO, and other styles
3

Tellaroli, Paola. "Three topics in omics research." Doctoral thesis, Università degli studi di Padova, 2015. http://hdl.handle.net/11577/3423912.

Full text
Abstract:
The rather generic title of this Thesis is due to the fact that several aspects of biological phenomena have been investigated. Most of this work was addressed at the investigation of the limitations of one of the essential tools for analyzing gene expression data: cluster analysis. With several hundred of clustering methods in existence, there is clearly no shortage of clustering algorithms but, at the same time, satisfactory answers to some basic questions are still to come. In particular, we present a novel algorithm for the clustering of static data and a new strategy for the clustering of short-length time-course data. Finally, we analyzed data coming from Cap Analysis Gene Expression, a relatively new technology useful for the genome-wide promoter analysis and still mostly unexplored.
Il titolo piuttosto generico di questa tesi è dovuto al fatto che sono stati indagati diversi aspetti di fenomeni biologici. La maggior parte di questo lavoro è stato rivolto alla ricerca dei limiti di uno degli strumenti essenziali per l'analisi di dati di espressione genica: l'analisi dei gruppi. Esistendo diverse centinaia di metodi di raggruppamento, chiaramente non c'è carenza di algoritmi di analisi dei gruppi, ma, allo stesso tempo, alcuni quesiti fondamentali non hanno ancora ricevuto risposte soddisfacenti. In particolare, presentiamo un nuovo algoritmo di analisi dei gruppi per dati statici ed una nuova strategia per il raggruppamento di dati temporali di breve lunghezza. Infine, abbiamo analizzato dati provenienti da una tecnologia relativamente nuova, chiamata Cap Analysis Gene Expression, utile per l'analisi dei promotori su tutto il genoma e ancora in gran parte inesplorata.
APA, Harvard, Vancouver, ISO, and other styles
4

Ayati, Marzieh. "Algorithms to Integrate Omics Data for Personalized Medicine." Case Western Reserve University School of Graduate Studies / OhioLINK, 2018. http://rave.ohiolink.edu/etdc/view?acc_num=case1527679638507616.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Zuo, Yiming. "Differential Network Analysis based on Omic Data for Cancer Biomarker Discovery." Diss., Virginia Tech, 2017. http://hdl.handle.net/10919/78217.

Full text
Abstract:
Recent advances in high-throughput technique enables the generation of a large amount of omic data such as genomics, transcriptomics, proteomics, metabolomics, glycomics etc. Typically, differential expression analysis (e.g., student's t-test, ANOVA) is performed to identify biomolecules (e.g., genes, proteins, metabolites, glycans) with significant changes on individual level between biologically disparate groups (disease cases vs. healthy controls) for cancer biomarker discovery. However, differential expression analysis on independent studies for the same clinical types of patients often led to different sets of significant biomolecules and had only few in common. This may be attributed to the fact that biomolecules are members of strongly intertwined biological pathways and highly interactive with each other. Without considering these interactions, differential expression analysis could lead to biased results. Network-based methods provide a natural framework to study the interactions between biomolecules. Commonly used data-driven network models include relevance network, Bayesian network and Gaussian graphical models. In addition to data-driven network models, there are many publicly available databases such as STRING, KEGG, Reactome, and ConsensusPathDB, where one can extract various types of interactions to build knowledge-driven networks. While both data- and knowledge-driven networks have their pros and cons, an appropriate approach to incorporate the prior biological knowledge from publicly available databases into data-driven network model is desirable for more robust and biologically relevant network reconstruction. Recently, there has been a growing interest in differential network analysis, where the connection in the network represents a statistically significant change in the pairwise interaction between two biomolecules in different groups. From the rewiring interactions shown in differential networks, biomolecules that have strongly altered connectivity between distinct biological groups can be identified. These biomolecules might play an important role in the disease under study. In fact, differential expression and differential network analyses investigate omic data from two complementary perspectives: the former focuses on the change in individual biomolecule level between different groups while the latter concentrates on the change in pairwise biomolecules level. Therefore, an approach that can integrate differential expression and differential network analyses is likely to discover more reliable and powerful biomarkers. To achieve these goals, we start by proposing a novel data-driven network model (i.e., LOPC) to reconstruct sparse biological networks. The sparse networks only contains direct interactions between biomolecules which can help researchers to focus on the more informative connections. Then we propose a novel method (i.e., dwgLASSO) to incorporate prior biological knowledge into data-driven network model to build biologically relevant networks. Differential network analysis is applied based on the networks constructed for biologically disparate groups to identify cancer biomarker candidates. Finally, we propose a novel network-based approach (i.e., INDEED) to integrate differential expression and differential network analyses to identify more reliable and powerful cancer biomarker candidates. INDEED is further expanded as INDEED-M to utilize omic data at different levels of human biological system (e.g., transcriptomics, proteomics, metabolomics), which we believe is promising to increase our understanding of cancer. Matlab and R packages for the proposed methods are developed and available at Github (https://github.com/Hurricaner1989) to share with the research community.
Ph. D.
APA, Harvard, Vancouver, ISO, and other styles
6

Lu, Yingzhou. "Multi-omics Data Integration for Identifying Disease Specific Biological Pathways." Thesis, Virginia Tech, 2018. http://hdl.handle.net/10919/83467.

Full text
Abstract:
Pathway analysis is an important task for gaining novel insights into the molecular architecture of many complex diseases. With the advancement of new sequencing technologies, a large amount of quantitative gene expression data have been continuously acquired. The springing up omics data sets such as proteomics has facilitated the investigation on disease relevant pathways. Although much work has previously been done to explore the single omics data, little work has been reported using multi-omics data integration, mainly due to methodological and technological limitations. While a single omic data can provide useful information about the underlying biological processes, multi-omics data integration would be much more comprehensive about the cause-effect processes responsible for diseases and their subtypes. This project investigates the combination of miRNAseq, proteomics, and RNAseq data on seven types of muscular dystrophies and control group. These unique multi-omics data sets provide us with the opportunity to identify disease-specific and most relevant biological pathways. We first perform t-test and OVEPUG test separately to define the differential expressed genes in protein and mRNA data sets. In multi-omics data sets, miRNA also plays a significant role in muscle development by regulating their target genes in mRNA dataset. To exploit the relationship between miRNA and gene expression, we consult with the commonly used gene library - Targetscan to collect all paired miRNA-mRNA and miRNA-protein co-expression pairs. Next, by conducting statistical analysis such as Pearson's correlation coefficient or t-test, we measured the biologically expected correlation of each gene with its upstream miRNAs and identify those showing negative correlation between the aforementioned miRNA-mRNA and miRNA-protein pairs. Furthermore, we identify and assess the most relevant disease-specific pathways by inputting the differential expressed genes and negative correlated genes into the gene-set libraries respectively, and further characterize these prioritized marker subsets using IPA (Ingenuity Pathway Analysis) or KEGG. We will then use Fisher method to combine all these p-values derived from separate gene sets into a joint significance test assessing common pathway relevance. In conclusion, we will find all negative correlated paired miRNA-mRNA and miRNA-protein, and identifying several pathophysiological pathways related to muscular dystrophies by gene set enrichment analysis. This novel multi-omics data integration study and subsequent pathway identification will shed new light on pathophysiological processes in muscular dystrophies and improve our understanding on the molecular pathophysiology of muscle disorders, preventing and treating disease, and make people become healthier in the long term.
Master of Science
APA, Harvard, Vancouver, ISO, and other styles
7

Elhezzani, Najla Saad R. "New statistical methodologies for improved analysis of genomic and omic data." Thesis, King's College London (University of London), 2018. https://kclpure.kcl.ac.uk/portal/en/theses/new-statistical-methodologies-for-improved-analysis-of-genomic-and-omic-data(eb8d95f4-e926-4c54-984f-94d86306525a).html.

Full text
Abstract:
We develop statistical tools for analyzing different types of phenotypic data in genome-wide settings. When the phenotype of interest is a binary case-control status, most genome-wide association studies (GWASs) use randomly selected samples from the population (hereafter bases) as the control set. This approach is successful when the trait of interest is very rare; otherwise, a loss in the statistical power to detect disease-associated variants is expected. To address this, we propose a joint analysis of the three types of samples; cases, bases and controls. This is done by modeling the bases as a mixture of multinomial logistic functions of cases and controls, according to disease prevalence. In a typical GWAS, where thousands of single-nucleotide polymorphisms (SNPs) are available for testing, score-based test statistics are ideal in this case. Other tests of associations such as Wald’s and likelihood ratio tests are known to be asymptotically equivalent to the score test, however their performance under small sample sizes can vary significantly. In order to allow the test comparison to be performed under the proposed case-base-control (CBC) design, we provide an estimation procedure using the maximum likelihood (ML) method along with the expectation-maximization (EM) algorithm. Simulations show that combining the three samples can increase the power to detect disease-associated variants, though a very large base sample set can compensate for lack of controls. In the second part of the thesis, we consider a joint analysis of both genome-wide SNPs as well as multiple phenotypes, with a focus on the challenges they present in the estimation of SNP heritability. The current standard for performing this task is fit-ting a variance component model, despite its tendency to produce boundary estimates when small sample sizes are used. We propose a Bayesian covariance component model (BCCM) that takes into account genetic correlation among phenotypes and genetic correlation among individuals. The use of Bayesian methods allows us to circumvent some issues related to small sample sizes, mainly overfitting and boundary estimates. Using gene expression pathways, we demonstrate a significant improvement in SNP heritability estimates over univariate and ML-based methods, thus explaining why recent progress in eQTL identification has been limited. I published this work as an article in the European Journal of Human genetics. In the third part of the thesis, we study the prospects of using the proposed BCCM for phenotype prediction. Results from real data show consistency in accuracy between ML based methods and the proposed Bayesian method, when effect sizes are estimated using their posterior mode. It is also noted that an initial imputation step relatively increases the predictive accuracy.
APA, Harvard, Vancouver, ISO, and other styles
8

Hafez, Khafaga Ahmed Ibrahem 1987. "Bioinformatics approaches for integration and analysis of fungal omics data oriented to knowledge discovery and diagnosis." Doctoral thesis, TDX (Tesis Doctorals en Xarxa), 2021. http://hdl.handle.net/10803/671160.

Full text
Abstract:
Aquesta tesi presenta una sèrie de recursos bioinformàtics desenvolupats per a donar suport en l'anàlisi de dades de NGS i altres òmics en el camp d'estudi i diagnòstic d'infeccions fúngiques. Hem dissenyat tècniques de computació per identificar nous biomarcadors i determinar potencial trets de resistència, pronosticant les característiques de les seqüències d'ADN/ARN, i planejant estratègies optimitzades de seqüenciació per als estudis de hoste-patogen transcriptomes (Dual RNA-seq). Hem dissenyat i desenvolupat tambe una solució bioinformàtica composta per un component de costat de servidor (constituït per diferents pipelines per a fer anàlisi VariantSeq, Denovoseq i RNAseq) i un altre component constituït per eines software basades en interfícies gràfiques (GUIs) per permetre a l'usuari accedir, gestionar i executar els pipelines mitjançant interfícies amistoses. També hem desenvolupat i validat un software per a l'anàlisi de seqüències i el disseny dels primers (SeqEditor) orientat a la identificació i detecció d'espècies en el diagnòstic de la PCR. Finalment, hem desenvolupat CandidaMine una base de dades integrant dades omiques de fongs patògens.
The aim of this thesis has been to develop a series of bioinformatic resources for analysis of NGS data, proteomics, or other omics technologies in the field of study and diagnosis of yeast infections. In particular, we have explored and designed distinct computational techniques to identify novel biomarker candidates of resistance traits, to predict DNA/RNA sequences’ features, and to optimize sequencing strategies for host-pathogen transcriptome sequencing studies (Dual RNA-seq). We have designed and developed an efficient bioinformatic solution composed of a server-side component constituted by distinct pipelines for VariantSeq, Denovoseq and RNAseq analyses as well as another component constituted by distinct GUI-based software to let the user to access, manage and run the pipelines with friendly-to-use interfaces. We have also designed and developed SeqEditor a software for sequence analysis and primers design for species identification and detection in PCR diagnosis. We also have developed CandidaMine an integrated data warehouse of fungal omics and for data analysis and knowledge discovery.
APA, Harvard, Vancouver, ISO, and other styles
9

Li, Yichao. "Algorithmic Methods for Multi-Omics Biomarker Discovery." Ohio University / OhioLINK, 2018. http://rave.ohiolink.edu/etdc/view?acc_num=ohiou1541609328071533.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Ronen, Jonathan. "Integrative analysis of data from multiple experiments." Doctoral thesis, Humboldt-Universität zu Berlin, 2020. http://dx.doi.org/10.18452/21612.

Full text
Abstract:
Auf die Entwicklung der Hochdurchsatz-Sequenzierung (HTS) folgte eine Reihe von speziellen Erweiterungen, die erlauben verschiedene zellbiologischer Aspekte wie Genexpression, DNA-Methylierung, etc. zu messen. Die Analyse dieser Daten erfordert die Entwicklung von Algorithmen, die einzelne Experimenteberücksichtigen oder mehrere Datenquellen gleichzeitig in betracht nehmen. Der letztere Ansatz bietet besondere Vorteile bei Analyse von einzelligen RNA-Sequenzierung (scRNA-seq) Experimenten welche von besonders hohem technischen Rauschen, etwa durch den Verlust an Molekülen durch die Behandlung geringer Ausgangsmengen, gekennzeichnet sind. Um diese experimentellen Defizite auszugleichen, habe ich eine Methode namens netSmooth entwickelt, welche die scRNA-seq-Daten entrascht und fehlende Werte mittels Netzwerkdiffusion über ein Gennetzwerk imputiert. Das Gennetzwerk reflektiert dabei erwartete Koexpressionsmuster von Genen. Unter Verwendung eines Gennetzwerks, das aus Protein-Protein-Interaktionen aufgebaut ist, zeige ich, dass netSmooth anderen hochmodernen scRNA-Seq-Imputationsmethoden bei der Identifizierung von Blutzelltypen in der Hämatopoese, zur Aufklärung von Zeitreihendaten unter Verwendung eines embryonalen Entwicklungsdatensatzes und für die Identifizierung von Tumoren der Herkunft für scRNA-Seq von Glioblastomen überlegen ist. netSmooth hat einen freien Parameter, die Diffusionsdistanz, welche durch datengesteuerte Metriken optimiert werden kann. So kann netSmooth auch dann eingesetzt werden, wenn der optimale Diffusionsabstand nicht explizit mit Hilfe von externen Referenzdaten optimiert werden kann. Eine integrierte Analyse ist auch relevant wenn multi-omics Daten von mehrerer Omics-Protokolle auf den gleichen biologischen Proben erhoben wurden. Hierbei erklärt jeder einzelne dieser Datensätze nur einen Teil des zellulären Systems, während die gemeinsame Analyse ein vollständigeres Bild ergibt. Ich entwickelte eine Methode namens maui, um eine latente Faktordarstellungen von multiomics Daten zu finden.
The development of high throughput sequencing (HTS) was followed by a swarm of protocols utilizing HTS to measure different molecular aspects such as gene expression (transcriptome), DNA methylation (methylome) and more. This opened opportunities for developments of data analysis algorithms and procedures that consider data produced by different experiments. Considering data from seemingly unrelated experiments is particularly beneficial for Single cell RNA sequencing (scRNA-seq). scRNA-seq produces particularly noisy data, due to loss of nucleic acids when handling the small amounts in single cells, and various technical biases. To address these challenges, I developed a method called netSmooth, which de-noises and imputes scRNA-seq data by applying network diffusion over a gene network which encodes expectations of co-expression patterns. The gene network is constructed from other experimental data. Using a gene network constructed from protein-protein interactions, I show that netSmooth outperforms other state-of-the-art scRNA-seq imputation methods at the identification of blood cell types in hematopoiesis, as well as elucidation of time series data in an embryonic development dataset, and identification of tumor of origin for scRNA-seq of glioblastomas. netSmooth has a free parameter, the diffusion distance, which I show can be selected using data-driven metrics. Thus, netSmooth may be used even in cases when the diffusion distance cannot be optimized explicitly using ground-truth labels. Another task which requires in-tandem analysis of data from different experiments arises when different omics protocols are applied to the same biological samples. Analyzing such multiomics data in an integrated fashion, rather than each data type (RNA-seq, DNA-seq, etc.) on its own, is benefitial, as each omics experiment only elucidates part of an integrated cellular system. The simultaneous analysis may reveal a comprehensive view.
APA, Harvard, Vancouver, ISO, and other styles
11

Denecker, Thomas. "Bioinformatique et analyse de données multiomiques : principes et applications chez les levures pathogènes Candida glabrata et Candida albicans Functional networks of co-expressed genes to explore iron homeostasis processes in the pathogenic yeast Candida glabrata Efficient, quick and easy-to-use DNA replication timing analysis with START-R suite FAIR_Bioinfo: a turnkey training course and protocol for reproducible computational biology Label-free quantitative proteomics in Candida yeast species: technical and biological replicates to assess data reproducibility Rendre ses projets R plus accessibles grâce à Shiny Pixel: a content management platform for quantitative omics data Empowering the detection of ChIP-seq "basic peaks" (bPeaks) in small eukaryotic genomes with a web user-interactive interface A hypothesis-driven approach identifies CDK4 and CDK6 inhibitors as candidate drugs for treatments of adrenocortical carcinomas Characterization of the replication timing program of 6 human model cell lines." Thesis, université Paris-Saclay, 2020. http://www.theses.fr/2020UPASL010.

Full text
Abstract:
Plusieurs évolutions sont constatées dans la recherche en biologie. Tout d’abord, les études menées reposent souvent sur des approches expérimentales quantitatives. L’analyse et l’interprétation des résultats requièrent l’utilisation de l’informatique et des statistiques. Également, en complément des études centrées sur des objets biologiques isolés, les technologies expérimentales haut débit permettent l’étude des systèmes (caractérisation des composants du système ainsi que des interactions entre ces composants). De très grandes quantités de données sont disponibles dans les bases de données publiques, librement réutilisables pour de nouvelles problématiques. Enfin, les données utiles pour les recherches en biologie sont très hétérogènes (données numériques, de textes, images, séquences biologiques, etc.) et conservées sur des supports d’information également très hétérogènes (papiers ou numériques). Ainsi « l’analyse de données » s’est petit à petit imposée comme une problématique de recherche à part entière et en seulement une dizaine d’années, le domaine de la « Bioinformatique » s’est en conséquence totalement réinventé. Disposer d’une grande quantité de données pour répondre à un questionnement biologique n’est souvent pas le défi principal. La vraie difficulté est la capacité des chercheurs à convertir les données en information, puis en connaissance. Dans ce contexte, plusieurs problématiques de recherche en biologie ont été abordées lors de cette thèse. La première concerne l’étude de l’homéostasie du fer chez la levure pathogène Candida glabrata. La seconde concerne l’étude systématique des modifications post-traductionnelles des protéines chez la levure pathogène Candida albicans. Pour ces deux projets, des données « omiques » ont été exploitées : transcriptomiques et protéomiques. Des outils bioinformatiques et des outils d’analyses ont été implémentés en parallèle conduisant à l’émergence de nouvelles hypothèses de recherche en biologie. Une attention particulière et constante a aussi été portée sur les problématiques de reproductibilité et de partage des résultats avec la communauté scientifique
Biological research is changing. First, studies are often based on quantitative experimental approaches. The analysis and the interpretation of the obtained results thus need computer science and statistics. Also, together with studies focused on isolated biological objects, high throughput experimental technologies allow to capture the functioning of biological systems (identification of components as well as the interactions between them). Very large amounts of data are also available in public databases, freely reusable to solve new open questions. Finally, the data in biological research are heterogeneous (digital data, texts, images, biological sequences, etc.) and stored on multiple supports (paper or digital). Thus, "data analysis" has gradually emerged as a key research issue, and in only ten years, the field of "Bioinformatics" has been significantly changed. Having a large amount of data to answer a biological question is often not the main challenge. The real challenge is the ability of researchers to convert the data into information and then into knowledge. In this context, several biological research projects were addressed in this thesis. The first concerns the study of iron homeostasis in the pathogenic yeast Candida glabrata. The second concerns the systematic investigation of post-translational modifications of proteins in the pathogenic yeast Candida albicans. In these two projects, omics data were used: transcriptomics and proteomics. Appropriate bioinformatics and analysis tools were developed, leading to the emergence of new research hypotheses. Particular and constant attention has also been paid to the question of data reproducibility and sharing of results with the scientific community
APA, Harvard, Vancouver, ISO, and other styles
12

Czerwińska, Urszula. "Unsupervised deconvolution of bulk omics profiles : methodology and application to characterize the immune landscape in tumors Determining the optimal number of independent components for reproducible transcriptomic data analysis Application of independent component analysis to tumor transcriptomes reveals specific and reproducible immune-related signals A multiscale signalling network map of innate immune response in cancer reveals signatures of cell heterogeneity and functional polarization." Thesis, Sorbonne Paris Cité, 2018. http://www.theses.fr/2018USPCB075.

Full text
Abstract:
Les tumeurs sont entourées d'un microenvironnement complexe comprenant des cellules tumorales, des fibroblastes et une diversité de cellules immunitaires. Avec le développement actuel des immunothérapies, la compréhension de la composition du microenvironnement tumoral est d'une importance critique pour effectuer un pronostic sur la progression tumorale et sa réponse au traitement. Cependant, nous manquons d'approches quantitatives fiables et validées pour caractériser le microenvironnement tumoral, facilitant ainsi le choix de la meilleure thérapie. Une partie de ce défi consiste à quantifier la composition cellulaire d'un échantillon tumoral (appelé problème de déconvolution dans ce contexte), en utilisant son profil omique de masse (le profil quantitatif global de certains types de molécules, tels que l'ARNm ou les marqueurs épigénétiques). La plupart des méthodes existantes utilisent des signatures prédéfinies de types cellulaires et ensuite extrapolent cette information à des nouveaux contextes. Cela peut introduire un biais dans la quantification de microenvironnement tumoral dans les situations où le contexte étudié est significativement différent de la référence. Sous certaines conditions, il est possible de séparer des mélanges de signaux complexes, en utilisant des méthodes de séparation de sources et de réduction des dimensions, sans définitions de sources préexistantes. Si une telle approche (déconvolution non supervisée) peut être appliquée à des profils omiques de masse de tumeurs, cela permettrait d'éviter les biais contextuels mentionnés précédemment et fournirait un aperçu des signatures cellulaires spécifiques au contexte. Dans ce travail, j'ai développé une nouvelle méthode appelée DeconICA (Déconvolution de données omiques de masse par l'analyse en composantes immunitaires), basée sur la méthodologie de séparation aveugle de source. DeconICA a pour but l'interprétation et la quantification des signaux biologiques, façonnant les profils omiques d'échantillons tumoraux ou de tissus normaux, en mettant l'accent sur les signaux liés au système immunitaire et la découverte de nouvelles signatures. Afin de rendre mon travail plus accessible, j'ai implémenté la méthode DeconICA en tant que librairie R. En appliquant ce logiciel aux jeux de données de référence, j'ai démontré qu'il est possible de quantifier les cellules immunitaires avec une précision comparable aux méthodes de pointe publiées, sans définir a priori des gènes spécifiques au type cellulaire. DeconICA peut fonctionner avec des techniques de factorisation matricielle telles que l'analyse indépendante des composants (ICA) ou la factorisation matricielle non négative (NMF). Enfin, j'ai appliqué DeconICA à un grand volume de données : plus de 100 jeux de données, contenant au total plus de 28 000 échantillons de 40 types de tumeurs, générés par différentes technologies et traités indépendamment. Cette analyse a démontré que les signaux immunitaires basés sur l'ICA sont reproductibles entre les différents jeux de données. D'autre part, nous avons montré que les trois principaux types de cellules immunitaires, à savoir les lymphocytes T, les lymphocytes B et les cellules myéloïdes, peuvent y être identifiés et quantifiés. Enfin, les métagènes dérivés de l'ICA, c'est-à-dire les valeurs de projection associées à une source, ont été utilisés comme des signatures spécifiques permettant d'étudier les caractéristiques des cellules immunitaires dans différents types de tumeurs. L'analyse a révélé une grande diversité de phénotypes cellulaires identifiés ainsi que la plasticité des cellules immunitaires, qu'elle soit dépendante ou indépendante du type de tumeur. Ces résultats pourraient être utilisés pour identifier des cibles médicamenteuses ou des biomarqueurs pour l'immunothérapie du cancer
Tumors are engulfed in a complex microenvironment (TME) including tumor cells, fibroblasts, and a diversity of immune cells. Currently, a new generation of cancer therapies based on modulation of the immune system response is in active clinical development with first promising results. Therefore, understanding the composition of TME in each tumor case is critically important to make a prognosis on the tumor progression and its response to treatment. However, we lack reliable and validated quantitative approaches to characterize the TME in order to facilitate the choice of the best existing therapy. One part of this challenge is to be able to quantify the cellular composition of a tumor sample (called deconvolution problem in this context), using its bulk omics profile (global quantitative profiling of certain types of molecules, such as mRNA or epigenetic markers). In recent years, there was a remarkable explosion in the number of methods approaching this problem in several different ways. Most of them use pre-defined molecular signatures of specific cell types and extrapolate this information to previously unseen contexts. This can bias the TME quantification in those situations where the context under study is significantly different from the reference. In theory, under certain assumptions, it is possible to separate complex signal mixtures, using classical and advanced methods of source separation and dimension reduction, without pre-existing source definitions. If such an approach (unsupervised deconvolution) is feasible to apply for bulk omic profiles of tumor samples, then this would make it possible to avoid the above mentioned contextual biases and provide insights into the context-specific signatures of cell types. In this work, I developed a new method called DeconICA (Deconvolution of bulk omics datasets through Immune Component Analysis), based on the blind source separation methodology. DeconICA has an aim to decipher and quantify the biological signals shaping omics profiles of tumor samples or normal tissues. A particular focus of my study was on the immune system-related signals and discovering new signatures of immune cell types. In order to make my work more accessible, I implemented the DeconICA method as an R package named "DeconICA". By applying this software to the standard benchmark datasets, I demonstrated that DeconICA is able to quantify immune cells with accuracy comparable to published state-of-the-art methods but without a priori defining a cell type-specific signature genes. The implementation can work with existing deconvolution methods based on matrix factorization techniques such as Independent Component Analysis (ICA) or Non-Negative Matrix Factorization (NMF). Finally, I applied DeconICA to a big corpus of data containing more than 100 transcriptomic datasets composed of, in total, over 28000 samples of 40 tumor types generated by different technologies and processed independently. This analysis demonstrated that ICA-based immune signals are reproducible between datasets and three major immune cell types: T-cells, B-cells and Myeloid cells can be reliably identified and quantified. Additionally, I used the ICA-derived metagenes as context-specific signatures in order to study the characteristics of immune cells in different tumor types. The analysis revealed a large diversity and plasticity of immune cells dependent and independent on tumor type. Some conclusions of the study can be helpful in identification of new drug targets or biomarkers for immunotherapy of cancer
APA, Harvard, Vancouver, ISO, and other styles
13

Voillet, Valentin. "Approche intégrative du développement musculaire afin de décrire le processus de maturation en lien avec la survie néonatale." Thesis, Toulouse, INPT, 2016. http://www.theses.fr/2016INPT0067/document.

Full text
Abstract:
Depuis plusieurs années, des projets d'intégration de données omiques se sont développés, notamment avec objectif de participer à la description fine de caractères complexes d'intérêt socio-économique. Dans ce contexte, l'objectif de cette thèse est de combiner différentes données omiques hétérogènes afin de mieux décrire et comprendre le dernier tiers de gestation chez le porc, période influençant la mortinatalité porcine. Durant cette thèse, nous avons identifié les bases moléculaires et cellulaires sous-jacentes de la fin de gestation, en particulier au niveau du muscle squelettique. Ce tissu est en effet déterminant à la naissance car impliqué dans l'efficacité de plusieurs fonctions physiologiques comme la thermorégulation et la capacité à se déplacer. Au niveau du plan expérimental, les tissus analysés proviennent de foetus prélevés à 90 et 110 jours de gestation (naissance à 114 jours), issus de deux lignées extrêmes pour la mortalité à la naissance, Large White et Meishan, et des deux croisements réciproques. Au travers l'application de plusieurs études statistiques et computationnelles (analyses multidimensionnelles, inférence de réseaux, clustering et intégration de données), nous avons montré l'existence de mécanismes biologiques régulant la maturité musculaire chez les porcelets, mais également chez d'autres espèces d'intérêt agronomique (bovin et mouton). Quelques gènes et protéines ont été identifiées comme étant fortement liées à la mise en place du métabolisme énergétique musculaire durant le dernier tiers de gestation. Les porcelets ayant une immaturité du métabolisme musculaire seraient sujets à un plus fort risque de mortalité à la naissance. Un second volet de cette thèse concerne l'imputation de données manquantes (tout un groupe de variables pour un individu) dans les méthodes d'analyses multidimensionnelles, comme l'analyse factorielle multiple (AFM) (ou multiple factor analysis (MFA)). Dans notre contexte, l'AFM fut particulièrement intéressante pour l'intégration de données d'un ensemble d'individus sur différents tissus (deux ou plus). Afin de conserver ces individus manquants pour tout un groupe de variables, nous avons développé une méthode, appelée MI-MFA (multiple imputation - MFA), permettant l'estimation des composantes de l'AFM pour ces individus manquants
Over the last decades, some omics data integration studies have been developed to participate in the detailed description of complex traits with socio-economic interests. In this context, the aim of the thesis is to combine different heterogeneous omics data to better describe and understand the last third of gestation in pigs, period influencing the piglet mortality at birth. In the thesis, we better defined the molecular and cellular basis underlying the end of gestation, with a focus on the skeletal muscle. This tissue is specially involved in the efficiency of several physiological functions, such as thermoregulation and motor functions. According to the experimental design, tissues were collected at two days of gestation (90 or 110 days of gestation) from four fetal genotypes. These genotypes consisted in two extreme breeds for mortality at birth (Meishan and Large White) and two reciprocal crosses. Through statistical and computational analyses (descriptive analyses, network inference, clustering and biological data integration), we highlighted some biological mechanisms regulating the maturation process in pigs, but also in other livestock species (cattle and sheep). Some genes and proteins were identified as being highly involved in the muscle energy metabolism. Piglets with a muscular metabolism immaturity would be associated with a higher risk of mortality at birth. A second aspect of the thesis was the imputation of missing individual row values in the multidimensional statistical method framework, such as the multiple factor analysis (MFA). In our context, MFA was particularly interesting in integrating data coming from the same individuals on different tissues (two or more). To avoid missing individual row values, we developed a method, called MI-MFA (multiple imputation - MFA), allowing the estimation of the MFA components for these missing individuals
APA, Harvard, Vancouver, ISO, and other styles
14

Hulot, Audrey. "Analyses de données omiques : clustering et inférence de réseaux Female ponderal index at birth and idiopathic infertility." Thesis, université Paris-Saclay, 2020. http://www.theses.fr/2020UPASL034.

Full text
Abstract:
Le développement des méthodes de biologie haut-débit (séquençage et spectrométrie de masse) a permis de générer de grandes masses de données, dites -omiques, qui nous aident à mieux comprendre les processus biologiques.Cependant, isolément, chaque source -omique ne permet d'expliquer que partiellement ces processus. Mettre en relation les différentes sources de donnés -omiques devrait permettre de mieux comprendre les processus biologiques mais constitue un défi considérable.Dans cette thèse, nous nous intéressons particulièrement aux méthodes de clustering et d’inférence de réseaux, appliquées aux données -omiques.La première partie du manuscrit présente trois méthodes. Les deux premières méthodes sont applicables dans un contexte où les données peuvent être de nature hétérogène.La première concerne un algorithme d’agrégation d’arbres, permettant la construction d’un clustering hiérarchique consensus. La complexité sous-quadratique de cette méthode a fait l’objet d’une démonstration, et permet son application dans un contexte de grande dimension. Cette méthode est disponible dans le package R mergeTrees, accessible sur le CRAN.La seconde méthode concerne l’intégration de données provenant d’arbres ou de réseaux, en transformant les objets via la distance cophénétique ou via le plus court chemin, en matrices de distances. Elle utilise le Multidimensional Scaling et l’Analyse Factorielle Multiple et peut servir à la construction d’arbres et de réseaux consensus.Enfin, dans une troisième méthode, on se place dans le contexte des modèles graphiques gaussiens, et cherchons à estimer un graphe, ainsi que des communautés d’entités, à partir de plusieurs tables de données. Cette méthode est basée sur la combinaison d’un Stochastic Block Model, un Latent block Model et du Graphical Lasso.Cette thèse présente en deuxième partie les résultats d’une étude de données transcriptomiques et métagénomiques, réalisée dans le cadre d’un projet appliqué, sur des données concernant la Spondylarthrite ankylosante
The development of biological high-throughput technologies (next-generation sequencing and mass spectrometry) have provided researchers with a large amount of data, also known as -omics, that help better understand the biological processes.However, each source of data separately explains only a very small part of a given process. Linking the differents -omics sources between them should help us understand more of these processes.In this manuscript, we will focus on two approaches, clustering and network inference, applied to omics data.The first part of the manuscript presents three methodological developments on this topic. The first two methods are applicable in a situation where the data are heterogeneous.The first method is an algorithm for aggregating trees, in order to create a consensus out of a set of trees. The complexity of the process is sub-quadratic, allowing to use it on data leading to a great number of leaves in the trees. This algorithm is available in an R-package named mergeTrees on the CRAN.The second method deals with the integration data from trees and networks, by transforming these objects into distance matrices using cophenetic and shortest path distances, respectively. This method relies on Multidimensional Scaling and Multiple Factor Analysis and can be also used to build consensus trees or networks.Finally, we use the Gaussian Graphical Models setting and seek to estimate a graph, as well as communities in the graph, from several tables. This method is based on a combination of Stochastic Block Model, Latent Block Model and Graphical Lasso.The second part of the manuscript presents analyses conducted on transcriptomics and metagenomics data to identify targets to gain insight into the predisposition of Ankylosing Spondylitis
APA, Harvard, Vancouver, ISO, and other styles
15

Elmansy, Dalia F. "Computational Methods to Characterize the Etiology of Complex Diseases at Multiple Levels." Case Western Reserve University School of Graduate Studies / OhioLINK, 2020. http://rave.ohiolink.edu/etdc/view?acc_num=case1583416431321447.

Full text
APA, Harvard, Vancouver, ISO, and other styles
16

Teng, Sin Yong. "Intelligent Energy-Savings and Process Improvement Strategies in Energy-Intensive Industries." Doctoral thesis, Vysoké učení technické v Brně. Fakulta strojního inženýrství, 2020. http://www.nusl.cz/ntk/nusl-433427.

Full text
Abstract:
S tím, jak se neustále vyvíjejí nové technologie pro energeticky náročná průmyslová odvětví, stávající zařízení postupně zaostávají v efektivitě a produktivitě. Tvrdá konkurence na trhu a legislativa v oblasti životního prostředí nutí tato tradiční zařízení k ukončení provozu a k odstavení. Zlepšování procesu a projekty modernizace jsou zásadní v udržování provozních výkonů těchto zařízení. Současné přístupy pro zlepšování procesů jsou hlavně: integrace procesů, optimalizace procesů a intenzifikace procesů. Obecně se v těchto oblastech využívá matematické optimalizace, zkušeností řešitele a provozní heuristiky. Tyto přístupy slouží jako základ pro zlepšování procesů. Avšak, jejich výkon lze dále zlepšit pomocí moderní výpočtové inteligence. Účelem této práce je tudíž aplikace pokročilých technik umělé inteligence a strojového učení za účelem zlepšování procesů v energeticky náročných průmyslových procesech. V této práci je využit přístup, který řeší tento problém simulací průmyslových systémů a přispívá následujícím: (i)Aplikace techniky strojového učení, která zahrnuje jednorázové učení a neuro-evoluci pro modelování a optimalizaci jednotlivých jednotek na základě dat. (ii) Aplikace redukce dimenze (např. Analýza hlavních komponent, autoendkodér) pro vícekriteriální optimalizaci procesu s více jednotkami. (iii) Návrh nového nástroje pro analýzu problematických částí systému za účelem jejich odstranění (bottleneck tree analysis – BOTA). Bylo také navrženo rozšíření nástroje, které umožňuje řešit vícerozměrné problémy pomocí přístupu založeného na datech. (iv) Prokázání účinnosti simulací Monte-Carlo, neuronové sítě a rozhodovacích stromů pro rozhodování při integraci nové technologie procesu do stávajících procesů. (v) Porovnání techniky HTM (Hierarchical Temporal Memory) a duální optimalizace s několika prediktivními nástroji pro podporu managementu provozu v reálném čase. (vi) Implementace umělé neuronové sítě v rámci rozhraní pro konvenční procesní graf (P-graf). (vii) Zdůraznění budoucnosti umělé inteligence a procesního inženýrství v biosystémech prostřednictvím komerčně založeného paradigmatu multi-omics.
APA, Harvard, Vancouver, ISO, and other styles
17

"Sparse Models For Multimodal Imaging And Omics Data Integration." 2015.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
18

Nersisyan, Lilit. "Telomere analysis based on high-throughput multi-omics data." Doctoral thesis, 2017. https://ul.qucosa.de/id/qucosa%3A16297.

Full text
Abstract:
Telomeres are repeated sequences at the ends of eukaryotic chromosomes that play prominent role in normal aging and disease development. They are dynamic structures that normally shorten over the lifespan of a cell, but can be elongated in cells with high proliferative capacity. Telomere elongation in stem cells is an advantageous mechanism that allows them to maintain the regenerative capacity of tissues, however, it also allows for survival of cancer cells, thus leading to development of malignancies. Numerous studies have been conducted to explore the role of telomeres in health and disease. However, the majority of these studies have focused on consequences of extreme shortening of telomeres that lead to telomere dysfunction, replicative arrest or chromosomal instability. Very few studies have addressed the regulatory roles of telomeres, and the association of genomic, transcriptomic and epigenomic characteristics of a cell with telomere length dynamics. Scarcity of such studies is partially conditioned by the low-throughput nature of experimental approaches for telomere length measurement and the fact that they do not easily integrate with currently available high-throughput data. In this thesis, we have attempted to build algorithms, in silico pipelines and software packages to utilize high-throughput –omics data for telomere biology research. First, we have developed a software package Computel, to compute telomere length from whole genome next generation sequencing data. We show that it can be used to integrate telomere length dynamics into systems biology research. Using Computel, we have studied the association of telomere length with genomic variations in a healthy human population, as well as with transcriptomic and epigenomic features of lung cancers. Another aim of our study was to develop in silico models to assess the activity of telomere maintenance machanisms (TMM) based on gene expression data. There are two main TMMs: one based on the catalytic activity of ribonucleoprotein complex telomerase, and the other based on recombination events between telomeric sequences. Which type of TMM gets activated in a cancer cell determines the aggressiveness of the tumor and the outcome of the disease. Investigation into TMM mechanisms is valuable not only for basic research, but also for applied medicine, since many anticancer therapies attempt to inhibit the TMM in cancer cells to stop their growth. Therefore, studying the activation mechanisms and regulators of TMMs is of paramount importance for understanding cancer pathomechanisms and for treatment. Many studies have addressed this topic, however many aspects of TMM activation and realization still remain elusive. Additionally, current data-mining pipelines and functional annotation approaches of phenotype-associated genes are not adapted for identification of TMMs. To overcome these limitations, we have constructed pathway networks for the two TMMs based on literature, and have developed a methodology for assessment of TMM pathway activities from gene expression data. We have described the accuracy of our TMM-based approach on a set of cancer samples with experimentally validated TMMs. We have also applied it to explore TMM activity states in lung adenocarcinoma cell lines. In summary, recent developments of high-throughput technologies allow for production of data on multiple levels of cellular organization – from genomic and transcriptiomic to epigenomic. This has allowed for rapid development of various directions in molecular and cellular biology. In contrast, telomere research, although at the heart of stem cell and cancer studies, is still conducted with low-throughput experimental approaches. Here, we have attempted to utilize the huge amount of currently accumulated multi-omics data to foster telomere research and to bring it to systems biology scale.
APA, Harvard, Vancouver, ISO, and other styles
19

Papież, Anna. "Integrative data analysis methods in multi-omics molecular biology studies for disease of affluence biomarker research." Rozprawa doktorska, 2019. https://repolis.bg.polsl.pl/dlibra/docmetadata?showContent=true&id=59005.

Full text
APA, Harvard, Vancouver, ISO, and other styles
20

Papież, Anna. "Integrative data analysis methods in multi-omics molecular biology studies for disease of affluence biomarker research." Rozprawa doktorska, 2019. https://delibra.bg.polsl.pl/dlibra/docmetadata?showContent=true&id=59005.

Full text
APA, Harvard, Vancouver, ISO, and other styles
21

DONATO, LUIGI. "New omics approaches improving classification and personalized retinitis pigmentosa diagnosis." Doctoral thesis, 2018. http://hdl.handle.net/11570/3130272.

Full text
Abstract:
La Retinite Pigmentosa (RP) è una patologia ereditaria oculare altamente eterogenea, caratterizzata da una progressiva degenerazione retinica. È considerata una malattia rara, anche se la prevalenza a livello mondiale dimostra il contrario, variando approssimativamente da 1:9000 ad 1:750, in base alla localizzazione geografica. I soggetti affetti da RP sono circa 1-5 su 10.000 in Italia, mentre non sono disponibili dati precisi sulla frequenza in Sicilia. Il termine “pigmentosa” fa riferimento al caratteristico aspetto dato da accumuli di pigmento a livello retinico negli stadi più avanzati della patologia. Il primo sintomo generalmente presente è dato dalla ridotta visione notturna, seguito da una progressiva perdita del campo visivo periferico (visione a “tunnel”). L’elevatissima eterogeneità mostrata dai pazienti affetti da RP è sapientemente dimostrata dall’ampio numero di difetti genetici associati alla patologia. Ad oggi, più di 80 geni risultano coinvolti nelle forme non sindromiche di RP, ma questo elenco è in costante aggiornamento. Ognuno di questi geni appartiene ad un sottotipo gene-specifico di RP con un peculiare spettro fenotipico. In più, molti fattori possono variare ampiamente all’interno di questi particolari sottotipi, anche all’interno dei membri appartenenti alla stessa famiglia, suggerendo la presenza di fattori genetici/ambientali ancora sconosciuti che possono influenzare l’eziopatogenesi della malattia. Scopo primario di questo lavoro è migliorare la classificazione di forme orfane di RP, unitamente al chiarire maggiormente i meccanismi eziopatogenetici di quelle già note. Tali obiettivi sono stati realizzati utilizzando un approccio omico complesso, in grado di trarre beneficio dall’utilizzo delle moderne tecniche di Next Generation Sequencing e della derivata analisi del “Big Data”, portando alla luce i lati più eterogenei e nascosti della RP. È stato identificato un cluster complesso di varianti-geni candidati associati a forme di RP le cui cause genetiche erano prima sconosciute. Questi geni (PDE4DIP, LTF, CCDC175, GLOI, PPEF2, RNF144B, RDH13, FLT3, RCVRN, GNB1, GNGT1, GRK7, ARRB1, RLBP1, CACNG8, RXRG, PAX2, EYS, MYO7A, RP1, ABCA4, ELOVL4) codificano per varie proteine coinvolte nella fisiologia dell’occhio e della retina. In più, sono stati trovati nuovi “macro – pathway” candidati con numerosi sub – pathway che potrebbero essere coinvolti nell’eziopatogenesi della RP. Tra questi, sono risultate di notevole rilievo le variazioni di espressione in geni coinvolti nella regolazione del ciclo cellulare, nel trafficking cellulare, nella migrazione cellulare, nello stress del reticolo endoplasmatico, nell’attività chaperonica, nel signaling mediato dalle small GTPasi, nel ciclo dell’acido retinoico in relazione all’epigenetica, in difetti microvascolari, nell’instabilità cromosomica, nei ritmi circadiani correlati al metabolismo lipidico, nell’integrità delle sinapsi, nel complesso JUN e nella sopravvivenza delle cellule retiniche. Oltre ai geni codificanti, 3155 lncRNAs e 162 sncRNAs hanno mostrato alterazioni di espressione in esperimenti di trascrittomica, ncRNAs aventi come target host gene coinvolti in numerosi pathway biochimici correlati alle funzioni visive. Uno di questi, corroborato da analisi di Whole Genome Sequencing, è inerente alla compromissione sinaptica della neurotrasmissione retinica, che potrebbe essere considerato, per la prima volta, un serio candidato all’eziopatogenesi della RP. I risultati ottenuti potrebbero rappresentare un importante passo verso la scoperta e la classificazione di forme di RP ad oggi sconosciute. Gli step successivi dovrebbero prevedere l’analisi genetica di pazienti affetti da tali forme orfane di RP, alla ricerca di varianti nei geni candidati precedentemente selezionati, al fine di stabilire nuove correlazioni genotipo – fenotipo. Infine, la “caratterizzazione molecolare” di tali pazienti potrà permettere future inclusioni degli stessi in trial clinici basati sulla terapia genica.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography