Log in

Relevant bibliographies by topics / Omics data analysi / Journal articles

To see the other types of publications on this topic, follow the link: Omics data analysi.

Journal articles on the topic 'Omics data analysi'

Author: Grafiati

Published: 9 March 2023

Last updated: 10 March 2023

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'Omics data analysi.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Rappoport, Nimrod, and Ron Shamir. "NEMO: cancer subtyping by integration of partial multi-omic data." Bioinformatics 35, no. 18 (January 30, 2019): 3348–56. http://dx.doi.org/10.1093/bioinformatics/btz058.

Full text

Abstract:

Abstract Motivation Cancer subtypes were usually defined based on molecular characterization of single omic data. Increasingly, measurements of multiple omic profiles for the same cohort are available. Defining cancer subtypes using multi-omic data may improve our understanding of cancer, and suggest more precise treatment for patients. Results We present NEMO (NEighborhood based Multi-Omics clustering), a novel algorithm for multi-omics clustering. Importantly, NEMO can be applied to partial datasets in which some patients have data for only a subset of the omics, without performing data imputation. In extensive testing on ten cancer datasets spanning 3168 patients, NEMO achieved results comparable to the best of nine state-of-the-art multi-omics clustering algorithms on full data and showed an improvement on partial data. On some of the partial data tests, PVC, a multi-view algorithm, performed better, but it is limited to two omics and to positive partial data. Finally, we demonstrate the advantage of NEMO in detailed analysis of partial data of AML patients. NEMO is fast and much simpler than existing multi-omics clustering algorithms, and avoids iterative optimization. Availability and implementation Code for NEMO and for reproducing all NEMO results in this paper is in github: https://github.com/Shamir-Lab/NEMO. Supplementary information Supplementary data are available at Bioinformatics online.

APA, Harvard, Vancouver, ISO, and other styles

2

Lancaster, Samuel M., Akshay Sanghi, Si Wu, and Michael P. Snyder. "A Customizable Analysis Flow in Integrative Multi-Omics." Biomolecules 10, no. 12 (November 27, 2020): 1606. http://dx.doi.org/10.3390/biom10121606.

Full text

Abstract:

The number of researchers using multi-omics is growing. Though still expensive, every year it is cheaper to perform multi-omic studies, often exponentially so. In addition to its increasing accessibility, multi-omics reveals a view of systems biology to an unprecedented depth. Thus, multi-omics can be used to answer a broad range of biological questions in finer resolution than previous methods. We used six omic measurements—four nucleic acid (i.e., genomic, epigenomic, transcriptomics, and metagenomic) and two mass spectrometry (proteomics and metabolomics) based—to highlight an analysis workflow on this type of data, which is often vast. This workflow is not exhaustive of all the omic measurements or analysis methods, but it will provide an experienced or even a novice multi-omic researcher with the tools necessary to analyze their data. This review begins with analyzing a single ome and study design, and then synthesizes best practices in data integration techniques that include machine learning. Furthermore, we delineate methods to validate findings from multi-omic integration. Ultimately, multi-omic integration offers a window into the complexity of molecular interactions and a comprehensive view of systems biology.

APA, Harvard, Vancouver, ISO, and other styles

3

Oromendia, Ana, Dorina Ismailgeci, Michele Ciofii, Taylor Donnelly, Linda Bojmar, John Jyazbek, Arnaub Chatterjee, David Lyden, Kenneth H. Yu, and David Paul Kelsen. "Error-free, automated data integration of exosome cargo protein data with extensive clinical data in an ongoing, multi-omic translational research study." Journal of Clinical Oncology 38, no. 15_suppl (May 20, 2020): e16743-e16743. http://dx.doi.org/10.1200/jco.2020.38.15_suppl.e16743.

Full text

Abstract:

e16743 Background: Major advances in understanding the biology of cancer have come from genomic analysis of tumor and normal tissue. Integrating extensive patient-related data with deep analysis of omic data is crucial to informing omic data interpretation. Currently, such integrations are a highly manual, asynchronous, and costly process as well as error-prone and time-consuming. To develop new blood assays that may detect very early stage PDAC, a multi-omic investigation with deep clinical annotation is needed. Using pilot data from an on-going study, we test a new platform allowing automated error-free integration of an extensive clinical database with extensive omic data. Methods: Demographic, clinical, family pedigree and pathology data were collected on the Rave EDC platform. Exosomes were purified from 46 plasma samples from 14 controls and 24 PDAC patients and cargo proteins were quantified via SILAC. The Rave Omics platform was used to ingest and integrate clinical and omic data, run quality checks and generate integrated clinical-omic datasets. Data fidelity was validated by systematically computing differences between corresponding values in the source flies with those present in the extracted data object (integrated data). The root mean squared error (RMSE) was calculated for numeric values in each sample. Additional validation was conducted by manual inspection to ascertain data integrity. Results: We demonstrated automatic integration, without human intervention, of a subset of the clinical data and all available SILAC data into an analysis-ready data object. Data transfer was completely faithful, with 100% concordance between the source and the integrated data without loss of features. All proteins (n = 1515) and clinical variables (n = 64) were imported. Their nomenclature and corresponding sample values (n = 69690) and clinical values (n = 2432) matched exactly between datasets. In all samples, the RMSE was exactly zero, indicating no deviation between data sources. Conclusions: We demonstrated that automatic, efficient, and reliable integration of clinical-omic data is achievable during an in-flight PDAC trial. Automatic exploratory analytics supporting biomarker discovery are currently being used to uncover associations between omic and clinical features. The Rave Omics platform is disease-agnostic and we plan to expand to trials of varying size, indication, and completion status where systematic, automated integration of clinical and (multi)omic data is needed.

APA, Harvard, Vancouver, ISO, and other styles

4

Madrid-Márquez, Laura, Cristina Rubio-Escudero, Beatriz Pontes, Antonio González-Pérez, José C. Riquelme, and Maria E. Sáez. "MOMIC: A Multi-Omics Pipeline for Data Analysis, Integration and Interpretation." Applied Sciences 12, no. 8 (April 14, 2022): 3987. http://dx.doi.org/10.3390/app12083987.

Full text

Abstract:

Background and Objectives: The burst of high-throughput omics technologies has given rise to a new era in systems biology, offering an unprecedented scenario for deriving meaningful biological knowledge through the integration of different layers of information. Methods: We have developed a new software tool, MOMIC, that guides the user through the application of different analysis on a wide range of omic data, from the independent single-omics analysis to the combination of heterogeneous data at different molecular levels. Results: The proposed pipeline is developed as a collection of Jupyter notebooks, easily editable, reproducible and well documented. It can be modified to accommodate new analysis workflows and data types. It is accessible via momic.us.es, and as a docker project available at github that can be locally installed. Conclusions: MOMIC offers a complete analysis environment for analysing and integrating multi-omics data in a single, easy-to-use platform.

APA, Harvard, Vancouver, ISO, and other styles

5

Ugidos, Manuel, Sonia Tarazona, José M. Prats-Montalbán, Alberto Ferrer, and Ana Conesa. "MultiBaC: A strategy to remove batch effects between different omic data types." Statistical Methods in Medical Research 29, no. 10 (March 4, 2020): 2851–64. http://dx.doi.org/10.1177/0962280220907365.

Full text

Abstract:

Diversity of omic technologies has expanded in the last years together with the number of omic data integration strategies. However, multiomic data generation is costly, and many research groups cannot afford research projects where many different omic techniques are generated, at least at the same time. As most researchers share their data in public repositories, different omic datasets of the same biological system obtained at different labs can be combined to construct a multiomic study. However, data obtained at different labs or moments in time are typically subjected to batch effects that need to be removed for successful data integration. While there are methods to correct batch effects on the same data types obtained in different studies, they cannot be applied to correct lab or batch effects across omics. This impairs multiomic meta-analysis. Fortunately, in many cases, at least one omics platform—i.e. gene expression— is repeatedly measured across labs, together with the additional omic modalities that are specific to each study. This creates an opportunity for batch analysis. We have developed MultiBaC (multiomic Multiomics Batch-effect Correction correction), a strategy to correct batch effects from multiomic datasets distributed across different labs or data acquisition events. Our strategy is based on the existence of at least one shared data type which allows data prediction across omics. We validate this approach both on simulated data and on a case where the multiomic design is fully shared by two labs, hence batch effect correction within the same omic modality using traditional methods can be compared with the MultiBaC correction across data types. Finally, we apply MultiBaC to a true multiomic data integration problem to show that we are able to improve the detection of meaningful biological effects.

APA, Harvard, Vancouver, ISO, and other styles

6

Yang, Xiaoxi, Yuqi Wen, Xinyu Song, Song He, and Xiaochen Bo. "Exploring the classification of cancer cell lines from multiple omic views." PeerJ 8 (August 18, 2020): e9440. http://dx.doi.org/10.7717/peerj.9440.

Full text

Abstract:

Background Cancer classification is of great importance to understanding its pathogenesis, making diagnosis and developing treatment. The accumulation of extensive omics data of abundant cancer cell line provide basis for large scale classification of cancer with low cost. However, the reliability of cell lines as in vitro models of cancer has been controversial. Methods In this study, we explore the classification on pan-cancer cell line with single and integrated multiple omics data from the Cancer Cell Line Encyclopedia (CCLE) database. The representative omics data of cancer, mRNA data, miRNA data, copy number variation data, DNA methylation data and reverse-phase protein array data were taken into the analysis. TumorMap web tool was used to illustrate the landscape of molecular classification.The molecular classification of patient samples was compared with cancer cell lines. Results Eighteen molecular clusters were identified using integrated multiple omics clustering. Three pan-cancer clusters were found in integrated multiple omics clustering. By comparing with single omics clustering, we found that integrated clustering could capture both shared and complementary information from each omics data. Omics contribution analysis for clustering indicated that, although all the five omics data were of value, mRNA and proteomics data were particular important. While the classifications were generally consistent, samples from cancer patients were more diverse than cancer cell lines. Conclusions The clustering analysis based on integrated omics data provides a novel multi-dimensional map of cancer cell lines that can reflect the extent to pan-cancer cell lines represent primary tumors, and an approach to evaluate the importance of omic features in cancer classification.

APA, Harvard, Vancouver, ISO, and other styles

7

Chauvel, Cécile, Alexei Novoloaca, Pierre Veyre, Frédéric Reynier, and Jérémie Becker. "Evaluation of integrative clustering methods for the analysis of multi-omics data." Briefings in Bioinformatics 21, no. 2 (February 14, 2019): 541–52. http://dx.doi.org/10.1093/bib/bbz015.

Full text

Abstract:

Abstract Recent advances in sequencing, mass spectrometry and cytometry technologies have enabled researchers to collect large-scale omics data from the same set of biological samples. The joint analysis of multiple omics offers the opportunity to uncover coordinated cellular processes acting across different omic layers. In this work, we present a thorough comparison of a selection of recent integrative clustering approaches, including Bayesian (BCC and MDI) and matrix factorization approaches (iCluster, moCluster, JIVE and iNMF). Based on simulations, the methods were evaluated on their sensitivity and their ability to recover both the correct number of clusters and the simulated clustering at the common and data-specific levels. Standard non-integrative approaches were also included to quantify the added value of integrative methods. For most matrix factorization methods and one Bayesian approach (BCC), the shared and specific structures were successfully recovered with high and moderate accuracy, respectively. An opposite behavior was observed on non-integrative approaches, i.e. high performances on specific structures only. Finally, we applied the methods on the Cancer Genome Atlas breast cancer data set to check whether results based on experimental data were consistent with those obtained in the simulations.

APA, Harvard, Vancouver, ISO, and other styles

8

Alizadeh, Madeline, Natalia Sampaio Moura, Alyssa Schledwitz, Seema A. Patil, Jacques Ravel, and Jean-Pierre Raufman. "Big Data in Gastroenterology Research." International Journal of Molecular Sciences 24, no. 3 (January 27, 2023): 2458. http://dx.doi.org/10.3390/ijms24032458.

Full text

Abstract:

Studying individual data types in isolation provides only limited and incomplete answers to complex biological questions and particularly falls short in revealing sufficient mechanistic and kinetic details. In contrast, multi-omics approaches to studying health and disease permit the generation and integration of multiple data types on a much larger scale, offering a comprehensive picture of biological and disease processes. Gastroenterology and hepatobiliary research are particularly well-suited to such analyses, given the unique position of the luminal gastrointestinal (GI) tract at the nexus between the gut (mucosa and luminal contents), brain, immune and endocrine systems, and GI microbiome. The generation of ‘big data’ from multi-omic, multi-site studies can enhance investigations into the connections between these organ systems and organisms and more broadly and accurately appraise the effects of dietary, pharmacological, and other therapeutic interventions. In this review, we describe a variety of useful omics approaches and how they can be integrated to provide a holistic depiction of the human and microbial genetic and proteomic changes underlying physiological and pathophysiological phenomena. We highlight the potential pitfalls and alternatives to help avoid the common errors in study design, execution, and analysis. We focus on the application, integration, and analysis of big data in gastroenterology and hepatobiliary research.

APA, Harvard, Vancouver, ISO, and other styles

9

Misra, Biswapriya B., Carl Langefeld, Michael Olivier, and Laura A. Cox. "Integrated omics: tools, advances and future approaches." Journal of Molecular Endocrinology 62, no. 1 (January 2019): R21—R45. http://dx.doi.org/10.1530/jme-18-0055.

Full text

Abstract:

With the rapid adoption of high-throughput omic approaches to analyze biological samples such as genomics, transcriptomics, proteomics and metabolomics, each analysis can generate tera- to peta-byte sized data files on a daily basis. These data file sizes, together with differences in nomenclature among these data types, make the integration of these multi-dimensional omics data into biologically meaningful context challenging. Variously named as integrated omics, multi-omics, poly-omics, trans-omics, pan-omics or shortened to just ‘omics’, the challenges include differences in data cleaning, normalization, biomolecule identification, data dimensionality reduction, biological contextualization, statistical validation, data storage and handling, sharing and data archiving. The ultimate goal is toward the holistic realization of a ‘systems biology’ understanding of the biological question. Commonly used approaches are currently limited by the 3 i’s – integration, interpretation and insights. Post integration, these very large datasets aim to yield unprecedented views of cellular systems at exquisite resolution for transformative insights into processes, events and diseases through various computational and informatics frameworks. With the continued reduction in costs and processing time for sample analyses, and increasing types of omics datasets generated such as glycomics, lipidomics, microbiomics and phenomics, an increasing number of scientists in this interdisciplinary domain of bioinformatics face these challenges. We discuss recent approaches, existing tools and potential caveats in the integration of omics datasets for development of standardized analytical pipelines that could be adopted by the global omics research community.

APA, Harvard, Vancouver, ISO, and other styles

10

Pan, Jianqiao, Baoshan Ma, Xiaoyu Hou, Chongyang Li, Tong Xiong, Yi Gong, and Fengju Song. "The construction of transcriptional risk scores for breast cancer based on lightGBM and multiple omics data." Mathematical Biosciences and Engineering 19, no. 12 (2022): 12353–70. http://dx.doi.org/10.3934/mbe.2022576.

Full text

Abstract:

<abstract> <sec><title>Background</title><p>Polygenic risk score (PRS) can evaluate the individual-level genetic risk of breast cancer. However, standalone single nucleotide polymorphisms (SNP) data used for PRS may not provide satisfactory prediction accuracy. Additionally, current PRS models based on linear regression have insufficient power to leverage non-linear effects from thousands of associated SNPs. Here, we proposed a transcriptional risk score (TRS) based on multiple omics data to estimate the risk of breast cancer.</p> </sec> <sec><title>Methods</title><p>The multiple omics data and clinical data of breast invasive carcinoma (BRCA) were collected from the cancer genome atlas (TCGA) and the gene expression omnibus (GEO). First, we developed a novel TRS model for BRCA utilizing single omic data and LightGBM algorithm. Subsequently, we built a combination model of TRS derived from each omic data to further improve the prediction accuracy. Finally, we performed association analysis and prognosis prediction to evaluate the utility of the TRS generated by our method.</p> </sec> <sec><title>Results</title><p>The proposed TRS model achieved better predictive performance than the linear models and other ML methods in single omic dataset. An independent validation dataset also verified the effectiveness of our model. Moreover, the combination of the TRS can efficiently strengthen prediction accuracy. The analysis of prevalence and the associations of the TRS with phenotypes including case-control and cancer stage indicated that the risk of breast cancer increases with the increases of TRS. The survival analysis also suggested that TRS for the cancer stage is an effective prognostic metric of breast cancer patients.</p> </sec> <sec><title>Conclusions</title><p>Our proposed TRS model expanded the current definition of PRS from standalone SNP data to multiple omics data and outperformed the linear models, which may provide a powerful tool for diagnostic and prognostic prediction of breast cancer.</p> </sec> </abstract>

APA, Harvard, Vancouver, ISO, and other styles

11

Brink, Benedikt G., Annica Seidel, Nils Kleinbölting, Tim W. Nattkemper, and Stefan P. Albaum. "Omics Fusion – A Platform for Integrative Analysis of Omics Data." Journal of Integrative Bioinformatics 13, no. 4 (October 1, 2016): 43–46. http://dx.doi.org/10.1515/jib-2016-296.

Full text

Abstract:

Summary We present Omics Fusion, a new web-based platform for integrative analysis of omics data. Omics Fusion provides a collection of new and established tools and visualization methods to support researchers in exploring omics data, validating results or understanding how to adjust experiments in order to make new discoveries. It is easily extendible and new visualization methods are added continuously. It is available for free under: https://fusion.cebitec.uni-bielefeld.de/

APA, Harvard, Vancouver, ISO, and other styles

12

Sung, Wing-Kin. "Pan-omics analysis of biological data." Methods 102 (June 2016): 1–2. http://dx.doi.org/10.1016/j.ymeth.2016.05.004.

Full text

APA, Harvard, Vancouver, ISO, and other styles

13

Dinalankara, Wikum, Qian Ke, Donald Geman, and Luigi Marchionni. "An R package for divergence analysis of omics data." PLOS ONE 16, no. 4 (April 5, 2021): e0249002. http://dx.doi.org/10.1371/journal.pone.0249002.

Full text

Abstract:

Given the ever-increasing amount of high-dimensional and complex omics data becoming available, it is increasingly important to discover simple but effective methods of analysis. Divergence analysis transforms each entry of a high-dimensional omics profile into a digitized (binary or ternary) code based on the deviation of the entry from a given baseline population. This is a novel framework that is significantly different from existing omics data analysis methods: it allows digitization of continuous omics data at the univariate or multivariate level, facilitates sample level analysis, and is applicable on many different omics platforms. The divergence package, available on the R platform through the Bioconductor repository collection, provides easy-to-use functions for carrying out this transformation. Here we demonstrate how to use the package with data from the Cancer Genome Atlas.

APA, Harvard, Vancouver, ISO, and other styles

14

Wright, Michelle L., Melinda Higgins, Jacquelyn Y. Taylor, and Vicki Stover Hertzberg. "NuRsing Research in the 21st Century: R You Ready?" Biological Research For Nursing 21, no. 1 (November 1, 2018): 114–20. http://dx.doi.org/10.1177/1099800418810514.

Full text

Abstract:

Nurse scientists are adept at translating findings from basic science into useful clinical- and community-based interventions to improve health. Over time, the focus of some nursing research has grown to include the assessment and evaluation of genomic and other output from high-throughput, or “omic,” technologies as indicators related to health and disease. To date, the growth in the application of omics technologies in nursing research has included calls to increase attention to omics in nursing school curricula and educational training opportunities, such as the Summer Genetics Institute offered by the National Institute of Nursing Research. However, there has been scant attention paid in the nursing literature to the complexity of data analysis or issues of reproducibility related to omics studies. The goals of this article are to (1) familiarize nurse scientists with tools that encourage reproducibility in omics studies, with a focus on the free and open-source data processing and analysis pipeline, and (2) provide a baseline understanding of how these tools can be used to improve collaboration and cohesion among interdisciplinary research team members. Knowledge of these tools and skill in applying them will be important for communication across disciplines and imperative for the advancement of omics research in nursing.

APA, Harvard, Vancouver, ISO, and other styles

15

Rodosthenous, Theodoulos, Vahid Shahrezaei, and Marina Evangelou. "Integrating multi-OMICS data through sparse canonical correlation analysis for the prediction of complex traits: a comparison study." Bioinformatics 36, no. 17 (May 21, 2020): 4616–25. http://dx.doi.org/10.1093/bioinformatics/btaa530.

Full text

Abstract:

Abstract Motivation Recent developments in technology have enabled researchers to collect multiple OMICS datasets for the same individuals. The conventional approach for understanding the relationships between the collected datasets and the complex trait of interest would be through the analysis of each OMIC dataset separately from the rest, or to test for associations between the OMICS datasets. In this work we show that integrating multiple OMICS datasets together, instead of analysing them separately, improves our understanding of their in-between relationships as well as the predictive accuracy for the tested trait. Several approaches have been proposed for the integration of heterogeneous and high-dimensional (p≫n) data, such as OMICS. The sparse variant of canonical correlation analysis (CCA) approach is a promising one that seeks to penalize the canonical variables for producing sparse latent variables while achieving maximal correlation between the datasets. Over the last years, a number of approaches for implementing sparse CCA (sCCA) have been proposed, where they differ on their objective functions, iterative algorithm for obtaining the sparse latent variables and make different assumptions about the original datasets. Results Through a comparative study we have explored the performance of the conventional CCA proposed by Parkhomenko et al., penalized matrix decomposition CCA proposed by Witten and Tibshirani and its extension proposed by Suo et al. The aforementioned methods were modified to allow for different penalty functions. Although sCCA is an unsupervised learning approach for understanding of the in-between relationships, we have twisted the problem as a supervised learning one and investigated how the computed latent variables can be used for predicting complex traits. The approaches were extended to allow for multiple (more than two) datasets where the trait was included as one of the input datasets. Both ways have shown improvement over conventional predictive models that include one or multiple datasets. Availability and implementation https://github.com/theorod93/sCCA. Supplementary information Supplementary data are available at Bioinformatics online.

APA, Harvard, Vancouver, ISO, and other styles

16

Li, Chao, Zhenbo Gao, Benzhe Su, Guowang Xu, and Xiaohui Lin. "Data analysis methods for defining biomarkers from omics data." Analytical and Bioanalytical Chemistry 414, no. 1 (December 24, 2021): 235–50. http://dx.doi.org/10.1007/s00216-021-03813-7.

Full text

APA, Harvard, Vancouver, ISO, and other styles

17

Jiang, Xue. "Recent Advance in Biomedical Omics Data Analysis." American Journal of Biomedical Science & Research 3, no. 6 (July 10, 2019): 529–30. http://dx.doi.org/10.34297/ajbsr.2019.03.000731.

Full text

APA, Harvard, Vancouver, ISO, and other styles

18

Chen, Luonan. "Computational systems biology for omics data analysis." Journal of Molecular Cell Biology 11, no. 8 (August 2019): 631–32. http://dx.doi.org/10.1093/jmcb/mjz095.

Full text

APA, Harvard, Vancouver, ISO, and other styles

19

Fisch, Kathleen M., Tobias Meißner, Louis Gioia, Jean-Christophe Ducom, Tristan M. Carland, Salvatore Loguercio, and Andrew I. Su. "Omics Pipe: a community-based framework for reproducible multi-omics data analysis." Bioinformatics 31, no. 11 (January 30, 2015): 1724–28. http://dx.doi.org/10.1093/bioinformatics/btv061.

Full text

APA, Harvard, Vancouver, ISO, and other styles

20

Wang, Zhigang, and Yongqun He. "Precision omics data integration and analysis with interoperable ontologies and their application for COVID-19 research." Briefings in Functional Genomics 20, no. 4 (June 22, 2021): 235–48. http://dx.doi.org/10.1093/bfgp/elab029.

Full text

Abstract:

Abstract Omics technologies are widely used in biomedical research. Precision medicine focuses on individual-level disease treatment and prevention. Here, we propose the usage of the term ‘precision omics’ to represent the combinatorial strategy that applies omics to translate large-scale molecular omics data for precision disease understanding and accurate disease diagnosis, treatment and prevention. Given the complexity of both omics and precision medicine, precision omics requires standardized representation and integration of heterogeneous data types. Ontology has emerged as an important artificial intelligence component to become critical for standard data and metadata representation, standardization and integration. To support precision omics, we propose a precision omics ontology hypothesis, which hypothesizes that the effectiveness of precision omics is positively correlated with the interoperability of ontologies used for data and knowledge integration. Therefore, to make effective precision omics studies, interoperable ontologies are required to standardize and incorporate heterogeneous data and knowledge in a human- and computer-interpretable manner. Methods for efficient development and application of interoperable ontologies are proposed and illustrated. With the interoperable omics data and knowledge, omics tools such as OmicsViz can also be evolved to process, integrate, visualize and analyze various omics data, leading to the identification of new knowledge and hypotheses of molecular mechanisms underlying the outcomes of diseases such as COVID-19. Given extensive COVID-19 omics research, we propose the strategy of precision omics supported by interoperable ontologies, accompanied with ontology-based semantic reasoning and machine learning, leading to systematic disease mechanism understanding and rational design of precision treatment and prevention. Short Abstract Precision medicine focuses on individual-level disease treatment and prevention. Precision omics is a new strategy that applies omics for precision medicine research, which requires standardized representation and integration of individual genetics and phenotypes, experimental conditions, and data analysis settings. Ontology has emerged as an important artificial intelligence component to become critical for standard data and metadata representation, standardization and integration. To support precision omics, interoperable ontologies are required in order to standardize and incorporate heterogeneous data and knowledge in a human- and computer-interpretable manner. With the interoperable omics data and knowledge, omics tools such as OmicsViz can also be evolved to process, integrate, visualize and analyze various omics data, leading to the identification of new knowledge and hypotheses of molecular mechanisms underlying disease outcomes. The precision COVID-19 omics study is provided as the primary use case to illustrate the rationale and implementation of the precision omics strategy.

APA, Harvard, Vancouver, ISO, and other styles

21

Bodein, Antoine, Marie-Pier Scott-Boyer, Olivier Perin, Kim-Anh Lê Cao, and Arnaud Droit. "Interpretation of network-based integration from multi-omics longitudinal data." Nucleic Acids Research 50, no. 5 (December 9, 2021): e27-e27. http://dx.doi.org/10.1093/nar/gkab1200.

Full text

Abstract:

Abstract Multi-omics integration is key to fully understand complex biological processes in an holistic manner. Furthermore, multi-omics combined with new longitudinal experimental design can unreveal dynamic relationships between omics layers and identify key players or interactions in system development or complex phenotypes. However, integration methods have to address various experimental designs and do not guarantee interpretable biological results. The new challenge of multi-omics integration is to solve interpretation and unlock the hidden knowledge within the multi-omics data. In this paper, we go beyond integration and propose a generic approach to face the interpretation problem. From multi-omics longitudinal data, this approach builds and explores hybrid multi-omics networks composed of both inferred and known relationships within and between omics layers. With smart node labelling and propagation analysis, this approach predicts regulation mechanisms and multi-omics functional modules. We applied the method on 3 case studies with various multi-omics designs and identified new multi-layer interactions involved in key biological functions that could not be revealed with single omics analysis. Moreover, we highlighted interplay in the kinetics that could help identify novel biological mechanisms. This method is available as an R package netOmics to readily suit any application.

APA, Harvard, Vancouver, ISO, and other styles

22

Dong, Xianjun, Chunyu Liu, and Mikhail Dozmorov. "Review of multi-omics data resources and integrative analysis for human brain disorders." Briefings in Functional Genomics 20, no. 4 (May 8, 2021): 223–34. http://dx.doi.org/10.1093/bfgp/elab024.

Full text

Abstract:

Abstract In the last decade, massive omics datasets have been generated for human brain research. It is evolving so fast that a timely update is urgently needed. In this review, we summarize the main multi-omics data resources for the human brains of both healthy controls and neuropsychiatric disorders, including schizophrenia, autism, bipolar disorder, Alzheimer’s disease, Parkinson’s disease, progressive supranuclear palsy, etc. We also review the recent development of single-cell omics in brain research, such as single-nucleus RNA-seq, single-cell ATAC-seq and spatial transcriptomics. We further investigate the integrative multi-omics analysis methods for both tissue and single-cell data. Finally, we discuss the limitations and future directions of the multi-omics study of human brain disorders.

APA, Harvard, Vancouver, ISO, and other styles

23

Park, Mira, Doyoen Kim, Kwanyoung Moon, and Taesung Park. "Integrative Analysis of Multi-Omics Data Based on Blockwise Sparse Principal Components." International Journal of Molecular Sciences 21, no. 21 (November 2, 2020): 8202. http://dx.doi.org/10.3390/ijms21218202.

Full text

Abstract:

The recent development of high-throughput technology has allowed us to accumulate vast amounts of multi-omics data. Because even single omics data have a large number of variables, integrated analysis of multi-omics data suffers from problems such as computational instability and variable redundancy. Most multi-omics data analyses apply single supervised analysis, repeatedly, for dimensional reduction and variable selection. However, these approaches cannot avoid the problems of redundancy and collinearity of variables. In this study, we propose a novel approach using blockwise component analysis. This would solve the limitations of current methods by applying variable clustering and sparse principal component (sPC) analysis. Our approach consists of two stages. The first stage identifies homogeneous variable blocks, and then extracts sPCs, for each omics dataset. The second stage merges sPCs from each omics dataset, and then constructs a prediction model. We also propose a graphical method showing the results of sparse PCA and model fitting, simultaneously. We applied the proposed methodology to glioblastoma multiforme data from The Cancer Genome Atlas. The comparison with other existing approaches showed that our proposed methodology is more easily interpretable than other approaches, and has comparable predictive power, with a much smaller number of variables.

APA, Harvard, Vancouver, ISO, and other styles

24

von der Heyde, Silvia, Margarita Krawczyk, Julia Bischof, Thomas Corwin, Peter Frommolt, Jonathan Woodsmith, and Hartmut Juhl. "Clinically relevant multi-omic analysis of colorectal cancer." Journal of Clinical Oncology 38, no. 15_suppl (May 20, 2020): e16063-e16063. http://dx.doi.org/10.1200/jco.2020.38.15_suppl.e16063.

Full text

Abstract:

e16063 Background: Cancer is a highly heterogeneous disease, both intra- and inter-individually consisting of complex phenotypes and systems biology. Although genomic data has contributed greatly towards the identification of cancer-specific mutations and the progress of precision medicine, genomic alterations are only one of several important biological drivers of cancer. Furthermore, single-layer omics represent only a small piece of the cancer biology puzzle and provide only partial clues to connecting genotype with clinically relevant phenotypic data. A more integrated approach is urgently needed to unravel the underpinnings of molecular signatures and the phenotypic manifestation of cancer hallmarks. Methods: Here we characterize a colorectal cancer (CRC) cohort of 500 patients across multiple distinct omic data types. Across this CRC cohort, we defined clinically relevant whole genome sequencing based metrics such as micro-satellite-instability (MSI) status, and furthermore investigate gene expression at the transcript level using RNA-Seq, as well as at the proteomic level using tandem mass spectrometry. We further characterized a subgroup of 100 of these patients through 16s rRNA sequencing to identify associated microbiome profiles. Results: We combined these analyses with comprehensive clinical data to observe the impact of ascertained molecular signatures on the CRC patient cohort. Here, we report how patient survival correlates both with specific molecular events across individual omic data types, as well as with combined multi-omic analyses. Conclusions: This project highlights the utility of integrating multiple distinct data types to obtain a more comprehensive overview of the molecular mechanisms underpinning colo-rectal cancer. Furthermore, through combining identified aberrant molecular mechanisms with clinical reports, multi-omic data can be prioritized through their impact on patient cohort survival.

APA, Harvard, Vancouver, ISO, and other styles

25

Mirza, Bilal, Wei Wang, Jie Wang, Howard Choi, Neo Christopher Chung, and Peipei Ping. "Machine Learning and Integrative Analysis of Biomedical Big Data." Genes 10, no. 2 (January 28, 2019): 87. http://dx.doi.org/10.3390/genes10020087.

Full text

Abstract:

Recent developments in high-throughput technologies have accelerated the accumulation of massive amounts of omics data from multiple sources: genome, epigenome, transcriptome, proteome, metabolome, etc. Traditionally, data from each source (e.g., genome) is analyzed in isolation using statistical and machine learning (ML) methods. Integrative analysis of multi-omics and clinical data is key to new biomedical discoveries and advancements in precision medicine. However, data integration poses new computational challenges as well as exacerbates the ones associated with single-omics studies. Specialized computational approaches are required to effectively and efficiently perform integrative analysis of biomedical data acquired from diverse modalities. In this review, we discuss state-of-the-art ML-based approaches for tackling five specific computational challenges associated with integrative analysis: curse of dimensionality, data heterogeneity, missing data, class imbalance and scalability issues.

APA, Harvard, Vancouver, ISO, and other styles

26

Chang, Sheng-Mao, Meng Yang, Wenbin Lu, Yu-Jyun Huang, Yueyang Huang, Hung Hung, Jeffrey C. Miecznikowski, Tzu-Pin Lu, and Jung-Ying Tzeng. "Gene-set integrative analysis of multi-omics data using tensor-based association test." Bioinformatics 37, no. 16 (March 1, 2021): 2259–65. http://dx.doi.org/10.1093/bioinformatics/btab125.

Full text

Abstract:

Abstract Motivation Facilitated by technological advances and the decrease in costs, it is feasible to gather subject data from several omics platforms. Each platform assesses different molecular events, and the challenge lies in efficiently analyzing these data to discover novel disease genes or mechanisms. A common strategy is to regress the outcomes on all omics variables in a gene set. However, this approach suffers from problems associated with high-dimensional inference. Results We introduce a tensor-based framework for variable-wise inference in multi-omics analysis. By accounting for the matrix structure of an individual’s multi-omics data, the proposed tensor methods incorporate the relationship among omics effects, reduce the number of parameters, and boost the modeling efficiency. We derive the variable-specific tensor test and enhance computational efficiency of tensor modeling. Using simulations and data applications on the Cancer Cell Line Encyclopedia (CCLE), we demonstrate our method performs favorably over baseline methods and will be useful for gaining biological insights in multi-omics analysis. Availability and implementation R function and instruction are available from the authors’ website: https://www4.stat.ncsu.edu/~jytzeng/Software/TR.omics/TRinstruction.pdf. Supplementary information Supplementary data are available at Bioinformatics online.

APA, Harvard, Vancouver, ISO, and other styles

27

López de Maturana, Evangelina, Lola Alonso, Pablo Alarcón, Isabel Adoración Martín-Antoniano, Silvia Pineda, Lucas Piorno, M. Luz Calle, and Núria Malats. "Challenges in the Integration of Omics and Non-Omics Data." Genes 10, no. 3 (March 20, 2019): 238. http://dx.doi.org/10.3390/genes10030238.

Full text

Abstract:

Omics data integration is already a reality. However, few omics-based algorithms show enough predictive ability to be implemented into clinics or public health domains. Clinical/epidemiological data tend to explain most of the variation of health-related traits, and its joint modeling with omics data is crucial to increase the algorithm’s predictive ability. Only a small number of published studies performed a “real” integration of omics and non-omics (OnO) data, mainly to predict cancer outcomes. Challenges in OnO data integration regard the nature and heterogeneity of non-omics data, the possibility of integrating large-scale non-omics data with high-throughput omics data, the relationship between OnO data (i.e., ascertainment bias), the presence of interactions, the fairness of the models, and the presence of subphenotypes. These challenges demand the development and application of new analysis strategies to integrate OnO data. In this contribution we discuss different attempts of OnO data integration in clinical and epidemiological studies. Most of the reviewed papers considered only one type of omics data set, mainly RNA expression data. All selected papers incorporated non-omics data in a low-dimensionality fashion. The integrative strategies used in the identified papers adopted three modeling methods: Independent, conditional, and joint modeling. This review presents, discusses, and proposes integrative analytical strategies towards OnO data integration.

APA, Harvard, Vancouver, ISO, and other styles

28

Iuliano, Antonella, Annalisa Occhipinti, Claudia Angelini, Italia De Feis, and Pietro Liò. "COSMONET: An R Package for Survival Analysis Using Screening-Network Methods." Mathematics 9, no. 24 (December 15, 2021): 3262. http://dx.doi.org/10.3390/math9243262.

Full text

Abstract:

Identifying relevant genomic features that can act as prognostic markers for building predictive survival models is one of the central themes in medical research, affecting the future of personalized medicine and omics technologies. However, the high dimension of genome-wide omic data, the strong correlation among the features, and the low sample size significantly increase the complexity of cancer survival analysis, demanding the development of specific statistical methods and software. Here, we present a novel R package, COSMONET (COx Survival Methods based On NETworks), that provides a complete workflow from the pre-processing of omics data to the selection of gene signatures and prediction of survival outcomes. In particular, COSMONET implements (i) three different screening approaches to reduce the initial dimension of the data from a high-dimensional space p to a moderate scale d, (ii) a network-penalized Cox regression algorithm to identify the gene signature, (iii) several approaches to determine an optimal cut-off on the prognostic index (PI) to separate high- and low-risk patients, and (iv) a prediction step for patients’ risk class based on the evaluation of PIs. Moreover, COSMONET provides functions for data pre-processing, visualization, survival prediction, and gene enrichment analysis. We illustrate COSMONET through a step-by-step R vignette using two cancer datasets.

APA, Harvard, Vancouver, ISO, and other styles

29

Casey, Fergal, Soumya Negi, Jing Zhu, Yu H. Sun, Maria Zavodszky, Derrick Cheng, Dongdong Lin, et al. "OmicsView: Omics data analysis through interactive visual analytics." Computational and Structural Biotechnology Journal 20 (2022): 1277–85. http://dx.doi.org/10.1016/j.csbj.2022.02.022.

Full text

APA, Harvard, Vancouver, ISO, and other styles

30

Farag, Yehia, Frode S. Berven, Inge Jonassen, Kjell Petersen, and Harald Barsnes. "Distributed and interactive visual analysis of omics data." Journal of Proteomics 129 (November 2015): 78–82. http://dx.doi.org/10.1016/j.jprot.2015.05.029.

Full text

APA, Harvard, Vancouver, ISO, and other styles

31

Darzi, Youssef, Gwen Falony, Sara Vieira-Silva, and Jeroen Raes. "Towards biome-specific analysis of meta-omics data." ISME Journal 10, no. 5 (December 1, 2015): 1025–28. http://dx.doi.org/10.1038/ismej.2015.188.

Full text

APA, Harvard, Vancouver, ISO, and other styles

32

Zhao, Qing, Xingjie Shi, Jian Huang, Jin Liu, Yang Li, and Shuangge Ma. "Integrative analysis of ‘-omics’ data using penalty functions." Wiley Interdisciplinary Reviews: Computational Statistics 7, no. 1 (July 7, 2014): 99–108. http://dx.doi.org/10.1002/wics.1322.

Full text

APA, Harvard, Vancouver, ISO, and other styles

33

Huang, Eunchong, Sarah Kim, and TaeJin Ahn. "Deep Learning for Integrated Analysis of Insulin Resistance with Multi-Omics Data." Journal of Personalized Medicine 11, no. 2 (February 15, 2021): 128. http://dx.doi.org/10.3390/jpm11020128.

Full text

Abstract:

Technological advances in next-generation sequencing (NGS) have made it possible to uncover extensive and dynamic alterations in diverse molecular components and biological pathways across healthy and diseased conditions. Large amounts of multi-omics data originating from emerging NGS experiments require feature engineering, which is a crucial step in the process of predictive modeling. The underlying relationship among multi-omics features in terms of insulin resistance is not well understood. In this study, using the multi-omics data of type II diabetes from the Integrative Human Microbiome Project, from 10,783 features, we conducted a data analytic approach to elucidate the relationship between insulin resistance and multi-omics features, including microbiome data. To better explain the impact of microbiome features on insulin classification, we used a developed deep neural network interpretation algorithm for each microbiome feature’s contribution to the discriminative model output in the samples.

APA, Harvard, Vancouver, ISO, and other styles

34

Wu, Cen, Fei Zhou, Jie Ren, Xiaoxi Li, Yu Jiang, and Shuangge Ma. "A Selective Review of Multi-Level Omics Data Integration Using Variable Selection." High-Throughput 8, no. 1 (January 18, 2019): 4. http://dx.doi.org/10.3390/ht8010004.

Full text

Abstract:

High-throughput technologies have been used to generate a large amount of omics data. In the past, single-level analysis has been extensively conducted where the omics measurements at different levels, including mRNA, microRNA, CNV and DNA methylation, are analyzed separately. As the molecular complexity of disease etiology exists at all different levels, integrative analysis offers an effective way to borrow strength across multi-level omics data and can be more powerful than single level analysis. In this article, we focus on reviewing existing multi-omics integration studies by paying special attention to variable selection methods. We first summarize published reviews on integrating multi-level omics data. Next, after a brief overview on variable selection methods, we review existing supervised, semi-supervised and unsupervised integrative analyses within parallel and hierarchical integration studies, respectively. The strength and limitations of the methods are discussed in detail. No existing integration method can dominate the rest. The computation aspects are also investigated. The review concludes with possible limitations and future directions for multi-level omics data integration.

APA, Harvard, Vancouver, ISO, and other styles

35

Mangul, Serghei. "Interpreting and integrating big data in the life sciences." Emerging Topics in Life Sciences 3, no. 4 (June 26, 2019): 335–41. http://dx.doi.org/10.1042/etls20180175.

Full text

Abstract:

Abstract Recent advances in omics technologies have led to the broad applicability of computational techniques across various domains of life science and medical research. These technologies provide an unprecedented opportunity to collect the omics data from hundreds of thousands of individuals and to study the gene–disease association without the aid of prior assumptions about the trait biology. Despite the many advantages of modern omics technologies, interpretations of big data produced by such technologies require advanced computational algorithms. I outline key challenges that biomedical researches are facing when interpreting and integrating big omics data. I discuss the reproducibility aspect of big data analysis in the life sciences and review current practices in reproducible research. Finally, I explain the skills that biomedical researchers need to acquire to independently analyze big omics data.

APA, Harvard, Vancouver, ISO, and other styles

36

Eren, A. Murat, Özcan C. Esen, Christopher Quince, Joseph H. Vineis, Hilary G. Morrison, Mitchell L. Sogin, and Tom O. Delmont. "Anvi’o: an advanced analysis and visualization platform for ‘omics data." PeerJ 3 (October 8, 2015): e1319. http://dx.doi.org/10.7717/peerj.1319.

Full text

Abstract:

Advances in high-throughput sequencing and ‘omics technologies are revolutionizing studies of naturally occurring microbial communities. Comprehensive investigations of microbial lifestyles require the ability to interactively organize and visualize genetic information and to incorporate subtle differences that enable greater resolution of complex data. Here we introduce anvi’o, an advanced analysis and visualization platform that offers automated and human-guided characterization of microbial genomes in metagenomic assemblies, with interactive interfaces that can link ‘omics data from multiple sources into a single, intuitive display. Its extensible visualization approach distills multiple dimensions of information about each contig, offering a dynamic and unified work environment for data exploration, manipulation, and reporting. Using anvi’o, we re-analyzed publicly available datasets and explored temporal genomic changes within naturally occurring microbial populations throughde novocharacterization of single nucleotide variations, and linked cultivar and single-cell genomes with metagenomic and metatranscriptomic data. Anvi’o is an open-source platform that empowers researchers without extensive bioinformatics skills to perform and communicate in-depth analyses on large ‘omics datasets.

APA, Harvard, Vancouver, ISO, and other styles

37

Lin, Dongdong, Hima B. Yalamanchili, Xinmin Zhang, Nathan E. Lewis, Christina S. Alves, Joost Groot, Johnny Arnsdorf, et al. "CHOmics: A web-based tool for multi-omics data analysis and interactive visualization in CHO cell lines." PLOS Computational Biology 16, no. 12 (December 22, 2020): e1008498. http://dx.doi.org/10.1371/journal.pcbi.1008498.

Full text

Abstract:

Chinese hamster ovary (CHO) cell lines are widely used in industry for biological drug production. During cell culture development, considerable effort is invested to understand the factors that greatly impact cell growth, specific productivity and product qualities of the biotherapeutics. While high-throughput omics approaches have been increasingly utilized to reveal cellular mechanisms associated with cell line phenotypes and guide process optimization, comprehensive omics data analysis and management have been a challenge. Here we developed CHOmics, a web-based tool for integrative analysis of CHO cell line omics data that provides an interactive visualization of omics analysis outputs and efficient data management. CHOmics has a built-in comprehensive pipeline for RNA sequencing data processing and multi-layer statistical modules to explore relevant genes or pathways. Moreover, advanced functionalities were provided to enable users to customize their analysis and visualize the output systematically and interactively. The tool was also designed with the flexibility to accommodate other types of omics data and thereby enabling multi-omics comparison and visualization at both gene and pathway levels. Collectively, CHOmics is an integrative platform for data analysis, visualization and management with expectations to promote the broader use of omics in CHO cell research.

APA, Harvard, Vancouver, ISO, and other styles

38

Li, Peng, and Bo Sun. "Integration of Multi-Omics Data to Identify Cancer Biomarkers." Journal of Information Technology Research 15, no. 1 (January 2022): 1–15. http://dx.doi.org/10.4018/jitr.2022010105.

Full text

Abstract:

A novel method for integrating multi-omics data, including gene expression, copy number variation, DNA methylation, and miRNA data, is proposed to identify biomarkers of cancer prognosis. First, survival analysis was performed for these four types of omics data to obtain survival-related genes. Next, survival-related genes detected in at least two types of omics data were selected as candidate genes. The four types of omics data only composed of candidate genes were subjected to dimension reduction using an autoencoder to obtain a one-dimensional data representation. The mRMR algorithm was used to screen for key genes. This method was applied to lung squamous cell carcinoma and 20 cancer-related genes were identified. Gene function analysis revealed that the genes were related to cancer. Using survival analysis, the genes were verified to distinguish between high- and low-risk groups. These results indicate that the genes can be used as biomarkers for cancer.

APA, Harvard, Vancouver, ISO, and other styles

39

Xu, Chao, Ji-Gang Zhang, Dongdong Lin, Lan Zhang, Hui Shen, and Hong-Wen Deng. "A Systemic Analysis of Transcriptomic and Epigenomic Data To Reveal Regulation Patterns for Complex Disease." G3 Genes|Genomes|Genetics 7, no. 7 (July 1, 2017): 2271–79. http://dx.doi.org/10.1534/g3.117.042408.

Full text

Abstract:

Abstract Integrating diverse genomics data can provide a global view of the complex biological processes related to the human complex diseases. Although substantial efforts have been made to integrate different omics data, there are at least three challenges for multi-omics integration methods: (i) How to simultaneously consider the effects of various genomic factors, since these factors jointly influence the phenotypes; (ii) How to effectively incorporate the information from publicly accessible databases and omics datasets to fully capture the interactions among (epi)genomic factors from diverse omics data; and (iii) Until present, the combination of more than two omics datasets has been poorly explored. Current integration approaches are not sufficient to address all of these challenges together. We proposed a novel integrative analysis framework by incorporating sparse model, multivariate analysis, Gaussian graphical model, and network analysis to address these three challenges simultaneously. Based on this strategy, we performed a systemic analysis for glioblastoma multiforme (GBM) integrating genome-wide gene expression, DNA methylation, and miRNA expression data. We identified three regulatory modules of genomic factors associated with GBM survival time and revealed a global regulatory pattern for GBM by combining the three modules, with respect to the common regulatory factors. Our method can not only identify disease-associated dysregulated genomic factors from different omics, but more importantly, it can incorporate the information from publicly accessible databases and omics datasets to infer a comprehensive interaction map of all these dysregulated genomic factors. Our work represents an innovative approach to enhance our understanding of molecular genomic mechanisms underlying human complex diseases.

APA, Harvard, Vancouver, ISO, and other styles

40

Bradshaw, Michael S., and Samuel H. Payne. "Detecting fabrication in large-scale molecular omics data." PLOS ONE 16, no. 11 (November 30, 2021): e0260395. http://dx.doi.org/10.1371/journal.pone.0260395.

Full text

Abstract:

Fraud is a pervasive problem and can occur as fabrication, falsification, plagiarism, or theft. The scientific community is not exempt from this universal problem and several studies have recently been caught manipulating or fabricating data. Current measures to prevent and deter scientific misconduct come in the form of the peer-review process and on-site clinical trial auditors. As recent advances in high-throughput omics technologies have moved biology into the realm of big-data, fraud detection methods must be updated for sophisticated computational fraud. In the financial sector, machine learning and digit-frequencies are successfully used to detect fraud. Drawing from these sources, we develop methods of fabrication detection in biomedical research and show that machine learning can be used to detect fraud in large-scale omic experiments. Using the gene copy-number data as input, machine learning models correctly predicted fraud with 58–100% accuracy. With digit frequency as input features, the models detected fraud with 82%-100% accuracy. All of the data and analysis scripts used in this project are available at https://github.com/MSBradshaw/FakeData.

APA, Harvard, Vancouver, ISO, and other styles

41

Koppad, Saraswati, Annappa B, Georgios V. Gkoutos, and Animesh Acharjee. "Cloud Computing Enabled Big Multi-Omics Data Analytics." Bioinformatics and Biology Insights 15 (January 2021): 117793222110359. http://dx.doi.org/10.1177/11779322211035921.

Full text

Abstract:

High-throughput experiments enable researchers to explore complex multifactorial diseases through large-scale analysis of omics data. Challenges for such high-dimensional data sets include storage, analyses, and sharing. Recent innovations in computational technologies and approaches, especially in cloud computing, offer a promising, low-cost, and highly flexible solution in the bioinformatics domain. Cloud computing is rapidly proving increasingly useful in molecular modeling, omics data analytics (eg, RNA sequencing, metabolomics, or proteomics data sets), and for the integration, analysis, and interpretation of phenotypic data. We review the adoption of advanced cloud-based and big data technologies for processing and analyzing omics data and provide insights into state-of-the-art cloud bioinformatics applications.

APA, Harvard, Vancouver, ISO, and other styles

42

Palla, Giovanni, Hannah Spitzer, Michal Klein, David Fischer, Anna Christina Schaar, Louis Benedikt Kuemmerle, Sergei Rybakov, et al. "Squidpy: a scalable framework for spatial omics analysis." Nature Methods 19, no. 2 (January 31, 2022): 171–78. http://dx.doi.org/10.1038/s41592-021-01358-2.

Full text

Abstract:

AbstractSpatial omics data are advancing the study of tissue organization and cellular communication at an unprecedented scale. Flexible tools are required to store, integrate and visualize the large diversity of spatial omics data. Here, we present Squidpy, a Python framework that brings together tools from omics and image analysis to enable scalable description of spatial molecular data, such as transcriptome or multivariate proteins. Squidpy provides efficient infrastructure and numerous analysis methods that allow to efficiently store, manipulate and interactively visualize spatial omics data. Squidpy is extensible and can be interfaced with a variety of already existing libraries for the scalable analysis of spatial omics data.

APA, Harvard, Vancouver, ISO, and other styles

43

Chen, Chuming, Peter B. McGarvey, Hongzhan Huang, and Cathy H. Wu. "Protein Bioinformatics Infrastructure for the Integration and Analysis of Multiple High-Throughput “omics” Data." Advances in Bioinformatics 2010 (March 29, 2010): 1–19. http://dx.doi.org/10.1155/2010/423589.

Full text

Abstract:

High-throughput “omics” technologies bring new opportunities for biological and biomedical researchers to ask complex questions and gain new scientific insights. However, the voluminous, complex, and context-dependent data being maintained in heterogeneous and distributed environments plus the lack of well-defined data standard and standardized nomenclature imposes a major challenge which requires advanced computational methods and bioinformatics infrastructures for integration, mining, visualization, and comparative analysis to facilitate data-driven hypothesis generation and biological knowledge discovery. In this paper, we present the challenges in high-throughput “omics” data integration and analysis, introduce a protein-centric approach for systems integration of large and heterogeneous high-throughput “omics” data including microarray, mass spectrometry, protein sequence, protein structure, and protein interaction data, and use scientific case study to illustrate how one can use varied “omics” data from different laboratories to make useful connections that could lead to new biological knowledge.

APA, Harvard, Vancouver, ISO, and other styles

44

Zhang, Qiang, Xiang-He Meng, Chuan Qiu, Hui Shen, Qi Zhao, Lan-Juan Zhao, Qing Tian, Chang-Qing Sun, and Hong-Wen Deng. "Integrative analysis of multi-omics data to detect the underlying molecular mechanisms for obesity in vivo in humans." Human Genomics 16, no. 1 (May 14, 2022). http://dx.doi.org/10.1186/s40246-022-00388-x.

Full text

Abstract:

Abstract Background Obesity is a complex, multifactorial condition in which genetic play an important role. Most of the systematic studies currently focuses on individual omics aspect and provide insightful yet limited knowledge about the comprehensive and complex crosstalk between various omics levels. Subjects and methods Therefore, we performed a most comprehensive trans-omics study with various omics data from 104 subjects, to identify interactions/networks and particularly causal regulatory relationships within and especially those between omic molecules with the purpose to discover molecular genetic mechanisms underlying obesity etiology in vivo in humans. Results By applying differentially analysis, we identified 8 differentially expressed hub genes (DEHGs), 14 differentially methylated regions (DMRs) and 12 differentially accumulated metabolites (DAMs) for obesity individually. By integrating those multi-omics biomarkers using Mendelian Randomization (MR) and network MR analyses, we identified 18 causal pathways with mediation effect. For the 20 biomarkers involved in those 18 pairs, 17 biomarkers were implicated in the pathophysiology of obesity or related diseases. Conclusions The integration of trans-omics and MR analyses may provide us a holistic understanding of the underlying functional mechanisms, molecular regulatory information flow and the interactive molecular systems among different omic molecules for obesity risk and other complex diseases/traits.

APA, Harvard, Vancouver, ISO, and other styles

45

Das, Sarmistha, and Indranil Mukhopadhyay. "TiMEG: an integrative statistical method for partially missing multi-omics data." Scientific Reports 11, no. 1 (December 2021). http://dx.doi.org/10.1038/s41598-021-03034-z.

Full text

Abstract:

AbstractMulti-omics data integration is widely used to understand the genetic architecture of disease. In multi-omics association analysis, data collected on multiple omics for the same set of individuals are immensely important for biomarker identification. But when the sample size of such data is limited, the presence of partially missing individual-level observations poses a major challenge in data integration. More often, genotype data are available for all individuals under study but gene expression and/or methylation information are missing for different subsets of those individuals. Here, we develop a statistical model TiMEG, for the identification of disease-associated biomarkers in a case–control paradigm by integrating the above-mentioned data types, especially, in presence of missing omics data. Based on a likelihood approach, TiMEG exploits the inter-relationship among multiple omics data to capture weaker signals, that remain unidentified in single-omic analysis or common imputation-based methods. Its application on a real tuberous sclerosis dataset identified functionally relevant genes in the disease pathway.

APA, Harvard, Vancouver, ISO, and other styles

46

Planell, Nuria, Vincenzo Lagani, Patricia Sebastian-Leon, Frans van der Kloet, Ewoud Ewing, Nestoras Karathanasis, Arantxa Urdangarin, et al. "STATegra: Multi-Omics Data Integration – A Conceptual Scheme With a Bioinformatics Pipeline." Frontiers in Genetics 12 (March 4, 2021). http://dx.doi.org/10.3389/fgene.2021.620453.

Full text

Abstract:

Technologies for profiling samples using different omics platforms have been at the forefront since the human genome project. Large-scale multi-omics data hold the promise of deciphering different regulatory layers. Yet, while there is a myriad of bioinformatics tools, each multi-omics analysis appears to start from scratch with an arbitrary decision over which tools to use and how to combine them. Therefore, it is an unmet need to conceptualize how to integrate such data and implement and validate pipelines in different cases. We have designed a conceptual framework (STATegra), aiming it to be as generic as possible for multi-omics analysis, combining available multi-omic anlaysis tools (machine learning component analysis, non-parametric data combination, and a multi-omics exploratory analysis) in a step-wise manner. While in several studies, we have previously combined those integrative tools, here, we provide a systematic description of the STATegra framework and its validation using two The Cancer Genome Atlas (TCGA) case studies. For both, the Glioblastoma and the Skin Cutaneous Melanoma (SKCM) cases, we demonstrate an enhanced capacity of the framework (and beyond the individual tools) to identify features and pathways compared to single-omics analysis. Such an integrative multi-omics analysis framework for identifying features and components facilitates the discovery of new biology. Finally, we provide several options for applying the STATegra framework when parametric assumptions are fulfilled and for the case when not all the samples are profiled for all omics. The STATegra framework is built using several tools, which are being integrated step-by-step as OpenSource in the STATegRa Bioconductor package.1

APA, Harvard, Vancouver, ISO, and other styles

47

Ogris, Christoph, Yue Hu, Janine Arloth, and Nikola S. Müller. "Versatile knowledge guided network inference method for prioritizing key regulatory factors in multi-omics data." Scientific Reports 11, no. 1 (March 24, 2021). http://dx.doi.org/10.1038/s41598-021-85544-4.

Full text

Abstract:

AbstractConstantly decreasing costs of high-throughput profiling on many molecular levels generate vast amounts of multi-omics data. Studying one biomedical question on two or more omic levels provides deeper insights into underlying molecular processes or disease pathophysiology. For the majority of multi-omics data projects, the data analysis is performed level-wise, followed by a combined interpretation of results. Hence the full potential of integrated data analysis is not leveraged yet, presumably due to the complexity of the data and the lacking toolsets. We propose a versatile approach, to perform a multi-level fully integrated analysis: The Knowledge guIded Multi-Omics Network inference approach, KiMONo (https://github.com/cellmapslab/kimono). KiMONo performs network inference by using statistical models for combining omics measurements coupled to a powerful knowledge-guided strategy exploiting prior information from existing biological sources. Within the resulting multimodal network, nodes represent features of all input types e.g. variants and genes while edges refer to knowledge-supported and statistically derived associations. In a comprehensive evaluation, we show that our method is robust to noise and exemplify the general applicability to the full spectrum of multi-omics data, demonstrating that KiMONo is a powerful approach towards leveraging the full potential of data sets for detecting biomarker candidates.

APA, Harvard, Vancouver, ISO, and other styles

48

Zhang, Hui, Minghui Ao, Arianna Boja, Michael Schnaubelt, and Yingwei Hu. "OmicsOne: associate omics data with phenotypes in one-click." Clinical Proteomics 18, no. 1 (December 2021). http://dx.doi.org/10.1186/s12014-021-09334-w.

Full text

Abstract:

Abstract Background The rapid advancements of high throughput “omics” technologies have brought a massive amount of data to process during and after experiments. Multi-omic analysis facilitates a deeper interrogation of a dataset and the discovery of interesting genes, proteins, lipids, glycans, metabolites, or pathways related to the corresponding phenotypes in a study. Many individual software tools have been developed for data analysis and visualization. However, it still lacks an efficient way to investigate the phenotypes with multiple omics data. Here, we present OmicsOne as an interactive web-based framework for rapid phenotype association analysis of multi-omic data by integrating quality control, statistical analysis, and interactive data visualization on ‘one-click’. Materials and methods OmicsOne was applied on the previously published proteomic and glycoproteomic data sets of high-grade serous ovarian carcinoma (HGSOC) and the published proteome data set of lung squamous cell carcinoma (LSCC) to confirm its performance. The data was analyzed through six main functional modules implemented in OmicsOne: (1) phenotype profiling, (2) data preprocessing and quality control, (3) knowledge annotation, (4) phenotype associated features discovery, (5) correlation and regression model analysis for phenotype association analysis on individual features, and (6) enrichment analysis for phenotype association analysis on interested feature sets. Results We developed an integrated software solution, OmicsOne, for the phenotype association analysis on multi-omics data sets. The application of OmicsOne on the public data set of ovarian cancer data showed that the software could confirm the previous observations consistently and discover new evidence for HNRNPU and a glycopeptide of HYOU1 as potential biomarkers for HGSOC data sets. The performance of OmicsOne was further demonstrated in the Tumor and NAT comparison study on the proteome data set of LSCC. Conclusions OmicsOne can effectively simplify data analysis and reveal the significant associations between phenotypes and potential biomarkers, including genes, proteins, and glycopeptides, in minutes to assist users to understand aberrant biological processes.

APA, Harvard, Vancouver, ISO, and other styles

49

Blum, Benjamin C., and Andrew Emili. "Omics Notebook: Robust, reproducible, and flexible automated multi-omics exploratory analysis and reporting." Bioinformatics Advances, September 21, 2021. http://dx.doi.org/10.1093/bioadv/vbab024.

Full text

Abstract:

Abstract Summary Mass spectrometry is an increasingly important tool for the global interrogation of diverse biomolecules. Unfortunately, the complexity of downstream data analysis is a major challenge for the routine use of these data by investigators from broader training backgrounds. Omics Notebook is an open-source framework for exploratory analysis, reporting, and integrating multi-omic data that is automated, reproducible, and customizable. Built-in functions allow processing of proteomic data from MaxQuant and metabolomic data from XCMS, along with other omics data in standardized input formats as specified in the documentation. Additionally, the use of containerization manages R package installation requirements and is tailored for shared HPC or cloud environments. Availability and Implementation Omics Notebook is implemented in Python and R and is available for download from (https://github.com/cnsb-boston/Omics_Notebook) with additional documentation under a GNU GPLv3 license.

APA, Harvard, Vancouver, ISO, and other styles

50

Zou, Guanhua, Yilong Lin, Tianyang Han, and Le Ou-Yang. "DEMOC: a deep embedded multi-omics learning approach for clustering single-cell CITE-seq data." Briefings in Bioinformatics, August 31, 2022. http://dx.doi.org/10.1093/bib/bbac347.

Full text

Abstract:

Abstract Advances in single-cell RNA sequencing (scRNA-seq) technologies has provided an unprecedent opportunity for cell-type identification. As clustering is an effective strategy towards cell-type identification, various computational approaches have been proposed for clustering scRNA-seq data. Recently, with the emergence of cellular indexing of transcriptomes and epitopes by sequencing (CITE-seq), the cell surface expression of specific proteins and the RNA expression on the same cell can be captured, which provides more comprehensive information for cell analysis. However, existing single cell clustering algorithms are mainly designed for single-omic data, and have difficulties in handling multi-omics data with diverse characteristics efficiently. In this study, we propose a novel deep embedded multi-omics clustering with collaborative training (DEMOC) model to perform joint clustering on CITE-seq data. Our model can take into account the characteristics of transcriptomic and proteomic data, and make use of the consistent and complementary information provided by different data sources effectively. Experiment results on two real CITE-seq datasets demonstrate that our DEMOC model not only outperforms state-of-the-art single-omic clustering methods, but also achieves better and more stable performance than existing multi-omics clustering methods. We also apply our model on three scRNA-seq datasets to assess the performance of our model in rare cell-type identification, novel cell-subtype detection and cellular heterogeneity analysis. Experiment results illustrate the effectiveness of our model in discovering the underlying patterns of data.

APA, Harvard, Vancouver, ISO, and other styles

We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!