Journal articles on the topic 'Novel Python package'

To see the other types of publications on this topic, follow the link: Novel Python package.

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'Novel Python package.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Biová, Jana, Nicholas Dietz, Yen On Chan, Trupti Joshi, Kristin Bilyeu, and Mária Škrabišová. "AccuCalc: A Python Package for Accuracy Calculation in GWAS." Genes 14, no. 1 (January 1, 2023): 123. http://dx.doi.org/10.3390/genes14010123.

Full text
Abstract:
The genome-wide association study (GWAS) is a popular genomic approach that identifies genomic regions associated with a phenotype and, thus, aims to discover causative mutations (CM) in the genes underlying the phenotype. However, GWAS discoveries are limited by many factors and typically identify associated genomic regions without the further ability to compare the viability of candidate genes and actual CMs. Therefore, the current methodology is limited to CM identification. In our recent work, we presented a novel approach to an empowered “GWAS to Genes” strategy that we named Synthetic phenotype to causative mutation (SP2CM). We established this strategy to identify CMs in soybean genes and developed a web-based tool for accuracy calculation (AccuTool) for a reference panel of soybean accessions. Here, we describe our further development of the tool that extends its utilization for other species and named it AccuCalc. We enhanced the tool for the analysis of datasets with a low-frequency distribution of a rare phenotype by automated formatting of a synthetic phenotype and added another accuracy-based GWAS evaluation criterion to the accuracy calculation. We designed AccuCalc as a Python package for GWAS data analysis for any user-defined species-independent variant calling format (vcf) or HapMap format (hmp) as input data. AccuCalc saves analysis outputs in user-friendly tab-delimited formats and also offers visualization of the GWAS results as Manhattan plots accentuated by accuracy. Under the hood of Python, AccuCalc is publicly available and, thus, can be used conveniently for the SP2CM strategy utilization for every species.
APA, Harvard, Vancouver, ISO, and other styles
2

STAUDT, CHRISTIAN L., ALEKSEJS SAZONOVS, and HENNING MEYERHENKE. "NetworKit: A tool suite for large-scale complex network analysis." Network Science 4, no. 4 (December 2016): 508–30. http://dx.doi.org/10.1017/nws.2016.20.

Full text
Abstract:
AbstractWe introduce NetworKit, an open-source software package for analyzing the structure of large complex networks. Appropriate algorithmic solutions are required to handle increasingly common large graph data sets containing up to billions of connections. We describe the methodology applied to develop scalable solutions to network analysis problems, including techniques like parallelization, heuristics for computationally expensive problems, efficient data structures, and modular software architecture. Our goal for the software is to package results of our algorithm engineering efforts and put them into the hands of domain experts. NetworKit is implemented as a hybrid combining the kernels written in C++ with a Python frontend, enabling integration into the Python ecosystem of tested tools for data analysis and scientific computing. The package provides a wide range of functionality (including common and novel analytics algorithms and graph generators) and does so via a convenient interface. In an experimental comparison with related software, NetworKit shows the best performance on a range of typical analysis tasks.
APA, Harvard, Vancouver, ISO, and other styles
3

Bodory, Hugo, Hannah Busshoff, and Michael Lechner. "High Resolution Treatment Effects Estimation: Uncovering Effect Heterogeneities with the Modified Causal Forest." Entropy 24, no. 8 (July 28, 2022): 1039. http://dx.doi.org/10.3390/e24081039.

Full text
Abstract:
There is great demand for inferring causal effect heterogeneity and for open-source statistical software, which is readily available for practitioners. The mcf package is an open-source Python package that implements Modified Causal Forest (mcf), a causal machine learner. We replicate three well-known studies in the fields of epidemiology, medicine, and labor economics to demonstrate that our mcf package produces aggregate treatment effects, which align with previous results, and in addition, provides novel insights on causal effect heterogeneity. For all resolutions of treatment effects estimation, which can be identified, the mcf package provides inference. We conclude that the mcf constitutes a practical and extensive tool for a modern causal heterogeneous effects analysis.
APA, Harvard, Vancouver, ISO, and other styles
4

Jadoul, Yannick, Diandra Duengen, and Andrea Ravignani. "Parselmouth for bioacoustics: Integrating Praat into the Python scientific ecosystem." Journal of the Acoustical Society of America 151, no. 4 (April 2022): A29. http://dx.doi.org/10.1121/10.0010550.

Full text
Abstract:
As collected datasets become larger and computational analyses become ever more complex, the efficient processing of bioacoustical data is a crucial problem to tackle. Often, during data exploration and analysis, different research software packages need to be flexibly combined in a script. A typical example of such a multi-faceted workflow is the extraction of acoustic parameters from a recording, which are then plotted and tested for statistical significance. Parselmouth is an open-source Python library for Praat, a widely used acoustics and phonetics software package implementing acoustic algorithms and analyses regularly adopted in bioacoustics research. Parselmouth's goal is to provide a full-fledged Python library that integrates efficiently into the larger Python ecosystem. This way, it not only simplifies the application of Praat’s functionality within a typical data analysis workflow but also enables the creation of new experimental tools. Parselmouth’s contribution to bioacoustics research can be highlighted through concrete examples of studies we have conducted, e.g., on vocal flexibility in seals. Moreover, the integration of Praat’s functionality into a general-purpose programming language allows for novel, more complex experimental setups: for example, the integration of Parselmouth into a custom-created software tool permits live-monitoring and instantly evaluating the vocal development during animal training.
APA, Harvard, Vancouver, ISO, and other styles
5

Romano, Joseph D., Trang T. Le, William La Cava, John T. Gregg, Daniel J. Goldberg, Praneel Chakraborty, Natasha L. Ray, Daniel Himmelstein, Weixuan Fu, and Jason H. Moore. "PMLB v1.0: an open-source dataset collection for benchmarking machine learning methods." Bioinformatics 38, no. 3 (October 22, 2021): 878–80. http://dx.doi.org/10.1093/bioinformatics/btab727.

Full text
Abstract:
Abstract Motivation Novel machine learning and statistical modeling studies rely on standardized comparisons to existing methods using well-studied benchmark datasets. Few tools exist that provide rapid access to many of these datasets through a standardized, user-friendly interface that integrates well with popular data science workflows. Results This release of PMLB (Penn Machine Learning Benchmarks) provides the largest collection of diverse, public benchmark datasets for evaluating new machine learning and data science methods aggregated in one location. v1.0 introduces a number of critical improvements developed following discussions with the open-source community. Availability and implementation PMLB is available at https://github.com/EpistasisLab/pmlb. Python and R interfaces for PMLB can be installed through the Python Package Index and Comprehensive R Archive Network, respectively.
APA, Harvard, Vancouver, ISO, and other styles
6

Dallilar, Y., S. von Fellenberg, M. Bauboeck, P. T. de Zeeuw, A. Drescher, F. Eisenhauer, R. Genzel, et al. "Flaremodel: An open-source Python package for one-zone numerical modelling of synchrotron sources." Astronomy & Astrophysics 658 (February 2022): A111. http://dx.doi.org/10.1051/0004-6361/202142458.

Full text
Abstract:
Synchrotron processes, the radiative processes associated with the interaction of energetic charged particles with magnetic field, are of interest in many areas in astronomy, from the interstellar medium to extreme environments near compact objects. Consequently, observations of synchrotron sources carry information on the physical properties of the sources themselves and those of their close vicinity. In recent years, novel observations of such sources with multi-wavelength collaborations reveal complex features and peculiarities, especially near black holes. Exploring the nature of these sources in more detail necessitates numerical tools complementary to analytical one-zone modelling efforts. In this paper, we introduce an open-source Python package tailored to this purpose, FLAREMODEL. The core of the code consists of low-level utility functions to describe physical processes relevant to synchrotron sources, which are written in C for performance and parallelised with OpenMP for scalability. The Python interface provides access to these functions and built-in source models are provided as a guidance. At the same time, the modular design of the code and the generic nature of these functions enable users to build a variety of source models applicable to many astrophysical synchrotron sources. We describe our methodology and the structure of our code along with selected examples demonstrating capabilities and options for future modelling efforts.
APA, Harvard, Vancouver, ISO, and other styles
7

Bucur, C. "Artificial intelligence driven speed controller for DC motor in series." Scientific Bulletin of Naval Academy XIV, no. 2 (December 15, 2021): 83–88. http://dx.doi.org/10.21279/1454-864x-21-i2-007.

Full text
Abstract:
Recently a lot of work have been done to implement artificial intelligence controllers in the field of electrical motors. This paper presents a novel speed controller, developed through Reinforcement learning techniques, applied to series dc motors. We emphasize the ease of developed controller in available off the shelf hardware for industrial use. We used the open- source Python package gym-electric-motor [1] for environment setup, pytorch framework for developing the controller and .NET for performance evaluation.
APA, Harvard, Vancouver, ISO, and other styles
8

Gruenstaeudl, Michael. "annonex2embl: automatic preparation of annotated DNA sequences for bulk submissions to ENA." Bioinformatics 36, no. 12 (March 30, 2020): 3841–48. http://dx.doi.org/10.1093/bioinformatics/btaa209.

Full text
Abstract:
Abstract Motivation The submission of annotated sequence data to public sequence databases constitutes a central pillar in biological research. The surge of novel DNA sequences awaiting database submission due to the application of next-generation sequencing has increased the need for software tools that facilitate bulk submissions. This need has yet to be met with the concurrent development of tools to automate the preparatory work preceding such submissions. Results The author introduce annonex2embl, a Python package that automates the preparation of complete sequence flatfiles for large-scale sequence submissions to the European Nucleotide Archive. The tool enables the conversion of DNA sequence alignments that are co-supplied with sequence annotations and metadata to submission-ready flatfiles. Among other features, the software automatically accounts for length differences among the input sequences while maintaining correct annotations, automatically interlaces metadata to each record and displays a design suitable for easy integration into bioinformatic workflows. As proof of its utility, annonex2embl is employed in preparing a dataset of more than 1500 fungal DNA sequences for database submission. Availability and implementation annonex2embl is freely available via the Python package index at http://pypi.python.org/pypi/annonex2embl. Supplementary information Supplementary data are available at Bioinformatics online.
APA, Harvard, Vancouver, ISO, and other styles
9

Heybrock, Simon, Owen Arnold, Igor Gudich, Daniel Nixon, and Neil Vaytet. "Scipp: Scientific data handling with labeled multi-dimensional arrays for C++ and Python." Journal of Neutron Research 22, no. 2-3 (October 20, 2020): 169–81. http://dx.doi.org/10.3233/jnr-190131.

Full text
Abstract:
scipp is heavily inspired by the Python library xarray. It enriches raw NumPy-like multi-dimensional arrays of data by adding named dimensions and associated coordinates. Multiple arrays are combined into datasets. On top of this, scipp introduces (i) implicit handling of physical units, (ii) implicit propagation of uncertainties, (iii) support for histograms, i.e., bin-edge coordinate axes, which exceed the data’s dimension extent by one, and (iv) support for event data. In conjunction these features enable a more natural and more concise user experience. The combination of named dimensions, coordinates, and units helps to drastically reduce the risk for programming errors. The core of scipp is written in C++ to open opportunities for performance improvements that a Python-based solution would not allow for. On top of the C++ core, scipp’s Python components provide functionality for plotting and content representations, e.g., for use in Jupyter Notebooks. While none of scipp’s concepts in isolation is novel per-se, we are not aware of any project combining all of these aspects in a single coherent software package.
APA, Harvard, Vancouver, ISO, and other styles
10

Reininghaus, Maximilian, and Ralf Ulrich. "CORSIKA 8 – Towards a modern framework for the simulation of extensive air showers." EPJ Web of Conferences 210 (2019): 02011. http://dx.doi.org/10.1051/epjconf/201921002011.

Full text
Abstract:
Current and future challenges in astroparticle physics require novel simulation tools to achieve higher precision and more flexibility. For three decades the FORTRAN version of CORSIKA served the community in an excellent way. However, the effort to maintain and further develop this complex package is getting increasingly difficult. To overcome existing limitations, and designed as a very open platform for all particle cascade simulations in astroparticle physics, we are developing CORSIKA 8 based on modern C++ and Python concepts. Here, we give a brief status report of the project.
APA, Harvard, Vancouver, ISO, and other styles
11

Choudhary, Saket. "pysradb: A Python package to query next-generation sequencing metadata and data from NCBI Sequence Read Archive." F1000Research 8 (April 23, 2019): 532. http://dx.doi.org/10.12688/f1000research.18676.1.

Full text
Abstract:
The NCBI Sequence Read Archive (SRA) is the primary archive of next-generation sequencing datasets. SRA makes metadata and raw sequencing data available to the research community to encourage reproducibility and to provide avenues for testing novel hypotheses on publicly available data. However, methods to programmatically access this data are limited. We introduce the Python package, pysradb, which provides a collection of command line methods to query and download metadata and data from SRA, utilizing the curated metadata database available through the SRAdb project. We demonstrate the utility of pysradb on multiple use cases for searching and downloading SRA datasets. It is available freely at https://github.com/saketkc/pysradb.
APA, Harvard, Vancouver, ISO, and other styles
12

Radu, Cristian, Ioana D. Vlaicu, and Andrei C. Kuncser. "A new method for obtaining the magnetic shape anisotropy directly from electron tomography images." Beilstein Journal of Nanotechnology 13 (July 5, 2022): 590–98. http://dx.doi.org/10.3762/bjnano.13.51.

Full text
Abstract:
A new methodology to obtain magnetic information on magnetic nanoparticle (MNP) systems via electron tomography techniques is reported in this work. The new methodology is implemented in an under-development software package called Magn3t, written in Python and C++. A novel image-filtering technique that reduces the highly undesired diffraction effects in the tomography tilt-series has been also developed in order to increase the reliability of the correlations between morphology and magnetism. Using the Magn3t software, the magnetic shape anisotropy magnitude and direction of magnetite nanoparticles has been extracted for the first time directly from transmission electron tomography.
APA, Harvard, Vancouver, ISO, and other styles
13

Tice, Alexander K., David Žihala, Tomáš Pánek, Robert E. Jones, Eric D. Salomaki, Serafim Nenarokov, Fabien Burki, et al. "PhyloFisher: A phylogenomic package for resolving eukaryotic relationships." PLOS Biology 19, no. 8 (August 6, 2021): e3001365. http://dx.doi.org/10.1371/journal.pbio.3001365.

Full text
Abstract:
Phylogenomic analyses of hundreds of protein-coding genes aimed at resolving phylogenetic relationships is now a common practice. However, no software currently exists that includes tools for dataset construction and subsequent analysis with diverse validation strategies to assess robustness. Furthermore, there are no publicly available high-quality curated databases designed to assess deep (>100 million years) relationships in the tree of eukaryotes. To address these issues, we developed an easy-to-use software package, PhyloFisher (https://github.com/TheBrownLab/PhyloFisher), written in Python 3. PhyloFisher includes a manually curated database of 240 protein-coding genes from 304 eukaryotic taxa covering known eukaryotic diversity, a novel tool for ortholog selection, and utilities that will perform diverse analyses required by state-of-the-art phylogenomic investigations. Through phylogenetic reconstructions of the tree of eukaryotes and of the Saccharomycetaceae clade of budding yeasts, we demonstrate the utility of the PhyloFisher workflow and the provided starting database to address phylogenetic questions across a large range of evolutionary time points for diverse groups of organisms. We also demonstrate that undetected paralogy can remain in phylogenomic “single-copy orthogroup” datasets constructed using widely accepted methods such as all vs. all BLAST searches followed by Markov Cluster Algorithm (MCL) clustering and application of automated tree pruning algorithms. Finally, we show how the PhyloFisher workflow helps detect inadvertent paralog inclusions, allowing the user to make more informed decisions regarding orthology assignments, leading to a more accurate final dataset.
APA, Harvard, Vancouver, ISO, and other styles
14

Chen, Danze, Qianqian Zhao, Leiming Jiang, Shuaiyuan Liao, Zhigang Meng, and Jianzhen Xu. "TGStools: A Bioinformatics Suit to Facilitate Transcriptome Analysis of Long Reads from Third Generation Sequencing Platform." Genes 10, no. 7 (July 10, 2019): 519. http://dx.doi.org/10.3390/genes10070519.

Full text
Abstract:
Recent analyses show that transcriptome sequencing can be utilized as a diagnostic tool for rare Mendelian diseases. The third generation sequencing de novo detects long reads of thousands of base pairs, thus greatly expanding the isoform discovery and identification of novel long noncoding RNAs. In this study, we developed TGStools, a bioinformatics suite to facilitate routine tasks such as characterizing full-length transcripts, detecting shifted types of alternative splicing, and long noncoding RNAs (lncRNAs) identification in transcriptome analysis. It also prioritizes the transcripts with a visualization framework that automatically integrates rich annotation with known genomic features. TGStools is a Python package freely available at Github.
APA, Harvard, Vancouver, ISO, and other styles
15

Domanskyi, Sergii, Alex Hakansson, Thomas J. Bertus, Giovanni Paternostro, and Carlo Piermarocchi. "Digital Cell Sorter (DCS): a cell type identification, anomaly detection, and Hopfield landscapes toolkit for single-cell transcriptomics." PeerJ 9 (January 13, 2021): e10670. http://dx.doi.org/10.7717/peerj.10670.

Full text
Abstract:
Motivation Analysis of singe cell RNA sequencing (scRNA-seq) typically consists of different steps including quality control, batch correction, clustering, cell identification and characterization, and visualization. The amount of scRNA-seq data is growing extremely fast, and novel algorithmic approaches improving these steps are key to extract more biological information. Here, we introduce: (i) two methods for automatic cell type identification (i.e., without expert curator) based on a voting algorithm and a Hopfield classifier, (ii) a method for cell anomaly quantification based on isolation forest, and (iii) a tool for the visualization of cell phenotypic landscapes based on Hopfield energy-like functions. These new approaches are integrated in a software platform that includes many other state-of-the-art methodologies and provides a self-contained toolkit for scRNA-seq analysis. Results We present a suite of software elements for the analysis of scRNA-seq data. This Python-based open source software, Digital Cell Sorter (DCS), consists in an extensive toolkit of methods for scRNA-seq analysis. We illustrate the capability of the software using data from large datasets of peripheral blood mononuclear cells (PBMC), as well as plasma cells of bone marrow samples from healthy donors and multiple myeloma patients. We test the novel algorithms by evaluating their ability to deconvolve cell mixtures and detect small numbers of anomalous cells in PBMC data. Availability The DCS toolkit is available for download and installation through the Python Package Index (PyPI). The software can be deployed using the Python import function following installation. Source code is also available for download on Zenodo: DOI 10.5281/zenodo.2533377. Supplementary information Supplemental Materials are available at PeerJ online.
APA, Harvard, Vancouver, ISO, and other styles
16

Song, Sungyoon, Sungchul Hwang, Baekkyeong Ko, Seungtae Cha, and Gilsoo Jang. "Novel Transient Power Control Schemes for BTB VSCs to Improve Angle Stability." Applied Sciences 8, no. 8 (August 11, 2018): 1350. http://dx.doi.org/10.3390/app8081350.

Full text
Abstract:
This paper proposes two novel power control strategies to improve the angle stability of generators using a Back-to-Back (BTB) system-based voltage source converter (VSC). The proposed power control strategies have two communication systems: a bus angle monitoring system and a special protection system (SPS), respectively. The first power control strategy can emulate the behaviour of the ac transmission to improve the angle stability while supporting the ac voltage at the primary level of the control structure. The second power control scheme uses an SPS signal to contribute stability to the power system under severe contingencies involving the other generators. The results for the proposed control scheme were validated using the PSS/E software package with a sub-module written in the Python language, and the simple assistant power control with two communication systems is shown to improve the angle stability. In conclusion, BTB VSCs can contribute their power control strategies to ac grid in addition to offering several existing advantages, which makes them applicable for use in the commensurate protection of large ac grid.
APA, Harvard, Vancouver, ISO, and other styles
17

Kreklow, Jennifer. "Facilitating radar precipitation data processing, assessment and analysis: a GIS-compatible python approach." Journal of Hydroinformatics 21, no. 4 (May 15, 2019): 652–70. http://dx.doi.org/10.2166/hydro.2019.048.

Full text
Abstract:
Abstract A review of existing tools for radar data processing revealed a lack of open source software for automated processing, assessment and analysis of weather radar composites. The ArcGIS-compatible Python package radproc attempts to reduce this gap. Radproc provides an automated raw data processing workflow for nationwide, freely available German weather radar climatology (RADKLIM) and operational (RADOLAN) composite products. Raw data are converted into a uniform HDF5 file structure used by radproc's analysis and data quality assessment functions. This enables transferability of the developed analysis and export functionality to other gridded or point-scale precipitation data. Thus, radproc can be extended by additional import routines to support any other German or non-German precipitation dataset. Analysis methods include temporal aggregations, detection of heavy rainfall and an automated processing of rain gauge point data into the same HDF5 format for comparison to gridded radar data. A set of functions for data exchange with ArcGIS allows for visualisation and further geospatial analysis. The application on a 17-year time series of hourly RADKLIM data showed that radproc greatly facilitates radar data processing and analysis by avoiding manual programming work and helps to lower the barrier for non-specialists to work with these novel radar climatology datasets.
APA, Harvard, Vancouver, ISO, and other styles
18

Bakurov, Illya, Marco Buzzelli, Mauro Castelli, Leonardo Vanneschi, and Raimondo Schettini. "General Purpose Optimization Library (GPOL): A Flexible and Efficient Multi-Purpose Optimization Library in Python." Applied Sciences 11, no. 11 (May 23, 2021): 4774. http://dx.doi.org/10.3390/app11114774.

Full text
Abstract:
Several interesting libraries for optimization have been proposed. Some focus on individual optimization algorithms, or limited sets of them, and others focus on limited sets of problems. Frequently, the implementation of one of them does not precisely follow the formal definition, and they are difficult to personalize and compare. This makes it difficult to perform comparative studies and propose novel approaches. In this paper, we propose to solve these issues with the General Purpose Optimization Library (GPOL): a flexible and efficient multipurpose optimization library that covers a wide range of stochastic iterative search algorithms, through which flexible and modular implementation can allow for solving many different problem types from the fields of continuous and combinatorial optimization and supervised machine learning problem solving. Moreover, the library supports full-batch and mini-batch learning and allows carrying out computations on a CPU or GPU. The package is distributed under an MIT license. Source code, installation instructions, demos and tutorials are publicly available in our code hosting platform (the reference is provided in the Introduction).
APA, Harvard, Vancouver, ISO, and other styles
19

Cheon, Minjong, Hyodong Ha, Ook Lee, and Changbae Mun. "A Novel Hybrid Deep Learning Approach to Code Generation Aimed at Mitigating the Real-Time Network Attack in the Mobile Experiment Via GRU-LM and Word2vec." Mobile Information Systems 2022 (September 29, 2022): 1–11. http://dx.doi.org/10.1155/2022/3999868.

Full text
Abstract:
As the use of devices in mobile environments increases, network attacks such as DDoS have a malicious attempt to flood the network's regular traffic to overload the target and surrounding infrastructure. This research proposed machine learning and deep learning approaches to dealing with DDoS attacks, and the results are described as follows. First, this research successfully detected DDoS attacks through an LGBM with a 100% accuracy score. Second, the proposed model (GRU-LM), which consists of a trained Word2vec layer with the Python dataset, is far more effective than the standard GRU model. Since Python is quite similar to English, language model-based GRU yields superior results. Various preprocessing steps were performed through the NLTK package, and each number was assigned to the tokenized one for constructing the GRU language model. The result reveals that the proposed model achieved an accuracy score of 87% for predicting the following words in the source code, while the rest achieved below 30% accuracy. This conclusion is significant because its relatively simple and light structure overcomes tradeoff problems between time and accuracy and is adaptable to the mobile setting. Discovering traffic patterns for the underlying data of DDOS assaults and retrieving them using statistical data analysis is the value of this research. Furthermore, since public cloud application vulnerability assaults are rising due to expanding cloud infrastructure, this finding could be used in such attacks.
APA, Harvard, Vancouver, ISO, and other styles
20

Brennan, S. J., and M. Fraser. "The Automated Photometry of Transients pipeline (AutoPhOT)." Astronomy & Astrophysics 667 (November 2022): A62. http://dx.doi.org/10.1051/0004-6361/202243067.

Full text
Abstract:
We present the Automated Photometry of Transients (AutoPhOT) package, a novel automated pipeline that is designed for rapid, publication-quality photometry of astronomical transients. AutoPhOT is built from the ground up using Python 3 – with no dependencies on legacy software. Capabilities of AutoPhOT include aperture and point-spread-function photometry, template subtraction, and calculation of limiting magnitudes through artificial source injection. AutoPhOT is also capable of calibrating photometry against either survey catalogues, or using a custom set of local photometric standards, and is designed primarily for ground-based optical and infrared images. We show that both aperture and point-spread-function photometry from AutoPhOT is consistent with commonly used software, for example, DAOPHOT, and also demonstrate that AutoPhOT can reproduce published light curves for a selection of transients with minimal human intervention.
APA, Harvard, Vancouver, ISO, and other styles
21

Jain, Manju, C. S. Rai, and Jai Jain. "A Novel Method for Differential Prognosis of Brain Degenerative Diseases Using Radiomics-Based Textural Analysis and Ensemble Learning Classifiers." Computational and Mathematical Methods in Medicine 2021 (August 5, 2021): 1–13. http://dx.doi.org/10.1155/2021/7965677.

Full text
Abstract:
We propose a novel approach to develop a computer-aided decision support system for radiologists to help them classify brain degeneration process as physiological or pathological, aiding in early prognosis of brain degenerative diseases. Our approach applies computational and mathematical formulations to extract quantitative information from biomedical images. Our study explores the longitudinal OASIS-3 dataset, which consists of 4096 brain MRI scans collected over a period of 15 years. We perform feature extraction using Pyradiomics python package that quantizes brain MRI images using different texture analysis methods. Studies indicate that Radiomics has rarely been used for analysis of brain cognition; hence, our study is also a novel effort to determine the efficiency of Radiomics features extracted from structural MRI scans for classification of brain degenerative diseases and to create awareness about Radiomics. For classification tasks, we explore various ensemble learning classification algorithms such as random forests, bagging-based ensemble classifiers, and gradient-boosted ensemble classifiers such as XGBoost and AdaBoost. Such ensemble learning classifiers have not been used for biomedical image classification. We also propose a novel texture analysis matrix, Decreasing Gray-Level Matrix or DGLM. The features extracted from this filter helped to further improve the accuracy of our decision support system. The proposed system based on XGBoost ensemble learning classifiers achieves an accuracy of 97.38%, with sensitivity 99.82% and specificity 97.01%.
APA, Harvard, Vancouver, ISO, and other styles
22

Kern, Fabian, Tobias Fehlmann, Jeffrey Solomon, Louisa Schwed, Nadja Grammes, Christina Backes, Kendall Van Keuren-Jensen, David Wesley Craig, Eckart Meese, and Andreas Keller. "miEAA 2.0: integrating multi-species microRNA enrichment analysis and workflow management systems." Nucleic Acids Research 48, W1 (May 6, 2020): W521—W528. http://dx.doi.org/10.1093/nar/gkaa309.

Full text
Abstract:
Abstract Gene set enrichment analysis has become one of the most frequently used applications in molecular biology research. Originally developed for gene sets, the same statistical principles are now available for all omics types. In 2016, we published the miRNA enrichment analysis and annotation tool (miEAA) for human precursor and mature miRNAs. Here, we present miEAA 2.0, supporting miRNA input from ten frequently investigated organisms. To facilitate inclusion of miEAA in workflow systems, we implemented an Application Programming Interface (API). Users can perform miRNA set enrichment analysis using either the web-interface, a dedicated Python package, or custom remote clients. Moreover, the number of category sets was raised by an order of magnitude. We implemented novel categories like annotation confidence level or localisation in biological compartments. In combination with the miRBase miRNA-version and miRNA-to-precursor converters, miEAA supports research settings where older releases of miRBase are in use. The web server also offers novel comprehensive visualizations such as heatmaps and running sum curves with background distributions. We demonstrate the new features with case studies for human kidney cancer, a biomarker study on Parkinson’s disease from the PPMI cohort, and a mouse model for breast cancer. The tool is freely accessible at: https://www.ccb.uni-saarland.de/mieaa2.
APA, Harvard, Vancouver, ISO, and other styles
23

Paardekooper, Sijme-Jan, Colin P. McNally, and Francesco Lovascio. "Polydisperse streaming instability – II. Methods for solving the linear stability problem." Monthly Notices of the Royal Astronomical Society 502, no. 2 (January 19, 2021): 1579–95. http://dx.doi.org/10.1093/mnras/stab111.

Full text
Abstract:
ABSTRACT Occurring in protoplanetary discs composed of dust and gas, streaming instabilities are a favoured mechanism to drive the formation of planetesimals. The polydispserse streaming instability is a generalization of the streaming instability to a continuum of dust sizes. This second paper in the series provides a more in-depth derivation of the governing equations and presents novel numerical methods for solving the associated linear stability problem. In addition to the direct discretization of the eigenproblem at second order introduced in the previous paper, a new technique based on numerically reducing the system of integral equations to a complex polynomial combined with root finding is found to yield accurate results at much lower computational cost. A related method for counting roots of the dispersion relation inside a contour without locating those roots is also demonstrated. Applications of these methods show they can reproduce and exceed the accuracy of previous results in the literature, and new benchmark results are provided. Implementations of the methods described are made available in an accompanying python package psitools.
APA, Harvard, Vancouver, ISO, and other styles
24

Ochoa, Rodrigo, and Pilar Cossio. "PepFun: Open Source Protocols for Peptide-Related Computational Analysis." Molecules 26, no. 6 (March 16, 2021): 1664. http://dx.doi.org/10.3390/molecules26061664.

Full text
Abstract:
Peptide research has increased during the last years due to their applications as biomarkers, therapeutic alternatives or as antigenic sub-units in vaccines. The implementation of computational resources have facilitated the identification of novel sequences, the prediction of properties, and the modelling of structures. However, there is still a lack of open source protocols that enable their straightforward analysis. Here, we present PepFun, a compilation of bioinformatics and cheminformatics functionalities that are easy to implement and customize for studying peptides at different levels: sequence, structure and their interactions with proteins. PepFun enables calculating multiple characteristics for massive sets of peptide sequences, and obtaining different structural observables derived from protein-peptide complexes. In addition, random or guided library design of peptide sequences can be customized for screening campaigns. The package has been created under the python language based on built-in functions and methods available in the open source projects BioPython and RDKit. We present two tutorials where we tested peptide binders of the MHC class II and the Granzyme B protease.
APA, Harvard, Vancouver, ISO, and other styles
25

Cui, Kaiming, Junjie Liu, Fabo Feng, and Jifeng Liu. "Identify Light-curve Signals with Deep Learning Based Object Detection Algorithm. I. Transit Detection." Astronomical Journal 163, no. 1 (December 17, 2021): 23. http://dx.doi.org/10.3847/1538-3881/ac3482.

Full text
Abstract:
Abstract Deep learning techniques have been well explored in the transiting exoplanet field; however, previous work mainly focuses on classification and inspection. In this work, we develop a novel detection algorithm based on a well-proven object detection framework in the computer vision field. Through training the network on the light curves of the confirmed Kepler exoplanets, our model yields about 90% precision and recall for identifying transits with signal-to-noise ratio higher than 6 (set the confidence threshold to 0.6). Giving a slightly lower confidence threshold, recall can reach higher than 95%. We also transfer the trained model to the TESS data and obtain similar performance. The results of our algorithm match the intuition of the human visual perception and make it useful to find single-transiting candidates. Moreover, the parameters of the output bounding boxes can also help to find multiplanet systems. Our network and detection functions are implemented in the Deep-Transit toolkit, which is an open-source Python package hosted on Github and PyPI.
APA, Harvard, Vancouver, ISO, and other styles
26

Alexandre, Leonardo, Rafael S. Costa, and Rui Henriques. "DISA tool: Discriminative and informative subspace assessment with categorical and numerical outcomes." PLOS ONE 17, no. 10 (October 19, 2022): e0276253. http://dx.doi.org/10.1371/journal.pone.0276253.

Full text
Abstract:
Pattern discovery and subspace clustering play a central role in the biological domain, supporting for instance putative regulatory module discovery from omics data for both descriptive and predictive ends. In the presence of target variables (e.g. phenotypes), regulatory patterns should further satisfy delineate discriminative power properties, well-established in the presence of categorical outcomes, yet largely disregarded for numerical outcomes, such as risk profiles and quantitative phenotypes. DISA (Discriminative and Informative Subspace Assessment), a Python software package, is proposed to evaluate patterns in the presence of numerical outcomes using well-established measures together with a novel principle able to statistically assess the correlation gain of the subspace against the overall space. Results confirm the possibility to soundly extend discriminative criteria towards numerical outcomes without the drawbacks well-associated with discretization procedures. Results from four case studies confirm the validity and relevance of the proposed methods, further unveiling critical directions for research on biotechnology and biomedicine. Availability: DISA is freely available at https://github.com/JupitersMight/DISA under the MIT license.
APA, Harvard, Vancouver, ISO, and other styles
27

Rivas-Barragan, Daniel, Sarah Mubeen, Francesc Guim Bernat, Martin Hofmann-Apitius, and Daniel Domingo-Fernández. "Drug2ways: Reasoning over causal paths in biological networks for drug discovery." PLOS Computational Biology 16, no. 12 (December 2, 2020): e1008464. http://dx.doi.org/10.1371/journal.pcbi.1008464.

Full text
Abstract:
Elucidating the causal mechanisms responsible for disease can reveal potential therapeutic targets for pharmacological intervention and, accordingly, guide drug repositioning and discovery. In essence, the topology of a network can reveal the impact a drug candidate may have on a given biological state, leading the way for enhanced disease characterization and the design of advanced therapies. Network-based approaches, in particular, are highly suited for these purposes as they hold the capacity to identify the molecular mechanisms underlying disease. Here, we present drug2ways, a novel methodology that leverages multimodal causal networks for predicting drug candidates. Drug2ways implements an efficient algorithm which reasons over causal paths in large-scale biological networks to propose drug candidates for a given disease. We validate our approach using clinical trial information and demonstrate how drug2ways can be used for multiple applications to identify: i) single-target drug candidates, ii) candidates with polypharmacological properties that can optimize multiple targets, and iii) candidates for combination therapy. Finally, we make drug2ways available to the scientific community as a Python package that enables conducting these applications on multiple standard network formats.
APA, Harvard, Vancouver, ISO, and other styles
28

Heller, David, and Martin Vingron. "SVIM: structural variant identification using mapped long reads." Bioinformatics 35, no. 17 (January 21, 2019): 2907–15. http://dx.doi.org/10.1093/bioinformatics/btz041.

Full text
Abstract:
Abstract Motivation Structural variants are defined as genomic variants larger than 50 bp. They have been shown to affect more bases in any given genome than single-nucleotide polymorphisms or small insertions and deletions. Additionally, they have great impact on human phenotype and diversity and have been linked to numerous diseases. Due to their size and association with repeats, they are difficult to detect by shotgun sequencing, especially when based on short reads. Long read, single-molecule sequencing technologies like those offered by Pacific Biosciences or Oxford Nanopore Technologies produce reads with a length of several thousand base pairs. Despite the higher error rate and sequencing cost, long-read sequencing offers many advantages for the detection of structural variants. Yet, available software tools still do not fully exploit the possibilities. Results We present SVIM, a tool for the sensitive detection and precise characterization of structural variants from long-read data. SVIM consists of three components for the collection, clustering and combination of structural variant signatures from read alignments. It discriminates five different variant classes including similar types, such as tandem and interspersed duplications and novel element insertions. SVIM is unique in its capability of extracting both the genomic origin and destination of duplications. It compares favorably with existing tools in evaluations on simulated data and real datasets from Pacific Biosciences and Nanopore sequencing machines. Availability and implementation The source code and executables of SVIM are available on Github: github.com/eldariont/svim. SVIM has been implemented in Python 3 and published on bioconda and the Python Package Index. Supplementary information Supplementary data are available at Bioinformatics online.
APA, Harvard, Vancouver, ISO, and other styles
29

Collonge, M., P. Busca, P. Fajardo, and M. Williams. "Monte Carlo simulations for XIDer, a novel digital integration X-ray detector for the next generation of synchrotron radiation sources." Journal of Instrumentation 17, no. 01 (January 1, 2022): C01037. http://dx.doi.org/10.1088/1748-0221/17/01/c01037.

Full text
Abstract:
Abstract This work presents the first simulation results of the incremental digital integration readout, a charge-integrating front-end scheme with in-pixel digitisation and accumulation. This novel readout concept is at the core of the XIDer (X-ray Integrating Detector) project, which aims to design 2D pixelated X-ray detectors optimised for high energy scattering and diffraction applications for the next generation of synchrotron radiation sources such as the ESRF Extremely Brilliant Source (EBS). The digital integration readout and the XIDer detector open the possibilities for high-duty-cycle operation under very high photon flux, fast frame-rate and high dynamic range with single-photon sensitivity in the 30–100 keV energy range. The readout method allows for noise-free effective X-ray detection. The digital integration concept is currently under investigation to evaluate the impact of main critical design parameters to identify the strengths and weaknesses of the readout scheme and consequently to propose refinements in the final implementation. Simulations have been performed with a dedicated Monte Carlo simulation tool, X-DECIMO, a modular Python package designed to recreate the complete detection chain of X-ray detectors for synchrotron radiation experiments. Losses and non-linearities of the readout scheme are simulated and quantified. In addition to presenting simulation results for this novel readout scheme, this work underlines the potential of the approach and some of its limitations.
APA, Harvard, Vancouver, ISO, and other styles
30

Shen, Zeyang, Marten A. Hoeksema, Zhengyu Ouyang, Christopher Benner, and Christopher K. Glass. "MAGGIE: leveraging genetic variation to identify DNA sequence motifs mediating transcription factor binding and function." Bioinformatics 36, Supplement_1 (July 1, 2020): i84—i92. http://dx.doi.org/10.1093/bioinformatics/btaa476.

Full text
Abstract:
Abstract Motivation Genetic variation in regulatory elements can alter transcription factor (TF) binding by mutating a TF binding motif, which in turn may affect the activity of the regulatory elements. However, it is unclear which motifs are prone to impact transcriptional regulation if mutated. Current motif analysis tools either prioritize TFs based on motif enrichment without linking to a function or are limited in their applications due to the assumption of linearity between motifs and their functional effects. Results We present MAGGIE (Motif Alteration Genome-wide to Globally Investigate Elements), a novel method for identifying motifs mediating TF binding and function. By leveraging measurements from diverse genotypes, MAGGIE uses a statistical approach to link mutations of a motif to changes of an epigenomic feature without assuming a linear relationship. We benchmark MAGGIE across various applications using both simulated and biological datasets and demonstrate its improvement in sensitivity and specificity compared with the state-of-the-art motif analysis approaches. We use MAGGIE to gain novel insights into the divergent functions of distinct NF-κB factors in pro-inflammatory macrophages, revealing the association of p65–p50 co-binding with transcriptional activation and the association of p50 binding lacking p65 with transcriptional repression. Availability and implementation The Python package for MAGGIE is freely available at https://github.com/zeyang-shen/maggie. The accession number for the NF-κB ChIP-seq data generated for this study is Gene Expression Omnibus: GSE144070. Supplementary information Supplementary data are available at Bioinformatics online.
APA, Harvard, Vancouver, ISO, and other styles
31

Ceschin, Rafael, Ashok Panigrahy, and Vanathi Gopalakrishnan. "sfDM: Open-Source Software for Temporal Analysis and Visualization of Brain Tumor Diffusion MR Using Serial Functional Diffusion Mapping." Cancer Informatics 14s2 (January 2015): CIN.S17293. http://dx.doi.org/10.4137/cin.s17293.

Full text
Abstract:
A major challenge in the diagnosis and treatment of brain tumors is tissue heterogeneity leading to mixed treatment response. Additionally, they are often difficult or at very high risk for biopsy, further hindering the clinical management process. To overcome this, novel advanced imaging methods are increasingly being adapted clinically to identify useful noninvasive biomarkers capable of disease stage characterization and treatment response prediction. One promising technique is called functional diffusion mapping (fDM), which uses diffusion-weighted imaging (DWI) to generate parametric maps between two imaging time points in order to identify significant voxel-wise changes in water diffusion within the tumor tissue. Here we introduce serial functional diffusion mapping (sfDM), an extension of existing fDM methods, to analyze the entire tumor diffusion profile along the temporal course of the disease. sfDM provides the tools necessary to analyze a tumor data set in the context of spatiotemporal parametric mapping: the image registration pipeline, biomarker extraction, and visualization tools. We present the general workflow of the pipeline, along with a typical use case for the software. sfDM is written in Python and is freely available as an open-source package under the Berkley Software Distribution (BSD) license to promote transparency and reproducibility.
APA, Harvard, Vancouver, ISO, and other styles
32

Lockhart, Brandon, Jinglin Peng, Weiyuan Wu, Jiannan Wang, and Eugene Wu. "Explaining inference queries with bayesian optimization." Proceedings of the VLDB Endowment 14, no. 11 (July 2021): 2576–85. http://dx.doi.org/10.14778/3476249.3476304.

Full text
Abstract:
Obtaining an explanation for an SQL query result can enrich the analysis experience, reveal data errors, and provide deeper insight into the data. Inference query explanation seeks to explain unexpected aggregate query results on inference data; such queries are challenging to explain because an explanation may need to be derived from the source, training, or inference data in an ML pipeline. In this paper, we model an objective function as a black-box function and propose BOExplain, a novel framework for explaining inference queries using Bayesian optimization (BO). An explanation is a predicate defining the input tuples that should be removed so that the query result of interest is significantly affected. BO --- a technique for finding the global optimum of a black-box function --- is used to find the best predicate. We develop two new techniques (individual contribution encoding and warm start) to handle categorical variables. We perform experiments showing that the predicates found by BOExplain have a higher degree of explanation compared to those found by the state-of-the-art query explanation engines. We also show that BOExplain is effective at deriving explanations for inference queries from source and training data on a variety of real-world datasets. BOExplain is open-sourced as a Python package at https://github.com/sfu-db/BOExplain.
APA, Harvard, Vancouver, ISO, and other styles
33

Cluet, David, Ikram Amri, Blandine Vergier, Jérémie Léault, Astrid Audibert, Clémence Grosjean, Dylan Calabrési, and Martin Spichty. "A Quantitative Tri-fluorescent Yeast Two-hybrid System: From Flow Cytometry to In cellula Affinities." Molecular & Cellular Proteomics 19, no. 4 (February 3, 2020): 701–15. http://dx.doi.org/10.1074/mcp.tir119.001692.

Full text
Abstract:
We present a technological advancement for the estimation of the affinities of Protein-Protein Interactions (PPIs) in living cells. A novel set of vectors is introduced that enables a quantitative yeast two-hybrid system based on fluorescent fusion proteins. The vectors allow simultaneous quantification of the reaction partners (Bait and Prey) and the reporter at the single-cell level by flow cytometry. We validate the applicability of this system on a small but diverse set of PPIs (eleven protein families from six organisms) with different affinities; the dissociation constants range from 117 pm to 17 μm. After only two hours of reaction, expression of the reporter can be detected even for the weakest PPI. Through a simple gating analysis, it is possible to select only cells with identical expression levels of the reaction partners. As a result of this standardization of expression levels, the mean reporter levels directly reflect the affinities of the studied PPIs. With a set of PPIs with known affinities, it is straightforward to construct an affinity ladder that permits rapid classification of PPIs with thus far unknown affinities. Conventional software can be used for this analysis. To permit automated analysis, we provide a graphical user interface for the Python-based FlowCytometryTools package.
APA, Harvard, Vancouver, ISO, and other styles
34

Mehdi, Tahmid F., Gurdeep Singh, Jennifer A. Mitchell, and Alan M. Moses. "Variational infinite heterogeneous mixture model for semi-supervised clustering of heart enhancers." Bioinformatics 35, no. 18 (February 7, 2019): 3232–39. http://dx.doi.org/10.1093/bioinformatics/btz064.

Full text
Abstract:
Abstract Motivation Mammalian genomes can contain thousands of enhancers but only a subset are actively driving gene expression in a given cellular context. Integrated genomic datasets can be harnessed to predict active enhancers. One challenge in integration of large genomic datasets is the increasing heterogeneity: continuous, binary and discrete features may all be relevant. Coupled with the typically small numbers of training examples, semi-supervised approaches for heterogeneous data are needed; however, current enhancer prediction methods are not designed to handle heterogeneous data in the semi-supervised paradigm. Results We implemented a Dirichlet Process Heterogeneous Mixture model that infers Gaussian, Bernoulli and Poisson distributions over features. We derived a novel variational inference algorithm to handle semi-supervised learning tasks where certain observations are forced to cluster together. We applied this model to enhancer candidates in mouse heart tissues based on heterogeneous features. We constrained a small number of known active enhancers to appear in the same cluster, and 47 additional regions clustered with them. Many of these are located near heart-specific genes. The model also predicted 1176 active promoters, suggesting that it can discover new enhancers and promoters. Availability and implementation We created the ‘dphmix’ Python package: https://pypi.org/project/dphmix/. Supplementary information Supplementary data are available at Bioinformatics online.
APA, Harvard, Vancouver, ISO, and other styles
35

Zhao, Zhengqiao, Stephen Woloszynek, Felix Agbavor, Joshua Chang Mell, Bahrad A. Sokhansanj, and Gail L. Rosen. "Learning, visualizing and exploring 16S rRNA structure using an attention-based deep neural network." PLOS Computational Biology 17, no. 9 (September 22, 2021): e1009345. http://dx.doi.org/10.1371/journal.pcbi.1009345.

Full text
Abstract:
Recurrent neural networks with memory and attention mechanisms are widely used in natural language processing because they can capture short and long term sequential information for diverse tasks. We propose an integrated deep learning model for microbial DNA sequence data, which exploits convolutional neural networks, recurrent neural networks, and attention mechanisms to predict taxonomic classifications and sample-associated attributes, such as the relationship between the microbiome and host phenotype, on the read/sequence level. In this paper, we develop this novel deep learning approach and evaluate its application to amplicon sequences. We apply our approach to short DNA reads and full sequences of 16S ribosomal RNA (rRNA) marker genes, which identify the heterogeneity of a microbial community sample. We demonstrate that our implementation of a novel attention-based deep network architecture, Read2Pheno, achieves read-level phenotypic prediction. Training Read2Pheno models will encode sequences (reads) into dense, meaningful representations: learned embedded vectors output from the intermediate layer of the network model, which can provide biological insight when visualized. The attention layer of Read2Pheno models can also automatically identify nucleotide regions in reads/sequences which are particularly informative for classification. As such, this novel approach can avoid pre/post-processing and manual interpretation required with conventional approaches to microbiome sequence classification. We further show, as proof-of-concept, that aggregating read-level information can robustly predict microbial community properties, host phenotype, and taxonomic classification, with performance at least comparable to conventional approaches. An implementation of the attention-based deep learning network is available at https://github.com/EESI/sequence_attention (a python package) and https://github.com/EESI/seq2att (a command line tool).
APA, Harvard, Vancouver, ISO, and other styles
36

Miller, Tim B., and Pieter van Dokkum. "Bayesian Fitting of Multi-Gaussian Expansion Models to Galaxy Images." Astrophysical Journal 923, no. 1 (December 1, 2021): 124. http://dx.doi.org/10.3847/1538-4357/ac2b30.

Full text
Abstract:
Abstract Fitting parameterized models to images of galaxies has become the standard for measuring galaxy morphology. This forward-modeling technique allows one to account for the point-spread function to effectively study semi-resolved galaxies. However, using a specific parameterization for a galaxy’s surface brightness profile can bias measurements if it is not an accurate representation. Furthermore, it can be difficult to assess systematic errors in parameterized profiles. To overcome these issues we employ the Multi-Gaussian expansion (MGE) method of representing a galaxy’s profile together with a Bayesian framework for fitting images. MGE flexibly represents a galaxy’s profile using a series of Gaussians. We introduce a novel Bayesian inference approach that uses pre-rendered Gaussian components, which greatly speeds up computation time and makes it feasible to run the fitting code on large samples of galaxies. We demonstrate our method with a series of validation tests. By injecting galaxies, with properties similar to those observed at z ∼ 1.5, into deep Hubble Space Telescope observations we show that it can accurately recover total fluxes and effective radii of realistic galaxies. Additionally we use degraded images of local galaxies to show that our method can recover realistic galaxy surface brightness and color profiles. Our implementation is available in an open source python package imcascade, which contains all methods needed for the preparation of images, fitting, and analysis of results.
APA, Harvard, Vancouver, ISO, and other styles
37

Moyer, Devlin, Alan R. Pacheco, David B. Bernstein, and Daniel Segrè. "Stoichiometric Modeling of Artificial String Chemistries Reveals Constraints on Metabolic Network Structure." Journal of Molecular Evolution 89, no. 7 (July 6, 2021): 472–83. http://dx.doi.org/10.1007/s00239-021-10018-0.

Full text
Abstract:
AbstractUncovering the general principles that govern the structure of metabolic networks is key to understanding the emergence and evolution of living systems. Artificial chemistries can help illuminate this problem by enabling the exploration of chemical reaction universes that are constrained by general mathematical rules. Here, we focus on artificial chemistries in which strings of characters represent simplified molecules, and string concatenation and splitting represent possible chemical reactions. We developed a novel Python package, ARtificial CHemistry NEtwork Toolbox (ARCHNET), to study string chemistries using tools from the field of stoichiometric constraint-based modeling. In addition to exploring the topological characteristics of different string chemistry networks, we developed a network-pruning algorithm that can generate minimal metabolic networks capable of producing a specified set of biomass precursors from a given assortment of environmental nutrients. We found that the composition of these minimal metabolic networks was influenced more strongly by the metabolites in the biomass reaction than the identities of the environmental nutrients. This finding has important implications for the reconstruction of organismal metabolic networks and could help us better understand the rise and evolution of biochemical organization. More generally, our work provides a bridge between artificial chemistries and stoichiometric modeling, which can help address a broad range of open questions, from the spontaneous emergence of an organized metabolism to the structure of microbial communities.
APA, Harvard, Vancouver, ISO, and other styles
38

Gumbsch, Thomas, Christian Bock, Michael Moor, Bastian Rieck, and Karsten Borgwardt. "Enhancing statistical power in temporal biomarker discovery through representative shapelet mining." Bioinformatics 36, Supplement_2 (December 2020): i840—i848. http://dx.doi.org/10.1093/bioinformatics/btaa815.

Full text
Abstract:
Abstract Motivation Temporal biomarker discovery in longitudinal data is based on detecting reoccurring trajectories, the so-called shapelets. The search for shapelets requires considering all subsequences in the data. While the accompanying issue of multiple testing has been mitigated in previous work, the redundancy and overlap of the detected shapelets results in an a priori unbounded number of highly similar and structurally meaningless shapelets. As a consequence, current temporal biomarker discovery methods are impractical and underpowered. Results We find that the pre- or post-processing of shapelets does not sufficiently increase the power and practical utility. Consequently, we present a novel method for temporal biomarker discovery: Statistically Significant Submodular Subset Shapelet Mining (S5M) that retrieves short subsequences that are (i) occurring in the data, (ii) are statistically significantly associated with the phenotype and (iii) are of manageable quantity while maximizing structural diversity. Structural diversity is achieved by pruning non-representative shapelets via submodular optimization. This increases the statistical power and utility of S5M compared to state-of-the-art approaches on simulated and real-world datasets. For patients admitted to the intensive care unit (ICU) showing signs of severe organ failure, we find temporal patterns in the sequential organ failure assessment score that are associated with in-ICU mortality. Availability and implementation S5M is an option in the python package of S3M: github.com/BorgwardtLab/S3M.
APA, Harvard, Vancouver, ISO, and other styles
39

Schindler, Daniel, Ted Moldenhawer, Maike Stange, Valentino Lepro, Carsten Beta, Matthias Holschneider, and Wilhelm Huisinga. "Analysis of protrusion dynamics in amoeboid cell motility by means of regularized contour flows." PLOS Computational Biology 17, no. 8 (August 23, 2021): e1009268. http://dx.doi.org/10.1371/journal.pcbi.1009268.

Full text
Abstract:
Amoeboid cell motility is essential for a wide range of biological processes including wound healing, embryonic morphogenesis, and cancer metastasis. It relies on complex dynamical patterns of cell shape changes that pose long-standing challenges to mathematical modeling and raise a need for automated and reproducible approaches to extract quantitative morphological features from image sequences. Here, we introduce a theoretical framework and a computational method for obtaining smooth representations of the spatiotemporal contour dynamics from stacks of segmented microscopy images. Based on a Gaussian process regression we propose a one-parameter family of regularized contour flows that allows us to continuously track reference points (virtual markers) between successive cell contours. We use this approach to define a coordinate system on the moving cell boundary and to represent different local geometric quantities in this frame of reference. In particular, we introduce the local marker dispersion as a measure to identify localized membrane expansions and provide a fully automated way to extract the properties of such expansions, including their area and growth time. The methods are available as an open-source software package called AmoePy, a Python-based toolbox for analyzing amoeboid cell motility (based on time-lapse microscopy data), including a graphical user interface and detailed documentation. Due to the mathematical rigor of our framework, we envision it to be of use for the development of novel cell motility models. We mainly use experimental data of the social amoeba Dictyostelium discoideum to illustrate and validate our approach.
APA, Harvard, Vancouver, ISO, and other styles
40

Higgins, Jenny A., Madison Lands, Taryn M. Valley, Emma Carpenter, and Laura Jacques. "Real-Time Effects of Payer Restrictions on Reproductive Healthcare: A Qualitative Analysis of Cost-Related Barriers and Their Consequences among U.S. Abortion Seekers on Reddit." International Journal of Environmental Research and Public Health 18, no. 17 (August 26, 2021): 9013. http://dx.doi.org/10.3390/ijerph18179013.

Full text
Abstract:
Objective: The Hyde Amendment and related policies limit or prohibit Medicaid coverage of abortion services in the United States. Most research on cost-related abortion barriers relies on clinic-based samples, but people who desire abortions may never make it to a healthcare center. To examine a novel, pre-abortion population, we analyzed a unique qualitative dataset of posts from Reddit, a widely used social media platform increasingly leveraged by researchers, to assess financial obstacles among anonymous posters considering abortion. Methods: In February 2020, we used Python to web-scrape the 250 most recent posts that mentioned abortion, removing all identifying information and usernames. After transferring all posts into NVivo, a qualitative software package, the team identified all datapoints related to cost. Three qualitatively trained evaluators established and applied codes, reaching saturation after 194 posts. The research team used a descriptive qualitative approach, using both inductive and deductive elements, to identify and analyze themes related to financial barriers. Results: We documented multiple cost-related deterrents, including lack of funds for both the procedure and attendant travel costs, inability to afford desired abortion modality (i.e., medication or surgical), and for some, consideration of self-managed abortion options due to cost barriers. Conclusions: Findings from this study underscore the centrality of cost barriers and third-party payer restrictions to stymying reproductive health access in the United States. Results may contribute to the growing evidence base and building political momentum focused on repealing the Hyde Amendment.
APA, Harvard, Vancouver, ISO, and other styles
41

Rieger, Marcel. "Design Pattern for Analysis Automation on Distributed Resources using Luigi Analysis Workflows." EPJ Web of Conferences 245 (2020): 05025. http://dx.doi.org/10.1051/epjconf/202024505025.

Full text
Abstract:
In particle physics, workflow management systems are primarily used as tailored solutions in dedicated areas such as Monte Carlo event generation. However, physicists performing data analyses are usually required to steer their individual workflows manually, which is time-consuming and often leads to undocumented relations between particular workloads. We present the Luigi Analysis Workflows (Law) Python package, which is based on the opensource pipelining tool Luigi, originally developed by Spotify. It establishes a generic design pattern for analyses of arbitrary scale and complexity, and shifts the focus from executing to defining the analysis logic. Law provides the building blocks to seamlessly integrate interchangeable remote resources without, however, limiting itself to a specific choice of infrastructure. In particular, it encourages and enables the separation of analysis algorithms on the one hand, and run locations, storage locations, and software environments on the other hand. To cope with the sophisticated demands of end-to-end HEP analyses, Law supports job execution on WLCG infrastructure (ARC, gLite) as well as on local computing clusters (HTCondor, LSF), remote file access via most common protocols through the GFAL2 library, and an environment sandboxing mechanism with support for Docker and Singularity containers. Moreover, the novel approach ultimately aims for analysis preservation out-of-the-box. Law is entirely experiment independent and developed open-source. It is successfully used in tt̄H cross section measurements and searches for di-Higgs boson production with the CMS experiment.
APA, Harvard, Vancouver, ISO, and other styles
42

zadeh, Zeinab Khorasani, and Mohamed M. Ouf. "Optimizing occupant-centric building controls given stochastic occupant behaviour." Journal of Physics: Conference Series 2069, no. 1 (November 1, 2021): 012140. http://dx.doi.org/10.1088/1742-6596/2069/1/012140.

Full text
Abstract:
Abstract Occupant-centric control (OCC) strategies represent a novel approach for indoor climate control in which occupancy patterns and occupant preferences are embedded within control sequences. They aim to improve both occupant comfort and energy efficiency by learning and predicting occupant behaviour, then optimizing building operations accordingly. Previous studies estimate that OCC can increase energy savings by up to 60% while improving occupant comfort. However, their performance is subjected to several factors, including uncertainty due to occupant behaviour, OCC configurational settings, as well as building design parameters. To this end, testing OCCs and adjusting their configurational settings are critical to ensure optimal performance. Furthermore, identifying building design alternatives that can optimize such performance given different occupant preferences is an important step that cannot be investigated during field implementations of OCC due to logistical constraints. This paper presents a framework to optimize OCC performance in a simulation environment, which entails coupling synthetic occupant behaviour models with OCCs that learn their preferences. The genetic algorithm for optimization is then used to identify the configurational settings and design parameters that minimize energy consumption under three different occupant scenarios. To demonstrate the proposed framework, three OCCs were implemented in the building simulation program, EnergyPlus, and executed through a Python package, EPPY to optimize OCC configurational settings and design parameters. Results revealed significant improvement of OCC performance under the identified optimal configurational settings and design parameters for each of the investigated occupant scenarios. This approach would improve OCC performance in actual buildings and avoid discomfort issues that arise during the initial implementation phases.
APA, Harvard, Vancouver, ISO, and other styles
43

Mora-Márquez, Fernando, José Luis Vázquez-Poletti, and Unai López de Heredia. "NGScloud2: optimized bioinformatic analysis using Amazon Web Services." PeerJ 9 (April 16, 2021): e11237. http://dx.doi.org/10.7717/peerj.11237.

Full text
Abstract:
Background NGScloud was a bioinformatic system developed to perform de novo RNAseq analysis of non-model species by exploiting the cloud computing capabilities of Amazon Web Services. The rapid changes undergone in the way this cloud computing service operates, along with the continuous release of novel bioinformatic applications to analyze next generation sequencing data, have made the software obsolete. NGScloud2 is an enhanced and expanded version of NGScloud that permits the access to ad hoc cloud computing infrastructure, scaled according to the complexity of each experiment. Methods NGScloud2 presents major technical improvements, such as the possibility of running spot instances and the most updated AWS instances types, that can lead to significant cost savings. As compared to its initial implementation, this improved version updates and includes common applications for de novo RNAseq analysis, and incorporates tools to operate workflows of bioinformatic analysis of reference-based RNAseq, RADseq and functional annotation. NGScloud2 optimizes the access to Amazon’s large computing infrastructures to easily run popular bioinformatic software applications, otherwise inaccessible to non-specialized users lacking suitable hardware infrastructures. Results The correct performance of the pipelines for de novo RNAseq, reference-based RNAseq, RADseq and functional annotation was tested with real experimental data, providing workflow performance estimates and tips to make optimal use of NGScloud2. Further, we provide a qualitative comparison of NGScloud2 vs. the Galaxy framework. NGScloud2 code, instructions for software installation and use are available at https://github.com/GGFHF/NGScloud2. NGScloud2 includes a companion package, NGShelper that contains Python utilities to post-process the output of the pipelines for downstream analysis at https://github.com/GGFHF/NGShelper.
APA, Harvard, Vancouver, ISO, and other styles
44

Weis, Caroline, Max Horn, Bastian Rieck, Aline Cuénod, Adrian Egli, and Karsten Borgwardt. "Topological and kernel-based microbial phenotype prediction from MALDI-TOF mass spectra." Bioinformatics 36, Supplement_1 (July 1, 2020): i30—i38. http://dx.doi.org/10.1093/bioinformatics/btaa429.

Full text
Abstract:
Abstract Motivation Microbial species identification based on matrix-assisted laser desorption ionization time-of-flight (MALDI-TOF) mass spectrometry (MS) has become a standard tool in clinical microbiology. The resulting MALDI-TOF mass spectra also harbour the potential to deliver prediction results for other phenotypes, such as antibiotic resistance. However, the development of machine learning algorithms specifically tailored to MALDI-TOF MS-based phenotype prediction is still in its infancy. Moreover, current spectral pre-processing typically involves a parameter-heavy chain of operations without analyzing their influence on the prediction results. In addition, classification algorithms lack quantification of uncertainty, which is indispensable for predictions potentially influencing patient treatment. Results We present a novel prediction method for antimicrobial resistance based on MALDI-TOF mass spectra. First, we compare the complex conventional pre-processing to a new approach that exploits topological information and requires only a single parameter, namely the number of peaks of a spectrum to keep. Second, we introduce PIKE, the peak information kernel, a similarity measure specifically tailored to MALDI-TOF mass spectra which, combined with a Gaussian process classifier, provides well-calibrated uncertainty estimates about predictions. We demonstrate the utility of our approach by predicting antibiotic resistance of three clinically highly relevant bacterial species. Our method consistently outperforms competitor approaches, while demonstrating improved performance and security by rejecting out-of-distribution samples, such as bacterial species that are not represented in the training data. Ultimately, our method could contribute to an earlier and precise antimicrobial treatment in clinical patient care. Availability and implementation We make our code publicly available as an easy-to-use Python package under https://github.com/BorgwardtLab/maldi_PIKE.
APA, Harvard, Vancouver, ISO, and other styles
45

Terrasi, Andrea, Swathi Subramanian, Christine Klement, Sruthi Ramesh, Heike Bollig, Chiara Falcomatà, Katja Steiger, et al. "Abstract 2350: Foxj1 is a new master regulator of activated PI3K pathway pancreatic cancer." Cancer Research 82, no. 12_Supplement (June 15, 2022): 2350. http://dx.doi.org/10.1158/1538-7445.am2022-2350.

Full text
Abstract:
Abstract Background: Pancreatic ductal adenocarcinoma (PDAC) is predicted to become the second leading cause of cancer mortality within a decade with overall 5-year survival of 8% for all stages combined. Currently, it is well documented that mechanisms driving PDAC progression involve epigenetic and transcriptional rewiring. Here we combined assay for transposase-accessible chromatin using sequencing (ATAC-seq) and enrichment for H3K27 acetylation chromatin immunoprecipitation (H3K27ac ChIP-seq) measures to explore the epigenetic landscape of different mouse primary pancreatic tumor (PPT) cell lines. Methods: Kras-driven (n=36) and PI3K-driven PPT cell lines (n=9) were cultured in DMEM medium (Gibco). DNA was extracted using manufacture protocols (Qiagen, MinElute PCR Purification Kit) then DNA libraries and high-throughput sequencing were performed. Bioinformatics analysis (ROSE2 Python script) was conducted on H3k27ac ChIP-seq data to define super-enhancer (SEs) and SE-associated genes. Then, ATAC-seq data was explored using Coltron Python package to distinguish enriched Transcription Factor (TF) motifs into SEs. Transcriptomic data was used to slim down the list of potential cis-regulatory elements. We developed knockout (ko) PPT cell lines using CRISPR/CAS9 gene editing method to better characterize the role of Foxj1 as a novel potential master regulator in pancreatic cancer. Lastly, immunohistochemistry (IHC) staining for FOXJ1 was conducted on human PDAC cohort. Results: By k-means clustering, we identified 463 SE-associated genes. Many of them are associated with Kras-driven (epithelial or mesenchymal) or PI3K-driven cell lines exclusively. Surprisingly, we found Foxj1 as SE-associated TF exclusively in PI3K-driven PPT cell lines. Consistent with the epigenetic data, transcriptomic analysis confirmed higher expression of Foxj1 in PI3K-driven PPT cell lines. Then, RNA-seq data revealed downregulation of predicted Foxj1 target genes and enhanced EMT and Wnt/β-catenin signatures in Foxj1 ko cells. These data suggest that epithelial properties of PDAC cells are stabilized by Foxj1 activity. Consistent with these results we detect a higher potential of TGFβ treatment to induce mesenchymal features in Foxj1 ko cells. Furthermore, overexpression of β-catenin protein was confirmed by immunofluorescence. Enhanced Wnt/β-catenin signaling could be responsible for the higher proliferation of Foxj1 ko cells as revealed by proliferation assay. Finally, we investigated FOXJ1 protein level in our PDAC human cohort. Interestingly, we found high nuclear FOXJ1 expression in 23% of cases which is linked with better overall survival. Conclusions: In summary, our data revealed Foxj1 as a novel PDAC associated TF with the ability to reduce the cancer aggressiveness blocking epithelial to mesenchymal transition and β-catenin activity elucidating the better prognosis into the FOXJ1 high expressed patients. Citation Format: Andrea Terrasi, Swathi Subramanian, Christine Klement, Sruthi Ramesh, Heike Bollig, Chiara Falcomatà, Katja Steiger, Rupert Öllinger, Dieter Saur, Roland Rad, Maximilian Reichert, Günter Schneider, Gunnar Schotta. Foxj1 is a new master regulator of activated PI3K pathway pancreatic cancer [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2022; 2022 Apr 8-13. Philadelphia (PA): AACR; Cancer Res 2022;82(12_Suppl):Abstract nr 2350.
APA, Harvard, Vancouver, ISO, and other styles
46

Liu, Qian, Zequan Zheng, Jiabin Zheng, Qiuyi Chen, Guan Liu, Sihan Chen, Bojia Chu, et al. "Health Communication Through News Media During the Early Stage of the COVID-19 Outbreak in China: Digital Topic Modeling Approach." Journal of Medical Internet Research 22, no. 4 (April 28, 2020): e19118. http://dx.doi.org/10.2196/19118.

Full text
Abstract:
Background In December 2019, a few coronavirus disease (COVID-19) cases were first reported in Wuhan, Hubei, China. Soon after, increasing numbers of cases were detected in other parts of China, eventually leading to a disease outbreak in China. As this dreadful disease spreads rapidly, the mass media has been active in community education on COVID-19 by delivering health information about this novel coronavirus, such as its pathogenesis, spread, prevention, and containment. Objective The aim of this study was to collect media reports on COVID-19 and investigate the patterns of media-directed health communications as well as the role of the media in this ongoing COVID-19 crisis in China. Methods We adopted the WiseSearch database to extract related news articles about the coronavirus from major press media between January 1, 2020, and February 20, 2020. We then sorted and analyzed the data using Python software and Python package Jieba. We sought a suitable topic number with evidence of the coherence number. We operated latent Dirichlet allocation topic modeling with a suitable topic number and generated corresponding keywords and topic names. We then divided these topics into different themes by plotting them into a 2D plane via multidimensional scaling. Results After removing duplications and irrelevant reports, our search identified 7791 relevant news reports. We listed the number of articles published per day. According to the coherence value, we chose 20 as the number of topics and generated the topics’ themes and keywords. These topics were categorized into nine main primary themes based on the topic visualization figure. The top three most popular themes were prevention and control procedures, medical treatment and research, and global or local social and economic influences, accounting for 32.57% (n=2538), 16.08% (n=1258), and 11.79% (n=919) of the collected reports, respectively. Conclusions Topic modeling of news articles can produce useful information about the significance of mass media for early health communication. Comparing the number of articles for each day and the outbreak development, we noted that mass media news reports in China lagged behind the development of COVID-19. The major themes accounted for around half the content and tended to focus on the larger society rather than on individuals. The COVID-19 crisis has become a worldwide issue, and society has become concerned about donations and support as well as mental health among others. We recommend that future work addresses the mass media’s actual impact on readers during the COVID-19 crisis through sentiment analysis of news data.
APA, Harvard, Vancouver, ISO, and other styles
47

Waters, Michael R., Matthew Inkman, Perry W. Grigsby, Stephanie Markovina, Julie K. Schwarz, and Jin Zhang. "Abstract 3475: An 18-gene expression model predicts resistance to standard of care therapy on 3-month follow up 18FDG-PET in locally advanced cervical cancer." Cancer Research 82, no. 12_Supplement (June 15, 2022): 3475. http://dx.doi.org/10.1158/1538-7445.am2022-3475.

Full text
Abstract:
Abstract Introduction: As many as 30-50% of patients with locally advanced cervical cancer (LACC) experience recurrence after chemoradiation therapy (CRT), and five-year survival rates for these patients is only ~10%. While local recurrence and overall survival are useful metrics for analyzing genomic risk factors in LACC, such time to event analysis can be complicated by confounding variables which impede the elucidation of biologically relevant signals. Our group previously reported upregulation of genes from the PI3K pathway in cervical tumors with residual abnormal 18F-Fluoro-deoxy-glucose (FDG) uptake on positron emission tomography (PET) performed 3 months after CRT. Here, we analyzed whole transcriptome data using RNASeq from N=86 prospectively collected pretreatment cervix tumor biopsies to identify novel gene expression signatures associated with persistent or new FDG uptake on 3 month post-therapy FDG-PET. Objective: To identify and validate a predictive gene expression signature able to distinguish primary tumors likely to be resistant to standard of care CRT as assessed by PET scan obtained 3-months following completion of therapy. Methods: Whole tumor RNA-seq analysis was performed on 86 pre-treatment tumor specimens and compared with associated 3-month post-CRT FDG-PET. Response was dichotomized into metabolic complete response (mCR) and persistent/new disease (mP/N) defined as persistent or new areas of FDG-uptake on post-CRT PET per the nuclear medicine report. Differential expression analysis was conducted using the EdgeR R-package. Pathway enrichment analysis was conducted using reference libraries as specified and EnrichR software. Decision tree modeling, hyperparameter optimization, and k-fold cross-validation was performed using the sci-kitlearn python package. Results: We identified an 18-gene expression signature predictive for mCR vs. mP/N in patients treated with definitive CRT with an overall prospective accuracy of 0.92. This model reliably identified both classes of patients (mCR F1=0.95, precision = 0.90, recall =1.0 ; mP/N F1=0.80, precision 0.67, recall =1.0). Crowd source enrichment analysis using EnrichR identified an NRF2 driven expression program (p=0.0006), and an altered inflammatory response (0.000006) within the predictive 18 gene signature. Conclusions: We generated an 18-gene model to predict response to CRT as assessed by 3 month PET imaging. This novel expression signature further identified the NRF2 transcription factor as an important marker for resistance to CRT, consistent with previous studies showing NRF2 promotes cervical cancer growth. Our findings highlight the need for further elucidation of the NRF2 pathway in LACC, and can be used to accurately assess high risk patients who would benefit from treatment escalation. Citation Format: Michael R. Waters, Matthew Inkman, Perry W. Grigsby, Stephanie Markovina, Julie K. Schwarz, Jin Zhang. An 18-gene expression model predicts resistance to standard of care therapy on 3-month follow up 18FDG-PET in locally advanced cervical cancer [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2022; 2022 Apr 8-13. Philadelphia (PA): AACR; Cancer Res 2022;82(12_Suppl):Abstract nr 3475.
APA, Harvard, Vancouver, ISO, and other styles
48

Seal, Souvik, Qunhua Li, Elle Butler Basner, Laura M. Saba, and Katerina Kechris. "RCFGL: Rapid Condition adaptive Fused Graphical Lasso and application to modeling brain region co-expression networks." PLOS Computational Biology 19, no. 1 (January 6, 2023): e1010758. http://dx.doi.org/10.1371/journal.pcbi.1010758.

Full text
Abstract:
Inferring gene co-expression networks is a useful process for understanding gene regulation and pathway activity. The networks are usually undirected graphs where genes are represented as nodes and an edge represents a significant co-expression relationship. When expression data of multiple (p) genes in multiple (K) conditions (e.g., treatments, tissues, strains) are available, joint estimation of networks harnessing shared information across them can significantly increase the power of analysis. In addition, examining condition-specific patterns of co-expression can provide insights into the underlying cellular processes activated in a particular condition. Condition adaptive fused graphical lasso (CFGL) is an existing method that incorporates condition specificity in a fused graphical lasso (FGL) model for estimating multiple co-expression networks. However, with computational complexity of O(p2K log K), the current implementation of CFGL is prohibitively slow even for a moderate number of genes and can only be used for a maximum of three conditions. In this paper, we propose a faster alternative of CFGL named rapid condition adaptive fused graphical lasso (RCFGL). In RCFGL, we incorporate the condition specificity into another popular model for joint network estimation, known as fused multiple graphical lasso (FMGL). We use a more efficient algorithm in the iterative steps compared to CFGL, enabling faster computation with complexity of O(p2K) and making it easily generalizable for more than three conditions. We also present a novel screening rule to determine if the full network estimation problem can be broken down into estimation of smaller disjoint sub-networks, thereby reducing the complexity further. We demonstrate the computational advantage and superior performance of our method compared to two non-condition adaptive methods, FGL and FMGL, and one condition adaptive method, CFGL in both simulation study and real data analysis. We used RCFGL to jointly estimate the gene co-expression networks in different brain regions (conditions) using a cohort of heterogeneous stock rats. We also provide an accommodating C and Python based package that implements RCFGL.
APA, Harvard, Vancouver, ISO, and other styles
49

Ali, Mehdi, Charles Tapley Hoyt, Daniel Domingo-Fernández, Jens Lehmann, and Hajira Jabeen. "BioKEEN: a library for learning and evaluating biological knowledge graph embeddings." Bioinformatics 35, no. 18 (February 15, 2019): 3538–40. http://dx.doi.org/10.1093/bioinformatics/btz117.

Full text
Abstract:
Abstract Summary Knowledge graph embeddings (KGEs) have received significant attention in other domains due to their ability to predict links and create dense representations for graphs’ nodes and edges. However, the software ecosystem for their application to bioinformatics remains limited and inaccessible for users without expertise in programing and machine learning. Therefore, we developed BioKEEN (Biological KnowlEdge EmbeddiNgs) and PyKEEN (Python KnowlEdge EmbeddiNgs) to facilitate their easy use through an interactive command line interface. Finally, we present a case study in which we used a novel biological pathway mapping resource to predict links that represent pathway crosstalks and hierarchies. Availability and implementation BioKEEN and PyKEEN are open source Python packages publicly available under the MIT License at https://github.com/SmartDataAnalytics/BioKEEN and https://github.com/SmartDataAnalytics/PyKEEN Supplementary information Supplementary data are available at Bioinformatics online.
APA, Harvard, Vancouver, ISO, and other styles
50

Park, Soo, Brian M. Reilly, Timothy Luger, Dan Zhao, Robert J. Fram, Ajeeta B. Dash, and Rafael Bejar. "DNA Methylation Analysis before and during Treatment with Azacitidine Plus Pevonedistat or Azacitidine Alone in Patients with MDS, CMML, and AML Previously Untreated with Hypomethylating Agents." Blood 136, Supplement 1 (November 5, 2020): 29–30. http://dx.doi.org/10.1182/blood-2020-134484.

Full text
Abstract:
Background: Clinical responses to hypomethylating agents (HMA) occur in the minority of patients with myelodysplastic syndromes (MDS) and related disorders and require months of treatment. Robust biomarkers of HMA response would have great clinical utility but remain elusive. Combinations of HMA and novel agents may improve response rates and latency and be associated with distinct biomarkers. Predictive baseline DNA methylation (DNAm) profiles have yet to be identified in MDS, although responders show more global hypomethylation after response occurs. DNAm changes shortly after treatment may add predictive power as they reflect additional factors including effective cellular exposure and the rate of DNAm recovery between treatment cycles. To determine if DNAm profiling at baseline and early during treatment can improve our ability to predict response, we examined targeted DNAm profiles in samples from patients with higher-risk MDS (HR-MDS), chronic myelomonocytic leukemia (CMML), and low-blast count acute myeloid leukemia (LB-AML), previously untreated with HMA who were enrolled in the Pevonedistat-2001 (P-2001) Phase 2 study of azacitidine (AZA) with or without pevonedistat (PEV). Methods: Targeted DNAm was quantified using bisulfite padlock probes (BSPP) at ~141,000 unique regions known to contain differentially methylated CpG sites. BSPP sequencing was performed on 130 genetically and clinically annotated bone marrow mononuclear cell DNA samples collected at screening and Cycle 2 Day 22±3 (C2D22). Data from 15 samples were removed due to low CpG coverage resulting in 115 samples with >10X coverage of 322,387 CpGs each. There were 27 patients with paired samples, 30 with screening only, and 31 with C2D22 only (Table 1, Figure 1A). Responses for HR-MDS and CMML were categorized according to the IWG-2006 modified criteria and for LB-AML according to the IWG-2003 criteria. Differentially methylated CpGs were identified using the R package 'DSS' v2.35.0. Non-negative matrix factorization-based unsupervised clustering was performed using the Onco-GPS method within the Python 'ccal' package. Statistical analyses were carried out using R version 3.6.2. Results: Differential DNAm analysis between screening and C2D22 for the 27 paired samples shows global demethylation after treatment in nearly all cases but is more pronounced in responders (mean change in global DNAm: response = -5.5%; non-response = -2.0%; t-test p < 0.001). There was no significant difference in demethylation between AZA and AZA+PEV treated groups (AZA: -5.3%; AZA+PEV: -4.1%; t-test p = 0.32). Patients who attained a complete response (CR) in the AZA+PEV arm tended to have more demethylated CpGs than those with a CR in the AZA arm (Figure 1B. Mean number of demethylated CpGs in complete responders: AZA = 16,459; AZA+PEV = 24,731; t-test p = 0.396). In general, responding patients strictly defined by CR or partial response had steeper declines in mean DNAm levels (Figure 1C) than patients with other types of responses, but did not show an association between demethylation and time to response (TTR). Unsupervised clustering of the top 3% most variably methylated CpGs using Onco-GPS did not identify clusters associated with response at screening, but did resolve a cluster highly enriched for responders assessed at C2D22 (Figure 1D. Cluster B, Fisher's exact test p = 0.011; odds-ratio 95% CI = 1.52-Inf). Analysis of the CpG sites that define Cluster B and creation of a formalized statistical predictor of response based on DNAm patterns is ongoing. Conclusions: DNAm patterns early in the course of HMA treatment correlate with eventual response, with CRs associated with the greatest degree of demethylation. Responders are more likely to have greater demethylation that may reflect greater effective AZA exposure or slower remethylation dynamics. Addition of PEV may intensify site-specific demethylation in responders and increase the proportion of responding patients. With a median TTR of 3-4 months in the P-2001 study, depth of demethylation at the C2D22 timepoint may help predict eventual response. Strategies to augment demethylation with adjunctive agents or HMA dose escalation in patients with early inadequate demethylation should be investigated. Ongoing work will incorporate mutation status and focus on refining DNAm signatures that can predict eventual response. Disclosures Luger: Arcturus Therapeutics: Current Employment. Zhao:Millennium Pharmaceuticals Inc., a wholly owned subsidiary of Takeda Pharmaceutical Company Limited: Current Employment. Fram:Takeda, Bristol Myers, Gilead, Pfizer, Baxter, Teva: Current equity holder in publicly-traded company; BeyondSpring Pharmaceuticals Inc.: Consultancy; Takeda Pharmaceuticals Intl. Co.: Current Employment; Vertex Pharmaceuticals: Patents & Royalties: No Royalties; Patent 10/728,114 (Vertex Pharmaceuticals). Dash:Millennium Pharmaceuticals, Inc., a wholly owned subsidiary of Takeda Pharmaceutical Company Limited: Current Employment, Other: Stockholder. Bejar:Forty-Seven/Gilead: Honoraria; Genoptix/NeoGenomics: Honoraria; Daiichi-Sankyo: Honoraria; Takeda: Honoraria, Research Funding; Celgene/BMS: Honoraria, Research Funding; Astex/Otsuka: Honoraria; AbbVie/Genentech: Honoraria; Aptose Biosciences: Current Employment.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography