Journal articles on the topic 'Bioinformatics and computational biology not elsewhere classified'

To see the other types of publications on this topic, follow the link: Bioinformatics and computational biology not elsewhere classified.

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 36 journal articles for your research on the topic 'Bioinformatics and computational biology not elsewhere classified.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Pons, Joan Carles, David Paez-Espino, Gabriel Riera, Natalia Ivanova, Nikos C. Kyrpides, and Mercè Llabrés. "VPF-Class: taxonomic assignment and host prediction of uncultivated viruses based on viral protein families." Bioinformatics 37, no. 13 (January 20, 2021): 1805–13. http://dx.doi.org/10.1093/bioinformatics/btab026.

Full text
Abstract:
Abstract Motivation Two key steps in the analysis of uncultured viruses recovered from metagenomes are the taxonomic classification of the viral sequences and the identification of putative host(s). Both steps rely mainly on the assignment of viral proteins to orthologs in cultivated viruses. Viral Protein Families (VPFs) can be used for the robust identification of new viral sequences in large metagenomics datasets. Despite the importance of VPF information for viral discovery, VPFs have not yet been explored for determining viral taxonomy and host targets. Results In this work, we classified the set of VPFs from the IMG/VR database and developed VPF-Class. VPF-Class is a tool that automates the taxonomic classification and host prediction of viral contigs based on the assignment of their proteins to a set of classified VPFs. Applying VPF-Class on 731K uncultivated virus contigs from the IMG/VR database, we were able to classify 363K contigs at the genus level and predict the host of over 461K contigs. In the RefSeq database, VPF-class reported an accuracy of nearly 100% to classify dsDNA, ssDNA and retroviruses, at the genus level, considering a membership ratio and a confidence score of 0.2. The accuracy in host prediction was 86.4%, also at the genus level, considering a membership ratio of 0.3 and a confidence score of 0.5. And, in the prophages dataset, the accuracy in host prediction was 86% considering a membership ratio of 0.6 and a confidence score of 0.8. Moreover, from the Global Ocean Virome dataset, over 817K viral contigs out of 1 million were classified. Availability and implementation The implementation of VPF-Class can be downloaded from https://github.com/biocom-uib/vpf-tools. Supplementary information Supplementary data are available at Bioinformatics online.
APA, Harvard, Vancouver, ISO, and other styles
2

Xu, Jing, Han Zhang, Jinfang Zheng, Philippe Dovoedo, and Yanbin Yin. "eCAMI: simultaneous classification and motif identification for enzyme annotation." Bioinformatics 36, no. 7 (December 3, 2019): 2068–75. http://dx.doi.org/10.1093/bioinformatics/btz908.

Full text
Abstract:
Abstract Motivation Carbohydrate-active enzymes (CAZymes) are extremely important to bioenergy, human gut microbiome, and plant pathogen researches and industries. Here we developed a new amino acid k-mer-based CAZyme classification, motif identification and genome annotation tool using a bipartite network algorithm. Using this tool, we classified 390 CAZyme families into thousands of subfamilies each with distinguishing k-mer peptides. These k-mers represented the characteristic motifs (in the form of a collection of conserved short peptides) of each subfamily, and thus were further used to annotate new genomes for CAZymes. This idea was also generalized to extract characteristic k-mer peptides for all the Swiss-Prot enzymes classified by the EC (enzyme commission) numbers and applied to enzyme EC prediction. Results This new tool was implemented as a Python package named eCAMI. Benchmark analysis of eCAMI against the state-of-the-art tools on CAZyme and enzyme EC datasets found that: (i) eCAMI has the best performance in terms of accuracy and memory use for CAZyme and enzyme EC classification and annotation; (ii) the k-mer-based tools (including PPR-Hotpep, CUPP and eCAMI) perform better than homology-based tools and deep-learning tools in enzyme EC prediction. Lastly, we confirmed that the k-mer-based tools have the unique ability to identify the characteristic k-mer peptides in the predicted enzymes. Availability and implementation https://github.com/yinlabniu/eCAMI and https://github.com/zhanglabNKU/eCAMI. Supplementary information Supplementary data are available at Bioinformatics online.
APA, Harvard, Vancouver, ISO, and other styles
3

Li, Zhi, Tianyue Zhang, Haojie Lei, Liyan Wei, Yuanning Liu, Yadi Shi, Shuyi Li, et al. "Research on Gastric Cancer’s Drug-resistant Gene Regulatory Network Model." Current Bioinformatics 15, no. 3 (May 23, 2020): 225–34. http://dx.doi.org/10.2174/1574893614666190722102557.

Full text
Abstract:
Objective: Based on bioinformatics, differentially expressed gene data of drug-resistance in gastric cancer were analyzed, screened and mined through modeling and network modeling to find valuable data associated with multi-drug resistance of gastric cancer. Methods: First, data sets were preprocessed from three aspects: data processing, data annotation and classification, and functional clustering. Secondly, based on the preprocessed data, each classified primary gene regulatory network was constructed by mining interactions among the genes. This paper computed the values of each node in each classified primary gene regulatory network and ranked these nodes according to their scores. On the basis of this, the appropriate core node was selected and the corresponding core network was developed. Results and Conclusion:: Finally, core network modules were analyzed, which were mined. After the correlation analysis, the result showed that the constructed network module had 20 core genes. This module contained valuable data associated with multi-drug resistance in gastric cancer.
APA, Harvard, Vancouver, ISO, and other styles
4

Cleemput, Sara, Wim Dumon, Vagner Fonseca, Wasim Abdool Karim, Marta Giovanetti, Luiz Carlos Alcantara, Koen Deforche, and Tulio de Oliveira. "Genome Detective Coronavirus Typing Tool for rapid identification and characterization of novel coronavirus genomes." Bioinformatics 36, no. 11 (February 28, 2020): 3552–55. http://dx.doi.org/10.1093/bioinformatics/btaa145.

Full text
Abstract:
Abstract Summary Genome detective is a web-based, user-friendly software application to quickly and accurately assemble all known virus genomes from next-generation sequencing datasets. This application allows the identification of phylogenetic clusters and genotypes from assembled genomes in FASTA format. Since its release in 2019, we have produced a number of typing tools for emergent viruses that have caused large outbreaks, such as Zika and Yellow Fever Virus in Brazil. Here, we present the Genome Detective Coronavirus Typing Tool that can accurately identify the novel severe acute respiratory syndrome (SARS)-related coronavirus (SARS-CoV-2) sequences isolated in China and around the world. The tool can accept up to 2000 sequences per submission and the analysis of a new whole-genome sequence will take approximately 1 min. The tool has been tested and validated with hundreds of whole genomes from 10 coronavirus species, and correctly classified all of the SARS-related coronavirus (SARSr-CoV) and all of the available public data for SARS-CoV-2. The tool also allows tracking of new viral mutations as the outbreak expands globally, which may help to accelerate the development of novel diagnostics, drugs and vaccines to stop the COVID-19 disease. Availability and implementation https://www.genomedetective.com/app/typingtool/cov Contact koen@emweb.be or deoliveira@ukzn.ac.za Supplementary information Supplementary data are available at Bioinformatics online.
APA, Harvard, Vancouver, ISO, and other styles
5

Pham, Vu V. H., Lin Liu, Cameron P. Bracken, Gregory J. Goodall, Jiuyong Li, and Thuc D. Le. "DriverGroup: a novel method for identifying driver gene groups." Bioinformatics 36, Supplement_2 (December 2020): i583—i591. http://dx.doi.org/10.1093/bioinformatics/btaa797.

Full text
Abstract:
Abstract Motivation Identifying cancer driver genes is a key task in cancer informatics. Most existing methods are focused on individual cancer drivers which regulate biological processes leading to cancer. However, the effect of a single gene may not be sufficient to drive cancer progression. Here, we hypothesize that there are driver gene groups that work in concert to regulate cancer, and we develop a novel computational method to detect those driver gene groups. Results We develop a novel method named DriverGroup to detect driver gene groups by using gene expression and gene interaction data. The proposed method has three stages: (i) constructing the gene network, (ii) discovering critical nodes of the constructed network and (iii) identifying driver gene groups based on the discovered critical nodes. Before evaluating the performance of DriverGroup in detecting cancer driver groups, we firstly assess its performance in detecting the influence of gene groups, a key step of DriverGroup. The application of DriverGroup to DREAM4 data demonstrates that it is more effective than other methods in detecting the regulation of gene groups. We then apply DriverGroup to the BRCA dataset to identify driver groups for breast cancer. The identified driver groups are promising as several group members are confirmed to be related to cancer in literature. We further use the predicted driver groups in survival analysis and the results show that the survival curves of patient subpopulations classified using the predicted driver groups are significantly differentiated, indicating the usefulness of DriverGroup. Availability and implementation DriverGroup is available at https://github.com/pvvhoang/DriverGroup Supplementary information Supplementary data are available at Bioinformatics online.
APA, Harvard, Vancouver, ISO, and other styles
6

Meirson, Tomer, David Bomze, Liron Kahlon, Hava Gil-Henn, and Abraham O. Samson. "A helical lock and key model of polyproline II conformation with SH3." Bioinformatics 36, no. 1 (June 28, 2019): 154–59. http://dx.doi.org/10.1093/bioinformatics/btz527.

Full text
Abstract:
Abstract Motivation More than half of the human proteome contains the proline-rich motif, PxxP. This motif has a high propensity for adopting a left-handed polyproline II (PPII) helix and can potentially bind SH3 domains. SH3 domains are generally grouped into two classes, based on whether the PPII binds in a positive (N-to-C terminal) or negative (C-to-N terminal) orientation. Since the discovery of this structural motif, over six decades ago, a systematic understanding of its binding remains poor and the consensus amino acid sequence that binds SH3 domains is still ill defined. Results Here, we show that the PPII interaction with SH3 domains is governed by the helix backbone and its prolines, and their rotation angle around the PPII helical axis. Based on a geometric analysis of 131 experimentally solved SH3 domains in complex with PPIIs, we observed a rotary translation along the helical screw axis, and separated them by 120° into three categories we name α (0–120°), β (120–240°) and γ (240–360°). Furthermore, we found that PPII helices are distinguished by a shifting PxxP motif preceded by positively charged residues which act as a structural reading frame and dictates the organization of SH3 domains; however, there is no one single consensus motif for all classified PPIIs. Our results demonstrate a remarkable apparatus of a lock with a rotating and translating key with no known equivalent machinery in molecular biology. We anticipate our model to be a starting point for deciphering the PPII code, which can unlock an exponential growth in our understanding of the relationship between protein structure and function. Availability and implementation We have implemented the proposed methods in the R software environment and in an R package freely available at https://github.com/Grantlab/bio3d. Supplementary information Supplementary data are available at Bioinformatics online.
APA, Harvard, Vancouver, ISO, and other styles
7

Xie, Xiaoli, and Yunxiu Zhao. "A 2D Non-degeneracy Graphical Representation of Protein Sequence and Its Applications." Current Bioinformatics 15, no. 7 (December 15, 2020): 758–66. http://dx.doi.org/10.2174/1574893615666200106114337.

Full text
Abstract:
Background: The comparison of the protein sequences is an important research filed in bioinformatics. Many alignment-free methods have been proposed. Objective: In order to mining the more information of the protein sequence, this study focus on a new alignment-free method based on physiochemical properties of amino acids. Methods: Average physiochemical value (Apv) has been defined. For a given protein sequence, a 2D curve was outlined based on Apv and position of the amino acid, and there is not loop and intersection on the curve. According to the curve, the similarity/dissimilarity of the protein sequences can be analyzed. Results and Conclusion: Two groups of protein sequences are taken as examples to illustrate the new methods, the protein sequences can be classified correctly, and the results are highly correlated with that of ClustalW. The new method is simple and effective.
APA, Harvard, Vancouver, ISO, and other styles
8

Wei, Pan, XiaoDong Xie, Ran Wang, JianFeng Zhang, Feng Li, ZhaoPeng Luo, Zhong Wang, MingZhu Wu, Jun Yang, and PeiJian Cao. "Genetic Diversity of Blattella germanica Isolates from Central China based on Mitochondrial Genes." Current Bioinformatics 14, no. 7 (September 17, 2019): 574–80. http://dx.doi.org/10.2174/1574893614666190204153041.

Full text
Abstract:
Background: Blattella germanica is a widespread urban invader insect that can spread numerous types of human pathogens, including bacteria, fungi, and protozoa. Despite the medical significance of B. germanica, the genetic diversity of this species has not been investigated across its wide geographical distribution in China. Objective: In this study, the genetic variation of B. germanica was evaluated in central China. Methods: Fragments of the mitochondrial cytochrome c oxidase subunit I (COI) gene and the 16S rRNA gene were amplified in 36 B. germanica isolates from 7 regions. The sequence data for COI and 16S rRNA genes were analyzed using bioinformatics methods. Results: In total, 13 haplotypes were found among the concatenated sequences. Each sampled population, and the total population, had high haplotype diversity (Hd) that was accompanied by low nucleotide diversity (Pi). Molecular genetic variation analysis indicated that 84.33% of the genetic variation derived from intra-region sequences. Phylogenetic analysis indicated that the B. germanica isolates from central China should be classified as a single population. Demographic analysis rejected the hypothesis of sudden population expansion of the B. germanica population. Conclusion: The 36 isolates of B. germanica sampled in this study had high genetic variation and belonged to the same species. They should be classified as a single population. The mismatch distribution analysis and BSP analysis did not support a demographic population expansion of the B. germanica population, which provided useful knowledge for monitoring changes in parasite populations for future control strategies.
APA, Harvard, Vancouver, ISO, and other styles
9

Nguyen, Hien Thi, Van Mai Do, Thanh Thuy Phan, and Dung Tam Nguyen Huynh. "The Potential of Ameliorating COVID-19 and Sequelae From Andrographis paniculata via Bioinformatics." Bioinformatics and Biology Insights 17 (January 2023): 117793222211496. http://dx.doi.org/10.1177/11779322221149622.

Full text
Abstract:
The current coronavirus disease 2019 (COVID-19) outbreak is alarmingly escalating and raises challenges in finding efficient compounds for treatment. Repurposing phytochemicals in herbs is an ideal and economical approach for screening potential herbal components against COVID-19. Andrographis paniculata, also known as Chuan Xin Lian, has traditionally been used as an anti-inflammatory and antibacterial herb for centuries and has recently been classified as a promising herbal remedy for adjuvant therapy in treating respiratory diseases. This study aimed to screen Chuan Xin Lian’s bioactive components and elicit the potential pharmacological mechanisms and plausible pathways for treating COVID-19 using network pharmacology combined with molecular docking. The results found terpenoid (andrographolide) and flavonoid (luteolin, quercetin, kaempferol, and wogonin) derivatives had remarkable potential against COVID-19 and sequelae owing to their high degrees in the component-target-pathway network and strong binding capacities in docking scores. In addition, the Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis showed that the PI3K-AKT signaling pathway might be the most vital molecular pathway in the pathophysiology of COVID-19 and long-term sequelae whereby therapeutic strategies can intervene.
APA, Harvard, Vancouver, ISO, and other styles
10

Pournoor, Ehsan, Naser Elmi, and Ali Masoudi-Nejad. "CatbNet: A Multi Network Analyzer for Comparing and Analyzing the Topology of Biological Networks." Current Genomics 20, no. 1 (February 27, 2019): 69–75. http://dx.doi.org/10.2174/1389202919666181213101540.

Full text
Abstract:
Background: Complexity and dynamicity of biological events is a reason to use comprehensive and holistic approaches to deal with their difficulty. Currently with advances in omics data generation, network-based approaches are used frequently in different areas of computational biology and bioinformatics to solve problems in a systematic way. Also, there are many applications and tools for network data analysis and manipulation which their goal is to facilitate the way of improving our understandings of inter/intra cellular interactions. Methods: In this article, we introduce CatbNet, a multi network analyzer application which is prepared for network comparison objectives. Result and Conclusion: CatbNet uses many topological features of networks to compare their structure and foundations. One of the most prominent properties of this application is classified network analysis in which groups of networks are compared with each other.
APA, Harvard, Vancouver, ISO, and other styles
11

Chao, Haoyu, Yueming Hu, Liang Zhao, Saige Xin, Qingyang Ni, Peijing Zhang, and Ming Chen. "Biogenesis, Functions, Interactions, and Resources of Non-Coding RNAs in Plants." International Journal of Molecular Sciences 23, no. 7 (March 28, 2022): 3695. http://dx.doi.org/10.3390/ijms23073695.

Full text
Abstract:
Plant transcriptomes encompass a large number of functional non-coding RNAs (ncRNAs), only some of which have protein-coding capacity. Since their initial discovery, ncRNAs have been classified into two broad categories based on their biogenesis and mechanisms of action, housekeeping ncRNAs and regulatory ncRNAs. With advances in RNA sequencing technology and computational methods, bioinformatics resources continue to emerge and update rapidly, including workflow for in silico ncRNA analysis, up-to-date platforms, databases, and tools dedicated to ncRNA identification and functional annotation. In this review, we aim to describe the biogenesis, biological functions, and interactions with DNA, RNA, protein, and microorganism of five major regulatory ncRNAs (miRNA, siRNA, tsRNA, circRNA, lncRNA) in plants. Then, we systematically summarize tools for analysis and prediction of plant ncRNAs, as well as databases. Furthermore, we discuss the silico analysis process of these ncRNAs and present a protocol for step-by-step computational analysis of ncRNAs. In general, this review will help researchers better understand the world of ncRNAs at multiple levels.
APA, Harvard, Vancouver, ISO, and other styles
12

Rai, Devendra K., Paul Lawrence, Steve J. Pauszek, Maria E. Piccone, Nick J. Knowles, and Elizabeth Rieder. "Bioinformatics and Molecular Analysis of the Evolutionary Relationship between Bovine Rhinitis A Viruses and Foot-And-Mouth Disease Virus." Bioinformatics and Biology Insights 9s2 (January 2015): BBI.S37223. http://dx.doi.org/10.4137/bbi.s37223.

Full text
Abstract:
Bovine rhinitis viruses (BRVs) cause mild respiratory disease of cattle. In this study, a near full-length genome sequence of a virus named RS3X (formerly classified as bovine rhinovirus type 1), isolated from infected cattle from the UK in the 1960s, was obtained and analyzed. Compared to other closely related Aphthoviruses, major differences were detected in the leader protease (Lpro), P1, 2B, and 3A proteins. Phylogenetic analysis revealed that RS3X was a member of the species bovine rhinitis A virus (BRAV). Using different codon-based and branch-site selection models for Aphthoviruses, including BRAV RS3X and foot-and-mouth disease virus, we observed no clear evidence for genomic regions undergoing positive selection. However, within each of the BRV species, multiple sites under positive selection were detected. The results also suggest that the probability (determined by Recombination Detection Program) for recombination events between BRVs and other Aphthoviruses, including foot-and-mouth disease virus was not significant. In contrast, within BRVs, the probability of recombination increases. The data reported here provide genetic information to assist in the identification of diagnostic signatures and research tools for BRAV.
APA, Harvard, Vancouver, ISO, and other styles
13

Zeeshan, Saman, Ruoyun Xiong, Bruce T. Liang, and Zeeshan Ahmed. "100 Years of evolving gene–disease complexities and scientific debutants." Briefings in Bioinformatics 21, no. 3 (April 11, 2019): 885–905. http://dx.doi.org/10.1093/bib/bbz038.

Full text
Abstract:
AbstractIt’s been over 100 years since the word `gene’ is around and progressively evolving in several scientific directions. Time-to-time technological advancements have heavily revolutionized the field of genomics, especially when it’s about, e.g. triple code development, gene number proposition, genetic mapping, data banks, gene–disease maps, catalogs of human genes and genetic disorders, CRISPR/Cas9, big data and next generation sequencing, etc. In this manuscript, we present the progress of genomics from pea plant genetics to the human genome project and highlight the molecular, technical and computational developments. Studying genome and epigenome led to the fundamentals of development and progression of human diseases, which includes chromosomal, monogenic, multifactorial and mitochondrial diseases. World Health Organization has classified, standardized and maintained all human diseases, when many academic and commercial online systems are sharing information about genes and linking to associated diseases. To efficiently fathom the wealth of this biological data, there is a crucial need to generate appropriate gene annotation repositories and resources. Our focus has been how many gene–disease databases are available worldwide and which sources are authentic, timely updated and recommended for research and clinical purposes. In this manuscript, we have discussed and compared 43 such databases and bioinformatics applications, which enable users to connect, explore and, if possible, download gene–disease data.
APA, Harvard, Vancouver, ISO, and other styles
14

Nwadiugwu, Martin C. "Expression, Interaction, and Role of Pseudogene Adh6-ps1 in Cancer and other Disease Phenotypes." Bioinformatics and Biology Insights 15 (January 2021): 117793222110405. http://dx.doi.org/10.1177/11779322211040591.

Full text
Abstract:
Pseudogenes have been classified as functionless and their annotation is an ongoing problem. The Adh6-ps1—a mouse pseudogene belonging to the alcohol dehydrogenase gene complex (Adh) was analyzed to review the conservation, homology, expression and interactions, and identify any role it plays in disease phenotypes using bioinformatics databases. Results showed that Adh6-ps1 have 2 transcripts (processed and unprocessed) which may have emerged from a transposition and duplication event, respectively, and that induced inversions (Uox gene, In(3)11Rk) involving gene complexes associated with Adh6-ps1 have been implicated in a diverse range of diseases. Adh6-ps1 is highly conserved in vertebrates particularly rodents, and expressed in the liver. The top 5 MirRNA targets identified are Mir455, Mir511, Mir1903, Mir361, and Mir669o markers. While much is unknown about Mir1903 and Mir669o, the silencing of Mir455 and Mir511 is linked with hepatocellular carcinoma (HCC), and Mir361 is implicated in endometrial cancers. Given the identified MirRNA interactions with Adh6-ps1 and their expression in cancer phenotypes, and Adh6-ps1 associations with induced inversions, it may well have a role in tumorigenesis and disease phenotypes. Nonetheless, further studies are required to establish these facts to add to the growing efforts to understand pseudogenes and their potential involvement in disease conditions.
APA, Harvard, Vancouver, ISO, and other styles
15

Poot Velez, Albros Hermes, Fernando Fontove, and Gabriel Del Rio. "Protein–Protein Interactions Efficiently Modeled by Residue Cluster Classes." International Journal of Molecular Sciences 21, no. 13 (July 6, 2020): 4787. http://dx.doi.org/10.3390/ijms21134787.

Full text
Abstract:
Predicting protein–protein interactions (PPI) represents an important challenge in structural bioinformatics. Current computational methods display different degrees of accuracy when predicting these interactions. Different factors were proposed to help improve these predictions, including choosing the proper descriptors of proteins to represent these interactions, among others. In the current work, we provide a representative protein structure that is amenable to PPI classification using machine learning approaches, referred to as residue cluster classes. Through sampling and optimization, we identified the best algorithm–parameter pair to classify PPI from more than 360 different training sets. We tested these classifiers against PPI datasets that were not included in the training set but shared sequence similarity with proteins in the training set to reproduce the situation of most proteins sharing sequence similarity with others. We identified a model with almost no PPI error (96–99% of correctly classified instances) and showed that residue cluster classes of protein pairs displayed a distinct pattern between positive and negative protein interactions. Our results indicated that residue cluster classes are structural features relevant to model PPI and provide a novel tool to mathematically model the protein structure/function relationship.
APA, Harvard, Vancouver, ISO, and other styles
16

Agrawal, A., MJ Khan, DE Graugnard, M. Vailati-Riboni, SL Rodriguez-Zas, JS Osorio, and JJ Loor. "Prepartal Energy Intake Alters Blood Polymorphonuclear Leukocyte Transcriptome During the Peripartal Period in Holstein Cows." Bioinformatics and Biology Insights 11 (January 1, 2017): 117793221770466. http://dx.doi.org/10.1177/1177932217704667.

Full text
Abstract:
In the dairy industry, cow health and farmer profits depend on the balance between diet (ie, nutrient composition, daily intake) and metabolism. This is especially true during the transition period, where dramatic physiological changes foster vulnerability to immunosuppression, negative energy balance, and clinical and subclinical disorders. Using an Agilent microarray platform, this study examined changes in the transcriptome of bovine polymorphonuclear leukocytes (PMNLs) due to prepartal dietary intake. Holstein cows were fed a high-straw, control-energy diet (CON; NEL = 1.34 Mcal/kg) or overfed a moderate-energy diet (OVE; NEL = 1.62 Mcal/kg) during the dry period. Blood for PMNL isolation and metabolite analysis was collected at −14 and +7 days relative to parturition. At an analysis of variance false discovery rate <0.05, energy intake (OVE vs CON) influenced 1806 genes. Dynamic Impact Approach bioinformatics analysis classified treatment effects on Kyoto Encyclopedia of Genes and Genomes pathways, including activated oxidative phosphorylation and biosynthesis of unsaturated fatty acids and inhibited RNA polymerase, proteasome, and toll-like receptor signaling pathway. This analysis indicates that processes critical for energy metabolism and cellular and immune function were affected with mixed results. However, overall interpretation of the transcriptome data agreed in part with literature documenting a potentially detrimental, chronic activation of PMNL in response to overfeeding. The widespread, transcriptome-level changes captured here confirm the importance of dietary energy adjustments around calving on the immune system.
APA, Harvard, Vancouver, ISO, and other styles
17

Smail, Harem othman. "Evolution of human diseases." International Journal of Applied Biology 4, no. 1 (June 29, 2020): 52–67. http://dx.doi.org/10.20956/ijab.v4i1.9914.

Full text
Abstract:
The main aims of this review were to understand the roles of evolutionary process in human disease. The suffering of human from disease may be millions years ago and until now are continuing and the human disease can be classified into many types based on their sources such as bacterial, Genetics and viral. For the past sixty years the scientist carried out high number of experiment to understand and the decision of the evolutionary process impact of the human disease. the main example of effect of evolution on the human health are using overuse of antibiotics against bacterial infection and the results to the speedy evolution of bacteria that are resistant to multiple antibiotics such that even vancomycin. The process of natural selection which is proposed by Charles Darwin play vital roles in Biological and medical process and also helps to predict and find the relationship between natural selection process of evolution and phenotypical traits. Understanding the developmental and genetic underpinnings of unique evolutionary changes have been hindered by way of insufficient databases of evolutionary anatomy and through the lack of a computational method to become aware of underlying candidate genes and regulators to the developing o the process of the evolution with helps of other branches of modern sciences such as genetics, Bioinformatics, epidemiology, ecology, microbiology, molecular biology and biochemistry.
APA, Harvard, Vancouver, ISO, and other styles
18

Grudman, Steven, J. Eduardo Fajardo, and Andras Fiser. "INTERCAAT: identifying interface residues between macromolecules." Bioinformatics, September 9, 2021. http://dx.doi.org/10.1093/bioinformatics/btab596.

Full text
Abstract:
Abstract Summary The Interface Contact definition with Adaptable Atom Types (INTERCAAT) was developed to determine the atomic interactions between molecules that form a known three dimensional structure. First, INTERCAAT creates a Voronoi tessellation where each atom acts as a seed. Interactions are defined by atoms that share a hyperplane and whose distance is less than the sum of each atoms’ Van der Waals radii plus the diameter of a solvent molecule. Interacting atoms are then classified and interactions are filtered based on compatibility. INTERCAAT implements an adaptive atom classification method; therefore, it can explore interfaces between a variety macromolecules. Availability and implementation Source code is freely available at: https://gitlab.com/fiserlab.org/intercaat. Supplementary information Supplementary data are available at Bioinformatics online.
APA, Harvard, Vancouver, ISO, and other styles
19

He, Hongxin, Manhong Shi, Yuxin Lin, Chaoying Zhan, Rongrong Wu, Cheng Bi, Xingyun Liu, Shumin Ren, and Bairong Shen. "HFBD: a biomarker knowledge database for heart failure heterogeneity and personalized applications." Bioinformatics, March 23, 2021. http://dx.doi.org/10.1093/bioinformatics/btab470.

Full text
Abstract:
Abstract Motivation Heart failure (HF) is a cardiovascular disease with a high incidence around the world. Accumulating studies have focused on the identification of biomarkers for HF precision medicine. To understand the HF heterogeneity and provide biomarker information for the personalized diagnosis and treatment of HF, a knowledge database collecting the distributed and multiple-level biomarker information is necessary. Results In this study, the HF biomarker knowledge database (HFBD) was established by manually collecting the data and knowledge from literature in PubMed. HFBD contains 2618 records and 868 HF biomarkers (731 single and 137 combined) extracted from 1237 original articles. The biomarkers were classified into proteins, RNAs, DNAs and the others at molecular, image, cellular and physiological levels. The biomarkers were annotated with biological, clinical and article information as well as the experimental methods used for the biomarker discovery. With its user-friendly interface, this knowledge database provides a unique resource for the systematic understanding of HF heterogeneity and personalized diagnosis and treatment of HF in the era of precision medicine. Availability and implementation The platform is openly available at http://sysbio.org.cn/HFBD/.
APA, Harvard, Vancouver, ISO, and other styles
20

Kim, Bong-Hyun, Kijin Yu, and Peter C. W. Lee. "Cancer classification of single-cell gene expression data by neural network." Bioinformatics, October 11, 2019. http://dx.doi.org/10.1093/bioinformatics/btz772.

Full text
Abstract:
Abstract Motivation Cancer classification based on gene expression profiles has provided insight on the causes of cancer and cancer treatment. Recently, machine learning-based approaches have been attempted in downstream cancer analysis to address the large differences in gene expression values, as determined by single-cell RNA sequencing (scRNA-seq). Results We designed cancer classifiers that can identify 21 types of cancers and normal tissues based on bulk RNA-seq as well as scRNA-seq data. Training was performed with 7398 cancer samples and 640 normal samples from 21 tumors and normal tissues in TCGA based on the 300 most significant genes expressed in each cancer. Then, we compared neural network (NN), support vector machine (SVM), k-nearest neighbors (kNN) and random forest (RF) methods. The NN performed consistently better than other methods. We further applied our approach to scRNA-seq transformed by kNN smoothing and found that our model successfully classified cancer types and normal samples. Availability and implementation Cancer classification by neural network. Supplementary information Supplementary data are available at Bioinformatics online.
APA, Harvard, Vancouver, ISO, and other styles
21

Guo, Jingwen, David Starr, and Huazhang Guo. "Classification and review of free PCR primer design software." Bioinformatics, October 26, 2020. http://dx.doi.org/10.1093/bioinformatics/btaa910.

Full text
Abstract:
Abstract Motivation Polymerase chain reaction (PCR) has been a revolutionary biomedical advancement. However, for PCR to be appropriately used, one must spend a significant amount of effort on PCR primer design. Carefully designed PCR primers not only increase sensitivity and specificity, but also decrease effort spent on experimental optimization. Computer software removes the human element by performing and automating the complex and rigorous calculations required in PCR primer design. Classification and review of the available software options and their capabilities should be a valuable resource for any PCR application. Results This article focuses on currently available free PCR primer design software and their major functions (https://pcrprimerdesign.github.io/). The software are classified according to their PCR applications, such as Sanger sequencing, reverse transcription quantitative PCR, single nucleotide polymorphism detection, splicing variant detection, methylation detection, microsatellite detection, multiplex PCR and targeted next generation sequencing, and conserved/degenerate primers to clone orthologous genes from related species, new gene family members in the same species, or to detect a group of related pathogens. Each software is summarized to provide a technical review of their capabilities and utilities. Contact huazhang.guo@health.slu.edu
APA, Harvard, Vancouver, ISO, and other styles
22

Zhang, Xiaolong, Zhikai Qian, Ye Wang, Qingfeng Zhang, Kai Yu, Yongqiang Zheng, Zekun Liu, Qi Zhao, and Ze-Xian Liu. "DrugCVar: a platform for evidence-based drug annotation for genetic variants in cancer." Bioinformatics, April 15, 2022. http://dx.doi.org/10.1093/bioinformatics/btac273.

Full text
Abstract:
Abstract Motivation Targeted therapy for cancer-related genetic variants is critical for precision medicine. Although several databases including The Clinical Interpretation of Variants in Cancer (CIViC), The Oncology Knowledge Base (OncoKB), The Cancer Genome Interpreter (CGI), My Cancer Genome (MCG) provide clinical interpretations of variants in cancer, the clinical evidence was limited and miscellaneous. In this study, we developed the DrugCVar database, which integrated our manually curated cancer variant-drug targeting evidence from literature and the interpretations from the public resources. Results In total, 7,830 clinical evidences for cancer variant-drug targeting were integrated and classified into ten evidence tiers. Searching and browsing functions were provided for quick queries of cancer variant-drug targeting evidence. Also, batch annotation module was developed for user-provided massive genetic variants in various formats. Details such as the mutation function, location of the variants in gene and protein structures, and mutation statistics of queried genes in various tumor types were also provided for further investigations. Thus, DrugCVar could serve as a comprehensive annotation tool to interpret potential drugs for cancer variants especially the massive ones from clinical cancer genomics studies. Availability and implementation The database is available at http://drugcvar.omicsbio.info. Supplementary information Supplementary data are available at Bioinformatics online.
APA, Harvard, Vancouver, ISO, and other styles
23

Rawat, Puneet, R. Prabakaran, Sandeep Kumar, and M. Michael Gromiha. "AggreRATE-Pred: a mathematical model for the prediction of change in aggregation rate upon point mutation." Bioinformatics, October 10, 2019. http://dx.doi.org/10.1093/bioinformatics/btz764.

Full text
Abstract:
Abstract Motivation Protein aggregation is a major unsolved problem in biochemistry with implications for several human diseases, biotechnology and biomaterial sciences. A majority of sequence-structural properties known for their mechanistic roles in protein aggregation do not correlate well with the aggregation kinetics. This limits the practical utility of predictive algorithms. Results We analyzed experimental data on 183 unique single point mutations that lead to change in aggregation rates for 23 polypeptides and proteins. Our initial mathematical model obtained a correlation coefficient of 0.43 between predicted and experimental change in aggregation rate upon mutation (P-value <0.0001). However, when the dataset was classified based on protein length and conformation at the mutation sites, the average correlation coefficient almost doubled to 0.82 (range: 0.74–0.87; P-value <0.0001). We observed that distinct sequence and structure-based properties determine protein aggregation kinetics in each class. In conclusion, the protein aggregation kinetics are impacted by local factors and not by global ones, such as overall three-dimensional protein fold, or mechanistic factors such as the presence of aggregation-prone regions. Availability and implementation The web server is available at http://www.iitm.ac.in/bioinfo/aggrerate-pred/. Supplementary information Supplementary data are available at Bioinformatics online.
APA, Harvard, Vancouver, ISO, and other styles
24

Kuang, Da, Rebecca Truty, Jochen Weile, Britt Johnson, Keith Nykamp, Carlos Araya, Robert L. Nussbaum, and Frederick P. Roth. "Prioritizing genes for systematic variant effect mapping." Bioinformatics, December 10, 2020. http://dx.doi.org/10.1093/bioinformatics/btaa1008.

Full text
Abstract:
Abstract Motivation When rare missense variants are clinically interpreted as to their pathogenicity, most are classified as variants of uncertain significance (VUS). Although functional assays can provide strong evidence for variant classification, such results are generally unavailable. Multiplexed assays of variant effect can generate experimental ‘variant effect maps’ that score nearly all possible missense variants in selected protein targets for their impact on protein function. However, these efforts have not always prioritized proteins for which variant effect maps would have the greatest impact on clinical variant interpretation. Results Here, we mined databases of clinically interpreted variants and applied three strategies, each building on the previous, to prioritize genes for systematic functional testing of missense variation. The strategies ranked genes (i) by the number of unique missense VUS that had been reported to ClinVar; (ii) by movability- and reappearance-weighted impact scores, to give extra weight to reappearing, movable VUS and (iii) by difficulty-adjusted impact scores, to account for the more resource-intensive nature of generating variant effect maps for longer genes. Our results could be used to guide systematic functional testing of missense variation toward greater impact on clinical variant interpretation. Availability and implementation Source code available at: https://github.com/rothlab/mave-gene-prioritization Supplementary information Supplementary data are available at Bioinformatics online.
APA, Harvard, Vancouver, ISO, and other styles
25

Zeng, Wenhuan, Anupam Gautam, and Daniel H. Huson. "DeepToA: An Ensemble Deep-Learning Approach to Predicting the Theater of Activity of a Microbiome." Bioinformatics, August 27, 2022. http://dx.doi.org/10.1093/bioinformatics/btac584.

Full text
Abstract:
Abstract Motivation Metagenomics is the study of microbiomes using DNA sequencing. A microbiome consists of an assemblage of microbes that is associated with a “theater of activity” (ToA). An important question is, to what degree does the taxonomic and functional content of the former depend on the (details of the) latter? Here we investigate a related technical question: Given a taxonomic and/or functional profile estimated from metagenomic sequencing data, how to predict the associated ToA? We present a deep-learning approach to this question. We use both taxonomic and functional profiles as input. We apply node2vec to embed hierarchical taxonomic profiles into numerical vectors. We then perform dimension reduction using clustering, to address the sparseness of the taxonomic data and thus make the problem more amenable to deep-learning algorithms. Functional features are combined with textual descriptions of protein families or domains. We present an ensemble deep-learning framework DeepToA for predicting the “theater of activity” of amicrobial community, based on taxonomic and functional profiles. We use SHAP (SHapley Additive exPlanations) values to determine which taxonomic and functional features are important for the prediction. Results Based on 7,560 metagenomic profiles downloaded from MGnify, classified into ten different theaters of activity, we demonstrate that DeepToA has an accuracy of 98.30%. We show that adding textual information to functional features increases the accuracy. Availability Our approach is available at http://ab.inf.uni-tuebingen.de/software/deeptoa. Supplementary information Supplementary data are available at Bioinformatics online.
APA, Harvard, Vancouver, ISO, and other styles
26

Kumar, Prasun, and Derek N. Woolfson. "Socket2: a program for locating, visualizing and analyzing coiled-coil interfaces in protein structures." Bioinformatics, September 8, 2021. http://dx.doi.org/10.1093/bioinformatics/btab631.

Full text
Abstract:
Abstract Motivation Protein–protein interactions are central to all biological processes. One frequently observed mode of such interactions is the α-helical coiled coil (CC). Thus, an ability to extract, visualize and analyze CC interfaces quickly and without expert guidance would facilitate a wide range of biological research. In 2001, we reported Socket, which locates and characterizes CCs in protein structures based on the knobs-into-holes (KIH) packing between helices in CCs. Since then, studies of natural and de novo designed CCs have boomed, and the number of CCs in the RCSB PDB has increased rapidly. Therefore, we have updated Socket and made it accessible to expert and nonexpert users alike. Results The original Socket only classified CCs with up to six helices. Here, we report Socket2, which rectifies this oversight to identify CCs with any number of helices, and KIH interfaces with any of the 20 proteinogenic residues or incorporating nonnatural amino acids. In addition, we have developed a new and easy-to-use web server with additional features. These include the use of NGL Viewer for instantly visualizing CCs, and tabs for viewing the sequence repeats, helix-packing angles and core-packing geometries of CCs identified and calculated by Socket2. Availability and implementation Socket2 has been tested on all modern browsers. It can be accessed freely at http://coiledcoils.chm.bris.ac.uk/socket2/home.html. The source code is distributed using an MIT licence and available to download under the Downloads tab of the Socket2 home page.
APA, Harvard, Vancouver, ISO, and other styles
27

Shen, Yifei, Qinjie Chu, Michael P. Timko, and Longjiang Fan. "scDetect: a rank-based ensemble learning algorithm for cell type identification of single-cell RNA sequencing in cancer." Bioinformatics, May 28, 2021. http://dx.doi.org/10.1093/bioinformatics/btab410.

Full text
Abstract:
Abstract Motivation Single-cell RNA sequencing (scRNA-seq) has enabled the characterization of different cell types in many tissues and tumor samples. Cell type identification is essential for single-cell RNA profiling, currently transforming the life sciences. Often, this is achieved by searching for combinations of genes that have previously been implicated as being cell-type specific, an approach that is not quantitative and does not explicitly take advantage of other scRNA-seq studies. Batch effects and different data platforms greatly decrease the predictive performance in inter-laboratory and different data type validation. Results Here, we present a new ensemble learning method named as ‘scDetect’ that combines gene expression rank-based analysis and a majority vote ensemble machine-learning probability-based prediction method capable of highly accurate classification of cells based on scRNA-seq data by different sequencing platforms. Because of tumor heterogeneity, in order to accurately predict tumor cells in the single-cell RNA-seq data, we have also incorporated cell copy number variation consensus clustering and epithelial score in the classification. We applied scDetect to scRNA-seq data from pancreatic tissue, mononuclear cells and tumor biopsies cells and show that scDetect classified individual cells with high accuracy and better than other publicly available tools. Availability and implementation scDetect is an open source software. Source code and test data is freely available from Github (https://github.com/IVDgenomicslab/scDetect/) and Zenodo (https://zenodo.org/record/4764132#.YKCOlrH5AYN). The examples and tutorial page is at https://ivdgenomicslab.github.io/scDetect-Introduction/. And scDetect will be available from Bioconductor. Supplementary information Supplementary data are available at Bioinformatics online.
APA, Harvard, Vancouver, ISO, and other styles
28

Schwarz, Dominik, Guy Georges, Sebastian Kelm, Jiye Shi, Anna Vangone, and Charlotte M. Deane. "Co-evolutionary distance predictions contain flexibility information." Bioinformatics, August 12, 2021. http://dx.doi.org/10.1093/bioinformatics/btab562.

Full text
Abstract:
Abstract Motivation Co-evolution analysis can be used to accurately predict residue–residue contacts from multiple sequence alignments. The introduction of machine-learning techniques has enabled substantial improvements in precision and a shift from predicting binary contacts to predict distances between pairs of residues. These developments have significantly improved the accuracy of de novo prediction of static protein structures. With AlphaFold2 lifting the accuracy of some predicted protein models close to experimental levels, structure prediction research will move on to other challenges. One of those areas is the prediction of more than one conformation of a protein. Here, we examine the potential of residue–residue distance predictions to be informative of protein flexibility rather than simply static structure. Results We used DMPfold to predict distance distributions for every residue pair in a set of proteins that showed both rigid and flexible behaviour. Residue pairs that were in contact in at least one reference structure were classified as rigid, flexible or neither. The predicted distance distribution of each residue pair was analysed for local maxima of probability indicating the most likely distance or distances between a pair of residues. We found that rigid residue pairs tended to have only a single local maximum in their predicted distance distributions while flexible residue pairs more often had multiple local maxima. These results suggest that the shape of predicted distance distributions contains information on the rigidity or flexibility of a protein and its constituent residues. Supplementary information Supplementary data are available at Bioinformatics online.
APA, Harvard, Vancouver, ISO, and other styles
29

Ma, Ziyang, and Jeongyoun Ahn. "Feature-weighted ordinal classification for predicting drug response in multiple myeloma." Bioinformatics, May 11, 2021. http://dx.doi.org/10.1093/bioinformatics/btab320.

Full text
Abstract:
Abstract Motivation Ordinal classification problems arise in a variety of real-world applications, in which samples need to be classified into categories with a natural ordering. An example of classifying high-dimensional ordinal data is to use gene expressions to predict the ordinal drug response, which has been increasingly studied in pharmacogenetics. Classical ordinal classification methods are typically not able to tackle high-dimensional data and standard high-dimensional classification methods discard the ordering information among the classes. Existing work of high-dimensional ordinal classification approaches usually assume a linear ordinality among the classes. We argue that manually labeled ordinal classes may not be linearly arranged in the data space, especially in high-dimensional complex problems. Results We propose a new approach that can project high-dimensional data into a lower discriminating subspace, where the innate ordinal structure of the classes is uncovered. The proposed method weights the features based on their rank correlations with the class labels and incorporates the weights into the framework of linear discriminant analysis. We apply the method to predict the response to two types of drugs for patients with multiple myeloma, respectively. A comparative analysis with both ordinal and nominal existing methods demonstrates that the proposed method can achieve a competitive predictive performance while honoring the intrinsic ordinal structure of the classes. We provide interpretations on the genes that are selected by the proposed approach to understand their drug-specific response mechanisms. Availability and implementation The data underlying this article are available in the Gene Expression Omnibus Database at https://www.ncbi.nlm.nih.gov/geo/ and can be accessed with accession number GSE9782 and GSE68871. The source code for FWOC can be accessed at https://github.com/pisuduo/Feature-Weighted-Ordinal-Classification-FWOC. Supplementary information Supplementary data are available at Bioinformatics online.
APA, Harvard, Vancouver, ISO, and other styles
30

Kim, Mirae, Soonwoo Hong, Thomas E. Yankeelov, Hsin-Chih Yeh, and Yen-Liang Liu. "Deep learning-based classification of breast cancer cells using transmembrane receptor dynamics." Bioinformatics, August 13, 2021. http://dx.doi.org/10.1093/bioinformatics/btab581.

Full text
Abstract:
Abstract Motivation Motions of transmembrane receptors on cancer cell surfaces can reveal biophysical features of the cancer cells, thus providing a method for characterizing cancer cell phenotypes. While conventional analysis of receptor motions in the cell membrane mostly relies on the mean-squared displacement plots, much information is lost when producing these plots from the trajectories. Here we employ deep learning to classify breast cancer cell types based on the trajectories of epidermal growth factor receptor (EGFR). Our model is an artificial neural network trained on the EGFR motions acquired from six breast cancer cell lines of varying invasiveness and receptor status: MCF7 (hormone receptor positive), BT474 (HER2-positive), SKBR3 (HER2-positive), MDA-MB-468 (triple negative, TN), MDA-MB-231 (TN) and BT549 (TN). Results The model successfully classified the trajectories within individual cell lines with 83% accuracy and predicted receptor status with 85% accuracy. To further validate the method, epithelial–mesenchymal transition (EMT) was induced in benign MCF10A cells, noninvasive MCF7 cancer cells and highly invasive MDA-MB-231 cancer cells, and EGFR trajectories from these cells were tested. As expected, after EMT induction, both MCF10A and MCF7 cells showed higher rates of classification as TN cells, but not the MDA-MB-231 cells. Whereas deep learning-based cancer cell classifications are primarily based on the optical transmission images of cell morphology and the fluorescence images of cell organelles or cytoskeletal structures, here we demonstrated an alternative way to classify cancer cells using a dynamic, biophysical feature that is readily accessible. Availability and implementation A python implementation of deep learning-based classification can be found at https://github.com/soonwoohong/Deep-learning-for-EGFR-trajectory-classification. Supplementary information Supplementary data are available at Bioinformatics online.
APA, Harvard, Vancouver, ISO, and other styles
31

Goyal, Mohit, Guillermo Serrano, Josepmaria Argemi, Ilan Shomorony, Mikel Hernaez, and Idoia Ochoa. "JIND: joint integration and discrimination for automated single-cell annotation." Bioinformatics, March 7, 2022. http://dx.doi.org/10.1093/bioinformatics/btac140.

Full text
Abstract:
Abstract Motivation An important step in the transcriptomic analysis of individual cells involves manually determining the cellular identities. To ease this labor-intensive annotation of cell-types, there has been a growing interest in automated cell annotation, which can be achieved by training classification algorithms on previously annotated datasets. Existing pipelines employ dataset integration methods to remove potential batch effects between source (annotated) and target (unannotated) datasets. However, the integration and classification steps are usually independent of each other and performed by different tools. We propose JIND (joint integration and discrimination for automated single-cell annotation), a neural-network-based framework for automated cell-type identification that performs integration in a space suitably chosen to facilitate cell classification. To account for batch effects, JIND performs a novel asymmetric alignment in which unseen cells are mapped onto the previously learned latent space, avoiding the need of retraining the classification model for new datasets. JIND also learns cell-type-specific confidence thresholds to identify cells that cannot be reliably classified. Results We show on several batched datasets that the joint approach to integration and classification of JIND outperforms in accuracy existing pipelines, and a smaller fraction of cells is rejected as unlabeled as a result of the cell-specific confidence thresholds. Moreover, we investigate cells misclassified by JIND and provide evidence suggesting that they could be due to outliers in the annotated datasets or errors in the original approach used for annotation of the target batch. Availability and implementation Implementation for JIND is available at https://github.com/mohit1997/JIND and the data underlying this article can be accessed at https://doi.org/10.5281/zenodo.6246322. Supplementary information Supplementary data are available at Bioinformatics online.
APA, Harvard, Vancouver, ISO, and other styles
32

Agapito, Giuseppe, Chiara Pastrello, and Igor Jurisica. "Comprehensive pathway enrichment analysis workflows: COVID-19 case study." Briefings in Bioinformatics, December 29, 2020. http://dx.doi.org/10.1093/bib/bbaa377.

Full text
Abstract:
Abstract The coronavirus disease 2019 (COVID-19) outbreak due to the novel coronavirus named severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has been classified as a pandemic disease by the World Health Organization on the 12th March 2020. This world-wide crisis created an urgent need to identify effective countermeasures against SARS-CoV-2. In silico methods, artificial intelligence and bioinformatics analysis pipelines provide effective and useful infrastructure for comprehensive interrogation and interpretation of available data, helping to find biomarkers, explainable models and eventually cures. One class of such tools, pathway enrichment analysis (PEA) methods, helps researchers to find possible key targets present in biological pathways of host cells that are targeted by SARS-CoV-2. Since many software tools are available, it is not easy for non-computational users to choose the best one for their needs. In this paper, we highlight how to choose the most suitable PEA method based on the type of COVID-19 data to analyze. We aim to provide a comprehensive overview of PEA techniques and the tools that implement them.
APA, Harvard, Vancouver, ISO, and other styles
33

Wang, Zitao, Anyu Bao, Shiyi Liu, Fangfang Dai, Yiping Gong, and Yanxiang Cheng. "A Pyroptosis-Related Gene Signature Predicts Prognosis and Immune Microenvironment for Breast Cancer Based on Computational Biology Techniques." Frontiers in Genetics 13 (April 7, 2022). http://dx.doi.org/10.3389/fgene.2022.801056.

Full text
Abstract:
Breast cancer (BC) is a malignant tumor with high morbidity and mortality, which seriously threatens women’s health worldwide. Pyroptosis is closely correlated with immune landscape and the tumorigenesis and development of various cancers. However, studies about pyroptosis and immune microenvironment in BC are limited. Therefore, our study aimed to investigate the potential prognostic value of pyroptosis-related genes (PRGs) and their relationship to immune microenvironment in BC. First, we identified 38 differentially expressed PRGs between BC and normal tissues. Further on, the least absolute shrinkage and selection operator (LASSO) Cox regression and computational biology techniques were applied to construct a four-gene signature based on PRGs and patients in The Cancer Genome Atlas (TCGA) cohort were classified into high- and low-risk groups. Patients in the high-risk group showed significantly lower survival possibilities compared with the low-risk group, which was also verified in an external cohort. Furthermore, the risk model was characterized as an independent factor for predicting the overall survival (OS) of BC patients. What is more important, functional enrichment analyses demonstrated the robust correlation between risk score and immune infiltration, thereby we summarized genetic mutation variation of PRGs, evaluated the relationship between PRGs, different risk group and immune infiltration, tumor mutation burden (TMB), microsatellite instability (MSI), and immune checkpoint blockers (ICB), which indicated that the low-risk group was enriched in higher TMB, more abundant immune cells, and subsequently had a brighter prognosis. Except for that, the lower expression of PRGs such as GZMB, IL18, IRF1, and GZMA represented better survival, which verified the association between pyroptosis and immune landscape. In conclusion, we performed a comprehensive bioinformatics analysis and established a four-PRG signature consisting of GZMB, IL18, IRF1, and GZMA, which could robustly predict the prognosis of BC patients.
APA, Harvard, Vancouver, ISO, and other styles
34

Ahmed, Zeeshan, Eduard Gibert Renart, Saman Zeeshan, and XinQi Dong. "Advancing clinical genomics and precision medicine with GVViZ: FAIR bioinformatics platform for variable gene-disease annotation, visualization, and expression analysis." Human Genomics 15, no. 1 (June 26, 2021). http://dx.doi.org/10.1186/s40246-021-00336-1.

Full text
Abstract:
Abstract Background Genetic disposition is considered critical for identifying subjects at high risk for disease development. Investigating disease-causing and high and low expressed genes can support finding the root causes of uncertainties in patient care. However, independent and timely high-throughput next-generation sequencing data analysis is still a challenge for non-computational biologists and geneticists. Results In this manuscript, we present a findable, accessible, interactive, and reusable (FAIR) bioinformatics platform, i.e., GVViZ (visualizing genes with disease-causing variants). GVViZ is a user-friendly, cross-platform, and database application for RNA-seq-driven variable and complex gene-disease data annotation and expression analysis with a dynamic heat map visualization. GVViZ has the potential to find patterns across millions of features and extract actionable information, which can support the early detection of complex disorders and the development of new therapies for personalized patient care. The execution of GVViZ is based on a set of simple instructions that users without a computational background can follow to design and perform customized data analysis. It can assimilate patients’ transcriptomics data with the public, proprietary, and our in-house developed gene-disease databases to query, easily explore, and access information on gene annotation and classified disease phenotypes with greater visibility and customization. To test its performance and understand the clinical and scientific impact of GVViZ, we present GVViZ analysis for different chronic diseases and conditions, including Alzheimer’s disease, arthritis, asthma, diabetes mellitus, heart failure, hypertension, obesity, osteoporosis, and multiple cancer disorders. The results are visualized using GVViZ and can be exported as image (PNF/TIFF) and text (CSV) files that include gene names, Ensembl (ENSG) IDs, quantified abundances, expressed transcript lengths, and annotated oncology and non-oncology diseases. Conclusions We emphasize that automated and interactive visualization should be an indispensable component of modern RNA-seq analysis, which is currently not the case. However, experts in clinics and researchers in life sciences can use GVViZ to visualize and interpret the transcriptomics data, making it a powerful tool to study the dynamics of gene expression and regulation. Furthermore, with successful deployment in clinical settings, GVViZ has the potential to enable high-throughput correlations between patient diagnoses based on clinical and transcriptomics data.
APA, Harvard, Vancouver, ISO, and other styles
35

Liang, Pengfei, Lei Zheng, Chunshen Long, Wuritu Yang, Lei Yang, and Yongchun Zuo. "HelPredictor models single-cell transcriptome to predict human embryo lineage allocation." Briefings in Bioinformatics, May 26, 2021. http://dx.doi.org/10.1093/bib/bbab196.

Full text
Abstract:
Abstract The in-depth understanding of cellular fate decision of human preimplantation embryos has prompted investigations on how changes in lineage allocation, which is far from trivial and remains a time-consuming task by experimental methods. It is desirable to develop a novel effective bioinformatics strategy to consider transitions of coordinated embryo lineage allocation and stage-specific patterns. There are rapidly growing applications of machine learning models to interpret complex datasets for identifying candidate development-related factors and lineage-determining molecular events. Here we developed the first machine learning platform, HelPredictor, that integrates three feature selection methods, namely, principal components analysis, F-score algorithm and squared coefficient of variation, and four classical machine learning classifiers that different combinations of methods and classifiers have independent outputs by increment feature selection method. With application to single-cell sequencing data of human embryo, HelPredictor not only achieved 94.9% and 90.9% respectively with cross-validation and independent test, but also fast classified different embryonic lineages and their development trajectories using less HelPredictor-predicted factors. The above-mentioned candidate lineage-specific genes were discussed in detail and were clustered for exploring transitions of embryonic heterogeneity. Our tool can fast and efficiently reveal potential lineage-specific and stage-specific biomarkers and provide insights into how advanced computational tools contribute to development research. The source code is available at https://github.com/liameihao/HelPredictor.
APA, Harvard, Vancouver, ISO, and other styles
36

Wenzel, Marius A., Berndt Müller, and Jonathan Pettitt. "SLIDR and SLOPPR: flexible identification of spliced leader trans-splicing and prediction of eukaryotic operons from RNA-Seq data." BMC Bioinformatics 22, no. 1 (March 22, 2021). http://dx.doi.org/10.1186/s12859-021-04009-7.

Full text
Abstract:
Abstract Background Spliced leader (SL) trans-splicing replaces the 5′ end of pre-mRNAs with the spliced leader, an exon derived from a specialised non-coding RNA originating from elsewhere in the genome. This process is essential for resolving polycistronic pre-mRNAs produced by eukaryotic operons into monocistronic transcripts. SL trans-splicing and operons may have independently evolved multiple times throughout Eukarya, yet our understanding of these phenomena is limited to only a few well-characterised organisms, most notably C. elegans and trypanosomes. The primary barrier to systematic discovery and characterisation of SL trans-splicing and operons is the lack of computational tools for exploiting the surge of transcriptomic and genomic resources for a wide range of eukaryotes. Results Here we present two novel pipelines that automate the discovery of SLs and the prediction of operons in eukaryotic genomes from RNA-Seq data. SLIDR assembles putative SLs from 5′ read tails present after read alignment to a reference genome or transcriptome, which are then verified by interrogating corresponding SL RNA genes for sequence motifs expected in bona fide SL RNA molecules. SLOPPR identifies RNA-Seq reads that contain a given 5′ SL sequence, quantifies genome-wide SL trans-splicing events and predicts operons via distinct patterns of SL trans-splicing events across adjacent genes. We tested both pipelines with organisms known to carry out SL trans-splicing and organise their genes into operons, and demonstrate that (1) SLIDR correctly detects expected SLs and often discovers novel SL variants; (2) SLOPPR correctly identifies functionally specialised SLs, correctly predicts known operons and detects plausible novel operons. Conclusions SLIDR and SLOPPR are flexible tools that will accelerate research into the evolutionary dynamics of SL trans-splicing and operons throughout Eukarya and improve gene discovery and annotation for a wide range of eukaryotic genomes. Both pipelines are implemented in Bash and R and are built upon readily available software commonly installed on most bioinformatics servers. Biological insight can be gleaned even from sparse, low-coverage datasets, implying that an untapped wealth of information can be retrieved from existing RNA-Seq datasets as well as from novel full-isoform sequencing protocols as they become more widely available.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography