Journal articles on the topic 'Genomics bioinformatics variant discovery sequence analysis'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'Genomics bioinformatics variant discovery sequence analysis.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Ahmed, Zeeshan, Eduard Gibert Renart, and Saman Zeeshan. "Genomics pipelines to investigate susceptibility in whole genome and exome sequenced data for variant discovery, annotation, prediction and genotyping." PeerJ 9 (July 26, 2021): e11724. http://dx.doi.org/10.7717/peerj.11724.

Full text
Abstract:
Over the last few decades, genomics is leading toward audacious future, and has been changing our views about conducting biomedical research, studying diseases, and understanding diversity in our society across the human species. The whole genome and exome sequencing (WGS/WES) are two of the most popular next-generation sequencing (NGS) methodologies that are currently being used to detect genetic variations of clinical significance. Investigating WGS/WES data for the variant discovery and genotyping is based on the nexus of different data analytic applications. Although several bioinformatics applications have been developed, and many of those are freely available and published. Timely finding and interpreting genetic variants are still challenging tasks among diagnostic laboratories and clinicians. In this study, we are interested in understanding, evaluating, and reporting the current state of solutions available to process the NGS data of variable lengths and types for the identification of variants, alleles, and haplotypes. Residing within the scope, we consulted high quality peer reviewed literature published in last 10 years. We were focused on the standalone and networked bioinformatics applications proposed to efficiently process WGS and WES data, and support downstream analysis for gene-variant discovery, annotation, prediction, and interpretation. We have discussed our findings in this manuscript, which include but not are limited to the set of operations, workflow, data handling, involved tools, technologies and algorithms and limitations of the assessed applications.
APA, Harvard, Vancouver, ISO, and other styles
2

Wiggans, G. R., D. J. Null, J. B. Cole, and H. D. Norman. "256 GENOMIC EVALUATION OF FERTILITY TRAITS AND DISCOVERY OF HAPLOTYPES THAT AFFECT FERTILITY OF US DAIRY CATTLE." Reproduction, Fertility and Development 28, no. 2 (2016): 260. http://dx.doi.org/10.1071/rdv28n2ab256.

Full text
Abstract:
Genomic evaluations of dairy cattle became official in the United States in January 2009 for Holsteins and Jerseys, and later for Brown Swiss, Ayrshires, and Guernseys. Up to 33 yield, fitness, calving, and conformation traits are evaluated, and the fertility traits included daughter pregnancy rate and heifer and cow conception rates. Additional fertility traits, such as age at first calving and days from calving to first insemination, also are being studied. Male fertility (sire conception rate) is evaluated phenotypically rather than through genomics. Over 1 million animals have genotypes in the national database, which reflects collaboration with Canada and Europe. Most of the genotypes are from females and are from genotyping chips with <30 000 single nucleotide polymorphisms (SNP). To combine data across chips, genotypes are imputed to a set of >77 000 SNP. The imputation process involves dividing the chromosome into segments of approximately equal length and determining the paternal or maternal origin of the alleles. Because some segments were never homozygous, they were assumed to contain an abnormality that resulted in early embryonic death. If a decrease in sire conception rate could be associated with a bull that was a carrier of such a chromosomal segment, the haplotype was designated as affecting fertility. Once the region was identified, bioinformatic analysis was used to discover the causative variant for many of those haplotypes. Accuracy of genomic evaluations is determined by size of the reference population and heritability of the trait. The reference population for Holsteins includes >180 000 bulls and cows. Because fertility traits have low heritabilities, genomic information is particularly useful in improving evaluation accuracy. Accuracy of fertility evaluations is expected to increase further by discovering causative variants for various aspects of conception and gestation through investigation of sequence data.
APA, Harvard, Vancouver, ISO, and other styles
3

Smith, Frances, David Brawand, Laura Steedman, Matthew Oakley, Christopher Wall, Peter Rushton, Margaret Allchurch, et al. "A Comprehensive Next Generation Sequencing Gene Panel Focused on Unexplained Anemia." Blood 126, no. 23 (December 3, 2015): 946. http://dx.doi.org/10.1182/blood.v126.23.946.946.

Full text
Abstract:
Abstract Congenital anemia is difficult to diagnose once common causes have been excluded; for example 80% cases of congenital non-spherocytic hemolytic anemia are undiagnosed once pyruvate kinase and G6PD deficiencies have been excluded using phenotypic analysis. We describe a next generation sequencing strategy, targeting 147 genes, to facilitate the diagnosis of these conditions. The coding regions, splice sites and 200 bp into the untranslated regions were examined in each gene. All clinically significant variants were confirmed by Sanger sequencing, including confirmation in any appropriate family members. Illumina MiSeq data was analysed using a bespoke bioinformatics pipeline, which has been validated to a UK certified standard. The pipeline implements detection of genetic variants using multiple base callers and discovery of copy number variants based on sequencing depth. Variants are annotated with information from ClinVar, and population frequency data from ExAC and 1000 genomes project. All genes are sequenced in every individual but data analysis can easily be restricted to virtual subpanels, excluding analysis of genes not requested. Here we present three cases, highlighting the diagnostic utility of the panel as well as the underlying bioinformatics analysis. Case 1. A male Caucasian child of <1 year, presented with haemolysis (LDH 539 IU/L, total bilirubin 39 umol/L), haematology (Hb 92g/L, MCV 84.4, MCH 28.9, absolute retic count 313.8x109/L); his film showed marked anisopoikilocytes, microspherocytes and polychromasia. He had frontal bossing and a palpable spleen and had suffered several infections, the child was transfused once. His father's film showed elliptocytes, FBC (Hb 127g/L, MCV 89.6, MCH 30.6, absolute retic count 230.5x109/L) but he had never been transfused. The mother's FBC was normal (Hb 113g/L, MCV 87.0, MCH 29.2, absolute retic count 48.4x109/L) but her film also showed elliptocytes. Analysis using the red cell panel found the child to be compound heterozygous for c.83G>A; p.Arg28His and c.[5572C>G; 6531-12C>T]; p.[Leu1858Val;?] in the SPTA1 gene, suggesting the diagnosis of hereditary pyropoikilocytosis. The c.83G>A; p.Arg28His mutation was inherited from the father and the c.[5572C>G; 6531-12C>T]; p.[Leu1858Val;?] low expression allele was inherited from the mother, who was homozygous. Case 2. The post mortem report from a hydropic still birth (36/40) showed extensive extramedullary hematopoiesis and severe anemia. A DNA sample was sent to the laboratory accompanied by blood samples from both parents whose hematology was normal. The DNA sample from the proband was relatively small so only the parental samples were analyzed using the red cell panel. Sequence analysis identified the mother to carry the c.3173dupG; p.Gln1659fs pathogenic variant and the father carried the c.2867_2868+1dupCCG pathogenic variant in the CDAN1 gene. Sanger Sequencing showed that the child had inherited both mutations from the parents. Variants in CDAN1 are associated with CDA type 1 which is documented to be a rare form of anemia which can be lethal. Case 3. An Italian girl carrying a paternally inherited c.118C>T β0 thalassemia variant presented with a severe form of microcytic anemia (FBC, Hb 86g/L, RBC 4.87 x1012/L, MCV 55.2, MCH 17.7 and HbA2=5%). The severity of her anemia (not transfused) and palpable spleen suggested she had an additional pathogenic variant that had not been identified. Her mother had normal hematology FBC, Hb 133g/L, RBC 4.82x1012/L, MCV 79.9, MCH 27.6. After sequencing, Exome Depth analysis of the proband's LCR identified a novel deletion which removed the 5' HS1 and HS2 sites but left HS3-5 intact (confirmed by MLPA in the mother and proband). The combination of this mild down regulation of the beta globin locus in combination with the c.118C>T β0 thalassemia variant caused her phenotype to be more severe than just a beta thalassemia carrier. Identifying pathogenic variants in these families is important as it facilitates prognosis and treatment, and allows prenatal diagnosis to be offered in future. To date the panel has assessed 10 cases of anemia with unknown cause and has made a definitive diagnosis in 8 (80%). Of the two undiagnosed, one was a child that died at 3 weeks and received multiple intrauterine and neonatal transfusions and had severe anemia and the other was a suspected case of CDA with little associated phenotype. Disclosures No relevant conflicts of interest to declare.
APA, Harvard, Vancouver, ISO, and other styles
4

Bao, Riyue, Lei Huang, Jorge Andrade, Wei Tan, Warren A. Kibbe, Hongmei Jiang, and Gang Feng. "Review of Current Methods, Applications, and Data Management for the Bioinformatics Analysis of Whole Exome Sequencing." Cancer Informatics 13s2 (January 2014): CIN.S13779. http://dx.doi.org/10.4137/cin.s13779.

Full text
Abstract:
The advent of next-generation sequencing technologies has greatly promoted advances in the study of human diseases at the genomic, transcriptomic, and epigenetic levels. Exome sequencing, where the coding region of the genome is captured and sequenced at a deep level, has proven to be a cost-effective method to detect disease-causing variants and discover gene targets. In this review, we outline the general framework of whole exome sequence data analysis. We focus on established bioinformatics tools and applications that support five analytical steps: raw data quality assessment, preprocessing, alignment, post-processing, and variant analysis (detection, annotation, and prioritization). We evaluate the performance of open-source alignment programs and variant calling tools using simulated and benchmark datasets, and highlight the challenges posed by the lack of concordance among variant detection tools. Based on these results, we recommend adopting multiple tools and resources to reduce false positives and increase the sensitivity of variant calling. In addition, we briefly discuss the current status and solutions for big data management, analysis, and summarization in the field of bioinformatics.
APA, Harvard, Vancouver, ISO, and other styles
5

Yang, Junmeng, Anna Liu, Isabella He, and Yongsheng Bai. "Bioinformatics Analysis Revealed Novel 3′UTR Variants Associated with Intellectual Disability." Genes 11, no. 9 (August 26, 2020): 998. http://dx.doi.org/10.3390/genes11090998.

Full text
Abstract:
MicroRNAs (or miRNAs) are short nucleotide sequences (~17–22 bp long) that play important roles in gene regulation through targeting genes in the 3′untranslated regions (UTRs). Variants located in genomic regions might have different biological consequences in changing gene expression. Exonic variants (e.g., coding variant and 3′UTR variant) are often causative of diseases due to their influence on gene product. Variants harbored in the 3′UTR region where miRNAs perform their targeting function could potentially alter the binding relationships for target pairs, which could relate to disease causation. We gathered miRNA–mRNA targeting pairs from published studies and then employed the database of microRNA Target Site single nucleotide variants (SNVs) (dbMTS) to discover novel SNVs within the selected pairs. We identified a total of 183 SNVs for the 114 pairs of accurate miRNA–mRNA targeting pairs selected. Detailed bioinformatics analysis of the three genes with identified variants that were exclusively located in the 3′UTR section indicated their association with intellectual disability (ID). Our result showed an exceptionally high expression of GPR88 in brain tissues based on GTEx gene expression data, while WNT7A expression data were relatively high in brain tissues when compared to other tissues. Motif analysis for the 3′UTR region of WNT7A showed that five identified variants were well-conserved across three species (human, mouse, and rat); the motif that contains the variant identified in GPR88 is significant at the level of the 3′UTR of the human genome. Studies of pathways, protein–protein interactions, and relations to diseases further suggest potential association with intellectual disability of our discovered SNVs. Our results demonstrated that 3′UTR variants could change target interactions of miRNA–mRNA pairs in the context of their association with ID. We plan to automate the methods through developing a bioinformatics pipeline for identifying novel 3′UTR SNVs harbored by miRNA-targeted genes in the future.
APA, Harvard, Vancouver, ISO, and other styles
6

Tremblay, Olivier, Zachary Thow, and A. Rod Merrill. "Several New Putative Bacterial ADP-Ribosyltransferase Toxins Are Revealed from In Silico Data Mining, Including the Novel Toxin Vorin, Encoded by the Fire Blight Pathogen Erwinia amylovora." Toxins 12, no. 12 (December 11, 2020): 792. http://dx.doi.org/10.3390/toxins12120792.

Full text
Abstract:
Mono-ADP-ribosyltransferase (mART) toxins are secreted by several pathogenic bacteria that disrupt vital host cell processes in deadly diseases like cholera and whooping cough. In the last two decades, the discovery of mART toxins has helped uncover the mechanisms of disease employed by pathogens impacting agriculture, aquaculture, and human health. Due to the current abundance of mARTs in bacterial genomes, and an unprecedented availability of genomic sequence data, mART toxins are amenable to discovery using an in silico strategy involving a series of sequence pattern filters and structural predictions. In this work, a bioinformatics approach was used to discover six bacterial mART sequences, one of which was a functional mART toxin encoded by the plant pathogen, Erwinia amylovora, called Vorin. Using a yeast growth-deficiency assay, we show that wild-type Vorin inhibited yeast cell growth, while catalytic variants reversed the growth-defective phenotype. Quantitative mass spectrometry analysis revealed that Vorin may cause eukaryotic host cell death by suppressing the initiation of autophagic processes. The genomic neighbourhood of Vorin indicated that it is a Type-VI-secreted effector, and co-expression experiments showed that Vorin is neutralized by binding of a cognate immunity protein, VorinI. We demonstrate that Vorin may also act as an antibacterial effector, since bacterial expression of Vorin was not achieved in the absence of VorinI. Vorin is the newest member of the mART family; further characterization of the Vorin/VorinI complex may help refine inhibitor design for mART toxins from other deadly pathogens.
APA, Harvard, Vancouver, ISO, and other styles
7

Alsamman, Alsamman M., Shafik D. Ibrahim, and Aladdin Hamwieh. "KASPspoon: an in vitro and in silico PCR analysis tool for high-throughput SNP genotyping." Bioinformatics 35, no. 17 (January 8, 2019): 3187–90. http://dx.doi.org/10.1093/bioinformatics/btz004.

Full text
Abstract:
Abstract Motivation Fine mapping becomes a routine trial following quantitative trait loci (QTL) mapping studies to shrink the size of genomic segments underlying causal variants. The availability of whole genome sequences can facilitate the development of high marker density and predict gene content in genomic segments of interest. Correlations between genetic and physical positions of these loci require handling of different experimental genetic data types, and ultimately converting them into positioning markers using a routine and efficient tool. Results To convert classical QTL markers into KASP assay primers, KASPspoon simulates a PCR by running an approximate-match searching analysis on user-entered primer pairs against the provided sequences, and then comparing in vitro and in silico PCR results. KASPspoon reports amplimers close to or adjoining genes/SNPs/simple sequence repeats and those that are shared between in vitro and in silico PCR results to select the most appropriate amplimers for gene discovery. KASPspoon compares physical and genetic maps, and reports the primer set genome coverage for PCR-walking. KASPspoon could be used to design KASP assay primers to convert QTL acquired by classical molecular markers into high-throughput genotyping assays and to provide major SNP resource for the dissection of genotypic and phenotypic variation. In addition to human-readable output files, KASPspoon creates Circos configurations that illustrate different in silico and in vitro results. Availability and implementation Code available under GNU GPL at (http://www.ageri.sci.eg/index.php/facilities-services/ageri-softwares/kaspspoon). Supplementary information Supplementary data are available at Bioinformatics online.
APA, Harvard, Vancouver, ISO, and other styles
8

BLAXTER, M., M. ASLETT, D. GUILIANO, J. DAUB, and THE FILARIAL GENOME PROJECT. "Parasitic helminth genomics." Parasitology 118, no. 7 (October 1999): 39–51. http://dx.doi.org/10.1017/s0031182099004060.

Full text
Abstract:
The initiation of genome projects on helminths of medical importance promises to yield new drug targets and vaccine candidates in unprecedented numbers. In order to exploit this emerging data it is essential that the user community is aware of the scope and quality of data available, and that the genome projects provide analyses of the raw data to highlight potential genes of interest. Core bioinformatics support for the parasite genome projects has promoted these approaches. In the Brugia genome project, a combination of expressed sequence tag sequencing from multiple cDNA libraries representing the complete filarial nematode lifecycle, and comparative analysis of the sequence dataset, particularly using the complete genome sequence of the model nematode C. elegans, has proved very effective in gene discovery.
APA, Harvard, Vancouver, ISO, and other styles
9

Karabayev, Daniyar, Askhat Molkenov, Kaiyrgali Yerulanuly, Ilyas Kabimoldayev, Asset Daniyarov, Aigul Sharip, Ainur Seisenova, Zhaxybay Zhumadilov, and Ulykbek Kairov. "re-Searcher: GUI-based bioinformatics tool for simplified genomics data mining of VCF files." PeerJ 9 (May 3, 2021): e11333. http://dx.doi.org/10.7717/peerj.11333.

Full text
Abstract:
Background High-throughput sequencing platforms generate a massive amount of high-dimensional genomic datasets that are available for analysis. Modern and user-friendly bioinformatics tools for analysis and interpretation of genomics data becomes essential during the analysis of sequencing data. Different standard data types and file formats have been developed to store and analyze sequence and genomics data. Variant Call Format (VCF) is the most widespread genomics file type and standard format containing genomic information and variants of sequenced samples. Results Existing tools for processing VCF files don’t usually have an intuitive graphical interface, but instead have just a command-line interface that may be challenging to use for the broader biomedical community interested in genomics data analysis. re-Searcher solves this problem by pre-processing VCF files by chunks to not load RAM of computer. The tool can be used as standalone user-friendly multiplatform GUI application as well as web application (https://nla-lbsb.nu.edu.kz). The software including source code as well as tested VCF files and additional information are publicly available on the GitHub repository (https://github.com/LabBandSB/re-Searcher).
APA, Harvard, Vancouver, ISO, and other styles
10

Knight, Samantha JL, Ruth Clifford, Pauline Robbe, Sara DC Ramos, Adam Burns, Adele T. Timbs, Reem Alsolami, et al. "The Identification of Further Minimal Regions of Overlap in Chronic Lymphocytic Leukemia Using High-Resolution SNP Arrays." Blood 124, no. 21 (December 6, 2014): 3315. http://dx.doi.org/10.1182/blood.v124.21.3315.3315.

Full text
Abstract:
Abstract Background:Historically, the identification of minimal deleted regions (MDRs) has been a useful approach for pinpointing genes involved in the pathogenesis of human malignancies and constitutional disorders. Microarray technology has offered increased capability for newly identifying or refining existing MDRs and minimal overlapping regions (MORs) in cancer. Despite this, in chronic lymphocytic leukemia (CLL), published MORs that pinpoint only a few candidate genes have been limited and with the advent of NGS, the utility of high resolution array work as a discovery tool has become uncertain. Here, we show that profiling copy number abnormalities (CNAs) and cnLOH using arrays in a large patient series can still be a valuable approach for the identification of genes that are disrupted or mutated in CLL and have a role in CLL development and/or progression. Methods: 250 CLL patient DNAs from individuals enrolled in two UK-based Phase II randomised controlled trials (AdMIRe and ARCTIC trials) were tested using Infinium HumanOmni2.5-8 v1.1 according to manufacturer’s guidelines (Illumina Inc, San Diego, CA). Data were processed using GenomeStudioV2009.2 (Illumina Inc.) and analysed using Nexus Discovery Edition v6.1 (BioDiscovery, Hawthorne, CA). All Nexus plots were inspected visually to verify calls made, identify uncalled events and exclude likely false positives. To exclude common germline CNVs, the Database of Genomic Variants (DGV), a comprehensive catalog of structural variation in control data, was used. Copy number (CN) changes that encompassed fully changes noted in the DGV were excluded from further analysis. Regions of copy neutral loss of heterozygosity (cnLOH) were recorded if >1Mb in size, but were not used to define or refine MORs. Data from 1275 age-appropriate control samples minimised the reporting of common cnLOH events. All genomic coordinates were noted with reference to the GRCh37, hg19 assembly. MORs were investigated using Microsoft Excel filtering functions. A subset of genes (n=91) selected from MORs mainly on the basis of event frequency and/or number of genes within the MOR and/or literature interest were taken forward for targeted sequencing (exons only) of appropriate samples with/without CN Losses or cnLOH (Set 1 n=124; Set 2 n=126). These were tested using custom designed TruSeq Custom Amplicon panels (Illumina Inc) and processed according to manufacturer’s instructions. SAMHD1 was excluded from these panels since it had been studied separately within our laboratory. The data were analysed using an in-house bioinformatics pipeline that uses the sequence aligners MSR and Stampy and the variant callers GATK and Platypus, followed by stringent filtering. Results: Using our datasets we have identified >50 MORs previously unreported in the literature. Six of these showed copy number (CN) losses in >3% of patients studied. Furthermore, we have refined 14 MORs that overlapped with regions described previously and that had also a CN loss frequency of >3%. Thirteen MORs involved only a single reference gene, often a gene implicated previously in cancer (eg. SAMHD1, MTSS1, DCC and RFC1). Of the 91 genes taken forward for targeted sequencing, stringent data filtering led to a subset of 19 genes of interest harbouring exonic mutations. Genes with mutations identified include DCC, BAP1 and FBXW7, also implicated previously in cancer. Conclusion: We have generated high resolution CNA and cnLOH profiles for 250 first-line chemo-immunotherapy treated CLL patients and used this information to document newly identified MORs, to refine MORs reported previously and to identify mutation harbouring genes using targeted NGS. Functional knowledge supports our hypothesis that these genes may have a contributory role in CLL. For two genes, SAMHD1 and FBXW7, relevance in CLL has been established already. Taken together, our data validate the utility of high resolution arrays studies for the identification of candidate genes that may be involved in CLL development or progression when disrupted. Further studies are required to confirm a role for these genes in CLL and to elucidate the nature of the underlying biological mechanisms. Disclosures No relevant conflicts of interest to declare.
APA, Harvard, Vancouver, ISO, and other styles
11

Sun, Yawei, Hongxing Ding, Feifan Zhao, Quanhui Yan, Yuwan Li, Xinni Niu, Weijun Zeng, et al. "Genomic Characteristics and E Protein Bioinformatics Analysis of JEV Isolates from South China from 2011 to 2018." Vaccines 10, no. 8 (August 12, 2022): 1303. http://dx.doi.org/10.3390/vaccines10081303.

Full text
Abstract:
Japanese encephalitis is a mosquito-borne zoonotic epidemic caused by the Japanese encephalitis virus (JEV). JEV is not only the leading cause of Asian viral encephalitis, but also one of the leading causes of viral encephalitis worldwide. To understand the genetic evolution and E protein characteristics of JEV, 263 suspected porcine JE samples collected from South China from 2011 to 2018 were inspected. It was found that 78 aborted porcine fetuses were JEV-nucleic-acid-positive, with a positive rate of 29.7%. Furthermore, four JEV variants were isolated from JEV-nucleic-acid-positive materials, namely, CH/GD2011/2011, CH/GD2014/2014, CH/GD2015/2015, and CH/GD2018/2018. The cell culture and virus titer determination of four JEV isolates showed that four JEV isolates could proliferate stably in Vero cells, and the virus titer was as high as 108.5 TCID 50/mL. The whole-genome sequences of four JEV isolates were sequenced. Based on the phylogenetic analysis of the JEV E gene and whole genome, it was found that CH/GD2011/2011 and CH/GD2015/2015 belonged to the GIII type, while CH/GD2014/2014 and CH/GD2018/2018 belonged to the GI type, which was significantly different from that of the JEV classical strain CH/BJ-1/1995. Bioinformatics tools were used to analyze the E protein phosphorylation site, glycosylation site, B cell antigen epitope, and modeled 3D structures of E protein in four JEV isolates. The analysis of the prevalence of JEV and the biological function of E protein can provide a theoretical basis for the prevention and control of JEV and the design of antiviral drugs.
APA, Harvard, Vancouver, ISO, and other styles
12

Gobalan K and Ahamed John. "Applications of Bioinformatics in Genomics and Proteomics." JOURNAL OF ADVANCED APPLIED SCIENTIFIC RESEARCH 1, no. 3 (December 15, 2021): 29–42. http://dx.doi.org/10.46947/joaasr13201616.

Full text
Abstract:
Bioinformatics is the application of statistics and computer science to the field of molecular biology. The term bioinformatics was coined by Paulien Hogeweg in 1979 for the study of bioinformatics processes in biotic systems. Its primary use since at least the late 1980s has been in genomics and proteomics, particularly in those areas of genomics involving in large-scale DNA sequencing and proteomics in protein structure prediction. Bioinformatics now entitle the creation and advancement of data bases, algorithms, computational and statistical techniques and theory to solve formal and practical problems arising from the management and analysis of biological data. Over the past few decades rapid developments in genomic and proteomics. Research technologies and developments in information technologies have combined to produce tremendous amount of information related to molecular biology. It is the name given to these mathematical and computing approaches used to clear understanding of biological processes. Common activities in bioinformatics include mapping and analyzing DNA and protein sequences, aligning different DNA and protein sequences to compare them and creating and viewing 3-D models of protein structures. The primary goal of bioinformatics is to increase the understanding of biological processes. Bioinformatics is focus on developing and applying computationally intensive techniques (e.g., data mining, machine learning algorithms, and visualization) to achieve this goal. Major research efforts in the field include sequence alignment, gene finding, genome assembly, drug design, drug discovery, protein structure alignment, protein structure prediction, prediction of gene expression and protein-protein interactions, genome-wide association studies and the modeling of evolution.
APA, Harvard, Vancouver, ISO, and other styles
13

Dourmishev, Lyubomir A., Assen L. Dourmishev, Diana Palmeri, Robert A. Schwartz, and David M. Lukac. "Molecular Genetics of Kaposi's Sarcoma-Associated Herpesvirus (Human Herpesvirus 8) Epidemiology and Pathogenesis." Microbiology and Molecular Biology Reviews 67, no. 2 (June 2003): 175–212. http://dx.doi.org/10.1128/mmbr.67.2.175-212.2003.

Full text
Abstract:
SUMMARY Kaposi's sarcoma had been recognized as unique human cancer for a century before it manifested as an AIDS-defining illness with a suspected infectious etiology. The discovery of Kaposi's sarcoma-associated herpesvirus (KSHV), also known as human herpesvirus-8, in 1994 by using representational difference analysis, a subtractive method previously employed for cloning differences in human genomic DNA, was a fitting harbinger for the powerful bioinformatic approaches since employed to understand its pathogenesis in KS. Indeed, the discovery of KSHV was rapidly followed by publication of its complete sequence, which revealed that the virus had coopted a wide armamentarium of human genes; in the short time since then, the functions of many of these viral gene variants in cell growth control, signaling apoptosis, angiogenesis, and immunomodulation have been characterized. This critical literature review explores the pathogenic potential of these genes within the framework of current knowledge of the basic herpesvirology of KSHV, including the relationships between viral genotypic variation and the four clinicoepidemiologic forms of Kaposi's sarcoma, current viral detection methods and their utility, primary infection by KSHV, tissue culture and animal models of latent- and lytic-cycle gene expression and pathogenesis, and viral reactivation from latency. Recent advances in models of de novo endothelial infection, microarray analyses of the host response to infection, receptor identification, and cloning of full-length, infectious KSHV genomic DNA promise to reveal key molecular mechanisms of the candidate pathogeneic genes when expressed in the context of viral infection.
APA, Harvard, Vancouver, ISO, and other styles
14

Bug, Dmitrii S., Ildar M. Barkhatov, Yana V. Gudozhnikova, Artem V. Tishkov, Igor B. Zhulin, and Natalia V. Petukhova. "Identification and Characterization of a Novel CLCN7 Variant Associated with Osteopetrosis." Genes 11, no. 11 (October 22, 2020): 1242. http://dx.doi.org/10.3390/genes11111242.

Full text
Abstract:
Osteopetrosis is a group of rare inheritable disorders of the skeleton characterized by increased bone density. The disease is remarkably heterogeneous in clinical presentation and often misdiagnosed. Therefore, genetic testing and molecular pathogenicity analysis are essential for precise diagnosis and new targets for preventive pharmacotherapy. Mutations in the CLCN7 gene give rise to the complete spectrum of osteopetrosis phenotypes and are responsible for about 75% of cases of autosomal dominant osteopetrosis. In this study, we report the identification of a novel variant in the CLCN7 gene in a patient diagnosed with osteopetrosis and provide evidence for its significance (likely deleterious) based on extensive comparative genomics, protein sequence and structure analysis. A set of automated bioinformatics tools used to predict consequences of this variant identified it as deleterious or pathogenic. Structure analysis revealed that the variant is located at the same “hot spot” as the most common CLCN7 mutations causing osteopetrosis. Deep phylogenetic reconstruction showed that not only Leu614Arg, but any non-aliphatic substitutions in this position are evolutionarily intolerant, further supporting the deleterious nature of the variant. The present study provides further evidence that reconstructing a precise evolutionary history of a gene helps in predicting phenotypical consequences of variants of uncertain significance.
APA, Harvard, Vancouver, ISO, and other styles
15

Bortoluzzi, Stefania, Andrea Bisognin, Marta Biasiolo, Paola Guglielmelli, Flavia Biamonte, Ruggiero Norfo, Rossella Manfredini, and Alessandro M. Vannucchi. "Characterization and discovery of novel miRNAs and moRNAs in JAK2V617F-mutated SET2 cells." Blood 119, no. 13 (March 29, 2012): e120-e130. http://dx.doi.org/10.1182/blood-2011-07-368001.

Full text
Abstract:
Abstract To gain insights into a possible role of microRNAs in myeloproliferative neoplasms, we performed short RNA massive sequencing and extensive bioinformatic analysis in the JAK2V617F-mutated SET2 cell line. Overall, 652 known mature miRNAs were detected, of which 21 were highly expressed, thus being responsible of most of miRNA-mediated gene repression. microRNA putative targets were enriched in specific signaling pathways, providing information about cell activities under massive posttranscriptional regulation. The majority of miRNAs were mixtures of sequence variants, called isomiRs, mainly because of alternative, noncanonical processing of hairpin precursors. We also identified 78 novel miRNAs (miRNA*) derived from known hairpin precursors. Both major and minor (*) forms of miRNAs were expressed concurrently from half of expressed hairpins, highlighting the relevance of miRNA* and the complexity of strand selection bias regulation. Finally, we discovered that SET2 cells express a number of miRNA-offset RNAs (moRNAs), short RNAs derived from genomic regions flanking mature miRNAs. We provide novel data about the possible origin of moRNAs, although their functional role remains to be elucidated. Overall, this study shed light on the complexity of microRNA-mediated gene regulation in SET2 cells and represents the basis for future studies in JAK2V617F-mutated cellular models.
APA, Harvard, Vancouver, ISO, and other styles
16

Lin, Bichen, Yang Liu, Lanxin Su, Hangbo Liu, Hailan Feng, Miao Yu, and Haochen Liu. "A Novel CDH1 Variant Identified in a Chinese Family with Blepharocheilodontic Syndrome." Diagnostics 12, no. 12 (November 24, 2022): 2936. http://dx.doi.org/10.3390/diagnostics12122936.

Full text
Abstract:
The goal of the current study was to identify the pathogenic gene variant in a Chinese family with Blepharocheilodontic (BCD) syndrome. Whole-exome sequencing (WES) and Sanger sequencing were used to identify the pathogenic gene variant. The harmfulness of the variant was predicted by bioinformatics. We identified a novel heterozygous missense variant c.1198G>A (p.Asp400Asn) in the CDH1 gene in the proband and his mother with BCD syndrome. The sequencing results of three healthy individuals in this family are wild type. This result is consistent with familial co-segregation. According to ReVe, REVEL, CADD, gnomAD, dbSNP, and the classification of pathogenic variants with the standards of the 2015 American College of Medical Genetics and Genomics and the Association for Molecular Pathology (ACMG), c.1198G>A (p.Asp400Asn) is predicted to be a likely pathogenic. We observed that variant c.1198G>A (p.Asp400Asn) was located in the extracellular cadherin-type repeats in CDH1. Amino acid sequence alignment of the CDH1 protein among multiple species showed that Asp400 was highly evolutionarily conserved. The conformational analysis showed that this variant might cause structural damage to the CDH1 protein. Phenotypic analysis revealed unique dental phenotypes in patients with BCD syndrome, such as oligodontia, conical-shaped teeth, and notching of the incisal edges. Our results broaden the variation spectrum of BCD syndrome and phenotype spectrum of CDH1, which can help with the clinical diagnosis, treatment, and genetic counseling in relation to BCD syndrome.
APA, Harvard, Vancouver, ISO, and other styles
17

Yang, Andrian, Joshua Y. S. Tang, Michael Troup, and Joshua W. K. Ho. "Scavenger: A pipeline for recovery of unaligned reads utilising similarity with aligned reads." F1000Research 8 (October 13, 2022): 1587. http://dx.doi.org/10.12688/f1000research.19426.2.

Full text
Abstract:
Read alignment is an important step in RNA-seq analysis as the result of alignment forms the basis for downstream analyses. However, recent studies have shown that published alignment tools have variable mapping sensitivity and do not necessarily align all the reads which should have been aligned, a problem we termed as the false-negative non-alignment problem. Here we present Scavenger, a python-based bioinformatics pipeline for recovering unaligned reads using a novel mechanism in which a putative alignment location is discovered based on sequence similarity between aligned and unaligned reads. We showed that Scavenger could recover unaligned reads in a range of simulated and real RNA-seq datasets, including single-cell RNA-seq data. We found that recovered reads tend to contain more genetic variants with respect to the reference genome compared to previously aligned reads, indicating that divergence between personal and reference genomes plays a role in the false-negative non-alignment problem. Even when the number of recovered reads is relatively small compared to the total number of reads, the addition of these recovered reads can impact downstream analyses, especially in terms of estimating the expression and differential expression of lowly expressed genes, such as pseudogenes.
APA, Harvard, Vancouver, ISO, and other styles
18

Yang, Andrian, Joshua Y. S. Tang, Michael Troup, and Joshua W. K. Ho. "Scavenger: A pipeline for recovery of unaligned reads utilising similarity with aligned reads." F1000Research 8 (September 4, 2019): 1587. http://dx.doi.org/10.12688/f1000research.19426.1.

Full text
Abstract:
Read alignment is an important step in RNA-seq analysis as the result of alignment forms the basis for downstream analyses. However, recent studies have shown that published alignment tools have variable mapping sensitivity and do not necessarily align all the reads which should have been aligned, a problem we termed as the false-negative non-alignment problem. Here we present Scavenger, a python-based bioinformatics pipeline for recovering unaligned reads using a novel mechanism in which a putative alignment location is discovered based on sequence similarity between aligned and unaligned reads. We showed that Scavenger could recover unaligned reads in a range of simulated and real RNA-seq datasets, including single-cell RNA-seq data. We found that recovered reads tend to contain more genetic variants with respect to the reference genome compared to previously aligned reads, indicating that divergence between personal and reference genomes plays a role in the false-negative non-alignment problem. Even when the number of recovered reads is relatively small compared to the total number of reads, the addition of these recovered reads can impact downstream analyses, especially in terms of estimating the expression and differential expression of lowly expressed genes, such as pseudogenes.
APA, Harvard, Vancouver, ISO, and other styles
19

Feau, Nicolas, David L. Joly, and Richard C. Hamelin. "Poplar leaf rusts: model pathogens for a model treeThis minireview is one of a selection of papers published in the Special Issue on Poplar Research in Canada." Canadian Journal of Botany 85, no. 12 (December 2007): 1127–35. http://dx.doi.org/10.1139/b07-102.

Full text
Abstract:
With the availability of the entire genome of the model tree Populus trichocarpa Torr. & A. Gray and the current genome sequencing project of its rust pathogen Melampsora larici-populina Kleb., rust–poplar interaction research has entered the genomic era. Recent genomics research on poplars has attempted to connect the genetic localizations of loci for qualitative and quantitative disease resistance with putative genes encoding resistance or signalling proteins. The interactions between these putative resistance genes and rust effectors remain unknown. Genomic resources developed for Melampsora spp. promise to contribute to our understanding of the molecular basis of pathogenicity by facilitating the isolation of pathogenicity genes. A multifaceted approach for the identification of such genes that relies largely on trimming and sequence data analysis has been developed. The strategy takes advantage of the resources available and combines EST libraries, bioinformatics data mining for extracellularly expressed secreted proteins, intra- and inter-specific comparative genomics, and testing for the presence of positive selection. It has resulted in the discovery of several putative candidate genes. In silico evidence for candidate genes will be further validated by robust experimental evidence through functional analyses.
APA, Harvard, Vancouver, ISO, and other styles
20

Li, Juyi, Shan Sun, Xiufang Wang, Yarong Li, Hong Zhu, Hongmei Zhang, and Aiping Deng. "A Missense Mutation in IRS1 is Associated with the Development of Early-Onset Type 2 Diabetes." International Journal of Endocrinology 2020 (January 25, 2020): 1–8. http://dx.doi.org/10.1155/2020/9569126.

Full text
Abstract:
There could be an overlap of monogenic diabetes and early-onset type 2 diabetes mellitus. Precise diagnosis of early-onset diabetes has proven valuable for understanding the mechanism of diabetes and selecting optimal therapy. The majority of maturity onset diabetes of the young (MODY) pathogenic genes in China is still unknown. In this study, a family with suspected MODY was enrolled. Whole-exome sequencing (WES) was used to analyze the variants of the proband. Variants were filtered according to their frequency, location, functional consequences, and bioinformatics software. Candidate pathogenic variants were validated by Sanger sequencing and tested for cosegregation in other members of the family and nonrelated healthy controls. KEGG (Kyoto Encyclopedia of Genes and Genomes) and PPI (protein-protein interaction) analysis were conducted using the DAVID (Database for Annotation, Visualization, and Integrated Discovery) and the STRING online analysis tools for the candidate pathogenic gene. A total of 123291 variants including 105344 SNPs and 17947 InDels were found in WES. A likely pathogenic rare missense heterozygous mutation in diabetes genes (c.2137C > T, p.His713Tyr in IRS1) was identified, which was a cosegregate in this family and not in nonrelated healthy controls. The position of the mutation in the aminoacid sequence of the gene is highly conserved among the species. 2 significantly enriched KEGG pathways were identified including bta04930, type II diabetes mellitus (GCK, INS, PDX1, ABCC8, and IRS1), and bta04910, insulin signaling pathway (GCK, INS, and IRS1). PPI analysis displayed that IRS1 interacts with 3 known pathogenic proteins including INS, KCNJ11, and GCK. We conclude that WES could be an initial option for genetic testing in patients with early-onset diabetes. IRS1 p.His713Tyr is implicated as a possible pathogenic mutation in monogenic diabetes, which might require further validation, and the precise molecular mechanism underlying the influence of IRS1 p.His713Tyr on the development of diabetes remains to be determined in the further prospective studies.
APA, Harvard, Vancouver, ISO, and other styles
21

Adawiah, Rabiatul, A. R. Shahril Firdaus, A. Norzihan, and A. B. Umi Kalsom. "Mining of single nucleotide polymorphism (SNP) and simple sequence repeats (SSRs) from EST tropical fruits." Asian Journal of Plant Biology 2, no. 2 (December 30, 2014): 48–52. http://dx.doi.org/10.54987/ajpb.v2i2.181.

Full text
Abstract:
The advancement in genomics technology has produced vast amount of expressed sequencetags (ESTs) sequence from tropical fruits. These resources have increased the publicavailability of ESTs sequence from year after year. Therefore, this effort permits mining ofsingle nucleotide polymorphism (SNP) and simple sequence repeat (SSR) from EST tropicalfruits. SNP and SSR are types of molecular marker which commonly used in modern geneticanalysis for wide application such as diversity analysis, linkage analysis and association study.In this study, a small scale EST sequences from tropical fruits (pineapple, mango, coconut andbanana) were retrieved from dbEST database (www.ncbi.nlm.nih.gov/dbEST/ ) as of March2013. Various bioinformatics tools were applied for rapid discovery of SNP and SSR markerfrom EST sequences. We analyzed 31,920 unigenes (contigs and singletons) representing atotal of 77,418 ESTs from four tropical fruits for their potential use in developing SNP and SSRmarkers. A total of 13,709 EST-SNP were discovered while a total of 4853 EST-SSR werediscovered from these four tropical fruits. The most abundant EST-SSR repeat is fromtrinucleotide (15,957 repeats) followed by dinucleotide (13,797 repeats) and tetranucleotide(973 repeats). Here, 1738 primers from SNP while 2033 primers from SSR were passedthrough the setting criteria and were selected for validation using genotyping platform. Thisstudy not only serves as a resource for marker development in tropical fruits but can provide abetter insight into the selection of candidate genes of interest.
APA, Harvard, Vancouver, ISO, and other styles
22

Maison, David P., Sean B. Cleveland, and Vivek R. Nerurkar. "Genomic analysis of SARS-CoV-2 variants of concern circulating in Hawai’i to facilitate public-health policies." PLOS ONE 17, no. 12 (December 1, 2022): e0278287. http://dx.doi.org/10.1371/journal.pone.0278287.

Full text
Abstract:
Using genomics, bioinformatics and statistics, herein we demonstrate the effect of statewide and nationwide quarantine on the introduction of SARS-CoV-2 variants of concern (VOC) in Hawai’i. To define the origins of introduced VOC, we analyzed 260 VOC sequences from Hawai’i, and 301,646 VOC sequences worldwide, deposited in the GenBank and global initiative on sharing all influenza data (GISAID), and constructed phylogenetic trees. The trees define the most recent common ancestor as the origin. Further, the multiple sequence alignment used to generate the phylogenetic trees identified the consensus single nucleotide polymorphisms in the VOC genomes. These consensus sequences allow for VOC comparison and identification of mutations of interest in relation to viral immune evasion and host immune activation. Of note is the P71L substitution within the E protein, the protein sensed by TLR2 to produce cytokines, found in the B.1.351 VOC may diminish the efficacy of some vaccines. Based on the phylogenetic trees, the B.1.1.7, B.1.351, B.1.427, and B.1.429 VOC have been introduced in Hawai’i multiple times since December 2020 from several definable geographic regions. From the first worldwide report of VOC in GenBank and GISAID, to the first arrival of VOC in Hawai’i, averages 320 days with quarantine, and 132 days without quarantine. As such, the effect of quarantine is shown to significantly affect the time to arrival of VOC in Hawai’i. Further, the collective 2020 quarantine of 43-states in the United States demonstrates a profound impact in delaying the arrival of VOC in states that did not practice quarantine, such as Utah. Our data demonstrates that at least 76% of all definable SARS-CoV-2 VOC have entered Hawai’i from California, with the B.1.351 variant in Hawai’i originating exclusively from the United Kingdom. These data provide a foundation for policy-makers and public-health officials to apply precision public health genomics to real-world policies such as mandatory screening and quarantine.
APA, Harvard, Vancouver, ISO, and other styles
23

Hasan, Imtiaj, Marco Gerdol, Yuki Fujii, and Yasuhiro Ozeki. "Functional Characterization of OXYL, A SghC1qDC LacNAc-specific Lectin from The Crinoid Feather Star Anneissia Japonica." Marine Drugs 17, no. 2 (February 25, 2019): 136. http://dx.doi.org/10.3390/md17020136.

Full text
Abstract:
We identified a lectin (carbohydrate-binding protein) belonging to the complement 1q(C1q) family in the feather star Anneissia japonica (a crinoid pertaining to the phylum Echinodermata). The combination of Edman degradation and bioinformatics sequence analysis characterized the primary structure of this novel lectin, named OXYL, as a secreted 158 amino acid-long globular head (sgh)C1q domain containing (C1qDC) protein. Comparative genomics analyses revealed that OXYL pertains to a family of intronless genes found with several paralogous copies in different crinoid species. Immunohistochemistry assays identified the tissues surrounding coelomic cavities and the arms as the main sites of production of OXYL. Glycan array confirmed that this lectin could quantitatively bind to type-2 N-acetyllactosamine (LacNAc: Galβ1-4GlcNAc), but not to type-1 LacNAc (Galβ1-3GlcNAc). Although OXYL displayed agglutinating activity towards Pseudomonas aeruginosa, it had no effect on bacterial growth. On the other hand, it showed a significant anti-biofilm activity. We provide evidence that OXYL can adhere to the surface of human cancer cell lines BT-474, MCF-7, and T47D, with no cytotoxic effect. In BT-474 cells, OXYL led to a moderate activation of the p38 kinase in the MAPK signaling pathway, without affecting the activity of caspase-3. Bacterial agglutination, anti-biofilm activity, cell adhesion, and p38 activation were all suppressed by co-presence of LacNAc. This is the first report on a type-2 LacNAc-specific lectin characterized by a C1q structural fold.
APA, Harvard, Vancouver, ISO, and other styles
24

Tenedini, Elena, Isabella Bernardis, Valentina Artusi, Lucia Artuso, Enrica Roncaglia, Paola Guglielmelli, Lisa Pieri, et al. "Targeted Cancer Exome Sequencing Discovers Novel Recurrent Mutations In MPN." Blood 122, no. 21 (November 15, 2013): 4099. http://dx.doi.org/10.1182/blood.v122.21.4099.4099.

Full text
Abstract:
Abstract The discovery of the JAK2V617F mutation in 2005 [Kralovics R, N Engl J Med 2005] represented a major breakthrough in the understanding of the molecular pathogenesis of Philadelphia chromosome negative chronic myeloproliferative neoplasms (MPN). Nevertheless several observations suggest that the JAK2V617F mutation may not be the disease funding mutation, at least in most instances. Therefore, a great deal of effort is ongoing with the aim to identifying novel genetic lesions contributing to the disease pathogenesis. The two major theoretical and technical drawbacks to the identification of new somatic mutations are represented, respectively, by the huge number of genes potentially involved in tumorigenesis of MPN and by the availability of a “pure” germline control DNA. Buccal swabs and saliva have been generally considered as readily available sources of DNA of non-hematopoietic origin, but detection of the JAK2V617F mutation in at least some of these samples indeed suggested the presence of myeloid cell contamination [Levine RL, Cancer Cell 2005]. So, in order to discover novel mutations in MPN using upfront technologies based on next-generation sequencing (NGS) we designed a “cancer exome” capture panel of 2000 unique genes and microRNAs. This panel was used to capture libraries generated from genomic DNA extracted from granulocytes and in vitro expanded CD3+ T-lymphocytes as germline control, in a cohort of 20 MPN patients. These captured libraries were than massively sequenced using the Roche 454 FLX platform. DNA samples had been collected at the diagnosis of PV in 9 subjects and PMF in 6 subjects, while the remaining 5 DNA samples were from 5 of the 9 PV patients at the time they evolved to post-PV myelofibrosis. After extensive bioinformatics analysis and multiple control adjustments, we finally produced a list of 171 novel “true” somatic mutations occurring in genes and microRNAs coding regions of those MPN samples; some of these mutations have been already described in MPN, whereas novel variants represent the vast majority. Despite patients harbored different numbers of somatic mutations, spanning from four to twenty-one variants, only 22 genes appear recurrently mutated. It is worth of notice the acquisition of additional mutations and/or the occurrence of loss of some mutations at the time of disease evolution from PV to a post-PV Myelofibrosis in the five patients for whom samples were available at both disease phases. Some of them, either acquired (NTRK1, PRDM2, BRCA2 and BARD1) or lost (APC, CARS, MLL3 and FAT2) had been found also in another PV or PMF sample. To test the recurrence of these mutations, we screened a different cohort of 189 patients composed by PMF (91 samples), PV (50 patients) and post-PV Myelofibrosis (48 samples) by Ion AmpliSeq technology on an Ion Torrent PGM platform. Deep amplicon sequencing of granulocytes DNA achieved a sample median of 1000-fold coverage. Excluding JAK2, MPL, IDH2, ASXL1 known variants, for 7 genes (SCRIB, MIR662, BARD1, TCF12, FAT4, DAP3, NRAS) we demonstrated in MPN a global mutation frequency greater than the 3%. Whereas some new variants need functional validation to prove causal mechanisms, some other mutations have a well-known pathogenic role in solid cancers but here are described for the first time in MPN. Disclosures: No relevant conflicts of interest to declare.
APA, Harvard, Vancouver, ISO, and other styles
25

Melidis, Damianos P., Christian Landgraf, Gunnar Schmidt, Anja Schöner-Heinisch, Sandra von Hardenberg, Anke Lesinski-Schiedat, Wolfgang Nejdl, and Bernd Auber. "GenOtoScope: Towards automating ACMG classification of variants associated with congenital hearing loss." PLOS Computational Biology 18, no. 9 (September 21, 2022): e1009785. http://dx.doi.org/10.1371/journal.pcbi.1009785.

Full text
Abstract:
Since next-generation sequencing (NGS) has become widely available, large gene panels containing up to several hundred genes can be sequenced cost-efficiently. However, the interpretation of the often large numbers of sequence variants detected when using NGS is laborious, prone to errors and is often difficult to compare across laboratories. To overcome this challenge, the American College of Medical Genetics and Genomics and the Association for Molecular Pathology (ACMG/AMP) have introduced standards and guidelines for the interpretation of sequencing variants. Additionally, disease-specific refinements have been developed that include accurate thresholds for many criteria, enabling highly automated processing. This is of particular interest for common but heterogeneous disorders such as hearing impairment. With more than 200 genes associated with hearing disorders, the manual inspection of possible causative variants is particularly difficult and time-consuming. To this end, we developed the open-source bioinformatics tool GenOtoScope, which automates the analysis of all ACMG/AMP criteria that can be assessed without further individual patient information or human curator investigation, including the refined loss of function criterion (“PVS1”). Two types of interfaces are provided: (i) a command line application to classify sequence variants in batches for a set of patients and (ii) a user-friendly website to classify single variants. We compared the performance of our tool with two other variant classification tools using two hearing loss data sets, which were manually annotated either by the ClinGen Hearing Loss Gene Curation Expert Panel or the diagnostics unit of our human genetics department. GenOtoScope achieved the best average accuracy and precision for both data sets. Compared to the second-best tool, GenOtoScope improved the accuracy metric by 25.75% and 4.57% and precision metric by 52.11% and 12.13% on the two data sets, respectively. The web interface is accessible via: http://genotoscope.mh-hannover.de:5000 and the command line interface via: https://github.com/damianosmel/GenOtoScope.
APA, Harvard, Vancouver, ISO, and other styles
26

Tiong, Ing Soo, Clarissa Wilson, Satwica Yerneni, John Markham, Karen Dun, Ashish Bajel, Ella R. Thompson, David Alan Westerman, and Piers Blombery. "Mutational and Copy Number Profiling of Circulating Tumor DNA in Acute Myeloid Leukemia Using Targeted Next Generation Sequencing." Blood 136, Supplement 1 (November 5, 2020): 39–40. http://dx.doi.org/10.1182/blood-2020-138933.

Full text
Abstract:
The assessment of circulating tumor DNA (ctDNA), released by tumor cells undergoing apoptosis or necrosis, has established utility in solid tumors due to the advantage of a non-invasive "liquid biopsy" replacing multiple site-specific biopsies. However, its role in acute myeloid leukemia (AML) is uncertain, where a significant proportion of variants detected in the bone marrow (BM) may not be detected in ctDNA (Short, Blood Adv 2020). We have previously demonstrated the possibility of comprehensive genomic characterization of lymphoid malignancy from ctDNA using a single targeted next generation sequencing (NGS) hybridization-based panel (Blombery, BJH 2017). We aimed to assess the performance of this same genomic approach in ctDNA and to compare it against BM in AML. In addition, we aimed to assess the integration of a sensitive variant caller (Mutect2; Benjamin, bioRxiv 2019) to the bioinformatics suite in an attempt to improve low-level variant detection. Nineteen patients were identified from sequential patients with AML treated at our institutions where paired ctDNA and BM aspirate DNA were available. ctDNA was analyzed using a hybridization-based NGS panel targeting genes recurrently mutated in hematological malignancy followed by a suite of bioinformatics tools including HaplotypeCaller (GATK)/Mutect2 (GATK) for variant calling, CNSpector/CNSpectorX (Markham, Sci Reports 2019) for copy number variation (CNV) assessment and GRIDSS (Cameron, Genome Res 2017) for structural variant detection. The cohort clinical details are summarized in Table 1; none had documented extramedullary disease at time of collection. A total of 66 unique variants in 27 genes were detected, summarized in Figure 1. Median number of variants detected was 3 per patient sample, including NPM1 (n=6), IDH1/2 (n=4) and FLT3 point mutation (n=2). Three patient samples had FLT3-ITD detected by fragment length analysis; none were detected by the NGS panel in either ctDNA or BM. Variant allele frequency (VAF) from both compartments were highly correlated (R2 0.87). Higher VAFs in ctDNA were more commonly observed for kinase activating mutations (12/17 variants) and TP53 (5/6). Using HaplotypeCaller alone, 58 and 61 variants were detected in the BM and ctDNA, respectively. Of the 2 variants "specific" to the BM, both (IDH1 and NRAS) were called by Mutect2 in ctDNA and confirmed by visual inspection of sequence read alignments. Of the 5 variants "specific" to the ctDNA, 4 were detected in the BM at low VAF: CBL (n=2), KRAS and TET2. One discrepant case was patient #15 with prior breast cancer: KRAS G13D 21% in ctDNA but absent in the BM. Analysis by Mutect2 additionally detected 3 variants not called by HaplotypeCaller: NRAS and KIT in both ctDNA and BM, and TP53 P278R (VAF 6%) specific to the ctDNA in patient #9 with normal karyotype AML without history of prior malignancy. Overall, 3/5 variants in ctDNA and 6/6 in BM with low VAF were kinase activating mutations. We then performed genome-wide alignment of off-target reads to generate a low-resolution digital karyotype and CNV from ctDNA (Figure 2) which was compared with CNV and conventional karyotyping from BM. CNVs were detected in ctDNA in 10/11 patients with abnormal karyotype (Figure 1). Of these, 6/7 with non-complex abnormal karyotypes had consistent ctDNA CNVs including (i) 3 patients (#1, #10 and #15) with either rearranged/amplified KMT2A by FISH were all found to have gains at 11q23.3-qter, and (ii) 1 patient (#4) had 2 marker chromosomes of unknown origin on karyotyping which were resolved as additional copies of 4p using ctDNA sequencing. Although no CNVs were detected in all 8 patients with normal karyotype, analysis of B-allele frequency from ctDNA revealed one patient (#5) with copy neutral loss of heterozygosity (CN-LOH) in 7q (with a concurrent EZH2 mutation &gt;95% VAF). In summary, we have demonstrated the ability to detect sequence variants, perform low-resolution digital karyotyping, CNV detection and CN-LOH detection from ctDNA in AML using a single hybridization based NGS assay. When using this approach, we show a high degree of concordance for both sequence variant and CNV detection, supporting the use of ctDNA as an alternative to BM (e.g. in cases of dry tap, hypocellular AML or failed karyotyping). Finally, using a sensitive variant caller, additional mutations were able to be detected in the ctDNA, the significance of which require evaluation in future studies. Disclosures Tiong: Amgen: Consultancy, Honoraria; Pfizer: Consultancy; Servier: Consultancy. Wilson:Illumina: Other: Illumina Diagnostic Genomics Scholarship. Bajel:Novartis: Honoraria; Astellas: Honoraria; Abbvie: Honoraria; Amgen: Honoraria, Speakers Bureau; Pfizer: Honoraria. Blombery:Invivoscribe: Honoraria; Novartis: Consultancy; Janssen: Honoraria; Amgen: Consultancy.
APA, Harvard, Vancouver, ISO, and other styles
27

Yang, Yunyun, Song Yang, Xiaolu Jiao, Juan Li, Miaomiao Zhu, Luya Wang, and Yanwen Qin. "ANGPTL3 Mutations in Unrelated Chinese Han Patients with Familial Hypercholesterolemia." Current Pharmaceutical Design 25, no. 2 (May 28, 2019): 190–200. http://dx.doi.org/10.2174/1381612825666190228000932.

Full text
Abstract:
Background and objective: Familial hypercholesterolemia (FH) is a severe genetic hyperlipidemia characterized by increased levels of low-density lipoprotein cholesterol (LDL-C), leading to premature atherosclerosis. Angiopoietin-like protein (ANGPTL3) is a hepatocyte-specific protein that can be used to lower LDL in FH. However, it was unknown whether ANGPTL3 variants are present in FH patients. This study was performed to identify ANGPTL3 variants in unrelated Chinese Han patients with FH. Methods and Results: We screened 80 patients with FH (total cholesterol >7.8mmol/L, LDL-cholesterol >4.9mmol/L) and 77 controls using targeted next-generation sequencing (NGS) of six FH candidate genes (LDLR, ApoB100, PCSK9, ABCG5, ABCG8, and ANGPTL3). Candidate pathogenic variants identified by NGS were validated by Sanger sequencing. Mutant and wild-type plasmids containing the variant sequence were constructed and verified by Sanger sequencing. The gene expression profile was analyzed by an expression profile chip in transfected HepG2 cells using quantitative real-time (qRT)-PCR. We identified 41 variants in 28 FH patients, including two ANGPTL3 mutations: one exonic (c.A956G: p.K319R) and one in the untranslated region (c.*249G>A). Gene ontology analyses found that the cholesterol metabolic process and ANGPTL3 expression were significantly up-regulated in the ANGPTL3 K319R mutation group compared with the wild-type group. qRT-PCR findings were consistent with the expression profile analysis. Conclusion: Rare ANGPTL3 variants were identified in Chinese patients with FH, including ANGPTL3: p.(Lys319Arg) which affected the expression of ANGPTL3 and the cholesterol metabolic process as determined by bioinformatics analysis. : Clinical Trial Registration: Chinese Clinical Trial Registration (ChiCTR-ROC-17011027) http://www.chictr.org.cn/listbycreater.aspx
APA, Harvard, Vancouver, ISO, and other styles
28

König, Simone, Wolfgang M. J. Obermann, and Johannes A. Eble. "The Current State-of-the-Art Identification of Unknown Proteins Using Mass Spectrometry Exemplified on De Novo Sequencing of a Venom Protease from Bothrops moojeni." Molecules 27, no. 15 (August 5, 2022): 4976. http://dx.doi.org/10.3390/molecules27154976.

Full text
Abstract:
(1) Background: The amino acid sequence elucidation of peptides from the gas phase fragmentation mass spectra, de novo sequencing, is a valuable method for the identification of unknown proteins complementary to Edman sequencing. It is increasingly used in shot-gun mass spectrometry (MS)-based proteomics experiments. We review the current state-of-the-art and use the identification of an unknown snake venom protein targeting the human tissue factor (TF) as an example to describe the analysis process based on manual spectrum interrogation. (2) Methods: The immobilized TF was incubated with a crude B. moojeni venom solution. The potential binding partners were eluted and further purified by gel electrophoresis. Edman degradation was performed to elucidate the N-terminus of the 31 kDa protein of interest. High-resolution MS with collision-induced dissociation was employed to generate peptide fragmentation spectra. Sequence tags were deduced and used for searches in the NCBI and Uniprot databases. Protein matches from the snake species were further validated by target MS/MS. (3) Results: Sequence tag D [K/Q] D [I/L] VDD [K/Q] led to a snake venom serine protease (SVSP) from lancehead B. jararaca (P81824). With target MS/MS, 24% of the SVSP sequence were confirmed; an additional 41% were tentatively assigned by data-independent MS. Edman sequencing provided information for 10 N-terminal amino acid residues, also confirming the match to SVSP. (4) Conclusions: The identification of unknown proteins continues to be a challenge despite major advances in MS instrumentation and bioinformatic tools. The main requirement is the generation of meaningful, high-quality MS peptide fragmentation spectra. These are used to elucidate sufficiently long sequence tags, which can subsequently be submitted to searches in protein databases. This basic method does not require extensive bioinformatics because peptide MS/MS spectra, especially of doubly-charged ions, can be analysed manually. We demonstrated the procedure with the elucidation of SVSP. While de novo sequencing quickly indicates the correct protein group, the validation of the entire protein sequence of amino acid-by-amino acid will take time. Reasons are the need to properly assign isobaric amino acid residues and modifications. With the ongoing efforts in genomics and transcriptomics and the availability of ever more data in public databases, the need for de novo MS sequencing will decrease. Still, not every animal and plant species will be sequenced, so the combination of MS and Edman sequencing will continue to be of importance for the identification of unknown proteins.
APA, Harvard, Vancouver, ISO, and other styles
29

Malek, Sami N., Denzil Bernard, Zhang Xiao Ying, Luke F. Peterson, Nisar A. Amin, Shaomeng Wang, Kamlai Saiya-Cork, Mark S. Kaminski, and Alfred Chang. "Analysis of 54 Follicular Lymphomas By Whole Exome Sequencing Identifies Multiple Novel Recurrently Mutated Pathways." Blood 126, no. 23 (December 3, 2015): 112. http://dx.doi.org/10.1182/blood.v126.23.112.112.

Full text
Abstract:
Abstract Introduction: Follicular lymphoma (FL) constitutes the second most common non-Hodgkin's lymphoma in the Western world. FL carries multiple recurrently mutated genes that are under active investigation. However, due to the relatively small number of published sequenced cases, knowledge regarding the coding genome in FL is still evolving. Methods: To further our understanding of the genetic basis of FL, we used solution exon capture of sheared and processed genomic DNA isolated from highly purified light chain restricted B-cells and paired CD3+ T-cells from 54 FL cases for paired-end massively parallel sequencing (WES). Data were subsequently analyzed using bioinformatics pipelines including the variant callers MuTect v.1.1.4, Strelka v.1.0.13, and VarScan2 v.2.3.7. Candidate somatically acquired gene mutations with variant allele frequencies (VAFs) >0.15 were confirmed using Sanger sequencing. Selected mutations were validated in an expansion cohort of 120 FL. Results: We identified heterozygous missense mutations in the mTOR regulator RRAGC in 10% of FL. The RRAGC mutations targeted multiple hotspot residues (amino acid 115, 118 and 119). RRAGC forms heterodimers with either RRAGA or RRAGB that under conditions of amino acid sufficiency facilitate recruitment of mTOR through the raptor subunit to lysosomal membranes. At the lysosomal surface, multiple protein complexes, each containing various proteins regulate mTOR activation through RHEB. To gain insights into the functional consequences of RRAGC mutations, we performed 3-dimensional modeling of FL-associated RRAGC mutations and located the mutations into relatively close proximity to the RRAGC GTP/GDP binding site. Energy calculations did not identify strong effects of mutated amino acid residues on the binding of GTP/GDP to RRAGC. We performed studies of the effects of RRAGC mutants on mTOR activity as measured through S6-kinase phosphorylation. In transient transfection systems (293T and HELA) achieving expression slightly above endogenous RRAGC levels, performed under conditions of leucine starvation or sufficiency, we did not identify differences in baseline mTOR activation. In stably transfected 293T cell lines (expressing RRAGB and RRAGC proteins above endogenous levels), that were starved for leucine for 1 hour, we detected modestly elevated p-S6K levels in RRAGC mutant versus wild type transfectants, suggesting a mild intrinsic activation phenotype of RRAGC mutations. Experiments in lentivirally-transfected lymphoma cell lines, including RRAGC binding studies to raptor and folliculin (a RRAGC regulator) are in progress and will be updated at the meeting. Curiously, we did not identify mutations in the other three small GTP binding proteins that are part of the same amino acid sensing pathway (RRAGA, RRAGB or RRAGD), potentially pointing to a unique advantage conferred by RRAGC mutants on FL B cells. We identified additional mutations (combined ~15%) in other mTOR components linked to lysosomal amino acid sensing, including recurrent mutations in the v-ATPase subunit ATP6V1B2 and the accessory subunit ATP6VAP1. The mutations in RRAGC and v-ATPase together highlight a previously unidentified role of the amino acid sensing pathway that regulates mTOR in FL pathogenesis. We have discovered a high frequency of mutations (40%) in the surrogate light chain gene IGLL5 in FL, a critical component of the pre-B-cell receptor. Mutations sharply cluster in the N-terminal 70 amino acid of IGLL5, a region known as the non-Ig domain of IGLL5. The non-Ig domain of IGLL5 has been implicated in influencing pre-B-cell receptor signaling and receptor surface expression as well as interaction with extracellular ligands. The mutational data suggest an unexpected role of IGLL5 in the pathogenesis of FL and work is in progress studying IGLL5 expression in primary FL samples. Conclusion: This large WES study of 54 FL identifies novel recurrently mutated genes and pathways in FL, including frequent mutations in genes involved in amino acid signaling to mTOR (RRAGC and v-ATPase) as well as pre-B-cell receptor signaling (the surrogate light chain gene IGLL5) and multiple other novel recurrently mutated genes that will be updated at the meeting. These data substantially broaden our understanding of the genetic basis of FL and provide clues to therapeutically targeting specific pathways in FL. Disclosures Malek: Abbvie: Equity Ownership; Gilead Sciences: Equity Ownership; Janssen Pharmaceuticals: Research Funding.
APA, Harvard, Vancouver, ISO, and other styles
30

Takei, Tomomi, Kazuaki Yokoyama, Nozomi Yusa, Sousuke Nakamura, Miho Ogawa, Kanya Kondoh, Masayuki Kobayashi, et al. "Artificial Intelligence Guided Precision Medicine Approach to Hematological Disease." Blood 132, Supplement 1 (November 29, 2018): 2254. http://dx.doi.org/10.1182/blood-2018-99-117941.

Full text
Abstract:
Abstract Background: Next-generation sequencing (NGS) is an attractive tool for prospective use in the field of precision medicine. Using NGS to guide therapy has provided a large volume of genomic data and therapeutic actionability of somatic NGS results. These data are evolving too rapidly to rely solely on human curation. So the interpretation of the clinical significance of such large amounts of genetic data remains the most severe bottleneck preventing the realization of precision medicine. Watson for Genomics (WfG) is a representative artificial intelligence (AI) software, which analyzes and categorizes genetic alterations that are related to disease progression and provides a list of potential therapeutic options within 3 minutes per sample. Recent reports suggested that WfG could empower tumor boards and improve patient care by providing a rapid, comprehensive approach for data analysis and consideration of the up-to-date availability of clinical trials (Patel NM, et.al. Oncologist. 2018). However, only limited data are available regarding the utility of AI-guided precision medicine approach in the field of hematological disease. The purpose of this study is to test the utility of AI in assisting the interpretation of high throughput genomic data from patients with the hematological disease. Methods: After obtaining written informed consent, we enrolled patients with hematological disease at our research hospital between May 2015 to June 2018. Genomic DNA was prepared from malignant cell fractions and normal tissues in each patient and subjected to comparative NGS, mainly targeted deep sequencing (TDS) with ready-made panels and, on demand, whole exome sequencing (WES). Sequence data was analyzed using a pipeline of in-house semi-automated medical informatics. After initial bioinformatics filtering, we used WfG to identify potential driver mutations, which were annotated as "pathogenic" or "likely pathogenic" (WfG version 39.132 and 39.135 as of July 2018). The results were compared with the findings of expert hematologists. Results: 247 paired samples (TDS, n= 143; WES, n= 104) collected from 187 patients were analyzed. Our cohort consisted of 63 patients with acute myeloid leukemia, 40 with myelodysplastic syndromes (MDS), 19 with myeloproliferative neoplasms (MPN), 9 with MDS/MPN, 10 with acute lymphoblastic leukemia/lymphoma, 17 with non Hodgkin lymphoma, 6 with multiple myeloma (MM) and others. In 151 of 187 patients, a total of 290 somatic driver mutations were identified by human curation. The frequently mutated genes were TP53 (n=31), NRAS (n=17), TET2 (n=16), U2AF1 (n=14), FLT3/ASXL1/WT1 (n=13 each), and DNMT3A/RUNX1 (n=12 each). WfG identified 79% (n=229) of driver mutations which human experts also did. There was some discordance between WfG and the human (Figure 1): Sixteen mutations were interpreted as "variant of unknown significance" by WfG, but these mutations were deduced as driver mutation by the human. Conversely, in two representative cases, WfG identified a relevant driver mutation that the human did not: FAM46C and SOCS1, from a patient with MM and with primary mediastinal large-B cell lymphoma, respectively. These examples indicate the potential for a mutually complementary or cooperative relationship between AI "software" and the human expert "hardware" in the interpretation of high throughput genomic data. Conclusion: Combing AI "software" and the human expert "hardware" will allow for the quick delivery of comprehensive information needed for patient care that outperforms what either can achieve individually in the field of hematological disease. Figure1. Comparison of potential driver mutations between human curation and Watson for Genomics. The size of the gene symbol indicates the total number of mutations identified Disclosures No relevant conflicts of interest to declare.
APA, Harvard, Vancouver, ISO, and other styles
31

Cannon, Matthew, Kori Kuzma, James Stevenson, Jiachen Liu, Colin O'Sullivan, Bimal P. Chaudhari, Matthew Brush, et al. "Abstract 1177: Introduction of the GA4GH Variation Representation Specification (VRS) and supporting tools for discovery and exchange of clinical genomic and cytogenomic knowledge in cancers." Cancer Research 82, no. 12_Supplement (June 15, 2022): 1177. http://dx.doi.org/10.1158/1538-7445.am2022-1177.

Full text
Abstract:
Abstract Precision oncology is the practice of interpreting the clinical significance of observed molecular changes in patient neoplasms, potentially impacting medical decision making and care. This process is labor-intensive and (among other challenges) involves accurately translating between variation representation conventions from one resource to the next. For example, differences in representations of Copy Number Variation (CNV) from genomic regions, cytogenomic bands, or gene features create challenges in knowledge matching due to lack of standards covering all of these modalities of observed variation.The Global Alliance for Genomics and Health (GA4GH; ga4gh.org) is an international collaborative of genomic data sharing initiatives (Driver Projects) developing genomic data sharing standards within a human rights framework. GA4GH recently published the Variation Representation Specification (VRS; pronounced “verse”), a standard for the computational representation of biomolecular variation. VRS is a terminology, schema, and associated conventions for creating uniquely identifiable and federatable representations of molecular variation. VRS has formal data classes well-suited to differentiating between variation on a single molecule (e.g. tandem duplications) from variation measured at a systemic level (e.g. genome-wide copy number variation). In addition to molecular sequence variation, VRS also supports variation on cytogenetic coordinate systems and genes, making it well-suited to representing variation associated with cancer biomarkers.We demonstrate the use of VRS to model reported gene-associated CNVs from the AACR Project GENIE cohort, to aid in the computational discovery of evidence from clinico-genomic knowledgebases with genomic or cytogenomic CNV representations. We highlight the use case of knowledge matching to the Atlas of Genetics and Cytogenetics in Oncology and Haematology (“the Atlas”; atlasgeneticsoncology.org), a cytogenetics resource historically driven by user website navigation. Using VRS search tools we developed for the Variant Interpretation for Cancer Consortium (VICC; cancervariants.org) GA4GH Driver Project, we found that 64% of GENIE samples with reported CNVs matched clinically relevant knowledge in the Atlas. This work was enabled by programmatic search tools leveraging standard VRS object structures, demonstrating how VRS enables collection of real-world evidence across more resources without manual interpretation or custom normalization methods. We conclude with a survey of open-source tools supporting this analysis as well as search of other clinico-genomic knowledgebases with VRS, including CIViC (civicdb.org), BRCA Exchange (brcaexchange.org), and the Molecular Oncology Almanac (moalmanac.org). Citation Format: Matthew Cannon, Kori Kuzma, James Stevenson, Jiachen Liu, Colin O'Sullivan, Bimal P. Chaudhari, Matthew Brush, Robert R. Freimuth, Tristan Nelson, Michael Baudis, Obi L. Griffith, Malachi Griffith, Lawrence Babb, Melissa S. Cline, Xuelu Liu, Brian Walsh, Alex H. Wagner. Introduction of the GA4GH Variation Representation Specification (VRS) and supporting tools for discovery and exchange of clinical genomic and cytogenomic knowledge in cancers [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2022; 2022 Apr 8-13. Philadelphia (PA): AACR; Cancer Res 2022;82(12_Suppl):Abstract nr 1177.
APA, Harvard, Vancouver, ISO, and other styles
32

Wohler, Elizabeth, Renan Martin, Sean Griffith, Eliete da S. Rodrigues, Corina Antonescu, Jennifer E. Posey, Zeynep Coban-Akdemir, et al. "PhenoDB, GeneMatcher and VariantMatcher, tools for analysis and sharing of sequence data." Orphanet Journal of Rare Diseases 16, no. 1 (August 18, 2021). http://dx.doi.org/10.1186/s13023-021-01916-z.

Full text
Abstract:
Abstract Background With the advent of whole exome (ES) and genome sequencing (GS) as tools for disease gene discovery, rare variant filtering, prioritization and data sharing have become essential components of the search for disease genes and variants potentially contributing to disease phenotypes. The computational storage, data manipulation, and bioinformatic interpretation of thousands to millions of variants identified in ES and GS, respectively, is a challenging task. To aid in that endeavor, we constructed PhenoDB, GeneMatcher and VariantMatcher. Results PhenoDB is an accessible, freely available, web-based platform that allows users to store, share, analyze and interpret their patients’ phenotypes and variants from ES/GS data. GeneMatcher is accessible to all stakeholders as a web-based tool developed to connect individuals (researchers, clinicians, health care providers and patients) around the globe with interest in the same gene(s), variant(s) or phenotype(s). Finally, VariantMatcher was developed to enable public sharing of variant-level data and phenotypic information from individuals sequenced as part of multiple disease gene discovery projects. Here we provide updates on PhenoDB and GeneMatcher applications and implementation and introduce VariantMatcher. Conclusion Each of these tools has facilitated worldwide data sharing and data analysis and improved our ability to connect genes to phenotypic traits. Further development of these platforms will expand variant analysis, interpretation, novel disease-gene discovery and facilitate functional annotation of the human genome for clinical genomics implementation and the precision medicine initiative.
APA, Harvard, Vancouver, ISO, and other styles
33

Choi, Hyejin, Kwanghwan Lee, Donghyo Kim, Sanguk Kim, and Jae Hoon Lee. "The implication of holocytochrome c synthase mutation in Korean familial hypoplastic amelogenesis imperfecta." Clinical Oral Investigations, March 3, 2022. http://dx.doi.org/10.1007/s00784-022-04413-0.

Full text
Abstract:
Abstract Objectives This study aimed to comprehensively characterise genetic variants of amelogenesis imperfecta in a single Korean family through whole-exome sequencing and bioinformatics analysis. Material and methods Thirty-one individuals of a Korean family, 9 of whom were affected and 22 unaffected by amelogenesis imperfecta, were enrolled. Whole-exome sequencing was performed on 12 saliva samples, including samples from 8 affected and 4 unaffected individuals. The possible candidate genes associated with the disease were screened by segregation analysis and variant filtering. In silico mutation impact analysis was then performed on the filtered variants based on sequence conservation and protein structure. Results Whole-exome sequencing data revealed an X-linked dominant, heterozygous genomic missense mutation in the mitochondrial gene holocytochrome c synthase (HCCS). We also found that HCCS is potentially related to the role of mitochondria in amelogenesis. The HCCS variant was expected to be deleterious in both evolution-based and large population-based analyses. Further, the variant was predicted to have a negative effect on catalytic function of HCCS by in silico analysis of protein structure. In addition, HCCS had significant association with amelogenesis in literature mining analysis. Conclusions These findings suggest new evidence for the relationship between amelogenesis and mitochondria function, which could be implicated in the pathogenesis of amelogenesis imperfecta. Clinical relevance The discovery of HCCS mutations and a deeper understanding of the pathogenesis of amelogenesis imperfecta could lead to finding solutions for the fundamental treatment of this disease. Furthermore, it enables dental practitioners to establish predictable prosthetic treatment plans at an early stage by early detection of amelogenesis imperfecta through personalised medicine.
APA, Harvard, Vancouver, ISO, and other styles
34

Su, Zhiguang, Allison Cox, Yuan Shen, Ioannis Stylianou, and Beverly Paigen. "Abstract 1388: Hdlq14 Gene, A New Gene Regulating HDL Levels." Circulation 116, suppl_16 (October 16, 2007). http://dx.doi.org/10.1161/circ.116.suppl_16.ii_285-a.

Full text
Abstract:
Background . The discovery of new genes responsible for regulation of high-density lipoprotein cholesterol (HDL) has great clinical relevance since increases in HDL can reduce cardiovascular disease risk. Quantitative trait locus (QTL) analysis is a means of finding novel genes that regulate complex traits, such as atherosclerosis and HDL. Hdlq14 and Hdlq15 , two closely linked QTLs for HDL on mouse Chr 1, have been detected by using an intercross between strains C57BL/6 (B6) and 129S1/SvImJ (129). Apoa2 is the gene for Hdlq15 locus, but the gene for Hdlq14 is unknown. Methods: To confirm the Hdlq14 and identify the candidate gene, we performed QTL analysis in a F2 population generated from strains NZB and NZW, which are same at Apoa2 to avoid its strong effect on the nearby QTL. Hdlq14 was further narrowed by several strategies including combining crosses, comparative genomics, and haplotype analysis. The reduced lists of candidate genes were evaluated by their expression or sequence differences between the strains that caused the Hdlq14 . Finally, other HDL crosses, including NZOxNON, B6xC3H, and Pera x D2, were examined to point out the QTL gene. The relationship between the polymorphism at the Hdlq14 gene and HDL was analyzed in 43 genetically diverse mouse strains. Results: Hdlq14 was proved in cross NZBxNZW and the critical interval was reduced from 45 Mb harboring 271 genes to 1.65 Mb containing 15 genes by using bioinformatics tools. Six of these 15 genes have polymorphisms that changed an amino acid; and two genes were found have a significant expression difference between strains B6 and 129. The Hdlq14 gene was further pointed out using HDL QTL identified in crosses including NZOxNON, B6xC3H, and PeraxDBA. In 43 genetically diverse mouse strains, we found that strains with one allele of the Hdlq14 had significantly higher plasma HDL levels than those with the other variant. Conclusions: The Hdlq14 was identified as a new HDL-regulating gene.
APA, Harvard, Vancouver, ISO, and other styles
35

Yun, Taedong, Helen Li, Pi-Chuan Chang, Michael F. Lin, Andrew Carroll, and Cory Y. McLean. "Accurate, scalable cohort variant calls using DeepVariant and GLnexus." Bioinformatics, January 5, 2021. http://dx.doi.org/10.1093/bioinformatics/btaa1081.

Full text
Abstract:
Abstract Motivation Population-scale sequenced cohorts are foundational resources for genetic analyses, but processing raw reads into analysis-ready cohort-level variants remains challenging. Results We introduce an open-source cohort-calling method that uses the highly-accurate caller DeepVariant and scalable merging tool GLnexus. Using callset quality metrics based on variant recall and precision in benchmark samples and Mendelian consistency in father-mother-child trios, we optimized the method across a range of cohort sizes, sequencing methods, and sequencing depths. The resulting callsets show consistent quality improvements over those generated using existing best practices with reduced cost. We further evaluate our pipeline in the deeply sequenced 1000 Genomes Project (1KGP) samples and show superior callset quality metrics and imputation reference panel performance compared to an independently-generated GATK Best Practices pipeline. Availability and Implementation We publicly release the 1KGP individual-level variant calls and cohort callset (https://console.cloud.google.com/storage/browser/brain-genomics-public/research/cohort/1KGP) to foster additional development and evaluation of cohort merging methods as well as broad studies of genetic variation. Both DeepVariant (https://github.com/google/deepvariant) and GLnexus (https://github.com/dnanexus-rnd/GLnexus) are open-sourced, and the optimized GLnexus setup discovered in this study is also integrated into GLnexus public releases v1.2.2 and later. Supplementary information Supplementary data are available at Bioinformatics online.
APA, Harvard, Vancouver, ISO, and other styles
36

Khorsand, Parsoa, Luca Denti, Paola Bonizzoni, Rayan Chikhi, and Fereydoun Hormozdiari. "Comparative genome analysis using sample-specific string detection in accurate long reads." Bioinformatics Advances 1, no. 1 (January 1, 2021). http://dx.doi.org/10.1093/bioadv/vbab005.

Full text
Abstract:
Abstract Motivation Comparative genome analysis of two or more whole-genome sequenced (WGS) samples is at the core of most applications in genomics. These include the discovery of genomic differences segregating in populations, case-control analysis in common diseases and diagnosing rare disorders. With the current progress of accurate long-read sequencing technologies (e.g. circular consensus sequencing from PacBio sequencers), we can dive into studying repeat regions of the genome (e.g. segmental duplications) and hard-to-detect variants (e.g. complex structural variants). Results We propose a novel framework for comparative genome analysis through the discovery of strings that are specific to one genome (‘samples-specific’ strings). We have developed a novel, accurate and efficient computational method for the discovery of sample-specific strings between two groups of WGS samples. The proposed approach will give us the ability to perform comparative genome analysis without the need to map the reads and is not hindered by shortcomings of the reference genome and mapping algorithms. We show that the proposed approach is capable of accurately finding sample-specific strings representing nearly all variation (&gt;98%) reported across pairs or trios of WGS samples using accurate long reads (e.g. PacBio HiFi data). Availability and implementation Data, code and instructions for reproducing the results presented in this manuscript are publicly available at https://github.com/Parsoa/PingPong. Supplementary information Supplementary data are available at Bioinformatics Advances online.
APA, Harvard, Vancouver, ISO, and other styles
37

Farkas, Carlos, Andy Mella, Maxime Turgeon, and Jody J. Haigh. "A Novel SARS-CoV-2 Viral Sequence Bioinformatic Pipeline Has Found Genetic Evidence That the Viral 3′ Untranslated Region (UTR) Is Evolving and Generating Increased Viral Diversity." Frontiers in Microbiology 12 (June 21, 2021). http://dx.doi.org/10.3389/fmicb.2021.665041.

Full text
Abstract:
An unprecedented amount of SARS-CoV-2 sequencing has been performed, however, novel bioinformatic tools to cope with and process these large datasets is needed. Here, we have devised a bioinformatic pipeline that inputs SARS-CoV-2 genome sequencing in FASTA/FASTQ format and outputs a single Variant Calling Format file that can be processed to obtain variant annotations and perform downstream population genetic testing. As proof of concept, we have analyzed over 229,000 SARS-CoV-2 viral sequences up until November 30, 2020. We have identified over 39,000 variants worldwide with increased polymorphisms, spanning the ORF3a gene as well as the 3′ untranslated (UTR) regions, specifically in the conserved stem loop region of SARS-CoV-2 which is accumulating greater observed viral diversity relative to chance variation. Our analysis pipeline has also discovered the existence of SARS-CoV-2 hypermutation with low frequency (less than in 2% of genomes) likely arising through host immune responses and not due to sequencing errors. Among annotated non-sense variants with a population frequency over 1%, recurrent inactivation of the ORF8 gene was found. This was found to be present in the newly identified B.1.1.7 SARS-CoV-2 lineage that originated in the United Kingdom. Almost all VOC-containing genomes possess one stop codon in ORF8 gene (Q27∗), however, 13% of these genomes also contains another stop codon (K68∗), suggesting that ORF8 loss does not interfere with SARS-CoV-2 spread and may play a role in its increased virulence. We have developed this computational pipeline to assist researchers in the rapid analysis and characterization of SARS-CoV-2 variation.
APA, Harvard, Vancouver, ISO, and other styles
38

Srivastava, Himangi, Drew Ferrell, and George V. Popescu. "NetSeekR: a network analysis pipeline for RNA-Seq time series data." BMC Bioinformatics 23, no. 1 (January 28, 2022). http://dx.doi.org/10.1186/s12859-021-04554-1.

Full text
Abstract:
Abstract Background Recent development of bioinformatics tools for Next Generation Sequencing data has facilitated complex analyses and prompted large scale experimental designs for comparative genomics. When combined with the advances in network inference tools, this can lead to powerful methodologies for mining genomics data, allowing development of pipelines that stretch from sequence reads mapping to network inference. However, integrating various methods and tools available over different platforms requires a programmatic framework to fully exploit their analytic capabilities. Integrating multiple genomic analysis tools faces challenges from standardization of input and output formats, normalization of results for performing comparative analyses, to developing intuitive and easy to control scripts and interfaces for the genomic analysis pipeline. Results We describe here NetSeekR, a network analysis R package that includes the capacity to analyze time series of RNA-Seq data, to perform correlation and regulatory network inferences and to use network analysis methods to summarize the results of a comparative genomics study. The software pipeline includes alignment of reads, differential gene expression analysis, correlation network analysis, regulatory network analysis, gene ontology enrichment analysis and network visualization of differentially expressed genes. The implementation provides support for multiple RNA-Seq read mapping methods and allows comparative analysis of the results obtained by different bioinformatics methods. Conclusion Our methodology increases the level of integration of genomics data analysis tools to network inference, facilitating hypothesis building, functional analysis and genomics discovery from large scale NGS data. When combined with network analysis and simulation tools, the pipeline allows for developing systems biology methods using large scale genomics data.
APA, Harvard, Vancouver, ISO, and other styles
39

Lo, Chien-Chi, Migun Shakya, Ryan Connor, Karen Davenport, Mark Flynn, Adán Myers y. Gutiérrez, Bin Hu, et al. "EDGE COVID-19: a web platform to generate submission-ready genomes from SARS-CoV-2 sequencing efforts." Bioinformatics, March 24, 2022. http://dx.doi.org/10.1093/bioinformatics/btac176.

Full text
Abstract:
Abstract Summary Genomics has become an essential technology for surveilling emerging infectious disease outbreaks. A range of technologies and strategies for pathogen genome enrichment and sequencing are being used by laboratories worldwide, together with different and sometimes ad hoc, analytical procedures for generating genome sequences. A fully integrated analytical process for raw sequence to consensus genome determination, suited to outbreaks such as the ongoing COVID-19 pandemic, is critical to provide a solid genomic basis for epidemiological analyses and well-informed decision making. We have developed a web-based platform and integrated bioinformatic workflows that help to provide consistent high-quality analysis of SARS-CoV-2 sequencing data generated with either the Illumina or Oxford Nanopore Technologies (ONT). Using an intuitive web-based interface, this workflow automates data quality control, SARS-CoV-2 reference-based genome variant and consensus calling, lineage determination and provides the ability to submit the consensus sequence and necessary metadata to GenBank, GISAID and INSDC raw data repositories. We tested workflow usability using real world data and validated the accuracy of variant and lineage analysis using several test datasets, and further performed detailed comparisons with results from the COVID-19 Galaxy Project workflow. Our analyses indicate that EC-19 workflows generate high-quality SARS-CoV-2 genomes. Finally, we share a perspective on patterns and impact observed with Illumina versus ONT technologies on workflow congruence and differences. Availability and implementation https://edge-covid19.edgebioinformatics.org, and https://github.com/LANL-Bioinformatics/EDGE/tree/SARS-CoV2. Supplementary information Supplementary data are available at Bioinformatics online.
APA, Harvard, Vancouver, ISO, and other styles
40

Sserwadda, Ivan, and Gerald Mboowa. "rMAP: the Rapid Microbial Analysis Pipeline for ESKAPE bacterial group whole-genome sequence data." Microbial Genomics 7, no. 6 (June 10, 2021). http://dx.doi.org/10.1099/mgen.0.000583.

Full text
Abstract:
The recent re-emergence of multidrug-resistant pathogens has exacerbated their threat to worldwide public health. The evolution of the genomics era has led to the generation of huge volumes of sequencing data at an unprecedented rate due to the ever-reducing costs of whole-genome sequencing (WGS). We have developed the Rapid Microbial Analysis Pipeline (rMAP), a user-friendly pipeline capable of profiling the resistomes of ESKAPE pathogens ( Enterococcus faecium , Staphylococcus aureus , Klebsiella pneumoniae , Acinetobacter baumannii , Pseudomonas aeruginosa and Enterobacter species) using WGS data generated from Illumina’s sequencing platforms. rMAP is designed for individuals with little bioinformatics expertise, and automates the steps required for WGS analysis directly from the raw genomic sequence data, including adapter and low-quality sequence read trimming, de novo genome assembly, genome annotation, single-nucleotide polymorphism (SNP) variant calling, phylogenetic inference by maximum likelihood, antimicrobial resistance (AMR) profiling, plasmid profiling, virulence factor determination, multi-locus sequence typing (MLST), pangenome analysis and insertion sequence characterization (IS). Once the analysis is finished, rMAP generates an interactive web-like html report. rMAP installation is very simple, it can be run using very simple commands. It represents a rapid and easy way to perform comprehensive bacterial WGS analysis using a personal laptop in low-income settings where high-performance computing infrastructure is limited.
APA, Harvard, Vancouver, ISO, and other styles
41

Camiolo, Salvatore, Nicolás M. Suárez, Antonia Chalka, Cristina Venturini, Judith Breuer, and Andrew J. Davison. "GRACY: a tool for analysing human cytomegalovirus sequence data." Virus Evolution, December 30, 2020. http://dx.doi.org/10.1093/ve/veaa099.

Full text
Abstract:
Abstract Modern DNA sequencing has instituted a new era in human cytomegalovirus (HCMV) genomics. A key development has been the ability to determine the genome sequences of HCMV strains directly from clinical material. This involves the application of complex and often non-standardized bioinformatics approaches to analysing data of variable quality in a process that requires substantial manual intervention. To relieve this bottleneck, we have developed GRACy (Genome Reconstruction and Annotation of Cytomegalovirus), an easy-to-use tookit for analysing HCMV sequence data. GRACy automates and integrates modules for read filtering, genotyping, genome assembly, genome annotation, variant analysis and data submission. These modules were tested extensively on simulated and experimental data and outperformed generic approaches. GRACy is written in Python and is embedded in a graphical user interface with all required dependencies installed by a single command. It runs on the Linux operating system, and is designed to allow the future implementation of a cross-platform version. GRACy is distributed under a GPL 3.0 license and is freely available at https://bioinformatics.cvr.ac.uk/software/ with the manual and a test dataset.
APA, Harvard, Vancouver, ISO, and other styles
42

Rana, Shashank, Preeti P, Vartika Singh, and Nikunj Bhardwaj. "Bioinformatics in Microbial Biotechnology: A Genomics and Proteomics Perspective." Innovations in Information and Communication Technology Series, February 28, 2021, 54–69. http://dx.doi.org/10.46532/978-81-950008-7-6_005.

Full text
Abstract:
Biological data is a new era with new growth in numerical and memory retention capacity, many microbial and eukaryotic genomes encapsulate the human genome's pure structure, followed by raising the prospect of higher viral control. The goal is as high as the development of drug development based on the study of the structures and functions of target molecules (rational drug) and antimicrobial agents, the growth is simple to manage drugs, protein biomarkers that develop different bacterial infections and healthier considerate of protein(host)-protein(bacteria) interactions to avert bacterial disease. In addition to many bioinformatics processes and cross-reference, databases have made easy the understanding of these goals. The current study is divided into (I) genomics - sequencing and gene-related studies to determine the genetic function and genetic engineering, (II) proteomics - classification of associated properties of protein and rebuilding of the metabolic and regulatory pathway, (III) growth of drug and antimicrobial agents' application. Our center of attention on genomics and proteomics strategies and their restrictions in the current chapter. Bioinformatics study can be grouped under several main criteria: (1) research-based on existing wet-lab testing data, (2) new data obtained from the use of mathematical modelling and (3) an incorporated method that combines exploration procedure with a mathematical model. The main implications of bioinformatics examined area have automated genetic sequence, robotic expansion of integrated data of genomics and proteomics, computer-assisted comparison to find genome utility, the automatic origin of a metabolic pathway, gene expression analysis which was derived from the regulatory pathway, clustering techniques and strategies of data mining to identify the interaction of protein-protein and protein-DNA and silico modelling of three-dimensional protein arrangement and docking between proteins and biological chemicals for rational drug design, investigation of differences among infectious and non-infectious species to recognise genes drugs and antimicrobial agents and all genome comparisons to be aware of the development of microorganisms. Advanced bioinformatics has the potential to help (i) cause disease detection, (ii) develop new drugs and (iii) improve cost-effective bioremediation agents. Recent research is a part of the lack of genetic functionality found in wet laboratories information, the absence of computer algorithms to test large amounts of information on unidentified function and the continuous discovery of protein-to-protein, protein-to-DNA and Protein to RNA interaction.
APA, Harvard, Vancouver, ISO, and other styles
43

Bradbury, P. J., T. Casstevens, S. E. Jensen, L. C. Johnson, Z. R. Miller, B. Monier, M. C. Romay, B. Song, and E. S. Buckler. "The Practical Haplotype Graph, a platform for storing and using pangenomes for imputation." Bioinformatics, June 24, 2022. http://dx.doi.org/10.1093/bioinformatics/btac410.

Full text
Abstract:
Abstract Motivation Pangenomes provide novel insights for population and quantitative genetics, genomics, and breeding not available from studying a single reference genome. Instead, a species is better represented by a pangenome or collection of genomes. Unfortunately, managing and using pangenomes for genomically diverse species is computationally and practically challenging. We developed a trellis graph representation anchored to the reference genome that represents most pangenomes well and can be used to impute complete genomes from low density sequence or variant data. Results The Practical Haplotype Graph (PHG) is a pangenome pipeline, database (PostGRES & SQLite), data model (Java, Kotlin, or R), and Breeding API (BrAPI) web service. The PHG has already been able to accurately represent diversity in four major crops including maize, one of the most genomically diverse species, with up to 1000-fold data compression. Using simulated data, we show that, at even 0.1X coverage, with appropriate reads and sequence alignment, imputation results in extremely accurate haplotype reconstruction. The PHG is a platform and environment for the understanding and application of genomic diversity. Availability All resources listed here are freely available. The PHG Docker used to generate the simulation results is https://hub.docker.com/ as maizegenetics/phg:0.0.27. PHG source code is at https://bitbucket.org/bucklerlab/practicalhaplotypegraph/src/master/. The code used for the analysis of simulated data is at https://bitbucket.org/bucklerlab/phg-manuscript/src/master/. The PHG database of NAM parent haplotypes is in the CyVerse data store (https://de.cyverse.org/de/) and named /iplant/home/shared/panzea/panGenome/PHG_db_maize/phg_v5Assemblies_20200608.db. Supplementary information Supplementary data are available at Bioinformatics online.
APA, Harvard, Vancouver, ISO, and other styles
44

Bersell, Kevin, Tao Yang, and Dan Roden. "Abstract 96: A Unique Jervell Lange-Nielsen Syndrome Mutation Modeled in Induced Pluripotent Stem Cell Derived Cardiomyocytes." Circulation Research 117, suppl_1 (July 17, 2015). http://dx.doi.org/10.1161/res.117.suppl_1.96.

Full text
Abstract:
Introduction: Current screening for mutations in human disease is turning increasingly to next-generation methods that map short reads to a reference sequence. We report here an unusual variant that was undetected by next generation sequencing in a patient diagnosed with Jervell Lange-Nielsen syndrome (JLNS) and initial results in an induced pluripotent stem cell-derived cardiomyocyte (iPSC-CM) model. Methods and Results: A diagnosis of JLNS was made in a middle-aged woman with congenital deafness and QT intervals as long as 800 msec. However, next-generation sequencing found only a heterozygous KCNQ1 mutation, R518X. Convinced by the clinical phenotype that a second causative variant was highly likely, we used Sanger sequencing of PCR KCNQ1 amplicons to identify a 36-basepair poly-adenine tract, encoding 12 lysines, inserted within the coding sequence at the 5’ end of exon 15. Electrophysiological studies in patient-specific IPSC-CMs revealed marked prolongation of ventricular-like action potentials (Figure). Conclusion: Long inserts of the type we identified here have not been previously reported in the long QT syndromes. We speculate that next generation-based short reads containing this variant could not be mapped to a reference sequence and thus this type of variant will be missed by next-generation analysis unless bioinformatics filters are specifically modified to include this possibility. Validation of this long QT syndrome iPSC-CM model provides a human cell based platform for drug discovery and mechanistic studies to further our understanding of disease pathogenesis.
APA, Harvard, Vancouver, ISO, and other styles
45

Peng, Qi, Wenyan Qin, Siping Li, Meihua Huang, Chunbao Rao, and Xiaomei Lu. "A Novel IRF6 Frameshift Mutation in a Large Chinese Pedigree With Van der Woude syndrome." Cleft Palate-Craniofacial Journal, April 28, 2021, 105566562110109. http://dx.doi.org/10.1177/10556656211010909.

Full text
Abstract:
Aims: Van der Woude syndrome (VWS) is one of the most common craniofacial anomalies, causing significant functional and psychological burden to the patients. This study aimed to identify the genetic cause of VWS in a Chinese family. Methods: Whole genome sequencing (WGS) was performed to screen for pathogenic mutations. Various Bioinformatics tools were used to assess the pathogenicity of the variants. Cosegregation analysis of the candidate variant was carried out. Interpretation of variants was performed according to the American College of Medical Genetics and Genomics guidelines. Results: A novel frameshift duplication c.373_374dupAA (p.Asn125Lys fs*43) was identified in exon 4 of the interferon regulatory factor 6 (IRF6) gene in all 3 affected members, which were not found in unaffected family members. The novel mutation leads to a frameshift and a premature stop codon which caused putative truncated protein. Protein alignment indicated high evolutionary conservation of the p.N125 residue, and this mutation was predicted by online tools to be damaging and deleterious. Conclusions: This study demonstrates that the novel mutation c.373_374dupAA (p.Asn125Lysfs*43) in the IRF6 gene corresponds to the VWS in this family. The discovery of this pathogenic variant enriches the genotypic spectrum of IRF6 gene and contributes to genetic diagnosis and counseling of families with VWS.
APA, Harvard, Vancouver, ISO, and other styles
46

Chen, Jia, Yuting Ma, Hong Li, Zhuo Lin, Zhe Yang, Qin Zhang, Feng Wang, Yanping Lin, Zebing Ye, and Yubi Lin. "Rare and potential pathogenic mutations of LMNA and LAMA4 associated with familial arrhythmogenic right ventricular cardiomyopathy/dysplasia with right ventricular heart failure, cerebral thromboembolism and hereditary electrocardiogram abnormality." Orphanet Journal of Rare Diseases 17, no. 1 (May 7, 2022). http://dx.doi.org/10.1186/s13023-022-02348-z.

Full text
Abstract:
Abstract Background Arrhythmogenic right ventricular cardiomyopathy/dysplasia (ARVC/D) is associated with ventricular arrhythmia, heart failure (HF), and sudden death. Thromboembolism is also an important and serious complication of ARVC/D. However, the etiology of ARVC/D and thromboembolism and their association with genetic mutations are unclear. Methods Genomic DNA samples of peripheral blood were conducted for whole-exome sequencing (WES) and Sanger sequencing in the ARVC/D family. Then, we performed bioinformatics analysis for genes susceptible to cardiomyopathies and arrhythmias. Further, we analyzed how the potential pathogenic mutations were affecting the hydrophobicity and phosphorylation of amino acids and their joint pathogenicity by ProtScale, NetPhos and ORVAL algorisms. Results We discovered a Chinese Han family of ARVC/D with right ventricular HF (RVHF), cerebral thromboembolism, arrhythmias (atrial fibrillation, atrial standstill, multifocal ventricular premature, complete right bundle block and third-degree atrioventricular block) and sudden death. Based on the WES data, the variants of LMNA p.A242V, LAMA4 p.A225P and RYR2 p.T858M are highly conserved and predicated as “deleterious” by SIFT and MetaSVM algorithms. Their CADD predicting scores are 33, 27.4 and 25.8, respectively. These variants increase the hydrophobicity of their corresponding amino acid residues and their nearby sequences by 0.378, 0.266 and 0.289, respectively. The LAMA4 and RYR2 variants lead to changes in protein phosphorylation at or near their corresponding amino acid sites. There were high risks of joint pathogenicity for cardiomyopathy among these three variants. Cosegregation analysis indicated that LMNA p.A242V might be an important risk factor for ARVC/D, electrocardiogram abnormality and cerebral thromboembolism, while LAMA4 p.A225P may be a pathogenic etiology of ARVC/D and hereditary electrocardiogram abnormality. Conclusions The LMNA p.A242V may participate in the pathogenesis of familial ARVC/D with RVHF and cerebral thromboembolism, while LAMA4 p.A225P may be associated with ARVC/D and hereditary electrocardiogram abnormality.
APA, Harvard, Vancouver, ISO, and other styles
47

Reid, Thomas, and Jordyn Bergsveinson. "How Do the Players Play? A Post-Genomic Analysis Paradigm to Understand Aquatic Ecosystem Processes." Frontiers in Molecular Biosciences 8 (May 7, 2021). http://dx.doi.org/10.3389/fmolb.2021.662888.

Full text
Abstract:
Culture-independent and meta-omics sequencing methods have shed considerable light on the so-called “microbial dark matter” of Earth’s environmental microbiome, improving our understanding of phylogeny, the tree of life, and the vast functional diversity of microorganisms. This influx of sequence data has led to refined and reimagined hypotheses about the role and importance of microbial biomass, that paradoxically, sequencing approaches alone are unable to effectively test. Post-genomic approaches such as metabolomics are providing more sensitive and insightful data to unravel the fundamental operations and intricacies of microbial communities within aquatic systems. We assert that the implementation of integrated post-genomic approaches, specifically metabolomics and metatranscriptomics, is the new frontier of environmental microbiology and ecology, expanding conventional assessments toward a holistic systems biology understanding. Progressing beyond siloed phylogenetic assessments and cataloging of metabolites, toward integrated analysis of expression (metatranscriptomics) and activity (metabolomics) is the most effective approach to provide true insight into microbial contributions toward local and global ecosystem functions. This data in turn creates opportunity for improved regulatory guidelines, biomarker discovery and better integration of modeling frameworks. To that end, critical aquatic environmental issues related to climate change, such as ocean warming and acidification, contamination mitigation, and macro-organism health have reasonable opportunity of being addressed through such an integrative approach. Lastly, we argue that the “post-genomics” paradigm is well served to proactively address the systemic technical issues experienced throughout the genomics revolution and focus on collaborative assessment of field-wide experimental standards of sampling, bioinformatics and statistical treatments.
APA, Harvard, Vancouver, ISO, and other styles
48

Chu, Chunfang, Lin Li, Shenghui Li, Qi Zhou, Ping Zheng, Yu-Di Zhang, Ai-hong Duan, Dan Lu, and Yu-Mei Wu. "Variants in genes related to development of the urinary system are associated with Mayer–Rokitansky–Küster–Hauser syndrome." Human Genomics 16, no. 1 (March 31, 2022). http://dx.doi.org/10.1186/s40246-022-00385-0.

Full text
Abstract:
AbstractMayer–Rokitansky–Küster–Hauser (MRKH) syndrome, also known as Müllerian agenesis, is characterized by uterovaginal aplasia in an otherwise phenotypically normal female with a normal 46,XX karyotype. Previous studies have associated sequence variants of PAX8, TBX6, GEN1, WNT4, WNT9B, BMP4, BMP7, HOXA10, EMX2, LHX1, GREB1L, LAMC1, and other genes with MRKH syndrome. The purpose of this study was to identify the novel genetic causes of MRKH syndrome. Ten patients with MRKH syndrome were recruited at Beijing Obstetrics and Gynecology Hospital, Capital Medical University, Beijing, China. Whole-exome sequencing was performed for each patient. Sanger sequencing confirmed the potential causative genetic variants in each patient. In silico analysis and American College of Medical Genetics and Genomics (ACMG) guidelines helped to classify the pathogenicity of each variant. The Robetta online protein structure prediction tool determined whether the variants affected protein structures. Eleven variants were identified in 90% (9/10) of the patients and were considered a molecular genetic diagnosis of MRKH syndrome. These 11 variants were related to nine genes: TBC1D1, KMT2D, HOXD3, DLG5, GLI3, HIRA, GATA3, LIFR, and CLIP1. Sequence variants of TBC1D1 were found in two unrelated patients. All variants were heterozygous. These changes included one frameshift variant, one stop-codon variant, and nine missense variants. All identified variants were absent or rare in gnomAD East Asian populations. Two of the 11 variants (18.2%) were classified as pathogenic according to the ACMG guidelines, and the remaining nine (81.8%) were classified as variants of uncertain significance. Robetta online protein structure prediction analysis suggested that missense variants in TBC1D1 (p.E357Q), HOXD3 (p.P192R), and GLI3 (p.L299V) proteins caused significant structural changes compared to those in wild-type proteins, which in turn may lead to changes in protein function. This study identified many novel genes, especially TBC1D1, related to the pathogenesis of MRKH syndrome. The identification of these variants provides new insights into the etiology of MRKH syndrome and a new molecular genetic reference for the development of the reproductive tract.
APA, Harvard, Vancouver, ISO, and other styles
49

Samaha, Georgina, Claire M. Wade, Hamutal Mazrier, Catherine E. Grueber, and Bianca Haase. "Exploiting genomic synteny in Felidae: cross-species genome alignments and SNV discovery can aid conservation management." BMC Genomics 22, no. 1 (August 6, 2021). http://dx.doi.org/10.1186/s12864-021-07899-2.

Full text
Abstract:
Abstract Background While recent advances in genomics has enabled vast improvements in the quantification of genome-wide diversity and the identification of adaptive and deleterious alleles in model species, wildlife and non-model species have largely not reaped the same benefits. This has been attributed to the resources and infrastructure required to develop essential genomic datasets such as reference genomes. In the absence of a high-quality reference genome, cross-species alignments can provide reliable, cost-effective methods for single nucleotide variant (SNV) discovery. Here, we demonstrated the utility of cross-species genome alignment methods in gaining insights into population structure and functional genomic features in cheetah (Acinonyx jubatas), snow leopard (Panthera uncia) and Sumatran tiger (Panthera tigris sumatrae), relative to the domestic cat (Felis catus). Results Alignment of big cats to the domestic cat reference assembly yielded nearly complete sequence coverage of the reference genome. From this, 38,839,061 variants in cheetah, 15,504,143 in snow leopard and 13,414,953 in Sumatran tiger were discovered and annotated. This method was able to delineate population structure but limited in its ability to adequately detect rare variants. Enrichment analysis of fixed and species-specific SNVs revealed insights into adaptive traits, evolutionary history and the pathogenesis of heritable diseases. Conclusions The high degree of synteny among felid genomes enabled the successful application of the domestic cat reference in high-quality SNV detection. The datasets presented here provide a useful resource for future studies into population dynamics, evolutionary history and genetic and disease management of big cats. This cross-species method of variant discovery provides genomic context for identifying annotated gene regions essential to understanding adaptive and deleterious variants that can improve conservation outcomes.
APA, Harvard, Vancouver, ISO, and other styles
50

Vasconcelos, Ana M., Maria Beatriz Carmo, Beatriz Ferreira, Inês Viegas, Margarida Gama-Carvalho, António Ferreira, and Andreia J. Amaral. "IsomiR_Window: a system for analyzing small-RNA-seq data in an integrative and user-friendly manner." BMC Bioinformatics 22, no. 1 (February 1, 2021). http://dx.doi.org/10.1186/s12859-021-03955-6.

Full text
Abstract:
Abstract Background IsomiRs are miRNA variants that vary in length and/or sequence when compared to their canonical forms. These variants display differences in length and/or sequence, including additions or deletions of one or more nucleotides (nts) at the 5′ and/or 3′ end, internal editings or untemplated 3′ end additions. Most available tools for small RNA-seq data analysis do not allow the identification of isomiRs and often require advanced knowledge of bioinformatics. To overcome this, we have developed IsomiR Window, a platform that supports the systematic identification, quantification and functional exploration of isomiR expression in small RNA-seq datasets, accessible to users with no computational skills. Methods IsomiR Window enables the discovery of isomiRs and identification of all annotated non-coding RNAs in RNA-seq datasets from animals and plants. It comprises two main components: the IsomiR Window pipeline for data processing; and the IsomiR Window Browser interface. It integrates over ten third-party softwares for the analysis of small-RNA-seq data and holds a new algorithm that allows the detection of all possible types of isomiRs. These include 3′ and 5′end isomiRs, 3′ end tailings, isomiRs with single nucleotide polymorphisms (SNPs) or potential RNA editings, as well as all possible fuzzy combinations. IsomiR Window includes all required databases for analysis and annotation, and is freely distributed as a Linux virtual machine, including all required software. Results IsomiR Window processes several datasets in an automated manner, without restrictions of input file size. It generates high quality interactive figures and tables which can be exported into different formats. The performance of isomiR detection and quantification was assessed using simulated small-RNA-seq data. For correctly mapped reads, it identified different types of isomiRs with high confidence and 100% accuracy. The analysis of a small RNA-seq data from Basal Cell Carcinomas (BCCs) using isomiR Window confirmed that miR-183-5p is up-regulated in Nodular BCCs, but revealed that this effect was predominantly due to a novel 5′end variant. This variant displays a different seed region motif and 1756 isoform-exclusive mRNA targets that are significantly associated with disease pathways, underscoring the biological relevance of isomiR-focused analysis. IsomiR Window is available at https://isomir.fc.ul.pt/.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography