Academic literature on the topic 'Genotypability, targeted resequencing, variant calling'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the lists of relevant articles, books, theses, conference reports, and other scholarly sources on the topic 'Genotypability, targeted resequencing, variant calling.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Journal articles on the topic "Genotypability, targeted resequencing, variant calling"

1

Chen, Li. "Concurrent targeted resequencing for translocations and mutations in tumor samples." Journal of Clinical Oncology 37, no. 15_suppl (May 20, 2019): e14597-e14597. http://dx.doi.org/10.1200/jco.2019.37.15_suppl.e14597.

Full text
Abstract:
e14597 Background: Genetic variations are diverse, and proper nucleic acid templates should be analyzed for optimal outcomes. Previous NGS assays separately treat DNA and/or RNA, and are costly and produce limited information. Here we present a concurrent assay simultaneously converting both RNA and DNA templates in a single-tube format to streamline variant detection process. Methods: We developed a high-throughput targeted resequencing assay utilizing both DNA and RNA templates for mutation and fusion detection, and applied to a cohort of over 1000 lung tumor samples. Total nucleic acids from tissue samples were processed in a single-tube format throughout, and post-assay data analysis split RNA and DNA signals for corresponding variant calling. Sensitivity and specificity were sufficient for tissue samples. Results: In addition to common EGFR, KRAS, BRAF, PIK3CA mutations and ALK, ROS1, RET fusions and MET Exon 14 skipping, we also identified rare HLA-DRB1---MET and MSN---NTRK2 fusions. We found that MET exon 14 skipping was abundant, and that EGFR T790M is more associated with Exon 19 deletion than with L858R. We also found baseline mutation frequency for FFPE samples at 4-5%. Conclusions: Overall, this method is robust and convenient for clinical molecular diagnosis for tissue samples where variant types are diverse and time and budgets are constrained. Extreme caution is suggested for making positive calls below 5% MAF.
APA, Harvard, Vancouver, ISO, and other styles
2

Tozaki, Teruaki, Aoi Ohnuma, Kotono Nakamura, Kazuki Hano, Masaki Takasu, Yuji Takahashi, Norihisa Tamura, et al. "Detection of Indiscriminate Genetic Manipulation in Thoroughbred Racehorses by Targeted Resequencing for Gene-Doping Control." Genes 13, no. 9 (September 4, 2022): 1589. http://dx.doi.org/10.3390/genes13091589.

Full text
Abstract:
The creation of genetically modified horses is prohibited in horse racing as it falls under the banner of gene doping. In this study, we developed a test to detect gene editing based on amplicon sequencing using next-generation sequencing (NGS). We designed 1012 amplicons to target 52 genes (481 exons) and 147 single-nucleotide variants (SNVs). NGS analyses showed that 97.7% of the targeted exons were sequenced to sufficient coverage (depth > 50) for calling variants. The targets of artificial editing were defined as homozygous alternative (HomoALT) and compound heterozygous alternative (ALT1/ALT2) insertion/deletion (INDEL) mutations in this study. Four models of gene editing (three homoALT with 1-bp insertions, one REF/ALT with 77-bp deletion) were constructed by editing the myostatin gene in horse fibroblasts using CRISPR/Cas9. The edited cells and 101 samples from thoroughbred horses were screened using the developed test, which was capable of identifying the three homoALT cells containing 1-bp insertions. Furthermore, 147 SNVs were investigated for their utility in confirming biological parentage. Of these, 120 SNVs were amenable to consistent and accurate genotyping. Surrogate (nonbiological) dams were excluded by 9.8 SNVs on average, indicating that the 120 SNV could be used to detect foals that have been produced by somatic cloning or embryo transfer, two practices that are prohibited in thoroughbred racing and breeding. These results indicate that gene-editing tests that include variant calling and SNV genotyping are useful to identify genetically modified racehorses.
APA, Harvard, Vancouver, ISO, and other styles
3

Royo, Carolina, Pablo Carbonell-Bejerano, Rafael Torres-Pérez, Luisa Freire, Javier Ibáñez, José Miguel Martínez-Zapater, and Mar Vilanova. "Is aromatic terpenoid composition of grapes in Northwestern Iberian wine cultivars related to variation in VviDXS1 gene?" Journal of Berry Research 11, no. 2 (June 14, 2021): 187–200. http://dx.doi.org/10.3233/jbr-200609.

Full text
Abstract:
BACKGROUND: Monoterpenes and C13-norisoprenoids are key terpenoid compounds for wine aroma. The enzyme encoded by VviDXS1 participates in terpenoid biosynthesis in grapevine fruits and gain-of-function mutations in this gene lead to characteristic muscat aroma. OBJECTIVE: To assess for VviDXS1 contribution to aroma variation in Northwestern Iberian wine cultivars, we resequenced this gene in 111 cultivars and compared grape juice terpenic composition in 12 of them. METHODS: VviDXS1 was capture-targeted for resequencing with Illumina paired-end reads, SAMtools was used for variant calling and gene haplotypes were reconstructed with PHASE. Monoterpenes and C13-norisoprenoids were quantified in free and glycosidically-bound forms from grape juice by gas chromatography-mass spectrometry. RESULTS: Terpenic composition discriminated between muscat, terpenic and neutral profiles across cultivars. While the terpenic profile of Loureira and Albariño white cultivars was not associated with muscat-like mutations, Albariño carries a V34L substitution in VviDXS1 that is also present in other aromatic cultivars and was not reported before. Tempranillo and Cabernet Sauvignon red cultivars accumulated higher levels of C13-norisoprenoids, which was not associated with specific variation in VviDXS1. CONCLUSIONS: Apart from the uncharacterized substitution present in Albariño, findings suggest that terpenoid pathway-related genes other than VviDXS1 could contribute to the aromatic attributes of these cultivars.
APA, Harvard, Vancouver, ISO, and other styles
4

Das, Reena, Manu Jamwal, Prashant Sharma, Deepak Bansal, Amita Trehan, Pankaj Malhotra, and Arindam Maitra. "Genetic Spectrum of Inherited/Congenital Hemolytic Anemias in Indian Patients." Blood 138, Supplement 1 (November 5, 2021): 4151. http://dx.doi.org/10.1182/blood-2021-154452.

Full text
Abstract:
Abstract Introduction Hemolytic anemias are a group of disorders caused by the premature destruction of red blood cells with reticulocytosis. Common causes of inherited/congenital hemolysis are hemoglobinopathies and thalassemia syndromes, red blood cell membrane, and enzyme disorders. Most of the common causes (thalassemia, glucose-6-phosphate dehydrogenase (G6PD) deficiency, hereditary spherocytosis, etc.) are diagnosed based on laboratory testing; however, for remaining causes laboratory tests are either inaccessible or cumbersome. We follow a stepwise diagnostic pipeline and red cell morphology is helpful with membrane disorders. Phenotypes vary from severe hemolysis (transfusion-dependent) to mild/asymptomatic patients. Undiagnosed haemolytic anemias are taken up for multi-gene panel-based targeted resequencing which is rapid, accurate, and cost-effective. The use of these panels expedites the diagnoses of inherited hemolytic anemias and is eventually helpful for evidence-based genetic counseling. Objectives This study aimed to determine the genetic defects in inherited/congenital hemolytic anemias which remained unexplained after routine laboratory tests. Methods Seventy-five families were enrolled based on the clinical and laboratory features of inherited/congenital hemolytic anemias. Common causes of inherited hemolysis are G6PD deficiency, hemoglobinopathies and thalassemia syndromes, autoimmune hemolytic anemias, hereditary spherocytosis, and pyruvate kinase (PK) deficiency were excluded on the basis of biochemical and molecular tests. DNA extraction was done QIAamp DNA Blood Mini Kit. Quantity and quality of DNA were verified using NanoDrop and Qubit Fluorometer respectively. DNA libraries were prepared using Amplicon custom panels for genes implicated in hemolytic anemias and sequenced on Illumina MiSeq Sequencer. Alignment and variant calling were done in Illumina Local run Manager and Variant annotation was done in Basespace VariantInterpretor. Sanger sequencing was done as orthogonal validation in the index case. Predictive testing was performed for the family members. Results After targeted resequencing of the total 75 index cases, 19 patients were found to have red blood cell enzymopathies, 15 patients had stomatocytosis, 13 had membranopathies and three patients had unstable hemoglobins. In 8 patients cause was not established either only heterozygous variant was found for autosomal recessive or due to the lack of samples of family members for screening. Seventeen cases remained unexplained even after next-generation sequencing. Out of 19 patients, unexpected PK deficiency was found in 12 patients and G6PD deficiency was found in 3 patients; despite the enzyme assay being normal in these cases. We also found 2 patients with glucose-6-phosphate isomerase deficiency. One case each with hexokinase deficiency and glutathione synthetase deficiency was found. Among 15 patients with stomatocytosis, 8 had Mediterranean stomatocytosis/macrothrombocytopenia (ABCG5/ABCG8). These 8 patients showed the presence of stomatocytosis along with giant platelets on peripheral smear evaluation. Of the remaining 7 cases , 2 were found to have overhydrated hereditary stomatocytosis (RHAG) and dehydrated Stomatocytosis/xerocytosis was found in 5 (PIEZO1/KCNN4). We also found 13 cases of hemolytic anemia to have a genetic defect in red blood cell membrane protein-coding genes. Of these 5 had probably pathogenic variants in the ANK1 gene, 5 had a pathogenic variant in SPTA1, 2 had SPTB 2, and 1 patient SLC4A1. We also encountered 3 cases of unstable hemoglobins where no abnormality was noted in Hb-HPLC patterns. A total of seven patients underwent splenectomy and are transfusion free. Conclusions Our cohort of 75 families of hemolytic anemia of unexplained etiology showed a highly heterogeneous genetic spectrum. Of the total cases, the confirmed diagnosis was achieved in 67% of the patients. This approach of using a multi-gene panel is cost-effective and can provide a rapid and accurate diagnosis. Unexpected PK deficiency, G6PD deficiency, and unstable hemoglobins suggest that such cases can be missed. Providing accurate diagnosis in such cases provides evidence-based counseling and saves the families from inappropriate treatments. Disclosures No relevant conflicts of interest to declare.
APA, Harvard, Vancouver, ISO, and other styles
5

Herold, Sylvia, Thoralf Stange, Matthias Kuhn, Ingo Roeder, Christoph Röllig, Gerhard Ehninger, and Christian Thiede. "Targeted Resequencing of MLL-PTD Positive AML Patients Reveals a High Prevalence of Co-Ocurring Mutations in Epigenetic Regulator Genes." Blood 124, no. 21 (December 6, 2014): 1035. http://dx.doi.org/10.1182/blood.v124.21.1035.1035.

Full text
Abstract:
Abstract Background Partial tandem duplication mutations of the Mixed Lineage Leukemia gene (MLL-PTD) can be found in about 10% of patients with AML, especially in patients with normal karyotype AML. The mutation generates a self-fusion within the N-terminal part of MLL and has been shown to be leukemogenic in mouse models. In patients, the presence of the mutation is associated with poor prognosis. Little is known on the molecular profile of patients with MLL-PTD and on the cooperating mutations. In order to identify accompanying molecular alterations, we performed whole exome sequencing (WES) of eight AML patients harbouring MLL-PTD mutations. Based on the observed alterations we then designed a custom amplicon panel and performed targeted resequencing in a cohort of 90 MLL-PTD mutated AML patients. Materials and Methods All patients included in this analysis were treated in prospective treatment protocols of the Study Alliance Leukemia (SAL). To enrich for malignant cells and to obtain germline reference material (T-cells), FACS sorting was performed on viable cells banked at diagnosis. After whole genome amplification of the primary DNA, whole exomes were enriched (TruSeq chemistry; Illumina), and paired-end sequenced using Illumina HiSeq2000 2x100 bp runs. Resulting data were mapped against human genome (Hg19). Only somatic single nucleotide variants (SNVs) were included in the final analysis. Based on the SNVs identified by whole exome sequencing (WES), a custom amplicon panel (TruSeq Custom Amplicon, TSCA, Illumina) for targeted resequencing was designed. The assay included either the entire coding region or mutational hot spots of 56 genes (Fig.1). In total, 700 targets were amplified in a single reaction for each patient and paired end sequenced on a MiSeq NGS system (Illumina). Paired end reads were BWA mapped against targeted regions and data analysis was done using the Sequence Pilot software package (JSI Medical Systems) with a 20% variant allele frequency (VAF) mutation calling cutoff. Only non-synonymous variants not specified as SNP in the db137 database and predicted as deleterious (Provean) were included in the final analysis. All variations were confirmed by Sanger sequencing. Results WES of eight MLL-PTD (7/8 FLT3-ITD negativ) patients revealed a total 490 SNVs (range 13-254 per patient). Most frequently mutated genes were DNMT3A, IDH1/2 and TET2. Somatic mutations were also found in genes rarely mutated in AML, such as ATM, GNAS, TET1 and EP300. Based on the WES-data, 90 MLL-PTD patients were screend for a panel of 56 genes using the TSCA assay, which revealed in total of 169 mutations. 18 genes were not found to be mutated and in 8 patients, no co-occurring mutations were identified. Due bad assay performance EP300, EZH1, JAK3, MLL2, MLL3 and NOTCH1 were excluded from the data analysis. Here again, the most frequently mutated genes were DNMT3A (34.4%), IDH1 (20.0%), IDH2R140 (18.9%), IDH2R172 (7.9%), TET2 (16.7%) and FLT3 (11.3%). Mutations were less frequently found in RUNX1 (8.9%) and ASXL1, SMC1A, U2AF1 (5.6% each) (Fig. 1). In addition to these known genes, most prevalent mutations were found in ATM (8.9%) as well as DNMT3B and TET1 (4.4% each). Overall, we oberserved a low frequency of mutations in typical class 1 genes such as NRAS, KRAS and FLT3, which was lower than reported in the TCGA data set. Conclusions This analysis in a large set of patients with MLL-PTD mutations did not reveal any new and specific individual mutation present in patients with this alteration. Instead, our finding of a very high prevalence of alterations in epigenetic regulator genes, found in more than 85% of patients with MLL-PTD, strongly argues for a particular disease biology in these patients. These findings might also implicate that treatment based on demethylating agents or histone-deacetylase inhibitors might be especially attractive in patients with MLL-PTD. Figure 1: Figure 1:. Distribution of mutations in MLL-PTD patients The assay included either the entire coding region or mutational hot spots of the following 56 genes; ASXL1, ATM, BCOR, BRAF, CBL, DDR1, DNMT1, DNMT3A, DNMT3B, EIF4A2, EP300, ETV6, EZH1, EZH2, FLT3, GATA1, GATA2, GNAS, HRAS, IDH1, IDH2, JAK1, JAK2, JAK3, KDM4A, KDM5A, KDM5C, KDM6A, KIT, KRAS, MET, MLL, MLL2, MLL3, NOTCH1, NOTCH4, NPM1, NRAS, PDGFRA, PDGFRB, PHF6, PTEN, PTPN11, RAD21, RUNX1, SF3A1, SF3B4, SMC1A, SMC3, SMC4, TET1, TET2, TP53, U2AF1 and WT1. Disclosures Thiede: AgenDix GmbH: Equity Ownership, Research Funding; Illumina: Research Support, Research Support Other.
APA, Harvard, Vancouver, ISO, and other styles
6

Clifford, Ruth M., Pauline Robbe, Susanne Weller, Adele T. Timbs, Michalis Titsias, Adam Burns, Maite Cabes, et al. "Towards Response Prediction Using Integrated Genomics in Chronic Lymphocytic Leukaemia: Results on 250 First-Line FCR Treated Patients from UK Clinical Trials." Blood 124, no. 21 (December 6, 2014): 1942. http://dx.doi.org/10.1182/blood.v124.21.1942.1942.

Full text
Abstract:
Abstract Background: Major progress has been made in understanding disease biology and therapeutic options for patients with chronic lymphocytic leukaemia (CLL). Recurrent mutations have been discovered using next generation sequencing, but with the exception of TP53 disruption their potential impact on response to treatment is unknown. In order to address this question, we characterised the genomic landscape of 250 first-line chemo-immunotherapy treated CLL patients within UK clinical trials using targeted resequencing and whole-genome SNP array. Methods: We studied patients from two UK-based Phase II randomised controlled trials (AdMIRe and ARCTIC) receiving FCR-based treatment in a first-line treatment setting. A TruSeq Custom Amplicon panel (TSCA, Illumina) was designed targeting 10 genes recurrently mutated in CLL based on recent publications.Average sequencing depth was 2260X. The cumulated length of targets sequenced was 7.87 kb from 330 amplicons covering 160 exons. Alignment and variant calling included a combination of three pipelines to confidently detect SNVs, indels and low level frequency mutations. SNP array testing was performed using HumanOmni2.5-8 BeadChips, (Illumina) and data analysed using Nexus 6.1 Discovery Edition, Biodiscovery. We performed targeted resequencing and genome-wide SNP arrays using selected samples’ germline material to confirm somatic mutations (n=40). Univariate and multivariate analyses using minimal residual disease (MRD) as the outcome measure were performed for 220 of the 250 patients. Results: Pathogenic mutations were identified in 165 (66%) patients, totalling 268 mutations in 10 genes. ATM was the most frequently mutated gene affecting 67 patients (29%) followed by SF3B1 (n=56, 24%), NOTCH1 (n= 32, 14%), TP53 (n= 21, 9%), BIRC3 (n= 17, 7%) and XPO1 (n=14, 6%). Less frequently recurrent mutations were seen in SAMHD1 (n=8, 3%), MYD88 (n= 4, 2%), MED12 (n=7, 3%) and ZFPM2 (n=5, 2%). Integrating sequencing and array results increased the patients with one or more CLL driver mutation from 66% to 94%. As previously reported del17p and TP53 mutations are co-occurring and associate with MRD positivity in all cases (n=15, p=0.0002). We report on minor TP53 subclones in 11 patients (VAF 1-5%), 8 of whom have MRD data available and were also associated with MRD positivity. Deletions of 11q were present in 44 patients. These lesions always included ATM but not always BIRC3. Bialleleic disruption was present in ATM for 27 patients (significantly associated with MRD positivity) and in BIRC3 for 4 patients. Rather surprisingly, trisomy 12 (n=33) and NOTCH1 mutations (n=28) were associated with MRD negativity (p=0.006 and 0.097, respectively). Analysing clonal and subclonal mutations per gene revealed the majority of mutations in SF3B1 and BIRC3 were subclonal (65% and 87% respectively). In contrast almost all SAMHD1 and MYD88 mutations were clonally distributed. There was an association between NOTCH1 subclonal mutations and MRD negativity, compared to clonal mutations, but this difference was not seen in the remaining mutated genes. From our copy number data, the presence of subclones was associated with MRD positivity (p=0.05). Combining important lesions in a multiple logistic regression analysis to predict MRD positivity, bialleleic ATM disruption, together with TP53 disruption, were the strongest predictors, followed by SAMHD1, whereas BIRC3 monoalleleic mutations were a medium predictor for MRD negativity. Conclusion: This is the first integrated genome-wide analysis of the distribution and associations of CLL drivers, using targeted deep resequencing and whole genome SNP arrays in an FCR-based first-line treatment setting. We have shown subclonal and clonal mutation profiles in all patients. For patients with two or more CLL-associated mutations we have begun to unravel clonal hierarchies. We have developed a comprehensive model using MRD as an outcome measure and have found bialleleic ATM mutations and SAMHD1 disruption to strongly predict for MRD positivity. Using MRD status as a robust proxy for PFS not only enables us to confirm results of previous studies, but is advantageous also in considerably reducing the timeframe for results. Indeed, we suggest that MRD status should be assessed routinely in future studies to complement modern integrated genomics approaches. Disclosures Hillmen: Pharmacyclics, Janssen, Gilead, Roche: Honoraria, Research Funding.
APA, Harvard, Vancouver, ISO, and other styles
7

Hamblin, Angela, Adam Burns, Christopher Tham, Ruth Clifford, Pauline Robbe, Adele Timbs, Joanne Mason, et al. "Development and Evaluation of the Clinical Utility of a Next Generation Sequencing (NGS) Tool for Myeloid Disorders." Blood 124, no. 21 (December 6, 2014): 2373. http://dx.doi.org/10.1182/blood.v124.21.2373.2373.

Full text
Abstract:
Abstract Background Historically diagnosis and prognosis of myeloid disorders including acute myeloid leukemia (AML) have been determined using a combination of morphology, immunophenotype, cytogenetic and more recently single gene, if not single mutation, analysis. The introduction of NGS technology has resulted in an explosion in the quantity of mutation data available. However, the feasibility and utility of NGS technology with regards to decision-making in routine clinical practice of myeloid disorders is currently unknown. We therefore developed an advanced NGS tool for simultaneous assessment of multiple myeloid candidate genes from low amounts of input DNA and present clinical utility analysis below. Methods We designed a targeted resequencing assay using a TruSeq Custom Amplicon panel with the MiSeq platform (both Illumina) consisting of 341 amplicons (~56 kb) designed around exons of genes frequently mutated in myeloid malignancies (ASXL1, ATRX, CBL, CBLB, CBLC, CEBPA, CSF3R, DNMT3a, ETV6, EZH2, FLT3, HRAS, IDH1, IDH2, JAK2, KIT, KRAS, MPL, NPM1, NRAS, PDGFRA, PHF6, PTEN, RUNX1, SETBP1, SF3B1, SRSF2, TET2, TP53, U2AF1, WT1 & ZRSR2). Filtering, variant calling and annotation were performed using Basespace and Variant Studio (Illumina) with additional indel detection achieved using Pindel. A cohort of samples previously characterised with conventional techniques was used for validation and the lower limit of detection established using qPCR. Post-validation, DNA from 152 diagnostic blood or bone marrow samples from patients with confirmed or suspected myeloid disorders; both AML (n=46) and disorders with the potential to transform to AML i.e. myelodysplasia (confirmed n=54, suspected n=10) and myeloproliferative neoplasms (n=42), were analysed using this assay. To gather clinical utility data we developed a reporting algorithm to feed back information to clinicians; only those variants with a variant allele frequency (VAF) of >10% and described as acquired in publically available databases were reported with the exception of novel mutations predicted to result in a truncated protein. Further utility data was obtained using published mutation algorithms to determine the proportion of patients in whom mutation data altered prognosis. Results In the validation cohort, initial concordance for detection of clinically significant mutations was 88% rising to 100% once Pindel was used to identify FLT3 ITDs. The lower limit of detection was 3% VAF, and mean amplicon coverage was 390 reads. Using our reporting algorithm 66% of patients in the post-validation cohort had a suspected pathogenic mutation relevant to a myeloid disorder, rising to 74% in patients with confirmed diagnoses. The median number of reported variants per sample for all diagnoses was one (range 0-6). When mutation data for patients with AML with intermediate risk cytogenetics was analysed using the algorithm of Patel et al (N Engl J Med. 2012;366:1079-1089), 4/22 (18%) moved into another risk category. A further two patients had double CEBPA mutations, improving their prognosis. Identification of complex mutations in KIT exon 8 in 2/6 patients with core binding factor AML resulted in more intensive MRD monitoring due to the increased risk of relapse. Interpretation of mutation data for patients with confirmed myelodysplasia using the work of Bejar et al (N Engl J Med. 2011;364:2496-2506) revealed 13/54 (24%) had a high risk mutation independently associated with poor overall survival. 2/8 (25%) patients with chronic myelomonocytic leukemia and 1/12 (8.3%) patients with primary myelofibrosis had high risk ASXL1 exon 12 mutations, independently associated with a poor prognosis. Among suspected diagnoses confirmatory mutations were found in 2/19 (11%), while the absence of mutations reduced the probability of myeloid disease in 11/19 (58%), in some cases sparing elderly patients invasive bone marrow sampling. A further 20 patients had clinically relevant mutations. Conclusions The NGS Myeloid Gene Panel provided extra information to clinicians in 57/152 patients (38%) helping inform diagnosis, individualize disease monitoring schedules and support treatment decisions. The targeted panel approach requires rigorous validation and standardisation in particular of bio-informatics pipelines, but can be adapted to incorporate new genes as their relevance is described and will become central to treatment decisions. Disclosures No relevant conflicts of interest to declare.
APA, Harvard, Vancouver, ISO, and other styles

Dissertations / Theses on the topic "Genotypability, targeted resequencing, variant calling"

1

Iadarola, Barbara. "Enhanced genotypability for a more accurate variant calling in targeted resequencing." Doctoral thesis, 2020. http://hdl.handle.net/11562/1019475.

Full text
Abstract:
The analysis of Next-Generation Sequencing (NGS) data for the identification of DNA genetic variants presents several bioinformatics challenges. The main requirements of the analysis are the accuracy and the reproducibility of results, as their clinical interpretation may be influenced by many variables, from the sample processing to the adopted bioinformatics algorithms. Targeted resequencing, which aim is the enrichment of genomic regions to identify genetic variants possibly associated to clinical diseases, bases the quality of its data on the depth and uniformity of coverage, for the differentiation between true and false positives findings. Many variant callers have been developed to reach the best accuracy considering these metrics, but they can’t work in regions of the genome where short reads cannot align uniquely (uncallable regions). The misalignment of reads on the reference genome can arise when reads are too short to overcome repetitious regions of the genome, causing the software to assign a low-quality score to the read pairs of the same fragment. A limitation of this process is that variant callers are not able to call variants in these regions, unless the quality of one of the two read mates could increase. Moreover, current metrics are not able to define with accuracy these regions, lacking in providing this information to the final customer. For this reason, a more accurate metric is needed to clearly report the uncallable genomic regions, with the prospect to improve the data analysis to possibly investigate them. This work aimed to improve the callability (genotypability) of the target regions for a more accurate data analysis and to provide a high-quality variant calling. Different experiments have been conducted to prove the relevance of genotypability for the evaluation of targeted resequencing performance. Firstly, this metric showed that increasing the depth of sequencing to rescue variants is not necessary at thresholds where genotypability reaches saturation (70X). To improve this metric and to evaluate the accuracy and reproducibility of results on different enrichment technologies for WES sample processing, the genotypability was evaluated on four exome platforms using three different DNA fragment lengths (short: ~200, medium: ~350, long: ~500 bp). Results showed that mapping quality could successfully increase on all platforms extending the fragment, hence increasing the distance between the read pairs. The genotypability of many genes, including several ones associated to a clinical phenotype, could strongly improve. Moreover, longer libraries increased uniformity of coverage for platforms that have not been completely optimized for short fragments, further improving their genotypability. Given the relevance of the quality of data derived, especially from the extension of the short fragments to the medium ones, a deeper investigation was performed to identify a potential threshold of fragment length above which the improvement in genotypability was significant. On the enrichment platform producing the higher enrichment uniformity (Twist), the fragments above 230 bp could obtain a meaningful improvement of genotypability (almost 1%) and a high uniformity of coverage of the target. Interestingly, the extension of the DNA fragment showed a greater influence on genotypability in respect on the solely uniformity of coverage. The enhancement of genotypability for a more accurate bioinformatics analysis of the target regions provided at limited costs (less sequencing) the investigation of regions of the genome previously defined as uncallable by current NGS methodologies.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography