Journal articles: 'Sequence data'

1

HARRIS, T. J. R. "Sequence data." Nature 355, no. 6361 (February 1992): 581. http://dx.doi.org/10.1038/355581c0.

Full text

APA, Harvard, Vancouver, ISO, and other styles

2

KENNARD, OLGA. "Sequence data." Nature 314, no. 6011 (April 1985): 492. http://dx.doi.org/10.1038/314492c0.

Full text

APA, Harvard, Vancouver, ISO, and other styles

3

Song, Bosheng, Zimeng Li, Xuan Lin, Jianmin Wang, Tian Wang, and Xiangzheng Fu. "Pretraining model for biological sequence data." Briefings in Functional Genomics 20, no. 3 (May 2021): 181–95. http://dx.doi.org/10.1093/bfgp/elab025.

Full text

Abstract:

Abstract With the development of high-throughput sequencing technology, biological sequence data reflecting life information becomes increasingly accessible. Particularly on the background of the COVID-19 pandemic, biological sequence data play an important role in detecting diseases, analyzing the mechanism and discovering specific drugs. In recent years, pretraining models that have emerged in natural language processing have attracted widespread attention in many research fields not only to decrease training cost but also to improve performance on downstream tasks. Pretraining models are used for embedding biological sequence and extracting feature from large biological sequence corpus to comprehensively understand the biological sequence data. In this survey, we provide a broad review on pretraining models for biological sequence data. Moreover, we first introduce biological sequences and corresponding datasets, including brief description and accessible link. Subsequently, we systematically summarize popular pretraining models for biological sequences based on four categories: CNN, word2vec, LSTM and Transformer. Then, we present some applications with proposed pretraining models on downstream tasks to explain the role of pretraining models. Next, we provide a novel pretraining scheme for protein sequences and a multitask benchmark for protein pretraining models. Finally, we discuss the challenges and future directions in pretraining models for biological sequences.

APA, Harvard, Vancouver, ISO, and other styles

4

Baxter, Catherine. "Sequence data wanted!" Nature Reviews Genetics 4, no. 3 (March 2003): 164. http://dx.doi.org/10.1038/nrg1040.

Full text

APA, Harvard, Vancouver, ISO, and other styles

5

Biemann, Torsten, and Deepak K. Datta. "Analyzing Sequence Data." Organizational Research Methods 17, no. 1 (September 5, 2013): 51–76. http://dx.doi.org/10.1177/1094428113499408.

Full text

APA, Harvard, Vancouver, ISO, and other styles

6

Schwager, Sabine, Dennis Rünger, Robert Gaschler, and Peter Frensch. "Data-driven sequence learning or search: What are the prerequisites for the generation of explicit sequence knowledge?" Advances in Cognitive Psychology 8, no. 2 (June 28, 2012): 132–43. http://dx.doi.org/10.5709/acp-0110-4.

Full text

APA, Harvard, Vancouver, ISO, and other styles

7

Biswas, Abhishek, David T. Gauthier, Desh Ranjan, and Mohammad Zubair. "ISQuest: finding insertion sequences in prokaryotic sequence fragment data." Bioinformatics 31, no. 21 (June 27, 2015): 3406–12. http://dx.doi.org/10.1093/bioinformatics/btv388.

Full text

APA, Harvard, Vancouver, ISO, and other styles

8

OH, SEUNG-JOON, and JAE-YEARN KIM. "A SEQUENCE-ELEMENT-BASED HIERARCHICAL CLUSTERING ALGORITHM FOR CATEGORICAL SEQUENCE DATA." International Journal of Information Technology & Decision Making 04, no. 01 (March 2005): 81–96. http://dx.doi.org/10.1142/s0219622005001398.

Full text

Abstract:

Recently, there has been enormous growth in the amount of commercial and scientific data, such as protein sequences, retail transactions, and web-logs. Such datasets consist of sequence data that have an inherent sequential nature. However, few existing clustering algorithms consider sequentiality. In this paper, we study how to cluster these sequence datasets. We propose a new similarity measure to compute the similarity between two sequences. In the proposed measure, subsets of a sequence are considered, and the more identical subsets there are, the more similar the two sequences. In addition, we propose a hierarchical clustering algorithm and an efficient method for measuring similarity. Using a splice dataset and synthetic datasets, we show that the quality of clusters generated by our proposed approach is better than that of clusters produced by traditional clustering algorithms.

APA, Harvard, Vancouver, ISO, and other styles

9

Hontzeas, Nikos, and Bernard R. Glick. "Manipulating DNA sequence data." Biotechnology Advances 19, no. 4 (July 2001): 319–20. http://dx.doi.org/10.1016/s0734-9750(01)00057-x.

Full text

APA, Harvard, Vancouver, ISO, and other styles

10

Holmes, Edward C., Andrew J. Leigh Brown, and Peter Simmonds. "Sequence data as evidence." Nature 364, no. 6440 (August 1993): 766. http://dx.doi.org/10.1038/364766b0.

Full text

APA, Harvard, Vancouver, ISO, and other styles

11

Kawahara, T., and K. Yamane. "Phylogeny of Aegilops and Triticum inferred from sequence data of cpDNA." Czech Journal of Genetics and Plant Breeding 41, Special Issue (July 31, 2012): 56. http://dx.doi.org/10.17221/6133-cjgpb.

Full text

APA, Harvard, Vancouver, ISO, and other styles

12

K., Sahityabhilash. "Impact of Loss Function Using M-LSTM Classifier for Sequence Data." International Journal of Psychosocial Rehabilitation 24, no. 5 (April 20, 2020): 3487–94. http://dx.doi.org/10.37200/ijpr/v24i5/pr202059.

Full text

APA, Harvard, Vancouver, ISO, and other styles

13

Tang, Qi, Guoli Ma, Weiming Zhang, and Nenghai Yu. "Reversible Data Hiding for DNA Sequences and Its Applications." International Journal of Digital Crime and Forensics 6, no. 4 (October 2014): 1–13. http://dx.doi.org/10.4018/ijdcf.2014100101.

Full text

Abstract:

As the blueprint of vital activities of most living things on earth, DNA has important status and must be protected perfectly. And in current DNA databases, each sequence is stored with several notes that help to describe that sequence. However, these notes have no contribution to the protection of sequences. In this paper, the authors propose a reversible data hiding method for DNA sequences, which could be used either to embed sequence-related annotations, or to detect and restore tampers. When embedding sequence annotations, the methods works in low embedding rate mode. Only several bits of annotations are embedded. When used for tamper detection and tamper restoration, all possible embedding positions are utilized to assure the maximum restoration capacity.

APA, Harvard, Vancouver, ISO, and other styles

14

Elkan, C. "Access to genetic sequence data." Science 255, no. 5045 (February 7, 1992): 663. http://dx.doi.org/10.1126/science.1738833.

Full text

APA, Harvard, Vancouver, ISO, and other styles

15

ZHU, Yang-Yong. "DNA Sequence Data Mining Technique." Journal of Software 18, no. 11 (2007): 2766. http://dx.doi.org/10.1360/jos182766.

Full text

APA, Harvard, Vancouver, ISO, and other styles

16

Harris, Nomi L. "Annotating Sequence Data Using Genotator." Molecular Biotechnology 16, no. 3 (2000): 221–32. http://dx.doi.org/10.1385/mb:16:3:221.

Full text

APA, Harvard, Vancouver, ISO, and other styles

17

Hyman, R. W. "Sequence Data: Posted vs. Published." Science 291, no. 5505 (February 2, 2001): 827b—827. http://dx.doi.org/10.1126/science.291.5505.827b.

Full text

APA, Harvard, Vancouver, ISO, and other styles

18

Hertz-Fowler, Christiane, and Arnab Pain. "Sequence data swell for nematodes." Nature Reviews Microbiology 6, no. 11 (November 2008): 800–801. http://dx.doi.org/10.1038/nrmicro2021.

Full text

APA, Harvard, Vancouver, ISO, and other styles

19

Roberts, Richard J. "Sequence data: expand comprehensive access." Nature 591, no. 7849 (March 4, 2021): 202. http://dx.doi.org/10.1038/d41586-021-00575-1.

Full text

APA, Harvard, Vancouver, ISO, and other styles

20

Holmes, J. Bradley, Eric Moyer, Lon Phan, Donna Maglott, and Brandi Kattman. "SPDI: data model for variants and applications at NCBI." Bioinformatics 36, no. 6 (November 18, 2019): 1902–7. http://dx.doi.org/10.1093/bioinformatics/btz856.

Full text

Abstract:

Abstract Motivation Normalizing sequence variants on a reference, projecting them across congruent sequences and aggregating their diverse representations are critical to the elucidation of the genetic basis of disease and biological function. Inconsistent representation of variants among variant callers, local databases and tools result in discrepancies that complicate analysis. NCBI’s genetic variation resources, dbSNP and ClinVar, require a robust, scalable set of principles to manage asserted sequence variants. Results The SPDI data model defines variants as a sequence of four attributes: sequence, position, deletion and insertion, and can be applied to nucleotide and protein variants. NCBI web services convert representations among HGVS, VCF and SPDI and provide two functions to aggregate variants. One, based on the NCBI Variant Overprecision Correction Algorithm, returns a unique, normalized representation termed the ‘Contextual Allele’. The SPDI data model, with its four operations, defines exactly the reference subsequence affected by the variant, even in repeat regions, such as homopolymer and other sequence repeats. The second function projects variants across congruent sequences and depends on an alignment dataset of non-assembly NCBI RefSeq sequences (prefixed NM, NR and NG), as well as inter- and intra-assembly-associated genomic sequences (NCs, NTs and NWs), supporting robust projection of variants across congruent sequences and assembly versions. The variant is projected to all congruent Contextual Alleles. One of these Contextual Alleles, typically the allele based on the latest assembly version, represents the entire set, is designated the unique ‘Canonical Allele’ and is used directly to aggregate variants across congruent sequences. Availability and implementation The SPDI services are available for open access at: https://api.ncbi.nlm.nih.gov/variation/v0. Supplementary information Supplementary data are available at Bioinformatics online.

APA, Harvard, Vancouver, ISO, and other styles

21

Own, C., A. Bleloch, W. Lerach, C. Bowell, M. Hamalainen, J. Herschleb, C. Melville, J. Stark, M. Andregg, and W. Andregg. "First Nucleotide Sequence Data from an Electron Microscopy Based DNA Sequencer." Microscopy and Microanalysis 19, S2 (August 2013): 208–9. http://dx.doi.org/10.1017/s1431927613003036.

Full text

APA, Harvard, Vancouver, ISO, and other styles

22

Zhang, Shu Lu, Dong Sheng Zhou, and Qiang Zhang. "Human Motion Capture Data Segmentation Based on LLE Algorithm." Applied Mechanics and Materials 538 (April 2014): 481–85. http://dx.doi.org/10.4028/www.scientific.net/amm.538.481.

Full text

Abstract:

In this paper, we propose the motion sequence segmentation based on LLE (Locally Linear Embedding) algorithm. The method is to reduce the dimension of the high dimension motion sequence to obtain one-dimension feature curve. Then we use the feature curve to achieve motion sequence segmentation. Simulation results demonstrate that this method can achieve motion sequences segmentation and improve the accuracy rate greatly compared with the traditional algorithm.

APA, Harvard, Vancouver, ISO, and other styles

23

Hanage, William P., Tarja Kaijalainen, Elja Herva, Annika Saukkoriipi, Ritva Syrjänen, and Brian G. Spratt. "Using Multilocus Sequence Data To Define the Pneumococcus." Journal of Bacteriology 187, no. 17 (September 1, 2005): 6223–30. http://dx.doi.org/10.1128/jb.187.17.6223-6230.2005.

Full text

Abstract:

ABSTRACT We investigated the genetic relationships between serotypeable pneumococci and nonserotypeable presumptive pneumococci using multilocus sequence typing (MLST) and partial sequencing of the pneumolysin gene (ply). Among 121 nonserotypeable presumptive pneumococci from Finland, we identified isolates of three classes: those with sequence types (STs) identical to those of serotypeable pneumococci, suggesting authentic pneumococci in which capsular expression had been downregulated or lost; isolates that clustered among serotypeable pneumococci on a tree based on the concatenated sequences of the MLST loci but which had STs that differed from those of serotypeable pneumococci in the MLST database; and a more diverse collection of isolates that did not cluster with serotypeable pneumococci. The latter isolates typically had sequences at all seven MLST loci that were 5 to 10% divergent from those of authentic pneumococci and also had distinct and divergent ply alleles. These isolates are proposed to be distinct from pneumococci but cannot be resolved from them by optochin susceptibility, bile solubility, or the presence of the ply gene. Complete resolution of pneumococci from the related but distinct population is problematic, as recombination between them was evident, and a few isolates of each population possessed alleles at one or occasionally more MLST loci from the other population. However, a tree based on the concatenated sequences of the MLST loci in most cases unambiguously distinguished whether a nonserotypeable isolate was or was not a pneumococcus, and the sequence of the ply gene fragment was found to be useful to resolve difficult cases.

APA, Harvard, Vancouver, ISO, and other styles

24

Yao, Haichang, Yimu Ji, Kui Li, Shangdong Liu, Jing He, and Ruchuan Wang. "HRCM: An Efficient Hybrid Referential Compression Method for Genomic Big Data." BioMed Research International 2019 (November 16, 2019): 1–13. http://dx.doi.org/10.1155/2019/3108950.

Full text

Abstract:

With the maturity of genome sequencing technology, huge amounts of sequence reads as well as assembled genomes are generating. With the explosive growth of genomic data, the storage and transmission of genomic data are facing enormous challenges. FASTA, as one of the main storage formats for genome sequences, is widely used in the Gene Bank because it eases sequence analysis and gene research and is easy to be read. Many compression methods for FASTA genome sequences have been proposed, but they still have room for improvement. For example, the compression ratio and speed are not so high and robust enough, and memory consumption is not ideal, etc. Therefore, it is of great significance to improve the efficiency, robustness, and practicability of genomic data compression to reduce the storage and transmission cost of genomic data further and promote the research and development of genomic technology. In this manuscript, a hybrid referential compression method (HRCM) for FASTA genome sequences is proposed. HRCM is a lossless compression method able to compress single sequence as well as large collections of sequences. It is implemented through three stages: sequence information extraction, sequence information matching, and sequence information encoding. A large number of experiments fully evaluated the performance of HRCM. Experimental verification shows that HRCM is superior to the best-known methods in genome batch compression. Moreover, HRCM memory consumption is relatively low and can be deployed on standard PCs.

APA, Harvard, Vancouver, ISO, and other styles

25

Yang, Chao, Zhongwen Guo, and Lintao Xian. "Time Series Data Prediction Based on Sequence to Sequence Model." IOP Conference Series: Materials Science and Engineering 692 (November 27, 2019): 012047. http://dx.doi.org/10.1088/1757-899x/692/1/012047.

Full text

APA, Harvard, Vancouver, ISO, and other styles

26

Goggin, C. L., and L. J. Newman. "Use of molecular data to discriminate pseudocerotid turbellarians." Journal of Helminthology 70, no. 2 (June 1996): 123–26. http://dx.doi.org/10.1017/s0022149x00015261.

Full text

Abstract:

AbstractNucleotide sequence data from the Internal Transcribed Spacer 1 (ITS1) in the ribosomal RNA (rRNA) gene cluster were used to determine the utility of molecular data to discriminate species and genera of pseudocerotid turbellarians. We sequenced 388,379 and 415 bp from the ITS1 of Pseudoceros jebborum, Pseudoceros paralaticlavus and Pseudobiceros gratus respectively, to give an aligned sequence for this region of 421 positions. The nucleotide sequence of the ITS1 of Pseudoceros jebborum differed from that of Pseudoceros paralaticlavus by 6% (24/421 positions) and from that of Pseudobiceros gratus by 36% (152/421 positions). These results indicate that sequence data from the ITS1 will be a useful taxonomic tool to discriminate pseudocerotid flatworms.

APA, Harvard, Vancouver, ISO, and other styles

27

Shi, Joel, John Culkin, and David Dinauer. "138-P: Proposal for reporting sequence data with group specific sequence primer and oligotyping data." Human Immunology 68, no. 1 (October 2007): S85. http://dx.doi.org/10.1016/j.humimm.2007.08.161.

Full text

APA, Harvard, Vancouver, ISO, and other styles

28

Varaljay, Vanessa A., Erinn C. Howard, Shulei Sun, and Mary Ann Moran. "Deep Sequencing of a Dimethylsulfoniopropionate-Degrading Gene (dmdA) by Using PCR Primer Pairs Designed on the Basis of Marine Metagenomic Data." Applied and Environmental Microbiology 76, no. 2 (November 30, 2009): 609–17. http://dx.doi.org/10.1128/aem.01258-09.

Full text

Abstract:

ABSTRACT In silico design and testing of environmental primer pairs with metagenomic data are beneficial for capturing a greater proportion of the natural sequence heterogeneity in microbial functional genes, as well as for understanding limitations of existing primer sets that were designed from more restricted sequence data. PCR primer pairs targeting 10 environmental clades and subclades of the dimethylsulfoniopropionate (DMSP) demethylase protein, DmdA, were designed using an iterative bioinformatic approach that took advantage of thousands of dmdA sequences captured in marine metagenomic data sets. Using the bioinformatically optimized primers, dmdA genes were amplified from composite free-living coastal bacterioplankton DNA (from 38 samples over 5 years and two locations) and sequenced using 454 technology. An average of 6,400 amplicons per primer pair represented more than 700 clusters of environmental dmdA sequences across all primers, with clusters defined conservatively at >90% nucleotide sequence identity (∼95% amino acid identity). Degenerate and inosine-based primers did not perform better than specific primer pairs in determining dmdA richness and sometimes captured a lower degree of richness of sequences from the same DNA sample. A comparison of dmdA sequences in free-living versus particle-associated bacteria in southeastern U.S. coastal waters showed that sequence richness in some dmdA subgroups differed significantly between size fractions, though most gene clusters were shared (52 to 91%) and most sequences were affiliated with the shared clusters (∼90%). The availability of metagenomic sequence data has significantly enhanced the design of quantitative PCR primer pairs for this key functional gene, providing robust access to the capabilities and activities of DMSP demethylating bacteria in situ.

APA, Harvard, Vancouver, ISO, and other styles

29

Woodwark, K. Cara, Simon J. Hubbard, and Stephen G. Oliver. "Sequence Search Algorithms for Single Pass Sequence Identification: Does One Size Fit All?" Comparative and Functional Genomics 2, no. 1 (2001): 4–9. http://dx.doi.org/10.1002/cfg.61.

Full text

Abstract:

Bioinformatic tools have become essential to biologists in their quest to understand the vast quantities of sequence data, and now whole genomes, which are being produced at an ever increasing rate. Much of these sequence data are single-pass sequences, such as sample sequences from organisms closely related to other organisms of interest which have already been sequenced, or cDNAs or expressed sequence tags (ESTs). These single-pass sequences often contain errors, including frameshifts, which complicate the identification of homologues, especially at the protein level. Therefore, sequence searches with this type of data are often performed at the nucleotide level. The most commonly used sequence search algorithms for the identification of homologues are Washington University’s and the National Center for Biotechnology Information’s (NCBI) versions of the BLAST suites of tools, which are to be found on websites all over the world. The work reported here examines the use of these tools for comparing sample sequence datasets to a known genome. It shows that care must be taken when choosing the parameters to use with the BLAST algorithms. NCBI’s version of gapped BLASTn gives much shorter, and sometimes different, top alignments to those found using Washington University’s version of BLASTn (which also allows for gaps), when both are used with their default parameters. Most of the differences in performance were found to be due to the choices of default parameters rather than underlying differences between the two algorithms. Washington University’s version, used with defaults, compares very favourably with the results obtained using the accurate but computationally intensive Smith–Waterman algorithm.

APA, Harvard, Vancouver, ISO, and other styles

30

Wu, Fone-Mao, and Peter M. Muriana. "Cloning, Sequencing, and Characterization of Genomic Subtracted Sequences from Listeria monocytogenes." Applied and Environmental Microbiology 65, no. 12 (December 1, 1999): 5427–30. http://dx.doi.org/10.1128/aem.65.12.5427-5430.1999.

Full text

Abstract:

ABSTRACT Individual sequences of a genomic subtracted, PCR-amplified, mixed-sequence probe (GS probe) were cloned and sequenced. The GS probe differentiated restriction fragment length polymorphism patterns forListeria monocytogenes but did not hybridize with members of other bacterial genera. Sequence analysis identified severalL. monocytogenes sequences already present in the GenBank database; the putative identities of other sequences were inferred from homology data, and still other sequences did not exhibit significant levels of homology with any GenBank sequences.

APA, Harvard, Vancouver, ISO, and other styles

31

Maia, Vitor H., Matthew A. Gitzendanner, Pamela S. Soltis, Gane Ka-Shu Wong, and Douglas E. Soltis. "Angiosperm Phylogeny Based on 18S/26S rDNA Sequence Data: Constructing a Large Data Set Using Next-Generation Sequence Data." International Journal of Plant Sciences 175, no. 6 (July 2014): 613–50. http://dx.doi.org/10.1086/676675.

Full text

APA, Harvard, Vancouver, ISO, and other styles

32

Matarangas, D., and V. Skourtsis-Coroneou. "Stratigraphical data from a metamorphic sequence of the North Sporades (Pelagonian zone, Greece)." Neues Jahrbuch für Geologie und Paläontologie - Monatshefte 1989, no. 3 (March 29, 1989): 182–92. http://dx.doi.org/10.1127/njgpm/1989/1989/182.

Full text

APA, Harvard, Vancouver, ISO, and other styles

33

Hilger, Hartmut H., and Nadja Diane. "A systematic analysis of Heliotropiaceae (Boraginales) based on trnL and ITS1 sequence data." Botanische Jahrbücher für Systematik, Pflanzengeschichte und Pflanzengeographie 125, no. 1 (December 17, 2003): 19–51. http://dx.doi.org/10.1127/0006-8152/2003/0125-0019.

Full text

APA, Harvard, Vancouver, ISO, and other styles

34

Koenemann, Stefan, and Frederick R. Schram. "The limitations of ontogenetic data in phylogenetic analyses." Contributions to Zoology 71, no. 1-3 (2002): 47–65. http://dx.doi.org/10.1163/18759866-0710103005.

Full text

Abstract:

The analysis of consecutive ontogenetic stages, or events, introduces a new class of data to phylogenetic systematics that are distinctly different from traditional morphological characters and molecular sequence data. Ontogenetic event sequences are distinguished by varying degrees of both a collective and linear type of dependence and, therefore, violate the criterion of character independence. We applied different methods of phylogenetic reconstruction to ontogenetic data including maximum parsimony and distance (cluster) analyses. Two different data sets were investigated: (1) four simulated ontogenies with defined phylogenies of six hypothetical taxa, and (2) a set of “real” data comprising sequences of 29 ontogenetic events from 11 vertebrate taxa. We confirm that heterochronic event sequences do contain a phylogenetic signal. However, based on our results we argue that maximum parsimony is a biased method to analyze such developmental sequence data. Ontogenetic events require a special analytical algorithm that would not neglect instances of chronological (horizontal) dependence of this type of data. One coding method, “event-pairing”, appeared to fulfill this requirement in the vertebrate analyses. However, to accurately analyze ontogenetic sequence data, a more sophisticated coding method and algorithm are needed, for example, measuring distances of dependent events.

APA, Harvard, Vancouver, ISO, and other styles

35

Brandon, M. C., D. C. Wallace, and P. Baldi. "Data structures and compression algorithms for genomic sequence data." Bioinformatics 25, no. 14 (May 15, 2009): 1731–38. http://dx.doi.org/10.1093/bioinformatics/btp319.

Full text

APA, Harvard, Vancouver, ISO, and other styles

36

Yang, Tae-Jin, Jung-Sun Kim, Ki-Byung Lim, Soo-Jin Kwon, Jin-A. Kim, Mina Jin, Jee Young Park, et al. "The KoreaBrassicaGenome Project: a Glimpse of theBrassicaGenome Based on Comparative Genome Analysis WithArabidopsis." Comparative and Functional Genomics 6, no. 3 (2005): 138–46. http://dx.doi.org/10.1002/cfg.465.

Full text

Abstract:

A complete genome sequence provides unlimited information in the sequenced organism as well as in related taxa. According to the guidance of the Multinational Brassica Genome Project (MBGP), the Korea Brassica Genome Project (KBGP) is sequencing chromosome 1 (cytogenetically oriented chromosome #1) ofBrassica rapa. We have selected 48 seed BACs on chromosome 1 using EST genetic markers and FISH analyses. Among them, 30 BAC clones have been sequenced and 18 are on the way. Comparative genome analyses of the EST sequences and sequenced BAC clones fromBrassicachromosome 1 revealed their homeologous partner regions on theArabidopsisgenome and a syntenic comparative map betweenBrassicachromosome 1 andArabidopsischromosomes.In silicochromosome walking and clone validation have been successfully applied to extending sequence contigs based on the comparative map and BAC end sequences. In addition, we have defined the (peri)centromeric heterochromatin blocks with centromeric tandem repeats, rDNA and centromeric retrotransposons. In-depth sequence analyses of five homeologous BAC clones and anArabidopsischromosomal region reveal overall co-linearity, with 82% sequence similarity. The data indicate that theBrassicagenome has undergone triplication and subsequent gene losses after the divergence ofArabidopsisandBrassica. Based on in-depth comparative genome analyses, we propose a comparative genomics approach for conquering theBrassicagenome. In 2005 we intend to construct an integrated physical map, including sequence information from 500 BAC clones and integration of fingerprinting data and end sequence data of more than 100 000 BAC clones. The sequences have been submitted to GenBank with accession numbers: 10 204 BAC ends of the KBrH library (CW978640–CW988843); KBrH138P04, AC155338; KBrH117N09, AC155337; KBrH097M21, AC155348; KBrH093K03, AC155347; KBrH081N08, AC155346; KBrH080L24, AC155345; KBrH077A05, AC155343; KBrH020D15, AC155340; KBrH015H17, AC155339; KBrH001H24, AC155335; KBrH080A08, AC155344; KBrH004D11, AC155341; KBrH117M18, AC146875; KBrH052O08, AC155342.

APA, Harvard, Vancouver, ISO, and other styles

37

Zhang, Shao Xuan, Xin Rui Liu, Bo Chuan Wang, Yun Hui Ling, De Jun Sun, and Guang Zhu Lin. "Comparison of the ITS Sequences of 5 Common Potentilla Species in Jilin Province of China." Advanced Materials Research 554-556 (July 2012): 1690–93. http://dx.doi.org/10.4028/www.scientific.net/amr.554-556.1690.

Full text

Abstract:

To find the differences in the internal transcribed spacer(ITS) sequences and provide scientific data for the authentication of Potentilla chinensis and its related species, we extracted the genome DNA from the leaves of 5 common Potetilla species in Jilin Province, amplified the ITS region using ITS universal primers of angiosperm, and sequenced the purified PCR products directly. Polymorphism of ITS sequences was found within P. chinensis and the sequence data suggested that our samples of this species might be related to hybridization. Other 4 species showed intraspecies-stability in ITS sequence. The ITS sequences of these 5 Potentilla species are significantly different. So ITS sequence analysis and other methods derived from it can be used in authentication of Potentilla.

APA, Harvard, Vancouver, ISO, and other styles

38

Huyen, Do Thi, Nguyen Minh Giang, Nguyen Thu Nguyet, and Truong Nam Hai. "Probe design for mining and selection of genes coding endo 1- 4 xylanase from dna metagenome data." TAP CHI SINH HOC 40, no. 1 (January 25, 2018): 39–50. http://dx.doi.org/10.15625/0866-7160/v40n1.9200.

Full text

Abstract:

According to the CAZY classification, endo 1- 4 xylanase belongs to GH 5, 8, 10, 11, 30, 51, 98. However only 03 sequences of GH8, 27 sequences of GH10, 18 sequence of GH11, only one sequence of each GH30 and GH51 from CAZy and NCBI database were thouroughly experimentally studied for biological activity and characteristics of the enzyme. Through the collected sequences, two probes for endo 1- 4 xylanase of GH10 and GH11 were designed, based on the sequence homology. The GH10 probe was 338 amino acids lenghth contained all the conserved amino acid residues (16 conserved residues in all sequences, 13 residues similar in almost sequences, 14 residues conserved in many sequences) with the lowest maxscore of 189, coverage of 88% and identity of 39%. The GH11 probe was 204 amino acids contained all the conserved amino acid residues (54 conserved residues were identity in all sequences, 25 residues similar in almost sequences, 24 residues conserved in many sequences) with the lowest maxscore of 165, coverage of 84% and identity of 50%. Using the two probes, we mined only one sequence (GL0018509) for endo 1- 4 xylanase from metagenomic DNA data of free-living bacteria in Coptotermes termite gut. Prediction of three-dimention structure of GL0018509 sequence by Phyre2 and Swiss Prot showed that this sequence was high similarity (95% by Phyre2 and 93,4% by Swiss Prot) with endo 1- 4 xylanase with the 100% confidence.

APA, Harvard, Vancouver, ISO, and other styles

39

Margos, Gabriele, Volker Fingerle, Andreas Sing, and Keith Jolley. "Sequence data management for scientific purposes." Infection, Genetics and Evolution 54 (October 2017): 508. http://dx.doi.org/10.1016/j.meegid.2017.06.030.

Full text

APA, Harvard, Vancouver, ISO, and other styles

40

Pandey, Subhash Chandra, and Saket Kumar Singh. "DNA sequence based data classification technique." CSI Transactions on ICT 3, no. 1 (March 2015): 59–69. http://dx.doi.org/10.1007/s40012-015-0072-x.

Full text

APA, Harvard, Vancouver, ISO, and other styles

41

AL-Rawi, Muhanned, and Muaayed AL-Rawi. "Combined Detector with Retraining Data Sequence." Technological Engineering 15, no. 1 (October 1, 2018): 47–50. http://dx.doi.org/10.1515/teen-2018-0009.

Full text

Abstract:

Abstract Two detectors are presented in this paper which are used to handle intersymbol interference introduced by the communication channels. These two detectors are based on combination of nonlinear equalizer and Viterbi detector. The first detector, which was previously developed, is named Combined Detector1(CDR1), while, the second detector, which is the contribution of this paper, is named Combined Detector-2(CDR2). CDR2 is similar to CDR1 but with retraining data sequence. These detectors are tested beside nonlinear equalizer using data transmission at 9.6kb/s over telephone channel. Simulation results show that the performance of CDR2 is better than the performance of CDR1 while the performance of CDR1 is better than the performance of nonlinear equalizer.

APA, Harvard, Vancouver, ISO, and other styles

42

LI, H. "Logging Data High-Resolution Sequence Stratigraphy." Journal of China University of Geosciences 17, no. 2 (June 2006): 173–80. http://dx.doi.org/10.1016/s1002-0705(06)60025-3.

Full text

APA, Harvard, Vancouver, ISO, and other styles

43

Zhu, Z., Y. Zhang, Z. Ji, S. He, and X. Yang. "High-throughput DNA sequence data compression." Briefings in Bioinformatics 16, no. 1 (December 3, 2013): 1–15. http://dx.doi.org/10.1093/bib/bbt087.

Full text

APA, Harvard, Vancouver, ISO, and other styles

44

Chatterjee, Samprit, and B. S. Weir. "Statistical Analysis of DNA Sequence Data." Journal of the American Statistical Association 80, no. 390 (June 1985): 495. http://dx.doi.org/10.2307/2287952.

Full text

APA, Harvard, Vancouver, ISO, and other styles

45

Bai, Ran, Wing Kai Hon, Eric Lo, Zhian He, and Kenny Zhu. "Historic Moments Discovery in Sequence Data." ACM Transactions on Database Systems 44, no. 1 (January 29, 2019): 1–33. http://dx.doi.org/10.1145/3276975.

Full text

APA, Harvard, Vancouver, ISO, and other styles

46

Carr, Ian M., Sanjeev Bhaskar, James O’ Sullivan, Mohammed A. Aldahmesh, Hanan E. Shamseldin, Alexander F. Markham, David T. Bonthron, Graeme Black, and Fowzan S. Alkuraya. "Autozygosity Mapping with Exome Sequence Data." Human Mutation 34, no. 1 (October 22, 2012): 50–56. http://dx.doi.org/10.1002/humu.22220.

Full text

APA, Harvard, Vancouver, ISO, and other styles

47

Rubin, Benjamin E. R., Richard H. Ree, and Corrie S. Moreau. "Inferring Phylogenies from RAD Sequence Data." PLoS ONE 7, no. 4 (April 6, 2012): e33394. http://dx.doi.org/10.1371/journal.pone.0033394.

Full text

APA, Harvard, Vancouver, ISO, and other styles

48

Bell;, E. "Publication Rights for Sequence Data Producers." Science 290, no. 5497 (December 1, 2000): 1696b—1698. http://dx.doi.org/10.1126/science.290.5497.1696b.

Full text

APA, Harvard, Vancouver, ISO, and other styles

49

States, D. J. "Describing the Release of Sequence Data." Science 292, no. 5519 (May 11, 2001): 1066–67. http://dx.doi.org/10.1126/science.292.5519.1066.

Full text

APA, Harvard, Vancouver, ISO, and other styles

50

Barbieri, Nicola, Giuseppe Manco, Ettore Ritacco, Marco Carnuccio, and Antonio Bevacqua. "Probabilistic topic models for sequence data." Machine Learning 93, no. 1 (July 3, 2013): 5–29. http://dx.doi.org/10.1007/s10994-013-5391-2.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Journal articles on the topic 'Sequence data'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles