Dissertations / Theses on the topic 'Sequence data'
Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles
Consult the top 50 dissertations / theses for your research on the topic 'Sequence data.'
Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.
You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.
Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.
Chui, Chun-kit, and 崔俊傑. "OLAP on sequence data." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 2010. http://hub.hku.hk/bib/B45823996.
Full textPray, Keith A. "Apriori Sets And Sequences: Mining Association Rules from Time Sequence Attributes." Link to electronic thesis, 2004. http://www.wpi.edu/Pubs/ETD/Available/etd-0506104-150831/.
Full textKeywords: mining complex data; temporal association rules; computer system performance; stock market analysis; sleep disorder data. Includes bibliographical references (p. 79-85).
Brine, A. "Direct sequence data transmission systems." Thesis, University of Kent, 1987. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.379274.
Full textZhang, Minghua, and 張明華. "Sequence mining algorithms." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 2004. http://hub.hku.hk/bib/B44570119.
Full textIbeh, Neke. "Inferring Viral Dynamics from Sequence Data." Thesis, Université d'Ottawa / University of Ottawa, 2016. http://hdl.handle.net/10393/35317.
Full textParsons, Jeremy David. "Computer analysis of molecular sequences." Thesis, University of Cambridge, 1993. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.282922.
Full textHamby, Stephen Edward. "Data mining techniques for protein sequence analysis." Thesis, University of Nottingham, 2010. http://eprints.nottingham.ac.uk/11498/.
Full textChung, Jimmy Hok Leung. "Application of sequence prediction to data compression." Thesis, Manchester Metropolitan University, 2000. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.322411.
Full textMaydt, Jochen. "Analysis of recombination in molecular sequence data." Aachen Shaker, 2008. http://d-nb.info/993318045/04.
Full textLayton, Martin Ian. "Augmented statistical models for classifying sequence data." Thesis, University of Cambridge, 2007. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.613094.
Full textRagonnet-Cronin, Manon Lily. "Transmission networks inferred from HIV sequence data." Thesis, University of Edinburgh, 2015. http://hdl.handle.net/1842/16151.
Full textChan, Wing-yan Sarah, and 陳詠欣. "Emerging substrings for sequence classification." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 2003. http://hub.hku.hk/bib/B2971672X.
Full textPustułka-Hunt, Elżbieta Katarzyna. "Biological sequence indexing using persistent Java." Thesis, University of Glasgow, 2001. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.270957.
Full textHuang, Tzu-Kuo. "Exploiting Non-Sequence Data in Dynamic Model Learning." Research Showcase @ CMU, 2013. http://repository.cmu.edu/dissertations/561.
Full textParry-Smith, David John. "Algorithms and data structures for protein sequence analysis." Thesis, University of Leeds, 1990. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.277404.
Full textPei, Shermin. "Identification of functional RNA structures in sequence data." Thesis, Boston College, 2016. http://hdl.handle.net/2345/bc-ir:107275.
Full textThesis advisor: Peter Clote
Structured RNAs have many biological functions ranging from catalysis of chemical reactions to gene regulation. Many of these homologous structured RNAs display most of their conservation at the secondary or tertiary structure level. As a result, strategies for natural structured RNA discovery rely heavily on identification of sequences sharing a common stable secondary structure. However, correctly identifying the functional elements of the structure continues to be challenging. In addition to studying natural RNAs, we improve our ability to distinguish functional elements by studying sequences derived from in vitro selection experiments to select structured RNAs that bind specific proteins. In this thesis, we seek to improve methods for distinguishing functional RNA structures from arbitrarily predicted structures in sequencing data. To do so, we developed novel algorithms that prioritize the structural properties of the RNA that are under selection. In order to identify natural structured ncRNAs, we bring concepts from evolutionary biology to bear on the de novo RNA discovery process. Since there is selective pressure to maintain the structure, we apply molecular evolution concepts such as neutrality to identify functional RNA structures. We hypothesize that alignments corresponding to structured RNAs should consist of neutral sequences. During the course of this work, we developed a novel measure of neutrality, the structure ensemble neutrality (SEN), which calculates neutrality by averaging the magnitude of structure retained over all single point mutations to a given sequence. In order to analyze in vitro selection data for RNA-protein binding motifs, we developed a novel framework that identifies enriched substructures in the sequence pool. Our method accounts for both sequence and structure components by abstracting the overall secondary structure into smaller substructures composed of a single base-pair stack. Unlike many current tools, our algorithm is designed to deal with the large data sets coming from high-throughput sequencing. In conclusion, our algorithms have similar performance to existing programs. However, unlike previous methods, our algorithms are designed to leverage the evolutionary selective pressures in order to emphasize functional structure conservation
Thesis (PhD) — Boston College, 2016
Submitted to: Boston College. Graduate School of Arts and Sciences
Discipline: Biology
Swenson, Hugo. "Detection of artefacts in FFPE-sample sequence data." Thesis, Uppsala universitet, Institutionen för biologisk grundutbildning, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-392623.
Full textWinarko, Edi, and edwin@ugm ac id. "The Discovery and Retrieval of Temporal Rules in Interval Sequence Data." Flinders University. Informatics and Engineering, 2007. http://catalogue.flinders.edu.au./local/adt/public/adt-SFU20080107.164033.
Full textHarshbarger, Stuart D. "Measured noise performance of a data clock circuit derived from the local M-sequence in direct-sequence spread spectrum systems." Thesis, Monterey, California : Naval Postgraduate School, 1990. http://handle.dtic.mil/100.2/ADA238335.
Full textThesis Advisor(s): Myers, Glen. Second Reader: Ha, Tri. "September 1990." Description based on title screen as viewed on December 21, 2009. DTIC Identifiers: Direct sequence spread spectrum, data clocks, delay lock loops, sequence generators. Author(s) subject terms: Direct-sequence spread spectrum, communications, data clock recovery, M-sequence, delay-lock loop, spread spectrum, binary sequence generation. Includes bibliographical references (p. 40). Also available in print.
Chen, Liangzhe. "Segmenting, Summarizing and Predicting Data Sequences." Diss., Virginia Tech, 2018. http://hdl.handle.net/10919/83573.
Full textPh. D.
Maydt, Jochen [Verfasser]. "Analysis of Recombination in Molecular Sequence Data / Jochen Maydt." Aachen : Shaker, 2009. http://d-nb.info/1126378321/34.
Full textMyers, Simon R. "The detection of recombination events using DNA sequence data." Thesis, University of Oxford, 2002. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.289117.
Full textHenderson, Daniel Adrian. "Modelling and analysis of non-coding DNA sequence data." Thesis, University of Newcastle Upon Tyne, 1999. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.299427.
Full textFrusher, Marie J. "Predicting protein-protein interactions from sequence and structure data." Thesis, University of Essex, 2005. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.412108.
Full textMenke, Matthew Ewald 1978. "Predicting the beta-trefoil fold from protein sequence data." Thesis, Massachusetts Institute of Technology, 2004. http://hdl.handle.net/1721.1/30093.
Full textIncludes bibliographical references (p. 45-47).
A method is presented that uses [beta]-strand interactions at both the sequence and the atomic level, to predict the beta-structural motifs in protein sequences. A program called Wrap-and-Pack implements this method, and is shown to recognize β-trefoils, an important class of globular β-structures, in the Protein Data Bank with 92% specificity and 92.3% sensitivity in cross-validation. It is demonstrated that Wrap-and-Pack learns each of the ten known SCOP β-trefoil families, when trained primarily on β-structures that are not β-trefoils, together with 3D structures of known β-trefoils from outside the family. Wrap-and-Pack also predicts many proteins of unknown structure to be β-trefoils. The computational method used here may generalize to other β-structures for which strand topology and profiles of residue accessibility are well conserved.
by Matthew Ewald Menke.
S.M.
Powell, David Richard 1973. "Algorithms for sequence alignment." Monash University, School of Computer Science and Software Engineering, 2001. http://arrow.monash.edu.au/hdl/1959.1/8051.
Full textTang, Fung Michael, and 鄧峰. "Sequence classification and melody tracks selection." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 2001. http://hub.hku.hk/bib/B29742973.
Full textTang, Fung Michael. "Sequence classification and melody tracks selection /." Hong Kong : University of Hong Kong, 2001. http://sunzi.lib.hku.hk/hkuto/record.jsp?B25017470.
Full textHo, Ngai-lam, and 何毅林. "Algorithms on constrained sequence alignment." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 2004. http://hub.hku.hk/bib/B30201949.
Full textHung, Rong-I. "Computational studies of protein sequence and structure." Thesis, University of Oxford, 1999. http://ora.ox.ac.uk/objects/uuid:9905c946-86dd-4bb3-8824-7c50df136913.
Full textLiu, Kai. "Detecting stochastic motifs in network and sequence data for human behavior analysis." HKBU Institutional Repository, 2014. https://repository.hkbu.edu.hk/etd_oa/60.
Full textFritz, Markus Hsi-Yang. "Exploiting high throughput DNA sequencing data for genomic analysis." Thesis, University of Cambridge, 2012. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.610819.
Full textPeng, Yu, and 彭煜. "Iterative de Bruijn graph assemblers for second-generation sequencing reads." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 2012. http://hub.hku.hk/bib/B50534051.
Full textpublished_or_final_version
Computer Science
Doctoral
Doctor of Philosophy
Ozarar, Mert. "Prediction Of Protein Subcellular Localization Based On Primary Sequence Data." Master's thesis, METU, 2003. http://etd.lib.metu.edu.tr/upload/1082320/index.pdf.
Full textShaolong, Chen. "Efficient data management strategies for sequence alignment on heterogeneous clusters." Doctoral thesis, Universitat Autònoma de Barcelona, 2019. http://hdl.handle.net/10803/667227.
Full textAmong the high performance computing systems, the Intel Xeon Phi is an accelerator that turns out to be a very attractive alternative to improve the performance of applications with intense computing needs that are traditionally executed in systems based on multicore servers. These applications can be migrated from a multicore server to an accelerator with a low coding effort because both systems are based on nuclei with the same basic architecture. In our study, we focused our attention on BWA, one of the most popular sequence aligners, and we have analyzed different modes of execution of BWA in various heterogeneous computing systems that incorporate an accelerator. The alignment of sequences is a fundamental phase in the analysis of genomic variants and has a high computational cost. Although its coding to run in a multicore system can be simple, achieving good performance is not easy in this type of systems, as our results show. We have developed and evaluated different strategies that have been applied on BWA and, of all of them, we conclude that the MDPR variant, which combines data parallelization and data replication, is the one that provides the best results in all systems evaluated. MDPR has a generic design that allows it to be used in different heterogeneous systems. On the one hand, we have applied it in a system consisting of a server with Intel Xeon multicore processors and a Xeon Phi accelerator. And, on the other hand, we have also evaluated it in other heterogeneous systems based on multicore servers equipped with AMD and Intel processors. In all these hardware configurations, we have tested two dynamic modes and one static mode of data distribution in MDPR. Our experimental results show that the best results for MDPR are obtained when the static mode of data distribution is applied. The dynamic strategy based on round robin achieves a similar performance without the off-line overhead incurred by the static mode. Although our proposal was applied to BWA using human genome data samples, this strategy can be easily applied to other sequence data and other alignment tools that have operating principles similar to those of the BWA aligner.
Raza, Atif [Verfasser]. "Metaheuristics for Pattern Mining in Big Sequence Data / Atif Raza." Mainz : Universitätsbibliothek der Johannes Gutenberg-Universität Mainz, 2021. http://d-nb.info/1231992875/34.
Full textScanlon, Eben Louis 1974. "Predicting the triple beta-spiral fold from primary sequence data." Thesis, Massachusetts Institute of Technology, 2004. http://hdl.handle.net/1721.1/16617.
Full textIncludes bibliographical references (leaves 118-125).
This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.
The Triple β-Spiral is a novel protein structure that plays a role in viral attachment and pathogenesis. At present, there are two Triple β-Spiral structures with solved crystallographic coordinates - one from Adenovirus and the other from Reovirus. There is evidence that the fold also occurs in Bacteriophage SF6. In this thesis, we present a computational analysis of the Triple β-Spiral fold. Our goal is to discover new instances of the fold in protein sequence databases. In Chapter 2, we present a series of sequence-based methods for the discovery of the fold. The final method in this Chapter is an iterative profile-based search that outperforms existing sequence-based algorithms. In Chapter 3, we introduce specific knowledge of the protein's structure into our prediction algorithms. Although this additional information does not improve the profile-based methods in Chapter 2, it does provide insight into the important forces that drive the Triple β-Spiral folding process. In Chapter 4, we employ logistic regression to integrate the score information from the previous Chapter into a single unified framework. This framework outperforms all previous methods in cross-validation tests. We do not discover a great number of additional instances of the Triple β-Spiral fold outside of the Adenovirus and Reovirus families. The results of our profile based templates and score integration tools, however, suggest that these methods might well succeed for other protein structures.
by Eben Louis Scanlon.
M.B.A.
S.M.
Simpson, Jared Thomas. "Efficient sequence assembly and variant calling using compressed data structures." Thesis, University of Cambridge, 2013. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.607828.
Full textSzalay, Tamas. "Improved Analysis of Nanopore Sequence Data and Scanning Nanopore Techniques." Thesis, Harvard University, 2016. http://nrs.harvard.edu/urn-3:HUL.InstRepos:33493548.
Full textEngineering and Applied Sciences - Applied Physics
Di, Nardo Antonello. "Phylodynamic modelling of foot-and-mouth disease virus sequence data." Thesis, University of Glasgow, 2016. http://theses.gla.ac.uk/7558/.
Full textShrestha, Ram Krishna. "Management and analysis of HIV -1 ultra-deep sequence data." University of the Western Cape, 2014. http://hdl.handle.net/11394/8466.
Full textThe continued success of antiretroviral programmes in the treatment of HIV is dependent on access to a cost-effective HIV drug resistance test (HIV-DRT). HIVDRT involves sequencing a fragment of the HIV genome and characterising the presence/absence of mutations that confer resistance to one or more drugs. HIV-DRT using conventional DNA sequencing is prohibitively expensive (~US$150 per patient) for routine use in resource-limited settings such as many African countries. While the advent of ultra deep pyrosequencing (UDPS) approaches have considerably reduced (3-5 fold reduction) the cost of generating the sequence data, there has been an even more significant increase in the volume of data generated and the complexity involved in its analysis. In order to address this issue we have developed Seq2Res, a computational pipeline for HIV drug resistance test from UDPS genotypic data. We have developed QTrim, software that undertakes high throughput quality trimming of UDPS sequencing data to ensure that subsequently analyzed data is of high quality. The comparison of QTrim to other widely used tools showed that it is equivalent to the next best method at trimming good quality data but outperforms all methods at trimming poor quality data. Further, we have developed, and evaluated, a computational approach for the analysis of UDPS sequence data generated using the novel Primer ID that enables the generation of a consensus sequence from all sequence reads originating from the same viral template, thus reducing the presence of PCR and sequencing induced errors in the dataset as well as reducing. We see that while the Primer ID approach does undoubtedly reduce the prevalence of PCR and sequencing induced errors, it artificially reduces the diversity of the subsequently analysed data due to the large volume of data that is discarded as a result of there being an insufficient number of sequences for consensus sequence generation. We validated the sensitivity of the Seq2Res pipeline using two real biological datasets from the Stanford HIV Database and five simulated datasets The Seq2Res results correlated fully with that of the Stanford database as well as identifying a drug resistance mutations (DRM) that had been incorrectly interpreted by the Stanford approach. Further, the analysis of the simulated datasets showed that Seq2Res is capable of accurately identifying DRMs at all prevalence levels down to at least 1% of the sequence data generated from a viral population. Finally, we applied Seq2Res to UDPS resistance data generated from as many as 641 individuals as part of the CIPRA-SA study to evaluate the effectiveness of UDPS HIV drug resistance genotyping in resource limited settings with a high burden of HIV infections. We find that, despite the FLX coverage being almost three times as much as that of the Junior platform, resistance genotyping results are directly comparable between both of the approaches at a range of prevalence levels to as low as 1%. Further, we find no significant difference between UDPS sequencing and the "gold standard" Sanger based approach, thus indicating that pooling as many as 48 patient's data and sequencing using the Roche/454 Junior platform is a viable approach for HIV drug resistance genotyping. Further, we explored the presence of resistant minor variants in individual's viral populations and find that the identification of minor resistant variants in individuals exposed to nevirapine through PMTCT correlates with the time since exposure. We conclude that HIV resistance genotyping is now a viable prospect for resource limited setting with a high burden of HIV infections and that UDPS approaches are at least as sensitive as the currently used Sanger-based sequencing approaches. Further, the development of Seq2Res has provided a sensitive, easy to use and scalable technology that facilitates the routine use of UDPS for HIV drug resistance genotyping.
Thorell, Stina. "The transaldolase family : structure, function and evolution /." Stockholm, 2001. http://diss.kib.ki.se/2001/91-628-4923-9/.
Full textLi, Yaoman, and 李耀满. "Efficient methods for improving the sensitivity and accuracy of RNA alignments and structure prediction." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 2013. http://hdl.handle.net/10722/195977.
Full textpublished_or_final_version
Computer Science
Master
Master of Philosophy
Wang, Yi, and 王毅. "Binning and annotation for metagenomic next-generation sequencing reads." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 2014. http://hdl.handle.net/10722/208040.
Full textpublished_or_final_version
Computer Science
Doctoral
Doctor of Philosophy
Stapert, R. P. "A segmental mixture model, maximising data use with time sequence information." Thesis, Swansea University, 2000. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.639099.
Full textZhang, Qi Wang Wei. "Mining emerging massive scientific sequence data using block-wise decomposition methods." Chapel Hill, N.C. : University of North Carolina at Chapel Hill, 2009. http://dc.lib.unc.edu/u?/etd,2530.
Full textTitle from electronic title page (viewed Oct. 5, 2009). "... in partial fulfillment of the requirements for the degree of Doctor of Philosophy in the Department of Computer Science." Discipline: Computer Science; Department/School: Computer Science.
Mumpower, Eric J. P. "FITSL : a language for directed exploration and analysis of sequence data." Thesis, Massachusetts Institute of Technology, 2007. http://hdl.handle.net/1721.1/41653.
Full textIncludes bibliographical references (p. 81-84).
This thesis describes a sequence-data processing toolkit for analysis of Intelligent Tutoring System (ITS) log data, that unlike other tools allows directed exploration of sequence patterns. This system provides a powerful yet straightforward abstraction for sequence-data processing, and a set of high-level manipulation primitives which allow arbitrarily complex transformations of such data. Using this language, very sophisticated queries can be performed using only a few lines of code. Furthermore, queries can be constructed interactively, allowing for rapid development, refinement, and comparison of hypotheses. Importantly, this system is not limited to ITS logs, but is equally applicable to the manipulation of any form of (potentially multidimensional) sequence data.
by Eric J.P. Mumpower.
M.Eng.
Svärd, Karl. "Developing new methods for estimating population divergence times from sequence data." Thesis, Uppsala universitet, Institutionen för medicinsk biokemi och mikrobiologi, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-450123.
Full textBajalan, Amanj. "Improved methods for virus detection and discovery in metagenomic sequence data." Thesis, Uppsala universitet, Institutionen för biologisk grundutbildning, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-412478.
Full textÁlvarez-Carretero, Sandra. "BACTpipe : Characterization of bacterial isolates based on whole-genome sequence data." Thesis, Högskolan i Skövde, Institutionen för biovetenskap, 2016. http://urn.kb.se/resolve?urn=urn:nbn:se:his:diva-15033.
Full text