Dissertations / Theses on the topic 'Structured data'
Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles
Consult the top 50 dissertations / theses for your research on the topic 'Structured data.'
Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.
You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.
Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.
Amornsinlaphachai, Pensri. "Updating semi-structured data." Thesis, Northumbria University, 2007. http://nrl.northumbria.ac.uk/3422/.
Full textYang, Lei. "Querying Graph Structured Data." Case Western Reserve University School of Graduate Studies / OhioLINK, 2015. http://rave.ohiolink.edu/etdc/view?acc_num=case1410434109.
Full textAl-Wasil, Fahad M. "Querying distributed heterogeneous structured and semi-structured data sources." Thesis, Cardiff University, 2007. http://orca.cf.ac.uk/56144/.
Full textSu, Wei. "Motif Mining On Structured And Semi-structured Biological Data." Case Western Reserve University School of Graduate Studies / OhioLINK, 2013. http://rave.ohiolink.edu/etdc/view?acc_num=case1365089538.
Full textTripney, Brian Grieve. "Data value storage for compressed semi-structured data." Thesis, University of Strathclyde, 2012. http://oleg.lib.strath.ac.uk:80/R/?func=dbin-jump-full&object_id=18962.
Full textMintram, Robert C. "Vector representations of structured data." Thesis, Southampton Solent University, 2002. http://ssudl.solent.ac.uk/624/.
Full textZhang, Chiyuan Ph D. Massachusetts Institute of Technology. "Deep learning and structured data." Thesis, Massachusetts Institute of Technology, 2018. http://hdl.handle.net/1721.1/115643.
Full textThis electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.
Cataloged from student-submitted PDF version of thesis.
Includes bibliographical references (pages 135-150).
In the recent years deep learning has witnessed successful applications in many different domains such as visual object recognition, detection and segmentation, automatic speech recognition, natural language processing, and reinforcement learning. In this thesis, we will investigate deep learning from a spectrum of different perspectives. First of all, we will study the question of generalization, which is one of the most fundamental notion in machine learning theory. We will show how, in the regime of deep learning, the characterization of generalization becomes different from the conventional way, and propose alternative ways to approach it. Moving from theory to more practical perspectives, we will show two different applications of deep learning. One is originated from a real world problem of automatic geophysical feature detection from seismic recordings to help oil & gas exploration; the other is motivated from a computational neuroscientific modeling and studying of human auditory system. More specifically, we will show how deep learning could be adapted to play nicely with the unique structures associated with the problems from different domains. Lastly, we move to the computer system design perspective, and present our efforts in building better deep learning systems to allow efficient and flexible computation in both academic and industrial worlds.
by Chiyuan Zhang.
Ph. D.
Pan, Jiajun. "Metric learning for structured data." Thesis, Nantes, 2019. http://www.theses.fr/2019NANT4076.
Full textMetric distance learning is a branch of re-presentation learning in machine learning algorithms. We summarize the development and current situation of the current metric distance learning algorithm from the aspects of the flat database and nonflat database. For a series of algorithms based on Mahalanobis distance for the flat database that fails to make full use of the intersection of three or more dimensions, we propose a metric learning algorithm based on the submodular function. For the lack of metric learning algorithms for relational databases in non-flat databases, we propose LSCS(Relational Link-strength Constraints Selection) for selecting constraints for metric learning algorithms with side information and MRML (Multi-Relation Metric Learning) which sums the loss from relationship constraints and label constraints. Through the design experiments and verification on the real database, the proposed algorithms are better than the current algorithms
Qiao, Shi. "QUERYING GRAPH STRUCTURED RDF DATA." Case Western Reserve University School of Graduate Studies / OhioLINK, 2016. http://rave.ohiolink.edu/etdc/view?acc_num=case1447198654.
Full textFok, Lordique(Lordique S. ). "Techniques for structured data discovery." Thesis, Massachusetts Institute of Technology, 2019. https://hdl.handle.net/1721.1/121671.
Full textThesis: M. Eng., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2019
Cataloged from student-submitted PDF version of thesis.
Includes bibliographical references (pages 63-64).
The discovery of structured data, or data that is tagged by key-value pairs, is a problem that can be subdivided into two issues: how best to structure information architecture and user interaction for discovery; and how to intelligently display data in a way that that optimizes the discovery of "useful" (i.e. relevant and helpful for a user's current use case) data. In this thesis, I investigate multiple methods of addressing both issues, and the results of evaluating these methods qualitatively and quantitatively. Specifically, I implement and evaluate: a novel interface design which combines different aspects of existing interfaces, two methods of diversifying data subsets given a search query, three methods of incorporating relevance in data subsets given a search query and information about the user's historic queries, a novel method of visualizing structured data, and two methods of inducing hierarchy on structured data in the presence of an partial data schema. These implementations and evaluations are shown to be effective in structuring information architecture and user interaction for structured data discovery, but are only partially effective in intelligently displaying data to optimize discovery of useful structured data.
by Lordique Fok.
M. Eng.
M.Eng. Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science
Miner, andrew S. "Data structures for the analysis of large structured Markov models." W&M ScholarWorks, 2000. https://scholarworks.wm.edu/etd/1539623985.
Full textSchönauer, Stefan. "Efficient similarity search in structured data." [S.l.] : [s.n.], 2004. http://edoc.ub.uni-muenchen.de/archive/00001802.
Full textSchönauer, Stefan. "Efficient Similarity Search in Structured Data." Diss., lmu, 2004. http://nbn-resolving.de/urn:nbn:de:bvb:19-18022.
Full textWackersreuther, Bianca. "Efficient Knowledge Extraction from Structured Data." Diss., lmu, 2011. http://nbn-resolving.de/urn:nbn:de:bvb:19-138079.
Full textKashima, Hisashi. "Machine learning approaches for structured data." 京都大学 (Kyoto University), 2007. http://hdl.handle.net/2433/135953.
Full textMaksimovic, Gordana. "Query Languages for Semi-structured Data." Thesis, Blekinge Tekniska Högskola, Institutionen för programvaruteknik och datavetenskap, 2003. http://urn.kb.se/resolve?urn=urn:nbn:se:bth-4332.
Full textNg, Kee Siong, and kee siong@rsise anu edu au. "Learning Comprehensible Theories from Structured Data." The Australian National University. Research School of Information Sciences and Engineering, 2005. http://thesis.anu.edu.au./public/adt-ANU20051031.105726.
Full textThomson, Susan Elizabeth. "A storage service for structured data." Thesis, University of Cambridge, 1990. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.385486.
Full textNUNES, BERNARDO PEREIRA. "AUTOMATIC CLASSIFICATION OF SEMI-STRUCTURED DATA." PONTIFÍCIA UNIVERSIDADE CATÓLICA DO RIO DE JANEIRO, 2009. http://www.maxwell.vrac.puc-rio.br/Busca_etds.php?strSecao=resultado&nrSeq=14382@1.
Full textO problema da classificação de dados remonta à criação de taxonomias visando cobrir áreas do conhecimento. Com o surgimento da Web, o volume de dados disponíveis aumentou várias ordens de magnitude, tornando praticamente impossível a organização de dados manualmente. Esta dissertação tem por objetivo organizar dados semi-estruturados, representados por frames, sem uma estrutura de classes prévia. A dissertação apresenta um algoritmo, baseado no K-Medóide, capaz de organizar um conjunto de frames em classes, estruturadas sob forma de uma hierarquia estrita. A classificação dos frames é feita a partir de um critério de proximidade que leva em conta os atributos e valores que cada frame possui.
The problem of data classification goes back to the definition of taxonomies covering knowledge areas. With the advent of the Web, the amount of data available has increased several orders of magnitude, making manual data classification impossible. This dissertation proposes a method to automatically classify semi-structured data, represented by frames, without any previous knowledge about structured classes. The dissertation introduces an algorithm, based on K-Medoid, capable of organizing a set of frames into classes, structured as a strict hierarchy. The classification of the frames is based on a closeness criterion that takes into account the attributes and their values in each frame.
Evanco, Kathleen L. (Kathleen Lee). "Customized data visualization using structured video." Thesis, Massachusetts Institute of Technology, 1996. http://hdl.handle.net/1721.1/29106.
Full textLee, John Boaz T. "Deep Learning on Graph-structured Data." Digital WPI, 2019. https://digitalcommons.wpi.edu/etd-dissertations/570.
Full textBandyopadhyay, Bortik. "Querying Structured Data via Informative Representations." The Ohio State University, 2020. http://rave.ohiolink.edu/etdc/view?acc_num=osu1595447189545086.
Full textFolkesson, Carl. "Anonymization of directory-structured sensitive data." Thesis, Linköpings universitet, Institutionen för datavetenskap, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-160952.
Full textNg, Kee Siong. "Learning comprehensible theories from structured data /." View thesis entry in Australian Digital Theses Program, 2005. http://thesis.anu.edu.au/public/adt-ANU20051031.105726/index.html.
Full textDa, San Martino Giovanni <1979>. "Kernel Methods for Tree Structured Data." Doctoral thesis, Alma Mater Studiorum - Università di Bologna, 2009. http://amsdottorato.unibo.it/1400/1/thesis.pdf.
Full textDa, San Martino Giovanni <1979>. "Kernel Methods for Tree Structured Data." Doctoral thesis, Alma Mater Studiorum - Università di Bologna, 2009. http://amsdottorato.unibo.it/1400/.
Full textTatikonda, Shirish. "Towards Efficient Data Analysis and Management of Semi-structured Data." The Ohio State University, 2010. http://rave.ohiolink.edu/etdc/view?acc_num=osu1275414859.
Full textEichhorn, Jan. "Applications of kernel machines to structured data." [S.l.] : [s.n.], 2006. http://opus.kobv.de/tuberlin/volltexte/2007/1507.
Full textLipton, Zachary C. "Learning from Temporally-Structured Human Activities Data." Thesis, University of California, San Diego, 2018. http://pqdtopen.proquest.com/#viewpdf?dispub=10683703.
Full textDespite the extraordinary success of deep learning on diverse problems, these triumphs are too often confined to large, clean datasets and well-defined objectives. Face recognition systems train on millions of perfectly annotated images. Commercial speech recognition systems train on thousands of hours of painstakingly-annotated data. But for applications addressing human activity, data can be noisy, expensive to collect, and plagued by missing values. In electronic health records, for example, each attribute might be observed on a different time scale. Complicating matters further, deciding precisely what objective warrants optimization requires critical consideration of both algorithms and the application domain. Moreover, deploying human-interacting systems requires careful consideration of societal demands such as safety, interpretability, and fairness.
The aim of this thesis is to address the obstacles to mining temporal patterns in human activity data. The primary contributions are: (1) the first application of RNNs to multivariate clinical time series data, with several techniques for bridging long-term dependencies and modeling missing data; (2) a neural network algorithm for forecasting surgery duration while simultaneously modeling heteroscedasticity; (3) an approach to quantitative investing that uses RNNs to forecast company fundamentals; (4) an exploration strategy for deep reinforcement learners that significantly speeds up dialogue policy learning; (5) an algorithm to minimize the number of catastrophic mistakes made by a reinforcement learner; (6) critical works addressing model interpretability and fairness in algorithmic decision-making.
Paaßen, Benjamin [Verfasser]. "Metric Learning for Structured Data / Benjamin Paaßen." Bielefeld : Universitätsbibliothek Bielefeld, 2019. http://d-nb.info/1186887818/34.
Full textBlampied, Paul Alexander. "Structured recursion for non-uniform data-types." Thesis, University of Nottingham, 2000. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.342028.
Full textCottee, Michaela J. "The graphical representation of structured multivariate data." Thesis, Open University, 1996. http://oro.open.ac.uk/57616/.
Full textSun, Yizhi. "Statistical Analysis of Structured High-dimensional Data." Diss., Virginia Tech, 2018. http://hdl.handle.net/10919/97505.
Full textPHD
Tu, Ying. "Focus-based Interactive Visualization for Structured Data." The Ohio State University, 2013. http://rave.ohiolink.edu/etdc/view?acc_num=osu1366198735.
Full textMorris, Henry. "Sparse nonlinear methods for predicting structured data." Thesis, Imperial College London, 2012. http://hdl.handle.net/10044/1/9548.
Full textBala, Saimir. "Mining Projects from Structured and Unstructured Data." Jens Gulden, Selmin Nurcan, Iris Reinhartz-Berger, Widet Guédria, Palash Bera, Sérgio Guerreiro, Michael Fellman, Matthias Weidlich, 2017. http://epub.wu.ac.at/7205/1/ProjecMining%2DCamera%2DReady.pdf.
Full textKing, Michael Allen. "Ensemble Learning Techniques for Structured and Unstructured Data." Diss., Virginia Tech, 2015. http://hdl.handle.net/10919/51667.
Full textPh. D.
Sundaravadivelu, Rathinasabapathy. "Interoperability between heterogeneous and distributed biodiversity data sources in structured data networks." Thesis, Cardiff University, 2010. http://orca.cf.ac.uk/18086/.
Full textOtaki, Keisuke. "Algorithmic Approaches to Pattern Mining from Structured Data." 京都大学 (Kyoto University), 2016. http://hdl.handle.net/2433/215673.
Full textKyoto University (京都大学)
0048
新制・課程博士
博士(情報学)
甲第19846号
情博第597号
新制||情||104(附属図書館)
32882
京都大学大学院情報学研究科知能情報学専攻
(主査)教授 山本 章博, 教授 鹿島 久嗣, 教授 阿久津 達也
学位規則第4条第1項該当
Zhao, Xiaoyan 1966. "Trie methods for structured data on secondary storage." Thesis, McGill University, 2000. http://digitool.Library.McGill.CA:80/R/?func=dbin-jump-full&object_id=36855.
Full textWe apply the trie structures to indexing, storing and querying structured data on secondary storage. We are interested in the storage compactness, the I/O efficiency, the order-preserving properties, the general orthogonal range queries and the exact match queries for very large files and databases. We also apply the trie structures to relational joins (set operations).
We compare trie structures to various data structures on secondary storage: multipaging and grid files in the direct access method category, R-trees/R*-trees and X-trees in the logarithmic access cost category, as well as some representative join algorithms for performing join operations. Our results show that range queries by trie method are superior to these competitors in search cost when queries return more than a few records and are competitive to direct access methods for exact match queries. Furthermore, as the trie structure compresses data, it is the winner in terms of storage compared to all other methods mentioned above.
We also present a new tidy function for order-preserving key-to-address transformation. Our tidy function is easy to construct and cheaper in access time and storage cost compared to its closest competitor.
NUNES, IAN MONTEIRO. "CLUSTERING TEXT STRUCTURED DATA BASED ON TEXT SIMILARITY." PONTIFÍCIA UNIVERSIDADE CATÓLICA DO RIO DE JANEIRO, 2008. http://www.maxwell.vrac.puc-rio.br/Busca_etds.php?strSecao=resultado&nrSeq=25796@1.
Full textCOORDENAÇÃO DE APERFEIÇOAMENTO DO PESSOAL DE ENSINO SUPERIOR
PROGRAMA DE EXCELENCIA ACADEMICA
O presente trabalho apresenta os resultados que obtivemos com a aplicação de grande número de modelos e algoritmos em um determinado conjunto de experimentos de agrupamento de texto. O objetivo de tais testes é determinar quais são as melhores abordagens para processar as grandes massas de informação geradas pelas crescentes demandas de data quality em diversos setores da economia. O processo de deduplicação foi acelerado pela divisão dos conjuntos de dados em subconjuntos de itens similares. No melhor cenário possível, cada subconjunto tem em si todas as ocorrências duplicadas de cada registro, o que leva o nível de erro na formação de cada grupo a zero. Todavia, foi determinada uma taxa de tolerância intrínseca de 5 porcento após o agrupamento. Os experimentos mostram que o tempo de processamento é significativamente menor e a taxa de acerto é de até 98,92 porcento. A melhor relação entre acurácia e desempenho é obtida pela aplicação do algoritmo K-Means com um modelo baseado em trigramas.
This document reports our findings on a set of text clusterig experiments, where a wide variety of models and algorithms were applied. The objective of these experiments is to investigate which are the most feasible strategies to process large amounts of information in face of the growing demands on data quality in many fields. The process of deduplication was accelerated through the division of the data set into individual subsets of similar items. In the best case scenario, each subset must contain all duplicates of each produced register, mitigating to zero the cluster s errors. It is established, although, a tolerance of 5 percent after the clustering process. The experiments show that the processing time is significantly lower, showing a 98,92 percent precision. The best accuracy/performance relation is achieved with the K-Means Algorithm using a trigram based model.
Gianniotis, Nikolaos. "Visualisation of structured data through generative probabilistic modeling." Thesis, University of Birmingham, 2008. http://etheses.bham.ac.uk//id/eprint/4803/.
Full textStachowiak, Maciej 1976. "Automated extraction of structured data from HTML documents." Thesis, Massachusetts Institute of Technology, 1998. http://hdl.handle.net/1721.1/9896.
Full textIncludes bibliographical references (leaf 45).
by Maciej Stachowiak.
M.Eng.
Zhang, Peng. "Structured sensing for estimation of high-dimensional data." Thesis, Imperial College London, 2016. http://hdl.handle.net/10044/1/49415.
Full textForshaw, Gareth William. "Semi-automatic matching of semi-structured data updates." Master's thesis, University of Cape Town, 2014. http://hdl.handle.net/11427/12930.
Full textData matching, also referred to as data linkage or field matching, is a technique used to combine multiple data sources into one data set. Data matching is used for data integration in a number of sectors and industries; from politics and health care to scientific applications. The motivation for this study was the observation of the day-to-day struggles of a large non-governmental organisation (NGO) in managing their membership database. With a membership base of close to 2.4 million, the challenges they face with regard to the capturing and processing of the semi-structured membership updates are monumental. Updates arrive from the field in a multitude of formats, often incomplete and unstructured, and expert knowledge is geographically localised. These issues are compounded by an extremely complex organisational hierarchy and a general lack of data validation processes. An online system was proposed for pre-processing input and then matching it against the membership database. Termed the Data Pre-Processing and Matching System (DPPMS), it allows for single or bulk updates. Based on the success of the DPPMS with the NGO’s membership database, it was subsequently used for pre-processing and data matching of semi-structured patient and financial customer data. Using the semi-automated DPPMS rather than a clerical data matching system, true positive matches increased by 21% while false negative matches decreased by 20%. The Recall, Precision and F-Measure values all improved and the risk of false positives diminished. The DPPMS was unable to match approximately 8% of provided records; this was largely due to human error during initial data capture. While the DPPMS greatly diminished the reliance on experts, their role remained pivotal during the final stage of the process.
Ni, Weizeng. "Ontology-based Feature Construction on Non-structured Data." University of Cincinnati / OhioLINK, 2015. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1439309340.
Full textHerrmann, Kai, Hannes Voigt, and Wolfgang Lehner. "Cinderella - Adaptive Online Partitioning of Irregularly Structured Data." IEEE, 2014. https://tud.qucosa.de/id/qucosa%3A75273.
Full textWeng, Daiyue. "Extracting structured data from Web query result pages." Thesis, Queen's University Belfast, 2016. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.709858.
Full textDoan, AnHai. "Learning to map between structured representations of data /." Thesis, Connect to this title online; UW restricted, 2002. http://hdl.handle.net/1773/6968.
Full textBui, Dang Bach. "Mining complex structured data: Enhanced methods and applications." Thesis, Curtin University, 2015. http://hdl.handle.net/20.500.11937/480.
Full text