Tesi sul tema "Data models, storage and indexing"
Cita una fonte nei formati APA, MLA, Chicago, Harvard e in molti altri stili
Vedi i top-39 saggi (tesi di laurea o di dottorato) per l'attività di ricerca sul tema "Data models, storage and indexing".
Accanto a ogni fonte nell'elenco di riferimenti c'è un pulsante "Aggiungi alla bibliografia". Premilo e genereremo automaticamente la citazione bibliografica dell'opera scelta nello stile citazionale di cui hai bisogno: APA, MLA, Harvard, Chicago, Vancouver ecc.
Puoi anche scaricare il testo completo della pubblicazione scientifica nel formato .pdf e leggere online l'abstract (il sommario) dell'opera se è presente nei metadati.
Vedi le tesi di molte aree scientifiche e compila una bibliografia corretta.
Munishwar, Vikram P. "Storage and indexing issues in sensor networks". Diss., Online access via UMI:, 2006.
Cerca il testo completoOttoson, Patrik. "Geographic Indexing and Data Management for 3D-Visualisation". Doctoral thesis, Stockholm : Tekniska högsk, 2001. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-3235.
Testo completoVasaitis, Vasileios. "Novel storage architectures and pointer-free search trees for database systems". Thesis, University of Edinburgh, 2012. http://hdl.handle.net/1842/6240.
Testo completoJia, Yanan Jia. "Generalized Bilinear Mixed-Effects Models for Multi-Indexed Multivariate Data". The Ohio State University, 2016. http://rave.ohiolink.edu/etdc/view?acc_num=osu1469180629.
Testo completoHabtu, Simon. "Indexing file metadata using a distributed search engine for searching files on a public cloud storage". Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-232064.
Testo completoVisma Labs AB eller Visma ville genomföra experiment för att se om filmetadata skulle kunna indexeras för att söka efter filer på ett publikt moln. Med tanke på att lagring av filer på ett publikt moln är billigare än den nuvarande lagringslösningen, kan implementeringen spara Visma pengar som spenderas på dyra lagringskostnader. Denna studie är därför till för att hitta och utvärdera ett tillvägagångssätt valt för att indexera filmetadata och söka filer på ett offentligt molnlagring med den utvalda distribuerade sökmotorn Elasticsearch. Arkitekturen för den föreslagna lösningen har likenelser av en filtjänst och implementerades med flera containeriserade tjänster för att den ska fungera. Resultaten visar att filservicelösningen verkligen är möjlig men skulle behöva ytterligare modifikationer och fler resurser att fungera enligt Vismas krav.
Singh, Aameek. "Secure Management of Networked Storage Services: Models and Techniques". Diss., Available online, Georgia Institute of Technology, 2007, 2007. http://etd.gatech.edu/theses/available/etd-04092007-004039/.
Testo completoLiu, Ling, Committee Chair ; Aberer, Karl, Committee Member ; Ahamad, Mustaque, Committee Member ; Blough, Douglas, Committee Member ; Pu, Calton, Committee Member ; Voruganti, Kaladhar, Committee Member.
Paul, Arnab Kumar. "An Application-Attuned Framework for Optimizing HPC Storage Systems". Diss., Virginia Tech, 2020. http://hdl.handle.net/10919/99793.
Testo completoDoctor of Philosophy
Clusters of multiple computers connected through internet are often deployed in industry and laboratories for large scale data processing or computation that cannot be handled by standalone computers. In such a cluster, resources such as CPU, memory, disks are integrated to work together. With the increase in popularity of applications that read and write a tremendous amount of data, we need a large number of disks that can interact effectively in such clusters. This forms the part of high performance computing (HPC) storage systems. Such HPC storage systems are used by a diverse set of applications coming from organizations from a vast range of domains from earth sciences, financial services, telecommunication to life sciences. Therefore, the HPC storage system should be efficient to perform well for the different read and write (I/O) requirements from all the different sets of applications. But current HPC storage systems do not cater to the varied I/O requirements. To this end, this dissertation designs and develops a framework for HPC storage systems that is application-attuned and thus provides much improved performance than other state-of-the-art HPC storage systems without such optimizations.
Regin, Måns, e Gunnarsson Emil. "Refactoring Existing Database Layers for Improved Performance, Readability and Simplicity". Thesis, Linnéuniversitetet, Institutionen för datavetenskap och medieteknik (DM), 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:lnu:diva-105277.
Testo completoChan, Wing Sze. "Semantic search of multimedia data objects through collaborative intelligence". HKBU Institutional Repository, 2010. http://repository.hkbu.edu.hk/etd_ra/1171.
Testo completoTandon, Ashish. "Analysis and optimization of data storage using enhanced object models in the .NET framework". Thesis, Edinburgh Napier University, 2007. http://researchrepository.napier.ac.uk/Output/4047.
Testo completoFritz, Eric Ryan. "Relational database models and other software and their importance in data analysis, storage, and communication". [Ames, Iowa : Iowa State University], 2009. http://gateway.proquest.com/openurl?url_ver=Z39.88-2004&rft_val_fmt=info:ofi/fmt:kev:mtx:dissertation&res_dat=xri:pqdiss&rft_dat=xri:pqdiss:1468081.
Testo completoCaliguri, Ryan P. "Comparison of Sensible Water Cooling, Ice building, and Phase Change Material in Thermal Energy Storage Tank Charging: Analytical Models and Experimental Data". University of Cincinnati / OhioLINK, 2021. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1627666292483648.
Testo completoWu, Bruce Jiinpo. "The effects of data models and conceptual models of the structured query language on the task of query writing by end users". Thesis, University of North Texas, 1991. https://digital.library.unt.edu/ark:/67531/metadc332680/.
Testo completoNobles, Royce Anthony. "Evaluation of spelling correction and concept-based searching models in a data entry application". View electronic thesis (PDF), 2009. http://dl.uncw.edu/etd/2009-2/noblesr/roycenobles.pdf.
Testo completoSlabber, Frans Bresler. "Semi-automated extraction of structural orientation data from aerospace imagery combined with digital elevation models". Thesis, Rhodes University, 1996. http://hdl.handle.net/10962/d1005614.
Testo completoMaples, Glenn (Glenn Edward). "Information System Quality: An Examination of Service-Based Models and Alternatives". Thesis, University of North Texas, 1997. https://digital.library.unt.edu/ark:/67531/metadc277952/.
Testo completoMunalula, Themba. "Measuring the applicability of Open Data Standards to a single distributed organisation: an application to the COMESA Secretariat". Thesis, University of Cape Town, 2008. http://pubs.cs.uct.ac.za/archive/00000461/.
Testo completoDawson, Linda Louise 1954. "An investigation of the use of object-oriented models in requirements engineering practice". Monash University, School of Information Management and Systems, 2001. http://arrow.monash.edu.au/hdl/1959.1/8031.
Testo completoXiong, Li. "Resilient Reputation and Trust Management: Models and Techniques". Diss., Georgia Institute of Technology, 2005. http://hdl.handle.net/1853/7483.
Testo completoMohammed, Jafaru. "Impact of Solar Resource and Atmospheric Constituents on Energy Yield Models for Concentrated Photovoltaic Systems". Thèse, Université d'Ottawa / University of Ottawa, 2013. http://hdl.handle.net/10393/24342.
Testo completoCamacho, Rodriguez Jesus. "Efficient techniques for large-scale Web data management". Thesis, Paris 11, 2014. http://www.theses.fr/2014PA112229/document.
Testo completoThe recent development of commercial cloud computing environments has strongly impacted research and development in distributed software platforms. Cloud providers offer a distributed, shared-nothing infrastructure, that may be used for data storage and processing.In parallel with the development of cloud platforms, programming models that seamlessly parallelize the execution of data-intensive tasks over large clusters of commodity machines have received significant attention, starting with the MapReduce model very well known by now, and continuing through other novel and more expressive frameworks. As these models are increasingly used to express analytical-style data processing tasks, the need for higher-level languages that ease the burden of writing complex queries for these systems arises.This thesis investigates the efficient management of Web data on large-scale infrastructures. In particular, we study the performance and cost of exploiting cloud services to build Web data warehouses, and the parallelization and optimization of query languages that are tailored towards querying Web data declaratively.First, we present AMADA, an architecture for warehousing large-scale Web data in commercial cloud platforms. AMADA operates in a Software as a Service (SaaS) approach, allowing users to upload, store, and query large volumes of Web data. Since cloud users support monetary costs directly connected to their consumption of resources, our focus is not only on query performance from an execution time perspective, but also on the monetary costs associated to this processing. In particular, we study the applicability of several content indexing strategies, and show that they lead not only to reducing query evaluation time, but also, importantly, to reducing the monetary costs associated with the exploitation of the cloud-based warehouse.Second, we consider the efficient parallelization of the execution of complex queries over XML documents, implemented within our system PAXQuery. We provide novel algorithms showing how to translate such queries into plans expressed in the PArallelization ConTracts (PACT) programming model. These plans are then optimized and executed in parallel by the Stratosphere system. We demonstrate the efficiency and scalability of our approach through experiments on hundreds of GB of XML data.Finally, we present a novel approach for identifying and reusing common subexpressions occurring in Pig Latin scripts. In particular, we lay the foundation of our reuse-based algorithms by formalizing the semantics of the Pig Latin query language with extended nested relational algebra for bags. Our algorithm, named PigReuse, operates on the algebraic representations of Pig Latin scripts, identifies subexpression merging opportunities, selects the best ones to execute based on a cost function, and merges other equivalent expressions to share its result. We bring several extensions to the algorithm to improve its performance. Our experiment results demonstrate the efficiency and effectiveness of our reuse-based algorithms and optimization strategies
Černý, Petr. "Vyhledávání ve videu". Master's thesis, Vysoké učení technické v Brně. Fakulta informačních technologií, 2012. http://www.nusl.cz/ntk/nusl-236590.
Testo completoZampetakis, Stamatis. "Scalable algorithms for cloud-based Semantic Web data management". Thesis, Paris 11, 2015. http://www.theses.fr/2015PA112199/document.
Testo completoIn order to build smart systems, where machines are able to reason exactly like humans, data with semantics is a major requirement. This need led to the advent of the Semantic Web, proposing standard ways for representing and querying data with semantics. RDF is the prevalent data model used to describe web resources, and SPARQL is the query language that allows expressing queries over RDF data. Being able to store and query data with semantics triggered the development of many RDF data management systems. The rapid evolution of the Semantic Web provoked the shift from centralized data management systems to distributed ones. The first systems to appear relied on P2P and client-server architectures, while recently the focus moved to cloud computing.Cloud computing environments have strongly impacted research and development in distributed software platforms. Cloud providers offer distributed, shared-nothing infrastructures that may be used for data storage and processing. The main features of cloud computing involve scalability, fault-tolerance, and elastic allocation of computing and storage resources following the needs of the users.This thesis investigates the design and implementation of scalable algorithms and systems for cloud-based Semantic Web data management. In particular, we study the performance and cost of exploiting commercial cloud infrastructures to build Semantic Web data repositories, and the optimization of SPARQL queries for massively parallel frameworks.First, we introduce the basic concepts around Semantic Web and the main components and frameworks interacting in massively parallel cloud-based systems. In addition, we provide an extended overview of existing RDF data management systems in the centralized and distributed settings, emphasizing on the critical concepts of storage, indexing, query optimization, and infrastructure. Second, we present AMADA, an architecture for RDF data management using public cloud infrastructures. We follow the Software as a Service (SaaS) model, where the complete platform is running in the cloud and appropriate APIs are provided to the end-users for storing and retrieving RDF data. We explore various storage and querying strategies revealing pros and cons with respect to performance and also to monetary cost, which is a important new dimension to consider in public cloud services. Finally, we present CliqueSquare, a distributed RDF data management system built on top of Hadoop, incorporating a novel optimization algorithm that is able to produce massively parallel plans for SPARQL queries. We present a family of optimization algorithms, relying on n-ary (star) equality joins to build flat plans, and compare their ability to find the flattest possibles. Inspired by existing partitioning and indexing techniques we present a generic storage strategy suitable for storing RDF data in HDFS (Hadoop’s Distributed File System). Our experimental results validate the efficiency and effectiveness of the optimization algorithm demonstrating also the overall performance of the system
Amaral, Simone Silmara Werner Gurgel do. "Modelos lineares mistos para análise de dados longitudinais bivariados provenientes de ensaios agropecuários". Universidade de São Paulo, 2013. http://www.teses.usp.br/teses/disponiveis/11/11134/tde-22112013-105455/.
Testo completoIn longitudinal studies, repeated measurements of a response variable are taken in the same experimental unit over time. . Since different observations are measured on the same experimental unit, it is expected that there is correlation among the repeated measurements and heterogeneity of variances in different occasions. Multivariate Longitudinal Data are obtained when we measure a number of different response variables in the same experimental unit repeatedly over time; in this case, we should also observe a correlation between the different response variables. One way to analyze bivariate longitudinal data is to use a mixed model for each of the response variables, and unite them in bivariate mixed models specifying the joint distribution for random effects. Parameter estimates of this common distribution may be used to evaluate the relationship between different responses. As an example of the use of the technique, UHT milk storage data were used. Models were fitted using SAS software and the graphical analysis was done with software R. To model selection, Akaike Information Criterion (AIC) and Bayesian Information Criterion (BIC) were used, and maximum likelihood ratio test was used to compare nested models. The use of bivariate mixed linear model allowed to model the heteroscedasticity of the occasions, the correlation between the different measurements in the same experimental unit and also the correlation between the different response variables.
Douieb, Karim. "Hotlinks and dictionaries". Doctoral thesis, Universite Libre de Bruxelles, 2008. http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/210471.
Testo completoA fundamental objective of computer science is to store and retrieve information efficiently. This is known as the dictionary problem. A dictionary asks for a data structure which allows essentially the search operation. In general, information that is important and popular at a given time has to be accessed faster than less relevant information. This can be achieved by dynamically managing the data structure periodically such that relevant information is located closer from the search starting point. The second part of this thesis is devoted to the development and the understanding of self-adjusting dictionaries in various models of computation. In particular, we focus our attention on dictionaries which do not have any knowledge of the future accesses. Those dictionaries have to auto-adapt themselves to be competitive with dictionaries specifically tuned for a given access sequence.
This approach, which transforms the information structure, is not always feasible. Reasons can be that the structure is based on the semantic of the information such as categorization. In this context, the search procedure is linked to the structure itself and modifying the structure will affect how a search is performed. A solution developed to improve search in static structure is the hotlink assignment. It is a way to enhance a structure without altering its original design. This approach speeds up the search by creating shortcuts in the structure. The first part of this thesis is devoted to this approach.
Doctorat en Sciences
info:eu-repo/semantics/nonPublished
Ton, That Dai Hai. "Gestion efficace et partage sécurisé des traces de mobilité". Thesis, Université Paris-Saclay (ComUE), 2016. http://www.theses.fr/2016SACLV003/document.
Testo completoNowadays, the advances in the development of mobile devices, as well as embedded sensors have permitted an unprecedented number of services to the user. At the same time, most mobile devices generate, store and communicate a large amount of personal information continuously. While managing personal information on the mobile devices is still a big challenge, sharing and accessing these information in a safe and secure way is always an open and hot topic. Personal mobile devices may have various form factors such as mobile phones, smart devices, stick computers, secure tokens or etc. It could be used to record, sense, store data of user's context or environment surrounding him. The most common contextual information is user's location. Personal data generated and stored on these devices is valuable for many applications or services to user, but it is sensitive and needs to be protected in order to ensure the individual privacy. In particular, most mobile applications have access to accurate and real-time location information, raising serious privacy concerns for their users.In this dissertation, we dedicate the two parts to manage the location traces, i.e. the spatio-temporal data on mobile devices. In particular, we offer an extension of spatio-temporal data types and operators for embedded environments. These data types reconcile the features of spatio-temporal data with the embedded requirements by offering an optimal data presentation called Spatio-temporal object (STOB) dedicated for embedded devices. More importantly, in order to optimize the query processing, we also propose an efficient indexing technique for spatio-temporal data called TRIFL designed for flash storage. TRIFL stands for TRajectory Index for Flash memory. It exploits unique properties of trajectory insertion, and optimizes the data structure for the behavior of flash and the buffer cache. These ideas allow TRIFL to archive much better performance in both Flash and magnetic storage compared to its competitors.Additionally, we also investigate the protect user's sensitive information in the remaining part of this thesis by offering a privacy-aware protocol for participatory sensing applications called PAMPAS. PAMPAS relies on secure hardware solutions and proposes a user-centric privacy-aware protocol that fully protects personal data while taking advantage of distributed computing. For this to be done, we also propose a partitioning algorithm an aggregate algorithm in PAMPAS. This combination drastically reduces the overall costs making it possible to run the protocol in near real-time at a large scale of participants, without any personal information leakage
Pacheco, Urubatan Rocha. "Análise de redes sociais em dados bibliográficos". [s.n.], 2010. http://repositorio.unicamp.br/jspui/handle/REPOSIP/275784.
Testo completoDissertação (mestrado) - Universidade Estadual de Campinas, Instituto de Computação
Made available in DSpace on 2018-08-17T02:21:18Z (GMT). No. of bitstreams: 1 Pacheco_UrubatanRocha_M.pdf: 1174940 bytes, checksum: d2b5f4af6749eb4a1c7c6a1810b9749a (MD5) Previous issue date: 2010
Resumo: O foco deste trabalho é viabilizar a análise estrutural em redes sociais de colaboração científica a partir de bases de dados bibliográficos. Os dados bibliográficos são utilizados para obter redes sociais de afiliação dos autores a instituições de pesquisa científica, e das publicações são extraídas as suas relações com ontologias de áreas de pesquisa. Foram estudados e aplicados métodos que utilizam a análise das redes sociais para solução/redução de ambiguidades em identidades de nomes de pesquisadores, instituições, e veículos científicos. Outro assunto estudado foi a abordagem de medida da qualidade dos resultados e os problemas que afetam a sua qualidade. Concretizando o objetivo deste trabalho, foram construídas métricas e ferramentas que permitem a comparação da produção científica entre instituições, departamentos, áreas de pesquisa, países, etc. As ferramentas também produziram um ranking de universidades baseado no prestígio dos pesquisadores destas universidades na rede social de co-autoria. Este resultado permitiu demonstrar que a informação estrutural de prestígio foi devidamente capturada ao correlacionar este ranking com outros que avaliam a qualidade da produção científica das universidades utilizando critérios semelhantes.
Abstract: This work performs social network analysis of the scientific collaborations extracted from bibliographic data bases. The analysis also includes the authors' scientific institution afiliation, and its relation with the main scientific publications and with research subject ontologies. We studied and applied methods that use social network analysis to solve or mitigate the problem of ambiguity in researchers' identities. We also applied the methods for ambiguity resolution for names of institutions, scientific meeting venues, country/state names, etc. Another study subject was measuring the quality of the results. Finally we developed metrics and implemented tools that allow the comparison of the scientific production of institutions, researcher groups, research subjects fields, countries, etc. The tools also produced a ranking of universities based on the prestige of these universities researchers at the co-authorship social network. These results demonstrated that prestige structural information was properly captured showing its correlation with other works that assess the quality of scientific production of universities using similar criteria.
Mestrado
Metodologia e Tecnicas da Computação
Mestre em Ciência da Computação
De, Vega Rodrigo Miguel. "Modeling future all-optical networks without buffering capabilities". Doctoral thesis, Universite Libre de Bruxelles, 2008. http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/210455.
Testo completoIn the first part we introduce the basic functionality and structure of OBS and OPS networks. We identify the blocking probability as the main performance parameter of interest.
In the second part we study the statistical properties of the traffic that will likely run through these networks. We use for this purpose a set of traffic traces obtained from the Universidad Politécnica de Catalunya. Our conclusion is that traffic entering the optical domain in future OBS/OPS networks will be long-range dependent (LRD).
In the third part we present the model for bufferless OBS/OPS networks. This model takes into account the results from the second part of the thesis concerning the LRD nature of traffic. It also takes into account specific issues concerning the functionality of a typical bufferless packet-switching network. The resulting model presents scalability problems, so we propose an approximative method to compute the blocking probability from it. We empirically evaluate the accuracy of this method, as well as its scalability.
Doctorat en Sciences de l'ingénieur
info:eu-repo/semantics/nonPublished
Fan, Yang, Hidehiko Masuhara, Tomoyuki Aotani, Flemming Nielson e Hanne Riis Nielson. "AspectKE*: Security aspects with program analysis for distributed systems". Universität Potsdam, 2010. http://opus.kobv.de/ubp/volltexte/2010/4136/.
Testo completoSamoladas, Vasilis. "On indexing large databases for advanced data models". 2001. http://hdl.handle.net/2152/10823.
Testo completo"Redundancy on content-based indexing". 1997. http://library.cuhk.edu.hk/record=b5889125.
Testo completoThesis (M.Phil.)--Chinese University of Hong Kong, 1997.
Includes bibliographical references (leaves 108-110).
Abstract --- p.ii
Acknowledgement --- p.iii
Chapter 1 --- Introduction --- p.1
Chapter 1.1 --- Motivation --- p.1
Chapter 1.2 --- Problems in Content-Based Indexing --- p.2
Chapter 1.3 --- Contributions --- p.3
Chapter 1.4 --- Thesis Organization --- p.4
Chapter 2 --- Content-Based Indexing Structures --- p.5
Chapter 2.1 --- R-Tree --- p.6
Chapter 2.2 --- R+-Tree --- p.8
Chapter 2.3 --- R*-Tree --- p.11
Chapter 3 --- Searching in Both R-Tree and R*-Tree --- p.15
Chapter 3.1 --- Exact Search --- p.15
Chapter 3.2 --- Nearest Neighbor Search --- p.19
Chapter 3.2.1 --- Definition of Searching Metrics --- p.19
Chapter 3.2.2 --- Pruning Heuristics --- p.21
Chapter 3.2.3 --- Nearest Neighbor Search Algorithm --- p.24
Chapter 3.2.4 --- Generalization to N-Nearest Neighbor Search --- p.25
Chapter 4 --- An Improved Nearest Neighbor Search Algorithm for R-Tree --- p.29
Chapter 4.1 --- Introduction --- p.29
Chapter 4.2 --- New Pruning Heuristics --- p.31
Chapter 4.3 --- An Improved Nearest Neighbor Search Algorithm --- p.34
Chapter 4.4 --- Replacing Heuristics --- p.36
Chapter 4.5 --- N-Nearest Neighbor Search --- p.41
Chapter 4.6 --- Performance Evaluation --- p.45
Chapter 5 --- Overlapping Nodes in R-Tree and R*-Tree --- p.53
Chapter 5.1 --- Overlapping Nodes --- p.54
Chapter 5.2 --- Problem Induced By Overlapping Nodes --- p.57
Chapter 5.2.1 --- Backtracking --- p.57
Chapter 5.2.2 --- Inefficient Exact Search --- p.57
Chapter 5.2.3 --- Inefficient Nearest Neighbor Search --- p.60
Chapter 6 --- Redundancy On R-Tree --- p.64
Chapter 6.1 --- Motivation --- p.64
Chapter 6.2 --- Adding Redundancy on Index Tree --- p.65
Chapter 6.3 --- R-Tree with Redundancy --- p.66
Chapter 6.3.1 --- Previous Models of R-Tree with Redundancy --- p.66
Chapter 6.3.2 --- Redundant R-Tree --- p.70
Chapter 6.3.3 --- Level List --- p.71
Chapter 6.3.4 --- Inserting Redundancy to R-Tree --- p.72
Chapter 6.3.5 --- Properties of Redundant R-Tree --- p.77
Chapter 7 --- Searching in Redundant R-Tree --- p.82
Chapter 7.1 --- Exact Search --- p.82
Chapter 7.2 --- Nearest Neighbor Search --- p.86
Chapter 7.3 --- Avoidance of Multiple Accesses --- p.89
Chapter 8 --- Experiment --- p.90
Chapter 8.1 --- Experimental Setup --- p.90
Chapter 8.2 --- Exact Search --- p.91
Chapter 8.2.1 --- Clustered Data --- p.91
Chapter 8.2.2 --- Real Data --- p.93
Chapter 8.3 --- Nearest Neighbor Search --- p.95
Chapter 8.3.1 --- Clustered Data --- p.95
Chapter 8.3.2 --- Uniform Data --- p.98
Chapter 8.3.3 --- Real Data --- p.100
Chapter 8.4 --- Discussion --- p.102
Chapter 9 --- Conclusions and Future Research --- p.105
Chapter 9.1 --- Conclusions --- p.105
Chapter 9.2 --- Future Research --- p.106
Bibliography --- p.108
Wang, Chun-Jen, e 王俊仁. "Chinese Speech Information Retrieval--Data-Driven and Predefined Indexing Features,Different Retrieval Models and Improved Approaches". Thesis, 2002. http://ndltd.ncl.edu.tw/handle/24116093457135065144.
Testo completoSadoghi, Hamedani Mohammad. "An Efficient, Extensible, Hardware-aware Indexing Kernel". Thesis, 2013. http://hdl.handle.net/1807/65515.
Testo completoJiang, Hou-Sian, e 江侯弦. "The Study of Wireless Senesor Network Data Storage and Web Models for Presentation". Thesis, 2010. http://ndltd.ncl.edu.tw/handle/27803366827990914919.
Testo completo國立臺灣海洋大學
系統工程暨造船學系
98
In this study, we aim to build a web-page model for WSN data presentation and reduced query time in massive WSN data to improve the relative operation efficiency. The system was combined by two parts: The data storing program for fast query mechanism assistance and the web-page model for representation.By using Visual Basic.NET as the script language, we developed both the data storing program and the web-page model. Moreover, the data storing program integrates database index and partition of database performance tuning technology. It will automatically analyze and coordinate the data from remote-end, store the data into the database, and interact with the database server regularly to maintain the data structure.In the web-page model, users are able to design a web-page with simple functions and check the location status of sensors in real time through intuitive graphic interface and scalable visual graphic data. Furthermore, the “Add sensors information” function can store sensor’s location information in any field when it’s covered by wireless network. The model can forbid the cognitive differences between in-field workers and web-page developers, which include paperwork mistakes; therefore, the system will realize the function of fast display interface building.
"ACTION: automatic classification for Chinese documents". Chinese University of Hong Kong, 1994. http://library.cuhk.edu.hk/record=b5895378.
Testo completoThesis (M.Phil.)--Chinese University of Hong Kong, 1994.
Includes bibliographical references (p. 107-109).
Abstract --- p.i
Acknowledgement --- p.iii
List of Tables --- p.viii
List of Figures --- p.ix
Chapter 1 --- Introduction --- p.1
Chapter 2 --- Chinese Information Processing --- p.6
Chapter 2.1 --- Chinese Word Segmentation --- p.7
Chapter 2.1.1 --- Statistical Method --- p.8
Chapter 2.1.2 --- Probabilistic Method --- p.9
Chapter 2.1.3 --- Linguistic Method --- p.10
Chapter 2.2 --- Automatic Indexing --- p.10
Chapter 2.2.1 --- Title Indexing --- p.11
Chapter 2.2.2 --- Free-Text Searching --- p.11
Chapter 2.2.3 --- Citation Indexing --- p.12
Chapter 2.3 --- Information Retrieval Systems --- p.13
Chapter 2.3.1 --- Users' Assessment of IRS --- p.13
Chapter 2.4 --- Concluding Remarks --- p.15
Chapter 3 --- Survey on Classification --- p.16
Chapter 3.1 --- Text Classification --- p.17
Chapter 3.2 --- Survey on Classification Schemes --- p.18
Chapter 3.2.1 --- Commonly Used Classification Systems --- p.18
Chapter 3.2.2 --- Classification of Newspapers --- p.31
Chapter 3.3 --- Concluding Remarks --- p.37
Chapter 4 --- System Models and the ACTION Algorithm --- p.38
Chapter 4.1 --- Factors Affecting Systems Performance --- p.38
Chapter 4.1.1 --- Specificity --- p.39
Chapter 4.1.2 --- Exhaustivity --- p.40
Chapter 4.2 --- Assumptions and Scope --- p.42
Chapter 4.2.1 --- Assumptions --- p.42
Chapter 4.2.2 --- System Scope ´ؤ Data Flow Diagrams --- p.44
Chapter 4.3 --- System Models --- p.48
Chapter 4.3.1 --- Article --- p.48
Chapter 4.3.2 --- Matching Table --- p.49
Chapter 4.3.3 --- Forest --- p.51
Chapter 4.3.4 --- Matching --- p.53
Chapter 4.4 --- Classification Rules --- p.54
Chapter 4.5 --- The ACTION Algorithm --- p.56
Chapter 4.5.1 --- Algorithm Design Objectives --- p.56
Chapter 4.5.2 --- Measuring Node Significance --- p.56
Chapter 4.5.3 --- Pseudocodes --- p.61
Chapter 4.6 --- Concluding Remarks --- p.64
Chapter 5 --- Analysis of Results and Validation --- p.66
Chapter 5.1 --- Seeking for Exhaustivity Rather Than Specificity --- p.67
Chapter 5.1.1 --- The News Article --- p.67
Chapter 5.1.2 --- The Matching Results --- p.68
Chapter 5.1.3 --- The Keyword Values --- p.68
Chapter 5.1.4 --- Analysis of Classification Results --- p.71
Chapter 5.2 --- Catering for Hierarchical Relationships Between Classes and Subclasses --- p.72
Chapter 5.2.1 --- The News Article --- p.72
Chapter 5.2.2 --- The Matching Results --- p.73
Chapter 5.2.3 --- The Keyword Values --- p.74
Chapter 5.2.4 --- Analysis of Classification Results --- p.75
Chapter 5.3 --- A Representative With Zero Occurrence --- p.78
Chapter 5.3.1 --- The News Article --- p.78
Chapter 5.3.2 --- The Matching Results --- p.79
Chapter 5.3.3 --- The Keyword Values --- p.80
Chapter 5.3.4 --- Analysis of Classification Results --- p.81
Chapter 5.4 --- Statistical Analysis --- p.83
Chapter 5.4.1 --- Classification Results with Highest Occurrence Frequency --- p.83
Chapter 5.4.2 --- Classification Results with Zero Occurrence Frequency --- p.85
Chapter 5.4.3 --- Distribution of Classification Results on Level Numbers --- p.86
Chapter 5.5 --- Concluding Remarks --- p.87
Chapter 5.5.1 --- Advantageous Characteristics of ACTION --- p.88
Chapter 6 --- Conclusion --- p.93
Chapter 6.1 --- Perspectives in Document Representation --- p.93
Chapter 6.2 --- Classification Schemes --- p.95
Chapter 6.3 --- Classification System Model --- p.95
Chapter 6.4 --- The ACTION Algorithm --- p.96
Chapter 6.5 --- Advantageous Characteristics of the ACTION Algorithm --- p.96
Chapter 6.6 --- Testing and Validating the ACTION algorithm --- p.98
Chapter 6.7 --- Future Work --- p.99
Chapter 6.8 --- A Final Remark --- p.100
Chapter A --- System Models --- p.102
Chapter B --- Classification Rules --- p.104
Chapter C --- Node Significance Definitions --- p.105
References --- p.107
Du, Lan. "Non-parametric bayesian methods for structured topic models". Phd thesis, 2011. http://hdl.handle.net/1885/149800.
Testo completo"Unsupervised extraction and normalization of product attributes from web pages". 2010. http://library.cuhk.edu.hk/record=b5894490.
Testo completo"July 2010."
Thesis (M.Phil.)--Chinese University of Hong Kong, 2010.
Includes bibliographical references (p. 59-63).
Abstracts in English and Chinese.
Chapter 1 --- Introduction --- p.1
Chapter 1.1 --- Background --- p.1
Chapter 1.2 --- Motivation --- p.4
Chapter 1.3 --- Our Approach --- p.8
Chapter 1.4 --- Potential Applications --- p.12
Chapter 1.5 --- Research Contributions --- p.13
Chapter 1.6 --- Thesis Organization --- p.15
Chapter 2 --- Literature Survey --- p.16
Chapter 2.1 --- Supervised Extraction Approaches --- p.16
Chapter 2.2 --- Unsupervised Extraction Approaches --- p.19
Chapter 2.3 --- Attribute Normalization --- p.21
Chapter 2.4 --- Integrated Approaches --- p.22
Chapter 3 --- Problem Definition and Preliminaries --- p.24
Chapter 3.1 --- Problem Definition --- p.24
Chapter 3.2 --- Preliminaries --- p.27
Chapter 3.2.1 --- Web Pre-processing --- p.27
Chapter 3.2.2 --- Overview of Our Framework --- p.31
Chapter 3.2.3 --- Background of Graphical Models --- p.32
Chapter 4 --- Our Proposed Framework --- p.36
Chapter 4.1 --- Our Proposed Graphical Model --- p.36
Chapter 4.2 --- Inference --- p.41
Chapter 4.3 --- Product Attribute Information Determination --- p.47
Chapter 5 --- Experiments and Results --- p.49
Chapter 6 --- Conclusion --- p.57
Bibliography --- p.59
Chapter A --- Dirichlet Process --- p.64
Chapter B --- Hidden Markov Models --- p.68
"Parameter free document stream classification". Thesis, 2006. http://library.cuhk.edu.hk/record=b6074286.
Testo completoFor the problem of bursty topics identification, PFreeBT adopts an approach, in which we term it as feature-pivot clustering approach. Given a document stream, PFreeBT first identifies a set of bursty features from there. The identification process is based on computing the probability distributions. According to the patterns of the bursty features and two newly defined concepts (equivalent and map-to), a set of bursty topics can be extracted.
For the problem of constructing a reliable classifier, we formulate it as a partially supervised classification problem. In this classification problem, only a few training examples are labeled as positive (P). All other training examples (U) are remained unlabeled. Here, U is mixed with the negative examples (N) and some other positive examples (P'). Existing techniques that tackle this problem all focus on finding N from U. None of them attempts to extract P' from U. In fact, it is difficult to succeed as the topics in U are diverse and the features in there are sparse. In this dissertation, PNLH is proposed for extracting a high quality of P' and N from U.
In this dissertation, two heuristics, PFreeBT and PNLH, are proposed to tackle the aforementioned problems. PFreeBT aims at identifying the bursty topics in a document stream, whereas PNLH aims at constructing a reliable classifier for a given bursty topic. It is worth noting that both heuristics are parameter free. Users do not need to provide any parameter explicitly. All of the required variables can be computed base on the given document stream automatically.
In this information overwhelming century, information becomes ever more pervasive. A new class of data-intensive application arises where data is modeled best as an open-ended stream. We call such kind of data as data stream. Document stream is a variation of data stream, which consists of a sequence of chronological ordered documents. A fundamental problem of mining document streams is to extract meaningful structure from there, so as to help us to organize the contents systematically. In this dissertation, we focus on such a problem. Specifically, this dissertation studies two problems: to identify the bursty topics in a document stream and to construct a classifiers for the bursty topics. A bursty topic is one of the topics resides in the document stream, such that a large number of documents would be related to it during a bounded time interval.
Fung Pui Cheong Gabriel.
"August 2006."
Adviser: Jeffrey Xu Yu.
Source: Dissertation Abstracts International, Volume: 68-03, Section: B, page: 1720.
Thesis (Ph.D.)--Chinese University of Hong Kong, 2006.
Includes bibliographical references (p. 122-130).
Electronic reproduction. Hong Kong : Chinese University of Hong Kong, [2012] System requirements: Adobe Acrobat Reader. Available via World Wide Web.
Electronic reproduction. [Ann Arbor, MI] : ProQuest Information and Learning, [200-] System requirements: Adobe Acrobat Reader. Available via World Wide Web.
Abstracts in English and Chinese.
School code: 1307.
"Data organization for routing on the multi-modal public transportation system: a GIS-T prototype of Hong Kong Island". 2001. http://library.cuhk.edu.hk/record=b5890808.
Testo completoThesis (M.Phil.)--Chinese University of Hong Kong, 2001.
Includes bibliographical references (leaves 130-138).
Abstracts in English and Chinese.
ABSTRACT IN ENGLISH --- p.i-ii
ABSTRACT IN CHINESE --- p.iii
ACKNOWLEDGEMENTS --- p.iv-v
TABLE OF CONTENTS --- p.vi-viii
LIST OF TABLES --- p.ix
LIST OF FIGURES --- p.x-xi
Chapter CHAPTER I --- INTRODUCTION
Chapter 1.1 --- Problem Statement --- p.1
Chapter 1.2 --- Research Purpose --- p.5
Chapter 1.3 --- Significance --- p.7
Chapter 1.4 --- Methodology --- p.8
Chapter 1.5 --- Outline of the Thesis --- p.9
Chapter CHAPTER II --- LITERATURE REVIEW
Chapter 2.1 --- Introduction --- p.12
Chapter 2.2 --- Origin of GIS --- p.12
Chapter 2.3 --- Development of GIS-T --- p.15
Chapter 2.4 --- Capabilities of GIS-T --- p.18
Chapter 2.5 --- Structure of a GIS-T --- p.19
Chapter 2.5.1 --- Data Models for GIS-T --- p.19
Chapter 2.5.2 --- Relational DBMS and Dueker-Butler's Data Model for Transportation --- p.22
Chapter 2.5.3 --- Objected-oriented Approach --- p.25
Chapter 2.6 --- Main Techniques of GIS-T --- p.26
Chapter 2.6.1 --- Linear Location Reference System --- p.26
Chapter 2.6.2 --- Dynamic Segmentation --- p.27
Chapter 2.6.3 --- Planar and Non-planar Networks --- p.28
Chapter 2.6.4 --- Turn-table --- p.28
Chapter 2.7 --- Algorithms for Finding Shortest Paths on a Network --- p.29
Chapter 2.7.1 --- Overview of Routing Algorithms --- p.29
Chapter 2.7.2 --- Dijkstra's Algorithm --- p.31
Chapter 2.7.3 --- Routing Models for the Multi-modal Network --- p.32
Chapter 2.8 --- Recent Researches on GIS Data Models for the Multi-modal Transportation System --- p.33
Chapter 2.9 --- Main Software Packages for GIS-T --- p.36
Chapter 2.10 --- Summary --- p.37
Chapter CHAPTER III --- MODELING THE MULTI-MODAL PUBLIC TRANSPORTATION SYSTEM
Chapter 3.1 --- Introduction --- p.40
Chapter 3.2 --- Elaborated Stages and Methods for GIS Modeling --- p.40
Chapter 3.3 --- Application Domain: The Multi-modal Public Transportation System --- p.43
Chapter 3.3.1 --- Definition of a Multi-modal Public Transportation System --- p.43
Chapter 3.3.2 --- Descriptions of the Multi-modal Public transportation System --- p.44
Chapter 3.3.3 --- Objective of the Modeling Work --- p.46
Chapter 3.4 --- A Layer-cake Based Application Domain Model for the Multi- modal Public Transportation System --- p.46
Chapter 3.5 --- A Conceptual Model for the Multi-modal Public Transportation System --- p.49
Chapter 3.6 --- Logical and Physical Implementation of the Data Model for the Multi-modal Public Transportation System --- p.54
Chapter 3.7 --- Criteria for Routing on the Multi-modal Public Transportation System --- p.57
Chapter 3.7.1 --- Least-time Routing --- p.58
Chapter 3.7.2 --- Least-fare Routing --- p.60
Chapter 3.7.3 --- Least-transfer Routing --- p.60
Chapter 3.8 --- Summary --- p.61
Chapter CHAPTER IV --- DATA PREPARATION FOR THE STUDY AREA
Chapter 4.1 --- Introduction --- p.53
Chapter 4.2 --- The Study Area: Hong Kong Island --- p.63
Chapter 4.2.1 --- General Information of the Transportation System on Hong Kong Island --- p.63
Chapter 4.2.2 --- Reasons for Choosing Hong Kong Island as the Study Area --- p.66
Chapter 4.2.3 --- Mass Transit Routes Selected for the Prototype --- p.67
Chapter 4.3 --- Data Source and Data Collection --- p.67
Chapter 4.4 --- Geographical Data Preparation --- p.71
Chapter 4.4.1 --- Data Conversion --- p.73
Chapter 4.4.2 --- Geographical Data Input --- p.79
Chapter 4.5 --- Attribute Data Input --- p.86
Chapter 4.6 --- Summary --- p.88
Chapter CHAPTER V --- IMPLEMENTATION OF THE PROTOTYPE
Chapter 5.1 --- Introduction --- p.89
Chapter 5.2 --- Construction of the Route Service Network --- p.89
Chapter 5.2.1 --- Generation of the Geographical Network --- p.90
Chapter 5.2.2 --- Setting Attribute Data for the Route Service Network --- p.95
Chapter 5.3 --- A GIS-T Prototype for the Study Area --- p.102
Chapter 5.4 --- General GIS Functions of the Prototype --- p.104
Chapter 5.4.1 --- Information Retrieve --- p.104
Chapter 5.4.2 --- Display --- p.105
Chapter 5.4.3 --- Data Query --- p.105
Chapter 5.5 --- Routing in the Prototype --- p.105
Chapter 5.5.1 --- Routing Procedure --- p.108
Chapter 5.5.2 --- Examples and Results --- p.110
Chapter 5.5.3 --- Comparison and Analysis --- p.113
Chapter 5.6 --- Summary --- p.118
Chapter CHAPTER VI --- CONCLUSION
Chapter 6.1 --- Research Findings --- p.123
Chapter 6.2 --- Research Limitations --- p.126
Chapter 6.3 --- Direction of Further Studies --- p.128
BIBLIOGRAPHY --- p.130