Дисертації з теми "Query processing and optimisation"
Оформте джерело за APA, MLA, Chicago, Harvard та іншими стилями
Ознайомтеся з топ-50 дисертацій для дослідження на тему "Query processing and optimisation".
Біля кожної праці в переліку літератури доступна кнопка «Додати до бібліографії». Скористайтеся нею – і ми автоматично оформимо бібліографічне посилання на обрану працю в потрібному вам стилі цитування: APA, MLA, «Гарвард», «Чикаго», «Ванкувер» тощо.
Також ви можете завантажити повний текст наукової публікації у форматі «.pdf» та прочитати онлайн анотацію до роботи, якщо відповідні параметри наявні в метаданих.
Переглядайте дисертації для різних дисциплін та оформлюйте правильно вашу бібліографію.
Manolescu, Ioana. "Efficient XML query processing." Habilitation à diriger des recherches, Université Paris Sud - Paris XI, 2009. http://tel.archives-ouvertes.fr/tel-00542801.
Повний текст джерелаAl-Hoqani, Noura Y. S. "In-network database query processing for wireless sensor networks." Thesis, Loughborough University, 2018. https://dspace.lboro.ac.uk/2134/36226.
Повний текст джерелаBelghoul, Abdeslem. "Optimizing Communication Cost in Distributed Query Processing." Thesis, Université Clermont Auvergne (2017-2020), 2017. http://www.theses.fr/2017CLFAC025/document.
Повний текст джерелаIn this thesis, we take a complementary look to the problem of optimizing the time for communicating query results in distributed query processing, by investigating the relationship between the communication time and the middleware configuration. Indeed, the middleware determines, among others, how data is divided into batches and messages before being communicated over the network. Concretely, we focus on the research question: given a query Q and a network environment, what is the best middleware configuration that minimizes the time for transferring the query result over the network? To the best of our knowledge, the database research community does not have well-established strategies for middleware tuning. We present first an intensive experimental study that emphasizes the crucial impact of middleware configuration on the time for communicating query results. We focus on two middleware parameters that we empirically identified as having an important influence on the communication time: (i) the fetch size F (i.e., the number of tuples in a batch that is communicated at once to an application consuming the data) and (ii) the message size M (i.e., the size in bytes of the middleware buffer, which corresponds to the amount of data that can be communicated at once from the middleware to the network layer; a batch of F tuples can be communicated via one or several messages of M bytes). Then, we describe a cost model for estimating the communication time, which is based on how data is communicated between computation nodes. Precisely, our cost model is based on two crucial observations: (i) batches and messages are communicated differently over the network: batches are communicated synchronously, whereas messages in a batch are communicated in pipeline (asynchronously), and (ii) due to network latency, it is more expensive to communicate the first message in a batch compared to any other message that is not the first in its batch. We propose an effective strategy for calibrating the network-dependent parameters of the communication time estimation function i.e, the costs of first message and non first message in their batch. Finally, we develop an optimization algorithm to effectively compute the values of the middleware parameters F and M that minimize the communication time. The proposed algorithm allows to quickly find (in small fraction of a second) the values of the middleware parameters F and M that translate a good trade-off between low resource consumption and low communication time. The proposed approach has been evaluated using a dataset issued from application in Astronomy
Mesmoudi, Amin. "Declarative parallel query processing on large scale astronomical databases." Thesis, Lyon 1, 2015. http://www.theses.fr/2015LYO10326.
Повний текст джерелаThis work is carried out in framework of the PetaSky project. The objective of this project is to provide a set of tools allowing to manage Peta-bytes of data from astronomical observations. Our work is concerned with the design of a scalable approach. We first started by analyzing the ability of MapReduce based systems and supporting SQL to manage the LSST data and ensure optimization capabilities for certain types of queries. We analyzed the impact of data partitioning, indexing and compression on query performance. From our experiments, it follows that there is no “magic” technique to partition, store and index data but the efficiency of dedicated techniques depends mainly on the type of queries and the typology of data that are considered. Based on our work on benchmarking, we identified some techniques to be integrated to large-scale data management systems. We designed a new system allowing to support multiple partitioning mechanisms and several evaluation operators. We used the BSP (Bulk Synchronous Parallel) model as a parallel computation paradigm. Unlike MapeReduce model, we send intermediate results to workers that can continue their processing. Data is logically represented as a graph. The evaluation of queries is performed by exploring the data graph using forward and backward edges. We also offer a semi-automatic partitioning approach, i.e., we provide the system administrator with a set of tools allowing her/him to choose the manner of partitioning data using the schema of the database and domain knowledge. The first experiments show that our approach provides a significant performance improvement with respect to Map/Reduce systems
Oğuz, Damla. "Méthodes d'optimisation pour le traitement de requêtes réparties à grande échelle sur des données liées." Thesis, Toulouse 3, 2017. http://www.theses.fr/2017TOU30067/document.
Повний текст джерелаLinked Data is a term to define a set of best practices for publishing and interlinking structured data on the Web. As the number of data providers of Linked Data increases, the Web becomes a huge global data space. Query federation is one of the approaches for efficiently querying this distributed data space. It is employed via a federated query engine which aims to minimize the response time and the completion time. Response time is the time to generate the first result tuple, whereas completion time refers to the time to provide all result tuples. There are three basic steps in a federated query engine which are data source selection, query optimization, and query execution. This thesis contributes to the subject of query optimization for query federation. Most of the studies focus on static query optimization which generates the query plans before the execution and needs statistics. However, the environment of Linked Data has several difficulties such as unpredictable data arrival rates and unreliable statistics. As a consequence, static query optimization can cause inefficient execution plans. These constraints show that adaptive query optimization should be used for federated query processing on Linked Data. In this thesis, we first propose an adaptive join operator which aims to minimize the response time and the completion time for federated queries over SPARQL endpoints. Second, we extend the first proposal to further reduce the completion time. Both proposals can change the join method and the join order during the execution by using adaptive query optimization. The proposed operators can handle different data arrival rates of relations and the lack of statistics about them. The performance evaluation of this thesis shows the efficiency of the proposed adaptive operators. They provide faster completion times and almost the same response times, compared to symmetric hash join. Compared to bind join, the proposed operators perform substantially better with respect to the response time and can also provide faster completion times. In addition, the second proposed operator provides considerably faster response time than bind-bloom join and can improve the completion time as well. The second proposal also provides faster completion times than the first proposal in all conditions. In conclusion, the proposed adaptive join operators provide the best trade-off between the response time and the completion time. Even though our main objective is to manage different data arrival rates of relations, the performance evaluation reveals that they are successful in both fixed and different data arrival rates
Gillani, Syed. "Semantically-enabled stream processing and complex event processing over RDF graph streams." Thesis, Lyon, 2016. http://www.theses.fr/2016LYSES055/document.
Повний текст джерелаThere is a paradigm shift in the nature and processing means of today’s data: data are used to being mostly static and stored in large databases to be queried. Today, with the advent of new applications and means of collecting data, most applications on the Web and in enterprises produce data in a continuous manner under the form of streams. Thus, the users of these applications expect to process a large volume of data with fresh low latency results. This has resulted in the introduction of Data Stream Processing Systems (DSMSs) and a Complex Event Processing (CEP) paradigm – both with distinctive aims: DSMSs are mostly employed to process traditional query operators (mostly stateless), while CEP systems focus on temporal pattern matching (stateful operators) to detect changes in the data that can be thought of as events. In the past decade or so, a number of scalable and performance intensive DSMSs and CEP systems have been proposed. Most of them, however, are based on the relational data models – which begs the question for the support of heterogeneous data sources, i.e., variety of the data. Work in RDF stream processing (RSP) systems partly addresses the challenge of variety by promoting the RDF data model. Nonetheless, challenges like volume and velocity are overlooked by existing approaches. These challenges require customised optimisations which consider RDF as a first class citizen and scale the processof continuous graph pattern matching. To gain insights into these problems, this thesis focuses on developing scalable RDF graph stream processing, and semantically-enabled CEP systems (i.e., Semantic Complex Event Processing, SCEP). In addition to our optimised algorithmic and data structure methodologies, we also contribute to the design of a new query language for SCEP. Our contributions in these two fields are as follows: • RDF Graph Stream Processing. We first propose an RDF graph stream model, where each data item/event within streams is comprised of an RDF graph (a set of RDF triples). Second, we implement customised indexing techniques and data structures to continuously process RDF graph streams in an incremental manner. • Semantic Complex Event Processing. We extend the idea of RDF graph stream processing to enable SCEP over such RDF graph streams, i.e., temporalpattern matching. Our first contribution in this context is to provide a new querylanguage that encompasses the RDF graph stream model and employs a set of expressive temporal operators such as sequencing, kleene-+, negation, optional,conjunction, disjunction and event selection strategies. Based on this, we implement a scalable system that employs a non-deterministic finite automata model to evaluate these operators in an optimised manner. We leverage techniques from diverse fields, such as relational query optimisations, incremental query processing, sensor and social networks in order to solve real-world problems. We have applied our proposed techniques to a wide range of real-world and synthetic datasets to extract the knowledge from RDF structured data in motion. Our experimental evaluations confirm our theoretical insights, and demonstrate the viability of our proposed methods
Alrammal, Muath. "Algorithms for XML stream processing : massive data, external memory and scalable performance." Phd thesis, Université Paris-Est, 2011. http://tel.archives-ouvertes.fr/tel-00779309.
Повний текст джерелаPhan, Duy-Hung. "Algorithmes d'aggrégation pour applications Big Data." Electronic Thesis or Diss., Paris, ENST, 2016. http://www.theses.fr/2016ENST0043.
Повний текст джерелаTraditional databases are facing problems of scalability and efficiency dealing with a vast amount of big-data. Thus, modern data management systems that scale to thousands of nodes, like Apache Hadoop and Spark, have emerged and become the de-facto platforms to process data at massive scales. In such systems, many data processing optimizations that were well studied in the database domain have now become futile because of the novel architectures and programming models. In this context, this dissertation pledged to optimize one of the most predominant operations in data processing: data aggregation for such systems.Our main contributions were the logical and physical optimizations for large-scale data aggregation, including several algorithms and techniques. These optimizations are so intimately related that without one or the other, the data aggregation optimization problem would not be solved entirely. Moreover, we integrated these optimizations in our multi-query optimization engine, which is totally transparent to users. The engine, the logical and physical optimizations proposed in this dissertation formed a complete package that is runnable and ready to answer data aggregation queries at massive scales. We evaluated our optimizations both theoretically and experimentally. The theoretical analyses showed that our algorithms and techniques are much more scalable and efficient than prior works. The experimental results using a real cluster with synthetic and real datasets confirmed our analyses, showed a significant performance boost and revealed various angles about our works. Last but not least, our works are published as open sources for public usages and studies
Camacho, Rodriguez Jesus. "Efficient techniques for large-scale Web data management." Thesis, Paris 11, 2014. http://www.theses.fr/2014PA112229/document.
Повний текст джерелаThe recent development of commercial cloud computing environments has strongly impacted research and development in distributed software platforms. Cloud providers offer a distributed, shared-nothing infrastructure, that may be used for data storage and processing.In parallel with the development of cloud platforms, programming models that seamlessly parallelize the execution of data-intensive tasks over large clusters of commodity machines have received significant attention, starting with the MapReduce model very well known by now, and continuing through other novel and more expressive frameworks. As these models are increasingly used to express analytical-style data processing tasks, the need for higher-level languages that ease the burden of writing complex queries for these systems arises.This thesis investigates the efficient management of Web data on large-scale infrastructures. In particular, we study the performance and cost of exploiting cloud services to build Web data warehouses, and the parallelization and optimization of query languages that are tailored towards querying Web data declaratively.First, we present AMADA, an architecture for warehousing large-scale Web data in commercial cloud platforms. AMADA operates in a Software as a Service (SaaS) approach, allowing users to upload, store, and query large volumes of Web data. Since cloud users support monetary costs directly connected to their consumption of resources, our focus is not only on query performance from an execution time perspective, but also on the monetary costs associated to this processing. In particular, we study the applicability of several content indexing strategies, and show that they lead not only to reducing query evaluation time, but also, importantly, to reducing the monetary costs associated with the exploitation of the cloud-based warehouse.Second, we consider the efficient parallelization of the execution of complex queries over XML documents, implemented within our system PAXQuery. We provide novel algorithms showing how to translate such queries into plans expressed in the PArallelization ConTracts (PACT) programming model. These plans are then optimized and executed in parallel by the Stratosphere system. We demonstrate the efficiency and scalability of our approach through experiments on hundreds of GB of XML data.Finally, we present a novel approach for identifying and reusing common subexpressions occurring in Pig Latin scripts. In particular, we lay the foundation of our reuse-based algorithms by formalizing the semantics of the Pig Latin query language with extended nested relational algebra for bags. Our algorithm, named PigReuse, operates on the algebraic representations of Pig Latin scripts, identifies subexpression merging opportunities, selects the best ones to execute based on a cost function, and merges other equivalent expressions to share its result. We bring several extensions to the algorithm to improve its performance. Our experiment results demonstrate the efficiency and effectiveness of our reuse-based algorithms and optimization strategies
Geng, Ke. "XML semantic query optimisation." Thesis, University of Auckland, 2011. http://hdl.handle.net/2292/6815.
Повний текст джерелаPapadopoulos, Stavros. "Authenticated query processing /." View abstract or full-text, 2010. http://library.ust.hk/cgi/db/thesis.pl?CSED%202010%20PAPADO.
Повний текст джерелаKhelil, Abdallah. "Gestion et optimisation des données massives issues du Web Combining graph exploration and fragmentation for scalable rdf query processing Should We Be Afraid of Querying Billions of Triples in a Graph-Based Centralized System? EXGRAF : Exploration et Fragmentation de Graphes au Service du Traitement Scalable de Requˆetes RDF". Thesis, Chasseneuil-du-Poitou, Ecole nationale supérieure de mécanique et d'aérotechnique, 2020. http://www.theses.fr/2020ESMA0009.
Повний текст джерелаBig Data represents a challenge not only for the socio-economic world but also for scientific research. Indeed, as has been pointed out in several scientific articles and strategic reports, modern computer applications are facing new problems and issues that are mainly related to the storage and the exploitation of data generated by modern observation and simulation instruments. The management of such data represents a real bottleneck which has the effect of slowing down the exploitation of the various data collected not only in the framework of international scientific programs but also by companies, the latter relying increasingly on the analysis of large-scale data. Much of this data is published today on the WEB. Indeed, we are witnessing an evolution of the traditional web, designed basically to manage documents, to a web of data that allows to offer mechanisms for querying semantic information. Several data models have been proposed to represent this information on the Web. The most important is the Resource Description Framework (RDF) which provides a simple and abstract representation of knowledge for resources on the Web. Each semantic Web fact can be encoded with an RDF triple. In order to explore and query structured information expressed in RDF, several query languages have been proposed over the years. In 2008,SPARQL became the official W3C Recommendation language for querying RDF data.The need to efficiently manage and query RDF data has led to the development of new systems specifically designed to process this data format. These approaches can be categorized as centralized that rely on a single machine to manage RDF data and distributed that can combine multiple machines connected with a computer network. Some of these approaches are based on an existing data management system such as Virtuoso and Jena, others relies on an approach specifically designed for the management of RDF triples such as GRIN, RDF3X and gStore. With the evolution ofRDF datasets (e.g. DBPedia) and Sparql, most systems have become obsolete and/or inefficient. For example, no one of existing centralized system is able to manage 1 billion triples provided under the WatDiv benchmark. Distributed systems would allow under certain conditions to improve this point but consequently leads a performance degradation. In this Phd thesis, we propose the centralized system "RDF_QDAG" that allows to find a good compromise between scalability and performance. We propose to combine physical data fragmentation and data graph exploration."RDF_QDAG" supports multiple types of queries based not only on basic graph patterns but also that incorporate filters based on regular expressions and aggregation and sorting functions. "RDF_QDAG" relies on the Volcano execution model, which allows controlling the main memory, avoiding any overflow even if the hardware configuration is limited. To the best of our knowledge, "RDF_QDAG" is the only centralized system that good performance when manage several billion triples. We compared this system with other systems that represent the state of the art in RDF data management: a relational approach (Virtuoso), a graph-based approach (g-Store), an intensive indexing approach (RDF-3X) and two parallel approaches (CliqueSquare and g-Store-D). "RDF_QDAG" surpasses existing systems when it comes to ensuring both scalability and performance
Lu, Yu-En. "Distributed proximity query processing." Thesis, University of Cambridge, 2008. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.612165.
Повний текст джерелаHe, Bingsheng. "Cache-oblivious query processing /." View abstract or full-text, 2008. http://library.ust.hk/cgi/db/thesis.pl?CSED%202008%20HE.
Повний текст джерелаJiang, Zhewei. "On XML query processing /." Available to subscribers only, 2008. http://proquest.umi.com/pqdweb?did=1757062851&sid=2&Fmt=2&clientId=1509&RQT=309&VName=PQD.
Повний текст джерела"Department of Electrical and Computer Engineering." Keywords: XML, Queries, Processing, Query processing. Includes bibliographical references (p. 58-65). Also available online.
Marathe, Arunprasad Prabhakar. "Query processing techniques for arrays." Thesis, National Library of Canada = Bibliothèque nationale du Canada, 2001. http://www.collectionscanada.ca/obj/s4/f2/dsk3/ftp04/NQ60556.pdf.
Повний текст джерелаStokes, Alan Barry. "Resilient sensor network query processing." Thesis, University of Manchester, 2014. https://www.research.manchester.ac.uk/portal/en/theses/resilient-sensor-network-query-processing(208a729a-5d48-47a9-b1f5-3d156932e197).html.
Повний текст джерелаBondiombouy, Carlyna. "Query Processing in Multistore Systems." Thesis, Montpellier, 2017. http://www.theses.fr/2017MONTS056/document.
Повний текст джерелаCloud computing is having a major impact on data management, with a proliferation of new, scalable data management solutions such as distributed file and object storage, NoSQL databases and big data processing frameworks. This also leads to a wide diversification of DBMS interfaces and the loss of a common programming paradigm, making it very hard for a user to integrate its data sitting in specialized data stores, e.g. relational, documents and graph data stores.In this thesis, we address the problem of query processing with multiple cloud data stores, where the data stores have different models, languages and APIs. This thesis has been prepared in the context of the CoherentPaaS European project and, in particular, the CloudMdsQL multistore system. CloudMdsQL is a functional query language able to exploit the full power of local data stores, by simply allowing some local data store native queries to be called as functions, and at the same time be optimized, e.g. by pushing down select predicates, using bind join, performing join ordering, or planning intermediate data shipping.In this thesis, we propose an extension of CloudMdsQL to take full advantage of the functionality of the underlying data processing frameworks such as Spark by allowing the ad-hoc usage of user defined map/filter/reduce (MFR) operators in combination with traditional SQL statements. This allows performing joins between relational and HDFS big data. Our solution allows for optimization by enabling subquery rewriting so that bind join can be used and filter conditions can be pushed down and applied by the data processing framework as early as possible.We validated our solution by implementing the MFR extension as part of the CloudMdsQL query engine. Based on this prototype, we provide an experimental validation of multistore query processing in a cluster to evaluate the impact on performance of optimization. More specifically, we explore the performance benefit of using bind join and select pushdown under different conditions. Overall, our performance evaluation illustrates the CloudMdsQL query engine’s ability to optimize a query and choose the most efficient execution strategy
Prasher, Sham. "Query processing in multiresolution spatial databases /." [St. Lucia, Qld.], 2005. http://www.library.uq.edu.au/pdfserve.php?image=thesisabs/absthe18682.pdf.
Повний текст джерелаYiu, Man-lung. "Advanced query processing on spatial networks." Click to view the E-thesis via HKUTO, 2006. http://sunzi.lib.hku.hk/hkuto/record/B36279365.
Повний текст джерелаChen, Yingwen. "XQuery Query Processing in Relational Systems." Thesis, University of Waterloo, 2004. http://hdl.handle.net/10012/1201.
Повний текст джерелаNagel, Fabian Oliver. "Efficient query processing in managed runtimes." Thesis, University of Edinburgh, 2015. http://hdl.handle.net/1842/15869.
Повний текст джерелаLei, Ma. "Distributed query processing using composite semijoins." Thesis, National Library of Canada = Bibliothèque nationale du Canada, 2001. http://www.collectionscanada.ca/obj/s4/f2/dsk3/ftp04/MQ62238.pdf.
Повний текст джерелаKatchaounov, Timour. "Query Processing for Peer Mediator Databases." Doctoral thesis, Uppsala : Acta Universitatis Upsaliensis : Univ.-bibl. [distributör], 2003. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-3687.
Повний текст джерелаLiu, Ying. "Query optimization for distributed stream processing." [Bloomington, Ind.] : Indiana University, 2007. http://gateway.proquest.com/openurl?url_ver=Z39.88-2004&rft_val_fmt=info:ofi/fmt:kev:mtx:dissertation&res_dat=xri:pqdiss&rft_dat=xri:pqdiss:3274258.
Повний текст джерелаSource: Dissertation Abstracts International, Volume: 68-07, Section: B, page: 4597. Adviser: Beth Plale. Title from dissertation home page (viewed Apr. 21, 2008).
Kissinger, Thomas, Benjamin Schlegel, Dirk Habich, and Wolfgang Lehner. "QPPT: Query Processing on Prefix Trees." Saechsische Landesbibliothek- Staats- und Universitaetsbibliothek Dresden, 2013. http://nbn-resolving.de/urn:nbn:de:bsz:14-qucosa-113269.
Повний текст джерелаeurviriyanukul, kwanchai. "adaptive query processing in pipelined plans." Thesis, University of Manchester, 2008. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.492914.
Повний текст джерелаYiu, Man-lung, and 姚文龍. "Advanced query processing on spatial networks." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 2006. http://hub.hku.hk/bib/B36279365.
Повний текст джерелаLiu, Fuyu. "Query processing in location-based services." Doctoral diss., University of Central Florida, 2010. http://digital.library.ucf.edu/cdm/ref/collection/ETD/id/4634.
Повний текст джерелаID: 029050964; System requirements: World Wide Web browser and PDF reader.; Mode of access: World Wide Web.; Thesis (Ph.D.)--University of Central Florida, 2010.; Includes bibliographical references (p. 138-145).
Ph.D.
Doctorate
Department of Electrical Engineering and Computer Science
Engineering and Computer Science
Zhang, Yang S. M. Massachusetts Institute of Technology. "ICEDB : intermittently-connected continuous query processing." Thesis, Massachusetts Institute of Technology, 2008. http://hdl.handle.net/1721.1/43064.
Повний текст джерелаIncludes bibliographical references (leaves 61-64).
Several emerging wireless sensor network applications must cope with a combination of node mobility (e.g., sensors on moving cars) and high data rates (media-rich sensors capturing videos, images, sounds, etc.). Due to their mobility, these sensor networks display intermittent and variable network connectivity, and often have to deliver large quantities of data relative to the bandwidth available during periods of connectivity. Unfortunately, existing distributed data management and stream processing are not appropriate for such applications because they assume that the network connecting nodes in the data processor is "always on," and that the absence of a network connection is a fault that needs to be masked to avoid failure. This thesis describes ICEDB (Intermittently Connected Embedded Database), a continuous query processing system for intermittently connected mobile sensor networks. ICEDB incorporates two key ideas: (1) a delay-tolerant continuous query processor, coordinated by a central server and distributed across the mobile nodes, and (2) algorithms for prioritizing certain query results to improve application-defined "utility" metrics. We describe the results of several experiments that use data collected from a deployed fleet of cabs driving in Boston.
by Yang Zhang.
S.M.
Shenoy, Sreekumar Thrivikrama. "Semantic query processing in database systems." Case Western Reserve University School of Graduate Studies / OhioLINK, 1990. http://rave.ohiolink.edu/etdc/view?acc_num=case1054585075.
Повний текст джерелаXu, Cheng. "Authenticated query processing in the cloud." HKBU Institutional Repository, 2019. https://repository.hkbu.edu.hk/etd_oa/620.
Повний текст джерелаTran, Van-Hoang. "Range query processing over untrustworthy clouds." Thesis, Rennes 1, 2020. http://www.theses.fr/2020REN1S072.
Повний текст джерелаCloud computing has increasingly become a standard for saving costs and enabling elasticity. While cloud providers expand their services, concerns about the security of outsourced data hinder cloud technologies from a widespread adoption. To address it, encryption is usually used to protect confidential data stored and processed on untrustworthy clouds. Encrypting outsourced data however mitigates the functionalities of applications since supporting some fundamental functions on encrypted data is still limited. This thesis focuses on the problem of supporting range queries over encrypted data stored on clouds. Many studies have been introduced in this line of work. Nevertheless, none of prior schemes exhibits satisfactory performances for modern systems, that require not only low-latency responses, but also high scalability. Particularly, most existing solutions suffer from either inefficient range query processing or privacy leaks. Even if some can achieve both strong privacy protection and fast processing, they do not satisfy scalability requirements, namely high ingestion throughput, practical storage overhead, and lightweight updates. To overcome this limitation, we propose scalable solutions on secure range query processing while still preserving efficiency and strong security. Our contributions are: (1) We adapt one of the state-of-the-art solutions to the context of high rate of incoming data that often creates bottlenecks. In other words, we introduce and integrate the notion of index template into one of the state-of-the-art solutions so that it can cope with the target context. (2) We develop an intensive ingestion framework dedicated to secure range query processing on encrypted data. Particularly, we re-design the architecture of the first contribution to make it fully distributed. A data presentation and asynchronous method are then introduced. Together, they significantly increase the intake ability of the system. Besides, we adapt the framework to a stronger type of adversaries (e.g., online attackers) and enhance its practicality. (3) We propose a scalable scheme for private range query processing on outsourced datasets. This scheme addresses the need of a scalable solution in terms of efficiency, high security, practical storage overhead, and numerous updates, which can not be supported by existing protocols. To this purpose, we develop our solution relying on equal-size chunks (buckets) of data and secure indexes. The former helps to protect privacy of the underlying data from the adversary while the latter enables efficiency. To support lightweight updates, we propose to decouple secure indexes from their buckets by using use equal-size bitmaps
Cheng, James Sheung-Chak. "Efficient query processing on graph databases /." View abstract or full-text, 2008. http://library.ust.hk/cgi/db/thesis.pl?CSED%202008%20CHENG.
Повний текст джерелаUnnava, Vasundhara. "Query processing in distributed database systems." Connect to resource, 1992. http://rave.ohiolink.edu/etdc/view.cgi?acc%5Fnum=osu1261314105.
Повний текст джерелаSankaranarayanan, Jagan. "Scalable query processing on spatial networks." College Park, Md.: University of Maryland, 2008. http://hdl.handle.net/1903/8130.
Повний текст джерелаThesis research directed by: Dept. of Computer Science. Title from t.p. of PDF. Includes bibliographical references. Published by UMI Dissertation Services, Ann Arbor, Mich. Also available in paper.
Ives, Zachary G. "Efficient query processing for data integration /." Thesis, Connect to this title online; UW restricted, 2002. http://hdl.handle.net/1773/6864.
Повний текст джерелаLian, Xiang. "Efficient query processing over uncertain data /." View abstract or full-text, 2009. http://library.ust.hk/cgi/db/thesis.pl?CSED%202009%20LIAN.
Повний текст джерелаFrosini, Riccardo. "Flexible query processing of SPARQL queries." Thesis, Birkbeck (University of London), 2018. http://bbktheses.da.ulcc.ac.uk/319/.
Повний текст джерелаKotto, Kombi Roland. "Distributed query processing over fluctuating streams." Thesis, Lyon, 2018. http://www.theses.fr/2018LYSEI050/document.
Повний текст джерелаIn a Big Data context, stream processing has become a very active research domain. In order to manage ephemeral data (Velocity) arriving at important rates (Volume), some specific solutions, denoted data stream management systems (DSMSs),have been developed. DSMSs take as inputs some queries, called continuous queries,defined on a set of data streams. Acontinuous query generates new results as long as new data arrive in input. In many application domains, data streams haveinput rates and distribution of values which change over time. These variations may impact significantly processingrequirements for each continuous query.This thesis takes place in the ANR project Socioplug (ANR-13-INFR-0003). In this context, we consider a collaborative platformfor stream processing. Each user can submit multiple continuous queries and contributes to the execution support of theplatform. However, as each processing unit supporting treatments has limited resources in terms of CPU and memory, asignificant increase in input rate may cause the congestion of the system. The problem is then how to adjust dynamicallyresource usage to processing requirements for each continuous query ? It raises several challenges : i) how to detect a need ofreconfiguration ? ii) when reconfiguring the system to avoid its congestion at runtime ?In this work, we are interested by the different processing steps involved in the treatment of a continuous query over adistributed infrastructure. From this global analysis, we extract mechanisms enabling dynamic adaptation of resource usage foreach continuous query. We focus on automatic parallelization, or auto-parallelization, of operators composing the executionplan of a continuous query. We suggest an original approach based on the monitoring of operators and an estimation ofprocessing requirements in near future. Thus, we can increase (scale-out), or decrease (scale-in) the parallelism degree ofoperators in a proactive many such as resource usage fits to processing requirements dynamically. Compared to a staticconfiguration defined by an expert, we show that it is possible to avoid the congestion of the system in many cases or to delay itin most critical cases. Moreover, we show that resource usage can be reduced significantly while delivering equivalentthroughput and result quality. We suggest also to combine this approach with complementary mechanisms for dynamic adaptation of continuous queries at runtime. These differents approaches have been implemented within a widely used DSMS and have been tested over multiple and reproductible micro-benchmarks
Lei, Chuan. "Recurring Query Processing on Big Data." Digital WPI, 2015. https://digitalcommons.wpi.edu/etd-dissertations/550.
Повний текст джерелаYi, Peipei. "Graph query autocompletion." HKBU Institutional Repository, 2018. https://repository.hkbu.edu.hk/etd_oa/557.
Повний текст джерелаAlyoubi, Khaled Hamed. "Database query optimisation based on measures of regret." Thesis, Birkbeck (University of London), 2016. http://bbktheses.da.ulcc.ac.uk/224/.
Повний текст джерелаLuong, Vu Ngoc Duy. "Optimisation for image processing." Thesis, Imperial College London, 2014. http://hdl.handle.net/10044/1/24904.
Повний текст джерелаGolab, Lukasz. "Sliding Window Query Processing over Data Streams." Thesis, University of Waterloo, 2006. http://hdl.handle.net/10012/2930.
Повний текст джерелаThis dissertation begins with the observation that the two fundamental requirements of a DSMS are dealing with transient (time-evolving) rather than static data and answering persistent rather than transient queries. One implication of the first requirement is that data maintenance costs have a significant effect on the performance of a DSMS. Additionally, traditional query processing algorithms must be re-engineered for the sliding window model because queries may need to re-process expired data and "undo" previously generated results. The second requirement suggests that a DSMS may execute a large number of persistent queries at the same time, therefore there exist opportunities for resource sharing among similar queries.
The purpose of this dissertation is to develop solutions for efficient query processing over sliding windows by focusing on these two fundamental properties. In terms of the transient nature of streaming data, this dissertation is based upon the following insight. Although the data keep changing over time as the windows slide forward, the changes are not random; on the contrary, the inputs and outputs of a DSMS exhibit patterns in the way the data are inserted and deleted. It will be shown that the knowledge of these patterns leads to an understanding of the semantics of persistent queries, lower window maintenance costs, as well as novel query processing, query optimization, and concurrency control strategies. In the context of the persistent nature of DSMS queries, the insight behind the proposed solution is that various queries may need to be refreshed at different times, therefore synchronizing the refresh schedules of similar queries creates more opportunities for resource sharing.
Thomo, Alex-Imir. "Query processing using views in semistructured databases." Thesis, National Library of Canada = Bibliothèque nationale du Canada, 2001. http://www.collectionscanada.ca/obj/s4/f2/dsk3/ftp04/MQ59343.pdf.
Повний текст джерелаDing, Luping. "Metadata-aware query processing over data streams." Worcester, Mass. : Worcester Polytechnic Institute, 2008. http://www.wpi.edu/Pubs/ETD/Available/etd-042208-194826/.
Повний текст джерелаWu, Hejun. "Scheduling for in-network sensor query processing /." View abstract or full-text, 2008. http://library.ust.hk/cgi/db/thesis.pl?CSED%202008%20WU.
Повний текст джерелаJonassen, Simon. "Efficient Query Processing in Distributed Search Engines." Doctoral thesis, Norges teknisk-naturvitenskapelige universitet, Institutt for datateknikk og informasjonsvitenskap, 2013. http://urn.kb.se/resolve?urn=urn:nbn:no:ntnu:diva-20206.
Повний текст джерелаMühleisen, Hannes [Verfasser]. "Architecture-independent distributed query processing / Hannes Mühleisen." Berlin : Freie Universität Berlin, 2013. http://d-nb.info/1031100261/34.
Повний текст джерела