Dissertations / Theses on the topic 'Distributed information retrieval'
Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles
Consult the top 50 dissertations / theses for your research on the topic 'Distributed information retrieval.'
Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.
You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.
Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.
Craswell, Nicholas Eric, and Nick Craswell@anu edu au. "Methods for Distributed Information Retrieval." The Australian National University. Faculty of Engineering and Information Technology, 2001. http://thesis.anu.edu.au./public/adt-ANU20020315.142540.
Full textPowell, Allison L. "Database selection in distributed information retrieval a study of multi-collection information retrieval /." Full text, Acrobat Reader required, 2001. http://viva.lib.virginia.edu/etd/diss/SEAS/ComputerScience/2001/Powell/etd.pdf.
Full textBaumgarten, Christoph. "Probabilistic information retrieval in a distributed heterogeneous environment." Doctoral thesis, [S.l. : s.n.], 1999. http://deposit.ddb.de/cgi-bin/dokserv?idn=963555316.
Full textBaumgarten, Christoph. "Probabilistic information retrieval in a distributed heterogeneous environment." Doctoral thesis, Technische Universität Dresden, 1998. https://tud.qucosa.de/id/qucosa%3A24785.
Full textYang, Hui. "Methodologies for information source selection under distributed information environments." Access electronically, 2005. http://www.library.uow.edu.au/adt-NWU/public/adt-NWU20060511.123303/index.html.
Full textLiu, Yang. "A resource aware distributed LSI algorithm for scalable information retrieval." Thesis, Brunel University, 2011. http://bura.brunel.ac.uk/handle/2438/5559.
Full textFu, R. "The quality of probabilistic search in unstructured distributed information retrieval systems." Thesis, University College London (University of London), 2012. http://discovery.ucl.ac.uk/1370031/.
Full textSchatz, Bruce Raymond. "Interactive retrieval in information spaces distributed across a wide-area network." Diss., The University of Arizona, 1991. http://hdl.handle.net/10150/185363.
Full textTran, Allen Quoc-Luan. "A network management facility for a fault-tolerant distributed information retrieval system." Thesis, National Library of Canada = Bibliothèque nationale du Canada, 2000. http://www.collectionscanada.ca/obj/s4/f2/dsk1/tape3/PQDD_0010/MQ53394.pdf.
Full textMacfarlane, Andrew. "Distributed inverted files and performance : a study of parallelism and data distribution methods in IR." Thesis, City University London, 2000. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.342722.
Full textShokouhi, Milad, and milads@microsoft com. "Federated Text Retrieval from Independent Collections." RMIT University. Computer Science and Information Technology, 2008. http://adt.lib.rmit.edu.au/adt/public/adt-VIT20080521.151632.
Full textScherle, Ryan. "Looking for a haystack selecting data sources in a distributed retrieval system /." [Bloomington, Ind.] : Indiana University, 2006. http://gateway.proquest.com/openurl?url_ver=Z39.88-2004&rft_val_fmt=info:ofi/fmt:kev:mtx:dissertation&res_dat=xri:pqdiss&rft_dat=xri:pqdiss:3240033.
Full text"Title from dissertation home page (viewed July 17, 2007)." Source: Dissertation Abstracts International, Volume: 67-10, Section: B, page: 5859. Advisers: David B. Leake; Michael Gasser.
Al-Shakarchi, Ahmad. "Scalable audio processing across heterogeneous distributed resources : an investigation into distributed audio processing for Music Information Retrieval." Thesis, Cardiff University, 2013. http://orca.cf.ac.uk/47855/.
Full textStegmaier, Florian [Verfasser], and Harald [Akademischer Betreuer] Kosch. "Unified Retrieval in Distributed and Heterogeneous Multimedia Information Systems / Florian Stegmaier. Betreuer: Harald Kosch." Passau : Universitätsbibliothek der Universität Passau, 2014. http://d-nb.info/1053119267/34.
Full textAbusukhon, Ahmad Salameh. "An investigation into improving the load balance and query throughput of distributed information retrieval." Thesis, University of Bath, 2008. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.505715.
Full textLu, Chengye. "Peer to peer English/Chinese cross-language information retrieval." Thesis, Queensland University of Technology, 2008. https://eprints.qut.edu.au/26444/1/Chengye_Lu_Thesis.pdf.
Full textLu, Chengye. "Peer to peer English/Chinese cross-language information retrieval." Queensland University of Technology, 2008. http://eprints.qut.edu.au/26444/.
Full textLi, Xiaodong. "RDSS ; a reliable and efficient distributed storage system." Ohio University / OhioLINK, 2004. http://www.ohiolink.edu/etd/view.cgi?ohiou1103127547.
Full textChilappagari, Sairam. "Role of web services for globally distributed information retrieval systems in a grid environment implementation and performance analysis of a prototype /." Fairfax, VA : George Mason University, 2008. http://hdl.handle.net/1920/3220.
Full textVita: p. 108. Thesis director: J. Mark Pullen. Submitted in partial fulfillment of the requirements for the degree of Master of Science in Computer Science. Title from PDF t.p. (viewed Aug. 28, 2008). Includes bibliographical references (p. 101-107). Also issued in print.
Petratos, Panagiotis. "A heuristic information retrieval study : an investigation of methods for enhanced searching of distributed data objects exploiting bidirectional relevance feedback." Thesis, University of Bedfordshire, 2004. http://hdl.handle.net/10547/319931.
Full textMilliner, Stephen William. "Dynamic resolution of conceptual heterogenity in large scale distributed information systems." Thesis, Queensland University of Technology, 2001.
Find full textKoch, Douglas J. "Positioning the Reserve Headquarters Support (RHS) system for multi-layered enterprise use." Thesis, Monterey, California : Naval Postgraduate School, 2009. http://edocs.nps.edu/npspubs/scholarly/theses/2009/Sep/09Sep%5FKoch.pdf.
Full textThesis Advisor(s): Cook, Glenn. "September 2009." Description based on title screen as viewed on 6 November 2009. Author(s) subject terms: Enterprise architecture, project management, business process transformation, operating model, IT governance, IT systems, data quality, data migration, business operating model, personnel IT systems, HRM, ERP. Includes bibliographical references (p. 89-92). Also available in print.
Miranda, Ackerman Eduardo Jacobo. "Extracting Causal Relations between News Topics from Distributed Sources." Doctoral thesis, Saechsische Landesbibliothek- Staats- und Universitaetsbibliothek Dresden, 2013. http://nbn-resolving.de/urn:nbn:de:bsz:14-qucosa-130066.
Full textPaques, Henrique Wiermann. "The Ginga Approach to Adaptive Query Processing in Large Distributed Systems." Diss., Georgia Institute of Technology, 2003. http://hdl.handle.net/1853/5277.
Full textWang, Xinyu. "Toward Scalable Hierarchical Clustering and Co-clustering Methods : application to the Cluster Hypothesis in Information Retrieval." Thesis, Lyon, 2017. http://www.theses.fr/2017LYSE2123/document.
Full textAs a major type of unsupervised machine learning method, clustering has been widely applied in various tasks. Different clustering methods have different characteristics. Hierarchical clustering, for example, is capable to output a binary tree-like structure, which explicitly illustrates the interconnections among data instances. Co-clustering, on the other hand, generates co-clusters, each containing a subset of data instances and a subset of data attributes. Applying clustering on textual data enables to organize input documents and reveal connections among documents. This characteristic is helpful in many cases, for example, in cluster-based Information Retrieval tasks. As the size of available data increases, demand of computing power increases. In response to this demand, many distributed computing platforms are developed. These platforms use the collective computing powers of commodity machines to parallelize data, assign computing tasks and perform computation concurrently.In this thesis, we first address text clustering tasks by proposing two clustering methods, Sim_AHC and SHCoClust. They respectively represent a similarity-based hierarchical clustering and a similarity-based hierarchical co-clustering. We examine their properties and performances through mathematical deduction, experimental verification and evaluation. Then we apply these methods in testing the cluster hypothesis, which is the fundamental assumption in cluster-based Information Retrieval. In such tests, we apply the optimal cluster search to evaluation the retrieval effectiveness of different clustering methods. We examine the computing efficiency and compare the results of the proposed tests. In order to perform clustering on larger datasets, we select Apache Spark platform and provide distributed implementation of Sim_AHC and of SHCoClust. For distributed Sim_AHC, we present the designed computing procedure, illustrate confronted difficulties and provide possible solutions. And for SHCoClust, we provide a distributed implementation of its core, spectral embedding. In this implementation, we use several datasets that vary in size to examine scalability
Ducrou, Amanda Joanne. "Complete interoperability in healthcare technical, semantic and process interoperability through ontology mapping and distributed enterprise integration techniques /." Access electronically, 2009. http://ro.uow.edu.au/theses/3048.
Full textOsborn, Viola. "Identifying At-Risk Students: An Assessment Instrument for Distributed Learning Courses in Higher Education." Thesis, University of North Texas, 2000. https://digital.library.unt.edu/ark:/67531/metadc2457/.
Full textNguyen, The An Binh [Verfasser], Ralf [Akademischer Betreuer] Steinmetz, and Michael [Akademischer Betreuer] Zink. "Quality-aware Tasking in Mobile Opportunistic Networks - Distributed Information Retrieval and Processing utilizing Opportunistic Heterogeneous Resources. / The An Binh Nguyen ; Ralf Steinmetz, Michael Zink." Darmstadt : Universitäts- und Landesbibliothek Darmstadt, 2018. http://d-nb.info/1167926331/34.
Full textBrand, Jacobus Edwin. "An instrument analysis system based on a modern relational database and distributed software architecture." Thesis, Stellenbosch : Stellenbosch University, 2003. http://hdl.handle.net/10019.1/53269.
Full textENGLISH ABSTRACT: This document discusses the development of a personal computer based financial instrument analysis system, based on the information from a relatively old sequential file based data source. The aim is to modernise the system to use the latest software and data storage technology. The principles used for the design of the system are discussed in Chapter 2. Principles for the development of relational databases are discussed, where after the development of personal computer based software architecture is discussed, to explain the choices made in the design of the system. Chapter 3 discusses the design and implementation of the system in more detail, based on the principles discussed in Chapter 2. Recommendations include a possible shift in architectural layout as well as recommendations for expansion of both the data stored and the analysis performed on the information.
AFRIKAANSE OPSOMMING: Hierdie dokument bespreek die ontwikkeling van ‘n persoonlike rekenaar gebaseerde finansiële instrument analise stelsel, gebaseer op inligting uit ‘n relatiewe ou sekwensiële leêr gebaseerde databron. Die doel is om die stelsel te moderniseer om sodoende van die nuutste sagteware en hardeware tegnologie gebruik te maak. Die beginsels wat gebruik is vir die ontwerp van die stelsel word kortliks in Hoofstuk 2 bespreek. Die beginsels vir die ontwerp van ‘n relasionele databasis word bespreek. Hierna word die ontwikkeling van persoonlike rekenaar gebaseerde sagteware argitektuur bespreek om meer lig te werp op die keuses wat geneem is met ontwerp van die stelsel se argitektuur. Hoofstuk 3 bespreek die ontwerp en implementering van die stelsel in meer detail, gebaseer op die beginsels bespreek in Hoofstuk 2. Voorstelle vir verbetering van die stelsel sluit in detail veranderings aan die argitektuur van die stelsel, sowel as voorstelle vir die uitbreiding van die stelsel wat betref tipe data wat gestoor word en en die analitiese vermoëns van die stelsel.
Augusto, Luiz Daniel Creao. "Arquitetura e implementação de um sistema distribuído e recuperação de informação." Universidade de São Paulo, 2010. http://www.teses.usp.br/teses/disponiveis/45/45134/tde-12072010-110036/.
Full textThe search for relevant documents for the final user is a problem that becomes more expensive as the databases grown faster. The solution was brought by distributed systems, because of its scalability and fail tolerance. The development of systems focused on enormous databases -- including the World Wide Web -- is an industry that involves billions of dollars in the world and had created giants. In this work, will be presented and discussed data structures and distributed architectures related to the indexes and searching in great document collections in distributed systems, reaching high performance and scalability. We will also discuss some of the biggest search engines, such as Google e Apache Solr, and the planning of an application with a developing prototype. At last, a new project of a distributed searching system will be presented and implemented, based on Lucene, with ideas from other works and new ideas of our own. On our tests, the system developed in this work had throughput 37.4\\% higher than Apache Solr and revealed higher performance than non-distributed solutions in a hardware more expensive than our cluster.
Ives, Zachary G. "Efficient query processing for data integration /." Thesis, Connect to this title online; UW restricted, 2002. http://hdl.handle.net/1773/6864.
Full textVilsmaier, Christian. "Contextualized access to distributed and heterogeneous multimedia data sources." Thesis, Lyon, INSA, 2014. http://www.theses.fr/2014ISAL0094/document.
Full textMaking multimedia data available online becomes less expensive and more convenient on a daily basis. This development promotes web phenomenons such as Facebook, Twitter, and Flickr. These phenomena and their increased acceptance in society in turn leads to a multiplication of the amount of available images online. This vast amount of, frequently public and therefore searchable, images already exceeds the zettabyte bound. Executing a similarity search on the magnitude of images that are publicly available and receiving a top quality result is a challenge that the scientific community has recently attempted to rise to. One approach to cope with this problem assumes the use of distributed heterogeneous Content Based Image Retrieval system (CBIRs). Following from this anticipation, the problems that emerge from a distributed query scenario must be dealt with. For example the involved CBIRs’ usage of distinct metadata formats for describing their content, as well as their unequal technical and structural information. An addition issue is the individual metrics that are used by the CBIRs to calculate the similarity between pictures, as well as their specific way of being combined. Overall, receiving good results in this environment is a very labor intensive task which has been scientifically but not yet comprehensively explored. The problem primarily addressed in this work is the collection of pictures from CBIRs, that are similar to a given picture, as a response to a distributed multimedia query. The main contribution of this thesis is the construction of a network of Content Based Image Retrieval systems that are able to extract and exploit the information about an input image’s semantic concept. This so called semantic CBIRn is mainly composed of CBIRs that are configured by the semantic CBIRn itself. Complementarily, there is a possibility that allows the integration of specialized external sources. The semantic CBIRn is able to collect and merge results of all of these attached CBIRs. In order to be able to integrate external sources that are willing to join the network, but are not willing to disclose their configuration, an algorithm was developed that approximates these configurations. By categorizing existing as well as external CBIRs and analyzing incoming queries, image queries are exclusively forwarded to the most suitable CBIRs. In this way, images that are not of any use for the user can be omitted beforehand. The hereafter returned images are rendered comparable in order to be able to merge them to one single result list of images, that are similar to the input image. The feasibility of the approach and the hereby obtained improvement of the search process is demonstrated by a prototypical implementation. Using this prototypical implementation an augmentation of the number of returned images that are of the same semantic concept as the input images is achieved by a factor of 4.75 with respect to a predefined non-semantic CBIRn
Domínguez, Sal David. "Analysis and optimization of question answering systems." Doctoral thesis, Universitat Politècnica de Catalunya, 2010. http://hdl.handle.net/10803/78011.
Full textGerbier, Emilie. "Effet du type d’agencement temporel des répétitions d’une information sur la récupération explicite." Thesis, Lyon 2, 2011. http://www.theses.fr/2011LYO20029/document.
Full textHow information is repeated over time determines future recollection of this information. Studies in psychology revealed a distributed practice effect, that is, one retains information better when its occurrences are separated by long lags rather than by short lags. Our studies focused specifically on cases in which items were repeated upon several days. We compared the efficiency of three different temporal schedules of repetitions: A uniform schedule that consisted in repetitions occurring with equal intervals, an expanding schedule that consisted in repetitions occurring with longer and longer intervals, and a contracting schedule that consisted in repetitions occurring with shorter and shorter intervals. In Experiments 1 and 2, the learning phase lasted one week and the retention interval lasted two days. It was shown that the expanding and uniform schedules were more efficient than the contracting schedule. In Experiment 3, the learning phase lasted two weeks and the retention interval lasted 2, 6, or 13 days. It was shown that the superiority of the expanding schedule over the other two schedules appeared gradually when the retention interval increased, suggesting that different schedules yielded different forgetting rates. We also tried to test major theories of the distributed practice effect, such as the encoding variability (Experiment 4) and the study-phase retrieval (Experiment 2) theories. Our results appeared to be consistent with the study-phase retrieval theory. We concluded our dissertation by emphasizing the importance of considering findings from other areas in cognitive science–especially neuroscience and computer science–in the study of the distributed practice effect
Conte, Simone Ivan. "The Sea of Stuff : a model to manage shared mutable data in a distributed environment." Thesis, University of St Andrews, 2019. http://hdl.handle.net/10023/16827.
Full textHarvesf, Cyrus Mehrabaun. "The design and implementation of a robust, cost-conscious peer-to-peer lookup service." Diss., Atlanta, Ga. : Georgia Institute of Technology, 2008. http://hdl.handle.net/1853/26559.
Full textCommittee Chair: Blough, Douglas; Committee Member: Liu, Ling; Committee Member: Owen, Henry; Committee Member: Riley, George; Committee Member: Yalamanchili, Sudhakar. Part of the SMARTech Electronic Thesis and Dissertation Collection.
Thomas, Cerqueus. "Contributions au problème d'hétérogénéité sémantique dans les systèmes pair-à-pair : application à la recherche d'information." Phd thesis, Université de Nantes, 2012. http://tel.archives-ouvertes.fr/tel-00763914.
Full textVillaça, Rodolfo da Silva 1974. "Hamming DHTe HCube : arquiteturas distribuídas para busca por similaridade." [s.n.], 2013. http://repositorio.unicamp.br/jspui/handle/REPOSIP/261007.
Full textTese (doutorado) - Universidade Estadual de Campinas, Faculdade de Engenharia Elétrica e de Computação
Made available in DSpace on 2018-08-23T11:36:13Z (GMT). No. of bitstreams: 1 Villaca_RodolfodaSilva_D.pdf: 2446951 bytes, checksum: c6d907cab0de18a43fe707cae0e827a4 (MD5) Previous issue date: 2013
Resumo: Atualmente, a quantidade de dados disponíveis na Internet supera a casa dos Zettabytes (ZB), definindo um cenário conhecido na literatura como Big Data. Embora as soluções de banco de dados tradicionais sejam eficientes na busca e recuperação de um conteúdo específico e exato, elas são ineficientes nesse cenário de Big Data, visto que não foram projetadas para isso. Outra dificuldade é que esses dados são essencialmente não-estruturados e encontram-se diluídos em toda a vastidão da Internet. Desta forma, novas soluções de infraestruturas de bancos de dados são necessárias de modo a suportar a busca e recuperação de dados similares de maneira não exata, configurando-se a busca por similaridade, isto é, busca por grupos de dados que compartilham entre si alguma semelhança. Nesse cenário, a proposta desta tese é explorar a similaridade de Hamming existente entre identificadores de objetos gerados através da função Random Hyperplane Hashing. Essa característica presente nesses identificadores servirá de base para propostas de infra-estruturas distribuídas de armazenamento de dados capazes de suportar eficientemente a busca por similaridade. Nesta tese serão apresentadas a Hamming DHT, uma solução P2P baseada em redes sobrepostas, e o HCube, uma solução baseada em servidores para Data Center. As avaliações de ambas as soluções são apresentadas e mostram que elas são capazes de reduzir as distâncias entre conteúdos similares em ambientes distribuídos, o que contribui para o aumento da cobertura em cenários de busca por similaridade
Abstract: Nowadays, the amount of data available on the Internet is over Zettabytes (ZB). Such condition defines a scenario known in the literature as Big Data. Although traditional database solutions are very efficient for finding and retrieving a specific content, they are inefficient on Big Data scenario, since the great majority of such data is unstructured and scattered across the Internet. In this way, new databases are required in order to support queries capable of finding and recovering similar datasets, i.e., retrieving groups of data that share a common meaning. In order to handle such challenging scenario, the proposal in this thesis is to explore the Hamming similarity existent between content identifiers that are generated using the Random Hyperplane Hashing function. Such identifiers provide the basis for building distributed infrastructures that facilitate the similarity search. In this thesis, we present two different approaches: a P2P solution named Hamming DHT, and a Data Center solution named HCube. Evaluations of both solutions are presented and indicate that such solutions are capable of reducing the distance between similar content, improving the recall in a similarity search
Doutorado
Engenharia de Computação
Doutor em Engenharia Elétrica
REIS, JUNIOR JOSE S. B. "Métodos e softwares para análise da produção científica e detecção de frentes emergentes de pesquisa." reponame:Repositório Institucional do IPEN, 2015. http://repositorio.ipen.br:8080/xmlui/handle/123456789/26929.
Full textMade available in DSpace on 2016-12-21T15:07:24Z (GMT). No. of bitstreams: 0
O progresso de projetos anteriores salientou a necessidade de tratar o problema dos softwares para detecção, a partir de bases de dados de publicações científicas, de tendências emergentes de pesquisa e desenvolvimento. Evidenciou-se a carência de aplicações computacionais eficientes dedicadas a este propósito, que são artigos de grande utilidade para um melhor planejamento de programas de pesquisa e desenvolvimento em instituições. Foi realizada, então, uma revisão dos softwares atualmente disponíveis, para poder-se delinear claramente a oportunidade de desenvolver novas ferramentas. Como resultado, implementou-se um aplicativo chamado Citesnake, projetado especialmente para auxiliar a detecção e o estudo de tendências emergentes a partir da análise de redes de vários tipos, extraídas das bases de dados científicas. Através desta ferramenta computacional robusta e eficaz, foram conduzidas análises de frentes emergentes de pesquisa e desenvolvimento na área de Sistemas Geradores de Energia Nuclear de Geração IV, de forma que se pudesse evidenciar, dentre os tipos de reatores selecionados como os mais promissores pelo GIF - Generation IV International Forum, aqueles que mais se desenvolveram nos últimos dez anos e que se apresentam, atualmente, como os mais capazes de cumprir as promessas realizadas sobre os seus conceitos inovadores.
Dissertação (Mestrado em Tecnologia Nuclear)
IPEN/D
Instituto de Pesquisas Energéticas e Nucleares - IPEN-CNEN/SP
Duguépéroux, Joris. "Protection des travailleurs dans les plateformes de crowdsourcing : une perspective technique." Thesis, Rennes 1, 2020. http://www.theses.fr/2020REN1S023.
Full textThis work focuses on protecting workers in a crowdsourcing context. Indeed, workers are especially vulnerable in online work, and both surveillance from platforms and lack of regulation are frequently denounced for endangering them. Our first contribution focuses on protecting their privacy, while allowing usages of their anonymized data for, e.g. assignment to tasks or providing help for task-design to requesters. Our second contribution considers a multi-platform context, and proposes a set of tools for law-makers to regulate platforms, allowing them to enforce limits on interactions in various ways (to limit the work time for instance), while also guaranteeing transparency and privacy. Both of these approaches make use of many technical tools such as cryptography, distribution, or anonymization tools, and include security proofs and experimental validations. A last, smaller contribution, draws attention on a limit and possible security issue for one of these technical tools, the PIR, when it is used multiple times, which has been ignored in current state-of-the-art contributions
Novaes, Tiago Fernandes de Athayde. "Processamento distribu?do da consulta espa?o textual top-k." Universidade Estadual de Feira de Santana, 2017. http://localhost:8080/tede/handle/tede/530.
Full textMade available in DSpace on 2017-11-28T21:38:06Z (GMT). No. of bitstreams: 1 dissertacao-versao-final.pdf: 2717503 bytes, checksum: a1476bba65482b40daa1a139191ea912 (MD5) Previous issue date: 2017-07-17
With the popularization of databases containing objects with spatial and textual information (spatio-textual object), the interest in new queries and techniques for retrieving these objects have increased. In this scenario, the main query is the the top-k spatio-textual query. This query retrieves the k best spatio-textual objects considering the distance of the object to the query location and the textual similarity between the query keywords and the textual information of the objects. However, most the studies related to top-k spatio-textual query are performed in centralized environments, not addressing real world problems such as scalability. In this paper, we study different strategies for partitioning the data and processing the top-k spatio-textual query in a distributed environment. We evaluate each strategy in a real distributed environment, employing real datasets.
Com a populariza??o de bases de dados contendo objetos que possuem informa??o espacial e textual (objeto espa?o-textual), aumentou o interesse por novas consultas e t?cnicas capazes de recuperar esses objetos de forma eficiente. Uma das principais consultas para objetos espa?o-textuais ? a consulta espa?o-textual top-k. Essa consulta visa recuperar os k melhores objetos considerando a dist?ncia do objeto at? um local informado na consulta e a similaridade textual entre palavras-chave de busca e a informa??o textual dos objetos. No entanto, a maioria dos estudos para consultas espa?o-textual top-k assumem ambientes centralizados, n?o abordando problemas frequentes em aplica??es do mundo real como escalabilidade. Nesta disserta??o s?o estudadas diferentes formas de particionar os dados e o impacto destes particionamentos no processamento da consulta espa?o-textual top-k em um ambiente distribu?do. Todas as estrat?gias propostas s?o avaliadas em um ambiente distribu?do real, utilizando dados reais.
Gaignard, Alban. "Partage et production de connaissances distribuées dans des plateformes scientifiques collaboratives." Phd thesis, Université de Nice Sophia-Antipolis, 2013. http://tel.archives-ouvertes.fr/tel-00827926.
Full textMoin, Afshin. "Les Techniques De Recommandation Et De Visualisation Pour Les Données A Une Grande Echelle." Phd thesis, Université Rennes 1, 2012. http://tel.archives-ouvertes.fr/tel-00724121.
Full textElofson, Gregg Steven. "Facilitating knowledge sharing in organizations: Semiautonomous agents that learn to gather, classify, and distribute environmental scanning knowledge." Diss., The University of Arizona, 1989. http://hdl.handle.net/10150/184743.
Full textLee, Chin Siong. "NPS AUV workbench: collaborative environment for autonomous underwater vehicles (AUV) mission planning and 3D visualization." Thesis, Monterey, California. Naval Postgraduate School, 2004. http://hdl.handle.net/10945/1658.
Full textalities. The extensible Markup Language (XML) is used for data storage and message exchange, Extensible 3D (X3D) Graphics for visualization and XML Schema-based Binary Compression (XSBC) for data compression. The AUV Workbench provides an intuitive cross-platform-capable tool with extensibility to provide for future enhancements such as agent-based control, asynchronous reporting and communication, loss-free message compression and built-in support for mission data archiving. This thesis also investigates the Jabber instant messaging protocol, showing its suitability for text and file messaging in a tactical environment. Exemplars show that the XML backbone of this open-source technology can be leveraged to enable both human and agent messaging with improvements over current systems. Integrated Jabber instant messaging support makes the NPS AUV Workbench the first custom application supporting XML Tactical Chat (XTC). Results demonstrate that the AUV Workbench provides a capable testbed for diverse AUV technologies, assisting in the development of traditional single-vehicle operations and agent-based multiple-vehicle methodologies. The flexible design of the Workbench further encourages integration of new extensions to serve operational needs. Exemplars demonstrate how in-mission and post-mission event monitoring by human operators can be achieved via simple web page, standard clients or custom instant messaging client. Finally, the AUV Workbench's potential as a tool in the development of multiple-AUV tactics and doctrine is discussed.
Civilian, Singapore Defence Science and Technology Agency
El, Mahdaouy Abdelkader. "Accès à l'information dans les grandes collections textuelles en langue arabe." Thesis, Université Grenoble Alpes (ComUE), 2017. http://www.theses.fr/2017GREAM091/document.
Full textGiven the amount of Arabic textual information available on the web, developing effective Information Retrieval Systems (IRS) has become essential to retrieve relevant information. Most of the current Arabic SRIs are based on the bag-of-words representation, where documents are indexed using surface words, roots or stems. Two main drawbacks of the latter representation are the ambiguity of Single Word Terms (SWTs) and term mismatch.The aim of this work is to deal with SWTs ambiguity and term mismatch. Accordingly, we propose four contributions to improve Arabic content representation, indexing, and retrieval. The first contribution consists of representing Arabic documents using Multi-Word Terms (MWTs). The latter is motivated by the fact that MWTs are more precise representational units and less ambiguous than isolated SWTs. Hence, we propose a hybrid method to extract Arabic MWTs, which combines linguistic and statistical filtering of MWT candidates. The linguistic filter uses POS tagging to identify MWTs candidates that fit a set of syntactic patterns and handles the problem of MWTs variation. Then, the statistical filter rank MWT candidate using our proposed association measure that combines contextual information and both termhood and unithood measures. In the second contribution, we explore and evaluate several IR models for ranking documents using both SWTs and MWTs. Additionally, we investigate a wide range of proximity-based IR models for Arabic IR. Then, we introduce a formal condition that IR models should satisfy to deal adequately with term dependencies. The third contribution consists of a method based on Distributed Representation of Word vectors, namely Word Embedding (WE), for Arabic IR. It relies on incorporating WE semantic similarities into existing probabilistic IR models in order to deal with term mismatch. The aim is to allow distinct, but semantically similar terms to contribute to documents scores. The last contribution is a method to incorporate WE similarity into Pseud-Relevance Feedback PRF for Arabic Information Retrieval. The main idea is to select expansion terms using their distribution in the set of top pseudo-relevant documents along with their similarity to the original query terms. The experimental validation of all the proposed contributions is performed using standard Arabic TREC 2002/2001 collection
Craswell, Nicholas Eric. "Methods for Distributed Information Retrieval." Phd thesis, 2000. http://hdl.handle.net/1885/46255.
Full textViles, Charles L. "Maintaining retrieval effectiveness in distributed, dynamic information retrieval systems." 1996. http://books.google.com/books?id=g8rgAAAAMAAJ.
Full textLu, Zhihong. "Scalable distributed architectures for information retrieval." 1999. https://scholarworks.umass.edu/dissertations/AAI9932326.
Full textHawking, David Anthony. "Text retrieval over distributed collections." Phd thesis, 1998. http://hdl.handle.net/1885/147205.
Full text