Journal articles: 'Ad-Hoc Information Retrieval'

1

Ormeño, Pablo, Marcelo Mendoza, and Carlos Valle. "Topic Models Ensembles for AD-HOC Information Retrieval." Information 12, no. 9 (September 1, 2021): 360. http://dx.doi.org/10.3390/info12090360.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

Ad hoc information retrieval (ad hoc IR) is a challenging task consisting of ranking text documents for bag-of-words (BOW) queries. Classic approaches based on query and document text vectors use term-weighting functions to rank the documents. Some of these methods’ limitations consist of their inability to work with polysemic concepts. In addition, these methods introduce fake orthogonalities between semantically related words. To address these limitations, model-based IR approaches based on topics have been explored. Specifically, topic models based on Latent Dirichlet Allocation (LDA) allow building representations of text documents in the latent space of topics, the better modeling of polysemy and avoiding the generation of orthogonal representations between related terms. We extend LDA-based IR strategies using different ensemble strategies. Model selection obeys the ensemble learning paradigm, for which we test two successful approaches widely used in supervised learning. We study Boosting and Bagging techniques for topic models, using each model as a weak IR expert. Then, we merge the ranking lists obtained from each model using a simple but effective top-k list fusion approach. We show that our proposal strengthens the results in precision and recall, outperforming classic IR models and strong baselines based on topic models.

2

Kurland, Oren, and Lillian Lee. "Clusters, language models, and ad hoc information retrieval." ACM Transactions on Information Systems 27, no. 3 (May 2009): 1–39. http://dx.doi.org/10.1145/1508850.1508851.

Full text

APA, Harvard, Vancouver, ISO, and other styles

3

Ensan, Faezeh, and Feras Al-Obeidat. "Relevance-based entity selection for ad hoc retrieval." Information Processing & Management 56, no. 5 (September 2019): 1645–66. http://dx.doi.org/10.1016/j.ipm.2019.05.005.

Full text

APA, Harvard, Vancouver, ISO, and other styles

4

Cai, Weihong, Zijun Hu, Yalan Luo, Daoyuan Liang, Yifan Feng, and Jiaxin Chen. "Multi-Layer Contextual Passage Term Embedding for Ad-Hoc Retrieval." Information 13, no. 5 (April 25, 2022): 221. http://dx.doi.org/10.3390/info13050221.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

Nowadays, pre-trained language models such as Bidirectional Encoder Representations from Transformer (BERT) are becoming a basic building block in Information Retrieval tasks. Nevertheless, there are several limitations when applying BERT to the query-document matching task: (1) relevance assessments are applicable at the document-level, and the tokens of documents often exceed the maximum input length of BERT; (2) applying BERT to long documents leads to a great consumption of memory usage and run time, owing to the computational cost of the interactions between tokens. This paper explores a novel multi-layer contextual passage architecture that leverage text summarization extraction to generate passage-level evidence for the pre-selected document passage thus brought new possibilities for the long document relevance task. Experiments were conducted on two standard ad-hoc retrieval collections from the Text Retrieval Conference (TREC) 2004 Robust Track (Robust04) and ClueWeb09 with two different characteristics individually. Experimental results show that our approach can significantly outperform the strong baselines and even compared with the same BERT-based models, the precision of our methods as well as state-of-the-art neural ranking models.

5

Cai, Weihong, Zijun Hu, Yalan Luo, Daoyuan Liang, Yifan Feng, and Jiaxin Chen. "Multi-Layer Contextual Passage Term Embedding for Ad-Hoc Retrieval." Information 13, no. 5 (April 25, 2022): 221. http://dx.doi.org/10.3390/info13050221.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

Nowadays, pre-trained language models such as Bidirectional Encoder Representations from Transformer (BERT) are becoming a basic building block in Information Retrieval tasks. Nevertheless, there are several limitations when applying BERT to the query-document matching task: (1) relevance assessments are applicable at the document-level, and the tokens of documents often exceed the maximum input length of BERT; (2) applying BERT to long documents leads to a great consumption of memory usage and run time, owing to the computational cost of the interactions between tokens. This paper explores a novel multi-layer contextual passage architecture that leverage text summarization extraction to generate passage-level evidence for the pre-selected document passage thus brought new possibilities for the long document relevance task. Experiments were conducted on two standard ad-hoc retrieval collections from the Text Retrieval Conference (TREC) 2004 Robust Track (Robust04) and ClueWeb09 with two different characteristics individually. Experimental results show that our approach can significantly outperform the strong baselines and even compared with the same BERT-based models, the precision of our methods as well as state-of-the-art neural ranking models.

6

Zhou, Xiaohua, Xiaohua Hu, and Xiaodan Zhang. "Topic Signature Language Models for Ad hoc Retrieval." IEEE Transactions on Knowledge and Data Engineering 19, no. 9 (September 2007): 1276–87. http://dx.doi.org/10.1109/tkde.2007.1058.

Full text

APA, Harvard, Vancouver, ISO, and other styles

7

Dang, Edward Kai Fung, Robert Wing Pong Luk, and James Allan. "A Comparison between Term-Independence Retrieval Models for Ad Hoc Retrieval." ACM Transactions on Information Systems 40, no. 3 (July 31, 2022): 1–37. http://dx.doi.org/10.1145/3483612.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

In Information Retrieval, numerous retrieval models or document ranking functions have been developed in the quest for better retrieval effectiveness. Apart from some formal retrieval models formulated on a theoretical basis, various recent works have applied heuristic constraints to guide the derivation of document ranking functions. While many recent methods are shown to improve over established and successful models, comparison among these new methods under a common environment is often missing. To address this issue, we perform an extensive and up-to-date comparison of leading term-independence retrieval models implemented in our own retrieval system. Our study focuses on the following questions: (RQ1) Is there a retrieval model that consistently outperforms all other models across multiple collections; (RQ2) What are the important features of an effective document ranking function? Our retrieval experiments performed on several TREC test collections of a wide range of sizes (up to the terabyte-sized Clueweb09 Category B) enable us to answer these research questions. This work also serves as a reproducibility study for leading retrieval models. While our experiments show that no single retrieval model outperforms all others across all tested collections, some recent retrieval models, such as MATF and MVD, consistently perform better than the common baselines.

8

Deveaud, Romain, Eric SanJuan, and Patrice Bellot. "Accurate and effective latent concept modeling for ad hoc information retrieval." Document numérique 17, no. 1 (April 30, 2014): 61–84. http://dx.doi.org/10.3166/dn.17.1.61-84.

Full text

APA, Harvard, Vancouver, ISO, and other styles

9

Khodabakhsh, Maryam, and Ebrahim Bagheri. "Semantics-enabled query performance prediction for ad hoc table retrieval." Information Processing & Management 58, no. 1 (January 2021): 102399. http://dx.doi.org/10.1016/j.ipm.2020.102399.

Full text

APA, Harvard, Vancouver, ISO, and other styles

10

Garigliotti, Darío, Faegheh Hasibi, and Krisztian Balog. "Identifying and exploiting target entity type information for ad hoc entity retrieval." Information Retrieval Journal 22, no. 3-4 (December 5, 2018): 285–323. http://dx.doi.org/10.1007/s10791-018-9346-x.

Full text

APA, Harvard, Vancouver, ISO, and other styles

11

Azzopardi, Leif. "Incorporating context within the language modeling approach for ad hoc information retrieval." ACM SIGIR Forum 40, no. 1 (June 2006): 70. http://dx.doi.org/10.1145/1147197.1147211.

Full text

APA, Harvard, Vancouver, ISO, and other styles

12

Su, Weihang, Qingyao Ai, Xiangsheng Li, Jia Chen, Yiqun Liu, Xiaolong Wu, and Shengluan Hou. "Wikiformer: Pre-training with Structured Information of Wikipedia for Ad-Hoc Retrieval." Proceedings of the AAAI Conference on Artificial Intelligence 38, no. 17 (March 24, 2024): 19026–34. http://dx.doi.org/10.1609/aaai.v38i17.29869.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

With the development of deep learning and natural language processing techniques, pre-trained language models have been widely used to solve information retrieval (IR) problems. Benefiting from the pre-training and fine-tuning paradigm, these models achieve state-of-the-art performance. In previous works, plain texts in Wikipedia have been widely used in the pre-training stage. However, the rich structured information in Wikipedia, such as the titles, abstracts, hierarchical heading (multi-level title) structure, relationship between articles, references, hyperlink structures, and the writing organizations, has not been fully explored. In this paper, we devise four pre-training objectives tailored for IR tasks based on the structured knowledge of Wikipedia. Compared to existing pre-training methods, our approach can better capture the semantic knowledge in the training corpus by leveraging the human-edited structured data from Wikipedia. Experimental results on multiple IR benchmark datasets show the superior performance of our model in both zero-shot and fine-tuning settings compared to existing strong retrieval baselines. Besides, experimental results in biomedical and legal domains demonstrate that our approach achieves better performance in vertical domains compared to previous models, especially in scenarios where long text similarity matching is needed. The code is available at https://github.com/oneal2000/Wikiformer.

13

P. Bhopale, Bhopale, and Ashish Tiwari. "LEVERAGING NEURAL NETWORK PHRASE EMBEDDING MODEL FOR QUERY REFORMULATION IN AD-HOC BIOMEDICAL INFORMATION RETRIEVAL." Malaysian Journal of Computer Science 34, no. 2 (April 30, 2021): 151–70. http://dx.doi.org/10.22452/mjcs.vol34no2.2.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

This study presents a spark enhanced neural network phrase embedding model to leverage query representation for relevant biomedical literature retrieval. Information retrieval for clinical decision support demands high precision. In recent years, word embeddings have been evolved as a solution to such requirements. It represents vocabulary words in low-dimensional vectors in the context of their similar words; however, it is inadequate to deal with semantic phrases or multi-word units. Learning vector embeddings for phrases by maintaining word meanings is a challenging task. This study proposes a scalable phrase embedding technique to embed multi-word units into vector representations using a state-of-the-art word embedding technique, keeping both word and phrase in the same vectors space. It will enhance the effectiveness and efficiency of query language models by expanding unseen query terms and phrases for the semantically associated query terms. Embedding vectors are evaluated via a query expansion technique for ad-hoc retrieval task over two benchmark corpora viz. TREC-CDS 2014 collection with 733,138 PubMed articles and OHSUMED corpus having 348,566 articles collected from a Medline database. The results show that the proposed technique has significantly outperformed other state-of-the-art retrieval techniques

14

Lin, Sheng-Chieh, Jheng-Hong Yang, Rodrigo Nogueira, Ming-Feng Tsai, Chuan-Ju Wang, and Jimmy Lin. "Multi-Stage Conversational Passage Retrieval: An Approach to Fusing Term Importance Estimation and Neural Query Rewriting." ACM Transactions on Information Systems 39, no. 4 (October 31, 2021): 1–29. http://dx.doi.org/10.1145/3446426.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

Conversational search plays a vital role in conversational information seeking. As queries in information seeking dialogues are ambiguous for traditional ad hoc information retrieval (IR) systems due to the coreference and omission resolution problems inherent in natural language dialogue, resolving these ambiguities is crucial. In this article, we tackle conversational passage retrieval, an important component of conversational search, by addressing query ambiguities with query reformulation integrated into a multi-stage ad hoc IR system. Specifically, we propose two conversational query reformulation (CQR) methods: (1) term importance estimation and (2) neural query rewriting. For the former, we expand conversational queries using important terms extracted from the conversational context with frequency-based signals. For the latter, we reformulate conversational queries into natural, stand-alone, human-understandable queries with a pretrained sequence-to-sequence model. Detailed analyses of the two CQR methods are provided quantitatively and qualitatively, explaining their advantages, disadvantages, and distinct behaviors. Moreover, to leverage the strengths of both CQR methods, we propose combining their output with reciprocal rank fusion, yielding state-of-the-art retrieval effectiveness, 30% improvement in terms of NDCG@3 compared to the best submission of Text REtrieval Conference (TREC) Conversational Assistant Track (CAsT) 2019.

15

Yuan-Po Cheng, Chia-Yi Wu, Yao-Jen Tang, and Ming-Jer Tsai. "Retrieval-Guaranteed Location-Aware Information Brokerage Scheme in 3D Wireless Ad Hoc Networks." IEEE Transactions on Computers 62, no. 4 (April 2013): 798–812. http://dx.doi.org/10.1109/tc.2012.16.

Full text

APA, Harvard, Vancouver, ISO, and other styles

16

Yang, Bo, and Manohar Mareboyana. "Location-Aware Caching for Semantic-Based Image Queries in Mobile AD HOC Networks." International Journal of Multimedia Data Engineering and Management 3, no. 1 (January 2012): 17–35. http://dx.doi.org/10.4018/jmdem.2012010102.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

Mobile image information retrieval, i.e., the processing of mobile image queries, has attracted research attention due to the recent technological advances in mobile and ubiquitous computing, network infrastructures, and multimedia streaming. The previous research focuses on data delivery, while few works have reported on content-based mobile information retrieval. Therefore, it is important to devise effective means to describe the semantics as well as content distribution of mobile data. Caching is an attractive solution that helps reveal semantic relationships among mobile data sources. However, traditional caching techniques rely on exact match of fixed values and are not efficient in dealing with imprecisely described image data. To address these issues, the authors propose a location-aware caching model which reflects the distribution of images based on the analysis of earlier queries. Through extensive simulations, the authors show that the proposed model can perform search with less cost for voluminous data.

17

Kwok, Kui-Lam, Laszlo Grunfeld, and Peter Deng. "Employing web mining and data fusion to improve weak ad hoc retrieval." Information Processing & Management 43, no. 2 (March 2007): 406–19. http://dx.doi.org/10.1016/j.ipm.2006.07.008.

Full text

APA, Harvard, Vancouver, ISO, and other styles

18

Ben Basat, Ran, Moshe Tennenholtz, and Oren Kurland. "A Game Theoretic Analysis of the Adversarial Retrieval Setting." Journal of Artificial Intelligence Research 60 (December 30, 2017): 1127–64. http://dx.doi.org/10.1613/jair.5547.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

The main goal of search engines is ad hoc retrieval: ranking documents in a corpus by their relevance to the information need expressed by a query. The Probability Ranking Principle (PRP) --- ranking the documents by their relevance probabilities --- is the theoretical foundation of most existing ad hoc document retrieval methods. A key observation that motivates our work is that the PRP does not account for potential post-ranking effects; specifically, changes to documents that result from a given ranking. Yet, in adversarial retrieval settings such as the Web, authors may consistently try to promote their documents in rankings by changing them. We prove that, indeed, the PRP can be sub-optimal in adversarial retrieval settings. We do so by presenting a novel game theoretic analysis of the adversarial setting. The analysis is performed for different types of documents (single-topic and multi-topic) and is based on different assumptions about the writing qualities of documents' authors. We show that in some cases, introducing randomization into the document ranking function yields an overall user utility that transcends that of applying the PRP.

19

Zhai, Chengxiang, and John Lafferty. "A Study of Smoothing Methods for Language Models Applied to Ad Hoc Information Retrieval." ACM SIGIR Forum 51, no. 2 (August 2, 2017): 268–76. http://dx.doi.org/10.1145/3130348.3130377.

Full text

APA, Harvard, Vancouver, ISO, and other styles

20

MYLONAS, PH, D. VALLET, P. CASTELLS, M. FERNÁNDEZ, and Y. AVRITHIS. "Personalized information retrieval based on context and ontological knowledge." Knowledge Engineering Review 23, no. 1 (March 2008): 73–100. http://dx.doi.org/10.1017/s0269888907001282.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

AbstractContext modeling has long been acknowledged as a key aspect in a wide variety of problem domains. In this paper we focus on the combination of contextualization and personalization methods to improve the performance of personalized information retrieval. The key aspects in our proposed approach are (1) the explicit distinction between historic user context and live user context, (2) the use of ontology-driven representations of the domain of discourse, as a common, enriched representational ground for content meaning, user interests, and contextual conditions, enabling the definition of effective means to relate the three of them, and (3) the introduction of fuzzy representations as an instrument to properly handle the uncertainty and imprecision involved in the automatic interpretation of meanings, user attention, and user wishes. Based on a formal grounding at the representational level, we propose methods for the automatic extraction of persistent semantic user preferences, and live, ad-hoc user interests, which are combined in order to improve the accuracy and reliability of personalization for retrieval.

21

Alowish, Mazen, Yoshiaki Shiraishi, Yasuhiro Takano, Masami Mohri, and Masakatu Morii. "A novel software-defined networking controlled vehicular named-data networking for trustworthy emergency data dissemination and content retrieval assisted by evolved interest packet." International Journal of Distributed Sensor Networks 16, no. 3 (March 2020): 155014772090928. http://dx.doi.org/10.1177/1550147720909280.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

Vehicle ad hoc network is the key technology for a future Internet of vehicles and intelligent transport system. However, involvement of vast number of vehicles in Internet of vehicles limits the performance of vehicle ad hoc network. To tackle this problem, a novel vehicle ad hoc network architecture with two different technologies such as software-defined networking and named-data networking is proposed in this article. In the proposed software-defined networking controlled vehicular named-data networking, IP addressing issue is resolved by named-data networking and global view of the network is attained by software-defined networking. Emergency data dissemination is initiated with packet classification. For packet classification, policy-based bifold classifier is proposed in roadside unit and supported by evolved interest packet. Subsequently, best disseminator selection is carried out by trustworthy weighted graph scheme based on novel weight value, which is computed by considering significant metrics. Content retrieval is accomplished by roadside unit and assisted by a controller. Location of content producer is obtained from a controller and optimal route is selected by roadside unit. Optimal route selection is performed by roadside unit for both content retrieval and vehicle-to-vehicle communication using novel region-based hybrid cuckoo search algorithm. Hybrid algorithm combines cuckoo search and particle swarm optimization algorithm to perform efficient route selection. Involvement of software-defined networking controller supports numerous users by providing a global view of the network, which includes network status and traffic information. Extensive simulation in NS-3 assures better interest satisfaction rate, interest satisfaction delay, forwarder interest packets, average hop count, and gain of scalability in software-defined networking controlled vehicular named-data networking than traditional vehicle ad hoc network.

22

MÜLLER, CHRISTOF, IRYNA GUREVYCH, and MAX MÜHLHÄUSER. "CLOSING THE VOCABULARY GAP FOR COMPUTING TEXT SIMILARITY AND INFORMATION RETRIEVAL." International Journal of Semantic Computing 02, no. 02 (June 2008): 253–72. http://dx.doi.org/10.1142/s1793351x08000452.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

This paper studies the integration of lexical semantic knowledge in two related semantic computing tasks: ad-hoc information retrieval and computing text similarity. For this purpose, we compare the performance of two algorithms: (i) using semantic relatedness, and (ii) using a conventional extended Boolean model [13] with additional query expansion. For the evaluation, we use two different test collections in the German language especially suitable to study the vocabulary gap problem: (i) GIRT [5] for the information retrieval task, and (ii) a collection of descriptions of professions built to evaluate a system for electronic career guidance in the information retrieval and text similarity tasks. We found that integrating lexical semantic knowledge increases the performance for both tasks. On the GIRT corpus, the performance is improved only for short queries. The performance on the collection of professional descriptions is improved, but crucially depends on the accurate preprocessing of the natural language essays employed as topics.

23

Goeuriot, Lorraine, Gareth J. F. Jones, Liadh Kelly, Johannes Leveling, Mihai Lupu, Joao Palotti, and Guido Zuccon. "An analysis of evaluation campaigns in ad-hoc medical information retrieval: CLEF eHealth 2013 and 2014." Information Retrieval Journal 21, no. 6 (May 3, 2018): 507–40. http://dx.doi.org/10.1007/s10791-018-9331-4.

Full text

APA, Harvard, Vancouver, ISO, and other styles

24

Zhang, Peng, Wenjie Hui, Benyou Wang, Donghao Zhao, Dawei Song, Christina Lioma, and Jakob Grue Simonsen. "Complex-valued Neural Network-based Quantum Language Models." ACM Transactions on Information Systems 40, no. 4 (October 31, 2022): 1–31. http://dx.doi.org/10.1145/3505138.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

Language modeling is essential in Natural Language Processing and Information Retrieval related tasks. After the statistical language models, Quantum Language Model (QLM) has been proposed to unify both single words and compound terms in the same probability space without extending term space exponentially. Although QLM achieved good performance in ad hoc retrieval, it still has two major limitations: (1) QLM cannot make use of supervised information, mainly due to the iterative and non-differentiable estimation of the density matrix, which represents both queries and documents in QLM. (2) QLM assumes the exchangeability of words or word dependencies, neglecting the order or position information of words. This article aims to generalize QLM and make it applicable to more complicated matching tasks (e.g., Question Answering) beyond ad hoc retrieval. We propose a complex-valued neural network-based QLM solution called C-NNQLM to employ an end-to-end approach to build and train density matrices in a light-weight and differentiable manner, and it can therefore make use of external well-trained word vectors and supervised labels. Furthermore, C-NNQLM adopts complex-valued word vectors whose phase vectors can directly encode the order (or position) information of words. Note that complex numbers are also essential in the quantum theory. We show that the real-valued NNQLM (R-NNQLM) is a special case of C-NNQLM. The experimental results on the QA task show that both R-NNQLM and C-NNQLM achieve much better performance than the vanilla QLM, and C-NNQLM’s performance is on par with state-of-the-art neural network models. We also evaluate the proposed C-NNQLM on text classification and document retrieval tasks. The results on most datasets show that the C-NNQLM can outperform R-NNQLM, which demonstrates the usefulness of the complex representation for words and sentences in C-NNQLM.

25

T, Gurumekala, Indira Gandhi S, and Senthil Sivakumar M. "122 Published By: Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP) © Copyright: All rights reserved. Retrieval Number: A10301291S52019/2019©BEIESP DOI:10.35940/ijeat.A1030.1291S519 Journal Website: www.ijeat.org Routing Protocols for AANET." International Journal of Engineering and Advanced Technology 9, no. 1s5 (December 30, 2019): 122–26. http://dx.doi.org/10.35940/ijeat.a1030.1291s519.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

Ad hoc network is an infrastructure less, self-configuring and dynamic network where the nodes are able to forward the information to other nodes based on connectivity and routing algorithm they follow. Recently, the concept of this ad hoc networking has been introduced among aircrafts for providing in-flight communication and to manage increased flow of data produced by civil aviation. The aircraft communication can be established either by satellites or cellular based systems. Utilizing satellites for the communication is very expensive and prone to high propagation delay. The cellular based systems provide direct link to aircraft with minimum cost and less delay. As the line-of-sight range of cellular systems is limited, the aircraft over the oceanic regions is unable to communicate with ground stations. Aiming at overcoming the demerits associated with aforementioned technologies for aircraft communication, the Aeronautical Ad hoc networks has been developed, which creates a ad hoc network among aircrafts where each aircraft is self-aware nodes and communicates with ground stations and other aircrafts irrespective of their flight region. AANET shares has some similarities with existing wireless ad hoc networks whilst having unique challenges in supporting greater mobility, size of network, node density and bandwidth limitations. Because of this unique challenges, routing in this AANET is a difficult task. In this paper, various routing algorithms for AANET with its merits and demerits has been thoroughly studied. Finally, the unsolved problems and research issues of routing in AANET are identified.

26

Faggioli, Guglielmo, Nicola Ferro, Josiane Mothe, Fiana Raiber, and Maik Fröbe. "Report on the 1st Workshop on Query Performance Prediction and Its Evaluation in New Tasks (QPP++ 2023) at ECIR 2023." ACM SIGIR Forum 57, no. 1 (June 2023): 1–7. http://dx.doi.org/10.1145/3636341.3636356.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

Query Performance Prediction (QPP) is currently primarily applied to ad-hoc retrieval tasks. The Information Retrieval (IR) field is reaching new heights thanks to recent advances in large language models and neural networks, as well as emerging new ways of searching, such as conversational search. Such advancements are quickly spreading to adjacent research areas, including QPP, necessitating a reconsideration of how we perform and evaluate QPP. This workshop sought to elicit discussion on three topics related to the future of QPP: exploiting advances in IR to improve QPP, instantiating QPP on new search paradigms, and evaluating QPP on new tasks. Date: 6 April 2023. Website: https://qpp.dei.unipd.it/.

27

Jermey, Jonathan. "Locating files on computer disks." Indexer: The International Journal of Indexing 22, no. 3 (April 1, 2001): 130–32. http://dx.doi.org/10.3828/indexer.2001.22.3.7.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

Over the last 20 years file storage capacity on personal computer systems has grown from the equivalent of a desk drawer to that of a medium-sized public library, and the trend is continuing. File retrieval methods have been developed on an ad hoc basis which has generally failed to keep up with this growth, but there are some encouraging and interesting developments in this area.

28

Valcarce, Daniel. "Information retrieval models for recommender systems." ACM SIGIR Forum 53, no. 1 (June 2019): 44–45. http://dx.doi.org/10.1145/3458537.3458545.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

Information retrieval addresses the information needs of users by delivering relevant pieces of information but requires users to convey their information needs explicitly. In contrast, recommender systems offer personalized suggestions of items automatically. Ultimately, both fields help users cope with information overload by providing them with relevant items of information. This thesis aims to explore the connections between information retrieval and recommender systems. Our objective is to devise recommendation models inspired in information retrieval techniques. We begin by borrowing ideas from the information retrieval evaluation literature to analyze evaluation metrics in recommender systems [2]. Second, we study the applicability of pseudo-relevance feedback models to different recommendation tasks [1]. We investigate the conventional top-N recommendation task [5, 4, 6, 7], but we also explore the recently formulated user-item group formation problem [3] and propose a novel task based on the liquidation of long tail items [8]. Third, we exploit ad hoc retrieval models to compute neighborhoods in a collaborative filtering scenario [9, 10, 12]. Fourth, we explore the opposite direction by adapting an effective recommendation framework to pseudo-relevance feedback [13, 11]. Finally, we discuss the results and present our conclusions. In summary, this doctoral thesis adapts a series of information retrieval models to recommender systems. Our investigation shows that many retrieval models can be accommodated to deal with different recommendation tasks. Moreover, we find that taking the opposite path is also possible. Exhaustive experimentation confirms that the proposed models are competitive. Finally, we also perform a theoretical analysis of some models to explain their effectiveness. Advisors : Álvaro Barreiro and Javier Parapar. Committee members : Gabriella Pasi, Pablo Castells and Fidel Cacheda. The dissertation is available at: https://www.dc.fi.udc.es/~dvalcarce/thesis.pdf.

29

Ai, Qingyao. "Neural generative models and representation learning for information retrieval." ACM SIGIR Forum 53, no. 2 (December 2019): 97. http://dx.doi.org/10.1145/3458553.3458565.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

Information Retrieval (IR) concerns about the structure, analysis, organization, storage, and retrieval of information. Among different retrieval models proposed in the past decades, generative retrieval models, especially those under the statistical probabilistic framework, are one of the most popular techniques that have been widely applied to Information Retrieval problems. While they are famous for their well-grounded theory and good empirical performance in text retrieval, their applications in IR are often limited by their complexity and low extendability in the modeling of high-dimensional information. Recently, advances in deep learning techniques provide new opportunities for representation learning and generative models for information retrieval. In contrast to statistical models, neural models have much more flexibility because they model information and data correlation in latent spaces without explicitly relying on any prior knowledge. Previous studies on pattern recognition and natural language processing have shown that semantically meaningful representations of text, images, and many types of information can be acquired with neural models through supervised or unsupervised training. Nonetheless, the effectiveness of neural models for information retrieval is mostly unexplored. In this thesis, we study how to develop new generative models and representation learning frameworks with neural models for information retrieval. Specifically, our contributions include three main components: (1) Theoretical Analysis : We present the first theoretical analysis and adaptation of existing neural embedding models for ad-hoc retrieval tasks; (2) Design Practice : Based on our experience and knowledge, we show how to design an embedding-based neural generative model for practical information retrieval tasks such as personalized product search; And (3) Generic Framework : We further generalize our proposed neural generative framework for complicated heterogeneous information retrieval scenarios that concern text, images, knowledge entities, and their relationships. Empirical results show that the proposed neural generative framework can effectively learn information representations and construct retrieval models that outperform the state-of-the-art systems in a variety of IR tasks.

30

Arguello, Jaime, Jonathan L. Elsas, Jamie Callan, and Jaime Carbonell. "Document Representation and Query Expansion Models for Blog Recommendation." Proceedings of the International AAAI Conference on Web and Social Media 2, no. 1 (September 25, 2021): 10–18. http://dx.doi.org/10.1609/icwsm.v2i1.18605.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

We explore several different document representation models and two query expansion models for the task of recommending blogs to a user in response to a query. Blog relevance ranking differs from traditional document ranking in ad-hocinformation retrieval in several ways: (1) the unit of output (the blog) is composed of a collection of documents (the blog posts) rather than a single document, (2) the query represents an ongoing and typically multifaceted interest in the topic rather than a passing ad-hoc information need and (3) due to the propensity of spam, splogs, and tangential comments, the blogosphere is particularly challenging to use as a source for high-quality query expansion terms. We address these differences at the document representation level, by comparing retrieval models that view either the blog or its constituent posts as the atomic units of retrieval, and at the query expansion level, by making novel use of the links and anchor text in Wikipedia1 to expand a user's initial query. We develop two complementary models of blog retrieval that perform at comparable levels of precision and recall. We also show consistent and significant improvement across all models using our Wikipedia expansion strategy.

31

MARAGOUDAKIS, MANOLIS, DIMITRIOS P. LYRAS, and KYRIAKOS SGARBAS. "BAYESIAN RETRIEVAL USING A SIMILARITY-BASED LEMMATIZER." International Journal on Artificial Intelligence Tools 21, no. 05 (October 2012): 1250024. http://dx.doi.org/10.1142/s0218213012500248.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

The present paper describes a Bayesian network approach to Information Retrieval (IR) from Web documents. The network structure provides an intuitive representation of uncertainty relationships and the embedded conditional probability table is used by inference algorithms in an attempt to identify documents that are relevant to the user's needs, expressed in the form of Boolean queries. Our research has been directed in constructing a probabilistic IR framework that focus on assisting users to perform Ad-hoc retrieval of documents from the various domains such as economics, news, sports, etc. Furthermore, users can integrate feedback regarding the relevance of the retrieved documents in an attempt to improve performance on upcoming requests. Towards these goals, we have expanded the traditional Bayesian network IR system and tested it on several Greek web corpora on different application domains. We have developed two different approaches with regards to the structure: a simple one, where the structure is manually provided, and an automated one, where data mining is used in order to extract the network's structure. Results have depicted competitive performance against successful IR models of different theoretical backgrounds, such as the vector space utilizing tf-idf and the probabilistic model of BM25 in terms of precision-recall curves. In order to further improve the performance of the IR system, we have implemented a novel similarity-based lemmatization framework, reducing thus the ambiguity posed by the plethora of morphological variations of the languages in question. The employed lemmatization framework comprises of 3 core components (i.e. the word segregation, the data cleansing and the lemmatization modules) and is language-independent (i.e. can be applied to other languages with morphological peculiarities and thus improve Ad-hoc retrieval) since it achieves the mapping of an input word to its normalized form by employing two state-of-the-art language independent distance metric models, meaning the Levenshtein Edit distance and the Dice coefficient similarity measure, combined with a language model describing the most frequent inflectional suffixes of the examined language. Experimental results support our claim on the significance of this incorporation to Greek texts web retrieval as results improve by a factor of 4% to 11%.

32

Arnold, Jeffrey L., Brian Neil Levine, R. Manmatha, Francis Lee, Prashant Shenoy, Ming-Che Tsai, Taha K. Ibrahim, Daniel J. O'Brien, and Donald A. Walsh. "Information-Sharing in Out-of-Hospital Disaster Response: The Future Role of Information Technology." Prehospital and Disaster Medicine 19, no. 3 (September 2004): 201–7. http://dx.doi.org/10.1017/s1049023x00001783.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

AbstractNumerous examples exist of the benefits of the timely access to information in emergencies and disasters. Information technology (IT) is playing an increasingly important role in information-sharing during emergencies and disasters.The effective use of IT in out-of-hospital (OOH) disaster response is accompanied by numerous challenges at the human, applications, communication, and security levels.Most reports of IT applications to emergencies or disasters to date, concern applications that are hospital-based or occur during non-response phases of events (i.e., mitigation, planning and preparedness, or recovery phases). Few reports address the application of IT to OOH disaster response.Wireless peer networks that involve ad hoc wireless routing networks and peer-to-peer application architectures offer a promising solution to the many challenges of information-sharing in OOH disaster response. These networks offer several services that are likely to improve information-sharing in OOH emergency response, including needs and capacity assessment databases, victim tracking, event logging, information retrieval, and overall incident management system support.

33

Klaeren, H., and F. Banhart. "A Graphical Query Generator for Clinical Research Databases." Methods of Information in Medicine 34, no. 04 (July 1995): 328–39. http://dx.doi.org/10.1055/s-0038-1634607.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

Abstract:Clinical research involves recording, storage and retrieval of disease-related patient data, typically using a database system. In order to facilitate ad hoc queries to clinical databases we have developed a query generator with a graphical interface. The query generator uses an object-oriented data model which is visualized by directed graphs. The main focus of our work was the definition of object-oriented user views to the partly complex data structures of a relational database. Furthermore, we tried to define graphical abstractions for all common types of queries. Thus, even for non-expert database users such as clinicians, it is easy to assemble highly complex queries for a thorough examination of the content of large research databases.

34

Devezas, José. "Graph-based entity-oriented search." ACM SIGIR Forum 55, no. 1 (June 2021): 1–2. http://dx.doi.org/10.1145/3476415.3476430.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

Entity-oriented search has revolutionized search engines. In the era of Google Knowledge Graph and Microsoft Satori, users demand an effortless process of search. Whether they express an information need through a keyword query, expecting documents and entities, or through a clicked entity, expecting related entities, there is an inherent need for the combination of corpora and knowledge bases to obtain an answer. Such integration frequently relies on independent signals extracted from inverted indexes, and from quad indexes indirectly accessed through queries to a triplestore. However, relying on two separate representation models inhibits the effective cross-referencing of information, discarding otherwise available relations that could lead to a better ranking. Moreover, different retrieval tasks often demand separate implementations, although the problem is, at its core, the same. With the goal of harnessing all available information to optimize retrieval, we explore joint representation models of documents and entities, while taking a step towards the definition of a more general retrieval approach. Specifically, we propose that graphs should be used to incorporate explicit and implicit information derived from the relations between text found in corpora and entities found in knowledge bases. We also take advantage of this framework to elaborate a general model for entity-oriented search, proposing a universal ranking function for the tasks of ad hoc document retrieval (leveraging entities), ad hoc entity retrieval, and entity list completion. At a conceptual stage, we begin by proposing the graph-of-entity, based on the relations between combinations of term and entity nodes. We introduce the entity weight as the corresponding ranking function, relying on the idea of seed nodes for representing the query, either directly through term nodes, or based on the expansion to adjacent entity nodes. The score is computed based on a series of geodesic distances to the remaining nodes, providing a ranking for the documents (or entities) in the graph. In order to improve on the low scalability of the graph-of-entity, we then redesigned this model in a way that reduced the number of edges in relation to the number of nodes, by relying on the hypergraph data structure. The resulting model, which we called hypergraph-of-entity, is the main contribution of this thesis. The obtained reduction was achieved by replacing binary edges with n -ary relations based on sets of nodes and entities (undirected document hyperedges), sets of entities (undirected hyperedges, either based on cooccurrence or a grouping by semantic subject), and pairs of a set of terms and a set of one entity (directed hyperedges, mapping text to an object). We introduce the random walk score as the corresponding ranking function, relying on the same idea of seed nodes, similar to the entity weight in the graph-of-entity. Scoring based on this function is highly reliant on the structure of the hypergraph, which we call representation-driven retrieval. As such, we explore several extensions of the hypergraph-of-entity, including relations of synonymy, or contextual similarity, as well as different weighting functions per node and hyperedge type. We also propose TF-bins as a discretization for representing term frequency in the hypergraph-of-entity. For the random walk score, we propose and explore several parameters, including length and repeats, with or without seed node expansion, direction, or weights, and with or without a certain degree of node and/or hyperedge fatigue, a concept that we also propose. For evaluation, we took advantage of TREC 2017 OpenSearch track, which relied on an online evaluation process based on the Living Labs API, and we also participated in TREC 2018 Common Core track, which was based on the newly introduced TREC Washington Post Corpus. Our main experiments were supported on the INEX 2009 Wikipedia collection, which proved to be a fundamental test collection for assessing retrieval effectiveness across multiple tasks. At first, our experiments solely focused on ad hoc document retrieval, ensuring that the model performed adequately for a classical task. We then expanded the work to cover all three entity-oriented search tasks. Results supported the viability of a general retrieval model, opening novel challenges in information retrieval, and proposing a new path towards generality in this area.

35

MacAvaney, Sean. "Effective and practical neural ranking." ACM SIGIR Forum 55, no. 1 (June 2021): 1–2. http://dx.doi.org/10.1145/3476415.3476432.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

Supervised machine learning methods that use neural networks ("deep learning") have yielded substantial improvements to a multitude of Natural Language Processing (NLP) tasks in the past decade. Improvements to Information Retrieval (IR) tasks, such as ad-hoc search, lagged behind those in similar NLP tasks, despite considerable community efforts. Although there are several contributing factors, I argue in this dissertation that early attempts were not more successful because they did not properly consider the unique characteristics of IR tasks when designing and training ranking models. I first demonstrate this by showing how large-scale datasets containing weak relevance labels can successfully replace training on in-domain collections. This technique improves the variety of queries encountered when training and helps mitigate concerns of over-fitting particular test collections. I then show that dataset statistics available in specific IR tasks can be easily incorporated into neural ranking models alongside the textual features, resulting in more effective ranking models. I also demonstrate that contextualized representations, particularly those from transformer-based language models, considerably improve neural ad-hoc ranking performance. I find that this approach is neither limited to the task of ad-hoc ranking (as demonstrated by ranking clinical reports) nor English content (as shown by training effective cross-lingual neural rankers). These efforts demonstrate that neural approaches can be effective for ranking tasks. However, I observe that these techniques are impractical due to their high query-time computational costs. To overcome this, I study approaches for offloading computational cost to index-time, substantially reducing query-time latency. These techniques make neural methods practical for ranking tasks. Finally, I take a deep dive into better understanding the linguistic biases of the methods I propose compared to contemporary and traditional approaches. The findings from this analysis highlight potential pitfalls of recent methods and provide a way to measure progress in this area going forward.

36

Yeshambel, Tilahun, Josiane Mothe, and Yaregal Assabie. "Learned Text Representation for Amharic Information Retrieval and Natural Language Processing." Information 14, no. 3 (March 20, 2023): 195. http://dx.doi.org/10.3390/info14030195.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

Over the past few years, word embeddings and bidirectional encoder representations from transformers (BERT) models have brought better solutions to learning text representations for natural language processing (NLP) and other tasks. Many NLP applications rely on pre-trained text representations, leading to the development of a number of neural network language models for various languages. However, this is not the case for Amharic, which is known to be a morphologically complex and under-resourced language. Usable pre-trained models for automatic Amharic text processing are not available. This paper presents an investigation on the essence of learned text representation for information retrieval and NLP tasks using word embeddings and BERT language models. We explored the most commonly used methods for word embeddings, including word2vec, GloVe, and fastText, as well as the BERT model. We investigated the performance of query expansion using word embeddings. We also analyzed the use of a pre-trained Amharic BERT model for masked language modeling, next sentence prediction, and text classification tasks. Amharic ad hoc information retrieval test collections that contain word-based, stem-based, and root-based text representations were used for evaluation. We conducted a detailed empirical analysis on the usability of word embeddings and BERT models on word-based, stem-based, and root-based corpora. Experimental results show that word-based query expansion and language modeling perform better than stem-based and root-based text representations, and fastText outperforms other word embeddings on word-based corpus.

37

Achemoukh, Farida, and Rachid Ahmed-Ouamer. "Integration of User Profile in Search Process according to the Bayesian Approach." International Journal of Recent Contributions from Engineering, Science & IT (iJES) 6, no. 4 (December 19, 2018): 32. http://dx.doi.org/10.3991/ijes.v6i4.9716.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

Most information retrieval system (IRS) rely on the so called system-centred approach, behaves as a black box, which produces the same answer to the same query, independently on the user’s specific information needs. Without considering the user, it is hard to know which sense refers to in a query. To satisfy user needs, personalization is an appropriate solution to improve the IRS usability. Modeling the user profile can be the first step towards personalization of information search. The user profile refers to his/her interests built across his/her interactions with the retrieval system. In this paper, we present a personalized information retrieval approach for building and exploiting the user profile in search process, based on Bayesian network. The theoretical framework provided by these networks allows better capturing the relationships between different information. Experiments carried out on TREC-1 ad hoc and TREC 2011 Track collections show that our approach achieves significant improvements over a personalized search approach described in the state of the art and also to a baseline search information process that do not consider the user profile

38

Shukla, Abhishek Kumar, Sujoy Das, Pushpendra Kumar, and Afroj Alam. "Relevance Feedback and Deep Neural Network-Based Semantic Method for Query Expansion." Wireless Communications and Mobile Computing 2022 (July 18, 2022): 1–11. http://dx.doi.org/10.1155/2022/6789044.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

Machine learning techniques have been widely used in almost every area of arts, science, and technology for the last two decades. Document analysis and query expansion also use machine learning techniques at a broad scale for information retrieval tasks. The state-of-the-art models like the Bo1 model, Bo2 model, KL divergence model, and chi-square model are probabilistic, and they work on DFR-based retrieval models. These models are much focused on term frequency and do not care about the semantic relationship among the terms. The proposed model applies the semantic method to find the semantic similarity among the terms to expand the query. The proposed method uses the relevance feedback method that selects a user-assisted most relevant document from top “ k ” initially retrieved documents and then applies deep neural network technique to select the most informative terms related to original query terms. The results are evaluated at FIRE 2011 ad hoc English test collection. The mean average precision of the proposed method is 0.3568. The proposed method also compares the state-of-the-art models. The proposed model observed 19.77% and 8.05% improvement on the mean average precision (MAP) parameter with respect to the original query and Bo1 model, respectively.

39

Zhang, Xinyu, Nandan Thakur, Odunayo Ogundepo, Ehsan Kamalloo, David Alfonso-Hermelo, Xiaoguang Li, Qun Liu, Mehdi Rezagholizadeh, and Jimmy Lin. "MIRACL: A Multilingual Retrieval Dataset Covering 18 Diverse Languages." Transactions of the Association for Computational Linguistics 11 (2023): 1114–31. http://dx.doi.org/10.1162/tacl_a_00595.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

Abstract MIRACL is a multilingual dataset for ad hoc retrieval across 18 languages that collectively encompass over three billion native speakers around the world. This resource is designed to support monolingual retrieval tasks, where the queries and the corpora are in the same language. In total, we have gathered over 726k high-quality relevance judgments for 78k queries over Wikipedia in these languages, where all annotations have been performed by native speakers hired by our team. MIRACL covers languages that are both typologically close as well as distant from 10 language families and 13 sub-families, associated with varying amounts of publicly available resources. Extensive automatic heuristic verification and manual assessments were performed during the annotation process to control data quality. In total, MIRACL represents an investment of around five person-years of human annotator effort. Our goal is to spur research on improving retrieval across a continuum of languages, thus enhancing information access capabilities for diverse populations around the world, particularly those that have traditionally been underserved. MIRACL is available at http://miracl.ai/.

40

Singh, Jagendra, and Rakesh Kumar. "Lexical Co-Occurrence and Contextual Window-Based Approach with Semantic Similarity for Query Expansion." International Journal of Intelligent Information Technologies 13, no. 3 (July 2017): 57–78. http://dx.doi.org/10.4018/ijiit.2017070104.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

Query expansion (QE) is an efficient method for enhancing the efficiency of information retrieval system. In this work, we try to capture the limitations of pseudo-feedback based QE approach and propose a hybrid approach for enhancing the efficiency of feedback based QE by combining corpus-based, contextual based information of query terms, and semantic based knowledge of query terms. First of all, this paper explores the use of different corpus-based lexical co-occurrence approaches to select an optimal combination of query terms from a pool of terms obtained using pseudo-feedback based QE. Next, we explore semantic similarity approach based on word2vec for ranking the QE terms obtained from top pseudo-feedback documents. Further, we combine co-occurrence statistics, contextual window statistics, and semantic similarity based approaches together to select the best expansion terms for query reformulation. The experiments were performed on FIRE ad-hoc and TREC-3 benchmark datasets. The statistics of our proposed experimental results show significant improvement over baseline method.

41

Rao, Jinfeng, Wei Yang, Yuhao Zhang, Ferhan Ture, and Jimmy Lin. "Multi-Perspective Relevance Matching with Hierarchical ConvNets for Social Media Search." Proceedings of the AAAI Conference on Artificial Intelligence 33 (July 17, 2019): 232–40. http://dx.doi.org/10.1609/aaai.v33i01.3301232.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

Despite substantial interest in applications of neural networks to information retrieval, neural ranking models have mostly been applied to “standard” ad hoc retrieval tasks over web pages and newswire articles. This paper proposes MP-HCNN (Multi-Perspective Hierarchical Convolutional Neural Network), a novel neural ranking model specifically designed for ranking short social media posts. We identify document length, informal language, and heterogeneous relevance signals as features that distinguish documents in our domain, and present a model specifically designed with these characteristics in mind. Our model uses hierarchical convolutional layers to learn latent semantic soft-match relevance signals at the character, word, and phrase levels. A poolingbased similarity measurement layer integrates evidence from multiple types of matches between the query, the social media post, as well as URLs contained in the post. Extensive experiments using Twitter data from the TREC Microblog Tracks 2011–2014 show that our model significantly outperforms prior feature-based as well as existing neural ranking models. To our best knowledge, this paper presents the first substantial work tackling search over social media posts using neural ranking models. Our code and data are publicly available.1

42

Gall, W., G. Duftschmid, and W. Dorda. "Clinical Data Retrieval: 25 Years of Temporal Query Management at the University of Vienna Medical School." Methods of Information in Medicine 41, no. 02 (2002): 89–97. http://dx.doi.org/10.1055/s-0038-1634291.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

Summary Objectives: Today, many clinical information systems include analysis components which allow clinicians to apply a selection of predefined statistical functions that satisfy typical cases. They are mostly to inflexible to handle complex, non-standard problems, however. The focus of this paper, therefore, is to present an approach that enables clinicians to autonomously create ad hoc queries including temporal relations in an interactive environment. Methods: We developed the query language AMAS, which was specifically customized for users from the medical domain to flexibly retrieve and interpret temporal, clinical data. AMAS provides for a significant temporal expressiveness in data retrieval using timestamped clinical databases and relies on an operator-operand concept for the specification of a query. Results: Within the last 25 years, four different clinical retrieval systems have been implemented at the Department of Medical Computer Sciences, based on the AMAS query language. Currently, these systems allow access to the medical records of more than 2 million patients. Physicians of 46 different departments at the University of Vienna and Graz Medical Schools have made extensive use of these systems in the course of clinical research and patient care, executing more than 10.000 queries per year. Conclusions: We discuss a list of 20 issues that represent the most essential lessons we have learned in the development of the four systems mentioned above. Amongst others, our experiences indicate that the operator-operand concept allows an intuitive specification of complex, temporal queries. Further, customization to different user classes, based on their statistical background, is essential.

43

Deris, A., I. Trigonis, A. Aravanis, and E. K. Stathopoulou. "DEPTH CAMERAS ON UAVs: A FIRST APPROACH." ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences XLII-2/W3 (February 23, 2017): 231–36. http://dx.doi.org/10.5194/isprs-archives-xlii-2-w3-231-2017.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

Accurate depth information retrieval of a scene is a field under investigation in the research areas of photogrammetry, computer vision and robotics. Various technologies, active, as well as passive, are used to serve this purpose such as laser scanning, photogrammetry and depth sensors, with the latter being a promising innovative approach for fast and accurate 3D object reconstruction using a broad variety of measuring principles including stereo vision, infrared light or laser beams. In this study we investigate the use of the newly designed Stereolab's ZED depth camera based on passive stereo depth calculation, mounted on an Unmanned Aerial Vehicle with an ad-hoc setup, specially designed for outdoor scene applications. Towards this direction, the results of its depth calculations and scene reconstruction generated by Simultaneous Localization and Mapping (SLAM) algorithms are compared and evaluated based on qualitative and quantitative criteria with respect to the ones derived by a typical Structure from Motion (SfM) and Multiple View Stereo (MVS) pipeline for a challenging cultural heritage application.

44

Järvelin, Kalervo, Peter Ingwersen, and Timo Niemi. "A user‐oriented interface for generalised informetric analysis based on applying advanced data modelling techniques." Journal of Documentation 56, no. 3 (June 1, 2000): 250–78. http://dx.doi.org/10.1108/eum0000000007115.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

This article presents a novel user‐oriented interface for generalised informetric analysis and demonstrates how informetric calculations can easily and declaratively be specified through advanced data modelling techniques. The interface is declarative and at a high level. Therefore it is easy to use, flexible and extensible. It enables end users to perform basic informetric ad hoc calculations easily and often with much less effort than in contemporary online retrieval systems. It also provides several fruitful generalisations of typical informetric measurements like impact factors. These are based on substituting traditional foci of analysis, for instance journals, by other object types, such as authors, organisations or countries. In the interface, bibliographic data are modelled as complex objects (non‐first normal form relations) and terminological and citation networks involving transitive relationships are modelled as binary relations for deductive processing. The interface is flexible, because it makes it easy to switch focus between various object types for informetric calculations, e.g. from authors to institutions. Moreover, it is demonstrated that all informetric data can easily be broken down by criteria that foster advanced analysis, e.g. by years or content‐bearing attributes. Such modelling allows flexible data aggregation along many dimensions. These salient features emerge from the query interface‘s general data restructuring and aggregation capabilities combined with transitive processing capabilities. The features are illustrated by means of sample queries and results in the article.

45

Pinto, Jaime A., and Mauricio B. Almeida. "An applied ontology-Oriented Case Study to Distinguish Public and Private Institutions Through Their Documents." KNOWLEDGE ORGANIZATION 47, no. 7 (2020): 582–91. http://dx.doi.org/10.5771/0943-7444-2020-7-582.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

The institutions we create shape many of the activities we engage insofar as they are pervasive entities in our society. In an era full of new technologies, including the semantic web, there is a movement toward sound conceptual modeling for socio-technical solutions applied to government institutions. To develop these complex solutions, one needs to deepen the ontological status of entities in the institutional domain, because literature is full of ambiguous and ad-hoc hypotheses about distinctions between public and private corporations. We believe we can find better explanations for such distinctions in the interdisciplinary field of library a information science. Within an ongoing semantic web project, we focus on a study case of official documents. First, we analyze theories about public and private corporations, seeking a reliable on‘tological distinction between them; then, by focusing on documents produced by each type of corporation, we hope to provide a well-founded analysis. Second, we adopt the aforementioned theories and the new analysis as recommendations for the improvement for the access and understanding of public documents, through appropriate classification of them within government information systems. This project, ultimately, aims to maximize the transparency of public government documents by favoring retrieval and comprehension by a society with plenty of automated information systems.

46

Álvarez-Rodríguez, Jose María, Ricardo Colomo-Palacios, and Vladimir Stantchev. "Skillrank: Towards a Hybrid Method to Assess Quality and Confidence of Professional Skills in Social Networks." Scientific Programming 2015 (2015): 1–13. http://dx.doi.org/10.1155/2015/451476.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

The present paper introduces a hybrid technique to measure the expertise of users by analyzing their profiles and activities in social networks. Currently, both job seekers and talent hunters are looking for new and innovative techniques to filter jobs and candidates where candidates are trying to improve and make their profiles more attractive. In this sense, the Skillrank approach is based on the conjunction of existing and well-known information and expertise retrieval techniques that perfectly fit the existing web and social media environment to deliver an intelligent component to integrate the user context in the analysis of skills confidence. A major outcome of this approach is that it actually takes advantage of existing data and information available on the web to perform both a ranked list of experts in a field and a confidence value for every professional skill. Thus, expertise and experts can be detected, verified, and ranked using a suited trust metric. An experiment to validate the Skillrank technique based on precision and recall metrics is also presented using two different datasets: (1) ad hoc created using real data from a professional social network and (2) real data extracted from the LinkedIn API.

47

De La Iglesia, B., S. Donell, V. Rayward-Smith, and J. Bettencourt-Silva. "On Creating a Patient-centric Database from Multiple Hospital Information Systems." Methods of Information in Medicine 51, no. 03 (2012): 210–20. http://dx.doi.org/10.3414/me10-01-0069.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

SummaryBackground: The information present in Hospital Information Systems (HIS) is heterogeneous and is used primarily by health practitioners to support and improve patient care. Conducting clinical research, data analyses or knowledge discovery projects using electronic patient data in secondary care centres relies on accurate data collection, which is often an ad-hoc process poorly described in the literature.Objectives: This paper aims at facilitating and expanding on the process of retrieving and collating patient-centric data from multiple HIS for the purpose of creating a research database. The development of a process roadmap for this purpose illustrates and exposes the constraints and drawbacks of undertaking such work in secondary care centres.Methods: A data collection exercise was carried using a combined approach based on segments of well established data mining and knowledge discovery methodologies, previous work on clinical data integration and local expert consultation. A case study on prostate cancer was carried out at an English regional National Health Service (NHS) hospital.Results: The process for data retrieval described in this paper allowed patient-centric data, pertaining to the case study on prostate cancer, to be successfully collected from multiple heterogeneous hospital sources, and collated in a format suitable for further clinical research.Conclusions: The data collection exercise described in this paper exposes the lengthy and difficult journey of retrieving and collating patient-centric, multi-source data from a hospital, which is indeed a non-trivial task, and one which will greatly benefit from further attention from researchers and hospital IT management.

48

Saleem, Qudsia, Ikram Ud Din, Ahmad Almogren, Ibrahim Alkhalifa, Hasan Ali Khattak, and Joel J. P. C. Rodrigues. "Named Data Networking-Based On-Demand Secure Vehicle-To-Vehicle Communications." Wireless Communications and Mobile Computing 2021 (November 26, 2021): 1–15. http://dx.doi.org/10.1155/2021/1615015.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

The detection of secure vehicles for content placement in vehicle to vehicle (V2V) communications makes a challenging situation for a well-organized dynamic nature of vehicular ad hoc networks (VANET). With the increase in the demand of efficient and adoptable content delivery, information-centric networking (ICN) can be a promising solution for the future needs of the network. ICN provides a direct retrieval of content through its unique name, which is independent of locations. It also performs better in content retrieval with its in-network caching and named-based routing capabilities. Since vehicles are mobile devices, it is very crucial to select a caching node, which is secure and reliable. The security of data is quite important in the vehicular named data networking (VNDN) environment due to its vital importance in saving the life of drivers and pedestrians. To overcome the issue of security and reduce network load in addition to detect a malicious activity, we define a blockchain-based distributive trust model to achieve security, trust, and privacy of the communicating entities in VNDN, named secure vehicle communication caching (SVC-caching) mechanism for the placement of on-demand data. The proposed trust management mechanism is decentralized in nature, which is used to select a trustworthy node for cluster-based V2V communications in the VNDN environment. The SVC-caching strategy is simulated in the NS-2 simulator. The results are evaluated based on one-hop count, delivery ratio, cache hit ratio, and malicious node detection. The results demonstrate that the proposed technique improves the performance based on the selected parameters.

49

Jiang, Nanlan, Sai Yang, and Pingping Xu. "Enabling Location Privacy Preservation in MANETs Based on Distance, Angle, and Spatial Cloaking." Electronics 9, no. 3 (March 8, 2020): 458. http://dx.doi.org/10.3390/electronics9030458.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

Preserving the location privacy of users in Mobile Ad hoc Networks (MANETs) is a significant challenge for location information. Most of the conventional Location Privacy Preservation (LPP) methods protect the privacy of the user while sacrificing the capability of retrieval on the server-side, that is, legitimate devices except the user itself cannot retrieve the location in most cases. On the other hand, applications such as geographic routing and location verification require the retrievability of locations on the access point, the base station, or a trusted server. Besides, with the development of networking technology such as caching technology, it is expected that more and more distributed location-based services will be deployed, which results in the risk of leaking location information in the wireless channel. Therefore, preserving location privacy in wireless channels without losing the retrievability of the real location is essential. In this paper, by focusing on the wireless channel, we propose a novel LPP enabled by distance (ranging result), angle, and the idea of spatial cloaking (DSC-LPP) to preserve location privacy in MANETs. DSC-LPP runs without the trusted third party nor the traditional cryptography tools in the line-of-sight environment, and it is suitable for MANETs such as the Internet of Things, even when the communication and computation capabilities of users are limited. Qualitative evaluation indicates that DSC-LPP can reduce the communication overhead when compared with k-anonymity, and the computation overhead of DSC-LPP is limited when compared with conventional cryptography. Meanwhile, the retrievability of DSC-LPP is higher than that of k-anonymity and differential privacy. Simulation results show that with the proper design of spatial divisions and parameters, other legitimate devices in a MANET can correctly retrieve the location of users with a high probability when adopting DSC-LPP.

50

Zafar, Waseeq Ul Islam, Muhammad Atif Ur Rehman, Farhana Jabeen, Byung-Seo Kim, and Zobia Rehman. "Context-Aware Naming and Forwarding in NDN-Based VANETs." Sensors 21, no. 14 (July 6, 2021): 4629. http://dx.doi.org/10.3390/s21144629.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

Vehicular ad-hoc network (VANET) is a technology that allows ubiquitous mobility to mobile users. Inter-vehicle communication is an integral component of intelligent transportation systems that enables a wide variety of applications where vehicles interact and cooperate with each other, from safety applications to non-safety applications. VANETs applications have different needs (e.g., latency, reliability, delivery priorities, etc.) in terms of delivery effectiveness. In the last decade, named data networking (NDN) gained the attention of the research community for effective content retrieval and dissemination in mobile environments such as VANETs. In NDN, the content’s name has a vital role in storing and retrieving the content effectively and efficiently. In NDN-based VANETs, adaptive content dissemination solutions must be introduced that can make decisions related to forwarding, cache management, etc., based on context information represented by a content name. In this context, our main contributions are two-fold: (i) we present the hierarchical context-aware content-naming (CACN) scheme for NDN-based VANETs that enables naming the safety and non-safety applications, and (ii) we present a decentralized context-aware notification (DCN) protocol that broadcasts event notification information for awareness within the application-based geographical area. Simulation results show that the proposed DCN protocol succeeds in achieving reduced transmissions, bandwidth, and energy compared to existing critical contents dissemination protocols.

Journal articles on the topic 'Ad-Hoc Information Retrieval'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles