Journal articles: 'Evaluation of XML retrieval effectiveness'

1

Lalmas, Mounia, and Anastasios Tombros. "Evaluating XML retrieval effectiveness at INEX." ACM SIGIR Forum 41, no. 1 (June 2007): 40–57. http://dx.doi.org/10.1145/1273221.1273225.

Full text

APA, Harvard, Vancouver, ISO, and other styles

2

Gövert, Norbert, Norbert Fuhr, Mounia Lalmas, and Gabriella Kazai. "Evaluating the effectiveness of content-oriented XML retrieval methods." Information Retrieval 9, no. 6 (September 1, 2006): 699–722. http://dx.doi.org/10.1007/s10791-006-9008-2.

Full text

APA, Harvard, Vancouver, ISO, and other styles

3

Selvaganesan, S., Su-Cheng Haw, and Lay-Ki Soon. "XDMA: A Dual Indexing and Mutual Summation Based Keyword Search Algorithm for XML Databases." International Journal of Software Engineering and Knowledge Engineering 24, no. 04 (May 2014): 591–615. http://dx.doi.org/10.1142/s0218194014500223.

Full text

Abstract:

Achieving the effectiveness in relation to the relevance of query result is the most crucial part of XML keyword search. Developing an XML Keyword search approach which addresses the user search intention, keyword ambiguity problems and query/search result grading (ranking) problem is still challenging. In this paper, we propose a novel approach called XDMA for keyword search in XML databases that builds two indices to resolve these problems. Then, a keyword search technique based on two-level matching between two indices is presented. Further, by utilizing the logarithmic and probability functions, a terminology that defines the Mutual Score to find the desired T-typed node is put forward. We also introduce the similarity measure to retrieve the exact data through the selected T-typed node. In addition, grading for the query results having comparable relevance scores is employed. Finally, we demonstrate the effectiveness of our proposed approach, XDMA with a comprehensive experimental evaluation using the datasets of DBLP, WSU and eBay.

APA, Harvard, Vancouver, ISO, and other styles

4

Blanke, Tobias. "Theoretical evaluation of XML retrieval." ACM SIGIR Forum 46, no. 1 (May 20, 2012): 82–83. http://dx.doi.org/10.1145/2215676.2215689.

Full text

APA, Harvard, Vancouver, ISO, and other styles

5

Yu, Hong, Xiao Lei Huang, Zhi Ling Wei, and Chen Xia Yang. "Study on XML Retrieval Results Classification." Applied Mechanics and Materials 263-266 (December 2012): 1773–77. http://dx.doi.org/10.4028/www.scientific.net/amm.263-266.1773.

Full text

Abstract:

Mining (classify or clustering) retrieval results to serve relevance feedback mechanism of search engine is an important solution to improve effectiveness of retrieval. Unlike plain text documents, since the XML documents are semi-structured data, for XML retrieval results classification, consider exploiting structure features of XML documents, such as tag paths and edges etc. We propose to use Support Vector Machine (SVM) classifier to classify XML retrieval results exploiting both their content and structure features. We implemented the classification method on XML retrieval results based on the IEEE SC corpus. Compared with k-nearest neighbor classification (KNN) on the same dataset in our application, SVM perform better. The experiment results have also shown that the use of structure features, especially tag paths and edges, can improve the classification performance significantly.

APA, Harvard, Vancouver, ISO, and other styles

6

Pal, Sukomal, Mandar Mitra, and Jaap Kamps. "Evaluation effort, reliability and reusability in XML retrieval." Journal of the American Society for Information Science and Technology 62, no. 2 (December 14, 2010): 375–94. http://dx.doi.org/10.1002/asi.21403.

Full text

APA, Harvard, Vancouver, ISO, and other styles

7

Wichaiwong, Tanakorn. "An Exponentiation Method for XML Element Retrieval." Scientific World Journal 2014 (2014): 1–10. http://dx.doi.org/10.1155/2014/404518.

Full text

Abstract:

XML document is now widely used for modelling and storing structured documents. The structure is very rich and carries important information about contents and their relationships, for example, e-Commerce. XML data-centric collections require query terms allowing users to specify constraints on the document structure; mapping structure queries and assigning the weight are significant for the set of possibly relevant documents with respect to structural conditions. In this paper, we present an extension to the MEXIR search system that supports the combination of structural and content queries in the form of content-and-structure queries, which we call the Exponentiation function. It has been shown the structural information improve the effectiveness of the search system up to 52.60% over the baseline BM25 at MAP.

APA, Harvard, Vancouver, ISO, and other styles

8

WANG, JASON T. L., JIANGHUI LIU, and JUNHAN WANG. "XML CLUSTERING AND RETRIEVAL THROUGH PRINCIPAL COMPONENT ANALYSIS." International Journal on Artificial Intelligence Tools 14, no. 04 (August 2005): 683–99. http://dx.doi.org/10.1142/s0218213005002326.

Full text

Abstract:

XML is increasingly important in data exchange and information management. A great deal of efforts have been spent in developing efficient techniques for storing, querying, indexing and accessing XML documents. In this paper we propose a new approach to clustering XML data. In contrast to previous work, which focused on documents defined by different DTDs, the proposed method works for documents with the same DTD. Our approach is to extract features from documents, modeled by ordered labeled trees, and transform the documents to vectors in a high-dimensional Euclidean space based on the occurrences of the features in the documents. We then reduce the dimensionality of the vectors by principal component analysis (PCA) and cluster the vectors in the reduced dimensional space. The PCA enables one to identify vectors with co-occurrent features, thereby enhancing the accuracy of the clustering. We also discuss an extension of our techniques to XML retrieval. Experimental results based on documents obtained from Wisconsin's XML data bank show the effectiveness and good performance of the proposed techniques.

APA, Harvard, Vancouver, ISO, and other styles

9

Blanke, Tobias, Mounia Lalmas, and Theo Huibers. "A framework for the theoretical evaluation of XML retrieval." Journal of the American Society for Information Science and Technology 63, no. 12 (November 8, 2012): 2463–73. http://dx.doi.org/10.1002/asi.22674.

Full text

APA, Harvard, Vancouver, ISO, and other styles

10

Fuhr, Norbert, and Norbert Gövert. "Retrieval quality vs. effectiveness of specificity-oriented search in XML collections." Information Retrieval 9, no. 1 (January 2006): 55–70. http://dx.doi.org/10.1007/s10791-005-5721-5.

Full text

APA, Harvard, Vancouver, ISO, and other styles

11

Savoy, Jacques. "Statistical inference in retrieval effectiveness evaluation." Information Processing & Management 33, no. 4 (July 1997): 495–512. http://dx.doi.org/10.1016/s0306-4573(97)00027-7.

Full text

APA, Harvard, Vancouver, ISO, and other styles

12

Roko, Abubakar, Shyamala Doraisamy, Azrul Hazri Jantan, and Azreen Azman. "Effective keyword query structuring using NER for XML retrieval." International Journal of Web Information Systems 11, no. 1 (April 20, 2015): 33–53. http://dx.doi.org/10.1108/ijwis-06-2014-0022.

Full text

Abstract:

Purpose – The purpose of this paper is to propose and evaluate XKQSS, a query structuring method that relegates the task of generating structured queries from a user to a search engine while retaining the simple keyword search query interface. A more effective way for searching XML database is to use structured queries. However, using query languages to express queries prove to be difficult for most users since this requires learning a query language and knowledge of the underlying data schema. On the other hand, the success of Web search engines has made many users to be familiar with keyword search and, therefore, they prefer to use a keyword search query interface to search XML data. Design/methodology/approach – Existing query structuring approaches require users to provide structural hints in their input keyword queries even though their interface is keyword base. Other problems with existing systems include their inability to put keyword query ambiguities into consideration during query structuring and how to select the best generated structure query that best represents a given keyword query. To address these problems, this study allows users to submit a schema independent keyword query, use named entity recognition (NER) to categorize query keywords to resolve query ambiguities and compute semantic information for a node from its data content. Algorithms were proposed that find user search intentions and convert the intentions into a set of ranked structured queries. Findings – Experiments with Sigmod and IMDB datasets were conducted to evaluate the effectiveness of the method. The experimental result shows that the XKQSS is about 20 per cent more effective than XReal in terms of return nodes identification, a state-of-art systems for XML retrieval. Originality/value – Existing systems do not take keyword query ambiguities into account. XKSS consists of two guidelines based on NER that help to resolve these ambiguities before converting the submitted query. It also include a ranking function computes a score for each generated query by using both semantic information and data statistic, as opposed to data statistic only approach used by the existing approaches.

APA, Harvard, Vancouver, ISO, and other styles

13

Sayed, Awny, Ahmed A. Radwan, and Mohamed M. Abdallah. "Efficient evaluation of relevance feedback algorithms for XML content‐based retrieval systems." International Journal of Web Information Systems 6, no. 2 (June 22, 2010): 121–31. http://dx.doi.org/10.1108/17440081011053113.

Full text

APA, Harvard, Vancouver, ISO, and other styles

14

Kazai, Gabriella, and Mounia Lalmas. "eXtended cumulated gain measures for the evaluation of content-oriented XML retrieval." ACM Transactions on Information Systems 24, no. 4 (October 2006): 503–42. http://dx.doi.org/10.1145/1185877.1185883.

Full text

APA, Harvard, Vancouver, ISO, and other styles

15

Liakos, Panagiotis, Panagiota Koltsida, George Kakaletris, Peter Baumann, Yannis Ioannidis, and Alex Delis. "A Distributed Infrastructure for Earth-Science Big Data Retrieval." International Journal of Cooperative Information Systems 24, no. 02 (June 2015): 1550002. http://dx.doi.org/10.1142/s0218843015500021.

Full text

Abstract:

Earth-Science data are composite, multi-dimensional and of significant size, and as such, continue to pose a number of ongoing problems regarding their management. With new and diverse information sources emerging as well as rates of generated data continuously increasing, a persistent challenge becomes more pressing: To make the information existing in multiple heterogeneous resources readily available. The widespread use of the XML data-exchange format has enabled the rapid accumulation of semi-structured metadata for Earth-Science data. In this paper, we exploit this popular use of XML and present the means for querying metadata emanating from multiple sources in a succinct and effective way. Thereby, we release the user from the very tedious and time consuming task of examining individual XML descriptions one by one. Our approach, termed Meta-Array Data Search (MAD Search), brings together diverse data sources while enhancing the user-friendliness of the underlying information sources. We gather metadata using different standards and construct an amalgamated service with the help of tools that discover and harvest such metadata; this service facilitates the end-user by offering easy and timely access to all metadata. The main contribution of our work is a novel query language termed xWCPS, that builds on top of two widely-adopted standards: XQuery and the Web Coverage Processing Service (WCPS). xWCPS furnishes a rich set of features regarding the way scientific data can be queried with. Our proposed unified language allows for requesting metadata while also giving processing directives. Consequently, the xWCPS-enabled MAD Search helps in both retrieval and processing of large data sets hosted in an heterogeneous infrastructure. We demonstrate the effectiveness of our approach through diverse use-cases that provide insights into the syntactic power and overall expressiveness of xWCPS. We evaluate MAD Search in a distributed environment that comprises five high-volume array-databases whose sizes range between 20 and 100 GB and so, we ascertain the applicability and potential of our proposal.

APA, Harvard, Vancouver, ISO, and other styles

16

Blair, David C., and M. E. Maron. "An evaluation of retrieval effectiveness for a full-text document-retrieval system." Communications of the ACM 28, no. 3 (March 1985): 289–99. http://dx.doi.org/10.1145/3166.3197.

Full text

APA, Harvard, Vancouver, ISO, and other styles

17

Machi, Paolo, Franck Jourdan, Dominique Ambard, Cedric Reynaud, Kyriakos Lobotesis, Mathieu Sanchez, Alain Bonafé, and Vincent Costalat. "Experimental evaluation of stent retrievers’ mechanical properties and effectiveness." Journal of NeuroInterventional Surgery 9, no. 3 (March 25, 2016): 257–63. http://dx.doi.org/10.1136/neurintsurg-2015-012213.

Full text

Abstract:

BackgroundFive randomized controlled trials recently appeared in the literature demonstrating that early mechanical thrombectomy in patients with acute ischemic stroke is significantly related to an improved outcome. Stent retrievers are accepted as the most effective devices for intracranial thrombectomy.ObjectiveTo analyze the mechanical properties of stent retrievers, their behavior during retrieval, and interaction with different clots and to identify device features that might correlate with the effectiveness of thrombus removal.Materials and methodsAll stent retrievers available in France up to June 2015 were evaluated by mechanical and functional tests aimed at investigating the variation of their radial force and their behavior during retrieval. Devices were also tested during in vitro thrombectomies using white and red experimental thrombi produced with human blood. Functional tests and in vitro thrombectomies were conducted using a rigid 3D printed vascular model.ResultsMechanical tests showed a variation in radial force during retrieval for each stent. A constant radial force during retrieval was related to continuous cohesion over the vessel wall and a higher rate of clot removal efficacy. All stent retrievers failed when interacting with white large thrombi (diameter ≥6 mm).ConclusionsNone of the tested devices were effective in removing white clots of large diameter (≥6 mm). Constant radial force during retrieval allows constant cohesion to the vessel wall and pressure over the clot; such features allow for a higher rate of clot removal.

APA, Harvard, Vancouver, ISO, and other styles

18

Cheng Haw, Su, Samini Subramaniam, Wei Siang Lim, and Fang Fang Chua. "Hybridation of Labeling Schemes for Efficient Dynamic Updates." Indonesian Journal of Electrical Engineering and Computer Science 4, no. 1 (October 1, 2016): 184. http://dx.doi.org/10.11591/ijeecs.v4.i1.pp184-194.

Full text

Abstract:

<p>With XML as the leading standard for data representation over the Web, it is crucial to store and query XML data. However, relational databases are the dominant database technology in most organizations. Thus, replacing relational database with a pure XML database is not a wise choice. One most prominent solution is to map XML into relational database. This paper introduces a robust labeling scheme which is a hybrid labeling scheme combining the beauty features of extended range and ORDPATH schemes to supports dynamic updates. In addition, we also proposed a mapping scheme based on the hybrid labeling scheme. Our proposed approach is evaluated in terms of (i) loading time, (ii) storage size, (iii) query retrieval time, and (iv) dynamic updates time, as compared to ORDPATH and ME schemes. The experimental evaluation results shows that our proposed approach is scalable to support huge datasets and dynamic updates.</p>

APA, Harvard, Vancouver, ISO, and other styles

19

Taghva, Kazem, Julie Borsack, and Allen Condit. "Evaluation of model-based retrieval effectiveness with OCR text." ACM Transactions on Information Systems 14, no. 1 (January 11, 1996): 64–93. http://dx.doi.org/10.1145/214174.214180.

Full text

APA, Harvard, Vancouver, ISO, and other styles

20

Sumida, T., K. Yamamoto, T. Shinogi, S. Tsuruoka, and H. Kawanaka. "Document Recognition and XML Generation of Tabular Form Discharge Summaries for Analogous Case Search System." Methods of Information in Medicine 46, no. 06 (2007): 700–708. http://dx.doi.org/10.1055/s-0038-1625431.

Full text

Abstract:

Summary Objectives : This paper discusses and develops a document image recognition, keyword extraction and automatic XML generation system to search analogous cases from paper-based documents. In this paper, we propose the document structure recognition method and automatic XML generation method for the tabular form discharge summary documents. This paper also develops the prototype system using the proposed method. Evaluation experiments using actual documents are doneto discuss the effectiveness of the developed system. Methods : The developed system consists of the following methods. Paper-based summary documents are scanned by a scanner using 300 dpi first. Noise and tilt of the image are reduced by pre-processing, and the table structures are identified. Characters in the table are recognized and converted to text data by the OCR engine. XML documents are automatically generated using obtained results. Results : In this paper, patient discharge summary documents archived at Mie University Hospital were used. The results show that XML documents can be automatically generated when standard tabular form documents are input into the developed system. In this experiment, it takes about 20 seconds to generate an XML document using the general personal computer. This paper also compares the developed system with a commercial product to discuss the effectiveness of the present system. Experimental results also show that the accuracy of table structure recognition is high and it can be used in a practical situation. Conclusions : This paper showed the effectiveness of the proposed method to recognize the tabular form document images to generate XML documents.

APA, Harvard, Vancouver, ISO, and other styles

21

Klusch, Matthias, Patrick Kapahnke, and Ingo Zinnikus. "Adaptive Hybrid Semantic Selection of SAWSDL Services with SAWSDL-MX2." International Journal on Semantic Web and Information Systems 6, no. 4 (October 2010): 1–26. http://dx.doi.org/10.4018/jswis.2010100101.

Full text

Abstract:

In this paper, the authors present an adaptive, hybrid semantic matchmaker for SAWSDL services, called SAWSDL-MX2. It determines three types of semantic matching of an advertised service with a requested one, which are described in standard SAWSDL: logic-based, text-similarity-based and XML-tree edit-based structural similarity. Before selection, SAWSDL-MX2 learns the optimal aggregation of these different matching degrees off-line over a random subset of a given SAWSDL service retrieval test collection by exploiting a binary support vector machine-based classifier with ranking. The authors present a comparative evaluation of the retrieval performance of SAWSDL-MX2.

APA, Harvard, Vancouver, ISO, and other styles

22

Park, Young-Ho, Kyu-Young Whang, Byung Suk Lee, and Wook-Shin Han. "Efficient evaluation of linear path expressions on large-scale heterogeneous XML documents using information retrieval techniques." Journal of Systems and Software 79, no. 2 (February 2006): 180–90. http://dx.doi.org/10.1016/j.jss.2005.05.009.

Full text

APA, Harvard, Vancouver, ISO, and other styles

23

Benkoussas, Chahinez, and Patrice Bellot. "Information Retrieval and Graph Analysis Approaches for Book Recommendation." Scientific World Journal 2015 (2015): 1–8. http://dx.doi.org/10.1155/2015/926418.

Full text

Abstract:

A combination of multiple information retrieval approaches is proposed for the purpose of book recommendation. In this paper, book recommendation is based on complex user's query. We used different theoretical retrieval models: probabilistic as InL2 (Divergence from Randomness model) and language model and tested their interpolated combination. Graph analysis algorithms such as PageRank have been successful in Web environments. We consider the application of this algorithm in a new retrieval approach to related document network comprised of social links. We called Directed Graph of Documents (DGD) a network constructed with documents and social information provided from each one of them. Specifically, this work tackles the problem of book recommendation in the context of INEX (Initiative for the Evaluation of XML retrieval) Social Book Search track. A series of reranking experiments demonstrate that combining retrieval models yields significant improvements in terms of standard ranked retrieval metrics. These results extend the applicability of link analysis algorithms to different environments.

APA, Harvard, Vancouver, ISO, and other styles

24

Roitero, Kevin. "Cheap IR evaluation." ACM SIGIR Forum 54, no. 2 (December 2020): 1–2. http://dx.doi.org/10.1145/3483382.3483400.

Full text

Abstract:

To evaluate Information Retrieval (IR) effectiveness, a possible approach is to use test collections, which are composed of a collection of documents, a set of description of information needs (called topics), and a set of relevant documents to each topic. Test collections are modelled in a competition scenario: for example, in the well known TREC initiative, participants run their own retrieval systems over a set of topics and they provide a ranked list of retrieved documents; some of the retrieved documents (usually the first ranked) constitute the so called pool, and their relevance is evaluated by human assessors; the document list is then used to compute effectiveness metrics and rank the participant systems. Private Web Search companies also run their in-house evaluation exercises; although the details are mostly unknown, and the aims are somehow different, the overall approach shares several issues with the test collection approach. The aim of this work is to: (i) develop and improve some state-of-the-art work on the evaluation of IR effectiveness while saving resources, and (ii) propose a novel, more principled and engineered, overall approach to test collection based effectiveness evaluation. In this thesis we focus on three main directions: the first part details the usage of few topics (i.e., information needs) in retrieval evaluation and shows an extensive study detailing the effect of using fewer topics for retrieval evaluation in terms of number of topics, topics subsets, and statistical power. The second part of this thesis discusses the evaluation without relevance judgements, reproducing, extending, and generalizing state-of-the-art methods and investigating their combinations by means of data fusion techniques and machine learning. Finally, the third part uses crowdsourcing to gather relevance labels, and in particular shows the effect of using fine grained judgement scales; furthermore, explores methods to transform judgements between different relevance scales. Awarded by: University of Udine, Udine, Italy on 19 March 2020. Supervised by: Professor Stefano Mizzaro. Available at: https://kevinroitero.com/resources/kr-phd-thesis.pdf.

APA, Harvard, Vancouver, ISO, and other styles

25

Sunny, Sanjeev K., and Mallikarjun Angadi. "Evaluating the effectiveness of thesauri in digital information retrieval systems." Electronic Library 36, no. 1 (February 5, 2018): 55–70. http://dx.doi.org/10.1108/el-02-2017-0033.

Full text

Abstract:

Purpose The purpose of this study is to carry out a systematic literature review for evidence-based assessment of the effectiveness of thesaurus in digital information retrieval systems. It also aimed to identify the evaluation methods, evaluation measures and data collection tools which may be used in evaluating digital information retrieval systems. Design/methodology/approach A systematic literature review (SLR) of 344 publications from LISA and 238 from Scopus has been carried out to identify the evaluation studies for analysis, and 15 evaluation studies have been analyzed. Findings This study presents evidences for the effectiveness of thesaurus in digital information retrieval systems. Various methods for evaluating digital information systems have been identified. Also, a wide range of evaluation measures and data collection tools have been identified. Research limitations/implications The study was limited to the literature published in English language and indexed in LISA and Scopus. The evaluation methods, evaluation measures and data collection tools identified in this study may be used to design more cognizant evaluation studies for digital information retrieval systems. Practical implications The findings have significant implications for the administrators of any type of digital information retrieval systems in making more informed decisions toward implementation of thesaurus in resource description and access to digital collections. Originality/value This study extends our knowledge on the potentials of thesauri in digital information retrieval systems. It also provides cues for designing more cognizant evaluation studies for digital information systems.

APA, Harvard, Vancouver, ISO, and other styles

26

Kazai, Gabriella, Mounia Lalmas, Norbert Fuhr, and Norbert Gövert. "A report on the first year of the INitiative for the Evaluation of XML retrieval (INEX'02)." Journal of the American Society for Information Science and Technology 55, no. 6 (January 21, 2004): 551–56. http://dx.doi.org/10.1002/asi.10386.

Full text

APA, Harvard, Vancouver, ISO, and other styles

27

Ayalew, Yirsaw, Barbara Moeng, and Gontlafetse Mosweunyane. "Experimental evaluation of ontology-based HIV/AIDS frequently asked question retrieval system." Health Informatics Journal 25, no. 4 (May 23, 2018): 1434–50. http://dx.doi.org/10.1177/1460458218775147.

Full text

Abstract:

This study presents the results of experimental evaluations of an ontology-based frequently asked question retrieval system in the domain of HIV and AIDS. The main purpose of the system is to provide answers to questions on HIV/AIDS using ontology. To evaluate the effectiveness of the frequently asked question retrieval system, we conducted two experiments. The first experiment focused on the evaluation of the quality of the ontology we developed using the OQuaRE evaluation framework which is based on software quality metrics and metrics designed for ontology quality evaluation. The second experiment focused on evaluating the effectiveness of the ontology in retrieving relevant answers. For this we used an open-source information retrieval platform, Terrier, with retrieval models BM25 and PL2. For the measurement of performance, we used the measures mean average precision, mean reciprocal rank, and precision at 5. The results suggest that frequently asked question retrieval with ontology is more effective than frequently asked question retrieval without ontology in the domain of HIV/AIDS.

APA, Harvard, Vancouver, ISO, and other styles

28

Sliusar, V. V. "Model for Evaluation of Information Retrieval Effectiveness within Semantic Web Concept." Proceedings of Universities. Electronics 23, no. 3 (2018): 308–12. http://dx.doi.org/10.24151/1561-5405-2018-23-3-308-312.

Full text

APA, Harvard, Vancouver, ISO, and other styles

29

Di Lecce, Vincenzo, and Andrea Guerriero. "An Evaluation of the Effectiveness of Image Features for Image Retrieval." Journal of Visual Communication and Image Representation 10, no. 4 (December 1999): 351–62. http://dx.doi.org/10.1006/jvci.1999.0423.

Full text

APA, Harvard, Vancouver, ISO, and other styles

30

Tamine-Lechani, Lynda, Mohand Boughanem, and Mariam Daoud. "Evaluation of contextual information retrieval effectiveness: overview of issues and research." Knowledge and Information Systems 24, no. 1 (July 15, 2009): 1–34. http://dx.doi.org/10.1007/s10115-009-0231-1.

Full text

APA, Harvard, Vancouver, ISO, and other styles

31

Behnert, Christiane, and Dirk Lewandowski. "A framework for designing retrieval effectiveness studies of library information systems using human relevance assessments." Journal of Documentation 73, no. 3 (May 8, 2017): 509–27. http://dx.doi.org/10.1108/jd-08-2016-0099.

Full text

Abstract:

Purpose The purpose of this paper is to demonstrate how to apply traditional information retrieval (IR) evaluation methods based on standards from the Text REtrieval Conference and web search evaluation to all types of modern library information systems (LISs) including online public access catalogues, discovery systems, and digital libraries that provide web search features to gather information from heterogeneous sources. Design/methodology/approach The authors apply conventional procedures from IR evaluation to the LIS context considering the specific characteristics of modern library materials. Findings The authors introduce a framework consisting of five parts: search queries, search results, assessors, testing, and data analysis. The authors show how to deal with comparability problems resulting from diverse document types, e.g., electronic articles vs printed monographs and what issues need to be considered for retrieval tests in the library context. Practical implications The framework can be used as a guideline for conducting retrieval effectiveness studies in the library context. Originality/value Although a considerable amount of research has been done on IR evaluation, and standards for conducting retrieval effectiveness studies do exist, to the authors’ knowledge this is the first attempt to provide a systematic framework for evaluating the retrieval effectiveness of twenty-first-century LISs. The authors demonstrate which issues must be considered and what decisions must be made by researchers prior to a retrieval test.

APA, Harvard, Vancouver, ISO, and other styles

32

Marijan, Robert, and Robert Leskovar. "A library’s information retrieval system (In)effectiveness: case study." Library Hi Tech 33, no. 3 (September 21, 2015): 369–86. http://dx.doi.org/10.1108/lht-07-2015-0071.

Full text

Abstract:

Purpose – The purpose of this paper is to evaluate the effectiveness of the information retrieval component of a daily newspaper publisher’s integrated library system (ILS) in comparison with the open source alternatives and observe the impact of the scale of metadata, generated daily by library administrators, on retrieved result sets. Design/methodology/approach – In Experiment 1, the authors compared the result sets of the information retrieval system (IRS) component of the publisher’s current ILS and the result sets of proposed ones with human-assessed relevance judgment set. In Experiment 2, the authors compared the performance of proposed IRS components with the publisher’s current production IRS, using result sets of current IRS classified as relevant. Both experiments were conducted using standard information retrieval (IR) evaluation methods: precision, recall, precision at k, F-measure, mean average precision and 11-point interpolated average precision. Findings – Results showed that: first, in Experiment 1, the publisher’s current production ILS ranked last of all participating IRSs when compared to a relevance document set classified by the senior library administrator; and second, in Experiment 2, the tested IR components’ request handlers that used only automatically generated metadata performed slightly better than request handlers that used all of the metadata fields. Therefore, regarding the effectiveness of IR, the daily human effort of generating the publisher’s current set of metadata attributes is unjustified. Research limitations/implications – The experiments’ collections contained Slovene language with large number of variations of the forms of nouns, verbs and adjectives. The results could be different if the experiments’ collections contained languages with different grammatical properties. Practical implications – The authors have confirmed, using standard IR methods, that the IR component used in the publisher’s current ILS, could be adequately replaced with an open source component. Based on the research, the publisher could incorporate the suggested open source IR components in practice. In the research, the authors have described the methods that can be used by libraries for evaluating the effectiveness of the IR of their ILSs. Originality/value – The paper provides a framework for the evaluation of an ILS’s IR effectiveness for libraries. Based on the evaluation results, the libraries could replace the IR components if their current information system setup allows it.

APA, Harvard, Vancouver, ISO, and other styles

33

Ro, Jung Soon. "An evaluation of the applicability of ranking algorithms to improve the effectiveness of full-text retrieval. I. On the effectiveness of full-text retrieval." Journal of the American Society for Information Science 39, no. 2 (March 1988): 73–78. http://dx.doi.org/10.1002/(sici)1097-4571(198803)39:2<73::aid-asi1>3.0.co;2-x.

Full text

APA, Harvard, Vancouver, ISO, and other styles

34

Rajagopal, Prabha, Sri Devi Ravana, Yun Sing Koh, and Vimala Balakrishnan. "Evaluating the effectiveness of information retrieval systems using effort-based relevance judgment." Aslib Journal of Information Management 71, no. 1 (January 21, 2019): 2–17. http://dx.doi.org/10.1108/ajim-04-2018-0086.

Full text

Abstract:

Purpose The effort in addition to relevance is a major factor for satisfaction and utility of the document to the actual user. The purpose of this paper is to propose a method in generating relevance judgments that incorporate effort without human judges’ involvement. Then the study determines the variation in system rankings due to low effort relevance judgment in evaluating retrieval systems at different depth of evaluation. Design/methodology/approach Effort-based relevance judgments are generated using a proposed boxplot approach for simple document features, HTML features and readability features. The boxplot approach is a simple yet repeatable approach in classifying documents’ effort while ensuring outlier scores do not skew the grading of the entire set of documents. Findings The retrieval systems evaluation using low effort relevance judgments has a stronger influence on shallow depth of evaluation compared to deeper depth. It is proved that difference in the system rankings is due to low effort documents and not the number of relevant documents. Originality/value Hence, it is crucial to evaluate retrieval systems at shallow depth using low effort relevance judgments.

APA, Harvard, Vancouver, ISO, and other styles

35

JING, YIXIN, DONGWON JEONG, and DOO-KWON BAIK. "AN INFERENCE-ENABLED ACCESS CONTROL MODEL FOR RDF ONTOLOGY." International Journal of Software Engineering and Knowledge Engineering 19, no. 03 (May 2009): 339–60. http://dx.doi.org/10.1142/s0218194009004209.

Full text

Abstract:

Although RDF ontologies are expressed based on XML syntax, existing methods to protect XML documents are not suitable for securing RDF ontologies. The graph style and inference feature of RDF ontologies demands new methods for access control. Driven by this goal, this paper proposes a query-oriented model for RDF ontology access control. The model adopts the concept of ontology view to rewrite user queries. In our approach, ontology views define accessible ontology concepts and instances a user can visit, and enables a controlled inference capability for the user. The design of the views guarantees that the views are free of conflict. Based on that, the paper describes algorithms for rewriting queries according to different views, and provides a system architecture along with an implemented prototype. In the evaluation, the system exhibits a promising result in terms of effectiveness and soundness.

APA, Harvard, Vancouver, ISO, and other styles

36

Demian, Peter, Kirti Ruikar, Tarun Sahu, and Anne Morris. "Three-Dimensional Information Retrieval (3DIR)." International Journal of 3-D Information Modeling 5, no. 1 (January 2016): 67–78. http://dx.doi.org/10.4018/ij3dim.2016010105.

Full text

Abstract:

An increasing amount of information is packed into BIMs, with the 3D geometry serving as a central index leading to other information. The 3DIR project investigates information retrieval from such environments. Here, the 3D visualization can be exploited when formulating queries, computing the relevance of information items, or visualizing search results. The need for such a system was specified using workshops with end users. A prototype was built on a commercial BIM platform. Following an evaluation, the system was enhanced to exploit model topology. Relationships between 3D objects are used to widen the search, whereby relevant information items linked to a related 3D object (rather than linked directly to objects selected by the user) are still retrieved but ranked lower. An evaluation of the enhanced prototype demonstrates its effectiveness but highlights its added complexity. Care needs to be taken when exploiting topological relationships, but that a tight coupling between text-based retrieval and the 3D model is generally effective in information retrieval from BIMs.

APA, Harvard, Vancouver, ISO, and other styles

37

Miao, Yi Chan, Gang Zhao, and Guang Rong Yan. "Comparison and Evaluation of Lightweight Formats of 3D Models." Applied Mechanics and Materials 157-158 (February 2012): 1490–97. http://dx.doi.org/10.4028/www.scientific.net/amm.157-158.1490.

Full text

Abstract:

The 3D models created by the CAD software have different internal formats, which leads to the poor software compatibility. It takes much longer time while loading 3D models containing much more information by CATIA and so on. During the digital collaboration process of the product design, the quickly transmission of the 3D models is the indispensable part. However, the storage space occupied by 3D models of a large document, resulting in a longer transmission time, thereby will reduce the real-time performance of the collaborative process. A good solution in the industry is to use the lightweight format in product collaborative design. Nowadays there are a variety of lightweight formats. This paper focus on the comparisons of three lightweight formats(3D XML、JT、U3D) often used in the industry. Some experiments are given to evaluate the format effectiveness. In the end, the trend of lightweight formats is also predicted.

APA, Harvard, Vancouver, ISO, and other styles

38

Ro, Jung Soon. "An evaluation of the applicability of ranking algorithms to improve the effectiveness of full-text retrieval. II. On the effectiveness of ranking algorithms on full-text retrieval." Journal of the American Society for Information Science 39, no. 3 (May 1988): 147–60. http://dx.doi.org/10.1002/(sici)1097-4571(198805)39:3<147::aid-asi1>3.0.co;2-o.

Full text

APA, Harvard, Vancouver, ISO, and other styles

39

Li, Dan. "Effective collection construction for information retrieval evaluation and optimization." ACM SIGIR Forum 54, no. 2 (December 2020): 1–2. http://dx.doi.org/10.1145/3483382.3483401.

Full text

Abstract:

The availability of test collections in Cranfield paradigm has significantly benefited the development of models, methods and tools in information retrieval. Such test collections typically consist of a set of topics, a document collection and a set of relevance assessments. Constructing these test collections requires effort of various perspectives such as topic selection, document selection, relevance assessment, and relevance label aggregation etc. The work in the thesis provides a fundamental way of constructing and utilizing test collections in information retrieval in an effective, efficient and reliable manner. To that end, we have focused on four aspects. We first study the document selection issue when building test collections. We devise an active sampling method for efficient large-scale evaluation [Li and Kanoulas, 2017]. Different from past sampling-based approaches, we account for the fact that some systems are of higher quality than others, and we design the sampling distribution to over-sample documents from these systems. At the same time, the estimated evaluation measures are unbiased, and assessments can be used to evaluate new, novel systems without introducing any systematic error. Then a natural further step is determining when to stop the document selection and assessment procedure. This is an important but understudied problem in the construction of test collections. We consider both the gain of identifying relevant documents and the cost of assessing documents as the optimization goals. We handle the problem under the continuous active learning framework by jointly training a ranking model to rank documents, and estimating the total number of relevant documents in the collection using a "greedy" sampling method [Li and Kanoulas, 2020]. The next stage of constructing a test collection is assessing relevance. We study how to denoise relevance assessments by aggregating from multiple crowd annotation sources to obtain high-quality relevance assessments. This helps to boost the quality of relevance assessments acquired in a crowdsourcing manner. We assume a Gaussian process prior on query-document pairs to model their correlation. The proposed model shows good performance in terms of interring true relevance labels. Besides, it allows predicting relevance labels for new tasks that has no crowd annotations, which is a new functionality of CrowdGP. Ablation studies demonstrate that the effectiveness is attributed to the modelling of task correlation based on the axillary information of tasks and the prior relevance information of documents to queries. After a test collection is constructed, it can be used to either evaluate retrieval systems or train a ranking model. We propose to use it to optimize the configuration of retrieval systems. We use Bayesian optimization approach to model the effect of a δ -step in the configuration space to the effectiveness of the retrieval system, by suggesting to use different similarity functions (covariance functions) for continuous and categorical values, and examine their ability to effectively and efficiently guide the search in the configuration space [Li and Kanoulas, 2018]. Beyond the algorithmic and empirical contributions, work done as part of this thesis also contributed to the research community as the CLEF Technology Assisted Reviews in Empirical Medicine Tracks in 2017, 2018, and 2019 [Kanoulas et al., 2017, 2018, 2019]. Awarded by: University of Amsterdam, Amsterdam, The Netherlands. Supervised by: Evangelos Kanoulas. Available at: https://dare.uva.nl/search?identifier=3438a2b6-9271-4f2c-add5-3c811cc48d42.

APA, Harvard, Vancouver, ISO, and other styles

40

Ahlgren, Per, and Leif Grönqvist. "Evaluation of retrieval effectiveness with incomplete relevance data: Theoretical and experimental comparison of three measures." Information Processing & Management 44, no. 1 (January 2008): 212–25. http://dx.doi.org/10.1016/j.ipm.2007.01.011.

Full text

APA, Harvard, Vancouver, ISO, and other styles

41

Akritidis, Leonidas, Dimitrios Katsaros, and Panayiotis Bozanis. "Improved retrieval effectiveness by efficient combination of term proximity and zone scoring: A simulation-based evaluation." Simulation Modelling Practice and Theory 22 (March 2012): 74–91. http://dx.doi.org/10.1016/j.simpat.2011.12.002.

Full text

APA, Harvard, Vancouver, ISO, and other styles

42

Bakar, Zainab Abu, Tengku Mohd T. Sembok, and Mohammed Yusoff. "An evaluation of retrieval effectiveness using spelling-correction and string-similarity matching methods on Malay texts." Journal of the American Society for Information Science 51, no. 8 (2000): 691–706. http://dx.doi.org/10.1002/(sici)1097-4571(2000)51:8<691::aid-asi20>3.0.co;2-u.

Full text

APA, Harvard, Vancouver, ISO, and other styles

43

Qiao, Feng Cai, Xin Zhang, Hui Wang, and Jian Ping Cao. "Evaluation of Near-Duplicate Image Retrieval Algorithms for the Identification of Celebrities in Web Images." Advanced Materials Research 765-767 (September 2013): 1431–35. http://dx.doi.org/10.4028/www.scientific.net/amr.765-767.1431.

Full text

Abstract:

Near-duplicate image retrieval is a classical research problem in computer vision, for which a large number of diverse approaches have been proposed. Recent studies have revealed that it can be used as an intermediate step to implement search-based celebrity identification given the existence of huge volume of user-tagged or text-surrounded celebrity images on the web. However, the effectiveness of existing near-duplicate image retrieval methods for such a task still remains unclear. To address this issue, this paper presents a comprehensive study of the existing near-duplicate image retrieval methods in a structural way. Four representatives of the existing methods, i.e. hash signature, mean SSIM, BoVW with SIFT features and ARG, are experimentally evaluated using a self-constructed dataset containing 24762 images of 15 top searched celebrities collected using 6 news search engines and the Google image search engine. The experimental results reveal that, compared with global feature based methods, local feature based ones are usually more appropriate for the task of celebrity identification in web images, as they can deal with partial duplicate and scene similar images better. In particular, BoVW with SIFT features is recommended as it provides the best trade-off between on-line speed and retrieval accuracy.

APA, Harvard, Vancouver, ISO, and other styles

44

Elkerton, Jay, and Susan Palmiter. "Designing Help Systems Using the GOMS Model: An Information Retrieval Evaluation." Proceedings of the Human Factors Society Annual Meeting 33, no. 5 (October 1989): 281–85. http://dx.doi.org/10.1177/154193128903300510.

Full text

Abstract:

Using the GOMS model (Card, Moran, and Newell, 1983), a help system was developed which was complete and well structured. The content of this help system was determined from the goals, operators, methods, and selection rules needed to perform HyperCard™ authoring tasks. The index to these methods, which was an integrated part of the system, was determined from the hierarchical goal tree provided by the GOMS analysis. To determine the effectiveness of using GOMS as a design aid for help systems, the GOMS help system was compared to a state-of-the art interface developed by Apple® Computer which was modified slightly for experimental purposes (Original help system). Two groups of 14 users, using one of the two help systems, retrieved help information about 56 tasks separated into 4 sessions. The results indicated that the GOMS users were significantly faster than the Original users with the largest speed difference occurring in the first session. However, no reliable differences were found for retrieval accuracy between the two groups. This is not surprising since the Original help system was found to have 85.9% of the procedural information contained in the GOMS help system. Interestingly, participants subjectively rated the GOMS help system higher than the Original help system. Overall, the results from this information retrieval study suggest that a GOMS model can aid in the development of help systems which are easy to use, easy to learn, and well liked.

APA, Harvard, Vancouver, ISO, and other styles

45

Du, Liying, and Nabila H. Saleh. "Medical Image Retrieval Algorithm Based on Content." Open Electrical & Electronic Engineering Journal 8, no. 1 (December 31, 2014): 675–79. http://dx.doi.org/10.2174/1874129001408010675.

Full text

Abstract:

With the rapid development of science and technology and the improving medical service, medical image's role in clinical diagnosis and treatment becomes increasingly prominent. It has become a high-profile task to help the doctors pick out desired targets from massive medical images. Currently, techniques of text-based medical image retrieval have failed to meet the need of massive medical image retrieval. On the other hand, techniques of content-based medical image retrieval are already established and hold vast research potential. Starting with the introduction of matured techniques of medical image retrieval, the paper expounds on the evaluation criteria of the effectiveness of such techniques, then on the modified text-and-content-based medical image retrieval algorithm. The last part is the verification of the research conclusion with contrast experiments illustrated by the sample figures.

APA, Harvard, Vancouver, ISO, and other styles

46

Yoosooka, Burasakorn, and Vilas Wuwongse. "An Adaptive System for Retrieval and Composition of Learning Objects." International Journal of Systems and Service-Oriented Engineering 2, no. 4 (October 2011): 42–59. http://dx.doi.org/10.4018/jssoe.2011100103.

Full text

Abstract:

This paper proposes a new approach to automatic retrieval and composition of Learning Objects (LOs) in an Adaptive Educational Hypermedia System (AEHS) using multidimensional learner characteristics to enhance learning effectiveness. The approach focuses on adaptive techniques in four components of AEHS: Learning Paths, LO Retrieval, LO Sequencing, and Examination Difficulty Levels. This approach has been designed to enable the adaptation of rules to become generic. Hence, the application to various domains is possible. The approach dynamically selects, sequences, and composes LOs into an individual learning package based on the use of domain ontology, learner profiles, and LO metadata. The Sharable Content Object Reference Model is employed to represent LO metadata and learning packages in order to support LO sharing. The IMS Learner Information Package Specification is used to represent learner profiles. A preliminary evaluation of the developed system indicates the system’s effectiveness in terms of learners’ satisfaction.

APA, Harvard, Vancouver, ISO, and other styles

47

Rahmati, Marzie, and Mohammad Ali Zare Chahooki. "Improvement in bug localization based on kernel extreme learning machine." Journal of Communications Technology, Electronics and Computer Science 5 (April 30, 2016): 1. http://dx.doi.org/10.22385/jctecs.v5i0.77.

Full text

Abstract:

Bug localization uses bug reports received from users, developers and testers to locate buggy files. Since finding a buggy file among thousands of files is time consuming and tedious for developers, various methods based on information retrieval is suggested to automate this process. In addition to information retrieval methods for error localization, machine learning methods are used too. Machine learning-based approach, improves methods of describing bug report and program code by representing them in feature vectors. Learning hypothesis on Extreme Learning Machine (ELM) has been recently effective in many areas. This paper shows effectiveness of none-linear kernel of ELM in bug localization. Furthermore the effectiveness of Different kernels in ELM compare to other kernel-based learning methods is analyzed. The experimental results for hypothesis evaluation on Mozilla Firefox dataset show effectiveness of Kernel ELM for bug localization in software projects.

APA, Harvard, Vancouver, ISO, and other styles

48

Hashimoto, Tomomi, Yuuki Munakata, Ryousuke Yamanaka, and Akinari Kurosu. "Proposal of Episodic Memory Retrieval Method on Mood Congruence Effects." Journal of Advanced Computational Intelligence and Intelligent Informatics 21, no. 4 (July 20, 2017): 722–29. http://dx.doi.org/10.20965/jaciii.2017.p0722.

Full text

Abstract:

Developments in robotics have advanced the development of robots that can communicate with humans. However, a few robots are only capable of stereotypic responses, and this leads to barriers in smooth communication between humans and robots. In this study, in order to represent mood congruence effects, an episodic memory retrieval method is proposed based on a robot’s emotional values that represent its internal state. In the study, impression evaluation experiments were conducted to prove the effectiveness of the proposed method.

APA, Harvard, Vancouver, ISO, and other styles

49

SHEN, JIALIE, JOHN SHEPHERD, and ANNE H. H. NGU. "AN EMPIRICAL STUDY OF QUERY EFFECTIVENESS IMPROVEMENT VIA MULTIPLE VISUAL FEATURE INTEGRATION." International Journal of Image and Graphics 07, no. 03 (July 2007): 551–81. http://dx.doi.org/10.1142/s0219467807002751.

Full text

Abstract:

This article is a comprehensive evaluation of a new framework for indexing image data, called CMVF, which can combine multiple data properties with a hybrid architecture. The goal of this system is to allow straightforward incorporation of multiple visual feature vectors, based on properties such as color, texture and shape, into a single low-dimension vector that is more effective for retrieval than the larger individual feature vectors. Moreover, CMVF is not only constrained to visual properties, but can also incorporate human classification criteria to further strengthen image retrieval process. The controlled study present in this paper concentrates on CMVF's performance on images, examining how the incorporation of extra features into the indexing affects both efficiency and effectiveness. Analysis and empirical evidence suggest that the inclusion of extra visual features can significantly improve system performance. Furthermore, it demonstrated that CMVF's effectiveness is robust against various kinds of common image distortions and initial (random) configuration of neural network.

APA, Harvard, Vancouver, ISO, and other styles

50

Li, Fei, Jia Jia Huang, Min Peng, and Rui Cai. "Feedback Ranking Method in Topic-Based Retrieval." Applied Mechanics and Materials 339 (July 2013): 269–74. http://dx.doi.org/10.4028/www.scientific.net/amm.339.269.

Full text

Abstract:

Ranking has an extensive application in analyzing public opinions of social network (SN), such as searching the most hot topic or the most relevant articles that the user concerning. In these scenarios, due to the different requirements of users, there is need to rank the object set from different aspects and to re-rank the object set by integrating these different results to acquire a synthesize rank result.In this paper, we proposed a novel Feedback Ranking method, which lets two basic rankers learn from each other during the mutual process by providing each one's result as feedback to the other so as to boost the ranking performance. During the mutual ranking refinement process, we utilize iSRCC---an improvement on Spearman Rank Correlation to calculate the weight of each basic rankers dynamically. We apply this method into the article ranking problem on topic-query retrieval and evaluate its effectiveness on the TAC09 data set. Overall evaluation results are promising.

APA, Harvard, Vancouver, ISO, and other styles

Journal articles on the topic 'Evaluation of XML retrieval effectiveness'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles