Pour voir les autres types de publications sur ce sujet consultez le lien suivant : LM. Automatic text retrieval.

Articles de revues sur le sujet « LM. Automatic text retrieval »

Créez une référence correcte selon les styles APA, MLA, Chicago, Harvard et plusieurs autres

Choisissez une source :

Consultez les 50 meilleurs articles de revues pour votre recherche sur le sujet « LM. Automatic text retrieval ».

À côté de chaque source dans la liste de références il y a un bouton « Ajouter à la bibliographie ». Cliquez sur ce bouton, et nous générerons automatiquement la référence bibliographique pour la source choisie selon votre style de citation préféré : APA, MLA, Harvard, Vancouver, Chicago, etc.

Vous pouvez aussi télécharger le texte intégral de la publication scolaire au format pdf et consulter son résumé en ligne lorsque ces informations sont inclues dans les métadonnées.

Parcourez les articles de revues sur diverses disciplines et organisez correctement votre bibliographie.

1

SALTON, G. « Developments in Automatic Text Retrieval ». Science 253, no 5023 (30 août 1991) : 974–80. http://dx.doi.org/10.1126/science.253.5023.974.

Texte intégral
Styles APA, Harvard, Vancouver, ISO, etc.
2

Wai Lam, M. Ruiz et P. Srinivasan. « Automatic text categorization and its application to text retrieval ». IEEE Transactions on Knowledge and Data Engineering 11, no 6 (1999) : 865–79. http://dx.doi.org/10.1109/69.824599.

Texte intégral
Styles APA, Harvard, Vancouver, ISO, etc.
3

Salton, Gerard. « Another look at automatic text-retrieval systems ». Communications of the ACM 29, no 7 (juillet 1986) : 648–56. http://dx.doi.org/10.1145/6138.6149.

Texte intégral
Styles APA, Harvard, Vancouver, ISO, etc.
4

Foo, Schubert, Siu Cheung Hui, Hong Koon Lim et Li Hui. « Automatic thesaurus for enhanced Chinese text retrieval ». Library Review 49, no 5 (juillet 2000) : 230–40. http://dx.doi.org/10.1108/00242530010331754.

Texte intégral
Styles APA, Harvard, Vancouver, ISO, etc.
5

Salton, Gerard, et Christopher Buckley. « Term-weighting approaches in automatic text retrieval ». Information Processing & ; Management 24, no 5 (janvier 1988) : 513–23. http://dx.doi.org/10.1016/0306-4573(88)90021-0.

Texte intégral
Styles APA, Harvard, Vancouver, ISO, etc.
6

Heinrich, Helen, et Eric Willis. « Automated storage and retrieval system : a time-tested innovation ». Library Management 35, no 6/7 (5 août 2014) : 444–53. http://dx.doi.org/10.1108/lm-09-2013-0086.

Texte intégral
Résumé :
Purpose – The purpose of this paper is to examine the ongoing life cycle of the world's first library Automated Storage and Retrieval System (ASRS) at the Oviatt Library at the California State University, Northridge (CSUN). Born from the pilot project at the California State University Chancellor's Office, CSUN's ASRS was inaugurated in 1991 and cost over $2,000,000 to implement. It survived a devastating 6.8 Northridge earthquake and protected the collection housed within. Almost 20 years later the CSUN ASRS underwent a major renovation of hardware. With the changing concept of library as space and the construction of Learning Commons at the Oviatt, the demand for ASRS capacity is higher than ever. Design/methodology/approach – In addition to the history and overview, the paper explores the major aspects of ASRS administration: specifications of storage layout and arrangement of the materials, collection policy for storing materials, communication of retrieval requests and ASRS interface and compatibility with successive Integrated Library Systems. Findings – The first ASRS served as proof of concept that a library collection does not lose its effectiveness when low-circulating materials are removed from the open stacks. Furthermore, with the changing concept of library as space and the construction of Learning Commons at the Oviatt, the provision of the nimble, just-in-time collection becomes paramount, and the demand for ASRS increases exponentially. Practical implications – Administrators and librarians who consider investing in ASRS will learn about the principles of storage organization, imperatives and challenges of its conception and long-term management on the example of CSUN. Originality/value – The paper carries unique qualities as it describes the formation and evolution of the world's first library ASRS. The visionary undertaking not only withstood the test of time and nature, it continues to play a pivotal role in Oviatt Library's adaption to the new generation of users’ demands and expectations.
Styles APA, Harvard, Vancouver, ISO, etc.
7

Salton, Gerard, James Allan et Chris Buckley. « Automatic structuring and retrieval of large text files ». Communications of the ACM 37, no 2 (février 1994) : 97–108. http://dx.doi.org/10.1145/175235.175243.

Texte intégral
Styles APA, Harvard, Vancouver, ISO, etc.
8

Wu, Zimin, et Gwyneth Tseng. « ACTS : An automatic Chinese text segmentation system for full text retrieval ». Journal of the American Society for Information Science 46, no 2 (mars 1995) : 83–96. http://dx.doi.org/10.1002/(sici)1097-4571(199503)46:2<83 ::aid-asi2>3.0.co;2-0.

Texte intégral
Styles APA, Harvard, Vancouver, ISO, etc.
9

Hamdy, Abeer, et Mohamed Elsayed. « Automatic Recommendation of Software Design Patterns : Text Retrieval Approach ». Journal of Software 13, no 4 (avril 2018) : 260–68. http://dx.doi.org/10.17706/jsw.13.4.260-268.

Texte intégral
Styles APA, Harvard, Vancouver, ISO, etc.
10

Kim, Jin-Suk, Du-Seok Jin, Kwang-Young Kim et Ho-Seop Choe. « Automatic In-Text Keyword Tagging based on Information Retrieval ». Journal of Information Processing Systems 5, no 3 (30 septembre 2009) : 159–66. http://dx.doi.org/10.3745/jips.2009.5.3.159.

Texte intégral
Styles APA, Harvard, Vancouver, ISO, etc.
11

Loukachevitch, Natalia, et Boris Dobrov. « The Sociopolitical Thesaurus as a resource for automatic document processing in Russian ». Terminology 21, no 2 (30 décembre 2015) : 237–62. http://dx.doi.org/10.1075/term.21.2.05lou.

Texte intégral
Résumé :
This paper presents the structure and current state of the Sociopolitical thesaurus, which was developed for automatic document analysis and information-retrieval applications in Russian in a broad domain of public affairs. The scope of the Sociopolitical thesaurus resembles traditional information-retrieval thesauri for broad domains such as the EUROVOC or UNBIS thesauri, but the Sociopolitical thesaurus is intended as a tool for automatic document processing and this difference leads to considerable distinctions in the thesaurus structure and principles of its development. The knowledge representation in the Sociopolitical thesaurus is based on the combination of three existing traditions of developing information-retrieval thesauri, wordnets, and formal ontology research, which facilitates the consistent representation for such a broad scope of concepts and automatic document analysis of unstructured texts. The Sociopolitical thesaurus is used in such applications as conceptual indexing in information-retrieval systems, knowledge-based text categorization, automatic summarization of single and multiple documents, and question-answering. This paper presents an evaluation of the Sociopolitical thesaurus in automatic knowledge-based text categorization.
Styles APA, Harvard, Vancouver, ISO, etc.
12

Lancaster, F. W. « Retrieval experiments : Full text versus human indexing versus automatic indexing ». Journal of the American Society for Information Science 49, no 5 (1998) : 483–84. http://dx.doi.org/10.1002/(sici)1097-4571(19980415)49:5<483 ::aid-asi13>3.0.co;2-a.

Texte intégral
Styles APA, Harvard, Vancouver, ISO, etc.
13

Lancaster, F. W. « Retrieval experiments : Full text versus human indexing versus automatic indexing ». Journal of the American Society for Information Science 49, no 5 (1998) : 484. http://dx.doi.org/10.1002/(sici)1097-4571(19980415)49:5<484 ::aid-asi14>3.0.co;2-6.

Texte intégral
Styles APA, Harvard, Vancouver, ISO, etc.
14

Lancaster, F. W. « Retrieval experiments : Full text versus human indexing versus automatic indexing ». Journal of the American Society for Information Science 49, no 5 (15 avril 1998) : 484. http://dx.doi.org/10.1002/(sici)1097-4571(19980415)49:5<484 ::aid-asi14>3.3.co;2-y.

Texte intégral
Styles APA, Harvard, Vancouver, ISO, etc.
15

Ni, Pin, Yuming Li et Victor Chang. « Research on Text Classification Based on Automatically Extracted Keywords ». International Journal of Enterprise Information Systems 16, no 4 (octobre 2020) : 1–16. http://dx.doi.org/10.4018/ijeis.2020100101.

Texte intégral
Résumé :
Automatic keywords extraction and classification tasks are important research directions in the domains of NLP (natural language processing), information retrieval, and text mining. As the fine granularity abstracted from text data, keywords are also the most important feature of text data, which has great practical and potential value in document classification, topic modeling, information retrieval, and other aspects. The compact representation of documents can be achieved through keywords, which contains massive significant information. Therefore, it may be quite advantageous to realize text classification with high-dimensional feature space. For this reason, this study designed a supervised keyword classification method based on TextRank keyword automatic extraction technology and optimize the model with the genetic algorithm to contribute to modeling the keywords of the topic for text classification.
Styles APA, Harvard, Vancouver, ISO, etc.
16

Hkiri, Emna, Souheyl Mallat et Mounir Zrigui. « Events Automatic Extraction from Arabic Texts ». International Journal of Information Retrieval Research 6, no 1 (janvier 2016) : 36–51. http://dx.doi.org/10.4018/ijirr.2016010103.

Texte intégral
Résumé :
The event extraction task consists in determining and classifying events within an open-domain text. It is very new for the Arabic language, whereas it attained its maturity for some languages such as English and French. Events extraction was also proved to help Natural Language Processing tasks such as Information Retrieval and Question Answering, text mining, machine translation etc… to obtain a higher performance. In this article, we present an ongoing effort to build a system for event extraction from Arabic texts using Gate platform and other tools.
Styles APA, Harvard, Vancouver, ISO, etc.
17

Yadav, Niharika, et Vinay Kumar. « A novel technique for automatic retrieval of embedded text from books ». Optik 127, no 20 (octobre 2016) : 9538–50. http://dx.doi.org/10.1016/j.ijleo.2016.05.122.

Texte intégral
Styles APA, Harvard, Vancouver, ISO, etc.
18

Zhu, Hong Mei, Yong Quan Liang, Qi Jia Tian et Shu Juan Ji. « Agricultural Policy-Oriented Ontology-Based Semantic Information Retrieval ». Key Engineering Materials 439-440 (juin 2010) : 572–76. http://dx.doi.org/10.4028/www.scientific.net/kem.439-440.572.

Texte intégral
Résumé :
Research on architecture of ontology-based information semantic representation and Retrieval is done. As a case study, a prototype for agricultural policy-oriented ontology-based semantic information retrieval system (APOSIRS) is established. Ontology plays a role that providing a shared terminology and supporting for the retrieval process. The architecture allows APOSIRS-based applications to perform automatic semantic information Retrieval of agricultural policy text at more length: automatic and dynamic semantic annotation of unstructured and semi-structured content, semantically-enabled information extraction, indexing, retrieval, as well as ontology management, such as querying and modifying the underlying ontology and knowledge bases. Main components of this architecture have been implemented and their results are reported.
Styles APA, Harvard, Vancouver, ISO, etc.
19

Takada, Tomoki, Mizuki Arai et Tomohiro Takagi. « Automatic Keyword Annotation System Using Newspapers ». Journal of Advanced Computational Intelligence and Intelligent Informatics 18, no 3 (20 mai 2014) : 340–46. http://dx.doi.org/10.20965/jaciii.2014.p0340.

Texte intégral
Résumé :
Nowadays, an increasingly large amount of information exists on the web. Therefore, a method is needed that enables us to find necessary information quickly because this is becoming increasingly difficult for users. To solve this problem, information retrieval systems like Google and recommendation systems like that on Amazon are used. In this paper, we focus on information retrieval systems. These retrieval systems require index terms, which affect the precision of retrieval. Two methods generally decide index terms. One is analyzing a text using natural language processing and deciding index terms using varying amounts of statistics. The other is someone choosing document keywords as index terms. However, the latter method requires too much time and effort and becomes more impractical as information grows. Therefore, we propose the Nikkei annotator system, which is based on the model of the human brain and learns patterns of past keyword annotation and automatically outputs keywords that users prefer. The purposes of the proposed method are automating manual keyword annotation and achieving high speed and high accuracy keyword annotation. Experimental results showed that the proposed method is more accurate than TFIDF and Naive Bayes in P@5 and P@10. Moreover, these results also showed that the proposed method could annotate about 19 times faster than Naive Bayes.
Styles APA, Harvard, Vancouver, ISO, etc.
20

Chen, Rong, Feng Chen et Yi Sun. « Research on Automatic Text Classification Algorithm Based on ITF-IDF and KNN ». Applied Mechanics and Materials 713-715 (janvier 2015) : 1830–34. http://dx.doi.org/10.4028/www.scientific.net/amm.713-715.1830.

Texte intégral
Résumé :
We consider how to efficiently text classification on all pairs of documents. This information can be used to information retrieval, digital library, information filtering, and search engine, among others. This paper describes text classification model which based on KNN algorithm. The text feature extraction algorithm, TF-IDF, can loss related information between text features, an improved ITF-IDF algorithm has been presented in order to overcome it. Our experiments show that our algorithm is better than others.
Styles APA, Harvard, Vancouver, ISO, etc.
21

Fjeldvig, Tove, et Anne Golden. « Experiments with Language-based Aids in Information Retrieval Systems ». Nordic Journal of Linguistics 11, no 1-2 (juin 1988) : 33–46. http://dx.doi.org/10.1017/s0332586500001736.

Texte intégral
Résumé :
The fact that a lexeme can appear in various forms causes problems in information retrieval. As a solution to this problem, we have developed methods for automatic root lemmatization, automatic truncation and automatic splitting of compound words. All the methods have as their basis a set of rules which contain information regarding inflected and derived forms of words – and not a dictionary. The methods have been tested on several collections of texts, and have produced very good results. By controlled experiments in text retrieval, we have studied the effects on search results. These results show that both the method of automatic root lemmatization and the method of automatic truncation make a considerable improvement on search quality. The experiments with splitting of compound words did not give quite the same improvement, however, but all the same this experiment showed that such a method could contribute to a richer and more complete search request.
Styles APA, Harvard, Vancouver, ISO, etc.
22

Gragnaniello, Diego, Andrea Bottino, Sandro Cumani et Wonjoon Kim. « Special Issue on Advances in Deep Learning ». Applied Sciences 10, no 9 (2 mai 2020) : 3172. http://dx.doi.org/10.3390/app10093172.

Texte intégral
Résumé :
Nowadays, deep learning is the fastest growing research field in machine learning and has a tremendous impact on a plethora of daily life applications, ranging from security and surveillance to autonomous driving, automatic indexing and retrieval of media content, text analysis, speech recognition, automatic translation, and many others [...]
Styles APA, Harvard, Vancouver, ISO, etc.
23

Zhou, Ning, et Jianping Fan. « Automatic image–text alignment for large-scale web image indexing and retrieval ». Pattern Recognition 48, no 1 (janvier 2015) : 205–19. http://dx.doi.org/10.1016/j.patcog.2014.07.001.

Texte intégral
Styles APA, Harvard, Vancouver, ISO, etc.
24

Zhang, Baopeng, Yanyun Qu, Jinye Peng et Jianping Fan. « An automatic image-text alignment method for large-scale web image retrieval ». Multimedia Tools and Applications 76, no 20 (27 octobre 2016) : 21401–21. http://dx.doi.org/10.1007/s11042-016-4059-x.

Texte intégral
Styles APA, Harvard, Vancouver, ISO, etc.
25

Zhang, Hongli. « Voice Keyword Retrieval Method Using Attention Mechanism and Multimodal Information Fusion ». Scientific Programming 2021 (23 janvier 2021) : 1–11. http://dx.doi.org/10.1155/2021/6662841.

Texte intégral
Résumé :
A cross-modal speech-text retrieval method using interactive learning convolution automatic encoder (CAE) is proposed. First, an interactive learning autoencoder structure is proposed, including two inputs of speech and text, as well as processing links such as encoding, hidden layer interaction, and decoding, to complete the modeling of cross-modal speech-text retrieval. Then, the original audio signal is preprocessed and the Mel frequency cepstrum coefficient (MFCC) feature is extracted. In addition, the word bag model is used to extract the text features, and then the attention mechanism is used to combine the text and speech features. Through interactive learning CAE, the shared features of speech and text modes are obtained and then sent to modal classifier to identify modal information, so as to realize cross-modal voice text retrieval. Finally, experiments show that the performance of the proposed algorithm is better than that of the contrast algorithm in terms of recall rate, accuracy rate, and false recognition rate.
Styles APA, Harvard, Vancouver, ISO, etc.
26

Boyce, Bert R. « Concepts of information retrieval and automatic text processing : The transformation analysis, and retrieval of information by computer ». Journal of the American Society for Information Science 41, no 2 (mars 1990) : 150–51. http://dx.doi.org/10.1002/(sici)1097-4571(199003)41:2<150 ::aid-asi12>3.0.co;2-8.

Texte intégral
Styles APA, Harvard, Vancouver, ISO, etc.
27

RUGGIERO, FRANCESCO, et REINIER VAN KLEIJ. « ON-LINE HYPERMEDIA NEWSPAPERS : AN EXPERIMENT WITH “L’UNIONE SARDA” ». International Journal of Modern Physics C 05, no 05 (octobre 1994) : 899–906. http://dx.doi.org/10.1142/s0129183194001033.

Texte intégral
Résumé :
In this brief paper we present a prototype of a an On-line hypermedia newspaper, the first example of daily electronic publishing in Italy, based on the results of a collaboration between CRS4 and L’UNIONE SARDA. The on-line newspaper (text and picture) is created by automatic retrieval, compression, transmission and conversion of newspaper data. The prototype is under development and currently allows automatic hypertextual links, article retrieval facilities and a simple mechanism for creating a personal newspaper. Some HyperText Markup Language (HTML) pages, are shown to give an impression of the prototype.
Styles APA, Harvard, Vancouver, ISO, etc.
28

Periyasamy, A. R. Pon. « Reversible N-grams Stemming Stripping Algorithm for Classification of Text Data ». International Journal of Advanced Research in Computer Science and Software Engineering 7, no 7 (30 juillet 2017) : 465. http://dx.doi.org/10.23956/ijarcsse/v7i4/0210.

Texte intégral
Résumé :
Abstract—Stemming methods traces the root or stem of a word that is possibly used for Information retrieval (IR) tasks for increasing the recall rate to enhance most relevant searches. There are numerous ways ranging from manual and automatic, language dependent to language independent of methods available for performing the task of stemming. Those algorithms are designed for the purpose of overcoming the challenges involved with the existing methods. This paper represents a comparative study of various available stemming algorithms widely used to enhance the effectiveness and efficiency of information retrieval.
Styles APA, Harvard, Vancouver, ISO, etc.
29

Pavlick, Ellie, et Chris Callison-Burch. « Extracting Structured Information via Automatic + Human Computation ». Proceedings of the AAAI Conference on Human Computation and Crowdsourcing 3 (23 septembre 2015) : 26–27. http://dx.doi.org/10.1609/hcomp.v3i1.13253.

Texte intégral
Résumé :
We present a system for extracting structured information from unstructured text using a combination of information retrieval, natural language processing, machine learning, and crowdsourcing. We test our pipeline by building a structured database of gun violence incidents in the United States. The results of our pilot study demonstrate that the proposed methodology is a viable way of collecting large-scale, up-to-date data for public health, public policy, and social science research.
Styles APA, Harvard, Vancouver, ISO, etc.
30

Sorokowska, Agnieszka, Marie Nord, Michał Mikołaj Stefańczyk et Maria Larsson. « Odor-based context-dependent memory : influence of olfactory cues on declarative and nondeclarative memory indices ». Learning & ; Memory 29, no 5 (28 avril 2022) : 136–41. http://dx.doi.org/10.1101/lm.053562.121.

Texte intégral
Résumé :
Reinstating the olfactory learning context can increase access to memory information, but it is not fully clear which memory functions are subject to an enhancing odor context reinstatement effect. Here, we tested whether congruent odor context during encoding and recall positively affected declarative and nondeclarative memory scores using a novel method for manipulation of an odorous environment; namely, intranasal Nosa plugs. Recall of a text and a complex figure as well as performance in a priming task were assessed immediately and 1 wk after encoding. We found that congruent odor exposure at encoding and recall aided free retrieval of a story at delayed testing but had no significant effect on a complex figure recall or a word completion task. Differences between the assessed memory indices suggest that olfactory environmental cues may be primarily efficient in free verbal recall tasks.
Styles APA, Harvard, Vancouver, ISO, etc.
31

Ostovar, Ahmad, Suna Bensch et Thomas Hellström. « Natural language guided object retrieval in images ». Acta Informatica 58, no 4 (19 juillet 2021) : 243–61. http://dx.doi.org/10.1007/s00236-021-00400-2.

Texte intégral
Résumé :
AbstractThe ability to understand the surrounding environment and being able to communicate with interacting humans are important functionalities for many automated systems where visual input (e.g., images, video) and natural language input (speech or text) have to be related to each other. Possible applications are automatic image caption generation, interactive surveillance systems, or human robot interaction. In this paper, we propose algorithms for automatic responses to natural language queries about an image. Our approach uses a predefined neural net for detection of bounding boxes and objects in images, spatial relations between bounding boxes are modeled with a neural net, the queries are analyzed with a syntactic parser, and algorithms to map natural language to properties in the images are introduced. The algorithms make use of semantic similarity and antonyms. We evaluate the performance of our approach with test users assessing the quality of our system’s generated answers.
Styles APA, Harvard, Vancouver, ISO, etc.
32

W Zaki, W. Mimi Diyana, Ling Chei Siong, Aini Hussain, W. Siti Halimatul Munirah W Ahmad et Hamzaini Abdul Hamid. « 12-APR Segmentation and Global Hu-F Descriptor for Human Spine MRI Image Retrieval ». Jurnal Kejuruteraan 34, no 4 (30 juillet 2022) : 659–70. http://dx.doi.org/10.17576/jkukm-2022-34(4)-14.

Texte intégral
Résumé :
The image retrieval system has been used to provide the needed correct images to the physicians while the diagnosis and treatment process is being conducted. The earlier image retrieval system was a text-based image retrieval system (TBIRS) that used keywords for the image context and it requires human’s help to manually make text annotation on the images. The text annotation process is a laborious task especially when dealing with a huge database and is prone to human errors. To overcome the aforementioned issues, the approach of a content-based image retrieval system (CBIRS) with automatic indexing using visual features such as colour, shape and texture becomes popular. Thus, this study proposes a semi-automated shape segmentation method using a 12-anatomical point representation method of the human spine vertebrae for CBIRS. The 12 points, which are annotated manually on the region of interest (ROI), is followed by automatic ROI extraction. The segmentation method performs excellently, as evidenced by the highest accuracy of 0.9987, specificity of 0.9989, and sensitivity of 0.9913. The features of the segmented ROI are extracted with a novel global Hu-F descriptor that combines a global shape descriptor, a Hu moment invariant, and a Fourier descriptor based on the ANOVA selection approach. The retrieval phase is implemented using 100 MRI data of the human spine for thoracic, lumbar, and sacral bones. The highest obtained precision is 0.9110 using a normalized Manhattan metric for lumbar bones. In a conclusion, a retrieval system to retrieve lumbar bones of the MRI human spine has been successfully developed to help radiologists in diagnosing human spine diseases.
Styles APA, Harvard, Vancouver, ISO, etc.
33

Jantima Polpinij, et Chumsak Sribunruang. « Automatic Retrieval of Particular Oncology Documents from PubMed by Semantic-based Text Clustering ». International Journal of Advancements in Computing Technology 5, no 11 (31 juillet 2013) : 65–74. http://dx.doi.org/10.4156/ijact.vol5.issue11.8.

Texte intégral
Styles APA, Harvard, Vancouver, ISO, etc.
34

Bichi, Abdulkadir Abubakar, Ruhaidah Samsudin et Rohayanti Hassan. « Automatic construction of generic stop words list for hausa text ». Indonesian Journal of Electrical Engineering and Computer Science 25, no 3 (1 mars 2022) : 1501. http://dx.doi.org/10.11591/ijeecs.v25.i3.pp1501-1507.

Texte intégral
Résumé :
<span lang="EN-US">Stop-words are words having the highest frequencies in a document without any significant information. They are characterized by having common relations within a cluster. They are the noise of the text that are evenly distributed over a document. Removal of stop words improve the performance and accuracy of information retrieval algorithms and machine learning at large. It saves the storage space by reducing the vector space dimension, and helps in effective documents indexing. This research generated a list of Hausa stop words automatically using aggregated method by combining frequency and statistics methods. The experiments are conducted using a primarily collected Hausa corpus consisting of 841 Hausa news articles of size 646862 words and finally a list of distinct 81 Hausa stop words is generated.</span>
Styles APA, Harvard, Vancouver, ISO, etc.
35

Klochko, Andriy. « INTRODUCTION OF INTELLECTUAL ANALYSIS TECHNOLOGIES OF TEXT DOCUMENTS INTO FIELD OF TECHNICAL REGULATION IN CONSTRUCTION ». Management of Development of Complex Systems, no 47 (27 septembre 2021) : 63–70. http://dx.doi.org/10.32347/2412-9933.2021.47.63-70.

Texte intégral
Résumé :
The article is devoted to the introduction of intellectual analyses technology of text documents into the field of technical regulation in Ukraine construction. The main attention in the paper is directed on the decision of questions of automatic collection and intellectual analysis of construction branch`s normative documents. These issues are becoming extremely important in connection with the digitalization of all sectors of the economy.Urgent problems of the technical regulation system in construction are highlighted.It is shown that these problems bring to the fore the task of increasing the speed and reliability of processing text documents in electronic information systems. The solution to this problem involves the development of automatic systems that are capable of intelligent document search in uncertainty conditions that caused by the presence of redundant textual information.The overview of information retrieval systems used to processing text documents in electronic information resources is conducted.Preconditions of introduction of intellectual analysis technologies of text documents into the technical regulation sphere in Ukraine construction are investigated.The timeliness of the technology implementation substantiated. The basic concepts used in the models and methods development of automatic extraction of meaningful information from texts are given.Process of data mining is studied and Models of textual information mining are analyzed that used in different information retrieval systems. Scheme of introduction of intellectual analysis technology of text documents into the Unified state electronic system in the field of construction is offered.Solution of clustering of text documents problem apply artificial neural networks is supposed.The possibility of using such models as Deep Structured Semantic Model and Self-Organizing Map is considered. The choice of these models is based on their ability to determine the proximity degree of information retrieval images of text documents. Work practical significance is seen in the improvement of search engines in the field of technical regulation in construction and the ability to significantly accelerate the process of restructuring into the field of technical regulation in Ukraine construction.
Styles APA, Harvard, Vancouver, ISO, etc.
36

Ahn, Hyeokju, et Harksoo Kim. « Enhanced Spoken Sentence Retrieval Using a Conventional Automatic Speech Recognizer in Smart Home ». International Journal on Artificial Intelligence Tools 25, no 03 (juin 2016) : 1650017. http://dx.doi.org/10.1142/s0218213016500172.

Texte intégral
Résumé :
With the rapid evolution of smart home environment, the demand for spoken information retrieval (e.g., voice-activated FAQ retrieval) on information appliances is increasing. In spoken information retrieval, users’ spoken queries are converted into text queries using automatic speech recognition (ASR) engines. If top-1 results of the ASR engines are incorrect, the errors are propagated to information retrieval systems. If a document collection is a small set of sentences such as frequently asked questions (FAQs), the errors have additional effect on the performance of information retrieval systems. To improve the performance of such a sentence retrieval system, we propose a post-processing model of an ASR engine. The post-processing model consists of a re-ranking and a query term generation model. The re-ranking model rearranges top-n outputs of the ASR engines using the ranking support vector machine (Ranking SVM). The query term generation model extracts meaningful content words from the re-ranked queries based on term frequencies and query rankings. In the experiments, the re-ranking model improved the top-1 performance results of an underlying ASR engine with 4.4% higher precision and 6.4% higher recall rate. The query term generation model improved the performance results of an underlying information retrieval system with an accuracy 2.4% to 2.6% higher. Based on the experimental result, the proposed model revealed that it could improve the performance of a spoken sentence retrieval system in a restricted domain.
Styles APA, Harvard, Vancouver, ISO, etc.
37

Fkih, Fethi, et Mohamed Nazih Omri. « Information Retrieval from Unstructured Web Text Document Based on Automatic Learning of the Threshold ». International Journal of Information Retrieval Research 2, no 4 (octobre 2012) : 12–30. http://dx.doi.org/10.4018/ijirr.2012100102.

Texte intégral
Résumé :
Collocation is defined as a sequence of lexical tokens which habitually co-occur. This type of information is widely used in various applications such as Information Retrieval, document indexing, machine translation, lexicography, etc. Therefore, many techniques are developed for the automatic retrieval of collocations from textual documents. These techniques use statistical measures based on a joint frequency calculation to quantify the connection strength between the tokens of a candidate collocation. The discrimination between relevant and irrelevant collocations is performed using a priori fixed threshold. Generally, the discrimination threshold estimation is performed manually by a domain expert. This supervised estimation is considered as an additional cost which reduces system performance. In this paper, the authors propose a new technique for the threshold automatic learning to retrieve information from web text document. This technique is mainly based on the usual performance evaluation measures (such as ROC and Precision-Recall curves). The results show the ability to automatically estimate a statistical threshold independently of the treated corpus.
Styles APA, Harvard, Vancouver, ISO, etc.
38

Alsayadi, Hamzah A., et Mohammed Hadwan. « Automatic Speech Recognition for Qur’an Verses using Traditional Technique ». Journal of Artificial Intelligence and Metaheuristics 1, no 2 (2022) : 17–23. http://dx.doi.org/10.54216/jaim.010202.

Texte intégral
Résumé :
Deep learning is the one of approaches of machine learning that uses algorithms for building a model based on complex unstructured data. The Muslims Holy Qur’an book is written using Arabic diacritized text. In this paper, a traditional method to build a robust Qur’an versus recognition is proposed. The MFCC is used to extract features. These features are adapted using minimum phone error (MPE) as a discriminative model. The acoustic model was built using the deep neural network (DNN) model. We present an n-gram language model (LM). The dataset of Qur’an verses is used for training and evaluating the proposed model, consisting of 10 hours of .wav recitations performed by 60 reciters. The Experimental results showed that the proposed DNN model achieved a significantly low character error rate (CER) of 4.09% and a word error rate (WER) of 8.46%.
Styles APA, Harvard, Vancouver, ISO, etc.
39

Zheng, Xin, et Ai Ping Cai. « The Method of Web Image Annotation Classification Automatic ». Advanced Materials Research 889-890 (février 2014) : 1323–26. http://dx.doi.org/10.4028/www.scientific.net/amr.889-890.1323.

Texte intégral
Résumé :
It has been heavy work that to find the related pictures form Internet without annotation. Therefore, the automatic image annotation was extremely important in image retrieval. The traditional method were translated image visual feature into keywords simply, but it ignored the image similarity problem between the low-level visual features and high-level semantic. That is image "gap" problem, so image annotation was very lower. This paper puts forward a classification of web based image content automatic tagging mixing technology, the first it will map visual feature of image to one or more rough images, then we will preprocess the web page text information, finally we select some keywords similarity as image annotation by using similar semantic processing module. So it realizes the image and text combining the automatic annotation and it achieve high precision of image annotation.
Styles APA, Harvard, Vancouver, ISO, etc.
40

Sakran, Alaa Ehab, Mohsen Rashwan et Sherif Mahdy Abdou. « Automatic Phonemes Segmentation for Quran Verses Using Kaldi Toolkit ». International Journal of Computer Science and Mobile Computing 10, no 12 (30 décembre 2021) : 39–45. http://dx.doi.org/10.47760/ijcsmc.2021.v10i12.006.

Texte intégral
Résumé :
In this paper, automatic segmentation system was built using the Kaldi toolkit at phoneme level for Quran verses data set with a total speech corpus of (80 hours) and its corresponding text corpus respectively, with a size of 1100 recorded Quran verses of 100 non-Arab reciters. Initiated with the extraction of Mel Frequency Cepstral Coefficients MFCCs, the proceedings of the building of Language Model LM and Acoustic Model AM training phase continued until the Deep Neural Network DNN level by selecting 770 waves (70 reciters). The testing of the system was done using 220 waves (20 reciters), and concluded with the selection of the development data set which was 280 waves (10 reciters). Comparison was implemented between automatic and manual segmentation, and the results obtained for the test set was 99% and for the Development set was 99% with Time Delay Neural Networks TDNN based acoustic modelling.
Styles APA, Harvard, Vancouver, ISO, etc.
41

Kim, Pan Koo. « An automatic indexing of compound words based on mutual information for Korean text retrieval ». Library and Information Science 34 (31 mars 1997) : 29–38. http://dx.doi.org/10.46895/lis.34.29.

Texte intégral
Styles APA, Harvard, Vancouver, ISO, etc.
42

Brandt, Cynthia, et Prakash Nadkarni. « Web-based UMLS concept retrieval by automatic text scanning : a comparison of two methods ». Computer Methods and Programs in Biomedicine 64, no 1 (janvier 2001) : 37–43. http://dx.doi.org/10.1016/s0169-2607(00)00092-4.

Texte intégral
Styles APA, Harvard, Vancouver, ISO, etc.
43

Küçük, Dilek, et Adnan Yazıcı. « A semi-automatic text-based semantic video annotation system for Turkish facilitating multilingual retrieval ». Expert Systems with Applications 40, no 9 (juillet 2013) : 3398–411. http://dx.doi.org/10.1016/j.eswa.2012.12.048.

Texte intégral
Styles APA, Harvard, Vancouver, ISO, etc.
44

Mao, Wenlei, et Wesley W. Chu. « The phrase-based vector space model for automatic retrieval of free-text medical documents ». Data & ; Knowledge Engineering 61, no 1 (avril 2007) : 76–92. http://dx.doi.org/10.1016/j.datak.2006.02.008.

Texte intégral
Styles APA, Harvard, Vancouver, ISO, etc.
45

Mosbah, Mawloud. « Improving the Results of Google Scholar Engine through Automatic Query Expansion Mechanism and Pseudo Re-ranking using MVRA ». Journal of information and organizational sciences 42, no 2 (10 décembre 2018) : 219–29. http://dx.doi.org/10.31341/jios.42.2.5.

Texte intégral
Résumé :
In this paper, we address the enhancing of Google Scholar engine, in the context of text retrieval, through two mechanisms related to the interrogation protocol of that query expansion and reformulation. The both schemes are applied with re-ranking results using a pseudo relevance feedback algorithm that we have proposed previously in the context of Content based Image Retrieval (CBIR) namely Majority Voting Re-ranking Algorithm (MVRA). The experiments conducted using ten queries reveal very promising results in terms of effectiveness.
Styles APA, Harvard, Vancouver, ISO, etc.
46

Ting, Yu. « Multiple Features Image Fusion Based on Visual Dictionary ». Advanced Materials Research 889-890 (février 2014) : 1111–14. http://dx.doi.org/10.4028/www.scientific.net/amr.889-890.1111.

Texte intégral
Résumé :
It has been heavy work that to find the related pictures form Internet without annotation. Therefore, the automatic image annotation was extremely important in image retrieval. The traditional method were translated image visual feature into keywords simply, but it ignored the image similarity problem between the low-level visual features and high-level semantic. That is image "gap" problem, so image annotation was very lower. This paper puts forward a classification of web based image content automatic tagging mixing technology, the first it will map visual feature of image to one or more rough images, then we will preprocess the web page text information, finally we select some keywords similarity as image annotation by using similar semantic processing module. So it realizes the image and text combining the automatic annotation and it achieve high precision of image annotation.
Styles APA, Harvard, Vancouver, ISO, etc.
47

Zheng, Min, Bo Liu et Le Sun. « LawRec : Automatic Recommendation of Legal Provisions Based on Legal Text Analysis ». Computational Intelligence and Neuroscience 2022 (14 septembre 2022) : 1–7. http://dx.doi.org/10.1155/2022/6313161.

Texte intégral
Résumé :
Smart court technologies are making full use of modern science to promote the modernization of the trial system and trial capabilities, for example, artificial intelligence, Internet of things, and cloud computing. The smart court technologies can improve the efficiency of case handling and achieving convenience for the people. Article recommendation is an important part of intelligent trial. For ordinary people without legal background, the traditional information retrieval system that searches laws and regulations based on keywords is not applicable because they do not have the ability to extract professional legal vocabulary from complex case processes. This paper proposes a law recommendation framework, called LawRec, based on Bidirectional Encoder Representation from Transformers (BERT) and Skip-Recurrent Neural Network (Skip-RNN) models. It intends to integrate the knowledge of legal provisions with the case description and uses the BERT model to learn the case description text and legal knowledge, respectively. At last, laws and regulations for cases can be recommended. Experiment results show that the proposed LawRec can achieve better performance than state-of-the-art methods.
Styles APA, Harvard, Vancouver, ISO, etc.
48

Yang, Chang, Hao Zhang, Xue Guang Zhou et Yang Wang. « A Novel Approach on Automatic Building of Word Correlation Net Based on Statistic ». Advanced Materials Research 734-737 (août 2013) : 2887–92. http://dx.doi.org/10.4028/www.scientific.net/amr.734-737.2887.

Texte intégral
Résumé :
Semantic knowledge-base has important meaning for increasing the deepness of NLP. Some comparatively mature Semantic knowledge-base such as WordNet, HowNet and Thesaurus was developed by manpower, and has many difficulties on actual application. In order to capture Chinese word knowledge of relating status moue automatically and demonstrably, this paper presented the concept of word correlation and a calculation method of word correlation based on statistic. Then a correlation net based on Chinese words which have strong domain characteristic was built. In order to resolve the difficulty of processing the huge amount of data, a hard disk storing method of array segmentation was designed. The semantic knowledge gained by the experiment had the advantage of empiricism. It is veracity and generalization is strong so it can play an important role in many fields such as text categorization, text retrieval, text filtering, etc.
Styles APA, Harvard, Vancouver, ISO, etc.
49

Wang, Qicai, Peiyu Liu, Zhenfang Zhu, Hongxia Yin, Qiuyue Zhang et Lindong Zhang. « A Text Abstraction Summary Model Based on BERT Word Embedding and Reinforcement Learning ». Applied Sciences 9, no 21 (4 novembre 2019) : 4701. http://dx.doi.org/10.3390/app9214701.

Texte intégral
Résumé :
As a core task of natural language processing and information retrieval, automatic text summarization is widely applied in many fields. There are two existing methods for text summarization task at present: abstractive and extractive. On this basis we propose a novel hybrid model of extractive-abstractive to combine BERT (Bidirectional Encoder Representations from Transformers) word embedding with reinforcement learning. Firstly, we convert the human-written abstractive summaries to the ground truth labels. Secondly, we use BERT word embedding as text representation and pre-train two sub-models respectively. Finally, the extraction network and the abstraction network are bridged by reinforcement learning. To verify the performance of the model, we compare it with the current popular automatic text summary model on the CNN/Daily Mail dataset, and use the ROUGE (Recall-Oriented Understudy for Gisting Evaluation) metrics as the evaluation method. Extensive experimental results show that the accuracy of the model is improved obviously.
Styles APA, Harvard, Vancouver, ISO, etc.
50

AZMI, AQIL M., et REHAM S. ALMAJED. « A survey of automatic Arabic diacritization techniques ». Natural Language Engineering 21, no 3 (10 octobre 2013) : 477–95. http://dx.doi.org/10.1017/s1351324913000284.

Texte intégral
Résumé :
AbstractIn Modern Standard Arabic texts are typically written without diacritical markings. The diacritics are important to clarify the sense and meaning of words. Lack of these markings may lead to ambiguity even for the natives. Often the natives successfully disambiguate the meaning through the context; however, many Arabic applications, such as machine translation, text-to-speech, and information retrieval, are vulnerable due to lack of diacritics. The process of automatically restoring diacritical marks is called diacritization or diacritic restoration. In this paper we discuss the properties of the Arabic language and the issues that are related to the lack of the diacritical marking. It will be followed by a survey of the recent algorithms that were developed to solve the diacritization problem. We also look into the future trend for researchers working in this area.
Styles APA, Harvard, Vancouver, ISO, etc.
Nous offrons des réductions sur tous les plans premium pour les auteurs dont les œuvres sont incluses dans des sélections littéraires thématiques. Contactez-nous pour obtenir un code promo unique!

Vers la bibliographie