Log in

Relevant bibliographies by topics / Natural Language Information Analysis Method (NIAM) / Journal articles

Journal articles on the topic 'Natural Language Information Analysis Method (NIAM)'

To see the other types of publications on this topic, follow the link: Natural Language Information Analysis Method (NIAM).

Author: Grafiati

Published: 4 June 2021

Last updated: 2 February 2022

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'Natural Language Information Analysis Method (NIAM).'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Din, Roshidi, Rosmadi Bakar, Raihan Sabirah Sabri, Mohamad Yusof Darus, and Shamsul Jamel Elias. "Performance analysis on secured data method in natural language steganography." Bulletin of Electrical Engineering and Informatics 8, no. 1 (March 1, 2019): 298–304. http://dx.doi.org/10.11591/eei.v8i1.1441.

Full text

Abstract:

The rapid amount of exchange information that causes the expansion of the internet during the last decade has motivated that a research in this field. Recently, steganography approaches have received an unexpected attention. Hence, the aim of this paper is to review different performance metric; covering the decoding, decrypting and extracting performance metric. The process of data decoding interprets the received hidden message into a code word. As such, data encryption is the best way to provide a secure communication. Decrypting take an encrypted text and converting it back into an original text. Data extracting is a process which is the reverse of the data embedding process. The effectiveness evaluation is mainly determined by the performance metric aspect. The intention of researchers is to improve performance metric characteristics. The evaluation success is mainly determined by the performance analysis aspect. The objective of this paper is to present a review on the study of steganography in natural language based on the criteria of the performance analysis. The findings review will clarify the preferred performance metric aspects used. This review is hoped to help future research in evaluating the performance analysis of natural language in general and the proposed secured data revealed on natural language steganography in specific.

APA, Harvard, Vancouver, ISO, and other styles

2

Chen, Jinyan, Susanne Becken, and Bela Stantic. "Lexicon based Chinese language sentiment analysis method." Computer Science and Information Systems 16, no. 2 (2019): 639–55. http://dx.doi.org/10.2298/csis181015013c.

Full text

Abstract:

The growing number of social media users and vast volume of posts could provide valuable information about the sentiment toward different locations, services as well as people. Recent advances in Big Data analytics and natural language processing often means to automatically calculate sentiment in these posts. Sentiment analysis is challenging and computationally demanding task due to the volume of data, misspelling, emoticons as well as abbreviations. While significant work was directed toward the sentiment analysis of English text there is limited attention in literature toward the sentiment analytic of Chinese language. In this work we propose method to identify the sentiment in Chinese social media posts and to test our method we rely on posts sent by visitors of Great Barrier Reef by users of most popular Chinese social media platform Sina Weibo. We elaborate process of capturing of weibo posts, describe a creation of lexicon as well as develop and explain algorithm for sentiment calculation. In case study, related to sentiment toward the different GBR destinations, we demonstrate that the proposed method is effective in obtaining the information and is suitable to monitor visitors? opinion.

APA, Harvard, Vancouver, ISO, and other styles

3

CHIU, IVEY, and L. H. SHU. "Biomimetic design through natural language analysis to facilitate cross-domain information retrieval." Artificial Intelligence for Engineering Design, Analysis and Manufacturing 21, no. 1 (January 2007): 45–59. http://dx.doi.org/10.1017/s0890060407070138.

Full text

Abstract:

Biomimetic, or biologically inspired, design uses analogous biological phenomena to develop solutions for engineering problems. Several instances of biomimetic design result from personal observations of biological phenomena. However, many engineers' knowledge of biology may be limited, thus reducing the potential of biologically inspired solutions. Our approach to biomimetic design takes advantage of the large amount of biological knowledge already available in books, journals, and so forth, by performing keyword searches on these existing natural-language sources. Because of the ambiguity and imprecision of natural language, challenges inherent to natural language processing were encountered. One challenge of retrieving relevant cross-domain information involves differences in domain vocabularies, or lexicons. A keyword meaningful to biologists may not occur to engineers. For an example problem that involved cleaning, that is, removing dirt, a biochemist suggested the keyword “defend.” Defend is not an obvious keyword to most engineers for this problem, nor are the words defend and “clean/remove” directly related within lexical references. However, previous work showed that biological phenomena retrieved by the keyword defend provided useful stimuli and produced successful concepts for the clean/remove problem. In this paper, we describe a method to systematically bridge the disparate biology and engineering domains using natural language analysis. For the clean/remove example, we were able to algorithmically generate several biologically meaningful keywords, including defend, that are not obviously related to the engineering problem. We developed a method to organize and rank the set of biologically meaningful keywords identified, and confirmed that we could achieve similar results for two other examples in encapsulation and microassembly. Although we specifically address cross-domain information retrieval from biology, the bridging process presented in this paper is not limited to biology, and can be used for any other domain given the availability of appropriate domain-specific knowledge sources and references.

APA, Harvard, Vancouver, ISO, and other styles

4

Akita, Chie, Motohiro Mase, and Yasuhiko Kitamura. "Natural Language Questions and Answers for RDF Information Resources." Journal of Advanced Computational Intelligence and Intelligent Informatics 14, no. 4 (May 20, 2010): 384–89. http://dx.doi.org/10.20965/jaciii.2010.p0384.

Full text

Abstract:

We propose Questions and Answers (Q&A) method responding natural language questions about RDF information resource. When a natural language question and an RDF graph are given, keywords are extracted from the question using morphological analysis and keywords are converted to key elements referencing lexica describing correspondence relationships between keywords and elements. A question subgraph containing all key elements corresponding to keywords in the question are extracted from the RDF graph and the question subgraph is searched for an answer. We evaluate performance using an RDF information resource that describes our laboratory.

APA, Harvard, Vancouver, ISO, and other styles

5

Litvin, A. A., V. Yu Velychko, and V. V. Kaverynskyi. "Method of information obtaining from ontology on the basis of a natural language phrase analysis." PROBLEMS IN PROGRAMMING, no. 2-3 (September 2020): 322–30. http://dx.doi.org/10.15407/pp2020.02-03.322.

Full text

Abstract:

A method for phrases analyzing in natural languages of inflective type (Ukrainian and Russian) has been developed. The method allows one to outline main expressed ideas and groups of words in the text by which they are stated. The semantic trees of propositions formed in this way, each of which expresses one specific idea, are a convenient source material for constructing queries to the ontology in the SPARQL language. The analysis algorithm is based on the following sequence of basic steps: word tokenize, determining of marker words and phrases, identifying available type of proposition, identifying nouns groups, building a syntactic graph of a sentence, building semantic trees of propositions based on existing types of propositions, substituting parameters from semantic trees of propositions in the corresponding SPARQL query templates. The choice of an appropriate template depends on the type of proposition expressed by a given semantic tree of a proposition. The sets of concepts received as an answer are tied as corresponding answers to the previously defined semantic tree of proposition. In case of non-receipt of information from the ontology, the reduction of noun groups is carried out to express more general concepts and the building queries using them. This allows us to get some answer, although not as accurate as when we use the full noun group. The use of SPARQL query templates requires an a priori known ontology structure, which is also proposed in this paper. Such a system is applicable for dialogue using chat-bots or for automatically receiving answers to questions from the text.

APA, Harvard, Vancouver, ISO, and other styles

6

Korycinski, C., and Alan F. Newell. "Natural-language processing and automatic indexing." Indexer: The International Journal of Indexing: Volume 17, Issue 1 17, no. 1 (April 1, 1990): 21–29. http://dx.doi.org/10.3828/indexer.1990.17.1.8.

Full text

Abstract:

The task of producing satisfactory indexes by automatic means has been tackled on two fronts: by statistical analysis of text and by attempting content analysis of the text in much the same way as a human indexcr does. Though statistical techniques have a lot to offer for free-text database systems, neither method has had much success with back-of-the-bopk indexing. This review examines some problems associated with the application of natural-language processing techniques to book texts.

APA, Harvard, Vancouver, ISO, and other styles

7

Rybinski, Maciej, Xiang Dai, Sonit Singh, Sarvnaz Karimi, and Anthony Nguyen. "Extracting Family History Information From Electronic Health Records: Natural Language Processing Analysis." JMIR Medical Informatics 9, no. 4 (April 30, 2021): e24020. http://dx.doi.org/10.2196/24020.

Full text

Abstract:

Background The prognosis, diagnosis, and treatment of many genetic disorders and familial diseases significantly improve if the family history (FH) of a patient is known. Such information is often written in the free text of clinical notes. Objective The aim of this study is to develop automated methods that enable access to FH data through natural language processing. Methods We performed information extraction by using transformers to extract disease mentions from notes. We also experimented with rule-based methods for extracting family member (FM) information from text and coreference resolution techniques. We evaluated different transfer learning strategies to improve the annotation of diseases. We provided a thorough error analysis of the contributing factors that affect such information extraction systems. Results Our experiments showed that the combination of domain-adaptive pretraining and intermediate-task pretraining achieved an F1 score of 81.63% for the extraction of diseases and FMs from notes when it was tested on a public shared task data set from the National Natural Language Processing Clinical Challenges (N2C2), providing a statistically significant improvement over the baseline (P<.001). In comparison, in the 2019 N2C2/Open Health Natural Language Processing Shared Task, the median F1 score of all 17 participating teams was 76.59%. Conclusions Our approach, which leverages a state-of-the-art named entity recognition model for disease mention detection coupled with a hybrid method for FM mention detection, achieved an effectiveness that was close to that of the top 3 systems participating in the 2019 N2C2 FH extraction challenge, with only the top system convincingly outperforming our approach in terms of precision.

APA, Harvard, Vancouver, ISO, and other styles

8

P. P., Dr Joby. "Expedient Information Retrieval System for Web Pages Using the Natural Language Modeling." June 2020 2, no. 2 (June 1, 2020): 100–110. http://dx.doi.org/10.36548/jaicn.2020.2.003.

Full text

Abstract:

Retrieving of information from the huge set of data flowing due to the day to day development in the technologies has become more popular as it assists in searching for the valuable information in a structured, unstructured or a semi structured data set like text, database, multimedia, documents, and internet etc. The retrieval of information is performed employing any one of the models starting from the simple Boolean model for retrieving information, or using other frame works such as probabilistic, vector space and the natural language modelling. The paper is emphasis on using a natural language model based information retrieval to recover the meaning insights from the enormous amount of data. The method proposed in the paper uses the latent semantic analysis to retrieve significant information’s from the question raised by the user or the bulk documents. The carried out method utilizes the fundamentals of semantic factor occurring in the data set to identify the useful insights. The experiment analysis of the proposed method is carried out with few state of art dataset such as TIME, LISA, CACM and the NPL etc. and the results obtained demonstrate the superiority of the method proposed in terms of precision, recall and F-score.

APA, Harvard, Vancouver, ISO, and other styles

9

Chen, Xieling, Ruoyao Ding, Kai Xu, Shan Wang, Tianyong Hao, and Yi Zhou. "A Bibliometric Review of Natural Language Processing Empowered Mobile Computing." Wireless Communications and Mobile Computing 2018 (June 28, 2018): 1–21. http://dx.doi.org/10.1155/2018/1827074.

Full text

Abstract:

Natural Language Processing (NLP) empowered mobile computing is the use of NLP techniques in the context of mobile environment. Research in this field has drawn much attention given the continually increasing number of publications in the last five years. This study presents the status and development trend of the research field through an objective, systematic, and comprehensive review of relevant publications available from Web of Science. Analysis techniques including a descriptive statistics method, a geographic visualization method, a social network analysis method, a latent dirichlet allocation method, and an affinity propagation clustering method are used. We quantitatively analyze the publications in terms of statistical characteristics, geographical distribution, cooperation relationship, and topic discovery and distribution. This systematic analysis of the field illustrates the publications evolution over time and identifies current research interests and potential directions for future research. Our work can potentially assist researchers in keeping abreast of the research status. It can also help monitoring new scientific and technological development in the research field.

APA, Harvard, Vancouver, ISO, and other styles

10

Jamadi Khiabani, Parisa, Mohammad Ehsan Basiri, and Hamid Rastegari. "An improved evidence-based aggregation method for sentiment analysis." Journal of Information Science 46, no. 3 (March 18, 2019): 340–60. http://dx.doi.org/10.1177/0165551519837187.

Full text

Abstract:

Sentiment analysis is one of the natural language processing tasks used to find reviews expressed in online texts and classify them into different classes. One of the most important factors affecting the efficiency of sentiment analysis methods is the aggregation algorithm used for scores combination. Recently, Dempster–Shafer algorithm has been used for scores aggregation. This algorithm has a higher precision than common methods such as average, weighed average, product and voting, but the problem with this algorithm is the aggregation of a dominant high or low score that is always selected by the algorithm as the overall score. In the current research, a new method is proposed for scores aggregation that employs both the most and the second probable classes to predict the final score. The proposed approach considers every review as a set of sentences each of which has its own sentiment orientation and score and computes the probability of belonging of every sentence to different classes in a five-star scale using a pure lexicon-based system. These probabilities are then used for document-level sentiment detection. To this aim, two-point structure is used to improve the Dempster–Shafer aggregation algorithm. The proposed method is applied to review datasets of TripAdvisor and CitySearch which have been used in previous studies. The obtained results show that in comparison with the original Dempster–Shafer aggregation method, the precision of the proposed method for both datasets is 23% and 27% higher, respectively.

APA, Harvard, Vancouver, ISO, and other styles

11

Shen, Qi, and Meng Zhang. "A Semantic Retrieval Method Based on Ontology." Advanced Materials Research 989-994 (July 2014): 2179–83. http://dx.doi.org/10.4028/www.scientific.net/amr.989-994.2179.

Full text

Abstract:

Semantic retrieval method stands at the crossroads between Natural Language Processing and Machine Intelligent. This paper makes analysis on the semantic search method and research on concept similarity algorithm, and discusses the factor of weight’s influence on concept similarity as well. On this basis, this paper proposed a new semantic search method based on ontology, and apply it to the tourism information retrieval, which intellectualized tourism information retrieval service.

APA, Harvard, Vancouver, ISO, and other styles

12

Ding, Yunnian, Yangli Jia, and Zhenling Zhang. "A conceptual similarity and correlation discrimination method based on HowNet." MATEC Web of Conferences 309 (2020): 03020. http://dx.doi.org/10.1051/matecconf/202030903020.

Full text

Abstract:

The similarity and correlation analysis of word concepts has a wide range of applications in natural language processing, and has important research significance in information retrieval, text classification, data mining, and other application fields. This paper analyzes and summarizes the information of sememes relationship through the definition of words in HowNet and proposes a method to distinguish the similarity and correlation of words. Firstly, using a combination of the part of speech and sememes to distinguish the similarity and correlation between words concept. Secondly, the similarity and correlation calculation results between vocabulary concepts are used to further optimize the judgment results. Finally, the similarity and correlation distinction and discrimination between vocabulary concepts are realized. The experimental results show that the method reduces the complexity of the algorithm and greatly improves the work efficiency. The semantic similarity and correlation judgment results are more in line with the human intuitive experience and improve the accuracy of computer understanding of natural language. which provides an important theoretical basis for the development of natural language.

APA, Harvard, Vancouver, ISO, and other styles

13

Boutilier, Robert G., and Kyle Bahr. "A Natural Language Processing Approach to Social License Management." Sustainability 12, no. 20 (October 13, 2020): 8441. http://dx.doi.org/10.3390/su12208441.

Full text

Abstract:

Dealing with the social and political impacts of large complex projects requires monitoring and responding to concerns from an ever-evolving network of stakeholders. This paper describes the use of text analysis algorithms to identify stakeholders’ concerns across the project life cycle. The social license (SL) concept has been used to monitor the level of social acceptance of a project. That acceptance can be assessed from the texts produced by stakeholders on sources ranging from social media to personal interviews. The same texts also contain information on the substance of stakeholders’ concerns. Until recently, extracting that information necessitated manual coding by humans, which is a method that takes too long to be useful in time-sensitive projects. Using natural language processing algorithms, we designed a program that assesses the SL level and identifies stakeholders’ concerns in a few hours. To validate the program, we compared it to human coding of interview texts from a Bolivian mining project from 2009 to 2018. The program’s estimation of the annual average SL was significantly correlated with rating scale measures. The topics of concern identified by the program matched the most mentioned categories defined by human coders and identified the same temporal trends.

APA, Harvard, Vancouver, ISO, and other styles

14

Shah, Parita Vishal, and Priya Swaminarayan. "Sentiment Analysis on Gujarati Text: A Survey." Journal of Computational and Theoretical Nanoscience 17, no. 9 (July 1, 2020): 4075–82. http://dx.doi.org/10.1166/jctn.2020.9022.

Full text

Abstract:

Internet is a source of huge amount of information generated from blog, social websites, and forums and so on by user. In today’s world information available on the internet plays an important role in human’s life. To analyze a huge amount of information it’s require an automated method to classify this type of information. High usage of web and mobile technologies, user generated content in Guajarati is increasing on the web is motivation behind sentiment analysis. Emotion analysis is the process of identifying user’s opinion in section of text. This opinion helps to carry out decisions. Now a day’s a new source of opinion for users are web documents. Sentiment analysis is natural language processing task that extract information from various sources such as news, social networking site, blog, forums and classify them into positive, negative or neutral on the basis of their polarity. Lots of research is done in English language but it’s also important to perform sentiment analysis in Gujarati language as it is 6th official language in India. This paper gives an overview how sentiment analysis can be performed in Gujarati Language.

APA, Harvard, Vancouver, ISO, and other styles

15

Chen, Kuan-Lin, and Meng-Han Tsai. "Conversation-Based Information Delivery Method for Facility Management." Sensors 21, no. 14 (July 13, 2021): 4771. http://dx.doi.org/10.3390/s21144771.

Full text

Abstract:

Facility management platforms are widely used in the facility maintenance phase of the building life cycle. However, a large amount of complex building information affects facility managers’ efficiency and user experience in retrieving specific information on the facility management platform. Therefore, this research aims to develop a conversation-based method to improve the efficiency and user experience of facility management information delivery. The proposed method contains four major modules: decision mechanism, equipment dataset, intent analysis, and knowledge base. A chatbot prototype was developed based on the proposed method. The prototype was then validated through a feasibility test and field test at the Shulin Arts Comprehensive Administration Building in Taiwan. The results showed that the proposed method changes the traditional information delivery between users and the facility management platform. By integrating natural language processing (NLP), building information modelling (BIM), and ontological techniques, the proposed method can increase the efficiency of FM information retrieval.

APA, Harvard, Vancouver, ISO, and other styles

16

Ghabayen, Ayman S., and Basem H. Ahmed. "Polarity Analysis of Customer Reviews Based on Part-of-Speech Subcategory." Journal of Intelligent Systems 29, no. 1 (August 15, 2019): 1535–44. http://dx.doi.org/10.1515/jisys-2018-0356.

Full text

Abstract:

Abstract Nowadays, sentiment analysis is a method used to analyze the sentiment of the feedback given by a user in an online document, such as a blog, comment, and review, and classifies it as negative, positive, or neutral. The classification process relies upon the analysis of the polarity features of the natural language text given by users. Polarity analysis has been an important subtask in sentiment analysis; however, detecting correct polarity has been a major issue. Different researchers have utilized different polarity features, such as standard part-of-speech (POS) tags such as adjectives, adverbs, verbs, and nouns. However, there seems to be a lack of research focusing on the subcategories of these tags. The aim of this research was to propose a method that better recognizes the polarity of natural language text by utilizing different polarity features using the standard POS category and the subcategory combinations in order to explore the specific polarity of text. Several experiments were conducted to examine and compare the efficacies of the proposed method in terms of F-measure, recall, and precision using an Amazon dataset. The results showed that JJ + NN + VB + RB + VBP + RP, which is a POS subcategory combination, obtained better accuracy compared to the baseline approaches by 4.4% in terms of F-measure.

APA, Harvard, Vancouver, ISO, and other styles

17

Yogish, Deepa, T. N. Manjunath, and Ravindra S. Hegadi. "Analysis of Vector Space Method in Information Retrieval for Smart Answering System." Journal of Computational and Theoretical Nanoscience 17, no. 9 (July 1, 2020): 4468–72. http://dx.doi.org/10.1166/jctn.2020.9099.

Full text

Abstract:

In the world of internet, searching play a vital role to retrieve the relevant answers for the user specific queries. The most promising application of natural language processing and information retrieval system is Question answering system which provides directly the accurate answer instead of set of documents. The main objective of information retrieval is to retrieve relevant document from a huge volume of data sets underlying in the internet using appropriatemodel. There are many models proposed for retrieval process such as Boolean, Vector space and Probabilistic method. Vector space model is best method in information retrieval for document ranking with efficient document representation which combines simplicity and clarity. VSM adopts similarity function to measure the matching between documents and user intent, and assign scores from the biggest to smallest. The documents and query are assigned with weights using term frequency and inverse document frequency method. To retrieve most relevant document to the user query term, document ranking function cosine similarity score is applied for every document and user query. The documents having more similarity scores will be considered as relevant documents to the query term and they are ranked based on these scores. This paper emphasizes on different techniques of information retrieval and Vector Space Model offers a realistic compromise in IR processing. It allows best weighing scheme which ranks the set of documents in order of relevance based on user query.

APA, Harvard, Vancouver, ISO, and other styles

18

Selot, Smita, Neeta Tripathi, and A. S. Zadgaonkar. "Neural Network Model for Semantic Analysis of Sanskrit Text." International Journal of Natural Computing Research 7, no. 1 (January 2018): 1–14. http://dx.doi.org/10.4018/ijncr.2018010101.

Full text

Abstract:

Semantic analysis is the process of extracting meaning of the sentence, from a given language. From the perspective of computer processing, challenge lies in making computer understand the meaning of the given sentence. Understandability depends upon the grammar, syntactic and semantic representation of the language and methods employed for extracting these parameters. Semantics interpretation methods of natural language varies from language to language, as grammatical structure and morphological representation of one language may be different from another. One ancient Indian language, Sanskrit, has its own unique way of embedding syntactic information within words of relevance in a sentence. Sanskrit grammar is defined in 4000 rules by PaninI reveals the mechanism of adding suffixes to words according to its use in sentence. Through this article, a method of extracting meaningful information through suffixes and classifying the word into a defined semantic category is presented. The application of NN-based classification has improved the processing of text.

APA, Harvard, Vancouver, ISO, and other styles

19

Qi, Qingfu, Liyuan Lin, and Rui Zhang. "Feature Extraction Network with Attention Mechanism for Data Enhancement and Recombination Fusion for Multimodal Sentiment Analysis." Information 12, no. 9 (August 24, 2021): 342. http://dx.doi.org/10.3390/info12090342.

Full text

Abstract:

Multimodal sentiment analysis and emotion recognition represent a major research direction in natural language processing (NLP). With the rapid development of online media, people often express their emotions on a topic in the form of video, and the signals it transmits are multimodal, including language, visual, and audio. Therefore, the traditional unimodal sentiment analysis method is no longer applicable, which requires the establishment of a fusion model of multimodal information to obtain sentiment understanding. In previous studies, scholars used the feature vector cascade method when fusing multimodal data at each time step in the middle layer. This method puts each modal information in the same position and does not distinguish between strong modal information and weak modal information among multiple modalities. At the same time, this method does not pay attention to the embedding characteristics of multimodal signals across the time dimension. In response to the above problems, this paper proposes a new method and model for processing multimodal signals, which takes into account the delay and hysteresis characteristics of multimodal signals across the time dimension. The purpose is to obtain a multimodal fusion feature emotion analysis representation. We evaluate our method on the multimodal sentiment analysis benchmark dataset CMU Multimodal Opinion Sentiment and Emotion Intensity Corpus (CMU-MOSEI). We compare our proposed method with the state-of-the-art model and show excellent results.

APA, Harvard, Vancouver, ISO, and other styles

20

Al-Moslmi, Tareq, Mohammed Albared, Adel Al-Shabi, Nazlia Omar, and Salwani Abdullah. "Arabic senti-lexicon: Constructing publicly available language resources for Arabic sentiment analysis." Journal of Information Science 44, no. 3 (February 1, 2017): 345–62. http://dx.doi.org/10.1177/0165551516683908.

Full text

Abstract:

Sentiment analysis is held to be one of the highly dynamic recent research fields in Natural Language Processing, facilitated by the quickly growing volume of Web opinion data. Most of the approaches in this field are focused on English due to the lack of sentiment resources in other languages such as the Arabic language and its large variety of dialects. In most sentiment analysis applications, good sentiment resources play a critical role. Based on that, in this article, several publicly available sentiment analysis resources for Arabic are introduced. This article introduces the Arabic senti-lexicon, a list of 3880 positive and negative synsets annotated with their part of speech, polarity scores, dialects synsets and inflected forms. This article also presents a Multi-domain Arabic Sentiment Corpus (MASC) with a size of 8860 positive and negative reviews from different domains. In this article, an in-depth study has been conducted on five types of feature sets for exploiting effective features and investigating their effect on performance of Arabic sentiment analysis. The aim is to assess the quality of the developed language resources and to integrate different feature sets and classification algorithms to synthesise a more accurate sentiment analysis method. The Arabic senti-lexicon is used for generating feature vectors. Five well-known machine learning algorithms: naïve Bayes, k-nearest neighbours, support vector machines (SVMs), logistic linear regression and neural network are employed as base-classifiers for each of the feature sets. A wide range of comparative experiments on standard Arabic data sets were conducted, discussion is presented and conclusions are drawn. The experimental results show that the Arabic senti-lexicon is a very useful resource for Arabic sentiment analysis. Moreover, results show that classifiers which are trained on feature vectors derived from the corpus using the Arabic sentiment lexicon are more accurate than classifiers trained using the raw corpus.

APA, Harvard, Vancouver, ISO, and other styles

21

Mészáros, Tamás, and Margit Kiss. "Knowledge Acquisition from Critical Annotations." Information 9, no. 7 (July 20, 2018): 179. http://dx.doi.org/10.3390/info9070179.

Full text

Abstract:

Critical annotations are important knowledge sources when researching one’s oeuvre. They describe literary, historical, cultural, linguistic and other kinds of information written in natural languages. Acquiring knowledge from these notes is a complex task due to the limited natural language understanding capability of computerized tools. The aim of the research was to extract knowledge from existing annotations, and to develop new authoring methods to facilitate the knowledge acquisition. After structural and semantic analysis of critical annotations, authors developed a software tool that transforms existing annotations into a structured form that encodes referral and factual knowledge. Authors also propose a new method for authoring annotations based on controlled natural languages. This method ensures that annotations are semantically processable by computer programs and the authoring process remains simple for non-technical users.

APA, Harvard, Vancouver, ISO, and other styles

22

Smith, Michael J., Nikhil Arora, Connor Stone, Stéphane Courteau, and James E. Geach. "Pix2Prof: fast extraction of sequential information from galaxy imagery via a deep natural language ‘captioning’ model." Monthly Notices of the Royal Astronomical Society 503, no. 1 (February 13, 2021): 96–105. http://dx.doi.org/10.1093/mnras/stab424.

Full text

Abstract:

ABSTRACT We present ‘Pix2Prof’, a deep learning model that can eliminate any manual steps taken when measuring galaxy profiles. We argue that a galaxy profile of any sort is conceptually similar to a natural language image caption. This idea allows us to leverage image captioning methods from the field of natural language processing, and so we design Pix2Prof as a float sequence ‘captioning’ model suitable for galaxy profile inference. We demonstrate the technique by approximating a galaxy surface brightness (SB) profile fitting method that contains several manual steps. Pix2Prof processes ∼1 image per second on an Intel Xeon E5-2650 v3 CPU, improving on the speed of the manual interactive method by more than two orders of magnitude. Crucially, Pix2Prof requires no manual interaction, and since galaxy profile estimation is an embarrassingly parallel problem, we can further increase the throughput by running many Pix2Prof instances simultaneously. In perspective, Pix2Prof would take under an hour to infer profiles for 105 galaxies on a single NVIDIA DGX-2 system. A single human expert would take approximately 2 yr to complete the same task. Automated methodology such as this will accelerate the analysis of the next generation of large area sky surveys expected to yield hundreds of millions of targets. In such instances, all manual approaches – even those involving a large number of experts – will be impractical.

APA, Harvard, Vancouver, ISO, and other styles

23

Shelke, Nilesh M., and Shrinivas P. Deshpande. "Exploiting Chi Square Method for Sentiment Analysis of Product Reviews." International Journal of Synthetic Emotions 9, no. 2 (July 2018): 76–93. http://dx.doi.org/10.4018/ijse.2018070105.

Full text

Abstract:

Sentiment analysis is an extension of data mining which employs natural language processing and information extraction task to recognize people's opinion towards entities such as products, services, issues, organizations, individuals, events, topics, and their attributes. It gives the summarized opinion of a writer or speaker. It has received lot of attention due to increasing number of posts/tweets on social sites. The proposed system is meant to classify a given text of review into positive, negative, or the neutral category. Primary objective of this article is to provide a method of exploiting permutation and combination and chi values for sentiment analysis of product reviews. Publicly available freely dictionary SentiWordNet 3.0 has been used for review classification. The proposed system is domain independent and context aware. Another objective of the proposed system is to identify the feature specific intensity with which reviewer has expressed his opinion. Effectiveness of the proposed system has been verified through performance matrix and compared with other research work.

APA, Harvard, Vancouver, ISO, and other styles

24

Ozhiganova, Marina, Irina Dergacheva, and Anastasija Kalita. "Analysis of Linguistic Methods of Informational Impact on Human Consciousness." NBI Technologies, no. 2 (October 2019): 17–24. http://dx.doi.org/10.15688/nbit.jvolsu.2019.2.3.

Full text

Abstract:

Modern mass media give the opportunity to use a variety of technologies of consciousness manipulation, but it is not important in what way or method this manipulation takes place, it is important what information needs to be conveyed to the ideologists of the organization. It is fundamentally important that almost every technology of mind manipulation creates its own image of the enemy, which provokes all the cruelty and aggression of terrorist organization members. The authors analyze various linguistic methods for detecting information influence on human consciousness. The paper shows that the classical approach to determining the types of speech actions is very convenient. However, this approach almost completely eludes the real semantics of natural language phrases that carry out speech actions, which can be compensated by the method of using content analysis and the use of databases for the information system.

APA, Harvard, Vancouver, ISO, and other styles

25

Névéol, A., and P. Zweigenbaum. "Clinical Natural Language Processing in 2014: Foundational Methods Supporting Efficient Healthcare." Yearbook of Medical Informatics 24, no. 01 (August 2015): 194–98. http://dx.doi.org/10.15265/iy-2015-035.

Full text

Abstract:

Summary Objective: To summarize recent research and present a selection of the best papers published in 2014 in the field of clinical Natural Language Processing (NLP).Method: A systematic review of the literature was performed by the two section editors of the IMIA Yearbook NLP section by searching bibliographic databases with a focus on NLP efforts applied to clinical texts or aimed at a clinical outcome. A shortlist of candidate best papers was first selected by the section editors before being peer-reviewed by independent external reviewers. Results: The clinical NLP best paper selection shows that the field is tackling text analysis methods of increasing depth. The full review process highlighted five papers addressing foundational methods in clinical NLP using clinically relevant texts from online forums or encyclopedias, clinical texts from Electronic Health Records, and included studies specifically aiming at a practical clinical outcome. The increased access to clinical data that was made possible with the recent progress of de-identification paved the way for the scientific community to address complex NLP problems such as word sense disambiguation, negation, temporal analysis and specific information nugget extraction. These advances in turn allowed for efficient application of NLP to clinical problems such as cancer patient triage. Another line of research investigates online clinically relevant texts and brings interesting insight on communication strategies to convey health-related information. Conclusions: The field of clinical NLP is thriving through the contributions of both NLP researchers and healthcare professionals interested in applying NLP techniques for concrete healthcare purposes. Clinical NLP is becoming mature for practical applications with a significant clinical impact.

APA, Harvard, Vancouver, ISO, and other styles

26

Waheeb, Samer Abdulateef, Naseer Ahmed Khan, Bolin Chen, and Xuequn Shang. "Machine Learning Based Sentiment Text Classification for Evaluating Treatment Quality of Discharge Summary." Information 11, no. 5 (May 23, 2020): 281. http://dx.doi.org/10.3390/info11050281.

Full text

Abstract:

Patients’ discharge summaries (documents) are health sensors that are used for measuring the quality of treatment in medical centers. However, extracting information automatically from discharge summaries with unstructured natural language is considered challenging. These kinds of documents include various aspects of patient information that could be used to test the treatment quality for improving medical-related decisions. One of the significant techniques in literature for discharge summaries classification is feature extraction techniques from the domain of natural language processing on text data. We propose a novel sentiment analysis method for discharge summaries classification that relies on vector space models, statistical methods, association rule, and extreme learning machine autoencoder (ELM-AE). Our novel hybrid model is based on statistical methods that build the lexicon in a domain related to health and medical records. Meanwhile, our method examines treatment quality based on an idea inspired by sentiment analysis. Experiments prove that our proposed method obtains a higher F1 value of 0.89 with good TPR (True Positive Rate) and FPR (False Positive Rate) values compared with various well-known state-of-the-art methods with different size of training and testing datasets. The results also prove that our method provides a flexible and effective technique to examine treatment quality based on positive, negative, and neutral terms for sentence-level in each discharge summary.

APA, Harvard, Vancouver, ISO, and other styles

27

Emadi, Mehdi, and Maseud Rahgozar. "Twitter sentiment analysis using fuzzy integral classifier fusion." Journal of Information Science 46, no. 2 (February 21, 2019): 226–42. http://dx.doi.org/10.1177/0165551519828627.

Full text

Abstract:

A thorough analysis of people’s sentiment about a business, an event or an individual is necessary for business development, event analysis and popularity assessment. Social networks are rich sources of obtaining user opinions about people, events and products. Sentiment analysis conducted using multiple user comments and messages on microblogs is an interesting field of data mining and natural language processing (NLP). Different techniques and algorithms have recently been developed for conducting sentiment analysis on Twitter. Different proposed classification and pure NLP-based methods have different behaviours in predicting sentiment orientation. In this study, we combined the results of the classic classifiers and NLP-based methods to propose a new approach for Twitter sentiment analysis. The proposed method uses a fuzzy measure for determining the importance of each classifier to make the final decision. Fuzzy measures are used with the Choquet fuzzy integral for fusing the classifier outputs in order to generate the final label. Our experiments with different Twitter sentiment datasets show that fuzzy integral-based classifier fusion improves the average accuracy of sentiment classification.

APA, Harvard, Vancouver, ISO, and other styles

28

Gridach, Mourad, and Noureddine Chenfour. "An XML Approach of Coding a Morphological Database for Arabic Language." Advances in Human-Computer Interaction 2011 (2011): 1–15. http://dx.doi.org/10.1155/2011/629305.

Full text

Abstract:

We present an XML approach for the production of an Arabic morphological database for Arabic language that will be used in morphological analysis for modern standard Arabic (MSA). Optimizing the production, maintenance, and extension of morphological database is one of the crucial aspects impacting natural language processing (NLP). For Arabic language, producing a morphological database is not an easy task, because this it has some particularities such as the phenomena of agglutination and a lot of morphological ambiguity phenomenon. The method presented can be exploited by NLP applications such as syntactic analysis, semantic analysis, information retrieval, and orthographical correction.

APA, Harvard, Vancouver, ISO, and other styles

29

Darusalam, Darusalam, and Helen Ashman. "Profiling and Identifying Individual Users by Their Command Line Usage and Writing Style." Knowledge Engineering and Data Science 1, no. 2 (August 23, 2018): 55. http://dx.doi.org/10.17977/um018v1i22018p55-63.

Full text

Abstract:

Profiling and identifying individual users is an approach for intrusion detection in a computer system. User profiles are important in many applications since they record highly user-specific information - profiles are basically built to record information about users or for users to share experiences with each other. This research extends previous research on re-authenticating users with their user profiles. This research focuses on the potential to add psychometric user characteristics into the user model so as to be able to detect unauthorized users who may be masquerading as a genuine user. There are five participants involved in the investigation for formal language user identification. Additionally, we analyze the natural language of two famous writers, Jane Austen & William Shakespeare, in their written works to determine if the same principles can be applied to natural language use. This research used the n-gram analysis method for characterizing user’s style, and can potentially provide accurate user identification. As a result, n-gram analysis of a user's typed inputs offers another method for intrusion detection as it may be able to both positively and negatively identify users. The contribution of this research is to assess the use of a user’s writing styles in both formal language and natural language as a user profile characteristic that could enable intrusion detection where intruders masquerade as real users.

APA, Harvard, Vancouver, ISO, and other styles

30

Masculo, Felipe, Jorn op den Buijs, Mariana Simons, and Aki Harma. "Natural Language Processing of Medical Alert Service Notes Reveals Reasons for Emergency Admissions." Iproceedings 5, no. 1 (October 2, 2019): e15225. http://dx.doi.org/10.2196/15225.

Full text

Abstract:

Background A Personal Emergency Response Service (PERS) enables an aging population to receive help quickly when an emergency situation occurs. The reasons that trigger a PERS alert are varied, including a sudden worsening of a chronic condition, a fall, or other injury. Every PERS case is documented by the response center using a combination of structured variables and free text notes. The text notes, in particular, contain a wealth of information in case of an incident such as contextual information, details about the situation, symptoms and more. Analysis of these notes at a population level could provide insight into the various situations that cause PERS medical alerts. Objective The objectives of this study were to (1) develop methods to enable the large-scale analysis of text notes from a PERS response center, and (2) to apply these methods to a large dataset and gain insight into the different situations that cause medical alerts. Methods More than 2.5 million deidentified PERS case text notes were used to train a document embedding model (ie, a deep learning Recurrent Neural Network [RNN] that takes the medical alert text note as input and produces a corresponding fixed length vector representation as output). We applied this model to 100,000 PERS text notes related to medical incidents that resulted in emergency department admission. Finally, we used t-SNE, a nonlinear dimensionality reduction method, to visualize the vector representation of the text notes in 2D as part of a graphical user interface that enabled interactive exploration of the dataset and visual analytics. Results Visual analysis of the vectors revealed the existence of several well-separated clusters of incidents such as fall, stroke/numbness, seizure, breathing problems, chest pain, and nausea, each of them related to the emergency situation encountered by the patient as recorded in an existing structured variable. In addition, subclusters were identified within each cluster which grouped cases based on additional features extracted from the PERS text notes and not available in the existing structured variables. For example, the incidents labeled as falls (n=37,842) were split into several subclusters corresponding to falls with bone fracture (n=1437), falls with bleeding (n=4137), falls caused by dizziness (n=519), etc. Conclusions The combination of state-of-the-art natural language processing, deep learning, and visualization techniques enables the large-scale analysis of medical alert text notes. This analysis demonstrates that, in addition to falls alerts, the PERS service is broadly used to signal for help in situations often related to underlying chronic conditions and acute symptoms such as respiratory distress, chest pain, diabetic reaction, etc. Moreover, the proposed techniques enable the extraction of structured information related to the medical alert from unstructured text with minimal human supervision. This structured information could be used, for example, to track trends over time, to generate concise medical alert summaries, and to create predictive models for desired outcomes.

APA, Harvard, Vancouver, ISO, and other styles

31

Zweigenbaum, P., and A. Névéol. "Clinical Natural Language Processing in 2015: Leveraging the Variety of Texts of Clinical Interest." Yearbook of Medical Informatics 25, no. 01 (August 2016): 234–39. http://dx.doi.org/10.15265/iy-2016-049.

Full text

Abstract:

Summary Objective: To summarize recent research and present a selection of the best papers published in 2015 in the field of clinical Natural Language Processing (NLP). Method: A systematic review of the literature was performed by the two section editors of the IMIA Yearbook NLP section by searching bibliographic databases with a focus on NLP efforts applied to clinical texts or aimed at a clinical outcome. Section editors first selected a shortlist of candidate best papers that were then peer-reviewed by independent external reviewers. Results: The clinical NLP best paper selection shows that clinical NLP is making use of a variety of texts of clinical interest to contribute to the analysis of clinical information and the building of a body of clinical knowledge. The full review process highlighted five papers analyzing patient-authored texts or seeking to connect and aggregate multiple sources of information. They provide a contribution to the development of methods, resources, applications, and sometimes a combination of these aspects. Conclusions: The field of clinical NLP continues to thrive through the contributions of both NLP researchers and healthcare professionals interested in applying NLP techniques to impact clinical practice. Foundational progress in the field makes it possible to leverage a larger variety of texts of clinical interest for healthcare purposes.

APA, Harvard, Vancouver, ISO, and other styles

32

Grabar, Anna, Lyubov Manukhina, and Nataliya Nikonova. "Indicative assessment method of the public perception of environmental marketing ideas." E3S Web of Conferences 244 (2021): 10023. http://dx.doi.org/10.1051/e3sconf/202124410023.

Full text

Abstract:

Natural Language Processing is a machine learning method based on mathematical linguistics that can identify trends in public opinion. The article analyzes the possibility of implementing LDA and NLP methods to identify the growing public interest to the problems of ecology and environmental conservation. It will provide the basis for manufacturers to make eco-marketing decisions. Reorienting production towards creating green goods, introducing new ecological products to the market, promoting energy-saving technologies requires significant investments. To get a return, it is required to capture the steady demand of contractors and consumers for ecologization. The article offers a comparative analysis of getting information by classical methods (for example, through surveys) and machine learning methods. The most important sources of data collection are highlighted on the basis of their popularity, public attention and the number of individuals participating in the discourse. The author has developed key categories and keywords with which the Russian society associates the perception of environmental marketing. The result of Natural Language Processing is presented to assess public perception of ecological marketing ideas.

APA, Harvard, Vancouver, ISO, and other styles

33

Mahendhiran, P. D., and S. Kannimuthu. "Deep Learning Techniques for Polarity Classification in Multimodal Sentiment Analysis." International Journal of Information Technology & Decision Making 17, no. 03 (May 2018): 883–910. http://dx.doi.org/10.1142/s0219622018500128.

Full text

Abstract:

Contemporary research in Multimodal Sentiment Analysis (MSA) using deep learning is becoming popular in Natural Language Processing. Enormous amount of data are obtainable from social media such as Facebook, WhatsApp, YouTube, Twitter and microblogs every day. In order to deal with these large multimodal data, it is difficult to identify the relevant information from social media websites. Hence, there is a need to improve an intellectual MSA. Here, Deep Learning is used to improve the understanding and performance of MSA better. Deep Learning delivers automatic feature extraction and supports to achieve the best performance to enhance the combined model that integrates Linguistic, Acoustic and Video information extraction method. This paper focuses on the various techniques used for classifying the given portion of natural language text, audio and video according to the thoughts, feelings or opinions expressed in it, i.e., whether the general attitude is Neutral, Positive or Negative. From the results, it is perceived that Deep Learning classification algorithm gives better results compared to other machine learning classifiers such as KNN, Naive Bayes, Random Forest, Random Tree and Neural Net model. The proposed MSA in deep learning is to identify sentiment in web videos which conduct the poof-of-concept experiments that proved, in preliminary experiments using the ICT-YouTube dataset, our proposed multimodal system achieves an accuracy of 96.07%.

APA, Harvard, Vancouver, ISO, and other styles

34

Vanyushkin, Alexander, and Leonid Graschenko. "Analysis of Text Collections for the Purposes of Keyword Extraction Task." Journal of information and organizational sciences 44, no. 1 (June 25, 2020): 171–84. http://dx.doi.org/10.31341/jios.44.1.8.

Full text

Abstract:

The article discusses the evaluation of automatic keyword extraction algorithms (AKEA) and points out AKEA’s dependence on the properties of the test collection for effectiveness. As a result, it is difficult to compare different algorithms who’s tests were based on various test datasets. It is also difficult to predict the effectiveness of different systems for solving real-world problems of natural language processing (NLP). We take in to consideration a number of characteristics, such as the text length distribution in words and the method of keyword assignment. Our analysis of publicly available analytical exposition text which is typical for the keywords extraction domain revealed that their length distributions are very regular and described by the lognormal form. Moreover, most of the article lengths range between 400 and 2500 words. Additionally, the paper presents a brief review of eleven corpora that have been used to evaluate AKEA’s.

APA, Harvard, Vancouver, ISO, and other styles

35

Dhanasekaran, Kuttiyapillai, and Ramachandran Rajeswari. "Text Mining Approach for Discovering Useful Knowledge from Information Sources of E-Waste." Advanced Materials Research 984-985 (July 2014): 1335–42. http://dx.doi.org/10.4028/www.scientific.net/amr.984-985.1335.

Full text

Abstract:

Proposed method introduces a K-Nearest Neighbor method by using relevance vector machine which finds the entities and related information on waste materials to make processing of waste materials more domain friendly. A corpus analysis was incorporated to support the extraction of accurate information through elimination of unrelated tokens. The distribution of weights to terms was determined through a vector space model. Parallel verification on various entities was carried out while testing. This reduces the time taken for mapping and discovering useful information from documents (dataset) of e-waste management. Recent computer aided tools cannot check for consistency and correctness of faulty requirement definition, this paper introduces text processing method using natural language technique; this enables effective maintenance and utilization of waste materials by presenting task specific information through computer-assisted text mining and analysis process.

APA, Harvard, Vancouver, ISO, and other styles

36

Xu, Dongxin, Jeffrey A. Richards, and Jill Gilkerson. "Automated Analysis of Child Phonetic Production Using Naturalistic Recordings." Journal of Speech, Language, and Hearing Research 57, no. 5 (October 2014): 1638–50. http://dx.doi.org/10.1044/2014_jslhr-s-13-0037.

Full text

Abstract:

Purpose Conventional resource-intensive methods for child phonetic development studies are often impractical for sampling and analyzing child vocalizations in sufficient quantity. The purpose of this study was to provide new information on early language development by an automated analysis of child phonetic production using naturalistic recordings. The new approach was evaluated relative to conventional manual transcription methods. Its effectiveness was demonstrated by a case study with 106 children with typical development (TD) ages 8–48 months, 71 children with autism spectrum disorder (ASD) ages 16–48 months, and 49 children with language delay (LD) not related to ASD ages 10–44 months. Method A small digital recorder in the chest pocket of clothing captured full-day natural child vocalizations, which were automatically identified into consonant, vowel, nonspeech, and silence, producing the average count per utterance (ACPU) for consonant and vowel. Results Clear child utterances were identified with above 72% accuracy. Correlations between machine-estimated and human-transcribed ACPUs were above 0.82. Children with TD produced significantly more consonants and vowels per utterance than did other children. Children with LD produced significantly more consonants but not vowels than did children with ASD. Conclusion The authors provide new information on typical and atypical language development in children with TD, ASD, and LD using an automated computational approach.

APA, Harvard, Vancouver, ISO, and other styles

37

Gulyamova, Shakhnoza Kakhramonovna Gulyamova. "SEMANTIC ANALYSIS AND SYN SIS AND SYNTHESIS IN THE A THESIS IN THE AUTOMATIC ANALYSIS OF THE TEXT." Scientific Reports of Bukhara State University 5, no. 1 (February 26, 2021): 112–24. http://dx.doi.org/10.52297/2181-1466/2021/5/1/9.

Full text

Abstract:

Introduction. In the information-search engine, semantic analysis and synthesis occupy a leading place. When we say automatic semantic analysis, using specially developed linguistic algorithms, we understand a set of methods and techniques that can be used with sufficient accuracy to express the meaning of random speech in a natural language with the help of a rigorous, accurate tool that is carried out on a computer. Highlighting the importance of the semantic analyzer in the information search engine, it is first of all associated with the study of the process of semantic analysis and synthesis in the automatic analysis of the text, the elimination of its problems. Research methods. The direct semantic analysis and synthesis method were used to cover the importance of semantic analysis and synthesis in the automatic analysis of text. Through this, their leading position in the automatic analysis of the text was manifested. Because initially the morphological and syntactic analysis of the text is carried out, and then the semantic analysis is performed. Semantic analysis works with meaning. Moreover, semantics is closely related to philosophy, psychology and other sciences, in addition to knowledge of the structure of the language. In semantic analysis, it is necessary to take into account both the social and cultural features of the native language. The process of human thinking, the means of expressing ideas, is a difficult process to formalize language. Results and discussions.

APA, Harvard, Vancouver, ISO, and other styles

38

Li, Zhen, Shuo Xu, and Tianyu Wang. "A Method of Interest Degree Mining Based on Behavior Data Analysis." International Journal of Pattern Recognition and Artificial Intelligence 34, no. 09 (December 2, 2019): 2059030. http://dx.doi.org/10.1142/s0218001420590302.

Full text

Abstract:

Based on big data, this paper starts from the behavior data of users on social media, and studies and explores the core issues of user modeling under personalized services. Focusing on the goal of user interest modeling, this paper proposes corresponding improvement measures for the existing interest model, which has great difference in interest description among different users and it is difficult to find the user interest change in time. For the above problems, this paper takes user-generated content and user behavior information as the analysis object, and uses natural language processing, knowledge warehouse, data fusion and other methods and techniques to numerically analyze user interest mining based on text mining and multi-source data fusion. We propose a user interest label space mapping method to avoid data sparse problem caused by too many dimensions in interest analysis. At the same time, we propose a method to extract and blend the long-term and short-term interests, and realize the comprehensive evaluation of interests. In the analysis of the big data phase, the user preference social property application preference value law, it is expected to achieve user Internet social media application preference data mining from the perspective of big data.

APA, Harvard, Vancouver, ISO, and other styles

39

Yang, Hao, Qin He, Zhenyan Liu, and Qian Zhang. "Malicious Encryption Traffic Detection Based on NLP." Security and Communication Networks 2021 (August 3, 2021): 1–10. http://dx.doi.org/10.1155/2021/9960822.

Full text

Abstract:

The development of Internet and network applications has brought the development of encrypted communication technology. But on this basis, malicious traffic also uses encryption to avoid traditional security protection and detection. Traditional security protection and detection methods cannot accurately detect encrypted malicious traffic. In recent years, the rise of artificial intelligence allows us to use machine learning and deep learning methods to detect encrypted malicious traffic without decryption, and the detection results are very accurate. At present, the research on malicious encrypted traffic detection mainly focuses on the characteristics’ analysis of encrypted traffic and the selection of machine learning algorithms. In this paper, a method combining natural language processing and machine learning is proposed; that is, a detection method based on TF-IDF is proposed to build a detection model. In the process of data preprocessing, this method introduces the natural language processing method, namely, the TF-IDF model, to extract data information, obtain the importance of keywords, and then reconstruct the characteristics of data. The detection method based on the TF-IDF model does not need to analyze each field of the data set. Compared with the general machine learning data preprocessing method, that is, data encoding processing, the experimental results show that using natural language processing technology to preprocess data can effectively improve the accuracy of detection. Gradient boosting classifier, random forest classifier, AdaBoost classifier, and the ensemble model based on these three classifiers are, respectively, used in the construction of the later models. At the same time, CNN neural network in deep learning is also used for training, and CNN can effectively extract data information. Under the condition that the input data of the classifier and neural network are consistent, through the comparison and analysis of various methods, the accuracy of the one-dimensional convolutional network based on CNN is slightly higher than that of the classifier based on machine learning.

APA, Harvard, Vancouver, ISO, and other styles

40

Chan, Patrick, Yoshinori Hijikata, Toshiya Kuramochi, and Shogo Nishida. "Semantic Relatedness Estimation using the Layout Information of Wikipedia Articles." International Journal of Cognitive Informatics and Natural Intelligence 7, no. 2 (April 2013): 30–48. http://dx.doi.org/10.4018/ijcini.2013040103.

Full text

Abstract:

Computing the semantic relatedness between two words or phrases is an important problem in fields such as information retrieval and natural language processing. Explicit Semantic Analysis (ESA), a state-of-the-art approach to solve the problem uses word frequency to estimate relevance. Therefore, the relevance of words with low frequency cannot always be well estimated. To improve the relevance estimate of low-frequency words and concepts, the authors apply regression to word frequency, its location in an article, and its text style to calculate the relevance. The relevance value is subsequently used to compute semantic relatedness. Empirical evaluation shows that, for low-frequency words, the authors’ method achieves better estimate of semantic relatedness over ESA. Furthermore, when all words of the dataset are considered, the combination of the authors’ proposed method and the conventional approach outperforms the conventional approach alone.

APA, Harvard, Vancouver, ISO, and other styles

41

Lin, Kangcheng, and Harrison Kim. "AN AUTOMATED METHOD TO CONDUCT IMPORTANCE-PERFORMANCE ANALYSIS OF PRODUCT ATTRIBUTES FROM ONLINE REVIEWS - AN EXTENSION WITH A CASE STUDY." Proceedings of the Design Society 1 (July 27, 2021): 417–26. http://dx.doi.org/10.1017/pds.2021.42.

Full text

Abstract:

AbstractWith the growth of online marketplaces and social media, product designers have been seeing an exponential growth of data available, which can serve as an extremely valuable source of information communicated from customers without geographical limitations. The data will reveal customers’ preferences, which can be expensive and slow to obtain via traditional methods such as survey and questionnaires. While existing methods in the literature have been proposed to extract product information and make inference from online data, they have limitations, especially in providing reliable results and in dealing with data sparsity. Therefore, this paper proposes a method to conduct an Important-performance analysis from online reviews. The major steps of this method involve using latent Dirichlet allocation (LDA) to identify product attributes, using IBM Watson Natural Language Understanding tool to perform aspect-based sentiment analysis, and using XGBoost model to infer product attribute importance from the collected dataset. In our case study, we have collected over 150,000 text reviews of more than 3,000 laptops from Amazon.

APA, Harvard, Vancouver, ISO, and other styles

42

Arman, Nabil, and Sari Jabbarin. "Generating Use Case Models from Arabic User Requirements in a Semiautomated Approach Using a Natural Language Processing Tool." Journal of Intelligent Systems 24, no. 2 (June 1, 2015): 277–86. http://dx.doi.org/10.1515/jisys-2014-0092.

Full text

Abstract:

AbstractAutomated software engineering has attracted a large amount of research efforts. The use of object-oriented methods for software systems development has made it necessary to develop approaches that automate the construction of different Unified Modeling Language (UML) models in a semiautomated approach from textual user requirements. UML use case models represent an essential artifact that provides a perspective of the system under analysis or development. The development of such use case models is very crucial in an object-oriented development method. The main principles used in obtaining these models are described. A natural language processing tool is used to parse different statements of the user requirements written in Arabic to obtain lists of nouns, noun phrases, verbs, verb phrases, etc., that aid in finding potential actors and use cases. A set of steps that represent our approach for constructing a use case model are presented. Finally, the proposed approach is validated using an experiment involving a group of graduate students who are familiar with use case modeling.

APA, Harvard, Vancouver, ISO, and other styles

43

Du, Changshun, and Lei Huang. "Sentiment Analysis Method based on Piecewise Convolutional Neural Network and Generative Adversarial Network." International Journal of Computers Communications & Control 14, no. 1 (February 14, 2019): 7–20. http://dx.doi.org/10.15837/ijccc.2019.1.3374.

Full text

Abstract:

Text sentiment analysis is one of the most important tasks in the field of public opinion monitoring, service evaluation and satisfaction analysis under network environments. Compared with the traditional Natural Language Processing analysis tools, convolution neural networks can automatically learn useful features from sentences and improve the performance of the affective analysis model. However, the original convolution neural network model ignores sentence structure information which is very important for text sentiment analysis. In this paper, we add piece-wise pooling to the convolution neural network, which allows the model to obtain the sentence structure. And the main features of different sentences are extracted to analyze the emotional tendencies of the text. At the same time, the user’s feedback involves many different fields, and there is less labeled data. In order to alleviate the sparsity of the data, this paper also uses the generative adversarial network to make common feature extractions, so that the model can obtain the common features associated with emotions in different fields, and improves the model’s Generalization ability with less training data. Experiments on different datasets demonstrate the effectiveness of this method.

APA, Harvard, Vancouver, ISO, and other styles

44

Knippenberg, S. C. M., L. F. P. Etman, T. Wilschut, and J. A. van de Mortel-Fronczak. "Specifying Process Activities for Multi-Domain Matrix Analysis Using a Structured Textual Format." Proceedings of the Design Society: International Conference on Engineering Design 1, no. 1 (July 2019): 1613–22. http://dx.doi.org/10.1017/dsi.2019.167.

Full text

Abstract:

AbstractThis paper proposes a method to automatically generate a multi-domain matrix (MDM) from textual activity specifications. The format for specifying these activities is based on a structured grammar derived from natural language and consists of two types of activities: goal activities and transformation activities. A goal activity describes the purpose of an action performed by an actor for the benefit of another actor in the system. A transformation activity describes an activity from the viewpoint of a single actor, who receives, generates, and outputs information or artifacts. If one describes activities using these two types of activity specifications, dependencies can be automatically derived between actors, activities, and parameters of the system and visualized in an MDM. Thus the generated MDM presents an organization DSM (actors), a process DSM (activities), and a parameter DSM (flows of information or objects), as well as the mapping matrices coupling the different domains. An illustrative house construction example demonstrates the effectiveness of the proposed activity specification format. The method may provide an outcome in understanding and managing complex systems.

APA, Harvard, Vancouver, ISO, and other styles

45

Huang, Kui, Wen Nie, and Nianxue Luo. "A Method of Constructing Marine Oil Spill Scenarios from Flat Text Based on Semantic Analysis." International Journal of Environmental Research and Public Health 17, no. 8 (April 13, 2020): 2659. http://dx.doi.org/10.3390/ijerph17082659.

Full text

Abstract:

Constructed emergency response scenarios provide a basis for decision makers to make management decisions, and the development of such scenarios considers earlier historical cases. Over the decades, the development of emergency response scenarios has mainly implemented the elements of historic cases to describe the grade and influence of an accident. This paper focuses on scenario construction and proposes a corresponding framework based on natural language processing (NLP) using text reports of marine oil spill accidents. For each accident, the original textual reports are first divided into sentence sets corresponding to the temporal evolution. Each sentence set is regarded as a textual description of a marine oil spill scenario. A method is proposed in this paper, based on parsing, named entity recognition (NER) and open information extraction (OpenIE) to process the relation triples that are extracted from the sentence sets. Finally, the relation triples are semantically clustered into different marine oil spill domains to construct scenarios. The research results are validated and indicate that the proposed scenario construction framework can be effectively used in practical applications.

APA, Harvard, Vancouver, ISO, and other styles

46

Shi, Lei, Yulin Zhu, Youpeng Zhang, and Zhongji Su. "Fault Diagnosis of Signal Equipment on the Lanzhou-Xinjiang High-Speed Railway Using Machine Learning for Natural Language Processing." Complexity 2021 (July 28, 2021): 1–13. http://dx.doi.org/10.1155/2021/9126745.

Full text

Abstract:

The Lanzhou-Xinjiang (Lan-Xin) high-speed railway is one of the principal sections of the railway network in western China, and signal equipment is of great importance in ensuring the safe and efficient operation of the high-speed railway. Over a long period, in the railway operation and maintenance process, the railway signaling and communications department has recorded a large amount of unstructured text information about equipment faults in the form of natural language. However, due to irregularities in the recording methods of these data, it is difficult to use directly. In this paper, a method based on natural language processing (NLP) was adopted to analyze and classify this information. First, the Latent Dirichlet Allocation (LDA) topic model was used to extract the semantic features of the text, which were then expressed in the corresponding topic feature space. Next, the Support Vector Machine (SVM) algorithm was used to construct a signal equipment fault diagnostic model that reduced the impact of sample data imbalance on the classification accuracy. This was compared and analyzed with the traditional Naive Bayes (NB), Logistic Regression (LR), Random Forest (RF), and K-Nearest Neighbor (KNN) algorithms. This study used signal equipment failure text data from the Lan-Xin high-speed railway to conduct experimental analysis and verify the effectiveness of the proposed method. Experiments showed that the accuracy of the SVM classification algorithm could reach 0.84 after being combined with the LDA topic model, which verifies that the natural language processing method can effectively realize the fault diagnosis of signal equipment and has certain guiding significance for the maintenance of field signal equipment.

APA, Harvard, Vancouver, ISO, and other styles

47

Ali, Manal Mostafa. "Arabic sentiment analysis about online learning to mitigate covid-19." Journal of Intelligent Systems 30, no. 1 (January 1, 2021): 524–40. http://dx.doi.org/10.1515/jisys-2020-0115.

Full text

Abstract:

Abstract The Covid-19 pandemic is forcing organizations to innovate and change their strategies for a new reality. This study collects online learning related tweets in Arabic language to perform a comprehensive emotion mining and sentiment analysis (SA) during the pandemic. The present study exploits Natural Language Processing (NLP) and Machine Learning (ML) algorithms to extract subjective information, determine polarity and detect the feeling. We begin with pulling out the tweets using Twitter APIs and then preparing for intensive preprocessing. Second, the National Research Council Canada (NRC) Word-Emotion Lexicon was examined to calculate the presence of the eight emotions at their emotional weight. Third, Information Gain (IG) is used as a filtering technique. Fourth, the latent reasons behind the negative sentiments were recognized and analyzed. Finally, different classification algorithms including Naïve Bayes (NB), Multinomial Naïve Bayes (MNB), K Nearest Neighbor (KNN), Logistic Regression (LR), and Support Vector Machine (SVM) were examined. The experiments reveal that the proposed model performs well in analyzing the perception of people about coronavirus with a maximum accuracy of about 89.6% using SVM classifier. From a practical perspective, the method could be generalized to other topical domains, such as public health monitoring and crisis management. It would help public health officials identify the progression and peaks of concerns for a disease in space and time, which enables the implementation of appropriate preventive actions to mitigate these diseases.

APA, Harvard, Vancouver, ISO, and other styles

48

DRAGGIOTIS, ANTHONY, MARIA GRIGORIADOU, and GIORGOS PHILOKYPROU. "The DINOUS parser." Natural Language Engineering 4, no. 2 (June 1998): 145–73. http://dx.doi.org/10.1017/s1351324997001800.

Full text

Abstract:

This paper deals with the development of parsing techniques for the analysis of natural language sentences. We present a paradigm of a multi- path shift-reduce parser which combines two differently structured computational subsystems. The first uses information concerning native speakers' preferences, and the second deals with the linguistic knowledge. To apply preferences on parsing, we propose a method to rank the alternative partial analyses on the basis of parse context and frequency of use effects. The method is mainly based on psycholinguistic evidence, since we hope eventually to build a parser working as closely as possible to the way native speakers analyse natural sentences. We also discuss in detail techniques for optimizing the effectiveness of the proposed model. The system has worked successfully in parsing sentences in Modern Greek, a language where the relatively free word order characteristic results in many ambiguity problems. The proposed parsing model is consistent with many directions in the field of preference-based parsing, and it is proved to be adequate in building effective and maintainable natural language analysers. It is believed that this model can also be used in parsing sentences in languages other than Greek.

APA, Harvard, Vancouver, ISO, and other styles

49

Baud, R. H., A. M. Rassinoux, J. C. Wagner, C. Lovis, C. Juge, L. L. Alpay, P. A. Michel, P. Degoulet, and J. R. Scherrer. "Representing Clinical Narratives Using Conceptual Graphs." Methods of Information in Medicine 34, no. 01/02 (1995): 176–86. http://dx.doi.org/10.1055/s-0038-1634586.

Full text

Abstract:

Abstract:The analysis of medical narratives and the generation of natural language expressions are strongly dependent on the existence of an adequate representation language. Such a language has to be expressive enough in order to handle the complexity of human reasoning in the domain. Sowa’s Conceptual Graphs (CG) are an answer, and this paper presents a multilingual implementation, using French, English and German. Current developments demonstrate the feasibility of an approach to natural Language Understanding where semantic aspects are dominant, in contrast, to syntax driven methods. The basic idea is to aggregate blocks of words according to semantic compatibility rules, following a method called Proximity Processing. The CG representation is gradually built, starting from single words in a semantic lexicon, to finally give a complete representation of the sentence under the form of a single CG. The process is dependent on specific rules of the medical domain, and for this reason is largely controlled by the declarative knowledge of the medical Linguistic Knowlege Base.

APA, Harvard, Vancouver, ISO, and other styles

50

Zeng, Qingtian, Xishi Zhao, Xiaohui Hu, Hua Duan, Zhongying Zhao, and Chao Li. "Learning emotional word embeddings for sentiment analysis." Journal of Intelligent & Fuzzy Systems 40, no. 5 (April 22, 2021): 9515–27. http://dx.doi.org/10.3233/jifs-201993.

Full text

Abstract:

Word embeddings have been successfully applied in many natural language processing tasks due to its their effectiveness. However, the state-of-the-art algorithms for learning word representations from large amounts of text documents ignore emotional information, which is a significant research problem that must be addressed. To solve the above problem, we propose an emotional word embedding (EWE) model for sentiment analysis in this paper. This method first applies pre-trained word vectors to represent document features using two different linear weighting methods. Then, the resulting document vectors are input to a classification model and used to train a text sentiment classifier, which is based on a neural network. In this way, the emotional polarity of the text is propagated into the word vectors. The experimental results on three kinds of real-world data sets demonstrate that the proposed EWE model achieves superior performances on text sentiment prediction, text similarity calculation, and word emotional expression tasks compared to other state-of-the-art models.

APA, Harvard, Vancouver, ISO, and other styles

We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!