Journal articles on the topic 'Real word Data'

To see the other types of publications on this topic, follow the link: Real word Data.

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'Real word Data.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Li, Quanzhi, Sameena Shah, Xiaomo Liu, and Armineh Nourbakhsh. "Data Sets: Word Embeddings Learned from Tweets and General Data." Proceedings of the International AAAI Conference on Web and Social Media 11, no. 1 (May 3, 2017): 428–36. http://dx.doi.org/10.1609/icwsm.v11i1.14859.

Full text
Abstract:
A word embedding is a low-dimensional, dense and real-valued vector representation of a word. Word embeddings have been used in many NLP tasks. They are usually generated from a large text corpus. The embedding of a word captures both its syntactic and semantic aspects. Tweets are short, noisy and have unique lexical and semantic features that are different from other types of text. Therefore, it is necessary to have word embeddings learned specifically from tweets. In this paper, we present ten word embedding data sets. In addition to the data sets learned from just tweet data, we also built embedding sets from the general data and the combination of tweets and the general data. The general data consist of news articles, Wikipedia data and other web data. These ten embedding models were learned from about 400 million tweets and 7 billion words from the general data. In this paper, we also present two experiments demonstrating how to use the data sets in some NLP tasks, such as tweet sentiment analysis and tweet topic classification tasks.
APA, Harvard, Vancouver, ISO, and other styles
2

Yamada, Kenta, Hideki Takayasu, and Misako Takayasu. "Estimation of Economic Indicator Announced by Government From Social Big Data." Entropy 20, no. 11 (November 6, 2018): 852. http://dx.doi.org/10.3390/e20110852.

Full text
Abstract:
We introduce a systematic method to estimate an economic indicator from the Japanese government by analyzing big Japanese blog data. Explanatory variables are monthly word frequencies. We adopt 1352 words in the section of economics and industry of the Nikkei thesaurus for each candidate word to illustrate the economic index. From this large volume of words, our method automatically selects the words which have strong correlation with the economic indicator and resolves some difficulties in statistics such as the spurious correlation and overfitting. As a result, our model reasonably illustrates the real economy index. The announcement of an economic index from government usually has a time lag, while our proposed method can be real time.
APA, Harvard, Vancouver, ISO, and other styles
3

Tafiadis, Dionysios, Vasiliki Zarokanellou, Alexandra Prentza, Louiza Voniati, and Nafsika Ziavra. "Oral diadochokinetic rates for real words and non-words in Greek-speaking children." Open Linguistics 7, no. 1 (January 1, 2021): 722–38. http://dx.doi.org/10.1515/opli-2020-0178.

Full text
Abstract:
Abstract This study examined the performance of Greek monolingual typically developing (TD) children on diadochokinetic (DDK) rates in real words and non-words and attempted to establish normative data for Greek. The effects of age, type of stimuli and gender were investigated. A total of 380 children aged 4.0–15.0 years as well as a control group of 313 adults participated in the study. Age significantly affected DDK performance, yet normative data differ from other studies. DDK rates for bisyllabic stimuli were faster than DDK rates for trisyllabic stimuli and real words were articulated faster than non-words. Adolescents aged 13.0–15.0 years were slower than adults both in real word and in non-word /ˈpataka/ repetition. Additionally, overall boys were significantly faster than girls. These findings show the need to: (a) implement real word stimuli in DDK tasks in order to better depict an individual’s oral-motor abilities and (b) establish language-specific normative data for TD children.
APA, Harvard, Vancouver, ISO, and other styles
4

Maloberti, Alessandro, Andrian Elisa, Leidi Filippo, Monticelli Massimiliano, Galasso Michele, Colombo Valentina, and Giannattasio Cristina. "CARDIOLOGICAL HYPERTENSIVE EMERGENCIES: REAL WORD DATA COMPARED TO GUIDELINES INDICATIONS." Journal of Hypertension 41, Suppl 3 (June 2023): e123. http://dx.doi.org/10.1097/01.hjh.0000940008.49861.5c.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Vulić, Ivan, and Marie-Francine Moens. "Bilingual Distributed Word Representations from Document-Aligned Comparable Data." Journal of Artificial Intelligence Research 55 (April 12, 2016): 953–94. http://dx.doi.org/10.1613/jair.4986.

Full text
Abstract:
We propose a new model for learning bilingual word representations from non-parallel document-aligned data. Following the recent advances in word representation learning, our model learns dense real-valued word vectors, that is, bilingual word embeddings (BWEs). Unlike prior work on inducing BWEs which heavily relied on parallel sentence-aligned corpora and/or readily available translation resources such as dictionaries, the article reveals that BWEs may be learned solely on the basis of document-aligned comparable data without any additional lexical resources nor syntactic information. We present a comparison of our approach with previous state-of-the-art models for learning bilingual word representations from comparable data that rely on the framework of multilingual probabilistic topic modeling (MuPTM), as well as with distributional local context-counting models. We demonstrate the utility of the induced BWEs in two semantic tasks: (1) bilingual lexicon extraction, (2) suggesting word translations in context for polysemous words. Our simple yet effective BWE-based models significantly outperform the MuPTM-based and context-counting representation models from comparable data as well as prior BWE-based models, and acquire the best reported results on both tasks for all three tested language pairs.
APA, Harvard, Vancouver, ISO, and other styles
6

Mustikarini, Wening, Risanuri Hidayat, and Agus Bejo. "Real-Time Indonesian Language Speech Recognition with MFCC Algorithms and Python-Based SVM." IJITEE (International Journal of Information Technology and Electrical Engineering) 3, no. 2 (October 29, 2019): 55. http://dx.doi.org/10.22146/ijitee.49426.

Full text
Abstract:
Abstract — Automatic Speech Recognition (ASR) is a technology that uses machines to process and recognize human voice. One way to increase recognition rate is to use a model of language you want to recognize. In this paper, a speech recognition application is introduced to recognize words "atas" (up), "bawah" (down), "kanan" (right), and "kiri" (left). This research used 400 samples of speech data, 75 samples from each word for training data and 25 samples for each word for test data. This speech recognition system was designed using Mel Frequency Cepstral Coefficient (MFCC) as many as 13 coefficients as features and Support Vector Machine (SVM) as identifiers. The system was tested with linear kernels and RBF, various cost values, and three sample sizes (n = 25, 75, 50). The best average accuracy value was obtained from SVM using linear kernels, a cost value of 100 and a data set consisted of 75 samples from each class. During the training phase, the system showed a f1-score (trade-off value between precision and recall) of 80% for the word "atas", 86% for the word "bawah", 81% for the word "kanan", and 100% for the word "kiri". Whereas by using 25 new samples per class for system testing phase, the f1-score was 76% for the "atas" class, 54% for the "bawah" class, 44% for the "kanan" class, and 100% for the "kiri" class.
APA, Harvard, Vancouver, ISO, and other styles
7

Whiting, Caroline, Yury Shtyrov, and William Marslen-Wilson. "Real-time Functional Architecture of Visual Word Recognition." Journal of Cognitive Neuroscience 27, no. 2 (February 2015): 246–65. http://dx.doi.org/10.1162/jocn_a_00699.

Full text
Abstract:
Despite a century of research into visual word recognition, basic questions remain unresolved about the functional architecture of the process that maps visual inputs from orthographic analysis onto lexical form and meaning and about the units of analysis in terms of which these processes are conducted. Here we use magnetoencephalography, supported by a masked priming behavioral study, to address these questions using contrasting sets of simple (walk), complex (swimmer), and pseudo-complex (corner) forms. Early analyses of orthographic structure, detectable in bilateral posterior temporal regions within a 150–230 msec time frame, are shown to segment the visual input into linguistic substrings (words and morphemes) that trigger lexical access in left middle temporal locations from 300 msec. These are primarily feedforward processes and are not initially constrained by lexical-level variables. Lexical constraints become significant from 390 msec, in both simple and complex words, with increased processing of pseudowords and pseudo-complex forms. These results, consistent with morpho-orthographic models based on masked priming data, map out the real-time functional architecture of visual word recognition, establishing basic feedforward processing relationships between orthographic form, morphological structure, and lexical meaning.
APA, Harvard, Vancouver, ISO, and other styles
8

Maloberti, A., E. Andrian, F. Leidi, M. Massimiliano, M. Galasso, V. Colombo, and C. Giannattasio. "C89 CARDIOLOGICAL HYPERTENSIVE EMERGENCIES: REAL WORD DATA COMPARED TO GUIDELINES INDICATIONS." European Heart Journal Supplements 25, Supplement_D (May 2023): D37. http://dx.doi.org/10.1093/eurheartjsupp/suad111.086.

Full text
Abstract:
Abstract Background The 2018 ESH guidelines have revised the therapeutic goals of cardiological Hypertensive Emergencies (HE) with an indication for a more intensive (target < 140/90 mmHg) and rapid (immediate) Blood Pressure (BP) reduction. Cardiac acute organ damage during HE includes acute myocardial infarction, pulmonary edema, unstable angina pectoris and aortic dissection. However, how much these indications have been applied in clinical practice to date it’s unknown. Aims The first purpose of our study is to analyze the prevalence and clinical characteristics of cardiological HE in our institution. The second purpose is to compare the year before the release of the 2018 guidelines (2017) with the subsequent years (2019) trying to verify adherence to guidelines. Methods This is a single–center retrospective study conducted at the Niguarda Hospital. All patients aged ≥ 18 years with Systolic BP≥ 180 mmHg and/or a Diastolic BP ≥ 120 mmHg with Cardiological Emergency were enrolled. From the Emergency Department (ED) data clinical, anamnestic, blood pressure, symptoms, drug treatment and target achievement were registered. Results Patients with BP > 180/120 mmHg in 2017 were 706 out of a total of 73795 accesses (0.96%) and 601 over 67273 (0.89%) in 2019. 246 (34.84%) in 2017 were HE of which 144 (58.53%) were cardiological: aortic dissection 1 (0.69%), acute coronary syndrome 52 (36.11%), acute pulmonary edema 35 (24.30%), cardiac decompensation 91 (63.19%). During 2019 similar figures were founded with 286 (47.58%) HE of which 286 (47.58%) were cardiological: aortic dissection 2 (1.43%), acute coronary syndrome 43 (30.93%), acute pulmonary edema 20 (14.39%), cardiac decompensation 76 (54.68%). The reduction in BP obtained in ED was significantly greater in 2017 than in 2019 (44.7±31.4 vs 35.4±24.5 mmHg, p = 0.011) with a lower target reaching in 2019 (28.9 vs 51.4%, p<0.001). Pulmonary edema is the cardiological HE on which a greater pressure reduction is obtained and therefore in which the target set by the guidelines is more frequently reached. Conclusions The recommendation for a more intense and rapid BP reduction in cardiological HE seems to be not accepted from ED clinicians that persist to reduce BP accordingly to previous guidelines.
APA, Harvard, Vancouver, ISO, and other styles
9

Damholdt, Malene Flensborg, Vestergaard Christina, Anna Kryvous, Catharina Vesterager Smedegaard, and Johanna Seibt. "What is in three words? Exploring a three-word methodology for assessing impressions of a social robot encounter online and in real life." Paladyn, Journal of Behavioral Robotics 10, no. 1 (December 31, 2019): 438–53. http://dx.doi.org/10.1515/pjbr-2019-0034.

Full text
Abstract:
AbstractWe explore the impressions and conceptualisations produced by participants after their first encounter with the teleoperated robot, Telenoid R1.Participants were invited to freely report the first three words that came to mind after seeing the robot. Here we triangulate (i) three-word data from an online survey (n=340) where respondents saw a brief video of the Telenoid with (ii) three-word data from an interaction study where participants interacted with a physically present Telenoid (n=75) and, (iii) data from qualitative interviews (n=7) with participants who had engaged with the Telenoid. Data were subjected to sentiment analysis, linguistic analysis and regression analysis.Ranking of the most frequently produced words in the two groups revealed an overlap on the top-10 produced words (6 out of 10 words). Sentiment analysis and regression revealed an association between negative predicates and the online condition. Sentiments were not convincingly associated with age or gender. Linguistic categorisations of the data revealed that especially adjectives expressing response-dependent features were frequent. We did not find any consistent statistical effect on categorising the words into cognitive and emotional predicates.The proposed three-word method offers, unguided approach to explore initial conceptualisations of robots.
APA, Harvard, Vancouver, ISO, and other styles
10

Roberto, Giuseppe, Andrea Spini, Claudia Bartolini, Valentino Moscatelli, Alessandro Barchielli, Davide Paoletti, Silvano Giorgi, et al. "Real word evidence on rituximab utilization: Combining administrative and hospital-pharmacy data." PLOS ONE 15, no. 3 (March 12, 2020): e0229973. http://dx.doi.org/10.1371/journal.pone.0229973.

Full text
APA, Harvard, Vancouver, ISO, and other styles
11

De la Torre Rubio, N., N. Franco Domingo, I. Pérez - Sancristóbal, M. Pavía Pascual, L. F. Montaño Tapia, M. Machattou, P. Navarro Palomo, et al. "AB1516 APREMILAST FOR TREATMENT OF ORAL APHTOSIS: REAL-WORD DATA MULTICENTER STUDY." Annals of the Rheumatic Diseases 82, Suppl 1 (May 30, 2023): 1990.1–1990. http://dx.doi.org/10.1136/annrheumdis-2023-eular.3977.

Full text
Abstract:
BackgroundThe main manifestation of Behçet’s disease is oral aphthae. Treatment consists of topical corticosteroids and colchicine in recurrent cases. Drugs such as azathioprine, thalidomide, interferon-alpha, tumour necrosis factor-alpha inhibitors or apremilast should be considered in selected cases[1]. A phase 3, double-blind, placebo-controlled RELIEF study suggests that apremilast is an effective agent for the management of oral aphtae in Behçet’s disease and benefits were sustained up to 64 weeks. Its onset of action occurred within 2 weeks after treatment was started. Adverse events, including nausea, vomiting, and diarrhea, occurred more frequently with apremilast than with placebo[2]. It has also been shown to be effective in the treatment of psoriasis[3].ObjectivesTo assess the effectiveness of apremilast in the treatment of oral aphtosis.MethodsRetrospective descriptive multicentre study of patients treated with apremilast for oral aphtosis associated with Behçet’s disease, as well as patients with oral aphtosis who have received apremilast for another indication (psoriasis) or off-label use. Four hospitals participated in the study. The following variables were collected: age, sex, diagnosis, oral aphtosis at the time of starting apremilast, aphtosis response at 3 and 6 months (partial improvement, total improvement, lack of response, discontinuation due to side effects), ESR, CRP, ferritin, 25-OH-Vitamin D, folic acid, cyanocobalamin and concomitant treatment (none, corticosteroids, methotrexate, colchicine or anti-TNF). Descriptive statistics were performed using mean and standard deviation (SD) for continuous quantitative variables.ResultsA total of 15 patients were being treated with apremilast for oral aphtosis. Ten of them were diagnosed with Behçet’s disease, one with psoriasis with oral aphtosis and four with recurrent oral aphtosis. Mean age was 47.1 ± 12.4 years, 67 % were female. All patients had active oral aphtosis at the time of starting apremilast. After 3 months of treatment, 87 % of the patients had improvement of the aphtosis (61.5 % partially and 38.5 % completely resolved); 13 % of the patients had to discontinue treatment due to side effects (diarrhea), none had to discontinue treatment due to lack of response. After 6 months, all the patients maintained improvement and only two of them discontinued apremilast after 12 months due to lack of response. Results of serological data analyzed were (mean ± SD): ESR 11.4 ± 6.7, CRP 3.7 ± 6.5, ferritin 100.8 ± 167.3, 25-OH-Vitamin D 31.4 ± 18.9, folic acid 11.7 ± 9.8, cyanocobalamin 585.8 ± 527. One patient maintained concomitant treatment with an anti-TNF drug, two with corticoids and five with colchicine.ConclusionTreatment with apremilast seems to improve oral aphthae in over 87% of patients.References[1]Hatemi G, Christensen R, Bang D, Bodaghi B, Celik AF, Fortune F, et al. 2018 update of the EULAR recommendations for the management of Behçet’s syndrome. Ann Rheum Dis 2018 -06;77(6):808-818.[2]Hatemi G, Mahr A, Takeno M, Kim DY, Saadoun D, Direskeneli H, Melikoğlu M, Cheng S, McCue S, Paris M, Chen M, Yazici Y. Apremilast for oral ulcers associated with active Behçet’s syndrome over 68 weeks: long-term results from a phase 3 randomised clinical trial. Clin Exp Rheumatol. 2021 Sep-Oct;39 Suppl 132(5):80-87. doi: 10.55563/clinexprheumatol/ra8ize. Epub 2021 Oct 6. PMID: 34622764.[3]Stein Gold L, Papp K, Pariser D, Green L, Bhatia N, Sofen H, et al. Efficacy and safety of apremilast in patients with mild-to-moderate plaque psoriasis: Results of a phase 3, multicenter, randomized, double-blind, placebo-controlled trial. J Am Acad Dermatol 2022 -01;86(1):77-85.Graph 1.Treatment response of oral aphtae after 12 monthsAcknowledgements:NIL.Disclosure of InterestsNone Declared.
APA, Harvard, Vancouver, ISO, and other styles
12

Su, Na, Shujuan Ji, and Jimin Liu. "Real-Time Topic Detection with Dynamic Windows." Computer Journal 63, no. 3 (June 4, 2019): 469–78. http://dx.doi.org/10.1093/comjnl/bxz042.

Full text
Abstract:
Abstract Microblog is a popular social network in which hot topics propagate online rapidly. Real-time topic detection can not only understand public opinion well but also bring high commercial value. We design a method for real-time microblog data analysis in order to detect popular long lasting events as well as emerging events. Firstly, a mining frequent items algorithm on microblog data stream is proposed to count approximate word frequency. This mining frequent items algorithm can find the frequent words for some time. Secondly, the windows size of the monitored words is adjusted dynamically according to the duration time and the evolution of events. Lastly, new topics and trends of existing topics can be detected by using dynamic clustering algorithm based on vector space model. Experimental results show that the proposed algorithms can improve performance in terms of running time and accuracy.
APA, Harvard, Vancouver, ISO, and other styles
13

Elgort, Irina, Marc Brysbaert, Michaël Stevens, and Eva Van Assche. "CONTEXTUAL WORD LEARNING DURING READING IN A SECOND LANGUAGE." Studies in Second Language Acquisition 40, no. 2 (June 21, 2017): 341–66. http://dx.doi.org/10.1017/s0272263117000109.

Full text
Abstract:
AbstractReading affords opportunities for L2 vocabulary acquisition. Empirical research into the pace and trajectory of this acquisition has both theoretical and applied value. Charting the development of different aspects of word knowledge can verify and inform theoretical frameworks of word learning and reading comprehension. It can also inform practical decisions about using L2 readings in academic study. Monitoring readers’ eye movements provides real-time data on word learning, under the conditions that closely approximate adult L2 vocabulary acquisition from reading. In this study, Dutch-speaking university students read an English expository text, while their eye movements were recorded. Of interest were patterns of change in the eye movements on the target low-frequency words that occurred multiple times in the text, and whether differences in the processing of target and control (known) words decreased overtime. Target word reading outside of the familiar text was examined in a posttest using semantically neutral sentences. The findings show that orthographic processing develops relatively quickly and reliably. However, online retrieval of meaning remains insufficient for fluent word-to-text integration even after multiple contextual encounters.
APA, Harvard, Vancouver, ISO, and other styles
14

Plotnick, Roy E. "Creating Models for Interpreting Data." Paleontological Society Special Publications 11 (2002): 275–88. http://dx.doi.org/10.1017/s2475262200009989.

Full text
Abstract:
For many of us, the word model may trigger remembrances of long-ago afternoons spent painstakingly gluing a small plastic car or airplane model together (for others, it may conjure up images of a somewhat emaciated young woman staring out of the cover of Vogue or Elle). The model car is not the same as a real car; it is made of different materials, it has many fewer parts, and it does not move. Nevertheless, it resembles a real automobile sufficiently that we recognize it as a realistic representation. Similarly, scientists use the term model to refer to a reconstruction of nature for the purpose of study (Levins, 1966). In other words, in order to understand nature, one may not always want to study it directly. Instead, understanding may come from studying a facsimile of nature that captures what is perceived to be its essential properties. In the same way, a child may learn how a car is built by building a plastic model of it, even if this model contains no moving parts.
APA, Harvard, Vancouver, ISO, and other styles
15

Plotnick, Roy E. "Creating Models for Interpreting Data." Paleontological Society Special Publications 9 (1999): 343–58. http://dx.doi.org/10.1017/s2475262200014180.

Full text
Abstract:
For many of us, the word model may trigger remembrances of long-ago afternoons spent painstakingly gluing a small plastic car or airplane model together (for others, it may conjure up images of a somewhat emaciated young woman staring out of the cover of Vogue or Elle). The model car is not the same as a real car; it is made of different materials; it has many fewer parts, and it does not move. Nevertheless, it resembles a real automobile sufficiently that we recognize it as a realistic representation. Similarly, scientists use the term model to refer to a reconstruction of nature for the purpose of study (Levins, 1966). In other words, in order to understand nature, one may not always want to study it directly. Instead, understanding may come from studying a facsimile of nature that captures what is perceived to be its essential properties. In the same way, a child may learn how a car is built by building a plastic model of it, even if this model contains no moving parts.
APA, Harvard, Vancouver, ISO, and other styles
16

Aisyah, Siti, and Afriliana Larasati. "An Analysis of Slang Word in Avril Lavigne Song Lyrics “Head Above Water”." JELT: Journal of English Language Teaching 5, no. 2 (November 13, 2021): 180. http://dx.doi.org/10.33087/jelt.v5i2.94.

Full text
Abstract:
This study deals in the slang word analysis in Avril Lavigne song lyrics “Head Above Water” album. This study aimed to identify the types of slang word found in Avril Lavigne song lyrics “Head Above Water” album, to find out the meaning of slang word in Avril Lavigne song lyrics “Head Above Water” album, to analyze the slang word which are realized in Avril Lavigne song lyrics “Head Above Water” album. Descriptive qualitative research was applied in this study. The data of research were taken from Avril Lavigne song lyric “Head Above Water” album, 74 words found in this song. The data were analyzed by identifying the song into four types of process, classifying the word in each type of slang, analyze the data into the slang word. It was found that there were four types of slang in selected Avril Lavigne song lyrics “Head Above Water” album namely blending, coinage, clipping, and compounding. The writer found the most widely types of slang namely clipping among words of slang in selected 8 songs lyrics of Avril Lavigne “Head Above Water” album. The researcher was realized that the words where the slang word which was found in Avril Lavigne song lyrics “Head Above Water” album by reading all the lyrics in detail and analyzing the word by trying find the correct word or the real meaning, then the researcher can decide the word was called the slang word.Key words: Analysis, Slang Word, Avril Lavigne Song Lyrics
APA, Harvard, Vancouver, ISO, and other styles
17

Pinzón-Arenas, Javier Orlando, and Robinson Jiménez-Moreno. "Comparison between handwritten word and speech record in real-time using CNN architectures." International Journal of Electrical and Computer Engineering (IJECE) 10, no. 4 (August 1, 2020): 4313. http://dx.doi.org/10.11591/ijece.v10i4.pp4313-4321.

Full text
Abstract:
This paper presents the development of a system of comparison between words spoken and written by means of deep learning techniques. There are used 10 words acquired by means of an audio function and, these same words, are written by hand and acquired by a webcam, in such a way as to verify if the two data match and show whether or not it is the required word. For this, 2 different CNN architectures were used for each function, where for voice recognition, a suitable CNN was used to identify complete words by means of their features obtained with mel frequency cepstral coefficients, while for handwriting, a faster R-CNN was used, so that it both locates and identifies the captured word. To implement the system, an easy-to-use graphical interface was developed, which unites the two neural networks for its operation. With this, tests were performed in real-time, obtaining a general accuracy of 95.24%, allowing showing the good performance of the implemented system, adding the response speed factor, being less than 200 ms in making the comparison.
APA, Harvard, Vancouver, ISO, and other styles
18

Dhanani, Jenish, Rupa Mehta, and Dipti Rana. "Sentiment Weighted Word Embedding for Big Text Data." International Journal of Web-Based Learning and Teaching Technologies 16, no. 6 (November 2021): 1–17. http://dx.doi.org/10.4018/ijwltt.20211101.oa2.

Full text
Abstract:
Sentiment analysis is the practice of eliciting a sentiment orientation of people's opinions (i.e. positive, negative and neutral) toward the specific entity. Word embedding technique like Word2vec is an effective approach to encode text data into real-valued semantic feature vectors. However, it fails to preserve sentiment information that results in performance deterioration for sentiment analysis. Additionally, big sized textual data consisting of large vocabulary and its associated feature vectors demands huge memory and computing power. To overcome these challenges, this research proposed a MapReduce based Sentiment weighted Word2Vec (MSW2V), which learns the sentiment and semantic feature vectors using sentiment dictionary and big textual data in a distributed MapReduce environment, where memory and computing power of multiple computing nodes are integrated to accomplish the huge resource demand. Experimental results demonstrate the outperforming performance of the MSW2V compared to the existing distributed and non-distributed approaches.
APA, Harvard, Vancouver, ISO, and other styles
19

Nugroho, Radityo Tri, and Afifah Linda Sari. "A STUDY ON THE REGISTER OF POLITY." LET: Linguistics, Literature and English Teaching Journal 9, no. 1 (June 30, 2019): 22. http://dx.doi.org/10.18592/let.v9i1.3076.

Full text
Abstract:
The language of Indonesian politics is quite interesting. There are many phenomena of language abuse such as language manipulation and meaning reduction, which often occur. This research is meant to find 1) the form of the polity register, 2) the kinds of meaning polity register and 3) the characteristics of the polity register. Dealing with the research method, the writer applies qualitative method. The writer takes the data from the Indonesian newspaper “Suara Merdeka” which contains register polity. She collects the data by reading, scrutinizing, and listing them. To analyze the data, the writer uses semantic analysis based on Poedjosoedarmo theory to determine the changing meaning of the register. Having analyzed the data, the writer finds two forms of polity register namely: word and phrase. The words are in the form of simple word (26, 7%), compound word (26, 7%), and complex word (24, 4%), abbreviation and blended, each has 2,2%. In the form of phrase (17, 8%) Related to the meaning, the writer finds five kinds of changing meaning, namely: narrower than the real meaning (8, 9%), share some features whereas each of them has different meaning (24, 5%), identical to the real meaning (13,3%), and different from the real meaning (53,3%). Meanwhile, the Language styles of Indonesian polity register are euphemism (57,7%), metaphor (6,7%), hyperbola (13,3%), and metonymy (22,2%). The result of this research shows that the biggest amount of the register meaning of the Indonesian polity is different from the real meaning and tends to be euphemism. Those phenomena indicate that the polity register used by Indonesian politician tends to hide the real meaning. These may happen since there are some interests behind the Indonesian polity register, among others are to maintain and to retain power. In sum, the register is used to keep the authority goes on.
APA, Harvard, Vancouver, ISO, and other styles
20

Ruiz, L., S. Hartz, I. Redondo, H. Lunagaria, and Ö. Åkerborg. "PSY27 Real Word DATA on Surgery in a Cohort of Ulcerative Colitis Patients." Value in Health 23 (December 2020): S747. http://dx.doi.org/10.1016/j.jval.2020.08.2035.

Full text
APA, Harvard, Vancouver, ISO, and other styles
21

Giyatmi, Giyatmi, Sihindun Arumi, and Mas Sulis Setiyono. "WORD FORMATION OF MESSAGING APPLICATIONS FOUND IN PLAY STORE." Lire Journal (Journal of Linguistics and Literature) 5, no. 1 (March 21, 2021): 92–109. http://dx.doi.org/10.33019/lire.v5i1.106.

Full text
Abstract:
This study aims at describing the process of word-formation used on messaging applications found in the Play Store. This is descriptive qualitative research. The data are messaging applications written in English and in the form of words. To collect the data, the researchers use observation. The analysis data consists of three steps namely data reduction, data display, and verification. There are 56 data found. There are 6 types of word formations; Affixation (4 data), compounding (15 data), blending (4 data), coinage (8 data), clipping (4 data), reduplication (1 data). However, there are 20 messaging applications that cannot be classified into the type of word formation such as Path, Line, Lemon, etc. They are simple words that have already existed in English and have been used in everyday communication. Meanwhile, nowadays they are used as a name of messaging applications and have different meanings as the real meaning. The suffixes used in the affixation process are –er, -ous, -ster. There are 6 formations of compounding used in the messaging application such as N+N, V+V, N+V, V+N, Adv. + Prep. There are 3 ways of blending process such as taking the whole part of the first word and taking the first syllable of the second, taking the first syllable of the first word and taking the whole part of the second word, taking two syllables from the front part of the first word and taking the last syllable of the second word. Coinage consists of the name of the company and the name of the product. There are two types of clipping found namely fore-clipping and back-clipping. Reduplication happens when there is a copying of the partial part of the word. Apparently, there are morphological processes used in life such as word formation to name the messaging application.
APA, Harvard, Vancouver, ISO, and other styles
22

Liu, Jiguo, Chao Liu, Nan Li, Shihao Gao, Mingqi Liu, and Dali Zhu. "LADA-Trans-NER: Adaptive Efficient Transformer for Chinese Named Entity Recognition Using Lexicon-Attention and Data-Augmentation." Proceedings of the AAAI Conference on Artificial Intelligence 37, no. 11 (June 26, 2023): 13236–45. http://dx.doi.org/10.1609/aaai.v37i11.26554.

Full text
Abstract:
Recently, word enhancement has become very popular for Chinese Named Entity Recognition (NER), reducing segmentation errors and increasing the semantic and boundary information of Chinese words. However, these methods tend to ignore the semantic relationship before and after the sentence after integrating lexical information. Therefore, the regularity of word length information has not been fully explored in various word-character fusion methods. In this work, we propose a Lexicon-Attention and Data-Augmentation (LADA) method for Chinese NER. We discuss the challenges of using existing methods in incorporating word information for NER and show how our proposed methods could be leveraged to overcome those challenges. LADA is based on a Transformer Encoder that utilizes lexicon to construct a directed graph and fuses word information through updating the optimal edge of the graph. Specially, we introduce the advanced data augmentation method to obtain the optimal representation for the NER task. Experimental results show that the augmentation done using LADA can considerably boost the performance of our NER system and achieve significantly better results than previous state-of-the-art methods and variant models in the literature on four publicly available NER datasets, namely Resume, MSRA, Weibo, and OntoNotes v4. We also observe better generalization and application to a real-world setting from LADA on multi-source complex entities.
APA, Harvard, Vancouver, ISO, and other styles
23

Pirró, Giuseppe. "REWOrD: Semantic Relatedness in the Web of Data." Proceedings of the AAAI Conference on Artificial Intelligence 26, no. 1 (September 20, 2021): 129–35. http://dx.doi.org/10.1609/aaai.v26i1.8107.

Full text
Abstract:
This paper presents REWOrD, an approach to compute semantic relatedness between entities in the Web of Data representing real word concepts. REWOrD exploits the graph nature of RDF data and the SPARQL query language to access this data. Through simple queries, REWOrD constructs weighted vectors keeping the informativeness of RDF predicates used to make statements about the entities being compared. The most informative path is also considered to further refine informativeness. Relatedness is then computed by the cosine of the weighted vectors. Differently from previous approaches based on Wikipedia, REWOrD does not require any prepro- cessing or custom data transformation. Indeed, it can lever- age whatever RDF knowledge base as a source of background knowledge. We evaluated REWOrD in different settings by using a new dataset of real word entities and investigate its flexibility. As compared to related work on classical datasets, REWOrD obtains comparable results while, on one side, it avoids the burden of preprocessing and data transformation and, on the other side, it provides more flexibility and applicability in a broad range of domains.
APA, Harvard, Vancouver, ISO, and other styles
24

Herrick, Dylan. "AN ACOUSTIC DESCRIPTION OF CENTRAL CATALAN VOWELS BASED ON REAL AND NONSENSE WORD DATA." Catalan Review: Volume 21, Issue 1 21, no. 1 (January 1, 2007): 231–56. http://dx.doi.org/10.3828/catr.21.10.

Full text
Abstract:
This paper examines the extent to which vowel height data taken from real words differs from data taken from nonsense words, and it finds no significant differences. As a result, it provides quantitative acoustic data for the seven stressed and three unstressed vowels of Standard Catalan (as uttered by female speakers). The data are drawn from three distinct phonetic contexts, i.e., /bVp/, /bVt/, and /bVk/, and the /bVp/ context consists entirely of nonsense words (the other contexts were all real words). A comparison and statistical analysis of the data for each vowel phoneme show that there are neither considerable nor statistically significant differences in the vowel height (F1 values) among the data from the three different phonetic contexts. In terms of vowel height, nonsense words provide as accurate a picrure of the Catalan data as real words do.
APA, Harvard, Vancouver, ISO, and other styles
25

Gu Yueguo. "From real-life situated discourse to video-stream data-mining." International Journal of Corpus Linguistics 14, no. 4 (December 15, 2009): 433–66. http://dx.doi.org/10.1075/ijcl.14.4.01gu.

Full text
Abstract:
This paper presents an argument for agent-oriented modeling (AOM) as a research methodology and a metalanguage for corpus linguistics. It is triggered by three closely related issues arising from compiling multimodal corpora such as the Spoken Chinese Corpora of Situated Discourse (SCCSD). Given a real-life situation, there are three types of representation: (i) the Written Word representation, (ii) audio recording, and (iii) video recording. It is shown that the three types are all data-transformative and involve data loss, and that they are intrinsically flawed. The current multiple-layered approach to data integration is also shown to be inadequate. AOM is proposed to be a potential solution to the problems. Modeling decision tree, levels of modeling, and modeling schema written in XML are demonstrated. The philosophical basis of AOM, and its theoretical implications are also discussed.
APA, Harvard, Vancouver, ISO, and other styles
26

Charaabi, L. "FPGA-Based Fixed Point Implementation of a Real-Time Induction Motor Emulator." Advances in Power Electronics 2012 (October 31, 2012): 1–10. http://dx.doi.org/10.1155/2012/409671.

Full text
Abstract:
This paper investigates the numerical issue of a discrete-time induction-motor emulator implementation. The stability analysis of the finite-word-length implementation shows a coupling between required word length and the sample rate. We propose specific guidelines to analyze this coupling and to estimate the required data word length for both signals and coefficients of the model. To respect algorithm requirements, an FPGA-based implementation was used for architecture development. The direct torque control is implemented to verify in real time the AC-motor emulator prototype.
APA, Harvard, Vancouver, ISO, and other styles
27

Hsu, Chung-Chian, Wei-Cyun Tsao, Arthur Chang, and Chuan-Yu Chang. "Analyzing mixed-type data by using word embedding for handling categorical features." Intelligent Data Analysis 25, no. 6 (October 29, 2021): 1349–68. http://dx.doi.org/10.3233/ida-205453.

Full text
Abstract:
Most of real-world datasets are of mixed type including both numeric and categorical attributes. Unlike numbers, operations on categorical values are limited, and the degree of similarity between distinct values cannot be measured directly. In order to properly analyze mixed-type data, dedicated methods to handle categorical values in the datasets are needed. The limitation of most existing methods is lack of appropriate numeric representations of categorical values. Consequently, some of analysis algorithms cannot be applied. In this paper, we address this deficiency by transforming categorical values to their numeric representation so as to facilitate various analyses of mixed-type data. In particular, the proposed transformation method preserves semantics of categorical values with respect to the other values in the dataset, resulting in better performance on data analyses including classification and clustering. The proposed method is verified and compared with other methods on extensive real-world datasets.
APA, Harvard, Vancouver, ISO, and other styles
28

Hagedorn, Christina, Michael Proctor, Louis Goldstein, Stephen M. Wilson, Bruce Miller, Maria Luisa Gorno-Tempini, and Shrikanth S. Narayanan. "Characterizing Articulation in Apraxic Speech Using Real-Time Magnetic Resonance Imaging." Journal of Speech, Language, and Hearing Research 60, no. 4 (April 14, 2017): 877–91. http://dx.doi.org/10.1044/2016_jslhr-s-15-0112.

Full text
Abstract:
Purpose Real-time magnetic resonance imaging (MRI) and accompanying analytical methods are shown to capture and quantify salient aspects of apraxic speech, substantiating and expanding upon evidence provided by clinical observation and acoustic and kinematic data. Analysis of apraxic speech errors within a dynamic systems framework is provided and the nature of pathomechanisms of apraxic speech discussed. Method One adult male speaker with apraxia of speech was imaged using real-time MRI while producing spontaneous speech, repeated naming tasks, and self-paced repetition of word pairs designed to elicit speech errors. Articulatory data were analyzed, and speech errors were detected using time series reflecting articulatory activity in regions of interest. Results Real-time MRI captured two types of apraxic gestural intrusion errors in a word pair repetition task. Gestural intrusion errors in nonrepetitive speech, multiple silent initiation gestures at the onset of speech, and covert (unphonated) articulation of entire monosyllabic words were also captured. Conclusion Real-time MRI and accompanying analytical methods capture and quantify many features of apraxic speech that have been previously observed using other modalities while offering high spatial resolution. This patient's apraxia of speech affected the ability to select only the appropriate vocal tract gestures for a target utterance, suppressing others, and to coordinate them in time.
APA, Harvard, Vancouver, ISO, and other styles
29

Md, Abdul Quadir, Raghav V. Anand, Senthilkumar Mohan, Christy Jackson Joshua, Sabhari S. Girish, Anthra Devarajan, and Celestine Iwendi. "Data-Driven Analysis of Privacy Policies Using LexRank and KL Summarizer for Environmental Sustainability." Sustainability 15, no. 7 (March 29, 2023): 5941. http://dx.doi.org/10.3390/su15075941.

Full text
Abstract:
Natural language processing (NLP) is a field in machine learning that analyses and manipulate huge amounts of data and generates human language. There are a variety of applications of NLP such as sentiment analysis, text summarization, spam filtering, language translation, etc. Since privacy documents are important and legal, they play a vital part in any agreement. These documents are very long, but the important points still have to be read thoroughly. Customers might not have the necessary time or the knowledge to understand all the complexities of a privacy policy document. In this context, this paper proposes an optimal model to summarize the privacy policy in the best possible way. The methodology of text summarization is the process where the summaries from the original huge text are extracted without losing any vital information. Using the proposed idea of a common word reduction process combined with natural language processing algorithms, this paper extracts the sentences in the privacy policy document that hold high weightage and displays them to the customer, and it can save the customer’s time from reading through the entire policy while also providing the customers with only the important lines that they need to know before signing the document. The proposed method uses two different extractive text summarization algorithms, namely LexRank and Kullback Leibler (KL) Summarizer, to summarize the obtained text. According to the results, the summarized sentences obtained via the common word reduction process and text summarization algorithms were more significant than the raw privacy policy text. The introduction of this novel methodology helps to find certain important common words used in a particular sector to a greater depth, thus allowing more in-depth study of a privacy policy. Using the common word reduction process, the sentences were reduced by 14.63%, and by applying extractive NLP algorithms, significant sentences were obtained. The results after applying NLP algorithms showed a 191.52% increase in the repetition of common words in each sentence using the KL summarizer algorithm, while the LexRank algorithm showed a 361.01% increase in the repetition of common words. This implies that common words play a large role in determining a sector’s privacy policies, making our proposed method a real-world solution for environmental sustainability.
APA, Harvard, Vancouver, ISO, and other styles
30

Schain, Frida, Annica Dominicus, Fredrik Borgsten, Marlene Mozart, and Magnus Björkholm. "Real-word data on autologous stem cell transplantation in older patients with multiple myeloma." Annals of Hematology 99, no. 2 (December 10, 2019): 375–76. http://dx.doi.org/10.1007/s00277-019-03878-6.

Full text
APA, Harvard, Vancouver, ISO, and other styles
31

Mei, Xiangping, Qiaoyun Tan, and Jing Yao. "Toxicities associated with sacituzumab govitecan: Data from clinical trials and real-word pharmacovigilance database." Journal of Clinical Oncology 41, no. 16_suppl (June 1, 2023): e24093-e24093. http://dx.doi.org/10.1200/jco.2023.41.16_suppl.e24093.

Full text
Abstract:
e24093 Background: This study aimed to analyze the adverse effect of sacituzumab govitecan (SG) with multiple source data, to provide reference for clinical medication safety management. Methods: Clinical trials of SG with available safety data were searched and included in the pooled analysis (up to January 5th, 2023). The adverse drug reaction (ADR) signals of SG were collected from the FDA Adverse Event Reporting System (FAERS) database up to January 1st, 2023. The ADR signals were presented and sorted by incidence frequency and reporting odds ratio (ROR) strength, respectively. We also searched and summarized drug interactions with SG in DDInter database. Results: A total of 6 clinical trials enrolled 1737 patients were included in pooled analysis, the most common adverse events of ≥3 grade were neutropenia (51.55%), leukopenia (14.50%), anemia (10.69%), diarrhea (7.30%), fatigue and asthenia (4.23%). In pharmacovigilance study, 1024 AE reports were extracted, the most common toxicities of SG are hematologic and gastrointestinal. AEs not included in the instructions also showed high signals, such as meningitis, colitis and lymphedema. A total of 40 drugs identified can induce drug-drug interaction when concomitant administration with SG. Conclusions: This study provides a comprehensive profile of SG based on clinical trial, FRAES and DDInter database, attention should be paid not only to the common ADRs, but also to the ADRs not reported in the drug instructions and potential drugs that induce drug-drug interactions. [Table: see text]
APA, Harvard, Vancouver, ISO, and other styles
32

Kovács, László, Katalin Orosz, and Peter Pollner. "Growing Networks – Modelling the Growth of Word Association Networks for Hungarian and English." Investigationes Linguisticae, no. 45 (December 30, 2021): 67–82. http://dx.doi.org/10.14746/il.2021.45.5.

Full text
Abstract:
In the new era of information and communication technology, the representation of information is of increasing importance. Knowing how words are connected to each other in the mind and what processes facilitate the creation of connections could result in better optimized applications, e.g. in computer aided education or in search engines. This paper models the growth process of a word association database with an algorithm. We present the network structure of word associations for an agglutinative language and compare it with the network of English word associations. Using the real-world data so obtained, we create a model that reproduces the main features of the observed growth process and show the evolution of the network. The model describes the growth of the word association data as a mixture of a topic based process and a random process. The model makes it possible to gain insight into the overall processes which are responsible for creating an interconnected mental lexicon.
APA, Harvard, Vancouver, ISO, and other styles
33

Reddy, Vookanti Anurag, CH Vamsidhar Reddy, and Dr R. Lakshminarayanan. "Fake News Detection using Machine Learning." International Journal for Research in Applied Science and Engineering Technology 10, no. 4 (April 30, 2022): 227–30. http://dx.doi.org/10.22214/ijraset.2022.41124.

Full text
Abstract:
Abstract: This Project comes up with the applications of NLP (Natural Language Processing) techniques for detecting the 'fake news', that is, misleading news stories that comes from the non-reputable sources. Only by building a model based on a count vectorizer (using word tallies) or a (Term Frequency Inverse Document Frequency) tfidf matrix, (word tallies relative to how often they’re used in other articles in your dataset) can only get you so far. But these models do not consider the important qualities like word ordering and context. It is very possible that two articles that are similar in their word count will be completely different in their meaning. The data science community has responded by taking actions against the problem. There is a Kaggle competition called as the “Fake News Challenge” and Facebook is employing AI to filter fake news stories out of users’ feeds. Combatting the fake news is a classic text classification project with a straight forward proposition. Is it possible for you to build a model that can differentiate between “Real “news and “Fake” news? So a proposed work on assembling a dataset of both fake and real news and employ a Naive Bayes classifier in order to create a model to classify an article into fake or real based on its words and phrases
APA, Harvard, Vancouver, ISO, and other styles
34

Kim, Kangmin, and Chanjun Chun. "Synthetic Data Generator for Solving Korean Arithmetic Word Problem." Mathematics 10, no. 19 (September 27, 2022): 3525. http://dx.doi.org/10.3390/math10193525.

Full text
Abstract:
A math word problems (MWPs) comprises mathematical logic, numbers, and natural language. To solve these problems, a solver model requires an understanding of language and the ability to reason. Since the 1960s, research on the design of a model that provides automatic solutions for mathematical problems has been continuously conducted, and numerous methods and datasets have been published. However, the published datasets in Korean are insufficient. In this study, we propose a Korean data generator for the first time to address this issue. The proposed data generator comprised problem types and data variations. Moreover, it has 4 problem types and 42 subtypes. The data variation has four categories, which adds robustness to the model. In total, 210,311 pieces of data were used for the experiment, of which 210,000 data points were generated. The training dataset had 150,000 data points. Each validation and test dataset had 30,000 data points. Furthermore, 311 problems were sourced from commercially available books on mathematical problems. We used these problems to evaluate the validity of our data generator on actual math word problems. The experiments confirm that models developed using the proposed data generator can be applied to real data. The proposed generator can be used to solve Korean MWPs in the field of education and the service industry, as well as serve as a basis for future research in this field.
APA, Harvard, Vancouver, ISO, and other styles
35

Zhao, Yi, Chong Wang, Jian Wang, and Keqing He. "Incorporating LDA With Word Embedding for Web Service Clustering." International Journal of Web Services Research 15, no. 4 (October 2018): 29–44. http://dx.doi.org/10.4018/ijwsr.2018100102.

Full text
Abstract:
With the rapid growth of web services on the internet, web service discovery has become a hot topic in services computing. Faced with the heterogeneous and unstructured service descriptions, many service clustering approaches have been proposed to promote web service discovery, and many other approaches leveraged auxiliary features to enhance the classical LDA model to achieve better clustering performance. However, these extended LDA approaches still have limitations in processing data sparsity and noise words. This article proposes a novel web service clustering approach by incorporating LDA with word embedding, which leverages relevant words obtained based on word embedding to improve the performance of web service clustering. Especially, the semantically relevant words of service keywords by Word2vec were used to train the word embeddings and then incorporated into the LDA training process. Finally, experiments conducted on a real-world dataset published on ProgrammableWeb show that the authors' proposed approach can achieve better clustering performance than several classical approaches.
APA, Harvard, Vancouver, ISO, and other styles
36

Karsi, Redouane, Mounia Zaim, and Jamila El Alami. "Leveraging Pre-Trained Contextualized Word Embeddings to Enhance Sentiment Classification of Drug Reviews." Revue d'Intelligence Artificielle 35, no. 4 (August 31, 2021): 307–14. http://dx.doi.org/10.18280/ria.350405.

Full text
Abstract:
Traditionally, pharmacovigilance data are collected during clinical trials on a small sample of patients and are therefore insufficient to adequately assess drugs. Nowadays, consumers use online drug forums to share their opinions and experiences about medication. These feedbacks, which are widely available on the web, are automatically analyzed to extract relevant information for decision-making. Currently, sentiment analysis methods are being put forward to leverage consumers' opinions and produce useful drug monitoring indicators. However, these methods' effectiveness depends on the quality of word representation, which presents a real challenge because the information contained in user reviews is noisy and very subjective. Over time, several sentiment classification problems use machine learning methods based on the traditional bag of words model, sometimes enhanced with lexical resources. In recent years, word embedding models have significantly improved classification performance due to their ability to capture words' syntactic and semantic properties. Unfortunately, these latter models are weak in sentiment classification tasks because they are unable to encode sentiment information in the word representation. Indeed, two words with opposite polarities can have close word embeddings as they appear together in the same context. To overcome this drawback, some studies have proposed refining pre-trained word embeddings with lexical resources or learning word embeddings using training data. However, these models depend on external resources and are complex to implement. This work proposes a deep contextual word embeddings model called ELMo that inherently captures the sentiment information by providing separate vectors for words with opposite polarities. Different variants of our proposed model are compared with a benchmark of pre-trained word embeddings models using SVM classifier trained on Drug Review Dataset. Experimental results show that ELMo embeddings improve classification performance in sentiment analysis tasks on the pharmaceutical domain.
APA, Harvard, Vancouver, ISO, and other styles
37

He, Yuejun, Bradley Camburn, Jianxi Luo, Maria C. Yang, and Kristin L. Wood. "Visual Sensemaking of Massive Crowdsourced Data for Design Ideation." Proceedings of the Design Society: International Conference on Engineering Design 1, no. 1 (July 2019): 409–18. http://dx.doi.org/10.1017/dsi.2019.44.

Full text
Abstract:
AbstractTextual idea data from online crowdsourcing contains rich information of the concepts that underlie the original ideas and can be recombined to generate new ideas. But representing such information in a way that can stimulate new ideas is not a trivial task, because crowdsourced data are often vast and in unstructured natural languages. This paper introduces a method that uses natural language processing to summarize a massive number of idea descriptions and represents the underlying concept space as word clouds with a core-periphery structure to inspire recombinations of such concepts into new ideas. We report the use of this method in a real public-sector-sponsored project to explore ideas for future transportation system design. Word clouds that represent the concept space underlying original crowdsourced ideas are used as ideation aids and stimulate many new ideas with varied novelty, usefulness and feasibility. The new ideas suggest that the proposed method helps expand the idea space. Our analysis of these ideas and a survey with the designers who generated them shed light on how people perceive and use the word clouds as ideation aids and suggest future research directions.
APA, Harvard, Vancouver, ISO, and other styles
38

Ibrahim, Valentina, Juhaid Abu Bakar, Nor Hazlyna Harun, and Alaa Fareed Abdulateef. "A Word Cloud Model based on Hate Speech in an Online Social Media Environment." Baghdad Science Journal 18, no. 2(Suppl.) (June 20, 2021): 0937. http://dx.doi.org/10.21123/bsj.2021.18.2(suppl.).0937.

Full text
Abstract:
Social media is known as detectors platform that are used to measure the activities of the users in the real world. However, the huge and unfiltered feed of messages posted on social media trigger social warnings, particularly when these messages contain hate speech towards specific individual or community. The negative effect of these messages on individuals or the society at large is of great concern to governments and non-governmental organizations. Word clouds provide a simple and efficient means of visually transferring the most common words from text documents. This research aims to develop a word cloud model based on hateful words on online social media environment such as Google News. Several steps are involved including data acquisition and pre-processing, feature extraction, model development, visualization and viewing of word cloud model result. The results present an image in a series of text describing the top words. This model can be considered as a simple way to exchange high-level information without overloading the user's details.
APA, Harvard, Vancouver, ISO, and other styles
39

Hudeček, Lana. "Hair Colour Stereotypes in Croatian Language Corpora." Collegium antropologicum 46, no. 3 (2022): 197–206. http://dx.doi.org/10.5671/ca.46.3.3.

Full text
Abstract:
The paper shows how hair colour stereotypes are reflected in two Croatian language corpora: the Croatian Language Corpus hrWaC and the Croatian Language Repository. Both were searched with the Sketch Engine corpus tool, utilizing the word sketches function, which shows the information on the most common collocations in which a lemma occurs. Synonymous words denoting female and male persons with fair, brown, black, ginger, or red hair were explored. The following hypotheses were confirmed or partially confirmed: women are more often defined by hair colour than men; more synonyms denote a female person of a particular hair colour than a male person; some synonyms appear in contexts suggesting stereotypes more often than others; in the formation of words especially denoting female persons of particular hair colour, some word-formation models are used to form pejorative and depreciative words and (by onymisation) animal names; and the adjectives pravi (‘real’), jedan (‘one’), and običan (‘ordinary’) serve as focus markers and suggest expressions reflecting stereotypes. Based on the conducted collocation and word-formation analysis, it is concluded that the collocations and word-formation models associated with hair colour words suggest various extralinguistic data, including the social status of women.
APA, Harvard, Vancouver, ISO, and other styles
40

Hofmann, Klaus. "Stress in real time." Journal of Historical Linguistics 10, no. 3 (December 8, 2020): 452–86. http://dx.doi.org/10.1075/jhl.19030.hof.

Full text
Abstract:
Abstract This contribution reviews a series of studies by Kelly (and Bock), suggesting that stress preferences of English nouns and verbs for left-hand and right-hand stress patterns are partly a result of alternating rhythm in real utterances. This claim is tested on diachronic corpus data to verify its historical implications. By using verse evidence to calibrate stress values for historical word classes, the quantitative analysis confirms that distributional asymmetries regarding strong and weak syllables in the contexts of nouns and verbs have existed at least since Late Middle English. In addition, the claim that stem-final segments predict the likelihood of right-hand stress is not only confirmed but the effect is found to be independent of etymological origin.
APA, Harvard, Vancouver, ISO, and other styles
41

Sanz-Garcia, Enrique, Benjamin Haibe-Kains, and Lillian L. Siu. "Using real-word data to evaluate the effects of broadening eligibility criteria in oncology trials." Cancer Cell 39, no. 6 (June 2021): 750–52. http://dx.doi.org/10.1016/j.ccell.2021.05.012.

Full text
APA, Harvard, Vancouver, ISO, and other styles
42

KUMAKI, T., Y. KURODA, M. ISHIZAKI, T. KOIDE, H. J. MATTAUSCH, H. NODA, K. DOSAKA, K. ARIMOTO, and K. SAITO. "Real-Time Huffman Encoder with Pipelined CAM-Based Data Path and Code-Word-Table Optimizer." IEICE Transactions on Information and Systems E90-D, no. 1 (January 1, 2007): 334–45. http://dx.doi.org/10.1093/ietisy/e90-1.1.334.

Full text
APA, Harvard, Vancouver, ISO, and other styles
43

Marzano, Gilberto, and Velta Lubkina. "CYBERBULLING AND REAL REALITY." SOCIETY, INTEGRATION, EDUCATION. Proceedings of the International Scientific Conference 2 (May 30, 2015): 412. http://dx.doi.org/10.17770/sie2013vol2.598.

Full text
Abstract:
There are various risks tied to cyberspace. Some of them are social risks because they are cultural risks, being related to new forms of relationships and interactions among people. In the last decade, toxic evils like cyberbullying and other malicious cyber violence are growing, and the search of antidotes is becoming a common concern for governments, educational authorities, teachers, parents and children alike. The available data shows clear evidence that the number of persons affected by cyber violence is increasing (Shariff e Churchill, 2009; U.S. Department of Education, 2011; Dilmac, 2012; Catalano, 2012): a Google search of the word “cyberbullying” finds more the 11 million of items. Despite the popularity of the word, there is a limited knowledge of this issue and many of the first conceptual formulations about it continue to be spread in literature, such as that the characteristics of bullies who act face-to-face and those who do so in cyberspace are very different. The paper analyzes the classic model of cyberbullying behavior, as described in literature, introducing a new element to be considered. It is that, especially for young people, Web and physical world are more and more becoming a whole: virtual-web and real reality are a continuum that we could define as an e-real-reality. Analyzing two of the most known cases of cyberbullying and considering some other evidences emerged by recent researches, we are theoretically convinced that a better understanding of this element could lead to the development of more effective strategies for combating cyberbullying.
APA, Harvard, Vancouver, ISO, and other styles
44

Wang, Dingquan, and Jason Eisner. "The Galactic Dependencies Treebanks: Getting More Data by Synthesizing New Languages." Transactions of the Association for Computational Linguistics 4 (December 2016): 491–505. http://dx.doi.org/10.1162/tacl_a_00113.

Full text
Abstract:
We release Galactic Dependencies 1.0—a large set of synthetic languages not found on Earth, but annotated in Universal Dependencies format. This new resource aims to provide training and development data for NLP methods that aim to adapt to unfamiliar languages. Each synthetic treebank is produced from a real treebank by stochastically permuting the dependents of nouns and/or verbs to match the word order of other real languages. We discuss the usefulness, realism, parsability, perplexity, and diversity of the synthetic languages. As a simple demonstration of the use of Galactic Dependencies, we consider single-source transfer, which attempts to parse a real target language using a parser trained on a “nearby” source language. We find that including synthetic source languages somewhat increases the diversity of the source pool, which significantly improves results for most target languages.
APA, Harvard, Vancouver, ISO, and other styles
45

Yoshida, Yasuhisa, Tsutomu Hirao, Tomoharu Iwata, Masaaki Nagata, and Yuji Matsumoto. "Transfer Learning for Multiple-Domain Sentiment Analysis — Identifying Domain Dependent/Independent Word Polarity." Proceedings of the AAAI Conference on Artificial Intelligence 25, no. 1 (August 4, 2011): 1286–91. http://dx.doi.org/10.1609/aaai.v25i1.8081.

Full text
Abstract:
Sentiment analysis is the task of determining the attitude (positive or negative) of documents. While the polarity of words in the documents is informative for this task, polarity of some words cannot be determined without domain knowledge. Detecting word polarity thus poses a challenge for multiple-domain sentiment analysis. Previous approaches tackle this problem with transfer learning techniques, but they cannot handle multiple source domains and multiple target domains. This paper proposes a novel Bayesian probabilistic model to handle multiple source and multiple target domains. In this model, each word is associated with three factors: Domain label, domain dependence/independence and word polarity. We derive an efficient algorithm using Gibbs sampling for inferring the parameters of the model, from both labeled and unlabeled texts. Using real data, we demonstrate the effectiveness of our model in a document polarity classification task compared with a method not considering the differences between domains. Moreover our method can also tell whether each word's polarity is domain-dependent or domain-independent. This feature allows us to construct a word polarity dictionary for each domain.
APA, Harvard, Vancouver, ISO, and other styles
46

Nguyen, Van Quan, Tien Nguyen Anh, and Hyung-Jeong Yang. "Real-time event detection using recurrent neural network in social sensors." International Journal of Distributed Sensor Networks 15, no. 6 (June 2019): 155014771985649. http://dx.doi.org/10.1177/1550147719856492.

Full text
Abstract:
We proposed an approach for temporal event detection using deep learning and multi-embedding on a set of text data from social media. First, a convolutional neural network augmented with multiple word-embedding architectures is used as a text classifier for the pre-processing of the input textual data. Second, an event detection model using a recurrent neural network is employed to learn time series data features by extracting temporal information. Recently, convolutional neural networks have been used in natural language processing problems and have obtained excellent results as performing on available embedding vector. In this article, word-embedding features at the embedding layer are combined and fed to convolutional neural network. The proposed method shows no size limitation, supplementation of more embeddings than standard multichannel based approaches, and obtained similar performance (accuracy score) on some benchmark data sets, especially in an imbalanced data set. For event detection, a long short-term memory network is used as a predictor that learns higher level temporal features so as to predict future values. An error distribution estimation model is built to calculate the anomaly score of observation. Events are detected using a window-based method on the anomaly scores.
APA, Harvard, Vancouver, ISO, and other styles
47

Simic-Muller, Ksenija, and Anthony Fernandes. "Preservice Teachers' Understanding of 'Real world': Developing a Typology." International Journal for Mathematics Teaching and Learning 21, no. 1 (September 20, 2020): 31–53. http://dx.doi.org/10.4256/ijmtl.v21i1.244.

Full text
Abstract:
This study examines the beliefs of 33 preservice teachers (PSTs) from the U.S. have about using different types of real-world contexts in the mathematics classroom. Qualitative data about the participants' reactions to specially designed word problems that varied in contexts from 'neutral' to controversial were collected. A thematic analysis of the responses indicated that they could be arranged into three typologies on a continuum based on their openness towards the use of controversial issues in the mathematics classroom. Drawing on the analysis of PSTs' responses and the literature, a fourth typology was inferred. The typologies can be useful to teacher educators and education programs as they seek to prepare PSTs to work with increasingly diverse students in their future mathematics classes. The study also highlights the potential of using word problems as a tool to understand PSTs' beliefs.
APA, Harvard, Vancouver, ISO, and other styles
48

., Pavani, U. V. Anbazhagu, Bhavadharani ., M. Latha, and J. Senthil. "Opinion Mining Embedding with Applications to Opinions." International Journal of Engineering & Technology 7, no. 3.27 (August 15, 2018): 192. http://dx.doi.org/10.14419/ijet.v7i3.27.17760.

Full text
Abstract:
The main objective of this project, we portray strategies to consequently create and score another estimation vocabulary, called sentimental analysis. Sentimental analysis is the one of the real errands of machine learning processing. Individuals post their own emotions and contemplating any items for an internet business website, (for example, Amazon, Flip card etc).sometime individuals needs to know whether these posts are positive, negative or unbiased. Existing word inserting learning calculations regularly just utilize the settings of words yet disregard the assumption of writings. Now we are applying enclose to word level assumption and stepwise level supposition arrangement, and estimation vocabularies. Information utilized as a part of this study are online item data sets are gathered from amazon.com. Experiments for both sentence-level and word-level are performed.
APA, Harvard, Vancouver, ISO, and other styles
49

Binder, J. R., K. A. McKiernan, M. E. Parsons, C. F. Westbury, E. T. Possing, J. N. Kaufman, and L. Buchanan. "Neural Correlates of Lexical Access during Visual Word Recognition." Journal of Cognitive Neuroscience 15, no. 3 (April 1, 2003): 372–93. http://dx.doi.org/10.1162/089892903321593108.

Full text
Abstract:
People can discriminate real words from nonwords even when the latter are orthographically and phonologically word-like, presumably because words activate specific lexical and/or semantic information. We investigated the neural correlates of this identification process using event-related functional magnetic resonance imaging (fMRI). Participants performed a visual lexical decision task under conditions that encouraged specific word identification: Nonwords were matched to words on orthographic and phonologic characteristics, and accuracy was emphasized over speed. To identify neural responses associated with activation of nonsemantic lexical information, processing of words and nonwords with many lexical neighbors was contrasted with processing of items with no neighbors. The fMRI data showed robust differences in activation by words and word-like nonwords, with stronger word activation occurring in a distributed, left hemisphere network previously associated with semantic processing, and stronger nonword activation occurring in a posterior inferior frontal area previously associated with grapheme-to-phoneme mapping. Contrary to lexicon-based models of word recognition, there were no brain areas in which activation increased with neighborhood size. For words, activation in the left prefrontal, angular gyrus, and ventrolateral temporal areas was stronger for items without neighbors, probably because accurate responses to these items were more dependent on activation of semantic information. The results show neural correlates of access to specific word information. The absence of facilitatory lexical neighborhood effects on activation in these brain regions argues for an interpretation in terms of semantic access. Because subjects performed the same task throughout, the results are unlikely to be due to task-specific attentional, strategic, or expectancy effects.
APA, Harvard, Vancouver, ISO, and other styles
50

Craig, Chie H. "Effects of Aging on Time-Gated Isolated Word-Recognition Performance." Journal of Speech, Language, and Hearing Research 35, no. 1 (February 1992): 234–38. http://dx.doi.org/10.1044/jshr.3501.234.

Full text
Abstract:
This investigation was designed to study real-time isolated monosyllabic word-recognition performance and the feasibility of applying time-gated NU-6 word-recognition test materials for real-time assessment of older listeners. Methods and materials developed in a previous investigation were used to obtain time-gated performance measures from 37 older listeners (mean age=69 years). The older listener performance measures were compared with extant data from 20 normally hearing young adult listeners (mean age=22 years). Specifically, listener confidence and accuracy by gate as well as listener isolation point, confidence at the isolation point, and total acceptance point measures were evaluated. The results show that major events in the real-time understanding process occur at a slower pace among older listeners. The data indicate that the time-gating method has excellent potential for future research among elderly listeners.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography