Увійти

Готові списки джерел за темами / IMDb DATASET / Статті в журналах

Щоб переглянути інші типи публікацій з цієї теми, перейдіть за посиланням: IMDb DATASET.

Статті в журналах з теми "IMDb DATASET"

Автор: Grafiati

Опубліковано: 11 вересня 2023

Оформте джерело за APA, MLA, Chicago, Harvard та іншими стилями

Оберіть тип джерела:

Ознайомтеся з топ-50 статей у журналах для дослідження на тему "IMDb DATASET".

Біля кожної праці в переліку літератури доступна кнопка «Додати до бібліографії». Скористайтеся нею – і ми автоматично оформимо бібліографічне посилання на обрану працю в потрібному вам стилі цитування: APA, MLA, «Гарвард», «Чикаго», «Ванкувер» тощо.

Також ви можете завантажити повний текст наукової публікації у форматі «.pdf» та прочитати онлайн анотацію до роботи, якщо відповідні параметри наявні в метаданих.

Переглядайте статті в журналах для різних дисциплін та оформлюйте правильно вашу бібліографію.

1

Jung, Soon-Gyo, Joni Salminen, and Bernard J. Jansen. "Engineers, Aware! Commercial Tools Disagree on Social Media Sentiment: Analyzing the Sentiment Bias of Four Major Tools." Proceedings of the ACM on Human-Computer Interaction 6, EICS (June 14, 2022): 1–20. http://dx.doi.org/10.1145/3532203.

Повний текст джерела

Анотація:

Large commercial sentiment analysis tools are often deployed in software engineering due to their ease of use. However, it is not known how accurate these tools are, and whether the sentiment ratings given by one tool agree with those given by another tool. We use two datasets - (1) NEWS consisting of 5,880 news stories and 60K comments from four social media platforms: Twitter, Instagram, YouTube, and Facebook; and (2) IMDB consisting of 7,500 positive and 7,500 negative movie reviews - to investigate the agreement and bias of four widely used sentiment analysis (SA) tools: Microsoft Azure (MS), IBM Watson, Google Cloud, and Amazon Web Services (AWS). We find that the four tools assign the same sentiment on less than half (48.1%) of the analyzed content. We also find that AWS exhibits neutrality bias in both datasets, Google exhibits bi-polarity bias in the NEWS dataset but neutrality bias in the IMDB dataset, and IBM and MS exhibit no clear bias in the NEWS dataset but have bi-polarity bias in the IMDB dataset. Overall, IBM has the highest accuracy relative to the known ground truth in the IMDB dataset. Findings indicate that psycholinguistic features - especially affect, tone, and use of adjectives - explain why the tools disagree. Engineers are urged caution when implementing SA tools for applications, as the tool selection affects the obtained sentiment labels.

Стилі APA, Harvard, Vancouver, ISO та ін.

2

Jnoub, Nour, Fadi Al Machot, and Wolfgang Klas. "A Domain-Independent Classification Model for Sentiment Analysis Using Neural Models." Applied Sciences 10, no. 18 (September 8, 2020): 6221. http://dx.doi.org/10.3390/app10186221.

Повний текст джерела

Анотація:

Most people nowadays depend on the Web as a primary source of information. Statistical studies show that young people obtain information mainly from Facebook, Twitter, and other social media platforms. By relying on these data, people may risk drawing the incorrect conclusions when reading the news or planning to buy a product. Therefore, systems that can detect and classify sentiments and assist users in finding the correct information on the Web is highly needed in order to prevent Web surfers from being easily deceived. This paper proposes an intensive study regarding domain-independent classification models for sentiment analysis that should be trained only once. The study consists of two phases: the first phase is based on a deep learning model which is training a neural network model once after extracting robust features and saving the model and its parameters. The second phase is based on applying the trained model on a totally new dataset, aiming at correctly classifying reviews as positive or negative. The proposed model is trained on the IMDb dataset and then tested on three different datasets: IMDb dataset, Movie Reviews dataset, and our own dataset collected from Amazon reviews that rate users’ opinions regarding Apple products. The work shows high performance using different evaluation metrics compared to the stat-of-the-art results.

Стилі APA, Harvard, Vancouver, ISO та ін.

3

Kamaru Zaman, Fadhlan Hafizhelmi. "Gender classification using custom convolutional neural networks architecture." International Journal of Electrical and Computer Engineering (IJECE) 10, no. 6 (December 1, 2020): 5758. http://dx.doi.org/10.11591/ijece.v10i6.pp5758-5771.

Повний текст джерела

Анотація:

Gender classification demonstrates high accuracy in many previous works. However, it does not generalize very well in unconstrained settings and environments. Furthermore, many proposed Convolutional Neural Network (CNN) based solutions vary significantly in their characteristics and architectures, which calls for optimal CNN architecture for this specific task. In this work, a hand-crafted, custom CNN architecture is proposed to distinguish between male and female facial images. This custom CNN requires smaller input image resolutions and significantly fewer trainable parameters than some popular state-of-the-arts such as GoogleNet and AlexNet. It also employs batch normalization layers which results in better computation efficiency. Based on experiments using publicly available datasets such as LFW, CelebA and IMDB-WIKI datasets, the proposed custom CNN delivered the fastest inference time in all tests, where it needs only 0.92ms to classify 1200 images on GPU, 1.79ms on CPU, and 2.51ms on VPU. The custom CNN also delivers performance on-par with state-of-the-arts and even surpassed these methods in CelebA gender classification where it delivered the best result at 96% accuracy. Moreover, in a more challenging cross-dataset inference, custom CNN trained using CelebA dataset gives the best gender classification accuracy for tests on IMDB and WIKI datasets at 97% and 96% accuracy respectively.

Стилі APA, Harvard, Vancouver, ISO та ін.

4

Alghazzawi, Daniyal M., Anser Ghazal Ali Alquraishee, Sahar K. Badri, and Syed Hamid Hasan. "ERF-XGB: Ensemble Random Forest-Based XG Boost for Accurate Prediction and Classification of E-Commerce Product Review." Sustainability 15, no. 9 (April 23, 2023): 7076. http://dx.doi.org/10.3390/su15097076.

Повний текст джерела

Анотація:

Recently, the concept of e-commerce product review evaluation has become a research topic of significant interest in sentiment analysis. The sentiment polarity estimation of product reviews is a great way to obtain a buyer’s opinion on products. It offers significant advantages for online shopping customers to evaluate the service and product qualities of the purchased products. However, the issues related to polysemy, disambiguation, and word dimension mapping create prediction problems in analyzing online reviews. In order to address such issues and enhance the sentiment polarity classification, this paper proposes a new sentiment analysis model, the Ensemble Random Forest-based XG boost (ERF-XGB) approach, for the accurate binary classification of online e-commerce product review sentiments. Two different Internet Movie Database (IMDB) datasets and the Chinese Emotional Corpus (ChnSentiCorp) dataset are used for estimating online reviews. First, the datasets are preprocessed through tokenization, lemmatization, and stemming operations. The Harris hawk optimization (HHO) algorithm selects two datasets’ corresponding features. Finally, the sentiments from online reviews are classified into positive and negative categories regarding the proposed ERF-XGB approach. Hyperparameter tuning is used to find the optimal parameter values that improve the performance of the proposed ERF-XGB algorithm. The performance of the proposed ERF-XGB approach is analyzed using evaluation indicators, namely accuracy, recall, precision, and F1-score, for different existing approaches. Compared with the existing method, the proposed ERF-XGB approach effectively predicts sentiments of online product reviews with an accuracy rate of about 98.7% for the ChnSentiCorp dataset and 98.2% for the IMDB dataset.

Стилі APA, Harvard, Vancouver, ISO та ін.

5

Effendi, Fery Ardiansyah, and Yuliant Sibaroni. "Sentiment Classification for Film Reviews by Reducing Additional Introduced Sentiment Bias." Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi) 5, no. 5 (October 24, 2021): 863–75. http://dx.doi.org/10.29207/resti.v5i5.3400.

Повний текст джерела

Анотація:

Film business and its individual reviews cannot be separated and film review sites such as IMDb is a credible source of reviews posted in public forums. With IMDb site reviews being unstructured and bias-heavy, classification methods by reducing additional sentiment bias is needed to create a balanced classification with lower polarity bias. Elimination of additional sentiment bias will improve the model as polarity is defined by non-bias method, resulting in models correctly defined which sequences of words is either positive or negative. This research limits the dataset by 50.000 rows of randomly extracted reviews from the IMDb website using dataset preparation methods such as Preprocessing, POS-Tagging, and Word Embeddings. Then preprocessed data is used in classification methods such as ANN, SWN, and SO-Cal. This paper also used bias processing methods such as Hyperparameter Tuning and BPM, with outputs evaluated using Accuracy and PBR metrics. This research yields 77.39 % for ANN, 66.32% for BPM, 75.6% for SO-Cal, and 76.26% for Hybrid classification. Best PBR resulted in two lexicon-based methods on 0.0009 for BPM, and 0.00006 for SO-Cal. More advanced model configuration in ANN can improve the model, and much complex lexicon models will be a future in the research topic.

Стилі APA, Harvard, Vancouver, ISO та ін.

6

Aribowo, Agus Sasmito, Halizah Basiron, and Noor Fazilla Abd Yusof. "Semi-supervised learning for sentiment classification with ensemble multi-classifier approach." International Journal of Advances in Intelligent Informatics 8, no. 3 (November 30, 2022): 349. http://dx.doi.org/10.26555/ijain.v8i3.929.

Повний текст джерела

Анотація:

Supervised sentiment analysis ideally uses a fully labeled data set for modeling. However, this ideal condition requires a struggle in the label annotation process. Semi-supervised learning (SSL) has emerged as a promising method to avoid time-consuming and expensive data labeling without reducing model performance. However, the research on SSL is still limited and its performance needs to be improved. Thus, this study aims to create a new SSL-Model for sentiment analysis. The Ensemble Classifier SSL model for sentiment classification is introduced. The research went through pre-processing, vectorization, and feature extraction using TF-IDF and n-grams. Support Vector Machine (SVM) or Random Forest for tokenization was used to separate unigram, bigram, and trigram in model generation. Then, the outputs of these models were combined using stacking ensemble approach. Accuracy and F1-score were used for the evaluation. IMDB datasets and US Airlines were used to test the new SSL models. The conclusion is that the sentiment annotation accuracy is highly dependent on the suitability of the dataset with the machine learning algorithm. In IMDB dataset, which consists of two classes, it is better to use SVM. In the US Airlines consisting of three classes, SVM is better at improving the model performance against the baseline, but RF is better at achieving the baseline performance even though it fails to maintain the model performance.

Стилі APA, Harvard, Vancouver, ISO та ін.

7

Shaddeli, Aitak, Farhad Soleimanian Soleimanian Gharehchopogh, Mohammad Masdari, and Vahid Solouk. "An Improved African Vulture Optimization Algorithm for Feature Selection Problems and Its Application of Sentiment Analysis on Movie Reviews." Big Data and Cognitive Computing 6, no. 4 (September 28, 2022): 104. http://dx.doi.org/10.3390/bdcc6040104.

Повний текст джерела

Анотація:

The African vulture optimization algorithm (AVOA) is inspired by African vultures’ feeding and orienting behaviors. It comprises powerful operators while maintaining the balance of exploration and efficiency in solving optimization problems. To be used in discrete applications, this algorithm needs to be discretized. This paper introduces two versions based on the S-shaped and V-shaped transfer functions of AVOA and BAOVAH. Moreover, the increase in computational complexity is avoided. Disruption operator and Bitwise strategy have also been used to maximize this model’s performance. A multi-strategy version of the AVOA called BAVOA-v1 is presented. In the proposed approach, i.e., BAVOA-v1, different strategies such as IPRS, mutation neighborhood search strategy (MNSS) (balance between exploration and exploitation), multi-parent crossover (increasing exploitation), and Bitwise (increasing diversity and exploration) are used to provide solutions with greater variety and to assure the quality of solutions. The proposed methods are evaluated on 30 UCI datasets with different dimensions. The simulation results showed that the proposed BAOVAH algorithm performed better than other binary meta-heuristic algorithms. So that the proposed BAOVAH algorithm set is the most accurate in 67% of the data set, and 93% of the data set is the best value of the fitness functions. In terms of feature selection, it has shown high performance. Finally, the proposed method in a case study to determine the number of neurons and the activator function to improve deep learning results was used in the sentiment analysis of movie viewers. In this paper, the CNNEM model is designed. The results of experiments on three datasets of sentiment analysis—IMDB, Amazon, and Yelp—show that the BAOVAH algorithm increases the accuracy of the CNNEM network in the IMDB dataset by 6%, the Amazon dataset by 33%, and the Yelp dataset by 30%.

Стилі APA, Harvard, Vancouver, ISO та ін.

8

Zhou, Yancong, Qian Zhang, Dongdong Wang, and Xiaoying Gu. "Text Sentiment Analysis Based on a New Hybrid Network Model." Computational Intelligence and Neuroscience 2022 (December 28, 2022): 1–15. http://dx.doi.org/10.1155/2022/6774320.

Повний текст джерела

Анотація:

The research of text sentiment analysis based on deep learning is increasingly rich, but the current models still have different degrees of deviation in understanding of semantic information. In order to reduce the loss of semantic information and improve the prediction accuracy as much as possible, the paper creatively combines the doc2vec model with the deep learning model and attention mechanism and proposes a new hybrid sentiment analysis model based on the doc2vec + CNN + BiLSTM + Attention. The new hybrid model effectively exploits the structural features of each part. In the model, the understanding of the overall semantic information of the sentence is enhanced through the paragraph vector pretrained by the doc2vec structure which can effectively reduce the loss of semantic information. The local features of the text are extracted through the CNN structure. The context information interaction is completed through the bidirectional cycle structure of the BiLSTM. The performance is improved by allocating weight and resources to the text information of different importance through the attention mechanism. The new model was built based on Keras framework, and performance comparison experiments and analysis were performed on the IMDB dataset and the DailyDialog dataset. The results have shown that the accuracy of the new model on the two datasets is 91.3% and 93.3%, respectively, and the loss rate is 22.1% and 19.9%, respectively. The accuracy on the IMDB datasets is 1.0% and 0.5% higher than that of the CNN-BiLSTM-Attention model and ATT-MCNN-BGRUM model in the references. Comprehensive comparison has shown the overall performance is improved, and the new model is effective.

Стилі APA, Harvard, Vancouver, ISO та ін.

9

Gore, Mohini, Aishwarya Sheth, Samrudhi Abbad, Paryul Jain, and Prof Pooja Mishra. "IMDB Box Office Prediction Using Machine Learning Algorithms." International Journal for Research in Applied Science and Engineering Technology 10, no. 5 (May 31, 2022): 2438–42. http://dx.doi.org/10.22214/ijraset.2022.42653.

Повний текст джерела

Анотація:

Abstract: Movies are a big part of our world! But nobody knows how a movie will perform at the box office. There are some bix budget movies that bomb and there are smaller movies that are smashing successes. This project tries to predict the overall worldwide box office revenue of movies using data such as the movie cast, crew, posters, plot keywords, budget, production companies, release dates, languages, and countries. The dataset on Kaggle contains all these data points that you can use to predict how a movie will fare at the box office. Among many movies that have been released, some generate high profit while the others do not. This paper studies the relationship between movie factors and its revenue and build prediction models. Besides analysis on aggregate data, we also divide data into groups using different methods and compare accuracy across these techniques as well as explore whether clustering techniques could help improve accuracy Keywords: component: regression; predictive analytics; Clustering; Expectation-maximization; K-means; Movies

Стилі APA, Harvard, Vancouver, ISO та ін.

10

Abdullah Haje, Umran, Mohammed Hussein Abdalla, Reben Mohammed Saleem Kurda, and Zhwan Mohammed Khalid. "A New Model for Emotions Analysis in Social Network Text Using Ensemble Learning and Deep learning." Academic Journal of Nawroz University 11, no. 1 (March 9, 2022): 130–40. http://dx.doi.org/10.25007/ajnu.v11n1a1250.

Повний текст джерела

Анотація:

Recently, emotion analysis has become widely used. Therefore, increasing the accuracy of existing methods has become a challenge for researchers. The proposed method in this paper is a hybrid model to improve the accuracy of emotion analysis; Which uses a combination of convolutional neural network and ensemble learning. In the proposed method, after receiving the dataset, the data is pre-processed and converted into process able samples. Then the new dataset is split into two categories of training and test. The proposed model is a structure for machine learning in the form of ensemble learning. It contains blocks consisting of a combination of convolutional networks and basic classification algorithms. In each convolutional network, the base classification algorithms replace the fully connected layer. Evaluate the proposed method, in IMDB, PL04 and SemEval dataset with accuracy, precision, recall and F1 criteria, shows that, on average, for all three datasets, the precision of polarity detection is 90%, the recall of polarity detection is 93%, the F1 of polarity detection is 91% and finally the accuracy of polarity detection is 92%.

Стилі APA, Harvard, Vancouver, ISO та ін.

11

Naeem, Muhammad Zaid, Furqan Rustam, Arif Mehmood, Mui-zzud-din, Imran Ashraf, and Gyu Sang Choi. "Classification of movie reviews using term frequency-inverse document frequency and optimized machine learning algorithms." PeerJ Computer Science 8 (March 15, 2022): e914. http://dx.doi.org/10.7717/peerj-cs.914.

Повний текст джерела

Анотація:

The Internet Movie Database (IMDb), being one of the popular online databases for movies and personalities, provides a wide range of movie reviews from millions of users. This provides a diverse and large dataset to analyze users’ sentiments about various personalities and movies. Despite being helpful to provide the critique of movies, the reviews on IMDb cannot be read as a whole and requires automated tools to provide insights on the sentiments in such reviews. This study provides the implementation of various machine learning models to measure the polarity of the sentiments presented in user reviews on the IMDb website. For this purpose, the reviews are first preprocessed to remove redundant information and noise, and then various classification models like support vector machines (SVM), Naïve Bayes classifier, random forest, and gradient boosting classifiers are used to predict the sentiment of these reviews. The objective is to find the optimal process and approach to attain the highest accuracy with the best generalization. Various feature engineering approaches such as term frequency-inverse document frequency (TF-IDF), bag of words, global vectors for word representations, and Word2Vec are applied along with the hyperparameter tuning of the classification models to enhance the classification accuracy. Experimental results indicate that the SVM obtains the highest accuracy when used with TF-IDF features and achieves an accuracy of 89.55%. The sentiment classification accuracy of the models is affected due to the contradictions in the user sentiments in the reviews and assigned labels. For tackling this issue, TextBlob is used to assign a sentiment to the dataset containing reviews before it can be used for training. Experimental results on TextBlob assigned sentiments indicate that an accuracy of 92% can be obtained using the proposed model.

Стилі APA, Harvard, Vancouver, ISO та ін.

12

Gupta, Ketan, Nasmin Jiwani, and Neda Afreen. "A Combined Approach of Sentimental Analysis Using Machine Learning Techniques." Revue d'Intelligence Artificielle 37, no. 1 (February 28, 2023): 1–6. http://dx.doi.org/10.18280/ria.370101.

Повний текст джерела

Анотація:

Sentiment analysis is a vital area of current research. The area of sentiment analysis is extensively used for observing text data and identifying the sentiment element. Every day, e- commerce sites produce a massive amount of text information from customer's comments, reviews, tweets, and feedbacks. One of the most recent technological advances in web development is the emergence of social networking websites. It aids in communication and knowledge gathering. Aspect - based evaluation of this information can help businesses to gain a greater understanding of their consumers' expectations and then shape their plans accordingly. It is difficult to convey the exact sentiment of a review. In this study, we demonstrated an approach that focuses on sentimental aspects of the item's characteristics. Consumer reviews on Amazon and IMDB have been presented and evaluated. We obtained the dataset from the UCI repository, where each analysis's opinion rates are first observed. To get meaningful information from datasets, and to eliminate noise, the pre-processing operations are performed by the system such as tokenization, punctuation, whitespace, special character, and stop-word removal. For the purpose of accurately representing the preprocessed data, feature selection methods such as word frequency-inverse document frequency are utilized (TF–IDF). The customer reviews from three datasets Amazon, Yelp, and IMDB is merged and classification is performed using classifiers such as Naïve Bayes, Random Forest, K-Nearest Neighbor (KNN), and Support Vector Machine (SVM). In last, we provide some insight into the future text classification work.

Стилі APA, Harvard, Vancouver, ISO та ін.

13

Islam, Md Mahbubul, and Joong-Hwan Baek. "A Hierarchical Approach toward Prediction of Human Biological Age from Masked Facial Image Leveraging Deep Learning Techniques." Applied Sciences 12, no. 11 (May 24, 2022): 5306. http://dx.doi.org/10.3390/app12115306.

Повний текст джерела

Анотація:

The lifestyle of humans has changed noticeably since the contagious COVID-19 disease struck globally. People should wear a face mask as a protective measure to curb the spread of the contagious disease. Consequently, real-world applications (i.e., electronic customer relationship management) dealing with human ages extracted from face images must migrate to a robust system proficient to estimate the age of a person wearing a face mask. In this paper, we proposed a hierarchical age estimation model from masked facial images in a group-to-specific manner rather than a single regression model because age progression across different age groups is quite dissimilar. Our intention was to squeeze the feature space among limited age classes so that the model could fairly discern age. We generated a synthetic masked face image dataset over the IMDB-WIKI face image dataset to train and validate our proposed model due to the absence of a benchmark masked face image dataset with real age annotations. We somewhat mitigated the data sparsity problem of the large public IMDB-WIKI dataset using off-the-shelf down-sampling and up-sampling techniques as required. The age estimation task was fully modeled like a deep classification problem, and expected ages were formulated from SoftMax probabilities. We performed a classification task by deploying multiple low-memory and higher-accuracy-based convolutional neural networks (CNNs). Our proposed hierarchical framework demonstrated marginal improvement in terms of mean absolute error (MAE) compared to the one-off model approach for masked face real age estimation. Moreover, this research is perhaps the maiden attempt to estimate the real age of a person from his/her masked face image.

Стилі APA, Harvard, Vancouver, ISO та ін.

14

Kwon, Hyun, and Sanghyun Lee. "Textual Adversarial Training of Machine Learning Model for Resistance to Adversarial Examples." Security and Communication Networks 2022 (April 7, 2022): 1–12. http://dx.doi.org/10.1155/2022/4511510.

Повний текст джерела

Анотація:

Deep neural networks provide good performance for image recognition, speech recognition, text recognition, and pattern recognition. However, such networks are vulnerable to attack by adversarial examples. Adversarial examples are created by adding a small amount of noise to an original sample in such a way that no problem is perceptible to humans, yet the sample will be incorrectly recognized by a model. Adversarial examples have been studied mainly in the context of images, but research has expanded to include the text domain. In the textual context, an adversarial example is a sample of text in which certain important words have been changed so that the sample will be misclassified by a model even though to humans it is the same as the original text in terms of meaning and grammar. In the text domain, there have been relatively few studies on defenses against adversarial examples compared with the number of studies on adversarial example attacks. In this paper, we propose an adversarial training method to defend against adversarial examples that target the latest text model, bidirectional encoder representations from transformers (BERT). In the proposed method, adversarial examples are generated using various parameters and then are applied in additional training of the target model to instill robustness against unknown adversarial examples. Experiments were conducted using five datasets (AG’s News, a movie review dataset, the IMDB Large Movie Review Dataset (IMDB), the Stanford Natural Language Inference (SNLI) corpus, and the Multi-Genre Natural Language Inference (MultiNLI) corpus), with TensorFlow as the machine learning library. According to the experimental results, the baseline model had an accuracy of 88.1% on the original sentences and an accuracy of 9.2% on the adversarial sentences, whereas the model that underwent the proposed training method maintained an average accuracy of 87.2% on the original sentences and had an average accuracy of 22.5% on the adversarial sentences.

Стилі APA, Harvard, Vancouver, ISO та ін.

15

Guo, Haolan. "Comparison Of Neural Network and Traditional Classifiers for Twitter Sentiment Analysis." Highlights in Science, Engineering and Technology 38 (March 16, 2023): 1062–70. http://dx.doi.org/10.54097/hset.v38i.5996.

Повний текст джерела

Анотація:

Sentiment analysis has been a popular topic of study in the field of social media analysis, particularly when it comes to analyzing the emotions expressed in online comments. This is particularly relevant when it comes to IMDb Movie reviews, where users often express their opinions on the films they have watched. By using sentiment analysis techniques, researchers can gain insights into the overall sentiment of a movie and how it is perceived by the public. This information can be useful for movie studios and producers, as it can help them gauge the success of their films and make decisions about future productions. In this analysis, the performance of neural network models is compared with that of traditional classification methods when applied to the task of sentiment classification of tweets. A dataset of tweets collected from IMDb movie reviews is used for training. Three different models are trained on this dataset: a sequential neural network with two dense layers activated by ReLU and SoftMax functions, logistic regression, and random forest. The performance of these models is evaluated using a variety of metrics, including confusion matrices, AUC graphs, and accuracy and loss curves. It is found that the neural network model achieves an accuracy of approximately 90%, outperforming the logistic regression and random forest models, which achieve accuracies of approximately 90% and 83%, respectively.

Стилі APA, Harvard, Vancouver, ISO та ін.

16

Ng, Hu, Glenn Jun Weng Chia, Timothy Tzen Vun Yap, and Vik Tor Goh. "Modelling sentiments based on objectivity and subjectivity with self-attention mechanisms." F1000Research 10 (May 17, 2022): 1001. http://dx.doi.org/10.12688/f1000research.73131.2.

Повний текст джерела

Анотація:

Background: The proliferation of digital commerce has allowed merchants to reach out to a wider customer base, prompting a study of customer reviews to gauge service and product quality through sentiment analysis. Sentiment analysis can be enhanced through subjectivity and objectivity classification with attention mechanisms. Methods: This research includes input corpora of contrasting levels of subjectivity and objectivity from different databases to perform sentiment analysis on user reviews, incorporating attention mechanisms at the aspect level. Three large corpora are chosen as the subjectivity and objectivity datasets, the Shopee user review dataset (ShopeeRD) for subjectivity, together with the Wikipedia English dataset (Wiki-en) and Internet Movie Database (IMDb) for objectivity. Word embeddings are created using Word2Vec with Skip-Gram. Then, a bidirectional LSTM with an attention layer (LSTM-ATT) imposed on word vectors. The performance of the model is evaluated and benchmarked against classification models of Logistics Regression (LR) and Linear SVC (L-SVC). Three models are trained with subjectivity (70% of ShopeeRD) and the objectivity (Wiki-en) embeddings, with ten-fold cross-validation. Next, the three models are evaluated against two datasets (IMDb and 20% of ShopeeRD). The experiments are based on benchmark comparisons, embedding comparison and model comparison with 70-10-20 train-validation-test splits. Data augmentation using AUG-BERT is performed and selected models incorporating AUG-BERT, are compared. Results: L-SVC scored the highest accuracy with 56.9% for objective embeddings (Wiki-en) while the LSTM-ATT scored 69.0% on subjective embeddings (ShopeeRD). Improved performances were observed with data augmentation using AUG-BERT, where the LSTM-ATT+AUG-BERT model scored the highest accuracy at 60.0% for objective embeddings and 70.0% for subjective embeddings, compared to 57% (objective) and 69% (subjective) for L-SVC+AUG-BERT, and 56% (objective) and 68% (subjective) for L-SVC. Conclusions: Utilizing attention layers with subjectivity and objectivity notions has shown improvement to the accuracy of sentiment analysis models.

Стилі APA, Harvard, Vancouver, ISO та ін.

17

Ng, Hu, Glenn Jun Weng Chia, Timothy Tzen Vun Yap, and Vik Tor Goh. "Modelling sentiments based on objectivity and subjectivity with self-attention mechanisms." F1000Research 10 (October 4, 2021): 1001. http://dx.doi.org/10.12688/f1000research.73131.1.

Повний текст джерела

Анотація:

Background: The proliferation of digital commerce has allowed merchants to reach out to a wider customer base, prompting a study of customer reviews to gauge service and product quality through sentiment analysis. Sentiment analysis can be enhanced through subjectivity and objectivity classification with attention mechanisms. Methods: This research includes input corpora of contrasting levels of subjectivity and objectivity from different databases to perform sentiment analysis on user reviews, incorporating attention mechanisms at the aspect level. Three large corpora are chosen as the subjectivity and objectivity datasets, the Shopee user review dataset (ShopeeRD) for subjectivity, together with the Wikipedia English dataset (Wiki-en) and Internet Movie Database (IMDb) for objectivity. Word embeddings are created using Word2Vec with Skip-Gram. Then, a bidirectional LSTM with an attention layer (LSTM-ATT) imposed on word vectors. The performance of the model is evaluated and benchmarked against classification models of Logistics Regression (LR) and Linear SVC (L-SVC). Three models are trained with subjectivity (70% of ShopeeRD) and the objectivity (Wiki-en) embeddings, with ten-fold cross-validation. Next, the three models are evaluated against two datasets (IMDb and 20% of ShopeeRD). The experiments are based on benchmark comparisons, embedding comparison and model comparison with 70-10-20 train-validation-test splits. Data augmentation using AUG-BERT is performed and selected models incorporating AUG-BERT, are compared. Results: L-SVC scored the highest accuracy with 56.9% for objective embeddings (Wiki-en) while the LSTM-ATT scored 69.0% on subjective embeddings (ShopeeRD). Improved performances were observed with data augmentation using AUG-BERT, where the LSTM-ATT+AUG-BERT model scored the highest accuracy at 60.0% for objective embeddings and 70.0% for subjective embeddings, compared to 57% (objective) and 69% (subjective) for L-SVC+AUG-BERT, and 56% (objective) and 68% (subjective) for L-SVC. Conclusions: Utilizing attention layers with subjectivity and objectivity notions has shown improvement to the accuracy of sentiment analysis models.

Стилі APA, Harvard, Vancouver, ISO та ін.

18

Sarker, Kamal Uddin, Mohammed Saqib, Raza Hasan, Salman Mahmood, Saqib Hussain, Ali Abbas, and Aziz Deraman. "A Ranking Learning Model by K-Means Clustering Technique for Web Scraped Movie Data." Computers 11, no. 11 (November 8, 2022): 158. http://dx.doi.org/10.3390/computers11110158.

Повний текст джерела

Анотація:

Business organizations experience cut-throat competition in the e-commerce era, where a smart organization needs to come up with faster innovative ideas to enjoy competitive advantages. A smart user decides from the review information of an online product. Data-driven smart machine learning applications use real data to support immediate decision making. Web scraping technologies support supplying sufficient relevant and up-to-date well-structured data from unstructured data sources like websites. Machine learning applications generate models for in-depth data analysis and decision making. The Internet Movie Database (IMDB) is one of the largest movie databases on the internet. IMDB movie information is applied for statistical analysis, sentiment classification, genre-based clustering, and rating-based clustering with respect to movie release year, budget, etc., for repository dataset. This paper presents a novel clustering model with respect to two different rating systems of IMDB movie data. This work contributes to the three areas: (i) the “grey area” of web scraping to extract data for research purposes; (ii) statistical analysis to correlate required data fields and understanding purposes of implementation machine learning, (iii) k-means clustering is applied for movie critics rank (Metascore) and users’ star rank (Rating). Different python libraries are used for web data scraping, data analysis, data visualization, and k-means clustering application. Only 42.4% of records were accepted from the extracted dataset for research purposes after cleaning. Statistical analysis showed that votes, ratings, Metascore have a linear relationship, while random characteristics are observed for income of the movie. On the other hand, experts’ feedback (Metascore) and customers’ feedback (Rating) are negatively correlated (−0.0384) due to the biasness of additional features like genre, actors, budget, etc. Both rankings have a nonlinear relationship with the income of the movies. Six optimal clusters were selected by elbow technique and the calculated silhouette score is 0.4926 for the proposed k-means clustering model and we found that only one cluster is in the logical relationship of two rankings systems.

Стилі APA, Harvard, Vancouver, ISO та ін.

19

Rawat, Deesha. "TV Show Popularity Analysis Using Data Mining." International Journal for Research in Applied Science and Engineering Technology 9, no. 12 (December 31, 2021): 147–52. http://dx.doi.org/10.22214/ijraset.2021.39206.

Повний текст джерела

Анотація:

Abstract: Evaluating and updating TV programs using data mining. Guessing the popularity of the TV show is an exciting activity with a growing interest in people in TV dramas. A simple guess of a trending TV game based on individual ratings can be made. Simple predictions of trending TV shows based on individual ratings can be made based on the target audience (Age, Year of Release, Rotten Tomatoes, etc.). Keywords: TV Show, Data Mining, Popularity Analysis, Data Visualization, IMDB Dataset

Стилі APA, Harvard, Vancouver, ISO та ін.

20

Dubey, Gaurav, Richa Khera, Ashish Grover, Amandeep Kaur, Abhishek Goyal, Rajkumar Rajkumar, Harsh Khatter, and Somya Srivastava. "A Hybrid Convolutional Network and Long Short-Term Memory (HBCNLS) model for Sentiment Analysis on Movie Reviews." International Journal on Recent and Innovation Trends in Computing and Communication 11, no. 4 (May 4, 2023): 341–48. http://dx.doi.org/10.17762/ijritcc.v11i4.6458.

Повний текст джерела

Анотація:

This paper proposes a hybrid model (HBCNLS) for sentiment analysis that combines the strengths of multiple machine learning approaches. The model consists of a convolutional neural network (CNN) for feature extraction, a long short-term memory (LSTM) network for capturing sequential dependencies, and a fully connected layer for classification on movie review dataset. We evaluate the performance of the HBCNLS on the IMDb movie review dataset and compare it to other state-of-the-art models, including BERT. Our results show that the hybrid model outperforms the other models in terms of accuracy, precision, and recall, demonstrating the effectiveness of the hybrid approach. The research work also compares the performance of BERT, a pre-trained transformer model, with long short-term memory (LSTM) networks and convolutional neural networks (CNNs) for the task of sentiment analysis on a movie review dataset..

Стилі APA, Harvard, Vancouver, ISO та ін.

21

Jain, Siddhartha, Ge Liu, Jonas Mueller, and David Gifford. "Maximizing Overall Diversity for Improved Uncertainty Estimates in Deep Ensembles." Proceedings of the AAAI Conference on Artificial Intelligence 34, no. 04 (April 3, 2020): 4264–71. http://dx.doi.org/10.1609/aaai.v34i04.5849.

Повний текст джерела

Анотація:

The inaccuracy of neural network models on inputs that do not stem from the distribution underlying the training data is problematic and at times unrecognized. Uncertainty estimates of model predictions are often based on the variation in predictions produced by a diverse ensemble of models applied to the same input. Here we describe Maximize Overall Diversity (MOD), an approach to improve ensemble-based uncertainty estimates by encouraging larger overall diversity in ensemble predictions across all possible inputs. We apply MOD to regression tasks including 38 Protein-DNA binding datasets, 9 UCI datasets, and the IMDB-Wiki image dataset. We also explore variants that utilize adversarial training techniques and data density estimation. For out-of-distribution test examples, MOD significantly improves predictive performance and uncertainty calibration without sacrificing performance on test data drawn from same distribution as the training data. We also find that in Bayesian optimization tasks, the performance of UCB acquisition is improved via MOD uncertainty estimates.

Стилі APA, Harvard, Vancouver, ISO та ін.

22

Buslim, Nurhayati, Lee Kyung Oh, Muhammad Hugo Athallah Hardy, and Yusuf Wijaya. "Comparative Analysis of KNN, Naïve Bayes and SVM Algorithms for Movie Genres Classification Based on Synopsis." JURNAL TEKNIK INFORMATIKA 15, no. 2 (December 23, 2022): 169–77. http://dx.doi.org/10.15408/jti.v15i2.29302.

Повний текст джерела

Анотація:

Text classification is a process of categorizing a text into the correct label. Text classification in natural language processing is a challenging task that requires accuracy to get the correct results, manual text classification tends to be inefficient because it requires a lot of time and also experts. The utilization of machine learning for automatic text classification can be a solution to this problem. KNN, Naive Bayes, and SVM are known as some of the most algorithms to solve classification problems, especially text classification. In this study, we are trying to compare the KNN, Naive Bayes, and SVM algorithms for text classification with the problem of classifying movie genres based on a synopsis using datasets obtained from Kaggle.com and IMDB Dataset. The results of this study indicate that of the 12 experiments, Support Vector Machine (SVM) is the bestperforming algorithm with an accuracy of 90%, 93%, 65%, and 63%. It is hoped that this research can help to determine the best algorithm in the text classification process.

Стилі APA, Harvard, Vancouver, ISO та ін.

23

Agarwal, Manav, Shreya Venugopal, Rishab Kashyap, and R. Bharathi. "Movie Success Prediction and Performance Comparison using Various Statistical Approaches." International Journal of Artificial Intelligence & Applications 13, no. 1 (January 31, 2022): 19–36. http://dx.doi.org/10.5121/ijaia.2022.13102.

Повний текст джерела

Анотація:

Movies are among the most prominent contributors to the global entertainment industry today, and they are among the biggest revenue-generating industries from a commercial standpoint. It's vital to divide films into two categories: successful and unsuccessful. To categorize the movies in this research, a variety of models were utilized, including regression models such as Simple Linear, Multiple Linear, and Logistic Regression, clustering techniques such as SVM and K-Means, Time Series Analysis, and an Artificial Neural Network. The models stated above were compared on a variety of factors, including their accuracy on the training and validation datasets as well as the testing dataset, the availability of new movie characteristics, and a variety of other statistical metrics. During the course of this study, it was discovered that certain characteristics have a greater impact on the likelihood of a film's success than others. For example, the existence of the genre action may have a significant impact on the forecasts, although another genre, such as sport, may not. The testing dataset for the models and classifiers has been taken from the IMDb website for the year 2020. The Artificial Neural Network, with an accuracy of 86 percent, is the best performing model of all the models discussed.

Стилі APA, Harvard, Vancouver, ISO та ін.

24

Oyewola, David Opeoluwa, and Emmanuel Gbenga Dada. "Machine Learning Methods for Predicting the Popularity of Movies." Journal of Artificial Intelligence and Systems 4, no. 1 (2022): 65–82. http://dx.doi.org/10.33969/ais.2022040105.

Повний текст джерела

Анотація:

The movie industry has grown into a several billion-dollar enterprise, and there is now a ton of information online about it. Numerous machine learning techniques have been created by academics and can produce effective classification models. In this study, different machine learning classification techniques are applied to our own movie dataset for multiclass classification. This paper's main objective is to compare the effectiveness of various machine learning techniques. This study examined five methods: Multinomial Logistic Regression (MLR), Support Vector Machine (SVM), Bagging (BAG), Naive Bayes (NBS) and K-Nearest Neighbor (KNN), while noise was removed using All K-Edited Nearest Neighbors (AENN). These techniques all utilize previous IMDb dataset to predict a movie's net profit value. The algorithms predict the profit at the box office for each of these five techniques. Based on the dataset used in this paper, which consists of 5043 rows and 14 columns of movies, this study evaluates the performance of all seven machine learning techniques. Bagging outperformed other machine learning techniques with a 99.56% accuracy rate.

Стилі APA, Harvard, Vancouver, ISO та ін.

25

Chatterjee, Shuvamoy, Kushal Chakrabarti, Avishek Garain, Friedhelm Schwenker, and Ram Sarkar. "JUMRv1: A Sentiment Analysis Dataset for Movie Recommendation." Applied Sciences 11, no. 20 (October 9, 2021): 9381. http://dx.doi.org/10.3390/app11209381.

Повний текст джерела

Анотація:

Nowadays, we can observe the applications of machine learning in every field, ranging from the quality testing of materials to the building of powerful computer vision tools. One such recent application is the recommendation system, which is a method that suggests products to users based on their preferences. In this paper, our focus is on a specific recommendation system called movie recommendation. Here, we make use of user reviews of movies in order to establish a general outlook about the movie and then use that outlook to recommend that movie to other users. However, a huge number of available reviews has baffled sophisticated review systems. Consequently, there is a need to find a method of extracting meaningful information from the available reviews and use that in classifying a movie review and predicting the sentiment in each one. In a typical scenario, a review can either be positive, negative, or indifferent about a movie. However, the available research articles in the field mainly consider this as a two-class classification problem—positive and negative. The most popular work in this field was performed on Stanford and Rotten Tomatoes datasets, which are somewhat outdated. Our work is based on self-scraped reviews from the IMDB website, and we have annotated the reviews into one of the three classes—positive, negative, and neutral. Our dataset is called JUMRv1—Jadavpur University Movie Recommendation dataset version 1. For the evaluation of JUMRv1, we took an exhaustive approach by testing various combinations of word embeddings, feature selection methods, and classifiers. We also analysed the performance trends, if there were any, and attempted to explain them. Our work sets a benchmark for movie recommendation systems that is based on the newly developed dataset using a three-class sentiment classification.

Стилі APA, Harvard, Vancouver, ISO та ін.

26

Elsa Vania, Salma Nuraini, and Dhian Satria Yudha Kartika. "PENGGUNAAN ALGORITMA K-MEANS CLUSTERING UNTUK MENENTUKAN REKOMENDASI FILM INDONESIA." Prosiding Seminar Nasional Teknologi dan Sistem Informasi 2, no. 1 (September 18, 2022): 207–14. http://dx.doi.org/10.33005/sitasi.v2i1.299.

Повний текст джерела

Анотація:

Seiring berkembangnya industri film, semakin banyak pula film yang diproduksi. Banyaknya film ini membuat penonton bimbang untuk memilih film mana yang akan ditonton. Penggunaan algoritma k-means clustering dapat membantu dalam mengelompokkan film berdasarkan karakteristiknya, sehingga penonton dapat memilah film dengan mudah. Tahapan klasterisasi dilakukan dengan metode CRISP-DM. Sedangkan algoritma yang diterapkan adalah K-Means. Dataset yang digunakan diambil dari kaggle yang berisi data film indonesia hasil scraping dari website IMDB dengan data film sebanyak 1272. Hasil dari tahapan klasterisasi ditemukan bahwa ada dua kelompok film yaitu film yang direkomendasikan dan film yang kurang direkomendasikan. Dari hasil klaster tersebut dapat menghasilkan rekomendasi film Indonesia yang mungkin bisa menjadi referensi untuk ditonton.

Стилі APA, Harvard, Vancouver, ISO та ін.

27

Ariatmanto, Dhani, and Muhammad Ilham Arief. "PREDIKSI PELUANG KESUKSESAN FILM DALAM PRA PRODUKSI MENGGUNAKAN ALGORITMA DECISION TREE." JATI (Jurnal Mahasiswa Teknik Informatika) 7, no. 1 (February 16, 2023): 222–26. http://dx.doi.org/10.36040/jati.v7i1.6277.

Повний текст джерела

Анотація:

Film merupakan salah satu hiburan yang popular didunia. Tingkat kesuksesan film tergantung dari jumlah penonton. Namun, banyaknya jumlah penonton berkaitan tidak hanya dari pemeran utama tapi juga plot cerita dan genre dari film tersebut. Pra produksi merupakan tahapan dari proses pembuatan Film. Dalam proses pra produksi terdapat ide dan konsep untuk pembuatan naskah. Penelitian ini memprediksi kuseksesan produksi film dari pengklasifikasian berdasar bahasa, negara, title years, imdb_score, movie_title, content_rating, director_name, budget, gross, genre, actor_name. Dataset yang digunakan bersifat public dari IMDB dan metode yang digunakan yaitu decision tree (DT). Tahapan dimulai dari pengumpulan data, pre-processing, klasifikasi dan terakhir pengujian model. Dari hasil eksperimen didapatkan tingkat akurasi yang lebih baik dari penelitian sebelumnya dengan akurasi sebesar 68%

Стилі APA, Harvard, Vancouver, ISO та ін.

28

Rehman, Muhammad Zubair, Kamal Z. Zamli, Mubarak Almutairi, Haruna Chiroma, Muhammad Aamir, Md Abdul Kader, and Nazri Mohd Nawi. "A novel state space reduction algorithm for team formation in social networks." PLOS ONE 16, no. 12 (December 2, 2021): e0259786. http://dx.doi.org/10.1371/journal.pone.0259786.

Повний текст джерела

Анотація:

Team formation (TF) in social networks exploits graphs (i.e., vertices = experts and edges = skills) to represent a possible collaboration between the experts. These networks lead us towards building cost-effective research teams irrespective of the geolocation of the experts and the size of the dataset. Previously, large datasets were not closely inspected for the large-scale distributions & relationships among the researchers, resulting in the algorithms failing to scale well on the data. Therefore, this paper presents a novel TF algorithm for expert team formation called SSR-TF based on two metrics; communication cost and graph reduction, that will become a basis for future TF’s. In SSR-TF, communication cost finds the possibility of collaboration between researchers. The graph reduction scales the large data to only appropriate skills and the experts, resulting in real-time extraction of experts for collaboration. This approach is tested on five organic and benchmark datasets, i.e., UMP, DBLP, ACM, IMDB, and Bibsonomy. The SSR-TF algorithm is able to build cost-effective teams with the most appropriate experts–resulting in the formation of more communicative teams with high expertise levels.

Стилі APA, Harvard, Vancouver, ISO та ін.

29

Gunawan, Putu Harry, Tb Dzulfiqar Alhafidh, and Bambang Ari Wahyudi. "The Sentiment Analysis of Spider-Man: No Way Home Film Based on IMDb Reviews." Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi) 6, no. 1 (February 27, 2022): 177–82. http://dx.doi.org/10.29207/resti.v6i1.3851.

Повний текст джерела

Анотація:

Sentiment analysis is used to determine the overall sentiment in a movie review. The goal of this paper is to investigate the sentiment analysis using multiple classification methods from Spider-Man: No Way Home movie reviews. The review dataset is procured from the IMDb website. Preprocessing methods are used and compared to determine the difference in accuracy score. The methods proposed for this study include Naïve-Bayes, Support Vector Machine (SVM), Stochastic Gradient Descent (SGD), and Decision Tree to find the best accuracy possible. The sentiment analysis of the movie review resulted in 94 positive reviews and 65 negative reviews. The highest accuracy and f1 score for this study are obtained from the SVM and the SGD classifier with an accuracy of 82% and an F1 score of 81% respectively

Стилі APA, Harvard, Vancouver, ISO та ін.

30

Basarslan, Muhammet Sinan, and Fatih Kayaalp. "Sentiment Analysis with Machine Learning Methods on Social Media." ADCAIJ: Advances in Distributed Computing and Artificial Intelligence Journal 9, no. 3 (September 17, 2020): 5–15. http://dx.doi.org/10.14201/adcaij202093515.

Повний текст джерела

Анотація:

Social media has become an important part of our everyday life due to the widespread use of the Internet. Of the social media services, Twitter is among the most used ones around the world. People share their opinions by writing tweets about numerous subjects, such as politics, sports, economy, etc. Millions of tweets per day create a huge dataset, which drew attention of the data scientists to focus on these data for sentiment analysis. The sentiment analysis focuses to identify the social media posts of users about a specific topic and categorize them as positive, negative or neutral. Thus, the study aims to investigate the effect of types of text representation on the performance of sentiment analysis. In this study, two datasets were used in the experiments. The first one is the user reviews about movies from the IMDB, which has been labeled by Kotzias, and the second one is the Twitter tweets, including the tweets of users about health topic in English in 2019, collected using the Twitter API. The Python programming language was used in the study both for implementing the classification models using the Naïve Bayes (NB), Support Vector Machines (SVM) and Artificial Neural Networks (ANN) algorithms, and for categorizing the sentiments as positive, negative and neutral. The feature extraction from the dataset was performed using Term Frequency-Inverse Document Frequency (TF-IDF) and Word2Vec (W2V) modeling techniques. The success percentages of the classification algorithms were compared at the end. According to the experimental results, Artificial Neural Network had the best accuracy performance in both datasets compared to the others.

Стилі APA, Harvard, Vancouver, ISO та ін.

31

Zhang, Bowen. "A BERT-CNN Based Approach on Movie Review Sentiment Analysis." SHS Web of Conferences 163 (2023): 04007. http://dx.doi.org/10.1051/shsconf/202316304007.

Повний текст джерела

Анотація:

Sentiment analysis plays a vital role in the decision-making of multiple fields. Specifically, in movies and television, audiences’ reviews can help with casting, the direction of the plot, etc. To further improve the performance of the original BERT model, a BERT-CNN-based approach is proposed in this paper to do sentiment analysis on IMDb dataset. Although their performances are nearly the same throughout the research, the BERT-CNN approach is better at negative sentiment detection. It got an average elevation of 3.6% in accuracy after the ensemble. Apart from that, topic modeling is also performed to show that most negative reviews are commented on from multiple aspects instead of criticizing only, making sentiment analysis of movie reviews a complex problem.

Стилі APA, Harvard, Vancouver, ISO та ін.

32

Mahajan, Manan, Abhedya Mishra, Praveen Kumar, and Ms Sapna Gupta. "Moviebox: A Movie Recommendation System." International Journal for Research in Applied Science and Engineering Technology 11, no. 6 (June 30, 2023): 3410–14. http://dx.doi.org/10.22214/ijraset.2023.54258.

Повний текст джерела

Анотація:

Abstract: In the vast world of cinema, numerous critically acclaimed movies often go unnoticed, despite their significant artistic and cultural value. Recognizing this disparity, our research endeavors to bridge the gap by developing a comprehensive movie recommendation system that highlights these "Out of the box" films. In the initial phase, we meticulously collected a bespoke dataset by scraping data from IMDb, encompassing a wide range of movies from various genres and regions. In the subsequent phase, we constructed an advanced algorithm utilizing content-based filtering techniques. This algorithm analyzes both user behavior and movie features to provide personalized recommendations that align with users' unique preferences. By embracing this approach, our research aims to enhance the discoverability and appreciation of lesser-known yet meaningful movies, empowering users to explore a diverse array of cinematic experiences.

Стилі APA, Harvard, Vancouver, ISO та ін.

33

Guha, Tapas, and K. G. Mohan. "A Hybrid Deep Learning Model for Long-Term Sentiment Classification." Webology 17, no. 2 (December 21, 2020): 663–76. http://dx.doi.org/10.14704/web/v17i2/web17059.

Повний текст джерела

Анотація:

With the omnipresence of user feedbacks in social media, mining of relevant opinion and extracting the underlying sentiment to analyze synthetic emotion towards a specific product, person, topic or event has become a vast domain of research in recent times. A thorough survey of the early unimodal and multimodal sentiment classification approaches reveals that researchers mostly relied on either corpus based techniques or those based on machine learning algorithms. Lately, Deep learning models progressed profoundly in the area of image processing. This success has been efficiently directed towards enhancements in sentiment categorization. A hybrid deep learning model consisting of Convolutional Neural Network (CNN) and stacked bidirectional Long Short Term Memory (BiLSTM) over pre-trained word vectors is proposed in this paper to achieve long-term sentiment analysis. This work experiments with various hyperparameters and optimization techniques to make the model get rid of overfitting and to achieve optimal performance. It has been validated on two standard sentiment datasets, Stanford Large Movie Review (IMDB) and Stanford Sentiment Treebank2 Dataset (SST2). It achieves a competitive advantage over other models like CNN, LSTM and ensemble of CNN-LSTM by attaining better accuracy and also produces high F measure.

Стилі APA, Harvard, Vancouver, ISO та ін.

34

Hourrane, Oumaima, El Habib Benlahmar, and Ahmed Zellou. "Comparative study of deep learning models for sentiment analysis." International Journal of Engineering & Technology 7, no. 2.14 (April 12, 2018): 5726. http://dx.doi.org/10.14419/ijet.v7i4.24459.

Повний текст джерела

Анотація:

Sentiment analysis is one of the new absorbing parts appeared in natural language processing with the emergence of community sites on the web. Taking advantage of the amount of information now available, research and industry have been seeking ways to automatically analyze the sentiments expressed in texts. The challenge for this task is the human language ambiguity, and also the lack of labeled data. In order to solve this issue, sentiment analysis and deep learning have been merged as deep learning models are effective due to their automatic learning capability. In this paper, we provide a comparative study on IMDB movie review dataset, we compare word embeddings and further deep learning models on sentiment analysis and give broad empirical outcomes for those keen on taking advantage of deep learning for sentiment analysis in real-world settings.

Стилі APA, Harvard, Vancouver, ISO та ін.

35

Yechuri, Praveen Kumar, and Suguna Ramadass. "Semantic Web Mining for Analyzing Retail Environment Using Word2Vec and CNN-FK." Ingénierie des systèmes d information 26, no. 3 (June 30, 2021): 311–18. http://dx.doi.org/10.18280/isi.260308.

Повний текст джерела

Анотація:

Digital Technology is becoming increasingly essential to organizations. Related knowledge is important for a company to allow optimal use of its IT services. The use of Big Data is relatively new to this field. Handling Big data is not, at this stage, a problem for large business organizations in particular; it has also become a challenge for small and medium-sized businesses. Although Semantic Web analysis is largely focused on fundamental advances that are expected to make the Semantic Web a reality, there has not been much work done to demonstrate the feasibility and effect of the Semantic Web on business issues. The infrastructure of electronic information executives and business types has provided various enhancements for companies, such as the automated process of buying and selling products. Nevertheless, undertakings are checked for the multifaceted nature of the extension required to deal with an ever-increasing number of electronic details and procedures. This paper suggests a model with a neural network design and a word representation system named Word2Vec for analyzing retail environment. Firstly, Word2vec manages the text data and shows it as a function diagram and a feature map is given to the Convolution Neural Network (CNN) that extracts the features and classifies them. The IMDB dataset, the Cornell dataset, the Amazon Products Dataset and the Twitter dataset were analyzed in the proposed model. The proposed Convolution Neural Network Fisher Kernel (CNN-FK) model is compared with the existing SVM model for analyzing retail environment in semantic web mining. The new approach has increased efficiency when compared to existing models.

Стилі APA, Harvard, Vancouver, ISO та ін.

36

Muhadzdzib Ramadhan, Muhammad Tsaqif, and Erwin Budi Setiawan. "Netflix Movie Recommendation System Using Collaborative Filtering With K-Means Clustering Method on Twitter." JURNAL MEDIA INFORMATIKA BUDIDARMA 6, no. 4 (October 25, 2022): 2056. http://dx.doi.org/10.30865/mib.v6i4.4571.

Повний текст джерела

Анотація:

Nowadays, the development of technology is very rapid, so watching movies at home has become a means of entertainment. Netflix is one of the platforms for watching movies and provides various movie titles. However, because of the many movie titles, it makes it difficult for users to determine the movie they want to watch. The solution to this problem is to provide a recommendation system that can provide movie recommendations to watch. Collaborative filtering is a method that exists in the recommendation system by providing recommendations based on the ratings given by other users. Collaborative filtering is divided into two, namely based on items (item-based) and based on users (user-based). Twitter is a social media used to write posts called tweets. For this system, tweets serve as data that will be processed into ratings. This research was conducted using k-means clustering with collaborative filtering and collaborative filtering only. By using a dataset obtained from Twitter by crawling data and added with ratings from IMDb, Rotten Tomatoes, and Metacritic. Which resulted in a dataset with 35 users, 785 movie titles, and 6184 reviews. Then preprocessing the data with text processing, polarity, and labeling. And get the dataset that will be used for this experiment. The results of this research test show that k-means clustering with collaborative filtering gets the best results with the best prediction of 2.8466, getting an MAE value of 0.5029, and an RMSE value of 0.6354

Стилі APA, Harvard, Vancouver, ISO та ін.

37

Prabhakar, Sunil Kumar, Harikumar Rajaguru, and Dong-Ok Won. "Performance Analysis of Hybrid Deep Learning Models with Attention Mechanism Positioning and Focal Loss for Text Classification." Scientific Programming 2021 (October 26, 2021): 1–12. http://dx.doi.org/10.1155/2021/2420254.

Повний текст джерела

Анотація:

Over the past few decades, text classification problems have been widely utilized in many real time applications. Leveraging the text classification methods by means of developing new applications in the field of text mining and Natural Language Processing (NLP) is very important. In order to accurately classify tasks in many applications, a deeper insight into deep learning methods is required as there is an exponential growth in the number of complex documents. The success of any deep learning algorithm depends on its capacity to understand the nonlinear relationships of the complex models within data. Thus, a huge challenge for researchers lies in the development of suitable techniques, architectures, and models for text classification. In this paper, hybrid deep learning models, with an emphasis on positioning of attention mechanism analysis, are considered and analyzed well for text classification. The first hybrid model proposed is called convolutional Bidirectional Long Short-Term Memory (Bi-LSTM) with attention mechanism and output (CBAO) model, and the second hybrid model is called convolutional attention mechanism with Bi-LSTM and output (CABO) model. In the first hybrid model, the attention mechanism is placed after the Bi-LSTM, and then the output Softmax layer is constructed. In the second hybrid model, the attention mechanism is placed after convolutional layer and followed by Bi-LSTM and the output Softmax layer. The proposed hybrid models are tested on three datasets, and the results show that when the proposed CBAO model is implemented for IMDB dataset, a high classification accuracy of 92.72% is obtained and when the proposed CABO model is implemented on the same dataset, a high classification accuracy of 90.51% is obtained.

Стилі APA, Harvard, Vancouver, ISO та ін.

38

Babhulkar, Mr Shubham. "Application of Machine Learning for Emotion Classification." International Journal for Research in Applied Science and Engineering Technology 9, no. VII (July 20, 2021): 1567–72. http://dx.doi.org/10.22214/ijraset.2021.36459.

Повний текст джерела

Анотація:

In this paper we propose an implement a general convolutional neural network (CNN) building framework for designing real-time CNNs. We validate our models by creat- ing a real-time vision system which accomplishes the tasks of face detection, gender classification and emotion classification simultaneously in one blended step using our proposed CNN architecture. After presenting the details of the training pro- cedure setup we proceed to evaluate on standard benchmark sets. We report accuracies of 96% in the IMDB gender dataset and 66% in the FER-2013 emotion dataset. Along with this we also introduced the very recent real-time enabled guided back- propagation visualization technique. Guided back-propagation uncovers the dynamics of the weight changes and evaluates the learned features. We argue that the careful implementation of modern CNN architectures, the use of the current regu- larization methods and the visualization of previously hidden features are necessary in order to reduce the gap between slow performances and real-time architectures. Our system has been validated by its deployment on a Care-O-bot 3 robot used during RoboCup@Home competitions. All our code, demos and pre- trained architectures have been released under an open-source license in our public repository.

Стилі APA, Harvard, Vancouver, ISO та ін.

39

Sivakumar, Soubraylu, and Ratnavel Rajalakshmi. "Analysis of Sentiment on Movie Reviews Using Word Embedding Self-Attentive LSTM." International Journal of Ambient Computing and Intelligence 12, no. 2 (April 2021): 33–52. http://dx.doi.org/10.4018/ijaci.2021040103.

Повний текст джерела

Анотація:

In the contemporary world, people share their thoughts rapidly in social media. Mining and extracting knowledge from this information for performing sentiment analysis is a complex task. Even though automated machine learning algorithms and techniques are available, and extraction of semantic and relevant key terms from a sparse representation of the review is difficult. Word embedding improves the text classification by solving the problem of sparse matrix and semantics of the word. In this paper, a novel architecture is proposed by combining long short-term memory (LSTM) with word embedding to extract the semantic relationship between the neighboring words and also a weighted self-attention is applied to extract the key terms from the reviews. Based on the experimental analysis on the IMDB dataset, the authors have shown that the proposed architecture word-embedding self-attention LSTM architecture achieved an F1 score of 88.67%, while LSTM and word embedding LSTM-based models resulted in an F1 score of 84.42% and 85.69%, respectively.

Стилі APA, Harvard, Vancouver, ISO та ін.

40

Qiu, Ningjia, Zhuorui Shen, Xiaojuan Hu, and Peng Wang. "A novel sentiment classification model based on online learning." Journal of Algorithms & Computational Technology 13 (January 2019): 174830261984576. http://dx.doi.org/10.1177/1748302619845764.

Повний текст джерела

Анотація:

Memory limitation and slow training speed are two important problems in sentiment analysis. In this paper, we propose a sentiment classification model based on online learning to improve the training speed of the sentiment classification. First, combining the adaptive adjustment of learning rate of the Adadelta algorithm and the characteristics of avoid frequent jitter of Adam algorithm in the later stage of training, we present a novel Adamdelta algorithm. It solves the problem that learning rate of traditional follow the regularized leader (FTRL)-Proximal online learning algorithm will disappear with the increase of training times. Moreover, we gain an optimized logistic regression (LR) model and use it to the sentiment classification of online learning. Finally, we compare the proposed algorithm with five similar models with the experimental data of the IMDb movie review dataset. Experimental results show that the improved algorithm has better classification effect and can effectively improve the precision and recall of the classifier.

Стилі APA, Harvard, Vancouver, ISO та ін.

41

Amit Pimpalkar and Jeberson Retna Raj. "A Bi-Directional GRU Architecture for the Self-Attention Mechanism: An Adaptable, Multi-Layered Approach with Blend of Word Embedding." International Journal of Engineering and Technology Innovation 13, no. 3 (July 4, 2023): 251–64. http://dx.doi.org/10.46604/ijeti.2023.11510.

Повний текст джерела

Анотація:

Sentiment analysis (SA) has become an essential component of natural language processing (NLP) with numerous practical applications to understanding “what other people think”. Various techniques have been developed to tackle SA using deep learning (DL); however, current research lacks comprehensive strategies incorporating multiple-word embeddings. This study proposes a self-attention mechanism that leverages DL and involves the contextual integration of word embedding with a time-dispersed bidirectional gated recurrent unit (Bi-GRU). This work employs word embedding approaches GloVe, word2vec, and fastText to achieve better predictive capabilities. By integrating these techniques, the study aims to improve the classifier’s capability to precisely analyze and categorize sentiments in textual data from the domain of movies. The investigation seeks to enhance the classifier’s performance in NLP tasks by addressing the challenges of underfitting and overfitting in DL. To evaluate the model’s effectiveness, an openly available IMDb dataset was utilized, achieving a remarkable testing accuracy of 99.70%.

Стилі APA, Harvard, Vancouver, ISO та ін.

42

Unal, Fatima Zehra, Mehmet Serdar Guzel, Erkan Bostanci, Koray Acici, and Tunc Asuroglu. "Multilabel Genre Prediction Using Deep-Learning Frameworks." Applied Sciences 13, no. 15 (July 27, 2023): 8665. http://dx.doi.org/10.3390/app13158665.

Повний текст джерела

Анотація:

In this study, transfer learning has been used to overcome multilabel classification tasks. As a case study, movie genre classification by using posters has been chosen. Six state-of-the-art pretrained models, VGG16, ResNet, DenseNet, Inception, MobileNet, and ConvNeXt, have been employed for this experiment. The movie posters have been obtained from Internet Movie Database (IMDB). The dataset has been divided using an iterative stratification technique. A sequence of dense layers has been added on top of each model and these models have been trained and fine-tuned. All the results of the models compared considered accuracy, loss, Hamming loss, F1-score, precision, and AUC metrics. When the metrics used were evaluated, the most successful result regarding accuracy has been obtained from the modified DenseNet architecture at 90%. Also, the ConvNeXt, which is the newest model among all, performed quite satisfactorily, reaching over 90% accuracy. This study uses an iterative stratification method to split an unbalanced dataset which provides more reliable results than the classical splitting method which is the common method in the literature. Also, the feature extraction capabilities of the six pretrained models have been compared. The outcome of this study shows promising results regarding multilabel classification. As for future work, it is planned to enhance this study by using natural language processing and ensemble methods.

Стилі APA, Harvard, Vancouver, ISO та ін.

43

Farkhod, Akhmedov, Akmalbek Abdusalomov, Fazliddin Makhmudov, and Young Im Cho. "LDA-Based Topic Modeling Sentiment Analysis Using Topic/Document/Sentence (TDS) Model." Applied Sciences 11, no. 23 (November 23, 2021): 11091. http://dx.doi.org/10.3390/app112311091.

Повний текст джерела

Анотація:

Customer reviews on the Internet reflect users’ sentiments about the product, service, and social events. As sentiments can be divided into positive, negative, and neutral forms, sentiment analysis processes identify the polarity of information in the source materials toward an entity. Most studies have focused on document-level sentiment classification. In this study, we apply an unsupervised machine learning approach to discover sentiment polarity not only at the document level but also at the word level. The proposed topic document sentence (TDS) model is based on joint sentiment topic (JST) and latent Dirichlet allocation (LDA) topic modeling techniques. The IMDB dataset, comprising user reviews, was used for data analysis. First, we applied the LDA model to discover topics from the reviews; then, the TDS model was implemented to identify the polarity of the sentiment from topic to document, and from document to word levels. The LDAvis tool was used for data visualization. The experimental results show that the analysis not only obtained good topic partitioning results, but also achieved high sentiment analysis accuracy in document- and word-level sentiment classifications.

Стилі APA, Harvard, Vancouver, ISO та ін.

44

Le Busque, Brianna, and Carla Litchfield. "Sharks, spiders, snakes, oh my: A review of creature feature films." Journal of Environmental Media 4, no. 1 (April 1, 2023): 49–75. http://dx.doi.org/10.1386/jem_00096_1.

Повний текст джерела

Анотація:

Media are conduits for people to obtain information about animal species and may therefore influence how people think about these species. This study advances our understanding of animals (and plants) in the media by analysing a final dataset of 638 films categorized in the genre ‘Creature Features’. Through analysing the biography, film poster and trailer on the IMDb database, it was found that sharks were the most depicted species in creature feature films, with insects and arachnids, dinosaurs and snakes also being frequently featured. There were changes in the types of animal species commonly portrayed in creature feature films across time, with dinosaurs and primates being more frequently depicted in the 1920s–30s and sharks being more frequently depicted in recent decades. This study is the first to investigate which animal/plant species are evident in creature feature films, which is a broader genre incorporating mythology, extant and general unrealistic portrayals of animals. This allows for new understandings regarding the influence the media can have on perceptions of animal and plant species.

Стилі APA, Harvard, Vancouver, ISO та ін.

45

Wassan, Sobia, Tian Shen, Chen Xi, Kamal Gulati, Danish Vasan, and Beenish Suhail. "Customer Experience towards the Product during a Coronavirus Outbreak." Behavioural Neurology 2022 (February 2, 2022): 1–18. http://dx.doi.org/10.1155/2022/4279346.

Повний текст джерела

Анотація:

Nowadays, sentimental analysis of consumers’ review is becoming much crucial in the marketing world. It is not just giving ideas to the firms that how consumers like their product or service, but it would also help them make their service better. In this article, the statistical method identifies the relationship of many factors in consumer feedback. It introduces a deep-based learning method called DSC (deep sentiment classifier) to determine whether or not to recommend the reviewed product thoroughly. Our suggested method also investigates the effect sizes of the feedback, such as positives, negatives, and neutrals. We used the women’s clothing review dataset containing 22,642 records after preprocessing of the results. Experimental studies show that the recommendations are an excellent positive sentiment indicator. In comparison, ratings become fuzzy performance metrics in product reviews. The 10-fold cross-validation analysis shows that the recommended form has the top F1 score (93.56%) in the sentimental classification on average and the recommended classification (88.32%) on average. A comparative description of other classifiers focused on machine learning, for example, KNN, random forest, logistic regression, decision tree, support vector machine multilayer perceptron, and naïve Bayes, also demonstrates that DSC gives the best possible result. We have tested DSC on the dataset IMDB (Internet Video Database), which includes the sentiment of the 50,000 movie reviews (25000 for training and 25000 for testing). In comparison to other baseline methods, DSC obtained an excellent classification score for this experiment.

Стилі APA, Harvard, Vancouver, ISO та ін.

46

Islam, Md Mahbubul, and Joong-Hwan Baek. "Deep Learning Based Real Age and Gender Estimation from Unconstrained Face Image towards Smart Store Customer Relationship Management." Applied Sciences 11, no. 10 (May 17, 2021): 4549. http://dx.doi.org/10.3390/app11104549.

Повний текст джерела

Анотація:

The COVID-19 pandemic markedly changed the human shopping nature, necessitating a contactless shopping system to curb the spread of the contagious disease efficiently. Consequently, a customer opts for a store where it is possible to avoid physical contacts and shorten the shopping process with extended services such as personalized product recommendations. Automatic age and gender estimation of a customer in a smart store strongly benefit the consumer by providing personalized advertisement and product recommendation; similarly, it aids the smart store proprietor to promote sales and develop an inventory perpetually for the future retail. In our paper, we propose a deep learning-founded enterprise solution for smart store customer relationship management (CRM), which allows us to predict the age and gender from a customer’s face image taken in an unconstrained environment to facilitate the smart store’s extended services, as it is expected for a modern venture. For the age estimation problem, we mitigate the data sparsity problem of the large public IMDB-WIKI dataset by image enhancement from another dataset and perform data augmentation as required. We handle our classification tasks utilizing an empirically leading pre-trained convolutional neural network (CNN), the VGG-16 network, and incorporate batch normalization. Especially, the age estimation task is posed as a deep classification problem followed by a multinomial logistic regression first-moment refinement. We validate our system for two standard benchmarks, one for each task, and demonstrate state-of-the-art performance for both real age and gender estimation.

Стилі APA, Harvard, Vancouver, ISO та ін.

47

Schneider, Frank M., Emese Domahidi, and Felix Dietrich. "What Is Important When We Evaluate Movies? Insights from Computational Analysis of Online Reviews." Media and Communication 8, no. 3 (August 13, 2020): 153–63. http://dx.doi.org/10.17645/mac.v8i3.3134.

Повний текст джерела

Анотація:

The question of what is important when we evaluate movies is crucial for understanding how lay audiences experience and evaluate entertainment products such as films. In line with this, subjective movie evaluation criteria (SMEC) have been conceptualized as mental representations of important attitudes toward specific film features. Based on exploratory and confirmatory factor analyses of self-report data from online surveys, previous research has found and validated eight dimensions. Given the large-scale evaluative information that is available in online users’ comments in movie databases, it seems likely that what online users write about movies may enrich our knowledge about SMEC. As a first fully exploratory attempt, drawing on an open-source dataset including movie reviews from IMDb, we estimated a correlated topic model to explore the underlying topics of those reviews. In 35,136 online movie reviews, the most prevalent topics tapped into three major categories—Hedonism, Actors’ Performance, and Narrative—and indicated what reviewers mostly wrote about. Although a qualitative analysis of the reviews revealed that users mention certain SMEC, results of the topic model covered only two SMEC: Story Innovation and Light-heartedness. Implications for SMEC and entertainment research are discussed.

Стилі APA, Harvard, Vancouver, ISO та ін.

48

Ashish, Roy, and B. G. Prasad. "Temporal face feature progression with cycle GAN." Journal of Physics: Conference Series 2161, no. 1 (January 1, 2022): 012008. http://dx.doi.org/10.1088/1742-6596/2161/1/012008.

Повний текст джерела

Анотація:

Abstract The aging process creates significant changes in the appearances of people’s faces. When compared to other causes of variation in face imaging, aging-related variation has specific distinct properties. Facial Aging variations, for example, is unique for each person; it occurs gradually and is significantly influenced by other characteristics including health, gender, and life-style. As a result, the proposed effort will use Generative Adversarial Networks to address these critical concerns (GANs). Generative Adversarial Networks (GAN’s) is made up of a generator and a discriminator network. The generator model generates images that a discriminator model analyses to determine if they are real or fake. This paper provides a Temporal Face Feature Progressive framework with Cycle GAN, which maintains the initial appearance and identity in the elderly aspect of their facial structure. To address aging concerns, our goal is to transform an initial age category image into a targeted age with age progression. We show that our temporal face features progressive cycle GAN learns and transfers facial traits from the source group to the targeted group by training various images. The IMDB-WIKI Face dataset has been used to obtain the results for the same.

Стилі APA, Harvard, Vancouver, ISO та ін.

49

Jain, Mansi, Purvit Vashishtha, Aman Satyam, and Smriti Sehgal. "CNN LSTM Hybrid Approach for Sentiment Analysis." International Journal for Research in Applied Science and Engineering Technology 11, no. 5 (May 31, 2023): 3096–107. http://dx.doi.org/10.22214/ijraset.2023.52191.

Повний текст джерела

Анотація:

Abstract: In recent years, one of the most popular study subjects has been sentiment analysis. It is employed to ascertain the text's actual intention. It is primarily interested in the processing and analysis of natural language data. The development of technology and the phenomenal rise of social media have produced a vast volume of confusing textual information. It's critical to examine the feelings that underlie such writings. Sentiment analysis reveals the core of irrational beliefs kept in enormous volumes of text. The primary objective is to get the computer to comprehend the backdrop of the data so that it may be divided into material that is good or bad. (i) Several machine learning models, including Naive Bayes, XGboost, Random Forest, LGB Machine, etc., are trained in this study. (ii) The implementation of the deep learning model Bi-LSTM, whose accuracy has showed promise. (iii) Bidirectional Encoder Representations from Transformers (BERT), a pre-trained language model that used an external Bi-LSTM model, was implemented. Then, a new approach of CNN-LSTM hybrid model is applied to IMDb dataset which performed better than all the models.

Стилі APA, Harvard, Vancouver, ISO та ін.

50

Bacco, Luca, Andrea Cimino, Felice Dell’Orletta, and Mario Merone. "Explainable Sentiment Analysis: A Hierarchical Transformer-Based Extractive Summarization Approach." Electronics 10, no. 18 (September 8, 2021): 2195. http://dx.doi.org/10.3390/electronics10182195.

Повний текст джерела

Анотація:

In recent years, the explainable artificial intelligence (XAI) paradigm is gaining wide research interest. The natural language processing (NLP) community is also approaching the shift of paradigm: building a suite of models that provide an explanation of the decision on some main task, without affecting the performances. It is not an easy job for sure, especially when very poorly interpretable models are involved, like the almost ubiquitous (at least in the NLP literature of the last years) transformers. Here, we propose two different transformer-based methodologies exploiting the inner hierarchy of the documents to perform a sentiment analysis task while extracting the most important (with regards to the model decision) sentences to build a summary as the explanation of the output. For the first architecture, we placed two transformers in cascade and leveraged the attention weights of the second one to build the summary. For the other architecture, we employed a single transformer to classify the single sentences in the document and then combine the probability scores of each to perform the classification and then build the summary. We compared the two methodologies by using the IMDB dataset, both in terms of classification and explainability performances. To assess the explainability part, we propose two kinds of metrics, based on benchmarking the models’ summaries with human annotations. We recruited four independent operators to annotate few documents retrieved from the original dataset. Furthermore, we conducted an ablation study to highlight how implementing some strategies leads to important improvements on the explainability performance of the cascade transformers model.

Стилі APA, Harvard, Vancouver, ISO та ін.

Ми пропонуємо знижки на всі преміум-плани для авторів, чиї праці увійшли до тематичних добірок літератури. Зв'яжіться з нами, щоб отримати унікальний промокод!