To see the other types of publications on this topic, follow the link: Attention based models.

Journal articles on the topic 'Attention based models'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'Attention based models.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Qin, Chu-Xiong, and Dan Qu. "Towards Understanding Attention-Based Speech Recognition Models." IEEE Access 8 (2020): 24358–69. http://dx.doi.org/10.1109/access.2020.2970758.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Steelman, Kelly S., Jason S. McCarley, and Christopher D. Wickens. "Theory-based Models of Attention in Visual Workspaces." International Journal of Human–Computer Interaction 33, no. 1 (September 16, 2016): 35–43. http://dx.doi.org/10.1080/10447318.2016.1232228.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Hashemi, Seyyed Mohammad Reza. "A Survey of Visual Attention Models." Ciência e Natura 37 (December 19, 2015): 297. http://dx.doi.org/10.5902/2179460x20786.

Full text
Abstract:
The present paper surveys visual attention models, showing factors’ categorization. It also studies bottom-up models in comparison to top-to-down, spatial models compared to spatial-temporal ones, obvious attention against the hidden one, and space-based models against the object-based ones. It categorizes some challenging model issues, including biological calculations, correlation with the set of eye-movement data, as well as bottom-up and top-to-down topics, explaining each in details.
APA, Harvard, Vancouver, ISO, and other styles
4

Zhou, Qifeng, Xiang Liu, and Qing Wang. "Interpretable duplicate question detection models based on attention mechanism." Information Sciences 543 (January 2021): 259–72. http://dx.doi.org/10.1016/j.ins.2020.07.048.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Kramer, Arthur F., and Andrew Jacobson. "A comparison of Space-Based and Object-Based Models of Visual Attention." Proceedings of the Human Factors Society Annual Meeting 34, no. 19 (October 1990): 1489–93. http://dx.doi.org/10.1177/154193129003401915.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Wang, Lei, Ed X. Wu, and Fei Chen. "EEG-based auditory attention decoding using speech-level-based segmented computational models." Journal of Neural Engineering 18, no. 4 (May 25, 2021): 046066. http://dx.doi.org/10.1088/1741-2552/abfeba.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Rosenberg, Monica D., Wei-Ting Hsu, Dustin Scheinost, R. Todd Constable, and Marvin M. Chun. "Connectome-based Models Predict Separable Components of Attention in Novel Individuals." Journal of Cognitive Neuroscience 30, no. 2 (February 2018): 160–73. http://dx.doi.org/10.1162/jocn_a_01197.

Full text
Abstract:
Although we typically talk about attention as a single process, it comprises multiple independent components. But what are these components, and how are they represented in the functional organization of the brain? To investigate whether long-studied components of attention are reflected in the brain's intrinsic functional organization, here we apply connectome-based predictive modeling (CPM) to predict the components of Posner and Petersen's influential model of attention: alerting (preparing and maintaining alertness and vigilance), orienting (directing attention to a stimulus), and executive control (detecting and resolving cognitive conflict) [Posner, M. I., & Petersen, S. E. The attention system of the human brain. Annual Review of Neuroscience, 13, 25–42, 1990]. Participants performed the Attention Network Task (ANT), which measures these three factors, and rested during fMRI scanning. CPMs tested with leave-one-subject-out cross-validation successfully predicted novel individual's overall ANT accuracy, RT variability, and executive control scores from functional connectivity observed during ANT performance. CPMs also generalized to predict participants' alerting scores from their resting-state functional connectivity alone, demonstrating that connectivity patterns observed in the absence of an explicit task contain a signature of the ability to prepare for an upcoming stimulus. Suggesting that significant variance in ANT performance is also explained by an overall sustained attention factor, the sustained attention CPM, a model defined in prior work to predict sustained attentional abilities, predicted accuracy, RT variability, and executive control from task-based data and predicted RT variability from resting-state data. Our results suggest that, whereas executive control may be closely related to sustained attention, the infrastructure that supports alerting is distinct and can be measured at rest. In the future, CPM may be applied to elucidate additional independent components of attention and relationships between the functional brain networks that predict them.
APA, Harvard, Vancouver, ISO, and other styles
8

Kristensen, Terje. "Towards Spike based Models of Visual Attention in the Brain." International Journal of Adaptive, Resilient and Autonomic Systems 6, no. 2 (July 2015): 117–38. http://dx.doi.org/10.4018/ijaras.2015070106.

Full text
Abstract:
A numerical solution of Hodgkin Huxley equations is presented to simulate the spiking behavior of a biological neuron. The solution is illustrated by building a graphical chart interface to finely tune the behavior of the neuron under different stimulations. In addition, a Multi-Agent System (MAS) has been developed to simulate the Visual Attention Network Model of the brain. Tasks are assigned to the agents according to the Attention Network Theory, developed by neuroscientists. A sequential communication model based on simple objects has been constructed, aiming to show the relations and the workflow between the different visual attention networks. Each agent is being used as an analogy to a role or function of the visual attention systems in the brain. Some experimental results based on this model have been presented in an earlier paper. The two approaches are at the moment not integrated. The long term goal is to develop an integrated parallel layered object model of the visual attention process, as a tool for simulating neuron interactions described by Hodgkin Huxley's equations or the Leaky-Integrate-and-Fire model.
APA, Harvard, Vancouver, ISO, and other styles
9

Tiawongsombat, Prasertsak, Mun-Ho Jeong, Alongkorn Pirayawaraporn, Joong-Jae Lee, and Joo-Seop Yun. "Vision-Based Attentiveness Determination Using Scalable HMM Based on Relevance Theory." Sensors 19, no. 23 (December 3, 2019): 5331. http://dx.doi.org/10.3390/s19235331.

Full text
Abstract:
Attention capability is an essential component of human–robot interaction. Several robot attention models have been proposed which aim to enable a robot to identify the attentiveness of the humans with which it communicates and gives them its attention accordingly. However, previous proposed models are often susceptible to noisy observations and result in the robot’s frequent and undesired shifts in attention. Furthermore, most approaches have difficulty adapting to change in the number of participants. To address these limitations, a novel attentiveness determination algorithm is proposed for determining the most attentive person, as well as prioritizing people based on attentiveness. The proposed algorithm, which is based on relevance theory, is named the Scalable Hidden Markov Model (Scalable HMM). The Scalable HMM allows effective computation and contributes an adaptation approach for human attentiveness; unlike conventional HMMs, Scalable HMM has a scalable number of states and observations and online adaptability for state transition probabilities, in terms of changes in the current number of states, i.e., the number of participants in a robot’s view. The proposed approach was successfully tested on image sequences (7567 frames) of individuals exhibiting a variety of actions (speaking, walking, turning head, and entering or leaving a robot’s view). From these experimental results, Scalable HMM showed a detection rate of 76% in determining the most attentive person and over 75% in prioritizing people’s attention with variation in the number of participants. Compared to recent attention approaches, Scalable HMM’s performance in people attention prioritization presents an approximately 20% improvement.
APA, Harvard, Vancouver, ISO, and other styles
10

Si, Nianwen, Wenlin Zhang, Dan Qu, Xiangyang Luo, Heyu Chang, and Tong Niu. "Spatial-Channel Attention-Based Class Activation Mapping for Interpreting CNN-Based Image Classification Models." Security and Communication Networks 2021 (May 31, 2021): 1–13. http://dx.doi.org/10.1155/2021/6682293.

Full text
Abstract:
Convolutional neural network (CNN) has been applied widely in various fields. However, it is always hindered by the unexplainable characteristics. Users cannot know why a CNN-based model produces certain recognition results, which is a vulnerability of CNN from the security perspective. To alleviate this problem, in this study, the three existing feature visualization methods of CNN are analyzed in detail firstly, and a unified visualization framework for interpreting the recognition results of CNN is presented. Here, class activation weight (CAW) is considered as the most important factor in the framework. Then, the different types of CAWs are further analyzed, and it is concluded that a linear correlation exists between them. Finally, on this basis, a spatial-channel attention-based class activation mapping (SCA-CAM) method is proposed. This method uses different types of CAWs as attention weights and combines spatial and channel attentions to generate class activation maps, which is capable of using richer features for interpreting the results of CNN. Experiments on four different networks are conducted. The results verify the linear correlation between different CAWs. In addition, compared with the existing methods, the proposed method SCA-CAM can effectively improve the visualization effect of the class activation map with higher flexibility on network structure.
APA, Harvard, Vancouver, ISO, and other styles
11

Cheng, Yepeng, Zuren Liu, and Yasuhiko Morimoto. "Attention-Based SeriesNet: An Attention-Based Hybrid Neural Network Model for Conditional Time Series Forecasting." Information 11, no. 6 (June 5, 2020): 305. http://dx.doi.org/10.3390/info11060305.

Full text
Abstract:
Traditional time series forecasting techniques can not extract good enough sequence data features, and their accuracies are limited. The deep learning structure SeriesNet is an advanced method, which adopts hybrid neural networks, including dilated causal convolutional neural network (DC-CNN) and Long-short term memory recurrent neural network (LSTM-RNN), to learn multi-range and multi-level features from multi-conditional time series with higher accuracy. However, they didn’t consider the attention mechanisms to learn temporal features. Besides, the conditioning method for CNN and RNN is not specific, and the number of parameters in each layer is tremendous. This paper proposes the conditioning method for two types of neural networks, and respectively uses the gated recurrent unit network (GRU) and the dilated depthwise separable temporal convolutional networks (DDSTCNs) instead of LSTM and DC-CNN for reducing the parameters. Furthermore, this paper presents the lightweight RNN-based hidden state attention module (HSAM) combined with the proposed CNN-based convolutional block attention module (CBAM) for time series forecasting. Experimental results show our model is superior to other models from the viewpoint of forecasting accuracy and computation efficiency.
APA, Harvard, Vancouver, ISO, and other styles
12

Sun, Zhaohong, Wei Dong, Jinlong Shi, Kunlun He, and Zhengxing Huang. "Attention-Based Deep Recurrent Model for Survival Prediction." ACM Transactions on Computing for Healthcare 2, no. 4 (October 31, 2021): 1–18. http://dx.doi.org/10.1145/3466782.

Full text
Abstract:
Survival analysis exhibits profound effects on health service management. Traditional approaches for survival analysis have a pre-assumption on the time-to-event probability distribution and seldom consider sequential visits of patients on medical facilities. Although recent studies leverage the merits of deep learning techniques to capture non-linear features and long-term dependencies within multiple visits for survival analysis, the lack of interpretability prevents deep learning models from being applied to clinical practice. To address this challenge, this article proposes a novel attention-based deep recurrent model, named AttenSurv , for clinical survival analysis. Specifically, a global attention mechanism is proposed to extract essential/critical risk factors for interpretability improvement. Thereafter, Bi-directional Long Short-Term Memory is employed to capture the long-term dependency on data from a series of visits of patients. To further improve both the prediction performance and the interpretability of the proposed model, we propose another model, named GNNAttenSurv , by incorporating a graph neural network into AttenSurv, to extract the latent correlations between risk factors. We validated our solution on three public follow-up datasets and two electronic health record datasets. The results demonstrated that our proposed models yielded consistent improvement compared to the state-of-the-art baselines on survival analysis.
APA, Harvard, Vancouver, ISO, and other styles
13

Gao, Tianyu, Xu Han, Zhiyuan Liu, and Maosong Sun. "Hybrid Attention-Based Prototypical Networks for Noisy Few-Shot Relation Classification." Proceedings of the AAAI Conference on Artificial Intelligence 33 (July 17, 2019): 6407–14. http://dx.doi.org/10.1609/aaai.v33i01.33016407.

Full text
Abstract:
The existing methods for relation classification (RC) primarily rely on distant supervision (DS) because large-scale supervised training datasets are not readily available. Although DS automatically annotates adequate amounts of data for model training, the coverage of this data is still quite limited, and meanwhile many long-tail relations still suffer from data sparsity. Intuitively, people can grasp new knowledge by learning few instances. We thus provide a different view on RC by formalizing RC as a few-shot learning (FSL) problem. However, the current FSL models mainly focus on low-noise vision tasks, which makes them hard to directly deal with the diversity and noise of text. In this paper, we propose hybrid attention-based prototypical networks for the problem of noisy few-shot RC. We design instancelevel and feature-level attention schemes based on prototypical networks to highlight the crucial instances and features respectively, which significantly enhances the performance and robustness of RC models in a noisy FSL scenario. Besides, our attention schemes accelerate the convergence speed of RC models. Experimental results demonstrate that our hybrid attention-based models require fewer training iterations and outperform the state-of-the-art baseline models. The code and datasets are released on https://github.com/thunlp/ HATT-Proto.
APA, Harvard, Vancouver, ISO, and other styles
14

Su, Jinsong, Jialong Tang, Hui Jiang, Ziyao Lu, Yubin Ge, Linfeng Song, Deyi Xiong, Le Sun, and Jiebo Luo. "Enhanced aspect-based sentiment analysis models with progressive self-supervised attention learning." Artificial Intelligence 296 (July 2021): 103477. http://dx.doi.org/10.1016/j.artint.2021.103477.

Full text
APA, Harvard, Vancouver, ISO, and other styles
15

Rasoulidanesh, Maryamsadat, Srishti Yadav, Sachini Herath, Yasaman Vaghei, and Shahram Payandeh. "Deep Attention Models for Human Tracking Using RGBD." Sensors 19, no. 4 (February 13, 2019): 750. http://dx.doi.org/10.3390/s19040750.

Full text
Abstract:
Visual tracking performance has long been limited by the lack of better appearance models. These models fail either where they tend to change rapidly, like in motion-based tracking, or where accurate information of the object may not be available, like in color camouflage (where background and foreground colors are similar). This paper proposes a robust, adaptive appearance model which works accurately in situations of color camouflage, even in the presence of complex natural objects. The proposed model includes depth as an additional feature in a hierarchical modular neural framework for online object tracking. The model adapts to the confusing appearance by identifying the stable property of depth between the target and the surrounding object(s). The depth complements the existing RGB features in scenarios when RGB features fail to adapt, hence becoming unstable over a long duration of time. The parameters of the model are learned efficiently in the Deep network, which consists of three modules: (1) The spatial attention layer, which discards the majority of the background by selecting a region containing the object of interest; (2) the appearance attention layer, which extracts appearance and spatial information about the tracked object; and (3) the state estimation layer, which enables the framework to predict future object appearance and location. Three different models were trained and tested to analyze the effect of depth along with RGB information. Also, a model is proposed to utilize only depth as a standalone input for tracking purposes. The proposed models were also evaluated in real-time using KinectV2 and showed very promising results. The results of our proposed network structures and their comparison with the state-of-the-art RGB tracking model demonstrate that adding depth significantly improves the accuracy of tracking in a more challenging environment (i.e., cluttered and camouflaged environments). Furthermore, the results of depth-based models showed that depth data can provide enough information for accurate tracking, even without RGB information.
APA, Harvard, Vancouver, ISO, and other styles
16

Zhang, Su Xian, Dong Zhang, Su Xiang Zhang, Bing Zhen Zhao, and Lin Yan Xie. "Topic Detection Research Based on Multi-Models." Applied Mechanics and Materials 740 (March 2015): 866–70. http://dx.doi.org/10.4028/www.scientific.net/amm.740.866.

Full text
Abstract:
In this paper, a novel approach was proposed for the topic detection which combined the multi-models. We paid attention to the content similarity, time similarity and location similarity respectively, at the same time, the Bayesian model also was researched and the atomic characteristics words were extracted. Combined the expert knowledge and multi-models, the experiment was completed and the experimental results show that the approach is effective.
APA, Harvard, Vancouver, ISO, and other styles
17

Cai, Wenjie, Zheng Xiong, Xianfang Sun, Paul L. Rosin, Longcun Jin, and Xinyi Peng. "Panoptic Segmentation-Based Attention for Image Captioning." Applied Sciences 10, no. 1 (January 4, 2020): 391. http://dx.doi.org/10.3390/app10010391.

Full text
Abstract:
Image captioning is the task of generating textual descriptions of images. In order to obtain a better image representation, attention mechanisms have been widely adopted in image captioning. However, in existing models with detection-based attention, the rectangular attention regions are not fine-grained, as they contain irrelevant regions (e.g., background or overlapped regions) around the object, making the model generate inaccurate captions. To address this issue, we propose panoptic segmentation-based attention that performs attention at a mask-level (i.e., the shape of the main part of an instance). Our approach extracts feature vectors from the corresponding segmentation regions, which is more fine-grained than current attention mechanisms. Moreover, in order to process features of different classes independently, we propose a dual-attention module which is generic and can be applied to other frameworks. Experimental results showed that our model could recognize the overlapped objects and understand the scene better. Our approach achieved competitive performance against state-of-the-art methods. We made our code available.
APA, Harvard, Vancouver, ISO, and other styles
18

Li, Wenkuan, Dongyuan Li, Hongxia Yin, Lindong Zhang, Zhenfang Zhu, and Peiyu Liu. "Lexicon-Enhanced Attention Network Based on Text Representation for Sentiment Classification." Applied Sciences 9, no. 18 (September 6, 2019): 3717. http://dx.doi.org/10.3390/app9183717.

Full text
Abstract:
Text representation learning is an important but challenging issue for various natural language processing tasks. Recently, deep learning-based representation models have achieved great success for sentiment classification. However, these existing models focus on more semantic information rather than sentiment linguistic knowledge, which provides rich sentiment information and plays a key role in sentiment analysis. In this paper, we propose a lexicon-enhanced attention network (LAN) based on text representation to improve the performance of sentiment classification. Specifically, we first propose a lexicon-enhanced attention mechanism by combining the sentiment lexicon with an attention mechanism to incorporate sentiment linguistic knowledge into deep learning methods. Second, we introduce a multi-head attention mechanism in the deep neural network to interactively capture the contextual information from different representation subspaces at different positions. Furthermore, we stack a LAN model to build a hierarchical sentiment classification model for large-scale text. Extensive experiments are conducted to evaluate the effectiveness of the proposed models on four popular real-world sentiment classification datasets at both the sentence level and the document level. The experimental results demonstrate that our proposed models can achieve comparable or better performance than the state-of-the-art methods.
APA, Harvard, Vancouver, ISO, and other styles
19

Kohlhas, Alexandre N., and Ansgar Walther. "Asymmetric Attention." American Economic Review 111, no. 9 (September 1, 2021): 2879–925. http://dx.doi.org/10.1257/aer.20191432.

Full text
Abstract:
We document that the expectations of households, firms, and professional forecasters in standard surveys simultaneously extrapolate from recent events and underreact to new information. Existing models of expectation formation, whether behavioral or rational, cannot account for these observations. We develop a rational theory of extrapolation based on limited attention, which is consistent with this evidence. In particular, we show that limited, asymmetric attention to procyclical variables can explain the coexistence of extrapolation and underreactions. We illustrate these mechanisms in a microfounded macroeconomic model, which generates expectations consistent with the survey data, and show that asymmetric attention increases business cycle fluctuations. (JEL C53, D83, D84, E23, E27, E32)
APA, Harvard, Vancouver, ISO, and other styles
20

Chen, Xu, Yongfeng Zhang, and Zheng Qin. "Dynamic Explainable Recommendation Based on Neural Attentive Models." Proceedings of the AAAI Conference on Artificial Intelligence 33 (July 17, 2019): 53–60. http://dx.doi.org/10.1609/aaai.v33i01.330153.

Full text
Abstract:
Providing explanations in a recommender system is getting more and more attention in both industry and research communities. Most existing explainable recommender models regard user preferences as invariant to generate static explanations. However, in real scenarios, a user’s preference is always dynamic, and she may be interested in different product features at different states. The mismatching between the explanation and user preference may degrade costumers’ satisfaction, confidence and trust for the recommender system. With the desire to fill up this gap, in this paper, we build a novel Dynamic Explainable Recommender (called DER) for more accurate user modeling and explanations. In specific, we design a time-aware gated recurrent unit (GRU) to model user dynamic preferences, and profile an item by its review information based on sentence-level convolutional neural network (CNN). By attentively learning the important review information according to the user current state, we are not only able to improve the recommendation performance, but also can provide explanations tailored for the users’ current preferences. We conduct extensive experiments to demonstrate the superiority of our model for improving recommendation performance. And to evaluate the explainability of our model, we first present examples to provide intuitive analysis on the highlighted review information, and then crowd-sourcing based evaluations are conducted to quantitatively verify our model’s superiority.
APA, Harvard, Vancouver, ISO, and other styles
21

Kardakis, Spyridon, Isidoros Perikos, Foteini Grivokostopoulou, and Ioannis Hatzilygeroudis. "Examining Attention Mechanisms in Deep Learning Models for Sentiment Analysis." Applied Sciences 11, no. 9 (April 25, 2021): 3883. http://dx.doi.org/10.3390/app11093883.

Full text
Abstract:
Attention-based methods for deep neural networks constitute a technique that has attracted increased interest in recent years. Attention mechanisms can focus on important parts of a sequence and, as a result, enhance the performance of neural networks in a variety of tasks, including sentiment analysis, emotion recognition, machine translation and speech recognition. In this work, we study attention-based models built on recurrent neural networks (RNNs) and examine their performance in various contexts of sentiment analysis. Self-attention, global-attention and hierarchical-attention methods are examined under various deep neural models, training methods and hyperparameters. Even though attention mechanisms are a powerful recent concept in the field of deep learning, their exact effectiveness in sentiment analysis is yet to be thoroughly assessed. A comparative analysis is performed in a text sentiment classification task where baseline models are compared with and without the use of attention for every experiment. The experimental study additionally examines the proposed models’ ability in recognizing opinions and emotions in movie reviews. The results indicate that attention-based models lead to great improvements in the performance of deep neural models showcasing up to a 3.5% improvement in their accuracy.
APA, Harvard, Vancouver, ISO, and other styles
22

Zachary, Wayne. "A Context-Based Model of Attention Switching in Computer-Human Interaction Domains." Proceedings of the Human Factors Society Annual Meeting 33, no. 5 (October 1989): 286–90. http://dx.doi.org/10.1177/154193128903300511.

Full text
Abstract:
COGNET (Cognitive Network of Tasks) is a model-building framework for real-time attention sharing cognitive processes. It is particularly designed for the construction of computational models of human-computer interaction. COGNET is unique in that it leads to context-sensitive models of attention switching based on the human operator's knowledge of the real-world domain being modeled. A COGNET model combines an augmented version of the GOMS task analysis language with the blackboard architecture of control. This paper discusses the theoretical organization of the COGNET framework, as well as the augmented GOMS/blackboard tools used to build COGNET models.
APA, Harvard, Vancouver, ISO, and other styles
23

Xue, Lanqing, Xiaopeng Li, and Nevin L. Zhang. "Not All Attention Is Needed: Gated Attention Network for Sequence Data." Proceedings of the AAAI Conference on Artificial Intelligence 34, no. 04 (April 3, 2020): 6550–57. http://dx.doi.org/10.1609/aaai.v34i04.6129.

Full text
Abstract:
Although deep neural networks generally have fixed network structures, the concept of dynamic mechanism has drawn more and more attention in recent years. Attention mechanisms compute input-dependent dynamic attention weights for aggregating a sequence of hidden states. Dynamic network configuration in convolutional neural networks (CNNs) selectively activates only part of the network at a time for different inputs. In this paper, we combine the two dynamic mechanisms for text classification tasks. Traditional attention mechanisms attend to the whole sequence of hidden states for an input sentence, while in most cases not all attention is needed especially for long sequences. We propose a novel method called Gated Attention Network (GA-Net) to dynamically select a subset of elements to attend to using an auxiliary network, and compute attention weights to aggregate the selected elements. It avoids a significant amount of unnecessary computation on unattended elements, and allows the model to pay attention to important parts of the sequence. Experiments in various datasets show that the proposed method achieves better performance compared with all baseline models with global or local attention while requiring less computation and achieving better interpretability. It is also promising to extend the idea to more complex attention-based models, such as transformers and seq-to-seq models.
APA, Harvard, Vancouver, ISO, and other styles
24

Qi, Feng, Debin Zhao, Xiaopeng Fan, and Tingting Jiang. "Stereoscopic video quality assessment based on visual attention and just-noticeable difference models." Signal, Image and Video Processing 10, no. 4 (August 6, 2015): 737–44. http://dx.doi.org/10.1007/s11760-015-0802-4.

Full text
APA, Harvard, Vancouver, ISO, and other styles
25

Schneider, W. X. "Space-based visual attention models and object selection: Constraints, problems, and possible solutions." Psychological Research 56, no. 1 (1993): 35–43. http://dx.doi.org/10.1007/bf00572131.

Full text
APA, Harvard, Vancouver, ISO, and other styles
26

Barić, Domjan, Petar Fumić, Davor Horvatić, and Tomislav Lipic. "Benchmarking Attention-Based Interpretability of Deep Learning in Multivariate Time Series Predictions." Entropy 23, no. 2 (January 25, 2021): 143. http://dx.doi.org/10.3390/e23020143.

Full text
Abstract:
The adaptation of deep learning models within safety-critical systems cannot rely only on good prediction performance but needs to provide interpretable and robust explanations for their decisions. When modeling complex sequences, attention mechanisms are regarded as the established approach to support deep neural networks with intrinsic interpretability. This paper focuses on the emerging trend of specifically designing diagnostic datasets for understanding the inner workings of attention mechanism based deep learning models for multivariate forecasting tasks. We design a novel benchmark of synthetically designed datasets with the transparent underlying generating process of multiple time series interactions with increasing complexity. The benchmark enables empirical evaluation of the performance of attention based deep neural networks in three different aspects: (i) prediction performance score, (ii) interpretability correctness, (iii) sensitivity analysis. Our analysis shows that although most models have satisfying and stable prediction performance results, they often fail to give correct interpretability. The only model with both a satisfying performance score and correct interpretability is IMV-LSTM, capturing both autocorrelations and crosscorrelations between multiple time series. Interestingly, while evaluating IMV-LSTM on simulated data from statistical and mechanistic models, the correctness of interpretability increases with more complex datasets.
APA, Harvard, Vancouver, ISO, and other styles
27

Hu, Feng, Jin-Li Guo, Fa-Xu Li, and Hai-Xing Zhao. "Hypernetwork models based on random hypergraphs." International Journal of Modern Physics C 30, no. 08 (August 2019): 1950052. http://dx.doi.org/10.1142/s0129183119500529.

Full text
Abstract:
Hypernetworks are ubiquitous in real-world systems. They provide a powerful means of accurately depicting networks of different types of entity and will attract more attention from researchers in the future. Most previous hypernetwork research has been focused on the application and modeling of uniform hypernetworks, which are based on uniform hypergraphs. However, random hypernetworks are generally more common, therefore, it is useful to investigate the evolution mechanisms of random hypernetworks. In this paper, we construct three dynamic evolutional models of hypernetworks, namely the equal-probability random hypernetwork model, the Poisson-probability random hypernetwork model and the certain-probability random hypernetwork model. Furthermore, we analyze the hyperdegree distributions of the three models with mean-field theory, and we simulate each model numerically with different parameter values. The simulation results agree well with the results of our theoretical analysis, and the findings indicate that our models could help understand the structure and evolution mechanisms of real systems.
APA, Harvard, Vancouver, ISO, and other styles
28

Abdallah, Abdelrahman, Mohamed Hamada, and Daniyar Nurseitov. "Attention-Based Fully Gated CNN-BGRU for Russian Handwritten Text." Journal of Imaging 6, no. 12 (December 18, 2020): 141. http://dx.doi.org/10.3390/jimaging6120141.

Full text
Abstract:
This article considers the task of handwritten text recognition using attention-based encoder–decoder networks trained in the Kazakh and Russian languages. We have developed a novel deep neural network model based on a fully gated CNN, supported by multiple bidirectional gated recurrent unit (BGRU) and attention mechanisms to manipulate sophisticated features that achieve 0.045 Character Error Rate (CER), 0.192 Word Error Rate (WER), and 0.253 Sequence Error Rate (SER) for the first test dataset and 0.064 CER, 0.24 WER and 0.361 SER for the second test dataset. Our proposed model is the first work to handle handwriting recognition models in Kazakh and Russian languages. Our results confirm the importance of our proposed Attention-Gated-CNN-BGRU approach for training handwriting text recognition and indicate that it can lead to statistically significant improvements (p-value < 0.05) in the sensitivity (recall) over the tests dataset. The proposed method’s performance was evaluated using handwritten text databases of three languages: English, Russian, and Kazakh. It demonstrates better results on the Handwritten Kazakh and Russian (HKR) dataset than the other well-known models.
APA, Harvard, Vancouver, ISO, and other styles
29

Xu, Jie, Haoliang Wei, Linke Li, Qiuru Fu, and Jinhong Guo. "Video Description Model Based on Temporal-Spatial and Channel Multi-Attention Mechanisms." Applied Sciences 10, no. 12 (June 23, 2020): 4312. http://dx.doi.org/10.3390/app10124312.

Full text
Abstract:
Video description plays an important role in the field of intelligent imaging technology. Attention perception mechanisms are extensively applied in video description models based on deep learning. Most existing models use a temporal-spatial attention mechanism to enhance the accuracy of models. Temporal attention mechanisms can obtain the global features of a video, whereas spatial attention mechanisms obtain local features. Nevertheless, because each channel of the convolutional neural network (CNN) feature maps has certain spatial semantic information, it is insufficient to merely divide the CNN features into regions and then apply a spatial attention mechanism. In this paper, we propose a temporal-spatial and channel attention mechanism that enables the model to take advantage of various video features and ensures the consistency of visual features between sentence descriptions to enhance the effect of the model. Meanwhile, in order to prove the effectiveness of the attention mechanism, this paper proposes a video visualization model based on the video description. Experimental results show that, our model has achieved good performance on the Microsoft Video Description (MSVD) dataset and a certain improvement on the Microsoft Research-Video to Text (MSR-VTT) dataset.
APA, Harvard, Vancouver, ISO, and other styles
30

Demiris, Yiannis, and Bassam Khadhouri. "Content-based control of goal-directed attention during human action perception." Interaction Studies 9, no. 2 (May 26, 2008): 353–76. http://dx.doi.org/10.1075/is.9.2.10dem.

Full text
Abstract:
During the perception of human actions by robotic assistants, the robotic assistant needs to direct its computational and sensor resources to relevant parts of the human action. In previous work we have introduced HAMMER (Hierarchical Attentive Multiple Models for Execution and Recognition) (Demiris and Khadhouri, 2006), a computational architecture that forms multiple hypotheses with respect to what the demonstrated task is, and multiple predictions with respect to the forthcoming states of the human action. To confirm their predictions, the hypotheses request information from an attentional mechanism, which allocates the robot’s resources as a function of the saliency of the hypotheses. In this paper we augment the attention mechanism with a component that considers the content of the hypotheses’ requests, with respect to the content’s reliability, utility and cost. This content-based attention component further optimises the utilisation of the resources while remaining robust to noise. Such computational mechanisms are important for the development of robotic devices that will rapidly respond to human actions, either for imitation or collaboration purposes.
APA, Harvard, Vancouver, ISO, and other styles
31

Liu, Chen, Feng Li, Xian Sun, and Hongzhe Han. "Attention-Based Joint Entity Linking with Entity Embedding." Information 10, no. 2 (February 1, 2019): 46. http://dx.doi.org/10.3390/info10020046.

Full text
Abstract:
Entity linking (also called entity disambiguation) aims to map the mentions in a given document to their corresponding entities in a target knowledge base. In order to build a high-quality entity linking system, efforts are made in three parts: Encoding of the entity, encoding of the mention context, and modeling the coherence among mentions. For the encoding of entity, we use long short term memory (LSTM) and a convolutional neural network (CNN) to encode the entity context and entity description, respectively. Then, we design a function to combine all the different entity information aspects, in order to generate unified, dense entity embeddings. For the encoding of mention context, unlike standard attention mechanisms which can only capture important individual words, we introduce a novel, attention mechanism-based LSTM model, which can effectively capture the important text spans around a given mention with a conditional random field (CRF) layer. In addition, we take the coherence among mentions into consideration with a Forward-Backward Algorithm, which is less time-consuming than previous methods. Our experimental results show that our model obtains a competitive, or even better, performance than state-of-the-art models across different datasets.
APA, Harvard, Vancouver, ISO, and other styles
32

Hong, Huiting, Hantao Guo, Yucheng Lin, Xiaoqing Yang, Zang Li, and Jieping Ye. "An Attention-Based Graph Neural Network for Heterogeneous Structural Learning." Proceedings of the AAAI Conference on Artificial Intelligence 34, no. 04 (April 3, 2020): 4132–39. http://dx.doi.org/10.1609/aaai.v34i04.5833.

Full text
Abstract:
In this paper, we focus on graph representation learning of heterogeneous information network (HIN), in which various types of vertices are connected by various types of relations. Most of the existing methods conducted on HIN revise homogeneous graph embedding models via meta-paths to learn low-dimensional vector space of HIN. In this paper, we propose a novel Heterogeneous Graph Structural Attention Neural Network (HetSANN) to directly encode structural information of HIN without meta-path and achieve more informative representations. With this method, domain experts will not be needed to design meta-path schemes and the heterogeneous information can be processed automatically by our proposed model. Specifically, we implicitly represent heterogeneous information using the following two methods: 1) we model the transformation between heterogeneous vertices through a projection in low-dimensional entity spaces; 2) afterwards, we apply the graph neural network to aggregate multi-relational information of projected neighborhood by means of attention mechanism. We also present three extensions of HetSANN, i.e., voices-sharing product attention for the pairwise relationships in HIN, cycle-consistency loss to retain the transformation between heterogeneous entity spaces, and multi-task learning with full use of information. The experiments conducted on three public datasets demonstrate that our proposed models achieve significant and consistent improvements compared to state-of-the-art solutions.
APA, Harvard, Vancouver, ISO, and other styles
33

Xia, Hongbin, Yang Luo, and Yuan Liu. "Attention neural collaboration filtering based on GRU for recommender systems." Complex & Intelligent Systems 7, no. 3 (January 30, 2021): 1367–79. http://dx.doi.org/10.1007/s40747-021-00274-4.

Full text
Abstract:
AbstractThe collaborative filtering method is widely used in the traditional recommendation system. The collaborative filtering method based on matrix factorization treats the user’s preference for the item as a linear combination of the user and the item latent vectors, and cannot learn a deeper feature representation. In addition, the cold start and data sparsity remain major problems for collaborative filtering. To tackle these problems, some scholars have proposed to use deep neural network to extract text information, but did not consider the impact of long-distance dependent information and key information on their models. In this paper, we propose a neural collaborative filtering recommender method that integrates user and item auxiliary information. This method fully integrates user-item rating information, user assistance information and item text assistance information for feature extraction. First, Stacked Denoising Auto Encoder is used to extract user features, and Gated Recurrent Unit with auxiliary information is used to extract items’ latent vectors, respectively. The attention mechanism is used to learn key information when extracting text features. Second, the latent vectors learned by deep learning techniques are used in multi-layer nonlinear networks to learn more abstract and deeper feature representations to predict user preferences. According to the verification results on the MovieLens data set, the proposed model outperforms other traditional approaches and deep learning models making it state of the art.
APA, Harvard, Vancouver, ISO, and other styles
34

Blair, R. J. R., and D. G. V. Mitchell. "Psychopathy, attention and emotion." Psychological Medicine 39, no. 4 (August 14, 2008): 543–55. http://dx.doi.org/10.1017/s0033291708003991.

Full text
Abstract:
Psychopathy is a developmental disorder marked by emotional hypo-responsiveness and an increased risk for antisocial behavior. Influential attention-based accounts of psychopathy have long been made; however, these accounts have made relatively little reference to general models of attention in healthy individuals. This review has three aims: (1) to summarize current cognitive neuroscience data on differing attentional systems; (2) to examine the functional integrity of these attentional systems in individuals with psychopathy; and (3) to consider the implications of these data for attention and emotion dysfunction accounts of psychopathy.
APA, Harvard, Vancouver, ISO, and other styles
35

Li, Shengwen, Renyao Chen, Bo Wan, Junfang Gong, Lin Yang, and Hong Yao. "DAWE: A Double Attention-Based Word Embedding Model with Sememe Structure Information." Applied Sciences 10, no. 17 (August 21, 2020): 5804. http://dx.doi.org/10.3390/app10175804.

Full text
Abstract:
Word embedding is an important reference for natural language processing tasks, which can generate distribution presentations of words based on many text data. Recent evidence demonstrates that introducing sememe knowledge is a promising strategy to improve the performance of word embedding. However, previous works ignored the structure information of sememe knowledges. To fill the gap, this study implicitly synthesized the structural feature of sememes into word embedding models based on an attention mechanism. Specifically, we propose a novel double attention word-based embedding (DAWE) model that encodes the characteristics of sememes into words by a “double attention” strategy. DAWE is integrated with two specific word training models through context-aware semantic matching techniques. The experimental results show that, in word similarity task and word analogy reasoning task, the performance of word embedding can be effectively improved by synthesizing the structural information of sememe knowledge. The case study also verifies the power of DAWE model in word sense disambiguation task. Furthermore, the DAWE model is a general framework for encoding sememes into words, which can be integrated into other existing word embedding models to provide more options for various natural language processing downstream tasks.
APA, Harvard, Vancouver, ISO, and other styles
36

Tian, Jinkai, Peifeng Yan, and Da Huang. "Kernel Analysis Based on Dirichlet Processes Mixture Models." Entropy 21, no. 9 (September 2, 2019): 857. http://dx.doi.org/10.3390/e21090857.

Full text
Abstract:
Kernels play a crucial role in Gaussian process regression. Analyzing kernels from their spectral domain has attracted extensive attention in recent years. Gaussian mixture models (GMM) are used to model the spectrum of kernels. However, the number of components in a GMM is fixed. Thus, this model suffers from overfitting or underfitting. In this paper, we try to combine the spectral domain of kernels with nonparametric Bayesian models. Dirichlet processes mixture models are used to resolve this problem by changing the number of components according to the data size. Multiple experiments have been conducted on this model and it shows competitive performance.
APA, Harvard, Vancouver, ISO, and other styles
37

Zou, Xiaochun, Xinbo Zhao, Jian Wang, and Yongjia Yang. "Learning to Model Task-Oriented Attention." Computational Intelligence and Neuroscience 2016 (2016): 1–12. http://dx.doi.org/10.1155/2016/2381451.

Full text
Abstract:
For many applications in graphics, design, and human computer interaction, it is essential to understand where humans look in a scene with a particular task. Models of saliency can be used to predict fixation locations, but a large body of previous saliency models focused on free-viewing task. They are based on bottom-up computation that does not consider task-oriented image semantics and often does not match actual eye movements. To address this problem, we collected eye tracking data of 11 subjects when they performed some particular search task in 1307 images and annotation data of 2,511 segmented objects with fine contours and 8 semantic attributes. Using this database as training and testing examples, we learn a model of saliency based on bottom-up image features and target position feature. Experimental results demonstrate the importance of the target information in the prediction of task-oriented visual attention.
APA, Harvard, Vancouver, ISO, and other styles
38

Nahmias-Biran, Bat-hen, Yafei Han, Shlomo Bekhor, Fang Zhao, Christopher Zegras, and Moshe Ben-Akiva. "Enriching Activity-Based Models using Smartphone-Based Travel Surveys." Transportation Research Record: Journal of the Transportation Research Board 2672, no. 42 (October 19, 2018): 280–91. http://dx.doi.org/10.1177/0361198118798475.

Full text
Abstract:
Smartphone-based travel surveys have attracted much attention recently, for their potential to improve data quality and response rate. One of the first such survey systems, Future Mobility Sensing (FMS), leverages sensors on smartphones, and machine learning techniques to collect detailed personal travel data. The main purpose of this research is to compare data collected by FMS and traditional methods, and study the implications of using FMS data for travel behavior modeling. Since its initial field test in Singapore, FMS has been used in several large-scale household travel surveys, including one in Tel Aviv, Israel. We present comparative analyses that make use of the rich datasets from Singapore and Tel Aviv, focusing on three main aspects: (1) richness in activity behaviors observed, (2) completeness of travel and activity data, and (3) data accuracy. Results show that FMS has clear advantages over traditional travel surveys: it has higher resolution and better accuracy of times, locations, and paths; FMS represents out-of-work and leisure activities well; and reveals large variability in day-to-day activity pattern, which is inadequately captured in a one-day snapshot in typical traditional surveys. FMS also captures travel and activities that tend to be under-reported in traditional surveys such as multiple stops in a tour and work-based sub-tours. These richer and more complete and accurate data can improve future activity-based modeling.
APA, Harvard, Vancouver, ISO, and other styles
39

Markevičiūtė, Jurgita, Jolita Bernatavičienė, Rūta Levulienė, Viktor Medvedev, Povilas Treigys, and Julius Venskus. "Attention-Based and Time Series Models for Short-Term Forecasting of COVID-19 Spread." Computers, Materials & Continua 70, no. 1 (2022): 695–714. http://dx.doi.org/10.32604/cmc.2022.018735.

Full text
APA, Harvard, Vancouver, ISO, and other styles
40

Chen, Lei, and Li Sun. "Self-Attention-Based Real-Time Signal Detector for Communication Systems With Unknown Channel Models." IEEE Communications Letters 25, no. 8 (August 2021): 2639–43. http://dx.doi.org/10.1109/lcomm.2021.3082708.

Full text
APA, Harvard, Vancouver, ISO, and other styles
41

Shi, Jiaqi, Chaoran Liu, Carlos Toshinori Ishi, and Hiroshi Ishiguro. "Skeleton-Based Emotion Recognition Based on Two-Stream Self-Attention Enhanced Spatial-Temporal Graph Convolutional Network." Sensors 21, no. 1 (December 30, 2020): 205. http://dx.doi.org/10.3390/s21010205.

Full text
Abstract:
Emotion recognition has drawn consistent attention from researchers recently. Although gesture modality plays an important role in expressing emotion, it is seldom considered in the field of emotion recognition. A key reason is the scarcity of labeled data containing 3D skeleton data. Some studies in action recognition have applied graph-based neural networks to explicitly model the spatial connection between joints. However, this method has not been considered in the field of gesture-based emotion recognition, so far. In this work, we applied a pose estimation based method to extract 3D skeleton coordinates for IEMOCAP database. We propose a self-attention enhanced spatial temporal graph convolutional network for skeleton-based emotion recognition, in which the spatial convolutional part models the skeletal structure of the body as a static graph, and the self-attention part dynamically constructs more connections between the joints and provides supplementary information. Our experiment demonstrates that the proposed model significantly outperforms other models and that the features of the extracted skeleton data improve the performance of multimodal emotion recognition.
APA, Harvard, Vancouver, ISO, and other styles
42

Kaldy, Joanne. "Population Health: An Old Idea Gets New Attention." Senior Care Pharmacist 34, no. 5 (May 1, 2019): 293–301. http://dx.doi.org/10.4140/tcp.n.2019.293.

Full text
Abstract:
A focus on patient populations—as opposed to care settings—encompasses a broad array of health care models: accountable care organizations, managed care, bundled payments, and other value-based care medical models. Pharmacists have a key role to play in streamlining medication management within these settings, ensuring a smooth transition as patients move through the care continuum, and preventing avoidable hospitalizations and readmissions.
APA, Harvard, Vancouver, ISO, and other styles
43

Lu, Huimin, Rui Yang, Zhenrong Deng, Yonglin Zhang, Guangwei Gao, and Rushi Lan. "Chinese Image Captioning via Fuzzy Attention-based DenseNet-BiLSTM." ACM Transactions on Multimedia Computing, Communications, and Applications 17, no. 1s (March 31, 2021): 1–18. http://dx.doi.org/10.1145/3422668.

Full text
Abstract:
Chinese image description generation tasks usually have some challenges, such as single-feature extraction, lack of global information, and lack of detailed description of the image content. To address these limitations, we propose a fuzzy attention-based DenseNet-BiLSTM Chinese image captioning method in this article. In the proposed method, we first improve the densely connected network to extract features of the image at different scales and to enhance the model’s ability to capture the weak features. At the same time, a bidirectional LSTM is used as the decoder to enhance the use of context information. The introduction of an improved fuzzy attention mechanism effectively improves the problem of correspondence between image features and contextual information. We conduct experiments on the AI Challenger dataset to evaluate the performance of the model. The results show that compared with other models, our proposed model achieves higher scores in objective quantitative evaluation indicators, including BLEU , BLEU , METEOR, ROUGEl, and CIDEr. The generated description sentence can accurately express the image content.
APA, Harvard, Vancouver, ISO, and other styles
44

Roy, Aurko, Mohammad Saffar, Ashish Vaswani, and David Grangier. "Efficient Content-Based Sparse Attention with Routing Transformers." Transactions of the Association for Computational Linguistics 9 (February 2021): 53–68. http://dx.doi.org/10.1162/tacl_a_00353.

Full text
Abstract:
Self-attention has recently been adopted for a wide range of sequence modeling problems. Despite its effectiveness, self-attention suffers from quadratic computation and memory requirements with respect to sequence length. Successful approaches to reduce this complexity focused on attending to local sliding windows or a small set of locations independent of content. Our work proposes to learn dynamic sparse attention patterns that avoid allocating computation and memory to attend to content unrelated to the query of interest. This work builds upon two lines of research: It combines the modeling flexibility of prior work on content-based sparse attention with the efficiency gains from approaches based on local, temporal sparse attention. Our model, the Routing Transformer, endows self-attention with a sparse routing module based on online k-means while reducing the overall complexity of attention to O( n1.5d) from O( n2d) for sequence length n and hidden dimension d. We show that our model outperforms comparable sparse attention models on language modeling on Wikitext-103 (15.8 vs 18.3 perplexity), as well as on image generation on ImageNet-64 (3.43 vs 3.44 bits/dim) while using fewer self-attention layers. Additionally, we set a new state-of-the-art on the newly released PG-19 data-set, obtaining a test perplexity of 33.2 with a 22 layer Routing Transformer model trained on sequences of length 8192. We open-source the code for Routing Transformer in Tensorflow.1
APA, Harvard, Vancouver, ISO, and other styles
45

Ma, Jiajia, Chao Che, and Qiang Zhang. "Medical Answer Selection Based on Two Attention Mechanisms with BiRNN." MATEC Web of Conferences 176 (2018): 01024. http://dx.doi.org/10.1051/matecconf/201817601024.

Full text
Abstract:
The contradiction between the large population of China and the limited medical resources lead to the difficulty of getting medical services. The emergence of question answering (QA) system in the medical field allows people to receive timely treatment at home and alleviates the burden on hospitals and doctors. To this end, this paper proposes a new model called Att-BiRNN-Att which combines the Bidirectional RNN (Recurrent Neural Network) with two attention mechanisms. The model employs BiRNN to capture more information in the context instead of the traditional directional RNN. Also, two attention mechanisms are used in the model to produce better feature representation of the answer. One attention is used before the input of BiRNN, and the other is used after the output of BiRNN. The combination of two attentions makes full use of the relevant information between the answer and question. The experiment on the HealthTap medical QA dataset shows that our model outperforms four state-of-theart deep learning models, which confirm the effectiveness of Att-BiRNN-Att model.
APA, Harvard, Vancouver, ISO, and other styles
46

Tan, Zhen, Bo Li, Peixin Huang, Bin Ge, and Weidong Xiao. "Neural Relation Classification Using Selective Attention and Symmetrical Directional Instances." Symmetry 10, no. 9 (August 21, 2018): 357. http://dx.doi.org/10.3390/sym10090357.

Full text
Abstract:
Relation classification (RC) is an important task in information extraction from unstructured text. Recently, several neural methods based on various network architectures have been adopted for the task of RC. Among them, convolution neural network (CNN)-based models stand out due to their simple structure, low model complexity and “good” performance. Nevertheless, there are still at least two limitations associated with existing CNN-based RC models. First, when handling samples with long distances between entities, they fail to extract effective features, even obtaining disturbing ones from the clauses, which results in decreased accuracy. Second, existing RC models tend to produce inconsistent results when fed with forward and backward instances of an identical sample. Therefore, we present a novel CNN-based sentence encoder with selective attention by leveraging the shortest dependency paths, and devise a classification framework using symmetrical directional—forward and backward—instances via information fusion. Comprehensive experiments verify the superior performance of the proposed RC model over mainstream competitors without additional artificial features.
APA, Harvard, Vancouver, ISO, and other styles
47

Wan, Haifeng, Lei Gao, Manman Su, Qirun Sun, and Lei Huang. "Attention-Based Convolutional Neural Network for Pavement Crack Detection." Advances in Materials Science and Engineering 2021 (April 7, 2021): 1–13. http://dx.doi.org/10.1155/2021/5520515.

Full text
Abstract:
Achieving high detection accuracy of pavement cracks with complex textures under different lighting conditions is still challenging. In this context, an encoder-decoder network-based architecture named CrackResAttentionNet was proposed in this study, and the position attention module and channel attention module were connected after each encoder to summarize remote contextual information. The experiment results demonstrated that, compared with other popular models (ENet, ExFuse, FCN, LinkNet, SegNet, and UNet), for the public dataset, CrackResAttentionNet with BCE loss function and PRelu activation function achieved the best performance in terms of precision (89.40), mean IoU (71.51), recall (81.09), and F1 (85.04). Meanwhile, for a self-developed dataset (Yantai dataset), CrackResAttentionNet with BCE loss function and PRelu activation function also had better performance in terms of precision (96.17), mean IoU (83.69), recall (93.44), and F1 (94.79). In particular, for the public dataset, the precision of BCE loss and PRelu activation function was improved by 3.21. For the Yantai dataset, the results indicated that the precision was improved by 0.99, the mean IoU was increased by 0.74, the recall was increased by 1.1, and the F1 for BCE loss and PRelu activation function was increased by 1.24.
APA, Harvard, Vancouver, ISO, and other styles
48

Zhang, Jinsong, Yongtao Peng, Bo Ren, and Taoying Li. "PM2.5 Concentration Prediction Based on CNN-BiLSTM and Attention Mechanism." Algorithms 14, no. 7 (July 13, 2021): 208. http://dx.doi.org/10.3390/a14070208.

Full text
Abstract:
The concentration of PM2.5 is an important index to measure the degree of air pollution. When it exceeds the standard value, it is considered to cause pollution and lower the air quality, which is harmful to human health and can cause a variety of diseases, i.e., asthma, chronic bronchitis, etc. Therefore, the prediction of PM2.5 concentration is helpful to reduce its harm. In this paper, a hybrid model called CNN-BiLSTM-Attention is proposed to predict the PM2.5 concentration over the next two days. First, we select the PM2.5 concentration data in hours from January 2013 to February 2017 of Shunyi District, Beijing. The auxiliary data includes air quality data and meteorological data. We use the sliding window method for preprocessing and dividing the corresponding data into a training set, a validation set, and a test set. Second, CNN-BiLSTM-Attention is composed of the convolutional neural network, bidirectional long short-term memory neural network, and attention mechanism. The parameters of this network structure are determined by the minimum error in the training process, including the size of the convolution kernel, activation function, batch size, dropout rate, learning rate, etc. We determine the feature size of the input and output by evaluating the performance of the model, finding out the best output for the next 48 h. Third, in the experimental part, we use the test set to check the performance of the proposed CNN-BiLSTM-Attention on PM2.5 prediction, which is compared by other comparison models, i.e., lasso regression, ridge regression, XGBOOST, SVR, CNN-LSTM, and CNN-BiLSTM. We conduct short-term prediction (48 h) and long-term prediction (72 h, 96 h, 120 h, 144 h), respectively. The results demonstrate that even the predictions of the next 144 h with CNN-BiLSTM-Attention is better than the predictions of the next 48 h with the comparison models in terms of mean absolute error (MAE), root mean square error (RMSE), and coefficient of determination (R2).
APA, Harvard, Vancouver, ISO, and other styles
49

Park, Sangmin, Eum Han, Sungho Park, Harim Jeong, and Ilsoo Yun. "Deep Q-network-based traffic signal control models." PLOS ONE 16, no. 9 (September 2, 2021): e0256405. http://dx.doi.org/10.1371/journal.pone.0256405.

Full text
Abstract:
Traffic congestion has become common in urban areas worldwide. To solve this problem, the method of searching a solution using artificial intelligence has recently attracted widespread attention because it can solve complex problems such as traffic signal control. This study developed two traffic signal control models using reinforcement learning and a microscopic simulation-based evaluation for an isolated intersection and two coordinated intersections. To develop these models, a deep Q-network (DQN) was used, which is a promising reinforcement learning algorithm. The performance was evaluated by comparing the developed traffic signal control models in this research with the fixed-time signal optimized by Synchro model, which is a traffic signal optimization model. The evaluation showed that the developed traffic signal control model of the isolated intersection was validated, and the coordination of intersections was superior to that of the fixed-time signal control method.
APA, Harvard, Vancouver, ISO, and other styles
50

Wang, Yingying, Yibin Li, Yong Song, and Xuewen Rong. "Facial Expression Recognition Based on Auxiliary Models." Algorithms 12, no. 11 (October 31, 2019): 227. http://dx.doi.org/10.3390/a12110227.

Full text
Abstract:
In recent years, with the development of artificial intelligence and human–computer interaction, more attention has been paid to the recognition and analysis of facial expressions. Despite much great success, there are a lot of unsatisfying problems, because facial expressions are subtle and complex. Hence, facial expression recognition is still a challenging problem. In most papers, the entire face image is often chosen as the input information. In our daily life, people can perceive other’s current emotions only by several facial components (such as eye, mouth and nose), and other areas of the face (such as hair, skin tone, ears, etc.) play a smaller role in determining one’s emotion. If the entire face image is used as the only input information, the system will produce some unnecessary information and miss some important information in the process of feature extraction. To solve the above problem, this paper proposes a method that combines multiple sub-regions and the entire face image by weighting, which can capture more important feature information that is conducive to improving the recognition accuracy. Our proposed method was evaluated based on four well-known publicly available facial expression databases: JAFFE, CK+, FER2013 and SFEW. The new method showed better performance than most state-of-the-art methods.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography