Zeitschriftenartikel: „Context Encoder“

1

Pinho, M. S., und W. A. Finamore. „Context-based LZW encoder“. Electronics Letters 38, Nr. 20 (2002): 1172. http://dx.doi.org/10.1049/el:20020807.

Der volle Inhalt der Quelle

APA, Harvard, Vancouver, ISO und andere Zitierweisen

2

Han, Jialong, Aixin Sun, Haisong Zhang, Chenliang Li und Shuming Shi. „CASE: Context-Aware Semantic Expansion“. Proceedings of the AAAI Conference on Artificial Intelligence 34, Nr. 05 (03.04.2020): 7871–78. http://dx.doi.org/10.1609/aaai.v34i05.6293.

Der volle Inhalt der Quelle

Annotation:

In this paper, we define and study a new task called Context-Aware Semantic Expansion (CASE). Given a seed term in a sentential context, we aim to suggest other terms that well fit the context as the seed. CASE has many interesting applications such as query suggestion, computer-assisted writing, and word sense disambiguation, to name a few. Previous explorations, if any, only involve some similar tasks, and all require human annotations for evaluation. In this study, we demonstrate that annotations for this task can be harvested at scale from existing corpora, in a fully automatic manner. On a dataset of 1.8 million sentences thus derived, we propose a network architecture that encodes the context and seed term separately before suggesting alternative terms. The context encoder in this architecture can be easily extended by incorporating seed-aware attention. Our experiments demonstrate that competitive results are achieved with appropriate choices of context encoder and attention scoring function.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

3

Marafioti, Andres, Nathanael Perraudin, Nicki Holighaus und Piotr Majdak. „A Context Encoder For Audio Inpainting“. IEEE/ACM Transactions on Audio, Speech, and Language Processing 27, Nr. 12 (Dezember 2019): 2362–72. http://dx.doi.org/10.1109/taslp.2019.2947232.

Der volle Inhalt der Quelle

APA, Harvard, Vancouver, ISO und andere Zitierweisen

4

Yun, Hyeongu, Yongkeun Hwang und Kyomin Jung. „Improving Context-Aware Neural Machine Translation Using Self-Attentive Sentence Embedding“. Proceedings of the AAAI Conference on Artificial Intelligence 34, Nr. 05 (03.04.2020): 9498–506. http://dx.doi.org/10.1609/aaai.v34i05.6494.

Der volle Inhalt der Quelle

Annotation:

Fully Attentional Networks (FAN) like Transformer (Vaswani et al. 2017) has shown superior results in Neural Machine Translation (NMT) tasks and has become a solid baseline for translation tasks. More recent studies also have reported experimental results that additional contextual sentences improve translation qualities of NMT models (Voita et al. 2018; Müller et al. 2018; Zhang et al. 2018). However, those studies have exploited multiple context sentences as a single long concatenated sentence, that may cause the models to suffer from inefficient computational complexities and long-range dependencies. In this paper, we propose Hierarchical Context Encoder (HCE) that is able to exploit multiple context sentences separately using the hierarchical FAN structure. Our proposed encoder first abstracts sentence-level information from preceding sentences in a self-attentive way, and then hierarchically encodes context-level information. Through extensive experiments, we observe that our HCE records the best performance measured in BLEU score on English-German, English-Turkish, and English-Korean corpus. In addition, we observe that our HCE records the best performance in a crowd-sourced test set which is designed to evaluate how well an encoder can exploit contextual information. Finally, evaluation on English-Korean pronoun resolution test suite also shows that our HCE can properly exploit contextual information.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

5

Dakwale, Praveen, und Christof Monz. „Convolutional over Recurrent Encoder for Neural Machine Translation“. Prague Bulletin of Mathematical Linguistics 108, Nr. 1 (01.06.2017): 37–48. http://dx.doi.org/10.1515/pralin-2017-0007.

Der volle Inhalt der Quelle

Annotation:

AbstractNeural machine translation is a recently proposed approach which has shown competitive results to traditional MT approaches. Standard neural MT is an end-to-end neural network where the source sentence is encoded by a recurrent neural network (RNN) called encoder and the target words are predicted using another RNN known as decoder. Recently, various models have been proposed which replace the RNN encoder with a convolutional neural network (CNN). In this paper, we propose to augment the standard RNN encoder in NMT with additional convolutional layers in order to capture wider context in the encoder output. Experiments on English to German translation demonstrate that our approach can achieve significant improvements over a standard RNN-based baseline.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

6

Dligach, Dmitriy, Majid Afshar und Timothy Miller. „Toward a clinical text encoder: pretraining for clinical natural language processing with applications to substance misuse“. Journal of the American Medical Informatics Association 26, Nr. 11 (24.06.2019): 1272–78. http://dx.doi.org/10.1093/jamia/ocz072.

Der volle Inhalt der Quelle

Annotation:

Abstract Objective Our objective is to develop algorithms for encoding clinical text into representations that can be used for a variety of phenotyping tasks. Materials and Methods Obtaining large datasets to take advantage of highly expressive deep learning methods is difficult in clinical natural language processing (NLP). We address this difficulty by pretraining a clinical text encoder on billing code data, which is typically available in abundance. We explore several neural encoder architectures and deploy the text representations obtained from these encoders in the context of clinical text classification tasks. While our ultimate goal is learning a universal clinical text encoder, we also experiment with training a phenotype-specific encoder. A universal encoder would be more practical, but a phenotype-specific encoder could perform better for a specific task. Results We successfully train several clinical text encoders, establish a new state-of-the-art on comorbidity data, and observe good performance gains on substance misuse data. Discussion We find that pretraining using billing codes is a promising research direction. The representations generated by this type of pretraining have universal properties, as they are highly beneficial for many phenotyping tasks. Phenotype-specific pretraining is a viable route for trading the generality of the pretrained encoder for better performance on a specific phenotyping task. Conclusions We successfully applied our approach to many phenotyping tasks. We conclude by discussing potential limitations of our approach.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

7

Trisedya, Bayu, Jianzhong Qi und Rui Zhang. „Sentence Generation for Entity Description with Content-Plan Attention“. Proceedings of the AAAI Conference on Artificial Intelligence 34, Nr. 05 (03.04.2020): 9057–64. http://dx.doi.org/10.1609/aaai.v34i05.6439.

Der volle Inhalt der Quelle

Annotation:

We study neural data-to-text generation. Specifically, we consider a target entity that is associated with a set of attributes. We aim to generate a sentence to describe the target entity. Previous studies use encoder-decoder frameworks where the encoder treats the input as a linear sequence and uses LSTM to encode the sequence. However, linearizing a set of attributes may not yield the proper order of the attributes, and hence leads the encoder to produce an improper context to generate a description. To handle disordered input, recent studies propose two-stage neural models that use pointer networks to generate a content-plan (i.e., content-planner) and use the content-plan as input for an encoder-decoder model (i.e., text generator). However, in two-stage models, the content-planner may yield an incomplete content-plan, due to missing one or more salient attributes in the generated content-plan. This will in turn cause the text generator to generate an incomplete description. To address these problems, we propose a novel attention model that exploits content-plan to highlight salient attributes in a proper order. The challenge of integrating a content-plan in the attention model of an encoder-decoder framework is to align the content-plan and the generated description. We handle this problem by devising a coverage mechanism to track the extent to which the content-plan is exposed in the previous decoding time-step, and hence it helps our proposed attention model select the attributes to be mentioned in the description in a proper order. Experimental results show that our model outperforms state-of-the-art baselines by up to 3% and 5% in terms of BLEU score on two real-world datasets, respectively.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

8

Cai, Yuanyuan, Min Zuo, Qingchuan Zhang, Haitao Xiong und Ke Li. „A Bichannel Transformer with Context Encoding for Document-Driven Conversation Generation in Social Media“. Complexity 2020 (17.09.2020): 1–13. http://dx.doi.org/10.1155/2020/3710104.

Der volle Inhalt der Quelle

Annotation:

Along with the development of social media on the internet, dialogue systems are becoming more and more intelligent to meet users’ needs for communication, emotion, and social intercourse. Previous studies usually use sequence-to-sequence learning with recurrent neural networks for response generation. However, recurrent-based learning models heavily suffer from the problem of long-distance dependencies in sequences. Moreover, some models neglect crucial information in the dialogue contexts, which leads to uninformative and inflexible responses. To address these issues, we present a bichannel transformer with context encoding (BCTCE) for document-driven conversation. This conversational generator consists of a context encoder, an utterance encoder, and a decoder with attention mechanism. The encoders aim to learn the distributed representation of input texts. The multihop attention mechanism is used in BCTCE to capture the interaction between documents and dialogues. We evaluate the proposed BCTCE by both automatic evaluation and human judgment. The experimental results on the dataset CMU_DoG indicate that the proposed model yields significant improvements over the state-of-the-art baselines on most of the evaluation metrics, and the generated responses of BCTCE are more informative and more relevant to dialogues than baselines.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

9

Zhang, Biao, Deyi Xiong, Jinsong Su und Hong Duan. „A Context-Aware Recurrent Encoder for Neural Machine Translation“. IEEE/ACM Transactions on Audio, Speech, and Language Processing 25, Nr. 12 (Dezember 2017): 2424–32. http://dx.doi.org/10.1109/taslp.2017.2751420.

Der volle Inhalt der Quelle

APA, Harvard, Vancouver, ISO und andere Zitierweisen

10

Pan, Yirong, Xiao Li, Yating Yang und Rui Dong. „Multi-Source Neural Model for Machine Translation of Agglutinative Language“. Future Internet 12, Nr. 6 (03.06.2020): 96. http://dx.doi.org/10.3390/fi12060096.

Der volle Inhalt der Quelle

Annotation:

Benefitting from the rapid development of artificial intelligence (AI) and deep learning, the machine translation task based on neural networks has achieved impressive performance in many high-resource language pairs. However, the neural machine translation (NMT) models still struggle in the translation task on agglutinative languages with complex morphology and limited resources. Inspired by the finding that utilizing the source-side linguistic knowledge can further improve the NMT performance, we propose a multi-source neural model that employs two separate encoders to encode the source word sequence and the linguistic feature sequences. Compared with the standard NMT model, we utilize an additional encoder to incorporate the linguistic features of lemma, part-of-speech (POS) tag, and morphological tag by extending the input embedding layer of the encoder. Moreover, we use a serial combination method to integrate the conditional information from the encoders with the outputs of the decoder, which aims to enhance the neural model to learn a high-quality context representation of the source sentence. Experimental results show that our approach is effective for the agglutinative language translation, which achieves the highest improvements of +2.4 BLEU points on Turkish–English translation task and +0.6 BLEU points on Uyghur–Chinese translation task.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

11

He, Xiang, Sibei Yang, Guanbin Li, Haofeng Li, Huiyou Chang und Yizhou Yu. „Non-Local Context Encoder: Robust Biomedical Image Segmentation against Adversarial Attacks“. Proceedings of the AAAI Conference on Artificial Intelligence 33 (17.07.2019): 8417–24. http://dx.doi.org/10.1609/aaai.v33i01.33018417.

Der volle Inhalt der Quelle

Annotation:

Recent progress in biomedical image segmentation based on deep convolutional neural networks (CNNs) has drawn much attention. However, its vulnerability towards adversarial samples cannot be overlooked. This paper is the first one that discovers that all the CNN-based state-of-the-art biomedical image segmentation models are sensitive to adversarial perturbations. This limits the deployment of these methods in safety-critical biomedical fields. In this paper, we discover that global spatial dependencies and global contextual information in a biomedical image can be exploited to defend against adversarial attacks. To this end, non-local context encoder (NLCE) is proposed to model short- and long-range spatial dependencies and encode global contexts for strengthening feature activations by channel-wise attention. The NLCE modules enhance the robustness and accuracy of the non-local context encoding network (NLCEN), which learns robust enhanced pyramid feature representations with NLCE modules, and then integrates the information across different levels. Experiments on both lung and skin lesion segmentation datasets have demonstrated that NLCEN outperforms any other state-of-the-art biomedical image segmentation methods against adversarial attacks. In addition, NLCE modules can be applied to improve the robustness of other CNN-based biomedical image segmentation methods.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

12

Sediqi, Khwaja Monib, und Hyo Jong Lee. „A Novel Upsampling and Context Convolution for Image Semantic Segmentation“. Sensors 21, Nr. 6 (20.03.2021): 2170. http://dx.doi.org/10.3390/s21062170.

Der volle Inhalt der Quelle

Annotation:

Semantic segmentation, which refers to pixel-wise classification of an image, is a fundamental topic in computer vision owing to its growing importance in the robot vision and autonomous driving sectors. It provides rich information about objects in the scene such as object boundary, category, and location. Recent methods for semantic segmentation often employ an encoder-decoder structure using deep convolutional neural networks. The encoder part extracts features of the image using several filters and pooling operations, whereas the decoder part gradually recovers the low-resolution feature maps of the encoder into a full input resolution feature map for pixel-wise prediction. However, the encoder-decoder variants for semantic segmentation suffer from severe spatial information loss, caused by pooling operations or stepwise convolutions, and does not consider the context in the scene. In this paper, we propose a novel dense upsampling convolution method based on a guided filter to effectively preserve the spatial information of the image in the network. We further propose a novel local context convolution method that not only covers larger-scale objects in the scene but covers them densely for precise object boundary delineation. Theoretical analyses and experimental results on several benchmark datasets verify the effectiveness of our method. Qualitatively, our approach delineates object boundaries at a level of accuracy that is beyond the current excellent methods. Quantitatively, we report a new record of 82.86% and 81.62% of pixel accuracy on ADE20K and Pascal-Context benchmark datasets, respectively. In comparison with the state-of-the-art methods, the proposed method offers promising improvements.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

13

Gu, Zaiwang, Jun Cheng, Huazhu Fu, Kang Zhou, Huaying Hao, Yitian Zhao, Tianyang Zhang, Shenghua Gao und Jiang Liu. „CE-Net: Context Encoder Network for 2D Medical Image Segmentation“. IEEE Transactions on Medical Imaging 38, Nr. 10 (Oktober 2019): 2281–92. http://dx.doi.org/10.1109/tmi.2019.2903562.

Der volle Inhalt der Quelle

APA, Harvard, Vancouver, ISO und andere Zitierweisen

14

Wen, F., Y. Zhang und B. Zhang. „GLOBAL CONTEXT AIDED SEMANTIC SEGMENTATION FOR CLOUD DETECTION OF REMOTE SENSING IMAGES“. ISPRS Annals of Photogrammetry, Remote Sensing and Spatial Information Sciences V-2-2020 (03.08.2020): 583–89. http://dx.doi.org/10.5194/isprs-annals-v-2-2020-583-2020.

Der volle Inhalt der Quelle

Annotation:

Abstract. Cloud detection is a vital preprocessing step for remote sensing image applications, which has been widely studied through Convolutional Neural Networks (CNNs) in recent years. However, the available CNN-based works only extract local/non-local features by stacked convolution and pooling layers, ignoring global contextual information of the input scenes. In this paper, a novel segmentation-based network is proposed for cloud detection of remote sensing images. We add a multi-class classification branch to a U-shaped semantic segmentation network. Through the encoder-decoder architecture, pixelwise classification of cloud, shadow and landcover can be obtained. Besides, the multi-class classification branch is built on top of the encoder module to extract global context by identifying what classes exist in the input scene. Linear representation encoded global contextual information is learned in the added branch, which is to be combined with featuremaps of the decoder and can help to selectively strengthen class-related features or weaken class-unrelated features at different scales. The whole network is trained and tested in an end-to-end fashion. Experiments on two Landsat-8 cloud detection datasets show better performance than other deep learning methods, which finally achieves 90.82% overall accuracy and 0.6992 mIoU on the SPARCS dataset, demonstrating the effectiveness of the proposed framework for cloud detection in remote sensing images.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

15

Cheng, Jinfeng, Weiqin Tong und Weian Yan. „Capsule Network Improved Multi-Head Attention for Word Sense Disambiguation“. Applied Sciences 11, Nr. 6 (10.03.2021): 2488. http://dx.doi.org/10.3390/app11062488.

Der volle Inhalt der Quelle

Annotation:

Word sense disambiguation (WSD) is one of the core problems in natural language processing (NLP), which is to map an ambiguous word to its correct meaning in a specific context. There has been a lively interest in incorporating sense definition (gloss) into neural networks in recent studies, which makes great contribution to improving the performance of WSD. However, disambiguating polysemes of rare senses is still hard. In this paper, while taking gloss into consideration, we further improve the performance of the WSD system from the perspective of semantic representation. We encode the context and sense glosses of the target polysemy independently using encoders with the same structure. To obtain a better presentation in each encoder, we leverage the capsule network to capture different important information contained in multi-head attention. We finally choose the gloss representation closest to the context representation of the target word as its correct sense. We do experiments on English all-words WSD task. Experimental results show that our method achieves good performance, especially having an inspiring effect on disambiguating words of rare senses.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

16

Su, Shaojing, Jing Zhou, Zhiping Huang, Chunwu Liu und Yimeng Zhang. „Blind Identification of Convolutional Encoder Parameters“. Scientific World Journal 2014 (2014): 1–9. http://dx.doi.org/10.1155/2014/798612.

Der volle Inhalt der Quelle

Annotation:

This paper gives a solution to the blind parameter identification of a convolutional encoder. The problem can be addressed in the context of the noncooperative communications or adaptive coding and modulations (ACM) for cognitive radio networks. We consider an intelligent communication receiver which can blindly recognize the coding parameters of the received data stream. The only knowledge is that the stream is encoded using binary convolutional codes, while the coding parameters are unknown. Some previous literatures have significant contributions for the recognition of convolutional encoder parameters in hard-decision situations. However, soft-decision systems are applied more and more as the improvement of signal processing techniques. In this paper we propose a method to utilize the soft information to improve the recognition performances in soft-decision communication systems. Besides, we propose a new recognition method based on correlation attack to meet low signal-to-noise ratio situations. Finally we give the simulation results to show the efficiency of the proposed methods.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

17

Sybrandt, Justin, und Ilya Safro. „CBAG: Conditional biomedical abstract generation“. PLOS ONE 16, Nr. 7 (06.07.2021): e0253905. http://dx.doi.org/10.1371/journal.pone.0253905.

Der volle Inhalt der Quelle

Annotation:

Biomedical research papers often combine disjoint concepts in novel ways, such as when describing a newly discovered relationship between an understudied gene with an important disease. These concepts are often explicitly encoded as metadata keywords, such as the author-provided terms included with many documents in the MEDLINE database. While substantial recent work has addressed the problem of text generation in a more general context, applications, such as scientific writing assistants, or hypothesis generation systems, could benefit from the capacity to select the specific set of concepts that underpin a generated biomedical text. We propose a conditional language model following the transformer architecture. This model uses the “encoder stack” to encode concepts that a user wishes to discuss in the generated text. The “decoder stack” then follows the masked self-attention pattern to perform text generation, using both prior tokens as well as the encoded condition. We demonstrate that this approach provides significant control, while still producing reasonable biomedical text.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

18

López-Granado, Otoniel Mario, Miguel Onofre Martínez-Rach, Antonio Martí-Campoy, Marco Antonio Cruz-Chávez und Manuel Pérez Malumbres. „A General Model for the Design of Efficient Sign-Coding Tools for Wavelet-Based Encoders“. Electronics 9, Nr. 11 (12.11.2020): 1899. http://dx.doi.org/10.3390/electronics9111899.

Der volle Inhalt der Quelle

Annotation:

Traditionally, it has been assumed that the compression of the sign of wavelet coefficients is not worth the effort because they form a zero-mean process. However, several image encoders such as JPEG 2000 include sign-coding capabilities. In this paper, we analyze the convenience of including sign-coding techniques into wavelet-based image encoders and propose a methodology that allows the design of sign-prediction tools for whatever kind of wavelet-based encoder. The proposed methodology is based on the use of metaheuristic algorithms to find the best sign prediction with the most appropriate context distribution that maximizes the resulting sign-compression rate of a particular wavelet encoder. Following our proposal, we have designed and implemented a sign-coding module for the LTW wavelet encoder, to evaluate the benefits of the sign-coding tool provided by our proposed methodology. The experimental results show that sign compression can save up to 18.91% of bit-rate when enabling sign-coding capabilities. Also, we have observed two general behaviors when coding the sign of wavelet coefficients: (a) the best results are provided from moderate to high compression rates; and (b) the sign redundancy may be better exploited when working with high-textured images.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

19

Messaoudi, Mohamed, Majdi Benzarti und Salem Hasnaoui. „4x4 Time-Domain MIMO encoder with OFDM Scheme in WIMAX Context“. International Journal of Management Excellence 1, Nr. 1 (30.04.2013): 01. http://dx.doi.org/10.17722/ijme.v1i1.2.

Der volle Inhalt der Quelle

APA, Harvard, Vancouver, ISO und andere Zitierweisen

20

Li, Zhiqiang, Zhouzhong Zhang und Hongchen Guo. „An Improved Image Inpainting Method Based on Feature Similarity Context Encoder“. Journal of Physics: Conference Series 1069 (August 2018): 012181. http://dx.doi.org/10.1088/1742-6596/1069/1/012181.

Der volle Inhalt der Quelle

APA, Harvard, Vancouver, ISO und andere Zitierweisen

21

Xiaohua Tian, T. M. Le, Xi Jiang und Yong Lian. „Full RDO-Support Power-Aware CABAC Encoder With Efficient Context Access“. IEEE Transactions on Circuits and Systems for Video Technology 19, Nr. 9 (September 2009): 1262–73. http://dx.doi.org/10.1109/tcsvt.2009.2020326.

Der volle Inhalt der Quelle

APA, Harvard, Vancouver, ISO und andere Zitierweisen

22

Yun, Hyeongu, Yongil Kim, Taegwan Kang und Kyomin Jung. „Pairwise Context Similarity for Image Retrieval System Using Variational Auto-Encoder“. IEEE Access 9 (2021): 34067–77. http://dx.doi.org/10.1109/access.2021.3061765.

Der volle Inhalt der Quelle

APA, Harvard, Vancouver, ISO und andere Zitierweisen

23

Deepthi, Godavarthi, und A. Mary Sowjanya. „Query-Based Retrieval Using Universal Sentence Encoder“. Revue d'Intelligence Artificielle 35, Nr. 4 (31.08.2021): 301–6. http://dx.doi.org/10.18280/ria.350404.

Der volle Inhalt der Quelle

Annotation:

In Natural language processing, various tasks can be implemented with the features provided by word embeddings. But for obtaining embeddings for larger chunks like sentences, the efforts applied through word embeddings will not be sufficient. To resolve such issues sentence embeddings can be used. In sentence embeddings, complete sentences along with their semantic information are represented as vectors so that the machine finds it easy to understand the context. In this paper, we propose a Question Answering System (QAS) based on sentence embeddings. Our goal is to obtain the text from the provided context for a user-query by extracting the sentence in which the correct answer is present. Traditionally, infersent models have been used on SQUAD for building QAS. In recent times, Universal Sentence Encoder with USECNN and USETrans have been developed. In this paper, we have used another variant of the Universal sentence encoder, i.e. Deep averaging network in order to obtain pre-trained sentence embeddings. The results on the SQUAD-2.0 dataset indicate our approach (USE with DAN) performs well compared to Facebook’s infersent embedding.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

24

Wen, Ying, Kai Xie und Lianghua He. „Segmenting Medical MRI via Recurrent Decoding Cell“. Proceedings of the AAAI Conference on Artificial Intelligence 34, Nr. 07 (03.04.2020): 12452–59. http://dx.doi.org/10.1609/aaai.v34i07.6932.

Der volle Inhalt der Quelle

Annotation:

The encoder-decoder networks are commonly used in medical image segmentation due to their remarkable performance in hierarchical feature fusion. However, the expanding path for feature decoding and spatial recovery does not consider the long-term dependency when fusing feature maps from different layers, and the universal encoder-decoder network does not make full use of the multi-modality information to improve the network robustness especially for segmenting medical MRI. In this paper, we propose a novel feature fusion unit called Recurrent Decoding Cell (RDC) which leverages convolutional RNNs to memorize the long-term context information from the previous layers in the decoding phase. An encoder-decoder network, named Convolutional Recurrent Decoding Network (CRDN), is also proposed based on RDC for segmenting multi-modality medical MRI. CRDN adopts CNN backbone to encode image features and decode them hierarchically through a chain of RDCs to obtain the final high-resolution score map. The evaluation experiments on BrainWeb, MRBrainS and HVSMR datasets demonstrate that the introduction of RDC effectively improves the segmentation accuracy as well as reduces the model size, and the proposed CRDN owns its robustness to image noise and intensity non-uniformity in medical MRI.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

25

Lei, S. F., C. C. Lo, C. C. Kuo und M. D. Shieh. „Low-power context-based adaptive binary arithmetic encoder using an embedded cache“. IET Image Processing 6, Nr. 4 (2012): 309. http://dx.doi.org/10.1049/iet-ipr.2010.0473.

Der volle Inhalt der Quelle

APA, Harvard, Vancouver, ISO und andere Zitierweisen

26

Yang, Libin, Zeqing Zhang, Xiaoyan Cai und Tao Dai. „Attention-Based Personalized Encoder-Decoder Model for Local Citation Recommendation“. Computational Intelligence and Neuroscience 2019 (03.06.2019): 1–7. http://dx.doi.org/10.1155/2019/1232581.

Der volle Inhalt der Quelle

Annotation:

With a tremendous growth in the number of scientific papers, researchers have to spend too much time and struggle to find the appropriate papers they are looking for. Local citation recommendation that provides a list of references based on a text segment could alleviate the problem. Most existing local citation recommendation approaches concentrate on how to narrow the semantic difference between the scientific papers’ and citation context’s text content, completely neglecting other information. Inspired by the successful use of the encoder-decoder framework in machine translation, we develop an attention-based encoder-decoder (AED) model for local citation recommendation. The proposed AED model integrates venue information and author information in attention mechanism and learns relations between variable-length texts of the two text objects, i.e., citation contexts and scientific papers. Specifically, we first construct an encoder to represent a citation context as a vector in a low-dimensional space; after that, we construct an attention mechanism integrating venue information and author information and use RNN to construct a decoder, then we map the decoder’s output into a softmax layer, and score the scientific papers. Finally, we select papers which have high scores and generate a recommended reference paper list. We conduct experiments on the DBLP and ACL Anthology Network (AAN) datasets, and the results illustrate that the performance of the proposed approach is better than the other three state-of-the-art approaches.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

27

Yang, Zhenjian, Jiamei Shang, Zhongwei Zhang, Yan Zhang und Shudong Liu. „A new end-to-end image dehazing algorithm based on residual attention mechanism“. Xibei Gongye Daxue Xuebao/Journal of Northwestern Polytechnical University 39, Nr. 4 (August 2021): 901–8. http://dx.doi.org/10.1051/jnwpu/20213940901.

Der volle Inhalt der Quelle

Annotation:

Traditional image dehazing algorithms based on prior knowledge and deep learning rely on the atmospheric scattering model and are easy to cause color distortion and incomplete dehazing. To solve these problems, an end-to-end image dehazing algorithm based on residual attention mechanism is proposed in this paper. The network includes four modules: encoder, multi-scale feature extraction, feature fusion and decoder. The encoder module encodes the input haze image into feature map, which is convenient for subsequent feature extraction and reduces memory consumption; the multi-scale feature extraction module includes residual smoothed dilated convolution module, residual block and efficient channel attention, which can expand the receptive field and extract different scale features by filtering and weighting; the feature fusion module with efficient channel attention adjusts the channel weight dynamically, acquires rich context information and suppresses redundant information so as to enhance the ability to extract haze density image of the network; finally, the encoder module maps the fused feature nonlinearly to obtain the haze density image and then restores the haze free image. The qualitative and quantitative tests based on SOTS test set and natural haze images show good objective and subjective evaluation results. This algorithm improves the problems of color distortion and incomplete dehazing effectively.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

28

Girbau, Dolors, und Humbert Boada. „Accurate Referential Communication and its Relation with Private and Social Speech in a Naturalistic Context“. Spanish Journal of Psychology 7, Nr. 2 (November 2004): 81–92. http://dx.doi.org/10.1017/s1138741600004789.

Der volle Inhalt der Quelle

Annotation:

Research into human communication has been grouped under two traditions: referential and sociolinguistic. The study of a communication behavior simultaneously from both paradigms appears to be absent. Basically, this paper analyzes the use of private and social speech, through both a referential task (Word Pairs) and a naturalistic dyadic setting (Lego-set) administered to a sample of 64 children from grades 3 and 5. All children, of 8 and 10 years of age, used speech that was not adapted to the decoder, and thus ineffective for interpersonal communication, in both referential and sociolinguistic communication. Pairs of high-skill referential encoders used significantly more task-relevant social speech, that is, cognitively more complex, than did low-skill dyads in the naturalistic context. High-skill referential encoder dyads showed a trend to produce more inaudible private speech than did low-skill ones during spontaneous communication. Gender did not affect the results.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

29

Varade, Saurabh, Ejaaz Sayyed, Vaibhavi Nagtode und Shilpa Shinde. „Text Summarization using Extractive and Abstractive Methods“. ITM Web of Conferences 40 (2021): 03023. http://dx.doi.org/10.1051/itmconf/20214003023.

Der volle Inhalt der Quelle

Annotation:

Text Summarization is a process where a huge text file is converted into summarized version which will preserve the original meaning and context. The main aim of any text summarization is to provide a accurate and precise summary. One approach is to use a sentence ranking algorithm. This comes under extractive summarization. Here, a graph based ranking algorithm is used to rank the sentences in the text and then top k-scored sentences are included in the summary. The most widely used algorithm to decide the importance of any vertex in a graph based on the information retrieved from the graph is Graph Based Ranking Algorithm. TextRank is one of the most efficient ranking algorithms which is used for Web link analysis that is for measuring the importance of website pages. Another approach is abstractive summarization where a LSTM encoder decoder model is used along with attention mechanism which focuses on some important words from the input. Encoder encodes the input sequence and decoder along with attention mechanism gives the summary as the output.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

30

Liu, Hai, Yuanxia Liu, Leung-Pun Wong, Lap-Kei Lee und Tianyong Hao. „A Hybrid Neural Network BERT-Cap Based on Pre-Trained Language Model and Capsule Network for User Intent Classification“. Complexity 2020 (21.11.2020): 1–11. http://dx.doi.org/10.1155/2020/8858852.

Der volle Inhalt der Quelle

Annotation:

User intent classification is a vital component of a question-answering system or a task-based dialogue system. In order to understand the goals of users’ questions or discourses, the system categorizes user text into a set of pre-defined user intent categories. User questions or discourses are usually short in length and lack sufficient context; thus, it is difficult to extract deep semantic information from these types of text and the accuracy of user intent classification may be affected. To better identify user intents, this paper proposes a BERT-Cap hybrid neural network model with focal loss for user intent classification to capture user intents in dialogue. The model uses multiple transformer encoder blocks to encode user utterances and initializes encoder parameters with a pre-trained BERT. Then, it extracts essential features using a capsule network with dynamic routing after utterances encoding. Experiment results on four publicly available datasets show that our model BERT-Cap achieves a F1 score of 0.967 and an accuracy of 0.967, outperforming a number of baseline methods, indicating its effectiveness in user intent classification.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

31

Fan, Zhun, Chong Li, Ying Chen, Jiahong Wei, Giuseppe Loprencipe, Xiaopeng Chen und Paola Di Mascio. „Automatic Crack Detection on Road Pavements Using Encoder-Decoder Architecture“. Materials 13, Nr. 13 (02.07.2020): 2960. http://dx.doi.org/10.3390/ma13132960.

Der volle Inhalt der Quelle

Annotation:

Automatic crack detection from images is an important task that is adopted to ensure road safety and durability for Portland cement concrete (PCC) and asphalt concrete (AC) pavement. Pavement failure depends on a number of causes including water intrusion, stress from heavy loads, and all the climate effects. Generally, cracks are the first distress that arises on road surfaces and proper monitoring and maintenance to prevent cracks from spreading or forming is important. Conventional algorithms to identify cracks on road pavements are extremely time-consuming and high cost. Many cracks show complicated topological structures, oil stains, poor continuity, and low contrast, which are difficult for defining crack features. Therefore, the automated crack detection algorithm is a key tool to improve the results. Inspired by the development of deep learning in computer vision and object detection, the proposed algorithm considers an encoder-decoder architecture with hierarchical feature learning and dilated convolution, named U-Hierarchical Dilated Network (U-HDN), to perform crack detection in an end-to-end method. Crack characteristics with multiple context information are automatically able to learn and perform end-to-end crack detection. Then, a multi-dilation module embedded in an encoder-decoder architecture is proposed. The crack features of multiple context sizes can be integrated into the multi-dilation module by dilation convolution with different dilatation rates, which can obtain much more cracks information. Finally, the hierarchical feature learning module is designed to obtain a multi-scale features from the high to low- level convolutional layers, which are integrated to predict pixel-wise crack detection. Some experiments on public crack databases using 118 images were performed and the results were compared with those obtained with other methods on the same images. The results show that the proposed U-HDN method achieves high performance because it can extract and fuse different context sizes and different levels of feature maps than other algorithms.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

32

Luo, Junyu, Min Yang, Ying Shen, Qiang Qu und Haixia Chai. „Learning Document Embeddings with Crossword Prediction“. Proceedings of the AAAI Conference on Artificial Intelligence 33 (17.07.2019): 9993–94. http://dx.doi.org/10.1609/aaai.v33i01.33019993.

Der volle Inhalt der Quelle

Annotation:

In this paper, we propose a Document Embedding Network (DEN) to learn document embeddings in an unsupervised manner. Our model uses the encoder-decoder architecture as its backbone, which tries to reconstruct the input document from an encoded document embedding. Unlike the standard decoder for text reconstruction, we randomly block some words in the input document, and use the incomplete context information and the encoded document embedding to predict the blocked words in the document, inspired by the crossword game. Thus, our decoder can keep the balance between the known and unknown information, and consider both global and partial information when decoding the missing words. We evaluate the learned document embeddings on two tasks: document classification and document retrieval. The experimental results show that our model substantially outperforms the compared methods.1.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

33

Wang, Shuyang, Xiaodong Mu, Dongfang Yang, Hao He und Peng Zhao. „Attention Guided Encoder-Decoder Network With Multi-Scale Context Aggregation for Land Cover Segmentation“. IEEE Access 8 (2020): 215299–309. http://dx.doi.org/10.1109/access.2020.3040862.

Der volle Inhalt der Quelle

APA, Harvard, Vancouver, ISO und andere Zitierweisen

34

Dong, Yuying, Liejun Wang, Shuli Cheng und Yongming Li. „FAC-Net: Feedback Attention Network Based on Context Encoder Network for Skin Lesion Segmentation“. Sensors 21, Nr. 15 (30.07.2021): 5172. http://dx.doi.org/10.3390/s21155172.

Der volle Inhalt der Quelle

Annotation:

Considerable research and surveys indicate that skin lesions are an early symptom of skin cancer. Segmentation of skin lesions is still a hot research topic. Dermatological datasets in skin lesion segmentation tasks generated a large number of parameters when data augmented, limiting the application of smart assisted medicine in real life. Hence, this paper proposes an effective feedback attention network (FAC-Net). The network is equipped with the feedback fusion block (FFB) and the attention mechanism block (AMB), through the combination of these two modules, we can obtain richer and more specific feature mapping without data enhancement. Numerous experimental tests were given by us on public datasets (ISIC2018, ISBI2017, ISBI2016), and a good deal of metrics like the Jaccard index (JA) and Dice coefficient (DC) were used to evaluate the results of segmentation. On the ISIC2018 dataset, we obtained results for DC equal to 91.19% and JA equal to 83.99%, compared with the based network. The results of these two main metrics were improved by more than 1%. In addition, the metrics were also improved in the other two datasets. It can be demonstrated through experiments that without any enhancements of the datasets, our lightweight model can achieve better segmentation performance than most deep learning architectures.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

35

Chen, Songle, Xuejian Zhao, Bingqing Luo und Zhixin Sun. „Visual Browse and Exploration in Motion Capture Data with Phylogenetic Tree of Context-Aware Poses“. Sensors 20, Nr. 18 (13.09.2020): 5224. http://dx.doi.org/10.3390/s20185224.

Der volle Inhalt der Quelle

Annotation:

Visual browse and exploration in motion capture data take resource acquisition as a human–computer interaction problem, and it is an essential approach for target motion search. This paper presents a progressive schema which starts from pose browse, then locates the interesting region and then switches to online relevant motion exploration. It mainly addresses three core issues. First, to alleviate the contradiction between the limited visual space and ever-increasing size of real-world database, it applies affinity propagation to numerical similarity measure of pose to perform data abstraction and obtains representative poses of clusters. Second, to construct a meaningful neighborhood for user browsing, it further merges logical similarity measures of pose with the weight quartets and casts the isolated representative poses into a structure of phylogenetic tree. Third, to support online motion exploration including motion ranking and clustering, a biLSTM-based auto-encoder is proposed to encode the high-dimensional pose context into compact latent space. Experimental results on CMU’s motion capture data verify the effectiveness of the proposed method.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

36

Ai, Xinbo, Yunhao Xie, Yinan He und Yi Zhou. „Improve SegNet with feature pyramid for road scene parsing“. E3S Web of Conferences 260 (2021): 03012. http://dx.doi.org/10.1051/e3sconf/202126003012.

Der volle Inhalt der Quelle

Annotation:

Road scene parsing is a common task in semantic segmentation. Its images have characteristics of containing complex scene context and differing greatly among targets of the same category from different scales. To address these problems, we propose a semantic segmentation model combined with edge detection. We extend the segmentation network with an encoder-decoder structure by adding an edge feature pyramid module, namely Edge Feature Pyramid Network (EFPNet, for short). This module uses edge detection operators to get boundary information and then combines the multiscale features to improve the ability to recognize small targets. EFPNet can make up the shortcomings of convolutional neural network features, and it helps to produce smooth segmentation. After extracting features of the encoder and decoder, EFPNet uses Euclidean distance to compare the similarity between the presentation of the encoder and the decoder, which can increase the decoder’s ability to restore from the encoder. We evaluated the proposed method on Cityscapes datasets. The experiment on Cityscapes datasets demonstrates that the accuracies are improved by 7.5% and 6.2% over the popular SegNet and ENet. And the ablation experiment validates the effectiveness of our method.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

37

Zhou, Zexun, Zhongshi He, Yuanyuan Jia, Jinglong Du, Lulu Wang und Ziyu Chen. „Context prior-based with residual learning for face detection: A deep convolutional encoder–decoder network“. Signal Processing: Image Communication 88 (Oktober 2020): 115948. http://dx.doi.org/10.1016/j.image.2020.115948.

Der volle Inhalt der Quelle

APA, Harvard, Vancouver, ISO und andere Zitierweisen

38

Sriraam, N. „A High-Performance Lossless Compression Scheme for EEG Signals Using Wavelet Transform and Neural Network Predictors“. International Journal of Telemedicine and Applications 2012 (2012): 1–8. http://dx.doi.org/10.1155/2012/302581.

Der volle Inhalt der Quelle

Annotation:

Developments of new classes of efficient compression algorithms, software systems, and hardware for data intensive applications in today's digital health care systems provide timely and meaningful solutions in response to exponentially growing patient information data complexity and associated analysis requirements. Of the different 1D medical signals, electroencephalography (EEG) data is of great importance to the neurologist for detecting brain-related disorders. The volume of digitized EEG data generated and preserved for future reference exceeds the capacity of recent developments in digital storage and communication media and hence there is a need for an efficient compression system. This paper presents a new and efficient high performance lossless EEG compression using wavelet transform and neural network predictors. The coefficients generated from the EEG signal by integer wavelet transform are used to train the neural network predictors. The error residues are further encoded using a combinational entropy encoder, Lempel-Ziv-arithmetic encoder. Also a new context-based error modeling is also investigated to improve the compression efficiency. A compression ratio of 2.99 (with compression efficiency of 67%) is achieved with the proposed scheme with less encoding time thereby providing diagnostic reliability for lossless transmission as well as recovery of EEG signals for telemedicine applications.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

39

Tackenberg, Michael C., und Douglas G. McMahon. „Photoperiodic Programming of the SCN and Its Role in Photoperiodic Output“. Neural Plasticity 2018 (2018): 1–9. http://dx.doi.org/10.1155/2018/8217345.

Der volle Inhalt der Quelle

Annotation:

Though the seasonal response of organisms to changing day lengths is a phenomenon that has been scientifically reported for nearly a century, significant questions remain about how photoperiod is encoded and effected neurobiologically. In mammals, early work identified the master circadian clock, the suprachiasmatic nuclei (SCN), as a tentative encoder of photoperiodic information. Here, we provide an overview of research on the SCN as a coordinator of photoperiodic responses, the intercellular coupling changes that accompany that coordination, as well as the SCN’s role in a putative brain network controlling photoperiodic input and output. Lastly, we discuss the importance of photoperiodic research in the context of tangible benefits to human health that have been realized through this research as well as challenges that remain.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

40

Guo, Hui, und Yong Qing Fu. „An Improved CAVLC Entropy Encoder of H.264/AVC and FPGA Implementation“. Key Engineering Materials 474-476 (April 2011): 241–46. http://dx.doi.org/10.4028/www.scientific.net/kem.474-476.241.

Der volle Inhalt der Quelle

Annotation:

Context-based Adaptive Variable Length Coding (CAVLC) as a new entropy coding algorithm has been introduced into H.264/AVC standard. Through analysing the CAVLC coding algorithm detailedly, the paper proposes an overlapping coverage storage method and a new stream merger method, and gives the specific implementation. This idea improves the structure and performance of the complex module and reduces the implementation complexity. The experimental results show that the proposed entropy encoder is correct, and the highest coding frequency of 81.70MHz can be achieved. Meanwhile all the hardware resource consumption is less than 2% of total hardware resources. The new entropy encoder achieves a better balance in system performance and resource consumption.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

41

Jin, Shih-Chun, Chia-Jui Hsieh, Jyh-Cheng Chen, Shih-Huan Tu, Ya-Chen Chen, Tzu-Chien Hsiao, Angela Liu, Wen-Hsiang Chou, Woei-Chyn Chu und Chih-Wei Kuo. „Development of Limited-Angle Iterative Reconstruction Algorithms with Context Encoder-Based Sinogram Completion for Micro-CT Applications“. Sensors 18, Nr. 12 (16.12.2018): 4458. http://dx.doi.org/10.3390/s18124458.

Der volle Inhalt der Quelle

Annotation:

Limited-angle iterative reconstruction (LAIR) reduces the radiation dose required for computed tomography (CT) imaging by decreasing the range of the projection angle. We developed an image-quality-based stopping-criteria method with a flexible and innovative instrument design that, when combined with LAIR, provides the image quality of a conventional CT system. This study describes the construction of different scan acquisition protocols for micro-CT system applications. Fully-sampled Feldkamp (FDK)-reconstructed images were used as references for comparison to assess the image quality produced by these tested protocols. The insufficient portions of a sinogram were inpainted by applying a context encoder (CE), a type of generative adversarial network, to the LAIR process. The context image was passed through an encoder to identify features that were connected to the decoder using a channel-wise fully-connected layer. Our results evidence the excellent performance of this novel approach. Even when we reduce the radiation dose by 1/4, the iterative-based LAIR improved the full-width half-maximum, contrast-to-noise and signal-to-noise ratios by 20% to 40% compared to a fully-sampled FDK-based reconstruction. Our data support that this CE-based sinogram completion method enhances the efficacy and efficiency of LAIR and that would allow feasibility of limited angle reconstruction.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

42

Wu, Yu, Furu Wei, Shaohan Huang, Yunli Wang, Zhoujun Li und Ming Zhou. „Response Generation by Context-Aware Prototype Editing“. Proceedings of the AAAI Conference on Artificial Intelligence 33 (17.07.2019): 7281–88. http://dx.doi.org/10.1609/aaai.v33i01.33017281.

Der volle Inhalt der Quelle

Annotation:

Open domain response generation has achieved remarkable progress in recent years, but sometimes yields short and uninformative responses. We propose a new paradigm, prototypethen-edit for response generation, that first retrieves a prototype response from a pre-defined index and then edits the prototype response according to the differences between the prototype context and current context. Our motivation is that the retrieved prototype provides a good start-point for generation because it is grammatical and informative, and the post-editing process further improves the relevance and coherence of the prototype. In practice, we design a contextaware editing model that is built upon an encoder-decoder framework augmented with an editing vector. We first generate an edit vector by considering lexical differences between a prototype context and current context. After that, the edit vector and the prototype response representation are fed to a decoder to generate a new response. Experiment results on a large scale dataset demonstrate that our new paradigm significantly increases the relevance, diversity and originality of generation results, compared to traditional generative models. Furthermore, our model outperforms retrieval-based methods in terms of relevance and originality.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

43

Liang, Wenkai, Yan Wu, Ming Li und Yice Cao. „High-Resolution SAR Image Classification Using Context-Aware Encoder Network and Hybrid Conditional Random Field Model“. IEEE Transactions on Geoscience and Remote Sensing 58, Nr. 8 (August 2020): 5317–35. http://dx.doi.org/10.1109/tgrs.2019.2963699.

Der volle Inhalt der Quelle

APA, Harvard, Vancouver, ISO und andere Zitierweisen

44

Licciardo, G. D., und L. Freda Albanese. „Design of a context-adaptive variable length encoder for real-time video compression on reconfigurable platforms“. IET Image Processing 6, Nr. 4 (2012): 301. http://dx.doi.org/10.1049/iet-ipr.2010.0510.

Der volle Inhalt der Quelle

APA, Harvard, Vancouver, ISO und andere Zitierweisen

45

Hwang, Yongkeun, Yanghoon Kim und Kyomin Jung. „Context-Aware Neural Machine Translation for Korean Honorific Expressions“. Electronics 10, Nr. 13 (30.06.2021): 1589. http://dx.doi.org/10.3390/electronics10131589.

Der volle Inhalt der Quelle

Annotation:

Neural machine translation (NMT) is one of the text generation tasks which has achieved significant improvement with the rise of deep neural networks. However, language-specific problems such as handling the translation of honorifics received little attention. In this paper, we propose a context-aware NMT to promote translation improvements of Korean honorifics. By exploiting the information such as the relationship between speakers from the surrounding sentences, our proposed model effectively manages the use of honorific expressions. Specifically, we utilize a novel encoder architecture that can represent the contextual information of the given input sentences. Furthermore, a context-aware post-editing (CAPE) technique is adopted to refine a set of inconsistent sentence-level honorific translations. To demonstrate the efficacy of the proposed method, honorific-labeled test data is required. Thus, we also design a heuristic that labels Korean sentences to distinguish between honorific and non-honorific styles. Experimental results show that our proposed method outperforms sentence-level NMT baselines both in overall translation quality and honorific translations.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

46

Chen, Yunfan, und Hyunchul Shin. „Pedestrian Detection at Night in Infrared Images Using an Attention-Guided Encoder-Decoder Convolutional Neural Network“. Applied Sciences 10, Nr. 3 (23.01.2020): 809. http://dx.doi.org/10.3390/app10030809.

Der volle Inhalt der Quelle

Annotation:

Pedestrian-related accidents are much more likely to occur during nighttime when visible (VI) cameras are much less effective. Unlike VI cameras, infrared (IR) cameras can work in total darkness. However, IR images have several drawbacks, such as low-resolution, noise, and thermal energy characteristics that can differ depending on the weather. To overcome these drawbacks, we propose an IR camera system to identify pedestrians at night that uses a novel attention-guided encoder-decoder convolutional neural network (AED-CNN). In AED-CNN, encoder-decoder modules are introduced to generate multi-scale features, in which new skip connection blocks are incorporated into the decoder to combine the feature maps from the encoder and decoder module. This new architecture increases context information which is helpful for extracting discriminative features from low-resolution and noisy IR images. Furthermore, we propose an attention module to re-weight the multi-scale features generated by the encoder-decoder module. The attention mechanism effectively highlights pedestrians while eliminating background interference, which helps to detect pedestrians under various weather conditions. Empirical experiments on two challenging datasets fully demonstrate that our method shows superior performance. Our approach significantly improves the precision of the state-of-the-art method by 5.1% and 23.78% on the Keimyung University (KMU) and Computer Vision Center (CVC)-09 pedestrian dataset, respectively.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

47

Liu, Hongtao, Wenjun Wang, Qiyao Peng, Nannan Wu, Fangzhao Wu und Pengfei Jiao. „Toward Comprehensive User and Item Representations via Three-tier Attention Network“. ACM Transactions on Information Systems 39, Nr. 3 (23.02.2021): 1–22. http://dx.doi.org/10.1145/3446341.

Der volle Inhalt der Quelle

Annotation:

Product reviews can provide rich information about the opinions users have of products. However, it is nontrivial to effectively infer user preference and item characteristics from reviews due to the complicated semantic understanding. Existing methods usually learn features for users and items from reviews in single static fashions and cannot fully capture user preference and item features. In this article, we propose a neural review-based recommendation approach that aims to learn comprehensive representations of users/items under a three-tier attention framework. We design a review encoder to learn review features from words via a word-level attention, an aspect encoder to learn aspect features via a review-level attention, and a user/item encoder to learn the final representations of users/items via an aspect-level attention. In word- and review-level attentions, we adopt the context-aware mechanism to indicate importance of words and reviews dynamically instead of static attention weights. In addition, the attentions in the word and review levels are of multiple paradigms to learn multiple features effectively, which could indicate the diversity of user/item features. Furthermore, we propose a personalized aspect-level attention module in user/item encoder to learn the final comprehensive features. Extensive experiments are conducted and the results in rating prediction validate the effectiveness of our method.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

48

Xing, Yongfeng, Luo Zhong und Xian Zhong. „An Encoder-Decoder Network Based FCN Architecture for Semantic Segmentation“. Wireless Communications and Mobile Computing 2020 (07.07.2020): 1–9. http://dx.doi.org/10.1155/2020/8861886.

Der volle Inhalt der Quelle

Annotation:

In recent years, the convolutional neural network (CNN) has made remarkable achievements in semantic segmentation. The method of semantic segmentation has a desirable application prospect. Nowadays, the methods mostly use an encoder-decoder architecture as a way of generating pixel by pixel segmentation prediction. The encoder is for extracting feature maps and decoder for recovering feature map resolution. An improved semantic segmentation method on the basis of the encoder-decoder architecture is proposed. We can get better segmentation accuracy on several hard classes and reduce the computational complexity significantly. This is possible by modifying the backbone and some refining techniques. Finally, after some processing, the framework has achieved good performance in many datasets. In comparison with the traditional architecture, our architecture does not need additional decoding layer and further reuses the encoder weight, thus reducing the complete quantity of parameters needed for processing. In this paper, a modified focal loss function is also put forward, as a replacement for the cross-entropy function to achieve a better treatment of the imbalance problem of the training data. In addition, more context information is added to the decode module as a way of improving the segmentation results. Experiments prove that the presented method can get better segmentation results. As an integral part of a smart city, multimedia information plays an important role. Semantic segmentation is an important basic technology for building a smart city.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

49

Markovnikov, Nikita, und Irina Kipyatkova. „Encoder-decoder models for recognition of Russian speech“. Information and Control Systems, Nr. 4 (04.10.2019): 45–53. http://dx.doi.org/10.31799/1684-8853-2019-4-45-53.

Der volle Inhalt der Quelle

Annotation:

Problem: Classical systems of automatic speech recognition are traditionally built using an acoustic model based on hidden Markovmodels and a statistical language model. Such systems demonstrate high recognition accuracy, but consist of several independentcomplex parts, which can cause problems when building models. Recently, an end-to-end recognition method has been spread, usingdeep artificial neural networks. This approach makes it easy to implement models using just one neural network. End-to-end modelsoften demonstrate better performance in terms of speed and accuracy of speech recognition. Purpose: Implementation of end-toendmodels for the recognition of continuous Russian speech, their adjustment and comparison with hybrid base models in terms ofrecognition accuracy and computational characteristics, such as the speed of learning and decoding. Methods: Creating an encoderdecodermodel of speech recognition using an attention mechanism; applying techniques of stabilization and regularization of neuralnetworks; augmentation of data for training; using parts of words as an output of a neural network. Results: An encoder-decodermodel was obtained using an attention mechanism for recognizing continuous Russian speech without extracting features or usinga language model. As elements of the output sequence, we used parts of words from the training set. The resulting model could notsurpass the basic hybrid models, but surpassed the other baseline end-to-end models, both in recognition accuracy and in decoding/learning speed. The word recognition error was 24.17% and the decoding speed was 0.3 of the real time, which is 6% faster than thebaseline end-to-end model and 46% faster than the basic hybrid model. We showed that end-to-end models could work without languagemodels for the Russian language, while demonstrating a higher decoding speed than hybrid models. The resulting model was trained onraw data without extracting any features. We found that for the Russian language the hybrid type of an attention mechanism gives thebest result compared to location-based or context-based attention mechanisms. Practical relevance: The resulting models require lessmemory and less speech decoding time than the traditional hybrid models. That fact can allow them to be used locally on mobile deviceswithout using calculations on remote servers.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

50

Li, Liunian Harold, Patrick H. Chen, Cho-Jui Hsieh und Kai-Wei Chang. „Efficient Contextual Representation Learning With Continuous Outputs“. Transactions of the Association for Computational Linguistics 7 (November 2019): 611–24. http://dx.doi.org/10.1162/tacl_a_00289.

Der volle Inhalt der Quelle

Annotation:

Contextual representation models have achieved great success in improving various downstream natural language processing tasks. However, these language-model-based encoders are difficult to train due to their large parameter size and high computational complexity. By carefully examining the training procedure, we observe that the softmax layer, which predicts a distribution of the target word, often induces significant overhead, especially when the vocabulary size is large. Therefore, we revisit the design of the output layer and consider directly predicting the pre-trained embedding of the target word for a given context. When applied to ELMo, the proposed approach achieves a 4-fold speedup and eliminates 80% trainable parameters while achieving competitive performance on downstream tasks. Further analysis shows that the approach maintains the speed advantage under various settings, even when the sentence encoder is scaled up.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

Zeitschriftenartikel zum Thema „Context Encoder“

Geben Sie eine Quelle nach APA, MLA, Chicago, Harvard und anderen Zitierweisen an