Log in

Relevant bibliographies by topics / LSTM unit

Contents

Journal articles
Dissertations / Theses
Books
Book chapters
Conference papers

Academic literature on the topic 'LSTM unit'

Author: Grafiati

Published: 28 June 2021

Last updated: 1 February 2022

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the lists of relevant articles, books, theses, conference reports, and other scholarly sources on the topic 'LSTM unit.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Journal articles on the topic "LSTM unit"

1

Dangovski, Rumen, Li Jing, Preslav Nakov, Mićo Tatalović, and Marin Soljačić. "Rotational Unit of Memory: A Novel Representation Unit for RNNs with Scalable Applications." Transactions of the Association for Computational Linguistics 7 (November 2019): 121–38. http://dx.doi.org/10.1162/tacl_a_00258.

Full text

Abstract:

Stacking long short-term memory (LSTM) cells or gated recurrent units (GRUs) as part of a recurrent neural network (RNN) has become a standard approach to solving a number of tasks ranging from language modeling to text summarization. Although LSTMs and GRUs were designed to model long-range dependencies more accurately than conventional RNNs, they nevertheless have problems copying or recalling information from the long distant past. Here, we derive a phase-coded representation of the memory state, Rotational Unit of Memory (RUM), that unifies the concepts of unitary learning and associative memory. We show experimentally that RNNs based on RUMs can solve basic sequential tasks such as memory copying and memory recall much better than LSTMs/GRUs. We further demonstrate that by replacing LSTM/GRU with RUM units we can apply neural networks to real-world problems such as language modeling and text summarization, yielding results comparable to the state of the art.

APA, Harvard, Vancouver, ISO, and other styles

2

Han, Shipeng, Zhen Meng, Xingcheng Zhang, and Yuepeng Yan. "Hybrid Deep Recurrent Neural Networks for Noise Reduction of MEMS-IMU with Static and Dynamic Conditions." Micromachines 12, no. 2 (2021): 214. http://dx.doi.org/10.3390/mi12020214.

Full text

Abstract:

Micro-electro-mechanical system inertial measurement unit (MEMS-IMU), a core component in many navigation systems, directly determines the accuracy of inertial navigation system; however, MEMS-IMU system is often affected by various factors such as environmental noise, electronic noise, mechanical noise and manufacturing error. These can seriously affect the application of MEMS-IMU used in different fields. Focus has been on MEMS gyro since it is an essential and, yet, complex sensor in MEMS-IMU which is very sensitive to noises and errors from the random sources. In this study, recurrent neural networks are hybridized in four different ways for noise reduction and accuracy improvement in MEMS gyro. These are two-layer homogenous recurrent networks built on long short term memory (LSTM-LSTM) and gated recurrent unit (GRU-GRU), respectively; and another two-layer but heterogeneous deep networks built on long short term memory-gated recurrent unit (LSTM-GRU) and a gated recurrent unit-long short term memory (GRU-LSTM). Practical implementation with static and dynamic experiments was carried out for a custom MEMS-IMU to validate the proposed networks, and the results show that GRU-LSTM seems to be overfitting large amount data testing for three-dimensional axis gyro in the static test. However, for X-axis and Y-axis gyro, LSTM-GRU had the best noise reduction effect with over 90% improvement in the three axes. For Z-axis gyroscope, LSTM-GRU performed better than LSTM-LSTM and GRU-GRU in quantization noise and angular random walk, while LSTM-LSTM shows better improvement than both GRU-GRU and LSTM-GRU networks in terms of zero bias stability. In the dynamic experiments, the Hilbert spectrum carried out revealed that time-frequency energy of the LSTM-LSTM, GRU-GRU, and GRU-LSTM denoising are higher compared to LSTM-GRU in terms of the whole frequency domain. Similarly, Allan variance analysis also shows that LSTM-GRU has a better denoising effect than the other networks in the dynamic experiments. Overall, the experimental results demonstrate the effectiveness of deep learning algorithms in MEMS gyro noise reduction, among which LSTM-GRU network shows the best noise reduction effect and great potential for application in the MEMS gyroscope area.

APA, Harvard, Vancouver, ISO, and other styles

3

Huang, Zhongzhan, Senwei Liang, Mingfu Liang, and Haizhao Yang. "DIANet: Dense-and-Implicit Attention Network." Proceedings of the AAAI Conference on Artificial Intelligence 34, no. 04 (2020): 4206–14. http://dx.doi.org/10.1609/aaai.v34i04.5842.

Full text

Abstract:

Attention networks have successfully boosted the performance in various vision problems. Previous works lay emphasis on designing a new attention module and individually plug them into the networks. Our paper proposes a novel-and-simple framework that shares an attention module throughout different network layers to encourage the integration of layer-wise information and this parameter-sharing module is referred to as Dense-and-Implicit-Attention (DIA) unit. Many choices of modules can be used in the DIA unit. Since Long Short Term Memory (LSTM) has a capacity of capturing long-distance dependency, we focus on the case when the DIA unit is the modified LSTM (called DIA-LSTM). Experiments on benchmark datasets show that the DIA-LSTM unit is capable of emphasizing layer-wise feature interrelation and leads to significant improvement of image classification accuracy. We further empirically show that the DIA-LSTM has a strong regularization ability on stabilizing the training of deep networks by the experiments with the removal of skip connections (He et al. 2016a) or Batch Normalization (Ioffe and Szegedy 2015) in the whole residual network.

APA, Harvard, Vancouver, ISO, and other styles

4

Wang, Jianyong, Lei Zhang, Yuanyuan Chen, and Zhang Yi. "A New Delay Connection for Long Short-Term Memory Networks." International Journal of Neural Systems 28, no. 06 (2018): 1750061. http://dx.doi.org/10.1142/s0129065717500617.

Full text

Abstract:

Connections play a crucial role in neural network (NN) learning because they determine how information flows in NNs. Suitable connection mechanisms may extensively enlarge the learning capability and reduce the negative effect of gradient problems. In this paper, a new delay connection is proposed for Long Short-Term Memory (LSTM) unit to develop a more sophisticated recurrent unit, called Delay Connected LSTM (DCLSTM). The proposed delay connection brings two main merits to DCLSTM with introducing no extra parameters. First, it allows the output of the DCLSTM unit to maintain LSTM, which is absent in the LSTM unit. Second, the proposed delay connection helps to bridge the error signals to previous time steps and allows it to be back-propagated across several layers without vanishing too quickly. To evaluate the performance of the proposed delay connections, the DCLSTM model with and without peephole connections was compared with four state-of-the-art recurrent model on two sequence classification tasks. DCLSTM model outperformed the other models with higher accuracy and F1[Formula: see text]score. Furthermore, the networks with multiple stacked DCLSTM layers and the standard LSTM layer were evaluated on Penn Treebank (PTB) language modeling. The DCLSTM model achieved lower perplexity (PPL)/bit-per-character (BPC) than the standard LSTM model. The experiments demonstrate that the learning of the DCLSTM models is more stable and efficient.

APA, Harvard, Vancouver, ISO, and other styles

5

He, Wei, Jufeng Li, Zhihe Tang, et al. "A Novel Hybrid CNN-LSTM Scheme for Nitrogen Oxide Emission Prediction in FCC Unit." Mathematical Problems in Engineering 2020 (August 17, 2020): 1–12. http://dx.doi.org/10.1155/2020/8071810.

Full text

Abstract:

Fluid Catalytic Cracking (FCC), a key unit for secondary processing of heavy oil, is one of the main pollutant emissions of NOx in refineries which can be harmful for the human health. Owing to its complex behaviour in reaction, product separation, and regeneration, it is difficult to accurately predict NOx emission during FCC process. In this paper, a novel deep learning architecture formed by integrating Convolutional Neural Network (CNN) and Long Short-Term Memory Network (LSTM) for nitrogen oxide emission prediction is proposed and validated. CNN is used to extract features among multidimensional data. LSTM is employed to identify the relationships between different time steps. The data from the Distributed Control System (DCS) in one refinery was used to evaluate the performance of the proposed architecture. The results indicate the effectiveness of CNN-LSTM in handling multidimensional time series datasets with the RMSE of 23.7098, and the R2 of 0.8237. Compared with previous methods (CNN and LSTM), CNN-LSTM overcomes the limitation of high-quality feature dependence and handles large amounts of high-dimensional data with better efficiency and accuracy. The proposed CNN-LSTM scheme would be a beneficial contribution to the accurate and stable prediction of irregular trends for NOx emission from refining industry, providing more reliable information for NOx risk assessment and management.

APA, Harvard, Vancouver, ISO, and other styles

6

Donoso-Oliva, C., G. Cabrera-Vives, P. Protopapas, R. Carrasco-Davis, and P. A. Estevez. "The effect of phased recurrent units in the classification of multiple catalogues of astronomical light curves." Monthly Notices of the Royal Astronomical Society 505, no. 4 (2021): 6069–84. http://dx.doi.org/10.1093/mnras/stab1598.

Full text

Abstract:

ABSTRACT In the new era of very large telescopes, where data are crucial to expand scientific knowledge, we have witnessed many deep learning applications for the automatic classification of light curves. Recurrent neural networks (RNNs) are one of the models used for these applications, and the Long Short-Term Memory (LSTM) unit stands out for being an excellent choice for the representation of long time series. In general, RNNs assume observations at discrete times, which may not suit the irregular sampling of light curves. A traditional technique to address irregular sequences consists of adding the sampling time to the network’s input, but this is not guaranteed to capture sampling irregularities during training. Alternatively, the Phased LSTM (PLSTM) unit has been created to address this problem by updating its state using the sampling times explicitly. In this work, we study the effectiveness of the LSTM- and PLSTM-based architectures for the classification of astronomical light curves. We use seven catalogues containing periodic and non-periodic astronomical objects. Our findings show that LSTM outperformed PLSTM on six of seven data sets. However, the combination of both units enhances the results in all data sets.

APA, Harvard, Vancouver, ISO, and other styles

7

Pan, Yu, Jing Xu, Maolin Wang, et al. "Compressing Recurrent Neural Networks with Tensor Ring for Action Recognition." Proceedings of the AAAI Conference on Artificial Intelligence 33 (July 17, 2019): 4683–90. http://dx.doi.org/10.1609/aaai.v33i01.33014683.

Full text

Abstract:

Recurrent Neural Networks (RNNs) and their variants, such as Long-Short Term Memory (LSTM) networks, and Gated Recurrent Unit (GRU) networks, have achieved promising performance in sequential data modeling. The hidden layers in RNNs can be regarded as the memory units, which are helpful in storing information in sequential contexts. However, when dealing with high dimensional input data, such as video and text, the input-to-hidden linear transformation in RNNs brings high memory usage and huge computational cost. This makes the training of RNNs very difficult. To address this challenge, we propose a novel compact LSTM model, named as TR-LSTM, by utilizing the low-rank tensor ring decomposition (TRD) to reformulate the input-to-hidden transformation. Compared with other tensor decomposition methods, TR-LSTM is more stable. In addition, TR-LSTM can complete an end-to-end training and also provide a fundamental building block for RNNs in handling large input data. Experiments on real-world action recognition datasets have demonstrated the promising performance of the proposed TR-LSTM compared with the tensor-train LSTM and other state-of-the-art competitors.

APA, Harvard, Vancouver, ISO, and other styles

8

Shafqat, Wafa, and Yung-Cheol Byun. "A Context-Aware Location Recommendation System for Tourists Using Hierarchical LSTM Model." Sustainability 12, no. 10 (2020): 4107. http://dx.doi.org/10.3390/su12104107.

Full text

Abstract:

The significance of contextual data has been recognized by analysts and specialists in numerous disciplines such as customization, data recovery, ubiquitous and versatile processing, information mining, and management. While a generous research has just been performed in the zone of recommender frameworks, by far most of the existing approaches center on prescribing the most relevant items to customers. It usually neglects extra-contextual information, for example time, area, climate or the popularity of different locations. Therefore, we proposed a deep long-short term memory (LSTM) based context-enriched hierarchical model. This proposed model had two levels of hierarchy and each level comprised of a deep LSTM network. In each level, the task of the LSTM was different. At the first level, LSTM learned from user travel history and predicted the next location probabilities. A contextual learning unit was active between these two levels. This unit extracted maximum possible contexts related to a location, the user and its environment such as weather, climate and risks. This unit also estimated other effective parameters such as the popularity of a location. To avoid feature congestion, XGBoost was used to rank feature importance. The features with no importance were discarded. At the second level, another LSTM framework was used to learn these contextual features embedded with location probabilities and resulted into top ranked places. The performance of the proposed approach was elevated with an accuracy of 97.2%, followed by gated recurrent unit (GRU) (96.4%) and then Bidirectional LSTM (94.2%). We also performed experiments to find the optimal size of travel history for effective recommendations.

APA, Harvard, Vancouver, ISO, and other styles

9

Wu, Beng, Wei He, Jing Wang, Huaqing Liang, and Chong Chen. "A convolutional-LSTM model for nitrogen oxide emission forecasting in FCC unit." Journal of Intelligent & Fuzzy Systems 40, no. 1 (2021): 1537–45. http://dx.doi.org/10.3233/jifs-192086.

Full text

Abstract:

As the environment issue is put on the agenda, air pollution also concerns a lot. Nitrogen oxide (NOx) an is important factor which affects air pollution and is also the main gas emissions of the smoke and waste gas of FCC unit in petrochemical industry. It is important to accurately predict the NOx emission in advance for petrochemical industry to avoid air pollution incidents. In this paper, convolutional neural network (CNN) and long short-term memory (LSTM) are combined to predict the NOx emission in Fluid Catalytic Cracking unit (FCC unit). Convolutional-LSTM (CLSTM) is able to extract the spatial and temporal features which are essential information in the prediction of the NOx emission. The features in the factors of production which would affect the NOx emission are extracted by CNN which prepares time series data for LSTM. The LSTM layer is connected after CNN to model the irregular trends in time series. CNN, Multi-layer perception (MLP), rand forest (RF), support vector machine (SVM) and LSTM are implemented as baseline models. The results from the proposed CLSTM model showed better performance than all the baseline models. The mean absolute error and root mean square error for CLSTM were calculated with the values of 16.8267 and 23.7089 which are the lowest among all the models. The Pearson correlation coefficient and R2 for the proposed CLSTM model are calculated with the value of 0.9263, 0.8237 which are the highest among all the models. Furthermore, the residual graphs indicate the well matched performance between the observations and the predictions. The study provides a model reference for forecasting the NOx concentration emitted by FCC unit in petrochemical industry.

APA, Harvard, Vancouver, ISO, and other styles

10

Appati, Justice Kwame, Ismail Wafaa Denwar, Ebenezer Owusu, and Michael Agbo Tettey Soli. "Construction of an Ensemble Scheme for Stock Price Prediction Using Deep Learning Techniques." International Journal of Intelligent Information Technologies 17, no. 2 (2021): 72–95. http://dx.doi.org/10.4018/ijiit.2021040104.

Full text

Abstract:

This study proposes a deep learning approach for stock price prediction by bridging the long short-term memory with gated recurrent unit. In its evaluation, the mean absolute error and mean square error were used. The model proposed is an extension of the study of Hossain et al. established in 2018 with an MSE of 0.00098 as its lowest error. The current proposed model is a mix of the bidirectional LSTM and bidirectional GRU resulting in 0.00000008 MSE as the lowest error recorded. The LSTM model recorded 0.00000025 MSE, the GRU model recorded 0.00000077 MSE, and the LSTM + GRU model recorded 0.00000023 MSE. Other combinations of the existing models such as the bi-directional LSTM model recorded 0.00000019 MSE, bi-directional GRU recorded 0.00000011 MSE, bidirectional LSTM + GRU recorded 0.00000027 MSE, LSTM and bi-directional GRU recorded 0.00000020 MSE.

APA, Harvard, Vancouver, ISO, and other styles

Dissertations / Theses on the topic "LSTM unit"

1

Sarika, Pawan Kumar. "Comparing LSTM and GRU for Multiclass Sentiment Analysis of Movie Reviews." Thesis, Blekinge Tekniska Högskola, Fakulteten för datavetenskaper, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:bth-20213.

Full text

Abstract:

Today, we are living in a data-driven world. Due to a surge in data generation, there is a need for efficient and accurate techniques to analyze data. One such kind of data which is needed to be analyzed are text reviews given for movies. Rather than classifying the reviews as positive or negative, we will classify the sentiment of the reviews on the scale of one to ten. In doing so, we will compare two recurrent neural network algorithms Long short term memory(LSTM) and Gated recurrent unit(GRU). The main objective of this study is to compare the accuracies of LSTM and GRU models. For training models, we collected data from two different sources. For filtering data, we used porter stemming and stop words. We coupled LSTM and GRU with the convolutional neural networks to increase the performance. After conducting experiments, we have observed that LSTM performed better in predicting border values. Whereas, GRU predicted every class equally. Overall GRU was able to predict multiclass text data of movie reviews slightly better than LSTM. GRU was computationally expansive when compared to LSTM.

APA, Harvard, Vancouver, ISO, and other styles

2

Imramovská, Klára. "Detekce komorových extrasystol v EKG." Master's thesis, Vysoké učení technické v Brně. Fakulta elektrotechniky a komunikačních technologií, 2021. http://www.nusl.cz/ntk/nusl-442489.

Full text

Abstract:

The thesis deals with problems of automatic detection of premature ventricular contractions in ECG records. One detection method which uses a convolutional neural network and LSTM units is implemented in the Python language. Cardiac cycles extracted from one-lead ECG were used for detection. F1 score for binary classification (PVC and normal beat) on the test dataset reached 96,41 % and 81,76 % for three-class classification (PVC, normal beat and other arrhythmias). Lastly, the accuracy of the classification is evaluated and discussed, the achieved results for binary classification are comparable to the results of methods described in different papers.

APA, Harvard, Vancouver, ISO, and other styles

3

Mealey, Thomas C. "Binary Recurrent Unit: Using FPGA Hardware to Accelerate Inference in Long Short-Term Memory Neural Networks." University of Dayton / OhioLINK, 2018. http://rave.ohiolink.edu/etdc/view?acc_num=dayton1524402925375566.

Full text

APA, Harvard, Vancouver, ISO, and other styles

4

Radhakrishnan, Saieshwar. "Domain Adaptation of IMU sensors using Generative Adversarial Networks." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-286821.

Full text

Abstract:

Autonomous vehicles rely on sensors for a clear understanding of the environment and in a heavy duty truck, the sensors are placed at multiple locations like the cabin, chassis and the trailer in order to increase the field of view and reduce the blind spot area. Usually, these sensors perform best when they are stationary relative to the ground, hence large and fast movements, which are quite common in a truck, may lead to performance reduction, erroneous data or in the worst case, a sensor failure. This enforces a need to validate the sensors before using them for making life-critical decisions. This thesis proposes Domain Adaptation as one of the strategies to co-validate Inertial Measurement Unit (IMU) sensors. The proposed Generative Adversarial Network (GAN) based framework predicts the data of one IMU using other IMUs in the truck by implicitly learning the internal dynamics. This prediction model along with other sensor fusion strategies would be used by the supervising system to validate the IMUs in real-time. Through data collected from real-world experiments, it is shown that the proposed framework is able to accurately transform raw IMU sequences across domains. A further comparison is made between Long Short Term Memory (LSTM) and WaveNet based architectures to show the superiority of WaveNets in terms of performance and computational efficiency.<br>Autonoma fordon förlitar sig på sensorer för att skapa en bild av omgivningen. På en tung lastbil placeras sensorerna på multipla ställen, till exempel på hytten, chassiet och på trailern för att öka siktfältet och för att minska blinda områden. Vanligtvis presterar sensorerna som bäst när de är stationära i förhållande till marken, därför kan stora och snabba rörelser, som är vanliga på en lastbil, leda till nedsatt prestanda, felaktig data och i värsta fall fallerande sensorer. På grund av detta så finns det ett stort behov av att validera sensordata innan det används för kritiskt beslutsfattande. Den här avhandlingen föreslår domänadaption som en av de strategier för att samvalidera Tröghetsmätningssensorer (IMU-sensorer). Det föreslagna Generative Adversarial Network (GAN) baserade ramverket förutspår en Tröghetssensors data genom att implicit lära sig den interna dynamiken från andra Tröghetssensorer som är monterade på lastbilen. Den här prediktionsmodellen kombinerat med andra sensorfusionsstrategier kan användas av kontrollsystemet för att i realtid validera Tröghetssensorerna. Med hjälp av data insamlat från verkliga experiment visas det att det föreslagna ramverket klarar av att med hög noggrannhet konvertera obehandlade Tröghetssensor-sekvenser mellan domäner. Ytterligare en undersökning mellan Long Short Term Memory (LSTM) och WaveNet-baserade arkitekturer görs för att visa överlägsenheten i WaveNets när det gäller prestanda och beräkningseffektivitet.

APA, Harvard, Vancouver, ISO, and other styles

5

Gattoni, Giacomo. "Improving the reliability of recurrent neural networks while dealing with bad data." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2021.

Find full text

Abstract:

In practical applications, machine learning and deep learning models can have difficulty in achieving generalization, especially when dealing with training samples that are either noisy or limited in quantity. Standard neural networks do not guarantee the monotonicity of the input features with respect to the output, therefore they lack interpretability and predictability when it is known a priori that the input-output relationship should be monotonic. This problem can be encountered in the CPG industry, where it is not possible to ensure that a deep learning model will learn the increasing monotonic relationship between promotional mechanics and sales. To overcome this issue, it is proposed the combined usage of recurrent neural networks, a type of artificial neural networks specifically designed to deal with data structured as sequences, with lattice networks, conceived to guarantee monotonicity of the desired input features with respect to the output. The proposed architecture has proven to be more reliable when new samples are fed to the neural network, demonstrating its ability to infer the evolution of the sales depending on the promotions, even when it is trained on bad data.

APA, Harvard, Vancouver, ISO, and other styles

6

Anbil, Parthipan Sarath Chandar. "On challenges in training recurrent neural networks." Thèse, 2019. http://hdl.handle.net/1866/23435.

Full text

Abstract:

Dans un problème de prédiction à multiples pas discrets, la prédiction à chaque instant peut dépendre de l’entrée à n’importe quel moment dans un passé lointain. Modéliser une telle dépendance à long terme est un des problèmes fondamentaux en apprentissage automatique. En théorie, les Réseaux de Neurones Récurrents (RNN) peuvent modéliser toute dépendance à long terme. En pratique, puisque la magnitude des gradients peut croître ou décroître exponentiellement avec la durée de la séquence, les RNNs ne peuvent modéliser que les dépendances à court terme. Cette thèse explore ce problème dans les réseaux de neurones récurrents et propose de nouvelles solutions pour celui-ci. Le chapitre 3 explore l’idée d’utiliser une mémoire externe pour stocker les états cachés d’un réseau à Mémoire Long et Court Terme (LSTM). En rendant l’opération d’écriture et de lecture de la mémoire externe discrète, l’architecture proposée réduit le taux de décroissance des gradients dans un LSTM. Ces opérations discrètes permettent également au réseau de créer des connexions dynamiques sur de longs intervalles de temps. Le chapitre 4 tente de caractériser cette décroissance des gradients dans un réseau de neurones récurrent et propose une nouvelle architecture récurrente qui, grâce à sa conception, réduit ce problème. L’Unité Récurrente Non-saturante (NRUs) proposée n’a pas de fonction d’activation saturante et utilise la mise à jour additive de cellules au lieu de la mise à jour multiplicative. Le chapitre 5 discute des défis de l’utilisation de réseaux de neurones récurrents dans un contexte d’apprentissage continuel, où de nouvelles tâches apparaissent au fur et à mesure. Les dépendances dans l’apprentissage continuel ne sont pas seulement contenues dans une tâche, mais sont aussi présentes entre les tâches. Ce chapitre discute de deux problèmes fondamentaux dans l’apprentissage continuel: (i) l’oubli catastrophique d’anciennes tâches et (ii) la capacité de saturation du réseau. De plus, une solution est proposée pour régler ces deux problèmes lors de l’entraînement d’un réseau de neurones récurrent.<br>In a multi-step prediction problem, the prediction at each time step can depend on the input at any of the previous time steps far in the past. Modelling such long-term dependencies is one of the fundamental problems in machine learning. In theory, Recurrent Neural Networks (RNNs) can model any long-term dependency. In practice, they can only model short-term dependencies due to the problem of vanishing and exploding gradients. This thesis explores the problem of vanishing gradient in recurrent neural networks and proposes novel solutions for the same. Chapter 3 explores the idea of using external memory to store the hidden states of a Long Short Term Memory (LSTM) network. By making the read and write operations of the external memory discrete, the proposed architecture reduces the rate of gradients vanishing in an LSTM. These discrete operations also enable the network to create dynamic skip connections across time. Chapter 4 attempts to characterize all the sources of vanishing gradients in a recurrent neural network and proposes a new recurrent architecture which has significantly better gradient flow than state-of-the-art recurrent architectures. The proposed Non-saturating Recurrent Units (NRUs) have no saturating activation functions and use additive cell updates instead of multiplicative cell updates. Chapter 5 discusses the challenges of using recurrent neural networks in the context of lifelong learning. In the lifelong learning setting, the network is expected to learn a series of tasks over its lifetime. The dependencies in lifelong learning are not just within a task, but also across the tasks. This chapter discusses the two fundamental problems in lifelong learning: (i) catastrophic forgetting of old tasks, and (ii) network capacity saturation. Further, it proposes a solution to solve both these problems while training a recurrent neural network.

APA, Harvard, Vancouver, ISO, and other styles

Books on the topic "LSTM unit"

1

D, Ashari. Antara tugas & hobby [i.e. hobi]: Otobiografi unik seorang pejuang, diplomat, menteri, pelukis, atlet, vegetarian, dan pelopor LSM. Yayasan Wiratama 45, 1999.

Find full text

APA, Harvard, Vancouver, ISO, and other styles

Book chapters on the topic "LSTM unit"

1

Li, Yancui, Chunxiao Lai, Jike Feng, and Hongyu Feng. "Chinese and English Elementary Discourse Units Recognition Based on Bi-LSTM-CRF Model." In Lecture Notes in Computer Science. Springer International Publishing, 2020. http://dx.doi.org/10.1007/978-3-030-63031-7_24.

Full text

APA, Harvard, Vancouver, ISO, and other styles

2

Okut, Hayrettin. "Deep Learning for Subtyping and Prediction of Diseases: Long-Short Term Memory." In Deep Learning Applications. IntechOpen, 2021. http://dx.doi.org/10.5772/intechopen.96180.

Full text

Abstract:

The long short-term memory neural network (LSTM) is a type of recurrent neural network (RNN). During the training of RNN architecture, sequential information is used and travels through the neural network from input vector to the output neurons, while the error is calculated and propagated back through the network to update the network parameters. Information in these networks incorporates loops into the hidden layer. Loops allow information to flow multi-directionally so that the hidden state signifies past information held at a given time step. Consequently, the output is dependent on the previous predictions which are already known. However, RNNs have limited capacity to bridge more than a certain number of steps. Mainly this is due to the vanishing of gradients which causes the predictions to capture the short-term dependencies as information from earlier steps decays. As more layers in RNN containing activation functions are added, the gradient of the loss function approaches zero. The LSTM neural networks (LSTM-ANNs) enable learning long-term dependencies. LSTM introduces a memory unit and gate mechanism to enable capture of the long dependencies in a sequence. Therefore, LSTM networks can selectively remember or forget information and are capable of learn thousands timesteps by structures called cell states and three gates.

APA, Harvard, Vancouver, ISO, and other styles

3

Husna, Asma, Saman Hassanzadeh Amin, and Bharat Shah. "Demand Forecasting in Supply Chain Management Using Different Deep Learning Methods." In Advances in Logistics, Operations, and Management Science. IGI Global, 2021. http://dx.doi.org/10.4018/978-1-7998-3805-0.ch005.

Full text

Abstract:

Supply chain management (SCM) is a fast growing and largely studied field of research. Forecasting of the required materials and parts is an important task in companies and can have a significant impact on the total cost. To have a reliable forecast, some advanced methods such as deep learning techniques are helpful. The main goal of this chapter is to forecast the unit sales of thousands of items sold at different chain stores located in Ecuador with holistic techniques. Three deep learning approaches including artificial neural network (ANN), convolutional neural network (CNN), and long short-term memory (LSTM) are adopted here for predictions from the Corporación Favorita grocery sales forecasting dataset collected from Kaggle website. Finally, the performances of the applied models are evaluated and compared. The results show that LSTM network tends to outperform the other two approaches in terms of performance. All experiments are conducted using Python's deep learning library and Keras and Tensorflow packages.

APA, Harvard, Vancouver, ISO, and other styles

4

Saxena, Suchitra, Shikha Tripathi, and Sudarshan Tsb. "Deep Robot-Human Interaction with Facial Emotion Recognition Using Gated Recurrent Units & Robotic Process Automation." In Machine Learning and Artificial Intelligence. IOS Press, 2020. http://dx.doi.org/10.3233/faia200773.

Full text

Abstract:

This research work proposes a Facial Emotion Recognition (FER) system using deep learning algorithm Gated Recurrent Units (GRUs) and Robotic Process Automation (RPA) for real time robotic applications. GRUs have been used in the proposed architecture to reduce training time and to capture temporal information. Most work reported in literature uses Convolution Neural Networks (CNN), Hybrid architecture of CNN with Long Short Term Memory (LSTM) and GRUs. In this work, GRUs are used for feature extraction from raw images and dense layers are used for classification. The performance of CNN, GRUs and LSTM are compared in the context of facial emotion recognition. The proposed FER system is implemented on Raspberry pi3 B+ and on Robotic Process Automation (RPA) using UiPath RPA tool for robot human interaction achieving 94.66% average accuracy in real time.

APA, Harvard, Vancouver, ISO, and other styles

Conference papers on the topic "LSTM unit"

1

Chen, Zhenzhong, and Wanjie Sun. "Scanpath Prediction for Visual Attention using IOR-ROI LSTM." In Twenty-Seventh International Joint Conference on Artificial Intelligence {IJCAI-18}. International Joint Conferences on Artificial Intelligence Organization, 2018. http://dx.doi.org/10.24963/ijcai.2018/89.

Full text

Abstract:

Predicting scanpath when a certain stimulus is presented plays an important role in modeling visual attention and search. This paper presents a model that integrates convolutional neural network and long short-term memory (LSTM) to generate realistic scanpaths. The core part of the proposed model is a dual LSTM unit, i.e., an inhibition of return LSTM (IOR-LSTM) and a region of interest LSTM (ROI-LSTM), capturing IOR dynamics and gaze shift behavior simultaneously. IOR-LSTM simulates the visual working memory to adaptively integrate and forget scene information. ROI-LSTM is responsible for predicting the next ROI given the inhibited image features. Experimental results indicate that the proposed architecture can achieve superior performance in predicting scanpaths.

APA, Harvard, Vancouver, ISO, and other styles

2

Nina, Oliver, and Andres Rodriguez. "Simplified LSTM unit and search space probability exploration for image description." In 2015 10th International Conference on Information, Communications and Signal Processing (ICICS). IEEE, 2015. http://dx.doi.org/10.1109/icics.2015.7459976.

Full text

APA, Harvard, Vancouver, ISO, and other styles

3

Wan, Vincent, Yannis Agiomyrgiannakis, Hanna Silen, and Jakub Vít. "Google’s Next-Generation Real-Time Unit-Selection Synthesizer Using Sequence-to-Sequence LSTM-Based Autoencoders." In Interspeech 2017. ISCA, 2017. http://dx.doi.org/10.21437/interspeech.2017-1107.

Full text

APA, Harvard, Vancouver, ISO, and other styles

4

Ho, Thi-Nga, Duy-Cat Can, and EngSiong Chng. "An Investigation of Word Embeddings with Deep Bidirectional LSTM for Sentence Unit Detection in Automatic Speech Transcription." In 2018 International Conference on Asian Language Processing (IALP). IEEE, 2018. http://dx.doi.org/10.1109/ialp.2018.8629114.

Full text

APA, Harvard, Vancouver, ISO, and other styles

5

Feng, Yufei, Fuyu Lv, Weichen Shen, et al. "Deep Session Interest Network for Click-Through Rate Prediction." In Twenty-Eighth International Joint Conference on Artificial Intelligence {IJCAI-19}. International Joint Conferences on Artificial Intelligence Organization, 2019. http://dx.doi.org/10.24963/ijcai.2019/319.

Full text

Abstract:

Click-Through Rate (CTR) prediction plays an important role in many industrial applications, such as online advertising and recommender systems. How to capture users' dynamic and evolving interests from their behavior sequences remains a continuous research topic in the CTR prediction. However, most existing studies overlook the intrinsic structure of the sequences: the sequences are composed of sessions, where sessions are user behaviors separated by their occurring time. We observe that user behaviors are highly homogeneous in each session, and heterogeneous cross sessions. Based on this observation, we propose a novel CTR model named Deep Session Interest Network (DSIN) that leverages users' multiple historical sessions in their behavior sequences. We first use self-attention mechanism with bias encoding to extract users' interests in each session. Then we apply Bi-LSTM to model how users' interests evolve and interact among sessions. Finally, we employ the local activation unit to adaptively learn the influences of various session interests on the target item. Experiments are conducted on both advertising and production recommender datasets and DSIN outperforms other state-of-the-art models on both datasets.

APA, Harvard, Vancouver, ISO, and other styles

6

Gupta, Ashit, Vishal Jadhav, Mukul Patil, Anirudh Deodhar, and Venkataramana Runkana. "Forecasting of Fouling in Air Pre-Heaters Through Deep Learning." In ASME 2021 Power Conference. American Society of Mechanical Engineers, 2021. http://dx.doi.org/10.1115/power2021-64665.

Full text

Abstract:

Abstract Thermal power plants employ regenerative type air pre-heaters (APH) for recovering heat from the boiler flue gases. APH fouling occurs due to deposition of ash particles and products formed by reactions between leaked ammonia from the upstream selective catalytic reduction (SCR) unit and sulphur oxides (SOx) present in the flue gases. Fouling is strongly influenced by concentrations of ammonia and sulphur oxide as well as the flue gas temperature within APH. It increases the differential pressure across APH over time, ultimately leading to forced outages. Owing to lack of sensors within APH and the complex thermo-chemical phenomena, fouling is quite unpredictable. We present a deep learning based model for forecasting the gas differential pressure across the APH using the Long Short Term Memory (LSTM) networks. The model is trained and tested with data generated by a plant model, validated against an industrial scale APH. The model forecasts the gas differential pressure across APH within an accuracy band of 5–10% up to 3 months in advance, as a function of operating conditions. We also propose a digital twin of APH that can provide real-time insights into progression of fouling and preempt the forced outages.

APA, Harvard, Vancouver, ISO, and other styles

7

Yang, Ruiyue, Wei Liu, Xiaozhou Qin, et al. "A Physics-Constrained Data-Driven Workflow for Predicting Coalbed Methane Well Production Using A Combined Gated Recurrent Unit and Multi-Layer Perception Neural Network Model." In SPE Annual Technical Conference and Exhibition. SPE, 2021. http://dx.doi.org/10.2118/205903-ms.

Full text

Abstract:

Abstract Coalbed methane (CBM) has emerged as one of the clean unconventional resources to supplement the rising demand of conventional hydrocarbons. Analyzing and predicting CBM production performance is critical in choosing the optimal completion methods and parameters. However, the conventional numerical simulation has challenges of complicated gridding issues and expensive computational costs. The huge amount of available production data that has been collected in the field site opens up a new opportunity to develop data-driven approaches in predicting the production rate. Here, we proposed a novel physics-constrained data-driven workflow to effectively forecast the CBM productivity based on a Gated Recurrent Unit (GRU) and Multi-Layer Perceptron (MLP) combined neural network (GRU-MLP model). The model architecture is optimized by the multiobjective algorithm: nondominated sorting genetic algorithm Ⅱ (NSGA Ⅱ). The proposed framework was used to predict synthetic cases with various fracture-network-complexities and two multistage-fractured wells in field sites located at Qinshui basin and Ordos basin, China. The results indicated that the proposed GRU-MLP combined neural network was able to accurately and stably predict the production performance of multi-fractured horizontal CBM wells in a fast manner. Compared with Simple Recurrent Neural Network (RNN), Gated Recurrent Unit (GRU) and Long Short-Term Memory (LSTM), the proposed GRU-MLP had the highest accuracy and stability especially for gas production in late-time. Consequently, a physics-constrained data-driven approach performed better than a pure data-driven method. Moreover, the optimum GRU-MLP model architecture was a group of optimized solutions, rather than a single solution. Engineers can evaluate the tradeoffs within this set according to the field-site requirements. This study provides a novel machine learning approach based on a GRU-MLP combined neural network model to estimate production performances in CBM wells. The method is simple and gridless, but is capable of predicting the productivity in a computational cost-effective way. The key findings of this work are expected to provide a theoretical guidance for the intelligent development in oil and gas industry.

APA, Harvard, Vancouver, ISO, and other styles

8

Wang, Fuyong, Yun Zai, Jiuyu Zhao, and Siyi Fang. "Field Application of Deep Learning for Flow Rate Prediction with Downhole Temperature and Pressure." In International Petroleum Technology Conference. IPTC, 2021. http://dx.doi.org/10.2523/iptc-21364-ms.

Full text

Abstract:

Abstract Well real-time flow rate is one of the most important production parameters in oilfield and accurate flow rate information is crucial for production monitoring and optimization. With the wide application of permanent downhole gauge (PDG), the high-frequency and large volume of downhole temperature and pressure make applying of deep learning technique to predict flow rate possible. Flow rate of production well is predicted with long short-term memory (LSTM) network using downhole temperature and pressure production data. The specific parameters of LSTM neural network are given, as well as the methods of data preprocessing and neural network training. The developed model has been validated with two production wells in the Volve Oilfield, North Sea. The field application demonstrates that the deep learning is applicable for flow rate prediction in oilfields. LSTM has the better performance of flow rate prediction than other five machine learning methods, including support vector machine (SVM), linear regression, tree, and Gaussian process regression. The LSTM with a dropout layer has a better performance than a standard LSTM network. The optimal numbers of LSTM layers and hidden units can be adjusted to obtain the best prediction results, but more LSTM layers and hidden units lead to more time of training and prediction, and LSTM model might be unstable and cannot converge. Compared with only downhole pressure or temperature data used as input parameters, flow rate prediction with both of downhole pressure and temperature used as input parameters has the higher prediction accuracy.

APA, Harvard, Vancouver, ISO, and other styles

9

Lau, Zhi Jie, and Chris Philips. "Advanced T-LSIM System Detections using Amplified External Isolated Source-Sense Unit." In ISTFA 2018. ASM International, 2018. http://dx.doi.org/10.31399/asm.cp.istfa2018p0200.

Full text

Abstract:

Abstract Thermal-Laser Signal Injection Microscopy (T-LSIM) is a widely used fault isolation technique. Although there are several T-LSIM systems on the market, each is limited in terms of the voltage and current it can produce. In this paper, the authors explain how they incorporated an Amplified External Isolated Source-Sense (AxISS) unit into their T-LSIM platform, increasing its current sourcing capability and voltage biasing range. They also provide examples highlighting the types of faults and failures that the modified system can detect.

APA, Harvard, Vancouver, ISO, and other styles

10

Alencar, Victor Aquiles Soares de Barros, Lucas Ribeiro Pessamilio, Felipe Rooke Da Silva, Heder Soares Bernardino, and Alex Borges Vieira. "Predição de Séries Temporais de Demanda em Modelos de Compartilhamento de Veículos para Modelos Uni e Multi Variáveis." In Workshop de Computação Urbana. Sociedade Brasileira de Computação - SBC, 2020. http://dx.doi.org/10.5753/courb.2020.12355.

Full text

Abstract:

O compartilhamento de veículos é alternativa para a mobilidade urbana que vem sendo largamente adotada. Porém, essa abordagem está sujeita a problemas, como desbalanceamento da frota ao longo do dia, por conta de demandas variadas em grandes centros urbanos. Neste trabalho aplicamos duas técnicas de séries temporais, o LSTM e o Prophet, para inferir a demanda de três serviços reais de compartilhamento de veículos. Além dos dados históricos, atributos climáticos também foram considerados numa das aplicações do LSTM. Como resultado, foi observado que a adição de dados meteorológicos melhorou o desempenho do modelo: um MAE (Erro Absoluto Médio) médio de aproximadamente 6,01% é obtido com os dados de demanda, enquanto um MAE de 5,9% é observado quando adiciona-se os dados climáticos. Também é possível notar que o desempenho do LSTM é melhor do que o obtido pelo Prophet (MAE médio igual a 10,4%) para as bases de dados adotadas aqui e considerando apenas a demanda dos serviços.

APA, Harvard, Vancouver, ISO, and other styles

We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!