Journal articles: 'ReLU neural networks'

1

Liang, XingLong, and Jun Xu. "Biased ReLU neural networks." Neurocomputing 423 (January 2021): 71–79. http://dx.doi.org/10.1016/j.neucom.2020.09.050.

Full text

APA, Harvard, Vancouver, ISO, and other styles

2

Huang, Changcun. "ReLU Networks Are Universal Approximators via Piecewise Linear or Constant Functions." Neural Computation 32, no. 11 (November 2020): 2249–78. http://dx.doi.org/10.1162/neco_a_01316.

Full text

Abstract:

This letter proves that a ReLU network can approximate any continuous function with arbitrary precision by means of piecewise linear or constant approximations. For univariate function [Formula: see text], we use the composite of ReLUs to produce a line segment; all of the subnetworks of line segments comprise a ReLU network, which is a piecewise linear approximation to [Formula: see text]. For multivariate function [Formula: see text], ReLU networks are constructed to approximate a piecewise linear function derived from triangulation methods approximating [Formula: see text]. A neural unit called TRLU is designed by a ReLU network; the piecewise constant approximation, such as Haar wavelets, is implemented by rectifying the linear output of a ReLU network via TRLUs. New interpretations of deep layers, as well as some other results, are also presented.

APA, Harvard, Vancouver, ISO, and other styles

3

Kulathunga, Nalinda, Nishath Rajiv Ranasinghe, Daniel Vrinceanu, Zackary Kinsman, Lei Huang, and Yunjiao Wang. "Effects of Nonlinearity and Network Architecture on the Performance of Supervised Neural Networks." Algorithms 14, no. 2 (February 5, 2021): 51. http://dx.doi.org/10.3390/a14020051.

Full text

Abstract:

The nonlinearity of activation functions used in deep learning models is crucial for the success of predictive models. Several simple nonlinear functions, including Rectified Linear Unit (ReLU) and Leaky-ReLU (L-ReLU) are commonly used in neural networks to impose the nonlinearity. In practice, these functions remarkably enhance the model accuracy. However, there is limited insight into the effects of nonlinearity in neural networks on their performance. Here, we investigate the performance of neural network models as a function of nonlinearity using ReLU and L-ReLU activation functions in the context of different model architectures and data domains. We use entropy as a measurement of the randomness, to quantify the effects of nonlinearity in different architecture shapes on the performance of neural networks. We show that the ReLU nonliearity is a better choice for activation function mostly when the network has sufficient number of parameters. However, we found that the image classification models with transfer learning seem to perform well with L-ReLU in fully connected layers. We show that the entropy of hidden layer outputs in neural networks can fairly represent the fluctuations in information loss as a function of nonlinearity. Furthermore, we investigate the entropy profile of shallow neural networks as a way of representing their hidden layer dynamics.

APA, Harvard, Vancouver, ISO, and other styles

4

Dung, D., V. K. Nguyen, and M. X. Thao. "ON COMPUTATION COMPLEXITY OF HIGH-DIMENSIONAL APPROXIMATION BY DEEP ReLU NEURAL NETWORKS." BULLETIN of L.N. Gumilyov Eurasian National University. MATHEMATICS. COMPUTER SCIENCE. MECHANICS Series 133, no. 4 (2020): 8–18. http://dx.doi.org/10.32523/2616-7182/2020-133-4-8-18.

Full text

Abstract:

We investigate computation complexity of deep ReLU neural networks for approximating functions in H\"older-Nikol'skii spaces of mixed smoothness $\Lad$ on the unit cube $\IId:=[0,1]^d$. For any function $f\in \Lad$, we explicitly construct nonadaptive and adaptive deep ReLU neural networks having an output that approximates $f$ with a prescribed accuracy $\varepsilon$, and prove dimension-dependent bounds for the computation complexity of this approximation, characterized by the size and depth of this deep ReLU neural network, explicitly in $d$ and $\varepsilon$. Our results show the advantage of the adaptive method of approximation by deep ReLU neural networks over nonadaptive one.

APA, Harvard, Vancouver, ISO, and other styles

5

Gühring, Ingo, Gitta Kutyniok, and Philipp Petersen. "Error bounds for approximations with deep ReLU neural networks in Ws,p norms." Analysis and Applications 18, no. 05 (September 19, 2019): 803–59. http://dx.doi.org/10.1142/s0219530519410021.

Full text

Abstract:

We analyze to what extent deep Rectified Linear Unit (ReLU) neural networks can efficiently approximate Sobolev regular functions if the approximation error is measured with respect to weaker Sobolev norms. In this context, we first establish upper approximation bounds by ReLU neural networks for Sobolev regular functions by explicitly constructing the approximate ReLU neural networks. Then, we establish lower approximation bounds for the same type of function classes. A trade-off between the regularity used in the approximation norm and the complexity of the neural network can be observed in upper and lower bounds. Our results extend recent advances in the approximation theory of ReLU networks to the regime that is most relevant for applications in the numerical analysis of partial differential equations.

APA, Harvard, Vancouver, ISO, and other styles

6

Dũng, Dinh, Van Kien Nguyen, and Mai Xuan Thao. "COMPUTATION COMPLEXITY OF DEEP RELU NEURAL NETWORKS IN HIGH-DIMENSIONAL APPROXIMATION." Journal of Computer Science and Cybernetics 37, no. 3 (September 28, 2021): 291–320. http://dx.doi.org/10.15625/1813-9663/37/3/15902.

Full text

Abstract:

The purpose of the present paper is to study the computation complexity of deep ReLU neural networks to approximate functions in H\"older-Nikol'skii spaces of mixed smoothness $H_\infty^\alpha(\mathbb{I}^d)$ on the unit cube $\mathbb{I}^d:=[0,1]^d$. In this context, for any function $f\in H_\infty^\alpha(\mathbb{I}^d)$, we explicitly construct nonadaptive and adaptive deep ReLU neural networks having an output that approximates $f$ with a prescribed accuracy $\varepsilon$, and prove dimension-dependent bounds for the computation complexity of this approximation, characterized by the size and the depth of this deep ReLU neural network, explicitly in $d$ and $\varepsilon$. Our results show the advantage of the adaptive method of approximation by deep ReLU neural networks over nonadaptive one.

APA, Harvard, Vancouver, ISO, and other styles

7

Полковникова, Н. А., Е. В. Тузинкевич, and А. Н. Попов. "Application of convolutional neural networks for monitoring of marine objects." MORSKIE INTELLEKTUAL`NYE TEHNOLOGII), no. 4(50) (December 17, 2020): 53–61. http://dx.doi.org/10.37220/mit.2020.50.4.097.

Full text

Abstract:

В статье рассмотрены технологии компьютерного зрения на основе глубоких свёрточных нейронных сетей. Применение нейронных сетей особенно эффективно для решения трудно формализуемых задач. Разработана архитектура свёрточной нейронной сети применительно к задаче распознавания и классификации морских объектов на изображениях. В ходе исследования выполнен ретроспективный анализ технологий компьютерного зрения и выявлен ряд проблем, связанных с применением нейронных сетей: «исчезающий» градиент, переобучение и вычислительная сложность. При разработке архитектуры нейросети предложено использовать функцию активации RELU, обучение некоторых случайно выбранных нейронов и нормализацию с целью упрощения архитектуры нейросети. Сравнение используемых в нейросети функций активации ReLU, LeakyReLU, Exponential ReLU и SOFTMAX выполнено в среде Matlab R2020a. На основе свёрточной нейронной сети разработана программа на языке программирования Visual C# в среде MS Visual Studio для распознавания морских объектов. Программапредназначена для автоматизированной идентификации морских объектов, производит детектирование (нахождение объектов на изображении) и распознавание объектов с высокой вероятностью обнаружения. The article considers computer vision technologies based on deep convolutional neural networks. Application of neural networks is particularly effective for solving difficult formalized problems. As a result convolutional neural network architecture to the problem of recognition and classification of marine objects on images is implemented. In the research process a retrospective analysis of computer vision technologies was performed and a number of problems associated with the use of neural networks were identified: vanishing gradient, overfitting and computational complexity. To solve these problems in neural network architecture development, it was proposed to use RELU activation function, training some randomly selected neurons and normalization for simplification of neural network architecture. Comparison of ReLU, LeakyReLU, Exponential ReLU, and SOFTMAX activation functions used in the neural network implemented in Matlab R2020a.The computer program based on convolutional neural network for marine objects recognition implemented in Visual C# programming language in MS Visual Studio integrated development environment. The program is designed for automated identification of marine objects, produces detection (i.e., presence of objects on image), and objects recognition with high probability of detection.

APA, Harvard, Vancouver, ISO, and other styles

8

Gao, Hongyang, Lei Cai, and Shuiwang Ji. "Adaptive Convolutional ReLUs." Proceedings of the AAAI Conference on Artificial Intelligence 34, no. 04 (April 3, 2020): 3914–21. http://dx.doi.org/10.1609/aaai.v34i04.5805.

Full text

Abstract:

Rectified linear units (ReLUs) are currently the most popular activation function used in neural networks. Although ReLUs can solve the gradient vanishing problem and accelerate training convergence, it suffers from the dying ReLU problem in which some neurons are never activated if the weights are not updated properly. In this work, we propose a novel activation function, known as the adaptive convolutional ReLU (ConvReLU), that can better mimic brain neuron activation behaviors and overcome the dying ReLU problem. With our novel parameter sharing scheme, ConvReLUs can be applied to convolution layers that allow each input neuron to be activated by different trainable thresholds without involving a large number of extra parameters. We employ the zero initialization scheme in ConvReLU to encourage trainable thresholds to be close to zero. Finally, we develop a partial replacement strategy that only replaces the ReLUs in the early layers of the network. This resolves the dying ReLU problem and retains sparse representations for linear classifiers. Experimental results demonstrate that our proposed ConvReLU has consistently better performance compared to ReLU, LeakyReLU, and PReLU. In addition, the partial replacement strategy is shown to be effective not only for our ConvReLU but also for LeakyReLU and PReLU.

APA, Harvard, Vancouver, ISO, and other styles

9

Petzka, Henning, Martin Trimmel, and Cristian Sminchisescu. "Notes on the Symmetries of 2-Layer ReLU-Networks." Proceedings of the Northern Lights Deep Learning Workshop 1 (February 6, 2020): 6. http://dx.doi.org/10.7557/18.5150.

Full text

Abstract:

Symmetries in neural networks allow different weight configurations leading to the same network function. For odd activation functions, the set of transformations mapping between such configurations have been studied extensively, but less is known for neural networks with ReLU activation functions. We give a complete characterization for fully-connected networks with two layers. Apart from two well-known transformations, only degenerated situations allow additional transformations that leave the network function unchanged. Reduction steps can remove only part of the degenerated cases. Finally, we present a non-degenerate situation for deep neural networks leading to new transformations leaving the network function intact.

APA, Harvard, Vancouver, ISO, and other styles

10

Zheng, Shuxin, Qi Meng, Huishuai Zhang, Wei Chen, Nenghai Yu, and Tie-Yan Liu. "Capacity Control of ReLU Neural Networks by Basis-Path Norm." Proceedings of the AAAI Conference on Artificial Intelligence 33 (July 17, 2019): 5925–32. http://dx.doi.org/10.1609/aaai.v33i01.33015925.

Full text

Abstract:

Recently, path norm was proposed as a new capacity measure for neural networks with Rectified Linear Unit (ReLU) activation function, which takes the rescaling-invariant property of ReLU into account. It has been shown that the generalization error bound in terms of the path norm explains the empirical generalization behaviors of the ReLU neural networks better than that of other capacity measures. Moreover, optimization algorithms which take path norm as the regularization term to the loss function, like Path-SGD, have been shown to achieve better generalization performance. However, the path norm counts the values of all paths, and hence the capacity measure based on path norm could be improperly influenced by the dependency among different paths. It is also known that each path of a ReLU network can be represented by a small group of linearly independent basis paths with multiplication and division operation, which indicates that the generalization behavior of the network only depends on only a few basis paths. Motivated by this, we propose a new norm Basis-path Norm based on a group of linearly independent paths to measure the capacity of neural networks more accurately. We establish a generalization error bound based on this basis path norm, and show it explains the generalization behaviors of ReLU networks more accurately than previous capacity measures via extensive experiments. In addition, we develop optimization algorithms which minimize the empirical risk regularized by the basis-path norm. Our experiments on benchmark datasets demonstrate that the proposed regularization method achieves clearly better performance on the test set than the previous regularization approaches.

APA, Harvard, Vancouver, ISO, and other styles

11

Liu, Bo, and Yi Liang. "Optimal function approximation with ReLU neural networks." Neurocomputing 435 (May 2021): 216–27. http://dx.doi.org/10.1016/j.neucom.2021.01.007.

Full text

APA, Harvard, Vancouver, ISO, and other styles

12

Bodyanskiy, Yevgeniy, and Serhii Kostiuk. "Adaptive hybrid activation function for deep neural networks." System research and information technologies, no. 1 (April 25, 2022): 87–96. http://dx.doi.org/10.20535/srit.2308-8893.2022.1.07.

Full text

Abstract:

The adaptive hybrid activation function (AHAF) is proposed that combines the properties of the rectifier units and the squashing functions. The proposed function can be used as a drop-in replacement for ReLU, SiL and Swish activations for deep neural networks and can evolve to one of such functions during the training. The effectiveness of the function was evaluated on the image classification task using the Fashion-MNIST and CIFAR-10 datasets. The evaluation shows that the neural networks with AHAF activations achieve better classification accuracy comparing to their base implementations that use ReLU and SiL. A double-stage parameter tuning process for training the neural networks with AHAF is proposed. The proposed approach is sufficiently simple from the implementation standpoint and provides high performance for the neural network training process.

APA, Harvard, Vancouver, ISO, and other styles

13

Moon, Sunghwan. "ReLU Network with Bounded Width Is a Universal Approximator in View of an Approximate Identity." Applied Sciences 11, no. 1 (January 4, 2021): 427. http://dx.doi.org/10.3390/app11010427.

Full text

Abstract:

Deep neural networks have shown very successful performance in a wide range of tasks, but a theory of why they work so well is in the early stage. Recently, the expressive power of neural networks, important for understanding deep learning, has received considerable attention. Classic results, provided by Cybenko, Barron, etc., state that a network with a single hidden layer and suitable activation functions is a universal approximator. A few years ago, one started to study how width affects the expressiveness of neural networks, i.e., a universal approximation theorem for a deep neural network with a Rectified Linear Unit (ReLU) activation function and bounded width. Here, we show how any continuous function on a compact set of Rnin,nin∈N can be approximated by a ReLU network having hidden layers with at most nin+5 nodes in view of an approximate identity.

APA, Harvard, Vancouver, ISO, and other styles

14

Bai, Yuhan. "RELU-Function and Derived Function Review." SHS Web of Conferences 144 (2022): 02006. http://dx.doi.org/10.1051/shsconf/202214402006.

Full text

Abstract:

The activation function plays an important role in training and improving performance in deep neural networks (dnn). The rectified linear unit (relu) function provides the necessary non-linear properties in the deep neural network (dnn). However, few papers sort out and compare various relu activation functions. Most of the paper focuses on the efficiency and accuracy of certain activation functions used by the model, but does not pay attention to the nature and differences of these activation functions. Therefore, this paper attempts to organize the RELU-function and derived function in this paper. And compared the accuracy of different relu functions (and its derivative functions) under the Mnist data set. From the experimental point of view, the relu function performs the best, and the selu and elu functions perform poorly.

APA, Harvard, Vancouver, ISO, and other styles

15

Abdeljawad, Ahmed, and Philipp Grohs. "Approximations with deep neural networks in Sobolev time-space." Analysis and Applications 20, no. 03 (May 2022): 499–541. http://dx.doi.org/10.1142/s0219530522500014.

Full text

Abstract:

Solutions of the evolution equation generally lie in certain Bochner–Sobolev spaces, in which the solutions may have regularity and integrability properties for the time variable that can be different for the space variables. Therefore, in this paper, we develop a framework that shows that deep neural networks can approximate Sobolev-regular functions with respect to Bochner–Sobolev spaces. In our work, we use the so-called Rectified Cubic Unit (ReCU) as an activation function in our networks. This activation function allows us to deduce approximation results of the neural networks while avoiding issues caused by the nonregularity of the most commonly used Rectified Linear Unit (ReLU) activation function.

APA, Harvard, Vancouver, ISO, and other styles

16

sci, Juncai He. "Relu Deep Neural Networks and Linear Finite Elements." Journal of Computational Mathematics 38, no. 3 (June 2020): 502–27. http://dx.doi.org/10.4208/jcm.1901-m2018-0160.

Full text

APA, Harvard, Vancouver, ISO, and other styles

17

Dũng, Dinh, and Van Kien Nguyen. "Deep ReLU neural networks in high-dimensional approximation." Neural Networks 142 (October 2021): 619–35. http://dx.doi.org/10.1016/j.neunet.2021.07.027.

Full text

APA, Harvard, Vancouver, ISO, and other styles

18

Dureja, Aman, and Payal Pahwa. "Analysis of Non-Linear Activation Functions for Classification Tasks Using Convolutional Neural Networks." Recent Patents on Computer Science 12, no. 3 (May 8, 2019): 156–61. http://dx.doi.org/10.2174/2213275911666181025143029.

Full text

Abstract:

Background: In making the deep neural network, activation functions play an important role. But the choice of activation functions also affects the network in term of optimization and to retrieve the better results. Several activation functions have been introduced in machine learning for many practical applications. But which activation function should use at hidden layer of deep neural networks was not identified. Objective: The primary objective of this analysis was to describe which activation function must be used at hidden layers for deep neural networks to solve complex non-linear problems. Methods: The configuration for this comparative model was used by using the datasets of 2 classes (Cat/Dog). The number of Convolutional layer used in this network was 3 and the pooling layer was also introduced after each layer of CNN layer. The total of the dataset was divided into the two parts. The first 8000 images were mainly used for training the network and the next 2000 images were used for testing the network. Results: The experimental comparison was done by analyzing the network by taking different activation functions on each layer of CNN network. The validation error and accuracy on Cat/Dog dataset were analyzed using activation functions (ReLU, Tanh, Selu, PRelu, Elu) at number of hidden layers. Overall the Relu gave best performance with the validation loss at 25th Epoch 0.3912 and validation accuracy at 25th Epoch 0.8320. Conclusion: It is found that a CNN model with ReLU hidden layers (3 hidden layers here) gives best results and improve overall performance better in term of accuracy and speed. These advantages of ReLU in CNN at number of hidden layers are helpful to effectively and fast retrieval of images from the databases.

APA, Harvard, Vancouver, ISO, and other styles

19

Li, Qunwei, Shaofeng Zou, and Wenliang Zhong. "Learning Graph Neural Networks with Approximate Gradient Descent." Proceedings of the AAAI Conference on Artificial Intelligence 35, no. 10 (May 18, 2021): 8438–46. http://dx.doi.org/10.1609/aaai.v35i10.17025.

Full text

Abstract:

The first provably efficient algorithm for learning graph neural networks (GNNs) with one hidden layer for node information convolution is provided in this paper. Two types of GNNs are investigated, depending on whether labels are attached to nodes or graphs. A comprehensive framework for designing and analyzing convergence of GNN training algorithms is developed. The algorithm proposed is applicable to a wide range of activation functions including ReLU, Leaky ReLU, Sigmod, Softplus and Swish. It is shown that the proposed algorithm guarantees a linear convergence rate to the underlying true parameters of GNNs. For both types of GNNs, sample complexity in terms of the number of nodes or the number of graphs is characterized. The impact of feature dimension and GNN structure on the convergence rate is also theoretically characterized. Numerical experiments are further provided to validate our theoretical analysis.

APA, Harvard, Vancouver, ISO, and other styles

20

Sun, Yichen, Mingli Dong, Mingxin Yu, Jiabin Xia, Xu Zhang, Yuchen Bai, Lidan Lu, and Lianqing Zhu. "Nonlinear All-Optical Diffractive Deep Neural Network with 10.6 μm Wavelength for Image Classification." International Journal of Optics 2021 (February 27, 2021): 1–16. http://dx.doi.org/10.1155/2021/6667495.

Full text

Abstract:

A photonic artificial intelligence chip is based on an optical neural network (ONN), low power consumption, low delay, and strong antiinterference ability. The all-optical diffractive deep neural network has recently demonstrated its inference capabilities on the image classification task. However, the size of the physical model does not have miniaturization and integration, and the optical nonlinearity is not incorporated into the diffraction neural network. By introducing the nonlinear characteristics of the network, complex tasks can be completed with high accuracy. In this study, a nonlinear all-optical diffraction deep neural network (N-D2NN) model based on 10.6 μm wavelength is constructed by combining the ONN and complex-valued neural networks with the nonlinear activation function introduced into the structure. To be specific, the improved activation function of the rectified linear unit (ReLU), i.e., Leaky-ReLU, parametric ReLU (PReLU), and randomized ReLU (RReLU), is selected as the activation function of the N-D2NN model. Through numerical simulation, it is proved that the N-D2NN model based on 10.6 μm wavelength has excellent representation ability, which enables them to perform classification learning tasks of the MNIST handwritten digital dataset and Fashion-MNIST dataset well, respectively. The results show that the N-D2NN model with the RReLU activation function has the highest classification accuracy of 97.86% and 89.28%, respectively. These results provide a theoretical basis for the preparation of miniaturized and integrated N-D2NN model photonic artificial intelligence chips.

APA, Harvard, Vancouver, ISO, and other styles

21

Thakur, Amey. "Fundamentals of Neural Networks." International Journal for Research in Applied Science and Engineering Technology 9, no. VIII (August 15, 2021): 407–26. http://dx.doi.org/10.22214/ijraset.2021.37362.

Full text

Abstract:

The purpose of this study is to familiarise the reader with the foundations of neural networks. Artificial Neural Networks (ANNs) are algorithm-based systems that are modelled after Biological Neural Networks (BNNs). Neural networks are an effort to use the human brain's information processing skills to address challenging real-world AI issues. The evolution of neural networks and their significance are briefly explored. ANNs and BNNs are contrasted, and their qualities, benefits, and disadvantages are discussed. The drawbacks of the perceptron model and their improvement by the sigmoid neuron and ReLU neuron are briefly discussed. In addition, we give a bird's-eye view of the different Neural Network models. We study neural networks (NNs) and highlight the different learning approaches and algorithms used in Machine Learning and Deep Learning. We also discuss different types of NNs and their applications. A brief introduction to Neuro-Fuzzy and its applications with a comprehensive review of NN technological advances is provided.

APA, Harvard, Vancouver, ISO, and other styles

22

Botoeva, Elena, Panagiotis Kouvaros, Jan Kronqvist, Alessio Lomuscio, and Ruth Misener. "Efficient Verification of ReLU-Based Neural Networks via Dependency Analysis." Proceedings of the AAAI Conference on Artificial Intelligence 34, no. 04 (April 3, 2020): 3291–99. http://dx.doi.org/10.1609/aaai.v34i04.5729.

Full text

Abstract:

We introduce an efficient method for the verification of ReLU-based feed-forward neural networks. We derive an automated procedure that exploits dependency relations between the ReLU nodes, thereby pruning the search tree that needs to be considered by MILP-based formulations of the verification problem. We augment the resulting algorithm with methods for input domain splitting and symbolic interval propagation. We present Venus, the resulting verification toolkit, and evaluate it on the ACAS collision avoidance networks and models trained on the MNIST and CIFAR-10 datasets. The experimental results obtained indicate considerable gains over the present state-of-the-art tools.

APA, Harvard, Vancouver, ISO, and other styles

23

Płaczek, Stanisław, and Aleksander Płaczek. "Learning algorithm analysis for deep neural network with ReLu activation functions." ITM Web of Conferences 19 (2018): 01009. http://dx.doi.org/10.1051/itmconf/20181901009.

Full text

Abstract:

In the article, emphasis is put on the modern artificial neural network structure, which in the literature is known as a deep neural network. Network includes more than one hidden layer and comprises many standard modules with ReLu nonlinear activation function. A learning algorithm includes two standard steps, forward and backward, and its effectiveness depends on the way the learning error is transported back through all the layers to the first layer. Taking into account all the dimensionalities of matrixes and the nonlinear characteristics of ReLu activation function, the problem is very difficult from a theoretic point of view. To implement simple assumptions in the analysis, formal formulas are used to describe relations between the structure of every layer and the internal input vector. In practice tasks, neural networks’ internal layer matrixes with ReLu activations function, include a lot of null value of weight coefficients. This phenomenon has a negatives impact on the effectiveness of the learning algorithm convergences. A theoretical analysis could help to build more effective algorithms.

APA, Harvard, Vancouver, ISO, and other styles

24

He, Juncai, Lin Li, and Jinchao Xu. "ReLU deep neural networks from the hierarchical basis perspective." Computers & Mathematics with Applications 120 (August 2022): 105–14. http://dx.doi.org/10.1016/j.camwa.2022.06.006.

Full text

APA, Harvard, Vancouver, ISO, and other styles

25

Dey, Santanu S., Guanyi Wang, and Yao Xie. "Approximation Algorithms for Training One-Node ReLU Neural Networks." IEEE Transactions on Signal Processing 68 (2020): 6696–706. http://dx.doi.org/10.1109/tsp.2020.3039360.

Full text

APA, Harvard, Vancouver, ISO, and other styles

26

Liu, Wan-Wei, Fu Song, Tang-Hao-Ran Zhang, and Ji Wang. "Verifying ReLU Neural Networks from a Model Checking Perspective." Journal of Computer Science and Technology 35, no. 6 (November 2020): 1365–81. http://dx.doi.org/10.1007/s11390-020-0546-7.

Full text

APA, Harvard, Vancouver, ISO, and other styles

27

Chieng, Hock Hung, Noorhaniza Wahid, Ong Pauline, and Sai Raj Kishore Perla. "Flatten-T Swish: a thresholded ReLU-Swish-like activation function for deep learning." International Journal of Advances in Intelligent Informatics 4, no. 2 (July 31, 2018): 76. http://dx.doi.org/10.26555/ijain.v4i2.249.

Full text

Abstract:

Activation functions are essential for deep learning methods to learn and perform complex tasks such as image classification. Rectified Linear Unit (ReLU) has been widely used and become the default activation function across the deep learning community since 2012. Although ReLU has been popular, however, the hard zero property of the ReLU has heavily hindering the negative values from propagating through the network. Consequently, the deep neural network has not been benefited from the negative representations. In this work, an activation function called Flatten-T Swish (FTS) that leverage the benefit of the negative values is proposed. To verify its performance, this study evaluates FTS with ReLU and several recent activation functions. Each activation function is trained using MNIST dataset on five different deep fully connected neural networks (DFNNs) with depth vary from five to eight layers. For a fair evaluation, all DFNNs are using the same configuration settings. Based on the experimental results, FTS with a threshold value, T=-0.20 has the best overall performance. As compared with ReLU, FTS (T=-0.20) improves MNIST classification accuracy by 0.13%, 0.70%, 0.67%, 1.07% and 1.15% on wider 5 layers, slimmer 5 layers, 6 layers, 7 layers and 8 layers DFNNs respectively. Apart from this, the study also noticed that FTS converges twice as fast as ReLU. Although there are other existing activation functions are also evaluated, this study elects ReLU as the baseline activation function.

APA, Harvard, Vancouver, ISO, and other styles

28

Salam, Abdulwahed, Abdelaaziz El Hibaoui, and Abdulgabbar Saif. "A comparison of activation functions in multilayer neural network for predicting the production and consumption of electricity power." International Journal of Electrical and Computer Engineering (IJECE) 11, no. 1 (February 1, 2021): 163. http://dx.doi.org/10.11591/ijece.v11i1.pp163-170.

Full text

Abstract:

Predicting electricity power is an important task, which helps power utilities in improving their systems’ performance in terms of effectiveness, productivity, management and control. Several researches had introduced this task using three main models: engineering, statistical and artificial intelligence. Based on the experiments, which used artificial intelligence models, multilayer neural networks model has proven its success in predicting many evaluation datasets. However, the performance of this model depends mainly on the type of activation function. Therefore, this paper introduces an experimental study for investigating the performance of the multilayer neural networks model with respect to different activation functions and different depths of hidden layers. The experiments in this paper cover the comparison among eleven activation functions using four benchmark electricity datasets. The activation functions under examination are sigmoid, hyperbolic tangent, SoftSign, SoftPlus, ReLU, Leak ReLU, Gaussian, ELU, SELU, Swish and Adjust-Swish. Experimental results show that ReLU and Leak ReLU activation functions outperform their counterparts in all datasets.

APA, Harvard, Vancouver, ISO, and other styles

29

Yuan, Xiaoyong, Zheng Feng, Matthew Norton, and Xiaolin Li. "Generalized Batch Normalization: Towards Accelerating Deep Neural Networks." Proceedings of the AAAI Conference on Artificial Intelligence 33 (July 17, 2019): 1682–89. http://dx.doi.org/10.1609/aaai.v33i01.33011682.

Full text

Abstract:

Utilizing recently introduced concepts from statistics and quantitative risk management, we present a general variant of Batch Normalization (BN) that offers accelerated convergence of Neural Network training compared to conventional BN. In general, we show that mean and standard deviation are not always the most appropriate choice for the centering and scaling procedure within the BN transformation, particularly if ReLU follows the normalization step. We present a Generalized Batch Normalization (GBN) transformation, which can utilize a variety of alternative deviation measures for scaling and statistics for centering, choices which naturally arise from the theory of generalized deviation measures and risk theory in general. When used in conjunction with the ReLU non-linearity, the underlying risk theory suggests natural, arguably optimal choices for the deviation measure and statistic. Utilizing the suggested deviation measure and statistic, we show experimentally that training is accelerated more so than with conventional BN, often with improved error rate as well. Overall, we propose a more flexible BN transformation supported by a complimentary theoretical framework that can potentially guide design choices.

APA, Harvard, Vancouver, ISO, and other styles

30

Butt, F. M., L. Hussain, S. H. M. Jafri, K. J. Lone, M. Alajmi, I. Abunadi, F. N. Al-Wesabi, and M. A. Hamza. "Optimizing Parameters of Artificial Intelligence Deep Convolutional Neural Networks (CNN) to improve Prediction Performance of Load Forecasting System." IOP Conference Series: Earth and Environmental Science 1026, no. 1 (May 1, 2022): 012028. http://dx.doi.org/10.1088/1755-1315/1026/1/012028.

Full text

Abstract:

Abstract Load Forecasting is an approach that is implemented to foresee the future load demand projected on some physical parameters such as loading on lines, temperature, losses, pressure, and weather conditions etc. This study is specifically aimed to optimize the parameters of deep convolutional neural networks (CNN) to improve the short-term load forecasting (STLF) and Medium-term load forecasting (MTLF) i.e. one day, one week, one month and three months. The models were tested based on the real-world case by conducting detailed experiments to validate their stability and practicality. The performance was measured in terms of squared error, Root Mean Square Error (RMSE), Mean Absolute Percentage Error (MAPE) and Mean Absolute Error (MAE). We optimized the parameters using three different cases. In first case, we used single layer with Rectified Linear Unit (ReLU) activation function. In the second case, we used double layer with ReLU – ReLU activation function. In the third case, we used double layer with ReLU – Sigmoid activation function. The number of neurons in each case were 2, 4, 6, 8, 10 and 12. To predict the one day ahead load forecasting, the lowest prediction error was yielded using double layer with ReLU – Sigmoid activation function. To predict ahead one-week load forecasting demands, the lowest error was obtained using single layer ReLU activation function. Likewise, to predict the one month ahead forecasting using double layer with ReLU – Sigmoid activation function. Moreover, to predict ahead three months forecasting using double layer ReLU – Sigmoid activation function produced lowest prediction error. The results reveal that by optimizing the parameters further improved the ahead prediction performance. The results also show that predicting nonstationary and nonlinear dynamics of ahead forecasting require more complex activation function and number of neurons. The results can be very useful in real-time implementation of this model to meet load demands and for further planning.

APA, Harvard, Vancouver, ISO, and other styles

31

Lee, Seunghye, Qui X. Lieu, Thuc P. Vo, and Jaehong Lee. "Deep Neural Networks for Form-Finding of Tensegrity Structures." Mathematics 10, no. 11 (May 25, 2022): 1822. http://dx.doi.org/10.3390/math10111822.

Full text

Abstract:

Analytical paradigms have limited conventional form-finding methods of tensegrities; therefore, an innovative approach is urgently needed. This paper proposes a new form-finding method based on state-of-the-art deep learning techniques. One of the statical paradigms, a force density method, is substituted for trained deep neural networks to obtain necessary information of tensegrities. It is based on the differential evolution algorithm, where the eigenvalue decomposition process of the force density matrix and the process of the equilibrium matrix are not needed to find the feasible sets of nodal coordinates. Three well-known tensegrity examples including a 2D two-strut, a 3D-truncated tetrahedron and an icosahedron tensegrity are presented for numerical verifications. The cases of the ReLU and Leaky ReLU activation functions show better results than those of the ELU and SELU. Moreover, the results of the proposed method are in good agreement with the analytical super-stable lines. Three examples show that the proposed method exhibits more uniform final shapes of tensegrity, and much faster convergence history than those of the conventional one.

APA, Harvard, Vancouver, ISO, and other styles

32

Akintunde, Michael E., Andreea Kevorchian, Alessio Lomuscio, and Edoardo Pirovano. "Verification of RNN-Based Neural Agent-Environment Systems." Proceedings of the AAAI Conference on Artificial Intelligence 33 (July 17, 2019): 6006–13. http://dx.doi.org/10.1609/aaai.v33i01.33016006.

Full text

Abstract:

We introduce agent-environment systems where the agent is stateful and executing a ReLU recurrent neural network. We define and study their verification problem by providing equivalences of recurrent and feed-forward neural networks on bounded execution traces. We give a sound and complete procedure for their verification against properties specified in a simplified version of LTL on bounded executions. We present an implementation and discuss the experimental results obtained.

APA, Harvard, Vancouver, ISO, and other styles

33

Daróczy, Bálint. "Gaussian Perturbations in ReLU Networks and the Arrangement of Activation Regions." Mathematics 10, no. 7 (March 31, 2022): 1123. http://dx.doi.org/10.3390/math10071123.

Full text

Abstract:

Recent articles indicate that deep neural networks are efficient models for various learning problems. However, they are often highly sensitive to various changes that cannot be detected by an independent observer. As our understanding of deep neural networks with traditional generalisation bounds still remains incomplete, there are several measures which capture the behaviour of the model in case of small changes at a specific state. In this paper we consider Gaussian perturbations in the tangent space and suggest tangent sensitivity in order to characterise the stability of gradient updates. We focus on a particular kind of stability with respect to changes in parameters that are induced by individual examples without known labels. We derive several easily computable bounds and empirical measures for feed-forward fully connected ReLU (Rectified Linear Unit) networks and connect tangent sensitivity to the distribution of the activation regions in the input space realised by the network.

APA, Harvard, Vancouver, ISO, and other styles

34

Zheng, Jing, Shuaishuai Shen, Tianqi Jiang, and Weiqiang Zhu. "Deep neural networks design and analysis for automatic phase pickers from three-component microseismic recordings." Geophysical Journal International 220, no. 1 (November 4, 2019): 323–34. http://dx.doi.org/10.1093/gji/ggz441.

Full text

Abstract:

SUMMARY It is essential to pick P-wave and S-wave arrival times rapidly and accurately for the microseismic monitoring systems. Meanwhile, it is not easy to identify the arrivals at a true phase automatically using traditional picking method. This is one of the reasons that many researchers are trying to introduce deep neural networks to solve these problems. Convolutional neural networks (CNNs) are very attractive for designing automatic phase pickers especially after introducing the fundamental network structure from semantic segmentation field, which can give the probability outputs for every labelled phase at every sample in the recordings. The typical segmentation architecture consists of two main parts: (1) an encoder part trained to extracting coarse semantic features; (2) a decoder part responsible not only for recovering the input resolution at the output but also for obtaining sparse representation of the objects. The fundamental segmentation structure performs well; however, the influence of the parameters in the structure on the pickers has not been investigated. It means that the structure design just depends on experience and tests. In this paper, we solve two main questions to give some guidance on network design. First, we show what sparse features will learn from the three-component microseismic recordings using CNNs. Second, the influence of two key parameters in the network on pickers, namely, the depth of decoder and activation functions, is analysed. Increasing the number of levels for a certain layer in the decoder will increase the burden of demand on trainable parameters, but it is beneficial to the accuracy of the model. Reasonable depth of the decoder can balance prediction accuracy and the demand of labelled data, which is important for microseismic systems because manual labelling process will decrease the real-time performance in monitoring tasks. Standard rectified linear unit (ReLU) and leaky rectified linear unit (Leaky ReLU) with different negative slopes are compared for the analysis. Leaky ReLU with a small negative slope can improve the performance of a given model than ReLU activation function by keeping some information about the negative parts.

APA, Harvard, Vancouver, ISO, and other styles

35

Katz, Justin, Iosif Pappas, Styliani Avraamidou, and Efstratios N. Pistikopoulos. "The Integration of Explicit MPC and ReLU based Neural Networks." IFAC-PapersOnLine 53, no. 2 (2020): 11350–55. http://dx.doi.org/10.1016/j.ifacol.2020.12.544.

Full text

APA, Harvard, Vancouver, ISO, and other styles

36

Schmidt-Hieber, Johannes. "Nonparametric regression using deep neural networks with ReLU activation function." Annals of Statistics 48, no. 4 (August 2020): 1875–97. http://dx.doi.org/10.1214/19-aos1875.

Full text

APA, Harvard, Vancouver, ISO, and other styles

37

Ohn, Ilsang, and Yongdai Kim. "Smooth Function Approximation by Deep Neural Networks with General Activation Functions." Entropy 21, no. 7 (June 26, 2019): 627. http://dx.doi.org/10.3390/e21070627.

Full text

Abstract:

There has been a growing interest in expressivity of deep neural networks. However, most of the existing work about this topic focuses only on the specific activation function such as ReLU or sigmoid. In this paper, we investigate the approximation ability of deep neural networks with a broad class of activation functions. This class of activation functions includes most of frequently used activation functions. We derive the required depth, width and sparsity of a deep neural network to approximate any Hölder smooth function upto a given approximation error for the large class of activation functions. Based on our approximation error analysis, we derive the minimax optimality of the deep neural network estimators with the general activation functions in both regression and classification problems.

APA, Harvard, Vancouver, ISO, and other styles

38

Opschoor, Joost A. A., Philipp C. Petersen, and Christoph Schwab. "Deep ReLU networks and high-order finite element methods." Analysis and Applications 18, no. 05 (February 21, 2020): 715–70. http://dx.doi.org/10.1142/s0219530519410136.

Full text

Abstract:

Approximation rate bounds for emulations of real-valued functions on intervals by deep neural networks (DNNs) are established. The approximation results are given for DNNs based on ReLU activation functions. The approximation error is measured with respect to Sobolev norms. It is shown that ReLU DNNs allow for essentially the same approximation rates as nonlinear, variable-order, free-knot (or so-called “[Formula: see text]-adaptive”) spline approximations and spectral approximations, for a wide range of Sobolev and Besov spaces. In particular, exponential convergence rates in terms of the DNN size for univariate, piecewise Gevrey functions with point singularities are established. Combined with recent results on ReLU DNN approximation of rational, oscillatory, and high-dimensional functions, this corroborates that continuous, piecewise affine ReLU DNNs afford algebraic and exponential convergence rate bounds which are comparable to “best in class” schemes for several important function classes of high and infinite smoothness. Using composition of DNNs, we also prove that radial-like functions obtained as compositions of the above with the Euclidean norm and, possibly, anisotropic affine changes of co-ordinates can be emulated at exponential rate in terms of the DNN size and depth without the curse of dimensionality.

APA, Harvard, Vancouver, ISO, and other styles

39

Ryffel, Théo, Pierre Tholoniat, David Pointcheval, and Francis Bach. "AriaNN: Low-Interaction Privacy-Preserving Deep Learning via Function Secret Sharing." Proceedings on Privacy Enhancing Technologies 2022, no. 1 (November 20, 2021): 291–316. http://dx.doi.org/10.2478/popets-2022-0015.

Full text

Abstract:

Abstract We propose AriaNN, a low-interaction privacy-preserving framework for private neural network training and inference on sensitive data. Our semi-honest 2-party computation protocol (with a trusted dealer) leverages function secret sharing, a recent lightweight cryptographic protocol that allows us to achieve an efficient online phase. We design optimized primitives for the building blocks of neural networks such as ReLU, MaxPool and BatchNorm. For instance, we perform private comparison for ReLU operations with a single message of the size of the input during the online phase, and with preprocessing keys close to 4× smaller than previous work. Last, we propose an extension to support n-party private federated learning. We implement our framework as an extensible system on top of PyTorch that leverages CPU and GPU hardware acceleration for cryptographic and machine learning operations. We evaluate our end-to-end system for private inference between distant servers on standard neural networks such as AlexNet, VGG16 or ResNet18, and for private training on smaller networks like LeNet. We show that computation rather than communication is the main bottleneck and that using GPUs together with reduced key size is a promising solution to overcome this barrier.

APA, Harvard, Vancouver, ISO, and other styles

40

Boopathy, Akhilan, Tsui-Wei Weng, Pin-Yu Chen, Sijia Liu, and Luca Daniel. "CNN-Cert: An Efficient Framework for Certifying Robustness of Convolutional Neural Networks." Proceedings of the AAAI Conference on Artificial Intelligence 33 (July 17, 2019): 3240–47. http://dx.doi.org/10.1609/aaai.v33i01.33013240.

Full text

Abstract:

Verifying robustness of neural network classifiers has attracted great interests and attention due to the success of deep neural networks and their unexpected vulnerability to adversarial perturbations. Although finding minimum adversarial distortion of neural networks (with ReLU activations) has been shown to be an NP-complete problem, obtaining a non-trivial lower bound of minimum distortion as a provable robustness guarantee is possible. However, most previous works only focused on simple fully-connected layers (multilayer perceptrons) and were limited to ReLU activations. This motivates us to propose a general and efficient framework, CNN-Cert, that is capable of certifying robustness on general convolutional neural networks. Our framework is general – we can handle various architectures including convolutional layers, max-pooling layers, batch normalization layer, residual blocks, as well as general activation functions; our approach is efficient – by exploiting the special structure of convolutional layers, we achieve up to 17 and 11 times of speed-up compared to the state-of-the-art certification algorithms (e.g. Fast-Lin, CROWN) and 366 times of speed-up compared to the dual-LP approach while our algorithm obtains similar or even better verification bounds. In addition, CNN-Cert generalizes state-of-the-art algorithms e.g. Fast-Lin and CROWN. We demonstrate by extensive experiments that our method outperforms state-of-the-art lowerbound-based certification algorithms in terms of both bound quality and speed.

APA, Harvard, Vancouver, ISO, and other styles

41

Hatano, Naoya, Masahiro Ikeda, Isao Ishikawa, and Yoshihiro Sawano. "A Global Universality of Two-Layer Neural Networks with ReLU Activations." Journal of Function Spaces 2021 (September 17, 2021): 1–3. http://dx.doi.org/10.1155/2021/6637220.

Full text

Abstract:

In the present study, we investigate a universality of neural networks, which concerns a density of the set of two-layer neural networks in function spaces. There are many works that handle the convergence over compact sets. In the present paper, we consider a global convergence by introducing a norm suitably, so that our results will be uniform over any compact set.

APA, Harvard, Vancouver, ISO, and other styles

42

Herrera, Oscar, and Belém Priego. "Wavelets as activation functions in Neural Networks." Journal of Intelligent & Fuzzy Systems 42, no. 5 (March 31, 2022): 4345–55. http://dx.doi.org/10.3233/jifs-219225.

Full text

Abstract:

Traditionally, a few activation functions have been considered in neural networks, including bounded functions such as threshold, sigmoidal and hyperbolic-tangent, as well as unbounded ReLU, GELU, and Soft-plus, among other functions for deep learning, but the search for new activation functions still being an open research area. In this paper, wavelets are reconsidered as activation functions in neural networks and the performance of Gaussian family wavelets (first, second and third derivatives) are studied together with other functions available in Keras-Tensorflow. Experimental results show how the combination of these activation functions can improve the performance and supports the idea of extending the list of activation functions to wavelets which can be available in high performance platforms.

APA, Harvard, Vancouver, ISO, and other styles

43

Ali, Mahmoud Emad Aldin, and Dinesh Kumar. "The Impact of Optimization Algorithms on The Performance of Face Recognition Neural Networks." Journal of Advanced Engineering and Computation 6, no. 4 (December 31, 2022): 248. http://dx.doi.org/10.55579/jaec.202264.370.

Full text

Abstract:

Face recognition has aroused great interest in a range of industries due to its practical applications nowadays. It is a biometric method that is used to identify and certify people with unique biological traits in a reliable and timely manner. Although iris and fingerprint recognition technologies are more accurate, face recognition technology is the most common and frequently utilized since it is simple to deploy and execute and does not require any physical input from the user. This study compares Neural Networks using (SGD, Adam, or L-BFGS-B) optimizers, with diﬀerent activation functions (Sigmoid, Tanh, or ReLU), and deep learning feature extraction methodologies including Squeeze Net, VGG19, or Inception model. The inception model outperforms the Squeeze Net and VGG19 in terms of accuracy. Based on the findings of the inception model, we achieved 93.6% of accuracy in a neural network with four layers and forty neurons by utilizing the SGD optimizer with the ReLU activation function. We also noticed that using the ReLU activation function with any of the three optimizers achieved the best results based on findings of the inception model, as it achieved 93.6%, 89.1%, and 94% of accuracy for each of the optimization algorithms SGD, Adam, and BFGS, respectively.This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium provided the original work is properly cited.

APA, Harvard, Vancouver, ISO, and other styles

44

Yan, Zhiqi, Shisheng Zhong, Lin Lin, and Zhiquan Cui. "Adaptive Levenberg–Marquardt Algorithm: A New Optimization Strategy for Levenberg–Marquardt Neural Networks." Mathematics 9, no. 17 (September 6, 2021): 2176. http://dx.doi.org/10.3390/math9172176.

Full text

Abstract:

Engineering data are often highly nonlinear and contain high-frequency noise, so the Levenberg–Marquardt (LM) algorithm may not converge when a neural network optimized by the algorithm is trained with engineering data. In this work, we analyzed the reasons for the LM neural network’s poor convergence commonly associated with the LM algorithm. Specifically, the effects of different activation functions such as Sigmoid, Tanh, Rectified Linear Unit (RELU) and Parametric Rectified Linear Unit (PRLU) were evaluated on the general performance of LM neural networks, and special values of LM neural network parameters were found that could make the LM algorithm converge poorly. We proposed an adaptive LM (AdaLM) algorithm to solve the problem of the LM algorithm. The algorithm coordinates the descent direction and the descent step by the iteration number, which can prevent falling into the local minimum value and avoid the influence of the parameter state of LM neural networks. We compared the AdaLM algorithm with the traditional LM algorithm and its variants in terms of accuracy and speed in the context of testing common datasets and aero-engine data, and the results verified the effectiveness of the AdaLM algorithm.

APA, Harvard, Vancouver, ISO, and other styles

45

Veness, Joel, Tor Lattimore, David Budden, Avishkar Bhoopchand, Christopher Mattern, Agnieszka Grabska-Barwinska, Eren Sezener, et al. "Gated Linear Networks." Proceedings of the AAAI Conference on Artificial Intelligence 35, no. 11 (May 18, 2021): 10015–23. http://dx.doi.org/10.1609/aaai.v35i11.17202.

Full text

Abstract:

This paper presents a new family of backpropagation-free neural architectures, Gated Linear Networks (GLNs). What distinguishes GLNs from contemporary neural networks is the distributed and local nature of their credit assignment mechanism; each neuron directly predicts the target, forgoing the ability to learn feature representations in favor of rapid online learning. Individual neurons are able to model nonlinear functions via the use of data-dependent gating in conjunction with online convex optimization. We show that this architecture gives rise to universal learning capabilities in the limit, with effective model capacity increasing as a function of network size in a manner comparable with deep ReLU networks. Furthermore, we demonstrate that the GLN learning mechanism possesses extraordinary resilience to catastrophic forgetting, performing almost on par to an MLP with dropout and Elastic Weight Consolidation on standard benchmarks.

APA, Harvard, Vancouver, ISO, and other styles

46

Inoue, Kenta. "Expressive Numbers of Two or More Hidden Layer ReLU Neural Networks." International Journal of Networking and Computing 10, no. 2 (2020): 293–307. http://dx.doi.org/10.15803/ijnc.10.2_293.

Full text

APA, Harvard, Vancouver, ISO, and other styles

47

Petersen, Philipp, and Felix Voigtlaender. "Optimal approximation of piecewise smooth functions using deep ReLU neural networks." Neural Networks 108 (December 2018): 296–330. http://dx.doi.org/10.1016/j.neunet.2018.08.019.

Full text

APA, Harvard, Vancouver, ISO, and other styles

48

Oostwal, Elisa, Michiel Straat, and Michael Biehl. "Hidden unit specialization in layered neural networks: ReLU vs. sigmoidal activation." Physica A: Statistical Mechanics and its Applications 564 (February 2021): 125517. http://dx.doi.org/10.1016/j.physa.2020.125517.

Full text

APA, Harvard, Vancouver, ISO, and other styles

49

Schmidt-Hieber, Johannes. "Rejoinder: “Nonparametric regression using deep neural networks with ReLU activation function”." Annals of Statistics 48, no. 4 (August 2020): 1916–21. http://dx.doi.org/10.1214/19-aos1931.

Full text

APA, Harvard, Vancouver, ISO, and other styles

50

Velasco, Lemuel Clark, John Frail Bongat, Ched Castillon, Jezreil Laurente, and Emily Tabanao. "Days-ahead water level forecasting using artificial neural networks for watersheds." Mathematical Biosciences and Engineering 20, no. 1 (2022): 758–74. http://dx.doi.org/10.3934/mbe.2023035.

Full text

Abstract:

<abstract> <p>Watersheds of tropical countries having only dry and wet seasons exhibit contrasting water level behaviour compared to countries having four seasons. With the changing climate, the ability to forecast the water level in watersheds enables decision-makers to come up with sound resource management interventions. This study presents a strategy for days-ahead water level forecasting models using an Artificial Neural Network (ANN) for watersheds by conducting data preparation of water level data captured from a Water Level Monitoring Station (WLMS) and two Automatic Rain Gauge (ARG) sensors divided into the two major seasons in the Philippines being implemented into multiple ANN models with different combinations of training algorithms, activation functions, and a number of hidden neurons. The implemented ANN model for the rainy season which is RPROP-Leaky ReLU produced a MAPE and RMSE of 6.731 and 0.00918, respectively, while the implemented ANN model for the dry season which is SCG-Leaky ReLU produced a MAPE and RMSE of 7.871 and 0.01045, respectively. By conducting appropriate water level data correction, data transformation, and ANN model implementation, the results of error computation and assessment shows the promising performance of ANN in days-ahead water level forecasting of watersheds among tropical countries.</p> </abstract>

APA, Harvard, Vancouver, ISO, and other styles

Journal articles on the topic 'ReLU neural networks'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles