Accedi

Bibliografie tematiche / Adaptive gradient methods with momentum / Articoli di riviste

Articoli di riviste sul tema "Adaptive gradient methods with momentum"

Segui questo link per vedere altri tipi di pubblicazioni sul tema: Adaptive gradient methods with momentum.

Autore: Grafiati

Pubblicato: 1 giugno 2024

Cita una fonte nei formati APA, MLA, Chicago, Harvard e in molti altri stili

Scegli il tipo di fonte:

Vedi i top-50 articoli di riviste per l'attività di ricerca sul tema "Adaptive gradient methods with momentum".

Accanto a ogni fonte nell'elenco di riferimenti c'è un pulsante "Aggiungi alla bibliografia". Premilo e genereremo automaticamente la citazione bibliografica dell'opera scelta nello stile citazionale di cui hai bisogno: APA, MLA, Harvard, Chicago, Vancouver ecc.

Puoi anche scaricare il testo completo della pubblicazione scientifica nel formato .pdf e leggere online l'abstract (il sommario) dell'opera se è presente nei metadati.

Vedi gli articoli di riviste di molte aree scientifiche e compila una bibliografia corretta.

1

Abdulkadirov, R. I., e P. A. Lyakhov. "A new approach to training neural networks using natural gradient descent with momentum based on Dirichlet distributions". Computer Optics 47, n. 1 (febbraio 2023): 160–69. http://dx.doi.org/10.18287/2412-6179-co-1147.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

Abstract (sommario):

In this paper, we propose a natural gradient descent algorithm with momentum based on Dirichlet distributions to speed up the training of neural networks. This approach takes into account not only the direction of the gradients, but also the convexity of the minimized function, which significantly accelerates the process of searching for the extremes. Calculations of natural gradients based on Dirichlet distributions are presented, with the proposed approach introduced into an error backpropagation scheme. The results of image recognition and time series forecasting during the experiments show that the proposed approach gives higher accuracy and does not require a large number of iterations to minimize loss functions compared to the methods of stochastic gradient descent, adaptive moment estimation and adaptive parameter-wise diagonal quasi-Newton method for nonconvex stochastic optimization.

2

REHMAN, MUHAMMAD ZUBAIR, e NAZRI MOHD NAWI. "STUDYING THE EFFECT OF ADAPTIVE MOMENTUM IN IMPROVING THE ACCURACY OF GRADIENT DESCENT BACK PROPAGATION ALGORITHM ON CLASSIFICATION PROBLEMS". International Journal of Modern Physics: Conference Series 09 (gennaio 2012): 432–39. http://dx.doi.org/10.1142/s201019451200551x.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

Abstract (sommario):

Despite being widely used in the practical problems around the world, Gradient Descent Back-propagation algorithm comes with problems like slow convergence and convergence to local minima. Previous researchers have suggested certain modifications to improve the convergence in gradient Descent Back-propagation algorithm such as careful selection of input weights and biases, learning rate, momentum, network topology, activation function and value for 'gain' in the activation function. This research proposed an algorithm for improving the working performance of back-propagation algorithm which is 'Gradient Descent with Adaptive Momentum (GDAM)' by keeping the gain value fixed during all network trials. The performance of GDAM is compared with 'Gradient Descent with fixed Momentum (GDM)' and 'Gradient Descent Method with Adaptive Gain (GDM-AG)'. The learning rate is fixed to 0.4 and maximum epochs are set to 3000 while sigmoid activation function is used for the experimentation. The results show that GDAM is a better approach than previous methods with an accuracy ratio of 1.0 for classification problems like Wine Quality, Mushroom and Thyroid disease.

3

Chen, Ruijuan, Xiaoquan Tang e Xiuting Li. "Adaptive Stochastic Gradient Descent Method for Convex and Non-Convex Optimization". Fractal and Fractional 6, n. 12 (29 novembre 2022): 709. http://dx.doi.org/10.3390/fractalfract6120709.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

Abstract (sommario):

Stochastic gradient descent is the method of choice for solving large-scale optimization problems in machine learning. However, the question of how to effectively select the step-sizes in stochastic gradient descent methods is challenging, and can greatly influence the performance of stochastic gradient descent algorithms. In this paper, we propose a class of faster adaptive gradient descent methods, named AdaSGD, for solving both the convex and non-convex optimization problems. The novelty of this method is that it uses a new adaptive step size that depends on the expectation of the past stochastic gradient and its second moment, which makes it efficient and scalable for big data and high parameter dimensions. We show theoretically that the proposed AdaSGD algorithm has a convergence rate of O(1/T) in both convex and non-convex settings, where T is the maximum number of iterations. In addition, we extend the proposed AdaSGD to the case of momentum and obtain the same convergence rate for AdaSGD with momentum. To illustrate our theoretical results, several numerical experiments for solving problems arising in machine learning are made to verify the promise of the proposed method.

4

Zhang, Yue, Seong-Yoon Shin, Xujie Tan e Bin Xiong. "A Self-Adaptive Approximated-Gradient-Simulation Method for Black-Box Adversarial Sample Generation". Applied Sciences 13, n. 3 (18 gennaio 2023): 1298. http://dx.doi.org/10.3390/app13031298.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

Abstract (sommario):

Deep neural networks (DNNs) have famously been applied in various ordinary duties. However, DNNs are sensitive to adversarial attacks which, by adding imperceptible perturbation samples to an original image, can easily alter the output. In state-of-the-art white-box attack methods, perturbation samples can successfully fool DNNs through the network gradient. In addition, they generate perturbation samples by only considering the sign information of the gradient and by dropping the magnitude. Accordingly, gradients of different magnitudes may adopt the same sign to construct perturbation samples, resulting in inefficiency. Unfortunately, it is often impractical to acquire the gradient in real-world scenarios. Consequently, we propose a self-adaptive approximated-gradient-simulation method for black-box adversarial attacks (SAGM) to generate efficient perturbation samples. Our proposed method uses knowledge-based differential evolution to simulate gradients and the self-adaptive momentum gradient to generate adversarial samples. To estimate the efficiency of the proposed SAGM, a series of experiments were carried out on two datasets, namely MNIST and CIFAR-10. Compared to state-of-the-art attack techniques, our proposed method can quickly and efficiently search for perturbation samples to misclassify the original samples. The results reveal that the SAGM is an effective and efficient technique for generating perturbation samples.

5

Long, Sheng, Wei Tao, Shuohao LI, Jun Lei e Jun Zhang. "On the Convergence of an Adaptive Momentum Method for Adversarial Attacks". Proceedings of the AAAI Conference on Artificial Intelligence 38, n. 13 (24 marzo 2024): 14132–40. http://dx.doi.org/10.1609/aaai.v38i13.29323.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

Abstract (sommario):

Adversarial examples are commonly created by solving a constrained optimization problem, typically using sign-based methods like Fast Gradient Sign Method (FGSM). These attacks can benefit from momentum with a constant parameter, such as Momentum Iterative FGSM (MI-FGSM), to enhance black-box transferability. However, the monotonic time-varying momentum parameter is required to guarantee convergence in theory, creating a theory-practice gap. Additionally, recent work shows that sign-based methods fail to converge to the optimum in several convex settings, exacerbating the issue. To address these concerns, we propose a novel method which incorporates both an innovative adaptive momentum parameter without monotonicity assumptions and an adaptive step-size scheme that replaces the sign operation. Furthermore, we derive a regret upper bound for general convex functions. Experiments on multiple models demonstrate the efficacy of our method in generating adversarial examples with human-imperceptible noise while achieving high attack success rates, indicating its superiority over previous adversarial example generation methods.

6

Zhang, Jiahui, Xinhao Yang, Ke Zhang e Chenrui Wen. "An Adaptive Deep Learning Optimization Method Based on Radius of Curvature". Computational Intelligence and Neuroscience 2021 (10 novembre 2021): 1–10. http://dx.doi.org/10.1155/2021/9882068.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

Abstract (sommario):

An adaptive clamping method (SGD-MS) based on the radius of curvature is designed to alleviate the local optimal oscillation problem in deep neural network, which combines the radius of curvature of the objective function and the gradient descent of the optimizer. The radius of curvature is considered as the threshold to separate the momentum term or the future gradient moving average term adaptively. In addition, on this basis, we propose an accelerated version (SGD-MA), which further improves the convergence speed by using the method of aggregated momentum. Experimental results on several datasets show that the proposed methods effectively alleviate the local optimal oscillation problem and greatly improve the convergence speed and accuracy. A novel parameter updating algorithm is also provided in this paper for deep neural network.

7

Zang, Yu, Zhe Xue, Shilong Ou, Lingyang Chu, Junping Du e Yunfei Long. "Efficient Asynchronous Federated Learning with Prospective Momentum Aggregation and Fine-Grained Correction". Proceedings of the AAAI Conference on Artificial Intelligence 38, n. 15 (24 marzo 2024): 16642–50. http://dx.doi.org/10.1609/aaai.v38i15.29603.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

Abstract (sommario):

Asynchronous federated learning (AFL) is a distributed machine learning technique that allows multiple devices to collaboratively train deep learning models without sharing local data. However, AFL suffers from low efficiency due to poor client model training quality and slow server model convergence speed, which are a result of the heterogeneous nature of both data and devices. To address these issues, we propose Efficient Asynchronous Federated Learning with Prospective Momentum Aggregation and Fine-Grained Correction (FedAC). Our framework consists of three key components. The first component is client weight evaluation based on temporal gradient, which evaluates the client weight based on the similarity between the client and server update directions. The second component is adaptive server update with prospective weighted momentum, which uses an asynchronous buffered update strategy and a prospective weighted momentum with adaptive learning rate to update the global model in server. The last component is client update with fine-grained gradient correction, which introduces a fine-grained gradient correction term to mitigate the client drift and correct the client stochastic gradient. We conduct experiments on real and synthetic datasets, and compare with existing federated learning methods. Experimental results demonstrate effective improvements in model training efficiency and AFL performance by our framework.

8

Liu, Miaomiao, Dan Yao, Zhigang Liu, Jingfeng Guo e Jing Chen. "An Improved Adam Optimization Algorithm Combining Adaptive Coefficients and Composite Gradients Based on Randomized Block Coordinate Descent". Computational Intelligence and Neuroscience 2023 (10 gennaio 2023): 1–14. http://dx.doi.org/10.1155/2023/4765891.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

Abstract (sommario):

An improved Adam optimization algorithm combining adaptive coefficients and composite gradients based on randomized block coordinate descent is proposed to address issues of the Adam algorithm such as slow convergence, the tendency to miss the global optimal solution, and the ineffectiveness of processing high-dimensional vectors. The adaptive coefficient is used to adjust the gradient deviation value and correct the search direction firstly. Then, the predicted gradient is introduced, and the current gradient and the first-order momentum are combined to form a composite gradient to improve the global optimization ability. Finally, the random block coordinate method is used to determine the gradient update mode, which reduces the computational overhead. Simulation experiments on two standard datasets for classification show that the convergence speed and accuracy of the proposed algorithm are higher than those of the six gradient descent methods, and the CPU and memory utilization are significantly reduced. In addition, based on logging data, the BP neural networks optimized by six algorithms, respectively, are used to predict reservoir porosity. Results show that the proposed method has lower system overhead, higher accuracy, and stronger stability, and the absolute error of more than 86% data is within 0.1%, which further verifies its effectiveness.

9

Jiang, Shuoran, Qingcai Chen, Youcheng Pan, Yang Xiang, Yukang Lin, Xiangping Wu, Chuanyi Liu e Xiaobao Song. "ZO-AdaMU Optimizer: Adapting Perturbation by the Momentum and Uncertainty in Zeroth-Order Optimization". Proceedings of the AAAI Conference on Artificial Intelligence 38, n. 16 (24 marzo 2024): 18363–71. http://dx.doi.org/10.1609/aaai.v38i16.29796.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

Abstract (sommario):

Lowering the memory requirement in full-parameter training on large models has become a hot research area. MeZO fine-tunes the large language models (LLMs) by just forward passes in a zeroth-order SGD optimizer (ZO-SGD), demonstrating excellent performance with the same GPU memory usage as inference. However, the simulated perturbation stochastic approximation for gradient estimate in MeZO leads to severe oscillations and incurs a substantial time overhead. Moreover, without momentum regularization, MeZO shows severe over-fitting problems. Lastly, the perturbation-irrelevant momentum on ZO-SGD does not improve the convergence rate. This study proposes ZO-AdaMU to resolve the above problems by adapting the simulated perturbation with momentum in its stochastic approximation. Unlike existing adaptive momentum methods, we relocate momentum on simulated perturbation in stochastic gradient approximation. Our convergence analysis and experiments prove this is a better way to improve convergence stability and rate in ZO-SGD. Extensive experiments demonstrate that ZO-AdaMU yields better generalization for LLMs fine-tuning across various NLP tasks than MeZO and its momentum variants.

10

Sineglazov, Victor, e Anatoly Kot. "Design of hybrid neural networks of the ensemble structure". Eastern-European Journal of Enterprise Technologies 1, n. 4 (109) (26 febbraio 2021): 31–45. http://dx.doi.org/10.15587/1729-4061.2021.225301.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

Abstract (sommario):

This paper considers the structural-parametric synthesis (SPS) of neural networks (NNs) of deep learning, in particular convolutional neural networks (CNNs), which are used in image processing. It has been shown that modern neural networks may possess a variety of topologies. That is ensured by using unique blocks that determine their essential features, namely, the compression and excitation unit, the attention module convolution unit, the channel attention module, the spatial attention module, the residual unit, the ResNeXt block. This, first of all, is due to the need to increase their efficiency in the processing of images. Due to the large architectural space of parameters, including the type of unique block, the location in the structure of the convolutional neural network, its connections with other blocks, layers, computing costs grow nonlinearly. To minimize computational costs while maintaining the specified accuracy this work set tasks of both the generation of possible topology and structural-parametric synthesis of convolutional neural networks. To resolve them, the use of a genetic algorithm (GA) has been proposed. Parameter configuration was implemented using a genetic algorithm and modern gradient methods (GM). For example, stochastic gradient descent with momentum, accelerated Nesterov gradient, adaptive gradient algorithm, distribution of the root of the mean square of the gradient, assessment of adaptive momentum, adaptive Nesterov momentum. It is assumed to use such networks in the intelligent medical diagnostic system (IMDS), for determining the activity of tuberculosis. To improve the accuracy of solving the classification problem in the processing of images, the ensemble structure of hybrid convolutional neural networks (HCNNs) has been proposed in the current work. The parallel structure of the ensemble with the merged layer was used. Algorithms of optimal choice and integration of features in the construction of the ensemble have been developed

11

Zhang, Jack, Guan Xiong Qiao, Alexandru Lopotenco e Ian Tong Pan. "Understanding Stochastic Optimization Behavior at the Layer Update Level (Student Abstract)". Proceedings of the AAAI Conference on Artificial Intelligence 36, n. 11 (28 giugno 2022): 13109–10. http://dx.doi.org/10.1609/aaai.v36i11.21691.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

Abstract (sommario):

Popular first-order stochastic optimization methods for deep neural networks (DNNs) are usually either accelerated schemes (e.g. stochastic gradient descent (SGD) with momentum) or adaptive step-size methods (e.g. Adam/AdaMax, AdaBelief). In many contexts, including image classification with DNNs, adaptive methods tend to generalize poorly compared to SGD, i.e. get stuck in non-robust local minima; however, SGD typically converges slower. We analyze possible reasons for this behavior by modeling gradient updates as vectors of random variables and comparing them to probabilistic bounds to identify "meaningful" updates. Through experiments, we observe that only layers close to the output have "definitely non-random" update behavior. In the future, the tools developed here may be useful in rigorously quantifying and analyzing intuitions about why some optimizers and particular DNN architectures perform better than others.

12

Zhang, Qikun, Yuzhi Zhang, Yanling Shao, Mengqi Liu, Jianyong Li, Junling Yuan e Ruifang Wang. "Boosting Adversarial Attacks with Nadam Optimizer". Electronics 12, n. 6 (20 marzo 2023): 1464. http://dx.doi.org/10.3390/electronics12061464.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

Abstract (sommario):

Deep neural networks are extremely vulnerable to attacks and threats from adversarial examples. These adversarial examples deliberately crafted by attackers can easily fool classification models by adding imperceptibly tiny perturbations on clean images. This brings a great challenge to image security for deep learning. Therefore, studying and designing attack algorithms for generating adversarial examples is essential for building robust models. Moreover, adversarial examples are transferable in that they can mislead multiple different classifiers across models. This makes black-box attacks feasible for practical applications. However, most attack methods have low success rates and weak transferability against black-box models. This is because they often overfit the model during the production of adversarial examples. To address this issue, we propose a Nadam iterative fast gradient method (NAI-FGM), which combines an improved Nadam optimizer with gradient-based iterative attacks. Specifically, we introduce the look-ahead momentum vector and the adaptive learning rate component based on the Momentum Iterative Fast Gradient Sign Method (MI-FGSM). The look-ahead momentum vector is dedicated to making the loss function converge faster and get rid of the poor local maximum. Additionally, the adaptive learning rate component is used to help the adversarial example to converge to a better extreme point by obtaining adaptive update directions according to the current parameters. Furthermore, we also carry out different input transformations to further enhance the attack performance before using NAI-FGM for attack. Finally, we consider attacking the ensemble model. Extensive experiments show that the NAI-FGM has stronger transferability and black-box attack capability than advanced momentum-based iterative attacks. In particular, when using the adversarial examples produced by way of ensemble attack to test the adversarially trained models, the NAI-FGM improves the success rate by 8% to 11% over the other attack methods. Last but not least, the NAI-DI-TI-SI-FGM combined with the input transformation achieves a success rate of 91.3% on average.

13

Yi, Dokkyun, Sangmin Ji e Sunyoung Bu. "An Enhanced Optimization Scheme Based on Gradient Descent Methods for Machine Learning". Symmetry 11, n. 7 (20 luglio 2019): 942. http://dx.doi.org/10.3390/sym11070942.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

Abstract (sommario):

A The learning process of machine learning consists of finding values of unknown weights in a cost function by minimizing the cost function based on learning data. However, since the cost function is not convex, it is conundrum to find the minimum value of the cost function. The existing methods used to find the minimum values usually use the first derivative of the cost function. When even the local minimum (but not a global minimum) is reached, since the first derivative of the cost function becomes zero, the methods give the local minimum values, so that the desired global minimum cannot be found. To overcome this problem, in this paper we modified one of the existing schemes—the adaptive momentum estimation scheme—by adding a new term, so that it can prevent the new optimizer from staying at local minimum. The convergence condition for the proposed scheme and the convergence value are also analyzed, and further explained through several numerical experiments whose cost function is non-convex.

14

Sun, Yunyun, Yutong Liu, Haocheng Zhou e Huijuan Hu. "Plant Diseases Identification through a Discount Momentum Optimizer in Deep Learning". Applied Sciences 11, n. 20 (12 ottobre 2021): 9468. http://dx.doi.org/10.3390/app11209468.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

Abstract (sommario):

Deep learning proves its promising results in various domains. The automatic identification of plant diseases with deep convolutional neural networks attracts a lot of attention at present. This article extends stochastic gradient descent momentum optimizer and presents a discount momentum (DM) deep learning optimizer for plant diseases identification. To examine the recognition and generalization capability of the DM optimizer, we discuss the hyper-parameter tuning and convolutional neural networks models across the plantvillage dataset. We further conduct comparison experiments on popular non-adaptive learning rate methods. The proposed approach achieves an average validation accuracy of no less than 97% for plant diseases prediction on several state-of-the-art deep learning models and holds a low sensitivity to hyper-parameter settings. Experimental results demonstrate that the DM method can bring a higher identification performance, while still maintaining a competitive performance over other non-adaptive learning rate methods in terms of both training speed and generalization.

15

Koudounas, Alkis, e Simone Fiori. "Gradient-based Learning Methods Extended to Smooth Manifolds Applied to Automated Clustering". Journal of Artificial Intelligence Research 68 (17 agosto 2020): 777–816. http://dx.doi.org/10.1613/jair.1.12192.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

Abstract (sommario):

Grassmann manifold based sparse spectral clustering is a classification technique that consists in learning a latent representation of data, formed by a subspace basis, which is sparse. In order to learn a latent representation, spectral clustering is formulated in terms of a loss minimization problem over a smooth manifold known as Grassmannian. Such minimization problem cannot be tackled by one of traditional gradient-based learning algorithms, which are only suitable to perform optimization in absence of constraints among parameters. It is, therefore, necessary to develop specific optimization/learning algorithms that are able to look for a local minimum of a loss function under smooth constraints in an efficient way. Such need calls for manifold optimization methods. In this paper, we extend classical gradient-based learning algorithms on at parameter spaces (from classical gradient descent to adaptive momentum) to curved spaces (smooth manifolds) by means of tools from manifold calculus. We compare clustering performances of these methods and known methods from the scientific literature. The obtained results confirm that the proposed learning algorithms prove lighter in computational complexity than existing ones without detriment in clustering efficacy.

16

Tchórzewski, Jerzy, e Tomasz Mielcarz. "Selection of an algorithm for classifying data quoted on the Day Ahead Market of TGE S.A. in MATLAB and Simulink using Deep Learning Toolbox". Studia Informatica. System and information technology 28, n. 1 (1 dicembre 2023): 83–108. http://dx.doi.org/10.34739/si.2023.28.05.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

Abstract (sommario):

The article contains an analysis leading to the selection of an algorithm for classifying data listed on the Day-Ahead Market of TGE S.A. in MATLAB and Simulink using Deep Learning Toolbox. In this regard, an introduction to deep learning methods, classification methods, and classification algorithms is provided first. Particular attention was paid to the essence of three important deep learningmethods in the classification, i.e. the methods called: Stochastic Gradient Descent Momentum, Root Mean Square Prop and Adaptive Moment Estimation. Then, three architectures of artificial neural networks used in deep learning were characterized, i.e.: Deep Belief Network, Convolutional Neural Network and Recurrent Neural Network. Attention was paid to the selection parameters of algorithms for learning deep artificial neural networks that can be used in classification, such as: accuracy, information losses and learning time. Practical aspects of research experiments were also shown, including selected results of research conducted on volume and fixing 1 data quoted on the TGE S.A. Day-Ahead Market. After analyzing the obtained test results for the hourly system, it was noted that the least suitable algorithm for classification purposes was the Stochastic Gradient Descent Momentum algorithm, which in each case had worse results than the other two algorithms, i.e. the Adaptive Moment Estimation algorithm and the Root Mean algorithm Square Prop. However, the best algorithm turned out to be the Adaptive Moment Estimation algorithm, which obtained the highest accuracy, which was at a level comparable to the Root Mean Square Prop algorithm, with the latter algorithm having larger losses.

17

Song, Ci. "The performance analysis of Adam and SGD in image classification and generation tasks". Applied and Computational Engineering 5, n. 1 (14 giugno 2023): 757–63. http://dx.doi.org/10.54254/2755-2721/5/20230697.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

Abstract (sommario):

Optimization problems have a very important leading position in machine learning. A great deal of machine learning algorithms ends up solving optimization problems. Among all the optimization algorithms, gradient methods are the simplest and most commonly used compared to algorithms like Particle Swarm Optimization and Ant Colony Optimization. In the gradient methods, Adaptive Moment Estimation (Adam) and stochastic parallel gradient descent (SGD) are both outstanding algorithms that have helped solve all kinds of deep learning tasks. But which one is better in some certain conditions is still unknown, which means programmers need to try many of the optimizers to have the best choice. Based on some previous researches, this paper study the impact of L2 regularization and weight decay in Adam and SGD with momentum, which turns out in adaptive methods, L2 regularization is not as effective as it is in SGD. It gives the intuition that SGD should outperform Adam in image classification tasks. However, this paper finds things go the other way around by running an experiment using Lenet-5 on MINST. Besides, this paper describes an experiment on Fashion-MINST using DCGAN with both Adam and SGD as optimizers in Generator and Discriminator. The result shows that the generator with SGD produces fake images with higher quality.

18

Sen, Alper, e Kutalmis Gumus. "Comparison of Different Parameters of Feedforward Backpropagation Neural Networks in DEM Height Estimation for Different Terrain Types and Point Distributions". Systems 11, n. 5 (19 maggio 2023): 261. http://dx.doi.org/10.3390/systems11050261.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

Abstract (sommario):

Digital Elevation Models (DEMs) are commonly used for environment, engineering, and architecture-related studies. One of the most important factors for the accuracy of DEM generation is the process of spatial interpolation, which is used for estimating the height values of the grid cells. The use of machine learning methods, such as artificial neural networks for spatial interpolation, contributes to spatial interpolation with more accuracy. In this study, the performances of FBNN interpolation based on different parameters such as the number of hidden layers and neurons, epoch number, processing time, and training functions (gradient optimization algorithms) were compared, and the differences were evaluated statistically using an analysis of variance (ANOVA) test. This research offers significant insights into the optimization of neural network gradients, with a particular focus on spatial interpolation. The accuracy of the Levenberg–Marquardt training function was the best, whereas the most significantly different training functions, gradient descent backpropagation and gradient descent with momentum and adaptive learning rule backpropagation, were the worst. Thus, this study contributes to the investigation of parameter selection of ANN for spatial interpolation in DEM height estimation for different terrain types and point distributions.

19

Jin, Yong, Yiwen Yang, Baican Yang e Yunfu Zhang. "An Adaptive BP Neural Network Model for Teaching Quality Evaluation in Colleges and Universities". Wireless Communications and Mobile Computing 2021 (10 agosto 2021): 1–7. http://dx.doi.org/10.1155/2021/4936873.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

Abstract (sommario):

There is currently no fair, rational, or scientific approach for evaluating college teachers’ teaching abilities. Mathematical methods are frequently used to measure the teaching capacity of college instructors in order to make it more scientific. Traditional statistical analysis evaluation models, fuzzy evaluation methods, grey decision methods, and the analytic hierarchy process (AHP) are only a few examples. Because teacher assessment is a nonlinear problem, even though the preceding methods have produced some positive results, they are vulnerable to some subjectivity. In this paper, the neural network model is incorporated into the adaptive vector and momentum of the modified BP neural network of a gradient descent method to boost the model’s convergence speed, and the model is thoroughly researched to evaluate university teaching quality, and the network structure is omitted to address the complex nonlinear problem of college and university teaching quality assessment. The model’s comprehensive evaluation of teaching activities is then bolstered by the addition of new evaluation indexes to the existing ones.

20

Han, Bao Ru, Jing Bing Li e Heng Yu Wu. "Tolerance Analog Circuit Hard Fault and Soft Fault Diagnosis Based on Particle Swarm Neural Network". Advanced Materials Research 712-715 (giugno 2013): 1965–69. http://dx.doi.org/10.4028/www.scientific.net/amr.712-715.1965.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

Abstract (sommario):

This paper presents a tolerance analog circuit hard fault and soft fault diagnosis method based on the BP neural network and particle swarm optimization algorithm. First, select the mean square error function of BP neural network as the fitness function of the PSO algorithm. Second, change the guidance of neural network algorithms rely on gradient information to adjust the network weights and threshold methods, through the use of the characteristics of the particle swarm algorithm groups parallel search to find more appropriate network weights and threshold. Then using the adaptive learning rate and momentum BP algorithm to train the BP neural network. Finally, the network is applied to fault diagnosis of analog circuit, can quickly and effectively to the circuit fault diagnosis.

21

An, Feng-Ping, Jun-e. Liu e Lei Bai. "Pedestrian Reidentification Algorithm Based on Deconvolution Network Feature Extraction-Multilayer Attention Mechanism Convolutional Neural Network". Journal of Sensors 2021 (7 gennaio 2021): 1–12. http://dx.doi.org/10.1155/2021/9463092.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

Abstract (sommario):

Pedestrian reidentification is a key technology in large-scale distributed camera systems. It can quickly and efficiently detect and track target people in large-scale distributed surveillance networks. The existing traditional pedestrian reidentification methods have problems such as low recognition accuracy, low calculation efficiency, and weak adaptive ability. Pedestrian reidentification algorithms based on deep learning have been widely used in the field of pedestrian reidentification due to their strong adaptive ability and high recognition accuracy. However, the pedestrian recognition method based on deep learning has the following problems: first, during the learning process of the deep learning model, the initial value of the convolution kernel is usually randomly assigned, which makes the model learning process easily fall into a local optimum. The second is that the model parameter learning method based on the gradient descent method exhibits gradient dispersion. The third is that the information transfer of pedestrian reidentification sequence images is not considered. In view of these issues, this paper first examines the feature map matrix from the original image through a deconvolution neural network, uses it as a convolution kernel, and then performs layer-by-layer convolution and pooling operations. Then, the second derivative information of the error function is directly obtained without calculating the Hessian matrix, and the momentum coefficient is used to improve the convergence of the backpropagation, thereby suppressing the gradient dispersion phenomenon. At the same time, to solve the problem of information transfer of pedestrian reidentification sequence images, this paper proposes a memory network model based on a multilayer attention mechanism, which uses the network to effectively store image visual information and pedestrian behavior information, respectively. It can solve the problem of information transmission. Based on the above ideas, this paper proposes a pedestrian reidentification algorithm based on deconvolution network feature extraction-multilayer attention mechanism convolutional neural network. Experiments are performed on the related data sets using this algorithm and other major popular human reidentification algorithms. The results show that the pedestrian reidentification method proposed in this paper not only has strong adaptive ability but also has significantly improved average recognition accuracy and rank-1 matching rate compared with other mainstream methods.

22

Zhang, Lin, Yian Zhu, Xianchen Shi e Xuesi Li. "A Situation Assessment Method with an Improved Fuzzy Deep Neural Network for Multiple UAVs". Information 11, n. 4 (4 aprile 2020): 194. http://dx.doi.org/10.3390/info11040194.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

Abstract (sommario):

To improve the intelligence and accuracy of the Situation Assessment (SA) in complex scenes, this work develops an improved fuzzy deep neural network approach to the situation assessment for multiple Unmanned Aerial Vehicle(UAV)s. Firstly, this work normalizes the scene data based on time series and use the normalized data as the input for an improved fuzzy deep neural network. Secondly, adaptive momentum and Elastic SGD (Elastic Stochastic Gradient Descent) are introduced into the training process of the neural network, to improve the learning performance. Lastly, in the real-time situation assessment task for multiple UAVs, conventional methods often bring inaccurate results for the situation assessment because these methods don’t consider the fuzziness of task situations. This work uses an improved fuzzy deep neural network to calculate the results of situation assessment and normalizes these results. Then, the degree of trust of the current result, relative to each situation label, is calculated with the normalized results using fuzzy logic. Simulation results show that the proposed method outperforms competitors.

23

Wu, Xue-Ting, Jun-Ning Liu, Adel Alowaisy, Noriyuki Yasufuku, Ryohei Ishikura e Meilani Adriyati. "Settlement Forecast of Marine Soft Soil Ground Improved with Prefabricated Vertical Drain-Assisted Staged Riprap Filling". Buildings 14, n. 5 (7 maggio 2024): 1316. http://dx.doi.org/10.3390/buildings14051316.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

Abstract (sommario):

By comparing different settlement forecast methods, eight methods were selected considering the creep of marine soft soils in this case study, including the Hyperbolic Method (HM), Exponential Curve Method (ECM), Pearl Growth Curve Modeling (PGCM), Gompertz Growth Curve Modeling (GGCM), Grey (1, 1) Model (GM), Grey Verhulst Model (GVM), Back Propagation of Artificial Neural Network (BPANN) with Levenberg–Marquardt Algorithm (BPLM), and BPANN with Gradient Descent of Momentum and Adaptive Learning Rate (BPGD). Taking Lingni Seawall soil ground improved with prefabricated vertical drain-assisted staged riprap filling as an example, forecasts of the short-term, medium-term, long-term, and final settlements at different locations of the soft ground were performed with the eight selected methods. The forecasting values were compared with each other and with the monitored data. When relative errors were between 0 and −1%, both the forecasting accuracy and engineering safety were appropriate and reliable. It was concluded that the appropriate forecast methods were different not only due to the time periods during the settlement process, but also the locations of soft ground. Among these methods, only BPGD was appropriate for all the time periods and locations, such as at the edge of the berm, and at the center of the berm and embankment.

24

ÖZALTIN, Öznur, e Özgür YENİAY. "Detection of monkeypox disease from skin lesion images using Mobilenetv2 architecture". Communications Faculty Of Science University of Ankara Series A1Mathematics and Statistics 72, n. 2 (23 giugno 2023): 482–99. http://dx.doi.org/10.31801/cfsuasmas.1202806.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

Abstract (sommario):

Monkeypox has recently become an endemic disease that threatens the whole world. The most distinctive feature of this disease is occurring skin lesions. However, in other types of diseases such as chickenpox, measles, and smallpox skin lesions can also be seen. The main aim of this study was to quickly detect monkeypox disease from others through deep learning approaches based on skin images. In this study, MobileNetv2 was used to determine in images whether it was monkeypox or non-monkeypox. To find splitting methods and optimization methods, a comprehensive analysis was performed. The splitting methods included training and testing (70:30 and 80:20) and 10 fold cross validation. The optimization methods as adaptive moment estimation (adam), root mean square propagation (rmsprop), and stochastic gradient descent momentum (sgdm) were used. Then, MobileNetv2 was tasked as a deep feature extractor and features were obtained from the global pooling layer. The Chi-Square feature selection method was used to reduce feature dimensions. Finally, selected features were classified using the Support Vector Machine (SVM) with different kernel functions. In this study, 10 fold cross validation and adam were seen as the best splitting and optimization methods, respectively, with an accuracy of 98.59%. Then, significant features were selected via the Chi-Square method and while classifying 500 features with SVM, an accuracy of 99.69% was observed.

25

Gao, Yiping. "News Video Classification Model Based on ResNet-2 and Transfer Learning". Security and Communication Networks 2021 (16 dicembre 2021): 1–9. http://dx.doi.org/10.1155/2021/5865200.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

Abstract (sommario):

A large amount of useful information is included in the news video, and how to classify the news video information has become an important research topic in the field of multimedia technology. News videos are enormously informative, and employing manual classification methods is too time-consuming and vulnerable to subjective judgment. Therefore, developing an automated news video analysis and retrieval method becomes one of the most important research contents in the current multimedia information system. Therefore, this paper proposes a news video classification model based on ResNet-2 and transfer learning. First, a model-based transfer method was adopted to transfer the commonality knowledge of the pretrained model of the Inception-ResNet-v2 network on ImageNet, and a news video classification model was constructed. Then, a momentum update rule is introduced on the basis of the Adam algorithm, and an improved gradient descent method is proposed in order to obtain an optimal solution of the local minima of the function in the learning process. The experimental results show that the improved Adam algorithm can iteratively update the network weights through the adaptive learning rate to reach the fastest convergence. Compared with other convolutional neural network models, the modified Inception-ResNet-v2 network model achieves 91.47% classification accuracy for common news video datasets.

26

Li, Yanan, Xuebin Ren, Fangyuan Zhao e Shusen Yang. "A Zeroth-Order Adaptive Learning Rate Method to Reduce Cost of Hyperparameter Tuning for Deep Learning". Applied Sciences 11, n. 21 (30 ottobre 2021): 10184. http://dx.doi.org/10.3390/app112110184.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

Abstract (sommario):

Due to powerful data representation ability, deep learning has dramatically improved the state-of-the-art in many practical applications. However, the utility highly depends on fine-tuning of hyper-parameters, including learning rate, batch size, and network initialization. Although many first-order adaptive methods (e.g., Adam, Adagrad) have been proposed to adjust learning rate based on gradients, they are susceptible to the initial learning rate and network architecture. Therefore, the main challenge of using deep learning in practice is how to reduce the cost of tuning hyper-parameters. To address this, we propose a heuristic zeroth-order learning rate method, Adacomp, which adaptively adjusts the learning rate based only on values of the loss function. The main idea is that Adacomp penalizes large learning rates to ensure the convergence and compensates small learning rates to accelerate the training process. Therefore, Adacomp is robust to the initial learning rate. Extensive experiments, including comparison to six typically adaptive methods (Momentum, Adagrad, RMSprop, Adadelta, Adam, and Adamax) on several benchmark datasets for image classification tasks (MNIST, KMNIST, Fashion-MNIST, CIFAR-10, and CIFAR-100), were conducted. Experimental results show that Adacomp is not only robust to the initial learning rate but also to the network architecture, network initialization, and batch size.

27

Kim, Kyung-Soo, e Yong-Suk Choi. "HyAdamC: A New Adam-Based Hybrid Optimization Algorithm for Convolution Neural Networks". Sensors 21, n. 12 (12 giugno 2021): 4054. http://dx.doi.org/10.3390/s21124054.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

Abstract (sommario):

As the performance of devices that conduct large-scale computations has been rapidly improved, various deep learning models have been successfully utilized in various applications. Particularly, convolution neural networks (CNN) have shown remarkable performance in image processing tasks such as image classification and segmentation. Accordingly, more stable and robust optimization methods are required to effectively train them. However, the traditional optimizers used in deep learning still have unsatisfactory training performance for the models with many layers and weights. Accordingly, in this paper, we propose a new Adam-based hybrid optimization method called HyAdamC for training CNNs effectively. HyAdamC uses three new velocity control functions to adjust its search strength carefully in term of initial, short, and long-term velocities. Moreover, HyAdamC utilizes an adaptive coefficient computation method to prevent that a search direction determined by the first momentum is distorted by any outlier gradients. Then, these are combined into one hybrid method. In our experiments, HyAdamC showed not only notable test accuracies but also significantly stable and robust optimization abilities when training various CNN models. Furthermore, we also found that HyAdamC could be applied into not only image classification and image segmentation tasks.

28

Liu, Yiqi, Longhua Yuan, Dong Li, Yan Li e Daoping Huang. "Process Monitoring of Quality-Related Variables in Wastewater Treatment Using Kalman-Elman Neural Network-Based Soft-Sensor Modeling". Water 13, n. 24 (20 dicembre 2021): 3659. http://dx.doi.org/10.3390/w13243659.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

Abstract (sommario):

Proper monitoring of quality-related but hard-to-measure effluent variables in wastewater plants is imperative. Soft sensors, such as dynamic neural network, are widely used to predict and monitor these variables and then to optimize plant operations. However, the traditional training methods of dynamic neural network may lead to poor local optima and low learning rates, resulting in inaccurate estimations of parameters and deviation of predictions. This study introduces a general Kalman-Elman method to monitor the effluent qualities, such as biochemical oxygen demand (BOD), chemical oxygen demand (COD), and total nitrogen (TN). The method couples an Elman neural network with the square-root unscented Kalman filter (SR-UKF) to build a soft-sensor model. In the proposed methodology, adaptive noise estimation and weight constraining are introduced to estimate the unknown noise and constrain the parameter values. The main merits of the proposed approach include the following: First, improving the mapping accuracy of the model and overcoming the underprediction phenomena in data-driven process monitoring; second, implementing the parameter constraint and avoid large weight values; and finally, providing a new way to update the parameters online. The proposed method is verified from a dataset of the University of California database (UCI database). The obtained results show that the proposed soft-sensor model achieved better prediction performance with root mean square error (RMSE) being at least 50% better than the Elman network based on back propagation through the time algorithm (Elman-BPTT), Elman network based on momentum gradient descent algorithm (Elman-GDM), and Elman network based on Levenberg-Marquardt algorithm (Elman-LM). This method can give satisfying prediction of quality-related effluent variables with the largest correlation coefficient (R) for approximately 0.85 in output suspended solids (SS-S) and 0.95 in BOD and COD.

29

Lin, Rong-Ho, Benjamin Kofi Kujabi, Chun-Ling Chuang, Ching-Shun Lin e Chun-Jen Chiu. "Application of Deep Learning to Construct Breast Cancer Diagnosis Model". Applied Sciences 12, n. 4 (13 febbraio 2022): 1957. http://dx.doi.org/10.3390/app12041957.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

Abstract (sommario):

(1) Background: According to Taiwan’s ministry of health statistics, the rate of breast cancer in women is increasing annually. Each year, more than 10,000 women suffer from breast cancer, and over 2000 die of the disease. The mortality rate is annually increasing, but if breast cancer tumors are detected earlier, and appropriate treatment is provided immediately, the survival rate of patients will increase enormously. (2) Methods: This research aimed to develop a stepwise breast cancer model architecture to improve diagnostic accuracy and reduce the misdiagnosis rate of breast cancer. In the first stage, a breast cancer risk factor dataset was utilized. After pre-processing, Artificial Neural Network (ANN) and the support vector machine (SVM) were applied to the dataset to classify breast cancer tumors and compare their performances. The ANN achieved 76.6% classification accuracy, and the SVM using radial functions achieved the best classification accuracy of 91.6%. Therefore, SVM was utilized in the determination of results concerning the relevant breast cancer risk factors. In the second stage, we trained AlexNet, ResNet101, and InceptionV3 networks using transfer learning. The networks were studied using Adaptive Moment Estimation (ADAM) and Stochastic Gradient Descent with Momentum (SGDM) based optimization algorithm to diagnose benign and malignant tumors, and the results were evaluated; (3) Results: According to the results, AlexNet obtained 81.16%, ResNet101 85.51%, and InceptionV3 achieved a remarkable accuracy of 91.3%. The results of the three models were utilized in establishing a voting combination, and the soft-voting method was applied to average the prediction result for which a test accuracy of 94.20% was obtained; (4) Conclusions: Despite the small number of images in this study, the accuracy is higher compared to other literature. The proposed method has demonstrated the need for an additional productive tool in clinical settings when radiologists are evaluating mammography images of patients.

30

null, Hailiang Liu, e Xuping Tian. "An Adaptive Gradient Method with Energy and Momentum". Annals of Applied Mathematics 38, n. 2 (giugno 2022): 183–222. http://dx.doi.org/10.4208/aam.oa-2021-0095.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

31

Liu, Jian-Qiang, Da-Zheng Feng e Wei-Wei Zhang. "Adaptive Improved Natural Gradient Algorithm for Blind Source Separation". Neural Computation 21, n. 3 (marzo 2009): 872–89. http://dx.doi.org/10.1162/neco.2008.07-07-562.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

Abstract (sommario):

We propose an adaptive improved natural gradient algorithm for blind separation of independent sources. First, inspired by the well-known backpropagation algorithm, we incorporate a momentum term into the natural gradient learning process to accelerate the convergence rate and improve the stability. Then an estimation function for the adaptation of the separation model is obtained to adaptively control a step-size parameter and a momentum factor. The proposed natural gradient algorithm with variable step-size parameter and variable momentum factor is therefore particularly well suited to blind source separation in a time-varying environment, such as an abruptly changing mixing matrix or signal power. The expected improvement in the convergence speed, stability, and tracking ability of the proposed algorithm is demonstrated by extensive simulation results in both time-invariant and time-varying environments. The ability of the proposed algorithm to separate extremely weak or badly scaled sources is also verified. In addition, simulation results show that the proposed algorithm is suitable for separating mixtures of many sources (e.g., the number of sources is 10) in the complete case.

32

Liu, Guoqi, Zhiheng Zhou, Huiqiang Zhong e Shengli Xie. "Gradient descent with adaptive momentum for active contour models". IET Computer Vision 8, n. 4 (agosto 2014): 287–98. http://dx.doi.org/10.1049/iet-cvi.2013.0089.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

33

HAMID, NORHAMREEZA ABDUL, NAZRI MOHD NAWI, ROZAIDA GHAZALI e MOHD NAJIB MOHD SALLEH. "SOLVING LOCAL MINIMA PROBLEM IN BACK PROPAGATION ALGORITHM USING ADAPTIVE GAIN, ADAPTIVE MOMENTUM AND ADAPTIVE LEARNING RATE ON CLASSIFICATION PROBLEMS". International Journal of Modern Physics: Conference Series 09 (gennaio 2012): 448–55. http://dx.doi.org/10.1142/s2010194512005533.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

Abstract (sommario):

This paper presents a new method to improve back propagation algorithm from getting stuck with local minima problem and slow convergence speeds which caused by neuron saturation in the hidden layer. In this proposed algorithm, each training pattern has its own activation functions of neurons in the hidden layer that are adjusted by the adaptation of gain parameters together with adaptive momentum and learning rate value during the learning process. The efficiency of the proposed algorithm is compared with the conventional back propagation gradient descent and the current working back propagation gradient descent with adaptive gain by means of simulation on three benchmark problems namely iris, glass and thyroid.

34

Zhang, Wei Tang, e Shao Gang Huang. "Adaptive Neural Network for Image Edge Detection". Advanced Materials Research 524-527 (maggio 2012): 3792–96. http://dx.doi.org/10.4028/www.scientific.net/amr.524-527.3792.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

Abstract (sommario):

This paper presents a neural network adaptive image edge detection method, and from neural network theory, this paper gives the formula of adaptive neural network algorithm; quantitative given the momentum factor and error, momentum factor and error on the weight vector of norm of the gradient of the quantitative relationship; and gives the algorithm flow diagram. Through experiment we get the conclusion: by using this adaptive neural network for image edge detection is feasible, and it has good generalization ability.

35

OU, Shi-Feng, Ying GAO e Xiao-Hui ZHAO. "Stochastic Gradient Based Variable Momentum Factor Algorithm for Adaptive Whitening". Acta Automatica Sinica 38, n. 8 (2012): 1370. http://dx.doi.org/10.3724/sp.j.1004.2012.01370.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

36

Xue, Liqi. "Research on SGD Algorithm Using Momentum Strategy". Applied and Computational Engineering 2, n. 1 (22 marzo 2023): 141–50. http://dx.doi.org/10.54254/2755-2721/2/20220622.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

Abstract (sommario):

With the continuous development of stochastic gradient descent algorithms, many efficient momentum algorithms have appeared. Stochastic gradient descent(SGD) is one of the classic algorithms in optimization. Its accelerated version, the SGD algorithm with momentum strategy, has been a hot research topic in recent years. Therefore, this paper will analyze and summarize these series of algorithms, starting with the classical momentum algorithm, and introduce some improved versions of the momentum algorithm. Numerical experiments on real problems will also be done to evaluate the performance of these algorithms. It is proved that the addition of momentum and adaptive learning rate effectively improve the performance of these algorithms. In future research, some cutting-edge momentum algorithms and other basic network should be analyzed.

37

Yaqub, Muhammad, Jinchao Feng, M. Sultan Zia, Kaleem Arshid, Kebin Jia, Zaka Ur Rehman e Atif Mehmood. "State-of-the-Art CNN Optimizer for Brain Tumor Segmentation in Magnetic Resonance Images". Brain Sciences 10, n. 7 (3 luglio 2020): 427. http://dx.doi.org/10.3390/brainsci10070427.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

Abstract (sommario):

Brain tumors have become a leading cause of death around the globe. The main reason for this epidemic is the difficulty conducting a timely diagnosis of the tumor. Fortunately, magnetic resonance images (MRI) are utilized to diagnose tumors in most cases. The performance of a Convolutional Neural Network (CNN) depends on many factors (i.e., weight initialization, optimization, batches and epochs, learning rate, activation function, loss function, and network topology), data quality, and specific combinations of these model attributes. When we deal with a segmentation or classification problem, utilizing a single optimizer is considered weak testing or validity unless the decision of the selection of an optimizer is backed up by a strong argument. Therefore, optimizer selection processes are considered important to validate the usage of a single optimizer in order to attain these decision problems. In this paper, we provides a comprehensive comparative analysis of popular optimizers of CNN to benchmark the segmentation for improvement. In detail, we perform a comparative analysis of 10 different state-of-the-art gradient descent-based optimizers, namely Adaptive Gradient (Adagrad), Adaptive Delta (AdaDelta), Stochastic Gradient Descent (SGD), Adaptive Momentum (Adam), Cyclic Learning Rate (CLR), Adaptive Max Pooling (Adamax), Root Mean Square Propagation (RMS Prop), Nesterov Adaptive Momentum (Nadam), and Nesterov accelerated gradient (NAG) for CNN. The experiments were performed on the BraTS2015 data set. The Adam optimizer had the best accuracy of 99.2% in enhancing the CNN ability in classification and segmentation.

38

Liang, Dong, Fanfan Ma e Wenyan Li. "New Gradient-Weighted Adaptive Gradient Methods With Dynamic Constraints". IEEE Access 8 (2020): 110929–42. http://dx.doi.org/10.1109/access.2020.3002590.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

39

Zhou, Bin, Li Gao e Yu-Hong Dai. "Gradient Methods with Adaptive Step-Sizes". Computational Optimization and Applications 35, n. 1 (31 marzo 2006): 69–86. http://dx.doi.org/10.1007/s10589-006-6446-0.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

40

Tseng, Paul. "An Incremental Gradient(-Projection) Method with Momentum Term and Adaptive Stepsize Rule". SIAM Journal on Optimization 8, n. 2 (maggio 1998): 506–31. http://dx.doi.org/10.1137/s1052623495294797.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

41

Shao, Hongmei, Dongpo Xu e Gaofeng Zheng. "Convergence of a Batch Gradient Algorithm with Adaptive Momentum for Neural Networks". Neural Processing Letters 34, n. 3 (22 luglio 2011): 221–28. http://dx.doi.org/10.1007/s11063-011-9193-x.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

42

Boffi, Nicholas M., e Jean-Jacques E. Slotine. "Implicit Regularization and Momentum Algorithms in Nonlinearly Parameterized Adaptive Control and Prediction". Neural Computation 33, n. 3 (marzo 2021): 590–673. http://dx.doi.org/10.1162/neco_a_01360.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

Abstract (sommario):

Stable concurrent learning and control of dynamical systems is the subject of adaptive control. Despite being an established field with many practical applications and a rich theory, much of the development in adaptive control for nonlinear systems revolves around a few key algorithms. By exploiting strong connections between classical adaptive nonlinear control techniques and recent progress in optimization and machine learning, we show that there exists considerable untapped potential in algorithm development for both adaptive nonlinear control and adaptive dynamics prediction. We begin by introducing first-order adaptation laws inspired by natural gradient descent and mirror descent. We prove that when there are multiple dynamics consistent with the data, these non-Euclidean adaptation laws implicitly regularize the learned model. Local geometry imposed during learning thus may be used to select parameter vectors—out of the many that will achieve perfect tracking or prediction—for desired properties such as sparsity. We apply this result to regularized dynamics predictor and observer design, and as concrete examples, we consider Hamiltonian systems, Lagrangian systems, and recurrent neural networks. We subsequently develop a variational formalism based on the Bregman Lagrangian. We show that its Euler Lagrange equations lead to natural gradient and mirror descent-like adaptation laws with momentum, and we recover their first-order analogues in the infinite friction limit. We illustrate our analyses with simulations demonstrating our theoretical results.

43

Fang, Qionglin, e X. U. E. Han. "A Nonlinear Gradient Domain-Guided Filter Optimized by Fractional-Order Gradient Descent with Momentum RBF Neural Network for Ship Image Dehazing". Journal of Sensors 2021 (2 gennaio 2021): 1–15. http://dx.doi.org/10.1155/2021/8864906.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

Abstract (sommario):

To avoid the blurred edges, noise, and halos caused by guided image filtering algorithm, this paper proposed a nonlinear gradient domain-guided image filtering algorithm for image dehazing. To dynamically adjust the edge preservation and smoothness of dehazed images, this paper proposed a fractional-order gradient descent with momentum RBF neural network to optimize the nonlinear gradient domain-guided filtering (NGDGIF-FOGDMRBF). Its convergence is proved. In order to speed up the convergence process, an adaptive learning rate is used to adjust the training process reasonably. The results verify the theoretical results of the proposed algorithm such as its monotonicity and convergence. The descending curve of error values by FOGDM is smoother than gradient descent and gradient descent with momentum method. The influence of regularization parameter is analyzed and compared. Compared with dark channel prior, histogram equalization, homomorphic filtering, and multiple exposure fusion, the halo and noise generated are significantly reduced with higher peak signal-to-noise ratio and structural similarity index.

44

Arthur, C. K., V. A. Temeng e Y. Y. Ziggah. "Performance Evaluation of Training Algorithms in Backpropagation Neural Network Approach to Blast-Induced Ground Vibration Prediction". Ghana Mining Journal 20, n. 1 (7 luglio 2020): 20–33. http://dx.doi.org/10.4314/gm.v20i1.3.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

Abstract (sommario):

Abstract Backpropagation Neural Network (BPNN) is an artificial intelligence technique that has seen several applications in many fields of science and engineering. It is well-known that, the critical task in developing an effective and accurate BPNN model depends on an appropriate training algorithm, transfer function, number of hidden layers and number of hidden neurons. Despite the numerous contributing factors for the development of a BPNN model, training algorithm is key in achieving optimum BPNN model performance. This study is focused on evaluating and comparing the performance of 13 training algorithms in BPNN for the prediction of blast-induced ground vibration. The training algorithms considered include: Levenberg-Marquardt, Bayesian Regularisation, Broyden–Fletcher–Goldfarb–Shanno (BFGS) Quasi-Newton, Resilient Backpropagation, Scaled Conjugate Gradient, Conjugate Gradient with Powell/Beale Restarts, Fletcher-Powell Conjugate Gradient, Polak-Ribiére Conjugate Gradient, One Step Secant, Gradient Descent with Adaptive Learning Rate, Gradient Descent with Momentum, Gradient Descent, and Gradient Descent with Momentum and Adaptive Learning Rate. Using ranking values for the performance indicators of Mean Squared Error (MSE), correlation coefficient (R), number of training epoch (iteration) and the duration for convergence, the performance of the various training algorithms used to build the BPNN models were evaluated. The obtained overall ranking results showed that the BFGS Quasi-Newton algorithm outperformed the other training algorithms even though the Levenberg Marquardt algorithm was found to have the best computational speed and utilised the smallest number of epochs. Keywords: Artificial Intelligence, Blast-induced Ground Vibration, Backpropagation Training Algorithms

45

Wanto, Anjar. "Prediksi Angka Partisipasi Sekolah dengan Fungsi Pelatihan Gradient Descent With Momentum & Adaptive LR". ALGORITMA : JURNAL ILMU KOMPUTER DAN INFORMATIKA 3, n. 1 (30 aprile 2019): 9. http://dx.doi.org/10.30829/algoritma.v3i1.4431.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

Abstract (sommario):

<p class="Abstract">School Participation Rate (APS) is known as one of the indicators of the success of the development of educational services in regions both Province, Regency or City in Indonesia. The higher the value of the School Participation Rate, then the area is considered successful in providing access to education services. The purpose of this study is to predict School Participation Rates based on Provinces in Indonesia from Aceh to Papua. The prediction algorithm used is the backpropagation algorithm using the gradient descent with momentum & adaptive LR (traingdx) training function. Traingdx is a network training function that updates weight values and biases based on gradient descent momentum and adaptive learning levels. Usually, the backpropagation algorithm uses the gradient descent backpropagation (traingd) function, but in this study, the training function used is using gradient descent with momentum & adaptive LR (traingdx). The data used in this study data on School Participation Figures for each province in Indonesia in 2011-2017 aged 19-24 years were taken from the Indonesian Central Bureau of Statistics (BPS). The reason for choosing this age range is because at this age is one of the factors that determine the success of education in a country, especially Indonesia. This study uses 3 network architecture models, namely: 5-5-1, 5-15-1 and 5-25-1. Of the 3 models, the best model is 5-5-1 with an iteration of 130, the accuracy of 94% and MSE 0,0008708473. This model is then used to predict School Participation Rates in each province in Indonesia over the next 3 years (2018-2020). These results are expected to help the Indonesian government to further increase scholarships and improve the quality of education in the future..</p><p> </p><p class="IndexTerms"><strong>Keywords</strong>: Prediction, APS, Backpropagation, Traingdx.</p>

46

Yang, Yang, Lipo Mo, Yusen Hu e Fei Long. "The Improved Stochastic Fractional Order Gradient Descent Algorithm". Fractal and Fractional 7, n. 8 (18 agosto 2023): 631. http://dx.doi.org/10.3390/fractalfract7080631.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

Abstract (sommario):

This paper mainly proposes some improved stochastic gradient descent (SGD) algorithms with a fractional order gradient for the online optimization problem. For three scenarios, including standard learning rate, adaptive gradient learning rate, and momentum learning rate, three new SGD algorithms are designed combining a fractional order gradient and it is shown that the corresponding regret functions are convergent at a sub-linear rate. Then we discuss the impact of the fractional order on the convergence and monotonicity and prove that the better performance can be obtained by adjusting the order of the fractional gradient. Finally, several practical examples are given to verify the superiority and validity of the proposed algorithm.

47

Frassoldati, Giacomo, Luca Zanni e Gaetano Zanghirati. "New adaptive stepsize selections in gradient methods". Journal of Industrial & Management Optimization 4, n. 2 (2008): 299–312. http://dx.doi.org/10.3934/jimo.2008.4.299.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

48

Gong, Xiaolin, e Xiaoshuang Ding. "Adaptive CDKF Based on Gradient Descent With Momentum and its Application to POS". IEEE Sensors Journal 21, n. 14 (15 luglio 2021): 16201–12. http://dx.doi.org/10.1109/jsen.2021.3076071.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

49

Shao, Hongmei, Dongpo Xu, Gaofeng Zheng e Lijun Liu. "Convergence of an online gradient method with inner-product penalty and adaptive momentum". Neurocomputing 77, n. 1 (febbraio 2012): 243–52. http://dx.doi.org/10.1016/j.neucom.2011.09.003.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

50

Han, Xiaohui, e Jianping Dong. "Applications of fractional gradient descent method with adaptive momentum in BP neural networks". Applied Mathematics and Computation 448 (luglio 2023): 127944. http://dx.doi.org/10.1016/j.amc.2023.127944.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri