Journal articles on the topic 'Selection of hyperparameters'

To see the other types of publications on this topic, follow the link: Selection of hyperparameters.

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'Selection of hyperparameters.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Sun, Yunlei, Huiquan Gong, Yucong Li, and Dalin Zhang. "Hyperparameter Importance Analysis based on N-RReliefF Algorithm." International Journal of Computers Communications & Control 14, no. 4 (August 5, 2019): 557–73. http://dx.doi.org/10.15837/ijccc.2019.4.3593.

Full text
Abstract:
Hyperparameter selection has always been the key to machine learning. The Bayesian optimization algorithm has recently achieved great success, but it has certain constraints and limitations in selecting hyperparameters. In response to these constraints and limitations, this paper proposed the N-RReliefF algorithm, which can evaluate the importance of hyperparameters and the importance weights between hyperparameters. The N-RReliefF algorithm estimates the contribution of a single hyperparameter to the performance according to the influence degree of each hyperparameter on the performance and calculates the weight of importance between the hyperparameters according to the improved normalization formula. The N-RReliefF algorithm analyses the hyperparameter configuration and performance set generated by Bayesian optimization, and obtains the important hyperparameters in random forest algorithm and SVM algorithm. The experimental results verify the effectiveness of the N-RReliefF algorithm.
APA, Harvard, Vancouver, ISO, and other styles
2

Bengio, Yoshua. "Gradient-Based Optimization of Hyperparameters." Neural Computation 12, no. 8 (August 1, 2000): 1889–900. http://dx.doi.org/10.1162/089976600300015187.

Full text
Abstract:
Many machine learning algorithms can be formulated as the minimization of a training criterion that involves a hyperparameter. This hyperparameter is usually chosen by trial and error with a model selection criterion. In this article we present a methodology to optimize several hyper-parameters, based on the computation of the gradient of a model selection criterion with respect to the hyperparameters. In the case of a quadratic training criterion, the gradient of the selection criterion with respect to the hyperparameters is efficiently computed by backpropagating through a Cholesky decomposition. In the more general case, we show that the implicit function theorem can be used to derive a formula for the hyper-parameter gradient involving second derivatives of the training criterion.
APA, Harvard, Vancouver, ISO, and other styles
3

Lohvithee, Manasavee, Wenjuan Sun, Stephane Chretien, and Manuchehr Soleimani. "Ant Colony-Based Hyperparameter Optimisation in Total Variation Reconstruction in X-ray Computed Tomography." Sensors 21, no. 2 (January 15, 2021): 591. http://dx.doi.org/10.3390/s21020591.

Full text
Abstract:
In this paper, a computer-aided training method for hyperparameter selection of limited data X-ray computed tomography (XCT) reconstruction was proposed. The proposed method employed the ant colony optimisation (ACO) approach to assist in hyperparameter selection for the adaptive-weighted projection-controlled steepest descent (AwPCSD) algorithm, which is a total-variation (TV) based regularisation algorithm. During the implementation, there was a colony of artificial ants that swarm through the AwPCSD algorithm. Each ant chose a set of hyperparameters required for its iterative CT reconstruction and the correlation coefficient (CC) score was given for reconstructed images compared to the reference image. A colony of ants in one generation left a pheromone through its chosen path representing a choice of hyperparameters. Higher score means stronger pheromones/probabilities to attract more ants in the next generations. At the end of the implementation, the hyperparameter configuration with the highest score was chosen as an optimal set of hyperparameters. In the experimental results section, the reconstruction using hyperparameters from the proposed method was compared with results from three other cases: the conjugate gradient least square (CGLS), the AwPCSD algorithm using the set of arbitrary hyperparameters and the cross-validation method.The experiments showed that the results from the proposed method were superior to those of the CGLS algorithm and the AwPCSD algorithm using the set of arbitrary hyperparameters. Although the results of the ACO algorithm were slightly inferior to those of the cross-validation method as measured by the quantitative metrics, the ACO algorithm was over 10 times faster than cross—Validation. The optimal set of hyperparameters from the proposed method was also robust against an increase of noise in the data and can be applicable to different imaging samples with similar context. The ACO approach in the proposed method was able to identify optimal values of hyperparameters for a dataset and, as a result, produced a good quality reconstructed image from limited number of projection data. The proposed method in this work successfully solves a problem of hyperparameters selection, which is a major challenge in an implementation of TV based reconstruction algorithms.
APA, Harvard, Vancouver, ISO, and other styles
4

Adewole, Ayoade I., and Olusoga A. Fasoranbaku. "Determination of Quantile Range of Optimal Hyperparameters Using Bayesian Estimation." Tanzania Journal of Science 47, no. 3 (August 13, 2021): 988–98. http://dx.doi.org/10.4314/tjs.v47i3.10.

Full text
Abstract:
Bayesian estimations have the advantages of taking into account the uncertainty of all parameter estimates which allows virtually the use of vague priors. This study focused on determining the quantile range at which optimal hyperparameter of normally distributed data with vague information could be obtained in Bayesian estimation of linear regression models. A Monte Carlo simulation approach was used to generate a sample size of 200 data-set. Observation precisions and posterior precisions were estimated from the regression output to determine the posterior means estimate for each model to derive the new dependent variables. The variances were divided into 10 equal parts to obtain the hyperparameters of the prior distribution. Average absolute deviation for model selection was used to validate the adequacy of each model. The study revealed the optimal hyperparameters located at 5th and 7th deciles. The research simplified the process of selecting the hyperparameters of prior distribution from the data with vague information in empirical Bayesian inferences. Keywords: Optimal Hyperparameters; Quantile Ranges; Bayesian Estimation; Vague prior
APA, Harvard, Vancouver, ISO, and other styles
5

Johnson, Kara Layne, and Nicole Bohme Carnegie . "Calibration of an Adaptive Genetic Algorithm for Modeling Opinion Diffusion." Algorithms 15, no. 2 (January 28, 2022): 45. http://dx.doi.org/10.3390/a15020045.

Full text
Abstract:
Genetic algorithms mimic the process of natural selection in order to solve optimization problems with minimal assumptions and perform well when the objective function has local optima on the search space. These algorithms treat potential solutions to the optimization problem as chromosomes, consisting of genes which undergo biologically-inspired operators to identify a better solution. Hyperparameters or control parameters determine the way these operators are implemented. We created a genetic algorithm in order to fit a DeGroot opinion diffusion model using limited data, making use of selection, blending, crossover, mutation, and survival operators. We adapted the algorithm from a genetic algorithm for design of mixture experiments, but the new algorithm required substantial changes due to model assumptions and the large parameter space relative to the design space. In addition to introducing new hyperparameters, these changes mean the hyperparameter values suggested for the original algorithm cannot be expected to result in optimal performance. To make the algorithm for modeling opinion diffusion more accessible to researchers, we conduct a simulation study investigating hyperparameter values. We find the algorithm is robust to the values selected for most hyperparameters and provide suggestions for initial, if not default, values and recommendations for adjustments based on algorithm output.
APA, Harvard, Vancouver, ISO, and other styles
6

Raji, Ismail Damilola, Habeeb Bello-Salau, Ime Jarlath Umoh, Adeiza James Onumanyi, Mutiu Adesina Adegboye, and Ahmed Tijani Salawudeen. "Simple Deterministic Selection-Based Genetic Algorithm for Hyperparameter Tuning of Machine Learning Models." Applied Sciences 12, no. 3 (January 24, 2022): 1186. http://dx.doi.org/10.3390/app12031186.

Full text
Abstract:
Hyperparameter tuning is a critical function necessary for the effective deployment of most machine learning (ML) algorithms. It is used to find the optimal hyperparameter settings of an ML algorithm in order to improve its overall output performance. To this effect, several optimization strategies have been studied for fine-tuning the hyperparameters of many ML algorithms, especially in the absence of model-specific information. However, because most ML training procedures need a significant amount of computational time and memory, it is frequently necessary to build an optimization technique that converges within a small number of fitness evaluations. As a result, a simple deterministic selection genetic algorithm (SDSGA) is proposed in this article. The SDSGA was realized by ensuring that both chromosomes and their accompanying fitness values in the original genetic algorithm are selected in an elitist-like way. We assessed the SDSGA over a variety of mathematical test functions. It was then used to optimize the hyperparameters of two well-known machine learning models, namely, the convolutional neural network (CNN) and the random forest (RF) algorithm, with application on the MNIST and UCI classification datasets. The SDSGA’s efficiency was compared to that of the Bayesian Optimization (BO) and three other popular metaheuristic optimization algorithms (MOAs), namely, the genetic algorithm (GA), particle swarm optimization (PSO) and biogeography-based optimization (BBO) algorithms. The results obtained reveal that the SDSGA performed better than the other MOAs in solving 11 of the 17 known benchmark functions considered in our study. While optimizing the hyperparameters of the two ML models, it performed marginally better in terms of accuracy than the other methods while taking less time to compute.
APA, Harvard, Vancouver, ISO, and other styles
7

Lu, Wanjie, Hongpeng Mao, Fanhao Lin, Zilin Chen, Hua Fu, and Yaosong Xu. "Recognition of rolling bearing running state based on genetic algorithm and convolutional neural network." Advances in Mechanical Engineering 14, no. 4 (April 2022): 168781322210956. http://dx.doi.org/10.1177/16878132221095635.

Full text
Abstract:
In this study, the GA-CNN model is proposed to realize the automatic recognition of rolling bearing running state. Firstly, to avoid the over-fitting and gradient dispersion in the training process of the CNN model, the BN layer and Dropout technology are introduced into the LeNet-5 model. Secondly, to obtain the automatic selection of hyperparameters in CNN model, a method of hyperparameter selection combined with genetic algorithm (GA) is proposed. In the proposed method, each hyperparameter is encoded as a chromosome, and each hyperparameter has a mapping relationship with the corresponding gene position on the chromosome. After the process of chromosome selection, crossover and variation, the fitness value is calculated to present the superiority of the current chromosome. The chromosomes with high fitness values are more likely to be selected in the next genetic iteration, that is, the optimal hyperparameters of the CNN model are obtained. Then, vibration signals from CWRU are used for the time-frequency analysis, and the obtained time-frequency image set is used to train and test the proposed GA-CNN model, and the accuracy of the proposed model can reach 99.85% on average, and the training speed is four times faster than the model LeNet-5. Finally, the result of the experiment on the laboratory test platform The experimental results confirm the superiority of the method and the transplantability of the optimization model.
APA, Harvard, Vancouver, ISO, and other styles
8

Han, Junjie, Cedric Gondro, and Juan Steibel. "98 Using differential evolution to improve predictive accuracy of deep learning models applied to pig production data." Journal of Animal Science 98, Supplement_3 (November 2, 2020): 27. http://dx.doi.org/10.1093/jas/skaa054.048.

Full text
Abstract:
Abstract Deep learning (DL) is being used for prediction in precision livestock farming and in genomic prediction. However, optimizing hyperparameters in DL models is critical for their predictive performance. Grid search is the traditional approach to select hyperparameters in DL, but it requires exhaustive search over the parameter space. We propose hyperparameter selection using differential evolution (DE), which is a heuristic algorithm that does not require exhaustive search. The goal of this study was to design and apply DE to optimize hyperparameters of DL models for genomic prediction and image analysis in pig production systems. One dataset consisted of 910 pigs genotyped with 28,916 SNP markers to predict their post-mortem meat pH. Another dataset consisted of 1,334 images of pigs eating inside a single-spaced feeder classified as: “single pig” or “multiple pigs.” The accuracy of genomic prediction was defined as the correlation between the predicted pH and the observed pH. The image classification prediction accuracy was the proportion of correctly classified images. For genomic prediction, a multilayer perceptron (MLP) was optimized. For image classification, MLP and convolutional neural networks (CNN) were optimized. For genomic prediction, the initial hyperparameter set resulted in an accuracy of 0.032 and for image classification, the initial accuracy was between 0.72 and 0.76. After optimization using DE, the genomic prediction accuracy was 0.3688 compared to 0.334 using GBLUP. The top selected models included one layer, 60 neurons, sigmoid activation and L2 penalty = 0.3. The accuracy of image classification after optimization was between 0.89 and 0.92. Selected models included three layers, adamax optimizer and relu or elu activation for the MLP, and one layer, 64 filters and 5×5 filter size for the CNN. DE can adapt the hyperparameter selection to each problem, dataset and model, and it significantly increased prediction accuracy with minimal user input.
APA, Harvard, Vancouver, ISO, and other styles
9

Wang, Chung-Ying, Chien-Yao Huang, and Yen-Han Chiang. "Solutions of Feature and Hyperparameter Model Selection in the Intelligent Manufacturing." Processes 10, no. 5 (April 27, 2022): 862. http://dx.doi.org/10.3390/pr10050862.

Full text
Abstract:
In the era of Industry 4.0, numerous AI technologies have been widely applied. However, implementation of the AI technology requires observation, analysis, and pre-processing of the obtained data, which takes up 60–90% of total time after data collection. Next, sensors and features are selected. Finally, the AI algorithms are used for clustering or classification. Despite the completion of data pre-processing, the subsequent feature selection and hyperparameter tuning in the AI model affect the sensitivity, accuracy, and robustness of the system. In this study, two novel approaches of sensor and feature selecting system, and hyperparameter tuning mechanisms are proposed. In the sensor and feature selecting system, the Shapley Additive ExPlanations model is used to calculate the contribution of individual features or sensors and to make the black-box AI model transparent, whereas, in the hyperparameter tuning mechanism, Hyperopt is used for tuning to improve model performance. Implementation of these two new systems is expected to reduce the problems in the processes of selection of the most sensitive features in the pre-processing stage, and tuning of hyperparameters, which are the most frequently occurring problems. Meanwhile, these methods are also applicable to the field of tool wear monitoring systems in intelligent manufacturing.
APA, Harvard, Vancouver, ISO, and other styles
10

Hendriks, Jacob, and Patrick Dumond. "Exploring the Relationship between Preprocessing and Hyperparameter Tuning for Vibration-Based Machine Fault Diagnosis Using CNNs." Vibration 4, no. 2 (April 3, 2021): 284–309. http://dx.doi.org/10.3390/vibration4020019.

Full text
Abstract:
This paper demonstrates the differences between popular transformation-based input representations for vibration-based machine fault diagnosis. This paper highlights the dependency of different input representations on hyperparameter selection with the results of training different configurations of classical convolutional neural networks (CNNs) with three common benchmarking datasets. Raw temporal measurement, Fourier spectrum, envelope spectrum, and spectrogram input types are individually used to train CNNs. Many configurations of CNNs are trained, with variable input sizes, convolutional kernel sizes and stride. The results show that each input type favors different combinations of hyperparameters, and that each of the datasets studied yield different performance characteristics. The input sizes are found to be the most significant determiner of whether overfitting will occur. It is demonstrated that CNNs trained with spectrograms are less dependent on hyperparameter optimization over all three datasets. This paper demonstrates the wide range of performance achieved by CNNs when preprocessing method and hyperparameters are varied as well as their complex interaction, providing researchers with useful background information and a starting place for further optimization.
APA, Harvard, Vancouver, ISO, and other styles
11

Franchini, Giorgia, Valeria Ruggiero, Federica Porta, and Luca Zanni. "Neural architecture search via standard machine learning methodologies." Mathematics in Engineering 5, no. 1 (2022): 1–21. http://dx.doi.org/10.3934/mine.2023012.

Full text
Abstract:
<abstract><p>In the context of deep learning, the more expensive computational phase is the full training of the learning methodology. Indeed, its effectiveness depends on the choice of proper values for the so-called hyperparameters, namely the parameters that are not trained during the learning process, and such a selection typically requires an extensive numerical investigation with the execution of a significant number of experimental trials. The aim of the paper is to investigate how to choose the hyperparameters related to both the architecture of a Convolutional Neural Network (CNN), such as the number of filters and the kernel size at each convolutional layer, and the optimisation algorithm employed to train the CNN itself, such as the steplength, the mini-batch size and the potential adoption of variance reduction techniques. The main contribution of the paper consists in introducing an automatic Machine Learning technique to set these hyperparameters in such a way that a measure of the CNN performance can be optimised. In particular, given a set of values for the hyperparameters, we propose a low-cost strategy to predict the performance of the corresponding CNN, based on its behavior after only few steps of the training process. To achieve this goal, we generate a dataset whose input samples are provided by a limited number of hyperparameter configurations together with the corresponding CNN measures of performance obtained with only few steps of the CNN training process, while the label of each input sample is the performance corresponding to a complete training of the CNN. Such dataset is used as training set for a Support Vector Machines for Regression and/or Random Forest techniques to predict the performance of the considered learning methodology, given its performance at the initial iterations of its learning process. Furthermore, by a probabilistic exploration of the hyperparameter space, we are able to find, at a quite low cost, the setting of a CNN hyperparameters which provides the optimal performance. The results of an extensive numerical experimentation, carried out on CNNs, together with the use of our performance predictor with NAS-Bench-101, highlight how the proposed methodology for the hyperparameter setting appears very promising.</p></abstract>
APA, Harvard, Vancouver, ISO, and other styles
12

Li, Yang, Jiawei Jiang, Jinyang Gao, Yingxia Shao, Ce Zhang, and Bin Cui. "Efficient Automatic CASH via Rising Bandits." Proceedings of the AAAI Conference on Artificial Intelligence 34, no. 04 (April 3, 2020): 4763–71. http://dx.doi.org/10.1609/aaai.v34i04.5910.

Full text
Abstract:
The Combined Algorithm Selection and Hyperparameter optimization (CASH) is one of the most fundamental problems in Automatic Machine Learning (AutoML). The existing Bayesian optimization (BO) based solutions turn the CASH problem into a Hyperparameter Optimization (HPO) problem by combining the hyperparameters of all machine learning (ML) algorithms, and use BO methods to solve it. As a result, these methods suffer from the low-efficiency problem due to the huge hyperparameter space in CASH. To alleviate this issue, we propose the alternating optimization framework, where the HPO problem for each ML algorithm and the algorithm selection problem are optimized alternately. In this framework, the BO methods are used to solve the HPO problem for each ML algorithm separately, incorporating a much smaller hyperparameter space for BO methods. Furthermore, we introduce Rising Bandits, a CASH-oriented Multi-Armed Bandits (MAB) variant, to model the algorithm selection in CASH. This framework can take the advantages of both BO in solving the HPO problem with a relatively small hyperparameter space and the MABs in accelerating the algorithm selection. Moreover, we further develop an efficient online algorithm to solve the Rising Bandits with provably theoretical guarantees. The extensive experiments on 30 OpenML datasets demonstrate the superiority of the proposed approach over the competitive baselines.
APA, Harvard, Vancouver, ISO, and other styles
13

Lin, Sijie, Ke Xu, Hui Feng, and Bo Hu. "Sequential Sampling and Estimation of Approximately Bandlimited Graph Signals." Sensors 21, no. 4 (February 19, 2021): 1460. http://dx.doi.org/10.3390/s21041460.

Full text
Abstract:
Graph signal sampling has been widely studied in recent years, but the accurate signal models required by most of the existing sampling methods are usually unavailable prior to any observations made in a practical environment. In this paper, a sequential sampling and estimation algorithm is proposed for approximately bandlimited graph signals, in the absence of prior knowledge concerning signal properties. We approach the problem from a Bayesian perspective in which we formulate the signal prior by a multivariate Gaussian distribution with unknown hyperparameters. To overcome the interconnected problems associated with the parameter estimation, in the proposed algorithm, hyperparameter estimation and sample selection are performed in an alternating way. At each step, the unknown hyperparameters are updated by an expectation maximization procedure based on historical observations, and then the next node in the sampling operation is chosen by uncertainty sampling with the latest hyperparameters. We prove that under some specific conditions, signal estimation in the proposed algorithm is consistent. Subsequent validation of the approach through simulations shows that the proposed procedure yields performances which are significantly better than existing state-of-the-art approaches notwithstanding the additional attribute of robustness in the presence of a broad range of signal attributes.
APA, Harvard, Vancouver, ISO, and other styles
14

Abu, Masyitah, Nik Adilah Hanin Zahri, Amiza Amir, Muhammad Izham Ismail, Azhany Yaakub, Said Amirul Anwar, and Muhammad Imran Ahmad. "A Comprehensive Performance Analysis of Transfer Learning Optimization in Visual Field Defect Classification." Diagnostics 12, no. 5 (May 18, 2022): 1258. http://dx.doi.org/10.3390/diagnostics12051258.

Full text
Abstract:
Numerous research have demonstrated that Convolutional Neural Network (CNN) models are capable of classifying visual field (VF) defects with great accuracy. In this study, we evaluated the performance of different pre-trained models (VGG-Net, MobileNet, ResNet, and DenseNet) in classifying VF defects and produced a comprehensive comparative analysis to compare the performance of different CNN models before and after hyperparameter tuning and fine-tuning. Using 32 batch sizes, 50 epochs, and ADAM as the optimizer to optimize weight, bias, and learning rate, VGG-16 obtained the highest accuracy of 97.63 percent, according to experimental findings. Subsequently, Bayesian optimization was utilized to execute automated hyperparameter tuning and automated fine-tuning layers of the pre-trained models to determine the optimal hyperparameter and fine-tuning layer for classifying many VF defect with the highest accuracy. We found that the combination of different hyperparameters and fine-tuning of the pre-trained models significantly impact the performance of deep learning models for this classification task. In addition, we also discovered that the automated selection of optimal hyperparameters and fine-tuning by Bayesian has significantly enhanced the performance of the pre-trained models. The results observed the best performance for the DenseNet-121 model with a validation accuracy of 98.46% and a test accuracy of 99.57% for the tested datasets.
APA, Harvard, Vancouver, ISO, and other styles
15

Zambelli, Antoine. "Ensemble method for cluster number determination and algorithm selection in unsupervised learning." F1000Research 11 (May 25, 2022): 573. http://dx.doi.org/10.12688/f1000research.121486.1.

Full text
Abstract:
Unsupervised learning, and more specifically clustering, suffers from the need for expertise in the field to be of use. Researchers must make careful and informed decisions on which algorithm to use with which set of hyperparameters for a given dataset. Additionally, researchers may need to determine the number of clusters in the dataset, which is unfortunately itself an input to most clustering algorithms; all of this before embarking on their actual subject matter work. After quantifying the impact of algorithm and hyperparameter selection, we propose an ensemble clustering framework which can be leveraged with minimal input. It can be used to determine both the number of clusters in the dataset and a suitable choice of algorithm to use for a given dataset. A code library is included in the Conclusions for ease of integration.
APA, Harvard, Vancouver, ISO, and other styles
16

Utsugi, Akio. "Hyperparameter Selection for Self-Organizing Maps." Neural Computation 9, no. 3 (March 1, 1997): 623–35. http://dx.doi.org/10.1162/neco.1997.9.3.623.

Full text
Abstract:
The self-organizing map (SOM) algorithm for finite data is derived as an approximate maximum a posteriori estimation algorithm for a gaussian mixture model with a gaussian smoothing prior, which is equivalent to a generalized deformable model (GDM). For this model, objective criteria for selecting hyperparameters are obtained on the basis of empirical Bayesian estimation and cross-validation, which are representative model selection methods. The properties of these criteria are compared by simulation experiments. These experiments show that the cross-validation methods favor more complex structures than the expected log likelihood supports, which is a measure of compatibility between a model and data distribution. On the other hand, the empirical Bayesian methods have the opposite bias.
APA, Harvard, Vancouver, ISO, and other styles
17

Efimova, V. A. "Reinforcement-based simultaneous classification model and its hyperparameters selection." Machine Learning and Data Analysis 2, no. 2 (2016): 244–54. http://dx.doi.org/10.21469/22233792.2.2.09.

Full text
APA, Harvard, Vancouver, ISO, and other styles
18

Tsirikoglou, P., S. Abraham, F. Contino, C. Lacor, and G. Ghorbaniasl. "A hyperparameters selection technique for support vector regression models." Applied Soft Computing 61 (December 2017): 139–48. http://dx.doi.org/10.1016/j.asoc.2017.07.017.

Full text
APA, Harvard, Vancouver, ISO, and other styles
19

Chen, Yuejian, Meng Rao, Ke Feng, and Ming J. Zuo. "Physics-Informed LSTM hyperparameters selection for gearbox fault detection." Mechanical Systems and Signal Processing 171 (May 2022): 108907. http://dx.doi.org/10.1016/j.ymssp.2022.108907.

Full text
APA, Harvard, Vancouver, ISO, and other styles
20

Sun, Yang, Hangdong Zhao, and Jonathan Scarlett. "On Architecture Selection for Linear Inverse Problems with Untrained Neural Networks." Entropy 23, no. 11 (November 9, 2021): 1481. http://dx.doi.org/10.3390/e23111481.

Full text
Abstract:
In recent years, neural network based image priors have been shown to be highly effective for linear inverse problems, often significantly outperforming conventional methods that are based on sparsity and related notions. While pre-trained generative models are perhaps the most common, it has additionally been shown that even untrained neural networks can serve as excellent priors in various imaging applications. In this paper, we seek to broaden the applicability and understanding of untrained neural network priors by investigating the interaction between architecture selection, measurement models (e.g., inpainting vs. denoising vs. compressive sensing), and signal types (e.g., smooth vs. erratic). We motivate the problem via statistical learning theory, and provide two practical algorithms for tuning architectural hyperparameters. Using experimental evaluations, we demonstrate that the optimal hyperparameters may vary significantly between tasks and can exhibit large performance gaps when tuned for the wrong task. In addition, we investigate which hyperparameters tend to be more important, and which are robust to deviations from the optimum.
APA, Harvard, Vancouver, ISO, and other styles
21

Menapace, Andrea, Ariele Zanfei, and Maurizio Righetti. "Tuning ANN Hyperparameters for Forecasting Drinking Water Demand." Applied Sciences 11, no. 9 (May 10, 2021): 4290. http://dx.doi.org/10.3390/app11094290.

Full text
Abstract:
The evolution of smart water grids leads to new Big Data challenges boosting the development and application of Machine Learning techniques to support efficient and sustainable drinking water management. These powerful techniques rely on hyperparameters making the models’ tuning a tricky and crucial task. We hence propose an insightful analysis of the tuning of Artificial Neural Networks for drinking water demand forecasting. This study focuses on layers and nodes’ hyperparameters fitting of different Neural Network architectures through a grid search method by varying dataset, prediction horizon and set of inputs. In particular, the architectures involved are the Feed Forward Neural Network, the Long Short Term Memory, the Simple Recurrent Neural Network and the Gated Recurrent Unit, while the prediction interval ranges from 1 h to 1 week. To avoid the problem of the Neural Networks tuning stochasticity, we propose the selection of the median model among several repetitions for each hyperparameter’s configurations. The proposed iterative tuning procedure highlights the change of the required number of layers and nodes depending on Neural Network architectures, prediction horizon and dataset. Significant trends and considerations are pointed out to support Neural Network application in drinking water prediction.
APA, Harvard, Vancouver, ISO, and other styles
22

Nazdryukhin, A. S., A. M. Fedrak, and N. A. Radeev. "Neural networks for classification problem on tabular data." Journal of Physics: Conference Series 2142, no. 1 (December 1, 2021): 012013. http://dx.doi.org/10.1088/1742-6596/2142/1/012013.

Full text
Abstract:
Abstract This work presents the results of using self-normalizing neural networks with automatic selection of hyperparameters, TabNet and NODE to solve the problem of tabular data classification. The method of automatic selection of hyperparameters was realised. Testing was carried out with the open source framework OpenML AutoML Benchmark. As part of the work, a comparative analysis was carried out with seven classification methods, experiments were carried out for 39 datasets with 5 methods. NODE shows the best results among the following methods and overperformed standard methods for four datasets.
APA, Harvard, Vancouver, ISO, and other styles
23

Jervis, Michael, Mingliang Liu, and Robert Smith. "Deep learning network optimization and hyperparameter tuning for seismic lithofacies classification." Leading Edge 40, no. 7 (July 2021): 514–23. http://dx.doi.org/10.1190/tle40070514.1.

Full text
Abstract:
Deep learning is increasingly being applied in many aspects of seismic processing and interpretation. Here, we look at a deep convolutional neural network approach to multiclass seismic lithofacies characterization using well logs and seismic data. In particular, we focus on network performance and hyperparameter tuning. Several hyperparameter tuning approaches are compared, including true and directed random search methods such as very fast simulated annealing and Bayesian hyperparameter optimization. The results show that improvements in predictive capability are possible by using automatic optimization compared with manual parameter selection. In addition to evaluating the prediction accuracy's sensitivity to hyperparameters, we test various types of data representations. The choice of input seismic data can significantly impact the overall accuracy and computation speed of the optimized networks for the classification challenge under consideration. This is validated on a 3D synthetic seismic lithofacies example with acoustic and lithologic properties based on real well data and structure from an onshore oil field.
APA, Harvard, Vancouver, ISO, and other styles
24

Bouktif, Salah, Ali Fiaz, Ali Ouni, and Mohamed Adel Serhani. "Multi-Sequence LSTM-RNN Deep Learning and Metaheuristics for Electric Load Forecasting." Energies 13, no. 2 (January 13, 2020): 391. http://dx.doi.org/10.3390/en13020391.

Full text
Abstract:
Short term electric load forecasting plays a crucial role for utility companies, as it allows for the efficient operation and management of power grid networks, optimal balancing between production and demand, as well as reduced production costs. As the volume and variety of energy data provided by building automation systems, smart meters, and other sources are continuously increasing, long short-term memory (LSTM) deep learning models have become an attractive approach for energy load forecasting. These models are characterized by their capabilities of learning long-term dependencies in collected electric data, which lead to accurate prediction results that outperform several alternative statistical and machine learning approaches. Unfortunately, applying LSTM models may not produce acceptable forecasting results, not only because of the noisy electric data but also due to the naive selection of its hyperparameter values. Therefore, an optimal configuration of an LSTM model is necessary to describe the electric consumption patterns and discover the time-series dynamics in the energy domain. Finding such an optimal configuration is, on the one hand, a combinatorial problem where selection is done from a very large space of choices; on the other hand, it is a learning problem where the hyperparameters should reflect the energy consumption domain knowledge, such as the influential time lags, seasonality, periodicity, and other temporal attributes. To handle this problem, we use in this paper metaheuristic-search-based algorithms, known by their ability to alleviate search complexity as well as their capacity to learn from the domain where they are applied, to find optimal or near-optimal values for the set of tunable LSTM hyperparameters in the electrical energy consumption domain. We tailor both a genetic algorithm (GA) and particle swarm optimization (PSO) to learn hyperparameters for load forecasting in the context of energy consumption of big data. The statistical analysis of the obtained result shows that the multi-sequence deep learning model tuned by the metaheuristic search algorithms provides more accurate results than the benchmark machine learning models and the LSTM model whose inputs and hyperparameters were established through limited experience and a discounted number of experimentations.
APA, Harvard, Vancouver, ISO, and other styles
25

Santos, Carlos Eduardo da Silva, Renato Coral Sampaio, Leandro dos Santos Coelho, Guillermo Alvarez Bestard, and Carlos Humberto Llanos. "Multi-objective adaptive differential evolution for SVM/SVR hyperparameters selection." Pattern Recognition 110 (February 2021): 107649. http://dx.doi.org/10.1016/j.patcog.2020.107649.

Full text
APA, Harvard, Vancouver, ISO, and other styles
26

Beck, Daniel, Trevor Cohn, Christian Hardmeier, and Lucia Specia. "Learning Structural Kernels for Natural Language Processing." Transactions of the Association for Computational Linguistics 3 (December 2015): 461–73. http://dx.doi.org/10.1162/tacl_a_00151.

Full text
Abstract:
Structural kernels are a flexible learning paradigm that has been widely used in Natural Language Processing. However, the problem of model selection in kernel-based methods is usually overlooked. Previous approaches mostly rely on setting default values for kernel hyperparameters or using grid search, which is slow and coarse-grained. In contrast, Bayesian methods allow efficient model selection by maximizing the evidence on the training data through gradient-based methods. In this paper we show how to perform this in the context of structural kernels by using Gaussian Processes. Experimental results on tree kernels show that this procedure results in better prediction performance compared to hyperparameter optimization via grid search. The framework proposed in this paper can be adapted to other structures besides trees, e.g., strings and graphs, thereby extending the utility of kernel-based methods.
APA, Harvard, Vancouver, ISO, and other styles
27

Trierweiler Ribeiro, Gabriel, João Guilherme Sauer, Naylene Fraccanabbia, Viviana Cocco Mariani, and Leandro dos Santos Coelho. "Bayesian Optimized Echo State Network Applied to Short-Term Load Forecasting." Energies 13, no. 9 (May 11, 2020): 2390. http://dx.doi.org/10.3390/en13092390.

Full text
Abstract:
Load forecasting impacts directly financial returns and information in electrical systems planning. A promising approach to load forecasting is the Echo State Network (ESN), a recurrent neural network for the processing of temporal dependencies. The low computational cost and powerful performance of ESN make it widely used in a range of applications including forecasting tasks and nonlinear modeling. This paper presents a Bayesian optimization algorithm (BOA) of ESN hyperparameters in load forecasting with its main contributions including helping the selection of optimization algorithms for tuning ESN to solve real-world forecasting problems, as well as the evaluation of the performance of Bayesian optimization with different acquisition function settings. For this purpose, the ESN hyperparameters were set as variables to be optimized. Then, the adopted BOA employs a probabilist model using Gaussian process to find the best set of ESN hyperparameters using three different options of acquisition function and a surrogate utility function. Finally, the optimized hyperparameters are used by the ESN for predictions. Two datasets have been used to test the effectiveness of the proposed forecasting ESN model using BOA approaches, one from Poland and another from Brazil. The results of optimization statistics, convergence curves, execution time profile, and the hyperparameters’ best solution frequencies indicate that each problem requires a different setting for the BOA. Simulation results are promising in terms of short-term load forecasting quality and low error predictions may be achieved, given the correct options settings are used. Furthermore, since there is not an optimal global optimization solution known for real-world problems, correlations among certain values of hyperparameters are useful to guide the selection of such a solution.
APA, Harvard, Vancouver, ISO, and other styles
28

El-Hasnony, Ibrahim M., Omar M. Elzeki, Ali Alshehri, and Hanaa Salem. "Multi-Label Active Learning-Based Machine Learning Model for Heart Disease Prediction." Sensors 22, no. 3 (February 4, 2022): 1184. http://dx.doi.org/10.3390/s22031184.

Full text
Abstract:
The rapid growth and adaptation of medical information to identify significant health trends and help with timely preventive care have been recent hallmarks of the modern healthcare data system. Heart disease is the deadliest condition in the developed world. Cardiovascular disease and its complications, including dementia, can be averted with early detection. Further research in this area is needed to prevent strokes and heart attacks. An optimal machine learning model can help achieve this goal with a wealth of healthcare data on heart disease. Heart disease can be predicted and diagnosed using machine-learning-based systems. Active learning (AL) methods improve classification quality by incorporating user–expert feedback with sparsely labelled data. In this paper, five (MMC, Random, Adaptive, QUIRE, and AUDI) selection strategies for multi-label active learning were applied and used for reducing labelling costs by iteratively selecting the most relevant data to query their labels. The selection methods with a label ranking classifier have hyperparameters optimized by a grid search to implement predictive modelling in each scenario for the heart disease dataset. Experimental evaluation includes accuracy and F-score with/without hyperparameter optimization. Results show that the generalization of the learning model beyond the existing data for the optimized label ranking model uses the selection method versus others due to accuracy. However, the selection method was highlighted in regards to the F-score using optimized settings.
APA, Harvard, Vancouver, ISO, and other styles
29

Сіряк, Р. В., І. С. Скарга-Бандурова, and T. O. Білобородова. "Towards an empirical hyperparameters optimization in CNN." ВІСНИК СХІДНОУКРАЇНСЬКОГО НАЦІОНАЛЬНОГО УНІВЕРСИТЕТУ імені Володимира Даля, no. 5(253) (September 5, 2019): 87–91. http://dx.doi.org/10.33216/1998-7927-2019-253-5-87-91.

Full text
Abstract:
The necessity of creating a model of recognition of gestures based on convolutional neural network that effective not only in pattern recognition, but also in terms of learning speed and resource intensity, is substantiated. In this regard, the work solved the problem of optimization of hyperparameters and the selection of the best optimizer backpropagation errors. To implement the tasks, a model was created that can recognize hand gestures, both from a single image and from streaming video.When choosing an optimizer, two adaptive methods were tested - Adadelta and Adam. The experiments confirmed the high efficiency of Adadelta, however, when compared with Adam, it showed more than twice as long network training.
APA, Harvard, Vancouver, ISO, and other styles
30

KNÜRR, TIMO, ESA LÄÄRÄ, and MIKKO J. SILLANPÄÄ. "Genetic analysis of complex traits via Bayesian variable selection: the utility of a mixture of uniform priors." Genetics Research 93, no. 4 (July 18, 2011): 303–18. http://dx.doi.org/10.1017/s0016672311000164.

Full text
Abstract:
SummaryA new estimation-based Bayesian variable selection approach is presented for genetic analysis of complex traits based on linear or logistic regression. By assigning a mixture of uniform priors (MU) to genetic effects, the approach provides an intuitive way of specifying hyperparameters controlling the selection of multiple influential loci. It aims at avoiding the difficulty of interpreting assumptions made in the specifications of priors. The method is compared in two real datasets with two other approaches, stochastic search variable selection (SSVS) and a re-formulation of Bayes B utilizing indicator variables and adaptive Student's t-distributions (IAt). The Markov Chain Monte Carlo (MCMC) sampling performance of the three methods is evaluated using the publicly available software OpenBUGS (model scripts are provided in the Supplementary material). The sensitivity of MU to the specification of hyperparameters is assessed in one of the data examples.
APA, Harvard, Vancouver, ISO, and other styles
31

Chan, Joshua C. C., Liana Jacobi, and Dan Zhu. "Efficient selection of hyperparameters in large Bayesian VARs using automatic differentiation." Journal of Forecasting 39, no. 6 (March 2, 2020): 934–43. http://dx.doi.org/10.1002/for.2660.

Full text
APA, Harvard, Vancouver, ISO, and other styles
32

Baldin, N., and V. Spokoiny. "Bayesian Model Selection and the Concentration of the Posterior of Hyperparameters." Journal of Mathematical Sciences 203, no. 6 (November 16, 2014): 761–76. http://dx.doi.org/10.1007/s10958-014-2166-7.

Full text
APA, Harvard, Vancouver, ISO, and other styles
33

Mathew, Steve Koshy, and Yu Zhang. "Acoustic-Based Engine Fault Diagnosis Using WPT, PCA and Bayesian Optimization." Applied Sciences 10, no. 19 (October 1, 2020): 6890. http://dx.doi.org/10.3390/app10196890.

Full text
Abstract:
Engine fault diagnosis aims to assist engineers in undertaking vehicle maintenance in an efficient manner. This paper presents an automatic model and hyperparameter selection scheme for engine combustion fault classification, using acoustic signals captured from cylinder heads of the engine. Wavelet Packet Transform (WPT) is utilized for time–frequency analysis, and statistical features are extracted from both high- and low-level WPT coefficients. Then, the extracted features are used to compare three models: (i) standard classification model; (ii) Bayesian optimization for automatic model and hyperparameters selection; and (iii) Principle Component Analysis (PCA) for feature space dimensionality reduction combined with Bayesian optimization. The latter two models both demonstrated improved accuracy and the other performance metrics compared to the standard model. Moreover, with similar accuracy level, PCA with Bayesian optimized model achieved around 20% less total evaluation time and 8–19% less testing time, compared to the second model, for all fault conditions, which thus shows a promising solution for further development in real-time engine fault diagnosis.
APA, Harvard, Vancouver, ISO, and other styles
34

Sethi, Monika, Sachin Ahuja, Shalli Rani, Puneet Bawa, and Atef Zaguia. "Classification of Alzheimer’s Disease Using Gaussian-Based Bayesian Parameter Optimization for Deep Convolutional LSTM Network." Computational and Mathematical Methods in Medicine 2021 (October 4, 2021): 1–16. http://dx.doi.org/10.1155/2021/4186666.

Full text
Abstract:
Alzheimer’s disease (AD) is one of the most important causes of mortality in elderly people, and it is often challenging to use traditional manual procedures when diagnosing a disease in the early stages. The successful implementation of machine learning (ML) techniques has also shown their effectiveness and its reliability as one of the better options for an early diagnosis of AD. But the heterogeneous dimensions and composition of the disease data have undoubtedly made diagnostics more difficult, needing a sufficient model choice to overcome the difficulty. Therefore, in this paper, four different 2D and 3D convolutional neural network (CNN) frameworks based on Bayesian search optimization are proposed to develop an optimized deep learning model to predict the early onset of AD binary and ternary classification on magnetic resonance imaging (MRI) scans. Moreover, certain hyperparameters such as learning rate, optimizers, and hidden units are to be set and adjusted for the performance boosting of the deep learning model. Bayesian optimization enables to leverage advantage throughout the experiments: A persistent hyperparameter space testing provides not only the output but also about the nearest conclusions. In this way, the series of experiments needed to explore space can be substantially reduced. Finally, alongside the use of Bayesian approaches, long short-term memory (LSTM) through the process of augmentation has resulted in finding the better settings of the model that too in less iterations with an relative improvement (RI) of 7.03%, 12.19%, 10.80%, and 11.99% over the four systems optimized with manual hyperparameters tuning such that hyperparameters that look more appealing from past data as well as the conventional techniques of manual selection.
APA, Harvard, Vancouver, ISO, and other styles
35

Utsugi, Akio. "Density Estimation by Mixture Models with Smoothing Priors." Neural Computation 10, no. 8 (November 1, 1998): 2115–35. http://dx.doi.org/10.1162/089976698300016990.

Full text
Abstract:
In the statistical approach for self-organizing maps (SOMs), learning is regarded as an estimation algorithm for a gaussian mixture model with a gaussian smoothing prior on the centroid parameters. The values of the hyperparameters and the topological structure are selected on the basis of a statistical principle. However, since the component selection probabilities are fixed to a common value, the centroids concentrate on areas with high data density. This deforms a coordinate system on an extracted manifold and makes smoothness evaluation for the manifold inaccurate. In this article, we study an extended SOM model whose component selection probabilities are variable. To stabilize the estimation, a smoothing prior on the component selection probabilities is introduced. An estimation algorithm for the parameters and the hyperparameters based on empirical Bayesian inference is obtained. The performance of density estimation by the new model and the SOM model is compared via simulation experiments.
APA, Harvard, Vancouver, ISO, and other styles
36

Goh, Rui Ying, Lai Soon Lee, Hsin-Vonn Seow, and Kathiresan Gopal. "Hybrid Harmony Search–Artificial Intelligence Models in Credit Scoring." Entropy 22, no. 9 (September 4, 2020): 989. http://dx.doi.org/10.3390/e22090989.

Full text
Abstract:
Credit scoring is an important tool used by financial institutions to correctly identify defaulters and non-defaulters. Support Vector Machines (SVM) and Random Forest (RF) are the Artificial Intelligence techniques that have been attracting interest due to their flexibility to account for various data patterns. Both are black-box models which are sensitive to hyperparameter settings. Feature selection can be performed on SVM to enable explanation with the reduced features, whereas feature importance computed by RF can be used for model explanation. The benefits of accuracy and interpretation allow for significant improvement in the area of credit risk and credit scoring. This paper proposes the use of Harmony Search (HS), to form a hybrid HS-SVM to perform feature selection and hyperparameter tuning simultaneously, and a hybrid HS-RF to tune the hyperparameters. A Modified HS (MHS) is also proposed with the main objective to achieve comparable results as the standard HS with a shorter computational time. MHS consists of four main modifications in the standard HS: (i) Elitism selection during memory consideration instead of random selection, (ii) dynamic exploration and exploitation operators in place of the original static operators, (iii) a self-adjusted bandwidth operator, and (iv) inclusion of additional termination criteria to reach faster convergence. Along with parallel computing, MHS effectively reduces the computational time of the proposed hybrid models. The proposed hybrid models are compared with standard statistical models across three different datasets commonly used in credit scoring studies. The computational results show that MHS-RF is most robust in terms of model performance, model explainability and computational time.
APA, Harvard, Vancouver, ISO, and other styles
37

Piccolo, Stephen R., Avery Mecham, Nathan P. Golightly, Jérémie L. Johnson, and Dustin B. Miller. "The ability to classify patients based on gene-expression data varies by algorithm and performance metric." PLOS Computational Biology 18, no. 3 (March 11, 2022): e1009926. http://dx.doi.org/10.1371/journal.pcbi.1009926.

Full text
Abstract:
By classifying patients into subgroups, clinicians can provide more effective care than using a uniform approach for all patients. Such subgroups might include patients with a particular disease subtype, patients with a good (or poor) prognosis, or patients most (or least) likely to respond to a particular therapy. Transcriptomic measurements reflect the downstream effects of genomic and epigenomic variations. However, high-throughput technologies generate thousands of measurements per patient, and complex dependencies exist among genes, so it may be infeasible to classify patients using traditional statistical models. Machine-learning classification algorithms can help with this problem. However, hundreds of classification algorithms exist—and most support diverse hyperparameters—so it is difficult for researchers to know which are optimal for gene-expression biomarkers. We performed a benchmark comparison, applying 52 classification algorithms to 50 gene-expression datasets (143 class variables). We evaluated algorithms that represent diverse machine-learning methodologies and have been implemented in general-purpose, open-source, machine-learning libraries. When available, we combined clinical predictors with gene-expression data. Additionally, we evaluated the effects of performing hyperparameter optimization and feature selection using nested cross validation. Kernel- and ensemble-based algorithms consistently outperformed other types of classification algorithms; however, even the top-performing algorithms performed poorly in some cases. Hyperparameter optimization and feature selection typically improved predictive performance, and univariate feature-selection algorithms typically outperformed more sophisticated methods. Together, our findings illustrate that algorithm performance varies considerably when other factors are held constant and thus that algorithm selection is a critical step in biomarker studies.
APA, Harvard, Vancouver, ISO, and other styles
38

Nahhas, Faten Hamed, Helmi Z. M. Shafri, Maher Ibrahim Sameen, Biswajeet Pradhan, and Shattri Mansor. "Deep Learning Approach for Building Detection Using LiDAR–Orthophoto Fusion." Journal of Sensors 2018 (August 5, 2018): 1–12. http://dx.doi.org/10.1155/2018/7212307.

Full text
Abstract:
This paper reports on a building detection approach based on deep learning (DL) using the fusion of Light Detection and Ranging (LiDAR) data and orthophotos. The proposed method utilized object-based analysis to create objects, a feature-level fusion, an autoencoder-based dimensionality reduction to transform low-level features into compressed features, and a convolutional neural network (CNN) to transform compressed features into high-level features, which were used to classify objects into buildings and background. The proposed architecture was optimized for the grid search method, and its sensitivity to hyperparameters was analyzed and discussed. The proposed model was evaluated on two datasets selected from an urban area with different building types. Results show that the dimensionality reduction by the autoencoder approach from 21 features to 10 features can improve detection accuracy from 86.06% to 86.19% in the working area and from 77.92% to 78.26% in the testing area. The sensitivity analysis also shows that the selection of the hyperparameter values of the model significantly affects detection accuracy. The best hyperparameters of the model are 128 filters in the CNN model, the Adamax optimizer, 10 units in the fully connected layer of the CNN model, a batch size of 8, and a dropout of 0.2. These hyperparameters are critical to improving the generalization capacity of the model. Furthermore, comparison experiments with the support vector machine (SVM) show that the proposed model with or without dimensionality reduction outperforms the SVM models in the working area. However, the SVM model achieves better accuracy in the testing area than the proposed model without dimensionality reduction. This study generally shows that the use of an autoencoder in DL models can improve the accuracy of building recognition in fused LiDAR–orthophoto data.
APA, Harvard, Vancouver, ISO, and other styles
39

Deshmukh, Miss Maithili, and Dr M. A. Pund. "Implementation Paper on Network Data Verification Using Machine Learning Classifiers Based on Reduced Feature Dimensions." International Journal for Research in Applied Science and Engineering Technology 10, no. 4 (April 30, 2022): 2921–24. http://dx.doi.org/10.22214/ijraset.2022.41938.

Full text
Abstract:
Abstract: With the rapid development of network-based applications, new risks arise and extra security mechanisms require additional attention to enhance speed and accuracy. Although many new security tools are developed, the rapid rise of malicious activity may be a major problem and therefore the ever-evolving attacks pose serious threats to network security. Network administrators rely heavily on intrusion detection systems to detect such network intrusion activity. a serious approach is machine learning methods for intrusion detection, where we learn models from data to differentiate between abnormal and normal traffic. Although machine learning methods are often used, there are some drawbacks to deep analysis of machine learning algorithms in terms of intrusion detection. during this work, we present a comprehensive analysis of some existing machine learning classifiers within the context of known intrusions into network traffic. Specifically, we analyze classification along different dimensions, that is, feature selection, sensitivity to hyper-parameter selection, and sophistication imbalance problems involved in intrusion detection. We evaluate several classifications using the NSL-KDD dataset and summarize their effectiveness using detailed experimental evaluation. Keywords: IDS, Machine Learning, Classification Algorithms, NSL-KDD Dataset, Network Intrusion Detection, Data Mining, Feature Selection, WEKA, Hyperparameters, Hyperparameter Optimization.
APA, Harvard, Vancouver, ISO, and other styles
40

Zhang, Xuan, and Kevin Duh. "Reproducible and Efficient Benchmarks for Hyperparameter Optimization of Neural Machine Translation Systems." Transactions of the Association for Computational Linguistics 8 (July 2020): 393–408. http://dx.doi.org/10.1162/tacl_a_00322.

Full text
Abstract:
Hyperparameter selection is a crucial part of building neural machine translation (NMT) systems across both academia and industry. Fine-grained adjustments to a model’s architecture or training recipe can mean the difference between a positive and negative research result or between a state-of-the-art and underperforming system. While recent literature has proposed methods for automatic hyperparameter optimization (HPO), there has been limited work on applying these methods to neural machine translation (NMT), due in part to the high costs associated with experiments that train large numbers of model variants. To facilitate research in this space, we introduce a lookup-based approach that uses a library of pre-trained models for fast, low cost HPO experimentation. Our contributions include (1) the release of a large collection of trained NMT models covering a wide range of hyperparameters, (2) the proposal of targeted metrics for evaluating HPO methods on NMT, and (3) a reproducible benchmark of several HPO methods against our model library, including novel graph-based and multiobjective methods.
APA, Harvard, Vancouver, ISO, and other styles
41

Shalamov, Viacheslav, Valeria Efimova, Sergey Muravyov, and Andrey Filchenkov. "Reinforcement-based Method for Simultaneous Clustering Algorithm Selection and its Hyperparameters Optimization." Procedia Computer Science 136 (2018): 144–53. http://dx.doi.org/10.1016/j.procs.2018.08.247.

Full text
APA, Harvard, Vancouver, ISO, and other styles
42

LEBRUN, GILLES, CHRISTOPHE CHARRIER, OLIVIER LEZORAY, and HUBERT CARDOT. "TABU SEARCH MODEL SELECTION FOR SVM." International Journal of Neural Systems 18, no. 01 (February 2008): 19–31. http://dx.doi.org/10.1142/s0129065708001348.

Full text
Abstract:
A model selection method based on tabu search is proposed to build support vector machines (binary decision functions) of reduced complexity and efficient generalization. The aim is to build a fast and efficient support vector machines classifier. A criterion is defined to evaluate the decision function quality which blends recognition rate and the complexity of a binary decision functions together. The selection of the simplification level by vector quantization, of a feature subset and of support vector machines hyperparameters are performed by tabu search method to optimize the defined decision function quality criterion in order to find a good sub-optimal model on tractable times.
APA, Harvard, Vancouver, ISO, and other styles
43

Chen, Xu, and Brett Wujek. "AutoDAL: Distributed Active Learning with Automatic Hyperparameter Selection." Proceedings of the AAAI Conference on Artificial Intelligence 34, no. 04 (April 3, 2020): 3537–44. http://dx.doi.org/10.1609/aaai.v34i04.5759.

Full text
Abstract:
Automated machine learning (AutoML) strives to establish an appropriate machine learning model for any dataset automatically with minimal human intervention. Although extensive research has been conducted on AutoML, most of it has focused on supervised learning. Research of automated semi-supervised learning and active learning algorithms is still limited. Implementation becomes more challenging when the algorithm is designed for a distributed computing environment. With this as motivation, we propose a novel automated learning system for distributed active learning (AutoDAL) to address these challenges. First, automated graph-based semi-supervised learning is conducted by aggregating the proposed cost functions from different compute nodes in a distributed manner. Subsequently, automated active learning is addressed by jointly optimizing hyperparameters in both the classification and query selection stages leveraging the graph loss minimization and entropy regularization. Moreover, we propose an efficient distributed active learning algorithm which is scalable for big data by first partitioning the unlabeled data and replicating the labeled data to different worker nodes in the classification stage, and then aggregating the data in the controller in the query selection stage. The proposed AutoDAL algorithm is applied to multiple benchmark datasets and a real-world electrocardiogram (ECG) dataset for classification. We demonstrate that the proposed AutoDAL algorithm is capable of achieving significantly better performance compared to several state-of-the-art AutoML approaches and active learning algorithms.
APA, Harvard, Vancouver, ISO, and other styles
44

Deshmukh, Miss Maithili, and Dr M. A. Pund. "Review Paper on Network Data Verification Using Machine Learning Classifiers Based On Reduced Feature Dimensions." International Journal for Research in Applied Science and Engineering Technology 10, no. 4 (April 30, 2022): 1592–95. http://dx.doi.org/10.22214/ijraset.2022.41586.

Full text
Abstract:
Abstract: With the rapid development of network-based applications, new risks arise and additional security mechanisms require additional attention to improve speed and accuracy. Although many new security tools have been developed, the rapid rise of malicious activity is a serious problem and the ever-evolving attacks pose serious threats to network security. Network administrators rely heavily on intrusion detection systems to detect such network intrusion activity. A major approach is machine learning methods for intrusion detection, where we learn models from data to differentiate between abnormal and normal traffic. Although machine learning methods are often used, there are some shortcomings in the in-depth analysis of machine learning algorithms in terms of intrusion detection. In this work, we present a comprehensive analysis of some existing machine learning classifiers with respect to known intrusions into network traffic. Specifically, we analyze classification with different dimensions, that is, feature selection, sensitivity to hyper-parameter selection, and class imbalance problems that are involved in intrusion detection. We evaluate several classifications using the NSL-KDD dataset and summarize their effectiveness using detailed experimental evaluation. Keywords: IDS, Machine Learning, Classification Algorithms, NSL-KDD Dataset, Network Intrusion Detection, Data Mining, Feature Selection, WEKA, Hyperparameters, Hyperparameter Optimization.
APA, Harvard, Vancouver, ISO, and other styles
45

Domingos, Edvaldo, Blessing Ojeme, and Olawande Daramola. "Experimental Analysis of Hyperparameters for Deep Learning-Based Churn Prediction in the Banking Sector." Computation 9, no. 3 (March 16, 2021): 34. http://dx.doi.org/10.3390/computation9030034.

Full text
Abstract:
Until recently, traditional machine learning techniques (TMLTs) such as multilayer perceptrons (MLPs) and support vector machines (SVMs) have been used successfully for churn prediction, but with significant efforts expended on the configuration of the training parameters. The selection of the right training parameters for supervised learning is almost always experimentally determined in an ad hoc manner. Deep neural networks (DNNs) have shown significant predictive strength over TMLTs when used for churn predictions. However, the more complex architecture of DNNs and their capacity to process huge amounts of non-linear input data demand more time and effort to configure the training hyperparameters for DNNs during churn modeling. This makes the process more challenging for inexperienced machine learning practitioners and researchers. So far, limited research has been done to establish the effects of different hyperparameters on the performance of DNNs during churn prediction. There is a lack of empirically derived heuristic knowledge to guide the selection of hyperparameters when DNNs are used for churn modeling. This paper presents an experimental analysis of the effects of different hyperparameters when DNNs are used for churn prediction in the banking sector. The results from three experiments revealed that the deep neural network (DNN) model performed better than the MLP when a rectifier function was used for activation in the hidden layers and a sigmoid function was used in the output layer. The performance of the DNN was better when the batch size was smaller than the size of the test set data, while the RemsProp training algorithm had better accuracy when compared with the stochastic gradient descent (SGD), Adam, AdaGrad, Adadelta, and AdaMax algorithms. The study provides heuristic knowledge that could guide researchers and practitioners in machine learning-based churn prediction from the tabular data for customer relationship management in the banking sector when DNNs are used.
APA, Harvard, Vancouver, ISO, and other styles
46

NGUYEN, Thanh-Tam, Son-Thai LE, and Van-Thuy LE. "Adaptive Hyperparameter for Face Recognition." International Journal of Innovative Technology and Exploring Engineering 10, no. 2 (January 10, 2021): 116–19. http://dx.doi.org/10.35940/ijitee.c8409.0110321.

Full text
Abstract:
One of the widely used prominent biometric techniques for identity authentication is Face Recognition. It plays an essential role in many areas, such as daily life, public security, finance, the military, and the smart school. The facial recognition task is identifying or verifying the identity of a person base on their face. The first step is face detection, which detects and locates human faces in images and videos. The face match process then finds an identity of the detected face. In recent years there have been many face recognition systems improving the performance based on deep learning models. Deep learning learns representations of the face based on multiple processing layers with multiple levels of feature extraction. This approach has made sufficient improvement in face recognition since 2014, launched by the breakthroughs of DeepFace and DeepID. However, finding a way to choose the best hyperparameters remains an open question. In this paper, we introduce a method for adaptive hyperparameters selection to improve recognition accuracy. The proposed method achieves improvements on three datasets.
APA, Harvard, Vancouver, ISO, and other styles
47

Pascal, Barbara, Samuel Vaiter, Nelly Pustelnik, and Patrice Abry. "Automated Data-Driven Selection of the Hyperparameters for Total-Variation-Based Texture Segmentation." Journal of Mathematical Imaging and Vision 63, no. 7 (May 29, 2021): 923–52. http://dx.doi.org/10.1007/s10851-021-01035-1.

Full text
APA, Harvard, Vancouver, ISO, and other styles
48

Ibrahim, Y., E. Okafor, and B. Yahaya. "Optimization of RBF-SVM hyperparameters using genetic algorithm for face recognit." Nigerian Journal of Technology 39, no. 4 (March 24, 2021): 1190–97. http://dx.doi.org/10.4314/njt.v39i4.27.

Full text
Abstract:
Manual grid-search tuning of machine learning hyperparameters is very time-consuming. Hence, to curb this problem, we propose the use of a genetic algorithm (GA) for the selection of optimal radial-basis-function based support vector machine (RBF-SVM) hyperparameters; regularization parameter C and cost-factor γ. The resulting optimal parameters were used during the training of face recognition models. To train the models, we independently extracted features from the ORL face image dataset using local binary patterns (handcrafted) and deep learning architectures (pretrained variants of VGGNet). The resulting features were passed as input to either linear-SVM or optimized RBF-SVM. The results show that the models from optimized RBFSVM combined with deep learning or hand-crafted features yielded performances that surpass models obtained from Linear-SVM combined with the aforementioned features in most of the data splits. The study demonstrated that it is profitable to optimize the hyperparameters of an SVM to obtain the best classification performance. Keywords: Face Recognition, Feature Extraction, Local Binary Patterns, Transfer Learning, Genetic Algorithm and Support Vector Machines.
APA, Harvard, Vancouver, ISO, and other styles
49

Brodzicki, Andrzej, Michał Piekarski, and Joanna Jaworek-Korjakowska. "The Whale Optimization Algorithm Approach for Deep Neural Networks." Sensors 21, no. 23 (November 30, 2021): 8003. http://dx.doi.org/10.3390/s21238003.

Full text
Abstract:
One of the biggest challenge in the field of deep learning is the parameter selection and optimization process. In recent years different algorithms have been proposed including bio-inspired solutions to solve this problem, however, there are many challenges including local minima, saddle points, and vanishing gradients. In this paper, we introduce the Whale Optimisation Algorithm (WOA) based on the swarm foraging behavior of humpback whales to optimise neural network hyperparameters. We wish to stress that to the best of our knowledge this is the first attempt that uses Whale Optimisation Algorithm for the optimisation task of hyperparameters. After a detailed description of the WOA algorithm we formulate and explain the application in deep learning, present the implementation, and compare the proposed algorithm with other well-known algorithms including widely used Grid and Random Search methods. Additionally, we have implemented a third dimension feature analysis to the original WOA algorithm to utilize 3D search space (3D-WOA). Simulations show that the proposed algorithm can be successfully used for hyperparameters optimization, achieving accuracy of 89.85% and 80.60% for Fashion MNIST and Reuters datasets, respectively.
APA, Harvard, Vancouver, ISO, and other styles
50

Ahmad, Waqas, Nasir Ayub, Tariq Ali, Muhammad Irfan, Muhammad Awais, Muhammad Shiraz, and Adam Glowacz. "Towards Short Term Electricity Load Forecasting Using Improved Support Vector Machine and Extreme Learning Machine." Energies 13, no. 11 (June 5, 2020): 2907. http://dx.doi.org/10.3390/en13112907.

Full text
Abstract:
Forecasting the electricity load provides its future trends, consumption patterns and its usage. There is no proper strategy to monitor the energy consumption and generation; and high variation among them. Many strategies are used to overcome this problem. The correct selection of parameter values of a classifier is still an issue. Therefore, an optimization algorithm is applied with deep learning and machine learning techniques to select the optimized values for the classifier’s hyperparameters. In this paper, a novel deep learning-based method is implemented for electricity load forecasting. A three-step model is also implemented, including feature selection using a hybrid feature selector (XGboost and decision tee), redundancy removal using feature extraction technique (Recursive Feature Elimination) and classification/forecasting using improved Support Vector Machine (SVM) and Extreme Learning Machine (ELM). The hyperparameters of ELM are tuned with a meta-heuristic algorithm, i.e., Genetic Algorithm (GA) and hyperparameters of SVM are tuned with the Grid Search Algorithm. The simulation results are shown in graphs and the values are shown in tabular form and they clearly show that our improved methods outperform State Of The Art (SOTA) methods in terms of accuracy and performance. The forecasting accuracy of Extreme Learning Machine based Genetic Algo (ELM-GA) and Support Vector Machine based Grid Search (SVM-GS) is 96.3% and 93.25%, respectively. The accuracy of our improved techniques, i.e., ELM-GA and SVM-GS is 10% and 7%, respectively, higher than the SOTA techniques.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography