Готові списки джерел за темами / Benign overfitting

Добірка наукової літератури з теми "Benign overfitting"

Автор: Grafiati

Опубліковано: 1 червня 2024

Оформте джерело за APA, MLA, Chicago, Harvard та іншими стилями

Оберіть тип джерела:

Ознайомтеся зі списками актуальних статей, книг, дисертацій, тез та інших наукових джерел на тему "Benign overfitting".

Біля кожної праці в переліку літератури доступна кнопка «Додати до бібліографії». Скористайтеся нею – і ми автоматично оформимо бібліографічне посилання на обрану працю в потрібному вам стилі цитування: APA, MLA, «Гарвард», «Чикаго», «Ванкувер» тощо.

Також ви можете завантажити повний текст наукової публікації у форматі «.pdf» та прочитати онлайн анотацію до роботи, якщо відповідні параметри наявні в метаданих.

Статті в журналах з теми "Benign overfitting":

Bartlett, Peter L., Philip M. Long, Gábor Lugosi, and Alexander Tsigler. "Benign overfitting in linear regression." Proceedings of the National Academy of Sciences 117, no. 48 (April 24, 2020): 30063–70. http://dx.doi.org/10.1073/pnas.1907378117.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

The phenomenon of benign overfitting is one of the key mysteries uncovered by deep learning methodology: deep neural networks seem to predict well, even with a perfect fit to noisy training data. Motivated by this phenomenon, we consider when a perfect fit to training data in linear regression is compatible with accurate prediction. We give a characterization of linear regression problems for which the minimum norm interpolating prediction rule has near-optimal prediction accuracy. The characterization is in terms of two notions of the effective rank of the data covariance. It shows that overparameterization is essential for benign overfitting in this setting: the number of directions in parameter space that are unimportant for prediction must significantly exceed the sample size. By studying examples of data covariance properties that this characterization shows are required for benign overfitting, we find an important role for finite-dimensional data: the accuracy of the minimum norm interpolating prediction rule approaches the best possible accuracy for a much narrower range of properties of the data distribution when the data lie in an infinite-dimensional space vs. when the data lie in a finite-dimensional space with dimension that grows faster than the sample size.

Peters, Evan, and Maria Schuld. "Generalization despite overfitting in quantum machine learning models." Quantum 7 (December 20, 2023): 1210. http://dx.doi.org/10.22331/q-2023-12-20-1210.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

The widespread success of deep neural networks has revealed a surprise in classical machine learning: very complex models often generalize well while simultaneously overfitting training data. This phenomenon of benign overfitting has been studied for a variety of classical models with the goal of better understanding the mechanisms behind deep learning. Characterizing the phenomenon in the context of quantum machine learning might similarly improve our understanding of the relationship between overfitting, overparameterization, and generalization. In this work, we provide a characterization of benign overfitting in quantum models. To do this, we derive the behavior of a classical interpolating Fourier features models for regression on noisy signals, and show how a class of quantum models exhibits analogous features, thereby linking the structure of quantum circuits (such as data-encoding and state preparation operations) to overparameterization and overfitting in quantum models. We intuitively explain these features according to the ability of the quantum model to interpolate noisy data with locally "spiky" behavior and provide a concrete demonstration example of benign overfitting.

Bartlett, Peter L., Andrea Montanari, and Alexander Rakhlin. "Deep learning: a statistical viewpoint." Acta Numerica 30 (May 2021): 87–201. http://dx.doi.org/10.1017/s0962492921000027.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

The remarkable practical success of deep learning has revealed some major surprises from a theoretical perspective. In particular, simple gradient methods easily find near-optimal solutions to non-convex optimization problems, and despite giving a near-perfect fit to training data without any explicit effort to control model complexity, these methods exhibit excellent predictive accuracy. We conjecture that specific principles underlie these phenomena: that overparametrization allows gradient methods to find interpolating solutions, that these methods implicitly impose regularization, and that overparametrization leads to benign overfitting, that is, accurate predictions despite overfitting training data. In this article, we survey recent progress in statistical learning theory that provides examples illustrating these principles in simpler settings. We first review classical uniform convergence results and why they fall short of explaining aspects of the behaviour of deep learning methods. We give examples of implicit regularization in simple settings, where gradient methods lead to minimal norm functions that perfectly fit the training data. Then we review prediction methods that exhibit benign overfitting, focusing on regression problems with quadratic loss. For these methods, we can decompose the prediction rule into a simple component that is useful for prediction and a spiky component that is useful for overfitting but, in a favourable setting, does not harm prediction accuracy. We focus specifically on the linear regime for neural networks, where the network can be approximated by a linear model. In this regime, we demonstrate the success of gradient flow, and we consider benign overfitting with two-layer networks, giving an exact asymptotic analysis that precisely demonstrates the impact of overparametrization. We conclude by highlighting the key challenges that arise in extending these insights to realistic deep learning settings.

Wang, Ke, and Christos Thrampoulidis. "Binary Classification of Gaussian Mixtures: Abundance of Support Vectors, Benign Overfitting, and Regularization." SIAM Journal on Mathematics of Data Science 4, no. 1 (March 2022): 260–84. http://dx.doi.org/10.1137/21m1415121.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Hu, Wei. "Understanding Surprising Generalization Phenomena in Deep Learning." Proceedings of the AAAI Conference on Artificial Intelligence 38, no. 20 (March 24, 2024): 22669. http://dx.doi.org/10.1609/aaai.v38i20.30285.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

Deep learning has exhibited a number of surprising generalization phenomena that are not captured by classical statistical learning theory. This talk will survey some of my work on the theoretical characterizations of several such intriguing phenomena: (1) Implicit regularization: A major mystery in deep learning is that deep neural networks can often generalize well despite their excessive expressive capacity. Towards explaining this mystery, it has been suggested that commonly used gradient-based optimization algorithms enforce certain implicit regularization which effectively constrains the model capacity. (2) Benign overfitting: In certain scenarios, a model can perfectly fit noisily labeled training data, but still archives near-optimal test error at the same time, which is very different from the classical notion of overfitting. (3) Grokking: In certain scenarios, a model initially achieves perfect training accuracy but no generalization (i.e. no better than a random predictor), and upon further training, transitions to almost perfect generalization. Theoretically establishing these properties often involves making appropriate high-dimensional assumptions on the problem as well as a careful analysis of the training dynamics.

Montaha, Sidratul, Sami Azam, A. K. M. Rakibul Haque Rafid, Sayma Islam, Pronab Ghosh, and Mirjam Jonkman. "A shallow deep learning approach to classify skin cancer using down-scaling method to minimize time and space complexity." PLOS ONE 17, no. 8 (August 4, 2022): e0269826. http://dx.doi.org/10.1371/journal.pone.0269826.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

The complex feature characteristics and low contrast of cancer lesions, a high degree of inter-class resemblance between malignant and benign lesions, and the presence of various artifacts including hairs make automated melanoma recognition in dermoscopy images quite challenging. To date, various computer-aided solutions have been proposed to identify and classify skin cancer. In this paper, a deep learning model with a shallow architecture is proposed to classify the lesions into benign and malignant. To achieve effective training while limiting overfitting problems due to limited training data, image preprocessing and data augmentation processes are introduced. After this, the ‘box blur’ down-scaling method is employed, which adds efficiency to our study by reducing the overall training time and space complexity significantly. Our proposed shallow convolutional neural network (SCNN_12) model is trained and evaluated on the Kaggle skin cancer data ISIC archive which was augmented to 16485 images by implementing different augmentation techniques. The model was able to achieve an accuracy of 98.87% with optimizer Adam and a learning rate of 0.001. In this regard, parameter and hyper-parameters of the model are determined by performing ablation studies. To assert no occurrence of overfitting, experiments are carried out exploring k-fold cross-validation and different dataset split ratios. Furthermore, to affirm the robustness the model is evaluated on noisy data to examine the performance when the image quality gets corrupted.This research corroborates that effective training for medical image analysis, addressing training time and space complexity, is possible even with a lightweighted network using a limited amount of training data.

Windisch, Paul, Carole Koechli, Susanne Rogers, Christina Schröder, Robert Förster, Daniel R. Zwahlen, and Stephan Bodis. "Machine Learning for the Detection and Segmentation of Benign Tumors of the Central Nervous System: A Systematic Review." Cancers 14, no. 11 (May 27, 2022): 2676. http://dx.doi.org/10.3390/cancers14112676.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

Objectives: To summarize the available literature on using machine learning (ML) for the detection and segmentation of benign tumors of the central nervous system (CNS) and to assess the adherence of published ML/diagnostic accuracy studies to best practice. Methods: The MEDLINE database was searched for the use of ML in patients with any benign tumor of the CNS, and the records were screened according to PRISMA guidelines. Results: Eleven retrospective studies focusing on meningioma (n = 4), vestibular schwannoma (n = 4), pituitary adenoma (n = 2) and spinal schwannoma (n = 1) were included. The majority of studies attempted segmentation. Links to repositories containing code were provided in two manuscripts, and no manuscripts shared imaging data. Only one study used an external test set, which raises the question as to whether some of the good performances that have been reported were caused by overfitting and may not generalize to data from other institutions. Conclusion: Using ML for detecting and segmenting benign brain tumors is still in its infancy. Stronger adherence to ML best practices could facilitate easier comparisons between studies and contribute to the development of models that are more likely to one day be used in clinical practice.

Liang, ShuFen, HuiLin Liu, FangChen Yang, Chuanbo Qin, and Yue Feng. "Classification of Benign and Malignant Pulmonary Nodules Using a Regularized Extreme Learning Machine." Journal of Medical Imaging and Health Informatics 11, no. 8 (August 1, 2021): 2117–23. http://dx.doi.org/10.1166/jmihi.2021.3448.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

An L1/L2-norm-bound extreme learning machine classification algorithm is proposed to improve the accuracy of distinguishing between benign and malignant pulmonary nodules. In this algorithm, features extracted from the segmented lung nodule using the histogram of oriented gradients method are used as inputs. L1-norm can promote sparsity in the weights of the output layer, and L2-norm can smooth output weights. The combination of the L1 norm and L2 norm can simplify the complexity of the network and prevent overfitting to improve classification accuracy. For each newly tested lung nodule, the algorithm outputs a class label of either benign or malignant. The accuracy, sensitivity, and specificity reached 94.12%, 93%, and 95% respectively over the lung image database consortium and image database resource initiative dataset. Compared with other algorithms, the average values of the three metrics increased by 6.5%, 7.94%, and 4.32%, respectively. An accuracy score of 95.83% can be achieved over a set of 120 urinary sediment images. Therefore, this algorithm has a good classification effect of pulmonary nodules.

Liu, Xinwei, Xiaojun Jia, Jindong Gu, Yuan Xun, Siyuan Liang, and Xiaochun Cao. "Does Few-Shot Learning Suffer from Backdoor Attacks?" Proceedings of the AAAI Conference on Artificial Intelligence 38, no. 18 (March 24, 2024): 19893–901. http://dx.doi.org/10.1609/aaai.v38i18.29965.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

The field of few-shot learning (FSL) has shown promising results in scenarios where training data is limited, but its vulnerability to backdoor attacks remains largely unexplored. We first explore this topic by first evaluating the performance of the existing backdoor attack methods on few-shot learning scenarios. Unlike in standard supervised learning, existing backdoor attack methods failed to perform an effective attack in FSL due to two main issues. Firstly, the model tends to overfit to either benign features or trigger features, causing a tough trade-off between attack success rate and benign accuracy. Secondly, due to the small number of training samples, the dirty label or visible trigger in the support set can be easily detected by victims, which reduces the stealthiness of attacks. It seemed that FSL could survive from backdoor attacks. However, in this paper, we propose the Few-shot Learning Backdoor Attack (FLBA) to show that FSL can still be vulnerable to backdoor attacks. Specifically, we first generate a trigger to maximize the gap between poisoned and benign features. It enables the model to learn both benign and trigger features, which solves the problem of overfitting. To make it more stealthy, we hide the trigger by optimizing two types of imperceptible perturbation, namely attractive and repulsive perturbation, instead of attaching the trigger directly. Once we obtain the perturbations, we can poison all samples in the benign support set into a hidden poisoned support set and fine-tune the model on it. Our method demonstrates a high Attack Success Rate (ASR) in FSL tasks with different few-shot learning paradigms while preserving clean accuracy and maintaining stealthiness. This study reveals that few-shot learning still suffers from backdoor attacks, and its security should be given attention.

Doimo, Diego, Aldo Glielmo, Sebastian Goldt, and Alessandro Laio. "Redundant representations help generalization in wide neural networks ^* ^, †." Journal of Statistical Mechanics: Theory and Experiment 2023, no. 11 (November 1, 2023): 114011. http://dx.doi.org/10.1088/1742-5468/aceb4f.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

Abstract Deep neural networks (DNNs) defy the classical bias-variance trade-off; adding parameters to a DNN that interpolates its training data will typically improve its generalization performance. Explaining the mechanism behind this ‘benign overfitting’ in deep networks remains an outstanding challenge. Here, we study the last hidden layer representations of various state-of-the-art convolutional neural networks and find that if the last hidden representation is wide enough, its neurons tend to split into groups that carry identical information and differ from each other only by statistically independent noise. The number of these groups increases linearly with the width of the layer, but only if the width is above a critical value. We show that redundant neurons appear only when the training is regularized and the training error is zero.

Більше джерел

Дисертації з теми "Benign overfitting":

Sigalla, Suzanne. "Contributions to structured high-dimensional inference." Electronic Thesis or Diss., Institut polytechnique de Paris, 2022. http://www.theses.fr/2022IPPAG013.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

Dans cette thèse, nous considérons les trois problèmes suivants : le problème de clustering dans le Bipartite Stochastic Block Model, le problème de classification de documents dans le cadre des topic models, et le problème de benign overfitting dans le cadre de régression non paramétrique. Tout d'abord, nous considérons le problème de clustering dans le Bipartite Stochastic Block Model (BSBM). Le BSBM est une généralisation non symétrique du Stochastic Block Model, avec deux ensembles de sommets. Nous introduisons un algorithme appelé le Hollowed Lloyd's algorithm, qui permet de classer les sommets du plus petit ensemble avec grande probabilité. Nous fournissons des garanties statistiques sur cet algorithme, qui est rapide et simple à implémenter. Nous établissons une condition suffisante pour le clustering dans le BSBM. Nos résultats améliorent les travaux précédents sur le BSBM, en particulier dans le cadre de grande dimension. Deuxièmement, nous étudions le problème de la classification de documents dans le cadre des topic models. Les topic models permettent d'exploiter des structures sous-jacentes dans un grand corpus de documents et ainsi de réduire la dimension du problème considéré. Chaque topic est vu comme une distribution de probabilité sur le dictionnaire de mots du corpus, et chaque document est vu comme un mélange de topics. Nous introduisons un algorithme appelé Successive Projection Overlapping Clustering (SPOC), inspiré du Successive Projection Algorithm pour le problème de Nonnegative Matrix Factorization. L'algorithme SPOC est rapide et simple à implémenter. Nous fournissons des garanties statistiques sur le résultat de l'algorithme SPOC. En particulier, nous fournissons des bornes minimax inférieures et supérieures sur son risque d'estimation pour les normes de Frobenius et l1, bornes correspondant à de faibles facteurs près. Notre procédure de clustering est adaptative en le nombre de topics. Enfin, le troisième problème étudié lors de cette thèse porte sur la régression non paramétrique. Nous considérons des estimateurs par polynômes locaux avec des noyaux singuliers. Nous prouvons que ces estimateurs sont minimax optimaux, adaptatifs en la régularité et interpolants avec une probabilité élevée. Cette propriété est appelée benign overfitting
In this thesis, we consider the three following problems: clustering in Bipartite Stochastic Block Model, estimation of topic-document matrix in topic model, and benign overfitting in nonparametric regression. First, we consider the graph clustering problem in the Bipartite Stochastic Block Model (BSBM). The BSBM is a non-symmetric generalization of the Stochastic Block Model, with two sets of vertices. We provide an algorithm called the Hollowed Lloyd's algorithm, which allows one to classify vertices of the smallest set with high probability. We provide statistical guarantees on this algorithm, which is computationnally fast and simple to implement. We establish a sufficient condition for clustering in BSBM. Our results improve on previous works on BSBM, in particular in the high-dimensional regime. Second, we study the problem of assigning topics to documents using topic models. Topic models allow one to discover hidden structures in a large corpus of documents through dimension reduction. Each topic is considered as a probability distribution on the dictionary of words, and each document is considered as a mixture of topics. We introduce an algotihm called the Successive Projection Overlapping Clustering (SPOC) algorithm, inspired by the Successive Projection Algorithm for Non-negative Matrix Factorization. The SPOC algorithm is computationnally fast and simple to implement. We provide statistical guarantees on the outcome of the algorithm. In particular, we provide near matching minimax upper and lower bounds on its estimation risk under the Frobenius and the l1-norm. Our clustering procedure is adaptive in the number of topics. Finally, the third problem we study is a nonparametric regression problem. We consider local polynomial estimators with singular kernel, which we prove to be minimax optimal, adaptive to unknown smoothness, and interpolating with high probability. This property is called benign overfitting

Тези доповідей конференцій з теми "Benign overfitting":

Wang, Ke, and Christos Thrampoulidis. "Benign Overfitting in Binary Classification of Gaussian Mixtures." In ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2021. http://dx.doi.org/10.1109/icassp39728.2021.9413946.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Chretien, Stephane, and Emmanuel Caron-Parte. "Benign overfitting of fully connected Deep Nets:A Sobolev space viewpoint." In ESANN 2021 - European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning. Louvain-la-Neuve (Belgium): Ciaco - i6doc.com, 2021. http://dx.doi.org/10.14428/esann/2021.es2021-37.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Chinthapally, Srinivas, Sidhardha Nuli, Arnab Das, and Akshay Hedaoo. "Method to Backout Load From Strain Gauges Using Machine Learning." In ASME 2023 Gas Turbine India Conference. American Society of Mechanical Engineers, 2023. http://dx.doi.org/10.1115/gtindia2023-118279.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

Abstract During vibration tests, loads need to be measured for machine components such as bearings, mounts, lugs, etc. As load cells cannot be placed at certain locations in the interior, strain gauges are used instead to measure the strain. But we need a load strain relation for the components. Most of the real-life components experience axial, bending, and torsional loads. Hence a multi-dimensional force strain relationship needs to be established for each component. Hence, prior to the actual tests, calibration tests are performed, on each component separately. These calibration tests are most often static load tests, in which load is applied in one direction at a time in small increments of load. In addition to the unidirectional loads, combined loads are also applied to establish complete load strain surface. Multidimensional load strain data is compiled and pre-processed to develop a multivariable load-strain relationship. The load-strain relationship is later used to back out loads from the strain data. Currently methods such as Newton-Raphson, plate smoothing spline, surface fitting, etc. are used to develop this relationship. Newton-Raphson is an iterative technique which solves for the loads simultaneously at each strain points. This method iterates till the error is small to achieve convergence. Newton-Raphson and surface fitting methods require structured data for developing multivariable load-strain relationship. However, deriving load-strain relations for complex geometries exhibiting highly nonlinear relationships is mathematically complex and computationally expensive. Also, formulation of these methods limits the use of large number of variables (more than 3). In this paper, an ensemble of MARS (Multi variate adaptive regression spline) along with Adaboost boosting algorithm has been explored to predict loads from strain data by developing multivariable correlation between loads and strains. MARS is an improved multivariable spline regression technique that generates piecewise polynomial functions between the variables and automatically determines the number & size of segment to achieve high accuracy or best fit on nonlinear problems. With complex high dimensional and noisy non-linear problem, Adaboost helps to reduce overfitting and improve the performance of the MARS model. The proposed method is well suited for dealing with large number of variables and developing complex non-linear load-strain relationship as it doesn’t require any structured data or iterations like existing methods to backout loads. The proposed method reduces the run time by more than 90% as compared to the conventional methods without compromising on accuracy. The advantages of the proposed method over conventional techniques have been demonstrated in this paper.