Academic literature on the topic 'Neural Network Pruning'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the lists of relevant articles, books, theses, conference reports, and other scholarly sources on the topic 'Neural Network Pruning.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Journal articles on the topic "Neural Network Pruning":

1

JORGENSEN, THOMAS D., BARRY P. HAYNES, and CHARLOTTE C. F. NORLUND. "PRUNING ARTIFICIAL NEURAL NETWORKS USING NEURAL COMPLEXITY MEASURES." International Journal of Neural Systems 18, no. 05 (October 2008): 389–403. http://dx.doi.org/10.1142/s012906570800166x.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
This paper describes a new method for pruning artificial neural networks, using a measure of the neural complexity of the neural network. This measure is used to determine the connections that should be pruned. The measure computes the information-theoretic complexity of a neural network, which is similar to, yet different from previous research on pruning. The method proposed here shows how overly large and complex networks can be reduced in size, whilst retaining learnt behaviour and fitness. The technique proposed here helps to discover a network topology that matches the complexity of the problem it is meant to solve. This novel pruning technique is tested in a robot control domain, simulating a racecar. It is shown, that the proposed pruning method is a significant improvement over the most commonly used pruning method Magnitude Based Pruning. Furthermore, some of the pruned networks prove to be faster learners than the benchmark network that they originate from. This means that this pruning method can also help to unleash hidden potential in a network, because the learning time decreases substantially for a pruned a network, due to the reduction of dimensionality of the network.
2

Ganguli, Tushar, and Edwin K. P. Chong. "Activation-Based Pruning of Neural Networks." Algorithms 17, no. 1 (January 21, 2024): 48. http://dx.doi.org/10.3390/a17010048.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
We present a novel technique for pruning called activation-based pruning to effectively prune fully connected feedforward neural networks for multi-object classification. Our technique is based on the number of times each neuron is activated during model training. We compare the performance of activation-based pruning with a popular pruning method: magnitude-based pruning. Further analysis demonstrated that activation-based pruning can be considered a dimensionality reduction technique, as it leads to a sparse low-rank matrix approximation for each hidden layer of the neural network. We also demonstrate that the rank-reduced neural network generated using activation-based pruning has better accuracy than a rank-reduced network using principal component analysis. We provide empirical results to show that, after each successive pruning, the amount of reduction in the magnitude of singular values of each matrix representing the hidden layers of the network is equivalent to introducing the sum of singular values of the hidden layers as a regularization parameter to the objective function.
3

Koene, Randal A., and Yoshio Takane. "Discriminant Component Pruning: Regularization and Interpretation of Multilayered Backpropagation Networks." Neural Computation 11, no. 3 (April 1, 1999): 783–802. http://dx.doi.org/10.1162/089976699300016665.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Neural networks are often employed as tools in classification tasks. The use of large networks increases the likelihood of the task's being learned, although it may also lead to increased complexity. Pruning is an effective way of reducing the complexity of large networks. We present discriminant components pruning (DCP), a method of pruning matrices of summed contributions between layers of a neural network. Attempting to interpret the underlying functions learned by the network can be aided by pruning the network. Generalization performance should be maintained at its optimal level following pruning. We demonstrate DCP's effectiveness at maintaining generalization performance, applicability to a wider range of problems, and the usefulness of such pruning for network interpretation. Possible enhancements are discussed for the identification of the optimal reduced rank and inclusion of nonlinear neural activation functions in the pruning algorithm.
4

Ling, Xing. "Summary of Deep Neural Network Pruning Algorithms." Applied and Computational Engineering 8, no. 1 (August 1, 2023): 352–61. http://dx.doi.org/10.54254/2755-2721/8/20230182.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
As deep learning has rapidly progressed in the 21st century, artificial neural networks have been continuously enhanced with deeper structures and larger parameter sets to tackle increasingly complex problems. However, this development also brings about the drawbacks of high computational and storage costs, which limit the application of neural networks in some practical scenarios. As a result, in recent years, more researchers have suggested and implemented network pruning techniques to decrease neural networks' computational and storage expenses while retaining the same level of accuracy. This paper reviews the research progress of network pruning techniques and categorizes them into unstructured and structured pruning. Finally, the shortcomings of current pruning techniques and possible future development directions are pointed out.
5

Gong, Ziyi, Huifu Zhang, Hao Yang, Fangjun Liu, and Fan Luo. "A Review of Neural Network Lightweighting Techniques." Innovation & Technology Advances 1, no. 2 (January 16, 2024): 1–16. http://dx.doi.org/10.61187/ita.v1i2.36.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
The application of portable devices based on deep learning has become increasingly widespread, which has made the deployment of complex neural networks on embedded devices a hot research topic. Neural network lightweighting is one of the key technologies for applying neural networks to embedded devices. This paper elaborates and analyzes neural network lightweighting techniques from two aspects: model pruning and network structure design. For model pruning, a comparison of methods from different periods is conducted, highlighting their advantages and limitations. Regarding network structure design, the principles of four classical lightweight network designs are described from a mathematical perspective, and the latest optimization methods for these networks are reviewed. Finally, potential research directions for lightweight neural network pruning and structure design optimization are discussed.
6

Guo, Changyi, and Ping Li. "Hybrid Pruning Method Based on Convolutional Neural Network Sensitivity and Statistical Threshold." Journal of Physics: Conference Series 2171, no. 1 (January 1, 2022): 012055. http://dx.doi.org/10.1088/1742-6596/2171/1/012055.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Abstract The hybrid pruning algorithm can not only ensure the precision of the network, but also achieve a good balance between pruning ratio and computation. However, traditional pruning algorithms use coarse-grained or fine-grained pruning networks, which have the tradeoff problem between pruning rate and computation amount. To this end, this paper presents. A hybrid pruning method of sensitivity and statistical threshold. Firstly, coarse-grained pruning is carried out on the network, and a fast sensitivity test is conducted on the convolutional layer of the network to determine the channels that need pruning within the tolerance range of network precision decline. Then, fine-grained pruning is performed on the network. Count the weights of the pruned network, calculate the thresholds of the weights of each layer, and delete the weights less than the thresholds so as to further reduce the size of the network and the amount of calculation. Hybrid pruning performs very well in AlexNet and Resnet networks. In particular, the method proposed in this paper is used in CIFAR-10 dataset, and the compression of FLOPs is 60%, while the compression of parameter number is nearly 80%. Compared with single pruning method, Hybrid pruning is better.
7

Zou, Yunhuan. "Research On Pruning Methods for Mobilenet Convolutional Neural Network." Highlights in Science, Engineering and Technology 81 (January 26, 2024): 232–36. http://dx.doi.org/10.54097/a742e326.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
This paper comprehensively reviews pruning methods for MobileNet convolutional neural networks. MobileNet is a lightweight convolutional neural network suitable for resource-constrained environments such as mobile devices.Various pruning methods can be applied to reduce the model's storage space and computational complexity, including channel pruning, kernel pruning, and weight pruning. Channel pruning removes unimportant channels to reduce redundant parameters and computations in the model, while kernel pruning reduces redundant calculations by pruning convolutional kernels. Weight pruning involves setting small-weighted elements to zero to remove unimportant weights. These pruning methods can be used individually or in combination. After pruning, fine-tuning is necessary to restore the model's performance. Factors such as pruning rate, pruning order, and pruning location need to be considered to achieve a balance between reducing model size and computational complexity while minimizing performance loss. Pruning methods based on MobileNet convolutional neural networks reduce the parameter count and computational complexity, improving model lightweightness and inference efficiency. These methods are of significant value in resource-constrained environments such as mobile devices. This review provides insights into pruning methods for MobileNet convolutional neural networks and their applications in lightweight and efficient model deployment. Further advancements such as automated pruning methods driven by reinforcement learning algorithms can enhance the pruning process to achieve optimal model compression effects. Future research should focus on adapting and optimizing these pruning methods for specific problem domains and achieving even higher compression ratios and computational speedups.
8

Liang, Ling, Lei Deng, Yueling Zeng, Xing Hu, Yu Ji, Xin Ma, Guoqi Li, and Yuan Xie. "Crossbar-Aware Neural Network Pruning." IEEE Access 6 (2018): 58324–37. http://dx.doi.org/10.1109/access.2018.2874823.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Tsai, Feng-Sheng, Yi-Li Shih, Chin-Tzong Pang, and Sheng-Yi Hsu. "Formulation of Pruning Maps with Rhythmic Neural Firing." Mathematics 7, no. 12 (December 17, 2019): 1247. http://dx.doi.org/10.3390/math7121247.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Rhythmic neural firing is thought to underlie the operation of neural function. This triggers the construction of dynamical network models to investigate how the rhythms interact with each other. Recently, an approach concerning neural path pruning has been proposed in a dynamical network system, in which critical neuronal connections are identified and adjusted according to the pruning maps, enabling neurons to produce rhythmic, oscillatory activity in simulation. Here, we construct a sort of homomorphic functions based on different rhythms of neural firing in network dynamics. Armed with the homomorphic functions, the pruning maps can be simply expressed in terms of interactive rhythms of neural firing and allow a concrete analysis of coupling operators to control network dynamics. Such formulation of pruning maps is applied to probe the consolidation of rhythmic patterns between layers of neurons in feedforward neural networks.
10

Wang, Miao, Xu Yang, Yunchong Qian, Yunlin Lei, Jian Cai, Ziyi Huan, Xialv Lin, and Hao Dong. "Adaptive Neural Network Structure Optimization Algorithm Based on Dynamic Nodes." Current Issues in Molecular Biology 44, no. 2 (February 7, 2022): 817–32. http://dx.doi.org/10.3390/cimb44020056.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Large-scale artificial neural networks have many redundant structures, making the network fall into the issue of local optimization and extended training time. Moreover, existing neural network topology optimization algorithms have the disadvantage of many calculations and complex network structure modeling. We propose a Dynamic Node-based neural network Structure optimization algorithm (DNS) to handle these issues. DNS consists of two steps: the generation step and the pruning step. In the generation step, the network generates hidden layers layer by layer until accuracy reaches the threshold. Then, the network uses a pruning algorithm based on Hebb’s rule or Pearson’s correlation for adaptation in the pruning step. In addition, we combine genetic algorithm to optimize DNS (GA-DNS). Experimental results show that compared with traditional neural network topology optimization algorithms, GA-DNS can generate neural networks with higher construction efficiency, lower structure complexity, and higher classification accuracy.

Dissertations / Theses on the topic "Neural Network Pruning":

1

Scalco, Alberto <1993&gt. "Feature Selection Using Neural Network Pruning." Master's Degree Thesis, Università Ca' Foscari Venezia, 2019. http://hdl.handle.net/10579/14382.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Feature selection is a well known technique for data prepossessing with the purpose of removing redundant and irrelevant information with the benefits, among others, of an improved generalization and a decreased curse of dimensionality. This paper investigates an approach based on a trained neural network model, where features are selected by iteratively removing a node in the input layer. This pruning process, comprise a node selection criterion and a subsequent weight correction: after a node elimination, the remaining weights are adjusted in a way that the overall network behaviour do not worsen over the entire training set. The pruning problem is formulated as a system of linear equations solved in a least-squares sense. This method allows the direct evaluation of the performance at each iteration and a stopping condition is also proposed. Finally experimental results are presented in comparison to another feature selection method.
2

Labarge, Isaac E. "Neural Network Pruning for ECG Arrhythmia Classification." DigitalCommons@CalPoly, 2020. https://digitalcommons.calpoly.edu/theses/2136.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Convolutional Neural Networks (CNNs) are a widely accepted means of solving complex classification and detection problems in imaging and speech. However, problem complexity often leads to considerable increases in computation and parameter storage costs. Many successful attempts have been made in effectively reducing these overheads by pruning and compressing large CNNs with only a slight decline in model accuracy. In this study, two pruning methods are implemented and compared on the CIFAR-10 database and an ECG arrhythmia classification task. Each pruning method employs a pruning phase interleaved with a finetuning phase. It is shown that when performing the scale-factor pruning algorithm on ECG, finetuning time can be expedited by 1.4 times over the traditional approach with only 10% of expensive floating-point operations retained, while experiencing no significant impact on accuracy.
3

Brantley, Kiante. "BCAP| An Artificial Neural Network Pruning Technique to Reduce Overfitting." Thesis, University of Maryland, Baltimore County, 2016. http://pqdtopen.proquest.com/#viewpdf?dispub=10140605.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:

Determining the optimal size of a neural network is complicated. Neural networks, with many free parameters, can be used to solve very complex problems. However, these neural networks are susceptible to overfitting. BCAP (Brantley-Clark Artificial Neural Network Pruning Technique) addresses overfitting by combining duplicate neurons in a neural network hidden layer, thereby forcing the network to learn more distinct features. We compare hidden units using the cosine similarity, and combine those that are similar with each other within a threshold ϵ. By doing so the co-adaption of the neurons in the network is reduced because hidden units that are highly correlated (i.e. similar) are combined. In this paper we show evidence that BCAP is successful in reducing network size while maintaining accuracy, or improving accuracy of neural networks during and after training.

4

Hubens, Nathan. "Towards lighter and faster deep neural networks with parameter pruning." Electronic Thesis or Diss., Institut polytechnique de Paris, 2022. http://www.theses.fr/2022IPPAS025.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Depuis leur résurgence en 2012, les réseaux de neurones profonds sont devenus omniprésents dans la plupart des disciplines de l'intelligence artificielle, comme la reconnaissance d'images, le traitement de la parole et le traitement du langage naturel. Cependant, au cours des dernières années, les réseaux de neurones sont devenus exponentiellement profonds, faisant intervenir de plus en plus de paramètres. Aujourd'hui, il n'est pas rare de rencontrer des architectures impliquant plusieurs milliards de paramètres, alors qu'elles en contenaient le plus souvent des milliers il y a moins de dix ans.Cette augmentation généralisée du nombre de paramètres rend ces grands modèles gourmands en ressources informatiques et essentiellement inefficaces sur le plan énergétique. Cela rend les modèles déployés coûteux à maintenir, mais aussi leur utilisation dans des environnements limités en ressources très difficile.Pour ces raisons, de nombreuses recherches ont été menées pour proposer des techniques permettant de réduire la quantité de stockage et de calcul requise par les réseaux neuronaux. Parmi ces techniques, l'élagage synaptique, consistant à créer des modèles réduits, a récemment été mis en évidence. Cependant, bien que l'élagage soit une technique de compression courante, il n'existe actuellement aucune méthode standard pour mettre en œuvre ou évaluer les nouvelles méthodes, rendant la comparaison avec les recherches précédentes difficile.Notre première contribution concerne donc une description inédite des techniques d'élagage, développée selon quatre axes, et permettant de définir de manière univoque et complète les méthodes existantes. Ces composantes sont : la granularité, le contexte, les critères et le programme. Cette nouvelle définition du problème de l'élagage nous permet de le subdiviser en quatre sous-problèmes indépendants et de mieux déterminer les axes de recherche potentiels.De plus, les méthodes d'élagage en sont encore à un stade de développement précoce et principalement destinées aux chercheurs, rendant difficile pour les novices d'appliquer ces techniques. Pour combler cette lacune, nous avons proposé l'outil FasterAI, destiné aux chercheurs, désireux de créer et d'expérimenter différentes techniques de compression, mais aussi aux nouveaux venus, souhaitant compresser leurs modèles pour des applications concrètes. Cet outil a de plus été construit selon les quatre composantes précédemment définis, permettant une correspondance aisée entre les idées de recherche et leur mise en œuvre.Nous proposons ensuite quatre contributions théoriques, chacune visant à fournir de nouvelles perspectives et à améliorer les méthodes actuelles dans chacun des quatre axes de description identifiés. De plus, ces contributions ont été réalisées en utilisant l'outil précédemment développé, validant ainsi son utilité scientifique.Enfin, afin de démontrer que l'outil développé, ainsi que les différentes contributions scientifiques proposées, peuvent être applicables à un problème complexe et réel, nous avons sélectionné un cas d'utilisation : la détection de la manipulation faciale, également appelée détection de DeepFakes. Cette dernière contribution est accompagnée d'une application de preuve de concept, permettant à quiconque de réaliser la détection sur une image ou une vidéo de son choix.L'ère actuelle du Deep Learning a émergé grâce aux améliorations considérables des puissances de calcul et à l'accès à une grande quantité de données. Cependant, depuis le déclin de la loi de Moore, les experts suggèrent que nous pourrions observer un changement dans la façon dont nous concevons les ressources de calcul, conduisant ainsi à une nouvelle ère de collaboration entre les communautés du logiciel, du matériel et de l'apprentissage automatique. Cette nouvelle quête de plus d'efficacité passera donc indéniablement par les différentes techniques de compression des réseaux neuronaux, et notamment les techniques d'élagage
Since their resurgence in 2012, Deep Neural Networks have become ubiquitous in most disciplines of Artificial Intelligence, such as image recognition, speech processing, and Natural Language Processing. However, over the last few years, neural networks have grown exponentially deeper, involving more and more parameters. Nowadays, it is not unusual to encounter architectures involving several billions of parameters, while they mostly contained thousands less than ten years ago.This generalized increase in the number of parameters makes such large models compute-intensive and essentially energy inefficient. This makes deployed models costly to maintain but also their use in resource-constrained environments very challenging.For these reasons, much research has been conducted to provide techniques reducing the amount of storage and computing required by neural networks. Among those techniques, neural network pruning, consisting in creating sparsely connected models, has been recently at the forefront of research. However, although pruning is a prevalent compression technique, there is currently no standard way of implementing or evaluating novel pruning techniques, making the comparison with previous research challenging.Our first contribution thus concerns a novel description of pruning techniques, developed according to four axes, and allowing us to unequivocally and completely define currently existing pruning techniques. Those components are: the granularity, the context, the criteria, and the schedule. Defining the pruning problem according to those components allows us to subdivide the problem into four mostly independent subproblems and also to better determine potential research lines.Moreover, pruning methods are still in an early development stage, and primarily designed for the research community. Indeed, most pruning works are usually implemented in a self-contained and sophisticated way, making it troublesome for non-researchers to apply such techniques without having to learn all the intricacies of the field. To fill this gap, we proposed FasterAI toolbox, intended to be helpful to researchers, eager to create and experiment with different compression techniques, but also to newcomers, that desire to compress their neural network for concrete applications. In particular, the sparsification capabilities of FasterAI have been built according to the previously defined pruning components, allowing for a seamless mapping between research ideas and their implementation.We then propose four theoretical contributions, each one aiming at providing new insights and improving on state-of-the-art methods in each of the four identified description axes. Also, those contributions have been realized by using the previously developed toolbox, thus validating its scientific utility.Finally, to validate the applicative character of the pruning technique, we have selected a use case: the detection of facial manipulation, also called DeepFakes Detection. The goal is to demonstrate that the developed tool, as well as the different proposed scientific contributions, can be applicable to a complex and actual problem. This last contribution is accompanied by a proof-of-concept application, providing DeepFake detection capabilities in a web-based environment, thus allowing anyone to perform detection on an image or video of their choice.This Deep Learning era has emerged thanks to the considerable improvements in high-performance hardware and access to a large amount of data. However, since the decline of Moore's Law, experts are suggesting that we might observe a shift in how we conceptualize the hardware, by going from task-agnostic to domain-specialized computations, thus leading to a new era of collaboration between software, hardware, and machine learning communities. This new quest for more efficiency will thus undeniably go through neural network compression techniques, and particularly sparse computations
5

Santacroce, Michael. "Neural Classification of Malware-As-Video with Considerations for In-Hardware Inferencing." University of Cincinnati / OhioLINK, 2019. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1554216974556897.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Dupont, Robin. "Deep Neural Network Compression for Visual Recognition." Electronic Thesis or Diss., Sorbonne université, 2023. http://www.theses.fr/2023SORUS565.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Grâce à la miniaturisation de l'électronique, les dispositifs embarqués sont devenus omniprésents depuis les années 2010, réalisant diverses tâches autour de nous. À mesure que leur utilisation augmente, la demande pour des dispositifs traitant les données et prenant des décisions complexes de manière efficace s'intensifie. Les réseaux de neurones profonds sont puissants pour cet objectif, mais souvent trop lourds pour les appareils embarqués. Il est donc impératif de compresser ces réseaux sans compromettre leur performance. Cette thèse introduit deux méthodes innovantes centrées sur l'élagage, pour compresser les réseaux sans impacter leur précision. Elle introduit d'abord une méthode qui considère un budget pour la compression de grands réseaux via la reparamétrisation des poids et une fonction de coût budgétaire, sans nécessité de fine-tuning. Les méthodes d'élagage traditionnelles reposent sur des indicateurs post-entraînement pour éliminer les poids, négligeant le taux d'élagage visé. Notre approche intègre une fonction de coût, guidant l'élagage vers une parcimonie précise pendant l'entraînement, optimisant la topologie et les poids. En simulant l'élagage des petits poids pendant l'entraînement via reparamétrisation, notre méthode limite la perte de précision par rapport aux méthodes traditionnelles. Nous démontrons son efficacité sur divers ensembles de données et architectures. La thèse se penche ensuite sur l'extraction de sous-réseaux efficaces sans entraîner les poids. L'objectif est de trouver la meilleure topologie d'un sous-réseau dans un grand réseau sans optimiser les poids, tout en offrant de bonnes performances. Ceci est fait grâce à notre méthode, l'Arbitrarily Shifted Log-Parametrisation, qui échantillonne des topologies de manière différentiable, permettant de former des masques indiquant la probabilité de sélection des poids. En parallèle, un mécanisme de recalibrage des poids, le Smart Rescale, est introduit, améliorant la performance des sous-réseaux et accélérant leur formation. Notre méthode trouve également le taux d'élagage optimal après un entraînement unique, évitant la recherche d'hyperparamètres et un entraînement pour chaque taux. Nous prouvons que notre méthode dépasse les techniques de pointe et permet de créer des réseaux légers avec haute parcimonie sans perdre en précision
Thanks to the miniaturisation of electronics, embedded devices have become ubiquitous since the 2010s, performing various tasks around us. As their usage expands, there's an increasing demand for efficient data processing and decision-making. Deep neural networks are apt tools for this, but they are often too large and intricate for embedded systems. Therefore, methods to compress these networks without affecting their performance are crucial. This PhD thesis introduces two methods focused on pruning to compress networks, maintaining accuracy. The thesis first details a budget-aware method for compressing large neural networks using weight reparametrisation and a budget loss, eliminating the need for fine-tuning. Traditional pruning methods often use post-training indicators to cut weights, ignoring desired pruning rates. Our method incorporates a budget loss, directing pruning during training, enabling simultaneous topology and weight optimisation. By soft-pruning smaller weights via reparametrisation, we reduce accuracy loss compared to standard pruning. We validate our method on several datasets and architectures. Later, the thesis examines extracting efficient subnetworks without weight training. We aim to discern the optimal subnetwork topology within a large network, bypassing weight optimisation yet ensuring strong performance. This is realized with our Arbitrarily Shifted Log Parametrisation, a differentiable method for discrete topology sampling, facilitating masks' training to denote weight selection probability. Additionally, a weight recalibration technique, Smart Rescale, is presented. It boosts extracted subnetworks' performance and hastens their training. Our method identifies the best pruning rate in a single training cycle, averting exhaustive hyperparameter searches and various rate training. Through extensive tests, our technique consistently surpasses similar state-of-the-art methods, creating streamlined networks that achieve high sparsity without notable accuracy drops
7

PRONO, LUCIANO. "Methods and Applications for Low-power Deep Neural Networks on Edge Devices." Doctoral thesis, Politecnico di Torino, 2023. https://hdl.handle.net/11583/2976593.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

ZULLICH, MARCO. "Un'analisi delle Tecniche di Potatura in Reti Neurali Profonde: Studi Sperimentali ed Applicazioni." Doctoral thesis, Università degli Studi di Trieste, 2023. https://hdl.handle.net/11368/3041099.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
La potatura, nel contesto dell'Apprendimento Automatico, denota l'atto di rimuovere parametri da modelli parametrici come modelli lineari, alberi decisionali e Reti Neurali Artificiali (ANN). La potatura di un modello può essere motivata da numerose esigenze, primo fra tutti la riduzione in dimensione e l'occupazione di memoria, possibilmente senza inficiare l'accuratezza finale del modello. L'interesse della comunità scientifica riguardo alla potatura delle ANN è aumentato in maniera sostanziosa nell'ultimo decennio a causa dell'altrettanto cospicua crescita nella dimensione di tali modelli. Ciò può seriamente limitare l'implementazione delle ANN in computer di bassa fascia, ponendo oltretutto un ostacolo alla democratizzazione dell'Intelligenza Artificiale. Avanzamenti recenti nell'ambito della potatura hanno mostrato in maniera empirica come si può, di fatto, rimuovere una grossa porzione di parametri (a volte anche superiore al 99%) con perdita in accuratezza minima o nulla. Nonostante ciò, rimangono ancora questioni aperte in proposito, specialmente per quanto concerne le dinamiche interne della potatura, ad esempio riguardo alle modalità con cui le caratteristiche apprese dalle ANN potate si relazionano a quelle delle corrispondenti ANN dense, oppure all'abilità delle ANN potate di generalizzare i loro risultati a dati o ambienti non osservati durante l'addestramento. Inoltre, la potatura è spesso costosa dal punto di vista computazionale e pone notevoli problematiche connesse all'alto consumo di energia e all'inquinamento. Nel presente elaborato, esporremo alcuni approcci per affrontare i problemi sopra introdotti: comparazione di rappresentazioni/caratteristiche apprese dalle ANN potate, efficientamento temporale di tecniche di potatura, applicazione della potatura a robot simulati, con un occhio di riguardo alla generalizzazione. Infine, mostriamo un utilizzo della potatura ai fini di ridurre la dimensione di un grosso modello di riconoscimento di oggetti per il riconoscimento di mascherine facciali, implementando successivamente tale modello in un dispositivo di bassa fascia a memoria ridotta, figurando una futura applicazione del modello nel campo della videosorveglianza.
Pruning, in the context of Machine Learning, denotes the act of removing parameters from parametric models, such as linear models, decision trees, and ANNs. Pruning can be motivated by several necessities, first and foremost the reduction in the size and the memory footprint of a model, possibly without hurting its accuracy. The interest of the scientific community to pruning applied to ANNs has increased substantially in the last decade due to the dramatic expansion in the size of these models. This can hinder the implementation of ANNs in lower-end computers, also posing a burden to democratization of Artificial Intelligence. Recent advances in pruning techniques have empirically shown to effectively remove a large portion of parameters (even over 99%) with none to minimal loss in accuracy. Despite this, open questions on the matter still remain, especially regarding the inner dynamics of pruning concerning, e.g., the way features learned by the pruned ANNs relate to their dense versions, or the ability of pruned ANNs to generalize to data or environments unseen during training. In addition, pruning is often computationally-expensive and poses notable issues concerning high energy consumption and pollution. We hereby present some approaches for tackling the aforementioned issues: comparing representations/features learned by pruned ANNs, improvement in time-efficiency of pruning, application to pruning to simulated robots, with an eye on generalization. Finally, we showcase the usage of pruning for deploying, on a low-end device with limited memory, a large object detection model for face mask detection, envisioning an application of the model to videosurveillance.
9

Yvinec, Edouard. "Efficient Neural Networks : Post Training Pruning and Quantization." Electronic Thesis or Diss., Sorbonne université, 2023. http://www.theses.fr/2023SORUS581.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Les réseaux de neurones profonds sont devenus les modèles les plus utilisés, que ce soit en vision par ordinateur ou en traitement du langage. Depuis le sursaut provoqué par l'utilisation des ordinateurs modernes, en 2012, la taille de ces modèles n'a fait qu'augmenter, aussi bien en matière de taille mémoire qu'en matière de coût de calcul. Ce phénomène a grandement limité le déploiement industriel de ces modèles. Spécifiquement, le cas de l'IA générative, et plus particulièrement des modèles de langue tels que GPT, a fait atteindre une toute nouvelle dimension à ce problème. En effet, ces réseaux sont définis par des milliards de paramètres et nécessitent plusieurs gpu en parallèle pour effectuer des inférences en temps réel. En réponse, la communauté scientifique et les spécialistes de l'apprentissage profond ont développé des solutions afin de compresser et d'accélérer ces modèles. Ces solutions sont : l'utilisation d'architecture efficiente par design, la décomposition tensorielle, l'élagage (ou pruning) et la quantification. Dans ce manuscrit de thèse, je propose de dépeindre une vue d'ensemble du domaine de la compression des réseaux de neurones artificiels ainsi que de mes contributions. Dans le premier chapitre, je présente une introduction générale au fonctionnement de chaque méthode de compression précédemment citée. De plus, j'y ajoute les intuitions relatives à leurs limitations ainsi que des exemples pratiques, issus des cours que j'ai donnés. Dans le second chapitre, je présente mes contributions au sujet du pruning. Ces dernières ont mené à la publications de trois articles: RED, RED++ et SInGE. Dans RED et RED++, j'ai proposé une nouvelle approche pour le pruning et la décomposition tensorielle, sans données. L'idée centrale était de réduire la redundance au sein des opérations effectuées par le modèle. 'A l'opposé, dans SInGE, j'ai défini un nouveau critère de pruning par importance. Pour ce faire, j'ai puisé de l'inspiration dans le domaine de l'attribution. En effet, afin d'expliquer les règles de décisions des réseaux de neurones profonds, les chercheurs et les chercheuses ont introduit des techniques visant à estimer l'importance relative des entrées du modèle par rapport aux sorties. Dans SInGE, j'ai adapté l'une de ces méthodes les plus efficaces, au pruning afin d'estimer l'importance des poids et donc des calculs du modèle. Dans le troisième chapitre, j'aborde mes contributions relatives à la quantification de réseaux de neurones. Celles-ci ont donné lieu à plusieurs publications dont les principales: SPIQ, PowerQuant, REx, NUPES et une publication sur les meilleurs pratiques à adopter. Dans SPIQ, PowerQuant et REx, j'adresse des limites spécifiques à la quantification sans données. En particulier, la granularité, dans SPIQ, la quantification non-uniform par automorphismes dans PowerQuant et l'utilisation d'une bit-width spécifique dans REx. Par ailleurs, dans les deux autres articles, je me suis attelé à la quantification post-training avec optimisation par descente de gradient. N'ayant pas eu le temps de toucher à tous les aspects de la compression de réseau de neurones, je conclue ce manuscrit par un chapitre sur ce qui me semble être les enjeux de demain ainsi que des pistes de solutions
Deep neural networks have grown to be the most widely adopted models to solve most computer vision and natural language processing tasks. Since the renewed interest, sparked in 2012, for these architectures, in machine learning, their size in terms of memory footprint and computational costs have increased tremendously, which has hindered their deployment. In particular, with the rising interest for generative ai such as large language models and diffusion models, this phenomenon has recently reached new heights, as these models can weight several billions of parameters and require multiple high-end gpus in order to infer in real-time. In response, the deep learning community has researched for methods to compress and accelerate these models. These methods are: efficient architecture design, tensor decomposition, pruning and quantization. In this manuscript, I paint a landscape of the current state-of-the art in deep neural networks compression and acceleration as well as my contributions to the field. First, I propose a general introduction to the aforementioned techniques and highlight their shortcomings and current challenges. Second, I provide a detailed discussion regarding my contributions to the field of deep neural networks pruning. These contributions led to the publication of three articles: RED, RED++ and SInGE. In RED and RED++, I introduced a novel way to perform data-free pruning and tensor decomposition based on redundancy reduction. On the flip side, in SInGE, I proposed a new importance-based criterion for data-driven pruning. This criterion was inspired by attribution techniques which consist in ranking inputs by their relative importance with respect to the final prediction. In SInGE, I adapted one of the most effective attribution technique to weight importance ranking for pruning. In the third chapter, I layout my contributions to the field of deep quantization: SPIQ, PowerQuant, REx, NUPES, and a best practice paper. Each of these methods address one of the previous limitations of post-training quantization. In SPIQ, PowerQuant and REx, I provide a solution to the granularity limitations of quantization, a novel non-uniform format which is particularly effective on transformer architectures and a technique for quantization decomposition which eliminates the need for unsupported bit-widths, respectively. In the two remaining articles, I provide significant improvements over existing gradient-based post-training quantization techniques, bridging the gap between such techniques and non-uniform quantization. In the last chapter, I propose a set of leads for future work which I believe to be the, current, most important unanswered questions in the field
10

Brigandì, Camilla. "Utilizzo della omologia persistente nelle reti neurali." Bachelor's thesis, Alma Mater Studiorum - Università di Bologna, 2022.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Lo scopo di questa tesi è quello di introdurre alcune applicazioni della topologia algebrica, e in particolare della teoria dell’omologia persistente, alle reti neurali. A tal fine, nel primo capitolo dell’elaborato vengono introdotti i concetti di neurone e rete neurale artificiale. Viene posta particolare attenzione sull’addestramento di una rete, spiegando anche delle problematiche e delle caratteristiche ad esso legate, come il problema dell’overfitting e la capacità di generalizzazione. All’interno dello stesso capitolo vengono anche esposti il concetto di similarità tra due reti e il concetto di pruning, e vengono definiti rigorosamente i problemi di classificazione. Nel secondo capitolo vengono introdotte le nozioni basilari relative all’omologia persistente, vengono forniti degli strumenti utili alla visualizzazione e comparazione di tali nozioni (i barcodes e i diagrammi di persistenza), e vengono esposti dei metodi per la costruzione di complessi simpliciali a partire da grafi o insiemi di punti in R^d. Nel terzo e ultimo capitolo vengono riportati i risultati di applicazione cui ci si riferiva all’inizio dell’abstract. In particolare, vengono esposte delle ricerche basate sull’utilizzo dell’omologia persistente riguardanti la creazione di misure di espressività e similarità di architetture neurali, la messa a punto di un metodo di pruning, la creazione di una rete neurale resistente agli adversarial attacks di misura data, e alcuni risultati sulla modifica topologica dei dati che vengono elaborati da un’architettura neurale.

Books on the topic "Neural Network Pruning":

1

Hong, X. A Givens rotation based fast backward elimination algorithm for RBF neural network pruning. Sheffield: University of Sheffield, Dept. of Automatic Control and Systems Engineering, 1996.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
2

C, Jorgensen Charles, and Ames Research Center, eds. Toward a more robust pruning procedure for MLP networks. Moffett Field, Calif: National Aeronautics and Space Administration, Ames Research Center, 1998.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
3

Multiple Comparison Pruning of Neural Networks. Storming Media, 1999.

Find full text
APA, Harvard, Vancouver, ISO, and other styles

Book chapters on the topic "Neural Network Pruning":

1

Chen, Jinting, Zhaocheng Zhu, Cheng Li, and Yuming Zhao. "Self-Adaptive Network Pruning." In Neural Information Processing, 175–86. Cham: Springer International Publishing, 2019. http://dx.doi.org/10.1007/978-3-030-36708-4_15.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Gridin, Ivan. "Model Pruning." In Automated Deep Learning Using Neural Network Intelligence, 319–55. Berkeley, CA: Apress, 2022. http://dx.doi.org/10.1007/978-1-4842-8149-9_6.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Pei, Songwen, Jie Luo, and Sheng Liang. "DRP:Discrete Rank Pruning for Neural Network." In Lecture Notes in Computer Science, 168–79. Cham: Springer Nature Switzerland, 2022. http://dx.doi.org/10.1007/978-3-031-21395-3_16.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Widmann, Thomas, Florian Merkle, Martin Nocker, and Pascal Schöttle. "Pruning for Power: Optimizing Energy Efficiency in IoT with Neural Network Pruning." In Engineering Applications of Neural Networks, 251–63. Cham: Springer Nature Switzerland, 2023. http://dx.doi.org/10.1007/978-3-031-34204-2_22.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Gong, Saijun, Lin Chen, and Zhicheng Dong. "Neural Network Pruning via Genetic Wavelet Channel Search." In Neural Information Processing, 348–58. Cham: Springer International Publishing, 2021. http://dx.doi.org/10.1007/978-3-030-92270-2_30.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Li, Wenrui, and Jo Plested. "Pruning Convolutional Neural Network with Distinctiveness Approach." In Communications in Computer and Information Science, 448–55. Cham: Springer International Publishing, 2019. http://dx.doi.org/10.1007/978-3-030-36802-9_48.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Wu, Jia-Liang, Haopu Shang, Wenjing Hong, and Chao Qian. "Robust Neural Network Pruning by Cooperative Coevolution." In Lecture Notes in Computer Science, 459–73. Cham: Springer International Publishing, 2022. http://dx.doi.org/10.1007/978-3-031-14714-2_32.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Yang, Yang, and Baoliang Lu. "Structure Pruning Strategies for Min-Max Modular Network." In Advances in Neural Networks — ISNN 2005, 646–51. Berlin, Heidelberg: Springer Berlin Heidelberg, 2005. http://dx.doi.org/10.1007/11427391_103.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Zhao, Feifei, Tielin Zhang, Yi Zeng, and Bo Xu. "Towards a Brain-Inspired Developmental Neural Network by Adaptive Synaptic Pruning." In Neural Information Processing, 182–91. Cham: Springer International Publishing, 2017. http://dx.doi.org/10.1007/978-3-319-70093-9_19.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Pei, Songwen, Yusheng Wu, and Meikang Qiu. "Neural Network Compression and Acceleration by Federated Pruning." In Algorithms and Architectures for Parallel Processing, 173–83. Cham: Springer International Publishing, 2020. http://dx.doi.org/10.1007/978-3-030-60239-0_12.

Full text
APA, Harvard, Vancouver, ISO, and other styles

Conference papers on the topic "Neural Network Pruning":

1

Shang, Haopu, Jia-Liang Wu, Wenjing Hong, and Chao Qian. "Neural Network Pruning by Cooperative Coevolution." In Thirty-First International Joint Conference on Artificial Intelligence {IJCAI-22}. California: International Joint Conferences on Artificial Intelligence Organization, 2022. http://dx.doi.org/10.24963/ijcai.2022/667.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Neural network pruning is a popular model compression method which can significantly reduce the computing cost with negligible loss of accuracy. Recently, filters are often pruned directly by designing proper criteria or using auxiliary modules to measure their importance, which, however, requires expertise and trial-and-error. Due to the advantage of automation, pruning by evolutionary algorithms (EAs) has attracted much attention, but the performance is limited for deep neural networks as the search space can be quite large. In this paper, we propose a new filter pruning algorithm CCEP by cooperative coevolution, which prunes the filters in each layer by EAs separately. That is, CCEP reduces the pruning space by a divide-and-conquer strategy. The experiments show that CCEP can achieve a competitive performance with the state-of-the-art pruning methods, e.g., prune ResNet56 for 63.42% FLOPs on CIFAR10 with -0.24% accuracy drop, and ResNet50 for 44.56% FLOPs on ImageNet with 0.07% accuracy drop.
2

Wang, Huan, Can Qin, Yue Bai, Yulun Zhang, and Yun Fu. "Recent Advances on Neural Network Pruning at Initialization." In Thirty-First International Joint Conference on Artificial Intelligence {IJCAI-22}. California: International Joint Conferences on Artificial Intelligence Organization, 2022. http://dx.doi.org/10.24963/ijcai.2022/786.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Neural network pruning typically removes connections or neurons from a pretrained converged model; while a new pruning paradigm, pruning at initialization (PaI), attempts to prune a randomly initialized network. This paper offers the first survey concentrated on this emerging pruning fashion. We first introduce a generic formulation of neural network pruning, followed by the major classic pruning topics. Then, as the main body of this paper, a thorough and structured literature review of PaI methods is presented, consisting of two major tracks (sparse training and sparse selection). Finally, we summarize the surge of PaI compared to PaT and discuss the open problems. Apart from the dedicated literature review, this paper also offers a code base for easy sanity-checking and benchmarking of different PaI methods.
3

Zhao, Chenglong, Bingbing Ni, Jian Zhang, Qiwei Zhao, Wenjun Zhang, and Qi Tian. "Variational Convolutional Neural Network Pruning." In 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2019. http://dx.doi.org/10.1109/cvpr.2019.00289.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Cai, Xingyu, Jinfeng Yi, Fan Zhang, and Sanguthevar Rajasekaran. "Adversarial Structured Neural Network Pruning." In CIKM '19: The 28th ACM International Conference on Information and Knowledge Management. New York, NY, USA: ACM, 2019. http://dx.doi.org/10.1145/3357384.3358150.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Lin, Chih-Chia, Chia-Yin Liu, Chih-Hsuan Yen, Tei-Wei Kuo, and Pi-Cheng Hsiu. "Intermittent-Aware Neural Network Pruning." In 2023 60th ACM/IEEE Design Automation Conference (DAC). IEEE, 2023. http://dx.doi.org/10.1109/dac56929.2023.10247825.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Shahhosseini, Sina, Ahmad Albaqsami, Masoomeh Jasemi, and Nader Bagherzadeh. "Partition Pruning: Parallelization-Aware Pruning for Dense Neural Networks." In 2020 28th Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP). IEEE, 2020. http://dx.doi.org/10.1109/pdp50117.2020.00053.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Jeong, Taehee, Ehsam Ghasemi, Jorn Tuyls, Elliott Delaye, and Ashish Sirasao. "Neural network pruning and hardware acceleration." In 2020 IEEE/ACM 13th International Conference on Utility and Cloud Computing (UCC). IEEE, 2020. http://dx.doi.org/10.1109/ucc48980.2020.00069.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Xu, Sheng, Anran Huang, Lei Chen, and Baochang Zhang. "Convolutional Neural Network Pruning: A Survey." In 2020 39th Chinese Control Conference (CCC). IEEE, 2020. http://dx.doi.org/10.23919/ccc50068.2020.9189610.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Molchanov, Pavlo, Arun Mallya, Stephen Tyree, Iuri Frosio, and Jan Kautz. "Importance Estimation for Neural Network Pruning." In 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2019. http://dx.doi.org/10.1109/cvpr.2019.01152.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Setiono, R., and A. Gaweda. "Neural network pruning for function approximation." In Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks. IJCNN 2000. Neural Computing: New Challenges and Perspectives for the New Millennium. IEEE, 2000. http://dx.doi.org/10.1109/ijcnn.2000.859435.

Full text
APA, Harvard, Vancouver, ISO, and other styles

Reports on the topic "Neural Network Pruning":

1

Guan, Hui, Xipeng Shen, Seung-Hwan Lim, and Robert M. Patton. Composability-Centered Convolutional Neural Network Pruning. Office of Scientific and Technical Information (OSTI), February 2018. http://dx.doi.org/10.2172/1427608.

Full text
APA, Harvard, Vancouver, ISO, and other styles

To the bibliography