Rozprawy doktorskie na temat „Neural Network Pruning”
Utwórz poprawne odniesienie w stylach APA, MLA, Chicago, Harvard i wielu innych
Sprawdź 34 najlepszych rozpraw doktorskich naukowych na temat „Neural Network Pruning”.
Przycisk „Dodaj do bibliografii” jest dostępny obok każdej pracy w bibliografii. Użyj go – a my automatycznie utworzymy odniesienie bibliograficzne do wybranej pracy w stylu cytowania, którego potrzebujesz: APA, MLA, Harvard, Chicago, Vancouver itp.
Możesz również pobrać pełny tekst publikacji naukowej w formacie „.pdf” i przeczytać adnotację do pracy online, jeśli odpowiednie parametry są dostępne w metadanych.
Przeglądaj rozprawy doktorskie z różnych dziedzin i twórz odpowiednie bibliografie.
Scalco, Alberto <1993>. "Feature Selection Using Neural Network Pruning". Master's Degree Thesis, Università Ca' Foscari Venezia, 2019. http://hdl.handle.net/10579/14382.
Pełny tekst źródłaLabarge, Isaac E. "Neural Network Pruning for ECG Arrhythmia Classification". DigitalCommons@CalPoly, 2020. https://digitalcommons.calpoly.edu/theses/2136.
Pełny tekst źródłaBrantley, Kiante. "BCAP| An Artificial Neural Network Pruning Technique to Reduce Overfitting". Thesis, University of Maryland, Baltimore County, 2016. http://pqdtopen.proquest.com/#viewpdf?dispub=10140605.
Pełny tekst źródłaDetermining the optimal size of a neural network is complicated. Neural networks, with many free parameters, can be used to solve very complex problems. However, these neural networks are susceptible to overfitting. BCAP (Brantley-Clark Artificial Neural Network Pruning Technique) addresses overfitting by combining duplicate neurons in a neural network hidden layer, thereby forcing the network to learn more distinct features. We compare hidden units using the cosine similarity, and combine those that are similar with each other within a threshold ϵ. By doing so the co-adaption of the neurons in the network is reduced because hidden units that are highly correlated (i.e. similar) are combined. In this paper we show evidence that BCAP is successful in reducing network size while maintaining accuracy, or improving accuracy of neural networks during and after training.
Hubens, Nathan. "Towards lighter and faster deep neural networks with parameter pruning". Electronic Thesis or Diss., Institut polytechnique de Paris, 2022. http://www.theses.fr/2022IPPAS025.
Pełny tekst źródłaSince their resurgence in 2012, Deep Neural Networks have become ubiquitous in most disciplines of Artificial Intelligence, such as image recognition, speech processing, and Natural Language Processing. However, over the last few years, neural networks have grown exponentially deeper, involving more and more parameters. Nowadays, it is not unusual to encounter architectures involving several billions of parameters, while they mostly contained thousands less than ten years ago.This generalized increase in the number of parameters makes such large models compute-intensive and essentially energy inefficient. This makes deployed models costly to maintain but also their use in resource-constrained environments very challenging.For these reasons, much research has been conducted to provide techniques reducing the amount of storage and computing required by neural networks. Among those techniques, neural network pruning, consisting in creating sparsely connected models, has been recently at the forefront of research. However, although pruning is a prevalent compression technique, there is currently no standard way of implementing or evaluating novel pruning techniques, making the comparison with previous research challenging.Our first contribution thus concerns a novel description of pruning techniques, developed according to four axes, and allowing us to unequivocally and completely define currently existing pruning techniques. Those components are: the granularity, the context, the criteria, and the schedule. Defining the pruning problem according to those components allows us to subdivide the problem into four mostly independent subproblems and also to better determine potential research lines.Moreover, pruning methods are still in an early development stage, and primarily designed for the research community. Indeed, most pruning works are usually implemented in a self-contained and sophisticated way, making it troublesome for non-researchers to apply such techniques without having to learn all the intricacies of the field. To fill this gap, we proposed FasterAI toolbox, intended to be helpful to researchers, eager to create and experiment with different compression techniques, but also to newcomers, that desire to compress their neural network for concrete applications. In particular, the sparsification capabilities of FasterAI have been built according to the previously defined pruning components, allowing for a seamless mapping between research ideas and their implementation.We then propose four theoretical contributions, each one aiming at providing new insights and improving on state-of-the-art methods in each of the four identified description axes. Also, those contributions have been realized by using the previously developed toolbox, thus validating its scientific utility.Finally, to validate the applicative character of the pruning technique, we have selected a use case: the detection of facial manipulation, also called DeepFakes Detection. The goal is to demonstrate that the developed tool, as well as the different proposed scientific contributions, can be applicable to a complex and actual problem. This last contribution is accompanied by a proof-of-concept application, providing DeepFake detection capabilities in a web-based environment, thus allowing anyone to perform detection on an image or video of their choice.This Deep Learning era has emerged thanks to the considerable improvements in high-performance hardware and access to a large amount of data. However, since the decline of Moore's Law, experts are suggesting that we might observe a shift in how we conceptualize the hardware, by going from task-agnostic to domain-specialized computations, thus leading to a new era of collaboration between software, hardware, and machine learning communities. This new quest for more efficiency will thus undeniably go through neural network compression techniques, and particularly sparse computations
Santacroce, Michael. "Neural Classification of Malware-As-Video with Considerations for In-Hardware Inferencing". University of Cincinnati / OhioLINK, 2019. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1554216974556897.
Pełny tekst źródłaDupont, Robin. "Deep Neural Network Compression for Visual Recognition". Electronic Thesis or Diss., Sorbonne université, 2023. http://www.theses.fr/2023SORUS565.
Pełny tekst źródłaThanks to the miniaturisation of electronics, embedded devices have become ubiquitous since the 2010s, performing various tasks around us. As their usage expands, there's an increasing demand for efficient data processing and decision-making. Deep neural networks are apt tools for this, but they are often too large and intricate for embedded systems. Therefore, methods to compress these networks without affecting their performance are crucial. This PhD thesis introduces two methods focused on pruning to compress networks, maintaining accuracy. The thesis first details a budget-aware method for compressing large neural networks using weight reparametrisation and a budget loss, eliminating the need for fine-tuning. Traditional pruning methods often use post-training indicators to cut weights, ignoring desired pruning rates. Our method incorporates a budget loss, directing pruning during training, enabling simultaneous topology and weight optimisation. By soft-pruning smaller weights via reparametrisation, we reduce accuracy loss compared to standard pruning. We validate our method on several datasets and architectures. Later, the thesis examines extracting efficient subnetworks without weight training. We aim to discern the optimal subnetwork topology within a large network, bypassing weight optimisation yet ensuring strong performance. This is realized with our Arbitrarily Shifted Log Parametrisation, a differentiable method for discrete topology sampling, facilitating masks' training to denote weight selection probability. Additionally, a weight recalibration technique, Smart Rescale, is presented. It boosts extracted subnetworks' performance and hastens their training. Our method identifies the best pruning rate in a single training cycle, averting exhaustive hyperparameter searches and various rate training. Through extensive tests, our technique consistently surpasses similar state-of-the-art methods, creating streamlined networks that achieve high sparsity without notable accuracy drops
PRONO, LUCIANO. "Methods and Applications for Low-power Deep Neural Networks on Edge Devices". Doctoral thesis, Politecnico di Torino, 2023. https://hdl.handle.net/11583/2976593.
Pełny tekst źródłaZULLICH, MARCO. "Un'analisi delle Tecniche di Potatura in Reti Neurali Profonde: Studi Sperimentali ed Applicazioni". Doctoral thesis, Università degli Studi di Trieste, 2023. https://hdl.handle.net/11368/3041099.
Pełny tekst źródłaPruning, in the context of Machine Learning, denotes the act of removing parameters from parametric models, such as linear models, decision trees, and ANNs. Pruning can be motivated by several necessities, first and foremost the reduction in the size and the memory footprint of a model, possibly without hurting its accuracy. The interest of the scientific community to pruning applied to ANNs has increased substantially in the last decade due to the dramatic expansion in the size of these models. This can hinder the implementation of ANNs in lower-end computers, also posing a burden to democratization of Artificial Intelligence. Recent advances in pruning techniques have empirically shown to effectively remove a large portion of parameters (even over 99%) with none to minimal loss in accuracy. Despite this, open questions on the matter still remain, especially regarding the inner dynamics of pruning concerning, e.g., the way features learned by the pruned ANNs relate to their dense versions, or the ability of pruned ANNs to generalize to data or environments unseen during training. In addition, pruning is often computationally-expensive and poses notable issues concerning high energy consumption and pollution. We hereby present some approaches for tackling the aforementioned issues: comparing representations/features learned by pruned ANNs, improvement in time-efficiency of pruning, application to pruning to simulated robots, with an eye on generalization. Finally, we showcase the usage of pruning for deploying, on a low-end device with limited memory, a large object detection model for face mask detection, envisioning an application of the model to videosurveillance.
Yvinec, Edouard. "Efficient Neural Networks : Post Training Pruning and Quantization". Electronic Thesis or Diss., Sorbonne université, 2023. http://www.theses.fr/2023SORUS581.
Pełny tekst źródłaDeep neural networks have grown to be the most widely adopted models to solve most computer vision and natural language processing tasks. Since the renewed interest, sparked in 2012, for these architectures, in machine learning, their size in terms of memory footprint and computational costs have increased tremendously, which has hindered their deployment. In particular, with the rising interest for generative ai such as large language models and diffusion models, this phenomenon has recently reached new heights, as these models can weight several billions of parameters and require multiple high-end gpus in order to infer in real-time. In response, the deep learning community has researched for methods to compress and accelerate these models. These methods are: efficient architecture design, tensor decomposition, pruning and quantization. In this manuscript, I paint a landscape of the current state-of-the art in deep neural networks compression and acceleration as well as my contributions to the field. First, I propose a general introduction to the aforementioned techniques and highlight their shortcomings and current challenges. Second, I provide a detailed discussion regarding my contributions to the field of deep neural networks pruning. These contributions led to the publication of three articles: RED, RED++ and SInGE. In RED and RED++, I introduced a novel way to perform data-free pruning and tensor decomposition based on redundancy reduction. On the flip side, in SInGE, I proposed a new importance-based criterion for data-driven pruning. This criterion was inspired by attribution techniques which consist in ranking inputs by their relative importance with respect to the final prediction. In SInGE, I adapted one of the most effective attribution technique to weight importance ranking for pruning. In the third chapter, I layout my contributions to the field of deep quantization: SPIQ, PowerQuant, REx, NUPES, and a best practice paper. Each of these methods address one of the previous limitations of post-training quantization. In SPIQ, PowerQuant and REx, I provide a solution to the granularity limitations of quantization, a novel non-uniform format which is particularly effective on transformer architectures and a technique for quantization decomposition which eliminates the need for unsupported bit-widths, respectively. In the two remaining articles, I provide significant improvements over existing gradient-based post-training quantization techniques, bridging the gap between such techniques and non-uniform quantization. In the last chapter, I propose a set of leads for future work which I believe to be the, current, most important unanswered questions in the field
Brigandì, Camilla. "Utilizzo della omologia persistente nelle reti neurali". Bachelor's thesis, Alma Mater Studiorum - Università di Bologna, 2022.
Znajdź pełny tekst źródłaRiera, Villanueva Marc. "Low-power accelerators for cognitive computing". Doctoral thesis, Universitat Politècnica de Catalunya, 2020. http://hdl.handle.net/10803/669828.
Pełny tekst źródłaLes xarxes neuronals profundes (DNN) han aconseguit un èxit enorme en aplicacions cognitives, i són especialment eficients en problemes de classificació i presa de decisions com ara reconeixement de veu o traducció automàtica. Els dispositius mòbils depenen cada cop més de les DNNs per entendre el món. Els telèfons i rellotges intel·ligents, o fins i tot els cotxes, realitzen diàriament tasques discriminatòries com ara el reconeixement de rostres o objectes. Malgrat la popularitat creixent de les DNNs, el seu funcionament en sistemes mòbils presenta diversos reptes: proporcionar una alta precisió i rendiment amb un petit pressupost de memòria i energia. Les DNNs modernes consisteixen en milions de paràmetres que requereixen recursos computacionals i de memòria enormes i, per tant, no es poden utilitzar directament en sistemes de baixa potència amb recursos limitats. L'objectiu d'aquesta tesi és abordar aquests problemes i proposar noves solucions per tal de dissenyar acceleradors eficients per a sistemes de computació cognitiva basats en DNNs. En primer lloc, ens centrem en optimitzar la inferència de les DNNs per a aplicacions de processament de seqüències. Realitzem una anàlisi de la similitud de les entrades entre execucions consecutives de les DNNs. A continuació, proposem DISC, un accelerador que implementa una tècnica de càlcul diferencial, basat en l'alt grau de semblança de les entrades, per reutilitzar els càlculs de l'execució anterior, en lloc de computar tota la xarxa. Observem que, de mitjana, més del 60% de les entrades de qualsevol capa de les DNNs utilitzades presenten canvis menors respecte a l'execució anterior. Evitar els accessos de memòria i càlculs d'aquestes entrades comporta un estalvi d'energia del 63% de mitjana. En segon lloc, proposem optimitzar la inferència de les DNNs basades en capes FC. Primer analitzem el nombre de pesos únics per neurona d'entrada en diverses xarxes. Aprofitant optimitzacions comunes com la quantització lineal, observem un nombre molt reduït de pesos únics per entrada en diverses capes FC de DNNs modernes. A continuació, per millorar l'eficiència energètica del càlcul de les capes FC, presentem CREW, un accelerador que implementa un eficient mecanisme de reutilització de càlculs i emmagatzematge dels pesos. CREW redueix el nombre de multiplicacions i proporciona estalvis importants en l'ús de la memòria. Avaluem CREW en un conjunt divers de DNNs modernes. CREW proporciona, de mitjana, una millora en rendiment de 2,61x i un estalvi d'energia de 2,42x. En tercer lloc, proposem un mecanisme per optimitzar la inferència de les RNNs. Les cel·les de les xarxes recurrents realitzen multiplicacions element a element de les activacions de diferents comportes, sigmoides i tanh sent les funcions habituals d'activació. Realitzem una anàlisi dels valors de les funcions d'activació i mostrem que una fracció significativa està saturada cap a zero o un en un conjunto d'RNNs populars. A continuació, proposem CGPA per podar dinàmicament les activacions de les RNNs a una granularitat gruixuda. CGPA evita l'avaluació de neurones senceres cada vegada que les sortides de neurones parelles estan saturades. CGPA redueix significativament la quantitat de càlculs i accessos a la memòria, aconseguint en mitjana un 12% de millora en el rendiment i estalvi d'energia. Finalment, en l'última contribució d'aquesta tesi ens centrem en metodologies de poda estàtica de les DNNs. La poda redueix la petjada de memòria i el treball computacional mitjançant l'eliminació de connexions o neurones redundants. Tanmateix, mostrem que els esquemes de poda previs fan servir un procés iteratiu molt llarg que requereix l'entrenament de les DNNs moltes vegades per ajustar els paràmetres de poda. A continuació, proposem un esquema de poda basat en l'anàlisi de components principals i la importància relativa de les connexions de cada neurona que optimitza automàticament el DNN optimitzat en un sol tret sense necessitat de sintonitzar manualment múltiples paràmetres
Faraone, Julian. "Simplification Of Deep Neural Networks For Efficient Inference". Thesis, The University of Sydney, 2021. https://hdl.handle.net/2123/25846.
Pełny tekst źródłaChan, Kin Wah. "Pruning of hidden Markov model with optimal brain surgeon /". View Abstract or Full-Text, 2003. http://library.ust.hk/cgi/db/thesis.pl?ELEC%202003%20CHAN.
Pełny tekst źródłaIncludes bibliographical references (leaves 72-76). Also available in electronic version. Access restricted to campus users.
Gaopande, Meghana Laxmidhar. "Exploring Accumulated Gradient-Based Quantization and Compression for Deep Neural Networks". Thesis, Virginia Tech, 2020. http://hdl.handle.net/10919/98617.
Pełny tekst źródłaMaster of Science
Neural networks are being employed in many different real-world applications. By learning the complex relationship between the input data and ground-truth output data during the training process, neural networks can predict outputs on new input data obtained in real time. To do so, a typical deep neural network often needs millions of numerical parameters, stored in memory. In this research, we explore techniques for reducing the storage requirements for neural network parameters. We propose software methods that convert 32-bit neural network parameters to values that can be stored using fewer bits. Our methods also convert a majority of numerical parameters to zero. Using special storage methods that only require storage of non-zero parameters, we gain significant compression benefits. On typical benchmarks like LeNet-300-100 (MNIST dataset), LeNet-5 (MNIST dataset), AlexNet (CIFAR-10 dataset) and VGG-16 (CIFAR-10 dataset), our methods can achieve up to 57.22x, 50.19x, 13.15x and 13.53x compression respectively. Storage benefits are achieved at the cost of classification accuracy, and we present our work in the light of the accuracy-compression trade-off.
Wolinski, Pierre. "Structural Learning of Neural Networks". Thesis, université Paris-Saclay, 2020. http://www.theses.fr/2020UPASS026.
Pełny tekst źródłaThe structure of a neural network determines to a large extent its cost of training and use, as well as its ability to learn. These two aspects are usually in competition: the larger a neural network is, the better it will perform the task assigned to it, but the more it will require memory and computing time resources for training. Automating the search of efficient network structures -of reasonable size and performing well- is then a very studied question in this area. Within this context, neural networks with various structures are trained, which requires a new set of training hyperparameters for each new structure tested. The aim of the thesis is to address different aspects of this problem. The first contribution is a training method that operates within a large perimeter of network structures and tasks, without needing to adjust the learning rate. The second contribution is a network training and pruning technique, designed to be insensitive to the initial width of the network. The last contribution is mainly a theorem that makes possible to translate an empirical training penalty into a Bayesian prior, theoretically well founded. This work results from a search for properties that theoretically must be verified by training and pruning algorithms to be valid over a wide range of neural networks and objectives
Kubisz, Jan. "Využití umělé inteligence k monitorování stavu obráběcího stroje". Master's thesis, Vysoké učení technické v Brně. Fakulta strojního inženýrství, 2020. http://www.nusl.cz/ntk/nusl-417752.
Pełny tekst źródłaWeman, Nicklas. "Empirical Investigation of the Effect of Pruning Artificial Neural Networks With Respect to Increased Generalization Ability". Thesis, Linköpings universitet, Institutionen för datavetenskap, 2010. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-60112.
Pełny tekst źródłaStrömberg, Lucas. "Optimizing Convolutional Neural Networks for Inference on Embedded Systems". Thesis, Uppsala universitet, Signaler och system, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-444802.
Pełny tekst źródłaBonfiglioli, Luca. "Identificazione efficiente di reti neurali sparse basata sulla Lottery Ticket Hypothesis". Master's thesis, Alma Mater Studiorum - Università di Bologna, 2020.
Znajdź pełny tekst źródłaMedeiros, ClÃudio Marques de SÃ. "Uma contribuiÃÃo ao problema de seleÃÃo de modelos neurais usando o princÃpio de mÃxima correlaÃÃo dos erros". Universidade Federal do CearÃ, 2008. http://www.teses.ufc.br/tde_busca/arquivo.php?codArquivo=2132.
Pełny tekst źródłaPropÃe-se nesta tese um mÃtodo de poda de pesos para redes Perceptron Multicamadas (MLP). TÃcnicas clÃssicas de poda convencionais, tais como Optimal Brain Surgeon(OBS) e Optimal Brain Damage(OBD), baseiam-se na anÃlise de sensibilidade de cada peso da rede, o que requer a determinaÃÃo da inversa da matriz Hessiana da funÃÃo-custo. A inversÃo da matriz Hessiana, alÃm de possuir um alto custo computacional, à bastante susceptÃvel a problemas numÃricos decorrentes do mal-condicionamento da mesma. MÃtodos de poda baseados na regularizaÃÃo da funÃÃo-custo, por outro lado, exigem a determinaÃÃo por tentativa-e-erro de um parÃmetro de regularizaÃÃo. Tendo em mente as limitaÃÃes dos mÃtodos de poda supracitados, o mÃtodo proposto baseia-se no "PrincÃpio da MÃxima CorrelaÃÃo dos Erros" (MAXCORE). A idÃia consiste em analisar a importÃncia de cada conexÃo da rede a partir da correlaÃÃo cruzada entre os erros em uma camada e os erros retropropagados para a camada anterior, partindo da camada de saÃda em direÃÃo à camada de entrada. As conexÃes que produzem as maiores correlaÃÃes tendem a se manter na rede podada. Uma vantagem imediata deste procedimento està em nÃo requerer a inversÃo de matrizes, nem um parÃmetro de regularizaÃÃo. O desempenho do mÃtodo proposto à avaliado em problemas de classificaÃÃo de padrÃes e os resultados sÃo comparados aos obtidos pelos mÃtodos OBS/OBD e por um mÃtodo de poda baseado em regularizaÃÃo. Para este fim, sÃo usados, alÃm de dados articialmente criados para salientar caracterÃsticas importantes do mÃtodo, os conjuntos de dados bem conhecidos da comunidade de aprendizado de mÃquinas: Iris, Wine e Dermatology. Utilizou-se tambÃm um conjunto de dados reais referentes ao diagnÃstico de patologias da coluna vertebral. Os resultados obtidos mostram que o mÃtodo proposto apresenta desempenho equivalente ou superior aos mÃtodos de poda convencionais, com as vantagens adicionais do baixo custo computacional e simplicidade. O mÃtodo proposto tambÃm mostrou-se bastante agressivo na poda de unidades de entrada (atributos), o que sugere a sua aplicaÃÃo em seleÃÃo de caracterÃsticas.
This thesis proposes a new pruning method which eliminates redundant weights in a multilayer perceptron (MLP). Conventional pruning techniques, like Optimal Brain Surgeon (OBS) and Optimal Brain Damage (OBD), are based on weight sensitivity analysis, which requires the inversion of the error Hessian matrix of the loss function (i.e. mean squared error). This inversion is specially susceptible to numerical problems due to poor conditioning of the Hessian matrix and demands great computational efforts. Another kind of pruning method is based on the regularization of the loss function, but it requires the determination of the regularization parameter by trial and error. The proposed method is based on "Maximum Correlation Errors Principle" (MAXCORE). The idea in this principle is to evaluate the importance of each network connection by calculating the cross correlation among errors in a layer and the back-propagated errors in the preceding layer, starting from the output layer and working through the network until the input layer is reached. The connections which have larger correlations remain and the others are pruned from the network. The evident advantage of this procedure is its simplicity, since matrix inversion or parameter adjustment are not necessary. The performance of the proposed method is evaluated in pattern classification tasks and the results are compared to those achieved by the OBS/OBD techniques and also by regularization-based method. For this purpose, artificial data sets are used to highlight some important characteristics of the proposed methodology. Furthermore, well known benchmarking data sets, such as IRIS, WINE and DERMATOLOGY, are also used for the sake of evaluation. A real-world biomedical data set related to pathologies of the vertebral column is also used. The results obtained show that the proposed method achieves equivalent or superior performance compared to conventional pruning methods, with the additional advantages of low computational cost and simplicity. The proposed method also presents eficient behavior in pruning the input units, which suggests its use as a feature selection method.
You, Shi Xian, i 游世賢. "The growing and pruning of neural network learning". Thesis, 1996. http://ndltd.ncl.edu.tw/handle/91967731658427789731.
Pełny tekst źródła(5931047), Akash Gaikwad. "Pruning Convolution Neural Network (SqueezeNet) for Efficient Hardware Deployment". Thesis, 2019.
Znajdź pełny tekst źródłaIn recent years, deep learning models have become popular in the real-time embedded application, but there are many complexities for hardware deployment because of limited resources such as memory, computational power, and energy. Recent research in the field of deep learning focuses on reducing the model size of the Convolution Neural Network (CNN) by various compression techniques like Architectural compression, Pruning, Quantization, and Encoding (e.g., Huffman encoding). Network pruning is one of the promising technique to solve these problems.
This thesis proposes methods to prune the convolution neural network (SqueezeNet) without introducing network sparsity in the pruned model.
This thesis proposes three methods to prune the CNN to decrease the model size of CNN without a significant drop in the accuracy of the model.
1: Pruning based on Taylor expansion of change in cost function Delta C.
2: Pruning based on L2 normalization of activation maps.
3: Pruning based on a combination of method 1 and method 2.
The proposed methods use various ranking methods to rank the convolution kernels and prune the lower ranked filters afterwards SqueezeNet model is fine-tuned by backpropagation. Transfer learning technique is used to train the SqueezeNet on the CIFAR-10 dataset. Results show that the proposed approach reduces the SqueezeNet model by 72% without a significant drop in the accuracy of the model (optimal pruning efficiency result). Results also show that Pruning based on a combination of Taylor expansion of the cost function and L2 normalization of activation maps achieves better pruning efficiency compared to other individual pruning criteria and most of the pruned kernels are from mid and high-level layers. The Pruned model is deployed on BlueBox 2.0 using RTMaps software and model performance was evaluated.
Gaikwad, Akash S. "Pruning Convolution Neural Network (SqueezeNet) for Efficient Hardware Deployment". Thesis, 2018. http://hdl.handle.net/1805/17923.
Pełny tekst źródłaIn recent years, deep learning models have become popular in the real-time embedded application, but there are many complexities for hardware deployment because of limited resources such as memory, computational power, and energy. Recent research in the field of deep learning focuses on reducing the model size of the Convolution Neural Network (CNN) by various compression techniques like Architectural compression, Pruning, Quantization, and Encoding (e.g., Huffman encoding). Network pruning is one of the promising technique to solve these problems. This thesis proposes methods to prune the convolution neural network (SqueezeNet) without introducing network sparsity in the pruned model. This thesis proposes three methods to prune the CNN to decrease the model size of CNN without a significant drop in the accuracy of the model. 1: Pruning based on Taylor expansion of change in cost function Delta C. 2: Pruning based on L2 normalization of activation maps. 3: Pruning based on a combination of method 1 and method 2. The proposed methods use various ranking methods to rank the convolution kernels and prune the lower ranked filters afterwards SqueezeNet model is fine-tuned by backpropagation. Transfer learning technique is used to train the SqueezeNet on the CIFAR-10 dataset. Results show that the proposed approach reduces the SqueezeNet model by 72% without a significant drop in the accuracy of the model (optimal pruning efficiency result). Results also show that Pruning based on a combination of Taylor expansion of the cost function and L2 normalization of activation maps achieves better pruning efficiency compared to other individual pruning criteria and most of the pruned kernels are from mid and high-level layers. The Pruned model is deployed on BlueBox 2.0 using RTMaps software and model performance was evaluated.
Fan, En-Yu, i 樊恩宇. "Convolutional Neural Network Pruning by Training-based Important Channel Identification". Thesis, 2019. http://ndltd.ncl.edu.tw/handle/rch56c.
Pełny tekst źródła國立臺灣大學
電子工程學研究所
107
Despite the tremendous success of convolutional neural networks (CNNs) in various applications, their deployment is greatly obstructed by its high computational cost and its large memory usage. Many approaches have been proposed to prune the network channel-wisely. Nevertheless, most consider the interrelations of channels independently to training or they prune the network in a layer-by-layer manner leveraging the statistics only by an individual layer or two consecutive layers. In this work, we devise a strategy that introduces the concepts of Scoring Network (SN) and Importance of Channels (IofC) into training for channel pruning. Specifically, we take interdependencies of channels into account by combining them into the training phase and jointly prune the channels of every layer based on the trained model. Experimental results evaluated on multiple datasets with several modern CNN models demonstrate that our method can produce promising reductions for modern CNN frameworks in both parameters and floating point operations (FLOPs) while the performance loss is negligible, or even better relative to the unpruned counterparts.
AlShahrani, Mona. "Towards an Efficient Artificial Neural Network Pruning and Feature Ranking Tool". Thesis, 2015. http://hdl.handle.net/10754/555862.
Pełny tekst źródła"Extended Kalman filter based pruning algorithms and several aspects of neural network learning". 1998. http://library.cuhk.edu.hk/record=b6073079.
Pełny tekst źródłaThesis (Ph.D.)--Chinese University of Hong Kong, 1998.
Includes bibliographical references (p. 155-[163]).
Electronic reproduction. Hong Kong : Chinese University of Hong Kong, [2012] System requirements: Adobe Acrobat Reader. Available via World Wide Web.
Mode of access: World Wide Web.
Xu, Zhiwei. "Applications of Markov Random Field Optimization and 3D Neural Network Pruning in Computer Vision". Phd thesis, 2022. http://hdl.handle.net/1885/258295.
Pełny tekst źródłaAlfarra, Motasem. "Applications of Tropical Geometry in Deep Neural Networks". Thesis, 2020. http://hdl.handle.net/10754/662473.
Pełny tekst źródłaRao, Sreenivasa M. "DNN: A new neural network architecture of associative memory with pruning and order-sensitive learning and its applications". Thesis, 1998. http://hdl.handle.net/2009/726.
Pełny tekst źródłaLaurent, César. "Advances in parameterisation, optimisation and pruning of neural networks". Thesis, 2020. http://hdl.handle.net/1866/25592.
Pełny tekst źródłaNeural networks are a family of Machine Learning models able to learn complex tasks directly from the data. Although already producing impressive results in many areas such as speech recognition, computer vision or machine translation, there are still a lot of challenges in both training and deployment of neural networks. In particular, training neural networks typically requires huge amounts of computational resources, and trained models are often too big or too computationally expensive to be deployed on resource-limited devices, such as smartphones or low-power chips. The articles presented in this thesis investigate solutions to these different issues. The first couple of articles focus on improving the training of Recurrent Neural Networks (RNNs), networks specially designed to process sequential data. RNNs are notoriously hard to train, so we propose to improve their parameterisation by upgrading them with Batch Normalisation (BN), a very effective parameterisation which was hitherto used only in feed-forward networks. In the first article, we apply BN to the input-to-hidden connections of the RNNs, thereby reducing internal covariate shift between layers. In the second article, we show how to apply it to both input-to-hidden and hidden-to-hidden connections of the Long Short-Term Memory (LSTM), a popular RNN architecture, thus also reducing internal covariate shift between time steps. Our experiments show that these proposed parameterisations allow for faster and better training of RNNs on several benchmarks. In the third article, we propose a new optimiser to accelerate the training of neural networks. Traditional diagonal optimisers, such as RMSProp, operate in parameters coordinates, which is not optimal when several parameters are updated at the same time. Instead, we propose to apply such optimisers in a basis in which the diagonal approximation is likely to be more effective. We leverage the same approximation used in Kronecker-factored Approximate Curvature (K-FAC) to efficiently build this Kronecker-factored Eigenbasis (KFE). Our experiments show improvements over K-FAC in training speed for several deep network architectures. The last article focuses on network pruning, the action of removing parameters from the network, in order to reduce its memory footprint and computational cost. Typical pruning methods rely on first or second order Taylor approximations of the loss landscape to identify which parameters can be discarded. We propose to study the impact of the assumptions behind such approximations. Moreover, we systematically compare methods based on first and second order approximations with Magnitude Pruning (MP), showing how they perform both before and after a fine-tuning phase. Our experiments show that better preserving the original network function does not necessarily transfer to better performing networks after fine-tuning, suggesting that only considering the impact of pruning on the loss might not be a sufficient objective to design good pruning criteria.
Fletcher, Lizelle. "Statistical modelling by neural networks". Thesis, 2002. http://hdl.handle.net/10500/600.
Pełny tekst źródłaMathematical Sciences
D. Phil. (Statistics)
Hübsch, Ondřej. "Redukce počtu parametrů v konvolučních neuronových sítích". Master's thesis, 2021. http://www.nusl.cz/ntk/nusl-447970.
Pełny tekst źródłaPetříčková, Zuzana. "Umělé neuronové sítě a jejich využití při extrakci znalostí". Doctoral thesis, 2015. http://www.nusl.cz/ntk/nusl-352245.
Pełny tekst źródłaElAraby, Mostafa. "Optimizing ANN Architectures using Mixed-Integer Programming". Thesis, 2020. http://hdl.handle.net/1866/24312.
Pełny tekst źródłaLes réseaux sur-paramétrés, où le nombre de paramètres dépasse le nombre de données, se généralisent bien sur diverses tâches. Cependant, les grands réseaux sont coûteux en termes d’entraînement et de temps d’inférence. De plus, l’hypothèse du billet de loterie indique qu’un sous-réseau d’un réseau initialisé de façon aléatoire peut atteindre une perte marginale après l’entrainement sur une tâche spécifique par rapport au réseau de référence. Par conséquent, il est nécessaire d’optimiser le temps d’inférence et d’entrainement, ce qui est possible pour des architectures neurales plus compactes. Nous introduisons une nouvelle approche “Optimizing ANN Architectures using Mixed-Integer Programming” (OAMIP) pour trouver ces sous-réseaux en identifiant les neurones importants et en supprimant les neurones non importants, ce qui permet d’accélérer le temps d’inférence. L’approche OAMIP proposée fait appel à un programme mixte en nombres entiers (MIP) pour attribuer des scores d’importance à chaque neurone dans les architectures de modèles profonds. Notre MIP est guidé par l’impact sur la principale tâche d’apprentissage du réseau en élaguant simultanément les neurones. En définissant soigneusement la fonction objective du MIP, le solveur aura une tendance à minimiser le nombre de neurones, à limiter le réseau aux neurones critiques, c’est-à-dire avec un score d’importance élevé, qui doivent être conservés pour maintenir la précision globale du réseau neuronal formé. De plus, la formulation proposée généralise l’hypothèse des billets de loterie récemment envisagée en identifiant de multiples sous-réseaux “chanceux”. Cela permet d’obtenir des architectures optimisées qui non seulement fonctionnent bien sur un seul ensemble de données, mais aussi se généralisent sur des di˙érents ensembles de données lors du recyclage des poids des réseaux. Enfin, nous présentons une implémentation évolutive de notre méthode en découplant les scores d’importance entre les couches à l’aide de réseaux auxiliaires et entre les di˙érentes classes. Nous démontrons la capacité de notre formulation à élaguer les réseaux de neurones avec une perte marginale de précision et de généralisabilité sur des ensembles de données et des architectures populaires.