Dissertations / Theses on the topic 'Convolutive Neural Networks'

To see the other types of publications on this topic, follow the link: Convolutive Neural Networks.

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 dissertations / theses for your research on the topic 'Convolutive Neural Networks.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.

1

Heuillet, Alexandre. "Exploring deep neural network differentiable architecture design." Electronic Thesis or Diss., université Paris-Saclay, 2023. http://www.theses.fr/2023UPASG069.

Full text
Abstract:
L'intelligence artificielle (IA) a gagné en popularité ces dernières années, principalement en raison de ses applications réussies dans divers domaines tels que l'analyse de données textuelles, la vision par ordinateur et le traitement audio. La résurgence des techniques d'apprentissage profond a joué un rôle central dans ce succès. L'article révolutionnaire de Krizhevsky et al., AlexNet, a réduit l'écart entre les performances humaines et celles des machines dans les tâches de classification d'images. Des articles ultérieurs tels que Xception et ResNet ont encore renforcé l'apprentissage profond en tant que technique de pointe, ouvrant de nouveaux horizons pour la communauté de l'IA. Le succès de l'apprentissage profond réside dans son architecture, conçue manuellement avec des connaissances d'experts et une validation empirique. Cependant, ces architectures n'ont pas la certitude d'être la solution optimale. Pour résoudre ce problème, des articles récents ont introduit le concept de Recherche d'Architecture Neuronale ( extit{NAS}), permettant l'automatisation de la conception des architectures profondes. Cependant, la majorités des approches initiales se sont concentrées sur de grandes architectures avec des objectifs spécifiques (par exemple, l'apprentissage supervisé) et ont utilisé des techniques d'optimisation coûteuses en calcul telles que l'apprentissage par renforcement et les algorithmes génétiques. Dans cette thèse, nous approfondissons cette idée en explorant la conception automatique d'architectures profondes, avec une emphase particulière sur les méthodes extit{NAS} différentiables ( extit{DNAS}), qui représentent la tendance actuelle en raison de leur efficacité computationnelle. Bien que notre principal objectif soit les réseaux convolutifs ( extit{CNNs}), nous explorons également les Vision Transformers (ViTs) dans le but de concevoir des architectures rentables adaptées aux applications en temps réel
Artificial Intelligence (AI) has gained significant popularity in recent years, primarily due to its successful applications in various domains, including textual data analysis, computer vision, and audio processing. The resurgence of deep learning techniques has played a central role in this success. The groundbreaking paper by Krizhevsky et al., AlexNet, narrowed the gap between human and machine performance in image classification tasks. Subsequent papers such as Xception and ResNet have further solidified deep learning as a leading technique, opening new horizons for the AI community. The success of deep learning lies in its architecture, which is manually designed with expert knowledge and empirical validation. However, these architectures lack the certainty of an optimal solution. To address this issue, recent papers introduced the concept of Neural Architecture Search (NAS), enabling the learning of deep architectures. However, most initial approaches focused on large architectures with specific targets (e.g., supervised learning) and relied on computationally expensive optimization techniques such as reinforcement learning and evolutionary algorithms. In this thesis, we further investigate this idea by exploring automatic deep architecture design, with a particular emphasis on differentiable NAS (DNAS), which represents the current trend in NAS due to its computational efficiency. While our primary focus is on Convolutional Neural Networks (CNNs), we also explore Vision Transformers (ViTs) with the goal of designing cost-effective architectures suitable for real-time applications
APA, Harvard, Vancouver, ISO, and other styles
2

Maragno, Alessandro. "Programmazione di Convolutional Neural Networks orientata all'accelerazione su FPGA." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2016. http://amslaurea.unibo.it/12476/.

Full text
Abstract:
Attualmente la Computer Vision, disciplina che consente di estrarre informazioni a partire da immagini digitali, è uno dei settori informatici più in fermento. Grazie alle recenti conquiste e progressi, tale settore ha raggiunto uno stato di maturità tale da poter essere applicato in svariati ambiti, a partire da quello industriale, fino ad arrivare ad applicazioni più vicine alla vita quotidiana. In particolare, si è raggiunto uno stato dell'arte sempre più solido nel campo del riconoscimento di oggetti (object detection) grazie allo sviluppo delle Convolutional Neural Networks (CNN): sistemi che si basano su un modello matematico, che viene gradualmente raffinato in base all'esperienza stessa del sistema nell'esecuzione di questo task, acquisita mediante tecniche di machine learning. Grazie a ciò, le CNN sono in grado di riconoscere e classificare il contenuto di immagini, dando loro una semantica. Tali sistemi però richiedono una grande capacità computazionale ed un'ingente quantità di memoria, pertanto la loro esecuzione avviene maggiormente su architetture potenti, come le GPU. Nonostante ciò, una delle sfide attualmente più importanti riguarda la classificazione in tempo reale di immagini eseguendo le reti neurali convolutive anche su architetture con disponibilità energetica e capacità computazionali ridotte, quali sono i sistemi embedded. Quindi, nel seguente trattato si propone un'implementazione di CNN riconfigurabile realizzata in linguaggio C. Ciò è risultato in un sistema semplice e modulare che con diverse ottimizzazioni ad-hoc può essere considerato un buon candidato per il porting su architetture embedded riconfigurabili FPGA.
APA, Harvard, Vancouver, ISO, and other styles
3

Abbasi, Mahdieh. "Toward robust deep neural networks." Doctoral thesis, Université Laval, 2020. http://hdl.handle.net/20.500.11794/67766.

Full text
Abstract:
Dans cette thèse, notre objectif est de développer des modèles d’apprentissage robustes et fiables mais précis, en particulier les Convolutional Neural Network (CNN), en présence des exemples anomalies, comme des exemples adversaires et d’échantillons hors distribution –Out-of-Distribution (OOD). Comme la première contribution, nous proposons d’estimer la confiance calibrée pour les exemples adversaires en encourageant la diversité dans un ensemble des CNNs. À cette fin, nous concevons un ensemble de spécialistes diversifiés avec un mécanisme de vote simple et efficace en termes de calcul pour prédire les exemples adversaires avec une faible confiance tout en maintenant la confiance prédicative des échantillons propres élevée. En présence de désaccord dans notre ensemble, nous prouvons qu’une borne supérieure de 0:5 + _0 peut être établie pour la confiance, conduisant à un seuil de détection global fixe de tau = 0; 5. Nous justifions analytiquement le rôle de la diversité dans notre ensemble sur l’atténuation du risque des exemples adversaires à la fois en boîte noire et en boîte blanche. Enfin, nous évaluons empiriquement la robustesse de notre ensemble aux attaques de la boîte noire et de la boîte blanche sur plusieurs données standards. La deuxième contribution vise à aborder la détection d’échantillons OOD à travers un modèle de bout en bout entraîné sur un ensemble OOD approprié. À cette fin, nous abordons la question centrale suivante : comment différencier des différents ensembles de données OOD disponibles par rapport à une tâche de distribution donnée pour sélectionner la plus appropriée, ce qui induit à son tour un modèle calibré avec un taux de détection des ensembles inaperçus de données OOD? Pour répondre à cette question, nous proposons de différencier les ensembles OOD par leur niveau de "protection" des sub-manifolds. Pour mesurer le niveau de protection, nous concevons ensuite trois nouvelles mesures efficaces en termes de calcul à l’aide d’un CNN vanille préformé. Dans une vaste série d’expériences sur les tâches de classification d’image et d’audio, nous démontrons empiriquement la capacité d’un CNN augmenté (A-CNN) et d’un CNN explicitement calibré pour détecter une portion significativement plus grande des exemples OOD. Fait intéressant, nous observons également qu’un tel A-CNN (nommé A-CNN) peut également détecter les adversaires exemples FGS en boîte noire avec des perturbations significatives. En tant que troisième contribution, nous étudions de plus près de la capacité de l’A-CNN sur la détection de types plus larges d’adversaires boîte noire (pas seulement ceux de type FGS). Pour augmenter la capacité d’A-CNN à détecter un plus grand nombre d’adversaires,nous augmentons l’ensemble d’entraînement OOD avec des échantillons interpolés inter-classes. Ensuite, nous démontrons que l’A-CNN, entraîné sur tous ces données, a un taux de détection cohérent sur tous les types des adversaires exemples invisibles. Alors que la entraînement d’un A-CNN sur des adversaires PGD ne conduit pas à un taux de détection stable sur tous les types d’adversaires, en particulier les types inaperçus. Nous évaluons également visuellement l’espace des fonctionnalités et les limites de décision dans l’espace d’entrée d’un CNN vanille et de son homologue augmenté en présence d’adversaires et de ceux qui sont propres. Par un A-CNN correctement formé, nous visons à faire un pas vers un modèle d’apprentissage debout en bout unifié et fiable avec de faibles taux de risque sur les échantillons propres et les échantillons inhabituels, par exemple, les échantillons adversaires et OOD. La dernière contribution est de présenter une application de A-CNN pour l’entraînement d’un détecteur d’objet robuste sur un ensemble de données partiellement étiquetées, en particulier un ensemble de données fusionné. La fusion de divers ensembles de données provenant de contextes similaires mais avec différents ensembles d’objets d’intérêt (OoI) est un moyen peu coûteux de créer un ensemble de données à grande échelle qui couvre un plus large spectre d’OoI. De plus, la fusion d’ensembles de données permet de réaliser un détecteur d’objet unifié, au lieu d’en avoir plusieurs séparés, ce qui entraîne une réduction des coûts de calcul et de temps. Cependant, la fusion d’ensembles de données, en particulier à partir d’un contexte similaire, entraîne de nombreuses instances d’étiquetées manquantes. Dans le but d’entraîner un détecteur d’objet robuste intégré sur un ensemble de données partiellement étiquetées mais à grande échelle, nous proposons un cadre d’entraînement auto-supervisé pour surmonter le problème des instances d’étiquettes manquantes dans les ensembles des données fusionnés. Notre cadre est évalué sur un ensemble de données fusionné avec un taux élevé d’étiquettes manquantes. Les résultats empiriques confirment la viabilité de nos pseudo-étiquettes générées pour améliorer les performances de YOLO, en tant que détecteur d’objet à la pointe de la technologie.
In this thesis, our goal is to develop robust and reliable yet accurate learning models, particularly Convolutional Neural Networks (CNNs), in the presence of adversarial examples and Out-of-Distribution (OOD) samples. As the first contribution, we propose to predict adversarial instances with high uncertainty through encouraging diversity in an ensemble of CNNs. To this end, we devise an ensemble of diverse specialists along with a simple and computationally efficient voting mechanism to predict the adversarial examples with low confidence while keeping the predictive confidence of the clean samples high. In the presence of high entropy in our ensemble, we prove that the predictive confidence can be upper-bounded, leading to have a globally fixed threshold over the predictive confidence for identifying adversaries. We analytically justify the role of diversity in our ensemble on mitigating the risk of both black-box and white-box adversarial examples. Finally, we empirically assess the robustness of our ensemble to the black-box and the white-box attacks on several benchmark datasets.The second contribution aims to address the detection of OOD samples through an end-to-end model trained on an appropriate OOD set. To this end, we address the following central question: how to differentiate many available OOD sets w.r.t. a given in distribution task to select the most appropriate one, which in turn induces a model with a high detection rate of unseen OOD sets? To answer this question, we hypothesize that the “protection” level of in-distribution sub-manifolds by each OOD set can be a good possible property to differentiate OOD sets. To measure the protection level, we then design three novel, simple, and cost-effective metrics using a pre-trained vanilla CNN. In an extensive series of experiments on image and audio classification tasks, we empirically demonstrate the abilityof an Augmented-CNN (A-CNN) and an explicitly-calibrated CNN for detecting a significantly larger portion of unseen OOD samples, if they are trained on the most protective OOD set. Interestingly, we also observe that the A-CNN trained on the most protective OOD set (calledA-CNN) can also detect the black-box Fast Gradient Sign (FGS) adversarial examples. As the third contribution, we investigate more closely the capacity of the A-CNN on the detection of wider types of black-box adversaries. To increase the capability of A-CNN to detect a larger number of adversaries, we augment its OOD training set with some inter-class interpolated samples. Then, we demonstrate that the A-CNN trained on the most protective OOD set along with the interpolated samples has a consistent detection rate on all types of unseen adversarial examples. Where as training an A-CNN on Projected Gradient Descent (PGD) adversaries does not lead to a stable detection rate on all types of adversaries, particularly the unseen types. We also visually assess the feature space and the decision boundaries in the input space of a vanilla CNN and its augmented counterpart in the presence of adversaries and the clean ones. By a properly trained A-CNN, we aim to take a step toward a unified and reliable end-to-end learning model with small risk rates on both clean samples and the unusual ones, e.g. adversarial and OOD samples.The last contribution is to show a use-case of A-CNN for training a robust object detector on a partially-labeled dataset, particularly a merged dataset. Merging various datasets from similar contexts but with different sets of Object of Interest (OoI) is an inexpensive way to craft a large-scale dataset which covers a larger spectrum of OoIs. Moreover, merging datasets allows achieving a unified object detector, instead of having several separate ones, resultingin the reduction of computational and time costs. However, merging datasets, especially from a similar context, causes many missing-label instances. With the goal of training an integrated robust object detector on a partially-labeled but large-scale dataset, we propose a self-supervised training framework to overcome the issue of missing-label instances in the merged datasets. Our framework is evaluated on a merged dataset with a high missing-label rate. The empirical results confirm the viability of our generated pseudo-labels to enhance the performance of YOLO, as the current (to date) state-of-the-art object detector.
APA, Harvard, Vancouver, ISO, and other styles
4

Kapoor, Rishika. "Malaria Detection Using Deep Convolution Neural Network." University of Cincinnati / OhioLINK, 2020. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1613749143868579.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Yu, Xiafei. "Wide Activated Separate 3D Convolution for Video Super-Resolution." Thesis, Université d'Ottawa / University of Ottawa, 2019. http://hdl.handle.net/10393/39974.

Full text
Abstract:
Video super-resolution (VSR) aims to recover a realistic high-resolution (HR) frame from its corresponding center low-resolution (LR) frame and several neighbouring supporting frames. The neighbouring supporting LR frames can provide extra information to help recover the HR frame. However, these frames are not aligned with the center frame due to the motion of objects. Recently, many video super-resolution methods based on deep learning have been proposed with the rapid development of neural networks. Most of these methods utilize motion estimation and compensation models as preprocessing to handle spatio-temporal alignment problem. Therefore, the accuracy of these motion estimation models are critical for predicting the high-resolution frames. Inaccurate results of motion compensation models will lead to artifacts and blurs, which also will damage the recovery of high-resolution frames. We propose an effective wide activated separate 3 dimensional (3D) Convolution Neural Network (CNN) for video super-resolution to overcome the drawback of utilizing motion compensation models. Separate 3D convolution factorizes the 3D convolution into convolutions in the spatial and temporal domain, which have benefit for the optimization of spatial and temporal convolution components. Therefore, our method can capture temporal and spatial information of input frames simultaneously without additional motion evaluation and compensation model. Moreover, the experimental results demonstrated the effectiveness of the proposed wide activated separate 3D CNN.
APA, Harvard, Vancouver, ISO, and other styles
6

Messou, Ehounoud Joseph Christopher. "Handling Invalid Pixels in Convolutional Neural Networks." Thesis, Virginia Tech, 2020. http://hdl.handle.net/10919/98619.

Full text
Abstract:
Most neural networks use a normal convolutional layer that assumes that all input pixels are valid pixels. However, pixels added to the input through padding result in adding extra information that was not initially present. This extra information can be considered invalid. Invalid pixels can also be inside the image where they are referred to as holes in completion tasks like image inpainting. In this work, we look for a method that can handle both types of invalid pixels. We compare on the same test bench two methods previously used to handle invalid pixels outside the image (Partial and Edge convolutions) and one method that was designed for invalid pixels inside the image (Gated convolution). We show that Partial convolution performs the best in image classification while Gated convolution has the advantage on semantic segmentation. As for hotel recognition with masked regions, none of the methods seem appropriate to generate embeddings that leverage the masked regions.
Master of Science
A module at the heart of deep neural networks built for Artificial Intelligence is the convolutional layer. When multiple convolutional layers are used together with other modules, a Convolutional Neural Network (CNN) is obtained. These CNNs can be used for tasks such as image classification where they tell if the object in an image is a chair or a car, for example. Most CNNs use a normal convolutional layer that assumes that all parts of the image fed to the network are valid. However, most models zero pad the image at the beginning to maintain a certain output shape. Zero padding is equivalent to adding a black frame around the image. These added pixels result in adding information that was not initially present. Therefore, this extra information can be considered invalid. Invalid pixels can also be inside the image where they are referred to as holes in completion tasks like image inpainting where the network is asked to fill these holes and give a realistic image. In this work, we look for a method that can handle both types of invalid pixels. We compare on the same test bench two methods previously used to handle invalid pixels outside the image (Partial and Edge convolutions) and one method that was designed for invalid pixels inside the image (Gated convolution). We show that Partial convolution performs the best in image classification while Gated convolution has the advantage on semantic segmentation. As for hotel recognition with masked regions, none of the methods seem appropriate to generate embeddings that leverage the masked regions.
APA, Harvard, Vancouver, ISO, and other styles
7

Ngo, Kalle. "FPGA Hardware Acceleration of Inception Style Parameter Reduced Convolution Neural Networks." Thesis, KTH, Skolan för informations- och kommunikationsteknik (ICT), 2016. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-205026.

Full text
Abstract:
Some researchers have noted that the growth rate in the number of network parameters of many recently proposed state-of-the-art CNN topologies is placing unrealistic demands on hardware resources and limits the practical applications of Neural Networks. This is particularly apparent when considering many of the projected applications (IoT, autonomous vehicles, etc) utilize embedded systems with even greater restrictions on computation and memory bandwidth than the typical research-class computer cluster that the CNN was designed on. The GoogLeNet CNN in 2014 proposed a new level of organization (“Inception Module”) that was demonstrated in competition to achieve similar/better performance, while using an order of magnitude less network parameters than the other competing topologies. This thesis explores the characteristics of the new GoogLeNet inception modules and the implications it presents to current CNN accelerator architectures. A custom FPGA accelerator is proposed to offset the inception module’s increased need to buffer large intermediate convolution arrays through array partitioning and cascading two convolution operations into a single pipeline pass. A Xilinx Artix-7 FPGA was used to implement architecture where it was able continuously supply data to the 331 utilized DSP blocks (approx. half of total available), while using only a quarter of the DDR bandwidth to achieve a peak throughput of 9.11 GFLOPS. The low utilization of the DDR bandwidth suggests that with some optimization, the design can be scaled up to better utilize the available resources and increase throughput.
APA, Harvard, Vancouver, ISO, and other styles
8

Pappone, Francesco. "Graph neural networks: theory and applications." Bachelor's thesis, Alma Mater Studiorum - Università di Bologna, 2021. http://amslaurea.unibo.it/23893/.

Full text
Abstract:
Le reti neurali artificiali hanno visto, negli ultimi anni, una crescita vertiginosa nelle loro applicazioni e nelle architetture dei modelli impiegati. In questa tesi introduciamo le reti neurali su domini euclidei, in particolare mostrando l’importanza dell’equivarianza di traslazione nelle reti convoluzionali, e introduciamo, per analogia, un’estensione della convoluzione a dati strutturati come grafi. Inoltre presentiamo le architetture dei principali Graph Neural Network ed esponiamo, per ognuna delle tre architetture proposte (Spectral graph Convolutional Network, Graph Convolutional Network, Graph Attention neTwork) un’applicazione che ne mostri sia il funzionamento che l’importanza. Discutiamo, ulteriormente, l’implementazione di un algoritmo di classificazione basato su due varianti dell’architettura Graph Convolutional Network, addestrato e testato sul dataset PROTEINS, capace di classificare le proteine del dataset in due categorie: enzimi e non enzimi.
APA, Harvard, Vancouver, ISO, and other styles
9

Sung, Wei-Hong. "Investigating minimal Convolution Neural Networks (CNNs) for realtime embedded eye feature detection." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-281338.

Full text
Abstract:
With the rapid rise of neural networks, many tasks that used to be difficult to complete in traditional methods can now be solved well, especially in the computer vision field. However, as the tasks we have to solve have become more and more complex, the neural networks we use are becoming deeper and larger. Therefore, although some embedded systems are powerful nowadays, most embedded systems still suffer from memory and computation limitations, which means it is hard to deploy our large neural networks on these embedded devices. This project aims to explore different methods to compress the original large model. That is, we first train a baseline model, YOLOv3[1], which is a famous object detection network, and then we use two methods to compress the baseline model. The first method is pruning by using sparsity training, and we do channel pruning according to the scaling factor value after sparsity training. Based on the idea of this method, we have made three explorations. Firstly, we take the union mask strategy to solve the dimension problem of the shortcut-related layers in YOLOv3[1]. Secondly, we try to absorb the shifting factor information into subsequent layers. Finally, we implement the layer pruning and combine it with channel pruning. The second method is pruning by using Neural Architecture Search (NAS), which uses a deep reinforcement framework to automatically find the best compression ratio for each layer. At the end of this report, we analyze the key findings and conclusions of our experiment and purpose the future work which could potentially improve our project.
Med den snabba ökningen av neurala nätverk kan många uppgifter som brukade vara svåra att utföra i traditionella metoder nu lösas bra, särskilt inom datorsynsfältet. Men eftersom uppgifterna vi måste lösa har blivit mer och mer komplexa, blir de neurala nätverken vi använder djupare och större. Därför, även om vissa inbäddade system är kraftfulla för närvarande, lider de flesta inbäddade system fortfarande av minnes- och beräkningsbegränsningar, vilket innebär att det är svårt att distribuera våra stora neurala nätverk på dessa inbäddade enheter. Projektet syftar till att utforska olika metoder för att komprimera den ursprungliga stora modellen. Det vill säga, vi tränar först en baslinjemodell, YOLOv3[1], som är ett berömt objektdetekteringsnätverk, och sedan använder vi två metoder för att komprimera basmodellen. Den första metoden är beskärning med hjälp av sparsity training, och vi kanalskärning enligt skalningsfaktorvärdet efter sparsity training. Baserat på idén om denna metod har vi gjort tre utforskningar. För det första tar vi unionens maskstrategi för att lösa dimensionsproblemet för genvägsrelaterade lager i YOLOv3[1]. För det andra försöker vi absorbera informationen om skiftande faktorer i efterföljande lager. Slutligen implementerar vi lagerskärningen och kombinerar det med kanalbeskärning. Den andra metoden är beskärning med NAS, som använder en djup förstärkningsram för att automatiskt hitta det bästa kompressionsförhållandet för varje lager. I slutet av denna rapport analyserar vi de viktigaste resultaten och slutsatserna i vårt experiment och syftar till det framtida arbetet som potentiellt kan förbättra vårt projekt.
APA, Harvard, Vancouver, ISO, and other styles
10

Wu, Jindong. "Pooling strategies for graph convolution neural networks and their effect on classification." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-288953.

Full text
Abstract:
With the development of graph neural networks, this novel neural network has been applied in a broader and broader range of fields. One of the thorny problems researchers face in this field is selecting suitable pooling methods for a specific research task from various existing pooling methods. In this work, based on the existing mainstream graph pooling methods, we develop a benchmark neural network framework that can be used to compare these different graph pooling methods. By using the framework, we compare four mainstream graph pooling methods and explore their characteristics. Furthermore, we expand two methods for explaining neural network decisions for convolution neural networks to graph neural networks and compare them with the existing GNNExplainer. We run experiments on standard graph classification tasks using the developed framework and discuss the different pooling methods’ distinctive characteristics. Furthermore, we verify the proposed extensions of the explanation methods’ correctness and measure the agreements among the produced explanations. Finally, we explore the characteristics of different methods for explaining neural network decisions and the insights of different pooling methods by applying these explanation methods.
Med utvecklingen av grafneurala nätverk har detta nya neurala nätverk tillämpats i olika område. Ett av de svåra problemen för forskare inom detta område är hur man väljer en lämplig poolningsmetod för en specifik forskningsuppgift från en mängd befintliga poolningsmetoder. I den här arbetet, baserat på de befintliga vanliga grafpoolingsmetoderna, utvecklar vi ett riktmärke för neuralt nätverk ram som kan användas till olika diagram pooling metoders jämförelse. Genom att använda ramverket jämför vi fyra allmängiltig diagram pooling metod och utforska deras egenskaper. Dessutom utvidgar vi två metoder för att förklara beslut om neuralt nätverk från convolution neurala nätverk till diagram neurala nätverk och jämföra dem med befintliga GNNExplainer. Vi kör experiment av grafisk klassificering uppgifter under benchmarkingramverk och hittade olika egenskaper av olika diagram pooling metoder. Dessutom verifierar vi korrekthet i dessa förklarningsmetoder som vi utvecklade och mäter överenskommelserna mellan dem. Till slut, vi försöker utforska egenskaper av olika metoder för att förklara neuralt nätverks beslut och deras betydelse för att välja pooling metoder i grafisk neuralt nätverk.
APA, Harvard, Vancouver, ISO, and other styles
11

GIACOPELLI, Giuseppe. "An Original Convolution Model to analyze Graph Network Distribution Features." Doctoral thesis, Università degli Studi di Palermo, 2022. https://hdl.handle.net/10447/553177.

Full text
Abstract:
Modern Graph Theory is a newly emerging field that involves all of those approaches that study graphs differently from Classic Graph Theory. The main difference between Classic and Modern Graph Theory regards the analysis and the use of graph's structures (micro/macro). The former aims to solve tasks hosted on graph nodes, most of the time with no insight into the global graph structure, the latter aims to analyze and discover the most salient features characterizing a whole network of each graph, like degree distributions, hubs, clustering coefficient and network motifs. The activities carried out during the PhD period concerned, after a careful preliminary study on the applications of the Modern Graph Theory, the development of an innovative Convolutional Model to model brain connections at the cellular level capable of combining exponential models and power law models. This new theoretical framework has been introduced in the first instance with an aspatial graph formulation and then proposed a spatial graph model with Convolutive connectivity able to fit the degree distributions of data driven Connectome reconstructions. In order to evaluate the qualities of the Convolutional Model, theoretical graphical models capable of characterizing brain activity were taken into consideration. In the specific case, the model examined characterizes the epileptic activity through a simple Hindmarsh-Rose model system of point neurons and reproduces the functional characteristics observed in the data driven model. Such a model provides insight into the deep impact of micro connectivity in macro-scale brain activity. Other evaluations have been done in different applications, in the field of image cell segmentation with Explainable Artificial Intelligence's neuronal agents in which has been used a methodology that is not only explainable but also resistant to adversarial noise and also in the field of modelling Covid-19 outbreak in gaining insight on vaccines and role of our habits as individuals in the pandemic spread. Therefore, the core of the thesis is to introduce Modern Graph Theory with a new competitive Convolutive Model and then expose some applications to real-world problems like a characterization of Brain networks, simulation and analysis of Brain dynamics with a particular focus on Epilepsy, Immunofluorescence images segmentation with neuronal based agents and modelling of Covid-19 Epidemic spread with a specific interest in human social networks. All this takes continuously into account the whole dialogue between Graph Theory and its applications.
APA, Harvard, Vancouver, ISO, and other styles
12

Ioannou, Yani Andrew. "Structural priors in deep neural networks." Thesis, University of Cambridge, 2018. https://www.repository.cam.ac.uk/handle/1810/278976.

Full text
Abstract:
Deep learning has in recent years come to dominate the previously separate fields of research in machine learning, computer vision, natural language understanding and speech recognition. Despite breakthroughs in training deep networks, there remains a lack of understanding of both the optimization and structure of deep networks. The approach advocated by many researchers in the field has been to train monolithic networks with excess complexity, and strong regularization --- an approach that leaves much to desire in efficiency. Instead we propose that carefully designing networks in consideration of our prior knowledge of the task and learned representation can improve the memory and compute efficiency of state-of-the art networks, and even improve generalization --- what we propose to denote as structural priors. We present two such novel structural priors for convolutional neural networks, and evaluate them in state-of-the-art image classification CNN architectures. The first of these methods proposes to exploit our knowledge of the low-rank nature of most filters learned for natural images by structuring a deep network to learn a collection of mostly small, low-rank, filters. The second addresses the filter/channel extents of convolutional filters, by learning filters with limited channel extents. The size of these channel-wise basis filters increases with the depth of the model, giving a novel sparse connection structure that resembles a tree root. Both methods are found to improve the generalization of these architectures while also decreasing the size and increasing the efficiency of their training and test-time computation. Finally, we present work towards conditional computation in deep neural networks, moving towards a method of automatically learning structural priors in deep networks. We propose a new discriminative learning model, conditional networks, that jointly exploit the accurate representation learning capabilities of deep neural networks with the efficient conditional computation of decision trees. Conditional networks yield smaller models, and offer test-time flexibility in the trade-off of computation vs. accuracy.
APA, Harvard, Vancouver, ISO, and other styles
13

Jackman, Simeon. "Football Shot Detection using Convolutional Neural Networks." Thesis, Linköpings universitet, Institutionen för medicinsk teknik, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-157438.

Full text
Abstract:
In this thesis, three different neural network architectures are investigated to detect the action of a shot within a football game using video data. The first architecture uses con- ventional convolution and pooling layers as feature extraction. It acts as a baseline and gives insight into the challenges faced during shot detection. The second architecture uses a pre-trained feature extractor. The last architecture uses three-dimensional convolution. All these networks are trained using short video clips extracted from football game video streams. Apart from investigating network architectures, different sampling methods are evaluated as well. This thesis shows that amongst the three evaluated methods, the ap- proach using MobileNetV2 as a feature extractor works best. However, when applying the networks to a video stream there are a multitude of challenges, such as false positives and incorrect annotations that inhibit the potential of detecting shots.
APA, Harvard, Vancouver, ISO, and other styles
14

Cranston, Daniel, and Filip Skarfelt. "Normalized Convolution Network and Dataset Generation for Refining Stereo Disparity Maps." Thesis, Linköpings universitet, Datorseende, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-158449.

Full text
Abstract:
Finding disparity maps between stereo images is a well studied topic within computer vision. While both classical and machine learning approaches exist in the literature, they frequently struggle to correctly solve the disparity in regions with low texture, sharp edges or occlusions. Finding approximate solutions to these problem areas is frequently referred to as disparity refinement, and is usually carried out separately after an initial disparity map has been generated. In the recent literature, the use of Normalized Convolution in Convolutional Neural Networks have shown remarkable results when applied to the task of stereo depth completion. This thesis investigates how well this approach performs in the case of disparity refinement. Specifically, we investigate how well such a method can improve the initial disparity maps generated by the stereo matching algorithm developed at Saab Dynamics using a rectified stereo rig. To this end, a dataset of ground truth disparity maps was created using equipment at Saab, namely a setup for structured light and the stereo rig cameras. Because the end goal is a dataset fit for training networks, we investigate an approach that allows for efficient creation of significant quantities of dense ground truth disparities. The method for generating ground truth disparities generates several disparity maps for every scene measured by using several stereo pairs. A densified disparity map is generated by merging the disparity maps from the neighbouring stereo pairs. This resulted in a dataset of 26 scenes and 104 dense and accurate disparity maps. Our evaluation results show that the chosen Normalized Convolution Network based method can be adapted for disparity map refinement, but is dependent on the quality of the input disparity map.
APA, Harvard, Vancouver, ISO, and other styles
15

Highlander, Tyler. "Efficient Training of Small Kernel Convolutional Neural Networks using Fast Fourier Transform." Wright State University / OhioLINK, 2015. http://rave.ohiolink.edu/etdc/view?acc_num=wright1432747175.

Full text
APA, Harvard, Vancouver, ISO, and other styles
16

Sunesson, Albin. "Establishing Effective Techniques for Increasing Deep Neural Networks Inference Speed." Thesis, KTH, Skolan för datavetenskap och kommunikation (CSC), 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-213833.

Full text
Abstract:
Recent trend in deep learning research is to build ever more deep networks (i.e. increase the number of layers) to solve real world classification/optimization problems. This introduces challenges for applications with a latency dependence. The problem arises from the amount of computations that needs to be performed for each evaluation. This is addressed by reducing inference speed. In this study we analyze two different methods for speeding up the evaluation of deep neural networks. The first method reduces the number of weights in a convolutional layer by decomposing its convolutional kernel. The second method lets samples exit a network through early exit branches when classifications are certain. Both methods were evaluated on several network architectures with consistent results. Convolutional kernel decomposition shows 20-70% speed up with no more than 1% loss in classification accuracy in setups evaluated. Early exit branches show up to 300% speed up with no loss in classification accuracy when evaluated on CPUs.
De senaste årens trend inom deep learning har varit att addera fler och fler lager till neurala nätverk. Det här introducerar nya utmaningar i applikationer med latensberoende. Problemet uppstår från mängden beräkningar som måste utföras vid varje evaluering. Detta adresseras med en reducering av inferenshastigheten. Jag analyserar två olika metoder för att snabba upp evalueringen av djupa neurala näverk. Den första metoden reducerar antalet vikter i ett faltningslager via en tensordekomposition på dess kärna. Den andra metoden låter samples lämna nätverket via tidiga förgreningar när en klassificering är säker. Båda metoderna utvärderas på flertalet nätverksarkitekturer med konsistenta resultat. Dekomposition på fältningskärnan visar 20-70% hastighetsökning med mindre än 1% försämring av klassifikationssäkerhet i evaluerade konfigurationer. Tidiga förgreningar visar upp till 300% hastighetsökning utan någon försämring av klassifikationssäkerhet när de evalueras på CPU.
APA, Harvard, Vancouver, ISO, and other styles
17

Shuvo, Md Kamruzzaman. "Hardware Efficient Deep Neural Network Implementation on FPGA." OpenSIUC, 2020. https://opensiuc.lib.siu.edu/theses/2792.

Full text
Abstract:
In recent years, there has been a significant push to implement Deep Neural Networks (DNNs) on edge devices, which requires power and hardware efficient circuits to carry out the intensive matrix-vector multiplication (MVM) operations. This work presents hardware efficient MVM implementation techniques using bit-serial arithmetic and a novel MSB first computation circuit. The proposed designs take advantage of the pre-trained network weight parameters, which are already known in the design stage. Thus, the partial computation results can be pre-computed and stored into look-up tables. Then the MVM results can be computed in a bit-serial manner without using multipliers. The proposed novel circuit implementation for convolution filters and rectified linear activation function used in deep neural networks conducts computation in an MSB-first bit-serial manner. It can predict earlier if the outcomes of filter computations will be negative and subsequently terminate the remaining computations to save power. The benefits of using the proposed MVM implementations techniques are demonstrated by comparing the proposed design with conventional implementation. The proposed circuit is implemented on an FPGA. It shows significant power and performance improvements compared to the conventional designs implemented on the same FPGA.
APA, Harvard, Vancouver, ISO, and other styles
18

Andersson, Viktor. "Semantic Segmentation : Using Convolutional Neural Networks and Sparse dictionaries." Thesis, Linköpings universitet, Datorseende, 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-139367.

Full text
Abstract:
The two main bottlenecks using deep neural networks are data dependency and training time. This thesis proposes a novel method for weight initialization of the convolutional layers in a convolutional neural network. This thesis introduces the usage of sparse dictionaries. A sparse dictionary optimized on domain specific data can be seen as a set of intelligent feature extracting filters. This thesis investigates the effect of using such filters as kernels in the convolutional layers in the neural network. How do they affect the training time and final performance? The dataset used here is the Cityscapes-dataset which is a library of 25000 labeled road scene images.The sparse dictionary was acquired using the K-SVD method. The filters were added to two different networks whose performance was tested individually. One of the architectures is much deeper than the other. The results have been presented for both networks. The results show that filter initialization is an important aspect which should be taken into consideration while training the deep networks for semantic segmentation.
APA, Harvard, Vancouver, ISO, and other styles
19

Bereczki, Márk. "Graph Neural Networks for Article Recommendation based on Implicit User Feedback and Content." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-300092.

Full text
Abstract:
Recommender systems are widely used in websites and applications to help users find relevant content based on their interests. Graph neural networks achieved state- of-the- art results in the field of recommender systems, working on data represented in the form of a graph. However, most graph- based solutions hold challenges regarding computational complexity or the ability to generalize to new users. Therefore, we propose a novel graph- based recommender system, by modifying Simple Graph Convolution, an approach for efficient graph node classification, and add the capability of generalizing to new users. We build our proposed recommender system for recommending the articles of Peltarion Knowledge Center. By incorporating two data sources, implicit user feedback based on pageview data as well as the content of articles, we propose a hybrid recommender solution. Throughout our experiments, we compare our proposed solution with a matrix factorization approach as well as a popularity- based and a random baseline, analyse the hyperparameters of our model, and examine the capability of our solution to give recommendations to new users who were not part of the training data set. Our model results in slightly lower, but similar Mean Average Precision and Mean Reciprocal Rank scores to the matrix factorization approach, and outperforms the popularity- based and random baselines. The main advantages of our model are computational efficiency and its ability to give relevant recommendations to new users without the need for retraining the model, which are key features for real- world use cases.
Rekommendationssystem används ofta på webbplatser och applikationer för att hjälpa användare att hitta relevant innehåll baserad på deras intressen. Med utvecklingen av grafneurala nätverk nådde toppmoderna resultat inom rekommendationssystem och representerade data i form av en graf. De flesta grafbaserade lösningar har dock svårt med beräkningskomplexitet eller att generalisera till nya användare. Därför föreslår vi ett nytt grafbaserat rekommendatorsystem genom att modifiera Simple Graph Convolution. De här tillvägagångssätt är en effektiv grafnodsklassificering och lägga till möjligheten att generalisera till nya användare. Vi bygger vårt föreslagna rekommendatorsystem för att rekommendera artiklarna från Peltarion Knowledge Center. Genom att integrera två datakällor, implicit användaråterkoppling baserad på sidvisningsdata samt innehållet i artiklar, föreslår vi en hybridrekommendatörslösning. Under våra experiment jämför vi vår föreslagna lösning med en matrisfaktoriseringsmetod samt en popularitetsbaserad och en slumpmässig baslinje, analyserar hyperparametrarna i vår modell och undersöker förmågan hos vår lösning att ge rekommendationer till nya användare som inte deltog av träningsdatamängden. Vår modell resulterar i något mindre men liknande Mean Average Precision och Mean Reciprocal Rank poäng till matrisfaktoriseringsmetoden och överträffar de popularitetsbaserade och slumpmässiga baslinjerna. De viktigaste fördelarna med vår modell är beräkningseffektivitet och dess förmåga att ge relevanta rekommendationer till nya användare utan behov av omskolning av modellen, vilket är nyckelfunktioner för verkliga användningsfall.
APA, Harvard, Vancouver, ISO, and other styles
20

Schembri, Massimo. "Anomaly Prediction in Production Supercomputer with Convolution and Semi-supervised autoencoder." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2021. http://amslaurea.unibo.it/22379/.

Full text
Abstract:
Un sistema HPC (High Performance Computing) è un sistema con capacità computazionali molto elevate adatto a task molto esigenti in termini di risorse. Alcune delle proprietà fondamentali di un sistema del genere sono certamente la disponibilità e l'affidabilità che possono essere messe a rischio da problemi hardware e software. In quest'attività di tesi si è realizzato e analizzato le performance di un sistema di anomaly detection in termini di capacità di rilevazione e predizione di un'anomalia su vari nodi di un sistema HPC, in particolare utilizzando i dati relativi al sistema MARCONI del consorzio CINECA.
APA, Harvard, Vancouver, ISO, and other styles
21

Lamouret, Marie. "Traitement automatisés des données acoustiques issues de sondeurs multifaisceaux pour la cartographie des fonds marins." Electronic Thesis or Diss., Toulon, 2022. http://www.theses.fr/2022TOUL0002.

Full text
Abstract:
Le sondeur multifaisceaux (SMF) est l'une des technologies d'acoustique sous-marine les plus avancées pour l'étude des fonds et de la colonne d'eau. Il requiert une réelle expertise pour son déploiement sur le terrain ainsi que pour l'élaboration de cartographies à partir des différentes données acquises. Ces traitements sont souvent chronophages en raison de la quantité de données acquises et demandent à être automatisés pour alléger le travail à l'hydrographe. C'est ce sur quoi portent les travaux réalisés durant cette thèse. Après des rappels sur des notions d'acoustique sous-marine, le fonctionnement du SMF est décrit et les types de données manipulées tout au long des traitements sont présentés. Le manuscrit s'articule ensuite autour de deux thématiques ˸ la cartographie bathymétrique et la cartographie biocénotique. Les développements sont intégrés dans les logiciels de l'entreprise Seaviews pour laquelle les travaux sont réalisés. Ils répondent à des besoins particuliers de l'entreprise.En ce qui concerne la cartographie bathymétrique, la donnée bathymétrique doit être préalablement triée pour écarter les sondes aberrantes et éviter qu'elles ne pénalisent la précision topographique. Ce tri d'innombrables sondes est une tâche que réalisent les hydrographes, assistés aujourd'hui d'outils numériques. Nous proposerons une méthode statistique rapide pour trier les sondes tout en réalisant une carte de profondeurs marines. Ce qui amène à se demander si les images de la colonne d'eau acquises également par le sondeur ne seraient pas exploitables pour déduire une bathymétrie exempte d'aberration. Nous testerons cette hypothèse à l'aide de l'apprentissage profond (deep learning) et en particulier par des réseaux de neurones convolutifs qui ont permis des progrès considérables en vision par ordinateur. La cartographie des habitats marins (les biocénoses) est un travail de classification de la nature des fonds à partir des données acoustiques du SMF en concordance avec les espèces vivant sur les lieux. La société Seaviews a développé une méthode de préparation des données SMF pour l'analyse des habitats. Nous nous orientons vers des méthodes de classification des habitats, à partir de ces données, par des techniques d'apprentissage automatique (machine learning). Plusieurs méthodes sont mises en place et testées, puis une zone d'étude est choisie pour évaluer et comparer les résultats des différentes approches
Among underwater acoustic technologies, multibeam echo sounder (MBES) is one of the most advanced tool to study and map the underwater floors and the above water column. Its deployment on-site requires expertise so as the whole data processing to map the information. These processing are very time-consuming due to the massive quantity of recorded data and thus needs to be automatised to shorten and alleviate the hydrographer's task. This PhD research works focus on the automatisation of the current activities in Seaviews society.After some reminders on the underwater acoustic sciences, the MBES operating is described as well the produced data that will be manipulated throughout the developments. This document presents two thematics˸ bathymetric (depths) and marine habitats mapping. The developments are integrated into the Seaviews' software in the aim to be used by all the employees.About seafloor depths mapping, the bathymetric sounding has to be sorted to avoid that the outlier errors distort the results. Sorting the uncountable measures is cumbersome but necessary, although the hydrographers are today happily computed-assisted. We propose a fast statistical method to exclude the outliers while mapping the information. This leads to wonder if the water column imagery would be workable to deduce the bathymetry without failure. We will test this hypothesis with some technics of deep learning, especially with convolutional neural networks.The marine habitats mapping is a seabed nature classification according to the local life. Seaviews has worked on a way to prepare MBES data and habitats analysis. Concerning the method of classification itself, we move towards machine learning technics. Several methods are implemented and assessed, and then an area is chosen to evaluate and compare the results
APA, Harvard, Vancouver, ISO, and other styles
22

Belharbi, Soufiane. "Neural networks regularization through representation learning." Thesis, Normandie, 2018. http://www.theses.fr/2018NORMIR10/document.

Full text
Abstract:
Les modèles de réseaux de neurones et en particulier les modèles profonds sont aujourd'hui l'un des modèles à l'état de l'art en apprentissage automatique et ses applications. Les réseaux de neurones profonds récents possèdent de nombreuses couches cachées ce qui augmente significativement le nombre total de paramètres. L'apprentissage de ce genre de modèles nécessite donc un grand nombre d'exemples étiquetés, qui ne sont pas toujours disponibles en pratique. Le sur-apprentissage est un des problèmes fondamentaux des réseaux de neurones, qui se produit lorsque le modèle apprend par coeur les données d'apprentissage, menant à des difficultés à généraliser sur de nouvelles données. Le problème du sur-apprentissage des réseaux de neurones est le thème principal abordé dans cette thèse. Dans la littérature, plusieurs solutions ont été proposées pour remédier à ce problème, tels que l'augmentation de données, l'arrêt prématuré de l'apprentissage ("early stopping"), ou encore des techniques plus spécifiques aux réseaux de neurones comme le "dropout" ou la "batch normalization". Dans cette thèse, nous abordons le sur-apprentissage des réseaux de neurones profonds sous l'angle de l'apprentissage de représentations, en considérant l'apprentissage avec peu de données. Pour aboutir à cet objectif, nous avons proposé trois différentes contributions. La première contribution, présentée dans le chapitre 2, concerne les problèmes à sorties structurées dans lesquels les variables de sortie sont à grande dimension et sont généralement liées par des relations structurelles. Notre proposition vise à exploiter ces relations structurelles en les apprenant de manière non-supervisée avec des autoencodeurs. Nous avons validé notre approche sur un problème de régression multiple appliquée à la détection de points d'intérêt dans des images de visages. Notre approche a montré une accélération de l'apprentissage des réseaux et une amélioration de leur généralisation. La deuxième contribution, présentée dans le chapitre 3, exploite la connaissance a priori sur les représentations à l'intérieur des couches cachées dans le cadre d'une tâche de classification. Cet à priori est basé sur la simple idée que les exemples d'une même classe doivent avoir la même représentation interne. Nous avons formalisé cet à priori sous la forme d'une pénalité que nous avons rajoutée à la fonction de perte. Des expérimentations empiriques sur la base MNIST et ses variantes ont montré des améliorations dans la généralisation des réseaux de neurones, particulièrement dans le cas où peu de données d'apprentissage sont utilisées. Notre troisième et dernière contribution, présentée dans le chapitre 4, montre l'intérêt du transfert d'apprentissage ("transfer learning") dans des applications dans lesquelles peu de données d'apprentissage sont disponibles. L'idée principale consiste à pré-apprendre les filtres d'un réseau à convolution sur une tâche source avec une grande base de données (ImageNet par exemple), pour les insérer par la suite dans un nouveau réseau sur la tâche cible. Dans le cadre d'une collaboration avec le centre de lutte contre le cancer "Henri Becquerel de Rouen", nous avons construit un système automatique basé sur ce type de transfert d'apprentissage pour une application médicale où l'on dispose d’un faible jeu de données étiquetées. Dans cette application, la tâche consiste à localiser la troisième vertèbre lombaire dans un examen de type scanner. L’utilisation du transfert d’apprentissage ainsi que de prétraitements et de post traitements adaptés a permis d’obtenir des bons résultats, autorisant la mise en oeuvre du modèle en routine clinique
Neural network models and deep models are one of the leading and state of the art models in machine learning. They have been applied in many different domains. Most successful deep neural models are the ones with many layers which highly increases their number of parameters. Training such models requires a large number of training samples which is not always available. One of the fundamental issues in neural networks is overfitting which is the issue tackled in this thesis. Such problem often occurs when the training of large models is performed using few training samples. Many approaches have been proposed to prevent the network from overfitting and improve its generalization performance such as data augmentation, early stopping, parameters sharing, unsupervised learning, dropout, batch normalization, etc. In this thesis, we tackle the neural network overfitting issue from a representation learning perspective by considering the situation where few training samples are available which is the case of many real world applications. We propose three contributions. The first one presented in chapter 2 is dedicated to dealing with structured output problems to perform multivariate regression when the output variable y contains structural dependencies between its components. Our proposal aims mainly at exploiting these dependencies by learning them in an unsupervised way. Validated on a facial landmark detection problem, learning the structure of the output data has shown to improve the network generalization and speedup its training. The second contribution described in chapter 3 deals with the classification task where we propose to exploit prior knowledge about the internal representation of the hidden layers in neural networks. This prior is based on the idea that samples within the same class should have the same internal representation. We formulate this prior as a penalty that we add to the training cost to be minimized. Empirical experiments over MNIST and its variants showed an improvement of the network generalization when using only few training samples. Our last contribution presented in chapter 4 showed the interest of transfer learning in applications where only few samples are available. The idea consists in re-using the filters of pre-trained convolutional networks that have been trained on large datasets such as ImageNet. Such pre-trained filters are plugged into a new convolutional network with new dense layers. Then, the whole network is trained over a new task. In this contribution, we provide an automatic system based on such learning scheme with an application to medical domain. In this application, the task consists in localizing the third lumbar vertebra in a 3D CT scan. A pre-processing of the 3D CT scan to obtain a 2D representation and a post-processing to refine the decision are included in the proposed system. This work has been done in collaboration with the clinic "Rouen Henri Becquerel Center" who provided us with data
APA, Harvard, Vancouver, ISO, and other styles
23

Karimi, Ahmad Maroof. "DATA SCIENCE AND MACHINE LEARNING TO PREDICT DEGRADATION AND POWER OF PHOTOVOLTAIC SYSTEMS: CONVOLUTIONAL AND SPATIOTEMPORAL GRAPH NEURAL NETWORK." Case Western Reserve University School of Graduate Studies / OhioLINK, 2021. http://rave.ohiolink.edu/etdc/view?acc_num=case1601082841477951.

Full text
APA, Harvard, Vancouver, ISO, and other styles
24

Mocko, Štefan. "Využitie pokročilých segmentačných metód pre obrazy z TEM mikroskopov." Master's thesis, Vysoké učení technické v Brně. Fakulta elektrotechniky a komunikačních technologií, 2018. http://www.nusl.cz/ntk/nusl-378145.

Full text
Abstract:
Tato magisterská práce se zabývá využitím konvolučních neuronových sítí pro segmentační účely v oblasti transmisní elektronové mikroskopie. Také popisuje zvolenou topologii neuronové sítě - U-NET, použíté augmentační techniky a programové prostředí. Firma Thermo Fisher Scientific (dříve FEI Czech Republic s.r.o) poskytla obrazová data pro účely této práce. Získané segmentační výsledky jsou prezentovány ve formě křivek (ROC, PRC) a ve formě numerických hodnot (ARI, DSC, Chybová matice). Zvolená UNET topologie dosáhla excelentních výsledků v oblasti pixelové segmentace. S největší pravděpodobností, budou tyto výsledky sloužit jako odrazový můstek pro interní firemní výzkum.
APA, Harvard, Vancouver, ISO, and other styles
25

Elavarthi, Pradyumna. "Semantic Segmentation of RGB images for feature extraction in Real Time." University of Cincinnati / OhioLINK, 2019. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1573575765136448.

Full text
APA, Harvard, Vancouver, ISO, and other styles
26

Lamma, Tommaso. "A mathematical introduction to geometric deep learning." Bachelor's thesis, Alma Mater Studiorum - Università di Bologna, 2021. http://amslaurea.unibo.it/23886/.

Full text
Abstract:
Lo scopo del geometric deep learning è quello di estendere l'algoritmo di deep learning sviluppato per la classificazione di immagini a domini non euclidei come grafi e complessi simpliciali.In questa tesi ci proponiamo di dare una definizione matematica dei concetti cardine utilizzati nel geometric deep learning quali equivarianza e convoluzione sui grafi. Vedremo inoltre come definire una rete convoluzionale invariante rispetto all'azione di gruppi.
APA, Harvard, Vancouver, ISO, and other styles
27

Ďuriš, Denis. "Detekce ohně a kouře z obrazového signálu." Master's thesis, Vysoké učení technické v Brně. Fakulta elektrotechniky a komunikačních technologií, 2020. http://www.nusl.cz/ntk/nusl-412968.

Full text
Abstract:
This diploma thesis deals with the detection of fire and smoke from the image signal. The approach of this work uses a combination of convolutional and recurrent neural network. Machine learning models created in this work contain inception modules and blocks of long short-term memory. The research part describes selected models of machine learning used in solving the problem of fire detection in static and dynamic image data. As part of the solution, a data set containing videos and still images used to train the designed neural networks was created. The results of this approach are evaluated in conclusion.
APA, Harvard, Vancouver, ISO, and other styles
28

Sparr, Henrik. "Object detection for a robotic lawn mower with neural network trained on automatically collected data." Thesis, Uppsala universitet, Datorteknik, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-444627.

Full text
Abstract:
Machine vision is hot research topic with findings being published at a high pace and more and more companies currently developing automated vehicles. Robotic lawn mowers are also increasing in popularity but most mowers still use relatively simple methods for cutting the lawn. No previous work has been published on machine learning networks that improved between cutting sessions by automatically collecting data and then used it for training. A data acquisition pipeline and neural network architecture that could help the mower in avoiding collision was therefor developed. Nine neural networks were tested of which a convolutional one reached the highest accuracy. The performance of the data acquisition routine and the networks show that it is possible to design a object detection model that improves between runs.
APA, Harvard, Vancouver, ISO, and other styles
29

Trčka, Jan. "Zlepšování kvality digitalizovaných textových dokumentů." Master's thesis, Vysoké učení technické v Brně. Fakulta informačních technologií, 2020. http://www.nusl.cz/ntk/nusl-417278.

Full text
Abstract:
The aim of this work is to increase the accuracy of the transcription of text documents. This work is mainly focused on texts printed on degraded materials such as newspapers or old books. To solve this problem, the current method and problems associated with text recognition are analyzed. Based on the acquired knowledge, the implemented method based on GAN network architecture is chosen. Experiments are a performer on these networks in order to find their appropriate size and their learning parameters. Subsequently, testing is performed to compare different learning methods and compare their results. Both training and testing is a performer on an artificial data set. Using implemented trained networks increases the transcription accuracy from 65.61 % for the raw damaged text lines to 93.23 % for lines processed by this network.
APA, Harvard, Vancouver, ISO, and other styles
30

Oquab, Maxime. "Convolutional neural networks : towards less supervision for visual recognition." Thesis, Paris Sciences et Lettres (ComUE), 2018. http://www.theses.fr/2018PSLEE061.

Full text
Abstract:
Les réseaux de neurones à convolution sont des algorithmes d’apprentissage flexibles qui tirent efficacement parti des importantes masses de données qui leur sont fournies pour l’entraînement. Malgré leur utilisation dans des applications industrielles dès les années 90, ces algorithmes n’ont pas été utilisés pour la reconnaissance d’image à cause de leurs faibles performances avec les images naturelles. C’est finalement grâce a l’apparition d’importantes quantités de données et de puissance de calcul que ces algorithmes ont pu révéler leur réel potentiel lors de la compétition ImageNet, menant à un changement de paradigme en reconnaissance d’image. La première contribution de cette thèse est une méthode de transfert d’apprentissage dans les réseaux à convolution pour la classification d’image. À l’aide d’une procédure de pré-entraînement, nous montrons que les représentations internes d’un réseau à convolution sont assez générales pour être utilisées sur d’autres tâches, et meilleures lorsque le pré-entraînement est réalisé avec plus de données. La deuxième contribution de cette thèse est un système faiblement supervisé pour la classification d’images, pouvant prédire la localisation des objets dans des scènes complexes, en utilisant, lors de l’entraînement, seulement l’indication de la présence ou l’absence des objets dans les images. La troisième contribution de cette thèse est une recherche de pistes de progression en apprentissage non-supervisé. Nous étudions l’algorithme récent des réseaux génératifs adversariaux et proposons l’utilisation d’un test statistique pour l’évaluation de ces modèles. Nous étudions ensuite les liens avec le problème de la causalité, et proposons un test statistique pour la découverte causale. Finalement, grâce a un lien établi récemment avec les problèmes de transport optimal, nous étudions ce que ces réseaux apprennent des données dans le cas non-supervisé
Convolutional Neural Networks are flexible learning algorithms for computer vision that scale particularly well with the amount of data that is provided for training them. Although these methods had successful applications already in the ’90s, they were not used in visual recognition pipelines because of their lesser performance on realistic natural images. It is only after the amount of data and the computational power both reached a critical point that these algorithms revealed their potential during the ImageNet challenge of 2012, leading to a paradigm shift in visual recogntion. The first contribution of this thesis is a transfer learning setup with a Convolutional Neural Network for image classification. Using a pre-training procedure, we show that image representations learned in a network generalize to other recognition tasks, and their performance scales up with the amount of data used in pre-training. The second contribution of this thesis is a weakly supervised setup for image classification that can predict the location of objects in complex cluttered scenes, based on a dataset indicating only with the presence or absence of objects in training images. The third contribution of this thesis aims at finding possible paths for progress in unsupervised learning with neural networks. We study the recent trend of Generative Adversarial Networks and propose two-sample tests for evaluating models. We investigate possible links with concepts related to causality, and propose a two-sample test method for the task of causal discovery. Finally, building on a recent connection with optimal transport, we investigate what these generative algorithms are learning from unlabeled data
APA, Harvard, Vancouver, ISO, and other styles
31

Vančo, Timotej. "Self-supervised učení v aplikacích počítačového vidění." Master's thesis, Vysoké učení technické v Brně. Fakulta elektrotechniky a komunikačních technologií, 2021. http://www.nusl.cz/ntk/nusl-442510.

Full text
Abstract:
The aim of the diploma thesis is to make research of the self-supervised learning in computer vision applications, then to choose a suitable test task with an extensive data set, apply self-supervised methods and evaluate. The theoretical part of the work is focused on the description of methods in computer vision, a detailed description of neural and convolution networks and an extensive explanation and division of self-supervised methods. Conclusion of the theoretical part is devoted to practical applications of the Self-supervised methods in practice. The practical part of the diploma thesis deals with the description of the creation of code for working with datasets and the application of the SSL methods Rotation, SimCLR, MoCo and BYOL in the role of classification and semantic segmentation. Each application of the method is explained in detail and evaluated for various parameters on the large STL10 dataset. Subsequently, the success of the methods is evaluated for different datasets and the limiting conditions in the classification task are named. The practical part concludes with the application of SSL methods for pre-training the encoder in the application of semantic segmentation with the Cityscapes dataset.
APA, Harvard, Vancouver, ISO, and other styles
32

Tiensuu, Jacob, Maja Linderholm, Sofia Dreborg, and Fredrik Örn. "Detecting exoplanets with machine learning : A comparative study between convolutional neural networks and support vector machines." Thesis, Uppsala universitet, Institutionen för teknikvetenskaper, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-385690.

Full text
Abstract:
In this project two machine learning methods, Support Vector Machine, SVM, and Convolutional Neural Network, CNN, are studied to determine which method performs best on a labeled data set containing time series of light intensity from extrasolar stars. The main difficulty is that in the data set there are a lot more non exoplanet stars than there are stars with orbiting exoplanets. This is causing a so called imbalanced data set which in this case is improved by i.e. mirroring the curves of stars with an orbiting exoplanet and adding them to the set. Trying to improve the results further, some preprocessing is done before implementing the methods on the data set. For the SVM, feature extraction and fourier transform of the time-series are important measures but further preprocessing alternatives are investigated. For the CNN-method the time-series are both detrended and smoothed, giving two inputs for the same light curve. All code is implemented in python. Of all the validation parameters recall is considered the main priority since it is more important to find all exoplanets than finding all non exoplanets. CNN turned out to be the best performing method for the chosen configurations with 1.000 in recall which exceeds SVM’s recall 0.800. Considering the second validation parameter precision CNN is also the best performing method with a precision of 0.769 over SVM's 0.571.
APA, Harvard, Vancouver, ISO, and other styles
33

HSIEH, PO-FENG, and 謝柏鋒. "Visualization of Convolution Neural Network." Thesis, 2019. http://ndltd.ncl.edu.tw/handle/qna29g.

Full text
Abstract:
碩士
國立臺北科技大學
資訊工程系
107
In recent years, convolutional neural networks have had many groundbreaking developments. This paper's goal is to analyze the recent YOLO (You Only Look Once) that has a very good performance classification for object detection technology. This paper is a simple way to explain the operation of the convolutional neural network. Present the process which can make the general public more aware of the way machine learning works, and also make it convenient for experts to analyze the structure of it. The ability to quickly improve the original architecture and accelerate it.
APA, Harvard, Vancouver, ISO, and other styles
34

Checg, Chung-Sheng, and 鄭仲勝. "An Accelerative Convolution Neural Network Model." Thesis, 2017. http://ndltd.ncl.edu.tw/handle/2tycsb.

Full text
Abstract:
碩士
國立臺北科技大學
自動化科技研究所
106
Machine learning is a technology that allows computers to learn the rules through vast amounts of information and correct their mistakes themselves. It show the superiority to conventional artificial methods. However in shallow learning, the capability of modelling complex functions is limited in the case of finite samples. Thus, shallow learning models are not enough to simulate human brains in solving difficult problems. Until recently, deep learning was proposed to model complex functions that shallow learning cannot achieve and automatically extract data features. Deep learning works great in generalization even without data pre-processing. However, its disadvantage is computation consumption. In order to automatically learn the characteristics of data and to model complex functions, a multi-layered network is required. Usually when a network has more layers, its prediction performance is better. But it comes with a downside that the network has millions of weighting parameters. Because a great number of parameters causes over-fitting and the massive use of computer memory, we propose a method to reduce the parameters of a convolutional neural network. The goal is to reduce the number of network parameters as many as possible but slightly degrade the accuracy. To validate the proposed method, THUR15K[36], Caltech-101[37], Caltech-256[38], GHIM10k[39] databases are used.. The experimental results show that the parameters is greatly reduced with a slight drop on accuracy (about 1.34%).
APA, Harvard, Vancouver, ISO, and other styles
35

KANG, NAN-RAN, and 康乃人. "Speaker Verification using Convolution Neural Network." Thesis, 2018. http://ndltd.ncl.edu.tw/handle/c8e3qe.

Full text
Abstract:
碩士
逢甲大學
資訊工程學系
106
Biometric system is no longer a new thing in daily life, and it has become more and more popular in recent years, fingerprint recognition, iris recognition, voiceprint recognition, and I-phone's Face ID are all biometric system, and speaker verification is one of them. Speaker recognition can be divided into two parts: feature extraction and classification. In the past, the two parts were solved by different methods, due to the rapid development of deep learning, the neural network for speaker recognition has gained breadth of development. In the part of the speaker verification, the Recurrent Neural Network is often used to solve the problem. It is less common to use the Convolution Neural Network to solve the problem of speaker verification. Since the Convolution Neural Network has a good effect on extracting feature details, this paper decided to use convolutional neural network as the basis to try to solve the two parts of the speaker recognition and propose a successful Methods to solve speaker verification.
APA, Harvard, Vancouver, ISO, and other styles
36

Sun, Tzu-Chun, and 孫梓鈞. "Fruit Recognition Using Deep Convolution Neural Network." Thesis, 2014. http://ndltd.ncl.edu.tw/handle/76315586416634324332.

Full text
Abstract:
碩士
國立暨南國際大學
資訊工程學系
102
This thesis focuses on developing a fruit recognition method. It can be used to improve life convenience by shortening the supermarket checkout time. Existing methods for fruit recognition use handcrafted image features, such as the texture, the color, and the shape of a fruit, for fruit recognition. However, image features extracted with a set of specific algorithms do not necessarily provide enough information for pattern recognition. In this work, we use deep convolution neural network (DCNN) to learn discriminative fruit features automatically. In order to achieve high recognition accuracy, we tested many different DCNN configurations. DCNNs of different depths and different nodes in each network layers are trained and are tested to determine the best configuration. To test the implemented DCNN fruit recognition method, we collect a fruit image database containing big Fuji apples, small Fuji apples, Washington apples, Granny Smith apples, papayas, Hami melons, muskmelons, guavas, bananas, Sunkist, grapefruit, wax apples, limes, peaches, and kiwis. The fruit images are divided into five parts. Four of them are used for training and the other one is for testing. We have also implemented another two existing fruit recognition methods as baseline methods to compare the recognition results of our DCNN approach. Experimental results show that the DCNN method outperforms the other two methods and its recognition accuracy is about 92.91%.
APA, Harvard, Vancouver, ISO, and other styles
37

WEI, TSUNG-HSIN, and 魏崇訓. "Video Super-resolution via Convolution Neural Network." Thesis, 2016. http://ndltd.ncl.edu.tw/handle/823kfa.

Full text
Abstract:
碩士
國立高雄應用科技大學
資訊工程系
104
Nowadays, people might need super resolution to have more effective and clear information. The technology of image processing becomes better and better, and there are more and more people present their research in this field. Super resolution algorithm enhances high frequent information (texture or edges) to improve the image quality. We can do more things with super resolution, such as road surveillance system. The view might be influence by illumination, angle, distance, and other conditions, so these might not be good for us to recognize the number of license plate or human face. Interpolation is a great method for super resolution, but this method does not own high frequent information. Therefore, researcher present the concept of Example-based in Learning-based to solve this problem. Besides, in recent years, deep learning has great result and becomes faster than before. Deep learning has not only significant result but also great speed. Although the speed of deep learning is faster than before, it still needs some time to rebuild. The time might be acceptable for single image. But what if we have to enhance video, it will take a lot of time to rebuild. Therefore, our research is able to solve this problem by Three Steps Search algorithm. We present a faster super resolution for video based on deep learning. We find the different blocks between frame and frame. Add these blocks into neural net and rebuild high resolution image to lower the total compute time.
APA, Harvard, Vancouver, ISO, and other styles
38

Chang, Yao-Ren, and 張耀仁. "Convolution neural network on WIFI indoor localization." Thesis, 2018. http://ndltd.ncl.edu.tw/handle/s7dhny.

Full text
Abstract:
碩士
國立臺灣大學
電機工程學研究所
106
The mobile payment has been growing very quickly in these year, our life has become more and more convenient. Once we can locate user’s position precisely, we can broadcast the advertisement to the user to increase sales performance. For example: when you walk into the restaurant, the system sent you the coupon of this restaurant immediately, when you walk into the apparel store, the system list all of the clothes you might like, when you are leaving parking lot, the system auto-debiting your parking fee. In the past, WIFI localization system is based on RFID localization, triangle localization. Nowadays, with the growing of machine learning such as DBSCAN, Deep learning, KNN, we can localize user’s location more precisely. In this paper, we use Alipay real-time payment dataset to do our experiment. We rebuild the geographic information from WIFI signal and train the model with convolution neural networks. Besides, we reduce the training/testing time on overhead by feature engineering. Then we evaluate the result with three most representative machine learning models: Lighgbm (multiple classifier), Lightgbm (binary Classifier), Keras (Deep Neural network). Finally, we evaluate the pros and cons for each machine learning model, and discuss the result.
APA, Harvard, Vancouver, ISO, and other styles
39

HUANG, JIAN-JHIH, and 黃健智. "Apply Convolution Neural Network on Vehicle Detection." Thesis, 2019. http://ndltd.ncl.edu.tw/handle/4g44kf.

Full text
Abstract:
碩士
國立彰化師範大學
電機工程學系
107
With rapid development of computer hardware and software technology, complex image processing and recognition can be performed by computer in recent years. The traditional image recognition has some issues of lack of flexibility and poor accuracy. These disadvantages have improved by the neural network. The earlier multi-layer perceptron is widely applied in various areas, however, its hidden layers are simple and slower convergence issues to cause longer training time. Most researchers apply R-CNN to the real time recognition system. These methods have higher accuracy but their layers are more complicated and need a lot of hardware resources to support. In this thesis, it proposes a CNN with lighter neural layers implemented to the Raspberry Pi of small single-board computer, and use this method to real time recognize vehicles. For speed up network training, this thesis uses data augmentation and normalization to enhance training dataset before input. Then, applying convolution layers extract features and pooling layers compress, and the backward propagation method is used to update network parameters. This proposed method have short time for training convergence and better efficient. Due to simple structure, it can be easy implemented other single board modules to improve generalization. Based on the experiments, the best model is 11 layers used for future inference after optimized network parameters. The accurate rate of training dataset is 96% after training stage processing. The test dataset is used for validation and the accurate rate is 94%. The proposed method has simple structures and higher accuracy to the real time recognition system.
APA, Harvard, Vancouver, ISO, and other styles
40

Hu, Yi-Chun, and 胡依淳. "Analysis and Comparison of Convolution Layer in Deep Convolution Neural Network." Thesis, 2018. http://ndltd.ncl.edu.tw/handle/686x7u.

Full text
Abstract:
碩士
國立暨南國際大學
電機工程學系
106
With the rapid development of information technology, big data has become mainstream, and many identification systems have been greatly affected. Therefore, deep learning requires a large database learning model and thus becomes the mainstream. Deep learning can take advantage of the characteristics of robots to automatically learn to task objectives, and thus deep learning of this architecture has become a very popular technology in academics. Nowadays, neural networks are popular in the field of visual imaging. The best performing model is the convolutional neural network. The progress of deep learning is related to Convolutional Neural Networks (CNN). Convolutional neural networks, also known as CNNs or ConvNets, are the main developments in the field of deep neural networks, and can even be more accurate than humans in image recognition. If there is any way to live up to the expectations of deep learning, convolutional neural networks are definitely the first choice. A key part of the convolutional neural network is the weight value in the kernel of the convolutional layer. There are usually three ways to change the convolutional layer in the convolutional neural network. Kernel size, activation function, and convolution use the number of kernels. The neural network may have different activation function selection or kernel size. The accuracy is not high.
APA, Harvard, Vancouver, ISO, and other styles
41

XIE, YOU-TING, and 謝侑廷. "CT Images segmentation using Deep Convolution Neural Network." Thesis, 2019. http://ndltd.ncl.edu.tw/handle/4xkkpc.

Full text
Abstract:
碩士
國立臺灣科技大學
電子工程系
107
Cause of hospital unable to provide a complete 3D model and nowadays technol- ogy can not produce stereoscopic images without a blind vision, it cause doctors misjudgment on surgery evaluated or diagnosed. Therefoe doctors can only rely on their own experience to identify computerized tomography (CT) images during diag- nosis and preoperative evaluation. However, images are a kind of two-dimensional information expression, it cannot provide doctors accurate informations in three- dimensional space. With professional training and extensive experience in the eld of clinical, the doctors constructs the stereoscopic shape of the liver in their mind, and then explains the current liver condition of the patient through oral dictation. But patient has no professional training to identify the computerized tomography (CT), thus they can't imagine the three-dimensional shape of the liver in their mind, which causes many problems between doctor and patient. This paper proposes a system that can automatically identify liver parts and instant 3D modeling in computed tomography (CT) images. The image segmenta- tion convolutional neural network (Segnet) can automatically segment and classify images into computerized tomography (CT) image results. Cause the image thresh- old of each organ in the body is similar, it is impossible to directly segment the liver region by the image segmentation convolutional neural network (Segnet). Therefore, this paper divides the image into a convolutional neural network (Segnet). The re- sults of the liver part are judged by an algorithm, and the resulting image is used to generate point cloud data. Finally the point cloud reconstruction is performed to generate 3D Mesh of the liver.
APA, Harvard, Vancouver, ISO, and other styles
42

Vijayan, Raghavendran. "Forecasting retweet count during elections using graph convolution neural networks." Thesis, 2018. https://doi.org/10.7912/C2JM2C.

Full text
APA, Harvard, Vancouver, ISO, and other styles
43

GUAN, HONG-CHENG, and 管宏成. "Detection of Scooters in Taiwan Based on Convolution Neural Networks." Thesis, 2017. http://ndltd.ncl.edu.tw/handle/54627521654968631599.

Full text
Abstract:
碩士
逢甲大學
資訊工程學系
105
Due to the attention of modern people on the increasing importance of traffic safety, drivers in Taiwan install driving recorder which record daily traffic or accident. Taiwanese usually use scooter as commuter vehicle. Riders in Taiwan are often shuttle in the traffic jam, it is very dangerous for car driver to detect the riders. It is usually to cause car accidents. In this paper, we propose a system to remind car drivers to avoid the danger of rider in the traffic jam or street .The system will draw rectangles on the driving recorder to mark the interested object on the video, namely car, scooter, bus, person and bicycle. We based on R .Girshick [1] as a basis for modifying Fast R-CNN [2] with the RPN architecture and adjusting its learning algorithms. For the R-CNN [3] architecture, the original author uses VGG16 [4] and ZF Net [5], which uses these two types of neural networks and modifies the number of RPN categories and related parameters. This paper makes the optimization of the network architecture by adding special samples and tuning its parameters.
APA, Harvard, Vancouver, ISO, and other styles
44

Cai, Pei Yun, and 蔡佩妘. "Design of A Flexible Accelerator for Deep Convolution Neural Networks." Thesis, 2017. http://ndltd.ncl.edu.tw/handle/76s688.

Full text
Abstract:
碩士
國立清華大學
資訊工程學系
105
Convolution Neural Network (CNN) is a deep learning method for vision recognition. The state-of-the-art accuracy makes it widely used in artificial intelligence, computer vision, self-driving car, etc. However, CNNs are highly computational complex and demand high memory bandwidth. Although we exploit highly parallel computation to achieve effective throughput, the good orchestration of data movements should be taken into consideration to reduce increased memory bandwidth. To address these problems, we present a specialized dataflow with spatial hardware (extended from MIT Eyeriss, an energy-efficient reconfigurable CNN accelerator) to reduce memory access without sacrificing performance. Existing works typically improve in either computation or memory access aspect. However, the computational parallelism and memory bandwidth react each other, so we should take both of them into consideration at the same time. Convolution operations of CNNs exhibit various data reuse types and show high parallelism. We apply highly-parallel PE array to improve the throughput. To minimize data access, we purpose a dataflow leveraging data reuse opportunities and local buffer inside PE. Then, data can be temporal reused without iterative access between high-level memory and PEs. In addition, large amount of intermediate data can be accumulated immediately, which could pose additional pressure on storage. By reason of that, we propose a tiling methodology with the tradeoff between performance and local buffer size. The larger local buffer is used, the more data can be reused and the more intermediate data can be consumed, which can alleviate the data streaming bottleneck, enabling the efficient computation. Furthermore, our dataflow and hardware can adapt to different layers with varying shapes, so we can maximize the throughput in each layer of CNNs. For layer2 of Alexnet, we can have a speedup of 4.64 times with additional buffer of 2.63 KB, over initial buffer size, which is a row size of input data. Compared to Eyeriss, we have the speedup of 1.07 times by using additional buffer of 18.38KB. For layer3 of Alexnet, we can have a speedup of 14.55 times with additional buffer of 12.8KB. Compare to Eyeriss, we have the speedup of 1.03 times by using additional buffer of 28.55KB. As a result, the throughput of the frame rate achieves 41.7 fps (55.5 GOPS) at 250MHz frequency for convolution layers of Alexnet. Compared to Eyeriss, under the same frequency 200 MHz and using Alexnet, we achieve better performance in layer2~4 ranging from 1.03~1.07 times by using additional on-chip buffer of 16KB.
APA, Harvard, Vancouver, ISO, and other styles
45

Ho, Pin-Hui, and 何品蕙. "Design of an Inference Accelerator for Compressed Convolution Neural Networks." Thesis, 2017. http://ndltd.ncl.edu.tw/handle/bxwfdn.

Full text
Abstract:
碩士
國立清華大學
資訊工程學系所
106
The state-of-the-art convolution neural network (CNNs) are wildly used in the intelligent applications such as the AI systems, nature language processing and image recognition. The huge and growing computation latency of the high-dimensional convolution has become a critical issue. Most of the multiplications in the convolution layers are ineffectual as they involve the multiplication that either one of the input data or both are zero. By pruning the redundant connections in the CNN models and clamping the features to zero by the rectified linear unit (ReLU), the input data of the convolution layers achieve a higher sparsity. The high data sparsity leaves a big room for improvement by compressing the data and skipping the ineffectual multiplication. In this thesis, we propose a design of the CNN accelerators which can accelerate the convolution layers by performing the sparse matrix multiplications and reducing the amount of the off-chip data transfer. The state-of-the-art Eyeriss architecture is an energy-efficient reconfigurable CNN accelerator which has the specialized data flow with the multi-level memory hierarchy and limited hardware resource. Improved over the Eyeriss architecture, our approach can perform the sparse matrix multiplications effectively. With the pruned and compressed kernel data, and dynamic encoding of the input feature into the compressed sparse row (CSR) format, our accelerator can reduce a significant amount of the off-chip data transfer, minimizing the memory accesses and skipping the ineffectual multiplication. One of the disadvantages of the compression scheme is that after the compression, the input workload becomes imbalance dynamically. Therefore, the technique will lower the data reusability and increase the off-chip data transfer. To analyze the data transfer further, we explore the relationship between the on-chip buffer size and off-chip data transfer. Our design needs to reaccess the off-chip data while the on-chip buffer can’t store all the output features. As a result, by reducing the significant amount of computation and memory accesses, our accelerator can still achieve the better performance. With Alexnet, a popular CNN model with 35% input data sparsity and 89.5% kernel data as the benckmark, our accelerator can achieve 1.12X speedup as compared with Eyeriss, and 3.42X speed up as compared with our baseline architecture.
APA, Harvard, Vancouver, ISO, and other styles
46

WANG, WEI-CHUN, and 王薇淳. "Scene Recognition of Remote Sensing Images Using Convolution Neural Networks." Thesis, 2018. http://ndltd.ncl.edu.tw/handle/nd35c3.

Full text
Abstract:
碩士
國防大學理工學院
空間科學碩士班
106
Geospatial information often use spatial information and observation system for spatial data processing and analysis to meet the need of applications in different fields. However, in the face of big data satellite images and various types of geospatial data on the Internet, it is hard for users to analyze such huge images and data with great load. In order to effectively conduct image interpretation and earth observation applications, it is no longer possible to work in a traditional manner. In this study, we firstly integrate the remote sensing images and POI geospatial databases to automatically generate the huge remote sensing images that are necessary for the development of artificial intelligence technology. Image classification in computer vision has always been a fundamental research topic in the field of image recognition. In recent years, image classification using the Convolution Neural Network (CNN) in deep learning has become increasingly popular, and neural network automatization has improved recognition capabilities. This opens up a new horizon for image recognition. Therefore, this study employed transfer learning technique, with the fine-tuning strategy on the popular InceptionV3 network. The well-trained model predicted a total of 1,296 image tiles from the three test areas, in order to assess the binary classification between airport and not airport images. With a fine-tuning strategy added, the average accuracy was improved by 77.37%. The F1 score was found from 0.17 to 0.54; the F2 score was found from 0.28 to 0.74.
APA, Harvard, Vancouver, ISO, and other styles
47

LIU, HONG-CEN, and 劉泓岑. "Combining 1D & 2D Convolution Neural Networks For Fall Detection." Thesis, 2019. http://ndltd.ncl.edu.tw/handle/hk76sd.

Full text
Abstract:
碩士
大同大學
資訊工程學系(所)
107
Fall detection mechanism can be sorted by the position where the sensor is placed. One is “Placed in the environment” and the other is “Attached on the human body”. The sensor of the first method usually required to be placed in specific place, and the price of the sensor is expensive. In contrast, the sensor of the second method is not only affordable but also no restrictions on the place. The second method usually uses a smartphone as a sensor, because the smartphone can complete the detection, determination, and notification by itself. The initial research in detection methods of smartphone had to place smartphone somewhere specifically on the body. But in reality, phone users have different habits of smartphone placement. It is important to consider that users will place their smartphone in different positions instead of being placed a single position. In view of this, this paper proposes a mechanism for using accelerometer and two neural networks for fall detection. First, using the first neural network to exclude non-fall event. The second neural network will confirm the fall event. In addition, the mechanism does not require users to place the smartphone on a single body part allowing users to place the smartphone in a garment, trousers or jacket pocket. According to experimental results in this paper, the Specificity and Accuracy of this mechanism are better than the three other methods. The Sensitivity of this method is slightly lower than one of three method, but better than other two.
APA, Harvard, Vancouver, ISO, and other styles
48

Fu, Chien-Chun, and 傅建鈞. "A System for Disguised Face Recognition with Convolution Neural Networks." Thesis, 2018. http://ndltd.ncl.edu.tw/handle/f5gfn3.

Full text
Abstract:
碩士
淡江大學
電機工程學系碩士在職專班
106
In this paper, we propose a disguised face recognition system based on Deep Normalization and Convolution Neural Network (DNCNN), this system include two trained DNCNN identification Network. The function of first trained identification network is to identify the type of disguised of the input face image. This network classifies human face disguised input images into three categories, No disguised, Upper half face disguised and Lower half face disguised. After the classification is completed, the system will remove the upper half disguised or the lower half disguised of the face image, and remaining the non-disguised half face images, then input it into the second recognition network. The function of the second recognition network is to recognize the identification of the input half face image. To reduce the over-fitting caused by imbalanced and insufficient training samples. Before performing the training and identification of the above two DNCNN recognition networks, we need to perform the pre-process on the original image samples first. The image pre-process is used the Viola-Jones face detection algorithm. The algorithm first finds out the block of the face position of original images, then the pre-process rotates and captures the face block image or half face images for the training and testing of recognition networks. After the preprocessing is completed, we can perform the training and testing of DNCNN recognition networks. The experimental results show that the system achieved a similar recognition rates as the reference.
APA, Harvard, Vancouver, ISO, and other styles
49

(5931047), Akash Gaikwad. "Pruning Convolution Neural Network (SqueezeNet) for Efficient Hardware Deployment." Thesis, 2019.

Find full text
Abstract:

In recent years, deep learning models have become popular in the real-time embedded application, but there are many complexities for hardware deployment because of limited resources such as memory, computational power, and energy. Recent research in the field of deep learning focuses on reducing the model size of the Convolution Neural Network (CNN) by various compression techniques like Architectural compression, Pruning, Quantization, and Encoding (e.g., Huffman encoding). Network pruning is one of the promising technique to solve these problems.

This thesis proposes methods to prune the convolution neural network (SqueezeNet) without introducing network sparsity in the pruned model.

This thesis proposes three methods to prune the CNN to decrease the model size of CNN without a significant drop in the accuracy of the model.

1: Pruning based on Taylor expansion of change in cost function Delta C.

2: Pruning based on L2 normalization of activation maps.

3: Pruning based on a combination of method 1 and method 2.

The proposed methods use various ranking methods to rank the convolution kernels and prune the lower ranked filters afterwards SqueezeNet model is fine-tuned by backpropagation. Transfer learning technique is used to train the SqueezeNet on the CIFAR-10 dataset. Results show that the proposed approach reduces the SqueezeNet model by 72% without a significant drop in the accuracy of the model (optimal pruning efficiency result). Results also show that Pruning based on a combination of Taylor expansion of the cost function and L2 normalization of activation maps achieves better pruning efficiency compared to other individual pruning criteria and most of the pruned kernels are from mid and high-level layers. The Pruned model is deployed on BlueBox 2.0 using RTMaps software and model performance was evaluated.

APA, Harvard, Vancouver, ISO, and other styles
50

Gaikwad, Akash S. "Pruning Convolution Neural Network (SqueezeNet) for Efficient Hardware Deployment." Thesis, 2018. http://hdl.handle.net/1805/17923.

Full text
Abstract:
Indiana University-Purdue University Indianapolis (IUPUI)
In recent years, deep learning models have become popular in the real-time embedded application, but there are many complexities for hardware deployment because of limited resources such as memory, computational power, and energy. Recent research in the field of deep learning focuses on reducing the model size of the Convolution Neural Network (CNN) by various compression techniques like Architectural compression, Pruning, Quantization, and Encoding (e.g., Huffman encoding). Network pruning is one of the promising technique to solve these problems. This thesis proposes methods to prune the convolution neural network (SqueezeNet) without introducing network sparsity in the pruned model. This thesis proposes three methods to prune the CNN to decrease the model size of CNN without a significant drop in the accuracy of the model. 1: Pruning based on Taylor expansion of change in cost function Delta C. 2: Pruning based on L2 normalization of activation maps. 3: Pruning based on a combination of method 1 and method 2. The proposed methods use various ranking methods to rank the convolution kernels and prune the lower ranked filters afterwards SqueezeNet model is fine-tuned by backpropagation. Transfer learning technique is used to train the SqueezeNet on the CIFAR-10 dataset. Results show that the proposed approach reduces the SqueezeNet model by 72% without a significant drop in the accuracy of the model (optimal pruning efficiency result). Results also show that Pruning based on a combination of Taylor expansion of the cost function and L2 normalization of activation maps achieves better pruning efficiency compared to other individual pruning criteria and most of the pruned kernels are from mid and high-level layers. The Pruned model is deployed on BlueBox 2.0 using RTMaps software and model performance was evaluated.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography