Dissertations / Theses on the topic 'Convolutive Neural Networks'
Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles
Consult the top 50 dissertations / theses for your research on the topic 'Convolutive Neural Networks.'
Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.
You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.
Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.
Heuillet, Alexandre. "Exploring deep neural network differentiable architecture design." Electronic Thesis or Diss., université Paris-Saclay, 2023. http://www.theses.fr/2023UPASG069.
Full textArtificial Intelligence (AI) has gained significant popularity in recent years, primarily due to its successful applications in various domains, including textual data analysis, computer vision, and audio processing. The resurgence of deep learning techniques has played a central role in this success. The groundbreaking paper by Krizhevsky et al., AlexNet, narrowed the gap between human and machine performance in image classification tasks. Subsequent papers such as Xception and ResNet have further solidified deep learning as a leading technique, opening new horizons for the AI community. The success of deep learning lies in its architecture, which is manually designed with expert knowledge and empirical validation. However, these architectures lack the certainty of an optimal solution. To address this issue, recent papers introduced the concept of Neural Architecture Search (NAS), enabling the learning of deep architectures. However, most initial approaches focused on large architectures with specific targets (e.g., supervised learning) and relied on computationally expensive optimization techniques such as reinforcement learning and evolutionary algorithms. In this thesis, we further investigate this idea by exploring automatic deep architecture design, with a particular emphasis on differentiable NAS (DNAS), which represents the current trend in NAS due to its computational efficiency. While our primary focus is on Convolutional Neural Networks (CNNs), we also explore Vision Transformers (ViTs) with the goal of designing cost-effective architectures suitable for real-time applications
Maragno, Alessandro. "Programmazione di Convolutional Neural Networks orientata all'accelerazione su FPGA." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2016. http://amslaurea.unibo.it/12476/.
Full textAbbasi, Mahdieh. "Toward robust deep neural networks." Doctoral thesis, Université Laval, 2020. http://hdl.handle.net/20.500.11794/67766.
Full textIn this thesis, our goal is to develop robust and reliable yet accurate learning models, particularly Convolutional Neural Networks (CNNs), in the presence of adversarial examples and Out-of-Distribution (OOD) samples. As the first contribution, we propose to predict adversarial instances with high uncertainty through encouraging diversity in an ensemble of CNNs. To this end, we devise an ensemble of diverse specialists along with a simple and computationally efficient voting mechanism to predict the adversarial examples with low confidence while keeping the predictive confidence of the clean samples high. In the presence of high entropy in our ensemble, we prove that the predictive confidence can be upper-bounded, leading to have a globally fixed threshold over the predictive confidence for identifying adversaries. We analytically justify the role of diversity in our ensemble on mitigating the risk of both black-box and white-box adversarial examples. Finally, we empirically assess the robustness of our ensemble to the black-box and the white-box attacks on several benchmark datasets.The second contribution aims to address the detection of OOD samples through an end-to-end model trained on an appropriate OOD set. To this end, we address the following central question: how to differentiate many available OOD sets w.r.t. a given in distribution task to select the most appropriate one, which in turn induces a model with a high detection rate of unseen OOD sets? To answer this question, we hypothesize that the “protection” level of in-distribution sub-manifolds by each OOD set can be a good possible property to differentiate OOD sets. To measure the protection level, we then design three novel, simple, and cost-effective metrics using a pre-trained vanilla CNN. In an extensive series of experiments on image and audio classification tasks, we empirically demonstrate the abilityof an Augmented-CNN (A-CNN) and an explicitly-calibrated CNN for detecting a significantly larger portion of unseen OOD samples, if they are trained on the most protective OOD set. Interestingly, we also observe that the A-CNN trained on the most protective OOD set (calledA-CNN) can also detect the black-box Fast Gradient Sign (FGS) adversarial examples. As the third contribution, we investigate more closely the capacity of the A-CNN on the detection of wider types of black-box adversaries. To increase the capability of A-CNN to detect a larger number of adversaries, we augment its OOD training set with some inter-class interpolated samples. Then, we demonstrate that the A-CNN trained on the most protective OOD set along with the interpolated samples has a consistent detection rate on all types of unseen adversarial examples. Where as training an A-CNN on Projected Gradient Descent (PGD) adversaries does not lead to a stable detection rate on all types of adversaries, particularly the unseen types. We also visually assess the feature space and the decision boundaries in the input space of a vanilla CNN and its augmented counterpart in the presence of adversaries and the clean ones. By a properly trained A-CNN, we aim to take a step toward a unified and reliable end-to-end learning model with small risk rates on both clean samples and the unusual ones, e.g. adversarial and OOD samples.The last contribution is to show a use-case of A-CNN for training a robust object detector on a partially-labeled dataset, particularly a merged dataset. Merging various datasets from similar contexts but with different sets of Object of Interest (OoI) is an inexpensive way to craft a large-scale dataset which covers a larger spectrum of OoIs. Moreover, merging datasets allows achieving a unified object detector, instead of having several separate ones, resultingin the reduction of computational and time costs. However, merging datasets, especially from a similar context, causes many missing-label instances. With the goal of training an integrated robust object detector on a partially-labeled but large-scale dataset, we propose a self-supervised training framework to overcome the issue of missing-label instances in the merged datasets. Our framework is evaluated on a merged dataset with a high missing-label rate. The empirical results confirm the viability of our generated pseudo-labels to enhance the performance of YOLO, as the current (to date) state-of-the-art object detector.
Kapoor, Rishika. "Malaria Detection Using Deep Convolution Neural Network." University of Cincinnati / OhioLINK, 2020. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1613749143868579.
Full textYu, Xiafei. "Wide Activated Separate 3D Convolution for Video Super-Resolution." Thesis, Université d'Ottawa / University of Ottawa, 2019. http://hdl.handle.net/10393/39974.
Full textMessou, Ehounoud Joseph Christopher. "Handling Invalid Pixels in Convolutional Neural Networks." Thesis, Virginia Tech, 2020. http://hdl.handle.net/10919/98619.
Full textMaster of Science
A module at the heart of deep neural networks built for Artificial Intelligence is the convolutional layer. When multiple convolutional layers are used together with other modules, a Convolutional Neural Network (CNN) is obtained. These CNNs can be used for tasks such as image classification where they tell if the object in an image is a chair or a car, for example. Most CNNs use a normal convolutional layer that assumes that all parts of the image fed to the network are valid. However, most models zero pad the image at the beginning to maintain a certain output shape. Zero padding is equivalent to adding a black frame around the image. These added pixels result in adding information that was not initially present. Therefore, this extra information can be considered invalid. Invalid pixels can also be inside the image where they are referred to as holes in completion tasks like image inpainting where the network is asked to fill these holes and give a realistic image. In this work, we look for a method that can handle both types of invalid pixels. We compare on the same test bench two methods previously used to handle invalid pixels outside the image (Partial and Edge convolutions) and one method that was designed for invalid pixels inside the image (Gated convolution). We show that Partial convolution performs the best in image classification while Gated convolution has the advantage on semantic segmentation. As for hotel recognition with masked regions, none of the methods seem appropriate to generate embeddings that leverage the masked regions.
Ngo, Kalle. "FPGA Hardware Acceleration of Inception Style Parameter Reduced Convolution Neural Networks." Thesis, KTH, Skolan för informations- och kommunikationsteknik (ICT), 2016. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-205026.
Full textPappone, Francesco. "Graph neural networks: theory and applications." Bachelor's thesis, Alma Mater Studiorum - Università di Bologna, 2021. http://amslaurea.unibo.it/23893/.
Full textSung, Wei-Hong. "Investigating minimal Convolution Neural Networks (CNNs) for realtime embedded eye feature detection." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-281338.
Full textMed den snabba ökningen av neurala nätverk kan många uppgifter som brukade vara svåra att utföra i traditionella metoder nu lösas bra, särskilt inom datorsynsfältet. Men eftersom uppgifterna vi måste lösa har blivit mer och mer komplexa, blir de neurala nätverken vi använder djupare och större. Därför, även om vissa inbäddade system är kraftfulla för närvarande, lider de flesta inbäddade system fortfarande av minnes- och beräkningsbegränsningar, vilket innebär att det är svårt att distribuera våra stora neurala nätverk på dessa inbäddade enheter. Projektet syftar till att utforska olika metoder för att komprimera den ursprungliga stora modellen. Det vill säga, vi tränar först en baslinjemodell, YOLOv3[1], som är ett berömt objektdetekteringsnätverk, och sedan använder vi två metoder för att komprimera basmodellen. Den första metoden är beskärning med hjälp av sparsity training, och vi kanalskärning enligt skalningsfaktorvärdet efter sparsity training. Baserat på idén om denna metod har vi gjort tre utforskningar. För det första tar vi unionens maskstrategi för att lösa dimensionsproblemet för genvägsrelaterade lager i YOLOv3[1]. För det andra försöker vi absorbera informationen om skiftande faktorer i efterföljande lager. Slutligen implementerar vi lagerskärningen och kombinerar det med kanalbeskärning. Den andra metoden är beskärning med NAS, som använder en djup förstärkningsram för att automatiskt hitta det bästa kompressionsförhållandet för varje lager. I slutet av denna rapport analyserar vi de viktigaste resultaten och slutsatserna i vårt experiment och syftar till det framtida arbetet som potentiellt kan förbättra vårt projekt.
Wu, Jindong. "Pooling strategies for graph convolution neural networks and their effect on classification." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-288953.
Full textMed utvecklingen av grafneurala nätverk har detta nya neurala nätverk tillämpats i olika område. Ett av de svåra problemen för forskare inom detta område är hur man väljer en lämplig poolningsmetod för en specifik forskningsuppgift från en mängd befintliga poolningsmetoder. I den här arbetet, baserat på de befintliga vanliga grafpoolingsmetoderna, utvecklar vi ett riktmärke för neuralt nätverk ram som kan användas till olika diagram pooling metoders jämförelse. Genom att använda ramverket jämför vi fyra allmängiltig diagram pooling metod och utforska deras egenskaper. Dessutom utvidgar vi två metoder för att förklara beslut om neuralt nätverk från convolution neurala nätverk till diagram neurala nätverk och jämföra dem med befintliga GNNExplainer. Vi kör experiment av grafisk klassificering uppgifter under benchmarkingramverk och hittade olika egenskaper av olika diagram pooling metoder. Dessutom verifierar vi korrekthet i dessa förklarningsmetoder som vi utvecklade och mäter överenskommelserna mellan dem. Till slut, vi försöker utforska egenskaper av olika metoder för att förklara neuralt nätverks beslut och deras betydelse för att välja pooling metoder i grafisk neuralt nätverk.
GIACOPELLI, Giuseppe. "An Original Convolution Model to analyze Graph Network Distribution Features." Doctoral thesis, Università degli Studi di Palermo, 2022. https://hdl.handle.net/10447/553177.
Full textIoannou, Yani Andrew. "Structural priors in deep neural networks." Thesis, University of Cambridge, 2018. https://www.repository.cam.ac.uk/handle/1810/278976.
Full textJackman, Simeon. "Football Shot Detection using Convolutional Neural Networks." Thesis, Linköpings universitet, Institutionen för medicinsk teknik, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-157438.
Full textCranston, Daniel, and Filip Skarfelt. "Normalized Convolution Network and Dataset Generation for Refining Stereo Disparity Maps." Thesis, Linköpings universitet, Datorseende, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-158449.
Full textHighlander, Tyler. "Efficient Training of Small Kernel Convolutional Neural Networks using Fast Fourier Transform." Wright State University / OhioLINK, 2015. http://rave.ohiolink.edu/etdc/view?acc_num=wright1432747175.
Full textSunesson, Albin. "Establishing Effective Techniques for Increasing Deep Neural Networks Inference Speed." Thesis, KTH, Skolan för datavetenskap och kommunikation (CSC), 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-213833.
Full textDe senaste årens trend inom deep learning har varit att addera fler och fler lager till neurala nätverk. Det här introducerar nya utmaningar i applikationer med latensberoende. Problemet uppstår från mängden beräkningar som måste utföras vid varje evaluering. Detta adresseras med en reducering av inferenshastigheten. Jag analyserar två olika metoder för att snabba upp evalueringen av djupa neurala näverk. Den första metoden reducerar antalet vikter i ett faltningslager via en tensordekomposition på dess kärna. Den andra metoden låter samples lämna nätverket via tidiga förgreningar när en klassificering är säker. Båda metoderna utvärderas på flertalet nätverksarkitekturer med konsistenta resultat. Dekomposition på fältningskärnan visar 20-70% hastighetsökning med mindre än 1% försämring av klassifikationssäkerhet i evaluerade konfigurationer. Tidiga förgreningar visar upp till 300% hastighetsökning utan någon försämring av klassifikationssäkerhet när de evalueras på CPU.
Shuvo, Md Kamruzzaman. "Hardware Efficient Deep Neural Network Implementation on FPGA." OpenSIUC, 2020. https://opensiuc.lib.siu.edu/theses/2792.
Full textAndersson, Viktor. "Semantic Segmentation : Using Convolutional Neural Networks and Sparse dictionaries." Thesis, Linköpings universitet, Datorseende, 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-139367.
Full textBereczki, Márk. "Graph Neural Networks for Article Recommendation based on Implicit User Feedback and Content." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-300092.
Full textRekommendationssystem används ofta på webbplatser och applikationer för att hjälpa användare att hitta relevant innehåll baserad på deras intressen. Med utvecklingen av grafneurala nätverk nådde toppmoderna resultat inom rekommendationssystem och representerade data i form av en graf. De flesta grafbaserade lösningar har dock svårt med beräkningskomplexitet eller att generalisera till nya användare. Därför föreslår vi ett nytt grafbaserat rekommendatorsystem genom att modifiera Simple Graph Convolution. De här tillvägagångssätt är en effektiv grafnodsklassificering och lägga till möjligheten att generalisera till nya användare. Vi bygger vårt föreslagna rekommendatorsystem för att rekommendera artiklarna från Peltarion Knowledge Center. Genom att integrera två datakällor, implicit användaråterkoppling baserad på sidvisningsdata samt innehållet i artiklar, föreslår vi en hybridrekommendatörslösning. Under våra experiment jämför vi vår föreslagna lösning med en matrisfaktoriseringsmetod samt en popularitetsbaserad och en slumpmässig baslinje, analyserar hyperparametrarna i vår modell och undersöker förmågan hos vår lösning att ge rekommendationer till nya användare som inte deltog av träningsdatamängden. Vår modell resulterar i något mindre men liknande Mean Average Precision och Mean Reciprocal Rank poäng till matrisfaktoriseringsmetoden och överträffar de popularitetsbaserade och slumpmässiga baslinjerna. De viktigaste fördelarna med vår modell är beräkningseffektivitet och dess förmåga att ge relevanta rekommendationer till nya användare utan behov av omskolning av modellen, vilket är nyckelfunktioner för verkliga användningsfall.
Schembri, Massimo. "Anomaly Prediction in Production Supercomputer with Convolution and Semi-supervised autoencoder." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2021. http://amslaurea.unibo.it/22379/.
Full textLamouret, Marie. "Traitement automatisés des données acoustiques issues de sondeurs multifaisceaux pour la cartographie des fonds marins." Electronic Thesis or Diss., Toulon, 2022. http://www.theses.fr/2022TOUL0002.
Full textAmong underwater acoustic technologies, multibeam echo sounder (MBES) is one of the most advanced tool to study and map the underwater floors and the above water column. Its deployment on-site requires expertise so as the whole data processing to map the information. These processing are very time-consuming due to the massive quantity of recorded data and thus needs to be automatised to shorten and alleviate the hydrographer's task. This PhD research works focus on the automatisation of the current activities in Seaviews society.After some reminders on the underwater acoustic sciences, the MBES operating is described as well the produced data that will be manipulated throughout the developments. This document presents two thematics˸ bathymetric (depths) and marine habitats mapping. The developments are integrated into the Seaviews' software in the aim to be used by all the employees.About seafloor depths mapping, the bathymetric sounding has to be sorted to avoid that the outlier errors distort the results. Sorting the uncountable measures is cumbersome but necessary, although the hydrographers are today happily computed-assisted. We propose a fast statistical method to exclude the outliers while mapping the information. This leads to wonder if the water column imagery would be workable to deduce the bathymetry without failure. We will test this hypothesis with some technics of deep learning, especially with convolutional neural networks.The marine habitats mapping is a seabed nature classification according to the local life. Seaviews has worked on a way to prepare MBES data and habitats analysis. Concerning the method of classification itself, we move towards machine learning technics. Several methods are implemented and assessed, and then an area is chosen to evaluate and compare the results
Belharbi, Soufiane. "Neural networks regularization through representation learning." Thesis, Normandie, 2018. http://www.theses.fr/2018NORMIR10/document.
Full textNeural network models and deep models are one of the leading and state of the art models in machine learning. They have been applied in many different domains. Most successful deep neural models are the ones with many layers which highly increases their number of parameters. Training such models requires a large number of training samples which is not always available. One of the fundamental issues in neural networks is overfitting which is the issue tackled in this thesis. Such problem often occurs when the training of large models is performed using few training samples. Many approaches have been proposed to prevent the network from overfitting and improve its generalization performance such as data augmentation, early stopping, parameters sharing, unsupervised learning, dropout, batch normalization, etc. In this thesis, we tackle the neural network overfitting issue from a representation learning perspective by considering the situation where few training samples are available which is the case of many real world applications. We propose three contributions. The first one presented in chapter 2 is dedicated to dealing with structured output problems to perform multivariate regression when the output variable y contains structural dependencies between its components. Our proposal aims mainly at exploiting these dependencies by learning them in an unsupervised way. Validated on a facial landmark detection problem, learning the structure of the output data has shown to improve the network generalization and speedup its training. The second contribution described in chapter 3 deals with the classification task where we propose to exploit prior knowledge about the internal representation of the hidden layers in neural networks. This prior is based on the idea that samples within the same class should have the same internal representation. We formulate this prior as a penalty that we add to the training cost to be minimized. Empirical experiments over MNIST and its variants showed an improvement of the network generalization when using only few training samples. Our last contribution presented in chapter 4 showed the interest of transfer learning in applications where only few samples are available. The idea consists in re-using the filters of pre-trained convolutional networks that have been trained on large datasets such as ImageNet. Such pre-trained filters are plugged into a new convolutional network with new dense layers. Then, the whole network is trained over a new task. In this contribution, we provide an automatic system based on such learning scheme with an application to medical domain. In this application, the task consists in localizing the third lumbar vertebra in a 3D CT scan. A pre-processing of the 3D CT scan to obtain a 2D representation and a post-processing to refine the decision are included in the proposed system. This work has been done in collaboration with the clinic "Rouen Henri Becquerel Center" who provided us with data
Karimi, Ahmad Maroof. "DATA SCIENCE AND MACHINE LEARNING TO PREDICT DEGRADATION AND POWER OF PHOTOVOLTAIC SYSTEMS: CONVOLUTIONAL AND SPATIOTEMPORAL GRAPH NEURAL NETWORK." Case Western Reserve University School of Graduate Studies / OhioLINK, 2021. http://rave.ohiolink.edu/etdc/view?acc_num=case1601082841477951.
Full textMocko, Štefan. "Využitie pokročilých segmentačných metód pre obrazy z TEM mikroskopov." Master's thesis, Vysoké učení technické v Brně. Fakulta elektrotechniky a komunikačních technologií, 2018. http://www.nusl.cz/ntk/nusl-378145.
Full textElavarthi, Pradyumna. "Semantic Segmentation of RGB images for feature extraction in Real Time." University of Cincinnati / OhioLINK, 2019. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1573575765136448.
Full textLamma, Tommaso. "A mathematical introduction to geometric deep learning." Bachelor's thesis, Alma Mater Studiorum - Università di Bologna, 2021. http://amslaurea.unibo.it/23886/.
Full textĎuriš, Denis. "Detekce ohně a kouře z obrazového signálu." Master's thesis, Vysoké učení technické v Brně. Fakulta elektrotechniky a komunikačních technologií, 2020. http://www.nusl.cz/ntk/nusl-412968.
Full textSparr, Henrik. "Object detection for a robotic lawn mower with neural network trained on automatically collected data." Thesis, Uppsala universitet, Datorteknik, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-444627.
Full textTrčka, Jan. "Zlepšování kvality digitalizovaných textových dokumentů." Master's thesis, Vysoké učení technické v Brně. Fakulta informačních technologií, 2020. http://www.nusl.cz/ntk/nusl-417278.
Full textOquab, Maxime. "Convolutional neural networks : towards less supervision for visual recognition." Thesis, Paris Sciences et Lettres (ComUE), 2018. http://www.theses.fr/2018PSLEE061.
Full textConvolutional Neural Networks are flexible learning algorithms for computer vision that scale particularly well with the amount of data that is provided for training them. Although these methods had successful applications already in the ’90s, they were not used in visual recognition pipelines because of their lesser performance on realistic natural images. It is only after the amount of data and the computational power both reached a critical point that these algorithms revealed their potential during the ImageNet challenge of 2012, leading to a paradigm shift in visual recogntion. The first contribution of this thesis is a transfer learning setup with a Convolutional Neural Network for image classification. Using a pre-training procedure, we show that image representations learned in a network generalize to other recognition tasks, and their performance scales up with the amount of data used in pre-training. The second contribution of this thesis is a weakly supervised setup for image classification that can predict the location of objects in complex cluttered scenes, based on a dataset indicating only with the presence or absence of objects in training images. The third contribution of this thesis aims at finding possible paths for progress in unsupervised learning with neural networks. We study the recent trend of Generative Adversarial Networks and propose two-sample tests for evaluating models. We investigate possible links with concepts related to causality, and propose a two-sample test method for the task of causal discovery. Finally, building on a recent connection with optimal transport, we investigate what these generative algorithms are learning from unlabeled data
Vančo, Timotej. "Self-supervised učení v aplikacích počítačového vidění." Master's thesis, Vysoké učení technické v Brně. Fakulta elektrotechniky a komunikačních technologií, 2021. http://www.nusl.cz/ntk/nusl-442510.
Full textTiensuu, Jacob, Maja Linderholm, Sofia Dreborg, and Fredrik Örn. "Detecting exoplanets with machine learning : A comparative study between convolutional neural networks and support vector machines." Thesis, Uppsala universitet, Institutionen för teknikvetenskaper, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-385690.
Full textHSIEH, PO-FENG, and 謝柏鋒. "Visualization of Convolution Neural Network." Thesis, 2019. http://ndltd.ncl.edu.tw/handle/qna29g.
Full text國立臺北科技大學
資訊工程系
107
In recent years, convolutional neural networks have had many groundbreaking developments. This paper's goal is to analyze the recent YOLO (You Only Look Once) that has a very good performance classification for object detection technology. This paper is a simple way to explain the operation of the convolutional neural network. Present the process which can make the general public more aware of the way machine learning works, and also make it convenient for experts to analyze the structure of it. The ability to quickly improve the original architecture and accelerate it.
Checg, Chung-Sheng, and 鄭仲勝. "An Accelerative Convolution Neural Network Model." Thesis, 2017. http://ndltd.ncl.edu.tw/handle/2tycsb.
Full text國立臺北科技大學
自動化科技研究所
106
Machine learning is a technology that allows computers to learn the rules through vast amounts of information and correct their mistakes themselves. It show the superiority to conventional artificial methods. However in shallow learning, the capability of modelling complex functions is limited in the case of finite samples. Thus, shallow learning models are not enough to simulate human brains in solving difficult problems. Until recently, deep learning was proposed to model complex functions that shallow learning cannot achieve and automatically extract data features. Deep learning works great in generalization even without data pre-processing. However, its disadvantage is computation consumption. In order to automatically learn the characteristics of data and to model complex functions, a multi-layered network is required. Usually when a network has more layers, its prediction performance is better. But it comes with a downside that the network has millions of weighting parameters. Because a great number of parameters causes over-fitting and the massive use of computer memory, we propose a method to reduce the parameters of a convolutional neural network. The goal is to reduce the number of network parameters as many as possible but slightly degrade the accuracy. To validate the proposed method, THUR15K[36], Caltech-101[37], Caltech-256[38], GHIM10k[39] databases are used.. The experimental results show that the parameters is greatly reduced with a slight drop on accuracy (about 1.34%).
KANG, NAN-RAN, and 康乃人. "Speaker Verification using Convolution Neural Network." Thesis, 2018. http://ndltd.ncl.edu.tw/handle/c8e3qe.
Full text逢甲大學
資訊工程學系
106
Biometric system is no longer a new thing in daily life, and it has become more and more popular in recent years, fingerprint recognition, iris recognition, voiceprint recognition, and I-phone's Face ID are all biometric system, and speaker verification is one of them. Speaker recognition can be divided into two parts: feature extraction and classification. In the past, the two parts were solved by different methods, due to the rapid development of deep learning, the neural network for speaker recognition has gained breadth of development. In the part of the speaker verification, the Recurrent Neural Network is often used to solve the problem. It is less common to use the Convolution Neural Network to solve the problem of speaker verification. Since the Convolution Neural Network has a good effect on extracting feature details, this paper decided to use convolutional neural network as the basis to try to solve the two parts of the speaker recognition and propose a successful Methods to solve speaker verification.
Sun, Tzu-Chun, and 孫梓鈞. "Fruit Recognition Using Deep Convolution Neural Network." Thesis, 2014. http://ndltd.ncl.edu.tw/handle/76315586416634324332.
Full text國立暨南國際大學
資訊工程學系
102
This thesis focuses on developing a fruit recognition method. It can be used to improve life convenience by shortening the supermarket checkout time. Existing methods for fruit recognition use handcrafted image features, such as the texture, the color, and the shape of a fruit, for fruit recognition. However, image features extracted with a set of specific algorithms do not necessarily provide enough information for pattern recognition. In this work, we use deep convolution neural network (DCNN) to learn discriminative fruit features automatically. In order to achieve high recognition accuracy, we tested many different DCNN configurations. DCNNs of different depths and different nodes in each network layers are trained and are tested to determine the best configuration. To test the implemented DCNN fruit recognition method, we collect a fruit image database containing big Fuji apples, small Fuji apples, Washington apples, Granny Smith apples, papayas, Hami melons, muskmelons, guavas, bananas, Sunkist, grapefruit, wax apples, limes, peaches, and kiwis. The fruit images are divided into five parts. Four of them are used for training and the other one is for testing. We have also implemented another two existing fruit recognition methods as baseline methods to compare the recognition results of our DCNN approach. Experimental results show that the DCNN method outperforms the other two methods and its recognition accuracy is about 92.91%.
WEI, TSUNG-HSIN, and 魏崇訓. "Video Super-resolution via Convolution Neural Network." Thesis, 2016. http://ndltd.ncl.edu.tw/handle/823kfa.
Full text國立高雄應用科技大學
資訊工程系
104
Nowadays, people might need super resolution to have more effective and clear information. The technology of image processing becomes better and better, and there are more and more people present their research in this field. Super resolution algorithm enhances high frequent information (texture or edges) to improve the image quality. We can do more things with super resolution, such as road surveillance system. The view might be influence by illumination, angle, distance, and other conditions, so these might not be good for us to recognize the number of license plate or human face. Interpolation is a great method for super resolution, but this method does not own high frequent information. Therefore, researcher present the concept of Example-based in Learning-based to solve this problem. Besides, in recent years, deep learning has great result and becomes faster than before. Deep learning has not only significant result but also great speed. Although the speed of deep learning is faster than before, it still needs some time to rebuild. The time might be acceptable for single image. But what if we have to enhance video, it will take a lot of time to rebuild. Therefore, our research is able to solve this problem by Three Steps Search algorithm. We present a faster super resolution for video based on deep learning. We find the different blocks between frame and frame. Add these blocks into neural net and rebuild high resolution image to lower the total compute time.
Chang, Yao-Ren, and 張耀仁. "Convolution neural network on WIFI indoor localization." Thesis, 2018. http://ndltd.ncl.edu.tw/handle/s7dhny.
Full text國立臺灣大學
電機工程學研究所
106
The mobile payment has been growing very quickly in these year, our life has become more and more convenient. Once we can locate user’s position precisely, we can broadcast the advertisement to the user to increase sales performance. For example: when you walk into the restaurant, the system sent you the coupon of this restaurant immediately, when you walk into the apparel store, the system list all of the clothes you might like, when you are leaving parking lot, the system auto-debiting your parking fee. In the past, WIFI localization system is based on RFID localization, triangle localization. Nowadays, with the growing of machine learning such as DBSCAN, Deep learning, KNN, we can localize user’s location more precisely. In this paper, we use Alipay real-time payment dataset to do our experiment. We rebuild the geographic information from WIFI signal and train the model with convolution neural networks. Besides, we reduce the training/testing time on overhead by feature engineering. Then we evaluate the result with three most representative machine learning models: Lighgbm (multiple classifier), Lightgbm (binary Classifier), Keras (Deep Neural network). Finally, we evaluate the pros and cons for each machine learning model, and discuss the result.
HUANG, JIAN-JHIH, and 黃健智. "Apply Convolution Neural Network on Vehicle Detection." Thesis, 2019. http://ndltd.ncl.edu.tw/handle/4g44kf.
Full text國立彰化師範大學
電機工程學系
107
With rapid development of computer hardware and software technology, complex image processing and recognition can be performed by computer in recent years. The traditional image recognition has some issues of lack of flexibility and poor accuracy. These disadvantages have improved by the neural network. The earlier multi-layer perceptron is widely applied in various areas, however, its hidden layers are simple and slower convergence issues to cause longer training time. Most researchers apply R-CNN to the real time recognition system. These methods have higher accuracy but their layers are more complicated and need a lot of hardware resources to support. In this thesis, it proposes a CNN with lighter neural layers implemented to the Raspberry Pi of small single-board computer, and use this method to real time recognize vehicles. For speed up network training, this thesis uses data augmentation and normalization to enhance training dataset before input. Then, applying convolution layers extract features and pooling layers compress, and the backward propagation method is used to update network parameters. This proposed method have short time for training convergence and better efficient. Due to simple structure, it can be easy implemented other single board modules to improve generalization. Based on the experiments, the best model is 11 layers used for future inference after optimized network parameters. The accurate rate of training dataset is 96% after training stage processing. The test dataset is used for validation and the accurate rate is 94%. The proposed method has simple structures and higher accuracy to the real time recognition system.
Hu, Yi-Chun, and 胡依淳. "Analysis and Comparison of Convolution Layer in Deep Convolution Neural Network." Thesis, 2018. http://ndltd.ncl.edu.tw/handle/686x7u.
Full text國立暨南國際大學
電機工程學系
106
With the rapid development of information technology, big data has become mainstream, and many identification systems have been greatly affected. Therefore, deep learning requires a large database learning model and thus becomes the mainstream. Deep learning can take advantage of the characteristics of robots to automatically learn to task objectives, and thus deep learning of this architecture has become a very popular technology in academics. Nowadays, neural networks are popular in the field of visual imaging. The best performing model is the convolutional neural network. The progress of deep learning is related to Convolutional Neural Networks (CNN). Convolutional neural networks, also known as CNNs or ConvNets, are the main developments in the field of deep neural networks, and can even be more accurate than humans in image recognition. If there is any way to live up to the expectations of deep learning, convolutional neural networks are definitely the first choice. A key part of the convolutional neural network is the weight value in the kernel of the convolutional layer. There are usually three ways to change the convolutional layer in the convolutional neural network. Kernel size, activation function, and convolution use the number of kernels. The neural network may have different activation function selection or kernel size. The accuracy is not high.
XIE, YOU-TING, and 謝侑廷. "CT Images segmentation using Deep Convolution Neural Network." Thesis, 2019. http://ndltd.ncl.edu.tw/handle/4xkkpc.
Full text國立臺灣科技大學
電子工程系
107
Cause of hospital unable to provide a complete 3D model and nowadays technol- ogy can not produce stereoscopic images without a blind vision, it cause doctors misjudgment on surgery evaluated or diagnosed. Therefoe doctors can only rely on their own experience to identify computerized tomography (CT) images during diag- nosis and preoperative evaluation. However, images are a kind of two-dimensional information expression, it cannot provide doctors accurate informations in three- dimensional space. With professional training and extensive experience in the eld of clinical, the doctors constructs the stereoscopic shape of the liver in their mind, and then explains the current liver condition of the patient through oral dictation. But patient has no professional training to identify the computerized tomography (CT), thus they can't imagine the three-dimensional shape of the liver in their mind, which causes many problems between doctor and patient. This paper proposes a system that can automatically identify liver parts and instant 3D modeling in computed tomography (CT) images. The image segmenta- tion convolutional neural network (Segnet) can automatically segment and classify images into computerized tomography (CT) image results. Cause the image thresh- old of each organ in the body is similar, it is impossible to directly segment the liver region by the image segmentation convolutional neural network (Segnet). Therefore, this paper divides the image into a convolutional neural network (Segnet). The re- sults of the liver part are judged by an algorithm, and the resulting image is used to generate point cloud data. Finally the point cloud reconstruction is performed to generate 3D Mesh of the liver.
Vijayan, Raghavendran. "Forecasting retweet count during elections using graph convolution neural networks." Thesis, 2018. https://doi.org/10.7912/C2JM2C.
Full textGUAN, HONG-CHENG, and 管宏成. "Detection of Scooters in Taiwan Based on Convolution Neural Networks." Thesis, 2017. http://ndltd.ncl.edu.tw/handle/54627521654968631599.
Full text逢甲大學
資訊工程學系
105
Due to the attention of modern people on the increasing importance of traffic safety, drivers in Taiwan install driving recorder which record daily traffic or accident. Taiwanese usually use scooter as commuter vehicle. Riders in Taiwan are often shuttle in the traffic jam, it is very dangerous for car driver to detect the riders. It is usually to cause car accidents. In this paper, we propose a system to remind car drivers to avoid the danger of rider in the traffic jam or street .The system will draw rectangles on the driving recorder to mark the interested object on the video, namely car, scooter, bus, person and bicycle. We based on R .Girshick [1] as a basis for modifying Fast R-CNN [2] with the RPN architecture and adjusting its learning algorithms. For the R-CNN [3] architecture, the original author uses VGG16 [4] and ZF Net [5], which uses these two types of neural networks and modifies the number of RPN categories and related parameters. This paper makes the optimization of the network architecture by adding special samples and tuning its parameters.
Cai, Pei Yun, and 蔡佩妘. "Design of A Flexible Accelerator for Deep Convolution Neural Networks." Thesis, 2017. http://ndltd.ncl.edu.tw/handle/76s688.
Full text國立清華大學
資訊工程學系
105
Convolution Neural Network (CNN) is a deep learning method for vision recognition. The state-of-the-art accuracy makes it widely used in artificial intelligence, computer vision, self-driving car, etc. However, CNNs are highly computational complex and demand high memory bandwidth. Although we exploit highly parallel computation to achieve effective throughput, the good orchestration of data movements should be taken into consideration to reduce increased memory bandwidth. To address these problems, we present a specialized dataflow with spatial hardware (extended from MIT Eyeriss, an energy-efficient reconfigurable CNN accelerator) to reduce memory access without sacrificing performance. Existing works typically improve in either computation or memory access aspect. However, the computational parallelism and memory bandwidth react each other, so we should take both of them into consideration at the same time. Convolution operations of CNNs exhibit various data reuse types and show high parallelism. We apply highly-parallel PE array to improve the throughput. To minimize data access, we purpose a dataflow leveraging data reuse opportunities and local buffer inside PE. Then, data can be temporal reused without iterative access between high-level memory and PEs. In addition, large amount of intermediate data can be accumulated immediately, which could pose additional pressure on storage. By reason of that, we propose a tiling methodology with the tradeoff between performance and local buffer size. The larger local buffer is used, the more data can be reused and the more intermediate data can be consumed, which can alleviate the data streaming bottleneck, enabling the efficient computation. Furthermore, our dataflow and hardware can adapt to different layers with varying shapes, so we can maximize the throughput in each layer of CNNs. For layer2 of Alexnet, we can have a speedup of 4.64 times with additional buffer of 2.63 KB, over initial buffer size, which is a row size of input data. Compared to Eyeriss, we have the speedup of 1.07 times by using additional buffer of 18.38KB. For layer3 of Alexnet, we can have a speedup of 14.55 times with additional buffer of 12.8KB. Compare to Eyeriss, we have the speedup of 1.03 times by using additional buffer of 28.55KB. As a result, the throughput of the frame rate achieves 41.7 fps (55.5 GOPS) at 250MHz frequency for convolution layers of Alexnet. Compared to Eyeriss, under the same frequency 200 MHz and using Alexnet, we achieve better performance in layer2~4 ranging from 1.03~1.07 times by using additional on-chip buffer of 16KB.
Ho, Pin-Hui, and 何品蕙. "Design of an Inference Accelerator for Compressed Convolution Neural Networks." Thesis, 2017. http://ndltd.ncl.edu.tw/handle/bxwfdn.
Full text國立清華大學
資訊工程學系所
106
The state-of-the-art convolution neural network (CNNs) are wildly used in the intelligent applications such as the AI systems, nature language processing and image recognition. The huge and growing computation latency of the high-dimensional convolution has become a critical issue. Most of the multiplications in the convolution layers are ineffectual as they involve the multiplication that either one of the input data or both are zero. By pruning the redundant connections in the CNN models and clamping the features to zero by the rectified linear unit (ReLU), the input data of the convolution layers achieve a higher sparsity. The high data sparsity leaves a big room for improvement by compressing the data and skipping the ineffectual multiplication. In this thesis, we propose a design of the CNN accelerators which can accelerate the convolution layers by performing the sparse matrix multiplications and reducing the amount of the off-chip data transfer. The state-of-the-art Eyeriss architecture is an energy-efficient reconfigurable CNN accelerator which has the specialized data flow with the multi-level memory hierarchy and limited hardware resource. Improved over the Eyeriss architecture, our approach can perform the sparse matrix multiplications effectively. With the pruned and compressed kernel data, and dynamic encoding of the input feature into the compressed sparse row (CSR) format, our accelerator can reduce a significant amount of the off-chip data transfer, minimizing the memory accesses and skipping the ineffectual multiplication. One of the disadvantages of the compression scheme is that after the compression, the input workload becomes imbalance dynamically. Therefore, the technique will lower the data reusability and increase the off-chip data transfer. To analyze the data transfer further, we explore the relationship between the on-chip buffer size and off-chip data transfer. Our design needs to reaccess the off-chip data while the on-chip buffer can’t store all the output features. As a result, by reducing the significant amount of computation and memory accesses, our accelerator can still achieve the better performance. With Alexnet, a popular CNN model with 35% input data sparsity and 89.5% kernel data as the benckmark, our accelerator can achieve 1.12X speedup as compared with Eyeriss, and 3.42X speed up as compared with our baseline architecture.
WANG, WEI-CHUN, and 王薇淳. "Scene Recognition of Remote Sensing Images Using Convolution Neural Networks." Thesis, 2018. http://ndltd.ncl.edu.tw/handle/nd35c3.
Full text國防大學理工學院
空間科學碩士班
106
Geospatial information often use spatial information and observation system for spatial data processing and analysis to meet the need of applications in different fields. However, in the face of big data satellite images and various types of geospatial data on the Internet, it is hard for users to analyze such huge images and data with great load. In order to effectively conduct image interpretation and earth observation applications, it is no longer possible to work in a traditional manner. In this study, we firstly integrate the remote sensing images and POI geospatial databases to automatically generate the huge remote sensing images that are necessary for the development of artificial intelligence technology. Image classification in computer vision has always been a fundamental research topic in the field of image recognition. In recent years, image classification using the Convolution Neural Network (CNN) in deep learning has become increasingly popular, and neural network automatization has improved recognition capabilities. This opens up a new horizon for image recognition. Therefore, this study employed transfer learning technique, with the fine-tuning strategy on the popular InceptionV3 network. The well-trained model predicted a total of 1,296 image tiles from the three test areas, in order to assess the binary classification between airport and not airport images. With a fine-tuning strategy added, the average accuracy was improved by 77.37%. The F1 score was found from 0.17 to 0.54; the F2 score was found from 0.28 to 0.74.
LIU, HONG-CEN, and 劉泓岑. "Combining 1D & 2D Convolution Neural Networks For Fall Detection." Thesis, 2019. http://ndltd.ncl.edu.tw/handle/hk76sd.
Full text大同大學
資訊工程學系(所)
107
Fall detection mechanism can be sorted by the position where the sensor is placed. One is “Placed in the environment” and the other is “Attached on the human body”. The sensor of the first method usually required to be placed in specific place, and the price of the sensor is expensive. In contrast, the sensor of the second method is not only affordable but also no restrictions on the place. The second method usually uses a smartphone as a sensor, because the smartphone can complete the detection, determination, and notification by itself. The initial research in detection methods of smartphone had to place smartphone somewhere specifically on the body. But in reality, phone users have different habits of smartphone placement. It is important to consider that users will place their smartphone in different positions instead of being placed a single position. In view of this, this paper proposes a mechanism for using accelerometer and two neural networks for fall detection. First, using the first neural network to exclude non-fall event. The second neural network will confirm the fall event. In addition, the mechanism does not require users to place the smartphone on a single body part allowing users to place the smartphone in a garment, trousers or jacket pocket. According to experimental results in this paper, the Specificity and Accuracy of this mechanism are better than the three other methods. The Sensitivity of this method is slightly lower than one of three method, but better than other two.
Fu, Chien-Chun, and 傅建鈞. "A System for Disguised Face Recognition with Convolution Neural Networks." Thesis, 2018. http://ndltd.ncl.edu.tw/handle/f5gfn3.
Full text淡江大學
電機工程學系碩士在職專班
106
In this paper, we propose a disguised face recognition system based on Deep Normalization and Convolution Neural Network (DNCNN), this system include two trained DNCNN identification Network. The function of first trained identification network is to identify the type of disguised of the input face image. This network classifies human face disguised input images into three categories, No disguised, Upper half face disguised and Lower half face disguised. After the classification is completed, the system will remove the upper half disguised or the lower half disguised of the face image, and remaining the non-disguised half face images, then input it into the second recognition network. The function of the second recognition network is to recognize the identification of the input half face image. To reduce the over-fitting caused by imbalanced and insufficient training samples. Before performing the training and identification of the above two DNCNN recognition networks, we need to perform the pre-process on the original image samples first. The image pre-process is used the Viola-Jones face detection algorithm. The algorithm first finds out the block of the face position of original images, then the pre-process rotates and captures the face block image or half face images for the training and testing of recognition networks. After the preprocessing is completed, we can perform the training and testing of DNCNN recognition networks. The experimental results show that the system achieved a similar recognition rates as the reference.
(5931047), Akash Gaikwad. "Pruning Convolution Neural Network (SqueezeNet) for Efficient Hardware Deployment." Thesis, 2019.
Find full textIn recent years, deep learning models have become popular in the real-time embedded application, but there are many complexities for hardware deployment because of limited resources such as memory, computational power, and energy. Recent research in the field of deep learning focuses on reducing the model size of the Convolution Neural Network (CNN) by various compression techniques like Architectural compression, Pruning, Quantization, and Encoding (e.g., Huffman encoding). Network pruning is one of the promising technique to solve these problems.
This thesis proposes methods to prune the convolution neural network (SqueezeNet) without introducing network sparsity in the pruned model.
This thesis proposes three methods to prune the CNN to decrease the model size of CNN without a significant drop in the accuracy of the model.
1: Pruning based on Taylor expansion of change in cost function Delta C.
2: Pruning based on L2 normalization of activation maps.
3: Pruning based on a combination of method 1 and method 2.
The proposed methods use various ranking methods to rank the convolution kernels and prune the lower ranked filters afterwards SqueezeNet model is fine-tuned by backpropagation. Transfer learning technique is used to train the SqueezeNet on the CIFAR-10 dataset. Results show that the proposed approach reduces the SqueezeNet model by 72% without a significant drop in the accuracy of the model (optimal pruning efficiency result). Results also show that Pruning based on a combination of Taylor expansion of the cost function and L2 normalization of activation maps achieves better pruning efficiency compared to other individual pruning criteria and most of the pruned kernels are from mid and high-level layers. The Pruned model is deployed on BlueBox 2.0 using RTMaps software and model performance was evaluated.
Gaikwad, Akash S. "Pruning Convolution Neural Network (SqueezeNet) for Efficient Hardware Deployment." Thesis, 2018. http://hdl.handle.net/1805/17923.
Full textIn recent years, deep learning models have become popular in the real-time embedded application, but there are many complexities for hardware deployment because of limited resources such as memory, computational power, and energy. Recent research in the field of deep learning focuses on reducing the model size of the Convolution Neural Network (CNN) by various compression techniques like Architectural compression, Pruning, Quantization, and Encoding (e.g., Huffman encoding). Network pruning is one of the promising technique to solve these problems. This thesis proposes methods to prune the convolution neural network (SqueezeNet) without introducing network sparsity in the pruned model. This thesis proposes three methods to prune the CNN to decrease the model size of CNN without a significant drop in the accuracy of the model. 1: Pruning based on Taylor expansion of change in cost function Delta C. 2: Pruning based on L2 normalization of activation maps. 3: Pruning based on a combination of method 1 and method 2. The proposed methods use various ranking methods to rank the convolution kernels and prune the lower ranked filters afterwards SqueezeNet model is fine-tuned by backpropagation. Transfer learning technique is used to train the SqueezeNet on the CIFAR-10 dataset. Results show that the proposed approach reduces the SqueezeNet model by 72% without a significant drop in the accuracy of the model (optimal pruning efficiency result). Results also show that Pruning based on a combination of Taylor expansion of the cost function and L2 normalization of activation maps achieves better pruning efficiency compared to other individual pruning criteria and most of the pruned kernels are from mid and high-level layers. The Pruned model is deployed on BlueBox 2.0 using RTMaps software and model performance was evaluated.