Dissertations / Theses: 'CNN'

1

Garbay, Thomas. "Zip-CNN." Electronic Thesis or Diss., Sorbonne université, 2023. https://accesdistant.sorbonne-universite.fr/login?url=https://theses-intra.sorbonne-universite.fr/2023SORUS210.pdf.

Full text

Abstract:

Les systèmes numériques utilisés pour l'Internet des Objets (IoT) et les Systèmes Embarqués ont connu une utilisation croissante ces dernières décennies. Les systèmes embarqués basés sur des microcontrôleurs (MCU) permettent de résoudre des problématiques variées, en récoltant de nombreuses données. Aujourd'hui, environ 250 milliards de MCU sont utilisés. Les projections d'utilisation de ces systèmes pour les années à venir annoncent une croissance très forte. L'intelligence artificielle a connu un regain d'intérêt dans les années 2012. L'utilisation de réseaux de neurones convolutifs (CNN) a permis de résoudre de nombreuses problématiques de vision par ordinateur ou de traitement du langage naturel. L'utilisation de ces algorithmes d'intelligence artificielle au sein de systèmes embarqués permettrait d'améliorer grandement l'exploitation des données récoltées. Cependant le coût d'exécution des CNN rend leur implémentation complexe au sein de systèmes embarqués. Ces travaux de thèse se concentrent sur l'exploration de l'espace des solutions pour guider l'intégration des CNN au sein de systèmes embarqués basés sur des microcontrôleurs. Pour cela, la méthodologie ZIP-CNN est définie. Elle tient compte du système embarqué et du CNN à implémenter. Elle fournit à un concepteur des informations sur l'impact de l'exécution du CNN sur le système. Un modèle fourni quantitativement une estimation de la latence, de la consommation énergétique et de l'espace mémoire nécessaire à une inférence d'un CNN au sein d'une cible embarquée, quelle que soit la topologie du CNN. Ce modèle tient compte des éventuelles réductions algorithmiques telles que la distillation de connaissances, l'élagage ou la quantification. L'implémentation de CNN de l'état de l'art au sein de MCU a permis la validation expérimentale de la justesse de l'approche. L'utilisation des modèles développés durant ces travaux de thèse démocratise l'implémentation de CNN au sein de MCU, en guidant les concepteurs de systèmes embarqués. De plus, les résultats obtenus ouvrent une voie d'exploration pour appliquer les modèles développés à d'autres matériels cibles, comme les architectures multi-cœur ou les FPGA. Les résultats d'estimations sont également exploitables dans l'utilisation d'algorithmes de recherche de réseaux de neurones (NAS)
Digital systems used for the Internet of Things (IoT) and Embedded Systems have seen an increasing use in recent decades. Embedded systems based on Microcontroller Unit (MCU) solve various problems by collecting a lot of data. Today, about 250 billion MCU are in use. Projections in the coming years point to very strong growth. Artificial intelligence has seen a resurgence of interest in 2012. The use of Convolutional Neural Networks (CNN) has helped to solve many problems in computer vision or natural language processing. The implementation of CNN within embedded systems would greatly improve the exploitation of the collected data. However, the inference cost of a CNN makes their implementation within embedded systems challenging. This thesis focuses on exploring the solution space, in order to assist the implementation of CNN within embedded systems based on microcontrollers. For this purpose, the ZIP-CNN methodology is defined. It takes into account the embedded system and the CNN to be implemented. It provides an embedded designer with information regarding the impact of the CNN inference on the system. A designer can explore the impact of design choices, with the objective of respecting the constraints of the targeted application. A model is defined to quantitatively provide an estimation of the latency, the energy consumption and the memory space required to infer a CNN within an embedded target, whatever the topology of the CNN is. This model takes into account algorithmic reductions such as knowledge distillation, pruning or quantization. The implementation of state-of-the-art CNN within MCU verified the accuracy of the different estimations through an experimental process. This thesis democratize the implementation of CNN within MCU, assisting the designers of embedded systems. Moreover, the results open a way of exploration to apply the developed models to other target hardware, such as multi-core architectures or FPGA. The estimation results are also exploitable in the Neural Architecture Search (NAS)

APA, Harvard, Vancouver, ISO, and other styles

2

Carpani, Valerio. "CNN-based video analytics." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2018.

Find full text

Abstract:

The content of this thesis illustrates the six months work done during my internship at TKH Security Solutions - Siqura B.V. in Gouda, Netherlands. The aim of this thesis is to investigate on convolutional neural networks possible usage, from two different point of view: first we propose a novel algorithm for person re-identification, second we propose a deployment chain, for bringing research concepts to product ready solutions. In existing works, the person re-identification task is assumed to be independent of the person detection task. In this thesis instead, we consider the two tasks as linked. In fact, features produced by an object detection convolutional neural network (CNN) contain useful information, which is not being used by current re-identification methods. We propose several solutions for learning a metric on CNN features to distinguish between different identities. Then the best of these solutions is compared with state of the art alternatives on the popular Market-1501 dataset. Results show that our method outperforms them in computational efficiency, with only a reasonable loss in accuracy. For this reason, we believe that the proposed method can be more appropriate than current state of the art methods in situations where the computational efficiency is critical, such as embedded applications. The deployment chain we propose in this thesis has two main goals: it must be flexible for introducing new advancement in networks architecture, and it must be able to deploy neural networks both on server and embedded platforms. We tested several frameworks on several platforms and we ended up with a deployment chain that relies on the open source format ONNX.

APA, Harvard, Vancouver, ISO, and other styles

3

Lara, Teodoro. "Controllability and applications of CNN." Diss., Georgia Institute of Technology, 1997. http://hdl.handle.net/1853/28921.

Full text

APA, Harvard, Vancouver, ISO, and other styles

4

Samal, Kruttidipta. "FPGA acceleration of CNN training." Thesis, Georgia Institute of Technology, 2015. http://hdl.handle.net/1853/54467.

Full text

Abstract:

This thesis presents the results of an architectural study on the design of FPGA- based architectures for convolutional neural networks (CNNs). We have analyzed the memory access patterns of a Convolutional Neural Network (one of the biggest networks in the family of deep learning algorithms) by creating a trace of a well-known CNN architecture and by developing a trace-driven DRAM simulator. The simulator uses the traces to analyze the effect that different storage patterns and dissonance in speed between memory and processing element, can have on the CNN system. This insight is then used create an initial design for a layer architecture for the CNN using an FPGA platform. The FPGA is designed to have multiple parallel-executing units. We design a data layout for the on-chip memory of an FPGA such that we can increase parallelism in the design. As the number of these parallel units (and hence parallelism) depends on the memory layout of input and output, particularly if parallel read and write accesses can be scheduled or not. The on-chip memory layout minimizes access contention during the operation of parallel units. The result is an SoC (System on Chip) that acts as an accelerator and can have more number of parallel units than previous work. The improvement in design was also observed by comparing post synthesis loop latency tables between our design and one with a single unit design. This initial design can help in designing FPGAs targeted for deep learning algorithms that can compete with GPUs in terms of performance.

APA, Harvard, Vancouver, ISO, and other styles

5

Mohamed, Moussa Elmokhtar. "Conversion d’écriture hors-ligne en écriture en-ligne et réseaux de neurones profonds." Electronic Thesis or Diss., Nantes Université, 2024. http://www.theses.fr/2024NANU4001.

Full text

Abstract:

Cette thèse se focalise sur la conversion d’images statiques d’écriture hors- ligne en signaux temporels d’écriture en-ligne. L’objectif est d’étendre l’approche à réseau de neurone au-delà des images de lettres isolées ainsi que de les généraliser à d’autres types de contenus plus complexes. La thèse explore deux approches neuronales distinctes, la première approche est un réseau de neurones convolutif entièrement convolutif multitâche UNet basé sur la méthode de [ZYT18]. Cette approche a démontré des bons résultats de squelettisation mais en revanche une extraction de trait problé- matique. En raison des limitations de modélisation temporelle intrinsèque à l’architecture CNN. La deuxième approche s’appuie sur le modèle de squelettisation précédent pour ex- traire les sous-traits et propose une modélisation au niveau sous-traits avec deux Tranformers : un encodeur de sous-trait (SET) et un décodeur pour ordonner les sous-traits (SORT) à l’aide de leur vecteur descripteur ainsi que la prédiction de lever de stylo. Cette approche surpasse l’état de l’art sur les bases de données de mots, phrases et d’équations mathématiques et a permis de surmonter plusieurs limitations relevées dans la littérature. Ces avancées ont permis d’étendre la portée de la conversion d’image d’écriture hors- ligne vers l’écriture en-ligne pour inclure des phrases entières de texte et d’aborder un type de contenu complexe tel que les équations mathématiques
This thesis focuses on the conversion of static images of offline handwriting into temporal signals of online handwriting. Our goal is to extend neural networks beyond the scale of images of isolated letters and as well to generalize to other complex types of content. The thesis explores two distinct neural network-based approaches, the first approach is a fully convolutional multitask UNet-based network, inspired by the method of [ZYT18]. This approach demonstrated good results for skeletonization but suboptimal stroke extrac- tion. Partly due to the inherent temporal mod- eling limitations of CNN architecture. The second approach builds on the pre- vious skeletonization model to extract sub- strokes and proposes a sub-stroke level modeling with Transformers, consisting of a sub- stroke embedding transformer (SET) and a sub-stroke ordering transformer (SORT) to or- der the different sub-strokes as well as pen up predictions. This approach outperformed the state of the art on text lines and mathematical equations databases and addressed several limitations identified in the literature. These advancements have expanded the scope of offline-to-online conversion to include entire text lines and generalize to bidimensional content, such as mathematical equations

APA, Harvard, Vancouver, ISO, and other styles

6

Rossetto, Andrea. "CNN per view synthesis da mappe depth." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2018. http://amslaurea.unibo.it/16570/.

Full text

Abstract:

Breve introduzione alle reti neurali e al deep learning con descrizione dei sistemi utilizzati per i modelli e i test effettuati. Spiegazione del funzionamento dei sistemi creati ed esposizione dei risultati ottenuti.

APA, Harvard, Vancouver, ISO, and other styles

7

Castelli, Filippo Maria. "3D CNN methods in biomedical image segmentation." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2019. http://amslaurea.unibo.it/18796/.

Full text

Abstract:

A definite trend in Biomedical Imaging is the one towards the integration of increasingly complex interpretative layers to the pure data acquisition process. One of the most interesting and looked-forward goals in the field is the automatic segmentation of objects of interest in extensive acquisition data, target that would allow Biomedical Imaging to look beyond its use as a purely assistive tool to become a cornerstone in ambitious large-scale challenges like the extensive quantitative study of the Human Brain. In 2019 Convolutional Neural Networks represent the state of the art in Biomedical Image segmentation and scientific interests from a variety of fields, spacing from automotive to natural resource exploration, converge to their development. While most of the applications of CNNs are focused on single-image segmentation, biomedical image data -being it MRI, CT-scans, Microscopy, etc- often benefits from three-dimensional volumetric expression. This work explores a reformulation of the CNN segmentation problem that is native to the 3D nature of the data, with particular interest to the applications to Fluorescence Microscopy volumetric data produced at the European Laboratories for Nonlinear Spectroscopy in the context of two different large international human brain study projects: the Human Brain Project and the White House BRAIN Initiative.

APA, Harvard, Vancouver, ISO, and other styles

8

Ringenson, Josefin. "Efficiency of CNN on Heterogeneous Processing Devices." Thesis, Linköpings universitet, Programvara och system, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-155034.

Full text

Abstract:

In the development of advanced driver assistance systems, computer vision problemsneed to be optimized to run efficiently on embedded platforms. Convolutional neural network(CNN) accelerators have proven to be very efficient for embedded camera platforms,such as the ones used for automotive vision systems. Therefore, the focus of this thesisis to evaluate the efficiency of a CNN on a future embedded heterogeneous processingdevice. The memory size in an embedded system is often very limited, and it is necessary todivide the input into multiple tiles. In addition, there are power and speed constraintsthat needs to be met to be able to use a computer vision system in a car. To increaseefficiency and optimize the memory usage, different methods for CNN layer fusion areproposed and evaluated for a variety of tile sizes. Several different layer fusion methods and input tile sizes are chosen as optimal solutions,depending on the depth of the layers in the CNN. The solutions investigated inthe thesis are most efficient for deep CNN layers, where the number of channels is high.

APA, Harvard, Vancouver, ISO, and other styles

9

Kristin, Hallberg. "Islam, BBC och CNN : Palestinska inbördeskriget 2006-2007." Thesis, Uppsala universitet, Teologiska institutionen, 2016. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-295888.

Full text

Abstract:

The topic of this paper is how CNN and BBC, two of the largest media companies in the world, presented Islam in the Palestinian civil war during the years 2006-2007. Articles that CNN and BBC published on the Palestinian civil war have been analyzed in order to answer this question. The purpose is to see if Islam is portrayed in an Islamophobic way by CNN and BBC and if it is possible to find discursive tracks from Clash of Civilizations-theory in the analyzed articles. The findings indicate that there are elements of Islamophobia and discursive tracks of Clash of Civilizations when it comes to presenting islam during the Palestinian civil war. Another conclusion is also that CNN and BBC presented islam in different ways during the civil war.

APA, Harvard, Vancouver, ISO, and other styles

10

Eklund, Anton. "Cascade Mask R-CNN and Keypoint Detection used in Floorplan Parsing." Thesis, Uppsala universitet, Institutionen för informationsteknologi, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-415371.

Full text

Abstract:

Parsing floorplans have been a problem in automatic document analysis for long and have up until recent years been approached with algorithmic methods. With the rise of convolutional neural networks (CNN), this problem too has seen an upswing in performance. In this thesis the task is to recover, as accurately as possible, spatial and geometric information from floorplans. This project builds around instance segmentation models like Cascade Mask R-CNN to extract the bulk of information from a floorplan image. To complement the segmentation, a new style of using keypoint-CNN is presented to find precise locations of corners. These are then combined in a post-processing step to give the resulting segmentation. The resulting segmentation scores exceed the current baseline of the CubiCasa5k floorplan dataset with a mean IoU of 72.7% compared to 57.5%. Further, the mean IoU for individual classes is also improved for almost every class. It is also shown that Cascade Mask R-CNN is better suited than Mask R-CNN for this task.

APA, Harvard, Vancouver, ISO, and other styles

11

Gu, Dongfeng. "3D Densely Connected Convolutional Network for the Recognition of Human Shopping Actions." Thesis, Université d'Ottawa / University of Ottawa, 2017. http://hdl.handle.net/10393/36739.

Full text

Abstract:

In recent years, deep convolutional neural networks (CNNs) have shown remarkable results in the image domain. However, most of the neural networks in action recognition do not have very deep layer compared with the CNN in the image domain. This thesis presents a 3D Densely Connected Convolutional Network (3D-DenseNet) for action recognition that can have more than 100 layers without exhibiting performance degradation or overfitting. Our network expands Densely Connected Convolutional Networks (DenseNet) [32] to 3D-DenseNet by adding the temporal dimension to all internal convolution and pooling layers. The internal layers of our model are connected with each other in a feed-forward fashion. In each layer, the feature-maps of all preceding layers are concatenated along the last dimension and are used as inputs to all subsequent layers. We propose two different versions of 3D-DenseNets: general 3D-DenseNet and lite 3D-DenseNet. While general 3D-DenseNet has the same architecture as DenseNet, lite 3D-DenseNet adds a 3D pooling layer right after the first 3D convolution layer of general 3D-DenseNet to reduce the number of training parameters at the beginning so that we can reach a deeper network. We test on two action datasets: the MERL shopping dataset [69] and the KTH dataset [63]. Our experiment results demonstrate that our method performs better than the state-of-the-art action recognition method on the MERL shopping dataset and achieves a competitive result on the KTH dataset.

APA, Harvard, Vancouver, ISO, and other styles

12

Chen, Tairui. "Going Deeper with Convolutional Neural Network for Intelligent Transportation." Digital WPI, 2016. https://digitalcommons.wpi.edu/etd-theses/144.

Full text

Abstract:

Over last several decades, computer vision researchers have been devoted to find good feature to solve different tasks, object recognition, object detection, object segmentation, activity recognition and so forth. Ideal features transform raw pixel intensity values to a representation in which these computer vision problems are easier to solve. Recently, deep feature from covolutional neural network(CNN) have attracted many researchers to solve many problems in computer vision. In the supervised setting, these hierarchies are trained to solve specific problems by minimizing an objective function for different tasks. More recently, the feature learned from large scale image dataset have been proved to be very effective and generic for many computer vision task. The feature learned from recognition task can be used in the object detection task. This work aims to uncover the principles that lead to these generic feature representations in the transfer learning, which does not need to train the dataset again but transfer the rich feature from CNN learned from ImageNet dataset. This work aims to uncover the principles that lead to these generic feature representations in the transfer learning, which does not need to train the dataset again but transfer the rich feature from CNN learned from ImageNet dataset. We begin by summarize some related prior works, particularly the paper in object recognition, object detection and segmentation. We introduce the deep feature to computer vision task in intelligent transportation system. First, we apply deep feature in object detection task, especially in vehicle detection task. Second, to make fully use of objectness proposals, we apply proposal generator on road marking detection and recognition task. Third, to fully understand the transportation situation, we introduce the deep feature into scene understanding in road. We experiment each task for different public datasets, and prove our framework is robust.

APA, Harvard, Vancouver, ISO, and other styles

13

Mukhtar, Hind. "Machine Learning Enabled-Localization in 5G and LTE Using Image Classification and Deep Learning." Thesis, Université d'Ottawa / University of Ottawa, 2021. http://hdl.handle.net/10393/42449.

Full text

Abstract:

Demand for localization has been growing due to the increase in location-based services and high bandwidth applications requiring precise localization of users to improve resource management and beam forming. Outdoor localization has been traditionally done through Global Positioning System (GPS), however it’s performance degrades in urban settings due to obstruction and multi-path effects, creating the need for better localization techniques. This thesis proposes a technique using a cascaded approach composed of image classification and deep learning using LIDAR or satellite images and Channel State In-formation (CSI) data from base stations to predict the location of moving vehicles and users outdoors. The algorithm’s performance is assessed using 3 different datasets. The first two use simulated data in the Milli-meter Wave (mmWave) band and lidar images that are collected from the neighbourhood of Rosslyn in Arlington, Virginia. The results show an improvement in localization accuracy as a result of the hierarchical architecture, with a Mean Absolute Error (MAE) of 6.55m for the proposed technique in comparison to a MAE of 9.82m using one Convolutional Neural Network (CNN). The third dataset uses measurements from an LTE mobile communication system along with satellite images that take place at the University of Denmark. The results achieve a MAE of 9.45 m fort he heirchichal approach in comparison to a MAE of 15.74 m for one Feed-Forward Neural Network (FFNN).

APA, Harvard, Vancouver, ISO, and other styles

14

Hossain, Md Tahmid. "Towards robust convolutional neural networks in challenging environments." Thesis, Federation University Australia, 2021. http://researchonline.federation.edu.au/vital/access/HandleResolver/1959.17/181882.

Full text

Abstract:

Image classification is one of the fundamental tasks in the field of computer vision. Although Artificial Neural Network (ANN) showed a lot of promise in this field, the lack of efficient computer hardware subdued its potential to a great extent. In the early 2000s, advances in hardware coupled with better network design saw the dramatic rise of Convolutional Neural Network (CNN). Deep CNNs pushed the State-of-The-Art (SOTA) in a number of vision tasks, including image classification, object detection, and segmentation. Presently, CNNs dominate these tasks. Although CNNs exhibit impressive classification performance on clean images, they are vulnerable to distortions, such as noise and blur. Fine-tuning a pre-trained CNN on mutually exclusive or a union set of distortions is a brute-force solution. This iterative fine-tuning process with all known types of distortion is, however, exhaustive and the network struggles to handle unseen distortions. CNNs are also vulnerable to image translation or shift, partly due to common Down-Sampling (DS) layers, e.g., max-pooling and strided convolution. These operations violate the Nyquist sampling rate and cause aliasing. The textbook solution is low-pass filtering (blurring) before down-sampling, which can benefit deep networks as well. Even so, non-linearity units, such as ReLU, often re-introduce the problem, suggesting that blurring alone may not suffice. Another important but under-explored issue for CNNs is unknown or Open Set Recognition (OSR). CNNs are commonly designed for closed set arrangements, where test instances only belong to some ‘Known Known’ (KK) classes used in training. As such, they predict a class label for a test sample based on the distribution of the KK classes. However, when used under the OSR setup (where an input may belong to an ‘Unknown Unknown’ or UU class), such a network will always classify a test instance as one of the KK classes even if it is from a UU class. Historically, CNNs have struggled with detecting objects in images with large difference in scale, especially small objects. This is because the DS layers inside a CNN often progressively wipe out the signal from small objects. As a result, the final layers are left with no signature from these objects leading to degraded performance. In this work, we propose solutions to the above four problems. First, we improve CNN robustness against distortion by proposing DCT based augmentation, adaptive regularisation, and noise suppressing Activation Functions (AF). Second, to ensure further performance gain and robustness to image transformations, we introduce anti-aliasing properties inside the AF and propose a novel DS method called blurpool. Third, to address the OSR problem, we propose a novel training paradigm that ensures detection of UU classes and accurate classification of the KK classes. Finally, we introduce a novel CNN that enables a deep detector to identify small objects with high precision and recall. We evaluate our methods on a number of benchmark datasets and demonstrate that they outperform contemporary methods in the respective problem set-ups.
Doctor of Philosophy

APA, Harvard, Vancouver, ISO, and other styles

15

Bark, Filip. "Embedded Implementation of Lane Keeping Functionality Using CNN." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-230193.

Full text

Abstract:

The interest in autonomous vehicles has recently increased and as a consequence many companies and researchers have begun working on their own solutions to many of the issues that ensue when a car has to handle complicated decisions on its own. This project looks into the possibility of relegating as many decisions as possible to only one sensor and engine control unit (ECU) — in this work, by letting a Raspberry Pi with a camera attached control a vehicle following a road. To solve this problem, image processing, or more specifically, machine learning’s convolutional neural networks (CNN) are utilized to steer a car by monitoring the path with a single camera. The proposed CNN is designed and implemented using a machine learning library for Python known as Keras. The design of the network is based on the famous Lenet, but has been downscaled to increase computation speed and to reduce memory size while still maintaining a sufficient accuracy. The network was run on the ECU, which in turn was fastened to a RC car together with the camera. For control purposes wires were soldered to the remote controller and connected to the Raspberry Pi. As concerns steering, a simple bang-bang controller was implemented. Glass box testing was used to assess the effectiveness of the code, and to guarantee a continuous evaluation of the results. To satisfy the network’s requirements in terms of both accuracy and computation speed larger experiments were performed. The final experiments showed that the network achieved sufficient accuracy and performance to steer the prototype car in real time tasks, such as following model roads and stopping at the end of the path, as planned. This shows that despite being small with moderate accuracy, this CNN can handle the task of lane-keeping using only the data of one single camera. Since the CNN could do this while running on a small computer such as the Raspberry Pi, it has been observed that using a CNN for a lane-keeping algorithm in an embedded system looks promising.
På senare tid så har intresset angående självkörande bilar ökat. Detta har lett till att många företag och forskare har börjat jobbat på sina egna lösningar till den myriad av problem som upstår när en bil behöver ta komplicerade beslut på egen hand. Detta projekt undersöker möjligheten att lämna så många av dessa beslut som möjligt till en enda sensor och processor. I detta fall så blir det en Raspberry Pi (RPI) och en kamera som sätts på en radiostyrd bil och skall följa en väg. För att implementera detta så används bildbehandling, eller mer specifikt, convolutional neural networks (CNN) från maskininlärning för att styra bilen med en enda kamera. Det utvecklade nätverket är designat och implementerat med ett bibliotek för maskininlärning i Python som kallas för Keras. Nätverkets design är baserat på det berömda Lenet men den har skalats ner för att öka prestandan och minska storleken som nätverket tar men fortfarande uppnå en anständing träffsäkerhet. Nätverket körs på RPIn, vilken i sin tur är fastsatt på en radiostyrd bil tillsammans med kameran. Kablar har kopplats och blivit lödda mellan RPIn och handkontrollen till radiostyrda bilen så att RPIn kan styra bilen. Själva styrningen lämnats åt en simpel "Bang Bang controller". Utvärdering av nätvärket och prototypen utfördes löpande under projektets gång, enhetstester gjordes enligt glasboxmetoden för att testa och verifiera olika delar av koden. Större experiment gjordes för att säkerställa att nätverket presterar som förväntat i olika situationer. Det slutgiltiga experimentet fastställde att nätverket uppfyller en acceptabel träffsäkerhet och kan styra prototypen utan problem när denne följer olika vägar samt att den kan stanna i de fall den behöver. Detta visar att trots den begränsade storleken på nätverket så kunde det styra en bil baserat på datan från endast en sensor. Detta var dessutom möjligt när man körde nätverket på en liten och svag dator som en RPI, detta visar att CNN var kraftfulla nog i det här fallet.

APA, Harvard, Vancouver, ISO, and other styles

16

Fernandez, Brillet Lucas. "Réseaux de neurones CNN pour la vision embarquée." Thesis, Université Grenoble Alpes, 2020. http://www.theses.fr/2020GRALM043.

Full text

Abstract:

Pour obtenir des hauts taux de détection, les CNNs requièrent d'un grand nombre de paramètres à stocker, et en fonction de l'application, aussi un grand nombre d'opérations. Cela complique gravement le déploiement de ce type de solutions dans les systèmes embarqués. Ce manuscrit propose plusieurs solutions à ce problème en visant une coadaptation entre l'algorithme, l'application et le matériel.Dans ce manuscrit, les principaux leviers permettant de fixer la complexité computationnelle d'un détecteur d'objets basé sur les CNNs sont identifiés et étudies. Lorsqu'un CNN est employé pour détecter des objets dans une scène, celui-ci doit être appliqué à travers toutes les positions et échelles possibles. Cela devient très coûteux lorsque des petits objets doivent être trouvés dans des images en haute résolution. Pour rendre la solution efficiente et ajustable, le processus est divisé en deux étapes. Un premier CNN s'especialise à trouver des régions d'intérêt de manière efficiente, ce qui permet d'obtenir des compromis flexibles entre le taux de détection et le nombre d’opérations. La deuxième étape comporte un CNN qui classifie l’ensemble des propositions, ce qui réduit la complexité de la tâche, et par conséquent la complexité computationnelle.De plus, les CNN exhibent plusieurs propriétés qui confirment leur surdimensionnement. Ce surdimensionnement est une des raisons du succès des CNN, puisque cela facilite le processus d’optimisation en permettant un ample nombre de solutions équivalentes. Cependant, cela complique leur implémentation dans des systèmes avec fortes contraintes computationnelles. Dans ce sens, une méthode de compression de CNN basé sur une Analyse en Composantes Principales (ACP) est proposé. L’ACP permet de trouver, pour chaque couche du réseau, une nouvelle représentation de l’ensemble de filtres appris par le réseau en les exprimant à travers d’une base ACP plus adéquate. Cette base ACP est hiérarchique, ce qui veut dire que les termes de la base sont ordonnés par importance, et en supprimant les termes moins importants, il est possible de trouver des compromis optimales entre l’erreur d’approximation et le nombre de paramètres. À travers de cette méthode il es possible d’obtenir, par exemple, une réduction x2 sur le nombre de paramètres et opérations d’un réseau du type ResNet-32, avec une perte en accuracy <2%. Il est aussi démontré que cette méthode est compatible avec d’autres méthodes connues de l’état de l’art, notamment le pruning, winograd et la quantification. En les combinant toutes, il est possible de réduire la taille d’un ResNet-110 de 6.88 Mbytes à 370kBytes (gain mémoire x19) avec une dégradation d’accuracy de 3.9%.Toutes ces techniques sont ensuite misses en pratique dans un cadre applicatif de détection de vissages. La solution obtenue comporte une taille de modèle de 29.3kBytes, ce qui représente une réduction x65 par rapport à l’état de l’art, à égal taux de détection. La solution est aussi comparé a une méthode classique telle que Viola-Jones, ce qui confirme autour d’un ordre de magnitude moins de calculs, au même temps que l’habilité d’obtenir des taux de détection plus hauts, sans des hauts surcoûts computationnels Les deux réseaux sont en suite évalues sur un multiprocesseur embarqué, ce qui permet de vérifier que les taux de compression théoriques obtenues restent cohérents avec les chiffres mesurées. Dans le cas de la détection de vissages, la parallélisation du réseau comprimé par ACP sûr 8 processeurs incrémente la vitesse de calcul d’un facteur x11.68 par rapport au réseau original sûr un seul processeur
Recently, Convolutional Neural Networks have become the state-of-the-art soluion(SOA) to most computer vision problems. In order to achieve high accuracy rates, CNNs require a high parameter count, as well as a high number of operations. This greatly complicates the deployment of such solutions in embedded systems, which strive to reduce memory size. Indeed, while most embedded systems are typically in the range of a few KBytes of memory, CNN models from the SOA usually account for multiple MBytes, or even GBytes in model size. Throughout this thesis, multiple novel ideas allowing to ease this issue are proposed. This requires to jointly design the solution across three main axes: Application, Algorithm and Hardware.In this manuscript, the main levers allowing to tailor computational complexity of a generic CNN-based object detector are identified and studied. Since object detection requires scanning every possible location and scale across an image through a fixed-input CNN classifier, the number of operations quickly grows for high-resolution images. In order to perform object detection in an efficient way, the detection process is divided into two stages. The first stage involves a region proposal network which allows to trade-off recall for the number of operations required to perform the search, as well as the number of regions passed on to the next stage. Techniques such as bounding box regression also greatly help reduce the dimension of the search space. This in turn simplifies the second stage, since it allows to reduce the task’s complexity to the set of possible proposals. Therefore, parameter counts can greatly be reduced.Furthermore, CNNs also exhibit properties that confirm their over-dimensionment. This over-dimensionement is one of the key success factors of CNNs in practice, since it eases the optimization process by allowing a large set of equivalent solutions. However, this also greatly increases computational complexity, and therefore complicates deploying the inference stage of these algorithms on embedded systems. In order to ease this problem, we propose a CNN compression method which is based on Principal Component Analysis (PCA). PCA allows to find, for each layer of the network independently, a new representation of the set of learned filters by expressing them in a more appropriate PCA basis. This PCA basis is hierarchical, meaning that basis terms are ordered by importance, and by removing the least important basis terms, it is possible to optimally trade-off approximation error for parameter count. Through this method, it is possible to compress, for example, a ResNet-32 network by a factor of ×2 both in the number of parameters and operations with a loss of accuracy <2%. It is also shown that the proposed method is compatible with other SOA methods which exploit other CNN properties in order to reduce computational complexity, mainly pruning, winograd and quantization. Through this method, we have been able to reduce the size of a ResNet-110 from 6.88Mbytes to 370kbytes, i.e. a x19 memory gain with a 3.9 % accuracy loss.All this knowledge, is applied in order to achieve an efficient CNN-based solution for a consumer face detection scenario. The proposed solution consists of just 29.3kBytes model size. This is x65 smaller than other SOA CNN face detectors, while providing equal detection performance and lower number of operations. Our face detector is also compared to a more traditional Viola-Jones face detector, exhibiting approximately an order of magnitude faster computation, as well as the ability to scale to higher detection rates by slightly increasing computational complexity.Both networks are finally implemented in a custom embedded multiprocessor, verifying that theorical and measured gains from PCA are consistent. Furthermore, parallelizing the PCA compressed network over 8 PEs achieves a x11.68 speed-up with respect to the original network running on a single PE

APA, Harvard, Vancouver, ISO, and other styles

17

Lind, Johan. "Evaluating CNN-based models for unsupervised image denoising." Thesis, Linköpings universitet, Institutionen för datavetenskap, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-176092.

Full text

Abstract:

Images are often corrupted by noise which reduces their visual quality and interferes with analysis. Convolutional Neural Networks (CNNs) have become a popular method for denoising images, but their training typically relies on access to thousands of pairs of noisy and clean versions of the same underlying picture. Unsupervised methods lack this requirement and can instead be trained purely using noisy images. This thesis evaluated two different unsupervised denoising algorithms: Noise2Self (N2S) and Parametric Probabilistic Noise2Void (PPN2V), both of which train an internal CNN to denoise images. Four different CNNs were tested in order to investigate how the performance of these algorithms would be affected by different network architectures. The testing used two different datasets: one containing clean images corrupted by synthetic noise, and one containing images damaged by real noise originating from the camera used to capture them. Two of the networks, UNet and a CBAM-augmented UNet resulted in high performance competitive with the strong classical denoisers BM3D and NLM. The other two networks - GRDN and MultiResUNet - on the other hand generally caused poor performance.

APA, Harvard, Vancouver, ISO, and other styles

18

Söderström, Douglas. "Comparing pre-trained CNN models on agricultural machines." Thesis, Umeå universitet, Institutionen för fysik, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:umu:diva-185333.

Full text

APA, Harvard, Vancouver, ISO, and other styles

19

Li, Xile. "Real-time Multi-face Tracking with Labels based on Convolutional Neural Networks." Thesis, Université d'Ottawa / University of Ottawa, 2017. http://hdl.handle.net/10393/36707.

Full text

Abstract:

This thesis presents a real-time multi-face tracking system, which is able to track multiple faces for live videos, broadcast, real-time conference recording, etc. The real-time output is one of the most significant advantages. Our proposed tracking system is comprised of three parts: face detection, feature extraction and tracking. We deploy a three-layer Convolutional Neural Network (CNN) to detect a face, a one-layer CNN to extract the features of a detected face and a shallow network for face tracking based on the extracted feature maps of the face. The performance of our multi-face tracking system enables the tracker to run in real-time without any on-line training. This algorithm does not need to change any parameters according to different input video conditions, and the runtime cost will not be affected significantly by an the increase in the number of faces being tracked. In addition, our proposed tracker can overcome most of the generally difficult tracking conditions which include video containing a camera cut, face occlusion, false positive face detection, false negative face detection, e.g. due to faces at the image boundary or faces shown in profile. We use two commonly used metrics to evaluate the performance of our multi-face tracking system demonstrating that our system achieves accurate results. Our multi-face tracker achieves an average runtime cost around 0.035s with GPU acceleration and this runtime cost is close to stable even if the number of tracked faces increases. All the evaluation results and comparisons are tested with four commonly used video data sets.

APA, Harvard, Vancouver, ISO, and other styles

20

El-Shafei, Ahmed. "Time multiplexing of cellular neural networks." Thesis, University of Kent, 2001. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.365221.

Full text

APA, Harvard, Vancouver, ISO, and other styles

21

Кириченко, І. О. "Інтелектуальна технологія детектування стану трубопроводів з аугментацією даних в режимі екзамену." Master's thesis, Сумський державний університет, 2021. https://essuir.sumdu.edu.ua/handle/123456789/86859.

Full text

Abstract:

Cпроектовано та розроблено класифікатор детектування стану трубопроводів. При цьому задача оцінки стану труб була розв’язана за допомогою підходу аугментації зображень, а сама технологія працює в режимі екзамену. Розроблений алгоритм реалізовано у формі програмного забезпечення, створеного за допомогою інструментального програмного середовища Python 3.0.

APA, Harvard, Vancouver, ISO, and other styles

22

Gustafsson, Magnus, and Niclas Hagel. "Al-Jazeera och CNN - En jämförande fallstudie i krigsjournalistik." Thesis, Halmstad University, School of Social and Health Sciences (HOS), 2009. http://urn.kb.se/resolve?urn=urn:nbn:se:hh:diva-2234.

Full text

Abstract:

Författare: Magnus Gustafsson Niclas Hagel

Handledare: Thomas Knoll

Examinator: Martin Danielsson

Titel: Al-Jazeera och CNN - En jämförande fallstudie i krigsjournalistik

Typ av rapport: C - uppsats

Ämne: Medie- och Kommunikationsvetenskap

År: Höstterminen 2008

Sektion: Sektionen för Hälsa och Samhälle

Syfte: Vårt syfte är att studera och jämföra al-Jazeeras och CNN:s

bevakning av en händelse i Afghanistankonflikten för att kunna

redogöra för eventuella skillnader. Vi vill se hur olika faktorer

påverkar journalistiken. En analys ur ett genusperspektiv

kommer också att göras.

Metod: Fallstudie har tillämpats som huvudsaklig metod och vid analys

av material har innehållsanalys och kritisk diskursanalys använts.

Slutsatser: Efter att ha jämfört de två nyhetskanalerna kan vi tydligt se att

det finns stora skillnader i rapporteringen av ett amerikanskt

flyganfall mot en afghansk by. CNN som amerikansk

nyhetskanal visar att deras rapportering påverkas av det

amerikanska medieklimatet där en neutral krigsrapportering kan

ses som stötande och journalister ständigt utsätts för

påtryckningar. Ur ett genusperspektiv ser vi dock tydliga

likheter mellan kanalerna.

APA, Harvard, Vancouver, ISO, and other styles

23

Berg, Albin. "Jämförelse av CNN modeller för objektidentifiering och automatisk markering." Thesis, Högskolan i Skövde, Institutionen för informationsteknologi, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:his:diva-18637.

Full text

Abstract:

En svårighet med att använda Artificiell Intelligens, är resurserna som krävs för att utföra beräkningarna under en acceptabel tidsram, men också med en bra träffsäkerhet. Målet med denna uppsats är att jämföra olika modeller av convolutional neural networks, mellan träffsäkerhet och hastighet, för att hitta den modell som är mest effektiv. Dessutom evalueras den mest effektiva modellen genom en webblösning, som kan markera bilder med text. Resultatet visar att varje modell har olika fördelar i hastighet och träffsäkerhet, men att VGG16 har nära till bäst resultat utan de problem som andra modeller har.

APA, Harvard, Vancouver, ISO, and other styles

24

El, Ahmar Wassim. "Head and Shoulder Detection using CNN and RGBD Data." Thesis, Université d'Ottawa / University of Ottawa, 2019. http://hdl.handle.net/10393/39448.

Full text

Abstract:

Alex Krizhevsky and his colleagues changed the world of machine vision and image processing in 2012 when their deep learning model, named Alexnet, won the Im- ageNet Large Scale Visual Recognition Challenge with more than 10.8% lower error rate than their closest competitor. Ever since, deep learning approaches have been an area of extensive research for the tasks of object detection, classification, pose esti- mation, etc...This thesis presents a comprehensive analysis of different deep learning models and architectures that have delivered state of the art performances in various machine vision tasks. These models are compared to each other and their strengths and weaknesses are highlighted. We introduce a new approach for human head and shoulder detection from RGB- D data based on a combination of image processing and deep learning approaches. Candidate head-top locations(CHL) are generated from a fast and accurate image processing algorithm that operates on depth data. We propose enhancements to the CHL algorithm making it three times faster. Different deep learning models are then evaluated for the tasks of classification and detection on the candidate head-top loca- tions to regress the head bounding boxes and detect shoulder keypoints. We propose 3 different small models based on convolutional neural networks for this problem. Experimental results for different architectures of our model are highlighted. We also compare the performance of our model to mobilenet. Finally, we show the differences between using 3 types of inputs CNN models: RGB images, a 3-channel representation generated from depth data (Depth map, Multi-order depth template, and Height difference map or DMH), and a 4 channel input composed of RGB+D data.

APA, Harvard, Vancouver, ISO, and other styles

25

Grogan, Andree Marie. "Observations on the news factory a case study of CNN /." restricted, 2005. http://etd.gsu.edu/theses/available/etd-11172005-173426/.

Full text

Abstract:

Thesis (M.A.)--Georgia State University, 2005.
Title from title screen. Merrill Morris, committee chair; Marian Meyers, Douglas Barthlow, committee members. Electronic text (98 p.) : digital, PDF file. Description based on contents viewed June 21, 2007. Includes bibliographical references (p. 89-96).

APA, Harvard, Vancouver, ISO, and other styles

26

Grogan, Andree Marie. "Observations on the News Factory: A Case Study of CNN." Digital Archive @ GSU, 2006. http://digitalarchive.gsu.edu/communication_theses/6.

Full text

Abstract:

News provides us with information about our world so we can make decisions about the matters that affect our daily lives—both for our personal and the public good. Television news is a pervasive force in our society, and it is important to study because of the influence it exerts on human action. But news is produced by human beings, and those human beings must make selections and rejections regarding what makes it into a newscast and what doesn’t. In addition, decisions have to be made on how to frame, present, order, word, edit, shape what news items are included. Many forces influence these decisions throughout the complex television news process. Media sociology scholars urge researchers to examine these influences at five levels: the individual, newsroom, organization, extra-organization and societal or cultural levels. This gatekeeping study examined this complex news process at work and revealed the complex set of forces that influence news decisions by news producers at CNN, a global 24-hour news network. By exposing the processes by which the news is made, one can better understand the influences that shape the end product—the news.

APA, Harvard, Vancouver, ISO, and other styles

27

Hiselius, Leo. "Igenkänning av musikalisk genre med CNN-nätverk och transfer learning." Thesis, KTH, Skolan för teknikvetenskap (SCI), 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-254764.

Full text

Abstract:

Detta projekt studerar effekterna av transfer learning på inhämtandet av information från CNN-baserade ljuddatarepresentationer. Flera otränade CNN- nätverk matas med melspektrogrammatriser och tränas på tre olika uppgifter, nämligen ’genre’, ’region’ och ’year’ och klassifikationsprestandan mäts. Efter detta appliceras transfer learning och klassifikationsprestandan mäts igen. F1- score för individuella klasser inom de olika uppgifterna mäts också. Genom att jämföra resultaten visas att transfer learning är applicerbart på denna domän.
This project studies the effects of transfer learning on music information retrieval tasks of CNN-based audio data representations. Several neural networks are fed melspectrogram matrices and trained with random initial weights on three different classification tasks including ’genre’, ’region’ and ’year’ and classification performance is measured, after which transfer learning is utilized and classification performance is measured again. F1-score for individual classes within the different tasks is also measured. Comparing the results shows that transfer learning is applicable in this task domain.

APA, Harvard, Vancouver, ISO, and other styles

28

Lee, Yi-Jou, and 李依柔. "A Reconfigurable CNN Accelerator Design." Thesis, 2017. http://ndltd.ncl.edu.tw/handle/15122663000772368149.

Full text

Abstract:

碩士
國立臺灣大學
資訊工程學研究所
105
With the large size of the convolutional neural network (CNN), performance and energy efficiency of CNN accelerator become an important problem. From previous works, we can find that DRAM accesses took a large part in energy consumption. To reduce DRAM accesses, we observe the computation behavior of convolutional layer, and many parameters are shared between computation. Those data may be loaded on-chip repeatedly with the limitation of on-chip buffer size in an accelerator. We would like to capture data reuse via the on-chip buffer to reduce DRAM accesses of CNN computation. There are three kinds of data reuse can be captured, and those data will be kept by on-chip buffer and be evicted when not needed. The first kind of data reuse is input feature map reuse, the next is filter reuse and the other is intermediate feature map reuse. Each layer in a CNN model may favor different data reuse policy based on the size of its input, output, and filters. But existing CNN accelerators only focus on one type of data reuse through CNN processing. To have flexibility using different data reuse policy for each layer in CNN processing, we would like to propose a reconfigurable CNN accelerator design, which can be configured to capture different types of reuse with the objective of minimizing off-chip memory accesses. With separating the CNN processing into several computation primitives which are units of convolution with different inputs and filters, we can reuse different data by arranging the computation ordering of those computation primitives in our accelerator. And our accelerator will execute based on the instructions generated by off-line generator considering the optimal reuse policy and hardware constraints. Our work shows that with our reconfigurable design, DRAM accesses can be reduced, and compare the execution time and the energy when using different data reuse policy. We also analyze the effect of the different configuration in our CNN accelerator design.

APA, Harvard, Vancouver, ISO, and other styles

29

Lopez, Paola Denisse Gomez, and 鮑樂. "Face Keypoint Recognition with CNN." Thesis, 2016. http://ndltd.ncl.edu.tw/handle/87259949052618555601.

Full text

Abstract:

碩士
元智大學
通訊工程學系
104
Purpose This is an attempt to unravel the problem of human face keypoints recognition. In the new area of machine learning research called deep learning. Different approaches to this problem were evaluated and proposed one system to implement using python libraries for computational skills. Methodology Face keypoints detection was achieved by using a template algorithm. Using GPU instances and convolutional networks consisting of multiple levels. The key idea is to pre-train models in completely unsupervised way and finally they can be fine-tuned for the task at hand using supervised learning using Nesterov Gradient Algorithm to create the perceptron units. Manual detection was used to test implemented face keypoints recognition system. Findings Successful results were obtained for automated face keypoints recognition under robust and controlled conditions. The experimental results show that the model provides better results than publicly available benchmarks for the dataset. Originality/Value Discuss different machine learning techniques used for face keypoints detection and provide a description why most algorithms are based in neural networks. Keywords Convolution Neural network, face keypoints recognition.

APA, Harvard, Vancouver, ISO, and other styles

30

CHEN, CHUN-LIN, and 陳俊霖. "CNN-based identity recognition system." Thesis, 2019. http://ndltd.ncl.edu.tw/handle/drtnp8.

Full text

Abstract:

碩士
國立中央大學
資訊管理學系在職專班
107
This paper proposes a set of "CNN-based identity recognition system" for identity recognition using a computer vision library OpenCV and deep learning technology and webcam. It is expected to be applied to access control and regional security. Monitoring, advertising, or other related systems that need to be enhanced by confirming their identity. This thesis is based on Python and TensorFlow's built-in GoogLeNet CNN model. Supervised learning is used to obtain facial image features and classified by identity. This paper uses self-organizing face image data and compares GoogLeNet. The identification rate of the three versions of the model, in the neural network architecture with the highest recognition rate in the experiment, can increase the recognition rate by adding the residual network experiment. Using the neural network model of the above-mentioned best recognition rate, the OpenCV is used to load the movie to instantly recognize the character in the film to verify the practicability of the neural network model of the research training. In the verification part of the results, the paper has self-organized 14 public figures, and each public figure has at least 130 face images as training and test and verification samples, among which the best recognition rate of the neural network is in 1260 images. The recognition rate of the training sample is 100%, and the image recognition rate of the 450 images is 99.11%. The time of the instant image recognition from the face image in the film to the completion identity is about 0.1 second.

APA, Harvard, Vancouver, ISO, and other styles

31

Chen, Zih-Jie, and 陳子傑. "CNN-based Gaze Block Estimation." Thesis, 2019. http://ndltd.ncl.edu.tw/handle/3mzyzg.

Full text

Abstract:

碩士
國立中央大學
資訊工程學系
107
The visual is one of the most important senses that a human receives outside information. The visual helps us explore the world, receive new knowledge, and communicate with computer. As contactless human-computer interaction (HCI) model continues to develop, the technology of communicating with gaze behavior has become a highlight in this field. There have been many applications in the fields of education, advertising, nursing, entertainment or virtual reality. In general, most of the eye tracking devices need calibration in advance or fixing head. There are still many restrictions on usage specification. To solve the above problems, this study uses the ResNet model as the core of classification to construct Gaze Block Estimation Model (GBE Model). It can estimate the gaze block of the user without calibration process. Moreover, only an RGB camera device without depth information is used to capture the image, such as a webcam, a built-in camera on a laptop, or front-facing camera of a smartphone. The deep learning approach is data-driven. It needs a large amount of correctly labeled training data to train a stable and compliant model. However, the existing public dataset of visual behavior has different application scenarios. Resulting in images of the dataset does not apply to all application domains. Therefore, this study collects and builds up to a dataset of eye images of up to 300000 images. According to the experimental results, the GBE Model can estimate gaze block of the user without calibration process and allow the head moving. Even in the real-life testing, it can reach 85.1% accuracy. The experimental results prove the proposed method can let user use gaze block to control the screen, and achieve the goal of HCI application scenario.

APA, Harvard, Vancouver, ISO, and other styles

32

Rebelo, José Soares. "CNN-Based Refinement for Image Segmentation." Master's thesis, 2018. https://repositorio-aberto.up.pt/handle/10216/114115.

Full text

APA, Harvard, Vancouver, ISO, and other styles

33

LIAO, PEN-MIN, and 廖本閔. "Streamflow Forecasting by CNN-GRU Model." Thesis, 2019. http://ndltd.ncl.edu.tw/handle/8rs76r.

Full text

Abstract:

碩士
逢甲大學
水利工程與資源保育學系
107
During the last two decades, the application of artificial intelligence in the field of flood forecasting has increased noticeably. Since the information of flood forecasting is the most important part of disaster management, also the emergency response and the mechanism of Recurrent Neural Network (RNN) include the behavior of the time series, this study attempt to adopt the Gated Recurrent Unit (GRU) which is a type of RNN used to develop a rainfall-runoff model for the mentioned purpose above. In this research RNN is using Gated Recurrent Unit (GRU). In each field, applicability of GRU is still in researching. Thereby, this paper will discuss the application GRU in the flood forecast. In order to improve the prediction accuracy of the GRU, the data is processed by using the Convolutional Neural Network (CNN) and then input into the GRU for prediction, called CNN-GRU. In the past, most studies used to extract every rainfall from the data before learning artificial neural networks for flood flow prediction. However this study will use a different approach, because GRU cell can remember the status from past. In addition, optimal hyperparameters setting for artificial neural networks will be found by genetic algorithm (GA) to modeling Dali River hourly rainfall-runoff model. Evaluation indicators show that CNN-GRU is better than GRU, the evaluation indicators show that CNN-GRU is better than GRU, because CNN-GRU uses CNN to extract eigenvalues from input data before using GRU for prediction.

APA, Harvard, Vancouver, ISO, and other styles

34

Rebelo, José Soares. "CNN-Based Refinement for Image Segmentation." Dissertação, 2018. https://repositorio-aberto.up.pt/handle/10216/114115.

Full text

APA, Harvard, Vancouver, ISO, and other styles

35

Chen, Shih-Che, and 陳釋澈. "Mandarin Tone Classification Using CNN/DNN." Thesis, 2018. http://ndltd.ncl.edu.tw/handle/6ptt3a.

Full text

Abstract:

碩士
國立臺灣大學
資訊工程學研究所
106
In Mandarin Chinese system, the tone plays an important role. Different tone patterns of the same syllable may result in different meanings. People whose native language aren’t Mandarin can be distinguished by their tone patterns. Therefore, we propose a method for tone classification. First, we convert the audio signal into the spectrogram. We treat the spectrogram as images, apply them as the image inputs for image recognition convolutional neural networks, and create tone classification models. We compare different image recognition models for tone classification. This approach can achieve good accuracy without too many processes on the audio signal. The tone classification architecture can be applied to Chinese teaching methods which will lead to educational success.

APA, Harvard, Vancouver, ISO, and other styles

36

Yang, Hsin-Wei, and 楊馨媁. "CNN-based Handwritten Invoice Recognition System." Thesis, 2019. http://ndltd.ncl.edu.tw/handle/c5zem6.

Full text

Abstract:

碩士
國立臺灣海洋大學
資訊工程學系
107
This paper proposes a method that uses deep learning and convolution neural network (CNN) for handwritten invoice recognition, this method can help enterprises solve that enterprises use only handwritten invoices and reduce labor costs of sorting invoices. Invoice recognition contains invoice number, buyer's government uniform invoice number, seller's government uniform invoice number, digital total amount, and Chinese total amount. Models train by different content, analyze and calculate the best results based on the labels, coordinates and scores of the model detection results. Besides, total amount result use digital total amount and Chinese total amount to correct, which increase 3% accuracy of total amount. The experiment use about 500 labeled invoices to train models, use models to recognize that randomly selected 1000 invoices, according to research results, the overall recognition accuracy over 95%.

APA, Harvard, Vancouver, ISO, and other styles

37

Shih, Yi-Hao, and 史鎰豪. "CNN-Based Distorted Barcode Number Recognition." Thesis, 2019. http://ndltd.ncl.edu.tw/handle/bnzv68.

Full text

Abstract:

碩士
國立臺灣海洋大學
資訊工程學系
107
The rapid development of deep learning in recent years saw breakthroughs after breakthroughs. AlphaGo’s victory against the world’s top-ranked professional GO player took only two years of learning. Then, Alpha Zero took only 21 days of self-learning to beat AlphaGo. We are now fully aware of the fast progress in deep learning, which uses Artificial Neural Network modeled upon the neuron transmission in the human brain to solve problems. This thesis uses a convolutional neural network Yolov3 to capture the feature of distorted barcode number images and made variance-weighted decisions based on the verification results. The value parameters are further changes to achieve the experimental goal of recognizing barcode label number. We use 350 photos of barcode numbers of training in deep learning,And the results show that the accuracy is up to 93%.

APA, Harvard, Vancouver, ISO, and other styles

38

Shen, Yu-Ru, and 沈渝茹. "Hands-on Image Recognition with CNN." Thesis, 2019. http://ndltd.ncl.edu.tw/handle/f7b527.

Full text

Abstract:

碩士
元智大學
資訊管理學系
107
Human beings are visualizers. The amount of information received from the visuals accounts for about 60% of all our senses. In the process of developing artificial intelligence, we train that machines what see the world, understand the world and use images recognition as a source of data for making decision and judgment. Deep learning is the mainstream of artificial intelligence, which a class of machine learning algorithms that use multiple layers to progressively extract higher level features from raw input. Artificial Neural Networks (ANNs) were inspired by information processing and distributed communication nodes in biological systems. Convolutional neural network is a class of deep neural networks, most commonly applied to analyzing visual imagery. The key to affecting the convolutional neural network is the architecture, the depth and the weight of the convolution kernel. The study compares these three factors and compares their impact differences.

APA, Harvard, Vancouver, ISO, and other styles

39

Tsao, Po-Ho, and 曹博賀. "Boat License Number Recognition Using CNN." Thesis, 2018. http://ndltd.ncl.edu.tw/handle/gu58wv.

Full text

Abstract:

碩士
國立臺灣海洋大學
資訊工程學系
106
This paper proposes the use of convolutional neural networks (CNNs) for real-time recognition of numbers on fishing vessels entering and leaving their ports. First, video cameras were mounted at the access of fishing ports to capture images of entries into and exits from these ports. Then, fishing vessels in the images were detected and positions of license plates on the vessels were located. After cutting and trimming, numbers on the fishing vessels were recognized. The recognized fishing vessel numbers then underwent rearrangement of their positions and trust scores of fishing vessel numbers that were recognized in their positions were organized in descending order. Finally, fishing vessel numbers that complied with the fishing vessel numbering rules and had the highest trust scores were regarded as the detected numbers of the fishing vessels. The recognition results were then shown in the video images. After experimentation in multiple networks, the results show that the accuracy is up to 58.3%

APA, Harvard, Vancouver, ISO, and other styles

40

Ming-Wei-Huang and 黃銘偉. "CNN-based gender and age classification." Thesis, 2018. http://ndltd.ncl.edu.tw/handle/a7wytt.

Full text

Abstract:

碩士
中原大學
資訊工程研究所
106
In this paper, we propose a method for classifying gender and age of pedestrian that can be applied to CCTV. With the development of science and technology, identifying the gender and age of pedestrians/face images becomes a popular and important task in social network and surveillance domain. We first perform face detection and extract facial landmarks from each image. Face alignment is then applied to gain aligned face images as training data. We use “GoogLeNet” which is one of the framework of Convolutional Neural Network (CNN) to train the models for gender and age classification. The experimental results show that our method achieves over 60% for all the male and female in our test video set.

APA, Harvard, Vancouver, ISO, and other styles

41

GUPTA, RASHI. "IMAGE FORGERY DETECTION USING CNN MODEL." Thesis, 2022. http://dspace.dtu.ac.in:8080/jspui/handle/repository/19175.

Full text

Abstract:

Image forgery detection has become more relevant in the real world in recent years since it is so easy to change a particular image and share it throughout social media, which may quickly lead to fake news and fake rumors all over the world. These editing softwares have posed a significant challenge to image forensics in terms of proposing and implementing various methods and strategies for detecting image counterfeiting. There have been a variety of traditional approaches for forgery detection, but they all focus on simple feature extraction and are more specialized to the type of forgery. However, as research advances, multiple deep learning approaches are being implemented to identify forgeries in images. Deep learning approaches have demonstrated exceptional outcomes in image forgery when compared to traditional methods. The numerous sorts of image forgeries are discussed in this work. The work presents and compares different applied and proven image forgery detection approaches, as well as a comprehensive literature analysis of deep learning algorithms for detecting various types of image counterfeiting. Also CNN network is build based on a prior study and compare its performance on two different datasets to address this issue. Furthermore, the impact of a data augmentation approach is assessed as well as several hyperparameters on classification accuracy. Our findings imply that the dataset's difficulty has a significant influence on the outcomes. In this study, we have also aimed to determine detection of image forgery using deep learning approach. The CNN Model is used along with the ELA extraction model which is then used for detection of forgery in images. Later we also used two CNN Models, VGG16 Model and VGG19 Model for the better comparison and understanding.

APA, Harvard, Vancouver, ISO, and other styles

42

Soldátová, Jana. "Hybridizace konceptu TV stanic na příkladu CNN Prima News." Master's thesis, 2021. http://www.nusl.cz/ntk/nusl-448007.

Full text

Abstract:

The diploma thesis deals with the origin and adaptation of the concept of CNN Prima NEWS on the Czech market and its possible hybridization. The expansion of global media corporations is a phenomenon of the 20th century that affects persisted till present. The television companies set up centers, branches or, through the sale of licenses, reach its audience through localized television stations. This thesis approaches the theory of globalization with a focus on the concepts of global culture, glocalization and hybridization. With standardization of successful patterns the companys strengthen the its positions in the global market, expand their influence and, last but not least, it is a profitable prosperous activity that does not require high costs. The thesis captures the arrival of the licensing concept of a global television station on Czech media market. The thesis develops connection of the global news brand CNN with Czech commercial television. These connections and deviations are elaborated and compared on the basis of three confrontable areas. At the same time, it is evaluated what creates this concept and what are its possible forms of hybridization.

APA, Harvard, Vancouver, ISO, and other styles

43

Hsiao, Chiao-Wei, and 蕭喬蔚. "A New CMOS Large-Neighborhood Cellular-Neural-Network (CNN) Cell Structure For Large-Neighborhood CNN Universal Machine (CNNUM)." Thesis, 2001. http://ndltd.ncl.edu.tw/handle/17286284714479061976.

Full text

Abstract:

碩士
國立交通大學
電子工程系
89
In this thesis, a new structure for the VLSI implementation of large- neighborhood cellular neural network (LN-CNN) is proposed and analyzed. In the proposed LN-CNN structure, the parasitic lateral bipolar junction transistor (BJT) in the CMOS process is used to implement both the neuron and synaptic path. Based on the basic device physics of the neuron-BJT (νBJT), a new compact neuron structure is proposed and analyzed. Besides, because of using NPN and PNP BJTs together, a low-power structure of synaptic path is designed and verified. The new low power is composed of dual path for positive or negative current flow in. There is no DC standby current, and it consumes no DC power. Basing on the concept of LN-CNN, the templates with more than two neighborhood layers can be realized without extra complex connections in VLSI implementation. So the chip area for interconnections is reduced and the array size could be increased. The above mentioned low power synaptic path circuit is used, so the LN-CNN is a lower power design. Using the proposed LN-CNN structure, the LN-CNN functions such as Muller-lyer arrowhead illusion and connected component detector, have been successfully realized and verified in HSPICE simulation. Both the negative and the asymmetrical template can be realized. A 16 X 16 LN-CNN with chip area 1200μm X 2580μm is designed and fabricated by 0.25μm TSMC 1P5M CMOS process. The total power consumption is lower than 50mW. Finally, the large neighborhood universal machine is proposed. By using the large neighborhood cellular neural network as the core-computing unit, the analog array processor is realized. Some local memories are added into the single LN-CNN cell. There are also local communication and control unit inside the cell. Many complicated tasks that cannot be compute at one time by LN-CNN can be solve by the universal machine. From the above results, the proposed LN-CNN has great potential in the implementation of the CNN universal machine for various signal-processing applications. Further researches in this field will be conducted in the future.

APA, Harvard, Vancouver, ISO, and other styles

44

Wei, Cian-Pin, and 魏千評. "Signal Reconstruction-LMI, GA and CNN Approaches." Thesis, 2006. http://ndltd.ncl.edu.tw/handle/07302740737170892869.

Full text

Abstract:

碩士
國立高雄應用科技大學
電子與資訊工程研究所碩士班
94
In this thesis, first, the design of FIR and IIR equalizers for the communication channels via the genetic algorithm (GA) and linear matrix inequality (LMI) approaches from an H-inf perspective is presented, which the communication channels are considered as linear time-invariant and nonlinear time-invariant models, respectively. In general, the equalizer plays an important role in modern digital communication systems that can be used to recover the corrupted signal. For the linear time-invariant channel, the problem of IIR equalizer design can be transformed into a nonlinear matrix inequality. In order to eliminate the nonlinear element from the inequality, GA technique is employed. Moreover, in the nonlinear time-invariant channel, the GA technique is utilized to linearize the nonlinear channel model and the approximate errors can be viewed as state uncertainties. Second, the technique of image noise cancellation is presented by employing cellular neural network (CNN) and LMI. The main objective is to train the templates of CNN by a corrupted image corresponding a desired image. A criterion for the uniqueness and global asymptotic stability of the equilibrium point of CNN is obtained based on the Lyapunov stability theorem. Finally, all illustrative examples are presented to demonstrate the effectiveness of the proposed methodologies.

APA, Harvard, Vancouver, ISO, and other styles

45

Chou, Hung-Chun, and 周宏春. "Discriminatively-learned CNN Features for Image Retrieval." Thesis, 2015. http://ndltd.ncl.edu.tw/handle/21952307307565825693.

Full text

Abstract:

碩士
國立交通大學
資訊科學與工程研究所
103
The thesis aims to learn discriminative features for image retrieval tasks based on using deep convolutional neural networks (CNN). Motivated by the great success of CNN in recognition tasks, one may be tempted to simply adopt the output of CNN for retrieval. However, CNN pre-trained model for classification tasks may not optimized for retrieval tasks. To address this issue, the CNN’s weight parameters are specifically adapted by a contrastive loss function to suit retrieval tasks. Extensive experiments conducted on typical retrieval datasets confirm the superiority of the proposed scheme over the state-of-the-art methods.

APA, Harvard, Vancouver, ISO, and other styles

46

黃園芳. "Two-Dimensional CNN with L-shaped Template." Thesis, 2007. http://ndltd.ncl.edu.tw/handle/96492000877316767105.

Full text

Abstract:

碩士
國立交通大學
應用數學系所
95
In this paper, we consider the simplest two-dimensional CNN template, L-shaped template. This work had investigated on [Lin&Yang, 2001] before. They use the building block to discuss the spatial entropy. In this paper, we reappraise the spatial entropy by pattern generation method which could refer to [Ban&Lin, 2005]. When we could not evaluate the spatial entropy, we use connecting operator referred to [Ban, Lin&Lin, 2006] to evaluate the lower bounded of spatial entropy. Finally, we compare the result with [Lin&Yang].

APA, Harvard, Vancouver, ISO, and other styles

47

Chen, Wei-Cheng, and 陳威成. "A Hrbrid Method for CNN Template Design." Thesis, 2002. http://ndltd.ncl.edu.tw/handle/82335919178615802468.

Full text

Abstract:

碩士
國立中正大學
電機工程研究所
90
In this study, a hybrid method for CNN (Cellar Neural Networks) template design is proposed. The objective is to efficiently find robust template for CNN with non-zero boundary consideration. In the proposed method, we analyzed the dynamic transient of the CNN and found the influence of non-zero boundary on the analytic method of CNN. This discovery can provide a limitation in the searching of robust template using GA (Genetic Algorithm). Incorporating the limitation in the procedure of GA can decrease the searching space and thus decrease the useless search. This study depicted that the number of generation in the GA procedure of the proposed method is smaller than several existing methods. Moreover, the robust templates found using this method are superior to other methods.

APA, Harvard, Vancouver, ISO, and other styles

48

Hua, Chen Bo, and 陳柏樺. "Handwritten Character Recognition using GA-based CNN." Thesis, 2011. http://ndltd.ncl.edu.tw/handle/69041942101896744315.

Full text

Abstract:

碩士
國立高雄應用科技大學
電機工程系
99
In this paper, the static simulation method is used for handwritten characters recognition. Since a lot of noise and some non-character traces would occur while scanning image of writting characters, it evoked an inevitable noise-elimination problems in character recognition. This paper simulates the image preprocessing by adding several types of noise, and then filter it out by using conventional and gene-based CNN methods. The results demonstrate the superiority of CNN optimization method. In dealing with salt-pepper noise and Gaussian noise, CNN algorithm results with a better image clearness. Even for the shaking blurred image, its restoration effect is better than that of least square filter. After preprocessing and normalized to the same size, the features of handwritten characters are extracted and put into back-propagate neural network for parameter training. In this stage, an innovative horizontal feature extraction method is adoped besides the traditional ones. It is easier to distinct those mixed-strokes. Meanwhile, the amount of data is much less than that in literatures of other researchers. The main contributions of this thesis are: the simplicity of system algorithm, the effectiveness of noise elimination, and a high recognition rate up to 98%.

APA, Harvard, Vancouver, ISO, and other styles

49

Lu, Pei-Hsuan, and 呂姵萱. "L1-Norm Based Adversarial Example against CNN." Thesis, 2018. http://ndltd.ncl.edu.tw/handle/ua49z8.

Full text

Abstract:

碩士
國立中興大學
資訊科學與工程學系
106
In recent years, defending adversarial perturbations to natural examples in order to build robust machine learning models trained by deep neural networks (DNNs) has become an emerging research field in the conjunction of deep learning and security. In particular, MagNet consisting of an adversary detector and a data reformer is by far one of the strongest defenses in the black-box setting, where the attacker aims to craft transferable adversarial examples from an undefended DNN model to bypass a defense module without knowing its existence. MagNet can successfully defend a variety of attacks in DNNs, including the Carlini and Wagner''s transfer attack based on the L2 distortion metric. However, in this thesis, under the black-box transfer attack setting we show that adversarial examples crafted based on the L1 distortion metric can easily bypass MagNet and fool the target DNN image classifiers on MNIST and CIFAR-10. We also provide theoretical justification on why the considered approach can yield adversarial examples with superior attack transferability and conduct extensive experiments on variants of MagNet to verify its lack of robustness to L1 distortion based transfer attacks. Notably, our results substantially weaken the existing transfer attack assumption of knowing the deployed defense technique when attacking defended DNNs (i.e., the gray-box setting).

APA, Harvard, Vancouver, ISO, and other styles

50

CHANG, CHANG, and 張競. "Fast Gender Detection System based on CNN." Thesis, 2019. http://ndltd.ncl.edu.tw/handle/jwvbz6.

Full text

Abstract:

碩士
輔仁大學
資訊工程學系碩士班
107
Artificial Intelligence(AI), Machine Learning, Deep Learning, have always been very popular topic. As the time moved forward, there are more and more open source tools appeared e.g. OpenAI, TensorFlow, char-RNN that people can get the hang of Artificial Intelligence and Machine Learning more easily and more quickly. As the technology arising, Artificial Intelligence can apply to many fields such as simple AI can apply to refrigerator, sweeper, air conditioner and so on. They can detect external signal e.g. humidity, temperature, brightness, image, horizontal, vibration, distance then to achieve machine’s automatic control. And further more AI can apply to medical related and industrial related application. In medical related application can use Expert System, branch of Artificial Intelligence, which are solving expert levels ability problems in some specific fields. Through expert’s rich experience and expertise, simulating expert’s mode of thinking to solve that only can solve by experts. In industrial related fields, then utilize and train a large amount of information cause by the production and estimate the problems in production line then improve the product’s yield rate, production efficiency and increase the gross output value. The essay is about the female passenger safety as propose then create the related application. In recent years, Taiwan’s public transportation had been thrive and flourish and the problems are still endless because the passenger couldn’t be filtered then the female passenger had to be more careful of the other passenger and guarded and more attention should be paid at night. This essay is using the surveillance camera on the platform and inside the carriage and instant identification the gender of people and provide the identification result through the cloud to passenger’s mobile device so that can get some information about male and female on the platform and carriage.

APA, Harvard, Vancouver, ISO, and other styles

Dissertations / Theses on the topic 'CNN'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles