Dissertations / Theses on the topic 'Deep learning based'

To see the other types of publications on this topic, follow the link: Deep learning based.

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 dissertations / theses for your research on the topic 'Deep learning based.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.

1

Hussein, Ahmed. "Deep learning based approaches for imitation learning." Thesis, Robert Gordon University, 2018. http://hdl.handle.net/10059/3117.

Full text
Abstract:
Imitation learning refers to an agent's ability to mimic a desired behaviour by learning from observations. The field is rapidly gaining attention due to recent advances in computational and communication capabilities as well as rising demand for intelligent applications. The goal of imitation learning is to describe the desired behaviour by providing demonstrations rather than instructions. This enables agents to learn complex behaviours with general learning methods that require minimal task specific information. However, imitation learning faces many challenges. The objective of this thesis is to advance the state of the art in imitation learning by adopting deep learning methods to address two major challenges of learning from demonstrations. Firstly, representing the demonstrations in a manner that is adequate for learning. We propose novel Convolutional Neural Networks (CNN) based methods to automatically extract feature representations from raw visual demonstrations and learn to replicate the demonstrated behaviour. This alleviates the need for task specific feature extraction and provides a general learning process that is adequate for multiple problems. The second challenge is generalizing a policy over unseen situations in the training demonstrations. This is a common problem because demonstrations typically show the best way to perform a task and don't offer any information about recovering from suboptimal actions. Several methods are investigated to improve the agent's generalization ability based on its initial performance. Our contributions in this area are three fold. Firstly, we propose an active data aggregation method that queries the demonstrator in situations of low confidence. Secondly, we investigate combining learning from demonstrations and reinforcement learning. A deep reward shaping method is proposed that learns a potential reward function from demonstrations. Finally, memory architectures in deep neural networks are investigated to provide context to the agent when taking actions. Using recurrent neural networks addresses the dependency between the state-action sequences taken by the agent. The experiments are conducted in simulated environments on 2D and 3D navigation tasks that are learned from raw visual data, as well as a 2D soccer simulator. The proposed methods are compared to state of the art deep reinforcement learning methods. The results show that deep learning architectures can learn suitable representations from raw visual data and effectively map them to atomic actions. The proposed methods for addressing generalization show improvements over using supervised learning and reinforcement learning alone. The results are thoroughly analysed to identify the benefits of each approach and situations in which it is most suitable.
APA, Harvard, Vancouver, ISO, and other styles
2

Abrishami, Hedayat. "Deep Learning Based Electrocardiogram Delineation." University of Cincinnati / OhioLINK, 2019. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1563525992210273.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Al-Bander, B. Q. "Retinal image analysis based on deep learning." Thesis, University of Liverpool, 2018. http://livrepository.liverpool.ac.uk/3022573/.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Widegren, Philip. "Deep learning-based forecasting of financial assets." Thesis, KTH, Matematisk statistik, 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-208308.

Full text
Abstract:
Deep learning and neural networks has recently become a powerful tool to solve complex problem due to improvements in training algorithms. Examples of successful application can be found in speech recognition and machine translation. There exist relative few finance articles were deep learning have been applied, but existing articles indicate that deep learning can be successfully applied to problems in finance.  This thesis studies forecasting of financial price movements using two types of neural networks, namely; feedforward and recurrent networks. For the feedforward neural networks we considered non-deep networks with more neurons and deep networks with fewer neurons. In addition to the comparison between feedforward and recurrent networks, a comparison between deep and non-deep networks will be made. The recurrent architecture consists of a recurrent layer mapping into a feedforward layer followed by an output layer. The networks are trained with two different feature setups, one less complex and one more complex. The findings for non-deep vs. deep feedforward neural networks imply that there does not exist any general pattern whether deep or non-deep networks are preferable. The findings for recurrent neural networks vs. feedforward neural networks imply that recurrent neural networks do not necessarily outperform feedforward neural networks even though financial data in general are time-dependent. In some cases, adding batch normalization can improve the accuracy for the feedforward neural networks. This can be preferable instead of using more complex models, such as a recurrent neural networks. Moreover, there are significant differences in accuracies between using the two different feature setups. The highest accuracy for all networks are 52.82%, which is significantly better than the simple benchmark.
Djupa neuronnät har under det senaste årtiondet blivit ett väldigt användarbart verktyg för att lösa komplexa problem, tack vare förbättringar i träningsalgoritmer. Två områden där djupinlärning visat sig väldigt användbart är inom taligenkänning och maskinöversättning. Det finns relativt få artiklar där djupinlärning används inom finans men i de få som existerar finns det tydliga tecken på att djupinlärning skulle kunna appliceras framgångsrikt på finansiella problem. Denna uppsats studerar prediktering av finansiella prisrörelser med framåtkopplade nätverk och rekurrenta nätverk. För de framåtkopplade nätverken kommer vi använda oss av djupa nätverk med färre neuroner per lager och mindre djupa nätverk med fler neuroner per lager. Förutom en jämförelse mellan framåtkopplade nätverk och rekurrenta nätverk kommer även en jämförelse mellan de djupa och mindre djupa framåtkopplade nätverken att göras. De rekurrenta nätverket består av ett rekurrent lager som sedan projicerar på ett framåtkopplande lager följt av ett outputlager. Nätverken är tränade med två olika uppsättningar av insignaler, ett mindre komplext och ett mer komplext. Resultaten för jämförelsen mellan de olika framåtkopplade nätverken indikerar att det inte med säkerhet går att säga om man vill använda sig av ett djupare nätverk eller inte, då det beror på många olika faktorer som tex. variabeluppsättning. Resultaten för jämförelsen mellan de rekurrent nätverken och framåtkopplade nätverken indikerar att rekurrenta nätverk nödvändigtvis inte presterar bättre än framåtkopplade nätverk trots att finansiell data vanligtvis är tidsberoende. Det finns signifikanta resultat där den mer komplexa variabeluppsättningen presterar bättre än den mindre komplexa. Den högsta träffsäkerheten för att prediktera rätt tecken på nästkommande prisrörelse är 52.82% vilket är signifikant bättre än ett enkelt benchmark.
APA, Harvard, Vancouver, ISO, and other styles
5

Wang, Xutao. "Chinese Text Classification Based On Deep Learning." Thesis, Mittuniversitetet, Avdelningen för informationssystem och -teknologi, 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:miun:diva-35322.

Full text
Abstract:
Text classification has always been a concern in area of natural language processing, especially nowadays the data are getting massive due to the development of internet. Recurrent neural network (RNN) is one of the most popular method for natural language processing due to its recurrent architecture which give it ability to process serialized information. In the meanwhile, Convolutional neural network (CNN) has shown its ability to extract features from visual imagery. This paper combine the advantages of RNN and CNN and proposed a model called BLSTM-C for Chinese text classification. BLSTM-C begins with a Bidirectional long short-term memory (BLSTM) layer which is an special kind of RNN to get a sequence output based on the past context and the future context. Then it feed this sequence to CNN layer which is utilized to extract features from the previous sequence. We evaluate BLSTM-C model on several tasks such as sentiment classification and category classification and the result shows our model’s remarkable performance on these text tasks.
APA, Harvard, Vancouver, ISO, and other styles
6

Zhou, Chenyang. "Measure face similarity based on deep learning." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-262675.

Full text
Abstract:
Measuring face similarity is a task in computer vision that is different from face recognition. It aims to find an embedding in which similar faces have a smaller distance than dissimilar ones. This project investigates two different Siamese networks to explore whether these specific networks outperform face recognition methods on face similarity. The best accuracy is from a Siamese convolution neural network, which is 65.11%. Moreover, the best results in a similarity ranking task are obtained from Siamese geometry-aware metric learning. Besides, this project creates a novel dataset with facial image pairs for face similarity.
Mätning av ansiktslikhet är en uppgift i datorseende som skiljer sig från ansiktsigenkänning. Det syftar till att hitta en inbäddning där liknande ansikten har ett mindre avstånd än olika ansikten. Detta projekt undersöker två olika siamesiska nätverk för att utforska om dessa specifika nätverk överträffar ansiktsigenkänningsmetoder på ansiktslikhet. Den bästa noggrannheten är från ett Siamesiskt faltningsnätverk, vilket är 65,11%. Dessutom erhålls de bästa resultaten i en likhetsrankningsuppgift från Siamesisk geometrimedveten metrisk inlärning. Projektet skapar också ett nytt dataset med ansiktsbildpar för ansiktslikhet.
APA, Harvard, Vancouver, ISO, and other styles
7

Thiele, Johannes C. "Deep learning in event-based neuromorphic systems." Thesis, Université Paris-Saclay (ComUE), 2019. http://www.theses.fr/2019SACLS403/document.

Full text
Abstract:
Inférence et apprentissage dans les réseaux de neurones profonds nécessitent une grande quantité de calculs qui, dans beaucoup de cas, limite leur intégration dans les environnements limités en ressources. Les réseaux de neurones évènementiels de type « spike » présentent une alternative aux réseaux de neurones artificiels classiques, et promettent une meilleure efficacité énergétique. Cependant, entraîner les réseaux spike demeure un défi important, particulièrement dans le cas où l’apprentissage doit être exécuté sur du matériel de calcul bio-inspiré, dit matériel neuromorphique. Cette thèse constitue une étude sur les algorithmes d’apprentissage et le codage de l’information dans les réseaux de neurones spike.A partir d’une règle d’apprentissage bio-inspirée, nous analysons quelles propriétés sont nécessaires dans les réseaux spike pour rendre possible un apprentissage embarqué dans un scénario d’apprentissage continu. Nous montrons qu’une règle basée sur le temps de déclenchement des neurones (type « spike-timing dependent plasticity ») est capable d’extraire des caractéristiques pertinentes pour permettre une classification d’objets simples comme ceux des bases de données MNIST et N-MNIST.Pour dépasser certaines limites de cette approche, nous élaborons un nouvel outil pour l’apprentissage dans les réseaux spike, SpikeGrad, qui représente une implémentation entièrement évènementielle de la rétro-propagation du gradient. Nous montrons comment cette approche peut être utilisée pour l’entrainement d’un réseau spike qui est capable d’inférer des relations entre valeurs numériques et des images MNIST. Nous démontrons que cet outil est capable d’entrainer un réseau convolutif profond, qui donne des taux de reconnaissance d’image compétitifs avec l’état de l’art sur les bases de données MNIST et CIFAR10. De plus, SpikeGrad permet de formaliser la réponse d’un réseau spike comme celle d’un réseau de neurones artificiels classique, permettant un entraînement plus rapide.Nos travaux introduisent ainsi plusieurs mécanismes d’apprentissage puissants pour les réseaux évènementiels, contribuant à rendre l’apprentissage des réseaux spike plus adaptés à des problèmes réels
Inference and training in deep neural networks require large amounts of computation, which in many cases prevents the integration of deep networks in resource constrained environments. Event-based spiking neural networks represent an alternative to standard artificial neural networks that holds the promise of being capable of more energy efficient processing. However, training spiking neural networks to achieve high inference performance is still challenging, in particular when learning is also required to be compatible with neuromorphic constraints. This thesis studies training algorithms and information encoding in such deep networks of spiking neurons. Starting from a biologically inspired learning rule, we analyze which properties of learning rules are necessary in deep spiking neural networks to enable embedded learning in a continuous learning scenario. We show that a time scale invariant learning rule based on spike-timing dependent plasticity is able to perform hierarchical feature extraction and classification of simple objects of the MNIST and N-MNIST dataset. To overcome certain limitations of this approach we design a novel framework for spike-based learning, SpikeGrad, which represents a fully event-based implementation of the gradient backpropagation algorithm. We show how this algorithm can be used to train a spiking network that performs inference of relations between numbers and MNIST images. Additionally, we demonstrate that the framework is able to train large-scale convolutional spiking networks to competitive recognition rates on the MNIST and CIFAR10 datasets. In addition to being an effective and precise learning mechanism, SpikeGrad allows the description of the response of the spiking neural network in terms of a standard artificial neural network, which allows a faster simulation of spiking neural network training. Our work therefore introduces several powerful training concepts for on-chip learning in neuromorphic devices, that could help to scale spiking neural networks to real-world problems
APA, Harvard, Vancouver, ISO, and other styles
8

Zanghieri, Marcello. "sEMG-based hand gesture recognition with deep learning." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2019. http://amslaurea.unibo.it/18112/.

Full text
Abstract:
Hand gesture recognition based on surface electromyographic (sEMG) signals is a promising approach for the development of Human-Machine Interfaces (HMIs) with a natural control, such as intuitive robot interfaces or poly-articulated prostheses. However, real-world applications are limited by reliability problems due to motion artifacts, postural and temporal variability, and sensor re-positioning. This master thesis is the first application of deep learning on the Unibo-INAIL dataset, the first public sEMG dataset exploring the variability between subjects, sessions and arm postures, by collecting data over 8 sessions of each of 7 able-bodied subjects executing 6 hand gestures in 4 arm postures. In the most recent studies, the variability is addressed with training strategies based on training set composition, which improve inter-posture and inter-day generalization of classical (i.e. non-deep) machine learning classifiers, among which the RBF-kernel SVM yields the highest accuracy. The deep architecture realized in this work is a 1d-CNN implemented in Pytorch, inspired by a 2d-CNN reported to perform well on other public benchmark databases. On this 1d-CNN, various training strategies based on training set composition were implemented and tested. Multi-session training proves to yield higher inter-session validation accuracies than single-session training. Two-posture training proves to be the best postural training (proving the benefit of training on more than one posture), and yields 81.2% inter-posture test accuracy. Five-day training proves to be the best multi-day training, and yields 75.9% inter-day test accuracy. All results are close to the baseline. Moreover, the results of multi-day trainings highlight the phenomenon of user adaptation, indicating that training should also prioritize recent data. Though not better than the baseline, the achieved classification accuracies rightfully place the 1d-CNN among the candidates for further research.
APA, Harvard, Vancouver, ISO, and other styles
9

Elkaref, Mohab. "Deep learning applications for transition-based dependency parsing." Thesis, University of Birmingham, 2018. http://etheses.bham.ac.uk//id/eprint/8620/.

Full text
Abstract:
Dependency Parsing is a method that builds dependency trees consisting of binary relations that describe the syntactic role of words in sentences. Recently, dependency parsing has seen large improvements due to deep learning, which enabled richer feature representations and flexible architectures. In this thesis we focus on the application of these methods to Transition-based parsing, which is a faster variant. We explore current architectures and examine ways to improve their representation capabilities and final accuracies. Our first contribution is an improvement on the basic architecture at the heart of many current parsers. We show that using Recurrent Neural Network hidden layers, initialised with pretrained weights from a feed forward network, provides significant accuracy improvements. Second, we examine the best parser architecture. We show that separate classifiers for dependency parsing and labelling, with a shared input layer provides the best accuracy. We also show that a parser and labeller can be successfully trained separately. Finally, we propose Recursive LSTM Trees, which can represent an entire tree as a single dense vector, and achieve competitive accuracy with minimal features. The parsers that we develop in this thesis cover many aspects of this task, and are easy to integrate with current methods.
APA, Harvard, Vancouver, ISO, and other styles
10

Dsouza, Rodney Gracian. "Deep Learning Based Motion Forecasting for Autonomous Driving." The Ohio State University, 2021. http://rave.ohiolink.edu/etdc/view?acc_num=osu1619139403696822.

Full text
APA, Harvard, Vancouver, ISO, and other styles
11

Hamdi, Slim. "Deep Learning Anomaly Detection for Drone-based Surveillance." Thesis, Troyes, 2021. http://www.theses.fr/2021TROY0026.

Full text
Abstract:
La sécurité civile est l'ensemble des moyens mis en œuvre par un État ou une organisation pour protéger les populations civiles, ainsi que leurs biens et activités, en temps de guerre, de crise et de paix, contre les risques ou menaces de toute nature. En outre, elle consiste à assurer la sécurité des personnes contre tous types de risques naturels tels que les incendies ou contre diverses menaces pouvant mettre en danger leur vie, ainsi que celle de leurs biens ou activités (actes de terrorisme, actes de vandalisme, etc.). Ces dernières années, l'utilisation de drones pour des tâches de surveillance s'est développée dans le monde entier. Ainsi, le nombre de caméras qui doivent être analysées augmente et l'efficacité et la précision des opérateurs humains ont atteint leurs limites. De plus, dans le contexte de la détection d'anomalies, seuls les événements normaux sont disponibles pour le processus d'apprentissage. Par conséquent, la mise en œuvre d'une méthode d'apprentissage profond en mode non supervisé pour résoudre ce problème devient fondamentale. Dans cette thèse, nous avons proposé plusieurs architectures d'apprentissage profond capables de détecter des événements anormaux avec des performances élevées
Civil security is the set of methods implemented by a State or an organization to protect civilian populations, as well as their property and activities, in times of war, crisis, and peace, against risks or threats of any kind. Moreover, it consists of ensuring the safety of people against all types of natural risks such as fires or against various threats that could endanger their lives, as well as that of their property or activities (acts of terrorism, acts of vandalism, etc.). In recent years, the use of drones for surveillance tasks has been on the rise worldwide. So, The number of cameras that must be analyzed increases and the efficiency and accuracy of human operators have reached their limits. Moreover, in the context of anomaly detection, only normal events are available for the learning process. Therefore, the implementation of a deep learning method in unsupervised mode to solve this problem becomes fundamental. In this thesis, we have proposed many deep learning architectures capable of detecting abnormal events with high performance
APA, Harvard, Vancouver, ISO, and other styles
12

Rawat, Sharad. "DEEP LEARNING BASED FRAMEWORK FOR STRUCTURAL TOPOLOGY DESIGN." The Ohio State University, 2019. http://rave.ohiolink.edu/etdc/view?acc_num=osu1559560543458263.

Full text
APA, Harvard, Vancouver, ISO, and other styles
13

Robertson, Curtis E. "Deep Learning-Based Speed Sign Detection and Recognition." University of Cincinnati / OhioLINK, 2020. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1595500028808679.

Full text
APA, Harvard, Vancouver, ISO, and other styles
14

Jiang, Ji Chu. "High Precision Deep Learning-Based Tabular Data Extraction." Thesis, Université d'Ottawa / University of Ottawa, 2021. http://hdl.handle.net/10393/41699.

Full text
Abstract:
The advancements of AI methodologies and computing power enables automation and propels the Industry 4.0 phenomenon. Information and data are digitized more than ever, millions of documents are being processed every day, they are fueled by the growth in institutions, organizations, and their supply chains. Processing documents is a time consuming laborious task. Therefore automating data processing is a highly important task for optimizing supply chains efficiency across all industries. Document analysis for data extraction is an impactful field, this thesis aims to achieve the vital steps in an ideal data extraction pipeline. Data is often stored in tables since it is a structured formats and the user can easily associate values and attributes. Tables can contain vital information from specifications, dimensions, cost etc. Therefore focusing on table analysis and recognition in documents is a cornerstone to data extraction. This thesis applies deep learning methodologies for automating the two main problems within table analysis for data extraction; table detection and table structure detection. Table detection is identifying and localizing the boundaries of the table. The output of the table detection model will be inputted into the table structure detection model for structure format analysis. Therefore the output of the table detection model must have high localization performance otherwise it would affect the rest of the data extraction pipeline. Our table detection improves bounding box localization performance by incorporating a Kullback–Leibler loss function that calculates the divergence between the probabilistic distribution between ground truth and predicted bounding boxes. As well as adding a voting procedure into the non-maximum suppression step to produce better localized merged bounding box proposals. This model improved precision of tabular detection by 1.2% while achieving the same recall as other state-of-the-art models on the public ICDAR2013 dataset. While also achieving state-of-the-art results of 99.8% precision on the ICDAR2017 dataset. Furthermore, our model showed huge improvements espcially at higher intersection over union (IoU) thresholds; at 95% IoU an improvement of 10.9% can be seen for ICDAR2013 dataset and an improvement of 8.4% can be seen for ICDAR2017 dataset. Table structure detection is recognizing the internal layout of a table. Often times researchers approach this through detecting the rows and columns. However, in order for correct mapping of each individual cell data location in the semantic extraction step the rows and columns would have to be combined and form a matrix, this introduces additional degrees of error. Alternatively we propose a model that directly detects each individual cell. Our model is an ensemble of state-of-the-art models; Hybird Task Cascade as the detector and dual ResNeXt101 backbones arranged in a CBNet architecture. There is a lack of quality labeled data for table cell structure detection, therefore we hand labeled the ICDAR2013 dataset, and we wish to establish a strong baseline for this dataset. Our model was compared with other state-of-the-art models that excelled at table or table structure detection. Our model yielded a precision of 89.2% and recall of 98.7% on the ICDAR2013 cell structure dataset.
APA, Harvard, Vancouver, ISO, and other styles
15

Haque, Ashraful. "A Deep Learning-based Dynamic Demand Response Framework." Diss., Virginia Tech, 2021. http://hdl.handle.net/10919/104927.

Full text
Abstract:
The electric power grid is evolving in terms of generation, transmission and distribution network architecture. On the generation side, distributed energy resources (DER) are participating at a much larger scale. Transmission and distribution networks are transforming to a decentralized architecture from a centralized one. Residential and commercial buildings are now considered as active elements of the electric grid which can participate in grid operation through applications such as the Demand Response (DR). DR is an application through which electric power consumption during the peak demand periods can be curtailed. DR applications ensure an economic and stable operation of the electric grid by eliminating grid stress conditions. In addition to that, DR can be utilized as a mechanism to increase the participation of green electricity in an electric grid. The DR applications, in general, are passive in nature. During the peak demand periods, common practice is to shut down the operation of pre-selected electrical equipment i.e., heating, ventilation and air conditioning (HVAC) and lights to reduce power consumption. This approach, however, is not optimal and does not take into consideration any user preference. Furthermore, this does not provide any information related to demand flexibility beforehand. Under the broad concept of grid modernization, the focus is now on the applications of data analytics in grid operation to ensure an economic, stable and resilient operation of the electric grid. The work presented here utilizes data analytics in DR application that will transform the DR application from a static, look-up-based reactive function to a dynamic, context-aware proactive solution. The dynamic demand response framework presented in this dissertation performs three major functionalities: electrical load forecast, electrical load disaggregation and peak load reduction during DR periods. The building-level electrical load forecasting quantifies required peak load reduction during DR periods. The electrical load disaggregation provides equipment-level power consumption. This will quantify the available building-level demand flexibility. The peak load reduction methodology provides optimal HVAC setpoint and brightness during DR periods to reduce the peak demand of a building. The control scheme takes user preference and context into consideration. A detailed methodology with relevant case studies regarding the design process of the network architecture of a deep learning algorithm for electrical load forecasting and load disaggregation is presented. A case study regarding peak load reduction through HVAC setpoint and brightness adjustment is also presented. To ensure the scalability and interoperability of the proposed framework, a layer-based software architecture to replicate the framework within a cloud environment is demonstrated.
Doctor of Philosophy
The modern power grid, known as the smart grid, is transforming how electricity is generated, transmitted and distributed across the US. In a legacy power grid, the utilities are the suppliers and the residential or commercial buildings are the consumers of electricity. However, the smart grid considers these buildings as active grid elements which can contribute to the economic, stable and resilient operation of an electric grid. Demand Response (DR) is a grid application that reduces electrical power consumption during peak demand periods. The objective of DR application is to reduce stress conditions of the electric grid. The current DR practice is to shut down pre-selected electrical equipment i.e., HVAC, lights during peak demand periods. However, this approach is static, pre-fixed and does not consider any consumer preference. The proposed framework in this dissertation transforms the DR application from a look-up-based function to a dynamic context-aware solution. The proposed dynamic demand response framework performs three major functionalities: electrical load forecasting, electrical load disaggregation and peak load reduction. The electrical load forecasting quantifies building-level power consumption that needs to be curtailed during the DR periods. The electrical load disaggregation quantifies demand flexibility through equipment-level power consumption disaggregation. The peak load reduction methodology provides actionable intelligence that can be utilized to reduce the peak demand during DR periods. The work leverages functionalities of a deep learning algorithm to increase forecasting accuracy. An interoperable and scalable software implementation is presented to allow integration of the framework with existing energy management systems.
APA, Harvard, Vancouver, ISO, and other styles
16

Maillot, Robin. "Deep learning approach to hologram based cellular classification." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-240602.

Full text
Abstract:
With the rise of data intensive classification algorithms the need for high throughput imagery methods has increased. Lens-free imagery provides a new high throughput technique for imaging cells through hologram measurements. One acquisition of a Petri dish can provide between one thousand and ten thousand samples which do not need to be annotated if the biological properties of the Petri dish are known. Previously, hologram classification was addressed using feature extraction and non-linear classifier. In this work a deep learning approach to cellular classification using holograms is introduced. Because deep learning approaches do not require hand tailored features they are quicker to develop and the framework is easier to generalize to other hologram classification tasks. A dataset containing alive and dead cell holograms was used to judge the feasibility of the approach. Although the deep learning classifier was successful in classifying simulated holograms (over 97% test accuracy), the experimental dataset showed some key flaws which limited test performance. In an effort to improve deep learning approach the necessary improvements to create a better experimental dataset suited for deep learninghave also been identified.
Med ökningen av datakrävande klassificeringsalgoritmer har behovet av bildmetoder med hög upplösning ökat. Linsfri bildbehandling ger en ny teknik med snabb genomströmning för bildbehandling av celler genom hologrammätningar. Ett förvärv från en Petriskål kan ge mellan ettusen och tiotusen prover som inte behöver noteras om de biologiska egenskaperna hos Petriskålen är kända. Tidigare behandlades hologramklassificering med hjälp av särdragsextraktion och icke-linjär klassificering. I detta arbete introduceras en metod med djupinlärning för cellulär klassificering med hjälp av hologram. Eftersom djupinlärningsmetoder inte kräver handanpassade funktioner är de snabbare att utveckla och ramverket är lättare att generalisera till andra klassificeringsuppgifter med hologram. Ett dataset innehållande levandeoch dödcellshologram användes för att bedöma genomförbarheten. Även om klassificeraren lyckades väl med simulerade hologrammer (över 97 viktiga brister som begränsade testprestandan. I ett försök att förbättra djupinlärningen har nödvändiga åtärder för att skapa ett bättre experimentellt dataset lämpat för djupinlärningidentifierats.
APA, Harvard, Vancouver, ISO, and other styles
17

Matsoukas, Christos. "Model Distillation for Deep-Learning-Based Gaze Estimation." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-261412.

Full text
Abstract:
With the recent advances in deep learning, the gaze estimation models reached new levels, in terms of predictive accuracy, that could not be achieved with older techniques. Nevertheless, deep learning consists of computationally and memory expensive algorithms that do not allow their integration for embedded systems. This work aims to tackle this problem by boosting the predictive power of small networks using a model compression method called "distillation". Under the concept of distillation, we introduce an additional term to the compressed model’s total loss which is a bounding term between the compressed model (the student) and a powerful one (the teacher). We show that the distillation method introduces to the compressed model something more than noise. That is, the teacher’s inductive bias which helps the student to reach a better optimum due to the adaptive error deduction. Furthermore, we show that the MobileNet family exhibits unstable training phases and we report that the distilled MobileNet25 slightly outperformed the MobileNet50. Moreover, we try newly proposed training schemes to increase the predictive power of small and thin networks and we infer that extremely thin architectures are hard to train. Finally, we propose a new training scheme based on the hintlearning method and we show that this technique helps the thin MobileNets to gain stability and predictive power.
Den senaste utvecklingen inom djupinlärning har hjälp till att förbättra precisionen hos gaze estimation-modeller till nivåer som inte tidigare varit möjliga. Dock kräver djupinlärningsmetoder oftast både stora mängder beräkningar och minne som därmed begränsar dess användning i inbyggda system med små minnes- och beräkningsresurser. Det här arbetet syftar till att kringgå detta problem genom att öka prediktiv kraft i små nätverk som kan användas i inbyggda system, med hjälp av en modellkomprimeringsmetod som kallas distillation". Under begreppet destillation introducerar vi ytterligare en term till den komprimerade modellens totala optimeringsfunktion som är en avgränsande term mellan en komprimerad modell och en kraftfull modell. Vi visar att destillationsmetoden inför mer än bara brus i den komprimerade modellen. Det vill säga lärarens induktiva bias som hjälper studenten att nå ett bättre optimum tack vare adaptive error deduction. Utöver detta visar vi att MobileNet-familjen uppvisar instabila träningsfaser och vi rapporterar att den destillerade MobileNet25 överträffade sin lärare MobileNet50 något. Dessutom undersöker vi nyligen föreslagna träningsmetoder för att förbättra prediktionen hos små och tunna nätverk och vi konstaterar att extremt tunna arkitekturer är svåra att träna. Slutligen föreslår vi en ny träningsmetod baserad på hint-learning och visar att denna teknik hjälper de tunna MobileNets att stabiliseras under träning och ökar dess prediktiva effektivitet.
APA, Harvard, Vancouver, ISO, and other styles
18

Smirnov, Dmitriy S. M. Massachusetts Institute of Technology. "Deep learning-based methods for parametric shape prediction." Thesis, Massachusetts Institute of Technology, 2019. https://hdl.handle.net/1721.1/122770.

Full text
Abstract:
Thesis: S.M., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2019
Cataloged from PDF version of thesis.
Includes bibliographical references (pages 67-76).
Many tasks in graphics and vision demand machinery for converting shapes into representations with sparse sets of parameters; these representations facilitate rendering, editing, and storage. When the source data is noisy or ambiguous, however, artists and engineers often manually construct such representations, a tedious and potentially time-consuming process. While advances in deep learning have been successfully applied to noisy geometric data, the task of generating parametric shapes has so far been difficult for these methods. In this thesis, we consider the task of deep parametric shape prediction from two distinct angles. First, we propose a new framework for predicting parametric shape primitives using distance fields to transition between parameters like control points and input data on a raster grid. We demonstrate efficacy on 2D and 3D tasks, including font vectorization and surface abstraction. Second, we look at the problem of sketch-based modeling. Sketch-based modeling aims to model 3D geometry using a concise and easy to create but extremely ambiguous input: artist sketches. While most conventional sketch-based modeling systems target smooth shapes and put manually-designed priors on the 3D shapes, we present a system to infer a complete man-made 3D shape, composed of parametric surfaces, from a single bitmap sketch. In particular, we introduce our parametric representation as well as several specially designed loss functions. We also propose a data generation and augmentation pipeline for sketch. We demonstrate the efficacy of our system on a gallery of synthetic and real sketches as well as via comparison to related work.
"Supported by the National Science Foundation Graduate Research Fellowship under Grant No. 1122374, the Toyota-CSAIL Joint Research Center, and the Skoltech-MIT Next Generation Program"
by Dmitriy Smirnov.
S.M.
S.M. Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science
APA, Harvard, Vancouver, ISO, and other styles
19

Boschini, Matteo. "A deep learning-based approach for 3D people tracking." Bachelor's thesis, Alma Mater Studiorum - Università di Bologna, 2016. http://amslaurea.unibo.it/11321/.

Full text
Abstract:
Questa tesi si occupa dell’estensione di un framework software finalizzato all'individuazione e al tracciamento di persone in una scena ripresa da telecamera stereoscopica. In primo luogo è rimossa la necessità di una calibrazione manuale offline del sistema sfruttando algoritmi che consentono di individuare, a partire da un fotogramma acquisito dalla camera, il piano su cui i soggetti tracciati si muovono. Inoltre, è introdotto un modulo software basato su deep learning con lo scopo di migliorare la precisione del tracciamento. Questo componente, che è in grado di individuare le teste presenti in un fotogramma, consente ridurre i dati analizzati al solo intorno della posizione effettiva di una persona, escludendo oggetti che l’algoritmo di tracciamento sarebbe portato a individuare come persone.
APA, Harvard, Vancouver, ISO, and other styles
20

Jan, Asim. "Deep learning based facial expression recognition and its applications." Thesis, Brunel University, 2017. http://bura.brunel.ac.uk/handle/2438/15944.

Full text
Abstract:
Facial expression recognition (FER) is a research area that consists of classifying the human emotions through the expressions on their face. It can be used in applications such as biometric security, intelligent human-computer interaction, robotics, and clinical medicine for autism, depression, pain and mental health problems. This dissertation investigates the advanced technologies for facial expression analysis and develops the artificial intelligent systems for practical applications. The first part of this work applies geometric and texture domain feature extractors along with various machine learning techniques to improve FER. Advanced 2D and 3D facial processing techniques such as Edge Oriented Histograms (EOH) and Facial Mesh Distances (FMD) are then fused together using a framework designed to investigate their individual and combined domain performances. Following these tests, the face is then broken down into facial parts using advanced facial alignment and localising techniques. Deep learning in the form of Convolutional Neural Networks (CNNs) is also explored also FER. A novel approach is used for the deep network architecture design, to learn the facial parts jointly, showing an improvement over using the whole face. Joint Bayesian is also adapted in the form of metric learning, to work with deep feature representations of the facial parts. This provides a further improvement over using the deep network alone. Dynamic emotion content is explored as a solution to provide richer information than still images. The motion occurring across the content is initially captured using the Motion History Histogram descriptor (MHH) and is critically evaluated. Based on this observation, several improvements are proposed through extensions such as Average Spatial Pooling Multi-scale Motion History Histogram (ASMMHH). This extension adds two modifications, first is to view the content in different spatial dimensions through spatial pooling; influenced by the structure of CNNs. The other modification is to capture motion at different speeds. Combined, they have provided better performance over MHH, and other popular techniques like Local Binary Patterns - Three Orthogonal Planes (LBP-TOP). Finally, the dynamic emotion content is observed in the feature space, with sequences of images represented as sequences of extracted features. A novel technique called Facial Dynamic History Histogram (FDHH) is developed to capture patterns of variations within the sequence of features; an approach not seen before. FDHH is applied in an end to end framework for applications in Depression analysis and evaluating the induced emotions through a large set of video clips from various movies. With the combination of deep learning techniques and FDHH, state-of-the-art results are achieved for Depression analysis.
APA, Harvard, Vancouver, ISO, and other styles
21

Zhang, Yongfeng. "Deep learning and interpolation for featured-based pattern classification." Thesis, Aberystwyth University, 2016. http://hdl.handle.net/2160/bc2f7c5c-28f4-4182-8ed3-3ca1b5bcc618.

Full text
Abstract:
Deep machine learning has received significant attention over the past decade, especially in terms of dealing with information that may span large scales. By employing a hierarchical architecture, consisting of simple computational nodes of similar characteristic, such a network helps to partition large data structures into relatively smaller, more manageable units, and to discover any dependencies that may exist between the resulting units. However, the process of running this type of network which has a layered structure, to perform tasks such as feature extraction, and subsequent feature pattern-based recognition, typically involves significant computation. To tackle this problem, two approaches are proposed in this thesis. The first novel approach developed is for image classification, by integrating deep learning and feature interpolation, supported with advanced learning classification techniques. The recently introduced Deep Spatio-Temporal Inference Network (DeSTIN) is employed to carry out limited original feature extraction. Simple interpolation is then employed to artificially increase the dimensionality of the extracted feature sets for accurate classification, without incurring heavy computational cost. The work is tested against the popular MNIST dataset of handwritten digits, demonstrating the potential of the proposed work. The second approach, which is a substantially simplified 2-layer learning network, is introduced that exploits unsupervised learning for pattern representation, capable of extracting effective features efficiently. Experimental results, in comparison with the use of popular deep learning networks, again on the application to handwritten digit classification demonstrate that the proposed approach is of significant potential in dealing with real-world problems. The generation of effective feature pattern-based classification rules from data is essential to the development of intelligent classifiers which are readily comprehensible to the user. Unfortunately, a sparse rule base may be generated when there is missing information in the experienced dataset. This hinders classification systems that work based on such sparse knowledge effectively performing their tasks in many real-world applications, where complete historical data cannot be assumed. This thesis further proposes an innovative approach by integrating fuzzy rule interpolation within a data-driven classification mechanism, such that conclusions can be approximately derived even if no matched rule can be found from a given sparse rule base when given a certain observation. The proposed technique is simple conceptually, directly exploiting the recently developed fuzzy rule interpolation techniques. However, the resulting integrated system offers a powerful means to develop robust classifiers, significantly enhancing the effectiveness of intelligent classification systems, as demonstrated by systematic comparative experimental results and also, by an application to the challenging problem of mammographic risk analysis.
APA, Harvard, Vancouver, ISO, and other styles
22

Lim, Steven. "Recommending TEE-based Functions Using a Deep Learning Model." Thesis, Virginia Tech, 2021. http://hdl.handle.net/10919/104999.

Full text
Abstract:
Trusted execution environments (TEEs) are an emerging technology that provides a protected hardware environment for processing and storing sensitive information. By using TEEs, developers can bolster the security of software systems. However, incorporating TEE into existing software systems can be a costly and labor-intensive endeavor. Software maintenance—changing software after its initial release—is known to contribute the majority of the cost in the software development lifecycle. The first step of making use of a TEE requires that developers accurately identify which pieces of code would benefit from being protected in a TEE. For large code bases, this identification process can be quite tedious and time-consuming. To help reduce the software maintenance costs associated with introducing a TEE into existing software, this thesis introduces ML-TEE, a recommendation tool that uses a deep learning model to classify whether an input function handles sensitive information or sensitive code. By applying ML-TEE, developers can reduce the burden of manual code inspection and analysis. ML-TEE's model was trained and tested on functions from GitHub repositories that use Intel SGX and on an imbalanced dataset. The accuracy of the final model used in the recommendation system has an accuracy of 98.86% and an F1 score of 80.00%. In addition, we conducted a pilot study, in which participants were asked to identify functions that needed to be placed inside a TEE in a third-party project. The study found that on average, participants who had access to the recommendation system's output had a 4% higher accuracy and completed the task 21% faster.
Master of Science
Improving the security of software systems has become critically important. A trusted execution environment (TEE) is an emerging technology that can help secure software that uses or stores confidential information. To make use of this technology, developers need to identify which pieces of code handle confidential information and should thus be placed in a TEE. However, this process is costly and laborious because it requires the developers to understand the code well enough to make the appropriate changes in order to incorporate a TEE. This process can become challenging for large software that contains millions of lines of code. To help reduce the cost incurred in the process of identifying which pieces of code should be placed within a TEE, this thesis presents ML-TEE, a recommendation system that uses a deep learning model to help reduce the number of lines of code a developer needs to inspect. Our results show that the recommendation system achieves high accuracy as well as a good balance between precision and recall. In addition, we conducted a pilot study and found that participants from the intervention group who used the output from the recommendation system managed to achieve a higher average accuracy and perform the assigned task faster than the participants in the control group.
APA, Harvard, Vancouver, ISO, and other styles
23

Boyd, Joseph. "Deep learning for computational phenotyping in cell-based assays." Thesis, Université Paris sciences et lettres, 2020. https://pastel.archives-ouvertes.fr/tel-02928984.

Full text
Abstract:
Le phénotypage computationnel est un ensemble de technologies émergentes permettant d’étudier systématiquement le rôle du génome dans l’obtention de phénotypes, les caractéristiques observables d’un organisme et de ses sous- systèmes. En particulier, les essais cellulaires permettent de cribler des panels de petites molécules ou de moduler l’expression des gènes, et de quantifier les effets sur les caractéristiques phénotypiques allant de la viabilité à la morphologie cellulaire. Le criblage à haut contenu étend les méthodologies des criblages cellulaires à une lecture à haut contenu basée sur des images, en particulier les canaux multiplexés de la microscopie à fluorescence. Les cribles basés sur de multiples lignées cellulaires sont aptes à différencier les phénotypes de différents sous-types d’une maladie, représentant l’hétérogénéité moléculaire concernée dans la conception de thérapies médicales de précision. Ces modèles biologiques plus riches sous-tendent une approche plus ciblée pour le traitement de maladies mortelles telles que le cancer. Un défi permanent pour le criblage à haut contenu est donc la synthèse des lectures hétérogènes dans les cribles à multiples lignées cellulaires. Parallèlement, l’état de l’art établi en matière d’applications d’analyse d’images et de vision par ordinateur est l’apprentissage profond. Cependant, son rôle dans le criblage à haut contenu ne fait que commencer à être réalisé. Cette thèse aborde deux problématiques de l’analyse à haut contenu des lignées cellulaires cancéreuses. Les contributions sont les suivantes : (i) une démonstration du potentiel d’apprentissage profond et de modèles générateurs dans le criblage à haut contenu ; (ii) une solution basée sur l’apprentissage profond au problème de l’hétérogénéité dans un criblage de médicaments sur plusieurs lignées cellulaires ; et (iii) de nouvelles applications de modèles de traduction d’image à image comme alternative à la microscopie à fluorescence coûteuse actuellement nécessaire pour le criblage à haut contenu
Computational phenotyping is an emergent set of technologies for systematically studying the role of the genome in eliciting phenotypes, the observable characteristics of an organism and its subsystems. In particular, cell-based assays screen panels of small compound drugs or otherwise modulations of gene expression, and quantify the effects on phenotypic characteristics ranging from viability to cell morphology. High content screening extends the methodologies of cell-based screens to a high content readout based on images, in particular the multiplexed channels of fluorescence microscopy. Screens based on multiple cell lines are apt to differentiating phenotypes across different subtypes of a disease, representing the molecular heterogeneity concerned in the design of precision medicine therapies. These richer biological models underpin a more targeted approach for treating deadly diseases such as cancer. An ongoing challenge for high content screening is therefore the synthesis of the heterogeneous readouts in multi-cell-line screens. Concurrently, deep learning is the established state-of-the-art image analysis and computer vision applications. However, its role in high content screening is only beginning to be realised. This dissertation spans two problem settings in the high content analysis of cancer cell lines. The contributions are the following: (i) a demonstration of the potential for deep learning and generative models in high content screening; (ii) a deep learning-based solution to the problem of heterogeneity in a multi-cell-line drug screen; and (iii) novel applications of image-to-image translation models as an alternative to the expensive fluorescence microscopy currently required for high content screening
APA, Harvard, Vancouver, ISO, and other styles
24

Cabrera, Gil Blanca. "Deep Learning Based Deformable Image Registration of Pelvic Images." Thesis, KTH, Skolan för kemi, bioteknologi och hälsa (CBH), 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-279155.

Full text
Abstract:
Deformable image registration is usually performed manually by clinicians,which is time-consuming and costly, or using optimization-based algorithms, which are not always optimal for registering images of different modalities. In this work, a deep learning-based method for MR-CT deformable image registration is presented. In the first place, a neural network is optimized to register CT pelvic image pairs. Later, the model is trained on MR-CT image pairs to register CT images to match its MR counterpart. To solve the unavailability of ground truth data problem, two approaches were used. For the CT-CT case, perfectly aligned image pairs were the starting point of our model, and random deformations were generated to create a ground truth deformation field. For the multi-modal case, synthetic CT images were generated from T2-weighted MR using a CycleGAN model, plus synthetic deformations were applied to the MR images to generate ground truth deformation fields. The synthetic deformations were created by combining a coarse and fine deformation grid, obtaining a field with deformations of different scales. Several models were trained on images of different resolutions. Their performance was benchmarked with an analytic algorithm used in an actual registration workflow. The CT-CT models were tested using image pairs created by applying synthetic deformation fields. The MR-CT models were tested using two types of test images. The first one contained synthetic CT images and MR ones deformed by synthetically generated deformation fields. The second test set contained real MR-CT image pairs. The test performance was measured using the Dice coefficient. The CT-CT models obtained Dice scores higherthan 0.82 even for the models trained on lower resolution images. Despite the fact that all MR-CT models experienced a drop in their performance, the biggest decrease came from the analytic method used as a reference, both for synthetic and real test data. This means that the deep learning models outperformed the state-of-the-art analytic benchmark method. Even though the obtained Dice scores would need further improvement to be used in a clinical setting, the results show great potential for using deep learning-based methods for multi- and mono-modal deformable image registration.
APA, Harvard, Vancouver, ISO, and other styles
25

Manjunatha, Bharadwaj Sandhya. "Land Cover Quantification using Autoencoder based Unsupervised Deep Learning." Thesis, Virginia Tech, 2020. http://hdl.handle.net/10919/99861.

Full text
Abstract:
This work aims to develop a deep learning model for land cover quantification through hyperspectral unmixing using an unsupervised autoencoder. Land cover identification and classification is instrumental in urban planning, environmental monitoring and land management. With the technological advancements in remote sensing, hyperspectral imagery which captures high resolution images of the earth's surface across hundreds of wavelength bands, is becoming increasingly popular. The high spectral information in these images can be analyzed to identify the various target materials present in the image scene based on their unique reflectance patterns. An autoencoder is a deep learning model that can perform spectral unmixing by decomposing the complex image spectra into its constituent materials and estimating their abundance compositions. The advantage of using this technique for land cover quantification is that it is completely unsupervised and eliminates the need for labelled data which generally requires years of field survey and formulation of detailed maps. We evaluate the performance of the autoencoder on various synthetic and real hyperspectral images consisting of different land covers using similarity metrics and abundance maps. The scalability of the technique with respect to landscapes is assessed by evaluating its performance on hyperspectral images spanning across 100m x 100m, 200m x 200m, 1000m x 1000m, 4000m x 4000m and 5000m x 5000m regions. Finally, we analyze the performance of this technique by comparing it to several supervised learning methods like Support Vector Machine (SVM), Random Forest (RF) and multilayer perceptron using F1-score, Precision and Recall metrics and other unsupervised techniques like K-Means, N-Findr, and VCA using cosine similarity, mean square error and estimated abundances. The land cover classification obtained using this technique is compared to the existing United States National Land Cover Database (NLCD) classification standard.
Master of Science
This work aims to develop an automated deep learning model for identifying and estimating the composition of the different land covers in a region using hyperspectral remote sensing imagery. With the technological advancements in remote sensing, hyperspectral imagery which captures high resolution images of the earth's surface across hundreds of wavelength bands, is becoming increasingly popular. As every surface has a unique reflectance pattern, the high spectral information contained in these images can be analyzed to identify the various target materials present in the image scene. An autoencoder is a deep learning model that can perform spectral unmixing by decomposing the complex image spectra into its constituent materials and estimate their percent compositions. The advantage of this method in land cover quantification is that it is an unsupervised technique which does not require labelled data which generally requires years of field survey and formulation of detailed maps. The performance of this technique is evaluated on various synthetic and real hyperspectral datasets consisting of different land covers. We assess the scalability of the model by evaluating its performance on images of different sizes spanning over a few hundred square meters to thousands of square meters. Finally, we compare the performance of the autoencoder based approach with other supervised and unsupervised deep learning techniques and with the current land cover classification standard.
APA, Harvard, Vancouver, ISO, and other styles
26

Feng, Shumin. "Mobile Robot Obstacle Avoidance based on Deep Reinforcement Learning." Thesis, Virginia Tech, 2018. http://hdl.handle.net/10919/87059.

Full text
Abstract:
Obstacle avoidance is one of the core problems in the field of autonomous navigation. An obstacle avoidance approach is developed for the navigation task of a reconfigurable multi-robot system named STORM, which stands for Self-configurable and Transformable Omni-Directional Robotic Modules. Various mathematical models have been developed in previous work in this field to avoid collision for such robots. In this work, the proposed collision avoidance algorithm is trained via Deep Reinforcement Learning, which enables the robot to learn by itself from its experiences, and then fit a mathematical model by updating the parameters of a neural network. The trained neural network architecture is capable of choosing an action directly based on the input sensor data using the trained neural network architecture. A virtual STORM locomotion module was trained to explore a Gazebo simulation environment without collision, using the proposed collision avoidance strategies based on DRL. The mathematical model of the avoidance algorithm was derived from the simulation and then applied to the prototype of the locomotion module and validated via experiments. Universal software architecture was also designed for the STORM modules. The software architecture has extensible and reusable features that improve the design efficiency and enable parallel development.
Master of Science
In this thesis, an obstacle avoidance approach is described to enable autonomous navigation of a reconfigurable multi-robot system, STORM. The Self-configurable and Transformable Omni-Directional Robotic Modules (STORM) is a novel approach towards heterogeneous swarm robotics. The system has two types of robotic modules, namely the locomotion module and the manipulation module. Each module is able to navigate and perform tasks independently. In addition, the systems are designed to autonomously dock together to perform tasks that the modules individually are unable to accomplish. The proposed obstacle avoidance approach is designed for the modules of STORM, but can be applied to mobile robots in general. In contrast to the existing collision avoidance approaches, the proposed algorithm was trained via deep reinforcement learning (DRL). This enables the robot to learn by itself from its experiences, and then fit a mathematical model by updating the parameters of a neural network. In order to avoid damage to the real robot during the learning phase, a virtual robot was trained inside a Gazebo simulation environment with obstacles. The mathematical model for the collision avoidance strategy obtained through DRL was then validated on a locomotion module prototype of STORM. This thesis also introduces the overall STORM architecture and provides a brief overview of the generalized software architecture designed for the STORM modules. The software architecture has expandable and reusable features that apply well to the swarm architecture while allowing for design efficiency and parallel development.
APA, Harvard, Vancouver, ISO, and other styles
27

Chen, Hua. "FPGA Based Multi-core Architectures for Deep Learning Networks." University of Dayton / OhioLINK, 2015. http://rave.ohiolink.edu/etdc/view?acc_num=dayton1449417091.

Full text
APA, Harvard, Vancouver, ISO, and other styles
28

Chen, Yani. "Deep Learning based 3D Image Segmentation Methods and Applications." Ohio University / OhioLINK, 2019. http://rave.ohiolink.edu/etdc/view?acc_num=ohiou1547066297047003.

Full text
APA, Harvard, Vancouver, ISO, and other styles
29

Wang, Huanyu. "Side-Channel Analysis of AES Based on Deep Learning." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-253755.

Full text
Abstract:
Side-channel attacks avoid complex analysis of cryptographic algorithms, instead they use side-channel signals captured from a software or a hardware implementation of the algorithm to recover its secret key. Recently, deep learning models, especially Convolutional Neural Networks (CNN), have been shown successful in assisting side-channel analysis. The attacker first trains a CNN model on a large set of power traces captured from a device with a known key. The trained model is then used to recover the unknown key from a few power traces captured from a victim device. However, previous work had three important limitations: (1) little attention is paid to the effects of training and testing on traces captured from different devices; (2) the effect of different power models on the attack’s efficiency has not been thoroughly evaluated; (3) it is believed that, in order to recover all bytes of a key, the CNN model must be trained as many times as the number of bytes in the key.This thesis aims to address these limitations. First, we show that it is easy to overestimate the attack’s efficiency if the CNN model is trained and tested on the same device. Second, we evaluate the effect of two common power models, identity and Hamming weight, on CNN-based side-channel attack’s efficiency. The results show that the identity power model is more effective under the same training conditions. Finally, we show that it is possible to recover all key bytes using the CNN model trained only once.
Sidokanalattacker undviker komplex analys av kryptografiska algoritmer, utan använder sig av sidokanalssignaler som tagits från en mjukvara eller en hårdvaruimplementering av algoritmen för att återställa sin hemliga nyckel. Nyligen har djupa inlärningsmodeller, särskilt konvolutionella neurala nätverk (CNN), visats framgångsrika för att bistå sidokanalanalys. Anfallaren tränar först en CNN-modell på en stor uppsättning strömspår som tagits från en enhet med en känd nyckel. Den utbildade modellen används sedan för att återställa den okända nyckeln från några kraftspår som fångats från en offeranordning. Tidigare arbete hade dock tre viktiga begränsningar: (1) Liten uppmärksamhet ägnas åt effekterna av träning och testning på spår som fångats från olika enheter; (2) Effekten av olika kraftmodeller på attackerens effektivitet har inte utvärderats noggrant. (3) man tror att CNN-modellen måste utbildas så många gånger som antalet byte i nyckeln för att återställa alla bitgrupper av en nyckel.Denna avhandling syftar till att hantera dessa begränsningar. Först visar vi att det är lätt att överskatta attackens effektivitet om CNN-modellen är utbildad och testad på samma enhet. För det andra utvärderar vi effekten av två gemensamma kraftmodeller, identitet och Hamming-vikt, på CNN-baserad sidokanalangrepps effektivitet. Resultaten visar att identitetsmaktmodellen är effektivare under samma träningsförhållanden. Slutligen visar vi att det är möjligt att återställa alla nyckelbyte med hjälp av CNN-modellen som utbildats en gång.
APA, Harvard, Vancouver, ISO, and other styles
30

Tas, Yusuf. "Deep Learning based Domain Adaptation." Phd thesis, 2021. http://hdl.handle.net/1885/223608.

Full text
Abstract:
Recent advancements in Deep Learning (DL) has helped researchers achieve fascinating results in various areas of Machine Learning (ML) and Computer Vision (CV). Starting with the ingenious approach of [Krizhevsky et al., 2012a] where they have utilized processing powers of graphical processing units (GPU) to make training large networks a viable choice in terms of training time, DL has had its place in different ML and CV problems over the years since. Object detection and semantic segmentation [Girshick et al., 2014a; Girshick, 2015; Ren et al., 2015], image super resolution [Dong et al., 2015], action recognition [Simonyan and Zisserman, 2014a] etc. are few examples to that. Over years, many more new and powerful DL architectures have been proposed: VGG [Simonyan and Zisserman, 2014b], GoogleNet [Szegedy et al., 2015], ResNet [He et al., 2016] are examples to most commonly used network architectures in the literature. Our focus is on the specific task of Supervised Domain Adaptation (SDA) using Deep Learning. SDA is a type of domain adaptation where target and source domains contain annotated data. Firstly, we look at SDA as a domain alignment problem. We propose a mixture of alignment approach based on second- or higher-order scatter statistics between source and target domains. Although they are different, each class has two distinctive representation in source and target domains. Proposed mixture alignment approach aims to reduce within class scatters to align same classes from source and target while maintaining between-class separation. We design and construct a two stream Convolutional Neural Network (CNN) where one stream receives source data and second one receives the target with matching classes to implement within class alignment. We achieve end-to-end training of our two-stream network together with alignment losses. Next, we propose a new dataset called Open Museum Identification Challenge (Open MIC) for SDA research. Office dataset [Saenko et al., 2010a] is commonly used in SDA literature. But one main drawback of this dataset is that results have saturated, reaching 90+% accuracy. Limited number of images is one of the main causes of high accuracy results. Open MIC aims to provide a large dataset for SDA while providing challenging tasks to be addressed. We also extend our mixture of alignment loss from frobenius norm distance to Bregman divergences and the Riemannian metric to learn the alignment in different feature spaces. In the next study, we propose a new representation to encode 3D body skeleton data into texture like images by using kernel methods for Action Recognition problem. We utilize these representations in our SDA two stream CNN pipeline. We improve our mixture of alignment losses to work with partially overlapping datasets to let us use other datasets available for Action Recognition as additional source domain even if they only partially overlap with the target set. Finally, we move to a more challenging domain adaptation problem: Multimodal Conversation Systems. Multimodal Dialogue dataset (MMD) [Saha et al., 2018] provides dialogues between a shopper and retail agent. In these dialogues, retail agent may also answer with specific retail items such as cloths, shoes etc. Hence flow of the conversation is a multimodal setting where utterances can contain both text and image modalities. Two level RNN encoders are used to encode a given context of utterances. We propose a new approach to this problem by adapting additional data from external domains. For improving text generating capabilities of the model, we utilize French translation of the target sentences as an additional output target. For improving image ranking capabilities of the model, we utilize an external dataset and find nearest neighbors of target positive and negative images. We set up new encoding methods for these nearest neighbors for assigning them to correct target class, positive or negative.
APA, Harvard, Vancouver, ISO, and other styles
31

CHEN, NAN-CEN, and 陳南岑. "Pedestrian Detection based on Deep Learning." Thesis, 2018. http://ndltd.ncl.edu.tw/handle/r9m564.

Full text
Abstract:
碩士
國立高雄第一科技大學
資訊管理系碩士班
106
In recent years, Deep Learning has important breakthroughs in the application of image recognition and has attracted widespread attention. Pedestrian detection is a technology in image recognition. In a real complex environment, deep learning technology solves many problems in real life, such as traffic detection systems, home security, etc. It requires more accurate pedestrian detection. In this paper, the INRIA Person Dataset will be used as the training and testing data of the original image with the corresponding annotation file. After acquiring the position of the pedestrian image from the 614 images of the training set, the characteristic values of each state of the pedestrian will be used as the pedestrian detection. The training data was finally determined by the Faster RCNN using Google's Inception V2 as a feature extractor and YOLO V3 method to determine whether it was a pedestrian based on the feature values. The experimental results were displayed in 288 test sets. These two methods can be used in cities, beaches, and mountains. The accuracy is 96.69% of Faster RCNN and 93.42% of YOLO V3. They are better than that of the HOG feature combined with the SVM classifier, with the accuracy of 85.03%; the Harr-like feature combined with the Adaboost classifier model, with the accuracy of pedestrian detection is 72.48%; the CNN model with the accuracy of detection is 87.52%.
APA, Harvard, Vancouver, ISO, and other styles
32

HSU, TZU-JEN, and 徐子仁. "Face Recognition Based on Deep Learning." Thesis, 2018. http://ndltd.ncl.edu.tw/handle/r7py3k.

Full text
Abstract:
碩士
國立臺灣科技大學
資訊工程系
106
In the past, the performance of face recognition technology was not ideal because of the environmental influences. These days, the impact of environment such as light and shadow to face recognition has been overcome by the technology based on deep learning, but the disadvantages are the high computational requirement and the enormous time for training a CNN model. In this paper, a method for training models has been proposed which requires relatively low computational requirements, less training time but comes with higher accuracy. The process of model convergence has become more stable and the model accuracy is a little raised due to the modification to the loss function-LMCL in this paper. There is a speedup about 1.8 times for model convergence because of the training method improved. The CNN model used in this paper is Mobilefacenet which is improved from MobileNet.
APA, Harvard, Vancouver, ISO, and other styles
33

Lin, Wei-Yu, and 林為瑀. "Deep Learning-based Obstacle Depth Estimation." Thesis, 2018. http://ndltd.ncl.edu.tw/handle/gfa5w4.

Full text
Abstract:
碩士
國立交通大學
資訊科學與工程研究所
106
Obstacle detection and avoidance are crucial issues in robotics and unmanned vehicles. In these kind of applications, we usually use a front-view camera as the system’s visual inputs. Due to perspective projection, we cannot know the object depth using the front-view camera image. Thus, most of the obstacle avoidance system rely on extra hardware, like RGB-D sensor, to get the object depth information. In order to deal with the loss of object depth information, we modify the existing deep learning-based object detection architecture – YOLOv3 and add an extra object depth prediction module. And then use a pre-processed KITTI dataset to train our proposed unified model for object detection and depth prediction to resolve the depth information loss problem. Besides, we use AirSim to generate simulated aerial images and use them to train and test our proposed unified model to verify our model can fit in different data domains. The experiment results show that our model compares favorably with other depth map prediction methods in terms of accuracy in the prediction of object depth for pre-processed KITTI dataset. As for our AirSim dataset, we find out the extra depth prediction module can boost the object detection performance and achieve higher precision and recall rates. Moreover, our model also perform very well for the depth prediction.
APA, Harvard, Vancouver, ISO, and other styles
34

long, Lin wen, and 林文龍. "Vehicle Classification Based on Deep Learning." Thesis, 2018. http://ndltd.ncl.edu.tw/handle/gg295p.

Full text
Abstract:
碩士
大葉大學
電機工程學系
106
In this paper, we conduct a comparison of transfer learning and fine-tuning on the performance of vehicle classification in a relatively small dataset. For deep learning-based classification task, sufficient training data are very important, but sometimes the collection of training data is quite difficult, especially for medical images. Therefore, the investigation of deep learning-based classification problem for a relatively small dataset is still valuable. Transfer learning is a method that uses a pre-trained deep convolutional (CONV) neural network to learn patterns from data that are not seen before, which is often served as a feature extractor. As for fine-tuning, it can be considered as another type of transfer learning, but its performance is usually better than transfer learning, provided there is sufficient training data. Experimental results show that for transfer learning, the average recall rates are all the same of 93% when either the classifier of linear SVM or Logistic Regression is applied on the top of the network architecture. But for fine-tuning, the average recall rate can be further increased from 93% of transfer learning to 95%, indicating that fine-tuning outperforms transfer learning on the task of vehicle classification.
APA, Harvard, Vancouver, ISO, and other styles
35

Lin, Ting-Hsuan, and 林庭萱. "Deep Learning based Gastric Section Detection." Thesis, 2019. http://ndltd.ncl.edu.tw/cgi-bin/gs32/gsweb.cgi/login?o=dnclcdr&s=id=%22107NCHU5394043%22.&searchmode=basic.

Full text
Abstract:
碩士
國立中興大學
資訊科學與工程學系所
107
To provide accurate histological parameter assessment of each gastric section from endoscopic images, gastric sections need to be correctly identified. In this thesis, we propose a novel ensemble learning method to detect gastric sections from endoscopic images. We fuse features extracted from multiple convolutional neural network (CNN) models, which provide initial decision probability of the endoscopic image. The decision probability is concatenated to form a super vector, which is used to be classified by a feature fusion network. The network considers the cross entropy loss functions based on the fusion networks to achieve more effective gastric section detection results. In the experimental results, we compare the proposed method with four state-of-the-art CNN models and conclude that the proposed method owns the best testing accuracy.
APA, Harvard, Vancouver, ISO, and other styles
36

Lin, Wei-Lun, and 林維倫. "Botnet Detection Based on Deep Learning." Thesis, 2019. http://ndltd.ncl.edu.tw/cgi-bin/gs32/gsweb.cgi/login?o=dnclcdr&s=id=%22107NCHU5396047%22.&searchmode=basic.

Full text
Abstract:
碩士
國立中興大學
資訊管理學系所
107
Botnets have been a serious problem in security for a long time. There are countless computers infected with botnets every year. The common attack methods include: distributed denial-of-service attack, spam, click fraud. Computers infected with botnets are not easily perceived by users. Therefore, detecting botnets has become an important issue. Most of the current implementations are based on network traffic and artificial extraction features, but it is also easy for the attacker to deliberately avoid the feature and escape the investigation. Because the latency of the botnet is not easily detected, the accuracy of the prediction is reduced. The concept of this paper can convert from network traffic to grayscale map. Using deep learning to classify computers for poisoning. Then, using feature visualization to assist visual observation. We hope to prevent it beforehand instead of detect afterwards. We use CTU dataset as dataset. Modeling with a single virus usingCNN、RNN、ConvLSTM and predict other type viruses. The accuracy can reach 91.59%, 90.60%, and 91.82% on average. Then, check the data and adjust dataset with visual feature maps. Finally, retraining with ConvLSTM, the accuracy is up to 99.58%.
APA, Harvard, Vancouver, ISO, and other styles
37

Audretsch, James. "Earthquake Detection using Deep Learning Based Approaches." Thesis, 2020. http://hdl.handle.net/10754/662251.

Full text
Abstract:
Earthquake detection is an important task, focusing on detecting seismic events in past data or in real time from seismic time series. In the past few decades, due to the increasing amount of available seismic data, research in seismic event detection shows remarkable success using neural networks and other machine learning techniques. However, creating high quality labeled data sets is still a manual process that demands tremendous amount of time and expert knowledge, and is stifling big data innovation. When compiling a data set, it is unclear how many earthquakes and noise are mislabeled. Another challenge is how to promote the general applicability of the machine learning based models to different geographical regions. The models trained by data sets from one location should be applicable to the detection at other locations. This thesis explores the most popular deep learning model, convolutional neural networks (CNN), to build a single location detection model. In addition, we build more robust generalized earthquake detection models using transfer learning and meta learning. We also introduce a process for generating high quality labeled datasets. Our technique achieves high detection accuracy even on low signal to noise ratio events. The AI techniques explored in this research have potential to be transferred to other domains that utilize signal processing. There are a myriad of potential applications, with audio processing probably being one of the most directly relevant. Any field that deals with waveforms (e.g. seismic, audio, light) can utilize the developed techniques.
APA, Harvard, Vancouver, ISO, and other styles
38

Po-HungKuo and 郭柏宏. "Image Super Resolution Based on Deep Learning." Thesis, 2015. http://ndltd.ncl.edu.tw/handle/35559245822499458738.

Full text
Abstract:
碩士
國立成功大學
電機工程學系
104
We develop two super resolution methods by different deep learning architecture. The first is the convolutional restricted Boltzmann machine (CRBM), the second is the convolutional neural network (CNN). To accelerate the training procedure, we implement the paralleled training algorithms by a GPU. Our experiments reveals that the super resolution performance of our works is equivalent to that of sparse coding while our processing speed is much faster.
APA, Harvard, Vancouver, ISO, and other styles
39

Huang, Kuan-Ying, and 黃冠穎. "Vehicle detection system based on deep learning." Thesis, 2017. http://ndltd.ncl.edu.tw/handle/26072999767924796230.

Full text
Abstract:
碩士
國立中央大學
電機工程學系
105
This thesis presents a vehicle detection system with deep learning. We use two detectors based on deep learning, vehicle type detector and plate number detector. The former is customized for model and color classification, and the latter is for License Plate Recognition (LPR). The vehicle type detector is able to predict 100 models and 11 colors in Taiwan, and it takes a whole image as input without cropping car regions, which considerably different from most of the current vehicle type classification methods using cropped car regions as input. In addition, traditional approaches to solve LPR problem typically are broken down into the localization, segmentation, and recognition steps. Rather than doing those preprocess steps, the plate number detector we proposed can operate directly on plate images with high performance in angularly skewed, various light, and low resolution condition. Considering the need for adding new classes for vehicle type detector in the future, we design an auto-labeling flow to automatically create bounding box labels for training. After getting the information of color, model, and plate number, we can search the plate number in the database of registered vehicle to confirm whether information is consistent. In this thesis, we develop two user interfaces (UI) for mobile device and street monitoring respectively. The user can know whether the car is stolen vehicle immediately by photographing it with smartphone camera. Additionally, our system can also achieve real-time video analysis for street monitoring. Notably, from the experimental results, our method is allowed to simultaneously detect all vehicles at one frame, even in skew angle.
APA, Harvard, Vancouver, ISO, and other styles
40

Tampubolon, Hendrik, and 譚恒力. "Supervised Deep Learning Based for TrafficFlow Prediction." Thesis, 2016. http://ndltd.ncl.edu.tw/handle/xk5aa2.

Full text
Abstract:
碩士
國立中正大學
資訊工程研究所
105
In the metropolitan areas, common traffic issues include traffic congestion, traffic accidents, air pollution, and energy consumption occur. To resolve this issues, many researchers have been developed Intelligent Transportation Systems (ITS). One of the important sub-systems in the development of ITS is a Traffic Management System (TMS) which attempts to reduce a traffic congestion. In fact, TMS itself relies on the estimation of traffic flow, therefore providing such an accurate traffic flow prediction is an essential need. For this reason, we aim to provide accurate traffic flow prediction to facilitate this system. In this Thesis, we propose a Supervised Deep Learning Based Traffic Flow Prediction (SDLTFP) which is a type of fully-connected deep neural network (FC-DNN). Timely prediction is also a major issue in guaranteeing reliable traffic flow prediction. However, training a deep network could be time-consuming, and overfitting may happen, especially when feeding small data into the deep architecture, the network is learned perfectly during the training, but in testing with new data, it could fail to generalize the model. We adopt Batch Normalization (BN) and Dropout techniques to help the network training. We then take advantage of open data as historical traffic data which are then used to predict future traffic flow with the above proposed method and model. Experiments show that the Mean Absolute Percentage Error (MAPE) for our traffic flow prediction is within 5 % using sample data and between 15 % to 20 % using out of the sample data. Training a deep network faster with BN and Dropout reduces the over fitting.
APA, Harvard, Vancouver, ISO, and other styles
41

Colaço, Fábio Iúri Gaspar. "Recommender Systems Based on Deep Learning Techniques." Master's thesis, 2020. http://hdl.handle.net/10451/45148.

Full text
Abstract:
Tese de mestrado em Ciência de Dados, Universidade de Lisboa, Faculdade de Ciências, 2020
O atual aumento do número de opções disponíveis aquando a tomada de uma decisão, faz com que vários indivíduos se sintam sobrecarregados, o que origina experiências de utilização frustrantes e demoradas. Sistemas de Recomendação são ferramentas fundamentais para a mitigação deste acontecimento, ao remover certas alternativas que provavelmente serão irrelevantes para cada indivíduo. Desenvolver estes sistemas apresenta vários desafios, tornando-se assim uma tarefa de difícil realização. Para tal, vários sistemas (frameworks) para facilitar estes desenvolvimentos foram propostos, ajudando assim a reduzir os custos de desenvolvimento, através da oferta de ferramentas reutilizáveis, tal como implementações de estratégias comuns e modelos populares. Contudo, ainda é difícil encontrar um sistema (framework) que também ofereça uma abstração completa na conversão de conjuntos de dados, suporte para abordagens baseadas em aprendizagem profunda, modelos extensíveis, e avaliações reproduzíveis. Este trabalho introduz o DRecPy, um novo sistema (framework) que oferece vários módulos para evitar trabalho de desenvolvimento repetitivo, mas também para auxiliar os praticantes nos desafios mencionados anteriormente. O DRecPy contém módulos para lidar com: tarefas de carregar e converter conjuntos de dados; divisão de conjuntos de dados para treino, validação e teste de modelos; amostragem de pontos de dados através de estratégias distintas; criação de sistemas de recomendação complexos e extensíveis, ao seguir uma estrutura de modelo definida mas flexível; juntamente com vários processos de avaliação que originam resultados determinísticos por padrão. Para avaliar este novo sistema (framework), a sua consistência é analisada através da comparação dos resultados produzidos, com os resultados publicados na literatura. Para mostrar que o DRecPy pode ser uma ferramenta valiosa para a comunidade de sistemas de recomendação, várias características são também avaliadas e comparadas com ferramentas existentes, tais como extensibilidade, reutilização e reprodutibilidade.
The current increase in available options makes individuals feel overwhelmed whenever facing a decision, resulting in a frustrating and time-consuming user experience. Recommender systems are a fundamental tool to solve this issue, filtering out the options that are most likely to be irrelevant for each person. Developing these systems presents us with a vast number of challenges, making it a difficult task to accomplish. To this end, various frameworks to aid their development have been proposed, helping reducing development costs by offering reusable tools, as well as implementations of common strategies and popular models. However, it is still hard to find a framework that also provides full abstraction over data set conversion, support for deep learning-based approaches, extensible models, and reproducible evaluations. This work introduces DRecPy, a novel framework that not only provides several modules to avoid repetitive development work, but also to assist practitioners with the above challenges. DRecPy contains modules to deal with: data set import and conversion tasks; splitting data sets for model training, validation, and testing; sampling data points using distinct strategies; creating extensible and complex recommenders, by following a defined but flexible model structure; together with many evaluation procedures that provide deterministic results by default. To evaluate this new framework, its consistency is analyzed by comparing the results generated by DRecPy against the results published by others using the same algorithms. Also, to show that DRecPy can be a valuable tool for the recommender systems’ community, several framework characteristics are evaluated and compared against existing tools, such as extensibility, reusability, and reproducibility.
APA, Harvard, Vancouver, ISO, and other styles
42

Kuo, Chieh-Ming, and 郭介銘. "Deep Learning Based Facial Expression Recognition System." Thesis, 2017. http://ndltd.ncl.edu.tw/handle/n2x2p2.

Full text
APA, Harvard, Vancouver, ISO, and other styles
43

Wey, Shin-Yu, and 魏心郁. "Glaucoma Detection System Based on Deep Learning." Thesis, 2018. http://ndltd.ncl.edu.tw/handle/v3jzjk.

Full text
APA, Harvard, Vancouver, ISO, and other styles
44

Chang, Wei-Cheng, and 張為誠. "Deep Learning Based Style Transfer for Videos." Thesis, 2018. http://ndltd.ncl.edu.tw/handle/fy564u.

Full text
Abstract:
碩士
國立交通大學
多媒體工程研究所
107
Neural style transfer is usually suitable for use in abstract styles. When used in styles such as Japanese animation whose foreground is more complex than their background, the results are often not as good as expected. We design a method to automatically transfer the style for video with this type of style. We combine semantic segmentation and spatial control to transfer the specified style to the specified area. By designing the initial image and the loss function, we fixed the distortion of the face and the incomplete style transfer. We propose a method to provide users with the ability to adjust the feature weights of different regions to maintain the artistic conception of the target style, we also combine the optical flow to ensure the coherence from frame to frame in the video.
APA, Harvard, Vancouver, ISO, and other styles
45

Chung, Lung-Yang, and 鍾隆揚. "Deep Learning Based Indoor Localization and Mapping." Thesis, 2018. http://ndltd.ncl.edu.tw/handle/388s7a.

Full text
Abstract:
碩士
國立中興大學
電機工程學系所
107
Nowadays everyone has a smart phone at least. We can use it with services provided by Google if we want location and map. However, GPS signal can not be found when we stay indoors. We can not use GoogleMap in this situation. Deep learning have a great success in the computer vision field(for example, image classification, object detection). This thesis purposes a method using deep learning to solve the indoor localization and mapping problem. We split it into two sub-tasks, and solve individually with two deep learning models. To evaluate our models, we experiments with different datasets. Evaluating models on the real world dataset, we obtain the average error of localization model is 0.59m and mapping one is 0.65m.
APA, Harvard, Vancouver, ISO, and other styles
46

LAI, CHUAN-PENG, and 賴傳鵬. "Writing Recognition System Based on Deep Learning." Thesis, 2019. http://ndltd.ncl.edu.tw/handle/s36wmz.

Full text
Abstract:
碩士
銘傳大學
電子工程學系碩士班
107
With the vigorous development of smart phones and smart car systems, human-computer interface (HCI) has gradually changed, from the traditional keyboard input text gradually replaced by handwriting board, touch screen. Sometimes people's environment is not suitable for text input directly by manipulating traditional input devices. In this paper, a device-free writing recognition system is proposed, which uses wireless signals to identify user's writing actions as input mode without touching any input devices. In the environment, wireless network access points (AP) and wireless network cards are set up as transceivers of wireless signals to measure channel state information (CSI). The channel states measured during writing actions are used as current action information, and the characteristics are trained by Deep Learning (DL). Recognition of written words. We constructed two sets of handwriting recognition to measure the writing of Arabic numerals with 2.4 GHz Wifi signals and capitalized English letters with 5 GHz Wifi signals. The experimental results show that the accuracy of 1D-CNN is the best, the recognition accuracy of Arabic numerals can reach 93.86%, and that of capitalized English letters can reach 96.98%.
APA, Harvard, Vancouver, ISO, and other styles
47

Wang, Hsuan-Yin, and 王炫尹. "PCB Defects Detection Based on Deep Learning." Thesis, 2019. http://ndltd.ncl.edu.tw/handle/u379ep.

Full text
Abstract:
碩士
國立暨南國際大學
資訊工程學系
107
The annual output value of printed circuit board (PCB) related industry is more than 21 billion US dollars which implies that the quantity of the produced PCBs per year is extremely large. However, the yield rate of PCBs is limited and if defective PCBs cannot be detected and discarded in the early stage of producing an electronic system, then they will lead to large amount of profit loss. Nowadays, many high speed automatic optical inspection systems can be used to classify defective PCBs. However, a closer inspection of the discarded PCBs will reveal that almost 70\% of them are actually misclassified. In this work, we develop an accurate PCB defect re-identification system based on deep learning techniques. We tested the performance of ResNet (Residual Network), DenseNet (Densely Connected Convolutional Network), GoogLeNet (Google Inception Net), and EFMNet (Extremal Feature Map Network) developed by us. A 98\% PCB defect re-identification accuracy is achieved. The developed system can dramatically reduce the false-defect rate.
APA, Harvard, Vancouver, ISO, and other styles
48

TSAI, CHIUNG-CHENG, and 蔡炯誠. "Nighttime Pedestrian Detection Based on Deep Learning." Thesis, 2019. http://ndltd.ncl.edu.tw/handle/rwc3u3.

Full text
Abstract:
碩士
國立臺北科技大學
自動化科技研究所
107
The purpose of pedestrian detection is to identify and locate people in a dynamic scene or environment. Today's common pedestrian detection system is almost a visible light pedestrian detection system. The disadvantage is that it is susceptible to the detection accuracy of the screen light source. Due to insufficient brightness at dusk and nighttime, many noises are generated in the image, making the obstacle difficult to recognize in visible light images. Therefore, some people develop thermal image detection systems. Unlike visible imaging systems, thermal imaging depends on the thermal infrared rays of the object. In the presentation of images, these temperature information are relative. The darker part of a thermal image is The low temperature, and vice versa, is high temperature, so it has the advantage of distinguishing between the human body and the cold background. In the research of this thesis, the problem of pedestrian segmentation and occlusion is mainly solved. After using the thermal image to perform pedestrian detection using the Faster RCNN architecture, the prediction frame provides accurate positioning of the target instance. In each Roi area, the crowd instances are segmented to distinguish individual people and to solve multi-object tracking that cannot be recognized after occlusion. Kalman filter prediction is used to track each target trajectory, and the Mahalanobis distance comparison and actual detection are used. The Mahalanobis distance is used to filter the small probability of matching, and grab a sample that may be the target object. After the coco evaluation test, the result is better than other methods.
APA, Harvard, Vancouver, ISO, and other styles
49

Zhou, Wen-Zhi, and 周文志. "Weld Automatic Extraction Based on Deep Learning." Thesis, 2019. http://ndltd.ncl.edu.tw/handle/z9x2un.

Full text
Abstract:
碩士
元智大學
機械工程學系
107
Welding is the main processing method used to connect metal workpiece in the industrial system. Automatic extraction of weld track is the key of welding automation. In recent years, due to rapid progresses in the hardware and software technology, deep learning technology has been widely used and achieved good results. This paper proposes an automatic welding seam extraction method of weld seam trajectory from point cloud data based on deep learning. By accurately registering the point cloud of the workpiece to its corresponding generated from the CAD model point cloud, the precise location of the weld seam of the CAD model relative to the workpiece can be obtained. Finally, the weld seam trajectory of the workpiece is extracted by using the fast nearest point search method. The research contents are as follows: (1) Obtain point cloud data. Build the workpiece point cloud acquisition platform, load the laser scanning sensor and other related hardware, use a stepper motor to move the sensor to scan the workpiece, obtain the three-dimensional point cloud data of the workpiece surface, and obtain the CAD model and CAD weld point cloud data by the existing three-dimensional software. (2) Coarse positioning of weld track of workpiece. In order to improve the registration efficiency, the point cloud and the CAD point cloud of a workpiece are pre-processed to reduce the number of point clouds data points. Then, the point cloud registration algorithm based on deep learning is adopted to register the point cloud of the workpiece and the point cloud of the CAD model. In other words, the initial position of the CAD welding seam relative to the workpiece is pre-adjusted to obtain the approximate position of the welding seam track of the workpiece. (3) Accurate positioning of the weld track of the workpiece. Due to the large number of redundant points in the workpiece point cloud, a feature point cloud extraction method based on surface curvature is proposed. The improved ICP algorithm is used to accurately register the CAD weld point cloud after the characteristic point cloud of the workpiece and the pre-adjusted position, and obtain the precise gesture of the CAD weld point cloud relative to the workpiece point cloud. (4) Extraction of weld track of workpiece. Using the fast nearest point search algorithm based on KD-Tree, the nearest point on the workpiece point cloud is obtained from the CAD weld point cloud after adjusting bit posture, which can be used as the workpiece weld track. Calculate the distance between the pre-adjusted CAD weld point cloud and the CAD weld point cloud adjusted to the precise position and the nearest point of the workpiece weld point cloud data extracted by them respectively for error analysis. The error analysis results demonstrate that the extracted data of the workpiece weld point cloud basically meet the industrial requirements. Finally, the efficiency and registration accuracy of the proposed registration algorithm are verified by registration experiments. Key words:Weld Extraction, Point Cloud Registration, Deep Learning, Feature Extraction
APA, Harvard, Vancouver, ISO, and other styles
50

Lai, Yi-Chung, and 賴易鍾. "Dynamic Action Recognition Based on Deep Learning." Thesis, 2018. http://ndltd.ncl.edu.tw/handle/xcwdgy.

Full text
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography