Dissertations / Theses: 'Model-Based Deep Learning'

1

Matsoukas, Christos. "Model Distillation for Deep-Learning-Based Gaze Estimation." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-261412.

Full text

Abstract:

With the recent advances in deep learning, the gaze estimation models reached new levels, in terms of predictive accuracy, that could not be achieved with older techniques. Nevertheless, deep learning consists of computationally and memory expensive algorithms that do not allow their integration for embedded systems. This work aims to tackle this problem by boosting the predictive power of small networks using a model compression method called "distillation". Under the concept of distillation, we introduce an additional term to the compressed model’s total loss which is a bounding term between the compressed model (the student) and a powerful one (the teacher). We show that the distillation method introduces to the compressed model something more than noise. That is, the teacher’s inductive bias which helps the student to reach a better optimum due to the adaptive error deduction. Furthermore, we show that the MobileNet family exhibits unstable training phases and we report that the distilled MobileNet25 slightly outperformed the MobileNet50. Moreover, we try newly proposed training schemes to increase the predictive power of small and thin networks and we infer that extremely thin architectures are hard to train. Finally, we propose a new training scheme based on the hintlearning method and we show that this technique helps the thin MobileNets to gain stability and predictive power.
Den senaste utvecklingen inom djupinlärning har hjälp till att förbättra precisionen hos gaze estimation-modeller till nivåer som inte tidigare varit möjliga. Dock kräver djupinlärningsmetoder oftast både stora mängder beräkningar och minne som därmed begränsar dess användning i inbyggda system med små minnes- och beräkningsresurser. Det här arbetet syftar till att kringgå detta problem genom att öka prediktiv kraft i små nätverk som kan användas i inbyggda system, med hjälp av en modellkomprimeringsmetod som kallas distillation". Under begreppet destillation introducerar vi ytterligare en term till den komprimerade modellens totala optimeringsfunktion som är en avgränsande term mellan en komprimerad modell och en kraftfull modell. Vi visar att destillationsmetoden inför mer än bara brus i den komprimerade modellen. Det vill säga lärarens induktiva bias som hjälper studenten att nå ett bättre optimum tack vare adaptive error deduction. Utöver detta visar vi att MobileNet-familjen uppvisar instabila träningsfaser och vi rapporterar att den destillerade MobileNet25 överträffade sin lärare MobileNet50 något. Dessutom undersöker vi nyligen föreslagna träningsmetoder för att förbättra prediktionen hos små och tunna nätverk och vi konstaterar att extremt tunna arkitekturer är svåra att träna. Slutligen föreslår vi en ny träningsmetod baserad på hint-learning och visar att denna teknik hjälper de tunna MobileNets att stabiliseras under träning och ökar dess prediktiva effektivitet.

APA, Harvard, Vancouver, ISO, and other styles

2

Lim, Steven. "Recommending TEE-based Functions Using a Deep Learning Model." Thesis, Virginia Tech, 2021. http://hdl.handle.net/10919/104999.

Full text

Abstract:

Trusted execution environments (TEEs) are an emerging technology that provides a protected hardware environment for processing and storing sensitive information. By using TEEs, developers can bolster the security of software systems. However, incorporating TEE into existing software systems can be a costly and labor-intensive endeavor. Software maintenance—changing software after its initial release—is known to contribute the majority of the cost in the software development lifecycle. The first step of making use of a TEE requires that developers accurately identify which pieces of code would benefit from being protected in a TEE. For large code bases, this identification process can be quite tedious and time-consuming. To help reduce the software maintenance costs associated with introducing a TEE into existing software, this thesis introduces ML-TEE, a recommendation tool that uses a deep learning model to classify whether an input function handles sensitive information or sensitive code. By applying ML-TEE, developers can reduce the burden of manual code inspection and analysis. ML-TEE's model was trained and tested on functions from GitHub repositories that use Intel SGX and on an imbalanced dataset. The accuracy of the final model used in the recommendation system has an accuracy of 98.86% and an F1 score of 80.00%. In addition, we conducted a pilot study, in which participants were asked to identify functions that needed to be placed inside a TEE in a third-party project. The study found that on average, participants who had access to the recommendation system's output had a 4% higher accuracy and completed the task 21% faster.
Master of Science
Improving the security of software systems has become critically important. A trusted execution environment (TEE) is an emerging technology that can help secure software that uses or stores confidential information. To make use of this technology, developers need to identify which pieces of code handle confidential information and should thus be placed in a TEE. However, this process is costly and laborious because it requires the developers to understand the code well enough to make the appropriate changes in order to incorporate a TEE. This process can become challenging for large software that contains millions of lines of code. To help reduce the cost incurred in the process of identifying which pieces of code should be placed within a TEE, this thesis presents ML-TEE, a recommendation system that uses a deep learning model to help reduce the number of lines of code a developer needs to inspect. Our results show that the recommendation system achieves high accuracy as well as a good balance between precision and recall. In addition, we conducted a pilot study and found that participants from the intervention group who used the output from the recommendation system managed to achieve a higher average accuracy and perform the assigned task faster than the participants in the control group.

APA, Harvard, Vancouver, ISO, and other styles

3

Hellström, Terese. "Deep-learning based prediction model for dose distributions in lung cancer patients." Thesis, Stockholms universitet, Fysikum, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:su:diva-196891.

Full text

Abstract:

Background To combat one of the leading causes of death worldwide, lung cancer treatment techniques and modalities are advancing, and the treatment options are becoming increasingly individualized. Modern cancer treatment includes the option for the patient to be treated with proton therapy, which can in some cases spare healthy tissue from excessive dose better than conventional photon radiotherapy. However, to assess the benefit of proton therapy compared to photon therapy, it is necessary to make both treatment plans to get information about the Tumour Control Probability (TCP) and the Normal Tissue Complication Probability (NTCP). This requires excessive treatment planning time and increases the workload for planners. Aim This project aims to investigate the possibility for automated prediction of the treatment dose distribution using a deep learning network for lung cancer patients treated with photon radiotherapy. This is an initial step towards decreasing the overall planning time and would allow for efficient estimation of the NTCP for each treatment plan and lower the workload of treatment planning technicians. The purpose of the current work was also to understand which features of the input data and training specifics were essential for producing accurate predictions. Methods Three different deep learning networks were developed to assess the difference in performance based on the complexity of the input for the network. The deep learning models were applied for predictions of the dose distribution of lung cancer treatment and used data from 95 patient treatments. The networks were trained with a U-net architecture using input data from the planning Computed Tomography (CT) and volume contours to produce an output of the dose distribution of the same image size. The network performance was evaluated based on the error of the predicted mean dose to Organs At Risk (OAR) as well as the shape of the predicted Dose-Volume Histogram (DVH) and individual dose distributions. Results The optimal input combination was the CT scan and lung, mediastinum envelope and Planning Target Volume (PTV) contours. The model predictions showed a homogenous dose distribution over the PTV with a steep fall-off seen in the DVH. However, the dose distributions had a blurred appearance and the predictions of the doses to the OARs were therefore not as accurate as of the doses to the PTV compared to the manual treatment plans. The performance of the network trained with the Houndsfield Unit input of the CT scan had similar performance as the network trained without it. Conclusions As one of the novel attempts to assess the potential for a deep learning-based prediction model for the dose distribution based on minimal input, this study shows promising results. To develop this kind of model further a larger data set would be needed and the training method could be expanded as a generative adversarial network or as a more developed U-net network.

APA, Harvard, Vancouver, ISO, and other styles

4

Li, Mengtong. "An intelligent flood evacuation model based on deep learning of various flood scenarios." Doctoral thesis, Kyoto University, 2021. http://hdl.handle.net/2433/263634.

Full text

APA, Harvard, Vancouver, ISO, and other styles

5

Karlsson, Axel, and Bohan Zhou. "Model-Based versus Data-Driven Control Design for LEACH-based WSN." Thesis, KTH, Maskinkonstruktion (Inst.), 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-272197.

Full text

Abstract:

In relation to the increasing interest in implementing smart cities, deployment of widespread wireless sensor networks (WSNs) has become a current hot topic. Among the application’s greatest challenges, there is still progress to be made concerning energy consumption and quality of service. Consequently, this project aims to explore a series of feasible solutions to improve the WSN energy efficiency for data aggregation by the WSN. This by strategically adjusting the position of the receiving base station and the packet rate of the WSN nodes. Additionally, the low-energy adaptive clustering hierarchy (LEACH) protocol is coupled with the WSN state of charge (SoC). For this thesis, a WSN was defined as a two dimensional area which contains sensor nodes and a mobile sink, i.e. a movable base station. Subsequent to the rigorous analyses of the WSN data clustering principles and system-wide dynamics, two different developing strategies, model-based and data-driven designs, were employed to develop two corresponding control approaches, model predictive control and reinforcement learning, on WSN energy management. To test their performance, a simulation environment was thus developed in Python, including the extended LEACH protocol. The amount of data transmitted by an energy unit is adopted as the index to estimate the control performance. The simulation results show that the model based controller was able to aggregate over 22% more bits than only using the LEACH protocol. Whilst the data driven controller had a worse performance than the LEACH network but showed potential for smaller sized WSNs containing a fewer amount of nodes. Nonetheless, the extension of the LEACH protocol did not give rise to obvious improvement on energy efficiency due to a wide range of differing results.
I samband med det ökande intresset för att implementera så kallade smart cities, har användningen av utbredda trådlösa sensor nätverk (WSN) blivit ett intresseområde. Bland applikationens största utmaningar, finns det fortfarande förbättringar med avseende på energiförbrukning och servicekvalité. Därmed så inriktar sig detta projekt på att utforska en mängd möjliga lösningar för att förbättra energieffektiviteten för dataaggregation inom WSN. Detta gjordes genom att strategiskt justera positionen av den mottagande basstationen samt paketfrekvensen för varje nod. Dessutom påbyggdes low-energy adaptive clustering hierarchy (LEACH) protokollet med WSN:ets laddningstillstånd. För detta examensarbete definierades ett WSN som ett två dimensionellt plan som innehåller sensor noder och en mobil basstation, d.v.s. en basstation som går att flytta. Efter rigorös analys av klustringsmetoder samt dynamiken av ett WSN, utvecklades två kontrollmetoder som bygger på olika kontrollstrategier. Dessa var en modelbaserad MPC kontroller och en datadriven reinforcement learning kontroller som implementerades för att förbättra energieffektiviteten i WSN. För att testa prestandan på dom två kontrollmetoderna, utvecklades en simulations platform baserat på Python, tillsamans med påbyggnaden av LEACH protokollet. Mängden data skickat per energienhet användes som index för att approximera kontrollprestandan. Simuleringsresultaten visar att den modellbaserade kontrollern kunde öka antalet skickade datapacket med 22% jämfört med när LEACH protokollet användes. Medans den datadrivna kontrollern hade en sämre prestanda jämfört med när enbart LEACH protokollet användes men den visade potential för WSN med en mindre storlek. Påbyggnaden av LEACH protokollet gav ingen tydlig ökning med avseende på energieffektiviteten p.g.a. en mängd avvikande resultat.

APA, Harvard, Vancouver, ISO, and other styles

6

Lai, Khai Ping. "A deep learning model for automatic image texture classification: Application to vision-based automatic aircraft landing." Thesis, Queensland University of Technology, 2016. https://eprints.qut.edu.au/97992/4/Khai_Ping_Lai_Thesis.pdf.

Full text

Abstract:

This project aims to investigate a robust Deep Learning architecture to classify different type of textural imagery. The findings will eventually be part of a central processing algorithm used for Automatic Image Classification for Automatic Aircraft Landing.

APA, Harvard, Vancouver, ISO, and other styles

7

Keisala, Simon. "Using a Character-Based Language Model for Caption Generation." Thesis, Linköpings universitet, Interaktiva och kognitiva system, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-163001.

Full text

Abstract:

Using AI to automatically describe images is a challenging task. The aim of this study has been to compare the use of character-based language models with one of the current state-of-the-art token-based language models, im2txt, to generate image captions, with focus on morphological correctness. Previous work has shown that character-based language models are able to outperform token-based language models in morphologically rich languages. Other studies show that simple multi-layered LSTM-blocks are able to learn to replicate the syntax of its training data. To study the usability of character-based language models an alternative model based on TensorFlow im2txt has been created. The model changes the token-generation architecture into handling character-sized tokens instead of word-sized tokens. The results suggest that a character-based language model could outperform the current token-based language models, although due to time and computing power constraints this study fails to draw a clear conclusion. A problem with one of the methods, subsampling, is discussed. When using the original method on character-sized tokens this method removes characters (including special characters) instead of full words. To solve this issue, a two-phase approach is suggested, where training data first is separated into word-sized tokens where subsampling is performed. The remaining tokens are then separated into character-sized tokens. Future work where the modified subsampling and fine-tuning of the hyperparameters are performed is suggested to gain a clearer conclusion of the performance of character-based language models.

APA, Harvard, Vancouver, ISO, and other styles

8

Ma, Xiren. "Deep Learning-Based Vehicle Recognition Schemes for Intelligent Transportation Systems." Thesis, Université d'Ottawa / University of Ottawa, 2021. http://hdl.handle.net/10393/42247.

Full text

Abstract:

With the increasing highlighted security concerns in Intelligent Transportation System (ITS), Vision-based Automated Vehicle Recognition (VAVR) has attracted considerable attention recently. A comprehensive VAVR system contains three components: Vehicle Detection (VD), Vehicle Make and Model Recognition (VMMR), and Vehicle Re-identification (VReID). These components perform coarse-to-fine recognition tasks in three steps. The VAVR system can be widely used in suspicious vehicle recognition, urban traffic monitoring, and automated driving system. Vehicle recognition is complicated due to the subtle visual differences between different vehicle models. Therefore, how to build a VAVR system that can fast and accurately recognize vehicle information has gained tremendous attention. In this work, by taking advantage of the emerging deep learning methods, which have powerful feature extraction and pattern learning abilities, we propose several models used for vehicle recognition. First, we propose a novel Recurrent Attention Unit (RAU) to expand the standard Convolutional Neural Network (CNN) architecture for VMMR. RAU learns to recognize the discriminative part of a vehicle on multiple scales and builds up a connection with the prominent information in a recurrent way. The proposed ResNet101-RAU achieves excellent recognition accuracy of 93.81% on the Stanford Cars dataset and 97.84% on the CompCars dataset. Second, to construct efficient vehicle recognition models, we simplify the structure of RAU and propose a Lightweight Recurrent Attention Unit (LRAU). The proposed LRAU extracts the discriminative part features by generating attention masks to locate the keypoints of a vehicle (e.g., logo, headlight). The attention mask is generated based on the feature maps received by the LRAU and the preceding attention state generated by the preceding LRAU. Then, by adding LRAUs to the standard CNN architectures, we construct three efficient VMMR models. Our models achieve the state-of-the-art results with 93.94% accuracy on the Stanford Cars dataset, 98.31% accuracy on the CompCars dataset, and 99.41% on the NTOU-MMR dataset. In addition, we construct a one-stage Vehicle Detection and Fine-grained Recognition (VDFG) model by combining our LRAU with the general object detection model. Results show the proposed VDFG model can achieve excellent performance with real-time processing speed. Third, to address the VReID task, we design the Compact Attention Unit (CAU). CAU has a compact structure, and it relies on a single attention map to extract the discriminative local features of a vehicle. We add two CAUs to the truncated ResNet to construct a small but efficient VReID model, ResNetT-CAU. Compared with the original ResNet, the model size of ResNetT-CAU is reduced by 60%. Extensive experiments on the VeRi and VehicleID dataset indicate the proposed ResNetT-CAU achieve the best re-identification results on both datasets. In summary, the experimental results on the challenging benchmark VMMR and VReID datasets indicate our models achieve the best VMMR and VReID performance, and our models have a small model size and fast image processing speed.

APA, Harvard, Vancouver, ISO, and other styles

9

Liu, Rongrong. "Multispectral images-based background subtraction using Codebook and deep learning approaches." Thesis, Bourgogne Franche-Comté, 2020. http://www.theses.fr/2020UBFCA013.

Full text

Abstract:

Cette thèse vise à étudier les images multispectrales pour la détection d'objets en mouvement par soustraction d'arrière-plan, à la fois avec des méthodes classiques et d’apprentissage profond. En tant qu'algorithme classique efficace et représentatif pour la soustraction de fond, l’algorithme Codebook traditionnel a d'abord été étendu au cas multispectral. Afin de rendre l'algorithme fiable et robuste, un mécanisme auto-adaptatif pour sélectionner les paramètres optimaux a ensuite été proposé. Dans ce cadre, de nouveaux critères dans le processus d'appariement sont employés et de nouvelles techniques pour construire le modèle d'arrière-plan sont conçues, y compris le Codebook de boîtes, le Codebook dynamique et la stratégie de fusion. La dernière tentative est d'étudier les avantages potentiels de l'utilisation d'images multispectrales via des réseaux de neurones convolutifs. Sur la base de l'algorithme impressionnant FgSegNet_v2, les principales contributions de ce travail reposent sur deux aspects : (1) extraire trois canaux sur sept de l'ensemble des données multispectrales du FluxData FD-1665 pour correspondre au nombre de canaux d'entrée du modèle profond, et (2) proposer un nouvel encodeur convolutionnel pour pouvoir utiliser tous les canaux multispectraux disponibles permettant d’explorer davantage les informations des images multispectrales
This dissertation aims to investigate the multispectral images in moving objects detection via background subtraction, both with classical and deep learning-based methods. As an efficient and representative classical algorithm for background subtraction, the traditional Codebook has first been extended to multispectral case. In order to make the algorithm reliable and robust, a self-adaptive mechanism to select optimal parameters has then been proposed. In this frame, new criteria in the matching process are employed and new techniques to build the background model are designed, including box-based Codebook, dynamic Codebook and fusion strategy. The last attempt is to investigate the potential benefit of using multispectral images via convolutional neural networks. Based on the impressive algorithm FgSegNet_v2, the major contributions of this part lie in two aspects: (1) extracting three channels out of seven in the FluxData FD-1665 multispectral dataset to match the number of input channels of the deep model, and (2) proposing a new convolutional encoder to utilize all the multispectral channels available to further explore the information of multispectral images

APA, Harvard, Vancouver, ISO, and other styles

10

Rossi, Alex. "Self-supervised information retrieval: a novel approach based on Deep Metric Learning and Neural Language Models." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2021.

Find full text

Abstract:

Most of the existing open-source search engines, utilize keyword or tf-idf based techniques to find relevant documents and web pages relative to an input query. Although these methods, with the help of a page rank or knowledge graphs, proved to be effective in some cases, they often fail to retrieve relevant instances for more complicated queries that would require a semantic understanding to be exploited. In this Thesis, a self-supervised information retrieval system based on transformers is employed to build a semantic search engine over the library of Gruppo Maggioli company. Semantic search or search with meaning can refer to an understanding of the query, instead of simply finding words matches and, in general, it represents knowledge in a way suitable for retrieval. We chose to investigate a new self-supervised strategy to handle the training of unlabeled data based on the creation of pairs of ’artificial’ queries and the respective positive passages. We claim that by removing the reliance on labeled data, we may use the large volume of unlabeled material on the web without being limited to languages or domains where labeled data is abundant.

APA, Harvard, Vancouver, ISO, and other styles

11

Vellala, Abhinay. "Genre-based Video Clustering using Deep Learning : By Extraction feature using Object Detection and Action Recognition." Thesis, Linköpings universitet, Statistik och maskininlärning, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-176942.

Full text

Abstract:

Social media has become an integral part of the Internet. There have been users across the world sharing content like images, texts, videos, and so on. There is a huge amount of data being generated and it has become a challenge to the social media platforms to group the content for further usage like recommending a video. Especially, grouping videos based on similarity requires extracting features. This thesis investigates potential approaches to extract features that can help in determining the similarity between videos. Features of given videos are extracted using Object Detection and Action Recognition. Bag-of-features representation is used to build the vocabulary of all the features and transform data that can be useful in clustering videos. Probabilistic model-based clustering, Multinomial Mixture model is used to determine the underlying clusters within the data by maximizing the expected log-likelihood and estimating the parameters of data as well as probabilities of clusters. Analysis of clusters is done to understand the genre based on dominant actions and objects. Bayesian Information Criterion(BIC) and Akaike Information Criterion(AIC) are used to determine the optimal number of clusters within the given videos. AIC/BIC scores achieved minimum scores at 32 clusters which are chosen to be the optimal number of clusters. The data is labeled with the genres and Logistic regression is performed to check the cluster performance on test data and has achieved 96% accuracy

APA, Harvard, Vancouver, ISO, and other styles

12

Mendes, David, M. J. Lopes, Artur Romão, and Irene Pimenta Rodrigues. "Healthcare Computer Reasoning Addressing Chronically Ill Societies Using IoT: Deep Learning AI to the Rescue of Home-Based Healthcare." Bachelor's thesis, IGI Global, 2016. http://hdl.handle.net/10174/19286.

Full text

Abstract:

The authors present a proposal to develop intelligent assisted living environments for home based healthcare. These environments unite the chronical patient clinical history sematic representation with the ability of monitoring the living conditions and events recurring to a fully managed Semantic Web of Things (SWoT). Several levels of acquired knowledge and the case based reasoning that is possible by knowledge representation of the health-disease history and acquisition of the scientific evidence will deliver, through various voice based natural interfaces, the adequate support systems for disease auto management but prominently by activating the less differentiated caregiver for any specific need. With these capabilities at hand, home based healthcare providing becomes a viable possibility reducing the institutionalization needs. The resulting integrated healthcare framework will provide significant savings while improving the generality of health and satisfaction indicators.

APA, Harvard, Vancouver, ISO, and other styles

13

Sievert, Rolf. "Instance Segmentation of Multiclass Litter and Imbalanced Dataset Handling : A Deep Learning Model Comparison." Thesis, Linköpings universitet, Datorseende, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-175173.

Full text

Abstract:

Instance segmentation has a great potential for improving the current state of littering by autonomously detecting and segmenting different categories of litter. With this information, litter could, for example, be geotagged to aid litter pickers or to give precise locational information to unmanned vehicles for autonomous litter collection. Land-based litter instance segmentation is a relatively unexplored field, and this study aims to give a comparison of the instance segmentation models Mask R-CNN and DetectoRS using the multiclass litter dataset called Trash Annotations in Context (TACO) in conjunction with the Common Objects in Context precision and recall scores. TACO is an imbalanced dataset, and therefore imbalanced data-handling is addressed, exercising a second-order relation iterative stratified split, and additionally oversampling when training Mask R-CNN. Mask R-CNN without oversampling resulted in a segmentation of 0.127 mAP, and with oversampling 0.163 mAP. DetectoRS achieved 0.167 segmentation mAP, and improves the segmentation mAP of small objects most noticeably, with a factor of at least 2, which is important within the litter domain since small objects such as cigarettes are overrepresented. In contrast, oversampling with Mask R-CNN does not seem to improve the general precision of small and medium objects, but only improves the detection of large objects. It is concluded that DetectoRS improves results compared to Mask R-CNN, as well does oversampling. However, using a dataset that cannot have an all-class representation for train, validation, and test splits, together with an iterative stratification that does not guarantee all-class representations, makes it hard for future works to do exact comparisons to this study. Results are therefore approximate considering using all categories since 12 categories are missing from the test set, where 4 of those were impossible to split into train, validation, and test set. Further image collection and annotation to mitigate the imbalance would most noticeably improve results since results depend on class-averaged values. Doing oversampling with DetectoRS would also help improve results. There is also the option to combine the two datasets TACO and MJU-Waste to enforce training of more categories.

APA, Harvard, Vancouver, ISO, and other styles

14

Liu, Zheng-Wei, and 劉政威. "Waterfall Model for Deep Reinforcement Learning Based Scheduling." Thesis, 2019. http://ndltd.ncl.edu.tw/handle/a3yn5q.

Full text

Abstract:

碩士
國立中央大學
通訊工程學系在職專班
107
The fourth generation of communication systems has been able to meet the multimedia application needs of mobile devices. Through the scheduling service provided by the base station, the user equipment can obtain the data packets required by the downlink of the communication system to meet and obtain better application services, so the channel resources are allocated and the calculation of the user group scheduling service is provided. The law is quite critical. This paper implements a mobile communication scheduling learning platform, and proposes a Deep Deterministic Policy Gradient model. The waterfall model concept is used to analyze the scheduling algorithm flow into three stages: sorting selection, resource evaluation and channel allocation. A waterfall scheduling method that enables more data throughput per unit time and meets more user needs in the current communication environment. The mobile communication scheduling learning platform is composed of six modular components: base station and channel resources, enhanced learning neural network, user equipment attributes, application service types, environmental information and reward functions, and phase micro-algorithms and dependency injection. . Using inversion control and dependency injection to reduce platform software coupling, it is quite easy to maintain the stage micro-algorithm and the six module components.

APA, Harvard, Vancouver, ISO, and other styles

15

Chang, Chao-Mei, and 張昭美. "Taiwanese speech commands recognition model based on deep learning." Thesis, 2019. http://ndltd.ncl.edu.tw/handle/knb7ws.

Full text

Abstract:

碩士
國立交通大學
資訊學院資訊學程
107
Most of the recent machine learning papers are aimed at images and videos, such as face recognition, large image database identification, unmanned autopilot, object recognition, AlphaGo, object movement trajectory prediction, changing image style and creating virtual portrait style-based GAN. However, due to the development trend of voice assistants, it is necessary to closely cooperate with local language materials and culture habits. Therefore, the focus is on local language audio processing and machine learning . Benefiting from the prosperous deep learning progress, diverse languages are no longer an obstacle to communication, but rather a manifestation of diverse cultures. It is time to pay attention to regional languages such as Taiwanese. The paper uses a variety of audio pre-processing and CNN, LSTM, GRU and other attempts to use the depth model. 1. Taiwanese speech command recognition. 2. Specific Taiwanese speech keyword trigger. 3. Identify the audio segment between Mandarin and Taiwanese. 4. Use AI to write Taiwanese local drama. Finally, it is applied to the Android mobile app, so that the user can use the voice "góabeh" (I want to) "khuànn-siòng-phìnn" (See photo), "thian-im-ga̍k" (Listen to music), "khà-tiān-uē" (Make a phone call), "hip-siàng" (Photograph) evokes the corresponding application.

APA, Harvard, Vancouver, ISO, and other styles

16

Huang, Kuang-Chieh, and 黃冠傑. "Predicting Remaining Useful Life of Equipment based on Deep Learning-based Model." Thesis, 2018. http://ndltd.ncl.edu.tw/handle/2bd366.

Full text

Abstract:

碩士
元智大學
資訊管理學系
106
With the development of smart manufacturing, in order to instantly detect abnormal conditions of the equipment, a large number of sensors were built to record the variables associated with the collection of production equipment. This research focuses on the remaining useful life(RUL)prediction. Remaining useful life is a part of the predictive maintenance(PDM), it is condition based, according to the development trend of the machine in the past to detect the machine is going to malfunction, our purpose is to detect early that the machine needs to be replaced or repaired to ensure the sustainability of the system. Existing literature methods are often difficult to extract meaningful features from sensing data. This research proposes a deep learning method, constructing an autoencoder gated recurrent unit (AE-GRU) neural network model, autoencoder extracts the important features from the raw data and gated recurrent unit picks up the information of the sequences to forecasting remaining useful life precisely. In the experiment of this research, we use for the prognostics challenge competition at the IEEE International Conference on Prognostics and Health Management (PHM08) and evaluated by 5 folds cross-validation. In the verification of root mean square error(RMSE) in our experiments, our method is better than other methods, such as deep neural network(DNN)、recurrent neural network(RNN)、long short-term memory neural network(LSTM)、gated recurrent unit neural network(GRU).

APA, Harvard, Vancouver, ISO, and other styles

17

HUANG, PO-YU, and 黃柏毓. "Predicting Social Insurance Payment Behavior Based on Deep Learning Model." Thesis, 2017. http://ndltd.ncl.edu.tw/handle/04210692742079168717.

Full text

Abstract:

碩士
逢甲大學
資訊工程學系
105
The social insurance is an important part of the social security system. In Taiwan, the social insurance system is classified by occupational groups and managed by different government agencies. According to the Executive Yuan of Taiwan, this pension system includes five separate social insurance programs covering public servants and teachers, laborers, military personnel, farmers, and a national pension insurance program for those not covered by the above four employment-based categories. Ministry of Health and Welfare in Taiwan is responsible for many types of social insurances like National Pension Insurance, National Health Insurance and Long-term Care Services Program. In addition, Ministry of Health and Welfare provides subsidized health insurance coverage for the underprivileged and ensures that senior citizens with no employment-based retirement benefits will still have the basic economic necessities in their elderly life. Unfortunately, most social insurances are impacted by various problems and have been facing the crisis of pension bankruptcy. Althougth the traditional actuarial methods use many hypotheses to analyze cash flow, they mostly focus on trend analysis with a macro view of the participants. Due to a large number of the insured, it is very hard to predict the payment behavior of each individual. To make better prediction, we propose to build payment behavior models based on machine learning technology to predict personal payment behavior accurately. Using the number of the participants for each personal payment behavior and corresponding insurance premiums, we can make better cash flow prediction in order to help the social insurance operations become sustainable. This research uses the seven year's data from Taiwan's National Pension Insurance as the source of experimental data. With the implementation of deep learning model, we could analyze and predict the future payment behaviors of the insured.

APA, Harvard, Vancouver, ISO, and other styles

18

Chung, Hao-Ting, and 鐘皓廷. "Building Student Course Performance Prediction Model Based on Deep Learning." Thesis, 2018. http://ndltd.ncl.edu.tw/handle/m2z8n3.

Full text

Abstract:

碩士
國立臺北科技大學
資訊工程系
106
The deferral of graduation rate in Taiwan’s universities is estimated 16%, which will affect the scheduling of school resources. Therefore, if we can expect to take notice of students’ academic performance and provide guidance to students who cannot pass the threshold as expected, we can effectively reduce the waste of school resources. In this research, we use recent years’ student data attributes and course results as training data to construct student performance prediction model. The K-Means algorithm was used to classify all courses from the freshman to the senior. The related courses will be grouped in the same cluster, which will more likely to find similar features and improve the accuracy of the prediction. Then, this research constructs independent neural networks for each course according to the different academic year. Each model will be pre-trained by using De-noising Autoencoder. After pre-training, the corresponding structure and weights are taken as the initial value of the neural network model. Each neural network is treated as a base predictor. All predictors will be integrated into an Ensemble predictor according to different years’ weights to predict the current student’s course performance. As the students finish the course at the end of each semester, the prediction model will continue track and update to enhance model accuracy through online learning.

APA, Harvard, Vancouver, ISO, and other styles

19

Huang, Chu-Chih, and 黃炬智. "Classification of Chinese Articulation Disorder based on Deep Learning Model." Thesis, 2019. http://ndltd.ncl.edu.tw/handle/3v7km6.

Full text

Abstract:

碩士
國立臺灣科技大學
電子工程系
107
Articulation disorder means having difficulties during pronunciations, leading to incorrect articulations and unclear sentences. Articulation disorder has been a common child language issue. Currently, there is no any unified sayings for articulation disorder's classification in the Taiwan's medical field. Thus, a speech therapist is required for analysis and treatment in hospitals. After a series of pronunciations, a speech therapist will make an analysis based on children's pronunciations. Children will return to the hospitals for months continuously to improve their conditions. Nevertheless, the treatment can only benefit children with articulation disorder by receiving treatments in hospitals, slowing down the treatment cycle. The purpose of this work is to automate the diagnosis for articulation disorder by combining the latest AI's convolutional neural network (CNN). Results show that LeNet-5 which achieved 94.56 Top-1 accuracy and 0.995 avg F1-score with the smallest model size is more suitable to apply articulations disorder application on mobile devices.

APA, Harvard, Vancouver, ISO, and other styles

20

LIN, HAN-LONG, and 林翰隆. "Building Graduate Salary Grading Prediction Model Based on Deep Learning." Thesis, 2019. http://ndltd.ncl.edu.tw/handle/z4hkqx.

Full text

Abstract:

碩士
國立臺北科技大學
資訊工程系
107
This paper used deep learning to build a salary grading prediction model. Due to the order relationship between each grading of salary grading, this paper regards this kind of problem as an ordinal regression problem. This paper used multiple output deep neural network to solve the ordinal regression problem so that the network learns the correlation between these salary grading during training. This model is pre-trained using Stacked De-noising Autoencoder. After pre-training, the corresponding weights are taken as the initial weights of neural network. During training, this paper used the Dropout and Bootstrap Aggregating to improve model performance. This model used the graduates’ personal information, grades, and family data as input feature, and predict salary grading of graduating or graduated students. This result will be provided to the school researchers to grasp the salary trend.

APA, Harvard, Vancouver, ISO, and other styles

21

CHEN, LI-TENG, and 陳立騰. "Image Caption Generation Based on Deep Learning and Visual Attention Model." Thesis, 2018. http://ndltd.ncl.edu.tw/handle/v6g3tp.

Full text

Abstract:

碩士
國立雲林科技大學
電機工程系
106
In this thesis, we develop an image caption generation based on deep learning and visual attention model. This system is composed of several parts: object detection, saliency computation, and image caption generation. In the object detection part, a deep learning technique, Faster R-CNN, is used to detect and classify objects in images. A pre-trained model can classify 80 categories for image classification. In the saliency computation, the pre-training model proposed in [8] is to compute the saliency value of each ROI image. According to category information and saliency value, the proposed system can generate the corresponding image caption. To evaluate the performance of the proposed system, the COCO 2014 image set is used. There are 30,000 images in the COCO 2014 image set. For image caption, the BLEU value of the proposed system is higher than that of [11]. Experimental results show that the proposed system is superior to the existing method [11].

APA, Harvard, Vancouver, ISO, and other styles

22

CHUANG, YU-HAO, and 莊友豪. "Automatic Mobile Online Game Bot Detection Model Based on Deep Learning." Thesis, 2018. http://ndltd.ncl.edu.tw/handle/bxryp2.

Full text

Abstract:

碩士
國立臺北大學
資訊工程學系
106
The excessive flooding of game bot causes the imbalances in mobile online games and even shortens the life cycle of mobile online games. The random forest algorithm is a general solution to identify game bot through behavioral features. Although the random forest algorithm can detect most game bot exactly, however, there are some players belonging to gray zone that cannot be detected accurately. Therefore, in this paper, we propose a deep learning based game bot detection approach, collecting players’ data and extracting the features to build the multilayer perceptron model as the detection standard. We use different methods to design four sets training parameters, and then choose the best performance training parameters as our deep learning model approach baseline. This approach is implemented on the mobile online game named KANO and the model calculates each data's probability. Then we count every probability’s number and search the data in the middle, through the algorithm to define the detecting bot critical value. The experimental result displays the proposed model has better performance, reducing the error rate from 6.218% to 2.53% and increasing the accuracy from 95.2% to 99.894% as compared with the random forest model in the same players’ data. And the training data’s critical value has very little difference with the testing data’s critical value. Thus our model can detect bot players more accurately and has lower false negative and false positive.

APA, Harvard, Vancouver, ISO, and other styles

23

LIU, ZHI-YONG, and 劉志勇. "Development and Performance Verification of Cognitive Diagnosis Model based on Deep Learning." Thesis, 2019. http://ndltd.ncl.edu.tw/handle/s6t3jr.

Full text

Abstract:

博士
國立臺中教育大學
教育資訊與測驗統計研究所
107
Deep learning has brought breakthrough development in many fields such as a convolutional neural network for image recognition, long short-term memory networks for speech and natural language processing, word2vec model for produces word vectors, generative adversarial network, and deep reinforcement learning. In this study, Autoencoder algorithm is applied to develop a deep learning cognitive diagnosis model (DLCD). Traditional cognitive diagnosis models, such as DINA and G-DINA, require parameter estimation through the expectation-maximization algorithm or Markov chain Monte Carlo method. DLCD is used to improve the problem that traditional cognitive diagnosis models require large samples for estimation. The research methods are divided into three parts, Q matrix research, simulation research, and real data research. DLCD not only work well both on a complete Q matrix and a non-complete Q matrix but also demonstrates the most favorable generalization ability. The proposed method outperforms DINA and G-DINA in simulated research when the sample size is small. Moreover, DLCD has the highest classification agreement on real data research. Based on the results of simulated and real data sets, DLCD is suitable for small-class teaching and even for only one examinee.

APA, Harvard, Vancouver, ISO, and other styles

24

CHUANG, YI-TING, and 莊宜庭. "A PM2.5 Prediction Model Based on Deep Learning with Recurrent Neural Network." Thesis, 2019. http://ndltd.ncl.edu.tw/handle/sfu623.

Full text

Abstract:

碩士
東海大學
資訊管理學系
107
In recent years, many studies have verified that air pollution will seriously affect human health. In addition, the media reported many issues concerning air pollution, so people have begun to pay attention to its existence. This study analyzes the data of the Environmental Protection Administration air quality immediate pollution indicators in 2018. Five methods are used to deal with the missing values. The main correlation variables affecting the PM25 concentration are identified by principal component analysis and correlation coefficients (single factor: PM10, SO2, NOX, NO2, CO, two-factor: NOX+NO2+CO, SO2+PM10), and the Long-Short Term Memory Model (LSTM) of the Recurrent Neural Network (RNN) was used to model the PM25 concentration model for the next 8 hours. According to the research results, most of the errors between the predicted and true values of Fengyuan Station fall within the reasonable range of MAPE (0.2~0.5). In addition, the best way to deal with the missing value is linear interpolation.

APA, Harvard, Vancouver, ISO, and other styles

25

LI, CHEN-YU, and 李振宇. "A Study on Deep Learning based Blind Guidance Model Building for Smart Device." Thesis, 2019. http://ndltd.ncl.edu.tw/handle/nuje9f.

Full text

Abstract:

碩士
逢甲大學
資訊工程學系
107
In the field of machine learning and data science, smart navigation device is a trend. More and more smart devices are designed for blind people, and the core of the visual aid is usually realized by image recognition. There are many tools and machine learning platforms for the construction of image recognition model, but these tools couldn’t satisfy the requirements of the visual aid devices. In order to construct a visual aid device, an ability of reducing the cost of model training and portability of the image recognition visual aid device are the main problems. System needs to remind and warn user in immediate danger situation, such as the car, the traffic light, and the indoor space obstacle or helping blind people find indoor objects efficiently and quickly. Therefore, in order to solve these problems, we proposed deep learning-based Blind Guidance framework. YOLO (Real-Time Object Detection) is currently the fastest detection, high accuracy and low training detection method, which is used to reduce the cost of model training. For portability of the image recognition visual aid device, the system needs to calculate deep learning and low power capability embedded devices, NVIDIA Jetson TX2 solves the problem of portability image recognition equipment. For the experiments, we evaluated our proposed framework with 4,853 images. The experimental results show that our proposed system suitable for blind people currently.

APA, Harvard, Vancouver, ISO, and other styles

26

CHEN, TAI-RONG, and 陳泰融. "A Preliminary Study on Deep Learning Neural Networks-based multi-model Sentiment Detection." Thesis, 2019. http://ndltd.ncl.edu.tw/handle/x5399c.

Full text

Abstract:

碩士
國立臺北科技大學
電子工程系
107
To celebrate the 50th anniversary of the moon landing program, University of Texas atDallas (UT Dallas) will be on the NASA Apollo program, Recording of communicationdialogues between all NASA Apollo program, provide Feeless Steps Corpus, the first heldFearless Steps Challenge Tournament, Further exploring the intricate communicationcharacteristics of problem solving on the scale as complex as going to the moon can lead to thedevelopment of novel algorithms beneficial for speech processing and conversationalunderstanding in challenging environments. This thesis focuses on the speech sentimentdetection tasks. Consider speech processing on the acoustics and Semantic meaning, propose aPreliminary Study on Deep Learning Neural Networks-based multi-model Sentiment Detection,Detect sentiment in speech signals, specific practice include (1) using the Convolutional NeuralNetwork (CNN) to automatically extract sentiment feature parameters from the acousticspectrum, and (2) Bidirectional Encoder Representation from Transformers (BERT).Combinethe characteristics of both to enhance the systems sentiment detection performance. The finalofficial competition found that, our system's sentiment detection accuracy rate 73.11%, rank 3in all teams submitting 20 results, exceed the baseline reference system (49.75%), Gap with thechampion (74.07%) 1%

APA, Harvard, Vancouver, ISO, and other styles

27

Rizqi, Diwanda Ageng, and Diwanda Ageng Rizqi. "A Skill Transfer Support Model Based on Deep Learning in Human-machine Interaction." Thesis, 2019. http://ndltd.ncl.edu.tw/handle/9h698z.

Full text

Abstract:

碩士
國立臺灣科技大學
工業管理系
107
The shift paradigm toward Industry 4.0 is not solely completed by enabling smart machines in a factory, but enabling human capability. Refinement of work processes and introduction of new training approaches are needed to support efficient human skill development and transfer. This study proposes a new skill transfer support model in a manufacturing scenario. The proposed model uses two types of deep learning as the backbone: a convolutional neural network for action recognition and a faster region-based convolutional neural network for object detection. In this study, to evaluate the performance of the prosed model, a case study using toy assembly was conducted and they are recorded using two cameras with different angle. The accuracy for CNN and faster R-CNN is 94.5% and 99% respectively. A junior operator is guided by the proposed model as doing flexible assembly tasks based on the skill representation that has been constructed. In terms of theoretical contribution, this study integrated two deep learning models which can simultaneously recognize the action and detect the object. The contribution of the present study is to facilitate advanced training in manufacturing in terms of adapting new skills for their operators.

APA, Harvard, Vancouver, ISO, and other styles

28

Lin, Tzu-Yang, and 林子揚. "Robust Vision-Based Daytime Vehicle Brake Light DetectionUsing Two-Stage Deep Learning Model." Thesis, 2018. http://ndltd.ncl.edu.tw/handle/9m435s.

Full text

Abstract:

碩士
元智大學
電機工程學系
106
In the modern age of advanced automotive technology and deep learning, most people have owned their own vehicle and each car is more and more equipped. The Advanced Driving Assistance System (ADAS) has gradually become a basic equipment for vehicles. In this environment, the development value of Internet of Vehicle is also increasing. If you can inform the driver of the driving conditions near the vehicle through the Internet of Vehicle, you can avoid most of the accidents. Today's ADAS functions can be divided into active control, early warning and other auxiliary three major categories, including Adaptive Cruise Control (ACC), Autonomous Emergency Braking (AEB), Forward Collision Warning (FCW), which uses radar and some sensors to measure the distance between itself and the front, and uses this as a parameter for analysis. However, in addition to the distance, if it is possible to know the information of the vehicle in front of the vehicle in a timely manner and transmit the real-time vehicle information of the vehicle in front through the Internet of Vehicle, it will be possible to more accurately determine the driving conditions of the surrounding vehicles. Therefore, we combined the latest object detection network and classification network and proposed an daytime vehicle brake light detection system that uses a single image as the input and does not require tracking. The screen of the general driving recorder is used as input, and the candidate area of the rear vehicle is found through the first stage of vehicle detection. Then the candidates are put into the second stage of the vehicle brake light recognition network to get the results. The experimental results show that our proposed system can achieve very high resolution under various weather conditions.

APA, Harvard, Vancouver, ISO, and other styles

29

Huang, I.-Hang, and 黃一航. "Development of the rainfall-runoff model based on Multi-Agents Deep Reinforcement Learning." Thesis, 2019. http://ndltd.ncl.edu.tw/handle/42p9m9.

Full text

Abstract:

碩士
國立臺灣大學
土木工程學研究所
107
Rainfall-runoff models are commonly classified into three categories, namely: physical-based, conceptual and empirical models to implement in hydraulic structure designing and researches. Physical-based models are restricted by geographic information and over-complex operations. Empirical models adopt several empirical equations to construct the correlation of inputs and outputs at outflow control section, but empirical models sometimes are distrusted due to the black box characteristics. A reasonable way to retain the advantages and improve the drawbacks of both physical-based and empirical models is constructing a rainfall-runoff model with novel algorithms. Therefore, this study presents a novel Multi-Agents system Deep Reinforcement Learning (MAS-DRL) that is capable of understanding the hydrological process through water units and reducing physical operations. The behavior of a water unit and interaction between water units can be simulated by Deep Reinforcement Learning and the Multi-Agents system, respectively. Designed cases are conducted to clearly demonstrate the advantage of the MAS-DRL model. To approximate the actual basin, the topographies are settled based on the Shihmen Reservoir basin (Taiwan), and rainfall events are designed due to the precipitation standards that were settled by Central Weather Bureau, respectively. The Eagleson Dynamic Wave solution (Eagleson solution) and SOBEK are constructed as comparison models. The results reveal that with increasing rounds of training, the simulated discharge resulting from the MAS-DRL model gradually closes to the results simulated from Eagleson solution regardless of the rainfall event. In summary, this study proposes a novel MAS-DRL model, which simulates the behavior of a water unit and interaction between water units to understand the hydrological process and reproduces physical operation. The proposed modeling technique could be help to future use of DRL in hydrology.

APA, Harvard, Vancouver, ISO, and other styles

30

HUANG, WEN-SHENG, and 黃玟勝. "Hash code generation based on deep learning and visual attention model for image retrieval." Thesis, 2018. http://ndltd.ncl.edu.tw/handle/er59ds.

Full text

Abstract:

碩士
國立雲林科技大學
電機工程系
106
In this thesis, we develop a hash code generation based on deep learning and visual attention model for image retrieval. This system is composed of several parts: object detection, saliency computation, and hash code generation. In the object detection part, a deep learning technique, Faster R-CNN, is used to detect and classify objects in images. A pre-trained model can classify 20 categories for image classification. In the saliency computation, the pre-training model proposed in [26] is to compute the saliency value of each object. According to category information and saliency value, the proposed system can generate the corresponding hash code. To evaluate the performance of the proposed system, the PASCAL VOC image set is used. There are 27088 images in the PASCAL VOC image set. For image retrieval, the nDCG value of the proposed system is higher than that of [29]. Experimental results show that the proposed system is superior to the existing method [29]. Keywords: object detection, visual attention, hash code

APA, Harvard, Vancouver, ISO, and other styles

31

Cruz, Rui Francisco Pereira Moital Loureiro da. "Fine-tuning a transformers-based model to extract relevant fields from invoices." Master's thesis, 2021. http://hdl.handle.net/10362/130277.

Full text

Abstract:

Dissertation presented as the partial requirement for obtaining a Master's degree in Data Science and Advanced Analytics, specialization in Data Science
Extraction of relevant fields from documents has been a relevant matter for decades. Although there are well-established algorithms to perform this task since the late XX century, this field of study has again gathered more attention with the fast growth of deep learning models and transfer learning. One of these models is LayoutLM, which is a Transformer-based architecture pre-trained with additional features that represent the 2D position of the words. In this dissertation, LayoutLM is fine-tuned on a set of invoices to extract some of its relevant fields, such as company name, address, document date, among others. Given the objective of deploying the model in a company’s internal accountant software, an end-to-end machine learning pipeline is presented. The training layer receives batches with images of documents and their corresponding annotations and fine-tunes the model for a sequence labeling task. The production layer inputs images and predicts the relevant fields. The images are pre-processed extracting the whole document text and bounding boxes using OCR. To automatically label the samples using Transformers-based input format, the text is labeled using an algorithm that searches parts of the text equal or highly similar to the annotations. Also, a new dataset to support this work is created and made publicly available. The dataset consists of 813 pictures and the annotation text for every relevant field, which include company name, company address, document date, document number, buyer tax number, seller tax number, total amount and tax amount. The models are fine-tuned and compared with two baseline models, showing a performance very close to the presented by the model authors. A sensitivity analysis is made to understand the impact of two datasets with different characteristics. In addition, the learning curves for different datasets define empirically that 100 to 200 samples are enough to fine-tune the model and achieve top performance. Based on the results, a strategy for model deployment is defined. Empirical results show that the already fine-tuned model is enough to guarantee top performance in production without the need of using online learning algorithms.

APA, Harvard, Vancouver, ISO, and other styles

32

(5929469), Hani A. Almansouri. "Model-Based Iterative Reconstruction and Direct Deep Learning for One-Sided Ultrasonic Non-Destructive Evaluation." Thesis, 2019.

Find full text

Abstract:

One-sided ultrasonic non-destructive evaluation (UNDE) is extensively used to characterize structures that need to be inspected and maintained from defects and flaws that could affect the performance of power plants, such as nuclear power plants. Most UNDE systems send acoustic pulses into the structure of interest, measure the received waveform and use an algorithm to reconstruct the quantity of interest. The most widely used algorithm in UNDE systems is the synthetic aperture focusing technique (SAFT) because it produces acceptable results in real time. A few regularized inversion techniques with linear models have been proposed which can improve on SAFT, but they tend to make simplifying assumptions that show artifacts and do not address how to obtain reconstructions from large real data sets. In this thesis, we present two studies. The first study covers the model-based iterative reconstruction (MBIR) technique which is used to resolve some of the issues in SAFT and the current linear regularized inversion techniques, and the second study covers the direct deep learning (DDL) technique which is used to further resolve issues related to non-linear interactions between the ultrasound signal and the specimen.

In the first study, we propose a model-based iterative reconstruction (MBIR) algorithm designed for scanning UNDE systems. MBIR reconstructs the image by optimizing a cost function that contains two terms: the forward model that models the measurements and the prior model that models the object. To further reduce some of the artifacts in the results, we enhance the forward model of MBIR to account for the direct arrival artifacts and the isotropic artifacts. The direct arrival signals are the signals received directly from the transmitter without being reflected. These signals contain no useful information about the specimen and produce high amplitude artifacts in regions close to the transducers. We resolve this issue by modeling these direct arrival signals in the forward model to reduce their artifacts while maintaining information from reflections of other objects. Next, the isotropic artifacts appear when the transmitted signal is assumed to propagate in all directions equally. Therefore, we modify our forward model to resolve this issue by modeling the anisotropic propagation. Next, because of the significant attenuation of the transmitted signal as it propagates through deeper regions, the reconstruction of deeper regions tends to be much dimmer than closer regions. Therefore, we combine the forward model with a spatially variant prior model to account for the attenuation by reducing the regularization as the pixel gets deeper. Next, for scanning large structures, multiple scans are required to cover the whole field of view. Typically, these scans are performed in raster order which makes adjacent scans share some useful correlations. Reconstructing each scan individually and performing a conventional stitching method is not an efficient way because this could produce stitching artifacts and ignore extra information from adjacent scans. We present an algorithm to jointly reconstruct measurements from large data sets that reduces the stitching artifacts and exploits useful information from adjacent scans. Next, using simulated and extensive experimental data, we show MBIR results and demonstrate how we can improve over SAFT as well as existing regularized inversion techniques. However, even with this improvement, MBIR still results in some artifacts caused by the inherent non-linearity of the interaction between the ultrasound signal and the specimen.

In the second study, we propose DDL, a non-iterative model-based reconstruction method for inverting measurements that are based on non-linear forward models for ultrasound imaging. Our approach involves obtaining an approximate estimate of the reconstruction using a simple linear back-projection and training a deep neural network to refine this to the actual reconstruction. While the technique we are proposing can show significant enhancement compared to the current techniques with simulated data, one issue appears with the performance of this technique when applied to experimental data. The issue is a modeling mismatch between the simulated training data and the real data. We propose an effective solution that can reduce the effect of this modeling mismatch by adding noise to the simulation input of the training set before simulation. This solution trains the neural network on the general features of the system rather than specific features of the simulator and can act as a regularization to the neural network. Another issue appears similar to the issue in MBIR caused by the attenuation of deeper reflections. Therefore, we propose a spatially variant amplification technique applied to the back-projection to amplify deeper regions. Next, to reconstruct from a large field of view that requires multiple scans, we propose a joint deep neural network technique to jointly reconstruct an image from these multiple scans. Finally, we apply DDL to simulated and experimental ultrasound data to demonstrate significant improvements in image quality compared to the delay-and-sum approach and the linear model-based reconstruction approach.

APA, Harvard, Vancouver, ISO, and other styles

33

LIN, YUN, and 林昀. "Auto-pilot Model Car Based on Raspberry Pi Embedded System with Neural Compute Stick using Deep Learning Model." Thesis, 2019. http://ndltd.ncl.edu.tw/handle/deje4w.

Full text

Abstract:

碩士
國立臺北科技大學
車輛工程系
107
In this study explores Donkey Car as an open source experimental platform for automated radio-controlled cars, based on the Python language, using machine learning and computer vision to drive radio-controlled cars in the Raspberry Pi 3 Model B + for automated cars platform. Through this experimental platform, deep learning is applied to automated radio-controlled cars. Donkey Car is a comprehensive experimental platform that combines a deep learning framework with a Raspberry Pi control board, which is the user first uses the joystick to control the radio-controlled cars in the lane, besides recording the image information and the joystick information. And then inputting the data information into the deep learning framework, which applies the data to train an automated model, includes identifying useful features in the image and making appropriate steering in response to the features, and continuing to collect information, train the model, and test the motion of the radio-controlled cars until the desired result is met. Because of the limited CPU computing power of the Raspberry Pi, this experiment is equipped with Intel neural compute stick to improve the performance of its computing, and use the YOLOv2 target detection to identify obstacles in front of the vehicle to improve autonomous driving application and performance.

APA, Harvard, Vancouver, ISO, and other styles

34

Ping-WeiSoh and 蘇評威. "Adaptive Deep Learning-based Air Quality Prediction Model Using the Most Relevant Spatial-Temporal Relations." Thesis, 2018. http://ndltd.ncl.edu.tw/handle/r38wfv.

Full text

APA, Harvard, Vancouver, ISO, and other styles

35

Chiu, Yu-Chang, and 邱裕樟. "A Direct Marketing Approach with Deep Learning in E-Commerce: Review-Based Text Generation Model." Thesis, 2018. http://ndltd.ncl.edu.tw/handle/65y4d3.

Full text

APA, Harvard, Vancouver, ISO, and other styles

36

YANG, HAO-XIANG, and 楊皓翔. "Surface Defect Detection of Scarce Samples Based on Deep Learning Model and Generative Adversarial Network." Thesis, 2019. http://ndltd.ncl.edu.tw/handle/evzn27.

Full text

Abstract:

碩士
國立臺北科技大學
自動化科技研究所
107
In traditional automated optical inspection (AOI), the surface defect detection of different targets usually requires the specified detection algorithms and procedures from the field expertise. In order to solve this problem, this thesis used the deep learning model to train the surface defect and further used the data augmentation and generated adversarial network (GAN) to add more abundant training dataset. The sparse defect samples are always happened in surface defect detection. And then, the data augmentation through simple techniques, such as cropping, rotating, and flipping input images, are traditionally applied to expand the training dataset in order to improve the performance and ability of the model to generalize. However, these traditional techniques often induce the overfitting of the defect model. This thesis firstly obtained the rich and qualified defect images by active learning. The filtered defect images successively feed into the GAN to add more abundant training dataset. The Fréchet Inception Distance (FID) is further used to judge the difference between input and generated images. The images owned lowest FID will be stored as the training dataset of surface defect model. The dataset will efficiently decrease the overkill rate and missed detection rate of the corresponding well trained surface defect model. Finally, the surface detection of deep learning model will be verified through the public dataset and the captured images by the AOI instrument in real world. The experiment results show that the surface detection of deep learning model can get the equal detection accuracy and performance for both training with huge raw dataset and the expanded dataset with traditional data augmentation and GAN.

APA, Harvard, Vancouver, ISO, and other styles

37

Fiorani, Matteo. "Mixed-input second-hand car price estimation model based on scraped data." Master's thesis, 2022. http://hdl.handle.net/10362/134276.

Full text

Abstract:

Dissertation presented as the partial requirement for obtaining a Master's degree in Data Science and Advanced Analytics, specialization in Data Science
The number of second-hand cars is growing year by year. More and more people prefer to buy a second-hand car rather than a new one due to the increasing cost of new cars and their fast devaluation in price. Consequently, there has also been an increase in online marketplaces for peerto- peer (P2P) second-hand cars trades. A robust price estimation is needed for both dealers, to have a good idea on how to price their cars, and buyers, to understand whether a listing is overpriced or not. Price estimation for second-hand cars has been, to my knowledge, so far only explored with numerical and categorical features such as mileage driven, brand or production year. An approach that also uses image data has yet to be developed. This work aims to investigate the use of a multi-input price estimation model for second-hand cars taking advantage of a convolutional neural network (CNN), to extract features from car images, combined with an artificial neural network (ANN), dealing with the categorical-numerical features, and assess whether this method improves accuracy in price estimation over more traditional single-input methods. To train and evaluate the model, a dataset of second-hand car images and textual features is scraped from a marketplace and curated such that more than 700 images can be used for the training.

APA, Harvard, Vancouver, ISO, and other styles

38

(9226151), Camilo G. Aguilar Herrera. "NOVEL MODEL-BASED AND DEEP LEARNING APPROACHES TO SEGMENTATION AND OBJECT DETECTION IN 3D MICROSCOPY IMAGES." Thesis, 2020.

Find full text

Abstract:

Modeling microscopy images and extracting information from them are important problems in the fields of physics and material science.

Model-based methods, such as marked point processes (MPPs), and machine learning approaches, such as convolutional neural networks (CNNs), are powerful tools to perform these tasks. Nevertheless, MPPs present limitations when modeling objects with irregular boundaries. Similarly, machine learning techniques show drawbacks when differentiating clustered objects in volumetric datasets.

In this thesis we explore the extension of the MPP framework to detect irregularly shaped objects. In addition, we develop a CNN approach to perform efficient 3D object detection. Finally, we propose a CNN approach together with geometric regularization to provide robustness in object detection across different datasets.

The first part of this thesis explores the addition of boundary energy to the MPP by using active contours energy and level sets energy. Our results show this extension allows the MPP framework to detect material porosity in CT microscopy images and to detect red blood cells in DIC microscopy images.

The second part of this thesis proposes a convolutional neural network approach to perform 3D object detection by regressing objects voxels into clusters. Comparisons with leading methods demonstrate a significant speed-up in 3D fiber and porosity detection in composite polymers while preserving detection accuracy.

The third part of this thesis explores an improvement in the 3D object detection approach by regressing pixels into their instance centers and using geometric regularization. This improvement demonstrates robustness when comparing 3D fiber detection in several large volumetric datasets.

These methods can contribute to fast and correct structural characterization of large volumetric datasets, which could potentially lead to the development of novel materials.

APA, Harvard, Vancouver, ISO, and other styles

39

Cianci, Davio. "A Deep-Learning-Based Muon Neutrino CCQE Selection for Searches Beyond the Standard Model with MicroBooNE." Thesis, 2021. https://doi.org/10.7916/d8-1zgg-jh16.

Full text

Abstract:

The anomalous Low Energy Excess (LEE) of electron neutrinos and antineutrinos in MiniBooNE has inspired both theories and entire experiments to probe the heart of its mystery. One such experiment is MicroBooNE. This dissertation presents an important facet of its LEE investigation: how a powerful systematic can be levied on this signal through parallel study of a highly correlated channel in muon neutrinos. This constraint serves to strengthen MicroBooNE's ability to confirm or validate the cause of the LEE and will lay the groundwork for future oscillation experiments in Liquid Argon Time Projection Chamber (LArTPC) detector experiments like SBN and DUNE. In addition, this muon channel can be used to test oscillations directly, demonstrated through the world's first muon neutrino disappearance search with LArTPC data.

APA, Harvard, Vancouver, ISO, and other styles

40

Xu, Kelvin. "Exploring Attention Based Model for Captioning Images." Thèse, 2017. http://hdl.handle.net/1866/20194.

Full text

APA, Harvard, Vancouver, ISO, and other styles

41

Desjardins, Guillaume. "Training deep convolutional architectures for vision." Thèse, 2009. http://hdl.handle.net/1866/3646.

Full text

Abstract:

Les tâches de vision artiﬁcielle telles que la reconnaissance d’objets demeurent irrésolues à ce jour. Les algorithmes d’apprentissage tels que les Réseaux de Neurones Artiﬁciels (RNA), représentent une approche prometteuse permettant d’apprendre des caractéristiques utiles pour ces tâches. Ce processus d’optimisation est néanmoins difﬁcile. Les réseaux profonds à base de Machine de Boltzmann Restreintes (RBM) ont récemment été proposés aﬁn de guider l’extraction de représentations intermédiaires, grâce à un algorithme d’apprentissage non-supervisé. Ce mémoire présente, par l’entremise de trois articles, des contributions à ce domaine de recherche. Le premier article traite de la RBM convolutionelle. L’usage de champs réceptifs locaux ainsi que le regroupement d’unités cachées en couches partageant les même paramètres, réduit considérablement le nombre de paramètres à apprendre et engendre des détecteurs de caractéristiques locaux et équivariant aux translations. Ceci mène à des modèles ayant une meilleure vraisemblance, comparativement aux RBMs entraînées sur des segments d’images. Le deuxième article est motivé par des découvertes récentes en neurosciences. Il analyse l’impact d’unités quadratiques sur des tâches de classiﬁcation visuelles, ainsi que celui d’une nouvelle fonction d’activation. Nous observons que les RNAs à base d’unités quadratiques utilisant la fonction softsign, donnent de meilleures performances de généralisation. Le dernière article quand à lui, offre une vision critique des algorithmes populaires d’entraînement de RBMs. Nous montrons que l’algorithme de Divergence Contrastive (CD) et la CD Persistente ne sont pas robustes : tous deux nécessitent une surface d’énergie relativement plate aﬁn que leur chaîne négative puisse mixer. La PCD à "poids rapides" contourne ce problème en perturbant légèrement le modèle, cependant, ceci génère des échantillons bruités. L’usage de chaînes tempérées dans la phase négative est une façon robuste d’adresser ces problèmes et mène à de meilleurs modèles génératifs.
High-level vision tasks such as generic object recognition remain out of reach for modern Artiﬁcial Intelligence systems. A promising approach involves learning algorithms, such as the Arﬁcial Neural Network (ANN), which automatically learn to extract useful features for the task at hand. For ANNs, this represents a difﬁcult optimization problem however. Deep Belief Networks have thus been proposed as a way to guide the discovery of intermediate representations, through a greedy unsupervised training of stacked Restricted Boltzmann Machines (RBM). The articles presented here-in represent contributions to this ﬁeld of research. The ﬁrst article introduces the convolutional RBM. By mimicking local receptive ﬁelds and tying the parameters of hidden units within the same feature map, we considerably reduce the number of parameters to learn and enforce local, shift-equivariant feature detectors. This translates to better likelihood scores, compared to RBMs trained on small image patches. In the second article, recent discoveries in neuroscience motivate an investigation into the impact of higher-order units on visual classiﬁcation, along with the evaluation of a novel activation function. We show that ANNs with quadratic units using the softsign activation function offer better generalization error across several tasks. Finally, the third article gives a critical look at recently proposed RBM training algorithms. We show that Contrastive Divergence (CD) and Persistent CD are brittle in that they require the energy landscape to be smooth in order for their negative chain to mix well. PCD with fast-weights addresses the issue by performing small model perturbations, but may result in spurious samples. We propose using simulated tempering to draw negative samples. This leads to better generative models and increased robustness to various hyperparameters.

APA, Harvard, Vancouver, ISO, and other styles

42

Scellier, Benjamin. "A deep learning theory for neural networks grounded in physics." Thesis, 2020. http://hdl.handle.net/1866/25593.

Full text

Abstract:

Au cours de la dernière décennie, l'apprentissage profond est devenu une composante majeure de l'intelligence artificielle, ayant mené à une série d'avancées capitales dans une variété de domaines. L'un des piliers de l'apprentissage profond est l'optimisation de fonction de coût par l'algorithme du gradient stochastique (SGD). Traditionnellement en apprentissage profond, les réseaux de neurones sont des fonctions mathématiques différentiables, et les gradients requis pour l'algorithme SGD sont calculés par rétropropagation. Cependant, les architectures informatiques sur lesquelles ces réseaux de neurones sont implémentés et entraînés souffrent d’inefficacités en vitesse et en énergie, dues à la séparation de la mémoire et des calculs dans ces architectures. Pour résoudre ces problèmes, le neuromorphique vise à implementer les réseaux de neurones dans des architectures qui fusionnent mémoire et calculs, imitant plus fidèlement le cerveau. Dans cette thèse, nous soutenons que pour construire efficacement des réseaux de neurones dans des architectures neuromorphiques, il est nécessaire de repenser les algorithmes pour les implémenter et les entraîner. Nous présentons un cadre mathématique alternative, compatible lui aussi avec l’algorithme SGD, qui permet de concevoir des réseaux de neurones dans des substrats qui exploitent mieux les lois de la physique. Notre cadre mathématique s'applique à une très large classe de modèles, à savoir les systèmes dont l'état ou la dynamique sont décrits par des équations variationnelles. La procédure pour calculer les gradients de la fonction de coût dans de tels systèmes (qui dans de nombreux cas pratiques ne nécessite que de l'information locale pour chaque paramètre) est appelée “equilibrium propagation” (EqProp). Comme beaucoup de systèmes en physique et en ingénierie peuvent être décrits par des principes variationnels, notre cadre mathématique peut potentiellement s'appliquer à une grande variété de systèmes physiques, dont les applications vont au delà du neuromorphique et touchent divers champs d'ingénierie.
In the last decade, deep learning has become a major component of artificial intelligence, leading to a series of breakthroughs across a wide variety of domains. The workhorse of deep learning is the optimization of loss functions by stochastic gradient descent (SGD). Traditionally in deep learning, neural networks are differentiable mathematical functions, and the loss gradients required for SGD are computed with the backpropagation algorithm. However, the computer architectures on which these neural networks are implemented and trained suffer from speed and energy inefficiency issues, due to the separation of memory and processing in these architectures. To solve these problems, the field of neuromorphic computing aims at implementing neural networks on hardware architectures that merge memory and processing, just like brains do. In this thesis, we argue that building large, fast and efficient neural networks on neuromorphic architectures also requires rethinking the algorithms to implement and train them. We present an alternative mathematical framework, also compatible with SGD, which offers the possibility to design neural networks in substrates that directly exploit the laws of physics. Our framework applies to a very broad class of models, namely those whose state or dynamics are described by variational equations. This includes physical systems whose equilibrium state minimizes an energy function, and physical systems whose trajectory minimizes an action functional (principle of least action). We present a simple procedure to compute the loss gradients in such systems, called equilibrium propagation (EqProp), which requires solely locally available information for each trainable parameter. Since many models in physics and engineering can be described by variational principles, our framework has the potential to be applied to a broad variety of physical systems, whose applications extend to various fields of engineering, beyond neuromorphic computing.

APA, Harvard, Vancouver, ISO, and other styles

43

Carvalho, João Gabriel Marques. "Electricity consumption forecast model for the DEEC based on machine learning tools." Master's thesis, 2020. http://hdl.handle.net/10316/90148.

Full text

Abstract:

Dissertação de Mestrado Integrado em Engenharia Electrotécnica e de Computadores apresentada à Faculdade de Ciências e Tecnologia
Nesta tese apresentaremos o trabalho sobre a criação de uma rede neuronal de aprendizagem automática, capaz de realizar previsões energéticas. Com o aumento do consumo energético, devem desenvolvidas ferramentas capazes de prever o consumo. Esta necessidade levou à pesquisa deste tema.Procura-se explicar a história da aprendizagem automática, o que é a aprendizagem automática e como é que esta funciona. Também se procura explicar os seus antecedentes matemáticos, a utilização de redes neuronais e que ferramentas foram atualmente desenvolvidas; de forma a criar soluções de aprendizagem automática. A aprendizagem automática consiste num programa informático, que após treino é capaz de desempenhar tarefas de forma similar à mente humana. A rede neuronal (ANN) é uma das mais importantes ferramentas de aprendizagem automática, através da qual se pode obter informação fundamental.Para prever o consumo de energia no Departamento de Engenharia Eletrotécnica e de Computadores (DEEC) da Universidade de Coimbra, uma rede neural foi treinada usando dados reais do consumo total das torres do DEEC.Phyton foi a linguagem utilizada e recorreu-se ao logaritmo de regressão de aprendizagem supervisionada. Com esta previsão, comparam-se os dados obtidos com os dados reais, o que permite a sua análise. Os dados usados no treino da rede neuronal vão de 2015/julho/10 a 2017/dezembro/31, num total de 906 dias. Por cada dia do ano existe um máximo de 3 valores, considerando-se assim uma amostra pequena.A comparação final entre os dados reais e os dados previstos foi somente realizada no mês de janeiro de 2018. A partir dos dados obtidos realizaram-se previsões, apesar de um certo nível de discrepância; justificada pela pequena quantidade de dados disponíveis. No futuro, deve-se aumentar os dados de treino de forma a obter um maior número de variáveis de entrada. O principal objetivo proposto nesta tese foi atingido com sucesso. Com toda a pesquisa apresentada, buscou-se criar informação que permitisse ser um marco na criação de melhores soluções. Este é um campo extraordinário que no futuro permitirá elevar os nossos conhecimentos a outros níveis.
In this thesis, the design of a machine learning neural network capable of making energy predictions is presented. With the increase in energy consumption, tools for the prediction of energy consumption are gaining great importance and their implementation is required. This concern is the main goal of the presented work.We strive to explain the history of machine learning, what machine learning is and how it works. It is also sought to explain the mathematical background and use of neural networks and what tools have been developed nowadays to create machine learning solutions. Machine learning is a computer program that can perform trained tasks in a similar way as the human mind. The neural network (ANN) is one of the most used and important machine learning solution through which pivotal data can be obtained. For predicting the energy consumption at the Department of Electrical and Computer Engineering (DEEC) of the University of Coimbra, a neural network was trained using real data from the overall consumption of the DEEC towers.Phyton was the language used and the supervised learning regression algorithm utilized. With this prediction, we finally compare our data with real data, so that we may analyze it. The data used in the training of the neural network goes from 2015/July/10 to 2017/December/31, a total of 906 days. For each day of the year, there is a maximum of 3 values, which is considered a small sample, but the only one available The final comparison between real and predicted data was only done for the month of January 2018. From the data achieved, predictions were made, but with a certain level of discrepancy, that is explained with the low amount of data available. In the future, one of the things that should be considered is to enlarge the training datasets, considering a larger amount of input variables. The main goal proposed for this thesis was successfully obtained. With all the presented research it was strived to create text that would allow being a steppingstone in the creation of better solutions. This is an extraordinary field that in the future will be able to elevate our knowledge to a completely different level.

APA, Harvard, Vancouver, ISO, and other styles

Dissertations / Theses on the topic 'Model-Based Deep Learning'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles