Dissertations / Theses: 'Convolutional model'

1

Kramer, Tyler Christian. "The Polarimetric Impulse Response and Convolutional Model for the Remote Sensing of Layered Vegetation." Thesis, Virginia Tech, 2007. http://hdl.handle.net/10919/41732.

Full text

Abstract:

To date, there exists no complete, computationally efficient, physics-based model to compute the radar backscatter from forest canopies. Several models attempt to predict the backscatter coefficient for random forest canopies by using the Vector Radiative Transfer (VRT) Theory with success, however, these models often rely on purely time-harmonic formulations and approximations to integrals. Forms of VRT models have recently been developed which account for a Gaussian pulse incident waveform, however, these models often rely heavily on very specific and obfuscated approximations to solve the associated integrals. This thesis attempts to resolve this problem by outlining a method by which existing, proven, time harmonic solutions to the VRT equation can be modified to account for arbitrary pulse waveforms through simple path delay method. These techniques lend physical insight into the actual scattering mechanisms behind the returned waveform, as well as offer explanations for why approximations of previous authors' break down in certain regions. Furthermore, these radiative transfer solutions can be reformulated into a convolutional model which is capable of quickly and accurately predicting the radar return of random volumes. A brief overview of radiative transfer theory as it applies to remote sensing is also given.
Master of Science

APA, Harvard, Vancouver, ISO, and other styles

2

Huss, Anders. "Hybrid Model Approach to Appliance Load Disaggregation : Expressive appliance modelling by combining convolutional neural networks and hidden semi Markov models." Thesis, KTH, Skolan för datavetenskap och kommunikation (CSC), 2015. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-179200.

Full text

Abstract:

The increasing energy consumption is one of the greatest environmental challenges of our time. Residential buildings account for a considerable part of the total electricity consumption and is further a sector that is shown to have large savings potential. Non Intrusive Load Monitoring (NILM), i.e. the deduction of the electricity consumption of individual home appliances from the total electricity consumption of a household, is a compelling approach to deliver appliance specific consumption feedback to consumers. This enables informed choices and can promote sustainable and cost saving actions. To achieve this, accurate and reliable appliance load disaggregation algorithms must be developed. This Master's thesis proposes a novel approach to tackle the disaggregation problem inspired by state of the art algorithms in the field of speech recognition. Previous approaches, for sampling frequencies $\leq$ 1 Hz, have primarily focused on different types of hidden Markov models (HMMs) and occasionally the use of artificial neural networks (ANNs). HMMs are a natural representation of electric appliances, however with a purely generative approach to disaggregation, basically all appliances have to be modelled simultaneously. Due to the large number of possible appliances and variations between households, this is a major challenge. It imposes strong restrictions on the complexity, and thus the expressiveness, of the respective appliance model to make inference algorithms feasible. In this thesis, disaggregation is treated as a factorisation problem where the respective appliance signal has to be extracted from its background. A hybrid model is proposed, where a convolutional neural network (CNN) extracts features that correlate with the state of a single appliance and the features are used as observations for a hidden semi Markov model (HSMM) of the appliance. Since this allows for modelling of a single appliance, it becomes computationally feasible to use a more expressive Markov model. As proof of concept, the hybrid model is evaluated on 238 days of 1 Hz power data, collected from six households, to predict the power usage of the households' washing machine. The hybrid model is shown to perform considerably better than a CNN alone and it is further demonstrated how a significant increase in performance is achieved by including transitional features in the HSMM.
Den ökande energikonsumtionen är en stor utmaning för en hållbar utveckling. Bostäder står för en stor del av vår totala elförbrukning och är en sektor där det påvisats stor potential för besparingar. Non Intrusive Load Monitoring (NILM), dvs. härledning av hushållsapparaters individuella elförbrukning utifrån ett hushålls totala elförbrukning, är en tilltalande metod för att fortlöpande ge detaljerad information om elförbrukningen till hushåll. Detta utgör ett underlag för medvetna beslut och kan bidraga med incitament för hushåll att minska sin miljöpåverakan och sina elkostnader. För att åstadkomma detta måste precisa och tillförlitliga algoritmer för el-disaggregering utvecklas. Denna masteruppsats föreslår ett nytt angreppssätt till el-disaggregeringsproblemet, inspirerat av ledande metoder inom taligenkänning. Tidigare angreppsätt inom NILM (i frekvensområdet $\leq$ 1 Hz) har huvudsakligen fokuserat på olika typer av Markovmodeller (HMM) och enstaka förekomster av artificiella neurala nätverk. En HMM är en naturlig representation av en elapparat, men med uteslutande generativ modellering måste alla apparater modelleras samtidigt. Det stora antalet möjliga apparater och den stora variationen i sammansättningen av dessa mellan olika hushåll utgör en stor utmaning för sådana metoder. Det medför en stark begränsning av komplexiteten och detaljnivån i modellen av respektive apparat, för att de algoritmer som används vid prediktion ska vara beräkningsmässigt möjliga. I denna uppsats behandlas el-disaggregering som ett faktoriseringsproblem, där respektive apparat ska separeras från bakgrunden av andra apparater. För att göra detta föreslås en hybridmodell där ett neuralt nätverk extraherar information som korrelerar med sannolikheten för att den avsedda apparaten är i olika tillstånd. Denna information används som obervationssekvens för en semi-Markovmodell (HSMM). Då detta utförs för en enskild apparat blir det beräkningsmässigt möjligt att använda en mer detaljerad modell av apparaten. Den föreslagna Hybridmodellen utvärderas för uppgiften att avgöra när tvättmaskinen används för totalt 238 dagar av elförbrukningsmätningar från sex olika hushåll. Hybridmodellen presterar betydligt bättre än enbart ett neuralt nätverk, vidare påvisas att prestandan förbättras ytterligare genom att introducera tillstånds-övergång-observationer i HSMM:en.

APA, Harvard, Vancouver, ISO, and other styles

3

Meng, Zhaoxin. "A deep learning model for scene recognition." Thesis, Mittuniversitetet, Institutionen för informationssystem och –teknologi, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:miun:diva-36491.

Full text

Abstract:

Scene recognition is a hot research topic in the field of image recognition. It is necessary that we focus on the research on scene recognition, because it is helpful to the scene understanding topic, and can provide important contextual information for object recognition. The traditional approaches for scene recognition still have a lot of shortcomings. In these years, the deep learning method, which uses convolutional neural network, has got state-of-the-art results in this area. This thesis constructs a model based on multi-layer feature extraction of CNN and transfer learning for scene recognition tasks. Because scene images often contain multiple objects, there may be more useful local semantic information in the convolutional layers of the network, which may be lost in the full connected layers. Therefore, this paper improved the traditional architecture of CNN, adopted the existing improvement which enhanced the convolution layer information, and extracted it using Fisher Vector. Then this thesis introduced the idea of transfer learning, and tried to introduce the knowledge of two different fields, which are scene and object. We combined the output of these two networks to achieve better results. Finally, this thesis implemented the method using Python and PyTorch. This thesis applied the method to two famous scene datasets. the UIUC-Sports and Scene-15 datasets. Compared with traditional CNN AlexNet architecture, we improve the result from 81% to 93% in UIUC-Sports, and from 79% to 91% in Scene- 15. It shows that our method has good performance on scene recognition tasks.

APA, Harvard, Vancouver, ISO, and other styles

4

Barai, Milad, and Anthony Heikkinen. "Impact of data augmentations when training the Inception model for image classification." Thesis, KTH, Skolan för informations- och kommunikationsteknik (ICT), 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-215727.

Full text

Abstract:

Image classification is the process of identifying to which class a previously unobserved object belongs to. Classifying images is a commonly occurring task in companies. Currently many of these companies perform this classification manually. Automated classification however, has a lower expected accuracy. This thesis examines how automated classification could be improved by the addition of augmented data into the learning process of the classifier. We conduct a quantitative empirical study on the effects of two image augmentations, random horizontal/vertical flips and random rotations (<180◦). The data set that is used is from an auction house search engine under the commercial name of Barnebys. The data sets contain 700 000, 50 000 and 28 000 images with each set containing 28 classes. In this bachelor’s thesis, we re-trained a convolutional neural network model called the Inception-v3 model with the two larger data sets. The remaining set is used to get more class specific accuracies. In order to get a more accurate value of the effects we used a tenfold cross-validation method. Results of our quantitative study shows that the Inception-v3 model can reach a base line mean accuracy of 64.5% (700 000 data set) and a mean accuracy of 51.1% (50 000 data set). The overall accuracy decreased with augmentations on our data sets. However, our results display an increase in accuracy for some classes. The highest flat accuracy increase observed is in the class "Whine & Spirits" in the small data set where it went from 42.3% correctly classified images to 72.7% correctly classified images of the specific class.
Bildklassificering är uppgiften att identifiera vilken klass ett tidigare osett objekt tillhör. Att klassificera bilder är en vanligt förekommande uppgift hos företag. För närvarande utför många av dessa företag klassificering manuellt. Automatiserade klassificerare har en lägre förväntad nogrannhet. I detta examensarbete studeradas hur en maskinklassificerar kan förbättras genom att lägga till ytterligare förändrad data i inlärningsprocessen av klassificeraren. Vi genomför en kvantitativ empirisk studie om effekterna av två bildförändringar, slumpmässiga horisontella/vertikala speglingar och slumpmässiga rotationer (<180◦). Bilddatasetet som används är från ett auktionshus sökmotor under det kommersiella namnet Barnebys. De dataseten som används består av tre separata dataset, 700 000, 50 000 och 28 000 bilder. Var och en av dataseten innehåller 28 klasser vilka mappas till verksamheten. I det här examensarbetet har vi tränat Inception-v3-modellen med dataset av storlek 700 000 och 50 000. Vi utvärderade sedan noggrannhet av de tränade modellerna genom att klassificera 28 000-datasetet. För att få ett mer exakt värde av effekterna använde vi en tiofaldig korsvalideringsmetod. Resultatet av vår kvantitativa studie visar att Inceptionv3-modellen kan nå en genomsnittlig noggrannhet på 64,5% (700 000 dataset) och en genomsnittlig noggrannhet på 51,1% (50 000 dataset). Den övergripande noggrannheten minskade med förändringar på vårat dataset. Dock visar våra resultat en ökad noggrannhet i vissa klasser. Den observerade högsta noggrannhetsökningen var i klassen Åhine & Spirits", där vi gick från 42,3 % korrekt klassificerade bilder till 72,7 % korrekt klassificerade bilder i det lilla datasetet med förändringar.

APA, Harvard, Vancouver, ISO, and other styles

5

Tan, Ke. "Convolutional and recurrent neural networks for real-time speech separation in the complex domain." The Ohio State University, 2021. http://rave.ohiolink.edu/etdc/view?acc_num=osu1626983471600193.

Full text

APA, Harvard, Vancouver, ISO, and other styles

6

Zhang, Xu. "Modeling & Performance Analysis of QAM-based COFDM System." University of Toledo / OhioLINK, 2011. http://rave.ohiolink.edu/etdc/view?acc_num=toledo1310148963.

Full text

APA, Harvard, Vancouver, ISO, and other styles

7

Geras, Krzysztof Jerzy. "Exploiting diversity for efficient machine learning." Thesis, University of Edinburgh, 2018. http://hdl.handle.net/1842/28839.

Full text

Abstract:

A common practice for solving machine learning problems is currently to consider each problem in isolation, starting from scratch every time a new learning problem is encountered or a new model is proposed. This is a perfectly feasible solution when the problems are sufficiently easy or, if the problem is hard when a large amount of resources, both in terms of the training data and computation, are available. Although this naive approach has been the main focus of research in machine learning for a few decades and had a lot of success, it becomes infeasible if the problem is too hard in proportion to the available resources. When using a complex model in this naive approach, it is necessary to collect large data sets (if possible at all) to avoid overfitting and hence it is also necessary to use large computational resources to handle the increased amount of data, first during training to process a large data set and then also at test time to execute a complex model. An alternative to this strategy of treating each learning problem independently is to leverage related data sets and computation encapsulated in previously trained models. By doing that we can decrease the amount of data necessary to reach a satisfactory level of performance and, consequently, improve the accuracy achievable and decrease training time. Our attack on this problem is to exploit diversity - in the structure of the data set, in the features learnt and in the inductive biases of different neural network architectures. In the setting of learning from multiple sources we introduce multiple-source cross-validation, which gives an unbiased estimator of the test error when the data set is composed of data coming from multiple sources and the data at test time are coming from a new unseen source. We also propose new estimators of variance of the standard k-fold cross-validation and multiple-source cross-validation, which have lower bias than previously known ones. To improve unsupervised learning we introduce scheduled denoising autoencoders, which learn a more diverse set of features than the standard denoising auto-encoder. This is thanks to their training procedure, which starts with a high level of noise, when the network is learning coarse features and then the noise is lowered gradually, which allows the network to learn some more local features. A connection between this training procedure and curriculum learning is also drawn. We develop further the idea of learning a diverse representation by explicitly incorporating the goal of obtaining a diverse representation into the training objective. The proposed model, the composite denoising autoencoder, learns multiple subsets of features focused on modelling variations in the data set at different levels of granularity. Finally, we introduce the idea of model blending, a variant of model compression, in which the two models, the teacher and the student, are both strong models, but different in their inductive biases. As an example, we train convolutional networks using the guidance of bidirectional long short-term memory (LSTM) networks. This allows to train the convolutional neural network to be more accurate than the LSTM network at no extra cost at test time.

APA, Harvard, Vancouver, ISO, and other styles

8

Appelstål, Michael. "Multimodal Model for Construction Site Aversion Classification." Thesis, Uppsala universitet, Institutionen för informationsteknologi, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-421011.

Full text

Abstract:

Aversion on construction sites can be everything from missingmaterial, fire hazards, or insufficient cleaning. These aversionsappear very often on construction sites and the construction companyneeds to report and take care of them in order for the site to runcorrectly. The reports consist of an image of the aversion and atext describing the aversion. Report categorization is currentlydone manually which is both time and cost-ineffective. The task for this thesis was to implement and evaluate an automaticmultimodal machine learning classifier for the reported aversionsthat utilized both the image and text data from the reports. Themodel presented is a late-fusion model consisting of a Swedish BERTtext classifier and a VGG16 for image classification. The results showed that an automated classifier is feasible for thistask and could be used in real life to make the classification taskmore time and cost-efficient. The model scored a 66.2% accuracy and89.7% top-5 accuracy on the task and the experiments revealed someareas of improvement on the data and model that could be furtherexplored to potentially improve the performance.

APA, Harvard, Vancouver, ISO, and other styles

9

Ujihara, Rintaro. "Multi-objective optimization for model selection in music classification." Thesis, KTH, Optimeringslära och systemteori, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-298370.

Full text

Abstract:

With the breakthrough of machine learning techniques, the research concerning music emotion classification has been getting notable progress combining various audio features and state-of-the-art machine learning models. Still, it is known that the way to preprocess music samples and to choose which machine classification algorithm to use depends on data sets and the objective of each project work. The collaborating company of this thesis, Ichigoichie AB, is currently developing a system to categorize music data into positive/negative classes. To enhance the accuracy of the existing system, this project aims to figure out the best model through experiments with six audio features (Mel spectrogram, MFCC, HPSS, Onset, CENS, Tonnetz) and several machine learning models including deep neural network models for the classification task. For each model, hyperparameter tuning is performed and the model evaluation is carried out according to pareto optimality with regard to accuracy and execution time. The results show that the most promising model accomplished 95% correct classification with an execution time of less than 15 seconds.
I och med genombrottet av maskininlärningstekniker har forskning kring känsloklassificering i musik sett betydande framsteg genom att kombinera olikamusikanalysverktyg med nya maskinlärningsmodeller. Trots detta är hur man förbehandlar ljuddatat och valet av vilken maskinklassificeringsalgoritm som ska tillämpas beroende på vilken typ av data man arbetar med samt målet med projektet. Denna uppsats samarbetspartner, Ichigoichie AB, utvecklar för närvarande ett system för att kategorisera musikdata enligt positiva och negativa känslor. För att höja systemets noggrannhet är målet med denna uppsats att experimentellt hitta bästa modellen baserat på sex musik-egenskaper (Mel-spektrogram, MFCC, HPSS, Onset, CENS samt Tonnetz) och ett antal olika maskininlärningsmodeller, inklusive Deep Learning-modeller. Varje modell hyperparameteroptimeras och utvärderas enligt paretooptimalitet med hänsyn till noggrannhet och beräkningstid. Resultaten visar att den mest lovande modellen uppnådde 95% korrekt klassificering med en beräkningstid på mindre än 15 sekunder.

APA, Harvard, Vancouver, ISO, and other styles

10

Ghibellini, Alessandro. "Trend prediction in financial time series: a model and a software framework." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2021. http://amslaurea.unibo.it/24708/.

Full text

Abstract:

The research has the aim to build an autonomous support for traders which in future can be translated in an Active ETF. My thesis work is characterized for a huge focus on problem formulation and an accurate analysis on the impact of the input and the length of the future horizon on the results. I will demonstrate that using financial indicators already used by professional traders every day and considering a correct length of the future horizon, it is possible to reach interesting scores in the forecast of future market states, considering both accuracy, which is around 90% in all the experiments, and confusion matrices which confirm the good accuracy scores, without an expensive Deep Learning approach. In particular, I used a 1D CNN. I also emphasize that classification appears to be the best approach to address this type of prediction in combination with proper management of unbalanced class weights. In fact, it is standard having a problem of unbalanced class weights, otherwise the model will react for inconsistent trend movements. Finally I proposed a Framework which can be used also for other fields which allows to exploit the presence of the Experts of the sector and combining this information with ML/DL approaches.

APA, Harvard, Vancouver, ISO, and other styles

11

Svensk, Gustav. "TDNet : A Generative Model for Taxi Demand Prediction." Thesis, Linköpings universitet, Programvara och system, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-158514.

Full text

Abstract:

Supplying the right amount of taxis in the right place at the right time is very important for taxi companies. In this paper, the machine learning model Taxi Demand Net (TDNet) is presented which predicts short-term taxi demand in different zones of a city. It is based on WaveNet which is a causal dilated convolutional neural net for time-series generation. TDNet uses historical demand from the last years and transforms features such as time of day, day of week and day of month into 26-hour taxi demand forecasts for all zones in a city. It has been applied to one city in northern Europe and one in South America. In northern europe, an error of one taxi or less per hour per zone was achieved in 64% of the cases, in South America the number was 40%. In both cities, it beat the SARIMA and stacked ensemble benchmarks. This performance has been achieved by tuning the hyperparameters with a Bayesian optimization algorithm. Additionally, weather and holiday features were added as input features in the northern European city and they did not improve the accuracy of TDNet.

APA, Harvard, Vancouver, ISO, and other styles

12

Melcherson, Tim. "Image Augmentation to Create Lower Quality Images for Training a YOLOv4 Object Detection Model." Thesis, Uppsala universitet, Signaler och system, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-429146.

Full text

Abstract:

Research in the Arctic is of ever growing importance, and modern technology is used in news ways to map and understand this very complex region and how it is effected by climate change. Here, animals and vegetation are tightly coupled with their environment in a fragile ecosystem, and when the environment undergo rapid changes it risks damaging these ecosystems severely. Understanding what kind of data that has potential to be used in artificial intelligence, can be of importance as many research stations have data archives from decades of work in the Arctic. In this thesis, a YOLOv4 object detection model has been trained on two classes of images to investigate the performance impacts of disturbances in the training data set. An expanded data set was created by augmenting the initial data to contain various disturbances. A model was successfully trained on the augmented data set and a correlation between worse performance and presence of noise was detected, but changes in saturation and altered colour levels seemed to have less impact than expected. Reducing noise in gathered data is seemingly of greater importance than enhancing images with lacking colour levels. Further investigations with a larger and more thoroughly processed data set is required to gain a clearer picture of the impact of the various disturbances.

APA, Harvard, Vancouver, ISO, and other styles

13

Suchánek, Tomáš. "Detektor tempa hudebních nahrávek na bázi neuronové sítě." Master's thesis, Vysoké učení technické v Brně. Fakulta elektrotechniky a komunikačních technologií, 2021. http://www.nusl.cz/ntk/nusl-442576.

Full text

Abstract:

This Master’s thesis deals with beat tracking systems, whose functionality is based on neural networks. It describes the structure of these systems and how the signal is processed in their individual blocks. Emphasis is then placed on recurrent and temporal convolutional networks, which by they nature can effectively detect tempo and beats in audio recordings. The selected methods, network architectures and their modifications are then implemented within a comprehensive detection system, which is further tested and evaluated through a cross-validation process on a genre-diverse data-set. The results show that the system, with proposed temporal convolutional network architecture, produces comparable results with foreign publications. For example, within the SMC dataset, it proved to be the most successful, on the contrary, in the case of other datasets it was slightly below the accuracy of state-of-the-art systems. In addition,the proposed network retains low computational complexity despite increased number of internal parameters.

APA, Harvard, Vancouver, ISO, and other styles

14

Ionascu, Beatrice. "Modelling user interaction at scale with deep generative methods." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-239333.

Full text

Abstract:

Understanding how users interact with a company's service is essential for data-driven businesses that want to better cater to their users and improve their offering. By using a generative machine learning approach it is possible to model user behaviour and generate new data to simulate or recognize and explain typical usage patterns. In this work we introduce an approach for modelling users' interaction behaviour at scale in a client-service model. We propose a novel representation of multivariate time-series data as time pictures that express temporal correlations through spatial organization. This representation shares two key properties that convolutional networks have been built to exploit and allows us to develop an approach based on deep generative models that use convolutional networks as backbone. In introducing this approach of feature learning for time-series data, we expand the application of convolutional neural networks in the multivariate time-series domain, and specifically user interaction data. We adopt a variational approach inspired by the β-VAE framework in order to learn hidden factors that define different user behaviour patterns. We explore different values for the regularization parameter β and show that it is possible to construct a model that learns a latent representation of identifiable and different user behaviours. We show on real-world data that the model generates realistic samples, that capture the true population-level statistics of the interaction behaviour data, learns different user behaviours, and provides accurate imputations of missing data.
Förståelse för hur användare interagerar med ett företags tjänst är essentiell för data-drivna affärsverksamheter med ambitioner om att bättre tillgodose dess användare och att förbättra deras utbud. Generativ maskininlärning möjliggör modellering av användarbeteende och genererande av ny data i syfte att simulera eller identifiera och förklara typiska användarmönster. I detta arbete introducerar vi ett tillvägagångssätt för storskalig modellering av användarinteraktion i en klientservice-modell. Vi föreslår en ny representation av multivariat tidsseriedata i form av tidsbilder vilka representerar temporala korrelationer via spatial organisering. Denna representation delar två nyckelegenskaper som faltningsnätverk har utvecklats för att exploatera, vilket tillåter oss att utveckla ett tillvägagångssätt baserat på på djupa generativa modeller som bygger på faltningsnätverk. Genom att introducera detta tillvägagångssätt för tidsseriedata expanderar vi applicering av faltningsnätverk inom domänen för multivariat tidsserie, specifikt för användarinteraktionsdata. Vi använder ett tillvägagångssätt inspirerat av ramverket β-VAE i syfte att lära modellen gömda faktorer som definierar olika användarmönster. Vi utforskar olika värden för regulariseringsparametern β och visar att det är möjligt att konstruera en modell som lär sig en latent representation av identifierbara och multipla användarbeteenden. Vi visar med verklig data att modellen genererar realistiska exempel vilka i sin tur fångar statistiken på populationsnivå hos användarinteraktionsdatan, samt lär olika användarbeteenden och bidrar med precisa imputationer av saknad data.

APA, Harvard, Vancouver, ISO, and other styles

15

Velander, Alice, and Harrysson David Gumpert. "Do Judge a Book by its Cover! : Predicting the genre of book covers using supervised deep learning. Analyzing the model predictions using explanatory artificial intelligence methods and techniques." Thesis, Linköpings universitet, Datorseende, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-177691.

Full text

Abstract:

In Storytel’s application on which a user can read and listen to digitalized literature, a user is displayed a list of books where the first thing the user encounters is the book title and cover. A book cover is therefore essential to attract a consumer’s attention. In this study, we take a data-driven approach to investigate the design principles for book covers through deep learning models and explainable AI. The first aim is to explore how well a Convolutional Neural Network (CNN) can interpret and classify a book cover image according to its genre in a multi-class classification task. The second aim is to increase model interpretability and investigate model feature to genre correlations. With the help of the explanatory artificial intelligence method Gradient-weighted Class Activation Map (Grad-CAM), we analyze the pixel-wise contribution to the model prediction. In addition, object detection by YOLOv3 was implemented to investigate which objects are detectable and reoccurring in the book covers. An interplay between Grad-CAM and YOLOv3 was used to investigate how identified objects and features correlate to a specific book genre and ultimately answer what makes a good book cover. Using a State-of-the-Art CNN model architecture we achieve an accuracy of 48% with the best class-wise accuracies for genres Erotica, Economy & Business and Children with accuracies 73%, 67% and 66%. Quantitative results from the Grad-CAM and YOLOv3 interplay show some strong associations between objects and genres, while indicating weak associations between abstract design principles and genres. Furthermore, a qualitative analysis of Grad-CAM visualizations show strong relevance of certain objects and text fonts for specific book genres. It was also observed that the portrayal of a feature was relevant for the model prediction of certain genres.

APA, Harvard, Vancouver, ISO, and other styles

16

Ma, Xiren. "Deep Learning-Based Vehicle Recognition Schemes for Intelligent Transportation Systems." Thesis, Université d'Ottawa / University of Ottawa, 2021. http://hdl.handle.net/10393/42247.

Full text

Abstract:

With the increasing highlighted security concerns in Intelligent Transportation System (ITS), Vision-based Automated Vehicle Recognition (VAVR) has attracted considerable attention recently. A comprehensive VAVR system contains three components: Vehicle Detection (VD), Vehicle Make and Model Recognition (VMMR), and Vehicle Re-identification (VReID). These components perform coarse-to-fine recognition tasks in three steps. The VAVR system can be widely used in suspicious vehicle recognition, urban traffic monitoring, and automated driving system. Vehicle recognition is complicated due to the subtle visual differences between different vehicle models. Therefore, how to build a VAVR system that can fast and accurately recognize vehicle information has gained tremendous attention. In this work, by taking advantage of the emerging deep learning methods, which have powerful feature extraction and pattern learning abilities, we propose several models used for vehicle recognition. First, we propose a novel Recurrent Attention Unit (RAU) to expand the standard Convolutional Neural Network (CNN) architecture for VMMR. RAU learns to recognize the discriminative part of a vehicle on multiple scales and builds up a connection with the prominent information in a recurrent way. The proposed ResNet101-RAU achieves excellent recognition accuracy of 93.81% on the Stanford Cars dataset and 97.84% on the CompCars dataset. Second, to construct efficient vehicle recognition models, we simplify the structure of RAU and propose a Lightweight Recurrent Attention Unit (LRAU). The proposed LRAU extracts the discriminative part features by generating attention masks to locate the keypoints of a vehicle (e.g., logo, headlight). The attention mask is generated based on the feature maps received by the LRAU and the preceding attention state generated by the preceding LRAU. Then, by adding LRAUs to the standard CNN architectures, we construct three efficient VMMR models. Our models achieve the state-of-the-art results with 93.94% accuracy on the Stanford Cars dataset, 98.31% accuracy on the CompCars dataset, and 99.41% on the NTOU-MMR dataset. In addition, we construct a one-stage Vehicle Detection and Fine-grained Recognition (VDFG) model by combining our LRAU with the general object detection model. Results show the proposed VDFG model can achieve excellent performance with real-time processing speed. Third, to address the VReID task, we design the Compact Attention Unit (CAU). CAU has a compact structure, and it relies on a single attention map to extract the discriminative local features of a vehicle. We add two CAUs to the truncated ResNet to construct a small but efficient VReID model, ResNetT-CAU. Compared with the original ResNet, the model size of ResNetT-CAU is reduced by 60%. Extensive experiments on the VeRi and VehicleID dataset indicate the proposed ResNetT-CAU achieve the best re-identification results on both datasets. In summary, the experimental results on the challenging benchmark VMMR and VReID datasets indicate our models achieve the best VMMR and VReID performance, and our models have a small model size and fast image processing speed.

APA, Harvard, Vancouver, ISO, and other styles

17

Yang, Ruochen. "Diagnosis of Evaporative Emissions Control System Using Physics-based and Machine Learning Methods." The Ohio State University, 2020. http://rave.ohiolink.edu/etdc/view?acc_num=osu1587651390226087.

Full text

APA, Harvard, Vancouver, ISO, and other styles

18

Thanikasalam, Kokul. "Appearance based online visual object tracking." Thesis, Queensland University of Technology, 2019. https://eprints.qut.edu.au/130875/1/Kokul_Thanikasalam_Thesis.pdf.

Full text

Abstract:

This thesis presents research contributions to the field of computer vision based visual object tracking. This study investigates appearance based object tracking by using traditional hand-crafted and deep features. The thesis proposes a real-time tracking framework with high accuracy which follows a deep similarity tracking strategy. This thesis also proposes several deep tracking frameworks for high-accuracy tracking and to manage the spatial information loss. The research findings of the study would be able to be used in a range of applications including visual surveillance systems.

APA, Harvard, Vancouver, ISO, and other styles

19

Зайяд, Абдаллах Мухаммед. "Ecrypted Network Classification With Deep Learning." Master's thesis, КПІ ім. Ігоря Сікорського, 2020. https://ela.kpi.ua/handle/123456789/34069.

Full text

Abstract:

Дисертація складається з 84 сторінок, 59 Цифри та 29 джерел у довідковому списку. Проблема: Оскільки світ стає більш безпечним, для забезпечення належної передачі даних між сторонами, що спілкуються, було використано більше протоколів шифрування. Класифікація мережі стала більше клопоту з використанням деяких прийомів, оскільки перевірка зашифрованого трафіку в деяких країнах може бути незаконною. Це заважає інженерам мережі мати можливість класифікувати трафік, щоб відрізняти зашифрований від незашифрованого трафіку. Мета роботи: Ця стаття спрямована на проблему, спричинену попередніми методами, використовуваними в шифрованій мережевій класифікації. Деякі з них обмежені розміром даних та обчислювальною потужністю. У даній роботі використовується рішення алгоритму глибокого навчання для вирішення цієї проблеми. Основні завдання дослідження: 1. Порівняйте попередні традиційні методи та порівняйте їх переваги та недоліки 2. Вивчити попередні супутні роботи у сучасній галузі досліджень. 3. Запропонуйте більш сучасний та ефективний метод та алгоритм для зашифрованої класифікації мережевого трафіку Об'єкт дослідження: Простий алгоритм штучної нейронної мережі для точної та надійної класифікації мережевого трафіку, що не залежить від розміру даних та обчислювальної потужності. Предмет дослідження: На основі даних, зібраних із приватного потоку трафіку у нашому власному інструменті моделювання мережі. За 4 допомогою запропонованого нами методу визначаємо відмінності корисних навантажень мережевого трафіку та класифікуємо мережевий трафік. Це допомогло відокремити або класифікувати зашифровані від незашифрованого трафіку. Методи дослідження: Експериментальний метод. Ми провели наш експеримент із моделюванням мережі та збиранням трафіку різних незашифрованих протоколів та зашифрованих протоколів. Використовуючи мову програмування python та бібліотеку Keras, ми розробили згорнуту нейронну мережу, яка змогла прийняти корисне навантаження зібраного трафіку, навчити модель та класифікувати трафік у нашому тестовому наборі з високою точністю без вимоги високої обчислювальної потужності.
This dissertation consists of 84 pages, 59 Figures and 29 sources in the reference list. Problem: As the world becomes more security conscious, more encryption protocols have been employed in ensuring suecure data transmission between communicating parties. Network classification has become more of a hassle with the use of some techniques as inspecting encrypted traffic can pose to be illegal in some countries. This has hindered network engineers to be able to classify traffic to differentiate encrypted from unencrypted traffic. Purpose of work: This paper aims at the problem caused by previous techniques used in encrypted network classification. Some of which are limited to data size and computational power. This paper employs the use of deep learning algorithm to solve this problem. The main tasks of the research: 1. Compare previous traditional techniques and compare their advantages and disadvantages 2. Study previous related works in the current field of research. 3. Propose a more modern and efficient method and algorithm for encrypted network traffic classification The object of research: Simple artificial neural network algorithm for accurate and reliable network traffic classification that is independent of data size and computational power. The subject of research: Based on data collected from private traffic flow in our own network simulation tool. We use our proposed method to identify the differences in network traffic payloads and classify network traffic. It helped to separate or classify encrypted from unencrypted traffic. 6 Research methods: Experimental method. We have carried out our experiment with network simulation and gathering traffic of different unencrypted protocols and encrypted protocols. Using python programming language and the Keras library we developed a convolutional neural network that was able to take in the payload of the traffic gathered, train the model and classify the traffic in our test set with high accuracy without the requirement of high computational power.

APA, Harvard, Vancouver, ISO, and other styles

20

Manrique, Tito. "Functional linear regression models : application to high-throughput plant phenotyping functional data." Thesis, Montpellier, 2016. http://www.theses.fr/2016MONTT264/document.

Full text

Abstract:

L'Analyse des Données Fonctionnelles (ADF) est une branche de la statistique qui est de plus en plus utilisée dans de nombreux domaines scientifiques appliqués tels que l'expérimentation biologique, la finance, la physique, etc. Une raison à cela est l'utilisation des nouvelles technologies de collecte de données qui augmentent le nombre d'observations dans un intervalle de temps.Les jeux de données fonctionnelles sont des échantillons de réalisations de fonctions aléatoires qui sont des fonctions mesurables définies sur un espace de probabilité à valeurs dans un espace fonctionnel de dimension infinie.Parmi les nombreuses questions étudiées par l'ADF, la régression linéaire fonctionnelle est l'une des plus étudiées, aussi bien dans les applications que dans le développement méthodologique.L'objectif de cette thèse est l'étude de modèles de régression linéaire fonctionnels lorsque la covariable X et la réponse Y sont des fonctions aléatoires et les deux dépendent du temps. En particulier, nous abordons la question de l'influence de l'histoire d'une fonction aléatoire X sur la valeur actuelle d'une autre fonction aléatoire Y à un instant donné t.Pour ce faire, nous sommes surtout intéressés par trois modèles: le modèle fonctionnel de concurrence (Functional Concurrent Model: FCCM), le modèle fonctionnel de convolution (Functional Convolution Model: FCVM) et le modèle linéaire fonctionnel historique. En particulier pour le FCVM et FCCM nous avons proposé des estimateurs qui sont consistants, robustes et plus rapides à calculer par rapport à d'autres estimateurs déjà proposés dans la littérature.Notre méthode d'estimation dans le FCCM étend la méthode de régression Ridge développée dans le cas linéaire classique au cadre de données fonctionnelles. Nous avons montré la convergence en probabilité de cet estimateur, obtenu une vitesse de convergence et développé une méthode de choix optimal du paramètre de régularisation.Le FCVM permet d'étudier l'influence de l'histoire de X sur Y d'une manière simple par la convolution. Dans ce cas, nous utilisons la transformée de Fourier continue pour définir un estimateur du coefficient fonctionnel. Cet opérateur transforme le modèle de convolution en un FCCM associé dans le domaine des fréquences. La consistance et la vitesse de convergence de l'estimateur sont obtenues à partir du FCCM.Le FCVM peut être généralisé au modèle linéaire fonctionnel historique, qui est lui-même un cas particulier du modèle linéaire entièrement fonctionnel. Grâce à cela, nous avons utilisé l'estimateur de Karhunen-Loève du noyau historique. La question connexe de l'estimation de l'opérateur de covariance du bruit dans le modèle linéaire entièrement fonctionnel est également traitée. Finalement nous utilisons tous les modèles mentionnés ci-dessus pour étudier l'interaction entre le déficit de pression de vapeur (Vapour Pressure Deficit: VPD) et vitesse d'élongation foliaire (Leaf Elongation Rate: LER) courbes. Ce type de données est obtenu avec phénotypage végétal haut débit. L'étude est bien adaptée aux méthodes de l'ADF
Functional data analysis (FDA) is a statistical branch that is increasingly being used in many applied scientific fields such as biological experimentation, finance, physics, etc. A reason for this is the use of new data collection technologies that increase the number of observations during a time interval.Functional datasets are realization samples of some random functions which are measurable functions defined on some probability space with values in an infinite dimensional functional space.There are many questions that FDA studies, among which functional linear regression is one of the most studied, both in applications and in methodological development.The objective of this thesis is the study of functional linear regression models when both the covariate X and the response Y are random functions and both of them are time-dependent. In particular we want to address the question of how the history of a random function X influences the current value of another random function Y at any given time t.In order to do this we are mainly interested in three models: the functional concurrent model (FCCM), the functional convolution model (FCVM) and the historical functional linear model. In particular for the FCVM and FCCM we have proposed estimators which are consistent, robust and which are faster to compute compared to others already proposed in the literature.Our estimation method in the FCCM extends the Ridge Regression method developed in the classical linear case to the functional data framework. We prove the probability convergence of this estimator, obtain a rate of convergence and develop an optimal selection procedure of theregularization parameter.The FCVM allows to study the influence of the history of X on Y in a simple way through the convolution. In this case we use the continuous Fourier transform operator to define an estimator of the functional coefficient. This operator transforms the convolution model into a FCCM associated in the frequency domain. The consistency and rate of convergence of the estimator are derived from the FCCM.The FCVM can be generalized to the historical functional linear model, which is itself a particular case of the fully functional linear model. Thanks to this we have used the Karhunen–Loève estimator of the historical kernel. The related question about the estimation of the covariance operator of the noise in the fully functional linear model is also treated.Finally we use all the aforementioned models to study the interaction between Vapour Pressure Deficit (VPD) and Leaf Elongation Rate (LER) curves. This kind of data is obtained with high-throughput plant phenotyping platform and is well suited to be studied with FDA methods

APA, Harvard, Vancouver, ISO, and other styles

21

Alamgir, Nyma. "Computer vision based smoke and fire detection for outdoor environments." Thesis, Queensland University of Technology, 2020. https://eprints.qut.edu.au/201654/1/Nyma_Alamgir_Thesis.pdf.

Full text

Abstract:

Surveillance Video-based detection of outdoor smoke and fire has been a challenging task due to the chaotic variations of shapes, movement, colour, texture, and density. This thesis contributes to the advancement of the contemporary efforts of smoke and fire detection by proposing novel technical methods and their possible integration into a complete fire safety model. The novel contributions of this thesis include an efficient feature calculation method combining local and global texture properties, the development of deep learning-based models and a conceptual framework to incorporate weather information in the fire safety model for improved accuracy in fire prediction and detection.

APA, Harvard, Vancouver, ISO, and other styles

22

Martin, Victor. "Computing methods for facial aging prevention and prediction." Thesis, CentraleSupélec, 2019. http://www.theses.fr/2019CSUP0014.

Full text

Abstract:

L'utilisation de la simulation informatique pour comprendre comment les visages humains vieillissent est un domaine de recherche en pleine croissance depuis des décennies. Cela a été appliqué à la recherche d'enfants disparus ainsi qu'aux domaines du divertissement, des cosmétiques et de la recherche en dermatologie. Notre objectif est de modéliser les changements des traits du visage liés à l'âge, afin de mieux les prédire. Dans ce travail, une nouvelle perspective pour faire vieillir un visage est proposée : un modèle actif d'apparence axé sur les rides. Tout d'abord, les visages sont décomposés en termes d'apparence et de forme à l'aide d'un modèle actif d'apparence. Ensuite, les rides de chaque visage sont transformées en paramètres d'apparence et de forme. Une façon nouvelle et efficace de modéliser la distribution des paramètres des rides dans un visage est introduite. Il est démontré que les visages artificiellement vieillis produits par le système influencent mieux la perception de l'âge que ceux produits par deux autres systèmes. Cet outil est une première étape dans la construction d'un système de vieillissement du visage plus précis. En outre, une nouvelle méthode d'estimation de la santé utilisant un réseau neuronal convolutionnel est proposée. Ce système est capable de reproduire le jugement humain dans l'évaluation de la santé perçue. Il est présenté comment cet outil utilise les mêmes traits du visage que l'humain pour effectuer sa prédiction. Enfin, l'impact de caractéristiques faciales spécifiques jamais étudié auparavant sur la perception de la santé est établi
The use of computer simulation to understand how human faces age has been a growing area of research since decades. It has been applied to the search for missing children as well as to the fields of entertainment, cosmetics and dermatology research. Our objective is to elaborate a model for the age-related changes of facial cues which affect the perception of age, so that we may better predict them. In this work, a new framework to make a face age is proposed: Wrinkle Oriented Active Appearance Model. First, faces are decomposed in terms of appearance and shape using Active Appearance Model. In addition, wrinkles in each face are transformed in appearance and shape parameters.A new effective way to model the distribution of wrinkle parameters in a face is introduced. Finally, it is shown that artificially aged faces produced by the system better influence age perception than those produced by two other systems. This framework is a first step in the construction of a more accurate facial aging system. In addition, a new health estimation system using a convolutional neural network is introduced. This system is able to estimate how a face is perceived in terms of health by humans. It is shown how this tool reacts in the same way as health perception by humans. Finally, the impact of specific facial features on health perception that have never been studied before is etablished

APA, Harvard, Vancouver, ISO, and other styles

23

Boutin, Victor. "Etude d’un algorithme hiérarchique de codage épars et prédictif : vers un modèle bio-inspiré de la perception visuelle." Thesis, Aix-Marseille, 2020. http://www.theses.fr/2020AIXM0028.

Full text

Abstract:

La représentation concise et efficace de l'information est un problème qui occupe une place centrale dans l'apprentissage machine. Le cerveau, et plus particulièrement le cortex visuel, ont depuis longtemps trouvé des solutions performantes et robustes afin de résoudre un tel problème. A l'échelle locale, le codage épars est l'un des mécanismes les plus prometteurs pour modéliser le traitement de l'information au sein des populations de neurones dans le cortex visuel. A l'échelle structurelle, le codage prédictif suggère que les signaux descendants observés dans le cortex visuel modulent l'activité des neurones pour inclure des détails contextuels au flux d'information ascendant. Cette thèse propose de combiner codage épars et codage prédictif au sein d'un modèle hiérarchique et convolutif. D'un point de vue computationnel, nous démontrons que les connections descendantes, introduites par le codage prédictif, permettent une convergence meilleure et plus rapide du modèle. De plus, nous analysons les effets des connections descendantes sur l'organisation des populations de neurones, ainsi que leurs conséquences sur la manière dont notre algorithme se représente les images. Nous montrons que les connections descendantes réorganisent les champs d'association de neurones dans V1 afin de permettre une meilleure intégration des contours. En outre, nous observons que ces connections permettent une meilleure reconstruction des images bruitées. Nos résultats suggèrent que l'inspiration des neurosciences fournit un cadre prometteur afin de développer des algorithmes de vision artificielles plus performants et plus robustes
Building models to efficiently represent images is a central and difficult problem in the machine learning community. The neuroscientific study of the early visual cortical areas is a great source of inspiration to find economical and robust solutions. For instance, Sparse Coding (SC) is one of the most successful frameworks to model neural computation at the local scale in the visual cortex. At the structural scale of the ventral visual pathways, the Predictive Coding (PC) theory has been proposed to model top-down and bottom-up interaction between cortical regions. The presented thesis introduces a model called the Sparse Deep Predictive Coding (SDPC) that combines Sparse Coding and Predictive Coding in a hierarchical and convolutional architecture. We analyze the SPDC from a computational and a biological perspective. In terms of computation, the recurrent connectivity introduced by the PC framework allows the SDPC to converge to lower prediction errors with a higher convergence rate. In addition, we combine neuroscientific evidence with machine learning methods to analyze the impact of recurrent processing at both the neural organization and representational level. At the neural organization level, the feedback signal of the model accounted for a reorganization of the V1 association fields that promotes contour integration. At the representational level, the SDPC exhibited significant denoising ability which is highly correlated with the strength of the feedback from V2 to V1. These results from the SDPC model demonstrate that neuro-inspiration might be the right methodology to design more powerful and more robust computer vision algorithms

APA, Harvard, Vancouver, ISO, and other styles

24

Štarha, Dominik. "Meření podobnosti obrazů s pomocí hlubokého učení." Master's thesis, Vysoké učení technické v Brně. Fakulta elektrotechniky a komunikačních technologií, 2018. http://www.nusl.cz/ntk/nusl-377018.

Full text

Abstract:

This master´s thesis deals with the reseach of technologies using deep learning method, being able to use when processing image data. Specific focus of the work is to evaluate the suitability and effectiveness of deep learning when comparing two image input data. The first – theoretical – part consists of the introduction to neural networks and deep learning. Also, it contains a description of available methods, their benefits and principles, used for processing image data. The second - practical - part of the thesis contains a proposal a appropriate model of Siamese networks to solve the problem of comparing two input image data and evaluating their similarity. The output of this work is an evaluation of several possible model configurations and highlighting the best-performing model parameters.

APA, Harvard, Vancouver, ISO, and other styles

25

Dupré, la Tour Tom. "Nonlinear models for neurophysiological time series." Thesis, Université Paris-Saclay (ComUE), 2018. http://www.theses.fr/2018SACLT018/document.

Full text

Abstract:

Dans les séries temporelles neurophysiologiques, on observe de fortes oscillations neuronales, et les outils d'analyse sont donc naturellement centrés sur le filtrage à bande étroite.Puisque cette approche est trop réductrice, nous proposons de nouvelles méthodes pour représenter ces signaux.Nous centrons tout d'abord notre étude sur le couplage phase-amplitude (PAC), dans lequel une bande haute fréquence est modulée en amplitude par la phase d'une oscillation neuronale plus lente.Nous proposons de capturer ce couplage dans un modèle probabiliste appelé modèle autoregressif piloté (DAR). Cette modélisation permet une sélection de modèle efficace grâce à la mesure de vraisemblance, ce qui constitue un apport majeur à l'estimation du PAC.%Nous présentons différentes paramétrisations des modèles DAR et leurs algorithmes d'inférence rapides, et discutons de leur stabilité.Puis nous montrons comment utiliser les modèles DAR pour l'analyse du PAC, et démontrons l'avantage de l'approche par modélisation avec trois jeux de donnée.Puis nous explorons plusieurs extensions à ces modèles, pour estimer le signal pilote à partir des données, le PAC sur des signaux multivariés, ou encore des champs réceptifs spectro-temporels.Enfin, nous proposons aussi d'adapter les modèles de codage parcimonieux convolutionnels pour les séries temporelles neurophysiologiques, en les étendant à des distributions à queues lourdes et à des décompositions multivariées. Nous développons des algorithmes d'inférence efficaces pour chaque formulations, et montrons que l'on obtient de riches représentations de façon non-supervisée
In neurophysiological time series, strong neural oscillations are observed in the mammalian brain, and the natural processing tools are thus centered on narrow-band linear filtering.As this approach is too reductive, we propose new methods to represent these signals.We first focus on the study of phase-amplitude coupling (PAC), which consists in an amplitude modulation of a high frequency band, time-locked with a specific phase of a slow neural oscillation.We propose to use driven autoregressive models (DAR), to capture PAC in a probabilistic model. Giving a proper model to the signal enables model selection by using the likelihood of the model, which constitutes a major improvement in PAC estimation.%We first present different parametrization of DAR models, with fast inference algorithms and stability discussions.Then, we present how to use DAR models for PAC analysis, demonstrating the advantage of the model-based approach on three empirical datasets.Then, we explore different extensions to DAR models, estimating the driving signal from the data, PAC in multivariate signals, or spectro-temporal receptive fields.Finally, we also propose to adapt convolutional sparse coding (CSC) models for neurophysiological time-series, extending them to heavy-tail noise distribution and multivariate decompositions. We develop efficient inference algorithms for each formulation, and show that we obtain rich unsupervised signal representations

APA, Harvard, Vancouver, ISO, and other styles

26

Viebke, André. "Accelerated Deep Learning using Intel Xeon Phi." Thesis, Linnéuniversitetet, Institutionen för datavetenskap (DV), 2015. http://urn.kb.se/resolve?urn=urn:nbn:se:lnu:diva-45491.

Full text

Abstract:

Deep learning, a sub-topic of machine learning inspired by biology, have achieved wide attention in the industry and research community recently. State-of-the-art applications in the area of computer vision and speech recognition (among others) are built using deep learning algorithms. In contrast to traditional algorithms, where the developer fully instructs the application what to do, deep learning algorithms instead learn from experience when performing a task. However, for the algorithm to learn require training, which is a high computational challenge. High Performance Computing can help ease the burden through parallelization, thereby reducing the training time; this is essential to fully utilize the algorithms in practice. Numerous work targeting GPUs have investigated ways to speed up the training, less attention have been paid to the Intel Xeon Phi coprocessor. In this thesis we present a parallelized implementation of a Convolutional Neural Network (CNN), a deep learning architecture, and our proposed parallelization scheme, CHAOS. Additionally a theoretical analysis and a performance model discuss the algorithm in detail and allow for predictions if even more threads are available in the future. The algorithm is evaluated on an Intel Xeon Phi 7120p, Xeon E5-2695v2 2.4 GHz and Core i5 661 3.33 GHz using various architectures and thread counts on the MNIST dataset. Findings show a 103.5x, 99.9x, 100.4x speed up for the large, medium, and small architecture respectively for 244 threads compared to 1 thread on the coprocessor. Moreover, a 10.9x - 14.1x (large to small) speed up compared to the sequential version running on Xeon E5. We managed to decrease training time from 7 days on the Core i5 and 31 hours on the Xeon E5, to 3 hours on the Intel Xeon Phi when training our large network for 15 epochs

APA, Harvard, Vancouver, ISO, and other styles

27

Liu, Chenguang. "Low level feature detection in SAR images." Electronic Thesis or Diss., Institut polytechnique de Paris, 2020. http://www.theses.fr/2020IPPAT015.

Full text

Abstract:

Dans cette thèse, nous développons des détecteurs de caractéristiques de bas niveau pour les images radar à synthèse d'ouverture (SAR) afin de faciliter l'utilisation conjointe des données SAR et optiques. Les segments de droite et les bords sont des caractéristiques de bas niveau très importantes dans les images qui peuvent être utilisées pour de nombreuses applications comme l'analyse ou le stockage d'images, ainsi que la détection d'objets. Alors qu'il existe de nombreux détecteurs efficaces pour les structures bas-niveau dans les images optiques, il existe très peu de détecteurs de ce type pour les images SAR, principalement en raison du fort bruit multiplicatif. Dans cette thèse, nous développons un détecteur de segment de droite générique et un détecteur de bords efficace pour les images SAR. Le détecteur de segment de droite proposé, nommé LSDSAR, est basé sur un modèle Markovien a contrario et le principe de Helmholtz, où les segments de droite sont validés en fonction d'une mesure de significativité. Plus précisément, un segment de droite est validé si son nombre attendu d'occurrences dans une image aléatoire sous l'hypothèse du modèle Markovien a contrario est petit. Contrairement aux approches habituelles a contrario, le modèle Markovien a contrario permet un filtrage fort dans l'étape de calcul du gradient, car les dépendances entre les orientations locales des pixels voisins sont autorisées grâce à l'utilisation d'une chaîne de Markov de premier ordre. Le détecteur de segments de droite basé sur le modèle Markovian a contrario proposé LSDSAR, bénéficie de la précision et l'efficacité de la nouvelle définition du modèle de fond, car de nombreux segments de droite vraie dans les images SAR sont détectés avec un contrôle du nombre de faux détections. De plus, très peu de réglages de paramètres sont requis dans les applications pratiques de LSDSAR.Dans la deuxième partie de cette thèse, nous proposons un détecteur de bords basé sur l'apprentissage profond pour les images SAR. Les contributions du détecteur de bords proposé sont doubles: 1) sous l'hypothèse que les images optiques et les images SAR réelles peuvent être divisées en zones constantes par morceaux, nous proposons de simuler un ensemble de données SAR à l'aide d'un ensemble de données optiques; 2) Nous proposons d'appliquer un réseaux de neurones convolutionnel classique, HED, directement sur les champs de magnitude des images. Ceci permet aux images de test SAR d'avoir des statistiques semblables aux images optiques en entrée du réseau. Plus précisément, la distribution du gradient pour toutes les zones homogènes est la même et la distribution du gradient pour deux zones homogènes à travers les frontières ne dépend que du rapport de leur intensité moyenne valeurs. Le détecteur de bords proposé, GRHED permet d'améliorer significativement l'état de l'art, en particulier en présence de fort bruit (images 1-look)
In this thesis we develop low level feature detectors for Synthetic Aperture Radar (SAR) images to facilitate the joint use of SAR and optical data. Line segments and edges are very important low level features in images which can be used for many applications like image analysis, image registration and object detection. Contrarily to the availability of many efficient low level feature detectors dedicated to optical images, there are very few efficient line segment detector and edge detector for SAR images mostly because of the strong multiplicative noise. In this thesis we develop a generic line segment detector and an efficient edge detector for SAR images.The proposed line segment detector which is named as LSDSAR, is based on a Markovian a contrario model and the Helmholtz principle, where line segments are validated according to their meaningfulness. More specifically, a line segment is validated if its expected number of occurences in a random image under the hypothesis of the Markovian a contrario model is small. Contrarily to the usual a contrario approaches, the Markovian a contrario model allows strong filtering in the gradient computation step, since dependencies between local orientations of neighbouring pixels are permitted thanks to the use of a first order Markov chain. The proposed Markovian a contrario model based line segment detector LSDSAR benefit from the accuracy and efficiency of the new definition of the background model, indeed, many true line segments in SAR images are detected with a control of the number of false detections. Moreover, very little parameter tuning is required in the practical applications of LSDSAR. The second work of this thesis is that we propose a deep learning based edge detector for SAR images. The contributions of the proposed edge detector are two fold: 1) under the hypothesis that both optical images and real SAR images can be divided into piecewise constant areas, we propose to simulate a SAR dataset using optical dataset; 2) we propose to train a classical CNN (convolutional neural network) edge detector, HED, directly on the graident fields of images. This, by using an adequate method to compute the gradient, enables SAR images at test time to have statistics similar to the training set as inputs to the network. More precisely, the gradient distribution for all homogeneous areas are the same and the gradient distribution for two homogeneous areas across boundaries depends only on the ratio of their mean intensity values. The proposed method, GRHED, significantly improves the state-of-the-art, especially in very noisy cases such as 1-look images

APA, Harvard, Vancouver, ISO, and other styles

28

Hansen, Vedal Amund. "Comparing performance of convolutional neural network models on a novel car classification task." Thesis, KTH, Medieteknik och interaktionsdesign, MID, 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-213468.

Full text

Abstract:

Recent neural network advances have lead to models that can be used for a variety of image classification tasks, useful for many of today’s media technology applications. In this paper, I train hallmark neural network architectures on a newly collected vehicle image dataset to do both coarse- and fine-grained classification of vehicle type. The results show that the neural networks can learn to distinguish both between many very different and between a few very similar classes, reaching accuracies of 50.8% accuracy on 28 classes and 61.5% in the most challenging 5, despite noisy images and labeling of the dataset.
Nya neurala nätverksframsteg har lett till modeller som kan användas för en mängd olika bildklasseringsuppgifter, och är därför användbara många av dagens medietekniska applikationer. I detta projektet tränar jag moderna neurala nätverksarkitekturer på en nyuppsamlad bilbild-datasats för att göra både grov- och finkornad klassificering av fordonstyp. Resultaten visar att neurala nätverk kan lära sig att skilja mellan många mycket olika bilklasser, och även mellan några mycket liknande klasser. Mina bästa modeller nådde 50,8% träffsäkerhet vid 28 klasser och 61,5% på de mest utmanande 5, trots brusiga bilder och manuell klassificering av datasetet.

APA, Harvard, Vancouver, ISO, and other styles

29

Poliak, Sebastián. "Mobilní aplikace využívající hlubokých konvolučních neuronových sítí." Master's thesis, Vysoké učení technické v Brně. Fakulta informačních technologií, 2018. http://www.nusl.cz/ntk/nusl-385895.

Full text

Abstract:

This thesis describes a process of creating a mobile application using deep convolutional neural networks. The process starts with proposal of the main idea, followed by product and technical design, implementation and evaluation. The thesis also explores the technical background of image recognition, and chooses the most suitable options for the purpose of the application. These are object detection and multi-label classification, which are both implemented, evaluated and compared. The resulting application tries to bring value from both user and technical point of view.

APA, Harvard, Vancouver, ISO, and other styles

30

Xu, Boqing, and 許博卿. "Convolutional perfectly matched layers for finite element modeling of wave propagation in unbounded domains." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 2014. http://hdl.handle.net/10722/208043.

Full text

Abstract:

A general convolutional version of perfectly matched layer (PML) formulation for second-order wave equations with displacement as the only unknown based on the coordinate stretching is proposed in this study, which overcomes the limitation of classical PML in splitting the displacement field and requires only minor modifications to existing finite element programs. The first contribution concerns the development of a robust and efficient finite element program QUAD-CPML based on QUAD4M capable of simulating wave propagation in an unbounded domain. The more efficient hybrid-stress finite element was incorporated into the program to reduce the number of iterations for the equivalent linear dynamic analysis and the total time for the direct time integration. The incorporation of new element types was verified with the QUAD4M solutions to problems of dynamic soil response and the efficiency of hybrid-stress finite element was demonstrated compared to the classical finite elements. The second development involves the implementation of a general convolutional perfectly matched layer (CPML) as an absorbing boundary condition for the modeling of the radiation of wave energy in an unbounded domain. The proposed non-split CPML formulation is displacement-based, which shows great compatibility with the direct time integration. This CPML formulation treats the convolutional terms as external forces and includes an updating scheme to calculate the temporal convolution terms arising from the Fourier transform. In addition, the performance of the CPML has been examined by various problems including a parametric study on a number of key coefficients that control the absorbing ability of the CPML boundary. The final task of this thesis is to apply the developed CPML models to the dynamic analyses of soil-structure interaction (SSI) problems. Typical loading conditions including external load on the structure and underground wave excitation on the medium has been considered. Practical applications of CPML models include the numerical study on the effectiveness of the rubber-soil mixture (RSM) as an earthquake protection material and the report of vibrations induced by the passage of a high-speed train. The former investigates the effectiveness of the CPML models for the evaluation of the performance of RSM subject to seismic excitation and the latter tests the boundary effects on the accuracy of the results for train induced vibrations. Both studies show that CPML as an absorbing boundary condition is theoretically sound and effective for the analysis of soil-structure dynamic response.
published_or_final_version
Civil Engineering
Doctoral
Doctor of Philosophy

APA, Harvard, Vancouver, ISO, and other styles

31

Meftah, Rabie. "Une approche par formalisme de green réduit pour le calcul des structures en contact dynamique : application au contact pneumatique/chaussée." Phd thesis, Université Paris-Est, 2011. http://pastel.archives-ouvertes.fr/pastel-00665546.

Full text

Abstract:

Le travail de cette thèse s'inscrit dans le cadre de la réduction du bruit du traffic routier. Le contact pneumatique/chaussée représente la principale source de ce phénomène dès la vitesse de 50 km/h. Dans ce contexte, une nouvelle démarche de modélisation du comportement dynamique d'un pneumatique roulant sur une chaussée rigide est développée. Au niveau du pneumatique, un modèle périodique est adopté pour calculer les fonctions de Green du pneumatique dans la zone de contact. Ce modèle permet de réduire considérablement le temps de calcul et de modéliser le pneumatique dans une large bande de fréquence. Le modèle est validé en le comparant avec un modèle d'éléments finis classique réalisé sous le logiciel Abaqus. Habituellement, la réponse temporelle du pneumatique peut être calculée par une convolution des fonctions de Green et des forces de contact. Cette technique est très coûteuse en terme de temps de calcul. Nous avons adopté une nouvelle démarche. L'idée consiste à décomposer les fonctions de Green dans une base modale. Les paramètres modaux sont ensuite utilisés pour construire une convolution plus rapide. La convolution modale est adaptée au problème de contact par l'addition d'une condition de contact cinématique. Le modèle de contact est comparé à la méthode de pénalité dans le cas d'un exemple académique. Il présente l'avantage de sa stabilité et de sa facilité de mise en oeuvre. Dans la dernière partie de ce travail, le modèle de contact est appliqué au cas d'un pneumatique roulant sur différents types de chaussée. Le contenu spectral des forces de contact est étudié en fonction de la vitesse de déplacement et la rugosité des chaussées. Afin de construire le modèle de contact d'un pneumatique réel sur une chaussée réelle, plusieurs exemples à complexité croissante sont traités. Le modèle d'anneau circulaire sous fondation élastique est largement étudié dans cette thèse. Une étude détaillée du modèle est réalisée dans les cas analytique et numérique

APA, Harvard, Vancouver, ISO, and other styles

32

Tang, Yuxing. "Weakly supervised learning of deformable part models and convolutional neural networks for object detection." Thesis, Lyon, 2016. http://www.theses.fr/2016LYSEC062/document.

Full text

Abstract:

Dans cette thèse, nous nous intéressons au problème de la détection d’objets faiblement supervisée. Le but est de reconnaître et de localiser des objets dans les images, n’ayant à notre disposition durant la phase d’apprentissage que des images partiellement annotées au niveau des objets. Pour cela, nous avons proposé deux méthodes basées sur des modèles différents. Pour la première méthode, nous avons proposé une amélioration de l’approche ”Deformable Part-based Models” (DPM) faiblement supervisée, en insistant sur l’importance de la position et de la taille du filtre racine initial spécifique à la classe. Tout d’abord, un ensemble de candidats est calculé, ceux-ci représentant les positions possibles de l’objet pour le filtre racine initial, en se basant sur une mesure générique d’objectness (par region proposals) pour combiner les régions les plus saillantes et potentiellement de bonne qualité. Ensuite, nous avons proposé l’apprentissage du label des classes latentes de chaque candidat comme un problème de classification binaire, en entrainant des classifieurs spécifiques pour chaque catégorie afin de prédire si les candidats sont potentiellement des objets cible ou non. De plus, nous avons amélioré la détection en incorporant l’information contextuelle à partir des scores de classification de l’image. Enfin, nous avons élaboré une procédure de post-traitement permettant d’élargir et de contracter les régions fournies par le DPM afin de les adapter efficacement à la taille de l’objet, augmentant ainsi la précision finale de la détection. Pour la seconde approche, nous avons étudié dans quelle mesure l’information tirée des objets similaires d’un point de vue visuel et sémantique pouvait être utilisée pour transformer un classifieur d’images en détecteur d’objets d’une manière semi-supervisée sur un large ensemble de données, pour lequel seul un sous-ensemble des catégories d’objets est annoté avec des boîtes englobantes nécessaires pour l’apprentissage des détecteurs. Nous avons proposé de transformer des classifieurs d’images basés sur des réseaux convolutionnels profonds (Deep CNN) en détecteurs d’objets en modélisant les différences entre les deux en considérant des catégories disposant à la fois de l’annotation au niveau de l’image globale et l’annotation au niveau des boîtes englobantes. Cette information de différence est ensuite transférée aux catégories sans annotation au niveau des boîtes englobantes, permettant ainsi la conversion de classifieurs d’images en détecteurs d’objets. Nos approches ont été évaluées sur plusieurs jeux de données tels que PASCAL VOC, ImageNet ILSVRC et Microsoft COCO. Ces expérimentations ont démontré que nos approches permettent d’obtenir des résultats comparables à ceux de l’état de l’art et qu’une amélioration significative a pu être obtenue par rapport à des méthodes récentes de détection d’objets faiblement supervisées
In this dissertation we address the problem of weakly supervised object detection, wherein the goal is to recognize and localize objects in weakly-labeled images where object-level annotations are incomplete during training. To this end, we propose two methods which learn two different models for the objects of interest. In our first method, we propose a model enhancing the weakly supervised Deformable Part-based Models (DPMs) by emphasizing the importance of location and size of the initial class-specific root filter. We first compute a candidate pool that represents the potential locations of the object as this root filter estimate, by exploring the generic objectness measurement (region proposals) to combine the most salient regions and “good” region proposals. We then propose learning of the latent class label of each candidate window as a binary classification problem, by training category-specific classifiers used to coarsely classify a candidate window into either a target object or a non-target class. Furthermore, we improve detection by incorporating the contextual information from image classification scores. Finally, we design a flexible enlarging-and-shrinking post-processing procedure to modify the DPMs outputs, which can effectively match the approximate object aspect ratios and further improve final accuracy. Second, we investigate how knowledge about object similarities from both visual and semantic domains can be transferred to adapt an image classifier to an object detector in a semi-supervised setting on a large-scale database, where a subset of object categories are annotated with bounding boxes. We propose to transform deep Convolutional Neural Networks (CNN)-based image-level classifiers into object detectors by modeling the differences between the two on categories with both image-level and bounding box annotations, and transferring this information to convert classifiers to detectors for categories without bounding box annotations. We have evaluated both our approaches extensively on several challenging detection benchmarks, e.g. , PASCAL VOC, ImageNet ILSVRC and Microsoft COCO. Both our approaches compare favorably to the state-of-the-art and show significant improvement over several other recent weakly supervised detection methods

APA, Harvard, Vancouver, ISO, and other styles

33

Mascarenhas, Helena. "Convolution type operators on cones and asymptotic spectral theory." Doctoral thesis, [S.l. : s.n.], 2004. http://deposit.ddb.de/cgi-bin/dokserv?idn=970638809.

Full text

APA, Harvard, Vancouver, ISO, and other styles

34

Dronzeková, Michaela. "Analýza polygonálních modelů pomocí neuronových sítí." Master's thesis, Vysoké učení technické v Brně. Fakulta informačních technologií, 2020. http://www.nusl.cz/ntk/nusl-417253.

Full text

Abstract:

This thesis deals with rotation estimation of 3D model of human jaw. It describes and compares methods for direct analysis od 3D models as well as method to analyze model using rasterization. To evaluate perfomance of proposed method, a metric that computes number of cases when prediction was less than 30° from ground truth is used. Proposed method that uses rasterization, takes three x-ray views of model as an input and processes it with convolutional network. It achieves best preformance, 99% with described metric. Method to directly analyze polygonal model as a sequence uses attention mechanism to do so and was inspired by transformer architecture. A special pooling function was proposed for this network that decreases memory requirements of the network. This method achieves 88%, but does not use rasterization and can process polygonal model directly. It is not as good as rasterization method with x-ray display, byt it is better than rasterization method with model not rendered as x-ray. The last method uses graph representation of mesh. Graph network had problems with overfitting, that is why it did not get good results and I think this method is not very suitable for analyzing plygonal model.

APA, Harvard, Vancouver, ISO, and other styles

35

Xu, (Bill) Ke. "Efficient parameterization and estimation of spatio-temporal dynamic models /." free to MU campus, to others for purchase, 2004. http://wwwlib.umi.com/cr/mo/fullcit?p3137766.

Full text

APA, Harvard, Vancouver, ISO, and other styles

36

Highlander, Tyler Clayton. "Conditional Dilated Attention Tracking Model - C-DATM." Wright State University / OhioLINK, 2019. http://rave.ohiolink.edu/etdc/view?acc_num=wright1564652134758139.

Full text

APA, Harvard, Vancouver, ISO, and other styles

37

Lundberg, Gustav. "Automatic map generation from nation-wide data sources using deep learning." Thesis, Linköpings universitet, Statistik och maskininlärning, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-170759.

Full text

Abstract:

The last decade has seen great advances within the field of artificial intelligence. One of the most noteworthy areas is that of deep learning, which is nowadays used in everything from self driving cars to automated cancer screening. During the same time, the amount of spatial data encompassing not only two but three dimensions has also grown and whole cities and countries are being scanned. Combining these two technological advances enables the creation of detailed maps with a multitude of applications, civilian as well as military.This thesis aims at combining two data sources covering most of Sweden; laser data from LiDAR scans and surface model from aerial images, with deep learning to create maps of the terrain. The target is to learn a simplified version of orienteering maps as these are created with high precision by experienced map makers, and are a representation of how easy or hard it would be to traverse a given area on foot. The performance on different types of terrain are measured and it is found that open land and larger bodies of water is identified at a high rate, while trails are hard to recognize.It is further researched how the different densities found in the source data affect the performance of the models, and found that some terrain types, trails for instance, benefit from higher density data, Other features of the terrain, like roads and buildings are predicted with higher accuracy by lower density data.Finally, the certainty of the predictions is discussed and visualised by measuring the average entropy of predictions in an area. These visualisations highlight that although the predictions are far from perfect, the models are more certain about their predictions when they are correct than when they are not.

APA, Harvard, Vancouver, ISO, and other styles

38

Kratzert, Ludvig. "Adversarial Example Transferabilty to Quantized Models." Thesis, Linköpings universitet, Medie- och Informationsteknik, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-177590.

Full text

Abstract:

Deep learning has proven to be a major leap in machine learning, allowing completely new problems to be solved. While flexible and powerful, neural networks have the disadvantage of being large and demanding high performance from the devices on which they are run. In order to deploy neural networks on more, and simpler, devices, techniques such as quantization, sparsification and tensor decomposition have been developed. These techniques have shown promising results, but their effects on model robustness against attacks remain largely unexplored. In this thesis, Universal Adversarial Perturbations (UAP) and the Fast Gradient Sign Method (FGSM) are tested against VGG-19 as well as versions of it compressed using 8-bit quantization, TensorFlows float16 quantization, and 8-bit and 4-bit single layer quantization as introduced in this thesis. The results show that UAP transfers well to all quantized models, while the transferability of FGSM is high to the float16 quantized model, lower to the 8-bit models, and high to the 4-bit SLQ model. We suggest that this disparity arises from the universal adversarial perturbations having been trained on multiple examples rather than just one, which has previously been shown to increase transferability. The results also show that quantizing a single layer, the first layer in this case, can have a disproportionate impact on transferability.

Examensarbetet är utfört vid Institutionen för teknik och naturvetenskap (ITN) vid Tekniska fakulteten, Linköpings universitet

APA, Harvard, Vancouver, ISO, and other styles

39

Dabiri, Sina. "Semi-Supervised Deep Learning Approach for Transportation Mode Identification Using GPS Trajectory Data." Thesis, Virginia Tech, 2018. http://hdl.handle.net/10919/86845.

Full text

Abstract:

Identification of travelers' transportation modes is a fundamental step for various problems that arise in the domain of transportation such as travel demand analysis, transport planning, and traffic management. This thesis aims to identify travelers' transportation modes purely based on their GPS trajectories. First, a segmentation process is developed to partition a user's trip into GPS segments with only one transportation mode. A majority of studies have proposed mode inference models based on hand-crafted features, which might be vulnerable to traffic and environmental conditions. Furthermore, the classification task in almost all models have been performed in a supervised fashion while a large amount of unlabeled GPS trajectories has remained unused. Accordingly, a deep SEmi-Supervised Convolutional Autoencoder (SECA) architecture is proposed to not only automatically extract relevant features from GPS segments but also exploit useful information in unlabeled data. The SECA integrates a convolutional-deconvolutional autoencoder and a convolutional neural network into a unified framework to concurrently perform supervised and unsupervised learning. The two components are simultaneously trained using both labeled and unlabeled GPS segments, which have already been converted into an efficient representation for the convolutional operation. An optimum schedule for varying the balancing parameters between reconstruction and classification errors are also implemented. The performance of the proposed SECA model, trip segmentation, the method for converting a raw trajectory into a new representation, the hyperparameter schedule, and the model configuration are evaluated by comparing to several baselines and alternatives for various amounts of labeled and unlabeled data. The experimental results demonstrate the superiority of the proposed model over the state-of-the-art semi-supervised and supervised methods with respect to metrics such as accuracy and F-measure.
Master of Science
Identifying users' transportation modes (e.g., bike, bus, train, and car) is a key step towards many transportation related problems including (but not limited to) transport planning, transit demand analysis, auto ownership, and transportation emissions analysis. Traditionally, the information for analyzing travelers' behavior for choosing transport mode(s) was obtained through travel surveys. High cost, low-response rate, time-consuming manual data collection, and misreporting are the main demerits of the survey-based approaches. With the rapid growth of ubiquitous GPS-enabled devices (e.g., smartphones), a constant stream of users' trajectory data can be recorded. A user's GPS trajectory is a sequence of GPS points, recorded by means of a GPS-enabled device, in which a GPS point contains the information of the device geographic location at a particular moment. In this research, users' GPS trajectories, rather than traditional resources, are harnessed to predict their transportation mode by means of statistical models. With respect to the statistical models, a wide range of studies have developed travel mode detection models using on hand-designed attributes and classical learning techniques. Nonetheless, hand-crafted features cause some main shortcomings including vulnerability to traffic uncertainties and biased engineering justification in generating effective features. A potential solution to address these issues is by leveraging deep learning frameworks that are capable of capturing abstract features from the raw input in an automated fashion. Thus, in this thesis, deep learning architectures are exploited in order to identify transport modes based on only raw GPS tracks. It is worth noting that a significant portion of trajectories in GPS data might not be annotated by a transport mode and the acquisition of labeled data is a more expensive and labor-intensive task in comparison with collecting unlabeled data. Thus, utilizing the unlabeled GPS trajectory (i.e., the GPS trajectories that have not been annotated by a transport mode) is a cost-effective approach for improving the prediction quality of the travel mode detection model. Therefore, the unlabeled GPS data are also leveraged by developing a novel deep-learning architecture that is capable of extracting information from both labeled and unlabeled data. The experimental results demonstrate the superiority of the proposed models over the state-of-the-art methods in literature with respect to several performance metrics.

APA, Harvard, Vancouver, ISO, and other styles

40

Guan, Xiao. "Deterministic and Flexible Parallel Latent Feature Models Learning Framework for Probabilistic Knowledge Graph." Thesis, Mittuniversitetet, Avdelningen för informationssystem och -teknologi, 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:miun:diva-35788.

Full text

Abstract:

Knowledge Graph is a rising topic in the field of Artificial Intelligence. As the current trend of knowledge representation, Knowledge graph research is utilizing the large knowledge base freely available on the internet. Knowledge graph also allows inspection, analysis, the reasoning of all knowledge in reality. To enable the ambitious idea of modeling the knowledge of the world, different theory and implementation emerges. Nowadays, we have the opportunity to use freely available information from Wikipedia and Wikidata. The thesis investigates and formulates a theory about learning from Knowledge Graph. The thesis researches probabilistic knowledge graph. It only focuses on a branch called latent feature models in learning probabilistic knowledge graph. These models aim to predict possible relationships of connected entities and relations. There are many models for such a task. The metrics and training process is detailed described and improved in the thesis work. The efficiency and correctness enable us to build a more complex model with confidence. The thesis also covers possible problems in finding and proposes future work.

APA, Harvard, Vancouver, ISO, and other styles

41

Радюк, Павло Михайлович, and Pavlo Radiuk. "Інформаційна технологія раннього діагностування пневмонії за індивідуальним підбором параметрів моделі класифікації медичних зображень легень." Дисертація, Хмельницький національний університет, 2021. http://elar.khnu.km.ua/jspui/handle/123456789/11937.

Full text

Abstract:

Дисертаційна робота присвячена розв’язанню актуальної науково-прикладної задачі автоматизації процесу діагностування вірусного пневмонічного запалення за медичними зображеннями легень через розроблення інформаційної технології раннього діагностування пневмонії за індивідуальним підбором параметрів моделі класифікації медичних зображень легень. Застосування розробленої інформаційної технології раннього діагностування пневмонії в клінічній практиці дає змогу підвищити точність та надійність ідентифікації пневмонії на ранніх стадіях за медичними зображеннями грудної клітини людини. Об’єктом дослідження є процес діагностування пневмонії за медичними зображеннями грудної клітини людини. Предметом дослідження є моделі, методи та засоби інформаційної технології для раннього діагностування пневмонії за медичними зображеннями грудної клітини людини. У дисертаційній роботі визначено актуальність застосування інформаційних технологій у галузі цифрового діагностування захворювань легень за медичними зображеннями грудної клітини. На основі проведено аналізу методів та підходів до виявлення пневмонії встановлено, що нейромережеві моделі є найкращим рішенням для розроблення інформаційної технології раннього діагностування. Досліджено методи для налаштування нейромережевої моделі та підходи до пояснення та інтерпретування результатів ідентифікації захворювання легень. За аналізом сучасних підходів, методів та інформаційних технологій для діагностування захворювання легень на ранніх стадіях за медичними зображеннями грудної клітини обґрунтовано потребу в створенні інформаційної технології раннього діагностування пневмонії.
The present thesis is devoted to solving the topical scientific and applied problem of automating the process of diagnosing viral pneumonia by medical images of the lungs through the development of information technology for early diagnosis of pneumonia by the individual selection of parameters of the classification model by medical images of the lungs. Applying the developed information technology for the early diagnosis of pneumonia in clinical practice by medical images of the human chest increases the accuracy and reliability of pneumonia identification in the early stages

APA, Harvard, Vancouver, ISO, and other styles

42

Diallo, Boubacar. "Mesure de l'intégrité d'une image : des modèles physiques aux modèles d'apprentissage profond." Thesis, Poitiers, 2020. http://www.theses.fr/2020POIT2293.

Full text

Abstract:

Les images numériques sont devenues un outil de communication visuel puissant et efficace pour transmettre des messages, diffuser des idées et prouver des faits. L’apparition du smartphone avec une grande diversité de marques et de modèles facilite la création de nouveaux contenus visuels et leur diffusion dans les réseaux sociaux et les plateformes de partage d’images. Liés à ce phénomène de création et publication d'images et aidés par la disponibilité et la facilité d’utilisation des logiciels de manipulation d’images, de nombreux problèmes sont apparus allant de la diffusion de contenu illégal à la violation du droit d’auteur. La fiabilité des images numériques est remise en cause que ce soit pour de simples utilisateurs ou pour des professionnels experts tels que les tribunaux et les enquêteurs de police. Le phénomène des « fake news » est un exemple bien connu et répandu d’utilisation malveillante d’images numériques sur les réseaux.De nombreux chercheurs du domaine de la cybersécurité des images ont relevé les défis scientifiques liés aux manipulations des images. De nombreuses méthodes aux performances intéressantes ont été développées basées sur le traitement automatique des images et plus récemment l'adoption de l'apprentissage profond. Malgré la diversité des techniques proposées, certaines ne fonctionnent que pour certaines conditions spécifiques et restent vulnérables à des attaques malveillantes relativement simples. En effet, les images collectées sur Internet imposent de nombreuses contraintes aux algorithmes remettant en question de nombreuses techniques de vérification d’intégrité existantes. Il existe deux particularités principales à prendre en compte pour la détection d'une falsification : l’une est le manque d'informations sur l'acquisition de l'image d'origine, l'autre est la forte probabilité de transformations automatiques liées au partage de l'image telles que la compression avec pertes ou le redimensionnement.Dans cette thèse, nous sommes confrontés à plusieurs de ces défis liés à la cybersécurité des images notamment l’identification de modèles de caméra et la détection de falsification d’images. Après avoir passé en revue l'état de l'art du domaine, nous proposons une première méthode basée sur les données pour l’identification de modèles de caméra. Nous utilisons les techniques d’apprentissage profond basées sur les réseaux de neurones convolutifs (CNN) et développons une stratégie d’apprentissage prenant en compte la qualité des données d’entrée par rapport à la transformation appliquée. Une famille de réseaux CNN a été conçue pour apprendre les caractéristiques du modèle de caméra directement à partir d’une collection d’images subissant les mêmes transformations que celles couramment utilisées sur Internet. Notre intérêt s'est porté sur la compression avec pertes pour nos expérimentations, car c’est le type de post-traitement le plus utilisé sur Internet. L’approche proposée fournit donc une solution robuste face à la compression pour l’identification de modèles de caméra. Les performances obtenues par notre approche de détection de modèles de caméra sont également utilisées et adaptées pour la détection et la localisation de falsification d’images. Les performances obtenues soulignent la robustesse de nos propositions pour la classification de modèles de caméra et la détection de falsification d'images
Digital images have become a powerful and effective visual communication tool for delivering messages, diffusing ideas, and proving facts. The smartphone emergence with a wide variety of brands and models facilitates the creation of new visual content and its dissemination in social networks and image sharing platforms. Related to this phenomenon and helped by the availability and ease of use of image manipulation softwares, many issues have arisen ranging from the distribution of illegal content to copyright infringement. The reliability of digital images is questioned for common or expert users such as court or police investigators. A well known phenomenon and widespread examples are the "fake news" which oftenly include malicious use of digital images.Many researchers in the field of image forensic have taken up the scientific challenges associated with image manipulation. Many methods with interesting performances have been developed based on automatic image processing and more recently the adoption of deep learning. Despite the variety of techniques offered, performance are bound to specific conditions and remains vulnerable to relatively simple malicious attacks. Indeed, the images collected on the Internet impose many constraints on algorithms questioning many existing integrity verification techniques. There are two main peculiarities to be taken into account for the detection of a falsification: one is the lack of information on pristine image acquisition, the other is the high probability of automatic transformations linked to the image-sharing platforms such as lossy compression or resizing.In this thesis, we focus on several of these image forensic challenges including camera model identification and image tampering detection. After reviewing the state of the art in the field, we propose a first data-driven method for identifying camera models. We use deep learning techniques based on convolutional neural networks (CNNs) and develop a learning strategy considering the quality of the input data versus the applied transformation. A family of CNN networks has been designed to learn the characteristics of the camera model directly from a collection of images undergoing the same transformations as those commonly used on the Internet. Our interest focused on lossy compression for our experiments, because it is the most used type of post-processing on the Internet. The proposed approach, therefore, provides a robust solution to compression for camera model identification. The performance achieved by our camera model detection approach is also used and adapted for image tampering detection and localization. The performances obtained underline the robustness of our proposals for camera model identification and image forgery detection

APA, Harvard, Vancouver, ISO, and other styles

43

Duessel, Patrick [Verfasser]. "Detection of unknown cyber attacks using convolution kernels over attributed language models / Patrick Duessel." Bonn : Universitäts- und Landesbibliothek Bonn, 2018. http://d-nb.info/1162953187/34.

Full text

APA, Harvard, Vancouver, ISO, and other styles

44

Mabon, Gwennaëlle. "Estimation non-paramétrique adaptative pour des modèles bruités." Thesis, Sorbonne Paris Cité, 2016. http://www.theses.fr/2016USPCB020/document.

Full text

Abstract:

Dans cette thèse, nous nous intéressons au problème d'estimation de densité dans le modèle de convolution. Ce cadre correspond aux modèles avec erreurs de mesures additives, c'est-à-dire que nous observons une version bruitée de la variable d'intérêt. Pour mener notre étude, nous adoptons le point de vue de l'estimation non-paramétrique adaptative qui repose sur des procédures de sélection de modèle développées par Birgé & Massart ou sur les méthodes de Lepski. Cette thèse se divise en deux parties. La première développe des méthodes spécifiques d'estimation adaptative quand les variables d'intérêt et les erreurs sont des variables aléatoires positives. Ainsi nous proposons des estimateurs adaptatifs de la densité ou encore de la fonction de survie dans ce modèle, puis de fonctionnelles linéaires de la densité cible. Enfin nous suggérons une procédure d'agrégation linéaire. La deuxième partie traite de l'estimation adaptative de densité dans le modèle de convolution lorsque la loi des erreurs est inconnue. Dans ce cadre il est supposé qu'un échantillon préliminaire du bruit est disponible ou que les observations sont disponibles sous forme de données répétées. Les résultats obtenus pour des données répétées dans le modèle de convolution permettent d'élargir cette méthodologie au cadre des modèles linéaires mixtes. Enfin cette méthode est encore appliquée à l'estimation de la densité de somme de variables aléatoires observées avec du bruit
In this thesis, we are interested in nonparametric adaptive estimation problems of density in the convolution model. This framework matches additive measurement error models, which means we observe a noisy version of the random variable of interest. To carry out our study, we follow the paradigm of model selection developped by Birgé & Massart or criterion based on Lepski's method. The thesis is divided into two parts. In the first one, the main goal is to build adaptive estimators in the convolution model when both random variables of interest and errors are distributed on the nonnegative real line. Thus we propose adaptive estimators of the density along with the survival function, then of linear functionals of the target density. This part ends with a linear density aggregation procedure. The second part of the thesis deals with adaptive estimation of density in the convolution model when the distribution is unknown and distributed on the real line. To make this problem identifiable, we assume we have at hand either a preliminary sample of the noise or we observe repeated data. So, we can derive adaptive estimation with mild assumptions on the noise distribution. This methodology is then applied to linear mixed models and to the problem of density estimation of the sum of random variables when the latter are observed with an additive noise

APA, Harvard, Vancouver, ISO, and other styles

45

Diffner, Fredrik, and Hovig Manjikian. "Training a Neural Network using Synthetically Generated Data." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-280334.

Full text

Abstract:

A major challenge in training machine learning models is the gathering and labeling of a sufficiently large training data set. A common solution is the use of synthetically generated data set to expand or replace a real data set. This paper examines the performance of a machine learning model trained on synthetic data set versus the same model trained on real data. This approach was applied to the problem of character recognition using a machine learning model that implements convolutional neural networks. A synthetic data set of 1’240’000 images and two real data sets, Char74k and ICDAR 2003, were used. The result was that the model trained on the synthetic data set achieved an accuracy that was about 50% better than the accuracy of the same model trained on the real data set.
Vid utvecklandet av maskininlärningsmodeller kan avsaknaden av ett tillräckligt stort dataset för träning utgöra ett problem. En vanlig lösning är att använda syntetiskt genererad data för att antingen utöka eller helt ersätta ett dataset med verklig data. Denna uppsats undersöker prestationen av en maskininlärningsmodell tränad på syntetisk data jämfört med samma modell tränad på verklig data. Detta applicerades på problemet att använda ett konvolutionärt neuralt nätverk för att tyda tecken i bilder från ”naturliga” miljöer. Ett syntetiskt dataset bestående av 1’240’000 samt två stycken dataset med tecken från bilder, Char74K och ICDAR2003, användes. Resultatet visar att en modell tränad på det syntetiska datasetet presterade ca 50% bättre än samma modell tränad på Char74K.

APA, Harvard, Vancouver, ISO, and other styles

46

Segkos, Michail. "Advanced techniques to improve the performance of OFDM Wireless LAN." Thesis, Monterey, Calif. : Springfield, Va. : Naval Postgraduate School ; Available from National Technical Information Service, 2004. http://library.nps.navy.mil/uhtbin/hyperion/04Jun%5FSegkos.pdf.

Full text

Abstract:

Thesis (M.S. in Electrical Engineering and M.S. in Applied Physics)--Naval Postgraduate School, June 2004.
Thesis advisor(s): Tri T. Ha, Brett H. Borden. Includes bibliographical references (p. 107-109). Also available online.

APA, Harvard, Vancouver, ISO, and other styles

47

Nilsson, Kristian, and Hans-Eric Jönsson. "A comparison of image and object level annotation performance of image recognition cloud services and custom Convolutional Neural Network models." Thesis, Blekinge Tekniska Högskola, Institutionen för programvaruteknik, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:bth-18074.

Full text

Abstract:

Recent advancements in machine learning has contributed to an explosive growth of the image recognition field. Simultaneously, multiple Information Technology (IT) service providers such as Google and Amazon have embraced cloud solutions and software as a service. These factors have helped mature many computer vision tasks from scientific curiosity to practical applications. As image recognition is now accessible to the general developer community, a need arises for a comparison of its capabilities, and what can be gained from choosing a cloud service over a custom implementation. This thesis empirically studies the performance of five general image recognition services (Google Cloud Vision, Microsoft Computer Vision, IBM Watson, Clarifai and Amazon Rekognition) and image recognition models of the Convolutional Neural Network (CNN) architecture that we ourselves have configured and trained. Image and object level annotations of images extracted from different datasets were tested, both in their original state and after being subjected to one of the following six types of distortions: brightness, color, compression, contrast, blurriness and rotation. The output labels and confidence scores were compared to the ground truth of multiple levels of concepts, such as food, soup and clam chowder. The results show that out of the services tested, there is currently no clear top performer over all categories and they all have some variations and similarities in their output, but on average Google Cloud Vision performs the best by a small margin. The services are all adept at identifying high level concepts such as food and most mid-level ones such as soup. However, in terms of further specifics, such as clam chowder, they start to vary, some performing better than others in different categories. Amazon was found to be the most capable at identifying multiple unique objects within the same image, on the chosen dataset. Additionally, it was found that by using synonyms of the ground truth labels, performance increased as the semantic gap between our expectations and the actual output from the services was narrowed. The services all showed vulnerability to image distortions, especially compression, blurriness and rotation. The custom models all performed noticeably worse, around half as well compared to the cloud services, possibly due to the difference in training data standards. The best model, configured with three convolutional layers, 128 nodes and a layer density of two, reached an average performance of almost 0.2 or 20%. In conclusion, if one is limited by a lack of experience with machine learning, computational resources and time, it is recommended to make use of one of the cloud services to reach a more acceptable performance level. Which to choose depends on the intended application, as the services perform differently in certain categories. The services are all vulnerable to multiple image distortions, potentially allowing adversarial attacks. Finally, there is definitely room for improvement in regards to the performance of these services and the computer vision field as a whole.

APA, Harvard, Vancouver, ISO, and other styles

48

Lopez, de Diego Silvia Isabel. "Automated Interpretation of Abnormal Adult Electroencephalograms." Master's thesis, Temple University Libraries, 2017. http://cdm16002.contentdm.oclc.org/cdm/ref/collection/p245801coll10/id/463281.

Full text

Abstract:

Electrical and Computer Engineering
M.S.E.E.
Interpretation of electroencephalograms (EEGs) is a process that is still dependent on the subjective analysis of the examiner. The interrater agreement, even for relevant clinical events such as seizures, can be low. For instance, the differences between interictal, ictal, and post-ictal EEGs can be quite subtle. Before making such low-level interpretations of the signals, neurologists often classify EEG signals as either normal or abnormal. Even though the characteristics of a normal EEG are well defined, there are some factors, such as benign variants, that complicate this decision. However, neurologists can make this classification accurately by only examining the initial portion of the signal. Therefore, in this thesis, we explore the hypothesis that high performance machine classification of an EEG signal as abnormal can approach human performance using only the first few minutes of an EEG recording. The goal of this thesis is to establish a baseline for automated classification of abnormal adult EEGs using state of the art machine learning algorithms and a big data resource – The TUH EEG Corpus. A demographically balanced subset of the corpus was used to evaluate performance of the systems. The data was partitioned into a training set (1,387 normal and 1,398 abnormal files), and an evaluation set (150 normal and 130 abnormal files). A system based on hidden Markov Models (HMMs) achieved an error rate of 26.1%. The addition of a Stacked Denoising Autoencoder (SdA) post-processing step (HMM-SdA) further decreased the error rate to 24.6%. The overall best result (21.2% error rate) was achieved by a deep learning system that combined a Convolutional Neural Network and a Multilayer Perceptron (CNN-MLP). Even though the performance of our algorithm still lags human performance, which approaches a 1% error rate for this task, we have established an experimental paradigm that can be used to explore this application and have demonstrated a promising baseline using state of the art deep learning technology.
Temple University--Theses

APA, Harvard, Vancouver, ISO, and other styles

49

Arefiyan, Khalilabad Seyyed Mostafa. "Deep Learning Models for Context-Aware Object Detection." Thesis, Virginia Tech, 2017. http://hdl.handle.net/10919/88387.

Full text

Abstract:

In this thesis, we present ContextNet, a novel general object detection framework for incorporating context cues into a detection pipeline. Current deep learning methods for object detection exploit state-of-the-art image recognition networks for classifying the given region-of-interest (ROI) to predefined classes and regressing a bounding-box around it without using any information about the corresponding scene. ContextNet is based on an intuitive idea of having cues about the general scene (e.g., kitchen and library), and changes the priors about presence/absence of some object classes. We provide a general means for integrating this notion in the decision process about the given ROI by using a pretrained network on the scene recognition datasets in parallel to a pretrained network for extracting object-level features for the corresponding ROI. Using comprehensive experiments on the PASCAL VOC 2007, we demonstrate the effectiveness of our design choices, the resulting system outperforms the baseline in most object classes, and reaches 57.5 mAP (mean Average Precision) on the PASCAL VOC 2007 test set in comparison with 55.6 mAP for the baseline.
MS

APA, Harvard, Vancouver, ISO, and other styles

50

Wang, Zhen. "Semi-parametric Bayesian Models Extending Weighted Least Squares." The Ohio State University, 2009. http://rave.ohiolink.edu/etdc/view?acc_num=osu1236786934.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Dissertations / Theses on the topic 'Convolutional model'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles