Log in

Relevant bibliographies by topics / Convolutional neuralt nätverk / Dissertations / Theses

To see the other types of publications on this topic, follow the link: Convolutional neuralt nätverk.

Dissertations / Theses on the topic 'Convolutional neuralt nätverk'

Author: Grafiati

Published: 30 June 2021

Last updated: 7 February 2022

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 29 dissertations / theses for your research on the topic 'Convolutional neuralt nätverk.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.

1

Lavenius, Axel. "Automatic identification of northern pike (Exos Lucius) with convolutional neural networks." Thesis, Uppsala universitet, Institutionen för geovetenskaper, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-418639.

Full text

Abstract:

The population of northern pike in the Baltic sea has seen a drasticdecrease in numbers in the last couple of decades. The reasons for this are believed to be many, but the majority of them are most likely anthropogenic. Today, many measures are being taken to prevent further decline of pike populations, ranging from nutrient runoff control to habitat restoration. This inevitably gives rise to the problem addressed in this project, namely: how can we best monitor pike populations so that it is possible to accurately assess and verify the effects of these measures over the coming decades? Pike is currently monitored in Sweden by employing expensive and ineffective manual methods of individual marking of pike by a handful of experts. This project provides evidence that such methods could be replaced by a Convolutional Neural Network (CNN), an automatic artificial intelligence system, which can be taught how to identify pike individuals based on their unique patterns. A neural net simulates the functions of neurons in the human brain, which allows it to perform a range of tasks, while a CNN is a neural net specialized for this type of visual recognition task. The results show that the CNN trained in this project can identify pike individuals in the provided data set with upwards of 90% accuracy, with much potential for improvement.

APA, Harvard, Vancouver, ISO, and other styles

2

Du, Zekun. "Algorithm Design and Optimization of Convolutional Neural Networks Implemented on FPGAs." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-254575.

Full text

Abstract:

Deep learning develops rapidly in recent years. It has been applied to many fields, which are the main areas of artificial intelligence. The combination of deep learning and embedded systems is a good direction in the technical field. This project is going to design a deep learning neural network algorithm that can be implemented on hardware, for example, FPGA. This project based on current researches about deep learning neural network and hardware features. The system uses PyTorch and CUDA as assistant methods. This project focuses on image classification based on a convolutional neural network (CNN). Many good CNN models can be studied, like ResNet, ResNeXt, and MobileNet. By applying these models to the design, an algorithm is decided with the model of MobileNet. Models are selected in some ways, like floating point operations (FLOPs), number of parameters and classification accuracy. Finally, the algorithm based on MobileNet is selected with a top-1 error of 5.5%on software with a 6-class data set.Furthermore, the hardware simulation comes on the MobileNet based algorithm. The parameters are transformed from floating point numbers to 8-bit integers. The output numbers of each individual layer are cut to fixed-bit integers to fit the hardware restriction. A number handling method is designed to simulate the number change on hardware. Based on this simulation method, the top-1 error increases to 12.3%, which is acceptable.
Deep learning har utvecklats snabbt under den senaste tiden. Det har funnit applikationer inom många områden, som är huvudfälten inom Artificial Intelligence. Kombinationen av Deep Learning och innbyggda system är en god inriktning i det tekniska fältet. Syftet med detta projekt är att designa en Deep Learning-baserad Neural Network algoritm som kan implementeras på hårdvara, till exempel en FPGA. Projektet är baserat på modern forskning inom Deep Learning Neural Networks samt hårdvaruegenskaper.Systemet är baserat på PyTorch och CUDA. Projektets fokus är bild klassificering baserat på Convolutional Neural Networks (CNN). Det finns många bra CNN modeller att studera, t.ex. ResNet, ResNeXt och MobileNet. Genom att applicera dessa modeller till designen valdes en algoritm med MobileNetmodellen. Valet av modell är baserat på faktorer så som antal flyttalsoperationer, antal modellparametrar och klassifikationsprecision. Den mjukvarubaserade versionen av den MobileNet-baserade algoritmen har top-1 error på 5.5En hårdvarusimulering av MobileNet nätverket designades, i vilket parametrarna är konverterade från flyttal till 8-bit heltal. Talen från varje lager klipps till fixed-bit heltal för att anpassa nätverket till befintliga hårdvarubegränsningar. En metod designas för att simulera talförändringen på hårdvaran. Baserat på denna simuleringsmetod reduceras top-1 error till 12.3

APA, Harvard, Vancouver, ISO, and other styles

3

Elander, Filip. "Semantic segmentation of off-road scenery on embedded hardware using transfer learning." Thesis, KTH, Mekatronik, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-301154.

Full text

Abstract:

Real-time semantic scene understanding is a challenging computer vision task for autonomous vehicles. A limited amount of research has been done regarding forestry and off-road scene understanding, as the industry focuses on urban and on-road applications. Studies have shown that Deep Convolutional Neural Network architectures, using parameters trained on large datasets, can be re-trained and customized with smaller off-road datasets, using a method called transfer learning and yield state-of-the-art classification performance. This master’s thesis served as an extension of such existing off-road semantic segmentation studies. The thesis focused on detecting and visualizing the general trade-offs between classification performance, classification time, and the network’s number of available classes. The results showed that the classification performance declined for every class that got added to the network. Misclassification mainly occurred in the class boundary areas, which increased when more classes got added to the network. However, the number of classes did not affect the network’s classification time. Further, there was a nonlinear trade-off between classification time and classification performance. The classification performance improved with an increased number of network layers and a larger data type resolution. However, the layer depth increased the number of calculations and the larger data type resolution required a longer calculation time. The network’s classification performance increased by 0.5% when using a 16-bit data type resolution instead of an 8-bit resolution. But, its classification time considerably worsened as it segmented about 20 camera frames less per second with the larger data type. Also, tests showed that a 101-layered network slightly degraded in classification performance compared to a 50-layered network, which indicated the nonlinearity to the trade-off regarding classification time and classification performance. Moreover, the class constellations considerably impacted the network’s classification performance and continuity. It was essential that the class’s content and objects were visually similar and shared the same features. Mixing visually ambiguous objects into the same class could drop the inference performance by almost 30%. There are several directions for future work, including writing a new and customized source code for the ResNet50 network. A customized and pruned network could enhance both the application’s classification performance and classification speed. Further, procuring a task-specific forestry dataset and transferring weights pre-trained for autonomous navigation instead of generic object segmentation could lead to even better classification performance.
Se filen

APA, Harvard, Vancouver, ISO, and other styles

4

Spång, Anton. "Automatic Image Annotation by Sharing Labels Based on Image Clustering." Thesis, KTH, Skolan för datavetenskap och kommunikation (CSC), 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-210164.

Full text

Abstract:

The growth of image collection sizes during the development has currently made manual annotation unfeasible, leading to the need for accurate and time efficient image annotation methods. This project evaluates a system for Automatic Image Annotation to see if it is possible to share annotations between images based on un-supervised clustering. The evaluation of the system included performing experiments with different algorithms and different unlabeled data sets. The system is also compared to an award winning Convolutional Neural Network model, used as a baseline, to see if the system’s precision and/or recall could be better than the baseline model’s. The results of the experiment conducted in this work showed that the precision and recall could be increased on the data used in this thesis, an increase of 0.094 in precision and 0.049 in recall in average for the system compared to the baseline.
Utvecklingen av bildkollektioners storlekar har fram till idag ökat behovet av ett pålitligt och effektivt annoteringsverktyg i och med att manuell annotering har blivit ineffektivt. Denna rapport utvärderar möjligheterna att dela bildtaggar mellan visuellt lika bilder med ett system för automatisk bildannotering baserat på klustring. Utvärderingen sker i form av flera experiment med olika algoritmer och olika omärkta datamängder. I experimenten är systemet jämfört med en prisbelönt konvolutionell neural nätverksmodell, vilken är använd som utgångspunkt, för att undersöka om systemets resultat kan bli bättre än utgångspunktens resultat. Resultaten visar att både precisionen och återkallelsen förbättrades i de experiment som genomfördes på den data använd i detta arbete. En precisionsökning med 0.094 och en återkallelseökning med 0.049 för det implementerade systemet jämfört med utgångspunkten, över det genomförda experimenten.

APA, Harvard, Vancouver, ISO, and other styles

5

Engström, Messén Matilda, and Elvira Moser. "Pre-planning of Individualized Ankle Implants Based on Computed Tomography - Automated Segmentation and Optimization of Acquisition Parameters." Thesis, KTH, Fysik, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-297674.

Full text

Abstract:

The structure of the ankle joint complex creates an ideal balance between mobility and stability, which enables gait. If a lesion emerges in the ankle joint complex, the anatomical structure is altered, which may disturb mobility and stability and cause intense pain. A lesion in the articular cartilage on the talus bone, or a lesion in the subchondral bone of the talar dome, is referred to as an Osteochondral Lesion of the Talus (OLT). Replacing the damaged cartilage or bone with an implant is one of the methods that can be applied to treat OLTs. Episurf Medical develops and produces patient-specific implants (Episealers) along with the necessary associated surgical instruments by, inter alia, creating a corresponding 3D model of the ankle (talus, tibial, and fibula bones) based on either a Magnetic Resonance Imaging (MRI) scan or a Computed Tomography (CT) scan. Presently, the3D models based on MRI scans can be created automatically, but the 3Dmodels based on CT scans must be created manually, which can be very time-demanding. In this thesis project, a U-net based Convolutional Neural Network (CNN) was trained to automatically segment 3D models of ankles based on CT images. Furthermore, in order to optimize the quality of the incoming CT images, this thesis project also consisted of an evaluation of the specified parameters in the Episurf CT talus protocol that is being sent out to the clinics. The performance of the CNN was evaluated using the Dice Coeﬀicient (DC) with five-fold cross-validation. The CNN achieved a mean DC of 0.978±0.009 for the talus bone, 0.779±0.174 for the tibial bone, and 0.938±0.091 for the fibula bone. The values for the talus and fibula bones were satisfactory and comparable to results presented in previous researches; however, due to background artefacts in the images, the DC achieved by the network for the segmentation of the tibial bone was lower than the results presented in previous researches. To correct this, a noise-reducing filter will be implemented.
Fotledens komplexa anatomi ger upphov till en ideal balans mellan rörlighetoch stabilitet, vilket i sin tur möjliggör gång. Fotledens anatomi förändras när en skada uppstår, vilket kan påverka rörligheten och stabiliteten samt orsaka intensiv smärta. En skada i talusbenets ledbrosk eller i det subkondrala benet på talusdomen benämns som en Osteochondral Lesion of the Talus(OLT). En metod att behandla OLTs är att ersätta den del brosk eller bensom är skadat med ett implantat. Episurf Medical utvecklar och producerar individanpassade implantat (Episealers) och tillhörande nödvändiga kirurgiska instrument genom att, bland annat, skapa en motsvarande 3D-modell av fotleden (talus-, tibia- och fibula-benen) baserat på en skanning med antingen magnetisk resonanstomografi (MRI) eller datortomografi (CT). I dagsläget kan de 3D-modeller som baseras på MRI-skanningar skapas automatiskt, medan de 3D-modeller som baseras på CT-skanningar måste skapas manuellt - det senare ofta tidskrävande. I detta examensarbete har ett U-net-baserat Convolutional Neuralt Nätverk (CNN) tränats för att automatiskt kunna segmentera 3D-modeller av fotleder baserat på CT-bilder. Vidare har de speciferade parametrarna i Episurfs CT-protokoll för fotleden som skickas ut till klinikerna utvärderats, detta för att optimera bildkvaliteten på de CT-bilder som används för implantatspositionering och design. Det tränade nätverkets prestanda utvärderades med hjälp av Dicekoeﬀicienten (DC) med en fem-delad korsvalidering. Nätverket åstadkom engenomsnittlig DC på 0.978±0.009 för talusbenet, 0.779±0.174 för tibiabenet, och 0.938±0.091 för fibulabenet. Värdena för talus och fibula var adekvata och jämförbara med resultaten presenterade i tidigare forskning. På grund av bakgrundsartefakter i bilderna blev den DC som nätverket åstadkom för sin segmentering av tibiabenet lägre än tidigiare forskningsresultat. För att korrigera för bakgrundsartefakterna kommer ett brusreduceringsfilter implementeras

APA, Harvard, Vancouver, ISO, and other styles

6

Stjärnholm, Sigfrid. "Ghosts of Our Past: Neutrino Direction Reconstruction Using Deep Neural Networks." Thesis, Uppsala universitet, Högenergifysik, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-448765.

Full text

Abstract:

Neutrinos are the perfect cosmic messengers when it comes to investigating the most violent and mysterious astronomical and cosmological events in the Universe. The interaction probability of neutrinos is small, and the flux of high-energy neutrinos decreases quickly with increasing energy. In order to find high-energy neutrinos, large bodies of matter needs to be instrumented. A proposed detector station design called ARIANNA is designed to detect neutrino interactions in the Antarctic ice by measuring radio waves that are created due to the Askaryan effect. In this paper, we present a method based on state-of-the-art machine learning techniques to reconstruct the direction of the incoming neutrino, based on the radio emission that it produces. We trained a neural network with simulated data, created with the NuRadioMC framework, and optimized it to make the best possible predictions. The number of training events used was on the order of 106. Using two different emission models, we found that the network was able to learn and generalize on the neutrino events with good precision, resulting in a resolution of 4-5°. The model could also make good predictions on a dataset even if it was trained with another emission model. The results produced are promising, especially due to the fact that classical techniques have not been able to reproduce the same results without having prior knowledge of where the neutrino interaction took place. The developed neural network can also be used to assess the performance of other proposed detector designs, to quickly and reliably give an indication of which design might yield the most amount of value to the scientific community.
Neutriner är de perfekta kosmiska budbärarna när det kommer till att undersöka de mest våldsamma och mystiska astronomiska och kosmologiska händelserna i vårt universum. Sannolikheten för en neutrinointeraktion är dock liten, och flödet av högenergetiska neutriner minskar kraftigt med energin. För att hitta dessa högenergetiska neutriner måste stora volymer av materia instrumenteras. Ett förslag på en design för en detektorstation kallas ARIANNA, och är framtagen för att detektera neutrinointeraktioner i den antarktiska isen genom att mäta radiopulser som bildas på grund av Askaryan-effekten. I denna rapport presenterar vi en metod baserad på toppmoderna maskininlärningstekniker för att rekonstruera riktningen på en inkommande neutrino, utifrån den radiostrålning som produceras. Vi tränade ett neuralt nätverk med simulerade data, som skapades med hjälp av ramverket NuRadioMC, och optimerade nätverket för att göra så bra förutsägelser som möjligt. Antalet interaktionshändelser som användes för att träna nätverket var i storleksordningen 106. Genom att undersöka två olika emissionsmodeller fann vi att nätverket kunde generalisera med god precision. Detta resulterade i en upplösning på 4-5°. Modellen kunde även göra goda förutsägelser på en datamängd trots att nätverket var tränat med en annan emissionsmodell. De resultat som metoden framtog är lovande, särskilt med avseende på att tidigare klassiska metoder inte har lyckats reproducera samma resultat utan att metoden redan innan vet var i isen som neutrinointeraktionen skedde. Nätverket kan också komma att användas för att utvärdera prestandan hos andra designförslag på detektorstationer för att snabbt och säkert ge en indikation på vilken design som kan tillhandahålla mest vetenskapligt värde.

APA, Harvard, Vancouver, ISO, and other styles

7

Reiche, Myrgård Martin. "Acceleration of deep convolutional neural networks on multiprocessor system-on-chip." Thesis, Uppsala universitet, Avdelningen för datorteknik, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-385904.

Full text

Abstract:

In this master thesis some of the most promising existing frameworks and implementations of deep convolutional neural networks on multiprocessor system-on-chips (MPSoCs) are researched and evaluated. The thesis’ starting point was a previousthesis which evaluated possible deep learning models and frameworks for object detection on infra-red images conducted in the spring of 2018. In order to fit an existing deep convolutional neural network (DCNN) on a Multiple-Processor-System on Chip it needs modifications. Most DCNNs are trained on Graphic processing units (GPUs) with a bit width of 32 bit. This is not optimal for a platform with hard memory constraints such as the MPSoC which means it needs to be shortened. The optimal bit width depends on the network structure and requirements in terms of throughput and accuracy although most of the currently available object detection networks drop significantly when reduced below 6 bits width. After reducing the bit width, the network needs to be quantized and pruned for better memory usage. After quantization it can be implemented using one of many existing frameworks. This thesis focuses on Xilinx CHaiDNN and DNNWeaver V2 though it touches a little on revision, HLS4ML and DNNWeaver V1 as well. In conclusion the implementation of two network models on Xilinx Zynq UltraScale+ ZCU102 using CHaiDNN were evaluated. Conversion of existing network were done and quantization tested though not fully working. The results were a two to six times more power efficient implementation in comparison to GPU inference.

APA, Harvard, Vancouver, ISO, and other styles

8

Jangblad, Markus. "Object Detection in Infrared Images using Deep Convolutional Neural Networks." Thesis, Uppsala universitet, Avdelningen för systemteknik, 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-355221.

Full text

Abstract:

In the master thesis about object detection(OD) using deep convolutional neural network(DCNN), the area of OD is being tested when being applied to infrared images(IR). In this thesis the, goal is to use both long wave infrared(LWIR) images and short wave infrared(SWIR) images taken from an airplane in order to train a DCNN to detect runways, Precision Approach Path Indicator(PAPI) lights, and approaching lights. The purpose for detecting these objects in IR images is because IR light transmits better than visible light under certain weather conditions, for example, fog. This system could then help the pilot detect the runway in bad weather. The RetinaNet model architecture was used and modified in different ways to find the best performing model. The models contain parameters that are found during the training process but some parameters, called hyperparameters, need to be determined in advance. A way to automatically find good values of these hyperparameters was also tested. In hyperparameter optimization, the Bayesian optimization method proved to create a model with equally good performance as the best performance acieved by the author using manual hyperparameter tuning. The OD system was implemented using Keras with Tensorflow backend and received a high perfomance (mAP=0.9245) on the test data. The system manages to detect the wanted objects in the images but is expected to perform worse in a general situation since the training data and test data are very similar. In order to further develop this system and to improve performance under general conditions more data is needed from other airfields and under different weather conditions.

APA, Harvard, Vancouver, ISO, and other styles

9

Airola, Rasmus, and Kristoffer Hager. "Image Classification, Deep Learning and Convolutional Neural Networks : A Comparative Study of Machine Learning Frameworks." Thesis, Karlstads universitet, Institutionen för matematik och datavetenskap, 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:kau:diva-55129.

Full text

Abstract:

The use of machine learning and specifically neural networks is a growing trend in software development, and has grown immensely in the last couple of years in the light of an increasing need to handle big data and large information flows. Machine learning has a broad area of application, such as human-computer interaction, predicting stock prices, real-time translation, and self driving vehicles. Large companies such as Microsoft and Google have already implemented machine learning in some of their commercial products such as their search engines, and their intelligent personal assistants Cortana and Google Assistant. The main goal of this project was to evaluate the two deep learning frameworks Google TensorFlow and Microsoft CNTK, primarily based on their performance in the training time of neural networks. We chose to use the third-party API Keras instead of TensorFlow's own API when working with TensorFlow. CNTK was found to perform better in regards of training time compared to TensorFlow with Keras as frontend. Even though CNTK performed better on the benchmarking tests, we found Keras with TensorFlow as backend to be much easier and more intuitive to work with. In addition, CNTKs underlying implementation of the machine learning algorithms and functions differ from that of the literature and of other frameworks. Therefore, if we had to choose a framework to continue working in, we would choose Keras with TensorFlow as backend, even though the performance is less compared to CNTK.

APA, Harvard, Vancouver, ISO, and other styles

10

Gustavsson, Robin, and Johan Jakobsson. "Lung-segmentering : Förbehandling av medicinsk data vid predicering med konvolutionella neurala nätverk." Thesis, Högskolan i Borås, Akademin för bibliotek, information, pedagogik och IT, 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:hb:diva-14380.

Full text

Abstract:

Svenska socialstyrelsen presenterade år 2017 att lungcancer är den vanligaste cancerrelaterade dödsorsaken bland kvinnor i Sverige och den näst vanligaste bland män. Ett sätt att ta reda på om en patient har lungcancer är att en läkare studerar en tredimensionell-röntgenbild av en patients lungor. För att förebygga misstag som kan orsakas av den mänskliga faktorn är det möjligt att använda datorer och avancerade algoritmer för att upptäcka lungcancer. En nätverksmodell kan tränas att upptäcka detaljer och avvikelser i en lungröntgenbild, denna teknik kallas deep structural learning. Det är både tidskrävande och avancerat att skapa en sådan modell, det är därför viktigt att modellen tränas korrekt. Det finns flera studier som behandlar olika nätverksarkitekturer, däremot inte vad förbehandlingstekniken lung-segmentering kan ha för inverkan på en modell av denna signifikans. Därför ställde vi frågan: hur påverkas accuracy och loss hos en konvolutionell nätverksmodell när lung-segmentering appliceras på modellens tränings- och testdata? För att besvara frågan skapade vi flera modeller som använt, respektive, inte använt lung-segmentering. Modellernas resultat evaluerades och jämfördes, tekniken visade sig motverka överträning. Vi anser att denna studie kan underlätta för framtida forskning inom samma och liknande problemområde.
In the year of 2017 the Swedish social office reported the most common cancer related death amongst women was lung cancer and the second most common amongst men. A way to find out if a patient has lung cancer is for a doctor to study a computed tomography scan of a patients lungs. This introduces the chance for human error and could lead to fatal consequences. To prevent mistakes from happening it is possible to use computers and advanced algorithms for training a network model to detect details and deviations in the scans. This technique is called deep structural learning. It is both time consuming and highly challenging to create such a model. This discloses the importance of decorous training, and a lot of studies cover this subject. What these studies fail to emphasize is the significance of the preprocessing technique called lung segmentation. Therefore we investigated how is the accuracy and loss of a convolutional network model affected when lung segmentation is applied to the model’s training and test data? In this study a number of models were trained and evaluated on data where lung segmentation was applied, in relation to when it was not. The final conclusion of this report shows that the technique counteracts overfitting of a model and we allege that this study can ease further research within the same area of study.

APA, Harvard, Vancouver, ISO, and other styles

11

Larsson, Olov. "A Reward-based Algorithm for Hyperparameter Optimization of Neural Networks." Thesis, Karlstads universitet, Institutionen för matematik och datavetenskap (from 2013), 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kau:diva-78827.

Full text

Abstract:

Machine learning and its wide range of applications is becoming increasingly prevalent in both academia and industry. This thesis will focus on the two machine learning methods convolutional neural networks and reinforcement learning. Convolutional neural networks has seen great success in various applications for both classification and regression problems in a diverse range of fields, e.g. vision for self-driving cars or facial recognition. These networks are built on a set of trainable weights optimized on data, and a set of hyperparameters set by the designer of the network which will remain constant. For the network to perform well, the hyperparameters have to be optimized separately. The goal of this thesis is to investigate the use of reinforcement learning as a method for optimizing hyperparameters in convolutional neural networks built for classification problems. The reinforcement learning methods used are a tabular Q-learning and a new Q-learning inspired algorithm denominated max-table. These algorithms have been tested with different exploration policies based on each hyperparameter value’s covariance, precision or relevance to the performance metric. The reinforcement learning algorithms were mostly tested on the datasets CIFAR10 and MNIST fashion against a baseline set by random search. While the Q-learning algorithm was not able to perform better than random search, max-table was able to perform better than random search in 50% of the time on both datasets. Hyperparameterbased exploration policy using covariance and relevance were shown to decrease the optimizers’ performance. No significant difference was found between a hyperparameter based exploration policy using performance and an equally distributed exploration policy.
Maskininlärning och dess många tillämpningsområden blir vanligare i både akademin och industrin. Den här uppsatsen fokuserar på två maskininlärningsmetoder, faltande neurala nätverk och förstärkningsinlärning. Faltande neurala nätverk har sett stora framgångar inom olika applikationsområden både för klassifieringsproblem och regressionsproblem inom diverse fält, t.ex. syn för självkörande bilar eller ansiktsigenkänning. Dessa nätverk är uppbyggda på en uppsättning av tränbara parameterar men optimeras på data, samt en uppsättning hyperparameterar bestämda av designern och som hålls konstanta vilka behöver optimeras separat för att nätverket ska prestera bra. Målet med denna uppsats är att utforska användandet av förstärkningsinlärning som en metod för att optimera hyperparameterar i faltande neurala nätverk gjorda för klassifieringsproblem. De förstärkningsinlärningsmetoder som använts är en tabellarisk "Q-learning" samt en ny "Q-learning" inspirerad metod benämnd "max-table". Dessa algoritmer har testats med olika handlingsmetoder för utforskning baserade på hyperparameterarnas värdens kovarians, precision eller relevans gentemot utvärderingsmetriken. Förstärkningsinlärningsalgoritmerna var i största del testade på dataseten CIFAR10 och MNIST fashion och jämförda mot en baslinje satt av en slumpmässig sökning. Medan "Q-learning"-algoritmen inte kunde visas prestera bättre än den slumpmässiga sökningen, kunde "max-table" prestera bättre på 50\% av tiden på både dataseten. De handlingsmetoder för utforskning som var baserade på kovarians eller relevans visades minska algoritmens prestanda. Ingen signifikant skillnad kunde påvisas mellan en handlingsmetod baserad på hyperparametrarnas precision och en jämnt fördelad handlingsmetod för utforsking.

APA, Harvard, Vancouver, ISO, and other styles

12

Gilljam, Daniel, and Mario Youssef. "Jämförelse av artificiella neurala nätverksalgoritmerför klassificering av omdömen." Thesis, KTH, Hälsoinformatik och logistik, 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-230660.

Full text

Abstract:

Vid stor mängd data i form av kundomdömen kan det vara ett relativt tidskrävande arbeteatt bedöma varje omdömes sentiment manuellt, om det är positivt eller negativt laddat. Denna avhandling har utförts för att automatiskt kunna klassificera kundomdömen efter positiva eller negativa omdömen vilket hanterades med hjälp av maskininlärning. Tre olika djupa neurala nätverk testades och jämfördes med hjälp av två olika ramverk, TensorFlow och Keras, på både större och mindre datamängder. Även olika inbäddningsmetoder testades med de neurala nätverken. Den bästa kombination av neuralt nätverk, ramverk och inbäddningsmetod var ett Convolutional Neural Network (CNN) som använde ordinbäddningsmetoden Word2Vec, var skriven i ramverket Keras och gav en träffsäkerhetpå ca 88.87% med en avvikelse på ca 0.4%. CNN gav bäst resultat i alla olika tester framför de andra två neurala nätverken, Recurrent Neural Network (RNN) och Convolutional Recurrent Neural Network (CRNN)
With large amount of data in the form of customer reviews, it could be time consuming to manually go through each review and decide if its sentiment is positive or negative. This thesis have been done to automatically classify client reviews to determine if a review is positive or negative. This was dealt with by machine learning. Three different deep neural network was tested on greater and lesser datasets, and compared with the help of two different frameworks, TensorFlow and Keras. Different embedding methods were tested on the neural networks. The best combination of a neural network, a framework and anembedding was the Convolutional Neural Network (CNN) which used the word embedding method Word2Vec, was written in Keras framework and gave an accuracy of approximately 88.87% with a deviation of approximately 0.4%. CNN scored a better result in all of the tests in comparison with the two other neural networks, Recurrent NeuralNetwork (RNN) and Convolutional Recurrent Neural Network (CRNN).

APA, Harvard, Vancouver, ISO, and other styles

13

Linder, Johannes. "Modeling the intronic regulation of Alternative Splicing using Deep Convolutional Neural Nets." Thesis, KTH, Skolan för datavetenskap och kommunikation (CSC), 2015. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-172327.

Full text

Abstract:

This paper investigates the use of deep Convolutional Neural Networks for modeling the intronic regulation of Alternative Splicing on the basis of DNA sequence. By training the CNN on massively parallel synthetic DNA libraries of Alternative 5'-splicing and Alternatively Skipped exon events, the model is capable of predicting the relative abundance of alternatively spliced mRNA isoforms on held-out library data to a very high accuracy (R2 = 0.77 for Alt. 5'-splicing). Furthermore, the CNN is shown to generalize alternative splicing across cell lines efficiently. The Convolutional Neural Net is tested against a Logistic regression model and the results show that while prediction accuracy on the synthetic library is notably higher compared to the LR model, the CNN is worse at generalizing to new intronic contexts. Tests on non-synthetic human SNP genes suggest the CNN is dependent on the relative position of the intronic region it was trained for, a problem which is alleviated with LR. The increased library prediction accuracy of the CNN compared to Logistic regression is concluded to come from the non-linearity introduced by the deep layer architecture. It adds the capacity to model complex regulatory interactions and combinatorial RBP effects which studies have shown largely affect alternative splicing. However, the architecture makes interpreting the CNN hard, as the regulatory interactions are encoded deep within the layers. Nevertheless, high-performance modeling of alternative splicing using CNNs may still prove useful in numerous Synthetic biology applications, for example to model differentially spliced genes as is done in this paper.
Den här uppsatsen undersöker hur djupa neurala nätverk baserade på faltning ("Convolutions") kan användas för att modellera den introniska regleringen av Alternativ Splicing med endast DNA-sekvensen som indata. Nätverket tränas på ett massivt parallelt bibliotek av syntetiskt DNA innehållandes Alternativa Splicing-event där delar av de introniska regionerna har randomiserats. Uppsatsen visar att nätverksarkitekturen kan förutspå den relativa mängden alternativt splicat RNA till en mycket hög noggrannhet inom det syntetiska biblioteket. Modellen generaliserar även alternativ splicing mellan mänskliga celltyper väl. Hursomhelst, tester på icke-syntetiska mänskliga gener med SNP-mutationer visar att nätverkets prestanda försämras när den introniska region som används som indata flyttas i jämförelse till den relativa position som modellen tränats på. Uppsatsen jämför modellen med Logistic regression och drar slutsatsen att nätverkets förbättrade prestanda grundar sig i dess förmåga att modellera icke-linjära beroenden i datan. Detta medför dock svårigheter i att tolka vad modellen faktiskt lärt sig, eftersom interaktionen mellan reglerande element är inbäddat i nätverkslagren. Trots det kan högpresterande modellering av alternativ splicing med hjälp av neurala nät vara användbart, exempelvis inom Syntetisk biologi där modellen kan användas för att kontrollera regleringen av splicing när man konstruerar syntetiska gener.

APA, Harvard, Vancouver, ISO, and other styles

14

Droh, Erik. "T-Distributed Stochastic Neighbor Embedding Data Preprocessing Impact on Image Classification using Deep Convolutional Neural Networks." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-237422.

Full text

Abstract:

Image classification in Machine Learning encompasses the task of identification of objects in an image. The technique has applications in various areas such as e-commerce, social media and security surveillance. In this report the author explores the impact of using t-Distributed Stochastic Neighbor Embedding (t-SNE) on data as a preprocessing step when classifying multiple classes of clothing with a state-of-the-art Deep Convolutional Neural Network (DCNN). The t-SNE algorithm uses dimensionality reduction and groups similar objects close to each other in three-dimensional space. Extracting this information in the form of a positional coordinate gives us a new parameter which could help with the classification process since the features it uses can be different from that of the DCNN. Therefore, three slightly different DCNN models receives different input and are compared. The first benchmark model only receives pixel values, the second and third receive pixel values together with the positional coordinates from the t-SNE preprocessing for each data point, but with different hyperparameter values in the preprocessing step. The Fashion-MNIST dataset used contains 10 different clothing classes which are normalized and gray-scaled for easeof-use. The dataset contains 70.000 images in total. Results show minimum change in classification accuracy in the case of using a low-density map with higher learning rate as the data size increases, while a more dense map and lower learning rate performs a significant increase in accuracy of 4.4% when using a small data set. This is evidence for the fact that the method can be used to boost results when data is limited.
Bildklassificering i maskinlärning innefattar uppgiften att identifiera objekt i en bild. Tekniken har applikationer inom olika områden så som e-handel, sociala medier och säkerhetsövervakning. I denna rapport undersöker författaren effekten av att användat-Distributed Stochastic Neighbour Embedding (t-SNE) på data som ett förbehandlingssteg vid klassificering av flera klasser av kläder med ett state-of-the-art Deep Convolutio-nal Neural Network (DCNN). t-SNE-algoritmen använder dimensioneringsreduktion och grupperar liknande objekt nära varandra i tredimensionellt utrymme. Att extrahera denna information i form av en positionskoordinat ger oss en ny parameter som kan hjälpa till med klassificeringsprocessen eftersom funktionerna som den använder kan skilja sig från DCNN-modelen. Tre olika DCNN-modeller får olika in-data och jämförs därefter. Den första referensmodellen mottar endast pixelvärden, det andra och det tredje motar pixelvärden tillsammans med positionskoordinaterna från t-SNE-förbehandlingen för varje datapunkt men med olika hyperparametervärden i förbehandlingssteget. I studien används Fashion-MNIST datasetet som innehåller 10 olika klädklasser som är normaliserade och gråskalade för enkel användning. Datasetet innehåller totalt 70.000 bilder. Resultaten visar minst förändring i klassificeringsnoggrannheten vid användning av en låg densitets karta med högre inlärningsgrad allt eftersom datastorleken ökar, medan en mer tät karta och lägre inlärningsgrad uppnår en signifikant ökad noggrannhet på 4.4% när man använder en liten datamängd. Detta är bevis på att metoden kan användas för att öka klassificeringsresultaten när datamängden är begränsad.

APA, Harvard, Vancouver, ISO, and other styles

15

Melcherson, Tim. "Image Augmentation to Create Lower Quality Images for Training a YOLOv4 Object Detection Model." Thesis, Uppsala universitet, Signaler och system, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-429146.

Full text

Abstract:

Research in the Arctic is of ever growing importance, and modern technology is used in news ways to map and understand this very complex region and how it is effected by climate change. Here, animals and vegetation are tightly coupled with their environment in a fragile ecosystem, and when the environment undergo rapid changes it risks damaging these ecosystems severely. Understanding what kind of data that has potential to be used in artificial intelligence, can be of importance as many research stations have data archives from decades of work in the Arctic. In this thesis, a YOLOv4 object detection model has been trained on two classes of images to investigate the performance impacts of disturbances in the training data set. An expanded data set was created by augmenting the initial data to contain various disturbances. A model was successfully trained on the augmented data set and a correlation between worse performance and presence of noise was detected, but changes in saturation and altered colour levels seemed to have less impact than expected. Reducing noise in gathered data is seemingly of greater importance than enhancing images with lacking colour levels. Further investigations with a larger and more thoroughly processed data set is required to gain a clearer picture of the impact of the various disturbances.

APA, Harvard, Vancouver, ISO, and other styles

16

Larsson, Sofia. "A Study of the Loss Landscape and Metastability in Graph Convolutional Neural Networks." Thesis, KTH, Matematisk statistik, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-273622.

Full text

Abstract:

Many novel graph neural network models have reported an impressive performance on benchmark dataset, but the theory behind these networks is still being developed. In this thesis, we study the trajectory of Gradient descent (GD) and Stochastic gradient descent (SGD) in the loss landscape of Graph neural networks by replicating Xing et al. [1] study for feed-forward networks. Furthermore, we empirically examine if the training process could be accelerated by an optimization algorithm inspired from Stochastic gradient Langevin dynamics and what effect the topology of the graph has on the convergence of GD by perturbing its structure. We find that the loss landscape is relatively flat and that SGD does not encounter any significant obstacles during its propagation. The noise-induced gradient appears to aid SGD in finding a stationary point with desirable generalisation capabilities when the learning rate is poorly optimized. Additionally, we observe that the topological structure of the graph plays a part in the convergence of GD but further research is required to understand how.
Många nya grafneurala nätverk har visat imponerande resultat på existerande dataset, dock är teorin bakom dessa nätverk fortfarande under utveckling. I denna uppsats studerar vi banor av gradientmetoden (GD) och den stokastiska gradientmetoden (SGD) i lösningslandskapet till grafiska faltningsnätverk genom att replikera studien av feed-forward nätverk av Xing et al. [1]. Dessutom undersöker vi empiriskt om träningsprocessen kan accelereras genom en optimeringsalgoritm inspirerad av Stokastisk gradient Langevin dynamik, samt om grafens topologi har en inverkan på konvergensen av GD genom att ändra strukturen. Vi ser att lösningslandskapet är relativt plant och att bruset inducerat i gradienten verkar hjälpa SGD att finna stabila stationära punkter med önskvärda generaliseringsegenskaper när inlärningsparametern har blivit olämpligt optimerad. Dessutom observerar vi att den topologiska grafstrukturen påverkar konvergensen av GD, men det behövs mer forskning för att förstå hur.

APA, Harvard, Vancouver, ISO, and other styles

17

Viklund, Alexander, and Emma Nimstad. "Character Recognition in Natural Images Utilising TensorFlow." Thesis, KTH, Skolan för datavetenskap och kommunikation (CSC), 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-208385.

Full text

Abstract:

Convolutional Neural Networks (CNNs) are commonly used for character recognition. They achieve the lowest error rates for popular datasets such as SVHN and MNIST. Usage of CNN is lacking in research about character classification in natural images regarding the whole English alphabet. This thesis conducts an experiment where TensorFlow is used to construct a CNN that is trained and tested on the Chars74K dataset, with 15 images per class for training and 15 images per class for testing. This is done with the aim of achieving a higher accuracy than the non-CNN approach by de Campos et al. [1], that achieved 55.26%. The thesis explores data augmentation techniques for expanding the small training set and evaluates the result of applying rotation, stretching, translation and noise-adding. The result of this is that all of these methods apart from adding noise gives a positive effect on the accuracy of the network. Furthermore, the experiment shows that with a three layered convolutional neural network it is possible to create a character classifier that is as good as de Campos et al.'s. It is believed that even better results can be achieved if more experiments would be conducted on the parameters of the network and the augmentation.
Det är vanligt att använda konvolutionära artificiella neuronnät (CNN) för bildigenkänning, då de ger de minsta felmarginalerna på kända datamängder som SVHN och MNIST. Dock saknas det forskning om användning av CNN för klassificering av bokstäver i naturliga bilder när det gäller hela det engelska alfabetet. Detta arbete beskriver ett experiment där TensorFlow används för att bygga ett CNN som tränas och testas med bilder från Chars74K. 15 bilder per klass används för träning och 15 per klass för testning. Målet med detta är att uppnå högre noggrannhet än 55.26%, vilket är vad de campos et al. [1] uppnådde med en metod utan artificiella neuronnät. I rapporten utforskas olika tekniker för att artificiellt utvidga den lilla datamängden, och resultatet av att applicera rotation, utdragning, translation och bruspåslag utvärderas. Resultatet av det är att alla dessa metoder utom bruspåslag ger en positiv effekt på nätverkets noggrannhet. Vidare visar experimentet att med ett CNN med tre lager går det att skapa en bokstavsklassificerare som är lika bra som de Campos et al.s klassificering. Om fler experiment skulle genomföras på nätverkets och utvidgningens parametrar är det troligt att ännu bättre resultat kan uppnås.

APA, Harvard, Vancouver, ISO, and other styles

18

Karlsson, Daniel. "Hyperparameter optimisation using Q-learning based algorithms." Thesis, Karlstads universitet, Fakulteten för hälsa, natur- och teknikvetenskap (from 2013), 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kau:diva-78096.

Full text

Abstract:

Machine learning algorithms have many applications, both for academic and industrial purposes. Examples of applications are classification of diffraction patterns in materials science and classification of properties in chemical compounds within the pharmaceutical industry. For these algorithms to be successful they need to be optimised, part of this is achieved by training the algorithm, but there are components of the algorithms that cannot be trained. These hyperparameters have to be tuned separately. The focus of this work was optimisation of hyperparameters in classification algorithms based on convolutional neural networks. The purpose of this thesis was to investigate the possibility of using reinforcement learning algorithms, primarily Q-learning, as the optimising algorithm. Three different algorithms were investigated, Q-learning, double Q-learning and a Q-learning inspired algorithm, which was designed during this work. The algorithms were evaluated on different problems and compared to a random search algorithm, which is one of the most common optimisation tools for this type of problem. All three algorithms were capable of some learning, however the Q-learning inspired algorithm was the only one to outperform the random search algorithm on the test problems. Further, an iterative scheme of the Q-learning inspired algorithm was implemented, where the algorithm was allowed to refine the search space available to it. This showed further improvements of the algorithms performance and the results indicate that similar performance to the random search may be achieved in a shorter period of time, sometimes reducing the computational time by up to 40%.
Maskininlärningsalgoritmer har många tillämpningsområden, både akademiska och inom industrin. Exempel på tillämpningar är, klassificering av diffraktionsmönster inom materialvetenskap och klassificering av egenskaper hos kemiska sammansättningar inom läkemedelsindustrin. För att dessa algoritmer ska prestera bra behöver de optimeras. En del av optimering sker vid träning av algoritmerna, men det finns komponenter som inte kan tränas. Dessa hyperparametrar måste justeras separat. Fokuset för det här arbetet var optimering av hyperparametrar till klassificeringsalgoritmer baserade på faltande neurala nätverk. Syftet med avhandlingen var att undersöka möjligheterna att använda förstärkningsinlärningsalgoritmer, främst ''Q-learning'', som den optimerande algoritmen. Tre olika algoritmer undersöktes, ''Q-learning'', dubbel ''Q-learning'' samt en algoritm inspirerad av ''Q-learning'', denna utvecklades under arbetets gång. Algoritmerna utvärderades på olika testproblem och jämfördes mot resultat uppnådda med en slumpmässig sökning av hyperparameterrymden, vilket är en av de vanligare metoderna för att optimera den här typen av algoritmer. Alla tre algoritmer påvisade någon form av inlärning, men endast den ''Q-learning'' inspirerade algoritmen presterade bättre än den slumpmässiga sökningen. En iterativ implemetation av den ''Q-learning'' inspirerade algoritmen utvecklades också. Den iterativa metoden tillät den tillgängliga hyperparameterrymden att förfinas mellan varje iteration. Detta medförde ytterligare förbättringar av resultaten som indikerade att beräkningstiden i vissa fall kunde minskas med upp till 40% jämfört med den slumpmässiga sökningen med bibehållet eller förbättrat resultat.

APA, Harvard, Vancouver, ISO, and other styles

19

Friberg, Oscar. "Recognizing Semantics in Human Actions with Object Detection." Thesis, KTH, Skolan för datavetenskap och kommunikation (CSC), 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-212579.

Full text

Abstract:

Two-stream convolutional neural networks are currently one of the most successful approaches for human action recognition. The two-stream convolutional networks separates spatial and temporal information into a spatial stream and a temporal stream. The spatial stream accepts a single RGB frame, while the temporal stream accepts a sequence of optical flow. There have been attempts to further extend the work of the two-stream convolutional network framework. For instance there have been attempts to extend with a third network for auxiliary information, which this thesis mainly focuses on. We seek to extend the two-stream convolutional neural network by introducing a semantic stream by using object detection systems. Two contributions are made in thesis: First we show that this semantic stream can provide slight improvements over two-stream convolutional neural networks for human action recognition on standard benchmarks. Secondly, we attempt to seek divergence enhancements techniques to force our new semantic stream to complement the spatial and the temporal streams by modifying the loss function during training. Slight gains are seen using these divergence enhancement techniques.
Faltningsnätverk i två strömmar är just nu den mest lyckade tillvägagångsmetoden för mänsklig aktivitetsigenkänning, vilket delar upp rumslig och timlig information i en rumslig ström och en timlig ström. Den rumsliga strömmen tar emot individella RGB bildrutor för igenkänning, medan den timliga strömmen tar emot en sekvens av optisk flöde. Försök i att utöka ramverket för faltningsnätverk i två strömmar har gjorts i tidigare arbete. Till exempel har försök gjorts i att komplementera dessa två nätverk med ett tredje nätverk som tar emot extra information. I detta examensarbete söker vi metoder för att utöka faltningsnätverk i två strömmar genom att introducera en semantisk ström med objektdetektion. Vi gör i huvudsak två bidrag i detta examensarbete: Först visar vi att den semantiska strömmen tillsammans med den rumsliga strömmen och den timliga strömmen kan bidra till små förbättringar för mänsklig aktivitetsigenkänning i video på riktmärkesstandarder. För det andra söker vi efter divergensutökningstekniker som tvingar den semantiska strömme att komplementera de andra två strömmarna genom att modifiera förlustfunktionen under träning. Vi ser små förbättringar med att använda dessa tekniker för att öka divergens.

APA, Harvard, Vancouver, ISO, and other styles

20

Bereczki, Márk. "Graph Neural Networks for Article Recommendation based on Implicit User Feedback and Content." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-300092.

Full text

Abstract:

Recommender systems are widely used in websites and applications to help users find relevant content based on their interests. Graph neural networks achieved state- of-the- art results in the field of recommender systems, working on data represented in the form of a graph. However, most graph- based solutions hold challenges regarding computational complexity or the ability to generalize to new users. Therefore, we propose a novel graph- based recommender system, by modifying Simple Graph Convolution, an approach for efficient graph node classification, and add the capability of generalizing to new users. We build our proposed recommender system for recommending the articles of Peltarion Knowledge Center. By incorporating two data sources, implicit user feedback based on pageview data as well as the content of articles, we propose a hybrid recommender solution. Throughout our experiments, we compare our proposed solution with a matrix factorization approach as well as a popularity- based and a random baseline, analyse the hyperparameters of our model, and examine the capability of our solution to give recommendations to new users who were not part of the training data set. Our model results in slightly lower, but similar Mean Average Precision and Mean Reciprocal Rank scores to the matrix factorization approach, and outperforms the popularity- based and random baselines. The main advantages of our model are computational efficiency and its ability to give relevant recommendations to new users without the need for retraining the model, which are key features for real- world use cases.
Rekommendationssystem används ofta på webbplatser och applikationer för att hjälpa användare att hitta relevant innehåll baserad på deras intressen. Med utvecklingen av grafneurala nätverk nådde toppmoderna resultat inom rekommendationssystem och representerade data i form av en graf. De flesta grafbaserade lösningar har dock svårt med beräkningskomplexitet eller att generalisera till nya användare. Därför föreslår vi ett nytt grafbaserat rekommendatorsystem genom att modifiera Simple Graph Convolution. De här tillvägagångssätt är en effektiv grafnodsklassificering och lägga till möjligheten att generalisera till nya användare. Vi bygger vårt föreslagna rekommendatorsystem för att rekommendera artiklarna från Peltarion Knowledge Center. Genom att integrera två datakällor, implicit användaråterkoppling baserad på sidvisningsdata samt innehållet i artiklar, föreslår vi en hybridrekommendatörslösning. Under våra experiment jämför vi vår föreslagna lösning med en matrisfaktoriseringsmetod samt en popularitetsbaserad och en slumpmässig baslinje, analyserar hyperparametrarna i vår modell och undersöker förmågan hos vår lösning att ge rekommendationer till nya användare som inte deltog av träningsdatamängden. Vår modell resulterar i något mindre men liknande Mean Average Precision och Mean Reciprocal Rank poäng till matrisfaktoriseringsmetoden och överträffar de popularitetsbaserade och slumpmässiga baslinjerna. De viktigaste fördelarna med vår modell är beräkningseffektivitet och dess förmåga att ge relevanta rekommendationer till nya användare utan behov av omskolning av modellen, vilket är nyckelfunktioner för verkliga användningsfall.

APA, Harvard, Vancouver, ISO, and other styles

21

Ashfaq, Awais. "Segmentation of Cone Beam CT in Stereotactic Radiosurgery." Thesis, KTH, Skolan för teknik och hälsa (STH), 2016. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-193107.

Full text

Abstract:

C-arm Cone Beam CT (CBCT) systems – due to compact size, flexible geometry and low radiation exposure – inaugurated the era of on-board 3D image guidance in therapeutic and surgical procedures. Leksell Gamma Knife Icon by Elekta introduced an integrated CBCT system to determine patient position prior to surgical session, thus advancing to a paradigm shift in facilitating frameless stereotactic radiosurgeries. While CBCT offers a quick imaging facility with high spatial accuracy, the quantitative values tend to be distorted due to various physics based artifacts such as scatter, beam hardening and cone beam effect. Several 3D reconstruction algorithms targeting these artifacts involve an accurate and fast segmentation of craniofacial CBCT images into air, tissue and bone. The objective of the thesis is to investigate the performance of deep learning based convolutional neural networks (CNN) in relation to conventional image processing and machine learning algorithms in segmenting CBCT images. CBCT data for training and testing procedures was provided by Elekta. A framework of segmentation algorithms including multilevel automatic thresholding, fuzzy clustering, multilayer perceptron and CNN is developed and tested against pre-defined evaluation metrics carrying pixel-wise prediction accuracy, statistical tests and execution times among others. CNN has proven its ability to outperform other segmentation algorithms throughout the evaluation metrics except for execution times. Mean segmentation error for CNN is found to be 0.4% with a standard deviation of 0.07%, followed by fuzzy clustering with mean segmentation error of 0.8% and a standard deviation of 0.12%. CNN based segmentation takes 500s compared to multilevel thresholding which requires ~1s on similar sized CBCT image. The present work demonstrates the ability of CNN in handling artifacts and noise in CBCT images and maintaining a high semantic segmentation performance. However, further efforts targeting CNN execution speed are required to utilize the segmentation framework within real-time 3D reconstruction algorithms.
C-arm Cone Beam CT (CBCT) system har tack vare sitt kompakta format, flexibla geometri och låga strålningsdos startat en era av inbyggda 3D bildtagningssystem för styrning av terapeutiska och kirurgiska ingripanden. Elektas Leksell Gamma Knife Icon introducerade ett integrerat CBCT-system för att bestämma patientens position för operationer och på så sätt gå in i en paradigm av ramlös stereotaktisk strålkirurgi. Även om CBCT erbjuder snabb bildtagning med hög spatiel noggrannhet så tenderar de kvantitativa värdena att störas av olika artefakter som spridning, beam hardening och cone beam effekten. Ett flertal 3D rekonstruktionsalgorithmer som försöker reducera dessa artefakter kräver en noggrann och snabb segmentering av kraniofaciala CBCT-bilder i luft, mjukvävnad och ben. Målet med den här avhandlingen är att undersöka hur djupa neurala nätverk baserade på faltning (convolutional neural networks, CNN) presterar i jämförelse med konventionella bildbehandlings- och maskininlärningalgorithmer för segmentering av CBCT-bilder. CBCT-data för träning och testning tillhandahölls av Elekta. Ett ramverk för segmenteringsalgorithmer inklusive flernivåströskling (multilevel automatic thresholding), suddig klustring (fuzzy clustering), flerlagersperceptroner (multilayer perceptron) och CNN utvecklas och testas mot fördefinerade utvärderingskriterier som pixelvis noggrannhet, statistiska tester och körtid. CNN presterade bäst i alla metriker förutom körtid. Det genomsnittliga segmenteringsfelet för CNN var 0.4% med en standardavvikelse på 0.07%, följt av suddig klustring med ett medelfel på 0.8% och en standardavvikelse på 0.12%. CNN kräver 500 sekunder jämfört med ungefär 1 sekund för den snabbaste algorithmen, flernivåströskling på lika stora CBCT-volymer. Arbetet visar CNNs förmåga att handera artefakter och brus i CBCT-bilder och bibehålla en högkvalitativ semantisk segmentering. Vidare arbete behövs dock för att förbättra presetandan hos algorithmen för att metoden ska vara applicerbar i realtidsrekonstruktionsalgorithmer.

APA, Harvard, Vancouver, ISO, and other styles

22

Norén, Karl. "Obstacle Avoidance for an Autonomous Robot Car using Deep Learning." Thesis, Linköpings universitet, Programvara och system, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-160551.

Full text

Abstract:

The focus of this study was deep learning. A small, autonomous robot car was used for obstacle avoidance experiments. The robot car used a camera for taking images of its surroundings. A convolutional neural network used the images for obstacle detection. The available dataset of 31 022 images was trained with the Xception model. We compared two different implementations for making the robot car avoid obstacles. Mapping image classes to steering commands was used as a reference implementation. The main implementation of this study was to separate obstacle detection and steering logic in different modules. The former reached an obstacle avoidance ratio of 80 %, the latter reached 88 %. Different hyperparameters were looked at during training. We found that frozen layers and number of epochs were important to optimize. Weights were loaded from ImageNet before training. Frozen layers decided how many layers that were trainable after that. Training all layers (no frozen layers) was proven to work best. Number of epochs decided how many epochs a model trained. We found that it was important to train between 10-25 epochs. The best model used no frozen layers and trained for 21 epochs. It reached a test accuracy of 85.2 %.

APA, Harvard, Vancouver, ISO, and other styles

23

Diffner, Fredrik, and Hovig Manjikian. "Training a Neural Network using Synthetically Generated Data." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-280334.

Full text

Abstract:

A major challenge in training machine learning models is the gathering and labeling of a sufficiently large training data set. A common solution is the use of synthetically generated data set to expand or replace a real data set. This paper examines the performance of a machine learning model trained on synthetic data set versus the same model trained on real data. This approach was applied to the problem of character recognition using a machine learning model that implements convolutional neural networks. A synthetic data set of 1’240’000 images and two real data sets, Char74k and ICDAR 2003, were used. The result was that the model trained on the synthetic data set achieved an accuracy that was about 50% better than the accuracy of the same model trained on the real data set.
Vid utvecklandet av maskininlärningsmodeller kan avsaknaden av ett tillräckligt stort dataset för träning utgöra ett problem. En vanlig lösning är att använda syntetiskt genererad data för att antingen utöka eller helt ersätta ett dataset med verklig data. Denna uppsats undersöker prestationen av en maskininlärningsmodell tränad på syntetisk data jämfört med samma modell tränad på verklig data. Detta applicerades på problemet att använda ett konvolutionärt neuralt nätverk för att tyda tecken i bilder från ”naturliga” miljöer. Ett syntetiskt dataset bestående av 1’240’000 samt två stycken dataset med tecken från bilder, Char74K och ICDAR2003, användes. Resultatet visar att en modell tränad på det syntetiska datasetet presterade ca 50% bättre än samma modell tränad på Char74K.

APA, Harvard, Vancouver, ISO, and other styles

24

Carpentier, Benjamin. "Deep Learning for Earth Observation: improvement of classification methods for land cover mapping : Semantic segmentation of satellite image time series." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-299578.

Full text

Abstract:

Satellite Image Time Series (SITS) are becoming available at high spatial, spectral and temporal resolutions across the globe by the latest remote sensing sensors. These series of images can be highly valuable when exploited by classification systems to produce frequently updated and accurate land cover maps. The richness of spectral, spatial and temporal features in SITS is a promising source of data for developing better classification algorithms. However, machine learning methods such as Random Forests (RFs), despite their fruitful application to SITS to produce land cover maps, are structurally unable to properly handle intertwined spatial, spectral and temporal dynamics without breaking the structure of the data. Therefore, the present work proposes a comparative study of various deep learning algorithms from the Convolutional Neural Network (CNN) family and evaluate their performance on SITS classification. They are compared to the processing chain coined iota2, developed by the CESBIO and based on a RF model. Experiments are carried out in an operational context using with sparse annotations from 290 labeled polygons. Less than 80 000 pixel time series belonging to 8 land cover classes from a year of Sentinel- 2 monthly syntheses are used. Results show on a test set of 131 polygons that CNNs using 3D convolutions in space and time are more accurate than 1D temporal, stacked 2D and RF approaches. Best-performing models are CNNs using spatio-temporal features, namely 3D-CNN, 2D-CNN and SpatioTempCNN, a two-stream model using both 1D and 3D convolutions.
Tidsserier av satellitbilder (SITS) blir tillgängliga med hög rumslig, spektral och tidsmässig upplösning över hela världen med hjälp av de senaste fjärranalyssensorerna. Dessa bildserier kan vara mycket värdefulla när de utnyttjas av klassificeringssystem för att ta fram ofta uppdaterade och exakta kartor över marktäcken. Den stora mängden spektrala, rumsliga och tidsmässiga egenskaper i SITS är en lovande datakälla för utveckling av bättre algoritmer. Metoder för maskininlärning som Random Forests (RF), trots att de har tillämpats på SITS för att ta fram kartor över landtäckning, är strukturellt sett oförmögna att hantera den sammanflätade rumsliga, spektrala och temporala dynamiken utan att bryta sönder datastrukturen. I detta arbete föreslås därför en jämförande studie av olika algoritmer från Konvolutionellt Neuralt Nätverk (CNN) -familjen och en utvärdering av deras prestanda för SITS-klassificering. De jämförs med behandlingskedjan iota2, som utvecklats av CESBIO och bygger på en RF-modell. Försöken utförs i ett operativt sammanhang med glesa annotationer från 290 märkta polygoner. Mindre än 80 000 pixeltidsserier som tillhör 8 marktäckeklasser från ett års månatliga Sentinel-2-synteser används. Resultaten visar att CNNs som använder 3D-falsningar i tid och rum är mer exakta än 1D temporala, staplade 2D- och RF-metoder. Bäst presterande modeller är CNNs som använder spatiotemporala egenskaper, nämligen 3D-CNN, 2D-CNN och SpatioTempCNN, en modell med två flöden som använder både 1D- och 3D-falsningar.

APA, Harvard, Vancouver, ISO, and other styles

25

Ryan, Elisabeth. "Towards word alignment and dataset creation for shorthand documents and transcripts." Thesis, Uppsala universitet, Institutionen för informationsteknologi, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-452278.

Full text

Abstract:

Analysing handwritten texts and creating labelled data sets can facilitate novel research on languages and advanced computerized analysis of authors works. However, few handwritten works have word wise labelling or data sets associated with them. More often a transcription of the text is available, but without any exact coupling between words in the transcript and word representations in the document images. Can an algorithm be created that will take only an image of handwritten text and a corresponding transcript and return a partial alignment and data set? An algorithm is developed in this thesis that explores the use of a convolutional neural network trained on English handwritten text to be able to align some words on pages and create a data set given a handwritten page image and a transcript. This algorithm is tested on handwritten English text. The algorithm is also tested on Swedish shorthand, which was the inspiration for the development of the algorithm in this work. In testing on several pages of handwritten English text, the algorithm reaches an overall average classification of 68% of words on one page with 0% miss-classification of those words. On a sequence of pages, the algorithm reaches 84% correctly classified words on 10 pages and produces a data set of 551 correctly labelled word images. This after being shown 10 pages with an average of 70.6 words on each page, with0% miss-classification.
Analys av handskrivna texter och skapande av dataset kan främja ny forskning inom språk och avancerad datoranalys av olika författares verk. Det finns dock få handskrivna verk med information om vad varje handskrivet ord betecknar eller dataset relaterade till texten. Oftare finns en transkribering av texten, utan någon exakt koppling mellan de transkriberade orden och handskrivna ord i bilden av ett dokument. Genom att skapa en algoritm som kan ta tillvara handskrivna texter och motsvarande transkription kan potentiellt fler verk datoranalyseras. Kan en algoritm skapas som bara tar in en bild av ett handskrivet dokument och en motsvarande transkription och som returnerar en partiell placering av ord till ordbilder och ett dataset? En algoritm skapas i detta arbete som utforskar möjligheten att använda ett djupt neuralt nätverk tränat på engelsk handskriven text för att koppla ord i ett dokumentet till en transkription, och använda dessa för att skapa ett dataset. Denna algoritm är testad på engelsk handskriven text. Algoritmen testas också på svensk stenografi, vilket är inspirationen till skapandet av algoritmen. Algoritmen testades på ett antal sidor handskriven engelsk text. Där kunde algoritmen klassificera i genomsnitt 68% av orden på en handskriven sida med 0% av dessa ord felklassificerade. På en serie sidor når algoritmen en genomsnittlig klassificering av 84% klassificerade ord, och producerar ett dataset av 551 korrekt klassificerade ordbilder. Detta är efter att ha visat algoritmen 10 sidor med i snitt 70.6 ord per sida. I dessa test nåddes också en felklassificering på 0%.

APA, Harvard, Vancouver, ISO, and other styles

26

Rekathati, Faton. "Curating news sections in a historical Swedish news corpus." Thesis, Linköpings universitet, Statistik och maskininlärning, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-166313.

Full text

Abstract:

The National Library of Sweden uses optical character recognition software to digitize their collections of historical newspapers. The purpose of such software is first to automatically segment text and images from scanned newspaper pages, and second to read the contents of the identified text regions. While the raw text is often digitized successfully, important contextual information regarding whether the text constitutes for example a header, a section title or the body text of an article is not captured. These characteristics are easy for a human to distinguish, yet they remain difficult for a machine to recognize. The main purpose of this thesis is to investigate how well section titles in the newspaper Svenska Dagbladet can be classified by using so called image embeddings as features. A secondary aim is to examine whether section titles become harder to classify in older newspaper data. Lastly, we explore if manual annotation work can be reduced using the predictions of a semi-supervised classifier to help in the labeling process. Results indicate the use of image embeddings help quite substantially in classifying section titles. Datasets from three different time periods: 1990-1997, 2004-2013, and 2017 and onwards were sampled and annotated. The best performing model (Xgboost) achieved macro F1 scores of 0.886, 0.936 and 0.980 for the respective time periods. The results also showed classification became more difficult on older newspapers. Furthermore, a semi-supervised classifier managed an average precision of 83% with only single section title examples, showing promise as way to speed up manual annotation of data.

APA, Harvard, Vancouver, ISO, and other styles

27

Shunmugam, Nagarajan. "Operational data extraction using visual perception." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-292216.

Full text

Abstract:

The information era has led the manufacturer of trucks and logistics solution providers are inclined towards software as a service (SAAS) based solutions. With advancements in software technologies like artificial intelligence and deep learning, the domain of computer vision has achieved significant performance boosts that it competes with hardware based solutions. Firstly, data is collected from a large number of sensors which can increase production costs and carbon footprint in the environment. Secondly certain useful physical quantities/variables are impossible to measure or turns out to be very expensive solution. So in this dissertation, we are investigating the feasibility of providing the similar solution using a single sensor (dashboard- camera) to measure multiple variables. This provides a sustainable solution even when scaled up in huge fleets. The video frames that can be collected from the visual perception of the truck (i.e. the on-board camera of the truck) is processed by the deep learning techniques and operational data can be extracted. Certain techniques like the image classification and semantic segmentation outputs were experimented and shows potential to replace costly hardware counterparts like Lidar or radar based solutions.
Informationstiden har lett till att tillverkare av lastbilar och logistiklösningsleve -rantörer är benägna mot mjukvara som en tjänst (SAAS) baserade lösningar. Med framsteg inom mjukvaruteknik som artificiell intelligens och djupinlärnin har domänen för datorsyn uppnått betydande prestationsförstärkningar att konkurrera med hårdvarubaserade lösningar. För det första samlas data in från ett stort antal sensorer som kan öka produktionskostnaderna och koldioxidavtry -cket i miljön. För det andra är vissa användbara fysiska kvantiteter / variabler omöjliga att mäta eller visar sig vara en mycket dyr lösning. Så i denna avhandling undersöker vi möjligheten att tillhandahålla liknande lösning med hjälp av en enda sensor (instrumentbrädkamera) för att mäta flera variabler. Detta ger en hållbar lösning även när den skalas upp i stora flottor. Videoramar som kan samlas in från truckens visuella uppfattning (dvs. lastbilens inbyggda kamera) bearbetas av djupinlärningsteknikerna och operativa data kan extraher -as. Vissa tekniker som bildklassificering och semantiska segmenteringsutgång -ar experimenterades och visar potential att ersätta dyra hårdvaruprojekt som Lidar eller radarbaserade lösningar.

APA, Harvard, Vancouver, ISO, and other styles

28

Favia, Federico. "Real-time hand segmentation using deep learning." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-292930.

Full text

Abstract:

Hand segmentation is a fundamental part of many computer vision systems aimed at gesture recognition or hand tracking. In particular, augmented reality solutions need a very accurate gesture analysis system in order to satisfy the end consumers in an appropriate manner. Therefore the hand segmentation step is critical. Segmentation is a well-known problem in image processing, being the process to divide a digital image into multiple regions with pixels of similar qualities. Classify what pixels belong to the hand and which ones belong to the background need to be performed within a real-time performance and a reasonable computational complexity. While in the past mainly light-weight probabilistic and machine learning approaches were used, this work investigates the challenges of real-time hand segmentation achieved through several deep learning techniques. Is it possible or not to improve current state-of-theart segmentation systems for smartphone applications? Several models are tested and compared based on accuracy and processing speed. Transfer learning-like approach leads the method of this work since many architectures were built just for generic semantic segmentation or for particular applications such as autonomous driving. Great effort is spent on organizing a solid and generalized dataset of hands, exploiting the existing ones and data collected by ManoMotion AB. Since the first aim was to obtain a really accurate hand segmentation, in the end, RefineNet architecture is selected and both quantitative and qualitative evaluations are performed, considering its advantages and analysing the problems related to the computational time which could be improved in the future.
Handsegmentering är en grundläggande del av många datorvisionssystem som syftar till gestigenkänning eller handspårning. I synnerhet behöver förstärkta verklighetslösningar ett mycket exakt gestanalyssystem för att tillfredsställa slutkonsumenterna på ett lämpligt sätt. Därför är handsegmenteringssteget kritiskt. Segmentering är ett välkänt problem vid bildbehandling, det vill säga processen att dela en digital bild i flera regioner med pixlar av liknande kvaliteter. Klassificera vilka pixlar som tillhör handen och vilka som hör till bakgrunden måste utföras i realtidsprestanda och rimlig beräkningskomplexitet. Medan tidigare använts huvudsakligen lättviktiga probabilistiska metoder och maskininlärningsmetoder, undersöker detta arbete utmaningarna med realtidshandsegmentering uppnådd genom flera djupinlärningstekniker. Är det möjligt eller inte att förbättra nuvarande toppmoderna segmenteringssystem för smartphone-applikationer? Flera modeller testas och jämförs baserat på noggrannhet och processhastighet. Transfer learning-liknande metoden leder metoden för detta arbete eftersom många arkitekturer byggdes bara för generisk semantisk segmentering eller för specifika applikationer som autonom körning. Stora ansträngningar läggs på att organisera en gedigen och generaliserad uppsättning händer, utnyttja befintliga och data som samlats in av ManoMotion AB. Eftersom det första syftet var att få en riktigt exakt handsegmentering, väljs i slutändan RefineNetarkitekturen och både kvantitativa och kvalitativa utvärderingar utförs med beaktande av fördelarna med det och analys av problemen relaterade till beräkningstiden som kan förbättras i framtiden.

APA, Harvard, Vancouver, ISO, and other styles

29

Rzechowski, Kamil. "Ball tracking algorithm for mobile devices." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-290367.

Full text

Abstract:

Object tracking seeks to determine the object size and location in the following video frames, given the appearance and location of the object in the first frame. The object tracking approaches can be divided into categories: online trained trackers and offline trained tracker. First group of trackers is based on handcrafted features like HOG or Color Names. This group is characterised by high inference speed, but struggles from lack of highly deterministic features. On the other hand the second group uses Convolution Neural Networks as features extractors. They generate highly meaningful features, but limit the inference speed and possibility of learning object appearance in the offline phase. The following report investigates the problem of tracking a soccer ball on mobile devices. Keeping in mind the limited computational resources of mobile devices, we propose the fused tracker. At the beginning of the video the simple online trained tracker is fired. As soon as the tracker looses the ball, the more advanced tracer, based on deep neural networks is fired. The fusion allows to speed up the inference time, by using the simple tracker as much as possible, but keeps the tracking success rate high, by using the more advanced tracker after the object is lost by the first tracker. Both quantitative and qualitative experiments demonstrate the validity of this approach.
Objektspårning syftar till att bestämma objektets storlek och plats i följande videoramar, med tanke på objektets utseende och plats i den första bilden. Objektspårningsmetoderna kan delas in i kategorier: online-utbildade trackers och offline-utbildade trackers. Första gruppen av trackers är baserad på handgjorda funktioner som HOG eller Color Names. Denna grupp kännetecknas av hög inferenshastighet, men kämpar från brist på mycket deterministiska egenskaper. Å andra sidan använder den andra gruppen Convolution Neural Networks som funktioner för extrahering. De genererar mycket meningsfulla funktioner, men begränsar sluthastigheten och möjligheten att lära sig objekt i offlinefasen. Följande rapport undersöker problemet med att spåra en fotboll på mobila enheter. Med tanke på de begränsade beräkningsresurserna för mobila enheter föreslår vi den smälta trackern. I början av videon sparkas den enkla utbildade spåraren online. Så snart trackern förlorar bollen avfyras den mer avancerade spåraren, baserad på djupa neurala nätverk. Fusionen gör det möjligt att påskynda inferenstiden genom att använda den enkla trackern så mycket som möjligt, men håller spårningsfrekvensen hög, genom att använda den mer avancerade trackern efter att objektet förlorats av den första trackern. Både kvantitativa och kvalitativa experiment visar att detta tillvägagångssätt är giltigt.

APA, Harvard, Vancouver, ISO, and other styles

We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!