To see the other types of publications on this topic, follow the link: Deep learning segmentation.

Dissertations / Theses on the topic 'Deep learning segmentation'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 dissertations / theses for your research on the topic 'Deep learning segmentation.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.

1

Favia, Federico. "Real-time hand segmentation using deep learning." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-292930.

Full text
Abstract:
Hand segmentation is a fundamental part of many computer vision systems aimed at gesture recognition or hand tracking. In particular, augmented reality solutions need a very accurate gesture analysis system in order to satisfy the end consumers in an appropriate manner. Therefore the hand segmentation step is critical. Segmentation is a well-known problem in image processing, being the process to divide a digital image into multiple regions with pixels of similar qualities. Classify what pixels belong to the hand and which ones belong to the background need to be performed within a real-time performance and a reasonable computational complexity. While in the past mainly light-weight probabilistic and machine learning approaches were used, this work investigates the challenges of real-time hand segmentation achieved through several deep learning techniques. Is it possible or not to improve current state-of-theart segmentation systems for smartphone applications? Several models are tested and compared based on accuracy and processing speed. Transfer learning-like approach leads the method of this work since many architectures were built just for generic semantic segmentation or for particular applications such as autonomous driving. Great effort is spent on organizing a solid and generalized dataset of hands, exploiting the existing ones and data collected by ManoMotion AB. Since the first aim was to obtain a really accurate hand segmentation, in the end, RefineNet architecture is selected and both quantitative and qualitative evaluations are performed, considering its advantages and analysing the problems related to the computational time which could be improved in the future.
Handsegmentering är en grundläggande del av många datorvisionssystem som syftar till gestigenkänning eller handspårning. I synnerhet behöver förstärkta verklighetslösningar ett mycket exakt gestanalyssystem för att tillfredsställa slutkonsumenterna på ett lämpligt sätt. Därför är handsegmenteringssteget kritiskt. Segmentering är ett välkänt problem vid bildbehandling, det vill säga processen att dela en digital bild i flera regioner med pixlar av liknande kvaliteter. Klassificera vilka pixlar som tillhör handen och vilka som hör till bakgrunden måste utföras i realtidsprestanda och rimlig beräkningskomplexitet. Medan tidigare använts huvudsakligen lättviktiga probabilistiska metoder och maskininlärningsmetoder, undersöker detta arbete utmaningarna med realtidshandsegmentering uppnådd genom flera djupinlärningstekniker. Är det möjligt eller inte att förbättra nuvarande toppmoderna segmenteringssystem för smartphone-applikationer? Flera modeller testas och jämförs baserat på noggrannhet och processhastighet. Transfer learning-liknande metoden leder metoden för detta arbete eftersom många arkitekturer byggdes bara för generisk semantisk segmentering eller för specifika applikationer som autonom körning. Stora ansträngningar läggs på att organisera en gedigen och generaliserad uppsättning händer, utnyttja befintliga och data som samlats in av ManoMotion AB. Eftersom det första syftet var att få en riktigt exakt handsegmentering, väljs i slutändan RefineNetarkitekturen och både kvantitativa och kvalitativa utvärderingar utförs med beaktande av fördelarna med det och analys av problemen relaterade till beräkningstiden som kan förbättras i framtiden.
APA, Harvard, Vancouver, ISO, and other styles
2

Sarpangala, Kishan. "Semantic Segmentation Using Deep Learning Neural Architectures." University of Cincinnati / OhioLINK, 2019. http://rave.ohiolink.edu/etdc/view?acc_num=ucin157106185092304.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Wen, Shuangyue. "Automatic Tongue Contour Segmentation using Deep Learning." Thesis, Université d'Ottawa / University of Ottawa, 2018. http://hdl.handle.net/10393/38343.

Full text
Abstract:
Ultrasound is one of the primary technologies used for clinical purposes. Ultrasound systems have favorable real-time capabilities, are fast and relatively inexpensive, portable and non-invasive. Recent interest in using ultrasound imaging for tongue motion has various applications in linguistic study, speech therapy as well as in foreign language education, where visual-feedback of tongue motion complements conventional audio feedback. Ultrasound images are known to be difficult to recognize. The anatomical structure in them, the rapidity of tongue movements, also missing segments in some frames and the limited frame rate of ultrasound systems have made automatic tongue contour extraction and tracking very challenging and especially hard for real-time applications. Traditional image processing-based approaches have many practical limitations in terms of automation, speed, and accuracy. Recent progress in deep convolutional neural networks has been successfully exploited in a variety of computer vision problems such as detection, classification, and segmentation. In the past few years, deep belief networks for tongue segmentation and convolutional neural networks for the classification of tongue motion have been proposed. However, none of these claim fully-automatic or real-time performance. U-Net is one of the most popular deep learning algorithms for image segmentation, and it is composed of several convolutions and deconvolution layers. In this thesis, we proposed a fully automatic system to extract tongue dorsum from ultrasound videos in real-time using a simplified version of U-Net, which we call sU-Net. Two databases from different machines were collected, and different training schemes were applied for testing the learning capability of the model. Our experiment on ultrasound video data demonstrates that the proposed method is very competitive compared with other methods in terms of performance and accuracy.
APA, Harvard, Vancouver, ISO, and other styles
4

¿, Ananya. "DEEP LEARNING METHODS FOR CROP AND WEED SEGMENTATION." Case Western Reserve University School of Graduate Studies / OhioLINK, 2018. http://rave.ohiolink.edu/etdc/view?acc_num=case1528372119706623.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Tosteberg, Patrik. "Semantic Segmentation of Point Clouds Using Deep Learning." Thesis, Linköpings universitet, Datorseende, 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-136793.

Full text
Abstract:
In computer vision, it has in recent years become more popular to use point clouds to represent 3D data. To understand what a point cloud contains, methods like semantic segmentation can be used. Semantic segmentation is the problem of segmenting images or point clouds and understanding what the different segments are. An application for semantic segmentation of point clouds are e.g. autonomous driving, where the car needs information about objects in its surrounding. Our approach to the problem, is to project the point clouds into 2D virtual images using the Katz projection. Then we use pre-trained convolutional neural networks to semantically segment the images. To get the semantically segmented point clouds, we project back the scores from the segmentation into the point cloud. Our approach is evaluated on the semantic3D dataset. We find our method is comparable to state-of-the-art, without any fine-tuning on the Semantic3Ddataset.
APA, Harvard, Vancouver, ISO, and other styles
6

Kolhatkar, Dhanvin. "Real-Time Instance and Semantic Segmentation Using Deep Learning." Thesis, Université d'Ottawa / University of Ottawa, 2020. http://hdl.handle.net/10393/40616.

Full text
Abstract:
In this thesis, we explore the use of Convolutional Neural Networks for semantic and instance segmentation, with a focus on studying the application of existing methods with cheaper neural networks. We modify a fast object detection architecture for the instance segmentation task, and study the concepts behind these modifications both in the simpler context of semantic segmentation and the more difficult context of instance segmentation. Various instance segmentation branch architectures are implemented in parallel with a box prediction branch, using its results to crop each instance's features. We negate the imprecision of the final box predictions and eliminate the need for bounding box alignment by using an enlarged bounding box for cropping. We report and study the performance, advantages, and disadvantages of each. We achieve fast speeds with all of our methods.
APA, Harvard, Vancouver, ISO, and other styles
7

Wang, Wei. "Image Segmentation Using Deep Learning Regulated by Shape Context." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-227261.

Full text
Abstract:
In recent years, image segmentation by using deep neural networks has made great progress. However, reaching a good result by training with a small amount of data remains to be a challenge. To find a good way to improve the accuracy of segmentation with limited datasets, we implemented a new automatic chest radiographs segmentation experiment based on preliminary works by Chunliang using deep learning neural network combined with shape context information. When the process was conducted, the datasets were put into origin U-net at first. After the preliminary process, the segmented images were then repaired through a new network with shape context information. In this experiment, we created a new network structure by rebuilding the U-net into a 2-input structure and refined the processing pipeline step. In this proposed pipeline, the datasets and shape context were trained together through the new network model by iteration. The proposed method was evaluated on 247 posterior-anterior chest radiographs of public datasets and n-folds cross-validation was also used. The outcome shows that compared to origin U-net, the proposed pipeline reaches higher accuracy when trained with limited datasets. Here the "limited" datasets refer to 1-20 images in the medical image field. A better outcome with higher accuracy can be reached if the second structure is further refined and shape context generator's parameter is fine-tuned in the future.
Under de senaste åren har bildsegmentering med hjälp av djupa neurala nätverk gjort stora framsteg. Att nå ett bra resultat med träning med en liten mängd data kvarstår emellertid som en utmaning. För att hitta ett bra sätt att förbättra noggrannheten i segmenteringen med begränsade datamängder så implementerade vi en ny segmentering för automatiska röntgenbilder av bröstkorgsdiagram baserat på tidigare forskning av Chunliang. Detta tillvägagångssätt använder djupt lärande neurala nätverk kombinerat med "shape context" information. I detta experiment skapade vi en ny nätverkstruktur genom omkonfiguration av U-nätverket till en 2-inputstruktur och förfinade pipeline processeringssteget där bilden och "shape contexten" var tränade tillsammans genom den nya nätverksmodellen genom iteration.Den föreslagna metoden utvärderades på dataset med 247 bröströntgenfotografier, och n-faldig korsvalidering användes för utvärdering. Resultatet visar att den föreslagna pipelinen jämfört med ursprungs U-nätverket når högre noggrannhet när de tränas med begränsade datamängder. De "begränsade" dataseten här hänvisar till 1-20 bilder inom det medicinska fältet. Ett bättre resultat med högre noggrannhet kan nås om den andra strukturen förfinas ytterligare och "shape context-generatorns" parameter finjusteras.
APA, Harvard, Vancouver, ISO, and other styles
8

Chen, Yani. "Deep Learning based 3D Image Segmentation Methods and Applications." Ohio University / OhioLINK, 2019. http://rave.ohiolink.edu/etdc/view?acc_num=ohiou1547066297047003.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Liu, Dongnan. "Supervised and Unsupervised Deep Learning-based Biomedical Image Segmentation." Thesis, The University of Sydney, 2021. https://hdl.handle.net/2123/24744.

Full text
Abstract:
Biomedical image analysis plays a crucial role in the development of healthcare, with a wide scope of applications including the disease diagnosis, clinical treatment, and future prognosis. Among various biomedical image analysis techniques, segmentation is an essential step, which aims at assigning each pixel with labels of interest on the category and instance. At the early stage, the segmentation results were obtained via manual annotation, which is time-consuming and error-prone. Over the past few decades, hand-craft feature based methods have been proposed to segment the biomedical images automatically. However, these methods heavily rely on prior knowledge, which limits their generalization ability on various biomedical images. With the recent advance of the deep learning technique, convolutional neural network (CNN) based methods have achieved state-of-the-art performance on various nature and biomedical image segmentation tasks. The great success of the CNN based segmentation methods results from the ability to learn contextual and local information from the high dimensional feature space. However, the biomedical image segmentation tasks are particularly challenging, due to the complicated background components, the high variability of object appearances, numerous overlapping objects, and ambiguous object boundaries. To this end, it is necessary to establish automated deep learning-based segmentation paradigms, which are capable of processing the complicated semantic and morphological relationships in various biomedical images. In this thesis, we propose novel deep learning-based methods for fully supervised and unsupervised biomedical image segmentation tasks. For the first part of the thesis, we introduce fully supervised deep learning-based segmentation methods on various biomedical image analysis scenarios. First, we design a panoptic structure paradigm for nuclei instance segmentation in the histopathology images, and cell instance segmentation in the fluorescence microscopy images. Traditional proposal-based and proposal-free instance segmentation methods are only capable to leverage either global contextual or local instance information. However, our panoptic paradigm integrates both of them and therefore achieves better performance. Second, we propose a multi-level feature fusion architecture for semantic neuron membrane segmentation in the electron microscopy (EM) images. Third, we propose a 3D anisotropic paradigm for brain tumor segmentation in magnetic resonance images, which enlarges the model receptive field while maintaining the memory efficiency. Although our fully supervised methods achieve competitive performance on several biomedical image segmentation tasks, they heavily rely on the annotations of the training images. However, labeling pixel-level segmentation ground truth for biomedical images is expensive and labor-intensive. Subsequently, exploring unsupervised segmentation methods without accessing annotations is an important topic for biomedical image analysis. In the second part of the thesis, we focus on the unsupervised biomedical image segmentation methods. First, we proposed a panoptic feature alignment paradigm for unsupervised nuclei instance segmentation in the histopathology images, and mitochondria instance segmentation in EM images. To the best of our knowledge, we are for the first time to design an unsupervised deep learning-based method for various biomedical image instance segmentation tasks. Second, we design a feature disentanglement architecture for unsupervised object recognition. In addition to the unsupervised instance segmentation for the biomedical images, our method also achieves state-of-the-art performance on the unsupervised object detection for natural images, which further demonstrates its effectiveness and high generalization ability.
APA, Harvard, Vancouver, ISO, and other styles
10

Granli, Petter. "Semantic segmentation of seabed sonar imagery using deep learning." Thesis, Linköpings universitet, Programvara och system, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-160561.

Full text
Abstract:
For investigating the large parts of the ocean which have yet to be mapped, there is a need for autonomous underwater vehicles. Current state-of-the-art underwater positioning often relies on external data from other vessels or beacons. Processing seabed image data could potentially improve autonomy for underwater vehicles. In this thesis, image data from a synthetic aperture sonar (SAS) was manually segmented into two classes: sand and gravel. Two different convolutional neural networks (CNN) were trained using different loss functions, and the results were examined. The best performing network, U-Net trained with the IoU loss function, achieved dice coefficient and IoU scores of 0.645 and 0.476, respectively. It was concluded that CNNs are a viable approach for segmenting SAS image data, but there is much room for improvement.
APA, Harvard, Vancouver, ISO, and other styles
11

Bou, Albert. "Deep Learning models for semantic segmentation of mammography screenings." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-265652.

Full text
Abstract:
This work explores the performance of state-of-the-art semantic segmentation models on mammographic imagery. It does so by comparing several reference semantic segmentation deep learning models on a newly proposed medical dataset of mammograpgy screenings. All models are re-implemented in Tensorflow and validated first on the benchmark dataset Cityscapes. The new medical image corpus was gathered and annotated at the Science for Life Laboratory in Stockholm. In addition, this master thesis shows that it is possible to boost segmentation performance by training the models in an adversarial manner after reaching convergence in the classical training framework.
Denna uppsats undersöker hur väl moderna metoder presterar på semantisk segmentering av mammografibilder. Detta görs genom att utvärdera flera semantiska segmenteringsmetoder på ett dataset som är framtaget under detta examensarbete. Utvärderingarna genomförs genom att återimplementera flertalet semantiska segmenteringsmodeller för djupinlärning i Tensorflow och algoritmerna valideras på referensdatasetet Cityscapes. Därefter tränas modellerna också på det dataset med medicinska mammografi-bilder som är samlat och annoterat vid Science for Life Laboratory i Stockholm. Dessutom visar detta examensarbete att det är möjligt att öka segmenteringsprestandan genom att använda en adversarial träningsmetod efter att den klassiska träningsalgoritmen har konvergerat.
APA, Harvard, Vancouver, ISO, and other styles
12

Singh, Amarjot. "ScatterNet hybrid frameworks for deep learning." Thesis, University of Cambridge, 2019. https://www.repository.cam.ac.uk/handle/1810/285997.

Full text
Abstract:
Image understanding is the task of interpreting images by effectively solving the individual tasks of object recognition and semantic image segmentation. An image understanding system must have the capacity to distinguish between similar looking image regions while being invariant in its response to regions that have been altered by the appearance-altering transformation. The fundamental challenge for any such system lies within this simultaneous requirement for both invariance and specificity. Many image understanding systems have been proposed that capture geometric properties such as shapes, textures, motion and 3D perspective projections using filtering, non-linear modulus, and pooling operations. Deep learning networks ignore these geometric considerations and compute descriptors having suitable invariance and stability to geometric transformations using (end-to-end) learned multi-layered network filters. These deep learning networks in recent years have come to dominate the previously separate fields of research in machine learning, computer vision, natural language understanding and speech recognition. Despite the success of these deep networks, there remains a fundamental lack of understanding in the design and optimization of these networks which makes it difficult to develop them. Also, training of these networks requires large labeled datasets which in numerous applications may not be available. In this dissertation, we propose the ScatterNet Hybrid Framework for Deep Learning that is inspired by the circuitry of the visual cortex. The framework uses a hand-crafted front-end, an unsupervised learning based middle-section, and a supervised back-end to rapidly learn hierarchical features from unlabelled data. Each layer in the proposed framework is automatically optimized to produce the desired computationally efficient architecture. The term `Hybrid' is coined because the framework uses both unsupervised as well as supervised learning. We propose two hand-crafted front-ends that can extract locally invariant features from the input signals. Next, two ScatterNet Hybrid Deep Learning (SHDL) networks (a generative and a deterministic) were introduced by combining the proposed front-ends with two unsupervised learning modules which learn hierarchical features. These hierarchical features were finally used by a supervised learning module to solve the task of either object recognition or semantic image segmentation. The proposed front-ends have also been shown to improve the performance and learning of current Deep Supervised Learning Networks (VGG, NIN, ResNet) with reduced computing overhead.
APA, Harvard, Vancouver, ISO, and other styles
13

Lokegaonkar, Sanket Avinash. "Continual Learning for Deep Dense Prediction." Thesis, Virginia Tech, 2018. http://hdl.handle.net/10919/83513.

Full text
Abstract:
Transferring a deep learning model from old tasks to a new one is known to suffer from the catastrophic forgetting effects. Such forgetting mechanism is problematic as it does not allow us to accumulate knowledge sequentially and requires retaining and retraining on all the training data. Existing techniques for mitigating the abrupt performance degradation on previously trained tasks are mainly studied in the context of image classification. In this work, we present a simple method to alleviate catastrophic forgetting for pixel-wise dense labeling problems. We build upon the regularization technique using knowledge distillation to minimize the discrepancy between the posterior distribution of pixel class labels for old tasks predicted from 1) the original and 2) the updated networks. This technique, however, might fail in circumstances where the source and target distribution differ significantly. To handle the above scenario, we further propose an improvement to the distillation based approach by adding adaptive l2-regularization depending upon the per-parameter importance to the older tasks. We train our model on FCN8s, but our training can be generalized to stronger models like DeepLab, PSPNet, etc. Through extensive evaluation and comparisons, we show that our technique can incrementally train dense prediction models for novel object classes, different visual domains, and different visual tasks.
Master of Science
APA, Harvard, Vancouver, ISO, and other styles
14

Simonovsky, Martin. "Deep learning on attributed graphs." Thesis, Paris Est, 2018. http://www.theses.fr/2018PESC1133/document.

Full text
Abstract:
Le graphe est un concept puissant pour la représentation des relations entre des paires d'entités. Les données ayant une structure de graphes sous-jacente peuvent être trouvées dans de nombreuses disciplines, décrivant des composés chimiques, des surfaces des modèles tridimensionnels, des interactions sociales ou des bases de connaissance, pour n'en nommer que quelques-unes. L'apprentissage profond (DL) a accompli des avancées significatives dans une variété de tâches d'apprentissage automatique au cours des dernières années, particulièrement lorsque les données sont structurées sur une grille, comme dans la compréhension du texte, de la parole ou des images. Cependant, étonnamment peu de choses ont été faites pour explorer l'applicabilité de DL directement sur des données structurées sous forme des graphes. L'objectif de cette thèse est d'étudier des architectures de DL sur des graphes et de rechercher comment transférer, adapter ou généraliser à ce domaine des concepts qui fonctionnent bien sur des données séquentielles et des images. Nous nous concentrons sur deux primitives importantes : le plongement de graphes ou leurs nœuds dans une représentation de l'espace vectorielle continue (codage) et, inversement, la génération des graphes à partir de ces vecteurs (décodage). Nous faisons les contributions suivantes. Tout d'abord, nous introduisons Edge-Conditioned Convolutions (ECC), une opération de type convolution sur les graphes réalisés dans le domaine spatial où les filtres sont générés dynamiquement en fonction des attributs des arêtes. La méthode est utilisée pour coder des graphes avec une structure arbitraire et variable. Deuxièmement, nous proposons SuperPoint Graph, une représentation intermédiaire de nuages de points avec de riches attributs des arêtes codant la relation contextuelle entre des parties des objets. Sur la base de cette représentation, l'ECC est utilisé pour segmenter les nuages de points à grande échelle sans sacrifier les détails les plus fins. Troisièmement, nous présentons GraphVAE, un générateur de graphes permettant de décoder des graphes avec un nombre de nœuds variable mais limité en haut, en utilisant la correspondance approximative des graphes pour aligner les prédictions d'un auto-encodeur avec ses entrées. La méthode est appliquée à génération de molécules
Graph is a powerful concept for representation of relations between pairs of entities. Data with underlying graph structure can be found across many disciplines, describing chemical compounds, surfaces of three-dimensional models, social interactions, or knowledge bases, to name only a few. There is a natural desire for understanding such data better. Deep learning (DL) has achieved significant breakthroughs in a variety of machine learning tasks in recent years, especially where data is structured on a grid, such as in text, speech, or image understanding. However, surprisingly little has been done to explore the applicability of DL on graph-structured data directly.The goal of this thesis is to investigate architectures for DL on graphs and study how to transfer, adapt or generalize concepts working well on sequential and image data to this domain. We concentrate on two important primitives: embedding graphs or their nodes into a continuous vector space representation (encoding) and, conversely, generating graphs from such vectors back (decoding). To that end, we make the following contributions.First, we introduce Edge-Conditioned Convolutions (ECC), a convolution-like operation on graphs performed in the spatial domain where filters are dynamically generated based on edge attributes. The method is used to encode graphs with arbitrary and varying structure.Second, we propose SuperPoint Graph, an intermediate point cloud representation with rich edge attributes encoding the contextual relationship between object parts. Based on this representation, ECC is employed to segment large-scale point clouds without major sacrifice in fine details.Third, we present GraphVAE, a graph generator allowing to decode graphs with variable but upper-bounded number of nodes making use of approximate graph matching for aligning the predictions of an autoencoder with its inputs. The method is applied to the task of molecule generation
APA, Harvard, Vancouver, ISO, and other styles
15

von, Koch Christian, and William Anzén. "Detecting Slag Formation with Deep Learning Methods : An experimental study of different deep learning image segmentation models." Thesis, Linköpings universitet, Datorseende, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-177269.

Full text
Abstract:
Image segmentation through neural networks and deep learning have, in the recent decade, become a successful tool for automated decision-making. For Luossavaara-Kiirunavaara Aktiebolag (LKAB), this means identifying the amount of slag inside a furnace through computer vision.  There are many prominent convolutional neural network architectures in the literature, and this thesis explores two: a modified U-Net and the PSPNet. The architectures were combined with three loss functions and three class weighting schemes resulting in 18 model configurations that were evaluated and compared. This thesis also explores transfer learning techniques for neural networks tasked with identifying slag in images from inside a furnace. The benefit of transfer learning is that the network can learn to find features from already labeled data of another context. Finally, the thesis explored how temporal information could be utilised by adding an LSTM layer to a model taking pairs of images as input, instead of one. The results show (1) that the PSPNet outperformed the U-Net for all tested configurations in all relevant metrics, (2) that the model is able to find more complex features while converging quicker by using transfer learning, and (3) that utilising temporal information reduced the variance of the predictions, and that the modified PSPNet using an LSTM layer showed promise in handling images with outlying characteristics.
APA, Harvard, Vancouver, ISO, and other styles
16

He, Haoyu. "Deep learning based human parsing." Thesis, University of Sydney, 2020. https://hdl.handle.net/2123/24262.

Full text
Abstract:
Human parsing, or human body part semantic segmentation, has been an active research topic due to its wide potential applications. Although prior works have made significant progress by introducing large-scale datasets and deep learning to solve the problem, there are still two challenges remain unsolved. Firstly, to better exploit the existing parsing annotations, prior methods learn a knowledge-sharing mechanism to improve semantic structures in cross-dataset human parsing. However, the modeling for such mechanism remains inefficient for not considering classes' granularity difference in different domains. Secondly, the trained models are limited to parsing humans into classes pre-defined in the training data, which lacks the generalization ability to the unseen fashion classes. Targeting at improving feature representations from multi-domain annotations more efficiently, in this thesis, we propose a novel GRAph PYramid Mutual Learning (Grapy-ML) method to address the cross-dataset human parsing problem, where we model the granularity difference through a graph pyramid. Starting from the prior knowledge of the human body hierarchical structure, we devise a graph pyramid module (GPM) by stacking three levels of graph structures from coarse granularity to fine granularity subsequently. Specifically, the network weights of the first two levels are shared to exchange the learned coarse-granularity information across different datasets. At each level, GPM utilizes the self-attention mechanism to model the correlations between context nodes. Then, it adopts a top-down mechanism to progressively refine the hierarchical features through all the levels. GPM also enables efficient mutual learning. By making use of the multi-granularity labels, Grapy-ML learns a more discriminative feature representation and achieves state-of-the-art performance, which is demonstrated by extensive experiments on the three popular benchmarks, e.g., CIHP dataset. To bridge the generalizability gap, in this thesis, we propose a new problem named one-shot human parsing (OSHP) that requires to parse human into an open set of reference classes defined by any single reference example. During training, only base classes defined in the training set are exposed, which can overlap with part of reference classes. In this thesis, we devise a novel Progressive One-shot Parsing network (POPNet) to address two critical challenges in this problem, i.e., testing bias and small size. POPNet consists of two collaborative metric learning modules named Attention Guidance Module (AGM) and Nearest Centroid Module (NCM), which can learn representative prototypes for base classes and quickly transfer the ability to the unseen classes during testing, thereby reducing the testing bias. Moreover, POPNet adopts a progressive human parsing framework that can incorporate the learned knowledge of parent classes at the coarse granularity to help recognize the unseen descendant classes at the fine granularity, thereby handling the small size issue. Experiments on the ATR-OS benchmark tailoring for OSHP, demonstrate POPNet outperforms other representative one-shot segmentation models by large margins and establishes a strong baseline for the new problem.
APA, Harvard, Vancouver, ISO, and other styles
17

La, Rosa Francesco. "A deep learning approach to bone segmentation in CT scans." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2017. http://amslaurea.unibo.it/14561/.

Full text
Abstract:
This thesis proposes a deep learning approach to bone segmentation in abdominal CT scans. Segmentation is a common initial step in medical images analysis, often fundamental for computer-aided detection and diagnosis systems. The extraction of bones in CT scans is a challenging task, which if done manually by experts requires a time consuming process and that has not today a broadly recognized automatic solution. The method presented is based on a convolutional neural network, inspired by the U-Net and trained end-to-end, that performs a semantic segmentation of the data. The training dataset is made up of 21 abdominal CT scans, each one containing between 403 and 994 2D transversal images. Those images are in full resolution, 512x512 voxels, and each voxel is classified by the network into one of the following classes: background, femoral bones, hips, sacrum, sternum, spine and ribs. The output is therefore a bone mask where the bones are recognized and divided into six different classes. In the testing dataset, labeled by experts, the best model achieves a Dice coefficient as average of all bone classes of 0.93. This work demonstrates, to the best of my knowledge for the first time, the feasibility of automatic bone segmentation and classification for CT scans using a convolutional neural network.
APA, Harvard, Vancouver, ISO, and other styles
18

Torrents, Barrena Jordina. "Deep learning -based segmentation methods for computer-assisted fetal surgery." Doctoral thesis, Universitat Pompeu Fabra, 2019. http://hdl.handle.net/10803/668188.

Full text
Abstract:
This thesis focuses on the development of deep learning-based image processing techniques for the detection and segmentation of fetal structures in magnetic resonance imaging (MRI) and 3D ultrasound (US) images of singleton and twin pregnancies. Special attention is laid on monochorionic twins affected by the twin-to-twin transfusion syndrome (TTTS). In this context, we propose the first TTTS fetal surgery planning and simulation platform. Different approaches are utilized to automatically segment the mother’s soft tissue, uterus, placenta, its peripheral blood vessels, and umbilical cord from multiple (axial, sagittal and coronal) MRI views or a super-resolution reconstruction. (Conditional) generative adversarial networks (GANs) are used for segmentation of fetal structures from (3D) US and the umbilical cord insertion is localized from color Doppler US. Finally, we present a comparative study of deep-learning approaches and Radiomics over the segmentation performance of several fetal and maternal anatomies in both MRI and 3D US.
Aquesta tesi comprèn el desenvolupament de tècniques de processament d’imatge basades en aprenentatge profund per a la detecció i segmentació d’estructures fetals en imatges de ressonància magnètica (RM) i ultrasò (US) tridimensional d’embarassos normals i de bessons. S’ha fet especial èmfasi en el cas de bessons monocoriònics afectats per la síndrome de transfusió feto fetal (STFF). En aquest context es proposa la primera plataforma de planificació i simulació quirúrgica orientada a STFF. S’han utilitzat diferents mètodes per segmentar automàticament el teixit de la mare, l’úter, la placenta, els seus vasos perifèrics i el cordó umbilical a partir de les diferents vistes en RM o a partir d’un volum en super-resolució. S’han utilitzat xarxes generatives antagòniques (condicionals) per a la segmentació d’estructures en imatges d’US tridimensionals i s’ha localitzat la inserció del cordó a partir d’US Doppler. Finalment, es presenta un estudi comparatiu de les metodologies d’aprenentatge profund i Radiomics.
APA, Harvard, Vancouver, ISO, and other styles
19

Carrizo, Gabriel. "Organ Segmentation Using Deep Multi-task Learning with Anatomical Landmarks." Thesis, KTH, Skolan för kemi, bioteknologi och hälsa (CBH), 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-241640.

Full text
Abstract:
This master thesis is the study of multi-task learning to train a neural network to segment medical images and predict anatomical landmarks. The paper shows the results from experiments using medical landmarks in order to attempt to help the network learn the important organ structures quicker. The results found in this study are inconclusive and rather than showing the efficiency of the multi-task framework for learning, they tell a story of the importance of choosing the tasks and dataset wisely. The study also reflects and depicts the general difficulties and pitfalls of performing a project of this type.
APA, Harvard, Vancouver, ISO, and other styles
20

Wan, Fengkai. "Deep Learning Method used in Skin Lesions Segmentation and Classification." Thesis, KTH, Medicinsk teknik, 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-233467.

Full text
Abstract:
Malignant melanoma (MM) is a type of skin cancer that is associated with a very poor prognosis and can often lead to death. Early detection is crucial in order to administer the right treatment successfully but currently requires the expertise of a dermatologist. In the past years, studies have shown that automatic detection of MM is possible through computer vision and machine learning methods. Skin lesion segmentation and classification are the key methods in supporting automatic detection of different skin lesions. Compared with traditional computer vision as well as other machine learning methods, deep neural networks currently show the greatest promise both in segmentation and classification. In our work, we have implemented several deep neural networks to achieve the goals of skin lesion segmentation and classification. We have also applied different training schemes. Our best segmentation model achieves pixel-wise accuracy of \textbf{0.940}, Dice index of \textbf{0.867} and Jaccard index of \textbf{0.765} on the ISIC 2017 challenge dataset. This surpassed the official state of the art model whose pixel-wise accuracy was 0.934, Dice index 0.849 and Jaccard Index 0.765. We have also trained a segmentation model with the help of adversarial loss which improved the baseline model slightly. Our experiments with several neural network models for skin lesion classification achieved varying results. We also combined both segmentation and classification in one pipeline meaning that we were able to train the most promising classification model on pre-segmented images. This resulted in improved classification performance. The binary (melanoma or not) classification from this single model trained without extra data and clinical information reaches an area under the curve (AUC) of 0.684 on the official ISIC test dataset. Our results suggest that automatic detection of skin cancers through image analysis shows significant promise in early detection of malignant melanoma.
APA, Harvard, Vancouver, ISO, and other styles
21

Kobold, Jonathan. "Deep Learning for lesion and thrombus segmentation from cerebral MRI." Thesis, Université Paris-Saclay (ComUE), 2019. http://www.theses.fr/2019SACLE044.

Full text
Abstract:
L'apprentissage profond est le meilleur ensemble de méthodes aumonde pour identifier des objets sur des images. L'accident vascu-laire cérébral est une maladie mortelle dont le traitement nécessitel'identification d'objets par imagerie médicale. Cela semble être unecombinaison évidente, mais il n'est pas anodin de joindre les deux.La segmentation de la lésion de l'IRM cérébrale a retenu l'attentiondes chercheurs, mais la segmentation du thrombus est encore inex-plorée. Ce travail montre que les architectures de réseau de neur-ones convolutionnels contemporaines ne peuvent pas identifier demanière fiable le thrombus sur l'IRM. En outre, il est démontrépourquoi ces modèles ne fonctionnent pas sur ce problème. Fort decette connaissance, une architecture de réseau neuronal récurrente aété développée, appelée logic-LSTM, capable de prendre en comptela manière dont les médecins identifient le thrombus. Cette ar-chitecture fournit non seulement la première identification fiablede thrombus, mais elle fournit également de nouvelles informationssur la conception des réseaux neuronaux. En particulier, les méthodesd'augmentation du champ récepteur sont enrichies d'une nouvelleoption sans paramètre. Enfin, le logic-LSTM améliore également lesrésultats de la segmentation des lésions en fournissant une segment-ation des lésions avec un niveau de performance humaine
Deep learning, the world's best set of methods for identifying ob-jects on images. Stroke, a deadly disease whose treatment requiresidentifying objects on medical imaging. Sounds like an obvious com-bination yet it is not trivial to marry the two. Segmenting the lesionfrom stroke MRI has had some attention in literature but thrombussegmentation is still uncharted area. This work shows that contem-porary convolutional neural network architectures cannot reliablyidentify the thrombus on stroke MRI. Also it is demonstrated whythese models don't work on this problem. With this knowledge arecurrent neural network architecture, the logic LSTM, is developedthat takes into account the way medical doctors identify the throm-bus. Not only this architecture provides the first reliable thrombusidentification, it also provides new insights to neural network design.Especially the methods for increasing the receptive field are enrichedwith a new parameter free option. And last but not least the logicLSTM also improves the results of lesion segmentation by providinga lesion segmentation with human level performance
APA, Harvard, Vancouver, ISO, and other styles
22

Holmberg, Joakim. "Targeting the zebrafish eye using deep learning-based image segmentation." Thesis, Uppsala universitet, Institutionen för informationsteknologi, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-428325.

Full text
Abstract:
Researchers studying cardiovascular and metabolic disease in humans commonly usecomputer vision techniques to segment internal structures of the zebrafish animalmodel. However, there are no current image segmentation methods to target theeyes of the zebrafish. Segmenting the eyes is essential for accurate measurement ofthe eyes' size and shape following the experimental intervention. Additionally,successful segmentation of the eyes functions as a good starting point for futuresegmentation of other internal organs. To establish an effective segmentation method,the deep learning neural network architecture, Deeplab, was trained using 275 imagesof the zebrafish embryo. Besides model architecture, the training was refined withproper data pre-processing, including data augmentation to add variety and toartificially increase the training data. Consequently, the results yielded a score of95.88 percent when applying augmentations, and 95.30 percent withoutaugmentations. Despite this minor improvement in accuracy score when using theaugmented training dataset, it also produced visibly better predictions on a newdataset compared to the model trained without augmentations. Therefore, theimplemented segmentation model trained with augmentations proved to be morerobust, as the augmentations gave the model the ability to produce promising resultswhen segmenting on new data.
APA, Harvard, Vancouver, ISO, and other styles
23

Westermark, Hanna. "Deep Learning with Importance Sampling for Brain Tumor MR Segmentation." Thesis, KTH, Optimeringslära och systemteori, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-289574.

Full text
Abstract:
Segmentation of magnetic resonance images is an important part of planning radiotherapy treat-ments for patients with brain tumours but due to the number of images contained within a scan and the level of detail required, manual segmentation is a time consuming task. Convolutional neural networks have been proposed as tools for automated segmentation and shown promising results. However, the data sets used for training these deep learning models are often imbalanced and contain data that does not contribute to the performance of the model. By carefully selecting which data to train on, there is potential to both speed up the training and increase the network’s ability to detect tumours. This thesis implements the method of importance sampling for training a convolutional neural network for patch-based segmentation of three dimensional multimodal magnetic resonance images of the brain and compares it with the standard way of sampling in terms of network performance and training time. Training is done for two different patch sizes. Features of the most frequently sampled volumes are also analysed. Importance sampling is found to speed up training in terms of number of epochs and also yield models with improved performance. Analysis of the sampling trends indicate that when patches are large, small tumours are somewhat frequently trained on, however more investigation is needed to confirm what features may influence the sampling frequency of a patch.
Segmentering av magnetröntgenbilder är en viktig del i planeringen av strålbehandling av patienter med hjärntumörer. Det höga antalet bilder och den nödvändiga precisionsnivån gör dock manuellsegmentering till en tidskrävande uppgift. Faltningsnätverk har därför föreslagits som ett verktyg förautomatiserad segmentering och visat lovande resultat. Datamängderna som används för att träna dessa djupinlärningsmodeller är ofta obalanserade och innehåller data som inte bidrar till modellensprestanda. Det finns därför potential att både skynda på träningen och förbättra nätverkets förmåga att segmentera tumörer genom att noggrant välja vilken data som används för träning. Denna uppsats implementerar importance sampling för att träna ett faltningsnätverk för patch-baserad segmentering av tredimensionella multimodala magnetröntgenbilder av hjärnan. Modellensträningstid och prestanda jämförs mot ett nätverk tränat med standardmetoden. Detta görs förtvå olika storlekar på patches. Egenskaperna hos de mest valda volymerna analyseras också. Importance sampling uppvisar en snabbare träningsprocess med avseende på antal epoker och resulterar också i modeller med högre prestanda. Analys av de oftast valda volymerna indikerar att under träning med stora patches förekommer små tumörer i en något högre utsträckning. Vidareundersökningar är dock nödvändiga för att bekräfta vilka aspekter som påverkar hur ofta en volym används.
APA, Harvard, Vancouver, ISO, and other styles
24

Bahl, Gaétan. "Architectures deep learning pour l'analyse d'images satellite embarquée." Thesis, Université Côte d'Azur, 2022. https://tel.archives-ouvertes.fr/tel-03789667.

Full text
Abstract:
Les progrès des satellites d'observation de la Terre à haute résolution et la réduction des temps de revisite introduite par la création de constellations de satellites ont conduit à la création quotidienne de grandes quantités d'images (des centaines de Teraoctets par jour). Simultanément, la popularisation des techniques de Deep Learning a permis le développement d'architectures capables d'extraire le contenu sémantique des images. Bien que ces algorithmes nécessitent généralement l'utilisation de matériel puissant, des accélérateurs d'inférence IA de faible puissance ont récemment été développés et ont le potentiel d'être utilisés dans les prochaines générations de satellites, ouvrant ainsi la possibilité d'une analyse embarquée des images satellite. En extrayant les informations intéressantes des images satellite directement à bord, il est possible de réduire considérablement l'utilisation de la bande passante, du stockage et de la mémoire. Les applications actuelles et futures, telles que la réponse aux catastrophes, l'agriculture de précision et la surveillance du climat, bénéficieraient d'une latence de traitement plus faible, voire d'alertes en temps réel.Dans cette thèse, notre objectif est double : D'une part, nous concevons des architectures de Deep Learning efficaces, capables de fonctionner sur des périphériques de faible puissance, tels que des satellites ou des drones, tout en conservant une précision suffisante. D'autre part, nous concevons nos algorithmes en gardant à l'esprit l'importance d'avoir une sortie compacte qui peut être efficacement calculée, stockée, transmise au sol ou à d'autres satellites dans une constellation.Tout d'abord, en utilisant des convolutions séparables en profondeur et des réseaux neuronaux récurrents convolutionnels, nous concevons des réseaux neuronaux de segmentation sémantique efficaces avec un faible nombre de paramètres et une faible utilisation de la mémoire. Nous appliquons ces architectures à la segmentation des nuages et des forêts dans les images satellites. Nous concevons également une architecture spécifique pour la segmentation des nuages sur le FPGA d'OPS-SAT, un satellite lancé par l'ESA en 2019, et réalisons des expériences à bord à distance. Deuxièmement, nous développons une architecture de segmentation d'instance pour la régression de contours lisses basée sur une représentation à coefficients de Fourier, qui permet de stocker et de transmettre efficacement les formes des objets détectés. Nous évaluons la performance de notre méthode sur une variété de dispositifs informatiques à faible puissance. Enfin, nous proposons une architecture d'extraction de graphes routiers basée sur une combinaison de Fully Convolutional Networks et de Graph Neural Networks. Nous montrons que notre méthode est nettement plus rapide que les méthodes concurrentes, tout en conservant une bonne précision
The recent advances in high-resolution Earth observation satellites and the reduction in revisit times introduced by the creation of constellations of satellites has led to the daily creation of large amounts of image data hundreds of TeraBytes per day). Simultaneously, the popularization of Deep Learning techniques allowed the development of architectures capable of extracting semantic content from images. While these algorithms usually require the use of powerful hardware, low-power AI inference accelerators have recently been developed and have the potential to be used in the next generations of satellites, thus opening the possibility of onboard analysis of satellite imagery. By extracting the information of interest from satellite images directly onboard, a substantial reduction in bandwidth, storage and memory usage can be achieved. Current and future applications, such as disaster response, precision agriculture and climate monitoring, would benefit from a lower processing latency and even real-time alerts.In this thesis, our goal is two-fold: On the one hand, we design efficient Deep Learning architectures that are able to run on low-power edge devices, such as satellites or drones, while retaining a sufficient accuracy. On the other hand, we design our algorithms while keeping in mind the importance of having a compact output that can be efficiently computed, stored, transmitted to the ground or other satellites within a constellation.First, by using depth-wise separable convolutions and convolutional recurrent neural networks, we design efficient semantic segmentation neural networks with a low number of parameters and a low memory usage. We apply these architectures to cloud and forest segmentation in satellite images. We also specifically design an architecture for cloud segmentation on the FPGA of OPS-SAT, a satellite launched by ESA in 2019, and perform onboard experiments remotely. Second, we develop an instance segmentation architecture for the regression of smooth contours based on the Fourier coefficient representation, which allows detected object shapes to be stored and transmitted efficiently. We evaluate the performance of our method on a variety of low-power computing devices. Finally, we propose a road graph extraction architecture based on a combination of fully convolutional and graph neural networks. We show that our method is significantly faster than competing methods, while retaining a good accuracy
APA, Harvard, Vancouver, ISO, and other styles
25

Wu, Xinheng. "A Deep Unsupervised Anomaly Detection Model for Automated Tumor Segmentation." Thesis, The University of Sydney, 2020. https://hdl.handle.net/2123/22502.

Full text
Abstract:
Many researches have been investigated to provide the computer aided diagnosis (CAD) automated tumor segmentation in various medical images, e.g., magnetic resonance (MR), computed tomography (CT) and positron-emission tomography (PET). The recent advances in automated tumor segmentation have been achieved by supervised deep learning (DL) methods trained on large labelled data to cover tumor variations. However, there is a scarcity in such training data due to the cost of labeling process. Thus, with insufficient training data, supervised DL methods have difficulty in generating effective feature representations for tumor segmentation. This thesis aims to develop an unsupervised DL method to exploit large unlabeled data generated during clinical process. Our assumption is unsupervised anomaly detection (UAD) that, normal data have constrained anatomy and variations, while anomalies, i.e., tumors, usually differ from the normality with high diversity. We demonstrate our method for automated tumor segmentation on two different image modalities. Firstly, given that bilateral symmetry in normal human brains and unsymmetry in brain tumors, we propose a symmetric-driven deep UAD model using GAN model to model the normal symmetric variations thus segmenting tumors by their being unsymmetrical. We evaluated our method on two benchmarked datasets. Our results show that our method outperformed the state-of-the-art unsupervised brain tumor segmentation methods and achieved competitive performance to the supervised segmentation methods. Secondly, we propose a multi-modal deep UAD model for PET-CT tumor segmentation. We model a manifold of normal variations shared across normal CT and PET pairs; this manifold representing the normal pairing that can be used to segment the anomalies. We evaluated our method on two PET-CT datasets and the results show that we outperformed the state-of-the-art unsupervised methods, supervised methods and baseline fusion techniques.
APA, Harvard, Vancouver, ISO, and other styles
26

Gammulle, Pranali Harshala. "Deep learning for human action understanding." Thesis, Queensland University of Technology, 2019. https://eprints.qut.edu.au/135199/1/Pranali_Gammulle_Thesis.pdf.

Full text
Abstract:
This thesis addresses the problem of understanding human behaviour in videos in multiple problem settings including, recognition, segmentation, and prediction. Considering the complex nature of human behaviour, we propose to capture both short-term and long-term context in the given videos and propose novel multitask learning-based approaches to solve the action prediction task, as well as an adversarially-trained approach to action recognition. We demonstrate the efficacy of these techniques by applying them to multiple real-world human behaviour understanding settings including, security surveillance, sports action recognition, group activity recognition and recognition of cooking activities.
APA, Harvard, Vancouver, ISO, and other styles
27

Gujar, Sanket. "Pointwise and Instance Segmentation for 3D Point Cloud." Digital WPI, 2019. https://digitalcommons.wpi.edu/etd-theses/1290.

Full text
Abstract:
The camera is the cheapest and computationally real-time option for detecting or segmenting the environment for an autonomous vehicle, but it does not provide the depth information and is undoubtedly not reliable during the night, bad weather, and tunnel flash outs. The risk of an accident gets higher for autonomous cars when driven by a camera in such situations. The industry has been relying on LiDAR for the past decade to solve this problem and focus on depth information of the environment, but LiDAR also has its shortcoming. The industry methods commonly use projections methods to create a projection image and run detection and localization network for inference, but LiDAR sees obscurants in bad weather and is sensitive enough to detect snow, making it difficult for robustness in projection based methods. We propose a novel pointwise and Instance segmentation deep learning architecture for the point clouds focused on self-driving application. The model is only dependent on LiDAR data making it light invariant and overcoming the shortcoming of the camera in the perception stack. The pipeline takes advantage of both global and local/edge features of points in points clouds to generate high-level feature. We also propose Pointer-Capsnet which is an extension of CapsNet for small 3D point clouds.
APA, Harvard, Vancouver, ISO, and other styles
28

Kim, Max. "Improving Knee Cartilage Segmentation using Deep Learning-based Super-Resolution Methods." Thesis, KTH, Medicinteknik och hälsosystem, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-297900.

Full text
Abstract:
Segmentation of the knee cartilage is an important step for surgery planning and manufacturing patient-specific prostheses. What has been a promising technology in recent years is deep learning-based super-resolution methods that are composed of feed-forward models which have been successfully applied on natural and medical images. This thesis aims to test the feasibility to super-resolve thick slice 2D sequence acquisitions and acquire sufficient segmentation accuracy of the articular cartilage in the knee. The investigated approaches are single- and multi-contrast super-resolution, where the contrasts are either based on the 2D sequence, 3D sequence, or both. The deep learning models investigated are based on predicting the residual image between the high- and low-resolution image pairs, finding the hidden latent features connecting the image pairs, and approximating the end-to-end non-linear mapping between the low- and high-resolution image pairs. The results showed a slight improvement in segmentation accuracy with regards to the baseline bilinear interpolation for the single-contrast super-resolution, however, no notable improvements in segmentation accuracy were observed for the multi-contrast case. Although the multi-contrast approach did not result in any notable improvements, there are still unexplored areas not covered in this work that are promising and could potentially be covered as future work.
Segmentering av knäbrosket är ett viktigt steg för planering inför operationer och tillverkning av patientspecifika proteser. Idag segmenterar man knäbrosk med hjälp av MR-bilder tagna med en 3D-sekvens som både tidskrävande och rörelsekänsligt, vilket kan vara obehagligt för patienten. I samband med 3D-bildtagningar brukar även thick slice 2D-sekvenser tas för diagnostiska skäl, däremot är de inte anpassade för segmentering på grund av för tjocka skivor. På senare tid har djupinlärningsbaserade superupplösningsmetoder uppbyggda av så kallade feed-forwardmodeller visat sig vara väldigt framgångsrikt när det applicerats på verkliga- och medicinska bilder. Syftet med den här rapporten är att testa hur väl superupplösta thick slice 2D-sekvensbildtagningar fungerar för segmentering av ledbrosket i knät. De undersökta tillvägagångssätten är superupplösning av enkel- och flerkontrastbilder, där kontrasten är antingen baserade på 2D-sekvensen, 3D-sekvensen eller både och. Resultaten påvisar en liten förbättring av segmenteringnoggrannhet vid segmentering av enkelkontrastbilderna över baslinjen linjär interpolering. Däremot var det inte någon märkvärdig förbättring i superupplösning av flerkontrastbilderna. Även om superupplösning av flerkontrastmetoden inte gav någon märkbar förbättring segmenteringsresultaten så finns det fortfarande outforskade områden som inte tagits upp i det här arbetet som potentiellt skulle kunna utforskas i framtida arbeten.
APA, Harvard, Vancouver, ISO, and other styles
29

Kamann, Christoph [Verfasser], and Carsten [Akademischer Betreuer] Rother. "Robust Semantic Segmentation with Deep Learning / Christoph Kamann ; Betreuer: Carsten Rother." Heidelberg : Universitätsbibliothek Heidelberg, 2021. http://d-nb.info/123647483X/34.

Full text
APA, Harvard, Vancouver, ISO, and other styles
30

Dickens, James. "Depth-Aware Deep Learning Networks for Object Detection and Image Segmentation." Thesis, Université d'Ottawa / University of Ottawa, 2021. http://hdl.handle.net/10393/42619.

Full text
Abstract:
The rise of convolutional neural networks (CNNs) in the context of computer vision has occurred in tandem with the advancement of depth sensing technology. Depth cameras are capable of yielding two-dimensional arrays storing at each pixel the distance from objects and surfaces in a scene from a given sensor, aligned with a regular color image, obtaining so-called RGBD images. Inspired by prior models in the literature, this work develops a suite of RGBD CNN models to tackle the challenging tasks of object detection, instance segmentation, and semantic segmentation. Prominent architectures for object detection and image segmentation are modified to incorporate dual backbone approaches inputting RGB and depth images, combining features from both modalities through the use of novel fusion modules. For each task, the models developed are competitive with state-of-the-art RGBD architectures. In particular, the proposed RGBD object detection approach achieves 53.5% mAP on the SUN RGBD 19-class object detection benchmark, while the proposed RGBD semantic segmentation architecture yields 69.4% accuracy with respect to the SUN RGBD 37-class semantic segmentation benchmark. An original 13-class RGBD instance segmentation benchmark is introduced for the SUN RGBD dataset, for which the proposed model achieves 38.4% mAP. Additionally, an original depth-aware panoptic segmentation model is developed, trained, and tested for new benchmarks conceived for the NYUDv2 and SUN RGBD datasets. These benchmarks offer researchers a baseline for the task of RGBD panoptic segmentation on these datasets, where the novel depth-aware model outperforms a comparable RGB counterpart.
APA, Harvard, Vancouver, ISO, and other styles
31

Shah, Abhay. "Multiple surface segmentation using novel deep learning and graph based methods." Diss., University of Iowa, 2017. https://ir.uiowa.edu/etd/5630.

Full text
Abstract:
The task of automatically segmenting 3-D surfaces representing object boundaries is important in quantitative analysis of volumetric images, which plays a vital role in numerous biomedical applications. For the diagnosis and management of disease, segmentation of images of organs and tissues is a crucial step for the quantification of medical images. Segmentation finds the boundaries or, limited to the 3-D case, the surfaces, that separate regions, tissues or areas of an image, and it is essential that these boundaries approximate the true boundary, typically by human experts, as closely as possible. Recently, graph-based methods with a global optimization property have been studied and used for various applications. Sepecifically, the state-of-the-art graph search (optimal surface segmentation) method has been successfully used for various such biomedical applications. Despite their widespread use for image segmentation, real world medical image segmentation problems often pose difficult challenges, wherein graph based segmentation methods in its purest form may not be able to perform the segmentation task successfully. This doctoral work has a twofold objective. 1)To identify medical image segmentation problems which are difficult to solve using existing graph based method and develop novel methods by employing graph search as a building block to improve segmentation accuracy and efficiency. 2) To develop a novel multiple surface segmentation strategy using deep learning which is more computationally efficient and generic than the exisiting graph based methods, while eliminating the need for human expert intervention as required in the current surface segmentation methods. This developed method is possibly the first of its kind where the method does not require and human expert designed operations. To accomplish the objectives of this thesis work, a comprehensive framework of graph based and deep learning methods is proposed to achieve the goal by successfully fulfilling the follwoing three aims. First, an efficient, automated and accurate graph based method is developed to segment surfaces which have steep change in surface profiles and abrupt distance changes between two adjacent surfaces. The developed method is applied and validated on intra-retinal layer segmentation of Spectral Domain Optical Coherence Tomograph (SD-OCT) images of eye with Glaucoma, Age Related Macular Degneration and Pigment Epithelium Detachment. Second, a globally optimal graph based method is developed to attain subvoxel and super resolution accuracy for multiple surface segmentation problem while imposing convex constraints. The developed method was applied to layer segmentation of SD-OCT images of normal eye and vessel walls in Intravascular Ultrasound (IVUS) images. Third, a deep learning based multiple surface segmentation is developed which is more generic, computaionally effieient and eliminates the requirement of human expert interventions (like transformation designs, feature extrraction, parameter tuning, constraint modelling etc.) required by existing surface segmentation methods in varying capacities. The developed method was applied to SD-OCT images of normal and diseased eyes, to validate the superior segmentaion performance, computation efficieny and the generic nature of the framework, compared to the state-of-the-art graph search method.
APA, Harvard, Vancouver, ISO, and other styles
32

BINDER, THOMAS. "Gland Segmentation with Convolutional Neural Networks : Validity of Stroma Segmentation as a General Approach." Thesis, KTH, Skolan för kemi, bioteknologi och hälsa (CBH), 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-246134.

Full text
Abstract:
The analysis of glandular morphology within histopathology images is a crucial step in determining the stage of cancer. Manual annotation is a very laborious task. It is time consuming and suffers from the subjectivity of the specialists that label the glands. One of the aims of computational pathology is developing tools to automate gland segmentation. Such an algorithm would improve the efficiency of cancer diag- nosis. This is a complex task as there is a large variability in glandular morphologies and staining techniques. So far, specialised models have given promising results focusing on only one organ. This work investigated the idea of a cross domain ap- proximation. Unlike parenchymae the stroma tissue that lies between the glands is similar throughout all organs in the body. Creating a model able to precisely seg- ment the stroma would pave the way for a cross organ model. It would be able to segment the tissue and therefore give access to gland morphologies of different organs. To address this issue, we investigated different new and former architec- tures such as the MILD-net which is the currently best performing algorithm of the GlaS challenge. New architectures were created based on the promising U shaped network as well as Xception and the ResNet for feature extraction. These networks were trained on colon histopathology images focusing on glands and on the stroma. The comparision of the different results showed that this initial cross domain ap- proximation goes into the right direction and incites for further developments.
APA, Harvard, Vancouver, ISO, and other styles
33

Janurberg, Norman, and Christian Luksitch. "Exploring Deep Learning Frameworks for Multiclass Segmentation of 4D Cardiac Computed Tomography." Thesis, Linköpings universitet, Institutionen för hälsa, medicin och vård, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-178648.

Full text
Abstract:
By combining computed tomography data with computational fluid dynamics, the cardiac hemodynamics of a patient can be assessed for diagnosis and treatment of cardiac disease. The advantage of computed tomography over other medical imaging modalities is its capability of producing detailed high resolution images containing geometric measurements relevant to the simulation of cardiac blood flow. To extract these geometries from computed tomography data, segmentation of 4D cardiac computed tomography (CT) data has been performed using two deep learning frameworks that combine methods which have previously shown success in other research. The aim of this thesis work was to develop and evaluate a deep learning based technique to segment the left ventricle, ascending aorta, left atrium, left atrial appendage and the proximal pulmonary vein inlets. Two frameworks have been studied where both utilise a 2D multi-axis implementation to segment a single CT volume by examining it in three perpendicular planes, while one of them has also employed a 3D binary model to extract and crop the foreground from surrounding background. Both frameworks determine a segmentation prediction by reconstructing three volumes after 2D segmentation in each plane and combining their probabilities in an ensemble for a 3D output.  The results of both frameworks show similarities in their performance and ability to properly segment 3D CT data. While the framework that examines 2D slices of full size volumes produces an overall higher Dice score, it is less successful than the cropping framework at segmenting the smaller left atrial appendage. Since the full size 2D slices also contain background information in each slice, it is believed that this is the main reason for better segmentation performance. While the cropping framework provides a higher proportion of each foreground label, making it easier for the model to identify smaller structures. Both frameworks show success for use in 3D cardiac CT segmentation, and with further research and tuning of each network, even better results can be achieved.
APA, Harvard, Vancouver, ISO, and other styles
34

MATRONE, FRANCESCA. "Deep Semantic Segmentation of Built Heritage Point Clouds." Doctoral thesis, Politecnico di Torino, 2021. http://hdl.handle.net/11583/2924998.

Full text
APA, Harvard, Vancouver, ISO, and other styles
35

Estgren, Martin. "Bone Fragment Segmentation Using Deep Interactive Object Selection." Thesis, Linköpings universitet, Datorseende, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-157668.

Full text
Abstract:
In recent years semantic segmentation models utilizing Convolutional Neural Networks (CNN) have seen significant success for multiple different segmentation problems. Models such as U-Net have produced promising results within the medical field for both regular 2D and volumetric imaging, rivalling some of the best classical segmentation methods. In this thesis we examined the possibility of using a convolutional neural network-based model to perform segmentation of discrete bone fragments in CT-volumes with segmentation-hints provided by a user. We additionally examined different classical segmentation methods used in a post-processing refinement stage and their effect on the segmentation quality. We compared the performance of our model to similar approaches and provided insight into how the interactive aspect of the model affected the quality of the result. We found that the combined approach of interactive segmentation and deep learning produced results on par with some of the best methods presented, provided there were adequate amount of annotated training data. We additionally found that the number of segmentation hints provided to the model by the user significantly affected the quality of the result, with convergence of the result around 8 provided hints.
APA, Harvard, Vancouver, ISO, and other styles
36

Zhewei, Wang. "Fully Convolutional Networks (FCNs) for Medical Image Segmentation." Ohio University / OhioLINK, 2020. http://rave.ohiolink.edu/etdc/view?acc_num=ohiou1605199701509179.

Full text
APA, Harvard, Vancouver, ISO, and other styles
37

Sarkaar, Ajit Bhikamsingh. "Addressing Occlusion in Panoptic Segmentation." Thesis, Virginia Tech, 2021. http://hdl.handle.net/10919/101988.

Full text
Abstract:
Visual recognition tasks have witnessed vast improvements in performance since the advent of deep learning. Despite the gains in performance, image understanding algorithms are still not completely robust to partial occlusion. In this work, we propose a novel object classification method based on compositional modeling and explore its effect in the context of the newly introduced panoptic segmentation task. The panoptic segmentation task combines both semantic and instance segmentation to perform labelling of the entire image. The novel classification method replaces the object detection pipeline in UPSNet, a Mask R-CNN based design for panoptic segmentation. We also discuss an issue with the segmentation mask prediction of Mask R-CNN that affects overlapping instances. We perform extensive experiments and showcase results on the complex COCO and Cityscapes datasets. The novel classification method shows promising results for object classification on occluded instances in complex scenes.
24
Visual recognition tasks have witnessed vast improvements in performance since the advent of deep learning. Despite making significant improvements, algorithms for these tasks still do not perform well at recognizing partially visible objects in the scene. In this work, we propose a novel object classification method that uses compositional models to perform part based detection. The method first looks at individual parts of an object in the scene and then makes a decision about its identity. We test the proposed method in the context of the recently introduced panoptic segmentation task. The panoptic segmentation task combines both semantic and instance segmentation to perform labelling of the entire image. The novel classification method replaces the object detection module in UPSNet, a Mask R-CNN based algorithm for panoptic segmentation. We also discuss an issue with the segmentation mask prediction of Mask R-CNN that affects overlapping instances. After performing extensive experiments and evaluation, it can be seen that the novel classification method shows promising results for object classification on occluded instances in complex scenes.
APA, Harvard, Vancouver, ISO, and other styles
38

Suzani, Amin. "Automatic vertebrae localization, identification, and segmentation using deep learning and statistical models." Thesis, University of British Columbia, 2014. http://hdl.handle.net/2429/50722.

Full text
Abstract:
Automatic localization and identification of vertebrae in medical images of the spine are core requirements for building computer-aided systems for spine diagnosis. Automated algorithms for segmentation of vertebral structures can also benefit these systems for diagnosis of a range of spine pathologies. The fundamental challenges associated with the above-stated tasks arise from the repetitive nature of vertebral structures, restrictions in field of view, presence of spine pathologies or surgical implants, and poor contrast of the target structures in some imaging modalities. This thesis presents an automatic method for localization, identification, and segmentation of vertebrae in volumetric computed tomography (CT) scans and magnetic resonance (MR) images of the spine. The method makes no assumptions about which section of the vertebral column is visible in the image. An efficient deep learning approach is used to predict the location of each vertebra based on its contextual information in the image. Then, a statistical multi-vertebrae model is initialized by the localized vertebrae from the previous step. An iterative expectation maximization technique is used to register the statistical multi-vertebrae model to the edge points of the image in order to achieve a fast and reliable segmentation of vertebral bodies. State-of-the-art results are obtained for vertebrae localization in a public dataset of 224 arbitrary-field-of-view CT scans of pathological cases. Promising results are also obtained from quantitative evaluation of the automated segmentation method on volumetric MR images of the spine.
Applied Science, Faculty of
Electrical and Computer Engineering, Department of
Graduate
APA, Harvard, Vancouver, ISO, and other styles
39

Agerskov, Niels. "Adaptable Semi-Automated 3D Segmentation Using Deep Learning with Spatial Slice Propagation." Thesis, KTH, Skolan för kemi, bioteknologi och hälsa (CBH), 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-241542.

Full text
Abstract:
Even with the recent advances of deep learning pushing the field of medical image analysis further than ever before, progress is still slow due to limited availability of annotated data. There are multiple reasons for this, but perhaps the most prominent one is the amount of time manual annotation of medical images takes. In this project a semi-automated algorithm is proposed, approaching the segmentation problem in a slice by slice manner utilising the prediction of a previous slice as a prior for the next. This both allows the algorithm to segment entirely new cases and gives the user the ability to correct faulty slices, propagating the correction throughout. Results on par with current state of the art is achieved within the domain of the training data. In addition to this, cases outside of the training domain can also be segmented with some accuracy, paving the way for further improvement. The strategy for training the network to utilise auxiliary input lies in the heavy online data augmentation, forcing the network to rely on the provided prior.
Trots att framstegen inom djupinlärning banar vägen för medicinsk bildanalys snabbare än någonsin så finns det ett stort problem, mängden annoterad bilddata. Det har bland annat att göra med att medicinsk bilddata tar väldigt lång tid att annotera manuellt. I detta projektet har en semi-automatisk algoritm utvecklats som tar sig an 3D-segmentering från ett 2D-perspektiv. En bildvolym segmenteras genom att en initialiseringbild annoteras manuellt och används som hjälp för att annotera närliggande bilder i volymen. Detta upprepas sedan för resterande bilder men istället för att manuellt annotera används föregående segmentering av närverket som hjälp. Detta tillåter att algoritmen både kan generalisera till helt nya fall som ej är representerade av träningsdatan, och gör även att felaktigt segmenterade bilder kan korrigeras i efterhand. Korrigeringar kommer då att propageras genom volymen genom att varje segmentering används som hjälp för nästkommande bild. Resultaten är i nivå med motsvarande helautomatiska algoritmer inom träningsdomänen. Den största fördelen gentemot dessa är möjligheten att segmentera helt nya fall. Metoden som används för att träna nätverket att förlita sig på hjälpbilder bygger på kraftig bilddistortion av bilden som ska segmenteras. Detta tvingar nätverket att ta vara på informationen i segmenteringen av föregående bild.
APA, Harvard, Vancouver, ISO, and other styles
40

Bodin, Emanuel. "Furniture swap : Segmentation and 3D rotation of natural images using deep learning." Thesis, Uppsala universitet, Signaler och system, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-435503.

Full text
Abstract:
Learning to perceive scenes and objects from 2D images as 3D models is atrivial task for a human but very challenging for a computer. Being ableto retrieve a 3D model from a scene just by taking a picture of it canbe of great use in many fields, for example when making 3D blueprintsfor buildings or working with animations in the game or film industry.Novel view synthesis is a field within deep learning where generativemodels are trained to construct 3D models of scenes or objects from 2Dimages. In this work, the generative model HoloGAN is combined together with aU-net segmentation network. The solution is able to, given an imagecontaining a single object as input, swap that object to another oneand then perform a rotation of the scene, generating new images fromunobserved view points. The segmentation network is trained with pairedsegmentation masks while HoloGAN is able to in an unsupervised mannerlearn 3D metrics of scenes from unlabeled 2D images. The system as awhole is trained on one dataset containing images of cars while theperformance of HoloGAN was evaluated on four additionaldatasets. The chosen method proved to be successful but came with somedrawbacks such as requiring large dataset sizes and being computationalexpensive to train.
APA, Harvard, Vancouver, ISO, and other styles
41

ASLANI, SHAHAB. "Deep learning approaches for segmentation of multiple sclerosis lesions on brain MRI." Doctoral thesis, Università degli studi di Genova, 2020. http://hdl.handle.net/11567/997626.

Full text
Abstract:
Multiple Sclerosis (MS) is a demyelinating disease of the central nervous system which causes lesions in brain tissues, especially visible in white matter with magnetic resonance imaging (MRI). The diagnosis of MS lesions, which is often performed visually with MRI, is an important task as it can help characterizing the progression of the disease and monitoring the efficacy of a candidate treatment. automatic detection and segmentation of MS lesions from MRI images offer the potential for a faster and more cost-effective performance which could also be immune to expert bias segmentation. In this thesis, we study automated approaches to segment MS lesions from MRI images. The thesis begins with a review of the existing literature on MS lesion segmentation and discusses their general limitations. We then propose three novel approaches that rely on Convolutional Neural Networks (CNNs) to segment MS lesions. The first approach demonstrates that the parameters of a CNN learned from natural images, transfer well to the tasks of MS lesion segmentation. In the second approach, we describe a novel multi-branch CNN architecture with end-to-end training that can take advantage of each MRI modalities individually. In that work, we also investigated the combination of MRI modalities leading to the best segmentation performance. In the third approach, we show an effective and novel generalization method for MS lesion segmentation when data are collected from multiple MRI scanning sites and as suffer from (site-)domain shifts. Finally, this thesis concludes with open questions that may benefit from future work. This thesis demonstrates the potential role of CNNs as a common methodological building block to address clinical problems in MS segmentation.
APA, Harvard, Vancouver, ISO, and other styles
42

Wang, Zhewei. "Laplacian Pyramid FCN for Robust Follicle Segmentation." Ohio University / OhioLINK, 2019. http://rave.ohiolink.edu/etdc/view?acc_num=ohiou1565620740447982.

Full text
APA, Harvard, Vancouver, ISO, and other styles
43

Mali, Shruti Atul. "Multi-Modal Learning for Abdominal Organ Segmentation." Thesis, KTH, Skolan för kemi, bioteknologi och hälsa (CBH), 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-285866.

Full text
Abstract:
Deep Learning techniques are widely used across various medical imaging applications. However, they are often fine-tuned for a specific modality and are not generalizable when it comes to new modalities or datasets. One of the main reasons for this is large data variations for e.g., the dynamic range of intensity values is large across multi-modal images. The goal of the project is to develop a method to address multi-modal learning that aims at segmenting liver from Computed Tomography (CT) images and abdominal organs from Magnetic Resonance (MR) images using deep learning techniques. In this project, a self-supervised approach is adapted to attain domain adaptation across images while retaining important 3D information from medical images using a simple 3D-UNet with a few auxiliary tasks. The method comprises of two main steps: representation learning via self-supervised learning (pre-training) and fully supervised learning (fine-tuning). Pre-training is done using a 3D-UNet as a base model along with some auxiliary data augmentation tasks to learn representation through texture, geometry and appearances. The second step is fine-tuning the same network, without the auxiliary tasks, to perform the segmentation tasks on CT and MR images. The annotations of all organs are not available in both modalities. Thus the first step is used to learn general representation from both image modalities; while the second step helps to fine-tune the representations to the available annotations of each modality. Results obtained for each modality were submitted online, and one of the evaluations obtained was in the form of DICE score. The results acquired showed that the highest DICE score of 0.966 was obtained for CT liver prediction and highest DICE score of 0.7 for MRI abdominal segmentation. This project shows the potential to achieve desired results by combining both self and fully-supervised approaches.
APA, Harvard, Vancouver, ISO, and other styles
44

Havaei, Seyed Mohammad. "Machine learning methods for brain tumor segmentation." Thèse, Université de Sherbrooke, 2017. http://hdl.handle.net/11143/10260.

Full text
Abstract:
Abstract : Malignant brain tumors are the second leading cause of cancer related deaths in children under 20. There are nearly 700,000 people in the U.S. living with a brain tumor and 17,000 people are likely to loose their lives due to primary malignant and central nervous system brain tumor every year. To identify whether a patient is diagnosed with brain tumor in a non-invasive way, an MRI scan of the brain is acquired followed by a manual examination of the scan by an expert who looks for lesions (i.e. cluster of cells which deviate from healthy tissue). For treatment purposes, the tumor and its sub-regions are outlined in a procedure known as brain tumor segmentation . Although brain tumor segmentation is primarily done manually, it is very time consuming and the segmentation is subject to variations both between observers and within the same observer. To address these issues, a number of automatic and semi-automatic methods have been proposed over the years to help physicians in the decision making process. Methods based on machine learning have been subjects of great interest in brain tumor segmentation. With the advent of deep learning methods and their success in many computer vision applications such as image classification, these methods have also started to gain popularity in medical image analysis. In this thesis, we explore different machine learning and deep learning methods applied to brain tumor segmentation.
Résumé: Les tumeurs malignes au cerveau sont la deuxième cause principale de décès chez les enfants de moins de 20 ans. Il y a près de 700 000 personnes aux États-Unis vivant avec une tumeur au cerveau, et 17 000 personnes sont chaque année à risque de perdre leur vie suite à une tumeur maligne primaire dans le système nerveu central. Pour identifier de façon non-invasive si un patient est atteint d'une tumeur au cerveau, une image IRM du cerveau est acquise et analysée à la main par un expert pour trouver des lésions (c.-à-d. un groupement de cellules qui diffère du tissu sain). Une tumeur et ses régions doivent être détectées à l'aide d'une segmentation pour aider son traitement. La segmentation de tumeur cérébrale et principalement faite à la main, c'est une procédure qui demande beaucoup de temps et les variations intra et inter expert pour un même cas varient beaucoup. Pour répondre à ces problèmes, il existe beaucoup de méthodes automatique et semi-automatique qui ont été proposés ces dernières années pour aider les praticiens à prendre des décisions. Les méthodes basées sur l'apprentissage automatique ont suscité un fort intérêt dans le domaine de la segmentation des tumeurs cérébrales. L'avènement des méthodes de Deep Learning et leurs succès dans maintes applications tels que la classification d'images a contribué à mettre de l'avant le Deep Learning dans l'analyse d'images médicales. Dans cette thèse, nous explorons diverses méthodes d'apprentissage automatique et de Deep Learning appliquées à la segmentation des tumeurs cérébrales.
APA, Harvard, Vancouver, ISO, and other styles
45

Kushibar, Kaisar. "Automatic segmentation of brain structures in magnetic resonance images using deep learning techniques." Doctoral thesis, Universitat de Girona, 2020. http://hdl.handle.net/10803/670766.

Full text
Abstract:
This PhD thesis focuses on the development of deep learning based methods for accurate segmentation of the sub-cortical brain structures from MRI. First, we have proposed a 2.5D CNN architecture that combines convolutional and 2/2 spatial features. Second, we proposed a supervised domain adaptation technique to improve the robustness and consistency of deep learning model. Third, an unsupervised domain adaptation method was proposed to eliminate the requirement of manual intervention to train a deep learning model that is robust to differences in the MRI images from multi-centre and multi-scanner datasets. The experimental results for all the proposals demonstrated the effectiveness of our approaches in accurately segmenting the sub-cortical brain structures and has shown state-of-the-art performance on well-known publicly available datasets
Esta tesis doctoral se centra en el desarrollo de métodos basados en el aprendizaje profundo para la segmentación precisa de las estructuras cerebrales subcorticales a partir de la resonancia magnética. En primer lugar, hemos propuesto una arquitectura 2.5D CNN que combina características convolucionales y espaciales. En segundo lugar, hemos propuesto una técnica de adaptación de dominio supervisada para mejorar la robustez y la consistencia del modelo de aprendizaje profundo. En tercer lugar, hemos propuesto un método de adaptación de dominio no supervisado para eliminar el requisito de intervención manual para entrenar un modelo de aprendizaje profundo que sea robusto a las diferencias en las imágenes de la resonancia magnética de los conjuntos de datos multicéntricos y multiescáner. Los resultados experimentales de todas las propuestas demostraron la eficacia de nuestros enfoques para segmentar con precisión las estructuras cerebrales subcorticales y han mostrado un rendimiento de vanguardia en los conocidos conjuntos de datos de acceso público
APA, Harvard, Vancouver, ISO, and other styles
46

Serra, Sabina. "Deep Learning for Semantic Segmentation of 3D Point Clouds from an Airborne LiDAR." Thesis, Linköpings universitet, Datorseende, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-168367.

Full text
Abstract:
Light Detection and Ranging (LiDAR) sensors have many different application areas, from revealing archaeological structures to aiding navigation of vehicles. However, it is challenging to interpret and fully use the vast amount of unstructured data that LiDARs collect. Automatic classification of LiDAR data would ease the utilization, whether it is for examining structures or aiding vehicles. In recent years, there have been many advances in deep learning for semantic segmentation of automotive LiDAR data, but there is less research on aerial LiDAR data. This thesis investigates the current state-of-the-art deep learning architectures, and how well they perform on LiDAR data acquired by an Unmanned Aerial Vehicle (UAV). It also investigates different training techniques for class imbalanced and limited datasets, which are common challenges for semantic segmentation networks. Lastly, this thesis investigates if pre-training can improve the performance of the models. The LiDAR scans were first projected to range images and then a fully convolutional semantic segmentation network was used. Three different training techniques were evaluated: weighted sampling, data augmentation, and grouping of classes. No improvement was observed by the weighted sampling, neither did grouping of classes have a substantial effect on the performance. Pre-training on the large public dataset SemanticKITTI resulted in a small performance improvement, but the data augmentation seemed to have the largest positive impact. The mIoU of the best model, which was trained with data augmentation, was 63.7% and it performed very well on the classes Ground, Vegetation, and Vehicle. The other classes in the UAV dataset, Person and Structure, had very little data and were challenging for most models to classify correctly. In general, the models trained on UAV data performed similarly as the state-of-the-art models trained on automotive data.
APA, Harvard, Vancouver, ISO, and other styles
47

Rönnberg, Axel. "Semi-Supervised Deep Learning using Consistency-Based Methods for Segmentation of Medical Images." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-279579.

Full text
Abstract:
In radiation therapy, a form of cancer treatment, accurately locating the anatomical structures is required in order to limit the impact on healthy cells. The automatic task of delineating these structures and organs is called segmentation, where each pixel in an image is classified and assigned a label. Recently, deep neural networks have proven to be efficient at automatic medical segmentation. However, deep learning requires large amounts of training data. This is a restricting feature, especially in the medical field due to factors such as patient confidentiality. Nonetheless, the main challenge is not the image data itself but the lack of high-quality annotations. It is thus interesting to investigate methods for semi-supervised learning, where only a subset of the images re- quires annotations. This raises the question if these methods can be acceptable for organ segmentation, and if they will result in an increased performance in comparison to supervised models. A category of semi-supervised methods applies the strategy of encouraging consistency between predictions. Consistency Training and Mean Teacher are two methods in which the network weights are updated in order to minimize the impact of input perturbations such as data augmentations. In addition, the Mean Teacher method trains two models, a Teacher and a Student. The Teacher is updated as an average of consecutive Student models, using Temporal Ensembling. To resolve the question whether semi-supervised learning could be beneficial, the two mentioned techniques are investigated. They are used in training deep neural networks with an U-net architecture to segment the bladder and anorectum in 3D CT images. The results showed signs of promise for Consistency Training and Mean Teacher, with nearly all model configurations having improved segmentations. Results also showed that the methods caused a reduction in performance variance, primarily by limiting poor delineations. With these results in hand, the use of semi-supervised learning should definitely be considered. However, since the segmentation improvement was not repeated in all experiment configurations, more research needs to be done.
Inom radioterapi, en form av cancerbehandling, är precis lokalisering av anatomiska strukturer nödvändig för att begränsa påverkan på friska celler. Det automatiska arbetet att avbilda de här strukturerna och organen kallas för segmentering, där varje pixel i en bild är klassificerad och anvisad en etikett. Nyligen har djupa neurala nätverk visat sig vara effektiva för automatisk, medicinsk segmentering. Emellertid kräver djupinlärning stora mängder tränings- data. Det är ett begränsande drag, speciellt i det medicinska fältet, på grund av faktorer som patientsekretess. Trots det är den stora utmaningen inte bilddatan själv, utan bristen på högkvalitativa annoteringar. Det är därför intressant att undersöka metoder för semi-övervakad inlärning, där endast en delmängd av bilderna behöver annoteringar. Det höjer frågan om de här metoderna kan vara kliniskt acceptabla för organsegmentering, och om de resulterar i en ökad prestanda i jämförelse med övervakade modeller. En kategori av semi-övervakade metoder applicerar strategin att uppmuntra konsistens mellan prediktioner. Consistency Training och Mean Teacher är två metoder där nätverkets vikter är uppdaterade så att påverkan av rubbningar av input, som dataökningar, minimeras. Därtill tränar Mean Teacher två modeller, en Lärare och en Student. Läraren uppdateras som ett genomsnitt av konsekutiva Studentmodeller, användandes av Temporal Ensembling. För att lösa frågan huruvida semi-övervakad inlärning kan vara fördelaktig är de två nämnda metoderna undersökta. De används för att träna djupa neurala nät- verk med en U-net arkitektur för att segmentera blåsan och anorektum i 3D CT-bilder. Resultaten visade tecken på potential för Consistency Training och Mean Teacher, med förbättrad segmentering för nästan alla modellkonfigurationer. Resultaten visade även att metoderna medförde en reduktion i varians av prestanda, främst genom att begränsa dåliga segmenteringar. I och med de här resultaten borde användandet av semi-övervakad inlärning övervägas. Emellertid behöver mer forskning utföras, då förbättringen av segmenteringen inte upprepades i alla experiment.
APA, Harvard, Vancouver, ISO, and other styles
48

Rahman, Md Atiqur. "Application specific performance measure optimization using deep learning." IEEE, 2016. http://hdl.handle.net/1993/31812.

Full text
Abstract:
In this thesis, we address the action retrieval and the object category segmentation problems by directly optimizing application specific performance measures using deep learning. Most deep learning methods are designed to optimize simple loss functions (e.g., cross-entropy or hamming loss). These loss functions are suitable for applications where the performance of the application is measured by overall accuracy. But for many applications, the overall accuracy is not an appropriate performance measure. For example, applications like action retrieval often use the area under the Receiver Operating Characteristic curve (ROC curve) to measure the performance of a retrieval algorithm. Likewise, in object category segmentation from images, the intersection-over-union (IoU) is the standard performance measure. In this thesis, we propose approaches to directly optimize these complex performance measures in deep learning framework.
October 2016
APA, Harvard, Vancouver, ISO, and other styles
49

AKBAR, MUHAMMAD USMAN. "Deep Learning Approaches Targeting Radiological Images." Doctoral thesis, Università degli studi di Genova, 2022. http://hdl.handle.net/11567/1069387.

Full text
Abstract:
Artificial Intelligence (AI) algorithms have remarkably improved their performance in the recent years in various domains, thanks to the introduction of deep learning approaches. Indeed they have shown a tremendous potential when solving tasks involving image analysisThe problem of deep learning is its requirement for huge datasets, nonetheless, DL approaches have proved to be helpful in the domain of medical imaging as well. Automated segmentation and classification in different biomedical tasks have proven to be faster and more cost effective. In this thesis we study deep learning approaches used for segmentation and classification of different radiological images mainly CT Scans, MRI Scans and CXR images. In particular, we explored some issues like the multi-modality, and the small dataset problem We first discuss about how the small datasets can be exploited to improve the performance of the deep model in the proposed architectures and then in the next work we train the model with multi modal data consisting of both CT and MRI images together and consider the corresponding opposite modality of CT and MRI as missing data problem. We use Cycle-GAN to generate the synthetic data for the missing data and further train the model with original and synthetic data together. Then we focus on the classification of COVID exploiting the multi-modality data available. We proposed an architecture that is capable of handling multi modal data and extract feature representation from available modalities before concatenation and further use them for final classification. Then we exploit joint learning to train a small dataset from scratch. Finally, this thesis concludes with open questions that may benefit from future work. This thesis demonstrate the potential role of CNNs to address the tasks of segmentation and classification.
APA, Harvard, Vancouver, ISO, and other styles
50

Selagamsetty, Srinivasa Siddhartha. "Exploring a Methodology for Segmenting Biomedical Images using Deep Learning." University of Cincinnati / OhioLINK, 2019. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1573812579683504.

Full text
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography