Bibliografías temáticas / DEEP LEARNING, GENERATION, SEMANTIC SEGMENTATION, MACHINE LEARNING

Índice

Artículos de revistas
Tesis
Capítulos de libros
Actas de conferencias

Literatura académica sobre el tema "DEEP LEARNING, GENERATION, SEMANTIC SEGMENTATION, MACHINE LEARNING"

Autor: Grafiati

Publicado: 10 de marzo de 2023

Crea una cita precisa en los estilos APA, MLA, Chicago, Harvard y otros

Elija tipo de fuente:

Consulte las listas temáticas de artículos, libros, tesis, actas de conferencias y otras fuentes académicas sobre el tema "DEEP LEARNING, GENERATION, SEMANTIC SEGMENTATION, MACHINE LEARNING".

Junto a cada fuente en la lista de referencias hay un botón "Agregar a la bibliografía". Pulsa este botón, y generaremos automáticamente la referencia bibliográfica para la obra elegida en el estilo de cita que necesites: APA, MLA, Harvard, Vancouver, Chicago, etc.

También puede descargar el texto completo de la publicación académica en formato pdf y leer en línea su resumen siempre que esté disponible en los metadatos.

Artículos de revistas sobre el tema "DEEP LEARNING, GENERATION, SEMANTIC SEGMENTATION, MACHINE LEARNING"

Murtiyoso, A., F. Matrone, M. Martini, A. Lingua, P. Grussenmeyer y R. Pierdicca. "AUTOMATIC TRAINING DATA GENERATION IN DEEP LEARNING-AIDED SEMANTIC SEGMENTATION OF HERITAGE BUILDINGS". ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences V-2-2022 (17 de mayo de 2022): 317–24. http://dx.doi.org/10.5194/isprs-annals-v-2-2022-317-2022.

Texto completo

Resumen

Abstract. In the geomatics domain the use of deep learning, a subset of machine learning, is becoming more and more widespread. In this context, the 3D semantic segmentation of heritage point clouds presents an interesting and promising approach for modelling automation, in light of the heterogeneous nature of historical building styles and features. However, this heterogeneity also presents an obstacle in terms of generating the training data for use in deep learning, hitherto performed largely manually. The current generally low availability of labelled data also presents a motivation to aid the process of training data generation. In this paper, we propose the use of approaches based on geometric rules to automate to a certain degree this task. One object class will be discussed in this paper, namely the pillars class. Results show that the approach managed to extract pillars with satisfactory quality (98.5% of correctly detected pillars with the proposed algorithm). Tests were also performed to use the outputs in a deep learning segmentation setting, with a favourable outcome in terms of reducing the overall labelling time (−66.5%). Certain particularities were nevertheless observed, which also influence the result of the deep learning segmentation.

Los estilos APA, Harvard, Vancouver, ISO, etc.

Oluwasammi, Ariyo, Muhammad Umar Aftab, Zhiguang Qin, Son Tung Ngo, Thang Van Doan, Son Ba Nguyen, Son Hoang Nguyen y Giang Hoang Nguyen. "Features to Text: A Comprehensive Survey of Deep Learning on Semantic Segmentation and Image Captioning". Complexity 2021 (18 de marzo de 2021): 1–19. http://dx.doi.org/10.1155/2021/5538927.

Texto completo

Resumen

With the emergence of deep learning, computer vision has witnessed extensive advancement and has seen immense applications in multiple domains. Specifically, image captioning has become an attractive focal direction for most machine learning experts, which includes the prerequisite of object identification, location, and semantic understanding. In this paper, semantic segmentation and image captioning are comprehensively investigated based on traditional and state-of-the-art methodologies. In this survey, we deliberate on the use of deep learning techniques on the segmentation analysis of both 2D and 3D images using a fully convolutional network and other high-level hierarchical feature extraction methods. First, each domain’s preliminaries and concept are described, and then semantic segmentation is discussed alongside its relevant features, available datasets, and evaluation criteria. Also, the semantic information capturing of objects and their attributes is presented in relation to their annotation generation. Finally, analysis of the existing methods, their contributions, and relevance are highlighted, informing the importance of these methods and illuminating a possible research continuation for the application of semantic image segmentation and image captioning approaches.

Los estilos APA, Harvard, Vancouver, ISO, etc.

Pellis, E., A. Murtiyoso, A. Masiero, G. Tucci, M. Betti y P. Grussenmeyer. "AN IMAGE-BASED DEEP LEARNING WORKFLOW FOR 3D HERITAGE POINT CLOUD SEMANTIC SEGMENTATION". International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences XLVI-2/W1-2022 (25 de febrero de 2022): 429–34. http://dx.doi.org/10.5194/isprs-archives-xlvi-2-w1-2022-429-2022.

Texto completo

Resumen

Abstract. The interest in high-resolution semantic 3D models of historical buildings continuously increased during the last decade, thanks to their utility in protection, conservation and restoration of cultural heritage sites. The current generation of surveying tools allows the quick collection of large and detailed amount of data: such data ensure accurate spatial representations of the buildings, but their employment in the creation of informative semantic 3D models is still a challenging task, and it currently still requires manual time-consuming intervention by expert operators. Hence, increasing the level of automation, for instance developing an automatic semantic segmentation procedure enabling machine scene understanding and comprehension, can represent a dramatic improvement in the overall processing procedure. In accordance with this observation, this paper aims at presenting a new workflow for the automatic semantic segmentation of 3D point clouds based on a multi-view approach. Two steps compose this workflow: first, neural network-based semantic segmentation is performed on building images. Then, image labelling is back-projected, through the use of masked images, on the 3D space by exploiting photogrammetry and dense image matching principles. The obtained results are quite promising, with a good performance in the image segmentation, and a remarkable potential in the 3D reconstruction procedure.

Los estilos APA, Harvard, Vancouver, ISO, etc.

Rettenberger, Luca, Marcel Schilling y Markus Reischl. "Annotation Efforts in Image Segmentation can be Reduced by Neural Network Bootstrapping". Current Directions in Biomedical Engineering 8, n.º 2 (1 de agosto de 2022): 329–32. http://dx.doi.org/10.1515/cdbme-2022-1084.

Texto completo

Resumen

Abstract Modern medical technology offers potential for the automatic generation of datasets that can be fed into deep learning systems. However, even though raw data for supporting diagnostics can be obtained with manageable effort, generating annotations is burdensome and time-consuming. Since annotating images for semantic segmentation is particularly exhausting, methods to reduce the human effort are especially valuable. We propose a combined framework that utilizes unsupervised machine learning to automatically generate segmentation masks. Experiments on two biomedical datasets show that our approach generates noticeably better annotations than Otsu thresholding and k-means clustering without needing any additional manual effort. Using our framework, unannotated datasets can be amended with pre-annotations fully unsupervised thus reducing the human effort to a minimum.

Los estilos APA, Harvard, Vancouver, ISO, etc.

Prakash, Nikhil, Andrea Manconi y Simon Loew. "Mapping Landslides on EO Data: Performance of Deep Learning Models vs. Traditional Machine Learning Models". Remote Sensing 12, n.º 3 (21 de enero de 2020): 346. http://dx.doi.org/10.3390/rs12030346.

Texto completo

Resumen

Mapping landslides using automated methods is a challenging task, which is still largely done using human efforts. Today, the availability of high-resolution EO data products is increasing exponentially, and one of the targets is to exploit this data source for the rapid generation of landslide inventory. Conventional methods like pixel-based and object-based machine learning strategies have been studied extensively in the last decade. In addition, recent advances in CNN (convolutional neural network), a type of deep-learning method, has been widely successful in extracting information from images and have outperformed other conventional learning methods. In the last few years, there have been only a few attempts to adapt CNN for landslide mapping. In this study, we introduce a modified U-Net model for semantic segmentation of landslides at a regional scale from EO data using ResNet34 blocks for feature extraction. We also compare this with conventional pixel-based and object-based methods. The experiment was done in Douglas County, a study area selected in the south of Portland in Oregon, USA, and landslide inventory extracted from SLIDO (Statewide Landslide Information Database of Oregon) was considered as the ground truth. Landslide mapping is an imbalanced learning problem with very limited availability of training data. Our network was trained on a combination of focal Tversky loss and cross-entropy loss functions using augmented image tiles sampled from a selected training area. The deep-learning method was observed to have a better performance than the conventional methods with an MCC (Matthews correlation coefficient) score of 0.495 and a POD (probability of detection) rate of 0.72 .

Los estilos APA, Harvard, Vancouver, ISO, etc.

Ravishankar, Rashmi, Elaf AlMahmoud, Abdulelah Habib y Olivier L. de Weck. "Capacity Estimation of Solar Farms Using Deep Learning on High-Resolution Satellite Imagery". Remote Sensing 15, n.º 1 (30 de diciembre de 2022): 210. http://dx.doi.org/10.3390/rs15010210.

Texto completo

Resumen

Global solar photovoltaic capacity has consistently doubled every 18 months over the last two decades, going from 0.3 GW in 2000 to 643 GW in 2019, and is forecast to reach 4240 GW by 2040. However, these numbers are uncertain, and virtually all reporting on deployments lacks a unified source of either information or validation. In this paper, we propose, optimize, and validate a deep learning framework to detect and map solar farms using a state-of-the-art semantic segmentation convolutional neural network applied to satellite imagery. As a final step in the pipeline, we propose a model to estimate the energy generation capacity of the detected solar energy facilities. Objectively, the deep learning model achieved highly competitive performance indicators, including a mean accuracy of 96.87%, and a Jaccard Index (intersection over union of classified pixels) score of 95.5%. Subjectively, it was found to detect spaces between panels producing a segmentation output at a sub-farm level that was better than human labeling. Finally, the detected areas and predicted generation capacities were validated against publicly available data to within an average error of 4.5% Deep learning applied specifically for the detection and mapping of solar farms is an active area of research, and this deep learning capacity evaluation pipeline is one of the first of its kind. We also share an original dataset of overhead solar farm satellite imagery comprising 23,000 images (256 × 256 pixels each), and the corresponding labels upon which the machine learning model was trained.

Los estilos APA, Harvard, Vancouver, ISO, etc.

Mohtich, F. E., M. El-Ayachi, S. Bensiali, A. Idri y I. Ait Hou. "DEEP LEARNING APPROACH APPLIED TO DRONE IMAGERY FOR REAL ESTATE TAX ASSESSMENT: CASE OF THE TAX ON UNBUILT LAND KENITRA-MOROCCO". International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences XLVIII-4/W5-2022 (17 de octubre de 2022): 121–27. http://dx.doi.org/10.5194/isprs-archives-xlviii-4-w5-2022-121-2022.

Texto completo

Resumen

Abstract. According to the Court of Audit, urban taxation is the main source of revenue for local authorities in almost all regions of the world. In Morocco, in particular, the tax on unbuilt urban land accounts for 35% of the revenue from taxes managed directly by the municipality. The property tax assessment system currently adopted is not regularly updated and is not properly monitored. These difficulties do not allow for a significant expansion of the land base. The current efforts aim at accelerating the census of the urban heritage using innovative and automated approaches which are intended to lead to the next generation of urban information services and the development of smart cities. In this context we propose a methodology that consists of acquisition of high-resolution UAV images. Then the training of a deep learning algorithm of semantic segmentation of the images in order to extract the characteristics defining the unbuilt land. U-Net, the deep architecture of the convolutional neural network that we have parameterized in order to adapt it to the nature of the phenomenon treated and the volume of data we have as well as the performance of the machine, offers a segmentation accuracy that reaches 98.4%.Deep learning algorithms are seen as more promising for overcoming the difficulties of extracting semantic features from complex scenes and large differences in the appearance of unbuilt urban land. The results of prediction will be used for defining urban areas where updates are made from the perspective of tracking urban taxes.

Los estilos APA, Harvard, Vancouver, ISO, etc.

Grimm, Florian, Florian Edl, Susanne R. Kerscher, Kay Nieselt, Isabel Gugel y Martin U. Schuhmann. "Semantic segmentation of cerebrospinal fluid and brain volume with a convolutional neural network in pediatric hydrocephalus—transfer learning from existing algorithms". Acta Neurochirurgica 162, n.º 10 (25 de junio de 2020): 2463–74. http://dx.doi.org/10.1007/s00701-020-04447-x.

Texto completo

Resumen

Abstract Background For the segmentation of medical imaging data, a multitude of precise but very specific algorithms exist. In previous studies, we investigated the possibility of segmenting MRI data to determine cerebrospinal fluid and brain volume using a classical machine learning algorithm. It demonstrated good clinical usability and a very accurate correlation of the volumes to the single area determination in a reproducible axial layer. This study aims to investigate whether these established segmentation algorithms can be transferred to new, more generalizable deep learning algorithms employing an extended transfer learning procedure and whether medically meaningful segmentation is possible. Methods Ninety-five routinely performed true FISP MRI sequences were retrospectively analyzed in 43 patients with pediatric hydrocephalus. Using a freely available and clinically established segmentation algorithm based on a hidden Markov random field model, four classes of segmentation (brain, cerebrospinal fluid (CSF), background, and tissue) were generated. Fifty-nine randomly selected data sets (10,432 slices) were used as a training data set. Images were augmented for contrast, brightness, and random left/right and X/Y translation. A convolutional neural network (CNN) for semantic image segmentation composed of an encoder and corresponding decoder subnetwork was set up. The network was pre-initialized with layers and weights from a pre-trained VGG 16 model. Following the network was trained with the labeled image data set. A validation data set of 18 scans (3289 slices) was used to monitor the performance as the deep CNN trained. The classification results were tested on 18 randomly allocated labeled data sets (3319 slices) and on a T2-weighted BrainWeb data set with known ground truth. Results The segmentation of clinical test data provided reliable results (global accuracy 0.90, Dice coefficient 0.86), while the CNN segmentation of data from the BrainWeb data set showed comparable results (global accuracy 0.89, Dice coefficient 0.84). The segmentation of the BrainWeb data set with the classical FAST algorithm produced consistent findings (global accuracy 0.90, Dice coefficient 0.87). Likewise, the area development of brain and CSF in the long-term clinical course of three patients was presented. Conclusion Using the presented methods, we showed that conventional segmentation algorithms can be transferred to new advances in deep learning with comparable accuracy, generating a large number of training data sets with relatively little effort. A clinically meaningful segmentation possibility was demonstrated.

Los estilos APA, Harvard, Vancouver, ISO, etc.

Cira, Calimanut-Ionut, Ramón Alcarria, Miguel-Ángel Manso-Callejo y Francisco Serradilla. "A Deep Learning-Based Solution for Large-Scale Extraction of the Secondary Road Network from High-Resolution Aerial Orthoimagery". Applied Sciences 10, n.º 20 (17 de octubre de 2020): 7272. http://dx.doi.org/10.3390/app10207272.

Texto completo

Resumen

Secondary roads represent the largest part of the road network. However, due to the absence of clearly defined edges, presence of occlusions, and differences in widths, monitoring and mapping them represents a great effort for public administration. We believe that recent advancements in machine vision allow the extraction of these types of roads from high-resolution remotely sensed imagery and can enable the automation of the mapping operation. In this work, we leverage these advances and propose a deep learning-based solution capable of efficiently extracting the surface area of secondary roads at a large scale. The solution is based on hybrid segmentation models trained with high-resolution remote sensing imagery divided in tiles of 256 × 256 pixels and their correspondent segmentation masks, resulting in increases in performance metrics of 2.7–3.5% when compared to the original architectures. The best performing model achieved Intersection over Union and F1 scores of maximum 0.5790 and 0.7120, respectively, with a minimum loss of 0.4985 and was integrated on a web platform which handles the evaluation of large areas, the association of the semantic predictions with geographical coordinates, the conversion of the tiles’ format and the generation of geotiff results compatible with geospatial databases.

Los estilos APA, Harvard, Vancouver, ISO, etc.

Briechle, S., P. Krzystek y G. Vosselman. "SEMANTIC LABELING OF ALS POINT CLOUDS FOR TREE SPECIES MAPPING USING THE DEEP NEURAL NETWORK POINTNET++". ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences XLII-2/W13 (5 de junio de 2019): 951–55. http://dx.doi.org/10.5194/isprs-archives-xlii-2-w13-951-2019.

Texto completo

Resumen

Abstract. Most methods for the mapping of tree species are based on the segmentation of single trees that are subsequently classified using a set of hand-crafted features and an appropriate classifier. The classification accuracy for coniferous and deciduous trees just using airborne laser scanning (ALS) data is only around 90% in case the geometric information of the point cloud is used. As deep neural networks (DNNs) have the ability to adaptively learn features from the underlying data, they have outperformed classic machine learning (ML) approaches on well-known benchmark datasets provided by the robotics, computer vision and remote sensing community. Though, tree species classification using deep learning (DL) procedures has been of minor research interest so far. Some studies have been conducted based on an extensive prior generation of images or voxels from the 3D raw data. Since innovative DNNs directly operate on irregular and unordered 3D point clouds on a large scale, the objective of this study is to exemplarily use PointNet++ for the semantic labeling of ALS point clouds to map deciduous and coniferous trees. The dataset for our experiments consists of ALS data from the Bavarian Forest National Park (366 trees/ha), only including spruces (coniferous) and beeches (deciduous). First, the training data were generated automatically using a classic feature-based Random Forest (RF) approach classifying coniferous trees (precision&thinsp;=&thinsp;93%, recall&thinsp;=&thinsp;80%) and deciduous trees (precision&thinsp;=&thinsp;82%, recall&thinsp;=&thinsp;92%). Second, PointNet++ was trained and subsequently evaluated using 80 randomly chosen test batches à 400&thinsp;m2. The achieved per-point classification results after 163 training epochs for coniferous trees (precision&thinsp;=&thinsp;90%, recall&thinsp;=&thinsp;79%) and deciduous trees (precision&thinsp;=&thinsp;81%, recall&thinsp;=&thinsp;91%) are fairly high considering that only the geometry was included. Nevertheless, the classification results using PointNet++ are slightly lower than those of the baseline method using a RF classifier. Errors in the training data and occurring edge effects limited a better performance. Our first results demonstrate that the architecture of the 3D DNN PointNet++ can successfully be adapted to the semantic labeling of large ALS point clouds to map deciduous and coniferous trees. Future work will focus on the integration of additional features like i.e. the laser intensity, the surface normals and multispectral features into the DNN. Thus, a further improvement of the accuracy of the proposed approach is to be expected. Furthermore, the classification of numerous individual tree species based on pre-segmented single trees should be investigated.

Los estilos APA, Harvard, Vancouver, ISO, etc.

Más fuentes

Tesis sobre el tema "DEEP LEARNING, GENERATION, SEMANTIC SEGMENTATION, MACHINE LEARNING"

Di, Mauro Daniele. "Scene Understanding for Parking Spaces Management". Doctoral thesis, Università di Catania, 2019. http://hdl.handle.net/10761/4138.

Texto completo

Resumen

The major part of world-wide population moved to urban areas. After such process many issues of major cities have worsened, e.g. air pollution, traffic, security. The increase of security cameras and the improvements of Computer Vision algorithm can be a good solution for many of those problems. The work in this thesis was started after a grant by Park Smart s.r.l., a company located in Catania, which believes that Computer Vision can be the answer for parking space management. The main problem the company has to face is to find a fast way to deploy working solutions, lowering the labeling effort to the minimum, across different scene, cities, parking areas. During the three years of doctoral studies we have tried to solve the problem through the use of various methods such as Semi-Supervised Learning, Counting and Scene Adaptation through Image Classification, Object Detection and Semantic Segmentation. Semi-Supervised classification was the first approach used to decrease labeling effort for fast deployment. Methods based on counting objects, like cars and parking spots, were analyzed as second solution. To gain full knowledge of the scene we focused on Semantic Segmentation and the use of Generative Adversarial Networks in order to find a viable way to reach good Scene Adaptation results comparable to state-of-the-art methods.

Los estilos APA, Harvard, Vancouver, ISO, etc.

Nguyen, Duc Minh Chau. "Affordance learning for visual-semantic perception". Thesis, Edith Cowan University, Research Online, Perth, Western Australia, 2021. https://ro.ecu.edu.au/theses/2443.

Texto completo

Resumen

Affordance Learning is linked to the study of interactions between robots and objects, including how robots perceive objects by scene understanding. This area has been popular in the Psychology, which has recently come to influence Computer Vision. In this way, Computer Vision has borrowed the concept of affordance from Psychology in order to develop Visual-Semantic recognition systems, and to develop the capabilities of robots to interact with objects, in particular. However, existing systems of Affordance Learning are still limited to detecting and segmenting object affordances, which is called Affordance Segmentation. Further, these systems are not designed to develop specific abilities to reason about affordances. For example, a Visual-Semantic system, for captioning a scene, can extract information from an image, such as “a person holds a chocolate bar and eats it”, but does not highlight the affordances: “hold” and “eat”. Indeed, these affordances and others commonly appear within all aspects of life, since affordances usually connect to actions (from a linguistic view, affordances are generally known as verbs in sentences). Due to the above mentioned limitations, this thesis aims to develop systems of Affordance Learning for Visual-Semantic Perception. These systems can be built using Deep Learning, which has been empirically shown to be efficient for performing Computer Vision tasks. There are two goals of the thesis: (1) study what are the key factors that contribute to the performance of Affordance Segmentation and (2) reason about affordances (Affordance Reasoning) based on parts of objects for Visual-Semantic Perception. In terms of the first goal, the thesis mainly investigates the feature extraction module as this is one of the earliest steps in learning to segment affordances. The thesis finds that the quality of feature extraction from images plays a vital role in improved performance of Affordance Segmentation. With regard to the second goal, the thesis infers affordances from object parts to reason about part-affordance relationships. Based on this approach, the thesis devises an Object Affordance Reasoning Network that can learn to construct relationships between affordances and object parts. As a result, reasoning about affordance becomes achievable in the generation of scene graphs of affordances and object parts. Empirical results, obtained from extensive experiments, show the potential of the system (that the thesis developed) towards Affordance Reasoning from Scene Graph Generation.

Los estilos APA, Harvard, Vancouver, ISO, etc.

Espis, Andrea. "Object detection and semantic segmentation for assisted data labeling". Master's thesis, Alma Mater Studiorum - Università di Bologna, 2022.

Buscar texto completo

Resumen

The automation of data labeling tasks is a solution to the errors and time costs related to human labeling. In this thesis work CenterNet, DeepLabV3, and K-Means applied to the RGB color space, are deployed to build a pipeline for Assisted data labeling: a semi-automatic process to iteratively improve the quality of the annotations. The proposed pipeline pointed out a total of 1547 wrong and missing annotations when applied to a dataset originally containing 8,300 annotations. Moreover, the quality of each annotation has been drastically improved, and at the same time, more than 600 hours of work have been saved. The same models have also been used to address the real-time Tire inspection task, regarding the detection of markers on the surface of tires. According to the experiments, the combination of DeepLabV3 output and post-processing based on the area and shape of the predicted blobs, achieves a maximum of mean Precision 0.992, with mean Recall 0.982, and a maximum of mean Recall 0.998, with mean Precision 0.960.

Los estilos APA, Harvard, Vancouver, ISO, etc.

Serra, Sabina. "Deep Learning for Semantic Segmentation of 3D Point Clouds from an Airborne LiDAR". Thesis, Linköpings universitet, Datorseende, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-168367.

Texto completo

Resumen

Light Detection and Ranging (LiDAR) sensors have many different application areas, from revealing archaeological structures to aiding navigation of vehicles. However, it is challenging to interpret and fully use the vast amount of unstructured data that LiDARs collect. Automatic classification of LiDAR data would ease the utilization, whether it is for examining structures or aiding vehicles. In recent years, there have been many advances in deep learning for semantic segmentation of automotive LiDAR data, but there is less research on aerial LiDAR data. This thesis investigates the current state-of-the-art deep learning architectures, and how well they perform on LiDAR data acquired by an Unmanned Aerial Vehicle (UAV). It also investigates different training techniques for class imbalanced and limited datasets, which are common challenges for semantic segmentation networks. Lastly, this thesis investigates if pre-training can improve the performance of the models. The LiDAR scans were first projected to range images and then a fully convolutional semantic segmentation network was used. Three different training techniques were evaluated: weighted sampling, data augmentation, and grouping of classes. No improvement was observed by the weighted sampling, neither did grouping of classes have a substantial effect on the performance. Pre-training on the large public dataset SemanticKITTI resulted in a small performance improvement, but the data augmentation seemed to have the largest positive impact. The mIoU of the best model, which was trained with data augmentation, was 63.7% and it performed very well on the classes Ground, Vegetation, and Vehicle. The other classes in the UAV dataset, Person and Structure, had very little data and were challenging for most models to classify correctly. In general, the models trained on UAV data performed similarly as the state-of-the-art models trained on automotive data.

Los estilos APA, Harvard, Vancouver, ISO, etc.

Westell, Jesper. "Multi-Task Learning using Road Surface Condition Classification and Road Scene Semantic Segmentation". Thesis, Linköpings universitet, Institutionen för medicinsk teknik, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-157403.

Texto completo

Resumen

Understanding road surface conditions is an important component in active vehicle safety. Estimations can be achieved through image classification using increasingly popular convolutional neural networks (CNNs). In this paper, we explore the effects of multi-task learning by creating CNNs capable of simultaneously performing the two tasks road surface condition classification (RSCC) and road scene semantic segmentation (RSSS). A multi-task network, containing a shared feature extractor (VGG16, ResNet-18, ResNet-101) and two taskspecific network branches, is built and trained using the Road-Conditions and Cityscapes datasets. We reveal that utilizing task-dependent homoscedastic uncertainty in the learning process improvesmulti-task model performance on both tasks. When performing task adaptation, using a small set of additional data labeled with semantic information, we gain considerable RSCC improvements on complex models. Furthermore, we demonstrate increased model generalizability in multi-task models, with up to 12% higher F1-score compared to single-task models.

Los estilos APA, Harvard, Vancouver, ISO, etc.

Rydgård, Jonas y Marcus Bejgrowicz. "Semantic Segmentation of Building Materials in Real World Images Using 3D Information". Thesis, Linköpings universitet, Datorseende, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-176618.

Texto completo

Resumen

The increasing popularity of drones has made it convenient to capture a large number of images of a property, which can then be used to build a 3D model. The conditions of buildings can be analyzed to plan renovations. This creates an interest for automatically identifying building materials, a task well suited for machine learning. With access to drone imagery of buildings as well as depth maps and normal maps, we created a dataset for semantic segmentation. Two different convolutional neural networks were trained and evaluated, to see how well they perform material segmentation. DeepLabv3+, which uses RGB data, was compared to Depth-Aware CNN, which uses RGB-D data. Our experiments showed that DeepLabv3+ achieved higher mean intersection over union. To investigate if the information in the depth maps and normal maps could give a performance boost, we conducted experiments with an encoding we call HMN - horizontal disparity, magnitude of normal with ground, normal parallel with gravity. This three channel encoding was used to jointly train two CNNs, one with RGB and one with HMN, and then sum their predictions. This led to improved results for both DeepLabv3+ and Depth-Aware CNN.
Den ökade populariteten av drönare har gjort det smidigt att ta ett stort antal bilder av en fastighet, och sedan skapa en 3D-modell. Skicket hos en byggnad kan enkelt analyseras och renoveringar planeras. Det är då av intresse att automatiskt kunna identifiera byggnadsmaterial, en uppgift som lämpar sig väl för maskininlärning. Med tillgång till såväl drönarbilder av byggnader som djupkartor och normalkartor har vi skapat ett dataset för semantisk segmentering. Två olika faltande neuronnät har tränats och utvärderats för att se hur väl de fungerar för materialigenkänning. DeepLabv3+ som använder sig av RGB-data har jämförts med Depth-Aware CNN som använder RGB-D-data och våra experiment visar att DeepLabv3+ får högre mean intersection over union. För att undersöka om resultaten kan förbättras med hjälp av datat i djupkartor och normalkartor har vi kodat samman informationen till vad vi valt att benämna HMN - horisontell disparitet, magnitud av normalen parallell med marken, normal i gravitationsriktningen. Denna trekanalsinput kan användas för att träna ett extra CNN samtidigt som man tränar med RGB-bilder, och sedan summera båda predikteringarna. Våra experiment visar att detta leder till bättre segmenteringar för både DeepLabv3+ och Depth-Aware CNN.

Los estilos APA, Harvard, Vancouver, ISO, etc.

Sörsäter, Michael. "Active Learning for Road Segmentation using Convolutional Neural Networks". Thesis, Linköpings universitet, Datorseende, 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-152286.

Texto completo

Resumen

In recent years, development of Convolutional Neural Networks has enabled high performing semantic segmentation models. Generally, these deep learning based segmentation methods require a large amount of annotated data. Acquiring such annotated data for semantic segmentation is a tedious and expensive task. Within machine learning, active learning involves in the selection of new data in order to limit the usage of annotated data. In active learning, the model is trained for several iterations and additional samples are selected that the model is uncertain of. The model is then retrained on additional samples and the process is repeated again. In this thesis, an active learning framework has been applied to road segmentation which is semantic segmentation of objects related to road scenes. The uncertainty in the samples is estimated with Monte Carlo dropout. In Monte Carlo dropout, several dropout masks are applied to the model and the variance is captured, working as an estimate of the model’s uncertainty. Other metrics to rank the uncertainty evaluated in this work are: a baseline method that selects samples randomly, the entropy in the default predictions and three additional variations/extensions of Monte Carlo dropout. Both the active learning framework and uncertainty estimation are implemented in the thesis. Monte Carlo dropout performs slightly better than the baseline in 3 out of 4 metrics. Entropy outperforms all other implemented methods in all metrics. The three additional methods do not perform better than Monte Carlo dropout. An analysis of what kind of uncertainty Monte Carlo dropout capture is performed together with a comparison of the samples selected by baseline and Monte Carlo dropout. Future development and possible improvements are also discussed.

Los estilos APA, Harvard, Vancouver, ISO, etc.

Hu, Xikun. "Multispectral Remote Sensing and Deep Learning for Wildfire Detection". Licentiate thesis, KTH, Geoinformatik, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-295655.

Texto completo

Resumen

Remote sensing data has great potential for wildfire detection and monitoring with enhanced spatial resolution and temporal coverage. Earth Observation satellites have been employed to systematically monitor fire activity over large regions in two ways: (i) to detect the location of actively burning spots (during the fire event), and (ii) to map the spatial extent of the burned scars (during or after the event). Active fire detection plays an important role in wildfire early warning systems. The open-access of Sentinel-2 multispectral data at 20-m resolution offers an opportunity to evaluate its complementary role to the coarse indication in the hotspots provided by MODIS-like polar-orbiting and GOES-like geostationary systems. In addition, accurate and timely mapping of burned areas is needed for damage assessment. Recent advances in deep learning (DL) provides the researcher with automatic, accurate, and bias-free large-scale mapping options for burned area mapping using uni-temporal multispectral imagery. Therefore, the objective of this thesis is to evaluate multispectral remote sensing data (in particular Sentinel-2) for wildfire detection, including active fire detection using a multi-criteria approach and burned area detection using DL models. For active fire detection, a multi-criteria approach based on the reflectance of B4, B11, and B12 of Sentinel-2 MSI data is developed for several representative fire-prone biomes to extract unambiguous active fire pixels. The adaptive thresholds for each biome are statistically determined from 11 million Sentinel-2 observations samples acquired over summertime (June 2019 to September 2019) across 14 regions or countries. The primary criterion is derived from 3 sigma prediction interval of OLS regression of observation samples for each biome. More specific criteria based on B11 and B12 are further introduced to reduce the omission errors (OE) and commission errors (CE). The multi-criteria approach proves to be effective in cool smoldering fire detection in study areas with tropical & subtropical grasslands, savannas & shrublands using the primary criterion. At the same time, additional criteria that thresholds the reflectance of B11 and B12 can effectively decrease the CE caused by extremely bright flames around the hot cores in testing sites with Mediterranean forests, woodlands & scrub. The other criterion based on reflectance ratio between B12 and B11 also avoids the effects of CE caused by hot soil pixels in sites with tropical & subtropical moist broadleaf forests. Overall, the validation performance over testing patches reveals that CE and OE can be kept at a low level (0.14 and 0.04) as an acceptable trade-off. This multi-criteria algorithm is suitable for rapid active fire detection based on uni-temporal imagery without the requirement of multi-temporal data. Medium-resolution multispectral data can be used as a complementary choice to the coarse resolution images for their ability to detect small burning areas and to detect active fires more accurately. For burned area mapping, this thesis aims to expound on the capability of deep DL models for automatically mapping burned areas from uni-temporal multispectral imagery. Various burned area detection algorithms have been developed using Sentinel-2 and/or Landsat data, but most of the studies require a pre-fire image, dense time-series data, or an empirical threshold. In this thesis, several semantic segmentation network architectures, i.e., U-Net, HRNet, Fast- SCNN, and DeepLabv3+ are applied to Sentinel-2 imagery and Landsat-8 imagery over three testing sites in two local climate zones. In addition, three popular machine learning (ML) algorithms (LightGBM, KNN, and random forests) and NBR thresholding techniques (empirical and OTSU-based) are used in the same study areas for comparison. The validation results show that DL algorithms outperform the machine learning (ML) methods in two of the three cases with the compact burned scars, while ML methods seem to be more suitable for mapping dispersed scar in boreal forests. Using Sentinel-2 images, U-Net and HRNet exhibit comparatively identical performance with higher kappa (around 0.9) in one heterogeneous Mediterranean fire site in Greece; Fast-SCNN performs better than others with kappa over 0.79 in one compact boreal forest fire with various burn severity in Sweden. Furthermore, directly transferring the trained models to corresponding Landsat-8 data, HRNet dominates in the three test sites among DL models and can preserve the high accuracy. The results demonstrate that DL models can make full use of contextual information and capture spatial details in multiple scales from fire-sensitive spectral bands to map burned areas. With the uni-temporal image, DL-based methods have the potential to be used for the next Earth observation satellite with onboard data processing and limited storage for previous scenes. In the future study, DL models will be explored to detect active fire from multi-resolution remote sensing data. The existing problem of unbalanced labeled data can be resolved via advanced DL architecture, the suitable configuration on the training dataset, and improved loss function. To further explore the damage caused by wildfire, future work will focus on the burn severity assessment based on DL models through multi-class semantic segmentation. In addition, the translation between optical and SAR imagery based on Generative Adversarial Network (GAN) model could be explored to improve burned area mapping in different weather conditions.
Fjärranalysdata har stor potential för upptäckt och övervakning av skogsbränder med förbättrad rumslig upplösning och tidsmässig täckning. Jordobservationssatelliter har använts för att systematiskt övervaka brandaktivitet över stora regioner på två sätt: (i) för att upptäcka placeringen av aktivt brinnande fläckar (under brandhändelsen) och (ii) för att kartlägga den brända ärrens rumsliga omfattning ( under eller efter evenemanget). Aktiv branddetektering spelar en viktig roll i system för tidig varning för skogsbränder. Den öppna tillgången till Sentinel-2 multispektral data vid 20 m upplösning ger en möjlighet att utvärdera dess kompletterande roll i förhållande till den grova indikationen i hotspots som tillhandahålls av MODIS-liknande polaromloppsbanesystem och GOES-liknande geostationära system. Dessutom krävs en korrekt och snabb kartläggning av brända områden för skadebedömning. Senaste framstegen inom deep learning (DL) ger forskaren automatiska, exakta och förspänningsfria storskaliga kartläggningsalternativ för kartläggning av bränt område med unitemporal multispektral bild. Därför är syftet med denna avhandling att utvärdera multispektral fjärranalysdata (särskilt Sentinel- 2) för att upptäcka skogsbränder, inklusive aktiv branddetektering med hjälp av ett multikriterietillvägagångssätt och detektering av bränt område med DL-modeller. För aktiv branddetektering utvecklas en multikriteriemetod baserad på reflektionen av B4, B11 och B12 i Stentinel-2 MSI data för flera representativa brandbenägna biom för att få fram otvetydiga pixlar för aktiv brand. De adaptiva tröskelvärdena för varje biom bestäms statistiskt från 11 miljoner Sentinel-2 observationsprover som förvärvats under sommaren (juni 2019 till september 2019) i 14 regioner eller länder. Det primära kriteriet härleds från 3-sigma-prediktionsintervallet för OLS-regression av observationsprover för varje biom. Mer specifika kriterier baserade på B11 och B12 införs vidare för att minska utelämningsfel (OE) och kommissionsfel (CE). Det multikriteriella tillvägagångssättet visar sig vara effektivt när det gäller upptäckt av svala pyrande bränder i undersökningsområden med tropiska och subtropiska gräsmarker, savanner och buskmarker med hjälp av det primära kriteriet. Samtidigt kan ytterligare kriterier som tröskelvärden för reflektionen av B11 och B12 effektivt minska det fel som orsakas av extremt ljusa lågor runt de heta kärnorna i testområden med skogar, skogsmarker och buskage i Medelhavsområdet. Det andra kriteriet som bygger på förhållandet mellan B12 och B11:s reflektionsgrad undviker också effekterna av CE som orsakas av heta markpixlar i områden med tropiska och subtropiska fuktiga lövskogar. Sammantaget visar valideringsresultatet för testområden att CE och OE kan hållas på en låg nivå (0,14 och 0,04) som en godtagbar kompromiss. Algoritmen med flera kriterier lämpar sig för snabb aktiv branddetektering baserad på unika tidsmässiga bilder utan krav på tidsmässiga data. Multispektrala data med medelhög upplösning kan användas som ett kompletterande val till bilder med kursupplösning på grund av deras förmåga att upptäcka små brinnande områden och att upptäcka aktiva bränder mer exakt. När det gäller kartläggning av brända områden syftar denna avhandling till att förklara hur djupa DL-modeller kan användas för att automatiskt kartlägga brända områden från multispektrala bilder i ett tidsintervall. Olika algoritmer för upptäckt av brända områden har utvecklats med hjälp av Sentinel-2 och/eller Landsat-data, men de flesta av studierna kräver att man har en förebränning. bild före branden, täta tidsseriedata eller ett empiriskt tröskelvärde. I den här avhandlingen tillämpas flera arkitekturer för semantiska segmenteringsnätverk, dvs. U-Net, HRNet, Fast- SCNN och DeepLabv3+, på Sentinel- 2 bilder och Landsat-8 bilder över tre testplatser i två lokala klimatzoner. Dessutom används tre populära algoritmer för maskininlärning (ML) (Light- GBM, KNN och slumpmässiga skogar) och NBR-tröskelvärden (empiriska och OTSU-baserade) i samma undersökningsområden för jämförelse. Valideringsresultaten visar att DL-algoritmerna överträffar maskininlärningsmetoderna (ML) i två av de tre fallen med kompakta brända ärr, medan ML-metoderna verkar vara mer lämpliga för kartläggning av spridda ärr i boreala skogar. Med hjälp av Sentinel-2 bilder uppvisar U-Net och HRNet jämförelsevis identiska prestanda med högre kappa (omkring 0,9) i en heterogen brandplats i Medelhavet i Grekland; Fast-SCNN presterar bättre än andra med kappa över 0,79 i en kompakt boreal skogsbrand med varierande brännskadegrad i Sverige. Vid direkt överföring av de tränade modellerna till motsvarande Landsat-8-data dominerar HRNet dessutom på de tre testplatserna bland DL-modellerna och kan bevara den höga noggrannheten. Resultaten visade att DL-modeller kan utnyttja kontextuell information fullt ut och fånga rumsliga detaljer i flera skalor från brandkänsliga spektralband för att kartlägga brända områden. Med den unika tidsmässiga bilden har DL-baserade metoder potential att användas för nästa jordobservationssatellit med databehandling ombord och begränsad lagring av tidigare scener. I den framtida studien kommer DL-modeller att undersökas för att upptäcka aktiva bränder från fjärranalysdata med flera upplösningar. Det befintliga problemet med obalanserade märkta data kan lösas med hjälp av en avancerad DL-arkitektur, lämplig konfiguration av träningsdatasetet och förbättrad förlustfunktion. För att ytterligare utforska de skador som orsakas av skogsbränder kommer det framtida arbetet att fokusera på bedömningen av brännskadornas allvarlighetsgrad baserat på DL-modeller genom semantisk segmentering av flera klasser. Dessutom kan översättningen mellan optiska bilder och SAR-bilder baserad på en GAN-modell (Generative Adversarial Network) undersökas för att förbättra kartläggningen av brända områden under olika väderförhållanden.

QC 20210525

Los estilos APA, Harvard, Vancouver, ISO, etc.

Phillips, Adon. "Melanoma Diagnostics Using Fully Convolutional Networks on Whole Slide Images". Thesis, Université d'Ottawa / University of Ottawa, 2017. http://hdl.handle.net/10393/36929.

Texto completo

Resumen

Semantic segmentation as an approach to recognizing and localizing objects within an image is a major research area in computer vision. Now that convolutional neural networks are being increasingly used for such tasks, there have been many improve- ments in grand challenge results, and many new research opportunities in previously untennable areas. Using fully convolutional networks, we have developed a semantic segmentation pipeline for the identification of melanocytic tumor regions, epidermis, and dermis lay- ers in whole slide microscopy images of cutaneous melanoma or cutaneous metastatic melanoma. This pipeline includes processes for annotating and preparing a dataset from the output of a tissue slide scanner to the patch-based training and inference by an artificial neural network. We have curated a large dataset of 50 whole slide images containing cutaneous melanoma or cutaneous metastatic melanoma that are fully annotated at 40× ob- jective resolution by an expert pathologist. We will publish the source images of this dataset online. We also present two new FCN architectures that fuse multiple deconvolutional strides, combining coarse and fine predictions to improve accuracy over similar networks without multi-stride information. Our results show that the system performs better than our comparators. We include inference results on thousands of patches from four whole slide images, reassembling them into whole slide segmentation masks to demonstrate how our system generalizes on novel cases.

Los estilos APA, Harvard, Vancouver, ISO, etc.

Radhakrishnan, Aswathnarayan. "A Study on Applying Learning Techniques to Remote Sensing Data". The Ohio State University, 2020. http://rave.ohiolink.edu/etdc/view?acc_num=osu1586901481703797.

Texto completo

Los estilos APA, Harvard, Vancouver, ISO, etc.

Más fuentes

Capítulos de libros sobre el tema "DEEP LEARNING, GENERATION, SEMANTIC SEGMENTATION, MACHINE LEARNING"

Saadatifard, Leila, Aryan Mobiny, Pavel Govyadinov, Hien Van Nguyen y David Mayerich. "Cellular/Vascular Reconstruction Using a Deep CNN for Semantic Image Preprocessing and Explicit Segmentation". En Machine Learning for Medical Image Reconstruction, 134–44. Cham: Springer International Publishing, 2020. http://dx.doi.org/10.1007/978-3-030-61598-7_13.

Texto completo

Los estilos APA, Harvard, Vancouver, ISO, etc.

Jothe, Jaswant Singh y Punit Kumar Johari. "Multiclass Semantic Segmentation of COVID-19 CT Scan Images using Deep Learning". En Applications of Machine Intelligence in Engineering, 359–70. New York: CRC Press, 2022. http://dx.doi.org/10.1201/9781003269793-39.

Texto completo

Los estilos APA, Harvard, Vancouver, ISO, etc.

Reena, Amanpratap Singh Pall, Nonita Sharma, K. P. Sharma y Vaishali Wadhwa. "A Systematic Review of Deep Learning Techniques for Semantic Image Segmentation: Methods, Future Directions, and Challenges". En Handbook of Research on Machine Learning, 49–86. New York: Apple Academic Press, 2022. http://dx.doi.org/10.1201/9781003277330-4.

Texto completo

Los estilos APA, Harvard, Vancouver, ISO, etc.

Cao, Yuwei, Simone Teruggi, Francesco Fassi y Marco Scaioni. "A Comprehensive Understanding of Machine Learning and Deep Learning Methods for 3D Architectural Cultural Heritage Point Cloud Semantic Segmentation". En Geomatics for Green and Digital Transition, 329–41. Cham: Springer International Publishing, 2022. http://dx.doi.org/10.1007/978-3-031-17439-1_24.

Texto completo

Los estilos APA, Harvard, Vancouver, ISO, etc.

Iwamoto, Sora, Bisser Raytchev, Toru Tamaki y Kazufumi Kaneda. "Improving the Reliability of Semantic Segmentation of Medical Images by Uncertainty Modeling with Bayesian Deep Networks and Curriculum Learning". En Uncertainty for Safe Utilization of Machine Learning in Medical Imaging, and Perinatal Imaging, Placental and Preterm Image Analysis, 34–43. Cham: Springer International Publishing, 2021. http://dx.doi.org/10.1007/978-3-030-87735-4_4.

Texto completo

Los estilos APA, Harvard, Vancouver, ISO, etc.

Wicaksono, Hendro, Tina Boroukhian y Atit Bashyal. "A Demand-Response System for Sustainable Manufacturing Using Linked Data and Machine Learning". En Dynamics in Logistics, 155–81. Cham: Springer International Publishing, 2021. http://dx.doi.org/10.1007/978-3-030-88662-2_8.

Texto completo

Resumen

AbstractThe spread of demand-response (DR) programs in Europe is a slow but steady process to optimize the use of renewable energy in different sectors including manufacturing. A demand-response program promotes changes of electricity consumption patterns at the end consumer side to match the availability of renewable energy sources through price changes or incentives. This research develops a system that aims to engage manufacturing power consumers through price- and incentive-based DR programs. The system works on data from heterogeneous systems at both supply and demand sides, which are linked through a semantic middleware, instead of centralized data integration. An ontology is used as the integration information model of the semantic middleware. This chapter explains the concept of constructing the ontology by utilizing relational database to ontology mapping techniques, reusing existing ontologies such as OpenADR, SSN, SAREF, etc., and applying ontology alignment methods. Machine learning approaches are developed to forecast both the power generated from renewable energy sources and the power demanded by manufacturing consumers based on their processes. The forecasts are the groundworks to calculate the dynamic electricity price introduced for the DR program. This chapter presents different neural network architectures and compares the experiment results. We compare the results of Deep Neural Network (DNN), Long Short-Term Memory Network (LSTM), Convolutional Neural Network (CNN), and Hybrid architectures. This chapter focuses on the initial phase of the research where we focus on the ontology development method and machine learning experiments using power generation datasets.

Los estilos APA, Harvard, Vancouver, ISO, etc.

Xie, Bo y Long Chen. "Automatic Scoring Model of Subjective Questions Based Text Similarity Fusion Model". En Proceeding of 2021 International Conference on Wireless Communications, Networking and Applications, 586–99. Singapore: Springer Nature Singapore, 2022. http://dx.doi.org/10.1007/978-981-19-2456-9_60.

Texto completo

Resumen

AbstractAI In this era, scene based translation and intelligent word segmentation are not new technologies. However, there is still no good solution for long and complex Chinese semantic analysis. The subjective question scoring still relies on the teacher's manual marking. However, there are a large number of examinations, and the manual marking work is huge. At present, the labor cost is getting higher and higher, the traditional manual marking method can't meet the demand The demand for automatic marking is increasingly strong in modern society. At present, the automatic marking technology of objective questions has been very mature and widely used. However, by reasons of the complexity and the difficulty of natural language processing technology in Chinese text, there are still many shortcomings in subjective questions marking, such as not considering the impact of semantics, word order and other issues on scoring accuracy. The automatic scoring technology of subjective questions is a complex technology, involving pattern recognition, machine learning, natural language processing and other technologies. Good results have been seen in the calculation method-based deep learning and machine learning. The rapid development of NLP technology has brought a new breakthrough for subjective question scoring. We integrate two deep learning models based on the Siamese Network through bagging to ensure the accuracy of the results, the text similarity matching model based on the birth networks and the score point recognition model based on the named entity recognition method respectively. Combining with the framework of deep learning, we use the simulated manual scoring method to extract and match the score point sequence of students’ answers with standard answers. The score recognition model effectively improves the efficiency of model calculation and long text keyword matching. The loss value of the final training score recognition model is about 0.9, and the accuracy is 80.54%. The accuracy of the training text similarity matching model is 86.99%, and the fusion model is single. The scoring time is less than 0.8s, and the accuracy is 83.43%.

Los estilos APA, Harvard, Vancouver, ISO, etc.

Hagn, Korbinian y Oliver Grau. "Optimized Data Synthesis for DNN Training and Validation by Sensor Artifact Simulation". En Deep Neural Networks and Data for Automated Driving, 127–47. Cham: Springer International Publishing, 2022. http://dx.doi.org/10.1007/978-3-031-01233-4_4.

Texto completo

Resumen

AbstractSynthetic, i.e., computer-generated imagery (CGI) data is a key component for training and validating deep-learning-based perceptive functions due to its ability to simulate rare cases, avoidance of privacy issues, and generation of pixel-accurate ground truth data. Today, physical-based rendering (PBR) engines simulate already a wealth of realistic optical effects but are mainly focused on the human perception system. Whereas the perceptive functions require realistic images modeled with sensor artifacts as close as possible toward the sensor, the training data has been recorded. This chapter proposes a way to improve the data synthesis process by application of realistic sensor artifacts. To do this, one has to overcome the domain distance between real-world imagery and the synthetic imagery. Therefore, we propose a measure which captures the generalization distance of two distinct datasets which have been trained on the same model. With this measure the data synthesis pipeline can be improved to produce realistic sensor-simulated images which are closer to the real-world domain. The proposed measure is based on the Wasserstein distance (earth mover’s distance, EMD) over the performance metric mean intersection-over-union (mIoU) on a per-image basis, comparing synthetic and real datasets using deep neural networks (DNNs) for semantic segmentation. This measure is subsequently used to match the characteristic of a real-world camera for the image synthesis pipeline which considers realistic sensor noise and lens artifacts. Comparing the measure with the well-established Fréchet inception distance (FID) on real and artificial datasets demonstrates the ability to interpret the generalization distance which is inherent asymmetric and more informative than just a simple distance measure. Furthermore, we use the metric as an optimization criterion to adapt a synthetic dataset to a real dataset, decreasing the EMD distance between a synthetic and the Cityscapes dataset from 32.67 to 27.48 and increasing the mIoU of our test algorithm () from 40.36 to $$47.63\%$$ 47.63 % .

Los estilos APA, Harvard, Vancouver, ISO, etc.

Rege, Priti P. y Shaheera Akhter. "Text Separation From Document Images". En Machine Learning and Deep Learning in Real-Time Applications, 283–313. IGI Global, 2020. http://dx.doi.org/10.4018/978-1-7998-3095-5.ch013.

Texto completo

Resumen

Text separation in document image analysis is an important preprocessing step before executing an optical character recognition (OCR) task. It is necessary to improve the accuracy of an OCR system. Traditionally, for separating text from a document, different feature extraction processes have been used that require handcrafting of the features. However, deep learning-based methods are excellent feature extractors that learn features from the training data automatically. Deep learning gives state-of-the-art results on various computer vision, image classification, segmentation, image captioning, object detection, and recognition tasks. This chapter compares various traditional as well as deep-learning techniques and uses a semantic segmentation method for separating text from Devanagari document images using U-Net and ResU-Net models. These models are further fine-tuned for transfer learning to get more precise results. The final results show that deep learning methods give more accurate results compared with conventional methods of image processing for Devanagari text extraction.

Los estilos APA, Harvard, Vancouver, ISO, etc.

Murugan, R. "Implementation of Deep Learning Neural Network for Retinal Images". En Handbook of Research on Applications and Implementations of Machine Learning Techniques, 77–95. IGI Global, 2020. http://dx.doi.org/10.4018/978-1-5225-9902-9.ch005.

Texto completo

Resumen

The retinal parts segmentation has been recognized as a key component in both ophthalmological and cardiovascular sickness analysis. The parts of retinal pictures, vessels, optic disc, and macula segmentations, will add to the indicative outcome. In any case, the manual segmentation of retinal parts is tedious and dreary work, and it additionally requires proficient aptitudes. This chapter proposes a supervised method to segment blood vessel utilizing deep learning methods. All the more explicitly, the proposed part has connected the completely convolutional network, which is normally used to perform semantic segmentation undertaking with exchange learning. The convolutional neural system has turned out to be an amazing asset for a few computer vision assignments. As of late, restorative picture investigation bunches over the world are rapidly entering this field and applying convolutional neural systems and other deep learning philosophies to a wide assortment of uses, and uncommon outcomes are rising constantly.

Los estilos APA, Harvard, Vancouver, ISO, etc.

Actas de conferencias sobre el tema "DEEP LEARNING, GENERATION, SEMANTIC SEGMENTATION, MACHINE LEARNING"

Fathalla, Radwa y George Vogiatzis. "A deep learning pipeline for semantic facade segmentation". En British Machine Vision Conference 2017. British Machine Vision Association, 2017. http://dx.doi.org/10.5244/c.31.120.

Texto completo

Los estilos APA, Harvard, Vancouver, ISO, etc.

Knaak, Christian, Gerald Kolter, Frederic Schulze, Moritz Kröger y Peter Abels. "Deep learning-based semantic segmentation for in-process monitoring in laser welding applications". En Applications of Machine Learning, editado por Michael E. Zelinski, Tarek M. Taha, Jonathan Howe, Abdul A. Awwal y Khan M. Iftekharuddin. SPIE, 2019. http://dx.doi.org/10.1117/12.2529160.

Texto completo

Los estilos APA, Harvard, Vancouver, ISO, etc.

Cuza, Daniela, Andrea Loreggia, Alessandra Lumini y Loris Nanni. "Deep Semantic Segmentation in Skin Detection". En ESANN 2022 - European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning. Louvain-la-Neuve (Belgium): Ciaco - i6doc.com, 2022. http://dx.doi.org/10.14428/esann/2022.es2022-35.

Texto completo

Los estilos APA, Harvard, Vancouver, ISO, etc.

Zamorano Raya, José Alfonso, Mireya Saraí García Vázquez, Juan Carlos Jaimes Méndez, Abraham Montoya Obeso, Jorge Luis Compean Aguirre y Alejandro Álvaro Ramirez Acosta. "Semantic segmentation in egocentric video frames with deep learning for recognition of activities of daily living". En Applications of Machine Learning, editado por Michael E. Zelinski, Tarek M. Taha, Jonathan Howe, Abdul A. Awwal y Khan M. Iftekharuddin. SPIE, 2019. http://dx.doi.org/10.1117/12.2529834.

Texto completo

Los estilos APA, Harvard, Vancouver, ISO, etc.

Andreini, Paolo y Giovanna Maria Dimitri. "Deep Semantic Segmentation Models in Computer Vision". En ESANN 2022 - European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning. Louvain-la-Neuve (Belgium): Ciaco - i6doc.com, 2022. http://dx.doi.org/10.14428/esann/2022.es2022-5.

Texto completo

Los estilos APA, Harvard, Vancouver, ISO, etc.

Sinha, Sujata, Thomas Denney, Yang Zhou y Jingyi Zheng. "Automated Semantic Segmentation of Cardiac Magnetic Resonance Images with Deep Learning". En 2020 19th IEEE International Conference on Machine Learning and Applications (ICMLA). IEEE, 2020. http://dx.doi.org/10.1109/icmla51294.2020.00212.

Texto completo

Los estilos APA, Harvard, Vancouver, ISO, etc.

Ito, Shodai, Noboru Takagi, Kei Sawai, Hiroyuki Masuta y Tatsuo Motoyoshi. "Fast Semantic Segmentation for Vectorization of Line Drawings Based on Deep Neural Networks". En 2022 International Conference on Machine Learning and Cybernetics (ICMLC). IEEE, 2022. http://dx.doi.org/10.1109/icmlc56445.2022.9941326.

Texto completo

Los estilos APA, Harvard, Vancouver, ISO, etc.

Lutz, Benjamin, Dominik Kisskalt, Daniel Regulin, Raven Reisch, Andreas Schiffler y Jorg Franke. "Evaluation of Deep Learning for Semantic Image Segmentation in Tool Condition Monitoring". En 2019 18th IEEE International Conference On Machine Learning And Applications (ICMLA). IEEE, 2019. http://dx.doi.org/10.1109/icmla.2019.00321.

Texto completo

Los estilos APA, Harvard, Vancouver, ISO, etc.

Karoly, Artur I. y Peter Galambos. "Automated Dataset Generation with Blender for Deep Learning-based Object Segmentation". En 2022 IEEE 20th Jubilee World Symposium on Applied Machine Intelligence and Informatics (SAMI). IEEE, 2022. http://dx.doi.org/10.1109/sami54271.2022.9780790.

Texto completo

Los estilos APA, Harvard, Vancouver, ISO, etc.

Smit, Jason R., Hugh GP Huntt, Tyson Cross, Carina Schumann y Tom A. Warner. "Generation of metrics by semantic segmentation of high speed lightning footage using machine learning". En 2020 International SAUPEC/RobMech/PRASA Conference. IEEE, 2020. http://dx.doi.org/10.1109/saupec/robmech/prasa48453.2020.9041123.

Texto completo

Los estilos APA, Harvard, Vancouver, ISO, etc.

Ofrecemos descuentos en todos los planes premium para autores cuyas obras están incluidas en selecciones literarias temáticas. ¡Contáctenos para obtener un código promocional único!