To see the other types of publications on this topic, follow the link: Spectral-semantic model.

Journal articles on the topic 'Spectral-semantic model'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'Spectral-semantic model.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Guo, Yu Tang, and Chang Gang Han. "Automatic Image Annotation Using Semantic Subspace Graph Spectral Clustering Algorithm." Advanced Materials Research 271-273 (July 2011): 1090–95. http://dx.doi.org/10.4028/www.scientific.net/amr.271-273.1090.

Full text
Abstract:
Due to the existing of the semantic gap, images with the same or similar low level features are possibly different on semantic level. How to find the underlying relationship between the high-level semantic and low level features is one of the difficult problems for image annotation. In this paper, a new image annotation method based on graph spectral clustering with the consistency of semantics is proposed with detailed analysis on the advantages and disadvantages of the existed image annotation methods. The proposed method firstly cluster image into several semantic classes by semantic similarity measurement in the semantic subspace. Within each semantic class, images are re-clustered with visual features of region Then, the joint probability distribution of blobs and words was modeled by using Multiple-Bernoulli Relevance Model. We can annotate a unannotated image by using the joint distribution. Experimental results show the the effectiveness of the proposed approach in terms of quality of the image annotation. the consistency of high-level semantics and low level features is efficiently achieved.
APA, Harvard, Vancouver, ISO, and other styles
2

Zhu, Qiqi, Yanfei Zhong, and Liangpei Zhang. "SCENE CLASSFICATION BASED ON THE SEMANTIC-FEATURE FUSION FULLY SPARSE TOPIC MODEL FOR HIGH SPATIAL RESOLUTION REMOTE SENSING IMAGERY." ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences XLI-B7 (June 21, 2016): 451–57. http://dx.doi.org/10.5194/isprsarchives-xli-b7-451-2016.

Full text
Abstract:
Topic modeling has been an increasingly mature method to bridge the semantic gap between the low-level features and high-level semantic information. However, with more and more high spatial resolution (HSR) images to deal with, conventional probabilistic topic model (PTM) usually presents the images with a dense semantic representation. This consumes more time and requires more storage space. In addition, due to the complex spectral and spatial information, a combination of multiple complementary features is proved to be an effective strategy to improve the performance for HSR image scene classification. But it should be noticed that how the distinct features are fused to fully describe the challenging HSR images, which is a critical factor for scene classification. In this paper, a semantic-feature fusion fully sparse topic model (SFF-FSTM) is proposed for HSR imagery scene classification. In SFF-FSTM, three heterogeneous features – the mean and standard deviation based spectral feature, wavelet based texture feature, and dense scale-invariant feature transform (SIFT) based structural feature are effectively fused at the latent semantic level. The combination of multiple semantic-feature fusion strategy and sparse based FSTM is able to provide adequate feature representations, and can achieve comparable performance with limited training samples. Experimental results on the UC Merced dataset and Google dataset of SIRI-WHU demonstrate that the proposed method can improve the performance of scene classification compared with other scene classification methods for HSR imagery.
APA, Harvard, Vancouver, ISO, and other styles
3

Zhu, Qiqi, Yanfei Zhong, and Liangpei Zhang. "SCENE CLASSFICATION BASED ON THE SEMANTIC-FEATURE FUSION FULLY SPARSE TOPIC MODEL FOR HIGH SPATIAL RESOLUTION REMOTE SENSING IMAGERY." ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences XLI-B7 (June 21, 2016): 451–57. http://dx.doi.org/10.5194/isprs-archives-xli-b7-451-2016.

Full text
Abstract:
Topic modeling has been an increasingly mature method to bridge the semantic gap between the low-level features and high-level semantic information. However, with more and more high spatial resolution (HSR) images to deal with, conventional probabilistic topic model (PTM) usually presents the images with a dense semantic representation. This consumes more time and requires more storage space. In addition, due to the complex spectral and spatial information, a combination of multiple complementary features is proved to be an effective strategy to improve the performance for HSR image scene classification. But it should be noticed that how the distinct features are fused to fully describe the challenging HSR images, which is a critical factor for scene classification. In this paper, a semantic-feature fusion fully sparse topic model (SFF-FSTM) is proposed for HSR imagery scene classification. In SFF-FSTM, three heterogeneous features – the mean and standard deviation based spectral feature, wavelet based texture feature, and dense scale-invariant feature transform (SIFT) based structural feature are effectively fused at the latent semantic level. The combination of multiple semantic-feature fusion strategy and sparse based FSTM is able to provide adequate feature representations, and can achieve comparable performance with limited training samples. Experimental results on the UC Merced dataset and Google dataset of SIRI-WHU demonstrate that the proposed method can improve the performance of scene classification compared with other scene classification methods for HSR imagery.
APA, Harvard, Vancouver, ISO, and other styles
4

Wang, Yi, Wenke Yu, and Zhice Fang. "Multiple Kernel-Based SVM Classification of Hyperspectral Images by Combining Spectral, Spatial, and Semantic Information." Remote Sensing 12, no. 1 (January 1, 2020): 120. http://dx.doi.org/10.3390/rs12010120.

Full text
Abstract:
In this study, we present a hyperspectral image classification method by combining spectral, spatial, and semantic information. The main steps of the proposed method are summarized as follows: First, principal component analysis transform is conducted on an original image to produce its extended morphological profile, Gabor features, and superpixel-based segmentation map. To model spatial information, the extended morphological profile and Gabor features are used to represent structure and texture features, respectively. Moreover, the mean filtering is performed within each superpixel to maintain the homogeneity of the spatial features. Then, the k-means clustering and the entropy rate superpixel segmentation are combined to produce semantic feature vectors by using a bag of visual-words model for each superpixel. Next, three kernel functions are constructed to describe the spectral, spatial, and semantic information, respectively. Finally, the composite kernel technique is used to fuse all the features into a multiple kernel function that is fed into a support vector machine classifier to produce a final classification map. Experiments demonstrate that the proposed method is superior to the most popular kernel-based classification methods in terms of both visual inspection and quantitative analysis, even if only very limited training samples are available.
APA, Harvard, Vancouver, ISO, and other styles
5

Yang, J., and Z. Kang. "INDOOR SEMANTIC SEGMENTATION FROM RGB-D IMAGES BY INTEGRATING FULLY CONVOLUTIONAL NETWORK WITH HIGHER-ORDER MARKOV RANDOM FIELD." ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences XLII-4 (September 19, 2018): 717–24. http://dx.doi.org/10.5194/isprs-archives-xlii-4-717-2018.

Full text
Abstract:
<p><strong>Abstract.</strong> Indoor scenes have the characteristics of abundant semantic categories, illumination changes, occlusions and overlaps among objects, which poses great challenges for indoor semantic segmentation. Therefore, we in this paper develop a method based on higher-order Markov random field model for indoor semantic segmentation from RGB-D images. Instead of directly using RGB-D images, we first train and perform RefineNet model only using RGB information for generating the high-level semantic information. Then, the spatial location relationship from depth channel and the spectral information from color channels are integrated as a prior for a marker-controlled watershed algorithm to obtain the robust and accurate visual homogenous regions. Finally, higher-order Markov random field model encodes the short-range context among the adjacent pixels and the long-range context within each visual homogenous region for refining the semantic segmentations. To evaluate the effectiveness and robustness of the proposed method, experiments were conducted on the public SUN RGB-D dataset. Experimental results indicate that compared with using RGB information alone, the proposed method remarkably improves the semantic segmentation results, especially at object boundaries.</p>
APA, Harvard, Vancouver, ISO, and other styles
6

Zhang, Zhisheng, Jinsong Tang, Heping Zhong, Haoran Wu, Peng Zhang, and Mingqiang Ning. "Spectral Normalized CycleGAN with Application in Semisupervised Semantic Segmentation of Sonar Images." Computational Intelligence and Neuroscience 2022 (April 28, 2022): 1–12. http://dx.doi.org/10.1155/2022/1274260.

Full text
Abstract:
The effectiveness of CycleGAN is demonstrated to outperform recent approaches for semisupervised semantic segmentation on public segmentation benchmarks. In contrast to analog images, however, the acoustic images are unbalanced and often exhibit speckle noise. As a consequence, CycleGAN is prone to mode-collapse and cannot retain target details when applied directly to the sonar image dataset. To address this problem, a spectral normalized CycleGAN network is presented, which applies spectral normalization to both generators and discriminators to stabilize the training of GANs. Without using a pretrained model, the experimental results demonstrate that our simple yet effective method helps to achieve reasonably accurate sonar targets segmentation results.
APA, Harvard, Vancouver, ISO, and other styles
7

Akcay, Ozgun, Ahmet Cumhur Kinaci, Emin Ozgur Avsar, and Umut Aydar. "Semantic Segmentation of High-Resolution Airborne Images with Dual-Stream DeepLabV3+." ISPRS International Journal of Geo-Information 11, no. 1 (December 30, 2021): 23. http://dx.doi.org/10.3390/ijgi11010023.

Full text
Abstract:
In geospatial applications such as urban planning and land use management, automatic detection and classification of earth objects are essential and primary subjects. When the significant semantic segmentation algorithms are considered, DeepLabV3+ stands out as a state-of-the-art CNN. Although the DeepLabV3+ model is capable of extracting multi-scale contextual information, there is still a need for multi-stream architectural approaches and different training approaches of the model that can leverage multi-modal geographic datasets. In this study, a new end-to-end dual-stream architecture that considers geospatial imagery was developed based on the DeepLabV3+ architecture. As a result, the spectral datasets other than RGB provided increments in semantic segmentation accuracies when they were used as additional channels to height information. Furthermore, both the given data augmentation and Tversky loss function which is sensitive to imbalanced data accomplished better overall accuracies. Also, it has been shown that the new dual-stream architecture using Potsdam and Vaihingen datasets produced 88.87% and 87.39% overall semantic segmentation accuracies, respectively. Eventually, it was seen that enhancement of the traditional significant semantic segmentation networks has a great potential to provide higher model performances, whereas the contribution of geospatial data as the second stream to RGB to segmentation was explicitly shown.
APA, Harvard, Vancouver, ISO, and other styles
8

Cheng, Xu, Lihua Liu, and Chen Song. "A Cyclic Information–Interaction Model for Remote Sensing Image Segmentation." Remote Sensing 13, no. 19 (September 27, 2021): 3871. http://dx.doi.org/10.3390/rs13193871.

Full text
Abstract:
Object detection and segmentation have recently shown encouraging results toward image analysis and interpretation due to their promising applications in remote sensing image fusion field. Although numerous methods have been proposed, implementing effective and efficient object detection is still very challenging for now, especially for the limitation of single modal data. The use of a single modal data is not always enough to reach proper spectral and spatial resolutions. The rapid expansion in the number and the availability of multi-source data causes new challenges for their effective and efficient processing. In this paper, we propose an effective feature information–interaction visual attention model for multimodal data segmentation and enhancement, which utilizes channel information to weight self-attentive feature maps of different sources, completing extraction, fusion, and enhancement of global semantic features with local contextual information of the object. Additionally, we further propose an adaptively cyclic feature information–interaction model, which adopts branch prediction to decide the number of visual perceptions, accomplishing adaptive fusion of global semantic features and local fine-grained information. Numerous experiments on several benchmarks show that the proposed approach can achieve significant improvements over baseline model.
APA, Harvard, Vancouver, ISO, and other styles
9

Zhang, Chengming, Yan Chen, Xiaoxia Yang, Shuai Gao, Feng Li, Ailing Kong, Dawei Zu, and Li Sun. "Improved Remote Sensing Image Classification Based on Multi-Scale Feature Fusion." Remote Sensing 12, no. 2 (January 8, 2020): 213. http://dx.doi.org/10.3390/rs12020213.

Full text
Abstract:
When extracting land-use information from remote sensing imagery using image segmentation, obtaining fine edges for extracted objects is a key problem that is yet to be solved. In this study, we developed a new weight feature value convolutional neural network (WFCNN) to perform fine remote sensing image segmentation and extract improved land-use information from remote sensing imagery. The WFCNN includes one encoder and one classifier. The encoder obtains a set of spectral features and five levels of semantic features. It uses the linear fusion method to hierarchically fuse the semantic features, employs an adjustment layer to optimize every level of fused features to ensure the stability of the pixel features, and combines the fused semantic and spectral features to form a feature graph. The classifier then uses a Softmax model to perform pixel-by-pixel classification. The WFCNN was trained using a stochastic gradient descent algorithm; the former and two variants were subject to experimental testing based on Gaofen 6 images and aerial images that compared them with the commonly used SegNet, U-NET, and RefineNet models. The accuracy, precision, recall, and F1-Score of the WFCNN were higher than those of the other models, indicating certain advantages in pixel-by-pixel segmentation. The results clearly show that the WFCNN can improve the accuracy and automation level of large-scale land-use mapping and the extraction of other information using remote sensing imagery.
APA, Harvard, Vancouver, ISO, and other styles
10

Song, Hong, Syed Raza Mehdi, Yangfan Zhang, Yichun Shentu, Qixin Wan, Wenxin Wang, Kazim Raza, and Hui Huang. "Development of Coral Investigation System Based on Semantic Segmentation of Single-Channel Images." Sensors 21, no. 5 (March 6, 2021): 1848. http://dx.doi.org/10.3390/s21051848.

Full text
Abstract:
Among aquatic biota, corals provide shelter with sufficient nutrition to a wide variety of underwater life. However, a severe decline in the coral resources can be noted in the last decades due to global environmental changes causing marine pollution. Hence, it is of paramount importance to develop and deploy swift coral monitoring system to alleviate the destruction of corals. Performing semantic segmentation on underwater images is one of the most efficient methods for automatic investigation of corals. Firstly, to design a coral investigation system, RGB and spectral images of various types of corals in natural and artificial aquatic sites are collected. Based on single-channel images, a convolutional neural network (CNN) model, named DeeperLabC, is employed for the semantic segmentation of corals, which is a concise and modified deeperlab model with encoder-decoder architecture. Using ResNet34 as a skeleton network, the proposed model extracts coral features in the images and performs semantic segmentation. DeeperLabC achieved state-of-the-art coral segmentation with an overall mean intersection over union (IoU) value of 93.90%, and maximum F1-score of 97.10% which surpassed other existing benchmark neural networks for semantic segmentation. The class activation map (CAM) module also proved the excellent performance of the DeeperLabC model in binary classification among coral and non-coral bodies.
APA, Harvard, Vancouver, ISO, and other styles
11

Chen, Guanzhou, Xiaoliang Tan, Beibei Guo, Kun Zhu, Puyun Liao, Tong Wang, Qing Wang, and Xiaodong Zhang. "SDFCNv2: An Improved FCN Framework for Remote Sensing Images Semantic Segmentation." Remote Sensing 13, no. 23 (December 3, 2021): 4902. http://dx.doi.org/10.3390/rs13234902.

Full text
Abstract:
Semantic segmentation is a fundamental task in remote sensing image analysis (RSIA). Fully convolutional networks (FCNs) have achieved state-of-the-art performance in the task of semantic segmentation of natural scene images. However, due to distinctive differences between natural scene images and remotely-sensed (RS) images, FCN-based semantic segmentation methods from the field of computer vision cannot achieve promising performances on RS images without modifications. In previous work, we proposed an RS image semantic segmentation framework SDFCNv1, combined with a majority voting postprocessing method. Nevertheless, it still has some drawbacks, such as small receptive field and large number of parameters. In this paper, we propose an improved semantic segmentation framework SDFCNv2 based on SDFCNv1, to conduct optimal semantic segmentation on RS images. We first construct a novel FCN model with hybrid basic convolutional (HBC) blocks and spatial-channel-fusion squeeze-and-excitation (SCFSE) modules, which occupies a larger receptive field and fewer network model parameters. We also put forward a data augmentation method based on spectral-specific stochastic-gamma-transform-based (SSSGT-based) during the model training process to improve generalizability of our model. Besides, we design a mask-weighted voting decision fusion postprocessing algorithm for image segmentation on overlarge RS images. We conducted several comparative experiments on two public datasets and a real surveying and mapping dataset. Extensive experimental results demonstrate that compared with the SDFCNv1 framework, our SDFCNv2 framework can increase the mIoU metric by up to 5.22% while only using about half of parameters.
APA, Harvard, Vancouver, ISO, and other styles
12

Yan, L., and W. Xia. "A MODIFIED THREE-DIMENSIONAL GRAY-LEVEL CO-OCCURRENCE MATRIX FOR IMAGE CLASSIFICATION WITH DIGITAL SURFACE MODEL." ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences XLII-2/W13 (June 4, 2019): 133–38. http://dx.doi.org/10.5194/isprs-archives-xlii-2-w13-133-2019.

Full text
Abstract:
<p><strong>Abstract.</strong> 2D texture cannot reflect the 3D object’s texture because it only considers the intensity distribution in the 2D image region but int real world the intensities of objects are distributed in 3D surface. This paper proposes a modified three-dimensional gray-level co-occurrence matrix (3D-GLCM) which is first introduced to process volumetric data but cannot be used directly to spectral images with digital surface model because of the data sparsity of the direction perpendicular to the image plane. Spectral and geometric features combined with no texture, 2D-GLCM and 3D-GLCM were put into random forest for comparing using ISPRS 2D semantic labelling challenge dataset, and the overall accuracy of the combination containing 3D GLCM improved by 2.4% and 1.3% compared to the combinations without textures or with 2D-GLCM correspondingly.</p>
APA, Harvard, Vancouver, ISO, and other styles
13

Saha, S., L. Kondmann, and X. X. Zhu. "DEEP NO LEARNING APPROACH FOR UNSUPERVISED CHANGE DETECTION IN HYPERSPECTRAL IMAGES." ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences V-3-2021 (June 17, 2021): 311–16. http://dx.doi.org/10.5194/isprs-annals-v-3-2021-311-2021.

Full text
Abstract:
Abstract. Unsupervised deep transfer-learning based change detection (CD) methods require pre-trained feature extractor that can be used to extract semantic features from the target bi-temporal scene. However, it is difficult to obtain such feature extractors for hyperspectral images. Moreover, it is not trivial to reuse the models trained with the multispectral images for the hyperspectral images due to the significant difference in number of spectral bands. While hyperspectral images show large number of spectral bands, they generally show much less spatial complexity, thus reducing the requirement of large receptive fields of convolution filters. Recent works in the computer vision have shown that even untrained networks can yield remarkable result in different tasks like super-resolution and surface reconstruction. Motivated by this, we make a bold proposition that untrained deep model, initialized with some weight initialization strategy can be used to extract useful semantic features from bi-temporal hyperspectral images. Thus, we couple an untrained network with Deep Change Vector Analysis (DCVA), a popular method for unsupervised CD, to propose an unsupervised CD method for hyperspectral images. We conduct experiments on two hyperspectral CD data sets, and the results demonstrate advantages of the proposed unsupervised method over other competitors.
APA, Harvard, Vancouver, ISO, and other styles
14

Yang, Qinchen, Man Liu, Zhitao Zhang, Shuqin Yang, Jifeng Ning, and Wenting Han. "Mapping Plastic Mulched Farmland for High Resolution Images of Unmanned Aerial Vehicle Using Deep Semantic Segmentation." Remote Sensing 11, no. 17 (August 26, 2019): 2008. http://dx.doi.org/10.3390/rs11172008.

Full text
Abstract:
With increasing consumption, plastic mulch benefits agriculture by promoting crop quality and yield, but the environmental and soil pollution is becoming increasingly serious. Therefore, research on the monitoring of plastic mulched farmland (PMF) has received increasing attention. Plastic mulched farmland in unmanned aerial vehicle (UAV) remote images due to the high resolution, shows a prominent spatial pattern, which brings difficulties to the task of monitoring PMF. In this paper, through a comparison between two deep semantic segmentation methods, SegNet and fully convolutional networks (FCN), and a traditional classification method, Support Vector Machine (SVM), we propose an end-to-end deep-learning method aimed at accurately recognizing PMF for UAV remote sensing images from Hetao Irrigation District, Inner Mongolia, China. After experiments with single-band, three-band and six-band image data, we found that deep semantic segmentation models built via single-band data which only use the texture pattern of PMF can identify it well; for example, SegNet reaching the highest accuracy of 88.68% in a 900 nm band. Furthermore, with three visual bands and six-band data (3 visible bands and 3 near-infrared bands), deep semantic segmentation models combining the texture and spectral features further improve the accuracy of PMF identification, whereas six-band data obtains an optimal performance for FCN and SegNet. In addition, deep semantic segmentation methods, FCN and SegNet, due to their strong feature extraction capability and direct pixel classification, clearly outperform the traditional SVM method in precision and speed. Among three classification methods, SegNet model built on three-band and six-band data obtains the optimal average accuracy of 89.62% and 90.6%, respectively. Therefore, the proposed deep semantic segmentation model, when tested against the traditional classification method, provides a promising path for mapping PMF in UAV remote sensing images.
APA, Harvard, Vancouver, ISO, and other styles
15

Tan, Daning, Yu Liu, Gang Li, Libo Yao, Shun Sun, and You He. "Serial GANs: A Feature-Preserving Heterogeneous Remote Sensing Image Transformation Model." Remote Sensing 13, no. 19 (October 3, 2021): 3968. http://dx.doi.org/10.3390/rs13193968.

Full text
Abstract:
In recent years, the interpretation of SAR images has been significantly improved with the development of deep learning technology, and using conditional generative adversarial nets (CGANs) for SAR-to-optical transformation, also known as image translation, has become popular. Most of the existing image translation methods based on conditional generative adversarial nets are modified based on CycleGAN and pix2pix, focusing on style transformation in practice. In addition, SAR images and optical images are characterized by heterogeneous features and large spectral differences, leading to problems such as incomplete image details and spectral distortion in the heterogeneous transformation of SAR images in urban or semiurban areas and with complex terrain. Aiming to solve the problems of SAR-to-optical transformation, Serial GANs, a feature-preserving heterogeneous remote sensing image transformation model, is proposed in this paper for the first time. This model uses the Serial Despeckling GAN and Colorization GAN to complete the SAR-to-optical transformation. Despeckling GAN transforms the SAR images into optical gray images, retaining the texture details and semantic information. Colorization GAN transforms the optical gray images obtained in the first step into optical color images and keeps the structural features unchanged. The model proposed in this paper provides a new idea for heterogeneous image transformation. Through decoupling network design, structural detail information and spectral information are relatively independent in the process of heterogeneous transformation, thereby enhancing the detail information of the generated optical images and reducing its spectral distortion. Using SEN-2 satellite images as the reference, this paper compares the degree of similarity between the images generated by different models and the reference, and the results revealed that the proposed model has obvious advantages in feature reconstruction and the economical volume of the parameters. It also showed that Serial GANs have great potential in decoupling image transformation.
APA, Harvard, Vancouver, ISO, and other styles
16

He, L., Z. Wu, Y. Zhang, and Z. Hu. "SEMANTIC SEGMENTATION OF REMOTE SENSING IMAGERY USING OBJECT-BASED MARKOV RANDOM FIELD BASED ON HIERARCHICAL SEGMENTATION TREE WITH AUXILIARY LABELS." ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences XLIII-B3-2020 (August 21, 2020): 75–81. http://dx.doi.org/10.5194/isprs-archives-xliii-b3-2020-75-2020.

Full text
Abstract:
Abstract. In the remote sensing imagery, spectral and texture features are always complex due to different landscapes, which leads to misclassifications in the results of semantic segmentation. The object-based Markov random field provides an effective solution to this problem. However, the state-of-the-art object-based Markov random field still needs to be improved. In this paper, an object-based Markov Random Field model based on hierarchical segmentation tree with auxiliary labels is proposed. A remote sensing imagery is first segmented and the object-based hierarchical segmentation tree is built based on initial segmentation objects and merging criteria. And then, the object-based Markov random field with auxiliary label fields is established on the hierarchical tree structure. A probabilistic inference is applied to solve this model by iteratively updating label field and auxiliary label fields. In the experiment, this paper utilized a Worldview-3 image to evaluate the performance, and the results show the validity and the accuracy of the presented semantic segmentation approach.
APA, Harvard, Vancouver, ISO, and other styles
17

Li, Xiaoman, Yanfei Zhong, Yu Su, and Richen Ye. "Scene-Change Detection Based on Multi-Feature-Fusion Latent Dirichlet Allocation Model for High-Spatial-Resolution Remote Sensing Imagery." Photogrammetric Engineering & Remote Sensing 87, no. 9 (September 1, 2021): 669–81. http://dx.doi.org/10.14358/pers.20-00054.

Full text
Abstract:
With the continuous development of high-spatial-resolution ground observation technology, it is now becoming possible to obtain more and more high-resolution images, which provide us with the possibility to understand remote sensing images at the semantic level. Compared with traditional pixel- and object-oriented methods of change detection, scene-change detection can provide us with land use change information at the semantic level, and can thus provide reliable information for urban land use change detection, urban planning, and government management. Most of the current scene-change detection methods are based on the visual-words expression of the bag-of-visual-words model and the single-feature-based latent Dirichlet allocation model. In this article, a scene-change detection method for high-spatial-resolution imagery is proposed based on a multi-feature-fusion latent Dirich- let allocation model. This method combines the spectral, textural, and spatial features of the high-spatial-resolution images, and the final scene expression is realized through the topic features extracted from the more abstract latent Dirichlet allocation model. Post-classification comparison is then used to detect changes in the scene images at different times. A series of experiments demonstrates that, compared with the traditional bag-of-words and topic models, the proposed method can obtain superior scene-change detection results.
APA, Harvard, Vancouver, ISO, and other styles
18

Ruben, P. A., R. Sileryte, and G. Agugiaro. "3D CITY MODELS FOR URBAN MINING: POINT CLOUD BASED SEMANTIC ENRICHMENT FOR SPECTRAL VARIATION IDENTIFICATION IN HYPERSPECTRAL IMAGERY." ISPRS Annals of Photogrammetry, Remote Sensing and Spatial Information Sciences V-4-2020 (August 3, 2020): 223–30. http://dx.doi.org/10.5194/isprs-annals-v-4-2020-223-2020.

Full text
Abstract:
Abstract. Urban mining aims at reusing building materials enclosed in our cities. Therefore, it requires accurate information on the availability of these materials for each separate building. While recent publications have demonstrated that such information can be obtained using machine learning and data fusion techniques applied to hyperspectral imagery, challenges still persist. One of these is the so-called ’salt-and-pepper noise’, i.e. the oversensitivity to the presence of several materials within one pixel (e.g. chimneys, roof windows). For the specific case of identifying roof materials, this research demonstrates the potential of 3D city models to identify and filter out such unreliable pixels beforehand. As, from a geometrical point of view, most available 3D city models are too generalized for this purpose (e.g. in CityGML Level of Detail 2), semantic enrichment using a point cloud is proposed to compensate missing details. So-called deviations are mapped onto a 3D building model by comparing it with a point cloud. Seeded region growing approach based on distance and orientation features is used for the comparison. Further, the results of a validation carried out for parts of Rotterdam and resulting in KHAT values as high as 0.7 are discussed.
APA, Harvard, Vancouver, ISO, and other styles
19

Nyabuga, Douglas Omwenga, Jinling Song, Guohua Liu, and Michael Adjeisah. "A 3D-2D Convolutional Neural Network and Transfer Learning for Hyperspectral Image Classification." Computational Intelligence and Neuroscience 2021 (August 21, 2021): 1–19. http://dx.doi.org/10.1155/2021/1759111.

Full text
Abstract:
As one of the fast evolution of remote sensing and spectral imagery techniques, hyperspectral image (HSI) classification has attracted considerable attention in various fields, including land survey, resource monitoring, and among others. Nonetheless, due to a lack of distinctiveness in the hyperspectral pixels of separate classes, there is a recurrent inseparability obstacle in the primary space. Additionally, an open challenge stems from examining efficient techniques that can speedily classify and interpret the spectral-spatial data bands within a more precise computational time. Hence, in this work, we propose a 3D-2D convolutional neural network and transfer learning model where the early layers of the model exploit 3D convolutions to modeling spectral-spatial information. On top of it are 2D convolutional layers to handle semantic abstraction mainly. Toward simplicity and a highly modularized network for image classification, we leverage the ResNeXt-50 block for our model. Furthermore, improving the separability among classes and balance of the interclass and intraclass criteria, we engaged principal component analysis (PCA) for the best orthogonal vectors for representing information from HSIs before feeding to the network. The experimental result shows that our model can efficiently improve the hyperspectral imagery classification, including an instantaneous representation of the spectral-spatial information. Our model evaluation on five publicly available hyperspectral datasets, Indian Pines (IP), Pavia University Scene (PU), Salinas Scene (SA), Botswana (BS), and Kennedy Space Center (KSC), was performed with a high classification accuracy of 99.85%, 99.98%, 100%, 99.82%, and 99.71%, respectively. Quantitative results demonstrated that it outperformed several state-of-the-arts (SOTA), deep neural network-based approaches, and standard classifiers. Thus, it has provided more insight into hyperspectral image classification.
APA, Harvard, Vancouver, ISO, and other styles
20

Bell, Theodore S., Donald D. Dirks, and Timothy D. Trine. "Frequency-Importance Functions for Words in High- and Low-Context Sentences." Journal of Speech, Language, and Hearing Research 35, no. 4 (August 1992): 950–59. http://dx.doi.org/10.1044/jshr.3504.950.

Full text
Abstract:
The relative importance and absolute contributions of various spectral regions to speech intelligibility under conditions of either neutral or predictable sentential context were examined. Specifically, the frequency-importance functions for a set of monosyllabic words embedded in a highly predictive sentence context versus a sentence with little predictive information were developed using Articulation Index (Al) methods. Forty-two young normal-hearing adults heard sentences presented at signal-to-noise ratios from –8 to +14 dB in a noise shaped to conform to the peak spectrum of the speech. Results indicated only slight differences in ⅓-octave importance functions due to differences in semantic context, although the crossovers differed by a constant 180 Hz. Methodological and theoretical aspects of parameter estimation in the Al model are discussed. The results suggest that semantic context, as defined by these conditions, may alter frequency-importance relationships in addition to the dynamic range over which intelligibility rises.
APA, Harvard, Vancouver, ISO, and other styles
21

Wang, Xiaolei, Nengcheng Chen, Zeqiang Chen, Xunliang Yang, and Jizhen Li. "Earth observation metadata ontology model for spatiotemporal-spectral semantic-enhanced satellite observation discovery: a case study of soil moisture monitoring." GIScience & Remote Sensing 53, no. 1 (September 23, 2015): 22–44. http://dx.doi.org/10.1080/15481603.2015.1092490.

Full text
APA, Harvard, Vancouver, ISO, and other styles
22

Ouyang, Song, and Yansheng Li. "Combining Deep Semantic Segmentation Network and Graph Convolutional Neural Network for Semantic Segmentation of Remote Sensing Imagery." Remote Sensing 13, no. 1 (December 31, 2020): 119. http://dx.doi.org/10.3390/rs13010119.

Full text
Abstract:
Although the deep semantic segmentation network (DSSN) has been widely used in remote sensing (RS) image semantic segmentation, it still does not fully mind the spatial relationship cues between objects when extracting deep visual features through convolutional filters and pooling layers. In fact, the spatial distribution between objects from different classes has a strong correlation characteristic. For example, buildings tend to be close to roads. In view of the strong appearance extraction ability of DSSN and the powerful topological relationship modeling capability of the graph convolutional neural network (GCN), a DSSN-GCN framework, which combines the advantages of DSSN and GCN, is proposed in this paper for RS image semantic segmentation. To lift the appearance extraction ability, this paper proposes a new DSSN called the attention residual U-shaped network (AttResUNet), which leverages residual blocks to encode feature maps and the attention module to refine the features. As far as GCN, the graph is built, where graph nodes are denoted by the superpixels and the graph weight is calculated by considering the spectral information and spatial information of the nodes. The AttResUNet is trained to extract the high-level features to initialize the graph nodes. Then the GCN combines features and spatial relationships between nodes to conduct classification. It is worth noting that the usage of spatial relationship knowledge boosts the performance and robustness of the classification module. In addition, benefiting from modeling GCN on the superpixel level, the boundaries of objects are restored to a certain extent and there are less pixel-level noises in the final classification result. Extensive experiments on two publicly open datasets show that DSSN-GCN model outperforms the competitive baseline (i.e., the DSSN model) and the DSSN-GCN when adopting AttResUNet achieves the best performance, which demonstrates the advance of our method.
APA, Harvard, Vancouver, ISO, and other styles
23

Graf, Lukas, Heike Bach, and Dirk Tiede. "Semantic Segmentation of Sentinel-2 Imagery for Mapping Irrigation Center Pivots." Remote Sensing 12, no. 23 (December 1, 2020): 3937. http://dx.doi.org/10.3390/rs12233937.

Full text
Abstract:
Estimating the number and size of irrigation center pivot systems (CPS) from remotely sensed data, using artificial intelligence (AI), is a potential information source for assessing agricultural water use. In this study, we identified two technical challenges in the neural-network-based classification: Firstly, an effective reduction of the feature space of the remote sensing data to shorten training times and increase classification accuracy is required. Secondly, the geographical transferability of the AI algorithms is a pressing issue if AI is to replace human mapping efforts one day. Therefore, we trained the semantic image segmentation algorithm U-NET on four spectral channels (U-NET SPECS) and the first three principal components (U-NET principal component analysis (PCA)) of ESA/Copernicus Sentinel-2 images on a study area in Texas, USA, and assessed the geographic transferability of the trained models to two other sites: the Duero basin, in Spain, and South Africa. U-NET SPECS outperformed U-NET PCA at all three study areas, with the highest f1-score at Texas (0.87, U-NET PCA: 0.83), and a value of 0.68 (U-NET PCA: 0.43) in South Africa. At the Duero, both models showed poor classification accuracy (f1-score U-NET PCA: 0.08; U-NET SPECS: 0.16) and segmentation quality, which was particularly evident in the incomplete representation of the center pivot geometries. In South Africa and at the Duero site, a high rate of false positive and false negative was observed, which made the model less useful, especially at the Duero test site. Thus, geographical invariance is not an inherent model property and seems to be mainly driven by the complexity of land-use pattern. We do not consider PCA a suited spectral dimensionality reduction measure in this. However, shorter training times and a more stable training process indicate promising prospects for reducing computational burdens. We therefore conclude that effective dimensionality reduction and geographic transferability are important prospects for further research towards the operational usage of deep learning algorithms, not only regarding the mapping of CPS.
APA, Harvard, Vancouver, ISO, and other styles
24

Cao, W., X. H. Tong, S. C. Liu, and D. Wang. "LANDSLIDES EXTRACTION FROM DIVERSE REMOTE SENSING DATA SOURCES USING SEMANTIC REASONING SCHEME." ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences XLI-B8 (June 22, 2016): 25–31. http://dx.doi.org/10.5194/isprsarchives-xli-b8-25-2016.

Full text
Abstract:
Using high resolution satellite imagery to detect, analyse and extract landslides automatically is an increasing strong support for rapid response after disaster. This requires the formulation of procedures and knowledge that encapsulate the content of disaster area in the images. Object-oriented approach has been proved useful in solving this issue by partitioning land-cover parcels into objects and classifies them on the basis of expert rules. Since the landslides information present in the images is often complex, the extraction procedure based on the object-oriented approach should consider primarily the semantic aspects of the data. In this paper, we propose a scheme for recognizing landslides by using an object-oriented analysis technique and a semantic reasoning model on high spatial resolution optical imagery. Three case regions with different data sources are presented to evaluate its practicality. The procedure is designed as follows: first, the Gray Level Co-occurrence Matrix (GLCM) is used to extract texture features after the image explanation. Spectral features, shape features and thematic features are derived for semiautomatic landslide recognition. A semantic reasoning model is used afterwards to refine the classification results, by representing expert knowledge as first-order logic (FOL) rules. The experimental results are essentially consistent with the experts’ field interpretation, which demonstrate the feasibility and accuracy of the proposed approach. The results also show that the scheme has a good generality on diverse data sources.
APA, Harvard, Vancouver, ISO, and other styles
25

Cao, W., X. H. Tong, S. C. Liu, and D. Wang. "LANDSLIDES EXTRACTION FROM DIVERSE REMOTE SENSING DATA SOURCES USING SEMANTIC REASONING SCHEME." ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences XLI-B8 (June 22, 2016): 25–31. http://dx.doi.org/10.5194/isprs-archives-xli-b8-25-2016.

Full text
Abstract:
Using high resolution satellite imagery to detect, analyse and extract landslides automatically is an increasing strong support for rapid response after disaster. This requires the formulation of procedures and knowledge that encapsulate the content of disaster area in the images. Object-oriented approach has been proved useful in solving this issue by partitioning land-cover parcels into objects and classifies them on the basis of expert rules. Since the landslides information present in the images is often complex, the extraction procedure based on the object-oriented approach should consider primarily the semantic aspects of the data. In this paper, we propose a scheme for recognizing landslides by using an object-oriented analysis technique and a semantic reasoning model on high spatial resolution optical imagery. Three case regions with different data sources are presented to evaluate its practicality. The procedure is designed as follows: first, the Gray Level Co-occurrence Matrix (GLCM) is used to extract texture features after the image explanation. Spectral features, shape features and thematic features are derived for semiautomatic landslide recognition. A semantic reasoning model is used afterwards to refine the classification results, by representing expert knowledge as first-order logic (FOL) rules. The experimental results are essentially consistent with the experts’ field interpretation, which demonstrate the feasibility and accuracy of the proposed approach. The results also show that the scheme has a good generality on diverse data sources.
APA, Harvard, Vancouver, ISO, and other styles
26

Christovam, Luiz E., Milton H. Shimabukuro, Maria de Lourdes B. T. Galo, and Eija Honkavaara. "Pix2pix Conditional Generative Adversarial Network with MLP Loss Function for Cloud Removal in a Cropland Time Series." Remote Sensing 14, no. 1 (December 29, 2021): 144. http://dx.doi.org/10.3390/rs14010144.

Full text
Abstract:
Clouds are one of the major limitations to crop monitoring using optical satellite images. Despite all efforts to provide decision-makers with high-quality agricultural statistics, there is still a lack of techniques to optimally process satellite image time series in the presence of clouds. In this regard, in this article it was proposed to add a Multi-Layer Perceptron loss function to the pix2pix conditional Generative Adversarial Network (cGAN) objective function. The aim was to enforce the generative model to learn how to deliver synthetic pixels whose values were proxies for the spectral response improving further crop type mapping. Furthermore, it was evaluated the generalization capacity of the generative models in producing pixels with plausible values for images not used in the training. To assess the performance of the proposed approach it was compared real images with synthetic images generated with the proposed approach as well as with the original pix2pix cGAN. The comparative analysis was performed through visual analysis, pixel values analysis, semantic segmentation and similarity metrics. In general, the proposed approach provided slightly better synthetic pixels than the original pix2pix cGAN, removing more noise than the original pix2pix algorithm as well as providing better crop type semantic segmentation; the semantic segmentation of the synthetic image generated with the proposed approach achieved an F1-score of 44.2%, while the real image achieved 44.7%. Regarding the generalization, the models trained utilizing different regions of the same image provided better pixels than models trained using other images in the time series. Besides this, the experiments also showed that the models trained using a pair of images selected every three months along the time series also provided acceptable results on images that do not have cloud-free areas.
APA, Harvard, Vancouver, ISO, and other styles
27

Cui, Binge, Haoqing Zhang, Wei Jing, Huifang Liu, and Jianming Cui. "SRSe-Net: Super-Resolution-Based Semantic Segmentation Network for Green Tide Extraction." Remote Sensing 14, no. 3 (February 2, 2022): 710. http://dx.doi.org/10.3390/rs14030710.

Full text
Abstract:
Due to the phenomenon of mixed pixels in low-resolution remote sensing images, the green tide spectral features with low Enteromorpha coverage are not obvious. Super-resolution technology based on deep learning can supplement more detailed information for subsequent semantic segmentation tasks. In this paper, a novel green tide extraction method for MODIS images based on super-resolution and a deep semantic segmentation network was proposed. Inspired by the idea of transfer learning, a super-resolution model (i.e., WDSR) is first pre-trained with high spatial resolution GF1-WFV images, and then the representations learned in the GF1-WFV image domain are transferred to the MODIS image domain. The improvement of remote sensing image resolution enables us to better distinguish the green tide patches from the surrounding seawater. As a result, a deep semantic segmentation network (SRSe-Net) suitable for large-scale green tide information extraction is proposed. The SRSe-Net introduced the dense connection mechanism on the basis of U-Net and replaces the convolution operations with dense blocks, which effectively obtained the detailed green tide boundary information by strengthening the propagation and reusing features. In addition, the SRSe-Net reducs the pooling layer and adds a bridge module in the final stage of the encoder. The experimental results show that a SRSe-Net can obtain more accurate segmentation results with fewer network parameters.
APA, Harvard, Vancouver, ISO, and other styles
28

Mitri, G. H., and I. Z. Gitas. "A semi-automated object-oriented model for burned area mapping in the Mediterranean region using Landsat-TM imagery." International Journal of Wildland Fire 13, no. 3 (2004): 367. http://dx.doi.org/10.1071/wf03079.

Full text
Abstract:
Pixel-based classification methods that make use of the spectral information derived from satellite images have been repeatedly reported to create confusion between burned areas and non-vegetation categories, especially water bodies and shaded areas. As a result of the aforementioned, these methods cannot be used on an operational basis for mapping burned areas using satellite images. On the other hand, object-based image classification allows the integration of a broad spectrum of different object features, such as spectral values, shape and texture. Sophisticated classification, incorporating contextual and semantic information, can be performed by using not only image object attributes, but also the relationship between networked image objects. In this study, the synergy of all these features allowed us to address image analysis tasks that, up until now, have not been possible. The aim of this work was to develop an object-based classification model for burned area mapping in the Mediterranean using Landsat-TM imagery. The object-oriented model developed to map a burned area on the Greek island of Thasos was then used to map other burned areas in the Mediterranean region after the Landsat-TM images had been radiometrically, geometrically and topographically corrected. The results of the research showed that the developed object-oriented model was transferable and that it could be effectively used as an operative tool for identifying and mapping the three different burned areas (~98% overall accuracy).
APA, Harvard, Vancouver, ISO, and other styles
29

Schuegraf, Philipp, and Ksenia Bittner. "Automatic Building Footprint Extraction from Multi-Resolution Remote Sensing Images Using a Hybrid FCN." ISPRS International Journal of Geo-Information 8, no. 4 (April 12, 2019): 191. http://dx.doi.org/10.3390/ijgi8040191.

Full text
Abstract:
Recent technical developments made it possible to supply large-scale satellite image coverage. This poses the challenge of efficient discovery of imagery. One very important task in applications like urban planning and reconstruction is to automatically extract building footprints. The integration of different information, which is presently achievable due to the availability of high-resolution remote sensing data sources, makes it possible to improve the quality of the extracted building outlines. Recently, deep neural networks were extended from image-level to pixel-level labelling, allowing to densely predict semantic labels. Based on these advances, we propose an end-to-end U-shaped neural network, which efficiently merges depth and spectral information within two parallel networks combined at the late stage for binary building mask generation. Moreover, as satellites usually provide high-resolution panchromatic images, but only low-resolution multi-spectral images, we tackle this issue by using a residual neural network block. It fuses those images with different spatial resolution at the early stage, before passing the fused information to the Unet stream, responsible for processing spectral information. In a parallel stream, a stereo digital surface model (DSM) is also processed by the Unet. Additionally, we demonstrate that our method generalizes for use in cities which are not included in the training data.
APA, Harvard, Vancouver, ISO, and other styles
30

Ge, Zixian, Guo Cao, Hao Shi, Youqiang Zhang, Xuesong Li, and Peng Fu. "Compound Multiscale Weak Dense Network with Hybrid Attention for Hyperspectral Image Classification." Remote Sensing 13, no. 16 (August 20, 2021): 3305. http://dx.doi.org/10.3390/rs13163305.

Full text
Abstract:
Recently, hyperspectral image (HSI) classification has become a popular research direction in remote sensing. The emergence of convolutional neural networks (CNNs) has greatly promoted the development of this field and demonstrated excellent classification performance. However, due to the particularity of HSIs, redundant information and limited samples pose huge challenges for extracting strong discriminative features. In addition, addressing how to fully mine the internal correlation of the data or features based on the existing model is also crucial in improving classification performance. To overcome the above limitations, this work presents a strong feature extraction neural network with an attention mechanism. Firstly, the original HSI is weighted by means of the hybrid spectral–spatial attention mechanism. Then, the data are input into a spectral feature extraction branch and a spatial feature extraction branch, composed of multiscale feature extraction modules and weak dense feature extraction modules, to extract high-level semantic features. These two features are compressed and fused using the global average pooling and concat approaches. Finally, the classification results are obtained by using two fully connected layers and one Softmax layer. A performance comparison shows the enhanced classification performance of the proposed model compared to the current state of the art on three public datasets.
APA, Harvard, Vancouver, ISO, and other styles
31

Sun, X. F., and X. G. Lin. "RANDOM-FOREST-ENSEMBLE-BASED CLASSIFICATION OF HIGH-RESOLUTION REMOTE SENSING IMAGES AND NDSM OVER URBAN AREAS." ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences XLII-2/W7 (September 13, 2017): 887–92. http://dx.doi.org/10.5194/isprs-archives-xlii-2-w7-887-2017.

Full text
Abstract:
As an intermediate step between raw remote sensing data and digital urban maps, remote sensing data classification has been a challenging and long-standing research problem in the community of remote sensing. In this work, an effective classification method is proposed for classifying high-resolution remote sensing data over urban areas. Starting from high resolution multi-spectral images and 3D geometry data, our method proceeds in three main stages: feature extraction, classification, and classified result refinement. First, we extract color, vegetation index and texture features from the multi-spectral image and compute the height, elevation texture and differential morphological profile (DMP) features from the 3D geometry data. Then in the classification stage, multiple random forest (RF) classifiers are trained separately, then combined to form a RF ensemble to estimate each sample’s category probabilities. Finally the probabilities along with the feature importance indicator outputted by RF ensemble are used to construct a fully connected conditional random field (FCCRF) graph model, by which the classification results are refined through mean-field based statistical inference. Experiments on the ISPRS Semantic Labeling Contest dataset show that our proposed 3-stage method achieves 86.9% overall accuracy on the test data.
APA, Harvard, Vancouver, ISO, and other styles
32

Ballesteros, John R., German Sanchez-Torres, and John W. Branch-Bedoya. "HAGDAVS: Height-Augmented Geo-Located Dataset for Detection and Semantic Segmentation of Vehicles in Drone Aerial Orthomosaics." Data 7, no. 4 (April 14, 2022): 50. http://dx.doi.org/10.3390/data7040050.

Full text
Abstract:
Detection and Semantic Segmentation of vehicles in drone aerial orthomosaics has applications in a variety of fields such as security, traffic and parking management, urban planning, logistics, and transportation, among many others. This paper presents the HAGDAVS dataset fusing RGB spectral channel and Digital Surface Model DSM for the detection and segmentation of vehicles from aerial drone images, including three vehicle classes: cars, motorcycles, and ghosts (motorcycle or car). We supply DSM as an additional variable to be included in deep learning and computer vision models to increase its accuracy. RGB orthomosaic, RG-DSM fusion, and multi-label mask are provided in Tag Image File Format. Geo-located vehicle bounding boxes are provided in GeoJSON vector format. We also describes the acquisition of drone data, the derived products, and the workflow to produce the dataset. Researchers would benefit from using the proposed dataset to improve results in the case of vehicle occlusion, geo-location, and the need for cleaning ghost vehicles. As far as we know, this is the first openly available dataset for vehicle detection and segmentation, comprising RG-DSM drone data fusion and different color masks for motorcycles, cars, and ghosts.
APA, Harvard, Vancouver, ISO, and other styles
33

Habib, Maria, Mohammad Faris, Raneem Qaddoura, Manal Alomari, Alaa Alomari, and Hossam Faris. "Toward an Automatic Quality Assessment of Voice-Based Telemedicine Consultations: A Deep Learning Approach." Sensors 21, no. 9 (May 10, 2021): 3279. http://dx.doi.org/10.3390/s21093279.

Full text
Abstract:
Maintaining a high quality of conversation between doctors and patients is essential in telehealth services, where efficient and competent communication is important to promote patient health. Assessing the quality of medical conversations is often handled based on a human auditory-perceptual evaluation. Typically, trained experts are needed for such tasks, as they follow systematic evaluation criteria. However, the daily rapid increase of consultations makes the evaluation process inefficient and impractical. This paper investigates the automation of the quality assessment process of patient–doctor voice-based conversations in a telehealth service using a deep-learning-based classification model. For this, the data consist of audio recordings obtained from Altibbi. Altibbi is a digital health platform that provides telemedicine and telehealth services in the Middle East and North Africa (MENA). The objective is to assist Altibbi’s operations team in the evaluation of the provided consultations in an automated manner. The proposed model is developed using three sets of features: features extracted from the signal level, the transcript level, and the signal and transcript levels. At the signal level, various statistical and spectral information is calculated to characterize the spectral envelope of the speech recordings. At the transcript level, a pre-trained embedding model is utilized to encompass the semantic and contextual features of the textual information. Additionally, the hybrid of the signal and transcript levels is explored and analyzed. The designed classification model relies on stacked layers of deep neural networks and convolutional neural networks. Evaluation results show that the model achieved a higher level of precision when compared with the manual evaluation approach followed by Altibbi’s operations team.
APA, Harvard, Vancouver, ISO, and other styles
34

Aryal, Bibek, Stephen M. Escarzaga, Sergio A. Vargas Zesati, Miguel Velez-Reyes, Olac Fuentes, and Craig Tweedie. "Semi-Automated Semantic Segmentation of Arctic Shorelines Using Very High-Resolution Airborne Imagery, Spectral Indices and Weakly Supervised Machine Learning Approaches." Remote Sensing 13, no. 22 (November 14, 2021): 4572. http://dx.doi.org/10.3390/rs13224572.

Full text
Abstract:
Precise coastal shoreline mapping is essential for monitoring changes in erosion rates, surface hydrology, and ecosystem structure and function. Monitoring water bodies in the Arctic National Wildlife Refuge (ANWR) is of high importance, especially considering the potential for oil and natural gas exploration in the region. In this work, we propose a modified variant of the Deep Neural Network based U-Net Architecture for the automated mapping of 4 Band Orthorectified NOAA Airborne Imagery using sparsely labeled training data and compare it to the performance of traditional Machine Learning (ML) based approaches—namely, random forest, xgboost—and spectral water indices—Normalized Difference Water Index (NDWI), and Normalized Difference Surface Water Index (NDSWI)—to support shoreline mapping of Arctic coastlines. We conclude that it is possible to modify the U-Net model to accept sparse labels as input and the results are comparable to other ML methods (an Intersection-over-Union (IoU) of 94.86% using U-Net vs. an IoU of 95.05% using the best performing method).
APA, Harvard, Vancouver, ISO, and other styles
35

Wang, Jinxiao, Fang Chen, Meimei Zhang, and Bo Yu. "ACFNet: A Feature Fusion Network for Glacial Lake Extraction Based on Optical and Synthetic Aperture Radar Images." Remote Sensing 13, no. 24 (December 15, 2021): 5091. http://dx.doi.org/10.3390/rs13245091.

Full text
Abstract:
Glacial lake extraction is essential for studying the response of glacial lakes to climate change and assessing the risks of glacial lake outburst floods. Most methods for glacial lake extraction are based on either optical images or synthetic aperture radar (SAR) images. Although deep learning methods can extract features of optical and SAR images well, efficiently fusing two modality features for glacial lake extraction with high accuracy is challenging. In this study, to make full use of the spectral characteristics of optical images and the geometric characteristics of SAR images, we propose an atrous convolution fusion network (ACFNet) to extract glacial lakes based on Landsat 8 optical images and Sentinel-1 SAR images. ACFNet adequately fuses high-level features of optical and SAR data in different receptive fields using atrous convolution. Compared with four fusion models in which data fusion occurs at the input, encoder, decoder, and output stages, two classical semantic segmentation models (SegNet and DeepLabV3+), and a recently proposed model based on U-Net, our model achieves the best results with an intersection-over-union of 0.8278. The experiments show that fully extracting the characteristics of optical and SAR data and appropriately fusing them are vital steps in a network’s performance of glacial lake extraction.
APA, Harvard, Vancouver, ISO, and other styles
36

Pandey, Piyush, Kitt G. Payn, Yuzhen Lu, Austin J. Heine, Trevor D. Walker, Juan J. Acosta, and Sierra Young. "Hyperspectral Imaging Combined with Machine Learning for the Detection of Fusiform Rust Disease Incidence in Loblolly Pine Seedlings." Remote Sensing 13, no. 18 (September 9, 2021): 3595. http://dx.doi.org/10.3390/rs13183595.

Full text
Abstract:
Loblolly pine is an economically important timber species in the United States, with almost 1 billion seedlings produced annually. The most significant disease affecting this species is fusiform rust, caused by Cronartium quercuum f. sp. fusiforme. Testing for disease resistance in the greenhouse involves artificial inoculation of seedlings followed by visual inspection for disease incidence. An automated, high-throughput phenotyping method could improve both the efficiency and accuracy of the disease screening process. This study investigates the use of hyperspectral imaging for the detection of diseased seedlings. A nursery trial comprising families with known in-field rust resistance data was conducted, and the seedlings were artificially inoculated with fungal spores. Hyperspectral images in the visible and near-infrared region (400–1000 nm) were collected six months after inoculation. The disease incidence was scored with traditional methods based on the presence or absence of visible stem galls. The seedlings were segmented from the background by thresholding normalized difference vegetation index (NDVI) images, and the delineation of individual seedlings was achieved through object detection using the Faster RCNN model. Plant parts were subsequently segmented using the DeepLabv3+ model. The trained DeepLabv3+ model for semantic segmentation achieved a pixel accuracy of 0.76 and a mean Intersection over Union (mIoU) of 0.62. Crown pixels were segmented using geometric features. Support vector machine discrimination models were built for classifying the plants into diseased and non-diseased classes based on spectral data, and balanced accuracy values were calculated for the comparison of model performance. Averaged spectra from the whole plant (balanced accuracy = 61%), the crown (61%), the top half of the stem (77%), and the bottom half of the stem (62%) were used. A classification model built using the spectral data from the top half of the stem was found to be the most accurate, and resulted in an area under the receiver operating characteristic curve (AUC) of 0.83.
APA, Harvard, Vancouver, ISO, and other styles
37

Lu, M., L. Groeneveld, D. Karssenberg, S. Ji, R. Jentink, E. Paree, and E. Addink. "GEOMORPHOLOGICAL MAPPING OF INTERTIDAL AREAS." International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences XLIII-B3-2021 (June 28, 2021): 75–80. http://dx.doi.org/10.5194/isprs-archives-xliii-b3-2021-75-2021.

Full text
Abstract:
Abstract. Spatiotemporal geomorphological mapping of intertidal areas is essential for understanding system dynamics and provides information for ecological conservation and management. Mapping the geomorphology of intertidal areas is very challenging mainly because spectral differences are oftentimes relatively small while transitions between geomorphological units are oftentimes gradual. Also, the intertidal areas are highly dynamic. Considerable challenges are to distinguish between different types of tidal flats, specifically, low and high dynamic shoal flats, sandy and silty low dynamic flats, and mega-ripple areas. In this study, we harness machine learning methods and compare between machine learning methods using features calculated in classical Object-Based Image Analysis (OBIA) vs. end-to-end deep convolutional neural networks that derive features directly from imagery, in automated geomorphological mapping. This study expects to gain us an in-depth understanding of features that contribute to tidal area classification and greatly improve the automation and prediction accuracy. We emphasise model interpretability and knowledge mining. By comparing and combing object-based and deep learning-based models, this study contributes to the development and integration of both methodology domains for semantic segmentation.
APA, Harvard, Vancouver, ISO, and other styles
38

Decker, Kevin, and Brett Borghetti. "Composite Style Pixel and Point Convolution-Based Deep Fusion Neural Network Architecture for the Semantic Segmentation of Hyperspectral and Lidar Data." Remote Sensing 14, no. 9 (April 28, 2022): 2113. http://dx.doi.org/10.3390/rs14092113.

Full text
Abstract:
Multimodal hyperspectral and lidar data sets provide complementary spectral and structural data. Joint processing and exploitation to produce semantically labeled pixel maps through semantic segmentation has proven useful for a variety of decision tasks. In this work, we identify two areas of improvement over previous approaches and present a proof of concept network implementing these improvements. First, rather than using a late fusion style architecture as in prior work, our approach implements a composite style fusion architecture to allow for the simultaneous generation of multimodal features and the learning of fused features during encoding. Second, our approach processes the higher information content lidar 3D point cloud data with point-based CNN layers instead of the lower information content lidar 2D DSM used in prior work. Unlike previous approaches, the proof of concept network utilizes a combination of point and pixel-based CNN layers incorporating concatenation-based fusion necessitating a novel point-to-pixel feature discretization method. We characterize our models against a modified GRSS18 data set. Our fusion model achieved 6.6% higher pixel accuracy compared to the highest-performing unimodal model. Furthermore, it achieved 13.5% higher mean accuracy against the hardest to classify samples (14% of total) and data fusion; multimodal; hyperspectral; lidar; remote sensing; neural network; point convolutionequivalent accuracy on the other test set samples.
APA, Harvard, Vancouver, ISO, and other styles
39

Dongdong, Jiao, Arunkumar N., Zhang Wenyu, Li Beibei, Zhang Xinlei, and Zhu Guangjian. "Semantic clustering fuzzy c means spectral model based comparative analysis of cardiac color ultrasound and electrocardiogram in patients with left ventricular heart failure and cardiomyopathy." Future Generation Computer Systems 92 (March 2019): 324–28. http://dx.doi.org/10.1016/j.future.2018.10.019.

Full text
APA, Harvard, Vancouver, ISO, and other styles
40

Li, Chen. "A Partial Differential Equation-Based Image Restoration Method in Environmental Art Design." Advances in Mathematical Physics 2021 (October 28, 2021): 1–11. http://dx.doi.org/10.1155/2021/4040497.

Full text
Abstract:
With the rapid development of networks and the emergence of various devices, images have become the main form of information transmission in real life. Image restoration, as an important branch of image processing, can be applied to real-life situations such as pixel loss in image transmission or network prone to packet loss. However, existing image restoration algorithms have disadvantages such as fuzzy restoration effect and slow speed; to solve such problems, this paper adopts a dual discriminator model based on generative adversarial networks, which effectively improves the restoration accuracy by adding local discriminators to track the information of local missing regions of images. However, the model is not optimistic in generating reasonable semantic information, and for this reason, a partial differential equation-based image restoration model is proposed. A classifier and a feature extraction network are added to the dual discriminator model to provide category, style, and content loss constraints to the generative network, respectively. To address the training instability problem of discriminator design, spectral normalization is introduced to the discriminator design. Extensive experiments are conducted on a data dataset of partial differential equations, and the results show that the partial differential equation-based image restoration model provides significant improvements in image restoration over previous methods and that image restoration techniques are exceptionally important in the application of environmental art design.
APA, Harvard, Vancouver, ISO, and other styles
41

Dvořák, J., M. Potůčková, and V. Treml. "WEAKLY SUPERVISED LEARNING FOR TREELINE ECOTONE CLASSIFICATION BASED ON AERIAL ORTHOIMAGES AND AN ANCILLARY DSM." ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences V-3-2022 (May 17, 2022): 33–38. http://dx.doi.org/10.5194/isprs-annals-v-3-2022-33-2022.

Full text
Abstract:
Abstract. Convolutional neural networks (CNNs) effectively classify standard datasets in remote sensing (RS). Yet, real-world data are more difficult to classify using CNNs because these networks require relatively large amounts of training data. To reduce training data requirements, two approaches can be followed – either pretraining models on larger datasets or augmenting the available training data. However, these commonly used strategies do not fully resolve the lack of training data for land cover classification in RS. Our goal is to classify trees and shrubs from aerial orthoimages in the treeline ecotone of the Krkonoše Mountains, Czechia. Instead of training a model on a smaller, human-labelled dataset, we semiautomatically created training data using an ancillary normalised Digital Surface Model (nDSM) and image spectral information. This approach can complement existing techniques, trading accuracy for a larger labelled dataset while assuming that the classifier can handle the training data noise. Weakly supervised learning on a CNN led to 68.99% mean Intersection over Union (IoU) and 81.65% mean F1-score for U-Net and 72.94% IoU and 84.35% mean F1-score for our modified U-Net on a test set comprising over 1000 manually labelled points. Notwithstanding the bias resulting from the noise in training data (especially in the least occurring tree class), our data show that standard semantic segmentation networks can be used for weakly supervised learning for local-scale land cover mapping.
APA, Harvard, Vancouver, ISO, and other styles
42

Sun, Le, Xiangbo Song, Huxiang Guo, Guangrui Zhao, and Jinwei Wang. "Patch-Wise Semantic Segmentation for Hyperspectral Images via a Cubic Capsule Network with EMAP Features." Remote Sensing 13, no. 17 (September 3, 2021): 3497. http://dx.doi.org/10.3390/rs13173497.

Full text
Abstract:
In order to overcome the disadvantages of convolution neural network (CNN) in the current hyperspectral image (HSI) classification/segmentation methods, such as the inability to recognize the rotation of spatial objects, the difficulty to capture the fine spatial features and the problem that principal component analysis (PCA) ignores some important information when it retains few components, in this paper, an HSI segmentation model based on extended multi-morphological attribute profile (EMAP) features and cubic capsule network (EMAP–Cubic-Caps) was proposed. EMAP features can effectively extract various attributes profile features of entities in HSI, and the cubic capsule neural network can effectively capture complex spatial features with more details. Firstly, EMAP algorithm is introduced to extract the morphological attribute profile features of the principal components extracted by PCA, and the EMAP feature map is used as the input of the network. Then, the spectral and spatial low-layer information of the HSI is extracted by a cubic convolution network, and the high-layer information of HSI is extracted by the capsule module, which consists of an initial capsule layer and a digital capsule layer. Through the experimental comparison on three well-known HSI datasets, the superiority of the proposed algorithm in semantic segmentation is validated.
APA, Harvard, Vancouver, ISO, and other styles
43

Zhao, Wenbo, Qing Dong, and Zhengli Zuo. "A Method Combining Line Detection and Semantic Segmentation for Power Line Extraction from Unmanned Aerial Vehicle Images." Remote Sensing 14, no. 6 (March 11, 2022): 1367. http://dx.doi.org/10.3390/rs14061367.

Full text
Abstract:
Power line extraction is the basic task of power line inspection with unmanned aerial vehicle (UAV) images. However, due to the complex backgrounds and limited characteristics, power line extraction from images is a difficult problem. In this paper, we construct a power line data set using UAV images and classify the data according to the image clutter (IC). A method combining line detection and semantic segmentation is used. This method is divided into three steps: First, a multi-scale LSD is used to determine power line candidate regions. Then, based on the object-based Markov random field (OMRF), a weighted region adjacency graph (WRAG) is constructed using the distance and angle information of line segments to capture the complex interaction between objects, which is introduced into the Gibbs joint distribution of the label field. Meanwhile, the Gaussian mixture model is utilized to form the likelihood function by taking the spectral and texture features. Finally, a Kalman filter (KF) and the least-squares method are used to realize power line pixel tracking and fitting. Experiments are carried out on test images in the data set. Compared with common power line extraction methods, the proposed algorithm shows better performance on images with different IC. This study can provide help and guidance for power line inspection.
APA, Harvard, Vancouver, ISO, and other styles
44

Wu, Jibing, Zhifei Wang, Yahui Wu, Lihua Liu, Su Deng, and Hongbin Huang. "A Tensor CP Decomposition Method for Clustering Heterogeneous Information Networks via Stochastic Gradient Descent Algorithms." Scientific Programming 2017 (2017): 1–13. http://dx.doi.org/10.1155/2017/2803091.

Full text
Abstract:
Clustering analysis is a basic and essential method for mining heterogeneous information networks, which consist of multiple types of objects and rich semantic relations among different object types. Heterogeneous information networks are ubiquitous in the real-world applications, such as bibliographic networks and social media networks. Unfortunately, most existing approaches, such as spectral clustering, are designed to analyze homogeneous information networks, which are composed of only one type of objects and links. Some recent studies focused on heterogeneous information networks and yielded some research fruits, such as RankClus and NetClus. However, they often assumed that the heterogeneous information networks usually follow some simple schemas, such as bityped network schema or star network schema. To overcome the above limitations, we model the heterogeneous information network as a tensor without the restriction of network schema. Then, a tensor CP decomposition method is adapted to formulate the clustering problem in heterogeneous information networks. Further, we develop two stochastic gradient descent algorithms, namely, SGDClus and SOSClus, which lead to effective clustering multityped objects simultaneously. The experimental results on both synthetic datasets and real-world dataset have demonstrated that our proposed clustering framework can model heterogeneous information networks efficiently and outperform state-of-the-art clustering methods.
APA, Harvard, Vancouver, ISO, and other styles
45

Hasan, A., M. R. Udawalpola, C. Witharana, and A. K. Liljedahl. "COUNTING ICE-WEDGE POLYGONS FROM SPACE: USE OF COMMERCIAL SATELLITE IMAGERY TO MONITOR CHANGING ARCTIC POLYGONAL TUNDRA." International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences XLIV-M-3-2021 (August 10, 2021): 67–72. http://dx.doi.org/10.5194/isprs-archives-xliv-m-3-2021-67-2021.

Full text
Abstract:
Abstract. The microtopography associated with ice wedge polygons (IWPs) governs the Arctic ecosystem from local to regional scales due to the impacts on the flow and storage of water and therefore, vegetation and carbon. Increasing subsurface temperatures in Arctic permafrost landscapes cause differential ground settlements followed by a series of adverse microtopographic transitions at sub decadal scale. The entire Arctic has been imaged at 0.5 m or finer resolution by commercial satellite sensors. Dramatic microtopographic transformation of low-centered into high-centered IWPs can be identified using sub-meter resolution commercial satellite imagery. In this exploratory study, we have employed a Deep Learning (DL)-based object detection and semantic segmentation method named the Mask R-CNN to automatically map IWPs from commercial satellite imagery. Different tundra vegetation types have distinct spectral, spatial, textural characteristics, which in turn decide the semantics of overlying IWPs. Landscape complexity translates to the image complexity, affecting DL model performances. Scarcity of labelled training images, inadequate training samples for some types of tundra and class imbalance stand as other key challenges in this study. We implemented image augmentation methods to introduce variety in the training data and trained models separately for tundra types. Augmentation methods show promising results but the models with separate tundra types seem to suffer from the lack of annotated data.
APA, Harvard, Vancouver, ISO, and other styles
46

Lu, Yimin, Wei Shao, and Jie Sun. "Extraction of Offshore Aquaculture Areas from Medium-Resolution Remote Sensing Images Based on Deep Learning." Remote Sensing 13, no. 19 (September 26, 2021): 3854. http://dx.doi.org/10.3390/rs13193854.

Full text
Abstract:
It is important for aquaculture monitoring, scientific planning, and management to extract offshore aquaculture areas from medium-resolution remote sensing images. However, in medium-resolution images, the spectral characteristics of offshore aquaculture areas are complex, and the offshore land and seawater seriously interfere with the extraction of offshore aquaculture areas. On the other hand, in medium-resolution images, due to the relatively low image resolution, the boundaries between breeding areas are relatively fuzzy and are more likely to ‘adhere’ to each other. An improved U-Net model, including, in particular, an atrous spatial pyramid pooling (ASPP) structure and an up-sampling structure, is proposed for offshore aquaculture area extraction in this paper. The improved ASPP structure and up-sampling structure can better mine semantic information and location information, overcome the interference of other information in the image, and reduce ‘adhesion’. Based on the northeast coast of Fujian Province Sentinel-2 Multispectral Scan Imaging (MSI) image data, the offshore aquaculture area extraction was studied. Based on the improved U-Net model, the F1 score and Mean Intersection over Union (MIoU) of the classification results were 83.75% and 73.75%, respectively. The results show that, compared with several common classification methods, the improved U-Net model has a better performance. This also shows that the improved U-Net model can significantly overcome the interference of irrelevant information, identify aquaculture areas, and significantly reduce edge adhesion of aquaculture areas.
APA, Harvard, Vancouver, ISO, and other styles
47

Liu, Shengwei, Dailiang Peng, Bing Zhang, Zhengchao Chen, Le Yu, Junjie Chen, Yuhao Pan, et al. "The Accuracy of Winter Wheat Identification at Different Growth Stages Using Remote Sensing." Remote Sensing 14, no. 4 (February 13, 2022): 893. http://dx.doi.org/10.3390/rs14040893.

Full text
Abstract:
The aim of this study was to explore the differences in the accuracy of winter wheat identification using remote sensing data at different growth stages using the same methods. Part of northern Henan Province, China was taken as the study area, and the winter wheat growth cycle was divided into five periods (seeding-tillering, overwintering, reviving, jointing-heading, and flowering-maturing) based on monitoring data obtained from agrometeorological stations. With the help of the Google Earth Engine (GEE) platform, the separability between winter wheat and other land cover types was analyzed and compared using the Jeffries-Matusita (J-M) distance method. Spectral features, vegetation index, water index, building index, texture features, and terrain features were generated from Sentinel-2 remote sensing images at different growth periods, and then were used to establish a random forest classification and extraction model. A deep U-Net semantic segmentation model based on the red, green, blue, and near-infrared bands of Sentinel-2 imagery was also established. By combining models with field data, the identification of winter wheat was carried out and the difference between the accuracy of the identification in the five growth periods was analyzed. The experimental results show that, using the random forest classification method, the best separability between winter wheat and the other land cover types was achieved during the jointing-heading period: the overall identification accuracy for the winter wheat was then highest at 96.90% and the kappa coefficient was 0.96. Using the deep-learning classification method, it was also found that the semantic segmentation accuracy of winter wheat and the model performance were best during the jointing-heading period: a precision, recall, F1 score, accuracy, and IoU of 0.94, 0.93, 0.93, and 0.88, respectively, were achieved for this period. Based on municipal statistical data for winter wheat, the accuracy of the extraction of the winter wheat area using the two methods was 96.72% and 88.44%, respectively. Both methods show that the jointing-heading period is the best period for identifying winter wheat using remote sensing and that the identification made during this period is reliable. The results of this study provide a scientific basis for accurately obtaining the area planted with winter wheat and for further studies into winter wheat growth monitoring and yield estimation.
APA, Harvard, Vancouver, ISO, and other styles
48

Virnodkar, Shyamal S., Vinod K. Pachghare, Virupakshagouda C. Patil, and Sunil Kumar Jha. "DenseResUNet: An Architecture to Assess Water-Stressed Sugarcane Crops from Sentinel-2 Satellite Imagery." Traitement du Signal 38, no. 4 (August 31, 2021): 1131–39. http://dx.doi.org/10.18280/ts.380424.

Full text
Abstract:
A single most immense abiotic stress globally affecting the productivity of all the crops is water stress. Hence, timely and accurate detection of the water-stressed crops is a necessary task for high productivity. Agricultural crop production can be managed and enhanced by spatial and temporal evaluation of water-stressed crops through remotely sensed data. However, detecting water-stressed crops from remote sensing images is a challenging task as various factors impacting spectral bands, vegetation indices (VIs) at the canopy and landscape scales, as well as the fact that the water stress detection threshold is crop-specific, there has yet to be substantial agreement on their usage as a pre-visual signal of water stress. This research takes the benefits of freely available remote sensing data and convolutional neural networks to perform semantic segmentation of water-stressed sugarcane crops. Here an architecture ‘DenseResUNet’ is proposed for water-stressed sugarcane crops using segmentation based on encoder-decoder approach. The novelty of the proposed approach lies in the replacement of classical convolution operation in the UNet with the dense block. The layers of a dense block are residual modules with a dense connection. The proposed model achieved 61.91% mIoU, and 80.53% accuracy on segmenting the water-stressed sugarcane fields. This study compares the proposed architecture with the UNet, ResUNet, and DenseUNet models achieving mIoU of 32.20%, 58.34%, and 53.15%, respectively. The results of this study reveal that the model has the potential to identify water-stressed crops from remotely sensed data through deep learning techniques.
APA, Harvard, Vancouver, ISO, and other styles
49

Qiu, Xiaohua, Min Li, Lin Dong, Guangmang Deng, and Liqiong Zhang. "Dual-Band Maritime Imagery Ship Classification Based on Multilayer Convolutional Feature Fusion." Journal of Sensors 2020 (December 1, 2020): 1–16. http://dx.doi.org/10.1155/2020/8891018.

Full text
Abstract:
Addressing to the problems of few annotated samples and low-quality fused feature in visible and infrared dual-band maritime ship classification, this paper leverages hierarchical features of deep convolutional neural network to propose a dual-band maritime ship classification method based on multilayer convolutional feature fusion. Firstly, the VGGNet model pretrained on the ImageNet dataset is fine-tuned to capture semantic information of the specific dual-band ship dataset. Secondly, the pretrained and fine-tuned VGGNet models are used to extract low-level, middle-level, and high-level convolutional features of each band image, and a number of improved recursive neural networks with random weights are exploited to reduce feature dimension and learn feature representation. Thirdly, to improve the quality of feature fusion, multilevel and multilayer convolutional features of dual-band images are concatenated to fuse hierarchical information and spectral information. Finally, the fused feature vector is fed into a linear support vector machine for dual-band maritime ship category recognition. Experimental results on the public dual-band maritime ship dataset show that multilayer convolution feature fusion outperforms single-layer convolution feature by about 2% mean per-class classification accuracy for single-band image, dual-band images perform better than single-band image by about 2.3%, and the proposed method achieves the best accuracy of 89.4%, which is higher than the state-of-the-art method by 1.2%.
APA, Harvard, Vancouver, ISO, and other styles
50

Zhang, Xin, Ling Du, Shen Tan, Fangming Wu, Liang Zhu, Yuan Zeng, and Bingfang Wu. "Land Use and Land Cover Mapping Using RapidEye Imagery Based on a Novel Band Attention Deep Learning Method in the Three Gorges Reservoir Area." Remote Sensing 13, no. 6 (March 23, 2021): 1225. http://dx.doi.org/10.3390/rs13061225.

Full text
Abstract:
Land use/land cover (LULC) change has been recognized as one of the most important indicators to study ecological and environmental changes. Remote sensing provides an effective way to map and monitor LULC change in real time and for large areas. However, with the increasing spatial resolution of remote sensing imagery, traditional classification approaches cannot fully represent the spectral and spatial information from objects and thus have limitations in classification results, such as the “salt and pepper” effect. Nowadays, the deep semantic segmentation methods have shown great potential to solve this challenge. In this study, we developed an adaptive band attention (BA) deep learning model based on U-Net to classify the LULC in the Three Gorges Reservoir Area (TGRA) combining RapidEye imagery and topographic information. The BA module adaptively weighted input bands in convolution layers to address the different importance of the bands. By comparing the performance of our model with two typical traditional pixel-based methods including classification and regression tree (CART) and random forest (RF), we found a higher overall accuracy (OA) and a higher Intersection over Union (IoU) for all classification categories using our model. The OA and mean IoU of our model were 0.77 and 0.60, respectively, with the BA module and were 0.75 and 0.58, respectively, without the BA module. The OA and mean IoU of CART and RF were both below 0.51 and 0.30, respectively, although RF slightly outperformed CART. Our model also showed a reasonable classification accuracy in independent areas well outside the training area, which indicates the strong model generalizability in the spatial domain. This study demonstrates the novelty of our proposed model for large-scale LULC mapping using high-resolution remote sensing data, which well overcomes the limitations of traditional classification approaches and suggests the consideration of band weighting in convolution layers.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography