Paul, Subir. "Hyperspectral Remote Sensing for Land Cover Classification and Chlorophyll Content Estimation using Advanced Machine Learning Techniques." Thesis, 2020. https://etd.iisc.ac.in/handle/2005/4537.
Abstract:
In the recent years, remote sensing data or images have great potential for continuous spatial and temporal monitoring of Earth surface features. In case of optical remote sensing, hyperspectral (HS) data contains abundant spectral information and these information are advantageous for various applications. However, high-dimensional HS data handling is a very challenging task. Different techniques are proposed as a part of this thesis to handle the HS data in a computationally efficient manner and to achieve better performance for land cover classification and chlorophyll content prediction. Prior to start the HS data application, multispectral (MS) data are also analyzed in this thesis for crop classification.
Multi-temporal MS data is used for crop classification. Landsat-8 operational land imager (OLI) sensor data are considered as MS data in this work. Surface reflectances and derived normalized difference indices (NDIs) of multi-temporal MS bands are combinedly used for the crop classification. Different dimensionality reduction techniques, viz. feature selection (FS) (e.g. random forest (RF) and partial informational correlation (PIC) measure-based), linear (e.g. principal component analysis (PCA) and independent component analysis) and nonlinear feature extraction (FE) (e.g. kernel PCA and Autoencoder), to be employed on the multi-temporal surface reflectances and NDIs datasets, are evaluated to detect the most favorable features. Subsequently, the detected features are used in a promising nonparametric classifier, support vector machine (SVM), for crop classification. It is found that all the evaluated FE techniques, employed on the multi-temporal datasets, resulted in better performance compared to FS-based approaches. PCA, being a simple and efficient FE algorithm, is well-suited in crop classification in terms of computational complexity and classification performances. Multi-temporal images are proved to be more advantageous compared to the single-date imagery for crop identification.
HS data comprises of continuous spectral responses of hundreds of narrow spectral bands with very fine spectral resolution or bandwidth, which offer feature identification and classification with high accuracy. HS data are enriched with highly resourceful abundant spectral bands compared to only 5-10 spectral bands of MS data. However, analyzing and interpreting these ample amounts of data is a challenging task. Optimal spectral bands or features should be chosen or extracted to address the issue of redundancy and to capitalize on the absolute advantages of HS data. FS and FE are two broad categories of dimensionality reduction techniques. In this thesis, a FS and a FE-based computationally efficient dimensionality reduction technique is proposed for land cover classification.
PIC-based HS band selection approach is proposed as a FS-based dimensionality reduction technique for classification of land cover types. PIC measure is more skillful compared to mutual information for estimation of non-parametric conditional dependency. In this proposed approach, HS narrow-bands are selected in an innovative way utilizing the PIC. Firstly, HS bands are divided into different spectral groups or segments using normalized mutual information (NMI) and then PIC is employed to each spectral group for optimal band selection. This approach is more efficient in terms of computational time and in generalizing the applicability of selected spectral bands. Further, these optimal spectral bands are used in the SVM and RF classifier for classification of land cover types and performance evaluation. The proposed FS-based dimensionality reduction approach is compared with different state-of-the-art techniques for land cover classification. The proposed methodology improved the classification performances compared to the existing techniques and the advancement in performances are proven to be statistically significant.
In the recent years, deep learning-based FE techniques are very popular and also proven to be effective in extraction of apt features from the high-dimensional data. However, these techniques are computationally expensive. A computationally efficient FE-based dimensionality reduction approach, NMI-based segmented stacked auto-encoder (S-SAE), is proposed for extraction of spectral features from the HS data. These spectral features are consecutively utilized for creation of spatial features and later both spectral and spatial features are used in the classifier models (i.e. SVM and RF) for land cover classification. The proposed HS image classification approach reduces the complexity and computational time compared to the available techniques. A non-parametric dependency measure (i.e. NMI) based spectral segmentation is proposed instead of linear and parametric dependency measure to take care of the both linear and nonlinear inter-band dependencies for spectral segmentation of the HS bands. Then extended morphological profiles (EMPs) are created corresponding to segmented spectral features to assimilate the spatial information in the spectral-spatial classification approach. Two non-parametric classifiers, SVM with Gaussian kernel and RF are used for classification of the three most popularly used HS datasets. The experiments performed with the proposed methodology provide encouraging results compared to numerous existing approaches.
HS data are proven to be more resourceful compared to MS data for object detection, classification and several other applications. However, absence of any space-borne HS sensor and high cost and limited obtainability of airborne sensors-based images limit the use of HS data. Transformation of readily available MS data into quasi-HS data can be a feasible solution for this issue. A deep learning-based regression algorithm, convolutional neural network regression (CNNR), is proposed as part of this thesis for MS (i.e. Landsat-7/8) to quasi-HS (i.e. quasi-Hyperion) data transformation. CNNR model introduces the advantages of nonlinear modelling and assimilation of spatial information in the regression-based modelling. The proposed CNNR model is compared with the pseudo-HS image transformation algorithm (PHITA), stepwise linear regression (SLR), and support vector regression (SVR) models by evaluating the quality of the quasi-Hyperion data. Several statistical metrics are calculated to compare each band’s reflectance values as well as spectral reflectance curve of each pixel of the quasi-Hyperion data with that of the original Hyperion data. The developed models and generated quasi-Hyperion data are also evaluated with application to crop classification. Analyzing the results of all the experiments, it is evident that CNNR model is more efficient compared to PHITA, SLR, and SVR in creating the quasi-Hyperion data and this transformed data are proven to be resourceful for crop classification application. The proposed CNNR model-based MS to quasi-HS data transformation approach can be used as a viable alternative for different applications in the absence of original HS images.
HS data are investigated for estimation of chlorophyll content, which is one of the essential biochemical parameters to assess the growth process of the fruit trees. This study developed a model for estimation of canopy averaged chlorophyll content (CACC) of pear trees using the convolutional auto-encoder (CAE) features of HS data. This study also demonstrated the inspection of anomaly among the trees by employing multi-dimensional scaling (MDS) on the CAE features and detected the outlier trees, prior to fit nonlinear regression models. These outlier trees are excluded from further experiments which helped in improving the prediction performance of CACC. Gaussian process regression (GPR) and support vector regression (SVR) techniques are investigated as nonlinear regression models and used for prediction of CACC. The CAE features are proven to be providing better prediction of CACC, compared to the direct use of HS bands or vegetation indices as predictors. Training of the regression models, excluding the outlier trees, improved the CACC prediction performance. It is evident from the experiments that GPR can predict the CACC with better accuracy compared to SVR. In addition, the reliability of the tree canopy masks, which are utilized for averaging the features’ values for a particular tree, is also evaluated.