946
Views
0
CrossRef citations to date
0
Altmetric
Research Article

Local-aware coupled network for hyperspectral image super-resolution

, ORCID Icon, , ORCID Icon, &
Article: 2233725 | Received 03 Apr 2023, Accepted 27 Jun 2023, Published online: 07 Jul 2023

ABSTRACT

Despite the unprecedented success of super-resolution (SR) development for natural images, achieving hyperspectral image (HSI) SR with rich spectral characteristics remains a challenging task. Typically, HSI SR is accomplished by fusing low-resolution HSI (LR HSI) with the corresponding high-resolution multispectral image (HR MSI). However, due to the significant spectral difference between MSI and HSI, it is difficult to retain the spatial characteristics of MSI during image fusion. In addition, the spectral response function (SRF) used for simulating MSI is often unknown or unavailable in hyperspectral remote sensing images, further complicating the problem. To address the above issues, a local-aware coupled network (LCNet) is proposed in this paper. In LCNet, the SRF and point spread function (PSF) are adaptively learned in the primary stage of the network to address the issue of unknown prior information. By coupling two reconstruction networks, LCNet effectively preserves both the texture details of MSI and the spectral characteristics of HSI. Furthermore, the spatial local-aware block selectively emphasizes the texture features of MSI. Experimental results on three publicly available HSIs demonstrate whether the proposed LCNet is superior to the state-of-the-art methods with respect to both stability and quality.

1. Introduction

The emergence of hyperspectral imagers marked the arrival of an era in which the spectrum of ground objects can be used to determine their material composition. In contrast to panchromatic images or multispectral images (MSIs), hyperspectral images (HSIs) can be considered as cubes containing information from tens to hundreds of consecutive spectral bands that provide detailed information on spectral dimensions (Rasti et al. Citation2020). Compared with ordinary RGB images or MSIs, HSIs display three prominent features: (1) high spectral resolution; (2) integration of imagery and spectrum; and (3) low spatial resolution. Due to these properties, HSIs are widely applied in research fields such as target detection (Zeng et al. Citation2022; Rao et al. Citation2022), classification (Quan et al. Citation2023; Zhu et al. Citation2021), environment monitoring (Hong et al. Citation2022; Moreira, Teixeira, and Galvão Citation2015), change detection (Donovan et al. Citation2021; Ou et al. Citation2022), and composition analysis of Martian rocks (Brown et al. Citation2010). However, hyperspectral imaging often requires a trade-off between spectral and spatial resolution due to meteorological conditions and hardware equipment (Jiang et al. Citation2020). Therefore, image super-resolution (SR) reconstruction methods have been proposed to improve the spatial resolution of HSI.

Compared with natural images, HSIs have unique properties that require different SR processing methods. There are two categories of SR methods for HSIs based on the need for auxiliary images: (1) single HSI SR technology and (2) HSI SR technology based on image fusion. The single HSI SR methods achieve SR processing using their own features without requiring corresponding auxiliary images (Arun et al. Citation2020; Mei et al. Citation2017). However, the technical development of single HSI SR is limited by the large number of spectral bands in HSI and extremely scarce dataset. Fusion-based methods with more accurate auxiliary information can greatly improve the spatial characteristics of target images, such as Bayesian (Akhtar, Shafait, and Mian Citation2015; Eismann and Hardie Citation2004), matrix factorization (Borsoi, Imbiriba, and Bermudez Citation2019; Yokoya, Yairi, and Iwasaki Citation2011), tensor (Dian and Li Citation2019; Xu et al. Citation2019) and deep learning (Palsson, Sveinsson, and Ulfarsson Citation2017; Zhang et al. Citation2020).

In recent years, many fusion-based methods have been proposed for HSI SR, but there are still several problems to be solved: (1) The spectral response function (SRF) is often unknown or unavailable for hyperspectral remote sensing images. In most previous works, the SRF used for spectral downsampling is often ignored or used as prior knowledge (Han, Zheng, and Chen Citation2019; Qu, Qi, and Kwan Citation2018). However, acquiring SRF in remote sensing is still challenging. (2) The feature extraction of MSI is inefficient, and its spatial information cannot be effectively retained. Extracting detailed texture features effectively from high-resolution MSI (HR MSI) and retaining them in each band of HSI is challenging due to the significantly higher spectral dimension of HSI compared to MSI. In addition, spectral distortions are easily introduced in the subsequent HR HSI results. This research aims to combine the characteristics of HSI and MSI and use the method of image fusion to achieve the SR task of HSI.

To solve the above-mentioned issues, a local-aware coupled network (LCNet) is proposed to facilitate high-quality HSI SR restoration. In the primary stage of the network, the SRF and point spread function (PSF) are adaptively learned to generate simulated low-resolution HSI (LR HSI) and HR MSI. Two unsupervised encoder-decoder architectures are used to reconstruct these two images, and the target SR HSI is output in the MSI reconstruction network. Furthermore, the spatial local-aware block is introduced into the coding stage of the MSI reconstruction network to extract more discriminative feature representations.

Overall, the main contributions of this paper are as follows:

  1. A local-aware coupled network (LCNet) is proposed to implement the SR reconstruction of hyperspectral remote sensing images. The framework consists of a data generation module and two encoder-decoder based image reconstruction networks. In the data generation module, PSF and SRF can be learned adaptively and used to calculate LR HSI and HR MSI for image fusion. The two reconstruction networks are coupled by sharing the same decoder weight to extract spatial features and spectral information from the multimodalities.

  2. To further exploit the detailed spectral-spatial information of HR MSI, a spatial local-aware block is designed to further extract the feature representation of key regions. Due to the large spectral differences between MSI and HSI, it is difficult to effectively preserve the spatial characteristics of MSI. The spatial local-aware block is introduced to the MSI reconstruction network to selectively emphasize spatial correlation features and provide more effective texture features for the final reconstructed image.

2. Related works

According to the presence or absence of auxiliary information in SR processing, HSI SR reconstruction methods are divided into two classes: (1) single HSI SR techniques; (2) fusion-based techniques. In this section, the two different HSI SR methods mentioned above will be introduced.

2.1. Single HSI SR

Single HSI SR methods include the early method based on subpixel mapping (Xu et al. Citation2012; Zhang et al. Citation2012) and the method of using general image processing technology. Among the general image processing methods, single HSI SR methods based on sparsity and dictionary learning or low-rank approximation have been proposed to utilize the rich spectral correlation between continuous spectral bands (He et al. Citation2016; Huang, Yu, and Sun Citation2014; Irmak, Akar, and Yukse Citation2018; Wang et al. Citation2017). However, these handmade priors can only reflect the linear features of HSIs, and fail to capture the more complex relationships and patterns in HSIs. Since deep learning has shown good performance in the SR processing of RGB images (Chen et al. Citation2022; Zhang et al. Citation2018; Zhu et al. Citation2021), deep learning methods are used for single HSI SR (Liu, Li, and Yuan Citation2021; Yuan, Zheng, and Lu Citation2017). Since the 2D convolutional layer mainly considers spatial information, spectral distortion can easily occur when these networks are directly used for SR of HSIs band-by-band. Therefore, some network structures based on 3D convolutional neural networks have been proposed (Arun et al. Citation2020; Li, Wang, and Li Citation2021; Mei et al. Citation2017). For single HSI SR, very little training data is available, and the number of HSI bands obtained by different imaging devices varies. In addition, the spectral dimension of HSI data is very high and difficult to process. All of these factors give rise to difficulties in the model training of single HSI SR.

2.2. Fusion-based HSI SR

Methods based on multi-image fusion usually register and merge the LR HSI and the corresponding HR MSI to obtain SR HSI. Based on this fusion method, HSI SR methods are divided into five types: sharpening expansion, Bayesian, matrix decomposition, tensor factorization and deep learning.

In sharpening extension methods, Gomez, Jazaeri, and Kafatos (Citation2001) applied wavelet technology to fuse HSI and MSI. Aiazzi, Baronti, and Selva (Citation2007) used multiple regression methods to improve the performance of the fusion method based on component substitution. However, this approach cannot deal well with the lack of overlap between the spectral range of multispectral data and the spectral range of a large number of HSIs, which leads to more severe spectral distortions. Bayesian representation methods have been proposed to fuse HSI and panchromatic images to enhance the spatial representation of HSI (Akhtar, Shafait, and Mian Citation2015; Eismann and Hardie Citation2004). However, this method requires strong prior knowledge based on assumptions, which limits its ability to flexibly adapt to different HSI structures.

In the method based on matrix factorization, a coupled nonnegative matrix factorization (CNMF) unmixing method was proposed by Yokoya, Yairi, and Iwasaki (Citation2011) to generate fusion data. Since sparse learning has been proven to improve the results of image restoration, some researchers have introduced it to improve the quality of fused images (Chen et al. Citation2021; Dong et al. Citation2016). Subsequently, low-rank attributes (Yi, Zhao, and Chan Citation2018), clustering models (Zhang et al. Citation2018), and spectral variable features (Borsoi, Imbiriba, and Bermudez Citation2019) have been considered. Almost all matrix factorization-based methods require the 3D data structure to be expanded into a matrix, thus limiting the exploitation of the spatial and spectral correlation in the data.

The tensor representation shows good performance in maintaining the structure. Therefore, HSI processing using tensor-based approaches does not destroy the inherent structure of hyperspectral data. So far, several tensor decomposition-based frameworks have been proposed to implement HSI SR (He et al. Citation2022; Li et al. Citation2018; Wan et al. Citation2020; Xu et al. Citation2019). Compared with matrix factorization, tensor decomposition-based approaches maintain a better data structure, but incurs a much larger computational cost.

With the rise of deep learning, methods based on deep learning have been shown to outperform traditional signal processing methods in the fields of speech and pattern recognition (Zhu et al. Citation2018, Citation2021). Therefore, some scholars use deep learning methods to represent and fuse image data to achieve SR reconstruction of HSIs. Qu, Qi, and Kwan (Citation2018) first tried to solve the HSI SR problem using an unsupervised encoder-decoder architecture. To solve the unknown PSF and SRF problems, an iterative framework for blind fusion (Wang et al. Citation2019) and other fusion networks (Fu et al. Citation2019; Zheng et al. Citation2020) were proposed. Han, Zheng, and Chen (Citation2019) proposed a deep convolutional neural network (DCNN) to solve the problem of large resolution differences in the spatial domains of RGB images and HSIs. Zhou, Rangarajan, and Gader (Citation2019) and Qu et al. (Citation2021) combined image registration and fusion, enhancing the practical significance of using fusion for HSI SR. Inspired by model-specific properties, Liu et al. (Citation2022) designed a model-inspired deep network for HSI SR. Li et al. (Citation2022) proposed a fusion network of HSI, MSI, and panchromatic image (PAN) to further improve the spatial resolution of HSI. Compared with model-based methods, a large amount of time and data are required for learning-based methods, limiting their application in specific scenarios.

In summary, using accurate auxiliary information from MSI, fusion-based HSI SR can greatly improve the spatial characteristics of SR images. The fusion of HSI and MSI has been explored by many scholars, but there are still some issues to be addressed. Various prior knowledge, such as SRF, is added to these frameworks proposed in previous studies to improve HSI processing techniques, but they are unknown or unavailable in the field of remote sensing. In addition, the number of bands in HSI is much larger than that of MSI, so that the texture benefits of MSI cannot be effectively preserved.

3. Proposed method

To effectively solve the above problems, the fusion framework LCNet for HSI SR is proposed. The overall flowchart of LCNet is shown in . First, the estimated PSF and SRF are used to calculate the LR HSI and HR MSI, respectively. Subsequently, two encoder-decoder based image reconstruction networks are used for the reconstruction of MSI and HSI. These two networks share the same decoder weights and can extract spectral and spatial information in an unsupervised manner. In addition, a spatial local-aware block is introduced into the coding stage of the HR MSI reconstruction network to enhance the feature extraction capabilities of key regions.

Figure 1. Overview of the proposed local-aware coupling network (LCNet). The proposed framework includes a simulated data generation module for input data and SRF calculation, an LR HSI reconstruction network designed for learning spectral feature recovery, and an HR MSI reconstruction network with spatial local-aware block for capturing key region features. The output target image is generated by the HR MSI reconstruction network.

Figure 1. Overview of the proposed local-aware coupling network (LCNet). The proposed framework includes a simulated data generation module for input data and SRF calculation, an LR HSI reconstruction network designed for learning spectral feature recovery, and an HR MSI reconstruction network with spatial local-aware block for capturing key region features. The output target image is generated by the HR MSI reconstruction network.

3.1. Problem formulation

HR HSI is a data cube that can be expressed as XRM×N×L, where M and N represent width and height, respectively, and L represents the spectral bands. Similarly, HR MSI is represented as YRM×N×l, where L denotes the spectral bands of MSI. LR HSI is represented as ZRm×n×L, M and N represent the width and height of LR HSI. Expanding the 3-D image data in the structure of the 2-D matrix, the above three data cubes are represented as XRMN×L, YRMN×l, and ZRmn×L, respectively. Based on the linear unmixing theory, the HR HSI is expressed as:

(1) X=AE(1)

where ARMN×p and ERp×L are composed of abundances and endmembers, respectively, and p is the base of the spectrum.

LR HSI and HR MSI are obtained by downsampling the HR HSI in the spatial domain (Z)and spectral domain (Y), respectively, which can be modeled as follows:

(2) Z=AhESX=SAE(2)
(3) YXR=AER(3)

where AhRmn×p indicates abundances, SRmn×MN and RRL×l denote the PSF used to describe the spatial resolution degradation and the SRF used to describe the spectral degradation, respectively. The HSI SR process can be expressed mathematically as P(X|Y,Z), which represents the probability distribution of generating X when HR MSI Y and LR HSI Z are known.

3.2. Simulation data generating

This study adopts the method of testing by simulation experiment. This simulation experiment means that the LR HSI and HR MSI are obtained by sampling the space domain and spectral domain on the original HR HSI.

Our framework can adaptively estimate SRF and PSF from the HR HSI. For SRF estimation, the spectral response corresponding to the visible light band in the IKONOS SRF is used, and then B-spline interpolation is performed on it to obtain the same number of bands as the input hyperspectral data. From this, the SRF of size l×L is obtained, where l and l denotes the number of bands in MSI and HSI respectively. The PSF generally means that each pixel in LR HSI consists of adjacent pixels with unknown weights in the original HR HSI. To simulate this process, a channel convolution with the same convolution kernel size and step size as scaling is used.

3.3. Image reconstruction

3.3.1. LR HSI restoration

As shown in , these two networks reconstruct HSI and MSI in a coupled manner. The structure of the encoder and decoder forms the LR HSI reconstruction network. The encoder maps the input image to a low-dimensional representation and extracts the spatial features of the HSI. By imposing fixed constraints, the feature representation Ah is encouraged to obey the Dirichlet distribution, which can effectively model the sparsity and distribution relationship of multiple random variables. The decoder then reconstructs the data according to the characteristic representation Ah, which effectively restores the spectral characteristics of the HSI. Both the encoder and decoder in the LR HSI reconstruction network are composed of multiple fully connected layers.

The Dirichlet distribution of Ah is accomplished by the stick-breaking process (Sethuraman Citation1994). The stick-breaking process involves iteratively breaking a unit-length stick into different lengths to generate a new Dirichlet distribution, where the distribution of each random variable obeys a Beta distribution. This method yields a random distribution that satisfies the total sum of one and has sparsity, which can be used to represent the distribution relationship of multiple random variables. In addition, to enhance the sparsity of the feature representation, an entropy function is used in the encoder. Earlier, the entropy function was used to address signal problems in the field of compressed sensing. It is defined as:

(4) HpA=j=1NAjpApplogAjpApp(4)

To encourage spatial and spectral feature similarity between Z and reconstructed LR HSI Z^, the reconstruction loss of the LR HSI reconstruction is defined as:

(5) L(θhe,θhd)=12Z(θhe,θhd)Z^(θhe,θhd)F2+λH(Ah(θhe))+μθhdF2(5)

where λ and μ are weight parameters, θhe and θhd are the encoder weights and the decoder weights of the LR HSI reconstruction network, respectively.

3.3.2. HR MSI recovery with spatial local-aware block

The structure of the HR MSI reconstruction network also includes an encoder for extracting spatial information and a decoder for recovering spectral characteristics. The weights of the decoder are shared by the LR HSI reconstruction network, allowing for learning of the spectral characteristics of HSI to generate the target image X^. Then, a spectral downsampling of X^ is performed using SRF to obtain the reconstructed HR MSI Yˆ. The reconstruction loss over the HR MSI reconstruction is defined as:

(6) L(θme)=12Y(θme,θhd)Y^(θme,θhd)F2+λH(Ah(θme))(6)

where Y is the original HR MSI, λ is the weight parameter, and θme represents the encoder weights of the LR HSI reconstructed network.

Due to the significantly larger number of spectral bands in HSI compared to MSI, it is challenging to map spatial information to images with so many bands. In addition, it is inevitable that some important texture features are lost in multispectral images because Lpl, as shown in .

Figure 2. The spatial feature loss challenge of fusion-based HSI SR. The number of bands L in HSI is much larger than the number of basis vectors p, as well as the number of bands l in MSI. When mapping spatial features to more bands during image fusion, it is easy to cause loss of information.

Figure 2. The spatial feature loss challenge of fusion-based HSI SR. The number of bands L in HSI is much larger than the number of basis vectors p, as well as the number of bands l in MSI. When mapping spatial features to more bands during image fusion, it is easy to cause loss of information.

To better extract and preserve the texture and structural features of the HR MSI, the spatial local-aware block is introduced into the encoder of the MSI reconstruction network. This block selectively focuses on important local areas and takes full advantage of detailed features. The spatial attention map is generated based on the spatial relationships between features and selectively emphasizes the informative spatial features of the image. shows a simple framework of the spatial local-aware block.

Figure 3. Spatial local-aware block. After feature extraction through a densely connected structure, the MSI is fed into the module. After pooling, convolution, and activation, a spatial attention map is obtained, which aims to focus on the texture structures of key regions.

Figure 3. Spatial local-aware block. After feature extraction through a densely connected structure, the MSI is fed into the module. After pooling, convolution, and activation, a spatial attention map is obtained, which aims to focus on the texture structures of key regions.

Based on the input features, average pooling and maximum pooling are performed to generate two feature maps representing different information. Performing feature extraction using a 7×7 convolutional kernel with a larger perceptual field enables the capture of features over a wider spatial range of the input feature map, including texture and shape. Subsequently, the weight map generated by the sigmoid activation function assigns attention to different spatial locations. Overlaying this weight map onto the original input feature map produces a new feature map, where the features of the key regions are enhanced. The calculation formula of the spatial local-aware enhancement can be expressed as:

(7) Ms(I)=σ(f7×7([AvgPool(I);MaxPool(I)]))(7)
(8) Ms(I)=σ(f7×7([IavgS;ImaxS]))M(8)

where Ms(I) represents the generated spatial attention map, σ represents the sigmoid function, f7×7 denotes a convolution kernel of size 7×7, and [IavgS;ImaxS] is the two-dimensional feature map generated based on channel average pooling and maximum pooling.

3.4. Spectral angle similarity

The p in endmember ERp×Lis much larger than the number of bands l of HR MSI. Therefore, the estimated HR HSI is prone to spectral distortion when feature fusion is performed. Inspired by Qu, Qi, and Kwan (Citation2018), our network encourages the feature representations Ah and Am of the two reconstructed networks to follow similar patterns to mitigate the spectral distortion. In the proposed network, the decoder weights are shared between the LR HSI reconstruction network and HR MSI reconstruction network, so that their feature representation should have similar angles, in a similar manner to the spectral angle mapper (SAM).

Since AhR mn×p and AmRMN×p have different spatial dimensions, spatial enhancement of AhRmn×p is required before calculation. The spatial information of each pixel in Ah is enhanced by copying its value to its nearest neighborhood. The enhanced feature representations A˜hRMN×p and AhRMN×p have the same dimension in space. In this network, the spectral angle difference of the two feature representations is defined as:

(9) AA˜h,Am=1MN(i=1MNarccos(A˜ihAmiA˜ih2Ami2))(9)

The pseudocode of the proposed LCNet is shown in Algorithm 1.

4. Experimental analysis

4.1. Datasets

The proposed LCNet network was evaluated on three publicly available HSIs: Indian Pines, Pavia University and Washington DC. The Indian Pines dataset was imaged by the AVIRIS in 1992 on an Indian Pine in Indiana, USA. It consists of a spatial resolution of 20 meters, 145×145 pixels, and with 220 bands in the 400–2500 nm wavelength range. A 144×144 pixel area with 200 bands was selected as the experimental image after excluding 20 bands that cannot be reflected by water. The Pavia University dataset was acquired by the ROSIS-03 in 2003, consisting of 610×340 pixels. This spectral imager continuously images 115 bands in the 430–840 nm wavelength range and has a GSD of 1.3 m. Affected by noise and water vapor, 12 bands are removed, so that the image composed of the remaining 103 bands is usually used. In this experiment, an area with a size of 256×256 pixels and 103 bands in the top left corner of the image was selected. The Washington DC dataset was taken by the HYDICE sensor in 1995, and has a spatial resolution of 2.5. The image area was 1280×307 pixels and has 210 bands in the 400–2500 nm wavelength range. After removing 19 noise affected bands, 191 bands covering an area of 256×256 pixels were used. shows the experimental data after pre-processing.

Figure 4. The color-composite of three public hyperspectral datasets. (a) Indian Pines, (b) Pavia University and (c) Washington DC.

Figure 4. The color-composite of three public hyperspectral datasets. (a) Indian Pines, (b) Pavia University and (c) Washington DC.

To validate the proposed framework LCNet, we also performed experiments on real data from different hyperspectral and multispectral satellites. The HSI was acquired from the OHS-1 satellite, which consists of 32 bands with a spatial resolution of 10 meters. The multispectral data was obtained from the GF-2 satellite, which contains four bands with a spatial resolution of 4 meters. The data used have been corrected and registered using the Environment for Visualizing Images (ENVI) software. By slightly resizing the picture, the size of LR HSI is 428×428×32, the size of HR MSI is 1284×1284×32, and the fusion ratio is 3.

4.2. Experimental setup

In this experiment, we generated LR HSI by simulating spatial downsampling, and HR MSI by simulating spectral downsampling. The multiplier of spatial downsampling is the ratio of HR GSD to LR GSD. The GSD ratio is set to 8 for all public experimental data. The simulated LR HSI is generated by applying a Gaussian filter with a width of 8×8and a standard deviation of 0.5, which is widely used in remote sensing (Hong et al. Citation2019). All HR MSIs with four bands were generated by filtering HR HSI with SRF. All models were trained on four NVIDIA RTX2080 GPUs.

4.3. Evaluation metrics

In this study, simulation experiments are used to evaluate the performance of different fusion methods. The raw HR HSI is used as ground truth to calculate and compare the performance of the various algorithms. For quantitative comparison, the peak signal-to-noise ratio (PSNR) and the spectral angle mapper (SAM) are used to quantitatively evaluate the quality of the SR solutions.

PSNR is an objective standard for image evaluation in decibels (dB). PSNR is a vital evaluation index for image reconstruction. Generally, a higher PSNR value indicates better quality of SR results. The definition and calculation of PSNR are usually derived from the mean square error (MSE). The MSE of the reference image I and the reconstructed image K of size m×n can be calculated as:

(10) MSE=1mni=om1j=0n1[I(i,j)K(i,j)]2(10)

PSNR is expressed as:

(11) PSNR=10log10(MAXI2MSE)(11)

The calculation of SAM regards the image as a vector in a multidimensional space and calculates the angle between the real HSI and the spectral vector in the estimated HSI analytically. For this experiment, a smaller calculated spectral angle corresponds to better results from the hyperspectral image SR reconstruction. The SAM is calculated according to:

(12) SAM(i)=arccos(<u(i),uˆ(i)>u(i)2×uˆ(i)2)(12)

where u(i) and uˆ(i) represent the spectral curve of the i-th pixel in the real HR HSI and the estimated HR HSI, respectively, and SAM(i) represents the spectral angle of the i-th pixel.

4.4. Results

This section shows the SR results of the proposed LCNet on three public HSI datasets, i.e. Indian Pines, Pavia University and Washington DC. In addition, comparisons with relevant methods are also provided. The SR results are compared based on qualitative analysis and quantitative evaluation. In addition, experiments are conducted on real-world hyperspectral data to further demonstrate the effectiveness of the proposed method.

4.4.1. Indian pines

In this paper, the proposed LCNet is compared with eight classical methods for fusing HSI and MSI, including SFIM (Liu Citation2000), GLPHS (Selva et al. Citation2015), GSA (Aiazzi, Baronti, and Selva Citation2007), CNMF (Yokoya, Yairi, and Iwasaki Citation2011), FUSE (Wei, Dobigeon, and Tourneret Citation2015), HySure (Simoes et al. Citation2014), CSTF (Li et al. Citation2018) and uSDN (Qu, Qi, and Kwan Citation2018). summarizes properties of all comparative methods for learning SRF and PSF, of which only HySure can learn these two unknown functions. For methods that cannot learn SRF adaptively, the SRF estimated by the proposed network is used as the known prior.

Table 1. The ability of each method to learn unknown SRF and PSF.

presents an example of the results on Indian Pines data. uSDN and our method preserve the spatial structure more accurately and obtain results closer to the reference image. The fusion results of SFIM, GLPHS and GSA have obvious block effects, while HySure and CSTF produce additional noise. The results of CNMF and FUSE have blurred textures and unclear boundaries. Overall, uSDN and the proposed LCNet achieved visually pleasing fusion results on this data and perform well in preserving the structural features and spectral information of ground objects.

Figure 5. Grayscale display comparison of the 5th band of the reconstruction results on the Indian Pines data. Methods: (a) SFIM, (b) GLPHS, (c) GSA, (d) CNMF, (e) FUSE, (f) HySure, (g) CSTF, (h) uSDN, (i) LCNet, (j) Reference image.

Figure 5. Grayscale display comparison of the 5th band of the reconstruction results on the Indian Pines data. Methods: (a) SFIM, (b) GLPHS, (c) GSA, (d) CNMF, (e) FUSE, (f) HySure, (g) CSTF, (h) uSDN, (i) LCNet, (j) Reference image.

To visually enhance some of the subtle differences which are not easily detected, the error diagrams between the estimated results and the reference image are visualized to better evaluate the performance of each method. shows the error maps for each method at band 5 on the Indian Pines data. It can be observed from the error maps that the texture features of the reconstructed images obtained by SFIM, GLPHS, GSA, FUSE and CSTF are lost and differ significantly from the reference image. The results of CNMF and HySure are not obvious for the boundary differentiation. The uSDN has achieved a more satisfactory error result map, but there are still errors in the details of recovery. The proposed LCNet is not only more boundary-aware, but also focuses on the features of local details.

Figure 6. Comparison of the error maps of the 5th band of the reconstruction results on the Indian Pines data. Methods: (a) SFIM, (b) GLPHS, (c) GSA, (d) CNMF, (e) FUSE, (f) HySure, (g) CSTF, (h) uSDN, (i) LCNet.

Figure 6. Comparison of the error maps of the 5th band of the reconstruction results on the Indian Pines data. Methods: (a) SFIM, (b) GLPHS, (c) GSA, (d) CNMF, (e) FUSE, (f) HySure, (g) CSTF, (h) uSDN, (i) LCNet.

The spectral curves marked in red in each result were plotted to compare the ability of each method to maintain spectral properties, as shown in . The proposed LCNet shows a superior performance in spectral information reconstruction with the spectral curves closest to the ground truth.

Figure 7. Spectral curves of each method at the red mark of Indian Pines data.

Figure 7. Spectral curves of each method at the red mark of Indian Pines data.

As shown in , the proposed LCNet achieved the maximum PSNR of 37.544 and minimum SAM score of 1.700. Detailed examination shows that earlier methods based on sharpening and matrix factorization have difficulty in obtaining satisfactory evaluation values. They cannot restore texture structure and spectral features well. Although uSDN has achieved a high PSNR value with good spatial information restoration, the spectral distortion phenomenon needs to be improved. The PSNR value of CSTF is also high, but its SAM value is not satisfactory. Therefore, although CSTF can preserve the texture feature of the image well, the spectral distortion phenomenon is also quite serious. In contrast, our method shows better performance in terms of spatial local feature extraction and spectral information retention.

Table 2. Quantitative performance comparison of each method on the Indian Pines dataset. The best results are highlighted in bold.

4.4.2. Pavia University

To further confirm the generality of the proposed framework, the proposed LCNet was also compared with SFIM, GLPHS, GSA, CNMF, FUSE, HySure, CSTF, and uSDN on the Pavia University dataset. The visual evaluation of these methods is shown in . The results of SFIM and GLPHS still have an obvious patchy effect, while the pixel values of the images obtained by FUSE, CTSF, and uSDN are generally higher than those of the reference image. The texture structure of the GSA reconstructed image has changed, and the results obtained by CNMF, HySure, and the proposed LCNet have clear textures and boundaries.

Figure 8. Grayscale display comparison of the 30th band of the reconstruction results on the Pavia University data. Methods: (a) SFIM, (b) GLPHS, (c) GSA, (d) CNMF, (e) FUSE, (f) HySure, (g) CSTF, (h) uSDN, (i) LCNet, (j) Reference image.

Figure 8. Grayscale display comparison of the 30th band of the reconstruction results on the Pavia University data. Methods: (a) SFIM, (b) GLPHS, (c) GSA, (d) CNMF, (e) FUSE, (f) HySure, (g) CSTF, (h) uSDN, (i) LCNet, (j) Reference image.

In order to facilitate the evaluation of the results for each method, shows the error maps at the 40th band of the Pavia University dataset. It is visually evident that SFIM, GLPHS, GSA, FUSE and CSTF exhibit significant errors compared to the reference image. It is worth noting that CNMF achieves better visual performance in the grayscale display, while the error map realistically displayed the difference between the result and the ground truth. HySure exhibits a significant amount of high frequency noise errors. The proposed LCNet has ability to record regular texture structures. It can also be seen from the spectral graphs at the red marks of the results shown in that this method is effective for the extraction and retention of spectral features.

Figure 9. Comparison of the error maps of the 30th band of the reconstruction results on the Pavia University data. Methods: (a) SFIM, (b) GLPHS, (c) GSA, (d) CNMF, (e) FUSE, (f) HySure, (g) CSTF, (h) uSDN, (i) LCNet.

Figure 9. Comparison of the error maps of the 30th band of the reconstruction results on the Pavia University data. Methods: (a) SFIM, (b) GLPHS, (c) GSA, (d) CNMF, (e) FUSE, (f) HySure, (g) CSTF, (h) uSDN, (i) LCNet.

Figure 10. Spectral curves of each method at the red mark of Pavia University data.

Figure 10. Spectral curves of each method at the red mark of Pavia University data.

presents the quantitative results for the Pavia University dataset. Except for HySure, USDN and the proposed LCNet, the PSNR and SAM values obtained by other methods are unacceptable. They are insufficient for spatial structure preservation and spectral restoration. HySure has achieved the best SAM value and has an excellent perception of the spectral information on this data, but is deficient in preserving the texture structure of the image. Although the proposed LCNet did not achieve the best results, the obtained PSNR and SAM values were satisfactory with 35.778 and 4.381, respectively. Our method is effective in extracting and preserving the spectral characteristics of the image and the detailed texture characteristics of the key regions.

Table 3. Quantitative performance comparison of each method on the Pavia University data. The best results are highlighted in bold.

4.4.3. Washington DC

To further prove the quantitative and qualitative performance of LCNet, comparisons were made with SFIM, GLPHS, GSA, CNMF, FUSE, HySure, CSTF and uSDN on the Washington DC dataset. shows the qualitative results of the proposed LCNet and other methods on this data. As shown in detail in , SFIM, GLPHS, FUSE and CSTF are not as effective as other methods for the preservation of spatial structure. HySure results in higher pixel values than the reference image. GSA, USDN and the proposed LCNet achieve results that are closer to the reference image.

Figure 11. Grayscale display comparison of the 40th band of the reconstruction results on the Washington DC data. Methods: (a) SFIM, (b) GLPHS, (c) GSA, (d) CNMF, (e) FUSE, (f) HySure, (g) CSTF, (h) uSDN, (i) LCNet, (j) Reference image.

Figure 11. Grayscale display comparison of the 40th band of the reconstruction results on the Washington DC data. Methods: (a) SFIM, (b) GLPHS, (c) GSA, (d) CNMF, (e) FUSE, (f) HySure, (g) CSTF, (h) uSDN, (i) LCNet, (j) Reference image.

shows the error plot of each fusion result and the reference data at the 30th band. It can be seen from the figure that SFIM and FUSE have obvious patches with clear errors. GLPHS, GSA, and CSTF exhibited significant errors, while CNMF demonstrated insufficient ability to recover edge structures. uSDN and the proposed LCNet show small errors and clear texture structures. The spectral curve at the red mark is shown in , and the proposed method obtains the closest result to the ground truth. Overall, the image obtained by the proposed LCNet retains more complete spatial details and spectral features.

Figure 12. Comparison of the error maps of the 40th band of the reconstruction results on the Washington DC data. Methods: (a) SFIM, (b) GLPHS, (c) GSA, (d) CNMF, (e) FUSE, (f) HySure, (g) CSTF, (h) uSDN, (i) LCNet.

Figure 12. Comparison of the error maps of the 40th band of the reconstruction results on the Washington DC data. Methods: (a) SFIM, (b) GLPHS, (c) GSA, (d) CNMF, (e) FUSE, (f) HySure, (g) CSTF, (h) uSDN, (i) LCNet.

Figure 13. Spectral curves of each method at the red mark of Washington DC data.

Figure 13. Spectral curves of each method at the red mark of Washington DC data.

As shown in the quantitative comparison listed in , the proposed LCNet achieved the best performance with PSNR and SAM values of 39.977 and 4.264, respectively. For PSNR, the proposed LCNet outperforms SFIM, GLPHS, GSA, CNMF, FUSE, HySure, CSTF, and uSDN with 18.871, 14.533, 13.880, 14.377, 17.733, 14.246, 7.846, and 1.717, respectively. This indicates that LCNet is more effective in preserving structural features, even though the textures in the HSI are more complex. Additionally, the proposed LCNet achieved a much lower SAM value compared to other methods, indicating its superior spectral recovery ability compared to other methods.

Table 4. Quantitative performance comparison of each method on the Washington DC data. The best results are highlighted in bold.

4.4.4. Real dataset

In order to further evaluate the effectiveness of the proposed LCNet in practical applications, we conducted fusion experiments using hyperspectral data from the OHS-1 satellite and MSI from the GF-2 satellite. shows the visualization results of LCNet and other methods for this data. The methods of SFIM, GLPHS, GSA, CNMF, and HySure were unable to reconstruct images with clear textures. The FUSE method lost a significant amount of spatial information, resulting in poor performance. Both uSDN and the proposed LCNet method achieved more satisfactory visual results. However, the LCNet outperformed uSDN by exhibiting sharper edge and detail features, highlighting its potential for practical applications in realistic scenes with complex variability.

Figure 14. Grayscale display comparison of the 14th band of the reconstruction results on the real dataset. Methods: (a) LR HSI, (b) HR MSI (bands 3, 2, 1), (c) SFIM, (d) GLPHS, (e) GSA, (f) CNMF, (g) FUSE, (h) HySure, (i) uSDN, (j) LCNet.

Figure 14. Grayscale display comparison of the 14th band of the reconstruction results on the real dataset. Methods: (a) LR HSI, (b) HR MSI (bands 3, 2, 1), (c) SFIM, (d) GLPHS, (e) GSA, (f) CNMF, (g) FUSE, (h) HySure, (i) uSDN, (j) LCNet.

4.4.5. Comprehensive analysis

The performance of LCNet in terms of stability and effectiveness can be observed by comparing it with other methods on four datasets. LCNet achieves nearly optimal evaluation metrics on all three public datasets and produces visually superior results compared to other methods. It can effectively extract and preserve key local features, and the fused image features are rich and accurate with high stability. Conversely, some methods exhibit inconsistent performance on different datasets. For example, HySure achieves the best SAM values on the Pavia University data, yet its performance on the Washington DC is not satisfactory with poor recovery of spatial structure and spectral features. In addition, the performance of LCNet on real dataset also demonstrates its excellent applicability and robustness.

4.5. Discussion

This section provides a more comprehensive analysis and discussion of the proposed network, including ablation experiments, complexity analysis, and limitations.

4.5.1. Ablation study

To further explore the effectiveness of the techniques used in the network, we conducted ablation experiments and tried alternative techniques. Specifically, we tested the performance of relevant networks on the Washington DC dataset, including a network without the spatial local-aware block (LCNet-slb), a network with the spatial local-aware block replaced by the convolutional block attention module (LCNet-slb+CBAM), and a network with the spatial local-aware block replaced by the squeeze and excitation module (LCNet-slb+SE). displays the results of the ablation and substitution experiments. The experimental results show that the LCNet with spatial local-aware block achieved the best performance. Overall, the spatial local-aware block captures crucial spatial information and is more adaptive to the HSI fusion SR task.

Table 5. Ablation and substitution experiment results on the Washington DC dataset. The best results are highlighted in bold.

4.5.2. Network complexity

displays the number of network parameters and floating-point operations (FLOPs) for the two compared methods on the Washington DC dataset. It can be observed that both networks require relatively few parameters. Compared with uSDN, the proposed LCNet improves performance without a significant increase in the number of parameters and computational complexity. In addition, the hardware consumption of LCNet without the spatial local-aware block (LCNet-slb) is compared. The spatial local-aware block improves performance without significantly increasing hardware consumption.

Table 6. Parameters and FLOPs of LCNet and uSDN on the Washington DC dataset.

4.5.3. Limitations and future work

The proposed LCNet incorporates a spatial local-aware block to selectively emphasize the spatial information of MSI. However, the framework lacks full exploitation of the correlations and dependencies among the spectral bands in HSI. In future work, it is important to consider the characteristics of spectral bands in HSI and focus on restoring spectral information. Moreover, developing networks that are suitable for data with differences in spectral features and observation angles is worth exploring.

5. Conclusions

In this paper, a local-aware coupled network (LCNet) has been proposed for fusing LR HSI and HR MSI to achieve HSI SR. In LCNet, the SRF is estimated adaptively to generate simulated HR MSI and used for network optimization. Two reconstruction networks are coupled by sharing the decoder weights to preserve the spatial texture and spectral properties. The spatial local-aware block introduced in the encoder of the HR MSI reconstruction network effectively enhances the feature representation of key areas and promotes the preservation of detailed texture features. The proposed method was evaluated on three public HSI datasets, namely Indian Pines, Pavia University and Washington DC, as well as a real dataset. The experimental results confirmed that the proposed framework can effectively combine the characteristics of MSI and HSI to achieve SR processing of HSI. Compared with other state-of-the-art methods, the proposed LCNet is more effective and stable.

Acknowledgments

The authors would like to thank the editor, associate editor, and anonymous reviewers for their helpful comments and advice. This work was supported by the National Natural Science Foundation of China under Grant No. 41901306.

Disclosure statement

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Data availability statement

The Indian Pines, Pavia University, and Washington DC datasets are publicly available hyperspectral image datasets. The datasets can be downloaded from the following link: https://rslab.ut.ac.ir/data. The OHS-1 hyperspectral data can be obtained from the following link can be downloaded from the following link: https://www.orbitalinsight.com/data/. The GF-2 multispectral data can be downloaded from the following link: https://data.cresda.cn/.

Additional information

Funding

The work was supported by the National Natural Science Foundation of China [41901306].

References

  • Aiazzi, B., S. Baronti, and M. Selva. 2007. “Improving Component Substitution Pansharpening Through Multivariate Regression of MS + Pan Data.” IEEE Transactions on Geoscience and Remote Sensing 45 (10): 3230–17. https://doi.org/10.1109/TGRS.2007.901007.
  • Akhtar, N., F. Shafait, and A. Mian. 2015. “Bayesian Sparse Representation for Hyperspectral Image Super Resolution.” In IEEE Conference on Computer Vision and Pattern Recognition, Boston, June, 3631–3640.
  • Arun, P. V., K. M. Buddhiraju, A. Porwal, and J. Chanussot. 2020. “CNN-Based Super-Resolution of Hyperspectral Images.” IEEE Transactions on Geoscience and Remote Sensing 58 (9): 6106–6121. https://doi.org/10.1109/TGRS.2020.2973370.
  • Borsoi, R. A., T. Imbiriba, and J. C. M. Bermudez. 2019. “Super-Resolution for Hyperspectral and Multispectral Image Fusion Accounting for Seasonal Spectral Variability.” IEEE Transactions on Image Processing 29:116–127. https://doi.org/10.1109/TIP.2019.2928895.
  • Brown, A. J., S. J. Hook, A. M. Baldridge, J. K. Crowley, N. T. Bridges, B. J. Thomson, G. M. Marion, C. R. S. Filho, and J. L. Bishop. 2010. “Hydrothermal Formation of Clay-Carbonate Alteration Assemblages in the Nili Fossae Region of Mars.” Earth and Planetary Science Letters 297 (1–2): 174–182. https://doi.org/10.1016/j.epsl.2010.06.018.
  • Chen, H., X. He, L. Qing, L. Qing, Y. Wu, C. Ren, R. Sheriff, and C. Zhu. 2022. “Real-World Single Image Super-Resolution: A Brief Review.” Information Fusion 79:124–145. https://doi.org/10.1016/j.inffus.2021.09.005.
  • Chen, N., L. Sui, B. Zhang, H. He, K. Gao, Y. Li, J. Junior, and J. Li. 2021. “Fusion of Hyperspectral-Multispectral Images Joining Spatial-Spectral Dual-Dictionary and Structured Sparse Low-Rank Representation.” International Journal of Applied Earth Observation and Geoinformation 104:102570. https://doi.org/10.1016/j.jag.2021.102570.
  • Dian, R., and S. Li. 2019. “Hyperspectral Image Super-Resolution via Subspace-Based Low Tensor Multi-Rank Regularization.” IEEE Transactions on Image Processing 28 (10): 5135–5146. https://doi.org/10.1109/TIP.2019.2916734.
  • Dong, W., F. Fu, G. Shi, X. Cao, J. Wu, G. Li, and X. Li. 2016. “Hyperspectral Image Super-Resolution via Non-Negative Structured Sparse Representation.” IEEE Transactions on Image Processing 25 (5): 2337–2352. https://doi.org/10.1109/TIP.2016.2542360.
  • Donovan, S. D., D. A. MacLean, Y. Zhang, M. B. Lavigne, and J. A. Kershaw. 2021. “Evaluating Annual Spruce Budworm Defoliation Using Change Detection of Vegetation Indices Calculated from Satellite Hyperspectral Imagery.” Remote Sensing of Environment 253:112204. https://doi.org/10.1016/j.rse.2020.112204.
  • Eismann, M. T., and R. C. Hardie. 2004. ““Application of the Stochastic Mixing Model to Hyperspectral Resolution Enhancement.” IEEE Transactions on Geoscience and Remote Sensing 42 (9): 1924–1933. https://doi.org/10.1109/TGRS.2004.830644.
  • Fu, Y., T. Zhang, Y. Zheng, D. Zhang, and H. Huang. 2019. “Hyperspectral Image Super-Resolution with Optimized RGB Guidance.” In IEEE Conference on Computer Vision and Pattern Recognition, Los Angeles, June, 11661–11670.
  • Gomez, R. B., A. Jazaeri, and M. Kafatos. 2001. “Wavelet-Based Hyperspectral and Multispectral Image Fusion.“ In Geo-Spatial Image and Data Exploitation II, June, Orlando, FL, United States , 36–42. https://doi.org/10.1117/12.428249.
  • Han, X. H., Y. Zheng, and Y. W. Chen. 2019. “Multi-Level and Multi-Scale Spatial and Spectral Fusion CNN for Hyperspectral Image Super-Resolution.” In Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops, Seoul, Korea (South), October, 4330–4339. http://dx.doi.org/10.1109/ICCVW.2019.00533.
  • He, W., Y. Chen, N. Yokoya, C. Li, and B. Zhao. 2022. ““Hyperspectral Super-Resolution via Coupled Tensor Ring Factorization.” Pattern Recognition 122:108280. https://doi.org/10.1016/j.patcog.2021.108280.
  • He, S., H. Zhou, Y. Wang, W. Cao, and Z. Han. 2016. “Super-Resolution Reconstruction of Hyperspectral Images via Low Rank Tensor Modeling and Total Variation Regularization.” In IEEE International Geoscience and Remote Sensing Symposium (IGARSS)Beijing, July, 6962–6965.
  • Hong, S. M., K. H. Cho, S. Park, T. Kang, M. S. Kim, G. Nam, and J. Pyo. 2022. “Estimation of Cyanobacteria Pigments in the Main Rivers of South Korea Using Spatial Attention Convolutional Neural Network with Hyperspectral Imagery.” GIScience & Remote Sensing 59 (1): 547–567. https://doi.org/10.1080/15481603.2022.2037887.
  • Hong, D., N. Yokoya, N. Ge, J. Chanussot, and X. Zhu. 2019. ““Learnable Manifold Alignment (LeMa): A Semi-Supervised Cross-Modality Learning Framework for Land Cover and Land Use Classification.” Isprs Journal of Photogrammetry & Remote Sensing 147:193–205. https://doi.org/10.1016/j.isprsjprs.2018.10.006.
  • Huang, H., J. Yu, and W. Sun. 2014. “Super-Resolution Mapping via Multi-Dictionary Based Sparse Representation.” In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Florence, May, 3523–3527.
  • Irmak, H., G. B. Akar, and S. E. Yukse. 2018. “A Map-Based Approach for Hyperspectral Imagery Super-Resolution.” IEEE Transactions on Image Processing 27 (6): 2942–2951. https://doi.org/10.1109/TIP.2018.2814210.
  • Jiang, J., H. Sun, X. Liu, and J. Ma. 2020. “Learning Spatial-Spectral Prior for Super-Resolution of Hyperspectral Imagery.” IEEE Transactions on Computational Imaging 6:1082–1096. https://doi.org/10.1109/TCI.2020.2996075.
  • Li, S., R. Dian, L. Fang, and J. M. Bioucas-Dias. 2018. “Fusing Hyperspectral and Multispectral Images via Coupled Sparse Tensor Factorization.” IEEE Transactions on Image Processing 27 (8): 4118–4130. https://doi.org/10.1109/TIP.2018.2836307.
  • Liu, J. 2000. “Smoothing Filter-Based Intensity Modulation: A Spectral Preserve Image Fusion Technique for Improving Spatial Details.” International Journal of Remote Sensing 21 (18): 3461–3472. https://doi.org/10.1080/014311600750037499.
  • Liu, D., J. Li, and Q. Yuan. 2021. “A Spectral Grouping and Attention-Driven Residual Dense Network for Hyperspectral Image Super-Resolution.” IEEE Transactions on Geoscience and Remote Sensing 59 (9): 7711–7725. https://doi.org/10.1109/TGRS.2021.3049875.
  • Liu, J., Z. Wu, L. Xiao, and X. Wu. 2022. ““Model Inspired Autoencoder for Unsupervised Hyperspectral Image Super-Resolution.” IEEE Transactions on Geoscience and Remote Sensing 60:1–12. https://doi.org/10.1109/TGRS.2022.3143156.
  • Li, Q., Q. Wang, and X. Li. 2021. “Exploring the Relationship Between 2D/3D Convolution for Hyperspectral Image Super-Resolution.” IEEE Transactions on Geoscience and Remote Sensing 59 (10): 8693–8703. https://doi.org/10.1109/TGRS.2020.3047363.
  • Li, K., W. Zhang, D. Yu, and X. Tian. 2022. “HyperNet: A Deep Network for Hyperspectral, Multispectral, and Panchromatic Image Fusion.” ISPRS Journal of Photogrammetry and Remote Sensing 188:30–44. https://doi.org/10.1016/j.isprsjprs.2022.04.001.
  • Mei, S., X. Yuan, J. Ji, Y. Zhang, S. Wan, and Q. Du. 2017. “Hyperspectral Image Spatial Super-Resolution via 3D Full Convolutional Neural Network.” Remote Sensing 9 (11): 1139. https://doi.org/10.3390/rs9111139.
  • Moreira, L. C. J., A. D. S. Teixeira, and L. S. Galvão. 2015. “Potential of Multispectral and Hyperspectral Data to Detect Saline-Exposed Soils in Brazil.” GIScience & Remote Sensing 52 (4): 416–436. https://doi.org/10.1080/15481603.2015.1040227.
  • Ou, X., L. Liu, B. Tu, G. Zhang, and Z. Xu. 2022. “A CNN Framework with Slow-Fast Band Selection and Feature Fusion Grouping for Hyperspectral Image Change Detection.” IEEE Transactions on Geoscience and Remote Sensing 60:1–16. https://doi.org/10.1109/TGRS.2022.3156041.
  • Palsson, F., J. R. Sveinsson, and M. O. Ulfarsson. 2017. “Multispectral and Hyperspectral Image Fusion Using a 3-D-Convolutional Neural Network.” IEEE Geoscience and Remote Sensing Letters 14 (5): 639–643. https://doi.org/10.1109/LGRS.2017.2668299.
  • Quan, Y., M. Li, Y. Hao, J. Liu, and B. Wang. 2023. “Tree Species Classification in a Typical Natural Secondary Forest Using UAV-Borne LiDar and Hyperspectral Data.” GIScience & Remote Sensing 60 (1): 2171706. https://doi.org/10.1080/15481603.2023.2171706.
  • Qu, Y., H. Qi, and C. Kwan. 2018. “Unsupervised Sparse Dirichlet-Net for Hyperspectral Image Super-Resolution.” In Proceedings of the IEEE conference on computer vision and pattern recognition, Long Beach, June, 2511–2520.
  • Qu, Y., H. Qi, C. Kwan, N. Yokoya, and J. Chanussot. 2021. “Unsupervised and Unregistered Hyperspectral Image Super-Resolution with Mutual Dirichlet-Net.” IEEE Transactions on Geoscience and Remote Sensing 60:1–18. https://doi.org/10.1109/TGRS.2021.3079518.
  • Rao, W., L. Gao, Y. Qu, X. Sun, B. Zhang, and J. Chanussot. 2022. “Siamese Transformer Network for Hyperspectral Image Target Detection.” IEEE Transactions on Geoscience and Remote Sensing 60:1–19. https://doi.org/10.1109/TGRS.2022.3163173.
  • Rasti, B., D. Hong, R. Hang, P. Ghamisi, X. Kang, J. Chanussot, and J. A. Benediktsson. 2020. “Feature Extraction for Hyperspectral Imagery: The Evolution from Shallow to Deep: Overview and Toolbox.” IEEE Geoscience and Remote Sensing Magazine 8 (4): 60–88. https://doi.org/10.1109/MGRS.2020.2979764.
  • Selva, M., B. Aiazzi, F. Butera, F. Chiarantini, and S. Baronti. 2015. “Hyper-Sharpening: A First Approach on SIM-GA Data.” IEEE Journal of Selected Topics in Applied Earth Observations & Remote Sensing 8 (6): 3008–3024. https://doi.org/10.1109/JSTARS.2015.2440092.
  • Sethuraman, J. 1994. “A Constructive Definition of Dirichlet Priors.” Statistica sinica 4 (2): 639–650.
  • Simoes, M., J. Bioucas‐Dias, L. B. Almeida, and J. Chanussot. 2014. “A Convex Formulation for Hyperspectral Image Superresolution via Subspace-Based Regularization.” IEEE Transactions on Geoscience and Remote Sensing 53 (6): 3373–3388. https://doi.org/10.1109/TGRS.2014.2375320.
  • Wang, Y., X. A. Chen, Z. Han, and S. He. 2017. “Hyperspectral Image Super-Resolution via Nonlocal Low-Rank Tensor Approximation and Total Variation Regularization.” Remote Sensing 9 (12): 1286. https://doi.org/10.3390/rs9121286.
  • Wan, W., W. Guo, H. Huang, and J. Liu. 2020. “Nonnegative and Nonlocal Sparse Tensor Factorization-Based Hyperspectral Image Super-Resolution.” IEEE Transactions on Geoscience and Remote Sensing 58 (12): 8384–8394. https://doi.org/10.1109/TGRS.2020.2987530.
  • Wang, W., W. Zeng, Y. Huang, X. Ding, and J. Paisley. 2019. “Deep Blind Hyperspectral Image Fusion.” In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, October, 4150–4159.
  • Wei, Q., N. Dobigeon, and J. Tourneret. 2015. “Fast Fusion of Multi-Band Images Based on Solving a Sylvester Equation.” IEEE Transactions on Image Processing 24 (11): 4109–4121. https://doi.org/10.1109/TIP.2015.2458572.
  • Xu, Y., Z. Wu, J. Chanussot, P. Comon, and Z. Wei. 2019. “Nonlocal Coupled Tensor CP Decomposition for Hyperspectral and Multispectral Image Fusion.” IEEE Transactions on Geoscience and Remote Sensing 58 (1): 348–362. https://doi.org/10.1109/TGRS.2019.2936486.
  • Xu, X., Y. Zhong, L. Zhang, and H. Zhang. 2012. “Sub-Pixel Mapping Based on a MAP Model with Multiple Shifted Hyperspectral Imagery.” IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing 6 (2): 580–593. https://doi.org/10.1109/JSTARS.2012.2227246.
  • Yi, C., Y. Q. Zhao, and J. C. W. Chan. 2018. “Hyperspectral Image Super-Resolution Based on Spatial and Spectral Correlation Fusion.” IEEE Transactions on Geoscience and Remote Sensing 56 (7): 4165–4177. https://doi.org/10.1109/TGRS.2018.2828042.
  • Yokoya, N., T. Yairi, and A. Iwasaki. 2011. “Coupled Nonnegative Matrix Factorization Unmixing for Hyperspectral and Multispectral Data Fusion.” IEEE Transactions on Geoscience and Remote Sensing 50 (2): 528–537. https://doi.org/10.1109/TGRS.2011.2161320.
  • Yuan, Y., X. Zheng, and X. Lu. 2017. “Hyperspectral Image Superresolution by Transfer Learning.” IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing 10 (5): 1963–1974. https://doi.org/10.1109/JSTARS.2017.2655112.
  • Zeng, K., Z. Xu, Y. Yang, Y. Liu, H. Zhao, Y. Zhang, B. Xie, W. Zhou, C. Li, and W. Cao. 2022. “In situ Hyperspectral Characteristics and the Discriminative Ability of Remote Sensing to Coral Species in the South China Sea.” GIScience & Remote Sensing 59 (1): 272–294. https://doi.org/10.1080/15481603.2022.2026641.
  • Zhang, L., J. Nie, W. Wei, Y. Zhang, S. Liao, and L. Shao. 2020. “Unsupervised Adaptation Learning for Hyperspectral Imagery Super-Resolution.” In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, June, 3073–3082. https://doi.drg/10.1109/CVPR42600.2020.00314.
  • Zhang, L., W. Wei, C. Bai, Y. Gao, and Y. Zhang. 2018. “Exploiting Clustering Manifold Structure for Hyperspectral Imagery Super-Resolution.” IEEE Transactions on Image Processing 27 (12): 5969–5982. https://doi.org/10.1109/TIP.2018.2862629.
  • Zhang, L., X. Xu, J. Li, H. Shen, Y. Zhong, and X. Huang. 2012. “Research on Image Reconstruction Based and Pixel Unmixing Based Sub-Pixel Mapping Methods.” In 2012 IEEE International Geoscience and Remote Sensing Symposium, Munich, July, 7263–7266.
  • Zheng, K., L. Gao, W. Liao, D. Hong, B. Zhang, X. Cui, and J. Chanussot. 2020. “Coupled Convolutional Neural Network with Adaptive Response Function Learning for Unsupervised Hyperspectral Super Resolution.” IEEE Transactions on Geoscience and Remote Sensing 59 (3): 2487–2502. https://doi.org/10.1109/TGRS.2020.3006534.
  • Zhou, Y., A. Rangarajan, and P. D. Gader. 2019. “An Integrated Approach to Registration and Fusion of Hyperspectral and Multispectral Images.” IEEE Transactions on Geoscience and Remote Sensing 58 (5): 3020–3033. https://doi.org/10.1109/TGRS.2019.2946803.
  • Zhu, Q., W. Deng, Z. Zheng, Y. Zhong, Q. Guan, W. Lin, L. Zhang, and D. Li. 2021. “A Spectral-Spatial-Dependent Global Learning Framework for Insufficient and Imbalanced Hyperspectral Image Classification.” IEEE Transactions on Cybernetics 52 (11): 11709–11723. https://doi.org/10.1109/TCYB.2021.3070577.
  • Zhu, Q., Y. Zhang, L. Wang, Y. Zhong, Q. Guan, X. Lu, L. Zhang, and D. Li. 2021. “A Global Context-Aware and Batch-Independent Network for Road Extraction from VHR Satellite Imagery.” Isprs Journal of Photogrammetry & Remote Sensing 175:353–365. https://doi.org/10.1016/j.isprsjprs.2021.03.016.
  • Zhu, Q., Y. Zhong, L. Zhang, and D. Li. 2018. “Adaptive Deep Sparse Semantic Modeling Framework for High Spatial Resolution Image Scene Classification.” IEEE Transactions on Geoscience and Remote Sensing 56 (10): 6180–6195. https://doi.org/10.1109/TGRS.2018.2833293.