References
- Ba JL, Kiros JR, Hinton GE. 2016. Layer normalization. arXiv:1607.06450.
- Badrinarayanan V, Kendall A, Cipolla R. 2017. Segnet: a deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans Pattern Anal Mach Intell. 39(12):2481–2495. doi: 10.1109/TPAMI.2016.2644615.
- Cao H, Wang Y, Chen J, Jiang D, Zhang X, Tian Q, Wang M. 2023. Swin-unet: unet-like pure transformer for medical image segmentation. Computer Vision–ECCV 2022 Workshops; October 23–27, 2022; Tel Aviv, Israel: Springer.
- Chen L-C, Zhu Y, Papandreou G, Schroff F, Adam H. 2018. Encoder-decoder with atrous separable convolution for semantic image segmentation. The European Conference on Computer Vision (ECCV); 8–14 September 2018; Munich, Germany.
- Chen J, Lu Y, Yu Q, Luo X, Adeli E, Wang Y, Lu L, Yuille AL, Zhou YJ. 2021. Transunet: transformers make strong encoders for medical image segmentation. arXiv:2102.04306.
- Chen L-C, Papandreou G, Kokkinos I, Murphy K, Yuille AL. 2018. Deeplab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFS. IEEE Trans Pattern Anal Mach Intell. 40(4):834–848. doi: 10.1109/TPAMI.2017.2699184.
- Chen L-C, Papandreou G, Schroff F, Adam H. 2017. Rethinking atrous convolution for semantic image segmentation. arXiv:1706.05587.
- Chen F, Wang N, Yu B, Wang L. 2022. Res2-Unet, a new deep architecture for building detection from high spatial resolution images. IEEE J Sel Top Appl Earth Observations Remote Sensing. 15:1494–1501. doi: 10.1109/JSTARS.2022.3146430.
- Dowden B, De Silva O, Huang W, Oldford D. 2021. Sea ice classification via deep neural network semantic segmentation. IEEE Sensors J. 21(10):11879–11888. doi: 10.1109/JSEN.2020.3031475.
- Feng S, Zhao H, Shi F, Cheng X, Wang M, Ma Y, Xiang D, Zhu W, Chen X. 2020. CPFNet: context pyramid fusion network for medical image segmentation. IEEE Trans Med Imaging. 39(10):3008–3018. doi: 10.1109/TMI.2020.2983721.
- Fu J, Liu J, Tian H, Li Y, Bao Y, Fang Z, Lu H. 2019. Dual attention network for scene segmentation. The IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR); 15–20 June 2019; Long Beach, CA, USA.
- Guan S, Khan AA, Sikdar S, Chitnis PV. 2019. Fully dense UNet for 2-D sparse photoacoustic tomography artifact removal. IEEE J Biomed Health Inform. 24(2):568–576. doi: 10.1109/JBHI.2019.2912935.
- He K, Zhang X, Ren S, Sun J. 2016. Deep residual learning for image recognition. The IEEE conference on computer vision and pattern recognition; 27–30 June 2016; Las Vegas, NV, USA.
- He X, Zhou Y, Zhao J, Zhang D, Yao R, Xue Y. 2022. Swin transformer embedding UNet for remote sensing image semantic segmentation. IEEE Trans Geosci Remote Sensing. 60:1–15. doi: 10.1109/TGRS.2022.3144165.
- Hendrycks D, Gimpel K. 2016. Gaussian error linear units (gelus). arXiv:1606.08415.
- Huang G, Liu Z, Van Der Maaten L, Weinberger KQ. 2017. Densely connected convolutional networks. The IEEE Conference on Computer Vision and Pattern Recognition.
- Huang X, Zhang L, Gong W. 2011. Information fusion of aerial images and LIDAR data in urban areas: vector-stacking, re-classification and post-processing approaches. Int J Remote Sens. 32(1):69–84. doi: 10.1080/01431160903439882.
- Ibtehaz N, Rahman MS. 2020. MultiResUNet: rethinking the U-Net architecture for multimodal biomedical image segmentation. Neural Netw. 121:74–87. doi: 10.1016/j.neunet.2019.08.025.
- Ioffe S. 2017. Batch renormalization: towards reducing minibatch dependence in batch-normalized models. Adv Neural Info Processing Sys. 30.
- Kampffmeyer M, Salberg A-B, Jenssen R. 2018. Urban land cover classification with missing data modalities using deep convolutional neural networks. IEEE J Sel Top Appl Earth Observations Remote Sensing. 11(6):1758–1768. doi: 10.1109/JSTARS.2018.2834961.
- Li H, Qiu K, Chen L, Mei X, Hong L, Tao C. 2021. SCAttNet: semantic segmentation network with spatial and channel attention mechanism for high-resolution remote sensing images. IEEE Geosci Remote Sensing Lett. 18(5):905–909. doi: 10.1109/LGRS.2020.2988294.
- Li W, Wang J, Gao Y, Zhang M, Tao R, Zhang B. 2022. Graph-feature-enhanced selective assignment network for hyperspectral and multispectral data classification. IEEE Trans Geosci Remote Sensing. 60:1–14. doi: 10.1109/TGRS.2022.3166252.
- Li X, He H, Li X, Li D, Cheng G, Shi J, Weng L, Tong Y, Lin Z. 2021. Pointflow: flowing semantics through points for aerial image segmentation. The IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2021; June 19–25, 2021.
- Liu Z, Lin Y, Cao Y, Hu H, Wei Y, Zhang Z, Lin S, Guo B. 2021. Swin transformer: hierarchical vision transformer using shifted windows. The IEEE/CVF International Conference on Computer Vision; October 2021; Montreal, BC, Canada. p. 11–17.
- Liu Z, Mao H, Wu C-Y, Feichtenhofer C, Darrell T, Xie S. 2022. A convnet for the 2020s. The IEEE/CVF Conference on Computer Vision and Pattern Recognition.
- Long J, Shelhamer E, Darrell T. 2015. Fully convolutional networks for semantic segmentation. IEEE Conference on Computer Vision and Pattern Recognition; 7–12 June 2015; Boston, MA, USA.
- Milletari F, Navab N, Ahmadi S-A. 2016. V-net: fully convolutional neural networks for volumetric medical image segmentation. The 2016 Fourth International Conference on 3D Vision (3DV); 25–28 October 2016; Stanford, CA, USA: IEEE. doi: 10.1109/3DV.2016.79.
- Nair V, Hinton GE. 2010. Rectified linear units improve restricted Boltzmann machines. The 27th International Conference on Machine Learning (ICML-10).
- Oktay O, Schlemper J, Folgoc LL, Lee M, Heinrich M, Misawa K, Mori K, McDonagh S, Hammerla NY, Kainz B. 2018. Attention u-net: learning where to look for the pancreas. arXiv:1804.03999.
- Ronneberger O, Fischer P, Brox T. 2015. U-net: convolutional networks for biomedical image segmentation. Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015: 18th International Conference; October 5–9, 2015; Munich, Germany: Springer.
- Sandler M, Howard A, Zhu M, Zhmoginov A, Chen L-C. 2018. Mobilenetv2: inverted residuals and linear bottlenecks. The IEEE conference on computer vision and pattern recognition.
- Simonyan K, Zisserman A. 2014. Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556.
- Sun W, Chen J, Yan L, Lin J, Pang Y, Zhang G. 2022. COVID-19 CT image segmentation method based on swin transformer. Front Physiol. 13:981463. doi: 10.3389/fphys.2022.981463.
- Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I. 2017. Attention is all you need. Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017; December 4–9, 2017; Long Beach, CA, USA.
- Wang J, Li W, Gao Y, Zhang M, Tao R, Du Q. 2023. Hyperspectral and SAR image classification via multiscale interactive fusion network. IEEE Trans Neural Netw Learn Syst. 34(12):10823–10837. doi: 10.1109/TNNLS.2022.3171572.
- Wang J, Li W, Zhang M, Chanussot J. 2023. Large kernel sparse ConvNet weighted by multi-frequency attention for remote sensing scene understanding. IEEE Trans Geosci Remote Sensing. 61:1–12. doi: 10.1109/TGRS.2023.3333401.
- Wang J, Li W, Zhang M, Tao R, Chanussot J. 2023. Remote sensing scene classification via multi-stage self-guided separation network. IEEE Trans Geosci Remote Sensing. 61:1–12. doi: 10.1109/TGRS.2023.3295797.
- Woo S, Park J, Lee J-Y, Kweon IS. 2018. Cbam: convolutional block attention module. The European conference on computer vision (ECCV), 06 October 2018.
- Wu H, Zhang J, Huang K, Liang K, Yu Y. 2019. Fastfcn: rethinking dilated convolution in the backbone for semantic segmentation. arXiv:1903.11816.
- Xiao X, Lian S, Luo Z, Li S. 2018. Weighted Res-UNet for high-quality retina vessel segmentation. 2018 9th International Conference on Information Technology in Medicine and Education (ITME); IEEE. doi: 10.1109/ITME.2018.00080.
- Xie S, Girshick R, Dollár P, Tu Z, He K. 2017. Aggregated residual transformations for deep neural networks. The IEEE Conference on Computer Vision and Pattern Recognition.
- Yang M, Yu K, Zhang C, Li Z, Yang K. 2018. Denseaspp for semantic segmentation in street scenes. The IEEE Conference on Computer Vision and Pattern Recognition.
- Yang Y, Hallman S, Ramanan D, Fowlkes CC. 2011. Layered object models for image segmentation. IEEE Trans Pattern Anal Mach Intell. 34(9):1731–1743. doi: 10.1109/TPAMI.2011.208.
- Zhang M, Li W, Zhao X, Liu H, Tao R, Du Q. 2023. Morphological transformation and spatial-logical aggregation for tree species classification using hyperspectral imagery. IEEE Trans Geosci Remote Sensing. 61:1–12. doi: 10.1109/TGRS.2022.3233847.
- Zhao H, Shi J, Qi X, Wang X, Jia J. 2017. Pyramid scene parsing network. The IEEE conference on computer vision and pattern recognition; 21–26 July 2017; Honolulu, HI, USA.