CMPF-UNet: a ConvNeXt multi-scale pyramid fusion U-shaped network for multi-category segmentation of remote sensing images

Ning LiJilin Provincial Key Laboratory for Numerical Simulation, Jilin Normal University, Siping, ChinaView further author information

Xiaopeng YuJilin Provincial Key Laboratory for Numerical Simulation, Jilin Normal University, Siping, ChinaView further author information

Miao YuJilin Provincial Key Laboratory for Numerical Simulation, Jilin Normal University, Siping, ChinaCorrespondence[email protected]
View further author information

Article: 2311217 | Received 11 Oct 2023, Accepted 23 Jan 2024, Published online: 14 Feb 2024

Cite this article
https://doi.org/10.1080/10106049.2024.2311217
CrossMark

Full Article
Figures & data
References
Citations
Metrics
Licensing
Reprints & Permissions
View PDF PDF View EPUB EPUB

References

Ba JL, Kiros JR, Hinton GE. 2016. Layer normalization. arXiv:1607.06450.
Google Scholar
Badrinarayanan V, Kendall A, Cipolla R. 2017. Segnet: a deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans Pattern Anal Mach Intell. 39(12):2481–2495. doi: 10.1109/TPAMI.2016.2644615.
PubMed Web of Science ®Google Scholar
Cao H, Wang Y, Chen J, Jiang D, Zhang X, Tian Q, Wang M. 2023. Swin-unet: unet-like pure transformer for medical image segmentation. Computer Vision–ECCV 2022 Workshops; October 23–27, 2022; Tel Aviv, Israel: Springer.
Google Scholar
Chen L-C, Zhu Y, Papandreou G, Schroff F, Adam H. 2018. Encoder-decoder with atrous separable convolution for semantic image segmentation. The European Conference on Computer Vision (ECCV); 8–14 September 2018; Munich, Germany.
Google Scholar
Chen J, Lu Y, Yu Q, Luo X, Adeli E, Wang Y, Lu L, Yuille AL, Zhou YJ. 2021. Transunet: transformers make strong encoders for medical image segmentation. arXiv:2102.04306.
Google Scholar
Chen L-C, Papandreou G, Kokkinos I, Murphy K, Yuille AL. 2018. Deeplab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFS. IEEE Trans Pattern Anal Mach Intell. 40(4):834–848. doi: 10.1109/TPAMI.2017.2699184.
PubMed Web of Science ®Google Scholar
Chen L-C, Papandreou G, Schroff F, Adam H. 2017. Rethinking atrous convolution for semantic image segmentation. arXiv:1706.05587.
Google Scholar
Chen F, Wang N, Yu B, Wang L. 2022. Res2-Unet, a new deep architecture for building detection from high spatial resolution images. IEEE J Sel Top Appl Earth Observations Remote Sensing. 15:1494–1501. doi: 10.1109/JSTARS.2022.3146430.
Web of Science ®Google Scholar
Dowden B, De Silva O, Huang W, Oldford D. 2021. Sea ice classification via deep neural network semantic segmentation. IEEE Sensors J. 21(10):11879–11888. doi: 10.1109/JSEN.2020.3031475.
Web of Science ®Google Scholar
Feng S, Zhao H, Shi F, Cheng X, Wang M, Ma Y, Xiang D, Zhu W, Chen X. 2020. CPFNet: context pyramid fusion network for medical image segmentation. IEEE Trans Med Imaging. 39(10):3008–3018. doi: 10.1109/TMI.2020.2983721.
PubMed Web of Science ®Google Scholar
Fu J, Liu J, Tian H, Li Y, Bao Y, Fang Z, Lu H. 2019. Dual attention network for scene segmentation. The IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR); 15–20 June 2019; Long Beach, CA, USA.
Google Scholar
Guan S, Khan AA, Sikdar S, Chitnis PV. 2019. Fully dense UNet for 2-D sparse photoacoustic tomography artifact removal. IEEE J Biomed Health Inform. 24(2):568–576. doi: 10.1109/JBHI.2019.2912935.
PubMed Web of Science ®Google Scholar
He K, Zhang X, Ren S, Sun J. 2016. Deep residual learning for image recognition. The IEEE conference on computer vision and pattern recognition; 27–30 June 2016; Las Vegas, NV, USA.
Google Scholar
He X, Zhou Y, Zhao J, Zhang D, Yao R, Xue Y. 2022. Swin transformer embedding UNet for remote sensing image semantic segmentation. IEEE Trans Geosci Remote Sensing. 60:1–15. doi: 10.1109/TGRS.2022.3144165.
Web of Science ®Google Scholar
Hendrycks D, Gimpel K. 2016. Gaussian error linear units (gelus). arXiv:1606.08415.
Google Scholar
Huang G, Liu Z, Van Der Maaten L, Weinberger KQ. 2017. Densely connected convolutional networks. The IEEE Conference on Computer Vision and Pattern Recognition.
Google Scholar
Huang X, Zhang L, Gong W. 2011. Information fusion of aerial images and LIDAR data in urban areas: vector-stacking, re-classification and post-processing approaches. Int J Remote Sens. 32(1):69–84. doi: 10.1080/01431160903439882.
Web of Science ®Google Scholar
Ibtehaz N, Rahman MS. 2020. MultiResUNet: rethinking the U-Net architecture for multimodal biomedical image segmentation. Neural Netw. 121:74–87. doi: 10.1016/j.neunet.2019.08.025.
PubMed Web of Science ®Google Scholar
Ioffe S. 2017. Batch renormalization: towards reducing minibatch dependence in batch-normalized models. Adv Neural Info Processing Sys. 30.
Google Scholar
Kampffmeyer M, Salberg A-B, Jenssen R. 2018. Urban land cover classification with missing data modalities using deep convolutional neural networks. IEEE J Sel Top Appl Earth Observations Remote Sensing. 11(6):1758–1768. doi: 10.1109/JSTARS.2018.2834961.
Web of Science ®Google Scholar
Li H, Qiu K, Chen L, Mei X, Hong L, Tao C. 2021. SCAttNet: semantic segmentation network with spatial and channel attention mechanism for high-resolution remote sensing images. IEEE Geosci Remote Sensing Lett. 18(5):905–909. doi: 10.1109/LGRS.2020.2988294.
Web of Science ®Google Scholar
Li W, Wang J, Gao Y, Zhang M, Tao R, Zhang B. 2022. Graph-feature-enhanced selective assignment network for hyperspectral and multispectral data classification. IEEE Trans Geosci Remote Sensing. 60:1–14. doi: 10.1109/TGRS.2022.3166252.
Google Scholar
Li X, He H, Li X, Li D, Cheng G, Shi J, Weng L, Tong Y, Lin Z. 2021. Pointflow: flowing semantics through points for aerial image segmentation. The IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2021; June 19–25, 2021.
Google Scholar
Liu Z, Lin Y, Cao Y, Hu H, Wei Y, Zhang Z, Lin S, Guo B. 2021. Swin transformer: hierarchical vision transformer using shifted windows. The IEEE/CVF International Conference on Computer Vision; October 2021; Montreal, BC, Canada. p. 11–17.
Google Scholar
Liu Z, Mao H, Wu C-Y, Feichtenhofer C, Darrell T, Xie S. 2022. A convnet for the 2020s. The IEEE/CVF Conference on Computer Vision and Pattern Recognition.
Google Scholar
Long J, Shelhamer E, Darrell T. 2015. Fully convolutional networks for semantic segmentation. IEEE Conference on Computer Vision and Pattern Recognition; 7–12 June 2015; Boston, MA, USA.
Google Scholar
Milletari F, Navab N, Ahmadi S-A. 2016. V-net: fully convolutional neural networks for volumetric medical image segmentation. The 2016 Fourth International Conference on 3D Vision (3DV); 25–28 October 2016; Stanford, CA, USA: IEEE. doi: 10.1109/3DV.2016.79.
Google Scholar
Nair V, Hinton GE. 2010. Rectified linear units improve restricted Boltzmann machines. The 27th International Conference on Machine Learning (ICML-10).
Google Scholar
Oktay O, Schlemper J, Folgoc LL, Lee M, Heinrich M, Misawa K, Mori K, McDonagh S, Hammerla NY, Kainz B. 2018. Attention u-net: learning where to look for the pancreas. arXiv:1804.03999.
Google Scholar
Ronneberger O, Fischer P, Brox T. 2015. U-net: convolutional networks for biomedical image segmentation. Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015: 18th International Conference; October 5–9, 2015; Munich, Germany: Springer.
Google Scholar
Sandler M, Howard A, Zhu M, Zhmoginov A, Chen L-C. 2018. Mobilenetv2: inverted residuals and linear bottlenecks. The IEEE conference on computer vision and pattern recognition.
Google Scholar
Simonyan K, Zisserman A. 2014. Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556.
Google Scholar
Sun W, Chen J, Yan L, Lin J, Pang Y, Zhang G. 2022. COVID-19 CT image segmentation method based on swin transformer. Front Physiol. 13:981463. doi: 10.3389/fphys.2022.981463.
PubMed Web of Science ®Google Scholar
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I. 2017. Attention is all you need. Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017; December 4–9, 2017; Long Beach, CA, USA.
Google Scholar
Wang J, Li W, Gao Y, Zhang M, Tao R, Du Q. 2023. Hyperspectral and SAR image classification via multiscale interactive fusion network. IEEE Trans Neural Netw Learn Syst. 34(12):10823–10837. doi: 10.1109/TNNLS.2022.3171572.
PubMed Web of Science ®Google Scholar
Wang J, Li W, Zhang M, Chanussot J. 2023. Large kernel sparse ConvNet weighted by multi-frequency attention for remote sensing scene understanding. IEEE Trans Geosci Remote Sensing. 61:1–12. doi: 10.1109/TGRS.2023.3333401.
Web of Science ®Google Scholar
Wang J, Li W, Zhang M, Tao R, Chanussot J. 2023. Remote sensing scene classification via multi-stage self-guided separation network. IEEE Trans Geosci Remote Sensing. 61:1–12. doi: 10.1109/TGRS.2023.3295797.
Google Scholar
Woo S, Park J, Lee J-Y, Kweon IS. 2018. Cbam: convolutional block attention module. The European conference on computer vision (ECCV), 06 October 2018.
Google Scholar
Wu H, Zhang J, Huang K, Liang K, Yu Y. 2019. Fastfcn: rethinking dilated convolution in the backbone for semantic segmentation. arXiv:1903.11816.
Google Scholar
Xiao X, Lian S, Luo Z, Li S. 2018. Weighted Res-UNet for high-quality retina vessel segmentation. 2018 9th International Conference on Information Technology in Medicine and Education (ITME); IEEE. doi: 10.1109/ITME.2018.00080.
Google Scholar
Xie S, Girshick R, Dollár P, Tu Z, He K. 2017. Aggregated residual transformations for deep neural networks. The IEEE Conference on Computer Vision and Pattern Recognition.
Google Scholar
Yang M, Yu K, Zhang C, Li Z, Yang K. 2018. Denseaspp for semantic segmentation in street scenes. The IEEE Conference on Computer Vision and Pattern Recognition.
Google Scholar
Yang Y, Hallman S, Ramanan D, Fowlkes CC. 2011. Layered object models for image segmentation. IEEE Trans Pattern Anal Mach Intell. 34(9):1731–1743. doi: 10.1109/TPAMI.2011.208.
Web of Science ®Google Scholar
Zhang M, Li W, Zhao X, Liu H, Tao R, Du Q. 2023. Morphological transformation and spatial-logical aggregation for tree species classification using hyperspectral imagery. IEEE Trans Geosci Remote Sensing. 61:1–12. doi: 10.1109/TGRS.2022.3233847.
Web of Science ®Google Scholar
Zhao H, Shi J, Qi X, Wang X, Jia J. 2017. Pyramid scene parsing network. The IEEE conference on computer vision and pattern recognition; 21–26 July 2017; Honolulu, HI, USA.
Google Scholar

Share icon
Back to Top

Related research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.

People also read
Recommended articles
Cited by

To cite this article:

Reference style: APA Chicago Harvard

Citation copied to clipboard

Reference styles above use APA (6th edition), Chicago (16th edition) & Harvard (10th edition)

Download citation

Download a citation file in RIS format that can be imported by citation management software including EndNote, ProCite, RefWorks and Reference Manager.

Choose format: RIS BibTex RefWorks Direct Export

Choose options: Citation Citation & abstract Citation & references

CMPF-UNet: a ConvNeXt multi-scale pyramid fusion U-shaped network for multi-category segmentation of remote sensing images

References

Information for

Open access

Opportunities

Help and information

Your download is now in progress and you may close this window

Login or register to access this feature

CMPF-UNet: a ConvNeXt multi-scale pyramid fusion U-shaped network for multi-category segmentation of remote sensing images

References

Related research

To cite this article:

Download citation

Information for

Open access

Opportunities

Help and information

Keep up to date