191
Views
0
CrossRef citations to date
0
Altmetric
Research Article

A method for building extraction in remote sensing images based on swintransformer

, , , , , & show all
Article: 2353113 | Received 06 Dec 2023, Accepted 04 May 2024, Published online: 15 May 2024

References

  • Cao, Hu, Yueyue Wang, Joy Chen, Dongsheng Jiang, Xiaopeng Zhang, Qi Tian, and Manning Wang. 2022. “Swin-unet: Unet-Like Pure Transformer for Medical Image Segmentation.” Paper presented at the European Conference on Computer Vision.
  • Chen, Jieneng, Yongyi Lu, Qihang Yu, Xiangde Luo, Ehsan Adeli, Yan Wang, Le Lu, Alan L Yuille, and Yuyin Zhou. 2021. “Transunet: Transformers Make Strong Encoders for Medical Image Segmentation.” arXiv preprint arXiv:2102.04306.
  • Chen, Liang-Chieh, Yukun Zhu, George Papandreou, Florian Schroff, and Hartwig Adam. 2018. “Encoder-decoder with Atrous Separable Convolution for Semantic Image Segmentation.” Paper presented at the Proceedings of the European Conference on Computer Vision (ECCV).
  • Chollet, François. 2017. “Xception: Deep Learning with Depthwise Separable Convolutions.” Paper presented at the Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.
  • Dai, Jifeng, Haozhi Qi, Yuwen Xiong, Yi Li, Guodong Zhang, Han Hu, and Yichen Wei. 2017. “Deformable Convolutional Networks.” Paper presented at the Proceedings of the IEEE International Conference on Computer Vision.
  • Deng, Wenjing, Qian Shi, and Jun Li. 2021a. “Attention-gate-based Encoder–Decoder Network for Automatical Building Extraction.” IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing. https://doi.org/10.1109/jstars.2021.3058097.
  • Deng, Wenjing, Qian Shi, and Jun Li. 2021b. “Attention-gate-based Encoder–Decoder Network for Automatical Building Extraction.” IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing 14:2611–2620. https://doi.org/10.1109/JSTARS.2021.3058097.
  • Devlin, Jacob, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. “Bert: Pre-training of Deep Bidirectional Transformers for Language Understanding.” arXiv preprint arXiv:1810.04805.
  • Ding, Xiaohan, Xiangyu Zhang, Jungong Han, and Guiguang Ding. 2022. “Scaling up Your Kernels to 31 × 31: Revisiting Large Kernel Design in Cnns.” Paper presented at the Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.
  • Du, Shihong, Fangli Zhang, and Xiuyuan Zhang. 2015. “Semantic Classification of Urban Buildings Combining VHR Image and GIS Data: An Improved Random Forest Approach.” ISPRS Journal of Photogrammetry and Remote Sensing 105:107–119. https://doi.org/10.1016/j.isprsjprs.2015.03.011.
  • Duan, Meimei, Lijuan Duan, and Bai Yuan Ding. 2021. “High Spatial Resolution Remote Sensing Data Classification Method Based on Spectrum Sharing.” Scientific Programming 2021:1–12. https://doi.org/10.1155/2021/4356957.
  • Graham, Benjamin, Alaaeldin El-Nouby, Hugo Touvron, Pierre Stock, Armand Joulin, Hervé Jégou, and Matthijs Douze. 2021. “Levit: A Vision Transformer in Convnet's Clothing for Faster Inference.” Paper presented at the Proceedings of the IEEE/CVF International Conference on Computer Vision.
  • Howard, Andrew G, Menglong Zhu, Bo Chen, Dmitry Kalenichenko, Weijun Wang, Tobias Weyand, Marco Andreetto, and Hartwig Adam. 2017. “Mobilenets: Efficient Convolutional Neural Networks for Mobile Vision Applications.” arXiv preprint arXiv:1704.04861.
  • Huang, Yanbo, Zhong-xin Chen, Tao Yu, Xiang-zhi Huang, and Xing-fa Gu. 2018. “Agricultural Remote Sensing big Data: Management and Applications.” Journal of Integrative Agriculture 17 (9): 1915–1931. https://doi.org/10.1016/S2095-3119(17)61859-8.
  • Ji, Shunping, Shiqing Wei, and Meng Lu. 2018. “Fully Convolutional Networks for Multisource Building Extraction from an Open Aerial and Satellite Imagery Data set.” IEEE Transactions on Geoscience and Remote Sensing 57 (1): 574–586.
  • Kornblith, Simon, Jonathon Shlens, and Quoc V Le. 2019. “Do Better Imagenet Models Transfer Better?” Paper presented at the Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.
  • Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E Hinton. 2012. “Imagenet Classification with Deep Convolutional Neural Networks.” Advances in Neural Information Processing Systems 25: 1–8.
  • Li, Er, John Femiani, Shibiao Xu, Xiaopeng Zhang, and Peter Wonka. 2015. “Robust Rooftop Extraction from Visible Band Images Using Higher Order CRF.” IEEE Transactions on Geoscience and Remote Sensing. https://doi.org/10.1109/tgrs.2015.2400462.
  • Li, Rui, Shunyi Zheng, Ce Zhang, Chenxi Duan, Jianlin Su, Libo Wang, and Peter M Atkinson. 2021. “Multiattention Network for Semantic Segmentation of Fine-resolution Remote Sensing Images.” IEEE Transactions on Geoscience and Remote Sensing 60:1–13.
  • Liegang, Xia, Mi Shulin, Zhang Junxia, Luo Jiancheng, Shen Zhanfeng, and Cheng Yubin. 2023. “Dual-stream Feature Extraction Network Based on CNN and Transformer for Building Extraction.” Remote Sensing 15:2689, https://doi.org/10.3390/rs15102689.
  • Linhui, Li, Jing Weipeng, and Wang Huihui. 2021. “Extracting the Forest Type from Remote Sensing Images by Random Forest.” IEEE Sensors Journal 21 (16): 17447–17454. https://doi.org/10.1109/JSEN.2020.3045501.
  • Liu, Yaohui, Lutz Gross, Zhiqiang Li, Xiaoli Li, Xiwei Fan, and Wenhua Qi. 2019. “Automatic Building Extraction on High-resolution Remote Sensing Imagery Using Deep Convolutional Encoder-Decoder with Spatial Pyramid Pooling.” IEEE Access 7:128774–128786. https://doi.org/10.1109/ACCESS.2019.2940527.
  • Liu, Ze, Yutong Lin, Yue Cao, Han Hu, Yixuan Wei, Zheng Zhang, Stephen Lin, and Baining Guo. 2021. “Swin Transformer: Hierarchical Vision Transformer Using Shifted Windows.” Paper presented at the Proceedings of the IEEE/CVF International Conference on Computer Vision.
  • Long, Jonathan, Evan Shelhamer, and Trevor Darrell. 2015. “Fully Convolutional Networks for Semantic Segmentation.” Paper presented at the Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.
  • Maggiori, E., Y. Tarabalka, G. Charpiat, and P. Alliez. 2017. “Can Semantic Labeling Methods Generalize to any City? The Inria Aerial Image Labeling Benchmark.” Paper presented at the 2017 IEEE International Geoscience and Remote Sensing Symposium (IGARSS), 23–28 July 2017.
  • Mnih, Volodymyr. 2013. Machine Learning for Aerial Image Labeling. Ph.D. thesis, Toronto, ON, Canada: University of Toronto. https://www.cs.toronto.edu/~vmnih/docs/Mnih_Volodymyr_PhD_Thesis.pdf.
  • Raffel, Colin, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, and Peter J Liu. 2020. “Exploring the Limits of Transfer Learning with a Unified Text-to-text Transformer.” The Journal of Machine Learning Research 21 (1): 5485–5551.
  • Razaque, Abdul, Mohamed Ben Haj Frej, Muder Almi’ani, Munif Alotaibi, and Bandar Alotaibi. 2021. “Improved Support Vector Machine Enabled Radial Basis Function and Linear Variants for Remote Sensing Image Classification.” Sensors 21 (13): 4431, https://doi.org/10.3390/s21134431.
  • Ren, Sucheng, Daquan Zhou, Shengfeng He, Jiashi Feng, and Xinchao Wang. 2022. “Shunted Self-attention via Multi-scale Token Aggregation.” Paper presented at the Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.
  • Renhe, Zhang, Zhang Qian, and Zhang Guixu. 2023. “SDSC-UNet: Dual Skip Connection ViT-Based U-Shaped Model for Building Extraction.” IEEE Geoscience and Remote Sensing Letters 20:1–5. https://doi.org/10.1109/lgrs.2023.3270303.
  • Ronneberger, Olaf, Philipp Fischer, and Thomas Brox. 2015. “U-net: Convolutional Networks for Biomedical Image Segmentation.” Paper presented at the Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015: 18th International Conference, Munich, Germany, October 5–9, 2015, Proceedings, Part III 18.
  • Simonyan, Karen, and Andrew Zisserman. 2014. “Very Deep Convolutional Networks for Large-scale Image Recognition.” arXiv preprint arXiv:1409.1556.
  • Sun, Shuting, Lin Mu, Lizhe Wang, Peng Liu, Xiaolei Liu, and Yuwei Zhang. 2021. “Semantic Segmentation for Buildings of Large Intra-Class Variation in Remote Sensing Images with O-GAN.” Remote Sensing 13 (3): 475. https://doi.org/10.3390/rs13030475.
  • Tian, Qinglin, Yingjun Zhao, Kai Qin, Yao Li, and Xuejiao Chen. 2021. “Dense Feature Pyramid Fusion Deep Network for Building Segmentation in Remote Sensing Image.” Paper presented at the Seventh Symposium on Novel Photoelectronic Detection Technology and Applications.
  • Vaswani, Ashish, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. “Attention is All You Need.” Advances in Neural Information Processing Systems 30: 1–7.
  • Wang, Panqu, Pengfei Chen, Ye Yuan, Ding Liu, Zehua Huang, Xiaodi Hou, and Garrison Cottrell. 2018. “Understanding Convolution for Semantic Segmentation.” Paper presented at the 2018 IEEE Winter Conference On Applications of Computer Vision (WACV).
  • Wang, Hao, Xiaolei Lv, Kaiyu Zhang, and Bin Guo. 2022. “Building Change Detection Based on 3D Co-Segmentation Using Satellite Stereo Imagery.” Remote Sensing 14 (3): 628. https://doi.org/10.3390/rs14030628.
  • Wang, Shuyang, Xiaodong Mu, Dongfang Yang, Hao He, and Peng Zhao. 2021. “Road Extraction from Remote Sensing Images Using the Inner Convolution Integrated Encoder-decoder Network and Directional Conditional Random Fields.” Remote Sensing 13 (3): 465. https://doi.org/10.3390/rs13030465.
  • Wang, Wei, and Zhiguo Qu. 2022. “Design of Public Building Space in Smart City Based on Big Data.” Journal of Environmental and Public Health 2022:1–10. https://doi.org/10.1155/2022/4733901.
  • Wang, Mengqi, Yinglin Wang, Bozhao Li, Zhongliang Cai, and Mengjun Kang. 2022. “A Population Spatialization Model at the Building Scale Using Random Forest.” Remote Sensing 14 (8): 1811. https://doi.org/10.3390/rs14081811.
  • Wang, Wenhai, Enze Xie, Xiang Li, Deng-Ping Fan, Kaitao Song, Ding Liang, Tong Lu, Ping Luo, and Ling Shao. 2021. “Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction Without Convolutions.” Paper presented at the Proceedings of the IEEE/CVF International Conference on Computer Vision.
  • Wu, Guangming, Xiaowei Shao, Zhiling Guo, Qi Chen, Wei Yuan, Xiaodan Shi, Yongwei Xu, and Ryosuke Shibasaki. 2018. “Automatic Building Segmentation of Aerial Imagery Using Multi-constraint Fully Convolutional Networks.” Remote Sensing 10 (3): 407. https://doi.org/10.3390/rs10030407.
  • Xia, Liegang, Shulin Mi, Junxia Zhang, Jiancheng Luo, Zhanfeng Shen, and Yubin Cheng. 2023. “Dual-stream Feature Extraction Network Based on CNN and Transformer for Building Extraction.” Remote Sensing 15 (10): 2689. https://doi.org/10.3390/rs15102689.
  • Xiao, Xiao, Guo Wenliang, Chen Rui, Hui Yilong, Wang Jianing, and Zhao Hongyu. 2022. “A Swin Transformer-based Encoding Booster Integrated in U-Shaped Network for Building Extraction.” Remote Sensing 14. https://doi.org/10.3390/rs14112611.
  • Xu, Lele, Ye Li, Jinzhong Xu, Yue Zhang, and Lili Guo. 2023. “BCTNet: Bi-branch Cross-fusion Transformer for Building Footprint Extraction.” IEEE Transactions on Geoscience and Remote Sensing 61:1–14.
  • Yang, Zhilin, Zihang Dai, Yiming Yang, Jaime Carbonell, Russ R Salakhutdinov, and Quoc V Le. 2019. “Xlnet: Generalized Autoregressive Pretraining for Language Understanding.” Advances in Neural Information Processing Systems 32: 1–9.
  • Yang, Lingxiao, Ru-Yuan Zhang, Lida Li, and Xiaohua Xie. 2021. “SimAM: A Simple, Parameter-Free Attention Module for Convolutional Neural Networks.” In Proceedings of the 38th International Conference on Machine Learning, edited by Meila Marina and Zhang Tong, 11863–11874. Virtual Event. http://proceedings.mlr.press/v139/yang21o.html.: Proceedings of Machine Learning Research: PMLR.
  • Yu, Fisher, Vladlen Koltun, and Thomas Funkhouser. 2017. “Dilated Residual Networks.” Paper presented at the Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.
  • Yuan, Wei, Jin Wang, and Wenbo Xu. 2022. “Shift Pooling PSPNet: Rethinking Pspnet for Building Extraction in Remote Sensing Images from Entire Local Feature Pooling.” Remote Sensing 14 (19): 4889. https://doi.org/10.3390/rs14194889.
  • Zhang, Hu, Keke Zu, Jian Lu, Yuru Zou, and Deyu Meng. 2022. “EPSANet: An Efficient Pyramid Squeeze Attention Block on Convolutional Neural Network.” Paper presented at the Proceedings of the Asian Conference on Computer Vision.
  • Zhao, Hengshuang, Jianping Shi, Xiaojuan Qi, Xiaogang Wang, and Jiaya Jia. 2017. “Pyramid Scene Parsing Network.” Paper presented at the Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.
  • Zhu, Lei, Xinjiang Wang, Zhanghan Ke, Wayne Zhang, and Rynson WH Lau. 2023. “BiFormer: Vision Transformer with Bi-Level Routing Attention.” Paper presented at the Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.