Search in:

International Journal of Digital Earth Volume 17, 2024 - Issue 1

Submit an article Journal homepage

Open access

191

Views

CrossRef citations to date

Altmetric

Research Article

A method for building extraction in remote sensing images based on swintransformer

Weidong Zhua School of Marine Science and Ecological Environment, Shanghai Ocean University, Shanghai, People’s Republic of China;b Shanghai Estuary Marine Surveying and Mapping Engineering Technology Research Center, Shanghai, People’s Republic of China;c Key Laboratory of Marine Ecological Monitoring and Restoration Technologies, Shanghai, People’s Republic of ChinaView further author information

Xiaolong Zhua School of Marine Science and Ecological Environment, Shanghai Ocean University, Shanghai, People’s Republic of ChinaCorrespondence[email protected]
View further author information

Naiying Hea School of Marine Science and Ecological Environment, Shanghai Ocean University, Shanghai, People’s Republic of China;b Shanghai Estuary Marine Surveying and Mapping Engineering Technology Research Center, Shanghai, People’s Republic of ChinaView further author information

Yuelin Xua School of Marine Science and Ecological Environment, Shanghai Ocean University, Shanghai, People’s Republic of ChinaView further author information

Tiantian Caoa School of Marine Science and Ecological Environment, Shanghai Ocean University, Shanghai, People’s Republic of ChinaView further author information

Yifei Lia School of Marine Science and Ecological Environment, Shanghai Ocean University, Shanghai, People’s Republic of ChinaView further author information

Yanying Huanga School of Marine Science and Ecological Environment, Shanghai Ocean University, Shanghai, People’s Republic of ChinaView further author information

show all

Article: 2353113 | Received 06 Dec 2023, Accepted 04 May 2024, Published online: 15 May 2024

Cite this article
https://doi.org/10.1080/17538947.2024.2353113
CrossMark

Full Article
Figures & data
References
Citations
Metrics
Licensing
Reprints & Permissions
View PDF PDF View EPUB EPUB

References

Cao, Hu, Yueyue Wang, Joy Chen, Dongsheng Jiang, Xiaopeng Zhang, Qi Tian, and Manning Wang. 2022. “Swin-unet: Unet-Like Pure Transformer for Medical Image Segmentation.” Paper presented at the European Conference on Computer Vision.
Google Scholar
Chen, Jieneng, Yongyi Lu, Qihang Yu, Xiangde Luo, Ehsan Adeli, Yan Wang, Le Lu, Alan L Yuille, and Yuyin Zhou. 2021. “Transunet: Transformers Make Strong Encoders for Medical Image Segmentation.” arXiv preprint arXiv:2102.04306.
Google Scholar
Chen, Liang-Chieh, Yukun Zhu, George Papandreou, Florian Schroff, and Hartwig Adam. 2018. “Encoder-decoder with Atrous Separable Convolution for Semantic Image Segmentation.” Paper presented at the Proceedings of the European Conference on Computer Vision (ECCV).
Google Scholar
Chollet, François. 2017. “Xception: Deep Learning with Depthwise Separable Convolutions.” Paper presented at the Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.
Google Scholar
Dai, Jifeng, Haozhi Qi, Yuwen Xiong, Yi Li, Guodong Zhang, Han Hu, and Yichen Wei. 2017. “Deformable Convolutional Networks.” Paper presented at the Proceedings of the IEEE International Conference on Computer Vision.
Google Scholar
Deng, Wenjing, Qian Shi, and Jun Li. 2021a. “Attention-gate-based Encoder–Decoder Network for Automatical Building Extraction.” IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing. https://doi.org/10.1109/jstars.2021.3058097.
Web of Science ®Google Scholar
Deng, Wenjing, Qian Shi, and Jun Li. 2021b. “Attention-gate-based Encoder–Decoder Network for Automatical Building Extraction.” IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing 14:2611–2620. https://doi.org/10.1109/JSTARS.2021.3058097.
Web of Science ®Google Scholar
Devlin, Jacob, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. “Bert: Pre-training of Deep Bidirectional Transformers for Language Understanding.” arXiv preprint arXiv:1810.04805.
Google Scholar
Ding, Xiaohan, Xiangyu Zhang, Jungong Han, and Guiguang Ding. 2022. “Scaling up Your Kernels to 31 × 31: Revisiting Large Kernel Design in Cnns.” Paper presented at the Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.
Google Scholar
Du, Shihong, Fangli Zhang, and Xiuyuan Zhang. 2015. “Semantic Classification of Urban Buildings Combining VHR Image and GIS Data: An Improved Random Forest Approach.” ISPRS Journal of Photogrammetry and Remote Sensing 105:107–119. https://doi.org/10.1016/j.isprsjprs.2015.03.011.
Web of Science ®Google Scholar
Duan, Meimei, Lijuan Duan, and Bai Yuan Ding. 2021. “High Spatial Resolution Remote Sensing Data Classification Method Based on Spectrum Sharing.” Scientific Programming 2021:1–12. https://doi.org/10.1155/2021/4356957.
Web of Science ®Google Scholar
Graham, Benjamin, Alaaeldin El-Nouby, Hugo Touvron, Pierre Stock, Armand Joulin, Hervé Jégou, and Matthijs Douze. 2021. “Levit: A Vision Transformer in Convnet's Clothing for Faster Inference.” Paper presented at the Proceedings of the IEEE/CVF International Conference on Computer Vision.
Google Scholar
Howard, Andrew G, Menglong Zhu, Bo Chen, Dmitry Kalenichenko, Weijun Wang, Tobias Weyand, Marco Andreetto, and Hartwig Adam. 2017. “Mobilenets: Efficient Convolutional Neural Networks for Mobile Vision Applications.” arXiv preprint arXiv:1704.04861.
Google Scholar
Huang, Yanbo, Zhong-xin Chen, Tao Yu, Xiang-zhi Huang, and Xing-fa Gu. 2018. “Agricultural Remote Sensing big Data: Management and Applications.” Journal of Integrative Agriculture 17 (9): 1915–1931. https://doi.org/10.1016/S2095-3119(17)61859-8.
Web of Science ®Google Scholar
Ji, Shunping, Shiqing Wei, and Meng Lu. 2018. “Fully Convolutional Networks for Multisource Building Extraction from an Open Aerial and Satellite Imagery Data set.” IEEE Transactions on Geoscience and Remote Sensing 57 (1): 574–586.
Web of Science ®Google Scholar
Kornblith, Simon, Jonathon Shlens, and Quoc V Le. 2019. “Do Better Imagenet Models Transfer Better?” Paper presented at the Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.
Google Scholar
Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E Hinton. 2012. “Imagenet Classification with Deep Convolutional Neural Networks.” Advances in Neural Information Processing Systems 25: 1–8.
Google Scholar
Li, Er, John Femiani, Shibiao Xu, Xiaopeng Zhang, and Peter Wonka. 2015. “Robust Rooftop Extraction from Visible Band Images Using Higher Order CRF.” IEEE Transactions on Geoscience and Remote Sensing. https://doi.org/10.1109/tgrs.2015.2400462.
Web of Science ®Google Scholar
Li, Rui, Shunyi Zheng, Ce Zhang, Chenxi Duan, Jianlin Su, Libo Wang, and Peter M Atkinson. 2021. “Multiattention Network for Semantic Segmentation of Fine-resolution Remote Sensing Images.” IEEE Transactions on Geoscience and Remote Sensing 60:1–13.
Web of Science ®Google Scholar
Liegang, Xia, Mi Shulin, Zhang Junxia, Luo Jiancheng, Shen Zhanfeng, and Cheng Yubin. 2023. “Dual-stream Feature Extraction Network Based on CNN and Transformer for Building Extraction.” Remote Sensing 15:2689, https://doi.org/10.3390/rs15102689.
Google Scholar
Linhui, Li, Jing Weipeng, and Wang Huihui. 2021. “Extracting the Forest Type from Remote Sensing Images by Random Forest.” IEEE Sensors Journal 21 (16): 17447–17454. https://doi.org/10.1109/JSEN.2020.3045501.
Web of Science ®Google Scholar
Liu, Yaohui, Lutz Gross, Zhiqiang Li, Xiaoli Li, Xiwei Fan, and Wenhua Qi. 2019. “Automatic Building Extraction on High-resolution Remote Sensing Imagery Using Deep Convolutional Encoder-Decoder with Spatial Pyramid Pooling.” IEEE Access 7:128774–128786. https://doi.org/10.1109/ACCESS.2019.2940527.
Web of Science ®Google Scholar
Liu, Ze, Yutong Lin, Yue Cao, Han Hu, Yixuan Wei, Zheng Zhang, Stephen Lin, and Baining Guo. 2021. “Swin Transformer: Hierarchical Vision Transformer Using Shifted Windows.” Paper presented at the Proceedings of the IEEE/CVF International Conference on Computer Vision.
Google Scholar
Long, Jonathan, Evan Shelhamer, and Trevor Darrell. 2015. “Fully Convolutional Networks for Semantic Segmentation.” Paper presented at the Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.
Google Scholar
Maggiori, E., Y. Tarabalka, G. Charpiat, and P. Alliez. 2017. “Can Semantic Labeling Methods Generalize to any City? The Inria Aerial Image Labeling Benchmark.” Paper presented at the 2017 IEEE International Geoscience and Remote Sensing Symposium (IGARSS), 23–28 July 2017.
Google Scholar
Mnih, Volodymyr. 2013. Machine Learning for Aerial Image Labeling. Ph.D. thesis, Toronto, ON, Canada: University of Toronto. https://www.cs.toronto.edu/~vmnih/docs/Mnih_Volodymyr_PhD_Thesis.pdf.
Google Scholar
Raffel, Colin, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, and Peter J Liu. 2020. “Exploring the Limits of Transfer Learning with a Unified Text-to-text Transformer.” The Journal of Machine Learning Research 21 (1): 5485–5551.
Google Scholar
Razaque, Abdul, Mohamed Ben Haj Frej, Muder Almi’ani, Munif Alotaibi, and Bandar Alotaibi. 2021. “Improved Support Vector Machine Enabled Radial Basis Function and Linear Variants for Remote Sensing Image Classification.” Sensors 21 (13): 4431, https://doi.org/10.3390/s21134431.
PubMed Web of Science ®Google Scholar
Ren, Sucheng, Daquan Zhou, Shengfeng He, Jiashi Feng, and Xinchao Wang. 2022. “Shunted Self-attention via Multi-scale Token Aggregation.” Paper presented at the Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.
Google Scholar
Renhe, Zhang, Zhang Qian, and Zhang Guixu. 2023. “SDSC-UNet: Dual Skip Connection ViT-Based U-Shaped Model for Building Extraction.” IEEE Geoscience and Remote Sensing Letters 20:1–5. https://doi.org/10.1109/lgrs.2023.3270303.
Web of Science ®Google Scholar
Ronneberger, Olaf, Philipp Fischer, and Thomas Brox. 2015. “U-net: Convolutional Networks for Biomedical Image Segmentation.” Paper presented at the Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015: 18th International Conference, Munich, Germany, October 5–9, 2015, Proceedings, Part III 18.
Google Scholar
Simonyan, Karen, and Andrew Zisserman. 2014. “Very Deep Convolutional Networks for Large-scale Image Recognition.” arXiv preprint arXiv:1409.1556.
Google Scholar
Sun, Shuting, Lin Mu, Lizhe Wang, Peng Liu, Xiaolei Liu, and Yuwei Zhang. 2021. “Semantic Segmentation for Buildings of Large Intra-Class Variation in Remote Sensing Images with O-GAN.” Remote Sensing 13 (3): 475. https://doi.org/10.3390/rs13030475.
Web of Science ®Google Scholar
Tian, Qinglin, Yingjun Zhao, Kai Qin, Yao Li, and Xuejiao Chen. 2021. “Dense Feature Pyramid Fusion Deep Network for Building Segmentation in Remote Sensing Image.” Paper presented at the Seventh Symposium on Novel Photoelectronic Detection Technology and Applications.
Google Scholar
Vaswani, Ashish, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. “Attention is All You Need.” Advances in Neural Information Processing Systems 30: 1–7.
Google Scholar
Wang, Panqu, Pengfei Chen, Ye Yuan, Ding Liu, Zehua Huang, Xiaodi Hou, and Garrison Cottrell. 2018. “Understanding Convolution for Semantic Segmentation.” Paper presented at the 2018 IEEE Winter Conference On Applications of Computer Vision (WACV).
Google Scholar
Wang, Hao, Xiaolei Lv, Kaiyu Zhang, and Bin Guo. 2022. “Building Change Detection Based on 3D Co-Segmentation Using Satellite Stereo Imagery.” Remote Sensing 14 (3): 628. https://doi.org/10.3390/rs14030628.
Web of Science ®Google Scholar
Wang, Shuyang, Xiaodong Mu, Dongfang Yang, Hao He, and Peng Zhao. 2021. “Road Extraction from Remote Sensing Images Using the Inner Convolution Integrated Encoder-decoder Network and Directional Conditional Random Fields.” Remote Sensing 13 (3): 465. https://doi.org/10.3390/rs13030465.
Web of Science ®Google Scholar
Wang, Wei, and Zhiguo Qu. 2022. “Design of Public Building Space in Smart City Based on Big Data.” Journal of Environmental and Public Health 2022:1–10. https://doi.org/10.1155/2022/4733901.
Web of Science ®Google Scholar
Wang, Mengqi, Yinglin Wang, Bozhao Li, Zhongliang Cai, and Mengjun Kang. 2022. “A Population Spatialization Model at the Building Scale Using Random Forest.” Remote Sensing 14 (8): 1811. https://doi.org/10.3390/rs14081811.
Web of Science ®Google Scholar
Wang, Wenhai, Enze Xie, Xiang Li, Deng-Ping Fan, Kaitao Song, Ding Liang, Tong Lu, Ping Luo, and Ling Shao. 2021. “Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction Without Convolutions.” Paper presented at the Proceedings of the IEEE/CVF International Conference on Computer Vision.
Google Scholar
Wu, Guangming, Xiaowei Shao, Zhiling Guo, Qi Chen, Wei Yuan, Xiaodan Shi, Yongwei Xu, and Ryosuke Shibasaki. 2018. “Automatic Building Segmentation of Aerial Imagery Using Multi-constraint Fully Convolutional Networks.” Remote Sensing 10 (3): 407. https://doi.org/10.3390/rs10030407.
Web of Science ®Google Scholar
Xia, Liegang, Shulin Mi, Junxia Zhang, Jiancheng Luo, Zhanfeng Shen, and Yubin Cheng. 2023. “Dual-stream Feature Extraction Network Based on CNN and Transformer for Building Extraction.” Remote Sensing 15 (10): 2689. https://doi.org/10.3390/rs15102689.
Web of Science ®Google Scholar
Xiao, Xiao, Guo Wenliang, Chen Rui, Hui Yilong, Wang Jianing, and Zhao Hongyu. 2022. “A Swin Transformer-based Encoding Booster Integrated in U-Shaped Network for Building Extraction.” Remote Sensing 14. https://doi.org/10.3390/rs14112611.
Web of Science ®Google Scholar
Xu, Lele, Ye Li, Jinzhong Xu, Yue Zhang, and Lili Guo. 2023. “BCTNet: Bi-branch Cross-fusion Transformer for Building Footprint Extraction.” IEEE Transactions on Geoscience and Remote Sensing 61:1–14.
Web of Science ®Google Scholar
Yang, Zhilin, Zihang Dai, Yiming Yang, Jaime Carbonell, Russ R Salakhutdinov, and Quoc V Le. 2019. “Xlnet: Generalized Autoregressive Pretraining for Language Understanding.” Advances in Neural Information Processing Systems 32: 1–9.
Google Scholar
Yang, Lingxiao, Ru-Yuan Zhang, Lida Li, and Xiaohua Xie. 2021. “SimAM: A Simple, Parameter-Free Attention Module for Convolutional Neural Networks.” In Proceedings of the 38th International Conference on Machine Learning, edited by Meila Marina and Zhang Tong, 11863–11874. Virtual Event. http://proceedings.mlr.press/v139/yang21o.html.: Proceedings of Machine Learning Research: PMLR.
Google Scholar
Yu, Fisher, Vladlen Koltun, and Thomas Funkhouser. 2017. “Dilated Residual Networks.” Paper presented at the Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.
Google Scholar
Yuan, Wei, Jin Wang, and Wenbo Xu. 2022. “Shift Pooling PSPNet: Rethinking Pspnet for Building Extraction in Remote Sensing Images from Entire Local Feature Pooling.” Remote Sensing 14 (19): 4889. https://doi.org/10.3390/rs14194889.
Web of Science ®Google Scholar
Zhang, Hu, Keke Zu, Jian Lu, Yuru Zou, and Deyu Meng. 2022. “EPSANet: An Efficient Pyramid Squeeze Attention Block on Convolutional Neural Network.” Paper presented at the Proceedings of the Asian Conference on Computer Vision.
Google Scholar
Zhao, Hengshuang, Jianping Shi, Xiaojuan Qi, Xiaogang Wang, and Jiaya Jia. 2017. “Pyramid Scene Parsing Network.” Paper presented at the Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.
Google Scholar
Zhu, Lei, Xinjiang Wang, Zhanghan Ke, Wayne Zhang, and Rynson WH Lau. 2023. “BiFormer: Vision Transformer with Bi-Level Routing Attention.” Paper presented at the Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.
Google Scholar

Share icon
Back to Top

Related research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.

People also read
Recommended articles
Cited by

To cite this article:

Reference style: APA Chicago Harvard

Citation copied to clipboard

Reference styles above use APA (6th edition), Chicago (16th edition) & Harvard (10th edition)

Download citation

Download a citation file in RIS format that can be imported by citation management software including EndNote, ProCite, RefWorks and Reference Manager.

Choose format: RIS BibTex RefWorks Direct Export

Choose options: Citation Citation & abstract Citation & references

A method for building extraction in remote sensing images based on swintransformer

References

Information for

Open access

Opportunities

Help and information

Your download is now in progress and you may close this window

Login or register to access this feature

A method for building extraction in remote sensing images based on swintransformer

References

Related research

To cite this article:

Download citation

Information for

Open access

Opportunities

Help and information

Keep up to date