720
Views
0
CrossRef citations to date
0
Altmetric
Review Article

UAV image matching from handcrafted to deep local features

, , &
Article: 2307619 | Received 25 Sep 2023, Accepted 16 Jan 2024, Published online: 21 Feb 2024

References

  • Arandjelović, R., & Zisserman, A. (2012). Three things everyone should know to improve object retrieval. In 2012 IEEE Conference on Computer Vision and Pattern Recognition (pp. 2911–23). IEEE.
  • Balntas, V., Johns, E., Tang, L., & Mikolajczyk, K. (2016). PN-Net: Conjoined triple deep network for learning local image descriptors. ArXiv. Preprint ArXiv:1601.05030.https://doi.org/10.48550/arXiv.1601.05030
  • Balntas, V., Riba, E., Ponsa, D., & Mikolajczyk, K. (2016). Learning local feature descriptors with triplets and shallow convolutional neural networks. In British Machine Vision Conference 2016, 1, 3.
  • Bay, H., Tuytelaars, T., & Van Gool, L. (2006). Surf: Speeded up robust features. In Computer Vision–ECCV 2006: 9th European Conference on Computer Vision, Graz, Austria, May 7-13, 2006. Proceedings, Part I 9 (pp. 404–417). Springer.
  • Bhowmik, A., Gumhold, S., Rother, C., & Brachmann, E. (2020). Reinforced feature points: Optimizing feature detection and description for a high-level task. Proceedings of the IEEE/CVF Conference on Computer Vision and pattern recognition (pp. 4948–4957).
  • Chiabrando, F., D’Andria, F., Sammartano, G., & Spanò, A. (2018). UAV photogrammetry for archaeological site survey. 3D models at the Hierapolis in Phrygia (Turkey). Virtual Archaeology Review, 9(18), 28–43. https://doi.org/10.4995/var.2018.5958
  • Chopra, S., Hadsell, R., & LeCun, Y. (2005). Learning a similarity metric discriminatively, with application to face verification. Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition(CVPR’05) (Vol. 1, pp. 539–546). IEEE.
  • Daakir, M., Pierrot-Deseilligny, M., Bosser, P., Pichard, F., Thom, C., Rabot, Y., & Martin, O. (2017). Lightweight UAV with on-board photogrammetry and single-frequency GPS positioning for metrology applications. ISPRS Journal of Photogrammetry and Remote Sensing, 127, 115–126. https://doi.org/10.1016/j.isprsjprs.2016.12.007
  • DeTone, D., Malisiewicz, T., & Rabinovich, A. (2018). Superpoint: Self-supervised interest point detection and description. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops (pp. 224–236).
  • Dong, J., & Soatto, S. (2015). Domain-size pooling in local descriptors: DSP-SIFT. Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 5097–5106).
  • Habib, A., Han, Y., Xiong, W., He, F., Zhang, Z., & Crawford, M. (2016). Automated ortho-rectification of UAV-based hyperspectral data over an agricultural field using frame RGB imagery. Remote Sensing, 8(10), 796.
  • Han, X., Leung, T., Jia, Y., Sukthankar, R., & Berg, A. C. (2015). Matchnet: Unifying feature and metric learning for patch-based matching. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 3279–3286).
  • Harris, C., & Stephens, M. (1988). A combined corner and edge detector. In Alvey Vision Conference (Vol. 15, pp. 10–5244). Citeseer.
  • He, K., Zhang, X., Ren, S., & Sun, J. (n.d). Deep residual learning for image recognition. Retrieved from http://image-net.org/challenges/LSVRC/2015/
  • Huang, P.-H., Matzen, K., Kopf, J., Ahuja, N., & Huang, J.-B. (2018). Deepmvs: Learning multi-view stereopsis. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 2821–2830).
  • Hu, X., & Mordohai, P. (2012). Least commitment, viewpoint-based, multi-view stereo. Proceedings of the 2012 Second International Conference on 3D Imaging, Modeling, Processing, Visualization & Transmission (pp. 531–538). IEEE.
  • Jaderberg, M., Simonyan, K., & Zisserman, A. (2015). Spatial transformer networks. Advances in Neural Information Processing Systems (1506), 28. https://doi.org/10.48550/arXiv.1506.02025
  • Jegou, H., Douze, M., & Schmid, C. (2008). Hamming embedding and weak geometric consistency for large scale image search. Computer Vision–ECCV 2008: 10th European Conference on Computer Vision, Marseille, France, October 12-18, 2008, Proceedings, Part I 10 (pp. 304–317). Springer.
  • Jiang, S., & Jiang, W. (2017). Efficient structure from motion for oblique UAV images based on maximal spanning tree expansion. ISPRS Journal of Photogrammetry and Remote Sensing, 132(09), 140–161. https://doi.org/10.1016/j.isprsjprs.2017.09.004
  • Jiang, S., Jiang, W., Huang, W., & Yang, L. (2017). UAV-based oblique photogrammetry for outdoor data acquisition and offsite visual inspection of transmission line. Remote Sensing, 9(3), 278. https://doi.org/10.3390/rs9030278
  • Jiang, W., Song, Y., Leung, T., Rosenberg, C., & Wang, J. (2014). Learning fine-grained image similarity with deep ranking. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 1386–1393). CVPR.
  • Jin, Y., Mishkin, D., Mishchuk, A., Matas, J., Fua, P., Yi, K. M., & Trulls, E. (2021). Image matching across wide baselines: From paper to practice. International Journal of Computer Vision, 129(2), 517–547. https://doi.org/10.1007/s11263-020-01385-0
  • Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2017). ImageNet classification with deep convolutional neural networks. Communications of the ACM, 60(6), 84–90. https://doi.org/10.1145/3065386
  • Li, F., Zhang, H., Xu, H., Liu, S., Zhang, L., Ni, L. M., & Shum, H.-Y. (n.d). Mask DINO: Towards A Unified Transformer-Based Framework for Object Detection and Segmentation. Retrieved from https://github.com/IDEA-
  • Lowe, D. G. (2004). Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 60(2), 91–110. https://doi.org/10.1023/B:VISI.0000029664.99615.94
  • Luo, Z., Shen, T., Zhou, L., Zhang, J., & Yao, Y. (2019). Contextdesc: Local descriptor augmentation with cross-modality context. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 2527–2536).
  • Luo, Z., Zhou, L., Bai, X., Chen, H., & Zhang, J. (2020). Aslfeat: Learning local features of accurate shape and localization. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 6589–6598).
  • Mirza, M., & Osindero, S. (2014). Conditional generative adversarial nets. ArXiv. ArXiv Preprint ArXiv:1411.1784.https://doi.org/10.48550/arXiv.1411.1784
  • Mishchuk, A., Mishkin, D., Radenovic, F., & Matas, J. (2017). Working hard to know your neighbor’s margins: Local descriptor learning loss. Advances in Neural Information Processing Systems (1705), 30.https://doi.org/10.48550/arXiv.1705.10872
  • Ono, Y., Trulls, E., Fua, P., & Yi, K. M. (2018). LF-Net: Learning local features from images. Advances in Neural Information Processing Systems (1085), 31.https://doi.org/10.48550/arXiv.1805.09662
  • Pajares, G. (2015). Overview and current status of remote sensing applications based on unmanned aerial vehicles (UAVs). Photogrammetric Engineering & Remote Sensing, 81(4), 281–330. https://doi.org/10.14358/PERS.81.4.281
  • Philbin, J., Chum, O., Isard, M., Sivic, J., & Zisserman, A. (2007). Object retrieval with large vocabularies and fast spatial matching. Proceedings of the 2007 IEEE Conference on Computer Vision and Pattern Recognition (pp. 1–8). IEEE.
  • Radenovic, F., Schonberger, J. L., Ji, D., Frahm, J.-M., Chum, O., & Matas, J. (2016). From dusk till dawn: Modeling in the dark. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 5488–5496).
  • Rottensteiner, F. (n.d.). ISPRS test project on urban classification and 3D building reconstruction: Evaluation of building reconstruction results.
  • Rublee, E., Rabaud, V., Konolige, K., & Bradski, G. (2011). ORB: An efficient alternative to SIFT or SURF. Proceedings of the 2011 International Conference on Computer Vision (pp. 2564–2571). IEEE.
  • Sarlin, P.-E., Detone, D., Malisiewicz, T., Rabinovich, A., & Zurich, E. (n.d.). SuperGlue: Learning feature matching with graph neural networks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. https://doi.org/10.1109/cvpr42600.2020.00499
  • Sattler, T., Havlena, M., Schindler, K., & Pollefeys, M. (2016). Large-scale location recognition and the geometric burstiness problem. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 1582–1590).
  • Sattler, T., Leibe, B., & Kobbelt, L. (2011). Fast image-based localization using direct 2d-to-3d matching. Proceedings of the 2011 International Conference on Computer Vision (pp. 667–674). IEEE.
  • Schonberger, J. L., & Frahm, J.-M. (2016). Structure-from-motion revisited. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 4104–4113).
  • Schonberger, J. L., Hardmeier, H., Sattler, T., & Pollefeys, M. (2017). Comparative evaluation of hand-crafted and learned local features. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 1482–1491).
  • Schonberger, J. L., Radenovic, F., Chum, O., & Frahm, J.-M. (2015). From single image query to detailed 3d reconstruction. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 5126–5134).
  • Schönberger, J. L., Zheng, E., Frahm, J.-M., & Pollefeys, M. (2016). Pixelwise view selection for unstructured multi-view stereo. Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11-14, 2016, Proceedings, Part III 14 (pp. 501–518). Springer
  • Sedaghat, A., & Ebadi, H. (2015). Remote sensing image matching based on adaptive binning SIFT descriptor. IEEE Transactions on Geoscience and Remote Sensing, 53(10), 5283–5293. https://doi.org/10.1109/TGRS.2015.2420659
  • Sharif Razavian, A., Azizpour, H., Sullivan, J., & Carlsson, S. (2014). CNN features off-the-shelf: An astounding baseline for recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops (pp. 806–813).
  • Shen, X., Wang, C., Li, X., Yu, Z., & Li, J. (2019). RF-Net: An end-to-end image matching network based on receptive field. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 8132–8140).
  • Simo-Serra, E., Trulls, E., Ferraz, L., Kokkinos, I., Fua, P., & Moreno-Noguer, F. (2015). Discriminative learning of deep convolutional feature point descriptors. Proceedings of the IEEE International Conference on Computer Vision (pp. 118–126).
  • Sun, Y., Sun, H., Yan, L., Fan, S., & Chen, R. (2016). RBA: Reduced Bundle Adjustment for oblique aerial photogrammetry. ISPRS Journal of Photogrammetry and Remote Sensing, 121, 128–142. https://doi.org/10.1016/j.isprsjprs.2016.09.005
  • Tareen, S. A. K., & Saleem, Z. (2018). A comparative analysis of SIFT, SURF, KAZE, AKAZE, ORB, and BRISK. In 2018 International Conference on Computing, Mathematics and Engineering Technologies: Invent, Innovate and Integrate for Socioeconomic Development, iCoMET 2018 - Proceedings (Vol. 2018- January, pp. 1–10). Institute of Electrical and Electronics Engineers Inc. https://doi.org/10.1109/ICOMET.2018.8346440
  • Tian, Y., Fan, B., & Wu, F. (2017). L2-net: Deep learning of discriminative patch descriptor in euclidean space. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 661–669).
  • Tola, E., Lepetit, V., & Fua, P. (2009). Daisy: An efficient dense descriptor applied to wide-baseline stereo. IEEE Transactions on Pattern Analysis and Machine Intelligence, 32(5), 815–830. https://doi.org/10.1109/TPAMI.2009.77
  • Tolias, G., Avrithis, Y., & Jégou, H. (2016). Image search with selective match kernels: Aggregation across single and multiple images. International Journal of Computer Vision, 116(3), 247–261. https://doi.org/10.1007/s11263-015-0810-4
  • Torii, A., Arandjelovic, R., Sivic, J., Okutomi, M., & Pajdla, T. (2015). 24/7 place recognition by view synthesis. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 1808–1817).
  • Vijayanarasimhan, S., Ricco, S., Schmid, C., Sukthankar, R., & Fragkiadaki, K. (2017). Sfm-net: Learning of structure and motion from video. ArXiv. Preprint ArXiv:1704.07804.https://doi.org/10.48550/arXiv.1704.07804
  • Wang, J., Zhou, F., Wen, S., Liu, X., & Lin, Y. (2017). Deep metric learning with angular loss. Proceedings of the IEEE International Conference on Computer Vision (pp. 2593–2601).
  • Wei, X., Zhang, Y., Gong, Y., & Zheng, N. (2018). Kernelized subspace pooling for deep local descriptors. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 1867–1875).
  • Yi, K. M., Trulls, E., Lepetit, V., & Fua, P.(2016). Lift: Learned invariant feature transform. Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October11-14, 2016, Proceedings, Part VI 14 pp. (467–483). Springer International Publishing.
  • Yi, K. M., Verdie, Y., Fua, P., & Lepetit, V. (2016). Learning to assign orientations to feature points. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 107–116).
  • Zagoruyko, S., & Komodakis, N. (2015). Learning to compare image patches via convolutional neural networks. Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 4353–4361).
  • Zeisl, B., Sattler, T., & Pollefeys, M. (2015). Camera pose voting for large-scale image-based localization. Proceedings of the IEEE International Conference on Computer Vision (pp. 2704–2712).
  • Zhu, Q., Wang, Z., Hu, H., Xie, L., Ge, X., & Zhang, Y. (n.d.). Leveraging photogrammetric mesh models for aerial-ground feature point matching toward integrated 3D reconstruction. ISPRS Journal of Photogrammetry & Remote Sensing, 166, 26–40. https://doi.org/10.1016/j.isprsjprs.2020.05.024