283
Views
0
CrossRef citations to date
0
Altmetric
Research Article

VBNet: A Visually-Aware Biomimetic Network for Simulating the Human Eye’s Visual System

, , ORCID Icon &
Article: 2335100 | Received 22 Oct 2023, Accepted 21 Mar 2024, Published online: 01 Apr 2024

References

  • Chollet, F. 2017. Xception: Deep learning with depthwise separable convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 1251–18.
  • Cubuk, E. D., B. Zoph, J. Shlens, and Q. V. Le. 2020. Randaugment: Practical automated data augmentation with a reduced search space. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, Seattle, WA, USA, 702–03.
  • Dosovitskiy, A., L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gelly, et al. 2021. An image is worth 16x16 words: Transformers for image recognition at scale (arXiv:2010.11929). arXiv doi:10.48550/arXiv.2010.11929.
  • He, K., X. Zhang, S. Ren, and J. Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 770–78.
  • Hu, J., L. Shen, and G. Sun. 2018. Squeeze-and-excitation networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA, 7132–41.
  • Lin, T.-Y., M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P. Dollár, and C. L. Zitnick. 2014. Microsoft COCO: Common objects in context. In Computer vision – ECCV 2014, ed. D. Fleet, T. Pajdla, B. Schiele, and T. Tuytelaars, 740–55. Zurich, Switzerland: Springer International Publishing.
  • Liu, Z., Y. Lin, Y. Cao, H. Hu, Y. Wei, Z. Zhang, S. Lin, and B. Guo 2021. Swin transformer: hierarchical vision transformer using shifted windows. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, QC, Canada, 10012–22.
  • Luo, W., Y. Li, R. Urtasun, and R. Zemel. 2016. Understanding the effective receptive field in deep convolutional neural networks. In Advances in neural information processing systems, ed. D. Lee, M. Sugiyama, U. Luxburg, I. Guyon, and R. Garnett, vol. 29. Curran Associates, Inc.
  • Radosavovic, I., R. P. Kosaraju, R. Girshick, K. He, and P. Dollar. 2020. Designing network design spaces. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 10428–36.
  • Sandler, M., A. Howard, M. Zhu, A. Zhmoginov, and L.-C. Chen. 2018. MobileNetV2: Inverted residuals and linear bottlenecks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA, 4510–20.
  • Shen, L., and Y. Wang. 2022. TCCT: Tightly-coupled convolutional transformer on time series forecasting. Neurocomputing 480:131–45. doi:10.1016/j.neucom.2022.01.039.
  • Touvron, H., M. Cord, M. Douze, F. Massa, A. Sablayrolles, and H. Jegou. 2021. Training data-efficient image transformers & distillation through attention. In Proceedings of the 38th International Conference on Machine Learning, 10347–57.
  • Wang, W., E. Xie, X. Li, D.-P. Fan, K. Song, D. Liang, T. Lu, P. Luo, and L. Shao. 2021. Pyramid vision transformer: A versatile backbone for dense prediction without convolutions.” In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, QC, Canada, 568–78.
  • Wang, Q., S. Zhang, Y. Qian, G. Zhang, and H. Wang. 2022. Enhancing representation learning by exploiting effective receptive fields for object detection. Neurocomputing 481:22–32. doi:10.1016/j.neucom.2022.01.020.
  • Woo, S., J. Park, J.-Y. Lee, and I. S. Kweon. 2018. CBAM: Convolutional block attention module. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 3–19
  • Wu, H., B. Xiao, N. Codella, M. Liu, X. Dai, L. Yuan, and L. Zhang. 2021. CvT: Introducing convolutions to vision transformers. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, QC, Canada, 22–31.
  • Yuan, L., Y. Chen, T. Wang, W. Yu, Y. Shi, Z.-H. Jiang, F. E. H. Tay, J. Feng, and S. Yan. 2021. Tokens-to-Token ViT: Training vision transformers from scratch on ImageNet. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, Canada, 558–67.
  • Yu, W., M. Luo, P. Zhou, C. Si, Y. Zhou, X. Wang, J. Feng, and S. Yan. 2022. MetaFormer is actually what you need for vision. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, USA, 10819–29.
  • Yun, S., D. Han, S. J. Oh, S. Chun, J. Choe, and Y. Yoo. 2019. CutMix: Regularization strategy to train strong classifiers with localizable features. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, South Korea, 6023–32.
  • Zeng, K., Q. Ma, J. Wu, S. Xiang, T. Shen, and L. Zhang. 2022. Nlfftnet: A non-local feature fusion transformer network for multi-scale object detection. Neurocomputing 493:15–27. doi:10.1016/j.neucom.2022.04.062.
  • Zhang, H., M. Cisse, Y. N. Dauphin, and D. Lopez-Paz. 2018. mixup: Beyond Empirical Risk Minimization. arXiv:1710.09412. arXiv.
  • Zhong, Z., L. Zheng, G. Kang, S. Li, and Y. Yang. 2020. Random erasing data augmentation. In Proceedings of the AAAI Conference on Artificial Intelligence, New York City, USA, 34 (07), Article 07.
  • Zhou, B., A. Khosla, A. Lapedriza, A. Oliva, and A. Torralba. 2016. Learning deep features for discriminative localization. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, USA, 2921–29.