FRIC: a framework for few-shot remote sensing image captioning

Haonan ZhouSpace Engineering University, Beijing, People’s Republic of ChinaCorrespondence[email protected]

Lurui XiaSpace Engineering University, Beijing, People’s Republic of ChinaCorrespondence[email protected]

Xiaoping DuSpace Engineering University, Beijing, People’s Republic of China

Sen LiSpace Engineering University, Beijing, People’s Republic of China

ABSTRACT

The training of image captioning (IC) models requires a large number of caption-labeled samples, which is usually difficult to satisfy in the actual remote sensing scenarios. The performance of the models will be damaged due to the few-shot problems. We describe the few-shot problems in remote sensing image captioning (RC) and design two research schemes. Then, we propose a few-shot RC framework few-shot remote sensing image captioning framework (FRIC). FRIC does not need additional samples and uses a simple base model. FRIC tries to get performance boosts from split samples and reduce the negative effects of noises. Unlike previous works that use 100% samples to simulate few-shot scenarios, FRIC uses less than 1.0% data to simulate actual few-shot scenarios. While previous works focus on improving the encoder, FRIC focuses on optimizing the decoder with parameter ensemble, multi-model ensemble and self-distillation. FRIC can train a simple base model with limited caption-labeled samples to generate captions that meet human expectations. FRIC shows obvious advantages to other methods when trained with only 0.8% samples of RC datasets. No previous work has used such a small amount of data to train the RC model. In addition, the effectiveness of the components in FRIC is verified with ablation experiments.

KEYWORDS:

Disclosure statement

No potential conflict of interest was reported by the author(s).

Data availability statement

The UCM-Captions dataset, Sydney-Captions dataset and RSICD dataset can be found at https://github.com/201528014227051/RSICD_optima. The MASATI dataset can be found at https://www.iuii.ua.es/datasets/masati/index.html.

FRIC: a framework for few-shot remote sensing image captioning

Information for

Open access

Opportunities

Help and information

FRIC: a framework for few-shot remote sensing image captioning

ABSTRACT

Disclosure statement

Data availability statement

Related research

To cite this article:

Download citation

Information for

Open access

Opportunities

Help and information

Keep up to date

Your download is now in progress and you may close this window

Login or register to access this feature