Download PDFOpen PDF in browser

Visual Question Answering of Remote Sensing Image Based on Attention Mechanism

EasyChair Preprint no. 8373

11 pagesDate: July 2, 2022


In recent years, the research of attention mechanism has made significant progress in the field of computer vision.In the processing of visual problems of remote sensing images, the attention mechanism can make the computer focus on important image areas and improve the accuracy of question answering.Our research focuses on the role of synergistic attention mechanisms in the interaction of question representations and visual representations. On the basis of Modular Collaborative Attention (MCA), according to the complementary characteristics of global features and local features, the hybrid connection strategy is used to perceive global features at the same time without weakening the attention distribution of local features.The impact of attention mechanisms on various types of visual question answering questions has been evaluated:(i) scene classification (ii)object comparison (iii) quantitative statistics (iv) relational judgment.By fusing the global features and local
features of different modalities, the model can obtain more information between modalities. Model performance evaluation under the RSVQA-LR dataset. Experimental results showthe method in this paper improves the global accuracy by 9.81% than RSVQA.

Keyphrases: co-attention, Feature fusion., VQA

BibTeX entry
BibTeX does not have the right entry for preprints. This is a hack for producing the correct reference:
  author = {Shihuai Zhang and Qiang Wei and Yangyang Li and Yanqiao Chen and Licheng Jiao},
  title = {Visual Question Answering of Remote Sensing Image Based on Attention Mechanism},
  howpublished = {EasyChair Preprint no. 8373},

  year = {EasyChair, 2022}}
Download PDFOpen PDF in browser