Download PDFOpen PDF in browser

MKEAH: Multimodal Knowledge Extraction and Accumulation Based on Hyperplane Embedding for Knowledge-Based Visual Question Answering

EasyChair Preprint no. 10761

9 pagesDate: August 22, 2023

Abstract

External knowledge representations play an essential role in knowledge-based visual question and answering to better understand complex scenarios in the open world. Recent entity-relationship embedding approaches are deficient in some of representing complex relations, resulting in a lack of topic-related knowledge but the redundancy of topic-irrelevant information. To this end, we propose MKEAH to represent Multimodal Knowledge Extraction and Accumulation on Hyperplanes. To ensure that the length of the feature vectors projected to the hyperplane compares equally and to filter out enough topic-irrelevant information, two losses are proposed to learn the triplet representations from the complementary views: range loss and orthogonal loss. In order to interpret the capability of extracting topic-related knowledge, we present Topic Similarity (TS) between topic and entity-relation. Experimental results demonstrate the effectiveness of hyperplane embedding for knowledge representation in knowledge-based visual question answering. Our model outperforms the state-of-the-art methods by 2.12% and 3.24%, respectively, on two challenging knowledge-required datasets: OK-VQA and KRVQA. The obvious advantages of our model on TS shows that using hyperplane embedding to represent multimodal knowledge can improve the ability of the model to extract topic-related knowledge.

Keyphrases: Hyperplane, Knowledge-based Visual Question Answering, topic-related

BibTeX entry
BibTeX does not have the right entry for preprints. This is a hack for producing the correct reference:
@Booklet{EasyChair:10761,
  author = {Heng Zhang and Zhihua Wei and Guanming Liu and Ruibin Mu and Rui Wang and Chuan Bao Liu and Aiquan Yuan and Guodong Cao and Ning Hu},
  title = {MKEAH: Multimodal Knowledge Extraction and Accumulation Based on Hyperplane Embedding for Knowledge-Based Visual Question Answering},
  howpublished = {EasyChair Preprint no. 10761},

  year = {EasyChair, 2023}}
Download PDFOpen PDF in browser