Classifying Textual Data with Pre-Trained Vision Models Through Transfer Learning and Data Transformations

EasyChair Preprint 6118, version 2

Versions: 12→history

8 pages•Date: July 23, 2021

Abstract

Knowledge is acquired by humans through experience, and no boundary is set between the kinds of knowledge or skill levels we can achieve on different tasks at the same time. When it comes to Neural Networks, that is not the case,the major breakthroughs in the field are extremely task and domain specific. Vision and language are dealt with in separate manners, using separate methods and different datasets. Current text classification methods, mostly rely on obtaining contextual embeddings for input text samples, then training a classifier on the embedded dataset. Transfer learning in Language related tasks in general, is heavily used on obtaining the contextual text embeddings for the input samples. In this work, we propose to use knowledge acquired by benchmark Vision Models which are trained on ImageNet to help a much smaller architecture learn to classify text. A data transformation technique is used to create a new image dataset, where each image represents a sentence embedding from the last six layers of BERT projected on a 2D plane using a t-SNE based method. We trained five models containing layers sliced from vision models pretrained on ImageNet on the created image dataset for the IMDB dataset embedded with the last six layers of BERT. Despite the challenges posed by the very different datasets, experimental results achieved by this approach which links large pretrained models on both language and vision, are very promising, without needing high compute resources. Specifically, Sentiment Analysis is achieved by five different models on the same image dataset obtained after BERT embeddings are transformed into gray scale images.

Keyphrases: BERT, Convolutional Neural Networks, Domain Adaptation, Natural Language Processing, Transfer Learning, image classification, t-SNE, text classification

Links:

https://easychair.org/publications/preprint/Fmn4

BibTeX entry

BibTeX does not have the right entry for preprints. This is a hack for producing the correct reference:

@booklet{EasyChair:6118,
  author    = {Charaf Eddine Benarab},
  title     = {Classifying Textual Data with Pre-Trained Vision Models Through Transfer Learning and Data Transformations},
  howpublished = {EasyChair Preprint 6118},
  year      = {EasyChair, 2021}}

Download PDF Open PDF in browser