Download PDFOpen PDF in browser

Sentiment Analysis of English-Punjabi Code Mixed Social Media Content for Agriculture Domain

EasyChair Preprint no. 2813

6 pagesDate: February 29, 2020


In India, more than 70% of the population is dependent on agriculture. Since the independence of India, the people involved in agriculture mostly stay in rural areas. The government has taken numerous efforts for the improvement of the conditions of farmers. Still, the condition is not improved to an acceptable rate. Currently, it has been easy to extract the reviews of farmers from micro-blogging websites. For decades, a trend has been seen that multilingual speakers often switch between more than one language to express themselves on social media networks. Multiple languages are mixed with different rules of grammars, which in itself is the challenging task. In this paper, the authors have extracted the agriculture-related comments having a code-mixing property with English-Punjabi mixed content. Further, the performed language identification, normalization, and creation of the English-Punjabi code-mixed dictionary. After that, we have tested various models trained on English-Punjabi code mixed data using Support Vector Machine and Naive Bayes techniques for sentiment analysis, tested the pipeline for unigram predictive model. Later experimented for n-gram and performance was found to be better in our implemented model.

Keyphrases: Agriculture, Language Identification, Naive Bayes, Sentiment Analysis, social media, Support Vector Machines

BibTeX entry
BibTeX does not have the right entry for preprints. This is a hack for producing the correct reference:
  author = {Mukhtiar Singh and Vishal Goyal and Sahil Raj},
  title = {Sentiment Analysis of English-Punjabi Code Mixed Social Media Content for Agriculture Domain},
  howpublished = {EasyChair Preprint no. 2813},

  year = {EasyChair, 2020}}
Download PDFOpen PDF in browser