Constraint-Based Part-of-Speech Tagging

EasyChair Preprint no. 8379

8 pagesDate: July 3, 2022


This paper describes a constraint-based part-of-speech (POS) tagger, named CPOST, which treats POS tagging as a constraint satisfaction problem (CSP). CPOST treats each word as a variable, uses a lexicon to determine the domains of variables, employs context constraints to reduce ambiguity, and utilizes statistical models to label variables with values. This paper shows that, with a small number of context constraints that encode some of the basic linguistic knowledge, CPOST significantly enhances the precision at identifying base-form verbs, and mitigates the burden on syntax parsing.

Keyphrases: constraints, NLP, POS tagging

