Download PDFOpen PDF in browser

Improving Statistical Linguistic Algorithms for Parsing Mathematics

10 pagesPublished: September 27, 2016

Abstract

In this paper we describe our combined statistical/semantic parsing method based on the CYK chart-parsing algorithm augmented with limited internal typechecking and external ATP filtering. This method was previously evaluated on parsing ambiguous mathematical expressions over the informalized Flyspeck corpus of 20000 theorems. We first discuss the motivation and drawbacks of the first version of the CYK-based component of the algorithm, and then we propose and implement a more sophisticated approach based on better statistical model of mathematical data structures.

Keyphrases: automated reasoning, computational linguistics, Flyspeck, HOL Light, Parsing Mathematics, type checking

In: Boris Konev, Stephan Schulz and Laurent Simon (editors). IWIL-2015. 11th International Workshop on the Implementation of Logics, vol 40, pages 27--36

Links:
BibTeX entry
@inproceedings{IWIL-2015:Improving_Statistical_Linguistic_Algorithms,
  author    = {Cezary Kaliszyk and Josef Urban and Jiri Vyskocil},
  title     = {Improving Statistical Linguistic Algorithms for Parsing Mathematics },
  booktitle = {IWIL-2015. 11th International Workshop on the Implementation of Logics},
  editor    = {Boris Konev and Stephan Schulz and Laurent Simon},
  series    = {EPiC Series in Computing},
  volume    = {40},
  pages     = {27--36},
  year      = {2016},
  publisher = {EasyChair},
  bibsource = {EasyChair, https://easychair.org},
  issn      = {2398-7340},
  url       = {https://easychair.org/publications/paper/8B6L},
  doi       = {10.29007/8c2m}}
Download PDFOpen PDF in browser