LexSub - Delexicalized Lexicical Substitution Framework
Lexical Substitution Software
We offer two implementations that perform all-words lexial substitution with list-based substitution candidates: One monolingual and one cross-lingual transfer learning approach
LexSub
LexSub is a framework for supervised all-words lexical substitution using delexicalized features [1]. This particularly means that it requires only one trained classifier that can be applied on all seen and unseen words ("all-words"). This is achieved by using features that only characterize a word's context that is independent of the lexical surface form of the word itself ("delexicalized"). This includes n-gram frequencies, for example, and distributional similarity scores.
The main core of the project is published under the Apache Licence (link), the executable uses GPL licenced code and is published under the GPL licence.
Trans-Lingual LexSub
Going beyond lexicalitation, the cross-lingual transfer LexSub project can be trained on one language and executed on a different language. This is achieved by delexicalized reoresentations, as well as mappings between language-specific features, which are again combined into a single classifier [2].
The main core of the project is available here on GitHub.
[1] Szarvas, G., Biemann, C., and Gurevych, G. (2013): Supervised All-Words Lexical Substitution using Delexicalized Features. Proceedings of NAACL-2013, Atlanta, GA, USA [pdf] [slides][video]
[2] Hintz, G. and Biemann, C. (2016): Language Transfer Learning for Supervised Lexical Substitution. Proceedings of ACL-2016, Berlin, Germany [pdf]