Home
Welcome to the Language Technology Group
We are researching on all aspects of technologies for the processing of natural (human) language, and apply language technology in practice. In particular, we are interested in unsupervised, large-scale methods for natural language semantics.
Statistical Semantics
The Language Technology research group examines statistical methods that reflect natural-language semantics. Specifically we compute semantic similarities and semantic relations between lexical items through the analysis of large texts. These relations are used in applications such as semantic indexing, paraphrasing and identification of lexical chains.
Structure Discovery
The focus of this group is on unsupervised and knowledge free methods, such as clustering of lexical graphs, topic models and neural representations. These methods, which neither presuppose training data nor assume the existence of knowledge resources, identify regularities in large text collections and mark them back into the data, following the structure discovery paradigm. This markup, which is entirely data-driven and therefore independent of domain and language, is then used as features for learning applications in supervised machine learning settings: the utility of structure discovery processes is assessed in an application-based manner.
Open Source Software and Open Data
We believe that publicly funded software and data should be released to the public. Therefore, we make our software and data available under most lenient license terms whenever possible. Providing results from research quickly and directly to the public in the form of software enables researchers and companies alike to utilize novel approaches and advances in their work. Some of our software projects have received support from the industry and are used by multinational companies.