Web Interfaces for Language Processing Systems SS2019
The following systems are successfully completed as part of the master project web interfaces for language processing systems in the summer semester of 2019.
I. Personalized Reading Support
Team
- Julian Betz
Project aim
- To develop a reading aid that allows foreign language learners of a target language (e.g. Japanese) to view lexical information on sense-disambiguated words in the context of their online browsing.
Primary features
- Detect target-language words on web pages
- Provide glosses for sense-disambiguated words in a reference language (English)
- Provide example sentences, with a preference for those that are most understandable to the reader
- Report the user’s learning progress in terms of (sense disambiguated) lexical coverage of a reference corpus
- Detect target-language words on web pages
- Provide glosses for sense-disambiguated words in a reference language (English)
Secondary features
- Provide glosses for sense-disambiguated words in the target language
- Perform named-entity recognition
- Couple the system with an existing popular flashcard learning program (e.g. Anki)
- Provide the possibility to show relevant pictures
Non-functional features
- Recall-oriented language detection
- Responsiveness
- Non-invasive webpage analysis
Source code and documentation
II. Fake News Of The Day
- A tool that enables the user to get a simple and quick overview of the daily news. The articles are visualized in two graphs that separate the articles which are classified as True and Fake - news. The entities of the article's bodies give the user further context.
Team
- Luca Knobloch
How the tool is built
- Crawling the articles using the crawler from the network of the day "news-Crawler"
- Machine learning model to classify articles as fake or real
- Extract named entities from the articles
- Build fake and real graphs using the entities extracted from the two classes
Source code
- The source code of the project is available here
III. WikiHow QA / Summarization
Team
- Tim Fischer
Project objective
The WikiHow QA Application has four main features:
- Answer "How to" Questions
- Summarize provided text
- Analyse, Compare & Rate summarization techniques
- View Statistics of summarization techniques
Source code
- The source code of the project is available here
- The documentation of the project is available here
- The project report is available here
IV. Taxonomy Editor for Word Embeddings
This is a flask web application, designed to visualize operations on word-embeddings with a d3-graph. Additionally, the application provides the possibility to learn user-annotated relations between a pair of words. Given an existing word and some chosen relation, a new word can then be predicted and generated into the graph. For two words, you can also predict the best matching relation type.
Team
- Alexander Klassen
Source code:
- The source code of the project is available here
V. Sense Clustering over Time
- SCoT (Sense Clustering over Time) is a web application to view the senses of a word and its evolvement over time. The idea is to help anyone interested in diachronic semantics visualize and compare the meanings a word had at different points in time.
Team
- Inga Kempfert
Project objective
- Web application that visualizes the different senses of a concept
- Senses visible through a clustered graph of collocations
- View the sense changes over time
- Save the state of the graph in a separate file
- The user can edit the graph and correct the system’s hypothesis
Source code and documentation
VI: ActiveAnno - Document Annotation Service
- ActiveAnno is a web-based, responsive, highly configurable open-source document annotation tool.
Team
- Max Wiechmann
Task
- Create a document annotation tool
- Focus on document-level & simple span-level annotations
- Flexibility and conformability
- Usable for active learning & continuous streams of data
- Responsive and modern frontend
- Deployable via Docker in a Microservice context
Source code
- The project source code and documentation is available here