Steffen Remus

Name:	Dr. Steffen Remus
Position:	Postdoc
Email:	steffen.remus (at) uni−hamburg․de
Phone:	+49 40 42883 2369
Fax:	+49 40 42883 2345
Office:	F-413
Address:	Universität Hamburg Department of Informatics (Informatikum) Language Technology Group (LT) Vogt-Kölln-Straße 30 22527 Hamburg
https://orcid.org/0000-0003-4303-8781

Greetings, I am a postdoctoral researcher at the language technology group under the guidance of Prof. Dr. Chris Biemann. My main interests are unsupervised methods, applications in compuational linguistics, distributional methods, semantics, (focused) web crawling, information extraction, knowledge induction, etc. I earned my Ph.D. at the Universität Hamburg under the supervision of Prof. Dr. Chris Biemann.

In the past I was a scholarship holder for the KDSL (Knowledge Discovery in Scientific Literature) program and an associate researcher in the AIPHES program, I worked in the BIMDANUBE project and JOIN-T a project that combines ontologies with semantically induced information from text.

Projects

Main

JOIN-T2

BIMDANUBE

WebAnno + EXMARaLDA

KDSL (Knowledge Discovery in Scientific Publications) -- Crawling & Semantic Structuring of Scientific Publications in the Web

Lexical Chains for German

Associated

AIPHES (Adaptive Preparation of Information from Heterogeneous Sources)

Teaching

Thesis Supervision

Robert Günzler (in progress, BA)
Neil Hinrichs (in progress, BA)
Lennart Roth (2024, BA)
Fabian Rausch (2022, MA)
Frederik Wille (2022, MA)
Mengy Li (2022, MA)
Jan Stenzel (2022, MA)
Teresa Lübeck (2022, BA)
Maximilian Fischer (2021, BA)
Gopalakrishnan Venkatesh (2021, MA)
Tim Fischer (2021, MA)
Hans Ole Hatz (2020, MA)
Tim Dobert (2019, MA)
Rami Aly (2018, BA)
Tim Fischer (2018, BA)
Alvin Rindra Fazrie (2018, MA)
Kai Brusch (2018, MA)
Joël Harms (2018, BA)
Ahmed Elshinawi (2018, Independent Study)
Dominik Sobania (2015, MA)
Dennis Werner (2015, BA)

NOTE: if you are writing a thesis with me, please check the thesis template

Courses @ Universität Hamburg

Software Engineering 1 - Tutoring Practice Class - Winter 2024/25
Deep Learning for Natural Language Processing Seminar (DL4NLP) - Winter 2024/25
Web interfaces for language technology systems (WILPS) - MA practice course + seminar - Summer 2024
Research Software Engineering - Co-Lecturer - Summer 2024
Python for Computational Science - Tutoring Practice Class - Winter 2023/24
Applications with Aspects of Language Technology - BA block practice course - Winter 2023/24
Software Engineering 1 - Tutoring Practice Class - Winter 2023/24
Deep Learning for Natural Language Processing Seminar (DL4NLP) - Winter 2023/24
Web interfaces for language technology systems (WILPS) - MA practice course + seminar - Summer 2023
Statistical Methods of Language Technology (SMoLT) - Tutoring Practice Class - Summer 2023
Applications with Aspects of Language Technology - BA block practice course - Winter 2022/23
Deep Learning for Unstructured Data - Seminar - Winter 2022/23
Software Engineering 1 - Tutoring Practice Class - Winter 2022/23
Machine Learning - Tutoring Practice Class - Summer 2022
Web interfaces for language technology systems (WILPS) - MA practice course + seminar - Summer 2022
Bachelorpraktikum 2017: Language Technology and Web Services - Winter 2016/17
Softwareentwicklung 1 - Tutoring Pratice Class - Winter 2016/17

Courses @ Technische Universität Darmstadt

Question Answering Technologies Behind IBM Watson - Summer Term 2016
Algorithms of Language Technology - Practice Class - Summer Term 2016
Question Answering Technologies Behind IBM Watson - Summer Term 2015
Workshop on IBM Watson - One day workshop, February 2015
Algorithms of Language Technology - Practice Class - Summer Term 2015 (Best Mentoring)
Algorithms of Language Technology - Practice Class - Summer Term 2014

Professional Activities

Organizational / Editorial Activities:

Shared Task on Classification and Regression of Cognitive and Motivational Style from Text – GermEval Task 1 (2020)
Shared Task on Hierarchical Classification of Blurbs - GermEval Task 1 (2019)
2nd Workshop on Biomedical Information Management: Data-Driven Innovations (2018)
1st Workshop on Biomedical Information Management: Challenges and Open Problems (2018)
Workshop on IBM Watson - One day workshop, February 2015

Programme Committee Memberships / Reviewer Activities:

Coling 2025;
ArgMining Workshop 2024
ACL 2024
COLM 2024
LREC-Coling 2023
EMNLP 2023
TPAMI 2023
AAAI 2022
NAACL 2021
LDK 2021
EACL 2021
AAAI 2021
ACL 2020
AACL-IJCNLP 2020
KNLP workshop 2020
TACL 2020
LREC 2020
EMNLP 2020
WAC-XII workshop 2020
Coling 2020
CONLL 2020
AAAI 2020
TextGraphs workshop 2020
DGfS 2020
NLE (Journal) 2019
NAACL 2019
KONVENS 2019
EMNLP 2019
ECIR 2019
CONLL 2019
ACL 2019
IWCS 2019
LDK 2019
CONLL 2018
EMNLP 2018
ACL 2018
*SEM 2018
ESWC 2018
TextGraphs workshop 2018
ISWC 2017
RANLP 2017
GSCL 2017
EMNLP 2017
*SEM 2017
EACL 2017
TextGraphs workshop 2018
ESWC 2016
SemEval 2016
EMNLP 2016
WAC-X workshop 2016

Publications

Robert Günzler, Özge Sevgili, Steffen Remus, Chris Biemann, and Irina Nikishina. Sövereign at The Perspective Argument Retrieval Shared Task 2024: Using LLMs with Argument Mining. In Proceedings of the 11th Workshop on Argument Mining (ArgMining 2024), 150–158. Bangkok, Thailand, 2024.

Robert Geislinger, Ali Ebrahimi Pourasad, Deniz Gül, Daniel Djahangir, Seid Muhie Yimam, Steffen Remus, and Chris Biemann. Multi-Modal Learning Application – Support Language Learners with NLP Techniques and Eye-Tracking. In Proceedings of the 1st Workshop on Linguistic Insights from and for Multimodal Language Processing (LIMO), 6–11. Ingolstadt, Germany, 2023. ( pdf, bib).

Steffen Remus. Domain Defining Context: On Domain-Dependent Corpus Expansion and Contextualized Semantic Structuring. Doctoral dissertation, Universität Hamburg, 2023. ( pdf, metadata).

Özge Sevgili, Steffen Remus, Abhik Jana, Alexander Panchenko, and Chris Biemann. Unsupervised Ultra-Fine Entity Typing with Distributionally Induced Word Senses. In Proceedings of the 11th International Conference on Analysis of Images, Social Networks and Texts (AIST), 1–15. Yerevan, Armenia, 2023. ( pdf).

Tim Fischer, Steffen Remus, and Chris Biemann. Measuring Faithfulness of Abstractive Summaries. In Proceedings of the 18th Conference on Natural Language Processing (KONVENS 2022), 63–73. Potsdam, Germany, 2022. ( pdf).

Markus J Hofmann, Steffen Remus, Chris Biemann, and Ralph Radach. Language models explain word reading times better than empirical predictability. Ed. by Massimo Stella. Frontiers in Artificial Intelligence. 2022.: 1–20.

Steffen Remus, Gregor Wiedemann, Saba Anwar, Fynn Petersen-Frey, Seid Muhie Yimam, and Chris Biemann. More Like This: Semantic Retrieval with Linguistic Information. In Proceedings of the 18th Conference on Natural Language Processing (KONVENS 2022), 156–166. Potsdam, Germany, 2022.

Gopalakrishnan Venkatesh, Abhik Jana, Steffen Remus, Özge Sevgili, Gopalakrishnan Srinivasaraghavan, and Chris Biemann. Using Distributional Thesaurus To Enhance Transformer-based Contextualized Representations for Low Resource Languages. In Proceedings of the 37th ACM/SIGAPP Symposium On Applied Computing (ACM SAC), Special Track on Knowledge and Natural Language Processing (KNLP), 845–852. online, 2022.

Benjamin Milde, Tim Fischer, Steffen Remus, and Chris Biemann. MoM: Minutes of Meeting Bot. In Proceedings of Interspeech 2021 Show&Tell, 3311–3312. Brno, Czech Republic, 2021. ( pdf, video-de, video-en, git).

Jingyuan Feng, Özge Sevgili, Steffen Remus, Eugen Ruppert, and Chris Biemann. Supervised Pun Detection and Location with Feature Engineering and Logistic Regression. In Proceedings of the 5th SwissText & 16th KONVENS Joint Conference 2020, 3:1–6. Zurich, Switzerland, 2020. ( pdf).

Markus J Hofmann, Steffen Remus, Chris Biemann, and Ralph Radach. Language models explain word reading times better than empirical predictability. PsyArXiv. 2020.: 1–77. ( link).

Dirk Johannßen, Chris Biemann, Steffen Remus, Timo Baumann, and David Scheffer. GermEval 2020 Task 1 on the Classification and Regression of Cognitive and Motivational style from Text. In Proceedings of the GermEval 2020 Task 1 Workshop in conjunction with the 5th SwissText & 16th KONVENS Joint Conference 2020, 1–10. Zurich, Switzerland (online), 2020. ( pdf, web).

Varvara Logacheva, Denis Teslenko, Artem Shelmanov, Steffen Remus, Dmitry Ustalov, Andrey Kutuzov, Ekaterina Artemova, Chris Biemann, and Alexander Panchenko. Word sense disambiguation for 158 languages using word embeddings only. In Proceedings of The 12th Language Resources and Evaluation Conference, 5943–5952. Marseille, France, 2020. ( pdf, bib, web).

Rami Aly, Steffen Remus, and Chris Biemann. Hierarchical Multi-label Classification of Text with Capsule Networks. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics: Student Research Workshop, 323–330. Florence, Italy, 2019. ( pdf, bib).

Tim Fischer, Steffen Remus, and Chris Biemann. LT Expertfinder: An Evaluation Framework for Expert Finding Methods. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics (Demonstrations), 98–104. Minneapolis, MN, USA, 2019. ( pdf, bib, git, web).

Markus J Hofmann, Steffen Remus, Chris Biemann, and Ralph Radach. Language models can outperform empirical predictability in predicting eye movement data. In Proceedings of the 20th European Conference on Eye Movements (ECEM) 2019. Alicante, Spain, 2019. ( poster-pdf).

Steffen Remus, Rami Aly, and Chris Biemann. GermEval 2019 Task 1: Hierarchical Classification of Blurbs. In Proceedings of the 15th Conference on Natural Language Processing (KONVENS 2019), 280–292. Erlangen, Germany, 2019. ( pdf, bib, web).

Steffen Remus, Hanna Hedeland, Anne Ferger, Kristin Bührig, and Chris Biemann. Annotation gesprochener Daten mit WebAnno-MM. In Die 6. Jahrestagung des DHd e.V. 2019. Frankfurt & Mainz, Germany, 2019. ( poster-pdf).

Steffen Remus, Hanna Hedeland, Anne Ferger, Kristin Bührig, and Chris Biemann. WebAnno-MM: EXMARaLDA meets WebAnno. In Selected papers from the CLARIN Annual Conference 2018, 166–172. Linköping Electronic Conference Proceedings 159. 2019. ( pdf, git).

Gregor Wiedemann, Steffen Remus, Avi Chawla, and Chris Biemann. Does BERT Make Any Sense? Interpretable Word Sense Disambiguation with Contextualized Embeddings. In Proceedings of the 15th Conference on Natural Language Processing (KONVENS 2019), 161–170. Erlangen, Germany, 2019. ( pdf, bib).

Steffen Remus and Chris Biemann. Retrofitting Word Representations for Unsupervised Sense Aware Word Similarities. In Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018), 1035–1041. Miyazaki, Japan, 2018. ( bib, pdf, poster, git).

Steffen Remus, Hanna Hedeland, Anne Ferger, Kristin Bührig, and Chris Biemann. EXMARaLDA meets WebAnno. In Proceedings of the CLARIN Annual Conference 2018 (CAC 2018), 1–5. Pisa, Italy, 2018. ( pdf).

Seid Muhie Yimam, Steffen Remus, Alexander Panchenko, Andreas Holzinger, and Chris Biemann. Entity-Centric Information Access with the Human-in-the-Loop for the Biomedical Domains. In Proceddings of the Biomedical NLP Workshop associated with RANLP 2017, 42–48. Varna, Bulgaria, 2017. ( pdf).

Steffen Remus, Manuel Kaufmann, Kathrin Ballweg, Tatiana von Landesberger, and Chris Biemann. Storyfinder: Personalized Knowledge Base Construction and Management by Browsing the Web. In CIKM ’17: Proceedings of the 2017 ACM on Conference on Information and Knowledge Management, 2519–2522. Singapore, Singapore, 2017. ( preprint, poster, web).

Markus J Hofmann, Chris Biemann, and Steffen Remus. Benchmarking n-grams, topic models and recurrent neural networks by cloze completions, EEGs and eye movements. Ed. by Bernadette Sharp, Florence Sèdes, and Wiesław Lubaszewski. Cognitive Approach to Natural Language Processing. 2016.: 197–215. ( link).

Alexander Panchenko, Stefano Faralli, Eugen Ruppert, Steffen Remus, Hubert Naets, Cedrick Fairon, Simone P Ponzetto, and Chris Biemann. TAXI at SemEval-2016 Task 13: A Taxonomy Induction Method based on Lexico-Syntactic Patterns, Substrings and Focused Crawling. In Proceedings of the 10th International Workshop on Semantic Evaluation, 1320–1327. San Diego, CA, USA, 2016. ( pdf).

Steffen Remus and Chris Biemann. Domain-Specific Corpus Expansion with Focused Webcrawling. In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016), 23–28. Portorož, Slovenia, 2016. ( pdf, bib).

Steffen Remus, Gerold Hintz, Darina Benikova, Thomas Arnold, Judith Eckle-Kohler, Christian M Meyer, Margot Mieskes, and Chris Biemann. EmpiriST: AIPHES Robust Tokenization and POS-Tagging for Different Genres. In Proceedings of the 10th Web as Corpus Workshop (WAC-X), 106–114. Berlin, Germany, 2016. ( pdf).

Chris Biemann, Steffen Remus, and Markus J Hofmann. Predicting word ’predictability’ in cloze completion, electroencephalographic and eye movement data. In Proceedings of the 12th International Workshop on Natural Language Processing and Cognitive Science, 83–93. Krakow, Poland, 2015. ( pdf).

Omer Levy, Steffen Remus, Chris Biemann, and Ido Dagan. Do Supervised Distributional Methods Really Learn Lexical Inference Relations? In Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 970–976. Denver, CO, USA, 2015. ( pdf, bib).

Dirk Goldhahn, Steffen Remus, Uwe Quasthoff, and Chris Biemann. Top-Level Domain Crawling for Producing Comprehensive Monolingual Corpora from the Web. In Proceedings of the LREC-14 workshop on Challenges in the Management of Large Corpora (CMLC-2), 10–14. Reykjavik, Iceland, 2014. ( pdf).

Jinseok Nam, Christian Kirschner, Zheng Ma, Nicolai Erbs, Susanne Neumann, Daniela Oelke, Steffen Remus, Chris Biemann, Judith Eckle-Kohler, Johannes Fürnkranz, Iryna Gurevych, Marc Rittberger, and Karsten Weihe. Knowledge Discovery in Scientific Literature. In Proceedings of the 12th Konferenz zur Verarbeitung natürlicher Sprache (KONVENS 2014), 66–76. Hildesheim, Germany, 2014. ( pdf).

Steffen Remus. Unsupervised Relation Extraction of In-Domain Data from Focused Crawls. In Proceedings of the Student Research Workshop at the 14th Conference of the European Chapter of the Association for Computational Linguistics, 11–20. Gothenburg, Sweden, 2014. ( pdf, bib).

Steffen Remus and Chris Biemann. Three Knowledge-Free Methods for Automatic Lexical Chain Extraction. In Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 989–999. Atlanta, GA, USA, 2013. ( pdf, bib).

Steffen Remus. Automatically Identifying Lexical Chains by Means of Statistical Methods – A Knowledge-Free Approach. MA, Technische Universität Darmstadt, 2012.

Steffen Remus

Projects

Main

Associated

Teaching

Thesis Supervision

Courses @ Universität Hamburg

Courses @ Technische Universität Darmstadt

Professional Activities

Publications

News

Steffen Remus