Alexander Panchenko
aName: | Dr. Alexander Panchenko |
Position: | Postdoctoral researcher |
Email: | panchenko...informatik.uni-hamburg.de |
Phone: | +49 40 42883 2368 |
Fax: | +49 40 42883 2345 |
Office: | F-416 |
Address: |
Informatikum Vogt-Kölln-Straße 30 22527 Hamburg |
Hello, I am Alexander, a postdoctoral researcher in Natural Language Processing working with Prof. Chris Biemann. My main research interest is computational lexical semantics, including semantic relatedness, word sense induction, and disambiguation.
In the past, I worked on other NLP-related topics including short text classification, NLP for social media analysis, and skill extraction from text. More generally, I am interested in statistical natural language processing, information retrieval, semantic web, machine learning and intersections/interactions of these fields. Currently, I am
Recent News:
- I
co-organize a special issue in the Natural Language Engineering journal on "Informing Neural Architectures for NLP with Linguistic and Background Knowledge" together with Simone Paolo Ponzetto and Ivan Vulić. - A keynote talk at the 24th International Conference on Computational Linguistics and Intellectual Technologies (Dialogue'2018): From unsupervised induction of linguistic structures to applications in deep learning.
- The best paper award in the category "Impact on Society" of Fraunhofer IGD and the Visual Computing Groups of TU Darmstadt for the paper "new/s/leak - Information Extraction and Visualization for Investigative Data Journalists".
- I co-organize the 7th Conference on Analysis of Images, Social Networks, and Texts (AIST'2018). The selected papers will be published in the Springer LNCS series.
- An invited talk at the Global WordNet Conference (GWC'2018) in Singapore on inducing interpretable word senses for word sense disambiguation and enrichment of lexical resources.
- An article is accepted in the Natural Language Engineering article on graph-based distributional semantics.
- I co-organize a shared task on word sense induction for the Russian language. 18 teams participated in the task submitting 383 models. An overview of the results is available in this preprint.
- The release of a web-scale dependency-parsed corpus of English texts, based on the CommonCrawl web crawls. The corpus features over 250 billion of tokens and is available at Amazon S3.
- The release of a web demo of the unsupervised, knowledge-free, and interpretable system for word sense disambiguation presented at EMNLP 2017 in Copenhagen.
Click on the image below to see a demo of a word sense disambiguation system, which integrates a few ideas from my research:
Presentations
Invited Talks
- 24th International Conference on Computational Linguistics and Intellectual Technologies (Dialogue'2018): From unsupervised induction of linguistic structures to applications in deep learning. This invited talk was first presented at Skolkovo Institute of Science and Technology and later was also presented at the NLP seminar of Yandex's School of Data Analysis both in Moscow, Russia.
- Global WordNet Conference, Workshop on Wordnets and Word Embeddings (2018): Inducing Interpretable Word Senses for WSD and Enrichment of Lexical Resources. Singapore, Singapore
- Université catholique de Louvain, Research Seminar (2015). Text Analysis of Social Networks: Working with FB and VK Data. Louvain-la-Neuve, Belgium. This invited talk was as originally presented at the AI Ukraine 2014 conference.
- Montclair State University, Research Seminar (2014). Similarity Measures for Semantic Relation Extraction. Montclair, New Jersey, USA.
- Higher School of Economics, Research Seminar (2013). A Graph-based Approach to Extraction of Skills from Text. Nizhny Novgorod, Russia.
Publications
Monographs
- Alexander Pancheno (2013): Similarity Measures for Semantic Relation Extraction,
PhD Thesis, Universitécatholique de Louvain - Alexander Panchenko (2008): Automatic Thesaurus Construction System, Graduation Thesis, Moscow State Technical University (BMSTU)
Edited volumes
- van der Aalst, W.M.P., Ignatov, D.I., Khachay, M., Kuznetsov, S.O., Lempitsky, V., Lomazova, I.A., Loukachevitch, N., Napoli, A., Panchenko, A., Pardalos, P.M., Savchenko, A.V., Wasserman, S. (Eds.) (2017) The 6th International Conference on Analysis of Images, Social Networks and Texts (AIST'2017), Moscow, Russia, July 27–29, 2017, Revised Selected Papers. Springer Lecture Notes in Computer Science (LNCS). Springer
- Ignatov, D.I., Khachay, M.Y., Labunets, V.G., Loukachevitch, N., Nikolenko, S., Panchenko, A., Savchenko, A.V., Vorontsov, K. (Eds.) (2016) The 5th International Conference on Analysis of Images, Social Networks and Texts (AIST'2016), Yekaterinburg, Russia, April 7-9, Revised Selected Papers. Communications in Computer and Information Science. Springer Heidelberg Dordrecht London New York.
- Baixeries J., Ignatov D.I., Ilvovsky D., Panchenko A. (Eds.) (2016) The 3rd International Workshop on Concept Discovery in Unstructured Data (CDUD'2016) co-located with the 13th International Conference on Concept Lattices and Their Applications (CLA 2016), July 18, 2016, Moscow, Russia. CEUR-Workshop, Vol. 1625, ISSN 1613-0073.
- Ignatov D.I., Khachay M.Y., Labunets V.G., Loukachevitch N., Nikolenko S., Panchenko A., Savchencko A.V., Vorontosov K.V. (2016) Supplementary Proceedings of the Fifth International Conference on Analysis of Images, Social Networks
and Texts (AIST-SUP 2016). Yekaterinburg, Russia, April 6-8, 2016. CEUR-WS, Vol 1710. ISSN 1613-0073 - Khachay, M.Y., Konstantinova, N., Panchenko, A., Ignatov, D.I., Labunets, V.G. (Eds.) (2015) The 4th International Conference on Analysis of Images, Social Networks and Texts (AIST'2014), Yekaterinburg, Russia, April 9-11, Revised Selected Papers. Communications in Computer and Information Science. Springer Heidelberg Dordrecht London New York.
- Ignatov, D.I., Khachay, M.Y., Panchenko, A., Konstantinova, N., Yavorsky, R.E. (Eds.) (2014). The 3rd International Conference on Analysis of Images, Social Networks
and Texts (AIST'2014), Yekaterinburg, Russia, April 10-12, Revised Selected Papers. Communications in Computer and Information Science. Springer Heidelberg Dordrecht London New York.
Journal articles
- Biemann, C., Faralli, S., Panchenko, A., Ponzetto, S.P. (2018): A framework for enriching lexical semantic resources with distributional semantics. Journal of Natural Language Engineering, published online Jan 15, 2018
Conference proceedings
- Schildwächter, M., Bondarenko, A., Zenker, J., Hagen, M., Biemann, C., Panchenko, A. (2019): Answering Comparative
Qestions : Better than Ten-Blue-Links? Proceedings of ACM SIGIR Conference on Human Information Interaction and Retrieval (CHIIR ), Glasgow, Scotland, UK - Ustalov, D., Panchenko, A., Biemann, C., Ponzetto, S.P. (2018): Unsupervised Sense-Aware Hypernymy Extraction. Proceedings of the 14th Conference on Natural Language Processing (KONVENS 2018) Vienna, Austria
- Ustalov, D., Panchenko, A., Kutuzov, A., Biemann, C., Ponzetto, S.P. (2018): Unsupervised Semantic Frame Induction using Triclustering. Proceedings of ACL 2018, Melbourne, Australia
- Panchenko A., Lopukhina A., Ustalov D., Lopukhin K., Arefyev N., Leontyev A., Loukachevitch N. (2018): RUSSE'2018: A Shared Task on Word Sense Induction for the Russian Language. In Proceedings of the 24th International Conference on Computational Linguistics and Intellectual Technologies (Dialogue'2018). Moscow, Russia. RGGU
- Arefyev, N., Ermolaev, P., Panchenko. A. (2018): How much does a word weigh? Weighting word embeddings for word sense induction. In Proceedings of the 24th International Conference on Computational Linguistics and Intellectual Technologies (Dialogue'2018). Moscow, Russia. RGGU
- Panchenko, A., Ruppert, E., Faralli, S., Ponzetto, S.P., Biemann, C. (2018): Building a Web-Scale Dependency-Parsed Corpus from Common Crawl. Proceedings of LREC 2018, Myazaki, Japan
- Ustalov, D., Teslenko, D., Panchenko, A., Chernoskutov, M., Biemann, C., Ponzetto, S.P. (2018): An Unsupervised Word Sense Disambiguation System for Under-Resourced Languages. Proceedings of LREC 2018, Myazaki, Japan
- Faralli, S., Panchenko, A., Biemann, C., Ponzetto, S.P. (2018): Enriching Frame Representations with Distributionally Induced Senses. Proceedings of LREC 2018, Myazaki, Japan
- Panchenko, A., Ustalov, D., Faralli, S., Ponzetto, S.P., Biemann, C. (2018): Improving Hypernymy Extraction with Distributional Semantic Classes. Proceedings of LREC 2018, Myazaki, Japan
- Panchenko A., Fide M., Ruppert E., Faralli S., Ustalov D., Ponzetto S.P., Biemann C. (2017): Unsupervised, Knowledge-Free, and Interpretable Word Sense Disambiguation. In Proceedings of the the Conference on Empirical Methods on Natural Language Processing (EMNLP). Copenhagen, Denmark. Association for Computational Linguistics
- Ustalov D., Chernoskutov M., Biemann C., Panchenko A.: Fighting with the Sparsity of Synonymy Dictionaries. In Proceedings of the 6th Conference on Analysis of Images, Social Networks, and Texts (AIST'2017): Springer Lecture Notes in Computer Science (LNCS). Moscow, Russia
- Ustalov D., Panchenko A., Biemann C. (2017):
Watset : Automatic Induction of Synsets from a Graph of Synonyms. In Proceedings of the 55th Meeting of the Association for Computational Linguistics (ACL'2017). Vancouver, Canada. Association for Computational Linguistics - Ustalov D., Arefyev N., Biemann C., Panchenko A. (2017): Negative Sampling Improves Hypernymy Extraction Based on Projection Learning. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics (EACL'2017). Valencia, Spain. Association for Computational Linguistics
- Faralli S., Panchenko A., Biemann C., and Ponzetto S. P. (2017): The ContrastMedium Algorithm: Taxonomy Induction From Noisy Knowledge Graphs With Just A Few Links. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics (EACL'2017). Valencia, Spain. Association for Computational Linguistics.
- Panchenko A., Ruppert E., Faralli S., Ponzetto S. P., and Biemann C. (2017): Unsupervised Does Not Mean Uninterpretable: The Case for Word Sense Induction and Disambiguation. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics (EACL'2017). Valencia, Spain. Association for Computational Linguistics.
- Faralli S., Panchenko A., Biemann C., Ponzetto S. P. (2016): Linked Disambiguated Distributional Semantic Networks. In Proceedings of the 15th International Semantic Web Conference (ISWC 2016). pp. 56-64, Kobe, Japan. Lecture Notes in Computer Science, Springer International Publishing
- Ustalov D., Panchenko A. (2016): Learning Word Subsumption Projections for the Russian Language. In Proceedings of the International Conference on Big Data and its Applications (ICBDA 2016). ITM Web of Conferences. Vol. 8. P.01006. dx.doi.org/10.1051/itmconf/20160801006
- Panchenko A., Simon J., Riedl M., Biemann C. (2016): Noun Sense Induction and Disambiguation using Graph-Based Distributional Semantics. In Proceedings of the 13th Conference on Natural Language Processing (KONVENS'2016). Bochum, Germany. Bochumer Linguistische Arbeitsberichte (BLA)
- Yimam S.M., Ulrich H., von Landesberger T., Rosenbach M., Regneri M., Panchenko A., Lehmann F., Fahrer U., Biemann C. Ballweg K. (2016): new/s/leak – Information Extraction and Visualization for Investigative Data Journalists. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (ACL). System Demonstrations. Berlin, Germany. Association for Computational Linguistics
- Panchenko A. (2016): Best of Both Worlds: Making Word Sense Embeddings Interpretable. In Proceedings of the 10th edition of the Language Resources and Evaluation Conference (LREC'2016), Portorož, Slovenia. European Language Resources Association (ELRA)
- Panchenko A., Ustalov D., Arefyev N., Paperno D. Konstantinova N., Loukachevitch N. and Biemann C. (2016): Human and Machine Judgements about Russian Semantic Relatedness. In Proceedings of the 5th Conference on Analysis of Images, Social Networks, and Texts (AIST'2016). Communications in Computer and Information Science (CCIS). Springer-Verlag Berlin Heidelberg
- Panchenko A., Babaev D., and Objedkov S. (2015): Large-Scale Parallel Matching of Social Network Profiles. In Proceedings of 4th Conference on Analysis of Images, Social Networks, and Texts. Yekaterinburg, Russia. Communications in Computer and Information Science (CCIS). Springer-Verlag Berlin Heidelberg
- Panchenko A., Loukachevitch N. V., Ustalov D., Paperno D., Meyer C. M., Konstantinova N. (2015): RUSSE: The First International Workshop on Russian Semantic Similarity. In Proceedings of the 21st International Conference on Computational Linguistics and Intellectual Technologies (Dialogue'2015). Moscow, Russia. RGGU
- Arefyev N., Panchenko A., Lukanin A., Lesota O., Romanov P. (2015): Evaluating Three Corpus-Based Semantic Similarity Systems for Russian. In Proceedings of the 21st International Conference on Computational Linguistics and Intellectual Technologies (Dialogue'2015). Moscow, Russia. RGGU
- Panchenko A. (2014): Sentiment Index of the Russian Speaking Facebook. In Proceedings of the 20th International Conference on Computational Linguistics and Intellectual Technologies (Dialogue'2014). Moscow, Russia. RGGU
- Panchenko A., Muraviev N., Objedkov S. (2014): Neologisms on Facebook. In Proceedings of the 20th International Conference on Computational Linguistics and Intellectual Technologies (Dialogue'2014). Moscow, Russia. RGGU
- Panchenko A., Teterin A. (2014): Gender Detection by Full Name: Experiments with the Russian Language. In Proceedings of 3rd Conference on Analysis of Images, Social Networks, and Texts. Communications in Computer and Information Science (CCIS) Volume 436, pp.169-182. Springer-Verlag Berlin Heidelberg
- Panchenko A., Naets H., Brouwers L., Romanov P., Fairon C. (2013): Recherche et visualization de mots sémantiquement liés. In Proceedings of the 20th French Conference on Natural Language Processing (Conférence
sur le Traitement Automatique des Langues Naturelles, TALN'2013). Les Sables d'Olonne, France. pp.747--754. Association pour le Traitement Automatique des Langues (ATALA) - Panchenko A., Naets H., Beaufort R., Fairon C. (2013): Towards Detection of Child Sexual Abuse Media: Classification of the Associated Filenames. In Proceedings of the 35th European Conference on Information Retrieval (ECIR'2013). Lecture Notes in Computer Science (LNCS) vol. 7814, pp. 776-779. Springer-Verlag Berlin Heidelberg
- Panchenko A., Romanov P., Morozova O., Naets H., Romanov A., Philippovich A., Fairon C. (2013): Serelex: Search and Visualization of Semantically Similar Words. In Proceedings of the 35th European Conference on Information Retrieval (ECIR'2013). Lecture Notes in Computer Science (LNCS) vol. 7814, pp. 837-840. Springer-Verlag Berlin Heidelberg
- Panchenko A., Morozova O., Naets H. (2012): A Semantic Similarity Measure Based on Lexico-Syntactic Patterns. In Proceedings of the 11th Conference on Natural Language Processing (KONVENS'2012). pp.174--178. Vienna, Austria. Österreichische Gesellschaft für Artificial Intelligence (ÖGAI)
- Panchenko A. (2012): A Study of Heterogeneous Similarity Measures for Semantic Relation Extraction. In Proceedings of 14th French Conference on Natural Language Processing (JEP-TALN-RECITAL). Grenoble, France. Association pour le Traitement Automatique des Langues (ATALA)
Workshop proceedings
- Panchenko A., Faralli S., Ponzetto S. P., and Biemann C. (2017): Using Linked Disambiguated Distributional Networks for Word Sense Disambiguation. In Proceedings of the Workshop on Sense, Concept and Entity Representations and their Applications (SENSE) co-located with the 15th Conference of the European Chapter of the Association for Computational Linguistics (EACL'2017). Valencia, Spain. Association for Computational Linguistics
- Pelevina M., Arefyev N., Biemann C., Panchenko A. (2016): Making Sense of Word Embeddings. In Proceedings of the 1st Workshop on Representation Learning for NLP co-located with the ACL conference. Berlin, Germany. Association for Computational Linguistics
- Ballweg K., Zouhar F., Wilhelmi-Dworski P., von Landesberger T., Fahrer U., Panchenko A., Yimam S.M. Biemann C., Regneri M., Ulrich H. (2016): new/s/leak – A Tool for Visual Exploration of Large Text Document Collections in the Journalistic Domain. Workshop on Visualisation in Practice co-located with the VIS conference, Baltimore, MD, USA
- Panchenko A., Faralli S., Ruppert E., Remus S., Naets H., Fairon C. Ponzetto S. P., and Biemann, C. (2016): TAXI at SemEval-2016 Task 13: a Taxonomy Induction Method based on Lexico-Syntactic Patterns, Substrings and Focused Crawling. In Proceedings of the 10th International Workshop on Semantic Evaluation. San Diego, CA, USA. Association for Computational Linguistics
- Kivimaki I., Panchenko A., Dessy A. Verdegem D., Francq P., Fairon C., Bersini H., Saerens M. (2013): A Graph-Based Approach to Skill Extraction from Text. In Proceedings of TextGraphs-8 Workshop co-located with the Conference on Empirical Methods for Natural Language Processing (EMNLP'2013). Seattle, USA. Association for Computational Linguistics
- Panchenko A. Morozova O. (2012): A Study of Hybrid Similarity Measures for Semantic Relation Extraction. In Proceedings of Workshop
of Innovative Hybrid Approaches to the Processing of Textual Data Workshop co-located with the EACL'2012 conference, pp.10-18, Avignon, France. Association for Computational Linguistics - Panchenko A, Beaufort R., Fairon C. (2012): Detection of Child Sexual Abuse Media on P2P Networks: Normalization and Classification of Associated Filenames. In Proceedings of Workshop on Language Resources for Public Security Applications of the 8th International Conference on Language Resources and Evaluation (LREC). European Language Resources Association (ELRA)
- Panchenko A., Adeykin S., Romanov P., Romanov A. (2012): Extraction of Semantic Relations between Concepts with KNN Algorithms on Wikipedia. In Proceedings of Concept Discovery in Unstructured Data (CDUD) workshop co-located with the International Conference On Formal Concept Analysis (ICFCA'2012), pp.78-88, Leuven, Belgium. CEUR-WS
- Panchenko A. (2012): Towards an Efficient Combination of Similarity Measures for Semantic Relation Extraction. Abstract in Proceedings of the 22nd Meeting of Computational Linguistics in the Netherlands (CLIN 22). Tilburg, The Netherlands. Tilgurg University
- Panchenko A. (2011): Comparison of the Knowledge-, Corpus-, and Web-based Similarity Measures for Semantic Relations Extraction. In Proceedings of the Workshop GEometrical Models of Natural Language Semantics (GEMS) co-located with the EMNLP'2011 conference. pp.10-18. Edinburgh, Scotland. Association for Computational Linguistics
- Panchenko A. (2010): Can We Automatically Reproduce Semantic Relations of an Information Retrieval Thesaurus?. In Proceedings of the Young Scientists Conference of the 4th Russian Summer School in Information Retrieval (RuSSIR'2010). pp.36–51. Voronezh, Russia. VSU
Software and Datasets
Software
wsd : A system for unsupervised, knowledge-free, and interpretable word sense disambiguation. Main contributor: Fide Marten.- Serelex: Lexico-semantic search engine, a web application for search and visualization of semantically related words. The system provides access to word similarity graphs for English, French, Russian, and Portuguese languages and features a RESTful API. Main contributors: Alexey Romanov and Pavel Romanov.
PatternSim : A tool for (1) calculation semantic relatedness of words and (2) extraction of hypernyms based on lexical-syntactic patterns also known as Hearst patterns. The tool implements extraction grammars in form of Unitex FSTs for English, French, Russian, and Portuguese. The tool is capable of processing in parallel large amounts of texts ( tested on corpora up to 50Gb). Co-developed with Hubert Naets.- Chinese Whispers: A fast Java implementation of the Chinese Whispers graph clustering algorithm (Biemann, 2006). The tool is able to perform global or local clustering of the input graph (the latter is also known as ego-network clustering and can be useful for graph-based word sense induction). Main contributor: Johannes Simon.
- PyMystem3: A morphological analyzer of Russian language for Python based on Yandex
MyStem 3.0 (a wrapper around the binary version of theMyStem , which enable a convenient use of the tool from Python). Main contributor: DenisSukhonin . - SenseGram: A tool for learning of word sense embeddings on the basis of existing word embeddings. The system also provides word sense disambiguation functionality on the basis of learned word sense embeddings. The approach received the best paper award at the representation learning workshop at the ACL'2016 conference. Main contributor: Maria Pelevina.
- TAXI: TAXonomy Induction system based on Lexico-Syntactic Patterns, Substrings and Focused Crawling. This system was ranked first at the SemEval 2016 shared task on taxonomy extraction from text.
lefex : This project contains Hadoop jobs for extraction of word, feature and word-feature counts from texts. These counts can be helpful for computing distributional thesauri and word graphs. Several types of features are available including syntactic and trigram features. Co-developed with Johannes Simon.- STC: Short Text Categorizer is a lightweight and fast text categorization library written in C++ based on linear models and vector space model. The library relies on the LibLINEAR library for supervised categorization.
- Lexical-semantic language map of the Russian language. This visualization was obtained by clustering of the graph of semantically related words obtained using the Serelex (see above). The visualization groups semantically similar words, e.g. countries, names, etc.
- russe-evaluation: A collection of scripts and datasets for evaluation of semantic similarity/relatedness measures for the Russian language. This collection of tools contains several datasets that are based on human judgments (HJ), cognitive associations (AE), and semantic relations (synonyms and hypernyms) extracted from thesauri (RT).
- JoSimText: An implementation of the JoBimText approach (Biemann and Riedl, 2013) in Apache Spark.
Datasets
- RDT: Russian Distributional Thesaurus is a large-scale resource that consists of a distributional thesaurus and word vectors. In order to build the distributional thesaurus, we used the skip-gram model (Mikolov et al., 2013) trained on a 12.9 billion word collection of books in Russian (around 150Gb of plain text). According to the results of our participation in the shared task on Russian semantic similarity (Panchenko et al., 2015), this approach scored in the top submissions showing the state-of-the-art results in word relatedness for Russian. Due to its high coverage, the resource can be used in various applications.
- RUSSE: four semantic relatedness resources for Russian, each being a list of triples (word_i, word_j, similarity_ij) designed for evaluation of semantic relatedness, each complementing another in terms of relation type. These benchmarks were used in a shared task on Russian semantic similarity.
Serelex lexical-semantic networks are distributional thesauri weighted with semantic relatedness scores based on thePatternSim approach. These are the resources used in the Serelex lexical semantic search engine.- English: This network was extracted from a concatenation of Wikipedia and ukWaC corpora.
- French: The network was extracted from a concatenation of Wikipedia and frWaC corpora. Another version is available extracted from a lower-cased corpus.
- Russian: a distributional thesaurus weighted with semantic relatedness scores based on the
PatternSim approach. The network was extracted from a concatenation of Wikipedia and ruWaC corpora. Another version is available extracted only from Wikipedia corpus.
- A large-scale corpus of books in Russian: this 12.9 billion tokens corpus was assembled from various books from lib.rus.
ec website. - Data for sentiment analysis for the Russian language:
- Posts from the
lovehate .ru web site : This is a complete crawl by 2015 of this website, which is organized in the way so that each topic can receive "love" and "hate" comments i.e. positive or negative opinions of users on the topic. The collection can be useful for training of statistical sentiment classifiers. - Dictionary of abusive words: a comprehensive collection/compilation of freely available dictionaries and lists of abusive words. Abusive words
are usually denote highly negative emotions and can be helpful for feature extraction. - Sentiment dictionary: this sentiment lexicon contains over 11K of words carefully verified by several annotators. This dataset can be useful for building a dictionary-based sentiment analysis system or for generation of features in a supervised system.
- Posts from the
Teaching
Supervision of MA Theses
I supervised research-oriented Master theses, usually also aiming to publish a conference paper on the basis of the produced materials.
- Matthias Schildwächter (2019, University of Hamburg): An Open-Domain System for Retrieval and Visualization of Comparative Arguments from Text.
- Alvin Rindra Fazrie (2019, University of Hamburg): Visual Information Management with Compound Graphs. Main supervisor: Steffen Remus.
- Dahmash Ibrahim (2018, University of Hamburg): Question Answering using Dynamic Neural Networks. Main supervisor: Benjamin Milde.
- Mirco Franzek (2018, University of Hamburg). Comparative Argument Mining. Co-supervised with Chris Biemann.
- Marten Fide (2017, TU Darmstadt). Predicting hypernyms in contexts with JoBimText. Co-supervised with Chris Biemann. Now at TU Darmstadt.
- Maria Pelevina (2016, TU Darmstadt). Unsupervised Word Sense Disambiguation with Sense Embeddings. Co-supervised with Chris Biemann. Now at Deutsche Bahn R&D.
- Simon Dif (2015, TU Darmstadt). Statistical Models of Semantics with Structured Topics. Co-supervised with Chris Biemann. A dual degree Masters program with ENSIMAG, Grenoble, France. Now at Altran R&D.
- Alexey Romanov (2012, Moscow State Technical University). Graph Algorithms in the Lexical Semantic Search Engine 'Serelex'. Co-supervised with Andrew Philippovich. Now at the University of Massachusets Lowell.
Internships & Visiting Researchers
I help to write research proposals to funding organizations which let researchers visit our faculty and do interesting short-term research project together.
- Dmitry Ustalov (2016): Graph Clustering for Word Sense Induction. Funded by DAAD
- Artem Chernodub (2017): Recurrent Neural Networks for Argument Mining. Funded by DAAD.
- Andrey Kutuzov (2018): Learning graph embeddings via node similarities. Funded by the University of Oslo.
- Shantanu Acharya (2018): Taxonomy induction using word sense representations. Funded by DAAD.
Professional Activities
Organization of Events
- Special issue of Natural Language Engineering journal on informing neural architectures for NLP with linguistic and background knowledge.
on Word Sense Induction for the Russian Language (RUSSE'2018). This is a part of the Dialogue 2018 conference on Computational Linguistics.Shared task- Conference on Analysis of Images Social Networks and Texts (AIST), area chair of the Natural Language Processing track. I was involved in the organization of the conference and edition of the proceedings published by Springer CCIS series in 2014, 2015 and 2016, 2017, and 2018.
- The 3rd International Workshop on Concept Discovery in Unstructured Data (CDUD) co-located with the 13th International Conference on Concept Lattices and Their Applications
- The First International Workshop on Russian Semantic Similarity (RUSSE) co-located with the 20th International Conference on Computational Linguistics Dialogue'2015. See also the website with the description of the shared task and the datasets.
Programme Committee for Conferences and Workshops
- ISCW 2019: International Conference on Computational Semantics (ACL SIGSEM special interest group on semantics)
- CoNLL 2018: The SIGNLL Conference on Computational Natural Language Learning
- ACL 2018, 2019: Annual Meeting of the Association for Computational Linguistics.
- *SEM 2018, 2019: Joint Conference on Lexical and Computational Semantics
- NAACL 2018, 2019: North American Chapter of the Association for Computational Linguistics: Human Language Technologies
- SocInfo 2018: Social Informatics
- CLL 2018: 3rd Workshop on "Computational linguistics and language science"
- EMNLP 2017, 2018: The Conference on Empirical Methods on Natural Language Processing
- ESWC 2017, 2018: The Semantic Web conference
- ASSET 2017: Workshop on Advanced Solutions for Semantic Extraction from Texts co-located with the ESWC 2017 conference
- TextGraphs 2016, 2017, 2018: Workshop on Text Graph co-located with the ACL conference.
- ReprL4NLP 2017, 2018: Workshop on Representation Learning for NLP co-located with the ACL conference.
- SMERP 2017: International Workshop on Exploitation of Social Media for Emergency Relief and Preparedness (co-located with 39th European Conference on Information Retrieval (ECIR 2017)
- COLING 2016, 2018: International Conference on Natural Language Processing
- AINL 2015, 2016, 2017: Conference on Artificial Intelligence and Natural Language
- SEMANTiCS 2016, 2017: International Conference on Semantic Systems
- Dialogue 2015, 2016, 2017, 2018: International Conference on Computational Linguistics
- RECITAL 2015, 2016, 2017: Rencontre des Étudiants Chercheurs en Informatique pour le Traitement Automatique des Langues, co-located with TALN conference
- NLDB 2015, 2016, 2017: International Conference on Natural Language & Information Systems
- WI 2014, 2015: IEEE/WIC/ACM International Conference on Web Intelligence
- RuSSIR 2014, 2015: Young Scientists Conference at Russian Summer School in Information Retrieval
- AIST 2014, 2015, 2016, 2017, 2018: Conference on Analysis of Images, Social Networks
and Texts - RANLP 2013, 2015: Conference on Recent Advances in Natural Language Processing
- LTC 2011, 2013: The Language and Technology Conference: Human Language Technologies as a Challenge for Computer Science and Linguistics
Journal Reviewing
- Information Processing & Management, Elsevier (2018)
- Language Resources & Evaluation, Springer (2018)
- PLOS ONE (2018)
- Natural Language Engineering, Cambridge University Press (2018)
- Data & Knowledge Engineering (DATAK), Elsevier (2017, 2018)
- International Journal of Artificial Intelligence and Soft Computing, Interscience (2016)
- Internet Computing Journal, IEEE (2015)
- International Journal of Child Abuse & Neglect, Elsevier (2014)
Personal Interests
traveling, photography, speleology