Seid Muhie Yimam
Name: |
Seid Muhie Yimam |
Position: |
HCDS: Technical Lead LT: Research Associate |
Email: | seid.muhie.yimam@uni-hamburg.de |
Phone: | +49 40 42883 2383 |
Fax: | +49 40 42883 2345 |
Office: | F-426 |
Address: |
Informatikum Vogt-Kölln-Straße 30 22527 Hamburg |
Google Scholar | Github Page CV |
I am Seid, currently a postdoctoral researcher at LT lab, under the supervision of Prof. Chris Biemann. I have been working as scientific software engineer at LT lab since September 2012. I have participated in the development of NLP tools such as Par4Sem, WebAnno, new/s/leak, GermaNER, and Network of the Day and also assist Chris Biemann in teaching and student supervision. Previously, I have been working as semantic web software developer at Okkam srl, a start-up semantic web company in Trento, Italy, from September 2011 to August 2012.
I obtained my doctoral degree (Ph.D.) which was on the integration of adaptive machine learning approaches into interactive annotation tools and semantic writing aids.
My current research focus is on the development of NLP technologies for social applications and less-resource languages.
I have received an advanced master degree in Human Language Technology and Interfaces from University of Trento, Italy on September 2011. I have also received MSc. and BSc. degrees in Computer Science from the Department of Computer Science, Addis Ababa University, Ethiopia on July 2009 and July 2004 respectively..
Projects
DIVID -DJ: Data Extraction and Interactive Visualization of Unexplored Textual Datasets for Investigative Data-Driven Journalism- SEMSCH: Semantic Methods for Computer-supported Writing Aids
- NoD: Network of the Day, Hochschulwettbewerb 2014
- CLARIN-D: Implementation of a web-based annotation platform for linguistic annotations (F-AG 7)
Publications
Monographs
Book Chapters
- Biemann, C., Bontcheva, K., Eckart de Castilho, R., Gurevych, I., Yimam, S.M. (2017): Collaborative Web-based Tools for Multi-layer Text Annotation. In: N. Ide and J. Pustejovsky (Eds.): Handbook of Linguistic Annotation, Springer (pdf)
Journal Publications
- Ayalew Kassahun and Seid Muhie Yimam and Yonas Seifu Muanenda and Beshir Melkaw Ali and Seleshi Getahun Yalew(2024): Uncovering the priorities of scientific research on sustainable development goals: A case study in Ethiopia, Sustainable Development, published by ERP Environment and John Wiley & Sons Ltd. 2024;1–26 , DOI: 10.1002/sd.3020 (pdf)
- Jana, A., Venkatesh, G., Yimam, S.M., and Biemann, C., Hypernymy Detection for Low-Resource Languages: A Study for Hindi, Bengali, and Amharic, ACM Transactions on Asian and Low-Resource Language Information Processing (2022). (pdf)
- Yimam, S.M., Biemann, C., Majnaric, L., Šabanović, Š., Holzinger, A. (2016): An adaptive annotation approach for biomedical entity and relation recognition. Brain Informatics, (online first), 10.1007/s40708-016-0036-4 (pdf)
- Yimam, S.M.,,Ayele, A.A., Venkatesh, G., Gashaw I. and Biemann C. (2021): Introducing Various Semantic Models for Amharic: Experimentation and Evaluation with Multiple Tasks and Datasets. Future Internet 2021, 13, 275.(pdf) https://doi.org/10.3390/fi13110275
Conference Proceedings
- Nedjma Ousidhoum, Shamsuddeen Hassan Muhammad, Mohamed Abdalla, Idris Abdulmumin, Ibrahim Said Ahmad, Sanchit Ahuja, Alham Fikri Aji, Vladimir Araujo, Abinew Ali Ayele, Pavan Baswani, Meriem Beloucif, Chris Biemann, Sofia Bourhim, Christine de Kock, Genet Shanko Dekebo, Oumaima Hourrane, Gopichand Kanumolu, Lokesh Madasu, Samuel Rutunda, Manish Shrivastava, Thamar Solorio, Nirmal Surange, Hailegnaw Getaneh Tilaye, Krishnapriya Vishnubhotla, Genta Indra Winata, Seid Muhie Yimam, Saif M. Mohammad (2024): SemRel2024: A Collection of Semantic Textual Relatedness Datasets for 13 Languages. Findings of the Association for Computational Linguistics (ACL 2024), Bangkok Thailand. (pdf)
- Atnafu Lambebo Tonja, Israel Abebe Azime, Tadesse Destaw Belay, Mesay Gemeda Yigezu, Moges Ahmed Mehamed, Abinew Ali Ayele, Ebrahim Chekol Jibril, Michael Melese Woldeyohannis, Olga Kolesnikova, Philipp Slusallek, Dietrich Klakow and Yimam, S.M. (2024): EthioLLM: Multilingual Large Language Models for Ethiopian Languages with Task Evaluation, The 2024 Joint International Conference on Computational Linguistics, Language and Evaluation (LREC-COLING 2024, Torino, Italy) (pdf)
- Ayele A.A., Yimam S. M., Belay T.D., Asfaw T. and Biemann C. (2023): Exploring Amharic Hate Speech Data Collection and Classification Approaches, in the 14th Conference RECENT ADVANCES IN NATURAL LANGUAGE PROCESSING, Varna, Bulgaria (pdf)
- Ayele A.A., Dinter S., Yimam S. M. and Biemann C. (2023): Multilingual Racial Hate Speech Detection Using Transfer Learning, in the 14th Conference RECENT ADVANCES IN NATURAL LANGUAGE PROCESSING, Varna, Bulgaria (pdf)
- Schneider, F., Yimam S.M., Petersen-Frey , F., Biemann, C., von Nordheim, G., Kleinen-von Königslöw, K., (2023): CodeAnno: Extending WebAnno with Hierarchical Document Level Annotation and Automation. The 17th Conference of the European Chapter
of the Association for Computational Linguistics (EACL 2023), System Demonstrations Track, Dubrovnik, Croatia (pdf) - Belay T.D., Tonja A.L., Kolesnikova O., Yimam S. M., Ayele A.A., Haile S.B., Sidorov G., Gelbukh A. (2022): The Effect of Normalization for Bi-directional Amharic-English Neural Machine Translation, International Conference on Information and Communication Technology for Development for Africa (ICT4DA 2022), Bahir Dar, Ethiopia (pdf)
- Ayele A.A., Belay T.D., Asfaw, T.T., Dinter S., Yimam S. M. , Biemann C.(2022): The 5Js in Ethiopia: Amharic Hate Speech Data Annotation Using Toloka Crowdsourcing Platform, International Conference on Information and Communication Technology for Development for Africa (ICT4DA 2022), Bahir Dar (pdf)
- Remus S., Wiedemann G., Anwar S., Petersen-Frey F., Yimam S. M., Biemann C. (2022), More Like This: Semantic Retrieval with Linguistic Information, In Proceedings of the 18th Conference on Natural Language Processing (KONVENS 2022), pages 156–166, Potsdam, Germany (pdf).
- Beloucif M., Yimam, S.M., Stahlhacke S. and Biemann C. (2022): Elvis vs. M. Jackson: Who has More Albums? Classification and Identification of Elements in Comparative Question. In the 2022 International Conference on Language Resources and Evaluation (LREC 2022), Marseille, France (pdf).
- Belay, T. D., Ayele, A.A., Gelaye, G., Yimam, S.M., and Biemann, C. (2021): Impacts of Homophone Normalization on Semantic Models for Amharic. Proceedings of the Third International Conference on ICT for Development for Africa (ICT4DA 2021), Bahir Dar, Ethiopia (pdf)
- von Boguszewski, N., Moin, S., Bhowmick, A., Yimam, S.M., Biemann, C. (2021): How Hateful are Movies? A Study and Prediction on Movie Subtitles. Proceedings of KONVENS, Düsseldorf, Germany (pdf)
- Wiechmann, M., Yimam S. M., Biemann, C. (2021): ActiveAnno: General-Purpose Document-Level Annotation Tool with Active Learning Integration. The 2021 Conference of the North American Chapter of the Association for Computational Linguistics - Human Language Technologies - System Demonstrations, Mexico City, Mexico (online) (pdf)
- Gooding, S., Kochmar, E., Yimam S. M., Biemann, C. (2021): Word Complexity is in the Eye of the Beholder. The 2021 Conference of the North American Chapter of the Association for Computational Linguistics - Human Language Technologies (NAACL-HLT), Mexico City, Mexico (pdf)#mce_temp_url#.
- Mathew, B., Saha, P., Yimam S. M., Biemann, C., Goyal, P., Mukherjee, A. (2021): HateXplain: A Benchmark Dataset for Explainable Hate Speech Detection. Proceedings of AAAI-21, Virtual Conference. (pdf)
- Haase, C., Anwar, S.,Yimam S. M., Friedrich, A., Biemann, C. (2021): SCoT: Sense Clustering over Time: a tool for the analysis of lexical change. The 2021 Conference of the European Chapter of the Association for Computational Linguistics - System Demonstrations. Kyiv, Ukraine (Online) (pdf)
- Yimam S. M., Alemayehu H. M., Ayele A. A. and Biemann C. (2020): Exploring Amharic Sentiment Analysis from Social Media Texts: Building Annotation Tools and Classification Models. The 28th International Conference on Computational Linguistics (COLING 2020), Barcelona, Spain (pdf) (poster)
- Yimam S. M., Venkatesh, G., Lee, J. Biemann, C. (2020): Automatic Compilation of Resources for Academic Writing and Evaluating with Informal Word Identification and Paraphrasing System, The International Conference on Language Resources and Evaluation (LREC 2020), Marseille, France. (pdf)
- Wiedemann G., Yimam S.M., and Biemann C. (2018): A Multilingual Information Extraction Pipeline for Investigative Journalism, In Proceedings of 2018 Conference on Empirical Methods in Natural Language Processing (EMNLP 2018). Brussels, Belgium (pdf).
- Yimam S. M., Biemann C. (2018): Demonstrating Par4Sem - A Semantic Writing Aid with Adaptive Paraphrasing. In Proceedings of 2018 Conference on Empirical Methods in Natural Language Processing (EMNLP 2018). Brussels, Belgium (pdf).
- Wiedemann G., Yimam S.M., and Biemann C. (2018) : New/s/leak 2.0 – Multilingual Information Extraction and Visualization for Investigative Journalism. In: Proceedings of the 10th International Conference on Social Informatics (SocInfo 2018). St.Petersburg, Russia (pdf)
- Yimam S.M,, Biemann C. (2018): Par4Sim – Adaptive Paraphrasing for Text Simplification. In Proceedings of The 27th International Conference on Computational Linguistics (COLING 2018). Santa Fe, New-Mexico, USA (pdf).
- Yimam S.M, Štajner S., Riedl M., Biemann C. (2017): CWIG3G2 - Complex Word Identification Task across Three Text Genres and Two User Groups. In Proceedings of The 8th International Joint Conference on Natural Language Processing (IJCNLP 2017). Taipei, Taiwan (pdf) (to apear)
- Yimam S.M, Štajner S., Riedl Martin, Biemann C. (2017): Multilingual and Cross-Lingual Complex Word Identification. In Proceedings of The 2017 International Conference on Recent Advances in Natural Language Processing (RANLP). Varna, Bulgaria (pdf)
- Yimam, S.M., Ulrich, H., von Landesberger, T., Rosenbach, M., Regneri, M., Panchenko, A., Lehmann, F., Fahrer, U., Biemann, C. and Ballweg, K. (2016): new/s/leak – Information Extraction and Visualization for Investigative Data Journalists. ACL 2016 Demo Session, Berlin, Germany (pdf)
- Yimam, S.M., Biemann, C., Majnaric, L., Šabanović, Š., Holzinger, A. (2015): Interactive and Iterative Annotation for Biomedical Entity Recognition, International Conference on Brain Informatics and Health (BIH’15), London, UK (pdf)
- Benikova, D., Yimam, S.M., Biemann C. (2015). GermaNER: Free Open German Named Entity Recognition Tool. In: Proceedings of the GSCL 2015. Essen, Germany (pdf)
- Yimam, S.M., Eckart de Castilho, R., Gurevych, I., Biemann C. (2014): Automatic Annotation Suggestions and Custom Annotation Layers in WebAnno. Proceedings of ACL 2014 System Demonstrations, Baltimore, MD, USA (pdf)
- Yimam, S.M., Gurevych, I., Eckart de Castilho, R., and Biemann C. (2013): WebAnno: A Flexible, Web-based and Visually Supported System for Distributed Annotations. Proceedings of ACL-2013, demo session, Sofia, Bulgaria (pdf)
Workshop Proceedings
- Melese Ayichlie Jigar, Abinew Ali Ayele, Seid Muhie Yimam and Chris Biemann (2024): Detecting Hate Speech in Amharic Using Multimodal Analysis of Social Media Memes. Proceedings of The Fourth Workshop on Threat, Aggression & Cyberbullying, Torino, Italy (pdf)
- Abinew Ali Ayele, Esubalew Alemneh Jalew, Adem Chanie Ali, Seid Muhie Yimam, Chris Biemann (2024): Exploring Boundaries and Intensities in Offensive and Hate Speech: Unveiling the Complex Spectrum of Social Media Discourse. Proceedings of The Fourth Workshop on Threat, Aggression & Cyberbullying, Torino, Italy (pdf)
- Shamsuddeen Hassan Muhammad, Idris Abdulmumin, Seid Muhie Yimam, David Ifeoluwa Adelani, Ibrahim Sa'id Ahmad, Nedjma Ousidhoum, Abinew Ayele, Saif M Mohammad, Meriem BELOUCIF, Sebastian Ruder. SemEval-2023 Task 12: Sentiment Analysis for African Languages (AfriSenti-SemEval): arXiv preprint arXiv:2304.06845. 2023. (pdf)
- Shamsuddeen Hassan Muhammad, Idris Abdulmumin, Abinew Ali Ayele, Nedjma Ousidhoum, David Ifeoluwa Adelani, Seid Muhie Yimam, Ibrahim Sa'id Ahmad, Meriem BELOUCIF, Saif Mohammad, Sebastian Ruder, Oumaima Hourrane, Pavel Brazdil, Felermino Dário Mário António Ali, Davis Davis, Salomey Osei, Bello Shehu Bello, Falalu Ibrahim, Tajuddeen Gwadabe, Samuel Rutunda, Tadesse Belay, Wendimu Baye Messelle, Hailu Beshada Balcha, Sisay Adugna Chala, Hagos Tesfahun Gebremichael, Bernard Opoku, Steven Arthur. Afrisenti: A Twitter sentiment analysis benchmark for African languages: arXiv preprint arXiv:2302.08956. 2023. (pdf)
- Tonja A. L., Belay T. D., Azime I. A., Ayele A. A., Mehamed M. A., Kolesnikova O., Yimam S. M. (2023): Natural Language Processing in Ethiopian Languages: Current State, Challenges, and Opportunities, In the fourth workshop on Resources for African Indigenous Languages (RAIL) at EACL2023, Dubrovnik, Croatia (pdf)
- Banerjee D., Yimam S.M., Awale S. and Biemann C (2023), ARDIAS: AI-Enhanced Research Management, Discovery, and Advisory System, The AAAI-23 Workshop on Scientific Document Understanding at the Thirty-Seventh AAAI Conference on Artificial Intelligence (AAAI-23), Washington, DC, USA. (preprint pdf)
- Ayele A.A., Belay T.D., Yimam S. M., Dinter S., Asfaw, T.T., Biemann C. (2022): The 5Js in Ethiopia: Amharic Hate Speech Data Annotation Using Toloka Crowdsourcing Platform, The Sixth Widening Natural Language Processing Workshop (WiNLP 2022) in conjunction with EMNLP 2022, Abu Dhabi, UAE (pdf)
- Belay T.D., Tonja A.L., Kolesnikova O., Yimam S. M., Ayele A.A., Haile S.B., Sidorov G., Gelbukh A. (2022): The Effect of Normalization for Bi-directional Amharic-English Neural Machine Translation, The Sixth Widening Natural Language Processing Workshop (WiNLP 2022) in conjunction with EMNLP 2022, Abu Dhabi, UAE (pdf)
- Belay, T. D., Yimam, S.M., Ayele, A. A., and Biemann, C. (2022): Question Answering Classification for Amharic Social Media Community Based Questions, The 1st Annual Meeting of the ELRA/ISCA Special Interest Group on Under-Resourced Languages (SIGUL 2022), Marseille, France (pdf).
- Destaw T., Ayele A.A. and Yimam, S.M. (2021): The Development of Pre-processing Tools and Pre-trained Embedding Models for Amharic. Proceedings of The fifth WiNLP (“Widening NLP”) Workshop held in conjunction with EMNLP 2021, Punta Cana, Dominican Republic. (pdf).
- Wiedemann G, Yimam S.M., Biemann C. (2020): UHH-LT at SemEval-2020 Task 12: Fine-Tuning of Pre-Trained Transformer Networks for Offensive Language Detection. Proceedings of The 14th International Workshop on Semantic Evaluation (SemEval), Barcelona, Spain. (pdf) (poster)
- Yimam S. M., Ayele, A. A., Biemann C. (2019): Analysis of the Ethiopic Twitter Dataset for Abusive Speech in Amharic. In Proceedings of International Conference On Language Technologies For All: Enabling Linguistic Diversity And Multilingualism Worldwide (LT4ALL 2019). Paris, France (pdf).
- Yimam, S.M., Biemann, C., Malmasi, S., Paetzold, G.H., Speica, L., Štajner, S., Tack, A., Zampieri, M., (2018): A Report on the Complex Word Identification Shared Task 2018. Proceedings of the 13th Workshop on Innovative Use of NLP for Building Educational Applications, New Orleans, LA, USA (pdf)
- Yimam S.M., Remus S., Panchenko A., Holzinger A., Biemann C. (2017): Entity-Centric Information Access with the Human-in-the-Loop for the Biomedical Domains. Biomedical NLP Workshop associated with RANLP 2017. Varna, Bulgaria (pdf)
- Müller, M., Ballweg, K. von Landesberger, T., Yimam, S.M., Fahrer, U., Biemann, C., Rosenbach, M., Regneri, M., Ulrich, H. (2017). Guidance for Multi-Type Entity Graphs from Text Collections. EuroVis Workshop on Visual Analytics 2017, Barcelona, Spain (pdf)
- Nandi, T., Biemann, C., Yimam, SM., Gupta, Deepak., Kohail, S., Ekbal, A., Bhattacharyya, Pushpak. (2017): IT-UHH at SemEval-2017 Task 3: Exploring Multiple Features for Community Question Answering and Implicit Dialogue Identification, In Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval 2017), Vancouver, Canada.(pdf)
- Eckart de Castilho, R. Mújdricza-Maydt, E., Yimam, S.M., Hartmann, S., Gurevych, I., Frank, A. and Biemann, C. (2016): A Web-based Tool for the Integrated Annotation of Semantic and Syntactic Structures. Proceedings of the COLING workshop on Language Technology Resources and Tools for Digital Humanities (LT4DH), Osaka, Japan (pdf)
- Ballweg K., Zouhar F., Wilhelmi-Dworski P., von Landesberger T., Fahrer U., Panchenko A., Yimam S.M. Biemann C., Regneri M., Ulrich H. (2016) new/s/leak – A Tool for Visual Exploration of Large Text Document Collections in the Journalistic Domain, Baltimore, MD, USA, (pdf)
- Yimam, S.M., Martínez Alonso, H., Riedl M. and Biemann, C. (2016): Learning Paraphrasing for Multiword Expressions. The 12th Workshop on Multiword Expressions (MWE 2016), co-located with ACL 2016, Berlin, Germany (pdf)
- Yimam, S.M. (2015): Narrowing the Loop: Integration of Resources and Linguistic Dataset Development with Interactive Machine Learning. NAACL 2015 Student Research Workshop, p. 88--95, Denver, Colorado (pdf)
- Eckart de Castilho, R., Biemann, C., Gurevych, I., Yimam, S.M. (2014): WebAnno: a flexible, web-based annotation tool for CLARIN. CLARIN Annual Conference 2014, Soesterberg, The Netherlands (pdf)
- Benikova, D., Fahrer, U., Gabriel, A., Kaufmann, M., Yimam, S.M., von Landesberger, T., Biemann, C. (2014): Network of the Day: Aggregating and Visualizing Entity Networks from Online Sources. KONVENS 2014 Workshop proceedings: NLP4CMC, pp. 48-52, Hildesheim, Germany (pdf)
- Yimam, S.M, Libse, M. (2009): TETEYEQ: Amharic Question Answering For Factoid Questions, SEPLN09. SALTMIL workshop - Information Retrieval and Information Extraction for Less Resourced Languages (IE-IR-LRL), p. 17-25 (pdf)
Software and demo
Software
- GermaNER: German named entity recognition
- WebAnno: Web-based, distributive, and generic annotation tool
Demo
- WebAnno: Web-based, distributive, and generic annotation tool
- Network of the day: Interactive visualization of time-dependent relationships of public agents
- new/s/leak: NetWork of Searchable Leaks
Professional Activities
- Organizer of the Complex Word Identification (CWI) Shared Task 2018
- Dataset compilation
- Baseline System
- Writing the CWI report
- Scientific Paper review
- ACL
- EMNALP
Teaching
- SS 2018 - Supervision: Masterproject Web Interfaces for Language Processing Systems
- WS 2017/18 - Practical Classes : Natural Language Processing and the Web
- SS 2018 - Supervision: Masterproject Web Interfaces for Language Processing Systems
- WS 2016/17 - Practical Classes : Natural Language Processing and the Web
- WS 2016/17 - Practical Classes : 64-071 Übung Algorithmen und Datenstrukturen
- WS 2015/16 - Practical Classes : Natural Language Processing and the Web WS2015/16
- SS 2014 - Seminar coordinator : Knowledge Engineering for Question Answering Systems, Seminars
- WS 2014/15 - Practical Classes : Natural Language Processing and the Web WS2014/15