IPT and Twitter Introversion Extraversion Dataset

Accompanying Dataset for the Paper:

Classification of German Jungian Extraversion and Introversion Texts with
Assessment of Changes during the COVID-19 Pandemic

This dataset contains the utilized and researched data, described in the paper 'Classification of German Jungian Extraversion and Introversion Texts with Assessment of Changes during the COVID-19 Pandemic'.

Authors

* Dirk Johannßen
* Chris Biemann
* David Scheffer

Publication and detailed descriptions

Johannßen, D., Biemann, C., Scheffer, D., 2022. Classification of German Jungian Extraversion and Introversion Texts with Assessment of Changes during the COVID-19 Pandemic. In Proceedings of the LREC22 workshop on Resources and Processing of linguistic, para-linguistic and extra-linguistic Data from people with various forms of cognitive / psychiatric / developmental impairments (RaPID-4). European Language Resources Association (ELRA), Marseille, France (pdf).

Licence

This work is licensed under the Creative Commons Attribution-NonCommercial 4.0 International License. To view a copy of this license, visit http://creativecommons.org/licenses/by-nc/4.0/ or consult the license.txt file.

Gathering

The German natural language textual data utilized for creating the model was collected by the "Wirtschaftsakademie" (WafM), a company specialized in aptitude diagnostical testing.
The experimental data was drawn from Twitter, a micro-messaging service. The service offers an API for downloading 1% of the worldwide traffic of the social network (Gerlitz and Rieder, 2013).

Structure

The provided data is being delivered in two files, namely 20220402-ipt-intro-extra.tsv and 20220402-twitter-intro-extra.tsv. Both contain data in tab-separated values (tsv). The former file is strcutred as:

<label introversion / extraversion> \t <text> \t <ID>

The latter file is structured as:

<timestamp> \t <text> \t <label Extra / Intro>

Download

Please download the zip archive of the dataset here.