Informatisches Colloquium Hamburg
Wenn nicht anders angegeben, finden die Vorträge montags um 17.15 Uhr im Informatikum, Konrad-Zuse-Hörsaal, Gebäude B, Vogt-Kölln-Str. 30, Hamburg-Stellingen statt.
22.11.2004

Steven Krauwer
Universität Utrecht
ELSNET / Institute of Linguistics

The Basic Language Resources Kit (BLARK)

We define the Basic Language Resources Kit (abbreviated BLARK) as the minimal set of language resources that is needed to do any precompetitive R&D and education at all for a language. We see it as an instrument to help especially (but not exclusively) the technologically less developed languages in setting priorities when building the foundations of their language resources infrastructure and as a way to facilitate porting of knowledge and expertise between languages.

The BLARK may contain various types of components, such as written and spoken corpora, dictionaries, tree-banks, morphological analyzers, parsers, speech recognizers, etc. The BLARK specification will also include references to standards, cost estimations, information on availability, etc.

The specification of the BLARK and the way it is specified should be to a large extent language independent, but it is obvious that some languages or language families will require specific adaptations or extensions.

In my talk I will present the BLARK concept and some first attempts we have made to arrive at an initial BLARK specification. I would very much like to exchange ideas on what todays ideal BLARK would look like, including issues such as quantity (e.g. minimal size of a corpus), quality (e.g. best standards for representation and annotation), cost (development of modules, acquisition of data), IPR, etc.

Kontakt
Prof. Dr. W. v. Hahn
Telefon +49 40 42883 2434