Skip to content
Research data finder

IMPORTANT INFORMATION ABOUT ETSIN! Old Etsin ( will be migrated into new Etsin ( at the end of June 2019. After the migration all PUBLISHED datasets will be visible in new Etsin.
Describing the datasets in Etsin will not be possible after 12th June 2019. Instead, describing the datasets will be done in new metadata tool, Qvain, which will be launched at the begin of July 2019.
Note! Remember to publish your dataset if you want it to be migrated into new Etsin.

Search for a Dataset

9 datasets found
  • Metadata: 2/5

    Professor Marjatta Wis' Corpus

    The corpus contains i.a. press cuttings, hand-written notes, manuscripts, microfilms and photographs, all in non-electronic format, that belonged to professor Marjatta Wis (1915-2008).
  • Metadata: 2/5

    Written and Oral Data of the TAITO-project

    The corpus contains: a) Texts written by students of German, French, Italian, Swedish or English, who have just started their studies or who are at the end of their first year of study. b) Videos of partially transcribed discussions. In most of the cases the participants in the discussions are two students and one native speaker. The corpus contains...
  • Metadata: 2/5

    The National Certificates Corpus

    The NC test results, background information, speaking and writing performances in 9 foreign / second languages. A web-based data base (html files). The corpus contains background information and test results (5 sub-tests, 9 different languages) from 14 000 test takers as SPSS files, 2 000 writing performances, and 700 speaking performances. More...
  • Metadata: 2/5

    The Helsinki Korp JRC-Acquis Bilingual Parallel Corpora

    The corpora is available in Kielipankki - the Language Bank of Finland ( The Helsinki Korp JRC-Acquis Bilingual Parallel Corpora are: The Helsinki Korp JRC-Acquis Finnish-English Corpus The Helsinki Korp JRC-Acquis Finnish-Swedish Corpus The Helsinki Korp JRC-Acquis Finnish-German Corpus The Helsinki Korp JRC-Acquis...
  • Metadata: 2/5

    Lists of Words Corpus (UHLCS)

    The corpus is available in Kielipankki - the Language Bank of Finland (, access rights instructions: Location: /appl/kielipankki/words (only Finnish available) The lists of words located at the University of Helsinki Language Corpus Server were generated from the corpora of the following languages:...
  • Metadata: 2/5

    Opus, Helsinki Korp Version

    The Helsinki Korp version of the Opus open parallel corpus (, containing scrambled sentences, has been published in Korp, The subcorpora of Opus, Helsinki Korp Version are: OPUS Finnish–Czech OPUS Finnish–Danish OPUS Finnish–Dutch OPUS Finnish–English OPUS Finnish–Estonian OPUS...
  • Metadata: 2/5

    JRC-Acquis Multilingual Parallel Corpus

    The Acquis Communautaire (AC) is the total body of European Union (EU) law applicable in the the EU Member States. This collection of legislative text changes continuously and currently comprises selected texts written between the 1950s and now. As of the beginning of the year 2007, the EU had 27 Member States and 23 official languages. The Acquis...
  • Metadata: 2/5

    Lists of Words Corpus (UHLCS), Helsinki Korp Version

    The resource, a variant of Lists of Words Corpus (UHLCS) (, will be made available at
  • Metadata: 2/5

    Europarl Parallel Corpus

    The Europarl parallel corpus is extracted from the proceedings of the European Parliament. It includes versions in 21 European languages: Romanic (French, Italian, Spanish, Portuguese, Romanian), Germanic (English, Dutch, German, Danish, Swedish), Slavik (Bulgarian, Czech, Polish, Slovak, Slovene), Finni-Ugric (Finnish, Hungarian, Estonian), Baltic...