Skip to content
Research data finder
FI|EN

IMPORTANT INFORMATION ABOUT ETSIN! Old Etsin (etsin.avointiede.fi) will be migrated into new Etsin (etsin.fairdata.fi) at the end of June 2019. After the migration all PUBLISHED datasets will be visible in new Etsin.
Describing the datasets in Etsin will not be possible after 12th June 2019. Instead, describing the datasets will be done in new metadata tool, Qvain, which will be launched at the begin of July 2019.
Note! Remember to publish your dataset if you want it to be migrated into new Etsin.

Search for a Dataset

40 datasets found
  • Metadata: 2/5

    Helsinki Corpus of Swahili 2.0 (HCS 2.0)

    Helsinki Corpus of Swahili 2.0 is now available for research purposes in Kielipankki - the Language Bank of Finland. The corpus contains about 25 million words of written text, and it is available in two formats. The annotated version contains morphological and syntactic annotation as well as glosses in English. The not annotated version contains plain...
  • Metadata: 2/5

    Lists of Words Corpus (UHLCS)

    The corpus is available in Kielipankki - the Language Bank of Finland (taito-shell.csc.fi, access rights instructions: http://www.kielipankki.fi/access). Location: /appl/kielipankki/words (only Finnish available) The lists of words located at the University of Helsinki Language Corpus Server were generated from the corpora of the following languages:...
  • Metadata: 2/5

    Finnish Text Collection

    The corpus is available in Kielipankki - the Language Bank of Finland at https://korp.csc.fi/#?corpus=ftc, as well as downloadable at http://urn.fi/urn:nbn:fi:lb-2014052719 Corpus location instructions: https://www.kielipankki.fi/support/corpus-location/ (in Finnish: https://www.kielipankki.fi/tuki/aineiston-sijainti-kielipankissa/) Access rights...
  • Metadata: 2/5

    KOTUS Finnish-Swedish Parallel Corpus

    The corpus is available in Kielipankki - the Language Bank of Finland (korp.csc.fi), http://urn.fi/urn:nbn:fi:lb-2015101601 (Finnish sub-corpus) and https://korp.csc.fi/?mode=swedish#?lang=fi&prequery_within=sentence&cqp=[]&corpus=kfspc_sv (Swedish sub-corpus) with a Creative Commons BY 4.0 license, as well as downloadable in Taito and...
  • Metadata: 2/5

    Nenets Corpus (Tundra Nenets) (UHLCS)

    The corpus is available in Kielipankki - the Language Bank of Finland (taito-shell.csc.fi, access rights instructions: http://www.kielipankki.fi/access). Location: /appl/kielipankki/mrc-uhlcs/multilingual-language-archive/uralic-lgs/samoyedic-lgs/nenets Contents: Fragments of the Gospel of Luke in the Nenets Language. Translation: Barmich, Mariya...
  • Metadata: 2/5

    Komi Zyrian Corpus (UHLCS)

    The corpus is available in Kielipankki - the Language Bank of Finland (taito-shell.csc.fi, access rights instructions: http://www.kielipankki.fi/access). Location: /appl/kielipankki/mrc-uhlcs/multilingual-language-archive/uralic-lgs/finno-ugric-lgs/permic-lgs/komi Contents: 1. Jesus Friend of Children. ISBN 91-88394-64-6, ISBN 952-9790-13-9. Institute...
  • Metadata: 2/5

    Uzbek-English Dictionary (UHLCS)

    The corpus is available in Kielipankki - the Language Bank of Finland (taito-shell.csc.fi, access rights instructions: http://www.kielipankki.fi/access). Location: /appl/kielipankki/mrc-uhlcs/multilingual-language-archive/turkic-lgs/south-east-turkic-lgs/uzbek The Uzbek-English dictionary was compiled by Daniel Kimmage. Size of the dictionary: approx....
  • Metadata: 2/5

    Latin Corpus (UHLCS)

    The corpus is available in Kielipankki - the Language Bank of Finland (taito-shell.csc.fi, access rights instructions: http://www.kielipankki.fi/access). Location: /appl/kielipankki/mrc-uhlcs/general-linguistics/indo-european-lgs/latin Bible texts in Latin. The material was donated to the University of Helsinki by the American Philological Association...
  • Metadata: 2/5

    North Saami Corpus (Sámikultuvradoaibmagotti smiehttamush) (UHLCS)

    The corpus is available in Kielipankki - the Language Bank of Finland (taito-shell.csc.fi, access rights instructions: http://www.kielipankki.fi/access). Location: /appl/kielipankki/mrc-uhlcs/multilingual-language-archive/uralic-lgs/finno-ugric-lgs/saami-lgs/north-saami/report The corpus contains a fragment of the Report of the Saami Cultural Committee...
  • Metadata: 2/5

    North Saami Corpus (Literature) (UHLCS)

    The corpus is available in Kielipankki - the Language Bank of Finland (taito-shell.csc.fi, access rights instructions: http://www.kielipankki.fi/access). Location: /appl/kielipankki/mrc-uhlcs/multilingual-language-archive/uralic-lgs/finno-ugric-lgs/saami-lgs/north-saami The North Saami Corpus contains Kerttu Vuolab's novel Cheppari cháráhus written in...
  • Metadata: 2/5

    Ume Saami Corpus (UHLCS)

    The corpus is available in Kielipankki - the Language Bank of Finland (taito-shell.csc.fi, access rights instructions: http://www.kielipankki.fi/access). Location: /appl/kielipankki/mrc-uhlcs/multilingual-language-archive/uralic-lgs/finno-ugric-lgs/saami-lgs/ume-saami The corpus contains a morphologically analyzed document of the Ume Sami language. The...
  • Metadata: 2/5

    Lude (Ludian) Corpus (UHLCS)

    The corpus is available in Kielipankki - the Language Bank of Finland (taito-shell.csc.fi, access rights instructions: http://www.kielipankki.fi/access). Location: /appl/kielipankki/mrc-uhlcs/multilingual-language-archive/uralic-lgs/finno-ugric-lgs/baltic-finnic-lgs/lude The corpus contains samples of folklore of the Lude (Ludian) dialect of Karelian....
  • Metadata: 2/5

    Erzya and Moksha Mordvin Word List Corpus (UHLCS)

    The corpus is available in Kielipankki - the Language Bank of Finland (taito-shell.csc.fi, access rights instructions: http://www.kielipankki.fi/access). Location: /appl/kielipankki/mrc-uhlcs/multilingual-language-archive/uralic-lgs/finno-ugric-lgs/mordvin-lgs Contents: The Erzya corpus contains a historical word list of Erzya Mordvin documented in...
  • Metadata: 2/5

    Corpus of Erzya and Moksha Mordvin Literature and Journals and Komi Zyrian Li...

    The corpus is available in Kielipankki - the Language Bank of Finland (taito-shell.csc.fi, access rights instructions: https://www.kielipankki.fi/access). Locations: - /appl/kielipankki/mrc-uhlcs/multilingual-language-archive/uralic-lgs/finno-ugric-lgs/mordvin-lgs -...
  • Metadata: 2/5

    Khanty Corpus (North Khanty, Corpora and Translations) (UHLCS)

    The corpus is available in Kielipankki - the Language Bank of Finland (taito-shell.csc.fi, access rights instructions: http://www.kielipankki.fi/access). Location: /appl/kielipankki/mrc-uhlcs/multilingual-language-archive/uralic-lgs/finno-ugric-lgs/ugric-lgs/khanty The Khanty computer corpus contains the following sub-corpora: Khanty, Atlym dialect, 519...
  • Metadata: 2/5

    English Corpus (UHLCS)

    The corpus is available in Kielipankki - the Language Bank of Finland (taito-shell.csc.fi, access rights instructions: http://www.kielipankki.fi/access). Location: /appl/kielipankki/mrc-uhlcs/general-linguistics/indo-european-lgs/germanic-lgs/english The English Corpus is a part of the UHLCS corpus collection. Contents: The English Gutenberg Corpora...
  • Metadata: 2/5

    Finnish Corpus (Literature) (UHLCS)

    The corpus is available in Kielipankki - the Language Bank of Finland (taito-shell.csc.fi, access rights instructions: http://www.kielipankki.fi/access). Location: /appl/kielipankki/mrc-uhlcs/general-linguistics/uralic-lgs/finno-ugric-lgs/baltic-finnic-lgs/finnish Contents: HKV corpus: consists of samples of the Finnish literature representing various...
  • Metadata: 2/5

    Chuvash Corpus (UHLCS)

    The corpus is available in Kielipankki - the Language Bank of Finland (taito-shell.csc.fi, access rights instructions: https://www.kielipankki.fi/access/). The corpus contains the following documents: Gebräuche und Volksdichtung der Tschuwassen. Gesammelt von Heikki Paasonen, herausgeben von Eino Karahka und Matti Räsänen. Mémoires de la Société...
  • Metadata: 2/5

    Finnish Corpus (Bibles) (UHLCS)

    The corpus is available in Kielipankki - the Language Bank of Finland (taito-shell.csc.fi, access rights instructions: http://www.kielipankki.fi/access). Location: /appl/kielipankki/mrc-uhlcs/general-linguistics/uralic-lgs/finno-ugric-lgs/baltic-finnic-lgs/finnish/bible The Finnish text corpus contains two editions of the Bible: the old translation from...
  • Metadata: 2/5

    Estonian Corpus 1 (UHLCS)

    The corpus is available in Kielipankki - the Language Bank of Finland (taito-shell.csc.fi, access rights instructions: http://www.kielipankki.fi/access). Location: /appl/kielipankki/mrc-uhlcs/general-linguistics/uralic-lgs/finno-ugric-lgs/baltic-finnic-lgs/estonian/viro1 The corpus contains excerpts from articles published in Estonian newspapers,...