Skip to content
Research data finder
FI|EN

IMPORTANT INFORMATION ABOUT ETSIN! Old Etsin (etsin.avointiede.fi) will be migrated into new Etsin (etsin.fairdata.fi) at the end of June 2019. After the migration all PUBLISHED datasets will be visible in new Etsin.
Describing the datasets in Etsin will not be possible after 12th June 2019. Instead, describing the datasets will be done in new metadata tool, Qvain, which will be launched at the begin of July 2019.
Note! Remember to publish your dataset if you want it to be migrated into new Etsin.

Search for a Dataset

7,764 datasets found
More categories…
  • Metadata: 2/5

    Finnish Wikipedia 2017, source

    The Finnish Wikipedia 2017 source material corpus will be available in the download service korp.csc.fi/download The corpus contains all the Finnish articles from the online encyclopedia Wikipedia available in 1 January 2018. The text parts of the articles have been extracted from Wikipedia Dumps with WikiExtractor. The corpus has been tokenized and...
  • Metadata: 2/5

    Finnish OpenSubtitles 2017, source

    The Finnish OpenSubtitles 2017 source material corpus will be available in the download service korp.csc.fi/download The corpus contains Finnish subtitles for movies and TV-series from http://www.opensubtitles.org/ The corpus is a derivative of the OPUS OpenSubtitles2018 multilingual corpus. Information on the material processing up to sentence splitting...
  • Metadata: 2/5

    Finnish Wikipedia 2017, Kielipankki Korp Version

    The Finnish Wikipedia 2017 Corpus will be available in the concordance tool Korp. The corpus contains all the Finnish articles from the online encyclopedia Wikipedia available in 1 January 2018. The text parts of the articles have been extracted from Wikipedia Dumps with WikiExtractor. The corpus has been tokenized and annotated with morpho-syntactic...
  • Metadata: 2/5

    Finnish OpenSubtitles 2017, Kielipankki Korp Version

    The corpus will be available in Kielipankki through the Interface Korp. The corpus contains Finnish subtitles for movies and TV-series from http://www.opensubtitles.org/ The corpus is a derivative of the OPUS OpenSubtitles2018 multilingual corpus. Information on the material processing up to sentence splitting can be found in the original publication...
  • Metadata: 2/5

    Fenno-Ugrica Kielipankki Downloadable Version

    The Kielipankki downloadable version of Fenno-ugrica (http://urn.fi/urn:nbn:fi:lb-2014073056) is available in Kielipankki - the Language Bank of Finland at http://urn.fi/urn:nbn:fi:lb-2019032501
  • Metadata: 2/5

    The Swedish sub-corpus of the Classics Library of the National Library of Fin...

    This corpus will be made available for the interface Korp in Kielipankki - the Language Bank of Finland (korp.csc.fi). It comprises works written in Swedish, which are part of the Classics Library of the National Library of Finland and published under the license Public Domain. The data set in Swedish includes 282 works out of the whole data set of 968...
  • Metadata: 2/5

    The Finnish Sub-corpus of the Letters of Paul Sinebrychoff, Kielipankki Version

    The sub-corpus is available in Kielipankki - the Language Bank of Finland (korp.csc.fi) at http://urn.fi/urn:nbn:fi:lb-2016092702 For more information see http://urn.fi/urn:nbn:fi:lb-201407303
  • Metadata: 2/5

    Finnish Verbal Colorative Constructions

    The resource contains Finnish verbal colorative constructions from the database of the word notes used when creating the dictionaries Nykysuomen sanakirja and Kielitoimiston sanakirja (http://www.kielitoimistonsanakirja.fi/), from various literary works, from a query test made by Maria-Magdalena Jürvetson as well as from different Internet sources. The...
  • Metadata: 2/5

    Corpus of Finnish Magazines and Newspapers from the 1990s and 2000s, Version 2

    The corpus is available in Kielipankki - the Language Bank of Finland (korp.csc.fi). Reference instructions: See Attribution Details under Documentation. When quoting, also the name of the writer of the article, the title of the article, the name of the magazine or newspaper as well as the issue number or day of publication should be mentioned. Change...
  • Metadata: 2/5

    The Helsinki Korp Version of Samples of Spoken Finnish

    The corpus, which is the Korp version of the Samples of Spoken Finnish corpus, is available at http://urn.fi/urn:nbn:fi:lb-2015040101 For more information see http://urn.fi/urn:nbn:fi:lb-201407141
  • Metadata: 2/5

    The Finnish Sub-corpus of the JRC-Acquis Multilingual Parallel Corpus, Downlo...

    The downloadable version of the Finnish Sub-corpus of the JRC-Acquis Multilingual Parallel Corpus. The data is in VRT format which was basis for the Korp version.
  • Metadata: 1/5

    Finnish Supreme and Supreme Administrative Court decisions from 1980-2018 in ...

    Finnish Supreme Court (KKO) decisions in Finnish from 1980-2018 and Supreme Administrative Court (KHO) decisions from 1987-2018 in Finnish. The decisions are available in vrt format. KKO decisions: 5651. KHO decisions: 7633. For most decisions, the language used in court has been Finnish. In that case, the document contains the whole decision. If the...
  • Metadata: 1/5

    Finnish Parliament original statutes from 1734-2018, downloadable version

    Finnish Parliament original statutes in Finnish from 1734, 1868, 1889, 1895, 1896, 1898, 1901, 1906, 1907 and 1917-2018 and in Swedish from 1920-2018. The statutes are published in the Language Bank's Download service at korp.csc.fi/download in vrt format. NB! 2019-09-13 Discrepancies in dependency parses of the Finnish data: The dependency parses and...
  • Metadata: 1/5

    Finnish Parliament original statutes from 1734-2018 in Finnish, Korp version

    Finnish Parliament original statutes in Finnish from 1734, 1868, 1889, 1895, 1896, 1898, 1901, 1906, 1907 and 1917-2018. The statutes are available in the Korp interface korp.csc.fi. NB! 2019-09-13 Discrepancies in dependency parses of the Finnish data: The dependency parses and relations differ significantly from the parses in other corpora parsed...
  • Metadata: 1/5

    Finnish Parliament original statutes from 1920-2018, Korp version (Finnish-Sw...

    A collection of Finnish Parliament original statutes in Finnish and Swedish from 1920-2018. The statutes are available in the Language Bank of Finland Korp service korp.csc.fi NB! 2019-09-13 Discrepancies in dependency parses of the Finnish data: The dependency parses and relations differ significantly from the parses in other corpora parsed earlier with...
  • Metadata: 1/5

    Finnish Supreme and Supreme Administrative Court decisions from 1980-2018 in ...

    Finnish Supreme Court (KKO) decisions from 1980-2018 and Supreme Administrative Court (KHO) decisions from 1987-2018. The decisions are in Finnish. The decisions are available in the Korp interface korp.csc.fi. KKO decisions: 5651. KHO decisions: 7633. For some decisions, the language used in court has been Swedish; in that case the Finnish version...
  • Metadata: 2/5

    Wanca 2016, Korp Version (BETA)

    The Korp version of Wanca 2016 is a collection of web corpora in small Uralic languages. The collection is composed of 29 sentence corpora in different languages. The corpora have been collected from the Internet using the automated system developed in the Finno-Ugric Languages and the Internet project (SUKI) supported by the Kone foundation from their...
  • Metadata: 2/5

    Finnish TreeBank 3

    The corpus is available in Kielipankki - the Language Bank of Finland (https://korp.csc.fi), http://urn.fi/urn:nbn:fi:lb-2016051001 and downloadable at http://urn.fi/urn:nbn:fi:lb-2016011501 The FinnTreeBank project is creating a treebank and a parsebank for Finnish. This work is licensed under Creative Commons Attribution 3.0. A parsebank for Finnish...
  • Metadata: 2/5

    Finnish TreeBank 2

    The FinnTreeBank project is creating a treebank and a parsebank for Finnish. This work is licensed under Creative Commons Attribution 3.0. The second version of the treebank is annotated by hand and based on 17.000 model senctences in the Large Grammar of Finnish VISK - Iso Suomen Kielioppi. Brief samples of text from other sources, e.g. news items and...
  • Metadata: 2/5

    Finnish TreeBank 1

    The example sentences from Iso suomen kielioppi [Large Grammar of Finnish], manually annotated with dependency-syntactic descriptions. This is a Grammar Definition Corpus intended as a model for further automatic analysis of Finnish. The corpus is available in Kielipankki - the Language Bank of Finland. Download location:...