Skip to content
Research data finder
FI|EN

IMPORTANT INFORMATION ABOUT ETSIN! Old Etsin (etsin.avointiede.fi) will be migrated into new Etsin (etsin.fairdata.fi) at the end of June 2019. After the migration all PUBLISHED datasets will be visible in new Etsin.
Describing the datasets in Etsin will not be possible after 12th June 2019. Instead, describing the datasets will be done in new metadata tool, Qvain, which will be launched at the begin of July 2019.
Note! Remember to publish your dataset if you want it to be migrated into new Etsin.

Search for a Dataset

9,896 datasets found
More categories…
  • Metadata: 2/5

    The Downloadable Version of the Finnish Text Collection - Commercial Use

    This downloadable sub corpus of FTC is available for commercial use. The resource is available in Kielipankki - the Language Bank of Finland at http://urn.fi/urn:nbn:fi:lb-201908071 For information about the licence, see http://urn.fi/urn:nbn:fi:lb-20150304139 The corpus available for commercial use is a subcorpus of the Finnish Text Collection. More...
  • Metadata: 1/5

    The Yle MeMAD Media Corpus

    The corpus contains tv programs and videos from the archives of Yle, The Finnish Broadcasting Company. Journalistic programs (news, current affairs etc, no drama) have been selected on various topics and from time period ranging from 1966 to 2018. Each browse-quality video file is accompanied with their descriptive metadata and subtitles. Main audio and...
  • Metadata: 2/5

    Multimodal Translation with the Blind: Team

    The mutable-team subcorpus is part of the MUTABLE corpus (Multimodal Translation with the Blind), which entails video recordings of the work processes related to audio description as well as of the interaction between sighted and blind participants. The mutable-team subcorpus consists of appr. 25 h of video of authentic teamwork and the respective...
  • Metadata: 2/5

    Multimodal Translation with the Blind: Art

    The mutable-art subcorpus is part of the MUTABLE corpus (Multimodal Translation with the Blind), which entails video recordings of the work processes related to audio description as well as of the interaction between sighted and blind participants. The mutable-art subcorpus consists of appr. 2 h of video of authentic live audio description in art...
  • Metadata: 2/5

    Open Richly Annotated Cuneiform Corpus, Korp Version, May 2019

    Open Richly Annotated Cuneiform Corpus (Oracc) brings together the work of several Assyriological projects to publish online editions of cuneiform texts. The Korp version of Oracc allows extensive searches on the texts and presents the results as a KWIC concordance list. Korp also offers statistical information and comparison of the search results....
  • Metadata: 1/5

    Finnish Supreme and Supreme Administrative Court decisions from 1980-2018 in ...

    The Semfinlex corpora published in the Language Bank of Finland is based on the open data made available by the Semantic Finlex project (https://data.finlex.fi/en/project). The resource comprises original statutes of the Parliament of Finland, decisions by the Finnish Supreme Court and Supreme Administrative Court in Finnish and in Swedish, and also a...
  • Metadata: 2/5

    Wanca 2016, Korp Version (BETA)

    The Korp version of Wanca 2016 is a collection of web corpora in small Uralic languages. The collection is composed of 29 sentence corpora in different languages. The corpora have been collected from the Internet using the automated system developed in the Finno-Ugric Languages and the Internet project (SUKI) supported by the Kone foundation from their...
  • Metadata: 2/5

    Yle News Archive Easy-to-read Finnish 2011-2018, source

    This dataset consists of the selkouutiset in Finnish (Yle Easy-to-read Finnish News) published on the Yle news website https://yle.fi. The dataset was created by FIN-CLARIN from the contents of the Yle News Archive harvested on 2019-03-08 for the language code "fi" for each month from the year 2011 to the year 2018, inclusive. The Easy-to-read-Finnish...
  • Metadata: 2/5

    Finnish News Corpus for Named Entity Recognition

    The corpus consists of 953 articles (193,742 word tokens) with six named entity classes (organization, location, person, product, event,and date). The articles are extracted from the archives of Digitoday, a Finnish online technology news source. The data sets are available at https://github.com/mpsilfve/finer-data and will be available in the download...
  • Metadata: 1/5

    Finnish Supreme and Supreme Administrative Court decisions from 1980-2018 in ...

    The Semfinlex corpora published in the Language Bank of Finland is based on the open data made available by the Semantic Finlex project (https://data.finlex.fi/en/project). The resource comprises original statutes of the Parliament of Finland, decisions by the Finnish Supreme Court and Supreme Administrative Court in Finnish and in Swedish, and also a...
  • Metadata: 1/5

    Finnish Supreme and Supreme Administrative Court decisions from 1980-2018 in ...

    The Semfinlex corpora published in the Language Bank of Finland is based on the open data made available by the Semantic Finlex project (https://data.finlex.fi/en/project). The resource comprises original statutes of the Parliament of Finland, decisions by the Finnish Supreme Court and Supreme Administrative Court in Finnish and in Swedish, and also a...
  • Metadata: 1/5

    Finnish Parliament original statutes from 1734-2018 in Finnish, Korp version

    The Semfinlex corpora published in the Language Bank of Finland is based on the open data made available by the Semantic Finlex project (https://data.finlex.fi/en/project). The resource comprises original statutes of the Parliament of Finland, decisions by the Finnish Supreme Court and Supreme Administrative Court in Finnish and in Swedish, and also a...
  • Metadata: 1/5

    Finnish Parliament original statutes from 1920-2018 in Swedish, Korp version

    The Semfinlex corpora published in the Language Bank of Finland is based on the open data made available by the Semantic Finlex project (https://data.finlex.fi/en/project). The resource comprises original statutes of the Parliament of Finland, decisions by the Finnish Supreme Court and Supreme Administrative Court in Finnish and in Swedish, and also a...
  • Metadata: 1/5

    Finnish Parliament original statutes from 1920-2018, Korp version (Finnish-Sw...

    The Semfinlex corpora published in the Language Bank of Finland is based on the open data made available by the Semantic Finlex project (https://data.finlex.fi/en/project). The resource comprises original statutes of the Parliament of Finland, decisions by the Finnish Supreme Court and Supreme Administrative Court in Finnish and in Swedish, and also a...
  • Metadata: 1/5

    Finnish Parliament original statutes from 1734-2018, downloadable version

    The Semfinlex corpora published in the Language Bank of Finland is based on the open data made available by the Semantic Finlex project (https://data.finlex.fi/en/project). The resource comprises original statutes of the Parliament of Finland, decisions by the Finnish Supreme Court and Supreme Administrative Court in Finnish and in Swedish, and also a...
  • Metadata: 1/5

    Finnish Supreme and Supreme Administrative Court decisions from 1980-2018 in ...

    The Semfinlex corpora published in the Language Bank of Finland is based on the open data made available by the Semantic Finlex project (https://data.finlex.fi/en/project). The resource comprises original statutes of the Parliament of Finland, decisions by the Finnish Supreme Court and Supreme Administrative Court in Finnish and in Swedish, and also a...
  • Metadata: 2/5

    Iijoki, the University of Oulu Päätalo collection

    Iijoki-korpus on Oulun yliopiston Kielipankkiin tallettama kirjailija Kalle Päätalon (11.11.1919-20.11.2000) omaelämäkerrallinen pääteos. Päätaloa voidaan luonnehtia ainutlaatuiseksi suomalaisen lähihistorian ja työn kuvaajaksi sekä Koillismaan murteen tallentajaksi. Hänen kirjojensa aiheita olivat muun muassa nälkäaika, pula-ajat, metsätyöt,...
  • Metadata: 2/5

    Finnish News Agency Archive 1992-2018, source

    The Finnish News Agency Archive corpus comprises newswire articles in Finnish sent to media outlets by the Finnish News Agency (STT) between 1992-2018. The corpus includes about 2,8 million items in total. Most of the material is news articles that vary from short “news flashes” to telegrams and longer articles. News articles are categorized by department...
  • Metadata: 2/5

    Fenno-Ugrica Kielipankki Downloadable Version

    The Kielipankki downloadable version of Fenno-ugrica (http://urn.fi/urn:nbn:fi:lb-2014073056) is available in Kielipankki - the Language Bank of Finland at http://urn.fi/urn:nbn:fi:lb-2019032501
  • Metadata: 2/5

    Triangle of Aspects Analysis of Frozen

    The data will be available in the Language Bank service download (korp.csc.fi/download). It is available for research purposes until 23 December 21 2023. The user shall commit to removing the downloaded resource from his/her devices and other storage facilities governed by the user on or before 21 December 2023. This data set includes an analysis of the...