Skip to content
Research data finder
FI|EN

IMPORTANT INFORMATION ABOUT ETSIN! Old Etsin (etsin.avointiede.fi) will be migrated into new Etsin (etsin.fairdata.fi) at the end of June 2019. After the migration all PUBLISHED datasets will be visible in new Etsin.
Describing the datasets in Etsin will not be possible after 12th June 2019. Instead, describing the datasets will be done in new metadata tool, Qvain, which will be launched at the begin of July 2019.
Note! Remember to publish your dataset if you want it to be migrated into new Etsin.

Search for a Dataset

9,872 datasets found
More categories…
  • Metadata: 2/5

    Open Richly Annotated Cuneiform Corpus, Downloadable Version, September 2017

    This version contains the data that were available on the Oracc project website in September 2017. Open Richly Annotated Cuneiform Corpus (Oracc) brings together the work of several Assyriological projects to publish online editions of cuneiform texts. This version of ORACC contains the following Oracc projects: Corpus of Ancient Mesopotamian Scholarship;...
  • Metadata: 2/5

    Open Richly Annotated Cuneiform Corpus, Korp Version, May 2019

    Open Richly Annotated Cuneiform Corpus (Oracc) brings together the work of several Assyriological projects to publish online editions of cuneiform texts. The Korp version of Oracc allows extensive searches on the texts and presents the results as a KWIC concordance list. Korp also offers statistical information and comparison of the search results....
  • Metadata: 2/5

    Open Richly Annotated Cuneiform Corpus, Korp Version, September 2017

    This corpus version is no longer available in Korp. Instead, please see the Download version http://urn.fi/urn:nbn:fi:lb-2019111602 for the corresponding vrt files or http://urn.fi/urn:nbn:fi:lb-2019060601 for the new version of Oracc in Korp. This version contains the data that were available on the Oracc project website in September 2017. Open Richly...
  • Metadata: 2/5

    Open Richly Annotated Cuneiform Corpus, Korp Version, 2016

    This corpus version is no longer available. Instead, please see http://urn.fi/urn:nbn:fi:lb-2018071121 for the first stable version of the Oracc-Korp corpus (whose content was downloaded from the service of the Oracc projects in September 2017). This obsolete corpus version was published in the Korp service provided by Kielipankki - the Language Bank of...
  • Metadata: 2/5

    Finnish News Agency Archive 1992-2018, Kielipankki Korp Version

    The corpus will be available for non-commercial use in the concordance tool Korp where the context is restricted to sentences or paragraph. The Finnish News Agency Archive corpus comprises newswire articles in Finnish sent to media outlets by the Finnish News Agency (STT) between 1992-2018. The corpus includes about 2,8 million items in total. Most of...
  • Metadata: 2/5

    The "Hallituskausi 2007–2011" Translation Memory

    The "Hallituskausi 2007–2011" translation memory is intended for those translating administrative texts between Finnish and English. It includes key policy reports published by the Finnish ministries on their websites. The memory features some 58,000 Finnish-to-English translation segments. The tmx format requires a SDL Trados Studio programme. The...
  • Metadata: 2/5

    Finnish Wikipedia 2017, source

    The Finnish Wikipedia 2017 source material corpus will be available in the download service korp.csc.fi/download The corpus contains all the Finnish articles from the online encyclopedia Wikipedia available in 1 January 2018. The text parts of the articles have been extracted from Wikipedia Dumps with WikiExtractor. The corpus has been tokenized and...
  • Metadata: 2/5

    Finnish OpenSubtitles 2017, source

    The Finnish OpenSubtitles 2017 source material corpus will be available in the download service korp.csc.fi/download The corpus contains Finnish subtitles for movies and TV-series from http://www.opensubtitles.org/ The corpus is a derivative of the OPUS OpenSubtitles2018 multilingual corpus. Information on the material processing up to sentence splitting...
  • Metadata: 1/5

    Finnish News Agency Archive

    The Finnish News Agency Archive corpus comprises newswire articles made public by the Finnish News Agency (STT) during1992 to 2018. The corpora will be available through the corpus interface Korp (korp.csc.fi) as scrambled sentences (CC BY NC) and in the download service as whole texts (CLARIN RES).
  • Metadata: 2/5

    Finnish Wikipedia 2017, Kielipankki Korp Version

    The Finnish Wikipedia 2017 Corpus will be available in the concordance tool Korp. The corpus contains all the Finnish articles from the online encyclopedia Wikipedia available in 1 January 2018. The text parts of the articles have been extracted from Wikipedia Dumps with WikiExtractor. The corpus has been tokenized and annotated with morpho-syntactic...
  • Metadata: 2/5

    Finnish OpenSubtitles 2017, Kielipankki Korp Version

    The corpus will be available in Kielipankki through the Interface Korp. The corpus contains Finnish subtitles for movies and TV-series from http://www.opensubtitles.org/ The corpus is a derivative of the OPUS OpenSubtitles2018 multilingual corpus. Information on the material processing up to sentence splitting can be found in the original publication...
  • Metadata: 2/5

    Corpus of Finnish Magazines and Newspapers from the 1990s and 2000s, Download...

    The resource, containing entire newspaper and magazine articles, has been made available for Download in Kielipankki - the Language Bank of Finland at http://urn.fi/urn:nbn:fi:lb-201712201 The data consists of source data in PDF form or as plain text and is not annotated. An annotated version (lehdet90ff-vrt-v2) is available, see links below Relations on...
  • Metadata: 2/5

    Iijoki, the University of Oulu Päätalo collection, Kielipankki Korp version

    Iijoki-sarjan kuvaus löytyy sivulta http://urn.fi/urn:nbn:fi:lb-2019041401 ja Oulun yliopiston Päätalo-kokoelman tietosivu Kielipankin sivustolta osoitteesta https://www.kielipankki.fi/aineistot/oulun-yliopiston-paatalo-kokoelma/ Lisenssisivu: http://urn.fi/urn:nbn:fi:lb-2019102106 Aineiston on julkaistu konkordanssityökalu Korpissa...
  • Metadata: 1/5

    Iijoki, the University of Oulu Päätalo collection, Kielipankki TDPP Korp version

    Iijoki-sarjan kuvaus löytyy sivulta http://urn.fi/urn:nbn:fi:lb-2019041401. Lisenssisivu: http://urn.fi/urn:nbn:fi:lb-2019102106 Sarjan 26 kirjaa on jäsennetty Kielipankissa kahdella eri jäsentimellä. Molemmat julkaistaan Kielipankin Korp-konkordanssipalvelussa (korp.csc.fi). Tämän aineisto on jäsennetty Turku Dependency Parser Pipeline (TDPP)...
  • Metadata: 2/5

    Iijoki, the University of Oulu Päätalo collection

    Iijoki-korpus on Oulun yliopiston Kielipankkiin tallettama kirjailija Kalle Päätalon (11.11.1919-20.11.2000) omaelämäkerrallinen pääteos. Päätaloa voidaan luonnehtia ainutlaatuiseksi suomalaisen lähihistorian ja työn kuvaajaksi sekä Koillismaan murteen tallentajaksi. Hänen kirjojensa aiheita olivat muun muassa nälkäaika, pula-ajat, metsätyöt,...
  • Metadata: 3/5

    Everyday Experiences of Poverty 2012: Follow-up Study

    Aineisto koostuu 'Arkipäivän kokemuksia köyhyydestä' -kirjoituskilpailuun vuonna 2006 osallistuneiden henkilöiden uusista, vuonna 2012 kirjoittamista kirjoituksista. Kirjoituskutsu lähetettiin valikoidusti vuoden 2006 kirjoituskilpailuun osallistuneille henkilöille. Tarkoituksena oli selvittää, mitä köyhyyskirjoituskilpailuun osallistuneille henkilöille...
  • Metadata: 3/5

    Everyday Experiences of Poverty: Self-administered Writings 2006

    Aineisto koostuu "Arkipäivän kokemuksia köyhyydestä" -kirjoituskilpailun kautta kerätyistä teksteistä. Kirjoituksia saapui eri puolilta Suomea, ja kirjoittajat edustavat monipuolisesti eri väestöryhmiä, kuten lapsiperheitä, yksinhuoltajia, mielenterveyskuntoutujia, pitkäaikaissairaita, pienituloisia työntekijöitä, pienyrittäjiä, velkaantuneita,...
  • Metadata: 2/5

    The Swedish sub-corpus of Elias Lönnrot Letters Online - Kielipankki version

    This corpus will be made available at korp.csc.fi. It comprises letters and drafts written in Swedish, which are part of the correspondence corpus 'Elias Lönnrot Letters Online'. The data set in Swedish includes 3354 letters and drafts out of the whole data set of 4511 letters written in Finnish and Swedish. The letters and drafts of letters belong to the...
  • Metadata: 2/5

    The Finnish sub-corpus of Elias Lönnrot Letters Online - Kielipankki version

    This corpus will be made available at korp.csc.fi. It comprises letters and drafts written in Finnish, which are part of the correspondence corpus 'Elias Lönnrot Letters Online'. The data set in Finnish includes 1157 letters and drafts out of the whole data set of 4511 letters written in Finnish and Swedish. The letters and drafts of letters belong to the...
  • Metadata: 2/5

    The Finnish Dialect Syntax Archive's Helsinki Download Version

    The corpus, which is the Download version of The Finnish Dialect Syntax Archive's Helsinki Korp Version (http://urn.fi/urn:nbn:fi:lb-2016040702), is available in Kielipankki - the Language Bank of Finland Download service korp.csc.fi/download under the license CC BY ND 4.0. For more information see the metadata of The Finnish Dialect Syntax Arhive...