Skip to content
Research data finder
FI|EN

FIN-CLARIN

Followers 0

Search for datasets

375 datasets found
  • Metadata: 2/5

    The Swedish sub-corpus of Elias Lönnrot Letters Online - Kielipankki version

    This corpus will be made available at korp.csc.fi. It comprises letters and drafts written in Swedish, which are part of the correspondence corpus 'Elias Lönnrot Letters Online'. The data set in Swedish includes 3354 letters and drafts out of the whole data set of 4511 letters written in Finnish and Swedish. The letters and drafts of letters belong to the...
  • Metadata: 2/5

    The Finnish sub-corpus of Elias Lönnrot Letters Online - Kielipankki version

    This corpus will be made available at korp.csc.fi. It comprises letters and drafts written in Finnish, which are part of the correspondence corpus 'Elias Lönnrot Letters Online'. The data set in Finnish includes 1157 letters and drafts out of the whole data set of 4511 letters written in Finnish and Swedish. The letters and drafts of letters belong to the...
  • Metadata: 2/5

    Elias Lönnrot Letters Online

    The corpus consists of the correspondence of Elias Lönnrot with private individuals as well as institutions from 1823 until Lönnrot's death. Elias Lönnrot was the creator of the Kalevala, medical doctor and professor of language (1802 – 1884). The letters and drafts of letters belong to the Archive of the Finnish Literature Society and have been...
  • Metadata: 2/5

    KOTUS Finnish-Swedish Parallel Corpus

    The corpus is available in Kielipankki - the Language Bank of Finland (korp.csc.fi), http://urn.fi/urn:nbn:fi:lb-2015101601 (Finnish sub-corpus) and https://korp.csc.fi/?mode=swedish#?lang=fi&prequery_within=sentence&cqp=[]&corpus=kfspc_sv (Swedish sub-corpus) with a Creative Commons BY 4.0 license. The corpus contains corporate press...
  • Metadata: 2/5

    Corpus of Finnish Sign Language: elicited narratives, Download version

    This subcorpus is part of the Corpus of Finnish Sign Language collected in the CFINSL project. The subcorpus comprises elicited narratives from 21 Finnish Sign Language signers who belong to different age groups and live in different parts of Finland. The material covers three fixed tasks performed by the signers: narrating about short cartoon strips,...
  • Metadata: 2/5

    Corpus of Finnish Sign Language: elicited narratives

    This subcorpus is part of the Corpus of Finnish Sign Language collected in the CFINSL project. The subcorpus comprises elicited narratives from 21 Finnish Sign Language signers who belong to different age groups and live in different parts of Finland. The material covers three fixed tasks performed by the signers: narrating about short cartoon strips,...
  • Metadata: 2/5

    A Multimodal Corpus of Tourist Brochures Produced by the City of Helsinki, Fi...

    The corpus is available in in Kielipankki - the Language Bank of Finland (ling.helsinki.fi), download location: http://urn.fi/urn:nbn:fi:lb-2015030301 This multimodal corpus, which consists of the tourist brochures produced by the city of Helsinki, Finland, is fully annotated using XML schema provided for the Genre and Multimodality (GeM) model. The GeM...
  • Metadata: 2/5

    Finnish Corpus (Literature) (UHLCS)

    The corpus is available in Kielipankki - the Language Bank of Finland (taito-shell.csc.fi, access rights instructions: http://www.kielipankki.fi/access). Location: /appl/kielipankki/mrc-uhlcs/general-linguistics/uralic-lgs/finno-ugric-lgs/baltic-finnic-lgs/finnish Contents: HKV corpus: consists of samples of the Finnish literature representing various...
  • Metadata: 3/5

    The Finnish Dialect Syntax Archive

    The corpus is available in Kielipankki - the Language Bank of Finland (Korp version: http://urn.fi/urn:nbn:fi:lb-2014052715; LAT version: http://urn.fi/urn:nbn:fi:lb-1001100111532) under the licence CC BY ND 4.0.; the downloadable version: http://urn.fi/urn:nbn:fi:lb-2019092001) The corpus consists of 142 recordings, all interviews, and the associated...
  • Metadata: 2/5

    The Finland-Swedish Text Corpus (UHLCS)

    The corpus is available in Kielipankki - the Language Bank of Finland in Taito and Taito-shell under the directory /appl/kielipankki/ (Taito user guide: https://research.csc.fi/taito-user-guide) as well as in Korp (https://korp.csc.fi/, see there among the Swedish corpora "Finlandssvensk textkorpus (UHLCS) (FISC/FSTC)"). Access rights instructions:...
  • Metadata: 1/5

    Finnish Supreme and Supreme Administrative Court decisions from 1980-2018 in ...

    Finnish Supreme Court (KKO) decisions in Finnish from 1980-2018 and Supreme Administrative Court (KHO) decisions from 1987-2018 in Finnish. The decisions are available in vrt format. KKO decisions: 5651. KHO decisions: 7633. For most decisions, the language used in court has been Finnish. In that case, the document contains the whole decision. If the...
  • Metadata: 1/5

    Finnish Parliament original statutes from 1734-2018, downloadable version

    Finnish Parliament original statutes in Finnish from 1734, 1868, 1889, 1895, 1896, 1898, 1901, 1906, 1907 and 1917-2018 and in Swedish from 1920-2018. The statutes are published in the Language Bank's Download service at korp.csc.fi/download in vrt format. NB! 2019-09-13 Discrepancies in dependency parses of the Finnish data: The dependency parses and...
  • Metadata: 2/5

    Iijoki, the University of Oulu Päätalo collection

    Kielipankin Iijoki-korpus tullaan julkaisemaan konkordanssipalvelu Korpissa korp.csc.fi. Iijoki-korpus on Oulun yliopiston Kielipankkiin tallettama kirjailija Kalle Päätalon (11.11.1919-20.11.2000) omaelämäkerrallinen pääteos. Päätaloa voidaan luonnehtia ainutlaatuiseksi suomalaisen lähihistorian ja työn kuvaajaksi sekä Koillismaan murteen...
  • Metadata: 1/5

    The INA MeMAD Media Corpus

    The corpus contains television and radio programs from the archives of INA, the French National Audiovisual Institute. The corpus is made of 8 full days of programs on six French public television channels and radio stations (May 19th to 26th, 2014), corresponding to 2014 European elections. The corpus has been created and licensed for the MeMAD project,...
  • Metadata: 1/5

    Hundred Finnish Linguistic Life Stories

    More information about the project is available at https://blogs.helsinki.fi/100finnish/
  • Metadata: 1/5

    Finnish Parliament original statutes from 1734-2018 in Finnish, Korp version

    Finnish Parliament original statutes in Finnish from 1734, 1868, 1889, 1895, 1896, 1898, 1901, 1906, 1907 and 1917-2018. The statutes are available in the Korp interface korp.csc.fi. NB! 2019-09-13 Discrepancies in dependency parses of the Finnish data: The dependency parses and relations differ significantly from the parses in other corpora parsed...
  • Metadata: 1/5

    Finnish Parliament original statutes from 1920-2018, Korp version (Finnish-Sw...

    A collection of Finnish Parliament original statutes in Finnish and Swedish from 1920-2018. The statutes are available in the Language Bank of Finland Korp service korp.csc.fi NB! 2019-09-13 Discrepancies in dependency parses of the Finnish data: The dependency parses and relations differ significantly from the parses in other corpora parsed earlier with...
  • Metadata: 1/5

    Finnish Supreme and Supreme Administrative Court decisions from 1980-2018 in ...

    Finnish Supreme Court (KKO) decisions from 1980-2018 and Supreme Administrative Court (KHO) decisions from 1987-2018. The decisions are in Finnish. The decisions are available in the Korp interface korp.csc.fi. KKO decisions: 5651. KHO decisions: 7633. For some decisions, the language used in court has been Swedish; in that case the Finnish version...
  • Metadata: 2/5

    Wanca 2016, Korp Version (BETA)

    The Korp version of Wanca 2016 is a collection of web corpora in small Uralic languages. The collection is composed of 29 sentence corpora in different languages. The corpora have been collected from the Internet using the automated system developed in the Finno-Ugric Languages and the Internet project (SUKI) supported by the Kone foundation from their...
  • Metadata: 2/5

    Virtual Old Literary Finnish (VVKS) - Kielipankki Korp version

    The resource is is a collection of old Finnish texts from 1555 to 1788, containing 12 texts. The texts were published at http://www.helsinki.fi/vvks/, as well. This corpus complements the Corpus of Old Literary Finnish available in Kielipankki at http://urn.fi/urn:nbn:fi:lb-201407165. The corpus will be published at https://korp.csc.fi/