Skip to content
Research data finder
FI|EN

Search for a Dataset

17 datasets found
  • Metadata: 2/5

    Corpus of Early Modern Finnish, Kielipankki Version

    Written Finnish from the 19th century (mostly from the years between 1810 and 1880), browsable and searchable on the web. The collection contains published literature, periodicals, newspapers, and dictionaries, among others, with a focus on the earliest and most important publications and a wide thematic coverage. Texts written originally in Finnish were...
  • Metadata: 2/5

    Chuvash Corpus (UHLCS), Helsinki Korp Version

    The resource, a variant of Chuvash Corpus (UHLCS) (see http://urn.fi/urn:nbn:fi:lb-2014032625) will be made available at korp.csc.fi.
  • Metadata: 2/5

    The University of Helsinki's German E-thesis, Korp Version

    The corpus is available in Kielipankki - the Language Bank of Finland in Korp, http://urn.fi/urn:nbn:fi:lb-2016102802 The corpus contains the University of Helsinki's German master's theses as well as the doctoral theses and their summaries published at https://ethesis.helsinki.fi by September 2016.
  • Metadata: 2/5

    The von Wright and Wittgenstein Archives (WWA)

    The archives consist of two parts: the Wittgenstein Archives maintained by Georg Henrik von Wright since the 1960s and von Wright's own literary estate, including a vast amount of letters mainly relating to his work as one of Ludwig Wittgenstein's three literary executors 1951-2003. The main part was donated by G.H. von Wright to the University of...
  • Metadata: 2/5

    JRC-Acquis Multilingual Parallel Corpus

    The Acquis Communautaire (AC) is the total body of European Union (EU) law applicable in the the EU Member States. This collection of legislative text changes continuously and currently comprises selected texts written between the 1950s and now. As of the beginning of the year 2007, the EU had 27 Member States and 23 official languages. The Acquis...
  • Metadata: 2/5

    The Helsinki Korp JRC-Acquis Bilingual Parallel Corpora

    The corpora is available in Kielipankki - the Language Bank of Finland (http://urn.fi/urn:nbn:fi:lb-2015062301). The Helsinki Korp JRC-Acquis Bilingual Parallel Corpora are: The Helsinki Korp JRC-Acquis Finnish-English Corpus The Helsinki Korp JRC-Acquis Finnish-Swedish Corpus The Helsinki Korp JRC-Acquis Finnish-German Corpus The Helsinki Korp JRC-Acquis...
  • Metadata: 2/5

    The National Certificates Corpus

    The NC test results, background information, speaking and writing performances in 9 foreign / second languages. A web-based data base (html files). The corpus contains background information and test results (5 sub-tests, 9 different languages) from 14 000 test takers as SPSS files, 2 000 writing performances, and 700 speaking performances.
  • Metadata: 2/5

    The Helsinki Korp Europarl Bilingual Corpora

    The corpora are available in Kielipankki - the Language Bank of Finland (https://korp.csc.fi), http://urn.fi/urn:nbn:fi:lb-2015043012. The Helsinki Korp Europarl Bilingual Corpora are: The Helsinki Korp Europarl Finnish-English Corpus The Helsinki Korp Europarl Finnish-Swedish Corpus The Helsinki Korp Europarl Finnish-German Corpus The Helsinki Korp...
  • Metadata: 2/5

    Professor Marjatta Wis' Corpus

    The corpus contains i.a. press cuttings, hand-written notes, manuscripts, microfilms and photographs, all in non-electronic format, that belonged to professor Marjatta Wis (1915-2008).
  • Metadata: 2/5

    Chuvash Corpus (UHLCS)

    The corpus is available in Kielipankki - the Language Bank of Finland (taito-shell.csc.fi, access rights instructions: https://www.kielipankki.fi/access/). The corpus contains the following documents: Gebräuche und Volksdichtung der Tschuwassen. Gesammelt von Heikki Paasonen, herausgeben von Eino Karahka und Matti Räsänen. Mémoires de la Société...
  • Metadata: 2/5

    FinDe Corpus

    This contrastive language corpus contains German and Finnish literature and press texts and their respective translations into the other language.
  • Metadata: 2/5

    Lists of Words Corpus (UHLCS)

    The corpus is available in Kielipankki - the Language Bank of Finland (taito-shell.csc.fi, access rights instructions: http://www.kielipankki.fi/access). The lists of words located at the University of Helsinki Language Corpus Server were generated from the corpora of the following languages: Dutch: 178,430 words, 1,998,881 characters Finnish: proper...
  • Metadata: 2/5

    Written and Oral Data of the TAITO-project

    The corpus contains: a) Texts written by students of German, French, Italian, Swedish or English, who have just started their studies or who are at the end of their first year of study. b) Videos of partially transcribed discussions. In most of the cases the participants in the discussions are two students and one native speaker. The corpus contains...
  • Metadata: 2/5

    Information in Sign Language on the Tasks of the Parliamentary Ombudsman of F...

    Information available also in Finnish and Finland Swedish sign language on the tasks of the Parliamentary Ombudsman of Finland.
  • Metadata: 2/5

    The German Sub-corpus of MULCOLD, Multilingual Parallel Corpus of Legal Texts

    The sub-corpus is available in Kielipankki - the Language Bank of Finland (korp.csc.fi) at http://urn.fi/urn:nbn:fi:lb-2016042606 For more information see http://urn.fi/urn:nbn:fi:lb-201405278
  • Metadata: 2/5

    Lists of Words Corpus (UHLCS), Helsinki Korp Version

    The resource, a variant of Lists of Words Corpus (UHLCS) (http://urn.fi/urn:nbn:fi:lb-201406042), will be made available at korp.csc.fi.
  • Metadata: 4/5

    Phrase database for the chatting program Psyk

    This is a data file used by a conversation program (a.k.a. "chatterbot") called Psyk. Psyk is a learning program that remembers every line said to psyk, as well as the conversational context in which it was said. The data is formatted as one phrase per line. Every line has the format (context phrase) where context is a specification of the situation...