Skip to content
Research data finder
FI|EN

Search for a Dataset

22 datasets found
  • Metadata: 3/5

    Automatic Speaker Verification Spoofing and Countermeasures Challenge (ASVspo...

    The database has been used in the first Automatic Speaker Verification Spoofing and Countermeasures Challenge (ASVspoof 2015). Genuine speech is collected from 106 speakers (45 male, 61 female) and with no significant channel or background noise effects. Spoofed speech is generated from the genuine data using a number of different spoofing algorithms. The...
  • Metadata: 3/5

    The Voice Conversion Challenge 2018: database and results

    Voice conversion (VC) is a technique to transform a speaker identity included in a source speech waveform into a different one while preserving linguistic information of the source speech waveform. In 2016, we have launched the Voice Conversion Challenge (VCC) 2016 at Interspeech 2016. The objective of the 2016 challenge was to better understand...
  • Metadata: 3/5

    The 2nd Automatic Speaker Verification Spoofing and Countermeasures Challenge...

    This is a database used for the Second Automatic Speaker Verification Spoofing and Countermeasuers Challenge, for short, ASVspoof 2017 (http://www.asvspoof.org) organized by Tomi Kinnunen, Md Sahidullah, Héctor Delgado, Massimiliano Todisco, Nicholas Evans, Junichi Yamagishi, Kong Aik Lee in 2017. The ASVspoof challenge aims to encourage further progress...
  • Metadata: 4/5

    FIN-Benthic

    Citation This dataset has been introduced in: Jenni Raitoharju, Ekaterina Riabchenko, Iftikhar Ahmad, Alexandros Iosifidis, Moncef Gabbouj, Serkan Kiranyaz, Ville Tirronen, Johanna Ärje, Salme Kärkkäinen, Kristian Meissner, Benchmark database for fine-grained image classification of benthic macroinvertebrates, Image and Vision Computing, Vol. 78,...
  • Metadata: 3/5

    Densely sampled light fields

    Dataset containing pre-rectified horizontal-parallax multi-perspective images of 3D scenes. Dataset consists of 193 camera views (images), positioned equidistanly on a line, where the disparity range between adjacent views is 1 pixel at most. Images are of size 1280×720 pixels and are stored in 8-bit RGB format (PNG). Several other densely sampled light...
  • Metadata: 4/5

    DigiSami Conversational Speech

    Introduction The DigiSami project (www.helsinki.fi/digisami/) aims to study the effect of digitalisation on small Finno-Ugric language communities and to support visibility and revitalisations of the endangered languages by creating digital content as well as developing language and speech technology tools, resources, and applications that can be used for...
  • Metadata: 4/5

    DigiSami Read Speech

    Introduction The DigiSami project (www.helsinki.fi/digisami/) aims to study the effect of digitalisation on small Finno-Ugric language communities and to support visibility and revitalisations of the endangered languages by creating digital content as well as developing language and speech technology tools, resources, and applications that can be used for...
  • Metadata: 3/5

    Supplementary Data: A Comparison of Reconstruction Algorithms for 3D Histology

    Histopathological whole slide image data and annotations.
  • Metadata: 4/5

    Finnish First Encounter

    Introduction The research material consists of the Finnish first encounters dialogue corpus collected as part of the NOMCO project, a Nordic cooperation project. The project aim for developing and analyzing multi-modal spoken language corpora in the Nordic countries, and to compare communication strategies in three closely related languages (Danish,...
  • Metadata: 4/5

    Estonian First Encounter

    Introduction Within the project MINT (Multimodal Interaction – intercultural and technological aspects of video data collection, analysis, and use) we have collected a corpus of Estonian First Encounter dialogs. The goals of the MINT project are: to create Estonian multi-modal video corpus on various conversational activities, to provide analysis and...
  • Metadata: 4/5

    Linked Media

    The media service of LDF.fi aims at collecting news and other media content into a Linked Data repository to be interlinked with each other. Initially, the service contains over 34,000 news from the Edilex News collection of Edita Publishing Ltd. This data is used by the Linked Data Finland project for interlinking news with the Finnish Linked Open Law...
  • Metadata: 3/5

    WordNet

    WordNet is a large lexical database of English. Nouns, verbs, adjectives and adverbs are grouped into sets of cognitive synonyms (synsets), each expressing a distinct concept. Synsets are interlinked by means of conceptual-semantic and lexical relations. This RDF version of WordNet 3.1 incorporates direct links to the previous W3C WordNets, UBY,...
  • Metadata: 4/5

    Halias Bird Observation Data as Linked Data

    This dataset contains systematic bird observations made at the Hanko Bird Observatory (Halias) during ca. 30 years. The data is provided originally by the Helsingin Seudun Lintutieteellinen Yhdistys Tringa r.y. The data is integrated with weather observation data for the same time period. The weather data comes from the Finnish Meteorological Institute....
  • Metadata: 4/5

    Semantic Finlex

    This dataset contains Linked Data regarding Finnish legislation and case law. The RDF data has been converted from legacy XML formats used within the Finlex online service. RDF data models used in the converted data conform to European URI and metadata standards, namely ELI (European Legislation Identifier) and ECLI (European Case Law Identifier). The...
  • Metadata: 3/5

    Ontology of finnish bird species

    Ontology of bird species observed in Finland. Species are annotated with characteristics descriptions, conservation statuses and rarity classes. Ontology is based on work by Birdlife Finland and Finnish Museum of Natural History. The characteristics ontology and annotations are based on NatureGate characteristics system.
  • Metadata: 4/5

    Schoenberg Database of Manuscripts

    The Schoenberg Database of Manuscripts (SDBM) makes available data on medieval manuscript books of five or more folios produced before 1600. Its purpose is to facilitate research for scholars, collectors, and others interested in manuscript studies and the provenance of these unique books.
  • Metadata: 4/5

    CEEC Sampler

    Published in 1998, the Corpus of Early English Correspondence Sampler (CEECS) represents the non-copyrighted materials included in the original CEEC. This means that the editors of the collections included in it have died over 70 years ago. We have also included some material (re-)edited by us (see Henslowe and Marchall collections). The CEECS is a fairly...
  • Metadata: 4/5

    Six Degrees of Francis Bacon

    Six Degrees of Francis Bacon is a digital reconstruction of the early modern social network (EMSN) that scholars and students from all over the world will be able to collaboratively expand, revise, curate, and critique. Historians and literary critics have long studied the way that early modern people associated with each other and participated in various...
  • Metadata: 4/5

    POI Ontology

    A point of interest (POI) is a specific point that someone may find useful or interesting. The purpose of this dataset is to aggregate POI data from different sources.
  • Metadata: 4/5

    Old Bailey Corpus

    The Old Bailey Corpus is a sociolinguistically, pragmatically and textually annotated corpus based on the Proceedings of the Old Bailey. These speech-related texts document Late Modern English as used in London’s Central Criminal Court. The Proceedings of the Old Bailey were published from 1674 to 1913 and constitute a large body of Late Modern English...