ShareTIGR - Sharing the TIGR corpus of spoken Italian: an ORD case study

Persone

(Responsabile)

Abstract

For the study of spoken varieties of the Italian language, a series of corpora have been gathered from the 1990ies onwards and have partly been made available on websites and DVDs. The TIGR corpus of spoken Italian, which has been collected in Southern Switzerland in 2021 and 2022 within the InfinIta project (SNF grant no. 192771), is a unique language resource in this panorama because of the regional varieties of Italian it documents and because it includes not only audio data, transcripts and sociolinguistic data, but also video recordings. The goal of the present project is (a) to share this rather large dataset (23.5 hours of recordings, 121 speakers, 4 TB of storage in uncompressed form) for scientific use, respecting FAIR principles and data protection; (b) to discuss the various phases of this process as a case study of open research data practices in linguistics, engaging with potentially interested communities via scientific presentations and publications and via a lab blog and social media.

Informazioni aggiuntive

Acronimo

ShareTIGR

Data d'inizio

01.02.2024

Data di fine

31.05.2025

Durata

17 Mesi

Enti finanziatori

Università della Svizzera italiana

Partner esterni

UniBAS, UniNE, UniL, DaSCH, CLARIN-CH

Stato

Concluso

Categoria

USI Internal calls / Projects on OS and ORD

Persone

Formazione

Ricerca

Organizzazione

ShareTIGR - Sharing the TIGR corpus of spoken Italian: an ORD case study

Persone

Abstract

Informazioni aggiuntive

Facoltà

Unità organizzative

Indicazioni

Resta in contatto