
NADI 2026 Task 2: Spoken Dialect Identification
Organizer: prsull; 7 submissions; Part of the Nuanced Arabic Dialect Identification 2026 shared task, this Spoken Dialect Identification task focuses on out-of-domain dialect ident...
About this hackathon
Part of the Nuanced Arabic Dialect Identification 2026 shared task, this Spoken Dialect Identification task focuses on out-of-domain dialect identification, where the final test set will be a blind set from an unknown domain. This year we focus on an out-of-domain Spoken dialect ID task. Language and dialect ID models may be somewhat prone to overfitting to a training domain, limiting their applicability in real world scenarios. This blind domain evaluation aims to test the generalizability of these models. For our baseline we provide a training script to finetune a pretrained ECAPA-TDNN language ID system on a 200hr subset of the ADI-20 dataset. Training is unrestricted, and participants are free to train on the full ADI-17/20 datasets. Because this is a blind out-of-domain evaluation, we encourage participants to consider evaluating their models on selected data from other domains such as radio, read speech, conversational telephone etc.
Tracks
General Track
Organizer: prsull; 7 submissions; Part of the Nuanced Arabic Dialect Identification 2026 shared task, this Spoken Dialect Identification task focuses on out-of-domain dialect identification, where the final test set will be a blind set from an unknown domain. This year we focus on an out-of-domain Spoken dialect ID task. Language and dialect ID models may be somewhat prone to overfitting to a training domain, limiting their applicability in real world scenarios. This blind domain evaluation aims to test the generalizability of these models. For our baseline we provide a training script to finetune a pretrained ECAPA-TDNN language ID system on a 200hr subset of the ADI-20 data
Prizes
Project Prize
Organizer: prsull; 7 submissions; Part of the Nuanced Arabic Dialect Identification 2026 shared task, this Spoken Dialect Identification task focuses on out-of-domain dialect identification, where the final test set will be a blind set from an unknown domain. This year we focus on an out-of-domain Spoken dialect ID task. Language and dialect ID models may be somewhat prone to overfitting to a training domain, limiting their applicability in real world scenarios. This blind domain evaluation aims to test the generalizability of these models. For our baseline we provide a training script to finetune a pretrained ECAPA-TDNN language ID system on a 200hr subset of the ADI-20 data
Schedule
Jun 16, 04:00 PM
Tags
Comments
0Similar hackathons
AI Builders Challenge with IBM Bob
Student AI Challenge Competitions & Virtual Conference

AI Hackathon Haarlem
Artificial Intelligence (AI) hackathon in Haarlem, Netherlands

Citadel Hackathon - Season 1
Defend the Fort. Innovate the Future. Kolkata's Mahabharata-Themed Technical Hackathon..