Zerospeech 2021
Results
The columns are sortable by clicking on the |sortable| picture of each column header. A detailed view of the results is available by clicking on the picture of each row.
The columns are interpreted as follows (see Evaluation metrics for details):
-
Phonetic (across and within)
- ABX error rate on embeddings
- Scale is $[0, 1]$, lower is better
-
Lexical and Syntactic
- Mean correct / incorrect classification accurary
- Scale is $[0, 1]$, higher is better
- For Lexical the all column is the mean accuracy over five frequency bins (based on raw frequency counts in LibriSpeech-960: OOV; 1-5; 6-20; 21-100; 101+), and the in vocab. column leaves out the OOV category. Only the all column was published in the Interspeech summary paper.
-
Semantic
- Human judgement correlation coeficient (x 100$)
- Scale is $[-100, 100]$, far from 0 is better
- Mean score across all datasets
- Semantic (Weighted): Same as Semantic with mean score weighted by the number of pairs in each dataset. Only the unweighted (Semantic) columns were published in the Interspeech summary paper.
Phonetic (Within) | Phonetic (Across) | Lexical | Syntactic | Semantic | Semantic (Weighted) | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
# | Author | Budget | Set | clean | other | clean | other | all | in vocab. | synth. | libri. | synth. | libri. |