Zerospeech 2021


The dataset, the baseline and random submissions are provided here for download. The dataset is released under a Creative Commons 4.0 licence. The baseline checkpoints include CPC checkpoints first released in the public domain by Faceboook AI Research.

File Description Size MD5 sum Data for the 2021 edition 24 GB d196d4c9174f1bf2ce7111a19abddaca Purely random submission provided as example 0.2 GB e58b62602f34fddc97a39a3ebf2b21ab Baseline submission (BERT) 13 GB 8544fe3fccb6ead94a6ae1e260240ca8 Baseline submission (LSTM) 17 GB 994d1323b43376e7f03e6cd06e966e60
topline_checkpoints.tar.gz Topline checkpoints 1.6 GB 5dd0c31d37bd07a4ec52ac44ede0017f
baseline_checkpoints.tar.gz Baseline checkpoints 2.4 GB 3c5cfeda5dca079f2c0c02b6cbeb08ed Visually Grounded Baseline checkpoints 1.9 GB cd15403948f3d91ef1a3a58931418d26

The following commands will download and unzip the dataset:

unzip -d zerospeech2021_dataset
rm -f