On the Transferability of Large-Scale Self-Supervision to Few-Shot Audio Classification

Published in ICASSP24 SASB, 2024

Recommended citation:

Abstract: In recent years, self-supervised learning has excelled for its capacity to learn robust feature representations from unla belled data. Networks pretrained through self-supervision serve as effective feature extractors for downstream tasks, including Few-Shot Learning. While the evaluation of unsupervised approaches for few-shot learning is well-established in imagery, it is notably absent in acoustics. This study addresses this gap by assessing large-scale self-supervised models’ performance in few-shot audio classification. Additionally, we explore the relationship between a model’s few-shot learning capability and other downstream task benchmarks. Our findings reveal state-of-the-art performance in some few-shot problems such as SpeechCommandsv2, as well as strong correlations between speech-based few-shot problems and various downstream audio tasks.

Paper on arXiv
Papers With Code Entry

Recommended citation (BibTex): “@misc{heggan2024transferability, title={On the Transferability of Large-Scale Self-Supervision to Few-Shot Audio Classification}, author={Calum Heggan and Sam Budgett and Timothy Hospedales and Mehrdad Yaghoobi}, year={2024}, eprint={2402.01274}, archivePrefix={arXiv}, primaryClass={cs.SD} }”

ICASSP citation coming soon