Learning Source Disentanglement in Neural Audio Codec
1LTCI, Télécom Paris, Institut polytechnique de Paris, France
2CVSSP, University of Surrey, UK
ICASSP, 2025
|
---|
|
---|
Resynthesis
Audio ID | Ground Truth | SD-Codec | DAC |
---|---|---|---|
4470 (mixture) | |||
4470 (speech) | |||
4470 (music) | |||
4470 (SFX) | |||
Audio ID | Ground Truth | SD-Codec | DAC |
---|---|---|---|
47461 (mixture) | |||
47461 (speech) | |||
47461 (music) | |||
47461 (SFX) | |||
54022 (mixture) | |||
54022 (speech) | |||
54022 (music) | |||
54022 (SFX) | |||
71510 (mixture) | |||
71510 (speech) | |||
71510 (music) | |||
71510 (SFX) | |||
98846 (mixture) | |||
98846 (speech) | |||
98846 (music) | |||
98846 (SFX) | |||
Separation
Audio ID | Ground Truth | SD-Codec | TDANet |
---|---|---|---|
21594 (mixture) | |||
21594 (speech) | |||
21594 (music) | |||
21594 (SFX) | |||
Audio ID | Ground Truth | SD-Codec | TDANet |
---|---|---|---|
37828 (mixture) | |||
37828 (speech) | |||
37828 (music) | |||
37828 (SFX) | |||
58627 (mixture) | |||
58627 (speech) | |||
58627 (music) | |||
58627 (SFX) | |||
59313 (mixture) | |||
59313 (speech) | |||
59313 (music) | |||
59313 (SFX) | |||
98448 (mixture) | |||
98448 (speech) | |||
98448 (music) | |||
98448 (SFX) | |||
BibTeX
@article{bie2024sdcodec, author={Bie, Xiaoyu and Liu, Xubo and Richard, Ga{\"e}l}, title={Learning Source Disentanglement in Neural Audio Codec}, journal={arXiv preprint arXiv:2409.11228}, year={2024}, }
Acknowledgement
This work was funded by the European Union (ERC, HI-Audio, 101052978). Views and opinions expressed are however those of the author(s) only and do not necessarily reflect those of the European Union or the European Research Council. Neither the European Union nor the granting authority can be held responsible for them. This project was provided with computer and storage resources by GENCI at IDRIS thanks to the grant 2024-AD011015054 on the supercomputer Jean Zay's V100 and A100 partition.
Page updated on 18 Sep 2024