MLASS - Multi Latent Autoregressive Source Separation
TLDR: This project consisted into extending the LASS source separation method to allow the separation of more than two sources, while maintaining a feasible memory complexity. I proposed two methods: belief propagation (BP) and probabilistic extractor (PE).
LASS paper • MLASS report • MLASS GitHub Repo
LASS: the original work
Latent Autoregressive Source Separation (LASS) is a method for source separation introduced in the paper by Postolache et al.. The advantage of this method is the fact that it’s able to separate mixed sources into their original components without the needs for additional gradient-based optimization or modifications to pre-existing models. The method uses a VQ-VAE to embed the signals into a discretized latent space, and then uses autoregressive priors modelled by a Neural Network, used alongside a joint likelihood (computed with an ad hoc technique) in order to compute the joint posterior of the two original sources given the mixture, which will allow to sample those original sources.
The method has been tested on both the image (MNIST) and audio (SLAKH) domain.
MLASS: my extension
I proposed two methods that allow LASS to perform separation on more than two sources, decoupling the memory complexity from the number of sources in the mixture.
- The original memory complexity was $O(k^n)$, where $k$ is the number of codes used in the VQ-VAE embedding, and $n$ is the number of sources.
- The memory complexity of my solutions is $O((n-1)k^3)$, which is a major improvement.
Results of MLASS on two sources, the order is Original - Belief Propagation - Probabilistic Extractor Results of MLASS on three sources, the order is Original - Belief Propagation - Probabilistic Extractor
Results on MNIST dataset (PSNR)
Method | 2 sources | 3 sources |
---|---|---|
LASS | 24.23 ± 6.23 | N/A |
MLASS-PE | 16.87 ± 3.77 | 13.64 ± 1.76 |
MLASS-BP | 19.30 ± 5.68 | 14.19 ± 2.23 |
Results on SLAKH dataset (SDR)
Method | 2 sources | 3 sources |
---|---|---|
LASS | 5.01 ± 2.39 | N/A |
MLASS-BP | 3.09 ± 3.23 | -0.44 ± 2.96 |