Audio source separation
This tutorial demonstrates how to perform rudimentary source separation in bellplay~ using Non-Negative Matrix Factorization (NMF).
NMF is a technique that can decompose a mixed audio signal into its component sources by learning the spectral characteristics (or "fingerprints") of each source.
In this tutorial, we:
- Import two built-in audio files (
trumpet.wavanddrums.wav) - Analyze their spectral content to create "basis" representations
- Mix the buffers together
- Use NMF to separate the mixed signal back into individual components
- Visualize the results: the separated components, learned bases, and activation patterns
- Use NMF results for auto-panning, transcribing each NMF component with a unique panning position value.
audio_source_separation.bell
## set seed for reproducibility
setseed(1);
## import our source audio files
$files = 'trumpet.wav' 'drums.wav';
## load each file as a buffer
$bufs = for $f in $files collect importaudio($f);
## create spectral basis representations for each source
$bases = for $b $i in $bufs collect (
## normalize each buffer to -18 dB RMS for consistent levels
$b = process($b, normalize(@level -18 @rms 1));
## update buffer with spectral analysis
$b = analyze($b, spectrum());
## extract the computed spectrum
$spectrum = getkey($b, 'spectrum');
## convert the spectrum data into a buffer format for NMF
$basis = samps2buf([$spectrum]);
$basis
);
## create a mixed version of both sources
$mixed = process(
## take the first buffer (trumpet) and mix in the second (drums)
@buffer $bufs:1 @operations mix($bufs:2)
);
## perform Non-Negative Matrix Factorization on the mixed signal
## using the spectral bases we computed earlier as initialization
$nmf = buf2nmf(
@buffer $mixed
@bases $bases
@iterations 100
@useseed 1
);
## extract the three key results from NMF:
## components: the separated audio sources
$components = getkey($nmf, 'components');
## bases: the learned spectral patterns for each source
$bases = getkey($nmf, 'bases');
## activations: when and how strongly each source is present over time
$activations = getkey($nmf, 'activations');
## visualize all three types of results
for $buffers in [$components] [$bases] [$activations], $label in 'component' 'basis' 'activation' with @unwrap 1 do (
## iterate through each buffer and display it with an appropriate label
for $buffer $i in $buffers do view(
$buffer @label $label + ' ' + tosymbol($i)
)
);
## transcribe components, each with a unique panning position
$panvals = arithmser(0, 1, null, length($components));
for $buffer in $components, $pan in $panvals do (
transcribe(@buffer $buffer @pan $pan)
);
## trigger rendering
render(@play 1)