Audio Mosaicing

An example of basic audio mosaicking in bellplay~, where a target audio file is reconstructed using segments drawn from a small audio corpus.

The script below demonstrates how to use onset-based segmentation and multi-dimensional audio features (MFCCs and loudness) to match fragments of a target to acoustically similar units in a pre-analyzed corpus. The target is not reconstructed literally, but instead approximated by audio segments from the corpus.

A key aspect of the process is the use of a feature tree, built from the corpus, which allows fast approximate nearest-neighbor retrieval based on the extracted descriptors. Each target segment is mapped to its closest match in the corpus and reinserted into the timeline at the corresponding onset, optionally applying a gain envelope and stereo panning.

The approach is general and can be adapted to different sources, targets, and features.

audio_mosaicing.bell
## Define features to extract using MFCC and loudness
$features = mfcc() larm();
## Collect feature keys for later retrieval
$featurekeys = for $f in $features collect $f.getkey('output'):-1;
## Define input files for analysis
$files = 'badinerie.wav' 'trumpet.wav' 'singing.wav';
## Onset detection parameters
$thresh = 0.01;
$alpha = 0.05;
$delay = 5;
## Create onset descriptor with specified parameters
$onsetdescriptor = onsets(
    @alpha $alpha
    @delay $delay
    @silencethreshold $thresh
);
## Function to extract feature values from a buffer
$buf2features = (
    $buf -^ $featurekeys -> [for $k in $featurekeys collect $buf.getkey($k) ] 
);
## List to store features
$data = null;
## List to store corresponding audio segments
$corpus = null;
## Analyze each file and build corpus
for $file in $files do (
    $buf = importaudio($file);
    $buf = $buf.analyze($onsetdescriptor);
    $markers = $buf.getkey('onsets');
    $segments = $buf.splitbuf(
        ## split points
        @split $markers
        ## mode 2: segments based on pre-computed onsets/markers
        @mode 2
    );
    for $seg in $segments do (
        ## Extract features for each segment
        $seg = $seg.analyze($features);
        ## Append feature vector
        $data _= $buf2features($seg);
        ## Append corresponding audio segment
        $corpus _= $seg
    ) 
);
## Create feature tree (e.g., KD-tree) for nearest neighbor search
$tree = createtree($data);
## Load and analyze target audio for matching
$target = importaudio('drums.wav');
$target = $target.analyze($onsetdescriptor);
$markers = $target.getkey('onsets');
## Compute durations between markers
$markerdurs = x2dx($markers $target.getkey('duration'));
## Match each segment in the target to corpus using features
for $marker in $markers, $dur in $markerdurs do (
    $segment = $target.setkey('duration', $dur).setkey('offset', $marker);
    $segment = $segment.analyze($features);
    $point = $buf2features($segment);
    $id = querytree($tree, $point);
    $match = $corpus:$id;
    ## Transcribe matched segment to aligned position in mix
    $match.transcribe(
        @onset $marker
        @pan 0
        @gain [0 1 0] [1 0 -0.25] 
    ) 
);
## Optionally layer original target audio on the right stereo channel
$target.transcribe(@pan 1);
## Render and normalize final audio output
render(
    @play 1 @process normalize(-3) 
)