Audio Mosaicing
An example of basic audio mosaicking in bellplay~, where a target audio file is reconstructed using segments drawn from a small audio corpus.
The script below demonstrates how to use onset-based segmentation and multi-dimensional audio features (MFCCs and loudness) to match fragments of a target to acoustically similar units in a pre-analyzed corpus. The target is not reconstructed literally, but instead approximated by audio segments from the corpus.
A key aspect of the process is the use of a feature tree, built from the corpus, which allows fast approximate nearest-neighbor retrieval based on the extracted descriptors. Each target segment is mapped to its closest match in the corpus and reinserted into the timeline at the corresponding onset, optionally applying a gain envelope and stereo panning.
The approach is general and can be adapted to different sources, targets, and features.
## Define features to extract using MFCC and loudness
$features = mfcc() larm();
## Collect feature keys for later retrieval
$featurekeys = for $f in $features collect $f.getkey('output'):-1;
## Define input files for analysis
$files = 'badinerie.wav' 'trumpet.wav' 'singing.wav';
## Onset detection parameters
$thresh = 0.01;
$alpha = 0.05;
$delay = 5;
## Create onset descriptor with specified parameters
$onsetdescriptor = onsets(
@alpha $alpha
@delay $delay
@silencethreshold $thresh
);
## Function to extract feature values from a buffer
$buf2features = (
$buf -^ $featurekeys -> [for $k in $featurekeys collect $buf.getkey($k) ]
);
## List to store features
$data = null;
## List to store corresponding audio segments
$corpus = null;
## Analyze each file and build corpus
for $file in $files do (
$buf = importaudio($file);
$buf = $buf.analyze($onsetdescriptor);
$markers = $buf.getkey('onsets');
$segments = $buf.splitbuf(
## split points
@split $markers
## mode 2: segments based on pre-computed onsets/markers
@mode 2
);
for $seg in $segments do (
## Extract features for each segment
$seg = $seg.analyze($features);
## Append feature vector
$data _= $buf2features($seg);
## Append corresponding audio segment
$corpus _= $seg
)
);
## Create feature tree (e.g., KD-tree) for nearest neighbor search
$tree = createtree($data);
## Load and analyze target audio for matching
$target = importaudio('drums.wav');
$target = $target.analyze($onsetdescriptor);
$markers = $target.getkey('onsets');
## Compute durations between markers
$markerdurs = x2dx($markers $target.getkey('duration'));
## Match each segment in the target to corpus using features
for $marker in $markers, $dur in $markerdurs do (
$segment = $target.setkey('duration', $dur).setkey('offset', $marker);
$segment = $segment.analyze($features);
$point = $buf2features($segment);
$id = querytree($tree, $point);
$match = $corpus:$id;
## Transcribe matched segment to aligned position in mix
$match.transcribe(
@onset $marker
@pan 0
@gain [0 1 0] [1 0 -0.25]
)
);
## Optionally layer original target audio on the right stereo channel
$target.transcribe(@pan 1);
## Render and normalize final audio output
render(
@play 1 @process normalize(-3)
)