Skip to main content

umap

umap(
@dataset ## llll (required)
@iterations 200
@learnrate 0.1
@mindist 0.1
@numdimensions 2
@numneighbors 15
@useseed 0
) -> llll

Given a dataset, generates a Uniform Manifold Approximation and Projection (UMAP) object which can be used for dimensionality reduction of a dataset, via the transform function.


Arguments

  • @dataset [llll]: Dataset to fit UMAP to (required)
  • @iterations [int]: The number of iterations that the algorithm will go through to optimise the new representation. (default: 200).
  • @learnrate [float]: The learning rate of the algorithm, aka how much of the error it uses to estimate the next iteration. (default: 0.1).
  • @mindist [float]: The minimum distance each point is allowed to be from the others in the low dimension space. Low values will make tighter clumps, and higher will spread the points more. (default: 0.1).
  • @numdimensions [int]: The number of dimensions to reduce to. (default: 2).
  • @numneighbors [int]: The number of neighbors considered by the algorithm to balance local vs global structures to conserve. Low values will prioritise preservation of the local structure while high values will prioritise preservation of the global structure. (default: 15).
  • @useseed [int]: Use random seed for parameter initialization. (default: 0).
    • 0: Off
    • 1: On

Output

UMAP object [llll]


Usage

$indataset = dataset(
for $i in 1...100 collect [$i * 10 ** (-1...1)] ## dummy input dataset
);
$inpoint = 0.65 6.5 65; ## dummy input point
$reducer = umap($indataset); ## create reducer based on dummy dataset
$outdataset = transform($reducer, $indataset); ## transform dataset based on learned parameters
$outpoint = transform($reducer, $inpoint); ## transform new point based on learned parameters
writeobject($reducer, './umap.json'); ## write to JSON (optional)
$reducer = readobject('./umap.json'); ## read from JSON (optional)
print(getitems($indataset, 1...5), 'Input dataset:');
print(getitems($outdataset, 1...5), 'Output dataset:');
print($outpoint, 'Output point:')