Research preview · semantic layer

RuFaS Semantic Map

An interactive 3D map for browsing semantic neighborhoods among RuFaS input variables, output variables, and input files. Start with the map, switch projections, and use the details below only when you want more context about how the preview was generated.

Interactive 3D semantic map

Browse the map first

Switch projections without shrinking the map. The cluster labels stay the same in every view; only the 3D projection changes.

Loading interactive 3D map…

Please wait while the semantic map data loads.

UMAP is a nonlinear projection that often keeps nearby semantic neighbors visually close.

Points

Each point is a RuFaS input variable, output variable, or input file.

Hover

Hover labels show cluster name, item type, and path. Internal custom IDs are hidden.

Projection

UMAP, PCA, and t-SNE are visual views only; clustering did not happen on these 3D coordinates.

More context

How to read this preview

The map is the primary artifact. The sections below explain the supporting method, summary numbers, and caveats for people who want to understand how the semantic layer preview was produced.

What this shows

The map compares semantic neighborhoods among three RuFaS item types: input variables, output variables, and input files. The text description of each item was embedded using OpenAI's large embedding model, creating a high-dimensional vector representation of meaning.

DBSCAN clustering was applied to those original high-dimensional embedding vectors using cosine distance. UMAP, PCA, and t-SNE are used only as visualization methods; clustering is not performed on the 3D coordinates shown in the map.

How to use the map

Rotate: click and drag with the left mouse button.
Pan / move: click and drag with the right mouse button.
Zoom: use the mouse wheel or trackpad scroll.
Inspect a point: hover to see its cluster name, type, and path.
Filter with the legend: click a legend item to hide or show it.
Isolate a cluster: double-click a legend item to show only that cluster; double-click again to restore all.

Interpretation notes

Nearby points tend to be semantically similar according to the embedding model, but projection distance is an approximation.
UMAP and t-SNE usually emphasize local neighborhood structure; PCA is linear and more global.
The same cluster labels are shown in every projection panel so visual differences come from the projection method, not from reclustering.
This visualization helps inspect semantic neighborhoods among RuFaS model items; it should be treated as a research/product preview rather than a definitive scientific grouping.

References

OpenAI Embeddings →UMAP documentation →scikit-learn DBSCAN →scikit-learn PCA →scikit-learn t-SNE →Plotly 3D scatter plots →