Open-source dataset curation + embedding visualization (Euclidean + Poincaré disk)
- Dual-Panel UI: Image grid + scatter plot with bidirectional selection
- Euclidean/Poincaré Toggle: Switch between standard 2D UMAP and Poincaré disk visualization
- HuggingFace Integration: Load datasets directly from HuggingFace Hub
- Fast Embeddings: Uses EmbedAnything for CLIP-based image embeddings
Docs: docs/datasets.md · docs/colab.md · CONTRIBUTING.md · TESTS.md
pip install hyperviewhyperview demo --samples 500This will:
- Load 500 samples from CIFAR-100
- Compute CLIP embeddings
- Generate Euclidean and Poincaré visualizations
- Start the server at http://127.0.0.1:6262
import hyperview as hv
# Create dataset
dataset = hv.Dataset("my_dataset")
# Load from HuggingFace
dataset.add_from_huggingface(
"uoft-cs/cifar100",
split="train",
max_samples=1000
)
# Or load from local directory
# dataset.add_images_dir("/path/to/images", label_from_folder=True)
# Compute embeddings and visualization
dataset.compute_embeddings(model="openai/clip-vit-base-patch32")
dataset.compute_visualization()
# Launch the UI
hv.launch(dataset) # Opens http://127.0.0.1:6262See docs/colab.md for a fast Colab smoke test and notebook-friendly launch behavior.
Traditional Euclidean embeddings struggle with hierarchical data. In Euclidean space, volume grows polynomially (
Hyperbolic space (Poincaré disk) has exponential volume growth (
Development setup, frontend hot-reload, and backend API notes live in CONTRIBUTING.md.
- hyper-scatter: High-performance WebGL scatterplot engine (Euclidean + Poincaré) used by the frontend: https://github.com/Hyper3Labs/hyper-scatter
- hyper-models: Non-Euclidean model zoo + ONNX exports (e.g. for hyperbolic VLM experiments): https://github.com/Hyper3Labs/hyper-models
- Poincaré Embeddings for Learning Hierarchical Representations (Nickel & Kiela, 2017)
- Hyperbolic Neural Networks (Ganea et al., 2018)
MIT License - see LICENSE for details.