Blob-Oriented Retrieval with Segmental Unified KNN

BORSUK

Similarity search that digs through segments, not RAM.

Low RAM

Index stats and search reports expose resident routing memory, bytes read, and cache hits while vector blocks live in external files or blobs.

Blob First

Arrow schemas define the tables; Parquet files are the durable I/O unit for local files and S3-compatible object storage.

Metric Rich

Dense-vector, histogram, set-like, and string edit/similarity metrics are available through typed Python and TypeScript enums plus catalog helpers.

Native APIs

BORSUK is implemented in Rust and exposed directly through PyO3 and N-API. The Python and TypeScript packages do not shell out to the CLI.

cargo run --locked -p borsuk --example local_index
cd python && python examples/local_index.py
cd packages/borsuk && npm run example:local

SeaweedFS Smoke

The repository includes a local SeaweedFS stack that runs Rust, Python, and TypeScript S3-compatible tests end to end.

Open SeaweedFS example

Docs

Architecture, storage-format, API, and benchmark notes are kept with the source so implementation and documentation move together.

Read the docs