B-R-P/Vztor
Vector database written in Zig
refsA high-performance key-value based vector database written in Zig.
Vztor combines the power of NMSLIB for efficient approximate nearest neighbor search with LMDB for persistent key-value storage. It provides a simple API for storing, retrieving, and searching vectors with associated metadata.
Dependencies are managed via Zig's package manager:
Add Vztor as a dependency in your build.zig.zon:
.dependencies = .{
.vztor = .{
.url = "https://github.com/B-R-P/Vztor/archive/refs/heads/main.tar.gz",
.hash = "...",
},
},
const std = @import("std");
const VStore = @import("vztor").VStore;
const nmslib = @import("nmslib");
pub fn main() !void {
const allocator = std.heap.page_allocator;
var store = try VStore.init(
allocator,
"my_database", // Database path
"negdotprod_sparse", // Space type
nmslib.DataType.SparseVector,
nmslib.DistType.Float,
1000, // Max readers
);
defer store.deinit() catch unreachable;
}
// Define sparse vectors
const vectors = [_][]const nmslib.SparseElem{
&[_]nmslib.SparseElem{
.{ .id = 1, .value = 1.0 },
.{ .id = 5, .value = 2.0 },
},
&[_]nmslib.SparseElem{
.{ .id = 2, .value = 1.0 },
.{ .id = 10, .value = 3.0 },
},
};
// Associated data payloads
const data = [_][]const u8{ "Vector 1", "Vector 2" };
// Insert vectors (keys are auto-generated if null)
const keys = try store.batchPut(&vectors, &data, null);
// Get vector and data by key
const result = try store.batchGet(keys[0]);
std.debug.print("Data: {s}\n", .{result.data});
// Search for k nearest neighbors
const search_results = try store.search(query_vector, k);
for (search_results) |result| {
std.debug.print("Key: {s}, Distance: {d}\n", .{ result.key, result.distance });
}
// Save the index and flush LMDBX to disk
try store.save();
init(allocator, comptime db_path, space_type, vector_type, dist_type, max_readers)Initializes a new VStore instance or loads an existing one from disk.
allocator: Memory allocatordb_path: Path to the database directory (comptime string)space_type: Vector space type (e.g., "negdotprod_sparse")vector_type: Type of vectors (e.g., SparseVector)dist_type: Distance type (e.g., Float)max_readers: Maximum number of concurrent readersbatchPut(vectors, data, keys) (internal)Inserts multiple vectors with associated data.
vectors: Array of sparse vectorsdata: Array of data payloadskeys: Optional array of keys (auto-generated if null)Returns: Array of keys for the inserted vectors
batchGet(key) (internal)Retrieves a vector and its associated data by key.
Returns: getResult struct with vector and data fields
search(vector, k) (internal)Searches for the k nearest neighbors to the given vector.
Returns: Array of searchResult structs with key, data, and distance fields
save()Persists the index and flushes LMDBX to disk.
deinit()Cleans up resources and closes the store. Returns an error union (!void).
Vztor uses a dual-storage architecture:
Data is organized in three LMDBX databases:
data: Maps keys to data payloadsindex_to_key: Maps numeric indices to key-position pairsmetadata: Stores configuration like random seedszig build
zig build test
See LICENSE file for details.