mattzcarey/llama.js
run LLMs (llama, mamba, nemo, mistral) at native speeds from Javascript, Typescript.
an experiment to run llama.cpp through a javascript runtime at near native speeds
Clone the repo recursively
Install zig to the path
cd into llama.cpp.zig
Download a model
huggingface-cli download NousResearch Hermes-2-Pro-Mistral-7B-GGUF Hermes-2-Pro-Mistral-7B.Q4_0.gguf --local-dir models
zig build run-simple -Doptimize=ReleaseFast -- --model_path "./models/Hermes-2-Pro-Mistral-7B.Q4_0.gguf" --prompt "Hello! I am AI, and here are the 10 things I like to think about:"
zig build
bun run index.ts