EugenHotaj/zig_gpt2
GPT-2 inference engine written in Zig
GPT-2 inference engine written in Zig. Generation time: ~28ms per token.
Download the GPT-2 checkpoint from OpenAI.
python3 download_weights.py
Build the Zig binary and run it with a prompt to generate completions:
zig build -DOptimize=ReleaseFast
./zig-out/bin/zig_gpt2 "Marcus Aurelius said"
Generate test data by forwarding random tensors through PyTorch ops.
python3 generate_test_data.py
Run tests. Verifies Zig ops produce the same output as PyTorch.
zig build test
Implementation:
Efficiency:
softmax
and gelu
operations.