HumanBit-100M
207.5 M params · Hard-Attention GQA · HB2.32 Quant · SwiGLU · RoPE θ=500k · Qwen2.5 Tokenizer
⚠️ Early checkpoint — model is still training. Output improves with every run.
16 512
0 2
0.1 1
1 2
Device: CPU
Params: 207.5 M
Vocab: 151,643
CPU is slow — keep max tokens low
Example prompts