Model Benchmarks →
The Eval Hub - Building JS engines in C, testing out SWE-Marathon (not mine), testing small models train-on-test contamination, Fable's Refusals, and more to come.
Last chance to make stuff for humans. Let's have fun the whole time. Dog is here for morale.
English access to research papers from ChinaXiv and beyond — machine-translated at scale. Now joined by RussiaRxiv, soft-launched with the Russian corpus coming online.
The Eval Hub - Building JS engines in C, testing out SWE-Marathon (not mine), testing small models train-on-test contamination, Fable's Refusals, and more to come.
Full-ruleset Blood Bowl engine in C with PufferLib self-play for RL. Training runs around the clock - watch it LIVE on Blood Bowl TV
Ars Magica Fifth Edition character generator, live TTS, saga manager, NPC hub, rules reference, and more. The tools I always wished I had when I am pretending to be a wizard. Now in Beta.
Vibecoding an entire game with my son. It has NINJAS and MAGIC GREEN WOLVES and ZOMBIES, per the stakeholders requirements. He's 4. Powered by Claude, Codex, Unity, Cartwheel, Meshy
A Story Grid-driven fiction harness — the model writes the prose, the machinery validates structure, tells, and detection. Read The Kept Watch, written end-to-end inside it.
the morale department is a 26lb terrier named Watson. he has no email.


