Into:Cacheon
Fastest correct inference takes all.
As of · Jun 4, 10:37 UTC
14 just rebranded from TAOHash into a winner-take-all arena for the fastest LLM inference server.
What is Cacheon
Cacheon is a live competition that routes the subnet's to whoever runs the fastest correct inference server for a single open-source model. Miners ship containerized servers, benchmark them on identical hardware, and the fastest server that still gets the right answers takes the full miner reward stream until somebody beats it.
The simple version: A continuous race for the fastest open-source LLM server. One model, one set of GPUs, and only the winner gets paid.
Centralized equivalent: MLPerf inference benchmarks, except the leaderboard is on-chain and the winner earns ongoing income instead of a one-off citation.
How it works:
- Miners package an inference server as a Docker image, push it to a public registry, and commit the image reference plus digest on-chain. The server has to serve `Qwen2.5-72B-Instruct` through an OpenAI-compatible chat completions endpoint.
- Validators pull each new image, run it on a 4x H200 GPU pod, and score it on time-to-first-token and throughput against a pinned vLLM baseline. A correctness pass with logprobs follows; failing it zeroes the score.
Why This Matters
- The problem it solves: Inference cost dominates LLM serving budgets. Optimization techniques like FlashAttention, PagedAttention, speculative decoding, and custom CUDA kernels each get benchmarked in isolation, so apples-to-apples comparison across stacks is rare.
- The opportunity: A reproducible, permissionless leaderboard for inference speed creates a forcing function for optimization work that would otherwise sit behind closed research stacks.
- The Bittensor advantage: The prize is continuous emission rather than a one-time bounty, so the incentive to take and defend the throne keeps pulling new entries.
- Traction signals: The first testnet eval round completed in May 2026, mainnet competition is reportedly opening in mid-May per community announcements, and the alpha token is up roughly 28% over the past 7 days on positive 7-day .
Other research from the same neighborhood of the network.