Cacheon
SN14A Bitcoin mining pool owned by the community, where miners earn rewards for both BTC and Bittensor
Subnet 14 just rebranded from TAOHash into a winner-take-all arena for the fastest LLM inference server.
// Fastest correct inference takes all.
Cacheon is a live competition that routes the subnet's miner emissions to whoever runs the fastest correct inference server for a single open-source model. Miners ship containerized servers, validators benchmark them on identical hardware, and the fastest server that still gets the right answers takes the full miner reward stream until somebody beats it.
The simple version: A continuous race for the fastest open-source LLM server. One model, one set of GPUs, and only the winner gets paid.
Centralized equivalent: MLPerf inference benchmarks, except the leaderboard is on-chain and the winner earns ongoing income instead of a one-off citation.
How it works:
- Miners package an inference server as a Docker image, push it to a public registry, and commit the image reference plus digest on-chain. The server has to serve
Qwen2.5-72B-Instructthrough an OpenAI-compatible chat completions endpoint. - Validators pull each new image, run it on a 4x H200 GPU pod, and score it on time-to-first-token and throughput against a pinned vLLM baseline. A correctness pass with logprobs follows; failing it zeroes the score.
- The problem it solves: Inference cost dominates LLM serving budgets. Optimization techniques like FlashAttention, PagedAttention, speculative decoding, and custom CUDA kernels each get benchmarked in isolation, so apples-to-apples comparison across stacks is rare.
- The opportunity: A reproducible, permissionless leaderboard for inference speed creates a forcing function for optimization work that would otherwise sit behind closed research stacks.
- The Bittensor advantage: The prize is continuous emission rather than a one-time bounty, so the incentive to take and defend the throne keeps pulling new entries.
- Traction signals: The first testnet eval round completed in May 2026, mainnet competition is reportedly opening in mid-May per community announcements, and the alpha token is up roughly 28% over the past 7 days on positive 7-day net flow.
Category: Inference and Compute | Centralized Competitor: vLLM, TensorRT-LLM, SGLang
LLM inference is the largest line item in production ML budgets in 2026, and whoever can serve a given model fastest on a given hardware footprint captures the lion's share of margin. The optimization frontier is wide but fragmented, with most progress locked inside closed inference stacks at large labs. Cacheon's bet is that an open arena with continuous payout draws optimization work into the open and turns benchmarking into a live, reproducible contest.
Mechanism:
Miners build a containerized inference server, push it to a public registry, and commit the image reference and digest on-chain. Images are capped at 20 GB, with model weights mounted at runtime rather than baked in. Validators run two passes on the same 4x H200 pod: first a streaming pass that measures TTFT and throughput without logprobs, then a correctness pass with logprobs enabled and a first-mismatch greedy-decoding check. Score equals 0.5 x TTFT improvement plus 0.5 x throughput improvement against the vLLM baseline, with a hard zero if correctness fails. The fastest correct server above the current king becomes the new king. Challengers have to beat the king by a margin that starts near 1% and decays to 0 over roughly 7 days, which suppresses noise-driven churn.
Economics and activity:
ξ trades around 0.01182 TAO with about 27,149 TAO of root in the AMM pool. The 30-day price change is +32% and the 7-day change is +28%, with a smoothed emission share of about 7% under Taoflow and a 7-day net inflow of roughly 3,286 TAO. Root proportion of 0.16 indicates pool depth is largely organic stake rather than protocol subsidy. Per the team's documentation, up to 28 TAO per day routes to the current king.
The public repository at latent-to/cacheon opened on April 13, 2026 and has 109 commits across 3 named contributors, with the most recent push on May 12, 2026. Listed team members include Xavier Lyu (research), Clément Blaise (infrastructure), Dera Okeke (frontend), and Cameron Fairchild (Bittensor core contributor). Subnet 14 was previously identified as TAOHash and has been re-identified as Cacheon under the same owner address, with the codebase, validator workload, and product surface now entirely different from the prior identity.
- Pre-launch: Mainnet competition is reportedly opening in mid-May 2026. The first testnet eval round completed with both submissions failing startup or model-loading requirements, which is the kind of issue testnet is designed to surface, but means production readiness is not yet proven.
- Hardware entry cost: Miners need access to 4x H200 with NVLink or equivalent to test seriously, which narrows the competitive pool relative to subnets with lighter hardware requirements.
- Winner-take-all dynamics: The king collects the entire miner emission stream until dethroned. Strong incentive to lead, weak incentive to ship a near-miss, and effectively no recurring payout for second place.
- Stake concentration: The Nakamoto coefficient on the alpha token is 4, meaning a small coalition can move stake-weighted outcomes. This is structural for an early-stage subnet but worth tracking as the market matures.
- Pivot execution: Subnet 14's identity moved from TAOHash to Cacheon under the same owner. Execution now depends on the team shipping V1 cleanly on a repository opened in April 2026, against a roadmap that already promises speculative decoding, latency-SLA workloads, and production endpoint routing in later versions.
Into the next one.