Into:Chutes
Serverless AI. Just call the API.
As of · Jun 4, 10:37 UTC
The AWS Lambda of Bittensor. Pick any AI model, call an API endpoint, and Chutes handles the GPUs, load balancing, scaling, and infrastructure. 46+ models available, OpenAI-compatible API, used by multiple other as their inference backbone.
What is Chutes
Chutes is a serverless AI compute platform. Developers choose an AI model (or bring their own code), call an API endpoint, and Chutes handles everything: finding available GPUs, routing requests, scaling up or down based on demand, and managing the entire infrastructure. No server setup. No GPU procurement. No DevOps.
The simple version: Imagine a power outlet for AI. You plug in what you want to run, and the electricity (compute) just flows. You don't think about power plants, transmission lines, or transformers. Chutes is that outlet for AI models.
Centralized equivalent: Think AWS Lambda or Google Cloud Functions, but specifically for AI inference, with GPUs instead of CPUs, and powered by a decentralized network instead of corporate data centers.
How it works:
- Miners run AI models ("chutes") on their GPUs and decide which models to keep "hot" in memory for fast response times. They compete for bounties by being the first to launch cold models. Rewards are based on compute units (55%), invocations (25%), unique chutes hosted (15%), and bounty completions (5%).
- verify miner activity through digital fingerprints and activity reports. The main validator (operated by the Chutes team with approximately 16 H200 GPUs) coordinates the platform, while other validators audit reward calculations.
Why This Matters
Other research from the same neighborhood of the network.