# Into: Aurelius

Aurelius turns ethical dilemmas into training data. Miners write moral-conflict scenarios, validators run them through Google DeepMind's Concordia agent simulations, and the resulting transcripts are scored and used to teach language models how to reason about right and wrong.

// Simulating moral dilemmas to train AI.

---

> New to Bittensor? Start here. Experienced users can skip to the full analysis.

### What is Aurelius?

Aurelius (subnet 37) is a Bittensor subnet built around AI alignment: the problem of getting AI systems to behave in line with human values. Rather than testing a finished model, it produces the raw material for studying moral reasoning. Contributors author structured ethical-dilemma scenarios, simulated characters play them out, and the transcripts are scored and turned into training data.

**The simple version:** It is like a writers' room for moral dilemmas. People script hard ethical situations (who gets the last dose of medicine, duty versus desire), simulated agents argue their way through them, and the resulting transcripts become study material for teaching AI to handle similar questions.

**Centralized equivalent:** The closest comparisons are the in-house alignment and red-team programs at labs like Anthropic, OpenAI, and Google DeepMind. There is no consumer-product equivalent. This is research infrastructure, not an app.

**How it works:**
- **Miners** serve a library of hand-authored scenario configs. Each one is a premise plus two characters with goals and assigned ethical philosophies (deontology, care ethics, and so on) and a forced choice. The miner returns configs to validators on request, round-robin.
- **Validators** put each scenario through an eight-stage check, run the accepted ones through a Concordia simulation, score the transcript, and set on-chain weights from the result.

---

### Why This Matters

- **The problem it solves:** AI systems increasingly make or shape decisions with ethical stakes, and there is no settled way to measure or improve how they reason through them. Aurelius is built to produce structured data aimed at exactly that gap, targeting moral-reasoning benchmarks (the README names MoReBench).
- **The opportunity:** Alignment data is scarce and expensive to author by hand. A network that crowd-sources dilemma design across many independent authors can cover more moral and cultural ground than a single in-house team.
- **The Bittensor advantage:** Diverse contributors produce a wider spread of scenarios than a homogeneous lab would, and the scoring pipeline plus a work-token cost is meant to reward quality over raw volume.
- **Traction signals:** Modest so far. The codebase is actively maintained, with a published Docker path for running validators and miners and a live testnet on subnet 455. At snapshot time the subnet is in a burn-mode state with no active miners and no emission share on-chain (detailed below), and public discussion of it is thin.

---

## Full Analysis

**Category:** AI Alignment and Safety (Other) | **Centralized Competitor:** Anthropic alignment research, OpenAI safety, Google DeepMind

Most subnets optimize for something measurable and saleable: faster inference, cheaper compute, better predictions. Aurelius is unusual in optimizing for understanding. Its question is not "how fast" but "how well does a model reason about a moral conflict," and its output is data rather than a service.

**Mechanism:**

The work format is concrete and documented in the subnet's main repo. A miner is a Bittensor axon that serves a directory of scenario JSON files loaded at startup. When a validator queries it with a `ScenarioConfigSynapse`, the miner returns the next config in its library, signed by its hotkey and stamped with a `work_id`. Each config follows a fixed schema: a 200 to 2000 character premise, exactly two agents with identities and goals, a tension archetype such as justice versus mercy, and one or more scenes with an optional forced choice. Miners do not generate scenarios on demand; they serve a pre-authored library.

The validator runs an eight-stage pipeline: version check, schema validation, a work-token balance check, per-hotkey rate limiting, a novelty check (using FAISS), a classifier quality gate, the Concordia simulation, and finally a work-token deduction plus an on-chain weight update. The pipeline short-circuits on the first failure, and the work token is only spent if all eight stages pass. Concordia is Google DeepMind's open-source generative-agent simulation framework. Each simulation runs in an ephemeral, resource-capped container with its model egress firewalled to an allowlist; the default model is DeepSeek's `deepseek-chat`. The transcript it produces is parsed and scored for coherence, and that score determines the miner's weight.

The economics run on work tokens. A validator spends one work token per scenario it accepts for simulation, and a miner has to fund a balance first by depositing TAO to a Central API multisig address. A miner whose balance is zero gets its submissions rejected at stage three before any simulation runs. In plain terms: serving scenarios is not free, which is the design's filter against spam.

On current state, the on-chain readings and the repo agree. The most recent commit (dated 2026-05-10, via a live GitHub check) flips the validator's burn-mode default on and sets the burn percentage to 1.0. On-chain, TaoSwap reports miner emission burn at 100 percent, zero active miners, and an emission share of 0 percent at snapshot. The README is explicit that a miner earning nothing has no reason to run, so a 100 percent burn and zero active miners are consistent rather than contradictory. The team has not published a reason for the burn-mode default, so we describe the state and stop short of guessing why.

Development is the work of one primary contributor, Volker Einsfeld, across roughly 50 commits since the repo was created in November 2025, with the most recent push on 2026-05-10 (within the last six weeks at the time of writing). The repo is in Python, public, and carries a working quickstart, a testnet target, and CI that rebuilds the published images on each push to main.

Market readings are small and are snapshots, not promises. Price sits around 0.00354 TAO, market cap around 18,950 TAO, with about 8,200 TAO of root liquidity in the pool. Seven-day net flow is slightly positive at roughly 53 TAO, and root proportion is about 0.15. The thirty-day price change is down about 8 percent and the seven-day change is up about 2 percent.

---

### Risk Factors

- **No active incentive loop at snapshot:** The validator's default burns 100 percent of miner emissions, and on-chain there are no active miners and a 0 percent emission share. Until that changes, the mine-and-reward loop that drives most subnets is not running here.
- **Deregistration:** With an emission share of 0 percent and well past its four-month immunity, Aurelius is exposed to Bittensor's automatic deregistration, which removes the lowest EMA-price non-immune subnet when a new one registers.
- **Single primary developer:** Development is driven almost entirely by one contributor. That concentrates both the knowledge and the execution risk.
- **Niche demand path:** Alignment data is genuinely valuable, but the route from "produces moral-reasoning transcripts" to durable on-chain demand is unproven, and the work-token deposit model asks miners to fund participation up front.

---

Another subnet, unpacked.
