ResearchSubnet 0114 min read

Into:TrajectoryRL

Making AI agents cheaper, safer, and smarter.

By vaNlabs ResearchMarch 30, 2026View as Markdown

Priceτ0.00957

Market cap38.9k τ

Momentum53/ 100

Unique holders2.03k

Emission0.00%

Net flow 7d-299.8 τ

As of · Jun 4, 10:37 UTC

A that optimizes how AI agents work. compete to make agents cheaper, safer, and more reliable by optimizing their decision-making policies. One demonstration cut agent operating costs from $12,300/month to $900/month, a 93% reduction, through trajectory optimization alone.

What is TrajectoryRL

TrajectoryRL is a subnet where miners compete to optimize AI agent policies. When an AI agent performs a task (browsing the web, writing code, managing files), it makes a series of decisions called a "trajectory." TrajectoryRL rewards miners who find better trajectories: sequences of decisions that are faster, cheaper, and more reliable.

The simple version: Imagine an AI assistant that takes 20 steps to book a flight, costing $5 in API calls. A TrajectoryRL miner figures out how to do it in 6 steps for $0.30. The miner who finds the most efficient path wins.

Centralized equivalent: Think of it as automated consulting for AI operations. Companies like McKinsey optimize business processes; TrajectoryRL optimizes AI agent processes, but through competitive benchmarking rather than billable hours.

How it works:

Miners upload "policy packs" containing optimized agent configurations (prompt engineering, multi-LLM routing, skill injection) to any public HTTP endpoint and commit metadata on-chain. No server required, no uptime needed.
evaluate policy packs using ClawBench, a deterministic scenario suite with fixed fixtures. Two-phase evaluation: Phase 1 checks pack integrity, Phase 2 scores trajectory quality using LLM-as-judge against natural language criteria. Winner-take-all with first-mover advantage and NCD similarity detection to prevent copying.

Why This Matters

Keep exploring

Other research from the same neighborhood of the network.

ResearchSubnet 0114 min read

Into:TrajectoryRL

Making AI agents cheaper, safer, and smarter.

By vaNlabs ResearchMarch 30, 2026View as Markdown

Priceτ0.00957

Market cap38.9k τ

Momentum53/ 100

Unique holders2.03k

Emission0.00%

Net flow 7d-299.8 τ

As of · Jun 4, 10:37 UTC

What is TrajectoryRL

How it works:

Miners upload "policy packs" containing optimized agent configurations (prompt engineering, multi-LLM routing, skill injection) to any public HTTP endpoint and commit metadata on-chain. No server required, no uptime needed.
evaluate policy packs using ClawBench, a deterministic scenario suite with fixed fixtures. Two-phase evaluation: Phase 1 checks pack integrity, Phase 2 scores trajectory quality using LLM-as-judge against natural language criteria. Winner-take-all with first-mover advantage and NCD similarity detection to prevent copying.

Why This Matters

Keep exploring

Other research from the same neighborhood of the network.

Into:TrajectoryRL

What is TrajectoryRL

Why This Matters

Into:TrajectoryRL

What is TrajectoryRL

Why This Matters

Full Analysis

Risk Factors

//What is TrajectoryRL

Why This Matters

//What is TrajectoryRL

Why This Matters

//Full Analysis

//Risk Factors

What is TrajectoryRL

What is TrajectoryRL

Full Analysis

Risk Factors