ResearchData5 min read

Into:Data Universe

Web-scale scraping, queryable by anyone.

By vaNlabs ResearchMay 31, 2026View as Markdown

Priceτ0.00737

Market cap35.5k τ

Momentum55/ 100

Unique holders2.55k

Emission0.00%

Net flow 7d-38.0 τ

As of · Jun 4, 10:37 UTC

230 scrape and store social media data across X, Reddit, and YouTube. verify what they collect through random sampling and credibility scoring. The output is what Macrocosmos calls the world's largest open-source social media dataset, queryable through a no-code product called Gravity.

What is Data Universe

Data Universe (SN13) is Macrocosmos's data-scraping on Bittensor. Miners continuously pull fresh posts from X, Reddit, and YouTube transcripts, and store them in a structured format. Validators check sample slices to confirm the data is real, fresh, and not just duplicates of what every other miner is holding.

The simple version: Imagine a decentralized version of a web-scraping firm where hundreds of independent operators race to collect the freshest, most diverse social media content, and only get paid for data nobody else has and that the market actually asked for.

Centralized equivalent: Think Bright Data, Apify, or Common Crawl, but with miner incentives tied to demand from a no-code product (Gravity) instead of one-off enterprise contracts.

How it works:

Miners scrape DataEntities from X, Reddit, and YouTube, organize them into time and label buckets, and report a MinerIndex to validators. They also upload to S3-compatible storage for public dataset access.
Validators pull each miner's MinerIndex, randomly sample data for correctness, and score miners on data value (freshness, desirability, scarcity) multiplied by credibility raised to the 2.5 power. Weights get set roughly every 20 minutes.

Keep exploring

Other research from the same neighborhood of the network.

ResearchData5 min read

Into:Data Universe

Web-scale scraping, queryable by anyone.

By vaNlabs ResearchMay 31, 2026View as Markdown

Priceτ0.00737

Market cap35.5k τ

Momentum55/ 100

Unique holders2.55k

Emission0.00%

Net flow 7d-38.0 τ

As of · Jun 4, 10:37 UTC

What is Data Universe

Centralized equivalent: Think Bright Data, Apify, or Common Crawl, but with miner incentives tied to demand from a no-code product (Gravity) instead of one-off enterprise contracts.

How it works:

Miners scrape DataEntities from X, Reddit, and YouTube, organize them into time and label buckets, and report a MinerIndex to validators. They also upload to S3-compatible storage for public dataset access.
Validators pull each miner's MinerIndex, randomly sample data for correctness, and score miners on data value (freshness, desirability, scarcity) multiplied by credibility raised to the 2.5 power. Weights get set roughly every 20 minutes.

Keep exploring

Other research from the same neighborhood of the network.

Full Analysis

Category: Data Scraping and Archival | Centralized Competitor: Bright Data, Apify, ScraperAPI, Common Crawl

Macrocosmos is one of the larger operators on Bittensor, running five subnets and treating Data Universe as the raw-data tier of a broader stack. Gravity sits on top as a no-code query tool, Nebula visualizes the data on a 3D plane with sentiment overlays, and Mission Commander is an agentic chatbot (built on SN1, Apex) that helps users phrase scraping jobs. There is also an MCP integration that wires the dataset into Claude and Cursor.

Mechanism:

Each miner scrapes from the supported DataSources and groups entries into DataEntityBuckets keyed by source, time bucket, and a DataLabel (like a stock ticker or subreddit). The full set is the miner's MinerIndex. Validators periodically request that index and store a local copy, then sample data to verify it actually exists at the original source and matches what was claimed. A separate S3 storage validation runs roughly every six hours and checks for duplicates, job-match alignment, and scraper-verifiable authenticity.

Data value is not flat. Fresh data is worth more, with linear decay over 30 days and zero value after that. Data that matches active Gravity user requests gets up to a 5x multiplier, while unspecified labels score at 30% of baseline. Data that many miners already hold is worth less per unit. Credibility, tracked as an of validation outcomes, is then applied as a multiplier raised to the 2.5 power, which makes misrepresentation strictly worse than honest reporting of smaller stores. The team publishes a live dashboard at sn13-dashboard.api.macrocosmos.ai showing what the network currently holds.

On the market side, the alpha token trades around 0.00734 TAO with a of roughly 38,392 TAO and 22,640 TAO of depth in the pool. The 30-day price is down about 8% and 7-day is negative at -213 TAO. Under the November 2025 model, subnets with negative net staking flows receive no share of network emissions, and SN13's current emission share is 0%. The product surface is live and miners and validators continue to operate; emission share will follow when staking flows turn back positive.

Into:Data Universe

What is Data Universe

Into:Data Universe

What is Data Universe

Why This Matters

Full Analysis

Risk Factors

//What is Data Universe

//What is Data Universe

//Why This Matters

//Full Analysis

//Risk Factors

What is Data Universe

What is Data Universe

Why This Matters

Full Analysis

Risk Factors