Data Methodology

Data Methodology

How we source, compute, and classify every metric on the observatory

Data Taxonomy

All embedded data on Bitcoin falls into two disjoint categories based on where it lives in a transaction:

Total Embedded Data = Witness Data + OP_RETURN Data

Witness Data (segregated witness fields)

Inscriptions (Ordinals)

BRC-20 (subset of Inscriptions)

OP_RETURN Data (nulldata script outputs)

= Runes + Omni Layer + Counterparty + Other

Stamps (bare multisig outputs with fake pubkeys)

count only, no byte tracking

Inscriptions and OP_RETURN are guaranteed disjoint (different parts of the transaction). BRC-20 is always a subset of Inscriptions. Runes, Omni, Counterparty, and Other are mutually exclusive subsets of OP_RETURN.

What Each Count Means

Inscriptions(witness envelopes)

Number of standard Ordinals inscription envelopes found in witness data. One transaction can contain multiple inscriptions.

BRC-20(inscription envelopes)

Subset of inscriptions whose body contains the BRC-20 JSON marker. Always less than or equal to inscription count.

OP_RETURN outputs(transaction outputs)

Outputs whose scriptPubKey begins with OP_RETURN (0x6a). Coinbase SegWit commitment outputs are excluded.

Runes(OP_RETURN outputs)

OP_RETURN outputs matching the Runes protocol signature (OP_RETURN OP_13 prefix, or tiny scripts at block 840,000+).

Omni Layer(OP_RETURN outputs)

OP_RETURN outputs containing the 'omni' ASCII marker (hex: 6f6d6e69) in their payload.

Counterparty(OP_RETURN outputs)

OP_RETURN outputs containing the 'CNTRPRTY' ASCII marker (hex: 434e545250525459).

Stamps(multisig outputs)

Bare multisig outputs containing at least one fake public key (33-byte key not starting with 0x02 or 0x03). Count only.

Byte Accounting

We track two byte measures for inscriptions and one for OP_RETURN:

inscription_bytes (payload)

Content body only. Envelope overhead (OP_FALSE, OP_IF, 'ord' marker, content-type section, separator, OP_ENDIF) is subtracted. Typically 8-20 bytes of overhead removed per inscription.

inscription_envelope_bytes (full)

Full witness item bytes including the complete envelope structure and payload. Comparable to OP_RETURN script bytes.

OP_RETURN bytes (script)

Full scriptPubKey byte length: includes the OP_RETURN opcode, push opcodes, and payload data. This is the on-chain serialized size.

For comparable cross-category analysis, use inscription_envelope_bytes alongside OP_RETURN bytes (both represent full on-chain serialized footprint). Use inscription_bytes (payload) when analyzing content size alone.

Detection Heuristics

InscriptionsHigh

Hex pattern 0063036f7264 in witness data (OP_FALSE OP_IF OP_PUSH3 'ord').

Standard Ordinals envelopes only. Cursed, non-standard, and malformed inscriptions are not detected.

BRC-20Medium

Substring match for hex 7b2270223a226272632d3230 inside an inscription body.

Matches the JSON fragment {"p":"brc-20. No JSON validation is performed, so malformed BRC-20 payloads are included.

RunesHigh

OP_RETURN starting with OP_13 (0x5d) at block >= 840,000, or tiny scripts <= 6 bytes at block >= 840,000.

Matches the protocol specification. Before block 840,000 the same pattern is classified as generic OP_RETURN.

Omni LayerMedium

OP_RETURN payload contains ASCII 'omni' (hex: 6f6d6e69).

OP_RETURN encoding only. Historical bare multisig (Class A/B) encodings are not detected.

CounterpartyMedium

OP_RETURN payload contains ASCII 'CNTRPRTY' (hex: 434e545250525459).

OP_RETURN encoding only. Historical multisig and pubkeyhash encodings are not detected.

StampsMedium

Bare multisig output with at least one 33-byte key not starting with 0x02/0x03.

Fake pubkey detection. May miss some encoding variants.

Known Exclusions

The following data embedding techniques are not currently detected:

  • Historical Omni Layer encodings using bare multisig (Class A) or other pre-OP_RETURN methods
  • Historical Counterparty encodings using multisig or pubkeyhash patterns
  • Cursed, unbound, or non-standard inscription envelopes
  • Witness data stuffing outside the standard Ordinals envelope
  • P2SH redeemScript data stuffing
  • Annex field data (BIP 341)
  • Unknown or emerging protocols using novel encoding methods

These exclusions mean total embedded data figures are conservative lower bounds.

Data Source

All data is derived from confirmed blocks fetched from a local Bitcoin Core node at verbosity level 2 (full transaction details). Block data is ingested incrementally via 60-second polling and stored in SQLite.

Duplicate payloads across transactions are counted separately. Reorgs are detected and corrected within 15 seconds.

Non-Overlapping Total

To compute a non-overlapping total of embedded data from the dashboard:

total_bytes = inscription_envelope_bytes + op_return_bytes

total_count = inscription_count + op_return_count

Do not add BRC-20 to Inscriptions, or Runes to OP_RETURN, as these are subsets.

Block Metrics

All block data is fetched from Bitcoin Core at verbosity level 2 (full transaction details including inputs, outputs, witness data, and script types).

Block Size(bytes)

Raw serialized block size as reported by Bitcoin Core. Includes header, transaction data, and witness data.

Weight(weight units)

BIP 141 block weight: base_size * 3 + total_size. Consensus limit is 4,000,000 WU. Weight utilization is weight / 4,000,000 * 100%.

Transaction Count(transactions)

Total transactions in the block including the coinbase transaction.

Block Interval(seconds)

Difference between this block's timestamp and the previous block's timestamp. Target is 600 seconds (10 minutes).

Fee Calculations

Fees are derived from transaction data, not from any external fee estimation API.

Total Fees(satoshis)

Sum of all transaction fees in the block. Each fee = sum(inputs) - sum(outputs). Equivalently: coinbase output value minus the block subsidy (50 BTC halved every 210,000 blocks).

Per-TX Fee(satoshis)

For each non-coinbase transaction: sum of input values minus sum of output values. Requires txindex for input value lookups.

Median Fee Rate(sat/vB)

Median of all per-transaction fee rates in the block. Fee rate = fee / virtual_size. Virtual size = weight / 4.

Fee Percentiles(sat/vB)

p10, p25, p75, p90 fee rates computed from the sorted list of per-transaction fee rates. Used for the Fee Rate Bands chart.

Max TX Fee(satoshis)

Largest individual transaction fee in the block. Highlights fat-finger fees and high-priority transactions.

Protocol Fees(satoshis)

If a transaction contains an inscription or Runes output, its entire fee is attributed to that protocol. A transaction can only be attributed to one protocol.

Address Type Classification

Output types are classified from the scriptPubKey.type field returned by Bitcoin Core:

P2PK(outputs)

Pay-to-Public-Key (type: 'pubkey'). Early Bitcoin transactions, rarely used after 2010.

P2PKH(outputs)

Pay-to-Public-Key-Hash (type: 'pubkeyhash'). Legacy addresses starting with '1'.

P2SH(outputs)

Pay-to-Script-Hash (type: 'scripthash'). Addresses starting with '3', used for multisig and wrapped SegWit.

P2WPKH(outputs)

Pay-to-Witness-Public-Key-Hash (type: 'witness_v0_keyhash'). Native SegWit addresses starting with 'bc1q'.

P2WSH(outputs)

Pay-to-Witness-Script-Hash (type: 'witness_v0_scripthash'). Native SegWit multisig and complex scripts.

P2TR(outputs)

Pay-to-Taproot (type: 'witness_v1_taproot'). Taproot addresses starting with 'bc1p'. Available since block 709,632.

SegWit adoption % is calculated as the percentage of non-coinbase transactions with at least one witness input. Taproot spend types (key-path vs script-path) are detected from witness stack structure.

Mining Pool Identification

Mining pools are identified by matching patterns in the coinbase transaction:

Primary method(coinbase text)

The coinbase scriptSig is decoded to ASCII and matched against known pool signatures (case-insensitive). Covers 30+ pools: Foundry, AntPool, ViaBTC, F2Pool, MARA, OCEAN, SpiderPool, and others.

Fallback method(coinbase outputs)

If text matching fails, OP_RETURN outputs in the coinbase transaction are checked for pool identifiers.

OCEAN miners(template detection)

OCEAN pool uses a decentralized template model. Individual OCEAN template miners are identified and attributed separately.

Unknown(unidentified)

Blocks that match no known pool signature are labeled 'Unknown'. The HHI diversity index excludes Unknown miners to avoid inflating concentration metrics.

Price Data

Bitcoin price data is used for the price overlay on charts.

Source(API)

Historical daily prices from blockchain.info/charts API. Covers 2011 to present. Pre-2011 prices are hardcoded from historical records.

Live price(mempool.space)

Current price from mempool.space API, cached for 60 seconds to avoid excessive requests.

Chart overlay(interpolation)

For per-block charts, daily price data is interpolated to each block's timestamp using linear interpolation between surrounding price points. For daily charts, prices are matched by date.

Daily Aggregates

For time ranges longer than ~35 days (5,000 blocks), per-block data is rolled up into daily aggregates for performance.

Aggregation(daily)

Each day's blocks are averaged (or summed where appropriate). Dates are derived from block timestamps in UTC.

Incremental updates(automatic)

The daily_blocks table is updated incrementally as new blocks arrive. Only the current day's row is recomputed.

Range switching(automatic)

Charts automatically switch between per-block and daily data based on the selected time range. Short ranges (1D-1M) show per-block data; longer ranges (3M+) show daily aggregates.

Some charts are only available on per-block ranges (scatter plots, histograms) while others are only meaningful on daily ranges (adoption velocity, sunset tracker). The chart description updates to reflect which mode is active.