Block Template Similarities between Mining Pools

AntPool and friends?

Monday, September 16, 2024

Different mining pools sending out the same or a similar block template to miners is an indicator for proxy pooling. Knowing about proxy pools is important when discussing mining pool centralization. To find similarities between mining pool block templates, I compare the Merkle branches pools sent in the stratum jobs and calculate a similarity score. This shows pools with similar templates and allows building a relationship graph between the pools.

In April 2024, I reported that pools like BTC.com, Binance Pool, Poolin, Braiins, and possibly other pools¹ sometimes have the same template and custom transaction prioritization as AntPool. While motivated by on-chain transaction flows, my reporting was mainly based on visual observations over a few days. To further back these claims with data and have a base for further analysis, I started to record the stratum jobs published by major Bitcoin mining pools. In this post, I use a similarity score to compare the pool templates and develop a pool relationship graph showing pools that share templates.

Looking at the merkle branches that mining pools send to miners as part of stratum jobs, it's clear that the BTCcom pool, Binance pool, Poolin, EMCD, Rawpool, and possibly Braiins* have exactly the same template and custom transaction prioritization as AntPool. https://t.co/KTjFWtTXEP pic.twitter.com/xhCrdvkOH8
— b10c (@0xB10C) April 17, 2024

Stratum jobs and Merkle branches

Stratum jobs don’t contain the full list of transactions included in the block template. A miner only needs to construct a block header, which can be done without knowing the full template contents. Modern miners exhaust the 32-bit nonce in the block header quite fast and can then either update the timestamp in the header, roll the version a la overt ASICBoost, or change the so-called extranonce in the coinbase transaction, which causes the Merkle root to change. For this, miners need the coinbase transaction, information about the extranonce, and the Merkle branches to calculate a new Merkle root.

The list of Merkle branches in stratum jobs contains just the information required to calculate the Merkle root. To build the Merkle root, the coinbase transaction is hashes together with the first Merkle branch, the result is then hashed with the second Merkle branch, which is then again hashed with the third Merkle branch. The Merkle root is reached once all Merkle branches have been hashed together.

How to construct a block header from a stratum job: Shows the stratum job on the left and a block header on the top. The individual transactions, the Merkle branches, and the Merkle tree are shown.

The first Merkle branch b0, which is hashed with the coinbase transaction, is the txid of the first non-coinbase transaction in the block. The second Merkle branch b1 consists of the txids of the third and fourth transaction hashed together. The third Merkle branch b2 consists of the hash of the fifth and sixth txid hashed together and the seventh and eighth txid hashed together. The number of transactions included in each merkle branch grows exponentially. While the third branch consists of 4 transactions, the eighth branch already includes 128 transactions.

Similarity-score

To measure the similarity of block templates by different pools, I record the stratum jobs published by major mining pools with my work-in-progress stratum-observer tool. While some pools offer multiple stratum endpoints distributed across the globe, I choose the endpoints located closest to me to reduce latency. Note that different stratum endpoints from the same pool might serve different jobs.

A naive approach to measure the similarity would be to compare the Merkle branches included in mining jobs by calculating the share of branches that match across two pools. For example, if two pools include 10 Merkle branches in their jobs, and the first 7 Merkle branches match, this would result in a 70% match score. However, as the number of transactions in the Merkle branches rises exponentially, the first Merkle branches only contain a few transactions while the later ones contain the majority. The result would show branch similarity, but wouldn’t reflect template similarity.

A weighted approach, where the first branches weigh less than the later branches closer reflects template similarity. Matching branches can be weighted by the maximum number of transactions they can contain. The weighted similarity score of the Merkle branches $A$ and $B$ is

$$ \sum_{A_i \text{ matches } B_i}^l {\frac{1}{2}^{1 + l - i}} $$

where $i$ is the one-indexed position in the list of Merkle branches and $l$ is the minimum length of $A$ and $B$. A similarity score of 100% indicates that the templates match while a similarity score of 0% indicates that no branch matches. The example from above, where the first 7 of 10 Merkle branches match, results in a 12.5% weighted similarity score.

The similarity score is strongly affected by transaction ordering. Two block templates may include the same transactions but as soon as the transaction order is slightly different, the Merkle branches won’t match and the similarity score won’t indicate similarity.

Evaluation

The described similarity score is applied to stratum job data collected between 2024-06-01 and 2024-09-12. The data consists of 690k stratum jobs across 24 pools and pool configurations. My test_0xB10C Pool_ used as a reference, AntPool, BTC.com, Binance Pool, Braiins, the CKPool solo pool, the DEMAND stratum v1 solo pool, F2Pool, Foundry, Luxor, MaraPool and a development endpoint Marapool (dev), Ocean Pool’s four template providers Ocean (default), Ocean (ordis), Ocean (core), Ocean (datafree), Poolin, the PyBlock solo pool, SBICrypto, SecPool, SigmaPool, SpiderPool, Ultimus, and ViaBTC.

Pools generally publish at least one new mining job every minute with a frequently used interval being 30 seconds. To be able to compare recent jobs by different pools to each other, I sampled and compared the most recent jobs every five minutes.

The following similarity matrix gives an overview of which templates are similar. The similarity scores shown in the matrix can be grouped into three groups. First, a high-similarity group with scores between 99% and 80% containing, for example, AntPool-Poolin with a 99% similarity score, and AntPool-BTC.com with a 98% similarity score. Secondly, a low-similarity group with scores between 35% and 20%. This group contains, for example, Braiins-Poolin with a score of 32%. The third group is the no-similarity group with a similarity score below 5%. This group contains, for example, the Ocean (datafree)-ViaBTC pool combination with a similarity score of 0%.

A similarity matrix showing the mean similarity score for all 276 pool combinations. Higher (yellow) means more similar.

High-similarity pools

The pool combinations in the high-similarity group regularly send out mining jobs based on the same template. The high-similarity score combinations are:

AntPool - Poolin: 99%
AntPool - BTC.com: 98%
BTC.com - Poolin: 99%
SecPool - SigmaPool: 97%
Braiins - Ultimus: 89%
SpiderPool - Binance Pool: 81%

Based on these combinations, a preliminary relationship graph can be constructed. A relationship between AntPool, BTC.com, Poolin, and other pools has been assumed before based on coinbase reward consolidations. Braiins and Ultimus Pool both have been seen consolidating coinbase rewards with AntPool and other pools in the past, too. The relationship between SigmaPool and SecPool isn’t surprising, as the SigmaPool stratum endpoint eu1.btc.sigmapool.com only publishes mining jobs with the Mined by SecPool tag in the coinbase transaction. The SigmaPool stratum endpoint is a proxy for the SecPool endpoint. SpiderPool mined its first block in the spring of 2024. The relationship with Binance Pool is interesting.

A preliminary relationship graph based on high similarity scores. This graph is updated below.

Similarity over time

By plotting the similarity scores of the high- and low-similarity pool combinations over time, it becomes apparent that the BTC.com and Binance Pool mining pools switched to a different template provider in the observed time frame. The BTC.com pool switched to the Braiins and Ultimus templates at 9 am UTC on 2024-06-18 for about 24 hours. Binance Pool switched from SpiderPool to the AntPool-Poolin-BTC.com template on 2024-08-23.

Pools with an average similarity score >10% plotted over time.

Binance Pool switching from SpiderPool to AntPool-Poolin-BTC.com can be observed in the coinbase transaction tags sent along with the stratum jobs too. Until 2024-08-23, the Binance Pool endpoint I was connected to sends out jobs with the SpiderPool/ coinbase tag. After the switch, it mainly includes the tag binance/. A relationship between Binance Pool and BTC.com has already been observed in CoinMetrics’s State-of-the-Network #249 “Following Flows V: Pool Cross-Pollination”. While my data does not contain any jobs with binance/ in the coinbase tag before 2024-08-23, there were certainly blocks mined with a Binance Pool tag. Possibly, different Binance Pool endpoints publish different jobs.

From 2024-08-23 on, the Binance Pool jobs sometimes also contain the tag Mined by SecPool. The Binance Pool endpoint seems to transparently, without changing the coinbase transaction, proxy the mining jobs from SecPool from time to time. This partly explains the similarity score of 10% between Binance Pool and SecPool.

Coinbase tags in stratum jobs by Binance Pool's stratum endpoint — Coinbase tags in stratum jobs by Binance Pool’s stratum endpoint

With this information, we can update the relationship graph.

When looking for switches in coinbase tags from other pools, Braiins and Luxor stand out. Braiins normally uses the /slush/ coinbase tag from its predecessor Slush. However, since mid-July 2024, they started to transparently proxy Foundry’s mining jobs from time to time. This partly explains the 10% similarity score between Braiins and Foundry. On 2024-08-23, the day Binance Pool switched from SpiderPool to AntPool-Poolin-BTC.com, Braiins transparently proxied the jobs from Binance Pool for a few hours. Note that the Binance Pool website shows a popup referring to UltimusPool as a “Strategic Business Partner” and mentions that UltimusPool, who have the same block templates as Braiins, provides “technical services” for Binance Pool.

Coinbase tags in stratum jobs by Braiins stratum endpoint

The Luxor stratum endpoint seems to transparently proxy the jobs from the SBICrypto pool from time to time. Interestingly, the Luxor and SBICrypto pool only have a similarity score of 2%.

Coinbase tags in stratum jobs by Luxor's stratum endpoint — Coinbase tags in stratum jobs by Luxor’s stratum endpoint

Low-similarity pools

To closer analyze the relationships in the low-similarity groups, it makes sense to inspect where the Merkle branches of the pools with 20%-30% similarity match. The number of Merkle branches in a job depends on the number of transactions in the template, which depends on the pool’s mempool. More transactions in the template mean more Merkle branches send in the mining job. Currently, most mining jobs include 12 or 13 Merkle branches, which corresponds to up to 4096 and 8092 transactions. The following graphic shows the share of matching Merkle branches for each branch in the jobs. For example, when comparing the Merkle branches of Ocean (core) and CKPool (solo), 70% of all Merkle branches matched at position 2 when the jobs contained 12 Merkle branches. The 70% can be found in the top-left chart, on x = 2 and y = 12.

Where do the Merkle branches match? Four examples.

The chart includes four hand-picked examples². The Ocean (core)- CKPool (solo) example in the top-left should be a good baseline for the Merkle branch similarity between likely well-connected but unmodified Bitcoin Core nodes. These two pools have a similarity score of 1%. While branch position 1 matches more than 80% of the time, around position 7 the branches only match in less than 10% of the cases. The last branch positions never match. This is likely caused by small, normally occurring differences in the pools mempools.

The Poolin-Antpool comparison in the top-right shows the extreme case when the Merkle branches almost always completely match. These two pools have a similarity score of 99%. For nearly all positions, 99% or more of the branches across all branch positions match.

The ViaBTC-Ocean (datafree) example in the bottom-left shows the other extreme when the Merkle branches rarely match. These pools have a similarity score of 0%. ViaBTC is known for its custom transaction prioritizations through its transaction accelerator and the Ocean datafree node policy filters out all data-carrying transactions. These block templates are expected to be as different as it gets. There are very few matches between the branches at position 1 and no matches at the later positions.

With a baseline and examples for both extremes, we can better categorize pool combinations from the low-similarity group. The example in the bottom right shows Braiins-AntPool with a similarity score of 32%. The first few branch positions nearly always match. At positions 8 and 9, around 50% of the branches still match. At the last positions, often more than 25% of the templates match. This is significantly better than the baseline and raises the question of why, for example, the Braiins-AntPool templates share more transactions than other pool combinations.

It has been previously observed that AntPool, Braiins, and other pools sometimes prioritize the same low-fee transactions by putting them at the beginning of the block at the same time. One explanation could be that the templates are built from two nodes connected to each other. Both nodes receive the same prioritization. Sometimes their mempools match, and slight mempool differences cause a lower similarity score.

Final pool relationship graph including high- and low-similarity data as well as temporary relationships. Percentages are the average similarity scores. Pools not included here don’t show any significant similarities to other pools.

Looking at the network share of these nine pools can help to put these relationships into perspective. In the past month, AntPool mined 24.8% of the blocks, Binance Pool 2.86%, SpiderPool 2.67%, SecPool 2.14%, Luxor 1.89%, Braiins 1.7%, BTC.com 0.82%, Poolin 0.4%, and UltimusPool 0.31%. In contrast, Foundry mined 31.31% of the blocks. The high-similarity pool combination of AntPool-BTC.com-Poolin has therefore a network share of 26.02%. Braiins and Ultimus Pool together have a share of 2%. All pools together have had a 37.6% share of the network hashrate over the past month, which is significantly larger than Foundry’s share. That said, while the block templates might be unusually similar between some of these pools, and some pools might be engaging as proxy pools for others here and there, it’s not proven that there is a single entity behind these nine pools. Yet, it adds more data points to the discussion around mining pool centralization.

@mononautical reported about coinbase transaction outputs from multiple pools, including AntPool, Braiins, Binance Pool, SecPool, and F2Pool spent in the same transaction. Yet, the F2Pool templates show no similarity to any of the other pools. However, it is known that F2Pool runs its own nodes and builds its own block templates. It’s expected that F2Pool templates don’t match with, for example, AntPools templates.

More insights can be extracted from the stratum job data. For example, looking at the job update arrival time might be interesting as the jobs from some pools arrive at the same time. It might be interesting to look at the custom transaction prioritizations and where these match across pools. The coinbase output value could be analyzed. Pools often having a similar coinbase output value might be peering with each other. Generally, it might make sense to directly peer two otherwise independent Bitcoin Core nodes to see how similar the templates get.

My post mentioned EMCDPool and RawPool. According to mempool.space, EMCDPool mined its last block over a year ago and RawPool has been inactive for three years. While their stratum endpoints still publish jobs, I didn’t include them in this analysis. ↩︎
A collection showing all possible 276 pool combinations can be found here (7.1 MB, ~3500x12000px). ↩︎