The following are some interesting results on the performance of different miners over the course of the first 280,000 blocks of the Ethereum blockchain. For this time span, I have compiled the block and uncle coinbase address list; raw data can be found here for blocks Y here for the unclesand from this we can get a lot of interesting information, particularly about outdated rates and how well connected different miners and pools are.
First, the scatterplot:
What we clearly see here are some major trends. First of all, the uncle rates are pretty low compared to Olympic; in total we’ve seen 20750 uncles with 280000 blocks, or a uncle rate of 7.41% (if you calculate this inclusively, i.e. uncles as a percentage of all blocks instead of uncles per block, you get a 6.89 %); In short, isn’t that much higher than similar figures for bitcoin even in 2011, when its mining ecosystem was more similar to Ethereum with CPU and GPU still dominant and low transaction volume. Note that this does not mean that miners earn only 93.11% of the revenue that they would if they were infinitely well connected to everyone else; Ethereum’s mechanical uncle effectively cuts ~87% of the difference, so the actual “average loss” from poor connectivity is only ~0.9%. That being said, these losses will increase for two reasons once the network starts seeing more transactions: first, the mechanical uncle works only on base block rewards, not transaction fees, and second, larger blocks necessarily lead to longer times. longer propagation.
Second, we can see that there is a general trend that larger miners have lower uncle rates. This is, of course, to be expected, although it is important to dissect (1) why this happens, and (2) to what extent this is really a real effect and not simply a statistical artifact of the fact that smaller samples tend to have more extreme results.
Separating by miner size, the statistics are as follows:
Number of blocks mined | Uncle Average Rate |
<= 10 | 0.127 |
10-100 | 0.097 |
100-1000 | 0.087 |
1000-10000 | 0.089* |
>= 10000 | 0.055 |
* Arguably this result is heavily skewed by a single outlier, the probably broken miner which is the point on the graph at 4005 blocks mined, 0.378 uncle rate; not including this miner, we get an average uncle rate of 0.071 which seems much more in line with the general trend.
There are four main hypotheses that can explain these results:
- professionalism disparity– Big miners are professional operations and have more resources available to invest in improving their overall network connectivity (for example, buying better wireless services, taking a closer look to see if their uncles’ rates are way below optimal due to to network problems), and therefore have greater efficiency. Small miners, on the other hand, tend to be fond of their laptops and may not be particularly well connected to the internet.
- last block effect: The miner who produced the last block “finds out” about the block immediately instead of waiting ~1 second for it to propagate through the network and thus gains an advantage in finding the next block.
- pool efficiency: Very large miners are pools, and for some reason pools are likely related to more efficient networks than individual miners.
- time period differences: Pools and other very large miners were not active on the first day of the blockchain, when block times were very fast and uncle rates were very high.
The effect of the last block clearly doesn’t tell the whole story. If it was 100% the cause, then we would actually see a linear decrease in efficiency: miners who mined 1 block might see an 8% uncle rate, miners who mined 28000 (ie 10% of all) blocks would see a 7.2% uncle rate, miners who mined 56000 blocks would see a 6.4% uncle rate, etc.; this is because the miners who mined 20% of the blocks would have mined the last block 20% of the time and would therefore benefit from an expected uncle rate of 0% 20% of the time, hence the 20% reduction from 8% to 6.4%. The difference between miners who mined 1 block and miners who mined 100 blocks would be negligible. In reality, of course, the decline in stale rates with increasing size appears to be almost perfectly logarithmic, a curve that seems much more consistent with a disparity theory of professionalism than anything else. The curve also supports the time period difference theory, although it is important to note that only ~1600 uncles (i.e. 8% of all uncles and 0.6% of all blocks) were mined during those first two hectic days when guy rates were high. and that can be at most ~0.6% of the uncle rates in total.
The fact that disparity in professionalism seems to dominate is, in a sense, an encouraging sign, especially since (i) the factor is more important at small and medium scales than at medium and large scales, and (ii) individual miners they tend to have financial compensations. factors that outweigh their reduced efficiency, in particular, the fact that they are using hardware for which much of it has already been paid for.
Now what about the jump from 7.1% on 1000-10000 blocks to 5.5% for everyone above that? The effect of the last block can account for about 40% of the effect, but not all (quick math: the average miner in the first cohort has a 1% stake in the network, in the last cohort 10%, and the difference in The 9% should project a decline from 7.1* to 7.1% * 0.93 = 6.4%, though given the small number of miners, it’s important to note that any findings here should be taken as very tentative at The best case.
The key feature of miners above 10000 blocks, naturally, is that them is it so pools (or at least three of the five; the other two they are solitary miners although they are the smallest). Interestingly, the two non-pools have uncle rates of 8.1% and 3.5% respectively, a weighted average of 6.0% that is not much different from the 5.4% weighted average obsolete rate of the three pools; so overall it appears that pools are slightly more efficient than individual miners, but again, the finding should not be taken as statistically significant; although the sample size within each group is very large, the sample size of the groups is small. Also, the most efficient mining pool is not actually the largest (nanopool), it is suprnova.
This leads us to an interesting question: where do the efficiencies and inefficiencies of co-mining come from? On the one hand, the pools are likely to be very well connected to the network and do a good job of distributing their own blocks; they also benefit from a weaker version of the last block effect (weaker version because the single hop round trip from miner to pool to miner still exists). On the other hand, the delay in getting work from a pool after creating a block should slightly increase the stale rate: assuming 200ms network latency, by about 1%. These forces are likely to more or less cancel each other out.
The third key to measure is: how much of the disparities we see is due to genuine inequality in how well connected miners are, and how much is fluke? To check this, we can do a simple statistical test. These are the deciles of the uncle rates of all miners who produced more than 100 blocks (i.e., the first number is the lowest uncle rate, the second number is the 10th percentile, the third number is the 20th percentile, and so on until the last number is the highest):
(0.01125703564727955, 0.03481012658227848, 0.04812518452908179, 0.0582010582010582, 0.06701030927835051, 0.07642487046632124, 0.0847457627118644, 0.09588299024918744, 0.11538461538461539, 0.14803625377643503, 0.3787765293383271)
These are the deciles generated by a random model where each miner has a “natural” stale rate of 7.41% and all disparities are due to some being lucky or unlucky:
(0.03, 0.052980132450331126, 0.06140350877192982, 0.06594885598923284, 0.06948640483383686, 0.07207207207207207, 0.07488986784140969, 0.078125, 0.08302752293577982, 0.09230769230769231, 0.12857142857142856)
So we get about half the effect. The other half actually comes from genuine connectivity differences; In particular, if you make a simple model in which the “natural” stale rates are random variables with a normal distribution around a mean of 0.09, a standard deviation of 0.06, and a strict minimum of 0, you get:
(0, 0.025374105400130124, 0.05084745762711865, 0.06557377049180328, 0.07669616519174041, 0.09032875837855091, 0.10062893081761007, 0.11311861743912019, 0.13307984790874525, 0.16252390057361377, 0.21085858585858586)
This is pretty close, although it grows too fast on the low side and slowly on the high side; indeed, it appears that the best-fitting “natural distribution of stale rates” exhibits positive bias, which we would expect given the diminishing returns to spending more and more effort to get better and better connected to the network. Still, the effects are not very large; especially when divided by 8 after accounting for the uncle mechanism, the disparities are much smaller than the disparities in electricity costs. Therefore, arguably the best approaches to improve decentralization in the future concentrate on finding more decentralized alternatives to mining pools; maybe mining pools implementing something like Meni Rosenfeld Multiple PPS It can be a solution in the medium term.