Appendix B: The total cost of rewards and their distribution
The model used in chapter 7
Evolution of the distribution of attention
In chapter 7, we attempted to describe the compound result of the autonomous development of an Internet-native culture production and of the continued usage of the Internet as a distribution medium for media publishing. The interaction of these factors leads to a popularity distribution for creators (beneficiaries of the Creative Contribution), which is the basis for our model.
To explain what we mean by “popularity distribution for creators”, we need to define a few quantities. If each work wj has popularity pop(wj), and the relative contribution of a given creator ci to wj is contrib(ci,wj), the total popularity1 for a given creator is:
amath `pop(c_i)=\sum_j contrib(c_i,w_j) \times pop(w_j)` endamath
For example, if creator A contributed 50% of work w1, 25% of work w2 and 60% of work w3, the popularity of A is:
amath `pop(A) = 0.5 pop(w_1) + 0.25 pop(w_2) + 0.6 pop(w_3)` endamath
The popularity distribution for creators is the distribution of pop(ci): it determines the distribution of the rewards. We will speak equivalently of usage distribution, because our measurement system collects usage clues (see appendix C).
A number of heterogeneous factors (such as classical media publishing and unauthorized file sharing) already interact to drive the attention devoted to various productions. This interaction will be stronger once the Creative Contribution is put in place and file sharing is recognized as a legitimate activity. It will result – as it already does – in attention patterns (i.e. popularity distributions) that do not formally follow Zipf’s law but are often closely approximated by it. We will discuss below the impact of this possible divergence, but for the time being, let us accept modeling the reported usage pattern by a Zipf law and consider only how the associated parameter – and thus the diversity of usage – will vary.
Note that actual rewards will be distributed according to the observed distribution, not the model: this model is used only to determine the overall amount of rewards, i.e. to set a global scale factor. To do this, we will assume that the input to the reward system will be a Zipf law with parameter α. This leaves us with 3 parameters to set: the value of α, the size of the overall universe (the number of creators to be rewarded), and the minimum reward amount to be distributed.
As explained in section Rewards, one can only speculate about the likely observed value of α. In chapter 7, we predict that it will start at the high value of 1.0 (the classical Zipf law, but actually less concentrated than present copyright rewards), and then progressively decrease to 0.9 (a more diverse distribution of attention), possibly becoming as low as 0.8 in the long term. The interested reader is referred to figure 7.2 for a comparison between the corresponding distributions.2 Given the uncertainty of α and the likelihood that it will vary, it is important to build a model which can function at constant total cost regardless of the precise value of α, at least for the predictable future.
The reward level and size parameters
To set the remaining two parameters, we start from the decision that a certain number of people should be rewarded at or above a certain minimum level in a given country, at the time when the Creative Contribution is introduced. This choice is not arbitrary, but it is clearly based on a programmatic decision rather than on some more fundamental principle: there is no obvious ground truth that immediately tell us that “we should reward so many people by at least this amount”.
We chose to set this minimum reward at \$200/year (or the local currency equivalent), and the number of people who should receive it at 2-2.5% of individual Internet contributors who “produce and publish some contents for sharing over the Internet in a 3 month period” (see [Deroin2010]), and we did our best to substantiate this decision (see ), but we acknowledge that a different choice could be made. However, there are some important constraints. The number of creators to be rewarded at the minimum level cannot be raised arbitrarily, not only because it would make the reward system too expensive, but also because it is not clear that there are enough deserving creators out there to justify it, or that they would come out of the woodwork if a reward was available.
All the computer programs used to experiment with our model are distributed in parallel with the publication of this book as free software3, and readers are encouraged to experiment with other possible values, should they wish to do so.
Our choice for the threshold below which rewards will not be distributed is \$40/year (justified on ). Once these 2 choices have been made, the other decisions follow for a given value of the diversity parameter α.
Setting an initial value for the universe size
For the time being, let us assume a proportional reward, where creators are rewarded proportionally to the measured usage of their works. Let’s assume that the value of the parameter of Zipf’s law for the initially observed diversity of usage will be α=1.0, and we wish to have at least n=230,000 creators receiving €150/year or more. The following formula immediately gives the total number of rewarded creators
amath N = \exp[ln(n) + 1/\alpha ln(150 / 30)] endamath
where exp(x) is the exponential of x and ln(n) is the natural logarithm of n. The formula is obtained as follows. According to Zipf’s law, the reward for the nthcreator is:
amath `reward(n)=R_\alpha / n^\alpha` endamath
where Rα is a constant, or scale factor, that sets the overall level of the rewards. The last creator being rewarded (the one which receives the smallest amount) receives:
amath `reward_min=R_\alpha / N^\alpha` endamath
Dividing one equation by the other:
amath `(reward(n))/(reward_min) = N^\alpha / n^\alpha` endamath
Now take the natural logarithm of both sides:
amath `ln[(reward(n))/(reward_min)]=\alpha ln(N)-\alpha ln(n)` endamath
rearrange:
amath `ln(N)=ln(n) + 1/\alpha ln[(reward(n))/(reward_min)]` endamath
and take the exponential of both sides:
amath `N=\exp{ln(n)+1/\alpha ln[(reward(n))/(reward_min)]}` endamath
Plugging in the values reward(n)=150, rewardmin=30, n=230,000, and α=1 gives:
amath N=\exp{ln(230,000)+1/1 ln[150/30]} endamath
amath N=\exp{ln(230,000)+ln(5)} endamath
amath N=\exp{ln(230,000 \times 5)}=1,150,000 endamath
Running the model
All we now need is to work out the total reward, that is, the sum of all the rewards:
amath amath `reward_text(tot) = \sum_(m=1)^N reward(m) = \sum_(m=1)^N reward_min N^\alpha/m^\alpha` endamath
amath `reward_text(tot) = reward_min \times N^\alpha \times \sum_(m=1)^N 1/m^\alpha` endamath
amath `reward_text(tot) = reward_min \times N^\alpha \times H_N(\alpha)` endamath
where HN,αis the Nth harmonic number, already introduced in Appendix A. This formula immediately gives the total reward, the only step that requires some simple assistance is the computation of HN(α).
How many creators are rewarded as diversity increases?
If we are now in a situation where the observed diversity of use corresponds to another value of Zipf’s law parameter, say α' we can find the new number of rewarded creators N' that will lead to the same total cost in this new situation:
amath N'^(\alpha') \times H_(N')(\alpha') = N^\alpha \times H_N(\alpha) endamath
Solving this equation for N′ is not that trivial, and is easier done by approximation techniques or using numerical tables. For instance in the example above with N=1,150,000 and α=1.0, the solution corresponding to α′=0.9 is N′=2,140,000 and for α″=0.8 it is N″=3,490,000.
Impact of a divergence of the observed usage from Zipf’s law
Only when a measurement system is fully in place can we judge if the reported usage follows Zipf’s law. What are the consequences if it does not? We have designed a constant cost reward system, and this cost can be distributed according to the observed usage, but with some adjustments.
If we want to keep the minimal reward and still distribute the same total amount of rewards, we will have two differences in comparison with what would have happened with the Zipf law fitted to the observed usage:
- the number of rewardees will be different;
- the level of use corresponding to the minimal reward will be different.
The second effect could be the most problematic one if it forces us to measure usage precisely at much lower levels than modeled in appendix C. Fortunately, this is easy to avoid so long as one does not try to reward an excessive number of creators, staying clear of the level of attention where real use is mixed with noise. If we take the example of the 2 million most popular files in 10 weeks of usage of eDonkey P2P, where we have a significant divergence between the best-fitting Zipf’s law and the observed data (see table A.1), the level of usage for the 1,000,000th most popular work is only 24% lower in the observed data than in the model.
Reward functions

Rewards for the first 20,000 creators among one million rewarded creators for various reward functions, with a creator popularity distribution corresponding to a Zipf law parameter of 0.9, and a minimum reward of \$40. The cube root reward function was suggested by Richard Stallman [Stallman 2009]. If a non-proportional reward can be implemented, we favor the choice of a power law reward function with index 2/3, associated with a top-off for the highest rewards.
Now we relax the assumption that the relation between popularity and reward is necessarily proportional. This is not unusual: in most media publishing, the remuneration for creators is more than proportional to sales. A bestselling author might get four times the percentage of royalties of an average essay writer, a fact explained in part by economies of scale in the production and distribution of books, but mainly by the stronger negotiating power of bestselling authors. In the digital world, there are strong fairness and diversity motives for using less-than-proportional rewards.
To allow for this, we introduce a reward function reward(n) = r(pn), where pn is the popularity of creator n relative to that of the least popular creator who will be rewarded: pn = pop(cn)/pop(cN). The reward is proportional if r(pn) = pn. If we want the reward to be proportional to the square root of popularity, we would use amath r(p_n)=\sqrt{p_n} endamath.
The total reward is now given by:
amath `reward_text(tot) = reward(N) \times \sum_(n=1)^N r(N^\alpha/n^\alpha)` endamath
In the case of a square root reward:
amath `reward_text(tot) = reward_min \sum_(n=1)^N \sqrt(N^\alpha/n^\alpha) = reward_min N^(\alpha/2) H_N(\alpha/2)` endamath
In other terms, square root rewards for a distribution of parameter \alpha are the same as proportional rewards for a distribution of parameter amath \alpha/2 endamath. The same holds for any power reward function.
- 1. If
cidid not contribute towjat all,contrib(ci,wj)=0. - 2. In figure 7.2, the size of the universe is kept constant, but in practice, it will vary: if the total cost of the reward is fixed, spreading it more diversely (but still according to a Zipf law) implies an enlarged set of rewarded creators.
- 3. They can be run interactively or downloaded at http://www.sharing-thebook.org.