Mathematical Appendices

Main Menu           Previous Topic                                                           Next Topic

(back to topic 5)

Mathematical Appendix for Topic 5

Here, we recast the intuition behind hypothesis 4 in terms of a simple problem from the calculus of optimization.

Two students are in a college history class, and a test is pending. The first student studies during the night before and learns an amount of knowledge equal to k1. During the test, she opts to share a fraction m1 of that knowledge with the other student. Similarly, the other student learns k2 and shares a fraction m2. As a result, the two students obtain amounts of information k1 + m2*k2 and k2 + m1*k1, respectively.

While grading the test, the history professor applies a certain reward function, to determine the grades received by both students. Call this function R(t1, t2, i), where t1 = k1 + m2*k2 is the total information displayed during the test by the first student, and t2 - that displayed by the second. i={1 or 2} is the identity of the student whose test is graded.

Furthermore, for each student there is a cost associated with the studying for the test, for the weather outside is beautiful, and the soccer field is inviting. In fact the marginal costs of each additional detail of the American Civil War are increasing as the brains get fuller and more tired. Assume that the cost function is C(k) = c * k2, where c is a constant and k is an amount of information learned by the student.

The total reward that each student receives from the whole experience is equal to the mark given by the professor, minus the cost of studying for the test:

T(k1, m1, k2, m2, i) = R(k1 + m2*k2, k2 + m1*k1, i) - C(ki).

Now, the professor is the one deciding whether to adopt a relative-incentive grading scheme or an absolute-incentive one. We look at one possibility of these two options.


Case 1. Absolute-incentive reward function. The grade reward received by each student is not affected by the information displayed by the other student:

R(, 1) = k1 + m2*k2 - and a symmetrical expression for i = 2

In that case, T(, 1) = k1 + m2*k2 - c*k12

Taking a derivative of T w.r.t. k1, we find that this function reaches a maximum when k1 = 1 / (2c) - the amount of knowledge each of the students will learn the night before in this case. On the other hand, T is independent of m1, yet increases as m2 increases. Which means that if nothing prevents cooperation during the test, m1 = m2 = 1.

Case 2. Relative-incentive reward function. Each student is graded based on how well she did with respect to the other student. For instance,

R(, 1) = (k1 + m2*k2) / (k2 + m1*k1), and

T(, 1) = (k1 + m2*k2) / (k2 + m1*k1) - c*k12

In this case, student 1 wants to reduce m1 to 0, holding all other variables constant. Same goes for the other student. As a result, we get m1 = m2 = 0.

Once again we set the derivative of T w.r.t. k1 equal to 0, and evaluate the equation at m1 = m2 = 0. We then find that the amount of information student 1 will want to learn is k1 = 1 / (2c*k2). Symmetrically, student 2 will want to learn k2 = 1 / (2c*k1). As was expected, the interests of the two students are in conflict with each other. Let us assume that the two students are equally capable and motivated in their studies, and end up learning the same amount of information.

The two assumptions (students learn equal amounts of information and each maximizes her total reward) give us k1 = 1 / (2c*k1) => k1 = 1 / √(2c).


Comparison of the two cases.

1. Maximum sharing in the case of absolute rewards and minimum sharing in the case of relative rewards was immediately enforced by the equations.

2. The situation with learned information is not as clear-cut.

a. When c > , 1 / √ (2c) > 1 / (2c). In other words, each student opts to learn a greater amount of information in the second (relative-rewards) case than in the first.

b. On the other hand, when c < , the situation is reversed. Each student will actually learn more information in the absolute-rewards case.

Here's an intuitive interpretation of the above. When the cost of learning new information is great enough, each student will be content to rely on the sharing that will happen during the test to receive a decent grade. This explains case (a). On the other hand, as the cost of learning decreases, the students become more likely to get some studying done even when the sharing is guaranteed. In fact, when the cost of learning is low enough, each will learn less in the relative-incentives case, because the pressure of competing with the other will make learning an additional fact less worth the effort.

The important conclusion, however, is that there is a cost parameter c above which the relative-incentives framework increases the amount of knowledge each student will want to learn independently of each other.


Generalization: varying continuously between absolute and relative incentives.

We can think of a continuous relative-incentives parameter α that varies between 0 and 1. At α = 0 the incentives are completely absolute (as in case 1 above), and the agents are least likely to learn, and most likely to share. As α increases, the agents become more and more likely to learn (provided c is high enough), and less likely to share.

We explicitly encode a simple dependence that meets these requirements into the learning and sharing functions of this topic's simulations. It is not the exact mathematical dependence such as the one obtained in our analysis of the simplest case. Yet we still meet the qualitative requirements which we derived.

In particular, the derivation provides our learning and sharing functions with values at the edge cases. That is, as the incentives range between completely absolute and completely relative, sharing goes from 100% to 0%, while learning is restricted by the cost function to range between some two positive numbers.

(back to topic 5)

                   Previous Slide                                                           Next Slide