In Defense of Fluid Democracy

07/25/2021 ∙ by Daniel Halpern, et al. ∙ MIT Harvard University cornell university 0

Fluid democracy is a voting paradigm that allows voters to choose between directly voting and transitively delegating their votes to other voters. While fluid democracy has been viewed as a system that can combine the best aspects of direct and representative democracy, it can also result in situations where few voters amass a large amount of influence. To analyze the impact of this shortcoming, we consider what has been called an epistemic setting, where voters decide on a binary issue for which there is a ground truth. Previous work has shown that under certain assumptions on the delegation mechanism, the concentration of power is so severe that fluid democracy is less likely to identify the ground truth than direct voting. We examine different, arguably more realistic, classes of mechanisms, and prove they behave well by ensuring that (with high probability) there is a limit on concentration of power. Our proofs demonstrate that delegations can be treated as stochastic processes and that they can be compared to well-known processes from the literature – such as preferential attachment and multi-types branching process – that are sufficiently bounded for our purposes. Our results suggest that the concerns raised about fluid democracy can be overcome, thereby bolstering the case for this emerging paradigm.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Fluid democracy (also known as liquid democracy) is a voting paradigm that is conceptually situated between direct democracy, in which voters have direct influence over decisions, and representative democracy, where voters choose delegates who represent them for a period of time. Under fluid democracy, voters have a choice: they can either vote directly on an issue similar to direct democracy, or they can delegate their vote to another voter, entrusting them to vote on their behalf. The defining feature of fluid democracy is that these delegations are transitive: if voter 1 delegates to voter 2 and voter 2 delegates to voter 3, then voter 3 votes (or delegates) on behalf of all three voters.

In recent years, fluid democracy has gained prominence around the world. The most impressive example is that of the German Pirate Party, which adopted the LiquidFeedback platform in 2010 [22]. Other political parties, such as the Net Party in Argentina and Flux in Australia, have run on the wily promise that, once elected, their representatives would be essentially controlled by voters through a fluid democracy platform. Companies are also exploring the use of fluid democracy for corporate governance; Google, for example, has run a proof-of-concept experiment [17].

Practitioners, however, recognize that there is a potential flaw in fluid democracy, namely, the possibility of concentration of power, in the sense that certain voters amass an enormous number of delegations, giving them pivotal influence over the final decision. This scenario seems inherently undemocratic — and it is not a mere thought experiment. Indeed, in the LiquidFeedback platform of the German Pirate Party, a linguistics professor at the University of Bamberg received so many delegations that, as noted by Der Spiegel,111 http://www.spiegel.de/international/germany/liquid-democracy-web-platform-makes-professor-most-powerful-pirate-a-818683.html his “vote was like a decree.”

Kahng et al. [21] examine fluid democracy’s concentration-of-power phenomenon from a theoretical viewpoint, and established a troubling impossibility result in what has been called an epistemic setting, that is, one where there is a ground truth.222The use of the term “epistemic” in this context is well-established in the social choice literature [23, 28]. Informally, they demonstrate that, even under the strong assumption that voters only delegate to more “competent” voters, any “local mechanism” satisfying minimal conditions will, in certain instances, fall victim to a concentration of power, leading to relatively low accuracy.

More specifically, Kahng et al. model the problem as a decision problem where voters decide on an issue with two outcomes, , where is correct (the ground truth) and is incorrect. Each of the voters is characterized by a competence . The binary vote of each voter

is drawn independently from a Bernoulli distribution, that is, each voter votes correctly with probability

. Under direct democracy, the outcome of the election is determined by a majority vote: the correct outcome is selected if and only if more than half vote for the correct outcome. Under fluid democracy, there exists a set of weights, for each which represent the number of votes that voter gathered transitively after delegation. If voter delegates, then The outcome of the election is then determined by a weighted majority; it is correct if and only if . Kahng et al. also introduce the concept of a delegation mechanism, which determines whether voters delegate and if so to whom they delegate. They are especially interested in local mechanisms, where the delegation decision of a voter only depends on their local neighborhood according to an underlying social network. They assume that voters only delegate to those with strictly higher competence, which excludes the possibility of cyclic delegations.

In order to evaluate fluid democracy, Kahng et al. [21] test the intuition that society makes more informed decisions under fluid democracy than under direct democracy (especially given the foregoing assumption about upward delegation). To that end, they define the gain of a delegation mechanism to be the difference between the probability the correct outcome is selected under fluid democracy and the probability the correct outcome is selected under direct democracy. A delegation mechanism satisfies positive gain if its gain is strictly positive in some cases, and it satisfies do no harm if its loss (negative gain) is at most for all sufficiently large instances. The main result of Kahng et al. [21] is that local mechanisms can never satisfy these two requirements. Caragiannis and Micha [7] further strengthen this negative result by showing that there are instances where local mechanisms perform much worse than either direct democracy or dictatorship (the most extreme concentration of power).

These theoretical critiques undermine the case for fluid democracy: the benefits of delegation appear to be reversed by concentration of power. However, the negative conclusion relies heavily on modeling assumptions, and has not been borne out by experiments [2]. In this paper, we provide a rebuttal by introducing an arguably more realistic model, in which fluid democracy is able to avoid extreme concentration of power, thereby satisfying both do no harm and positive gain (for suitably defined extensions).

1.1 Our Contributions and Techniques

Our point of departure from the existing literature is the way we model delegation in fluid democracy. Our delegation mechanisms are defined by where is a function that maps a voter’s competence to the probability they delegate and maps a pair of competencies to a weight. In this model, each voter votes directly with probability and, conditioned on delegating with probability , delegates to voter with probability proportional to Crucially, a voter does not need to “know” the competence of another voter in order to decide whether to delegate; rather, the delegation probabilities are merely influenced by competence in an abstract way captured by . Also note that delegation cycles are possible, and we take a worst-case approach to dealing with them: If the delegations form a cycle then all voters in the cycle are assumed to be incorrect (vote ).333In LiquidFeedback delegation cycles are, in fact, ignored.

The most significant difference between our model of delegation and that of Kahng et al. [21] is that in our model each voter has a chance of delegating to any other voter, whereas in their model an underlying social network restricts delegation options. Our model captures a connected world where, in particular, voters may have heard of experts on various issues even if they do not know them personally. Although our model eschews an explicit social network, it can be seen as embedded into the delegation process, where the probability that delegates to takes into account the probability that is familiar with in the first place.

Another difference between our model and that of Kahng et al. [21] is that we model the competencies as being sampled independently from a distribution . While this assumption is made mostly for ease of exposition, it allows us to avoid edge cases and obtain robust results.

Our goal is to identify delegation mechanisms that satisfy (probabilistic versions of) positive gain and do no harm. Our first technical contribution, in Section 3, is the formulation of general conditions on the mechanism and competence distribution that are sufficient for these properties to hold (Lemma 1). In particular, to achieve the more difficult do no harm property, we present conditions that guarantee the maximum weight accumulated by any voter be sub-linear with high probability and that the expected increase in competence post-delegation be at least a positive constant times the population size. These conditions intuitively prevent extreme concentration of power, and ensure that the representatives post-delegation are sufficiently better than the entire population to compensate for any concentration of power that does happen.

It then suffices to identify mechanisms and distribution classes that verify these conditions. A delegation mechanism and a competence distribution induce a distribution over delegation instances that generates random graphs in ways that relate to well-known graph processes, which we leverage to analyze our mechanisms. Specifically, we introduce three mechanisms, all shown to satisfy do no harm and positive gain under any continuous distribution over competence levels. The first two mechanisms, upward delegation and confidence-based delegation, can be seen as interesting but somewhat restricted case studies, whereas the general continuous delegation mechanism is, as the name suggests, quite general and arguably realistic. Despite the simplicity of the first two mechanisms, the three mechanisms, taken together, reveal the robustness of our approach.

Upward Delegation.

In Section 4, we consider a model according to which the probability of delegating is exogenous and constant across competencies and delegation can only occur towards voters with strictly higher competence. That is, the probability that any voter delegates is and the weight that any voter puts on an other voter is This mechanism captures the fact that there might be some reluctance to delegate regardless of the voter’s competence, but does assume that voters act in the interest of society in only delegating to voters that are more competent than they are.

To generate a random graph induced by such a mechanism, one can add a single voter at a time in order of decreasing competence, and allow the voter to either not delegate and create their own disconnected component, or delegate to the creator of any other component with probability proportional to times the size of the component. This works because delegating to any voter in the previous components is possible (since they have strictly higher competence) and would result in the votes being concentrated in the originator of that component by transitivity. Such a process is exactly the one that generates a preferential attachment graph with positive probability of not attaching to the existing components [30]. We can then show that, with high probability, no component grows too large, so long as Further, there needs to be a constant improvement by continuity of the competence distribution, which ensures that a positive fraction of voters below a certain competence delegate to a positive fraction of voters with strictly higher competencies.

Confidence-Based Delegation.

In Section 5, we consider a model in which voters delegate with probability decreasing in their competencies and choose someone at random when they delegate. That is, the probability that any voter delegates is decreasing in and the weight that any voter gives to any voter is . In other words, in this model competence does not affect the probability of receiving delegations, only the probability of delegating.

To generate a random graph induced by such a mechanism, one can begin from a random vertex and study the delegation tree that starts at that vertex. A delegation tree is defined as a branching process, where a node ’s “children” are the nodes that delegated to node . In contrast to classical branching process, the probability for a child to be born increases as the number of people who already received a delegation decreases. Nevertheless, we prove that, with high probability, as long as a delegation tree is no larger than , our heterogeneous branching process is dominated by a sub-critical graph branching process [1]. We can then conclude that no component has size larger than with high probability. Next, we show that the expected competence among the voters that do not delegate is strictly higher than the average competence. Finally, given that no voter has weight larger than we prove that a small number of voters end up in cycles with high probability. We can therefore show that the conditions of Lemma 1 are satisfied.

General Continuous Delegation.

Finally, we consider a general model in Section 6 where the likelihood of delegating is fixed and the weight assigned to each voter when delegating is increasing in their competence. That is, each voter delegates with probability and the weight that voter places on voter is where

is continuous and increases in its second coordinate. Thus, in this model, the delegation distribution is slightly skewed towards more competent voters.

To generate a random graph induced by such a mechanism, we again consider a branching process, but now voters and place different weights on per . Therefore, voters have a type that governs their delegation behavior; this allow us to define a multi-type branching process with types that are continuous in . The major part of the analysis is a proof that, with high probability, as long as the delegation tree is no larger than , our heterogeneous branching process is dominated by a sub-critical Poisson multi-type branching process. To do so, we group the competencies into buckets that partition the segment into small enough pieces. We define a new that outputs, for any pair of competencies , the maximum weight a voter from ’s bucket could place on a voter from ’s bucket. We can show that such a discrete multi-type branching process is sub-critical and conclude that no component has size larger than with high probability. In a similar fashion to Confidence-Based Delegation, we also show that there is an expected increase in competence post-delegation.

1.2 Related work

Our work is most closely related to that of Kahng et al. [21], which was discussed in detail above. It is worth noting, though, that they complement their main negative result with a positive one: when the mechanism can restrict the maximum number of delegations (transitively) received by any voter to , do no harm and positive gain are satisfied. Imposing such a restriction would require a central planner that monitors and controls delegations. Gölz et al. [14] build on this idea: they study fluid democracy systems where voters may nominate multiple delegates, and a central planner chooses a single delegate for each delegator in order to minimize the maximum weight of any voter.

Similarly, Brill and Talmon [6] propose allowing voters to specify ordinal preferences over delegation options, and possibly restricting or modifying delegations in a centralized way. Caragiannis and Micha [7], and then Becker et al. [2] also consider central planners; they show that, for given competencies, the problem of choosing among delegation options to maximize the probability of a correct decision is hard to approximate. In any case, implementing these proposals would require a fundamental rethinking of the practice of fluid democracy. By contrast, our positive results show that decentralized delegation mechanisms are inherently self-regulatory, which supports the effectiveness of the current practice of fluid democracy.

More generally, there has been a significant amount of theoretical research on fluid democracy in recent years. To give a few examples: Green-Armytage [15] studies whether it is rational for voters to delegate their vote from a utilitarian viewpoint; Christoff and Grossi [8] examine a similar question but in the context of voting on logically interdependent propositions; Bloembergen et al. [3] and Zhang and Grossi [31] study fluid democracy from a game-theoretic viewpoint.

Further afield, fluid democracy is related to another paradigm called proxy voting, which dates back to the work of Miller [26]. Proxy voting allows voters to nominate representatives that have been previously declared. Cohensius et al. [10] study utilitarian voters that vote for the representative with the closest platform to theirs; they prove that the outcome of an election with proxy votes yields platforms closer to the median platform of the population than classical representative democracy. Their result provides a different viewpoint on the value of delegation.

2 Model

There is a set of voters, denoted . We assume voters are making a decision on a binary issue and there is a correct alternative and an incorrect alternative. Each voter has a competence level which is the probability that

votes correctly. We denote the vector of competencies by

. When is clear from the context, we sometimes drop it from the notation.

Delegation graphs

A delegation graph on voters is a directed graph with voters as vertices and a directed edge denoting that delegates their vote to . Again, if is clear from context, we occasionally drop it from the notation. The outdegree of a vertex in the delegation graph is at most , since each voter can delegate to at most one person. Voters that do not delegate have no outgoing edges. In a delegation graph , the delegations received by a voter , , is defined as the total number of people that (transitively) delegated to in , (i.e., the total number of ancestors of in ). The weight of a voter , , is if delegates, and otherwise. We define to be the largest weight of any voter and define . Since each vote is counted at most once, we have that . However, note that if delegation edges form a cycle, then the weight of the voters on the cycle and voters delegating into the cycle are all set to and hence will not be counted. In particular, this means that may be strictly less than .444This is a worst-case approach where cycles can only hurt the performance of fluid democracy, since this assumption is equivalent to assuming that all voters on the cycles vote incorrectly.

Delegation instances

We call the tuple a delegation instance, or simply an instance, on voters. Let if voter would vote correctly if did vote, and otherwise. Fixed competencies induce a probability measure over the possible binary votes , where . Given votes , we let be the number of correct votes under direct democracy, that is, . We let be the number of correct votes under fluid democracy with delegation graph , that is, . The probability that direct democracy and fluid democracy are correct are and , respectively.

Gain of a delegation instance

We define the gain of an instance as

In words, it is the difference between the probability that fluid democracy is correct and the probability that majority is correct.

Randomization over delegation instances

In general, we assume that both competencies and delegations are chosen randomly. Each voter’s competence is sampled i.i.d. from a fixed distribution with support contained in . Delegations will be chosen according to a mechanism . A mechanism is composed of two parts. The first is a function that maps competencies to the probability that the voter delegates. The second maps pairs of competencies to a weight. A voter with competence will choose how to delegate as follows:

  • With probability they do not delegate.

  • With probability , delegates; places weight on each voter and randomly sample another voter to delegate to proportional to these weights. In the degenerate case where for all , we assume that does not delegate.

A competence distribution , a mechanism , and a number of voters induce a probability measure over all instances of size .

We can now redefine the do no harm (DNH) and positive gain (PG) properties from Kahng et al. [21] in a probabilistic way.

Definition 1 (Probabilistic do no harm).

A mechanism satisfies probabilistic do no harm with respect to a class of distributions if, for all distributions and all , there exists such that for all ,

Definition 2 (Probabilistic positive gain).

A mechanism satisfies probabilistic positive gain with respect to a class of distributions if there exists a distribution such that for all , there exists such that for all ,

3 Core Lemma

In this section, we prove the following key lemma, which provides sufficient conditions for a mechanism to satisfy probabilistic do no harm and probabilistic positive gain with respect to a class of distributions. This lemma will form the basis of all of our later results.

Lemma 1.

If is a mechanism, a class of distributions, and for all distributions , there is an and with such that

(1)
(2)

then satisfies probabilistic do no harm. If in addition, there exists a distribution and an such that

(3)

then satisfies probabilistic positive gain.

In words, condition (1) ensures that, with high probability, the largest weight accumulated by a voter post-delegation is sub-linear. This condition prevents extreme concentration of power. Condition (2) ensures that, with high probability, the (weighted) average competence post-delegation is at least a positive constant more than the average competence pre-delegation. This condition guarantees that representatives post-delegation are sufficiently more competent than the entire population to compensate for any concentration of power that does occur. Finally, condition (3) ensures that there exists a distribution for which, with high probability, the average competence pre-delegation is at most minus a constant, while the average competence post-delegation is at least plus a constant. This condition suffices to guarantee that the probability that fluid democracy is correct goes to while direct democracy goes to .

Throughout many of the proofs, we will make use of the the following well-known concentration inequality [18]:

Lemma 2 (Hoeffding’s Inequality).

Let

be independent, bounded random variables with

for all i, where . Then

and

for all

Proof of Lemma 1.

We establish the two properties separately.

Probabilistic do no harm: We first show that a mechanism that satisfies conditions (1) and (2) satisfies probabilistic do no harm. Fix an arbitrary competence distribution and let and be such that (1) and (2) are satisfied. Without loss of generality, suppose that for all , as replacing any larger values of with will not affect (1) (since for all graphs on vertices). Fix . We must identify some such that for all , .

We will begin by showing there exists such that for all instances on voters, if both

(4)
(5)

then

(6)

Since (4) and (5) each hold with probability by (1) and (2), for sufficiently large , say , they will each occur with probability at least . Hence, by a union bound, for all , they both occur with probability at least . By taking , this implies that probabilistic do no harm is satisfied.

We now prove that, for sufficiently large , (4) and (5) imply (6). Fix an instance on voters satisfying (4) and (5). We first show that for sufficiently large , we have

(7)

and

(8)

Assume that (7) and (8) hold. Note that (5) implies either or . If , then (7) implies that

On the other hand, if , then (8) implies that

Hence, in either case, (6) holds. It remains to prove that (7) and (8) hold for sufficiently large . For (7), this follows directly from Hoeffding’s inequality (Lemma 2). To prove (8), first note that, as shown in Kahng et al. [21],

where the first inequality holds because is upper bounded by , the second because with each so the value is maximized by setting as many terms to as possible, and the final inequality holds because .

Hence, by Chebyshev’s inequality,

This bound is because the numerator is and the denominators is . Hence, for sufficiently large , it will be strictly less than , which implies that (8) holds.

Probabilistic positive gain: Fix a distribution and an such that (3) holds. We want to show that satisfies probabilistic positive gain. Since , it also satisfies (1) for some . We show below that there exists an such that all instances with voters satisfying (4) for which , we have that . As with the DNH part of the proof, since the events of (1) and (3) each hold with probability , for sufficiently large , say , they each occur with probability at least . Hence, by a union bound, for all , they both occur with probability . For , probabilistic positive gain is satisfied.

It remains to show that that if (1) and (3) hold for a specific instance , then for sufficiently large . Since , (7) and (8) are both satisfied for sufficiently large . When

is satisfied as well, we get that and , so is immediate. ∎

In the following sections, we investigate natural delegation mechanisms and identify conditions such that the mechanisms satisfy probabilistic do no harm and probabilistic positive gain. In all instances, we will invoke Lemma 1 after showing that its sufficient conditions are satisfied.

4 Strictly Upward Delegation Mechanism

We now turn to the analysis of a simple mechanism that assumes that voters either do not delegate with fixed exogenous probability, or delegate to voters that have a competence greater than their own.

Formally, for a fixed we let be the mechanism consisting of for all , and for all . That is, voter delegates with fixed probability and puts equal weight on all the more competent voters. In other words, if voter delegates, then does so to a more competent voter chosen uniformly at random. Note that a voter with maximal competence will place weight on all other voters, and hence is guaranteed not to delegate. We refer to as the Upward Delegation Mechanism parameterized by .

Theorem 1 (Upward Delegation Mechanism).

For all , satisfies probabilistic do no harm and probabilistic positive gain with respect to the class of all continuous distributions.

Proof.

To prove the theorem, we will prove that the Upward Delegation Mechanism with respect to satisfies (1) and (2), which implies that the mechanism satisfies probabilistic do no harm by (1). Later, we demonstrate a continuous distribution that satisfies (3), implying the mechanism satisfies probabilistic positive gain.

Upward Delegation satisfies (1)

We show there exists such that the maximum weight with high probability—that is, such that (1) holds. Fix some sampled competencies . Recall that each entry in is sampled i.i.d. from , a continuous distribution. Hence, almost surely, no two competencies are equal. From now on, we condition on this probability event. Now consider sampling the delegation graph . By the design of the mechanism , we can consider a random process for generating that is isomorphic to sampling according to as follows: first, order the competencies (note that such strict order is possible by our assumption that all competencies are different) and rename the voters such that voter has competence ; then construct iteratively by adding the voters one at a time in decreasing order of competencies, voter at time , voter at time , and so on.

We start with the voter with the highest competence, voter . By the choice of , voter places weight on every other voter and hence by definition does not vote. These voters form the first component in the graph , which we call . Then, we add voter who either delegates to voter joining component with probability , or starts a new component with probability . Next, we add voter . If (that is, if delegated to ), either delegates to (either directly or through by transitivity) with probability or she starts a new component . If , then either delegates to with probability and is added to , or delegates to with probability and is added to , or starts a new component . In general, at time , if there are existing components , voter either joins each component with probability or starts a new component with probability . To construct , we run this process for steps.

This is precisely the model introduced by Simon [30]. It has been studied under the name infinite Polya urn process [9] and is considered a generalization of the preferential attachment model (with a positive probability of not attaching to the existing graph).

Let be the size of the th component, , at time . In general, our approach will be to show that each component remains below some function by time with high enough probability so that we can union bound over all possible (there can never be more than components in the graph). That is, we will show

for some to be chosen later. Hence, it will be useful to consider this process more formally from the perspective of the th component, . The th component is “born” at some time when the th person chooses to not delegate, at which point (prior to this, ). More specifically, the first component is guaranteed to be born at time and for all other , it will be born at time with probability , although these exact probabilities will be unimportant for our analysis. Once born, we have the following recurrence on describing the probability will be chosen at time :

Let be the process for the size of component that is born at time . That is, , and for , follows the exact same recurrence as . Note that since the th component can only be born at time or later, we have that stochastically dominates for all . Hence, it suffices to show that

(9)

Choose to be a constant such that (say ); note that . Choose another constant such that . This additionally implies . Finally, choose . We show that the probability that any component is of size greater than by time (when the delegation process completes) is negligible. That is, we show that by showing that (9) holds for .

We split our analysis into two parts: the first consider the first components, while the second considers the last components.

We first show that .

Recall that we have , and the following recurrence for all :

Our first goal is to show that the expectation of is upper bounded by

(10)

for all , where represents the Gamma function.

By the tower property of expectation, for all ,

Thus,

where the first four equalities follow from the recursive formula for , the fifth because , the sixth and seventh by rearranging terms, the eighth uses the fact that for all , and the last uses the fact that for all . This proves (10).

We can now use Markov’s inequality to show that for all ,

Hence,

What remains to be shown is that

To do this, first note that . Indeed, the fact that

follows from Gautschi’s inequality [13], and both the upper and lower bounds are . Because is an increasing function of , we have that

(11)

Further,

Hence, the left-hand side of (11) is . By our choice of , , so this implies that it is is , as desired.

Now consider the final components. We will prove that . Since stochastically dominates for all , this implies that for all . Hence,

To do this, we compare the process to another process, .

We define , and for , take to satisfy the following recurrence:

This is identical to the recurrence except without the factor. Hence, clearly stochastically dominates . For convenience in calculation, we will instead focus on bounding which itself stochastically dominates .

Next, note that the process is in fact isomorphic to the following classic Polya’s urn process. We begin with with two urns, one with a signle ball and the other with balls. At each time, a new ball is added to one of the two bins with probability proportional to the bin size. The process is isomorphic to the size of the the one-ball urn after steps. Classic results tell us that for fixed starting bin sizes and , as the number of steps grows large, the possible proportion of balls in the -bin follows a distribution [25, 11, 29, 20, 24].

The mean and variance of such a Beta distribution would be sufficient to prove our necessary concentration bounds; however, for us, we need results after exactly

steps, not simply in the limit. Hence, we will be additionally concerned with the speed of convergence to this Beta distribution.

Let and . From Janson [19], we know that the rate of convergence is such that, for any

(12)

where is the minimal metric, defined as

which can be thought of as the minimal norm over all possible couplings between and . For our purposes, the only fact about the metric we will need is that where is the identically random variable. Since is in fact a metric, the triangle inequality tells us that , so, combining with (12), we have that

(13)

for all .

Note that since ,

and

Given these results, we are ready to prove that is smaller than with probability Precisely, we want to show that By Chebyshev’s inequality,

(13) with along with the fact that and are always nonnegative implies that . Hence, since . We can therefore write:

(14)

(13) with implies that