DeepAI

# Leximax Approximations and Representative Cohort Selection

Finding a representative cohort from a broad pool of candidates is a goal that arises in many contexts such as choosing governing committees and consumer panels. While there are many ways to define the degree to which a cohort represents a population, a very appealing solution concept is lexicographic maximality (leximax) which offers a natural (pareto-optimal like) interpretation that the utility of no population can be increased without decreasing the utility of a population that is already worse off. However, finding a leximax solution can be highly dependent on small variations in the utility of certain groups. In this work, we explore new notions of approximate leximax solutions with three distinct motivations: better algorithmic efficiency, exploiting significant utility improvements, and robustness to noise. Among other definitional contributions, we give a new notion of an approximate leximax that satisfies a similarly appealing semantic interpretation and relate it to algorithmically-feasible approximate leximax notions. When group utilities are linear over cohort candidates, we give an efficient polynomial-time algorithm for finding a leximax distribution over cohort candidates in the exact as well as in the approximate setting. Furthermore, we show that finding an integer solution to leximax cohort selection with linear utilities is NP-Hard.

• 61 publications
• 3 publications
• 19 publications
• 11 publications
02/15/2021

### Fair and Optimal Cohort Selection for Linear Utilities

The rise of algorithmic decision-making has created an explosion of rese...
02/06/2013

### Conditional Utility, Utility Independence, and Utility Networks

We introduce a new interpretation of two related notions - conditional u...
10/24/2021

### Approximate Core for Committee Selection via Multilinear Extension and Market Clearing

Motivated by civic problems such as participatory budgeting and multiwin...
09/04/2020

### Fair and Useful Cohort Selection

As important decisions about the distribution of society's resources bec...
05/25/2019

### Equitable Allocations of Indivisible Goods

In fair division, equitability dictates that each participant receives t...
02/15/2019

### Electing a committee with constraints

We consider the problem of electing a committee of k candidates, subject...
05/22/2020

### Cooperation in Small Groups – an Optimal Transport Approach

If agents cooperate only within small groups of some bounded sizes, is t...

## 1 Introduction

In many fairness-related settings, we seek to select an outcome that does not disproportionately harm any key subgroup. Speaking in terms of group utilities, a fair solution would ideally provide every key subgroup with high utility. Unfortunately, such a goal may be impossible to achieve if the utilities derived by subgroups from any potential solutions are in opposition. Moreover, other goals such as seeking to equalize utilities across groups may artificially constrain the utility of certain groups in order to match some group with uniformly low utility.

The classic maximin objective, which seeks to output solutions that maximize the utility of the worst-off group, has been widely studied as a goal that can circumvent these potential pitfalls by seeking to achieve the best possible outcome for the worst-off group. This results in a set of solutions that optimize the outcome for the worst-off group, but may still vary quite a bit with respect to the second-worst-off group, third-worst-off group, etc. Lexicographically maximal solutions strengthen the maximin objective by requiring that the utility of the second-worst-off group be maximized subject to the worst-off-group achieving its maximin value, the third-worst-off group be maximized subject to the worst-off and second-worst-off values, and so on. This goal intuitively tells us that a lexicographically maximal solution gives the best-possible utility guarantee we can give for each group without harming another group.

Lexicographic maximality (which we refer to as leximax, but is sometimes referred to in the literature as leximin) has been widely studied in the context of allocations [16, 18, 21]. Recently, Diana, Gill, Globus-Harris, Kearns, Roth, and Sharifi-Malvajerdi [10] explored applying the objective to the contemporary fairness context of loss minimization. In this paper, motivated by the goal of selecting a representative cohort from a group of candidates, we generalize the approach of [10] to the goal of selecting a solution that achieves lexicographically maximal utilities for a set of key subgroups.

Our contributions fall into two main categories: definitional, where we explore useful variants of the leximax objective and their relations in the general setting of selecting a leximax solution from a set of potential solutions, and algorithmic, in which we investigate how to efficiently find exact leximax solutions as well as different variants in the specific context of selecting representative cohorts. We provide an overview of definitional contributions in Section 1.1, followed by an overview of the cohort selection context and resulting algorithms in Section 1.2.

### 1.1 Approximations of Lexicographically Maximal Solutions

Diana et al. [10]

define an approximate notion of lexicographic maximality for which they construct oracle-efficient algorithms. Their notion is influenced by an algorithmic approach to calculating leximax solutions in that it assumes the maximal values of the worst-off group, second-worst-off group, etc. are calculated recursively based on whatever estimates came before. The definition assumes some small amount of error when calculating the maximin utility value, and then considers how this error would propagate to the second-worst-off-group’s maximum value, then considers how additional errors around the second-worst-off-group’s maximum value together with errors from the worst-off group maximum value might propagate to the third-worst-off group, and so on.

One of the appealing aspects of leximax solutions is that they offer a simple semantic interpretation that explains the sort of fairness guarantees such solutions provide: given a leximax solution, any alternative solution that improves the utility of some group must also decrease the utility of some worse-off group (Proposition 2). While the approximation notion presented by [10] is very natural, they also show that such approximate solutions may greatly diverge from exact solutions (see Example 3 for details), meaning that they may also diverge from this appealing semantic interpretation.

Ideally, we’d like a well-defined notion of approximation that extends the semantic interpretation of leximax and relates to the algorithmically achievable notion presented by [10]. However, we find that such a definition is somewhat difficult to pin down. Many natural relaxations of the semantic definition result in notions of approximations where either no solutions are guaranteed to satisfy the notion or the notions themselves may not imply a meaningful fairness guarantee that is analogue to that offered by leximax solutions. Developing a meaningful notion of approximation is exactly the challenge that this paper addresses.

We provide a relaxation of the semantic definition that we term -tradeoff leximax (Definition 3.1.2) that is always guaranteed to exist, and while it is not equivalent to the notion presented in [10], in Theorem 3.1.2, we show that it is equivalent to a stronger variant of their definition that we call -recursive leximax (Definition 3.1.2). The algorithms of [10] have the potential to slightly mis-estimate the maxmin values for different groups, and therefore are only guaranteed to output approximate leximax solutions. The type of mis-estimations that may arise are actually more constrained than the full class of errors their weaker notion of approximation allows for. In particular, solutions outputted by their algorithms actually satisfy our stronger notion of -recursive leximax.

Past explorations of lexicographic maximality have mostly concentrated on finding exact leximax solutions. In the design of algorithms, approximations are usually viewed as alternative solutions that are “almost as good” as the exact solution and that are computed in settings where it is difficult to efficiently find exact solutions. In this paper we suggest that in some cases, we may prefer to consider an approximate notion of lexicographic maximality rather than its exact counterpart. In particular, exact leximax solutions may be highly dependent on small variations in the utility of less-well-off groups. For example, a solution where all groups receive 0.01 utility would be preferred by the exact leximax objective over a solution where one group receives 0 utility and all others receive a utility of 1, even though this second solution gets much higher utility for the majority of groups while only decreasing the utility of a single group by a tiny amount. We explore well-defined ways where approximation can benefit stakeholders and suggest a notion of approximation that is stronger than the -recursive leximax notion mentioned above that we term -significant recursive leximax approximation (Definition 3.2) that identifies solutions that ignore tiny variations in utility and identifies only solutions that are leximax due to significant increases in utility. In Theorem 4, we give a more formal characterization of the benefits drawn from considering -significant recursive leximax solutions rather than just any -recursive leximax solution.

A third motivator for our study of leximax approximations is how robust leximax solutions may be to small amounts of noise in the estimates of group utility. We show that when calculated in a noisy setting, our relaxed semantic notion (-tradeoff leximax) is not guaranteed to still be -tradeoff leximax, however it is guaranteed to satisfy the weaker notion of approximation defined in [10]. On the other hand, in Lemma 3.3 we show that we can define a stronger variant of the semantic notion that guarantees a solution will be -tradeoff leximax in the noisy setting, but it has the disadvantage that such solutions may not always exist. We also examine noise in the context of -significant recursive leximax solutions, and show that when such solutions are calculated in the presence of noise, they are somewhat robust to noise as they imply a slightly weakened variant of -significance (Lemma 3.3).

Figure 1 summarizes the various notions of approximate lexicographical maximality and how they relate to one another. All of our approximate notions are defined with respect to an arbitrary class of solutions from which we’d like to pick a leximax solution. This allows our new definitions to be applied in the deterministic setting, where each solution would represent a particular cohort, or a randomized setting, where each solution corresponds to a distribution over cohorts and utilities are given in expectation.

### 1.2 Algorithms for Leximax Cohort Selection

In data selection, recruiting, and civic participation settings where a representative cohort is desired, the goal of representation is juxtaposed with the constraint of selecting a small representative set. There can be tension between selecting a cohort small enough for the resources available but large enough to represent as much of the population as possible. A lexicographically maximal solution is particularly salient in a representative cohort problem because it guarantees inclusion for the worst-off-groups while optimizing for the utility of all groups. We consider a model where how well each group or individual in the population is represented by a cohort candidate is given by a utility function. While there are many different ways a cohort or committee in power might make decisions or influence outcomes, we consider a linear setting where the utility a group derives from a cohort is the sum of utilities derived from each member of the cohort. Approximate notions of lexicographical maximality are of particular interest in this setting since estimating utilities that describe representativeness is difficult and might be noisy in practice.

Diana et al. [10] give a convex formulation of approximate lexicographical fairness and an oracle-efficient algorithm to solve general leximax convex programs. For our cohort selection setting specifically, we leverage the linearity across decision variables to find a polynomial time algorithm (Algorithm 1) that can calculate both exact leximax solutions as well as the two approximate variants we consider, -tradeoff leximax and -significant recursive leximax (with no external oracle needed for the calculation).

The linearity of utilities across cohort members and the recursive definition of leximax gives us a sequence of linear programs where the number of variables is linear in the size of the candidate pool and the number of constraints is exponential in the number of groups. In each

-th linear program, we maximize the sum of utilities of all sized- groups which gives us an exponential number of constraints; rendering the linear program too big to solve via generic LP solvers. We circumvent this difficulty by creating a separation oracle (Algorithm 2) which tests the sum of utilities of the worst off groups efficiently, giving us a polynomial time algorithm overall (Lemma 1). We can use the same approach to efficiently find -tradeoff leximax or -significant recursive leximax solutions by modifying the lower bound constraints on the sum of group utilities.

The output of our algorithm allows a randomized approach for selecting a cohort of expected size that guarantees leximax utilities in expectation. We can also round our algorithm output to a solution of size exactly where the expected utility across groups is leximax111For further rounding details, see discussion in Section 4.3. We focus on this distributional setting for our algorithms for a few key reasons:

• Tractability. If we wanted to instead find a deterministic cohort of exactly size by finding a lexicographically maximal integer solution, the problem becomes hard. By showing that the problem of finding the exact integer lexicographically maximal cohort solves the NP-hard problem of Minimum Hitting Set, we show that finding a solution as well as approximating the number of groups with non-minimum utility within a factor of is NP-Hard (Lemma 4.4).

• Fair Arbitration between Solutions. It is very possible that two lexicographically maximal deterministic solutions may provide wildly different utility values for a particular group. As an example, consider choosing between a cohort that provides maximum utility to Group A, but zero utility to Group B, and another cohort that provides zero utility to Group A but maximum utility to Group B. Both cohorts are a lexicographic maximum, however selecting a deterministic solution requires us to decide whether the solution should favor Group A or B. A distributional approach gets rid of this difficult decision because the randomized approach itself guarantees that we are providing both A and B a fair chance at high utility.

There are many different potential approaches to randomly selecting a cohort in the distributional setting. We choose to use a randomized approach to selection that includes or excludes each potential cohort member independently with probability outputted by the algorithm. Such an approach offers the following benefits:

• Simple Sampling Procedure.

Rather than outputting an arbitrary and potentially complicated distribution over cohorts that is difficult to sample from, the output of our algorithm is a single vector of marginal selection probabilities for each potential candidate. Our approach still results in a cohort with expected size

, but provides an easy way to sample cohorts, and as discussed in the final bullet point, gives better guarantees about the utility groups can expect to receive in practice. We also describe a rounding approach that results in cohorts of size exactly that are still leximax in expectation.

• Better Concentration Guarantees for Some Natural Settings.

While a distributional leximin solution may give groups better utility guarantees in expectation, it comes with the caveat that individual runs of the randomized solution may still result in cohorts where groups receive utility that is far below their expected utility. In an extreme case, a distributional solution that guarantees all groups 0.5 utility might be achieved by choosing uniformly between solutions that provide maximum and zero utility. When the size of the cohort is large enough, our approach to randomized choice guarantees that groups receive utility near their expectation with high probability because we consider each cohort member independently, rather than outputting an arbitrary joint distribution over potential cohort members (Lemma

4.1.2).

### 1.3 Our Contributions

To summarize, we provide the following contributions:

1. Define a new semantic notion of leximax approximation that is always guaranteed to exist and show that it is equivalent to an algorithmically-inspired notion of approximation that is stronger but related to the one defined in [10].

2. Investigate stricter notions of approximation that identify significantly leximax solutions that can be achieved by ignoring small variations in utility.

3. Explore how our new notions of approximation behave in settings where the group utilities may be reported with some small amount of additive noise.

4. Provide polynomial time algorithms for computing exact and approximate leximax distributions over cohorts with linear utility functions.

5. Show that the alternative goal of computing deterministic cohorts in our setting is NP-hard, and moreover approximating the number of groups with non-minimum utility is also NP-hard.

### 1.4 Related Work

Fair and diverse selection has become a prominent area of interest in algorithmic and machine learning fairness communities. In the setting of selecting representative data, prior works define metrics for diversity

[26], and give algorithms for diverse data selection and summarization [7, 20]. For selecting individuals from a larger pool, prior works on cohort selection and multi-winner elections have studied individual guarantees of fairness [2] as well as group parity goals of diversity [5, 8, 29]

. Other works have examined how bias and variance may affect different groups differently during a selection process and fairness amounts to remedying implicit bias and variance in the selection process for different groups of individuals

[13, 19]. Parity or proportional diversity approaches to cohort selection assume the correct amount of representation for each subgroup is known and thus fairness can be achieving a predefined level of diversity.

When there is no “merit” function to guide a selection process, cohort selection can also been seen as a representation problem. Diversity is the goal of a central decision maker while representation is the objective of each group in the population when selecting a cohort. Instead of modeling overall welfare based on the number of representatives from each group, our work considers the welfare of each group based on how representative each cohort member is for that group. Since how well a cohort serves each group in a population cannot be summarized by a single value, a natural direction is to examine the utilities of all groups of a given cohort that has been selected from a general population. Lexicographical fairness emerges as a reasonable notion of fairness that guarantees Pareto optimality in this setting of multiple objectives or losses. Flanigan et. al. [15] give an algorithm for recruiting “citizen’s assemblies” based on sampling from a distribution over representative panels that are generated from leximax selection probabilities over citizens in the population. Our work looks at selecting a representative cohort from a pool of candidates rather than the underlying population which allows a more general model where each member or group in the population has a utility vector describing its utility for each candidate that is being considered for the cohort. Furthermore, we optimize for leximax utilities for each group of interest rather than leximax sample probabilities for each individual in the population.

In telecommunication network design, min-max fairness (MMF) is an important solution concept to lexicographically maximize fractional flow for all parties [1, 27, 28]. An adjacent problem of lexigraphically maximal flows where there are multiple sinks has also been studied and a polynomial time algorithm exists for finding fractional flow [24, 25]. The problem of finding a leximax routing for an unsplittable flow along a network is NP-Complete but finding a 2-approximation is possible [18]. An approximate solution here means that it is not possible to improve a group without decreasing the utility of another group that is more than a factor of 2 worse.

Lexicographically maximal solutions have also been studied in other domains including bottleneck combinatorial optimization problems

[6, 9], sampling actions for repeated games [3], allocation of classrooms [21] as well as indivisible goods more generally [16]. It is important to note that unlike the leximax allocation problem, there is no limit on the number of groups gaining utility from the same candidate being included in a cohort or allocated set. Most recently, leximax empirical risk minimization for classification has also been studied [17, 22, 23].

## 2 The Leximax Objective

In this paper, we focus on approaches to selecting lexicographically maximal (or leximax) representative cohort solutions. We consider a setting in which we’d like to select a solution from a set of potential solutions such that is a good representation of some set of key (potentially overlapping) subgroups . We measure degree of representation via a utility function . Ideally, we’d like to select a cohort such that every subgroup is guaranteed to have high utility. However, this may be impossible to achieve in certain settings, such as when the utility functions of two groups are in opposition. Unlike maximizing total welfare, which may result in solutions that neglect the welfare of certain groups or seeking to equalize utilities across groups, which may artificially cap the utility some groups can achieve, lexicographically maximal solutions extend the goal of the classic maxmin objective by seeking to maximize the utility of the worst-off group, and then seeking to maximize the utility of the second-worst-off group subject to this worst-off group’s value, etc. This results in a solution concept that seeks to give the best guarantee possible for every key group, rather than just the worst-off.

We now formally define the leximax objective.

Given two vectors and in , we say that is lexicographically greater than , or , if and only if there exists some such that for all we have , and either or .

Applying this definition to the set of sorted group utility vectors obtained from every possible solution gives us a total ordering on these vectors. A leximax solution is any vector that is maximal according to this ordering. In many portions of this paper, in order to reason about the contents of these sorted vectors, we will care about the utility that the th worst-off group receives from a particular solution . We denote this with the bracketed notation .

Given a set of potential solutions and groups , we say that a solution is lexicographically maximal (leximax) if for any other solution , we have .

Intuitively, when we seek to find a lexicographically maximal solution, we try to do the best we can for the worst-off group, and then within these potential solutions try to do the best we can for the second-worst-off group, etc. Note that under this definition, groups may achieve varying utilities for different lexicographically maximal solutions, however the vector of sorted group utilities will be unique for any leximax solution. When the solution class is convex and compact and the utility function is continuous with respect to this class, a particular group receives the same utility under any leximax solution.

An attractive feature of lexicographically maximal solutions is that they have an equivalent definition that gives a semantic understanding of the solutions identified by the goal in Definition 2. We call this notion tradeoff leximax.

Given a set of solutions and groups , is lexicographically maximal if and only if for any and such that , there exists some such that .

###### Proof.

In the forward direction, let be a lexicographically maximal solution. Suppose that we have and such that .

Because is lexicographically maximal, we know that either or there exists some such that and for all , .

Because , we know that , and so such a must exist, and also otherwise we cannot have for all , and therefore the requirements of the statement are met.

In the opposite direction, suppose we have a solution such that for any and such that , there exists some such that .

Let be the smallest such that . If no such exists, then for all and we trivially have .

Otherwise, suppose for contradiction that . By our assumption on , there must exist some such that , however this is a contradiction because we have for all . Therefore we conclude our assumption was false, and therefore , and so .

Therefore for any other , we have , and thus is lexicographically maximal. ∎

This equivalent definition of lexicographic maximality offers an appealing re-interpretation of this objective: a solution is optimal if increasing the utility of any particular group would result in decreasing the utility of a worse-off group.

## 3 Approximations of Leximax-Optimal Solutions

While the leximax objective’s goal of doing the best we can for every group is attractive, one potential downside is that the set of leximax-optimal solutions can be incredibly sensitive to small variations in the utility received by certain groups. We consider the following example that illustrates this phenomenon:

[Sensitivity of leximax-optimal solutions] Consider a simple setting as in Figure 2 in which we have two groups, , and would like to decide between two potential solutions . The utilities for each group and each solution are defined as where is defined as follows:

 U=[010.010.01]

Clearly the only leximax solution is (with sorted utility vector ), because the worst-off group has value rather than receiving utility as it does in (which has a sorted utility vector of .

However, if we allow for the possibility that the utility estimates are off by even a tiny amount such as , suddenly is also a plausibly leximax solution despite having a completely different value for the second-worst-off group.

Example 3 is notable in that it demonstrates how small variations in the utilities of groups can lead to drastic changes with respect to the types of leximax solutions that are considered optimal. In settings where utilities may be reported with some estimation error, it is therefore incredibly important to consider how these errors might affect how the output optimal solution compares to the true leximax solution that would have been produced given completely accurate utilities.

Moreover, even when the utilities are believed to be accurate, it may be useful to consider solutions that are not exactly leximax, but are leximax when small variations in the utility are ignored. Example 3 is a situation where the exact leximax offers a tiny improvement in the worst-off group at the cost of a huge decrease in the utility of the second-worst-off group. A practitioner who views utility differences of less than 0.05 as insignificant might prefer as the only significantly lexicographically maximal solution because the worst-off groups between and receive comparable utility while the second-worst-off group is significantly better off under .

The search for plausibly exact lexicographic solutions given the potential for some amount of estimation error as well as the need for significantly maximal lexicographic solutions even when working with exact utility values motivates our study of new approximate leximax notions. In this section, we introduce two such notions: first, we introduce a semantic notion of approximate leximax that relaxes the standard leximax definition to consider additional solutions that may be plausibly leximax. The second notion we introduce here seeks solutions that are leximax if only “significant" improvements are considered (as in the discussion above). Unlike the first notion, the notion of significantly leximax solutions is not a strict relaxation of leximax and may not include the exact leximax solution in some cases.

### 3.1 Relaxations of the Leximax Objective

#### 3.1.1 Elementwise Approximation

The most naive approach to approximation would be to require that the element-wise distance between the sorted utility vectors of the true lexicographically maximal solution and the approximate solution be small:

[Element-wise leximax approximation] Given a set of groups and a set of potential solutions , let be the sorted vector of utilities attained by any leximax solution. We say that a solution is an -element-wise leximax approximation iff .

While attractive in its simplicity, [10] observe that in certain contexts, such a definition may be stricter than we can hope for. In particular, if the leximax solution is being computed recursively, small estimation errors in the values of the worst-off group’s utility can greatly effect the difference between the utility of better-off groups in a lexicographically maximal solution compared to a solution that maximizes group utilities based off of this incorrect value. Thus, we turn our attention to weaker notions of approximation.

We introduce a new notion of approximation that is a natural relaxation of the semantic interpretation of leximax solutions provided by the tradeoff leximax objective discussed in Proposition 2.

[-tradeoff leximax] Given a set of groups, , and a set of potential solutions, , a solution is -tradeoff leximax if for any and such that , there exists a such that .

Intuitively, this definition guarantees that if we can find some other solution that does a lot better on some particular group, then this new solution must also decrease the utility of some worse-off group.

-tradeoff leximax provides an appealingly simple relaxation of the semantic interpretation of exact leximax solutions. However, slight variations of this definition, also natural relaxations of leximax, will result in definitions where solutions are not guaranteed to exist. We explore this in the following example:

[Altered versions of -tradeoff leximax may not have any solutions.] We define a class of alternative tradeoff definitions that we term -significant tradeoff leximax for reasons that will become clear in Section 3.2 as follows:

[-significant tradeoff leximax] Given a set of groups, , and a set of potential solutions, , a solution is -significant tradeoff leximax for any if for any and such that , there exists a such that .

When and , this notion is equivalent to -tradeoff leximax. When , the definition requires that any increase by more than result in a decrease of more than in a worse-off group.

However, we demonstrate that for , no solution may exist. Consider for the setting depicted in Figure 3 where we have two groups and four potential solutions with utilities defined as , for

 U=⎡⎢ ⎢ ⎢ ⎢⎣00.5+6ϵϵ/20.5+4ϵϵ0.5+2ϵ3ϵ/20.5⎤⎥ ⎥ ⎥ ⎥⎦

Where we assume is sufficiently smaller than 0.5. Under these utilities, cannot be -significant tradeoff leximax because improves by more than in while only decreasing by . Similarly, and cannot be -significant tradeoff leximax due to the existence of and , respectively. This means that all cannot be -significant tradeoff leximax. However, we see that improves by more than over in , so also cannot be -significant tradeoff leximax. We conclude that no potential solution satisfies this definition222This example was not tied to the specific choice of . Similar examples exist for other choices..

The definition of -tradeoff leximax is only useful if we can compute -tradeoff leximax solutions efficiently. To show that this is possible, we relate -tradeoff leximax to a different notion of leximax approximation that arises from a natural algorithmic approach and is closely related to the notion of leximax approximations introduced in [10].

Consider the following approach to computing an exact leximax solution, which follows its definition: Compute the maximum value that can be guaranteed to the worst-off group, then calculate the maximum value that can be guaranteed to the second worst-off group subject to this value, and then recurse on the third, fourth, fifth, etc. until the values for all groups are fixed and a solution is found.

However, what if our algorithm is not completely accurate at each step? Introducing some amount of estimation error at each step of the recursion may result in selecting a solution that isn’t exact leximax, but can considered approximately leximax because it arose from small estimation errors in our algorithm. We call such solutions -recursive leximax, and define them as follows:

[-recursive leximax] Given a set of groups, , a set of potential solutions , and a choice of allowable ‘slack’ with , recursively define the sets of solutions such that and for each ,

 Sαi={S∈Sαi−1:u(S,G[i])≥maxS′∈Sαi−1u(S′,G[i])−αi}

We say that is an -recursively approximate leximax solution if there exists an with such that .

Our definition of -recursive leximax is a stronger variant of the definition of approximation used in [10]. Most importantly, the definition presented in [10] is less strict because it allows for the choice of allowable slack to depend on each solution. However, the solutions outputted by their algorithms actually achieve the stronger notion presented here. Unlike the weaker version, which is only implied by -tradeoff leximax, we can show that -recursive leximax and -tradeoff leximax are equivalent.

In this definition, the choice of slack, , determines the amount of estimation error at each step. We use this to recursively construct the sets in the same way they would be calculated had we applied a recursive approach to calculating a leximax solution but under-estimated the maximum value by at the th step for each .

Unlike our -tradeoff leximax notion of approximation, -recursive leximax provides a natural algorithmic interpretation of approximate solutions which allows efficient approaches to computing -recursive leximax solutions with respect to a particular choice of slack, as we do in Section 4333[10] give algorithms that calculate -recursive leximax solutions because their approach estimates each sequential maxmin value to within of its true value, though the notion of efficiency that they achieve does not exactly correspond to polynomial-time algorithms. We provide an alternative polynomial-time algorithm for the cohort selection setting that leverages linear group utilities to offer a more efficient approach.. Fortunately, we can actually show that these two notions of approximation are equivalent, which means that we can also efficiently compute -tradeoff leximax solutions.

For any set of groups, , and solutions, , the set of -tradeoff leximax solutions is equivalent to the set of -recursive leximax solutions.

###### Proof.

First, suppose we have some -tradeoff leximax solution .

Recursively define an amount of allowable slack as follows, where are the recursively defined sets discussed in Definition 3.1.2:

 αi=maxS′∈Sαi−1u(S′,G[i])−u(S,G[i])

In other words, is exactly the distance from the utility of the th worst-off group for to the maximal utility achieved by any th worst-off group in the th recursive set.

Clearly under this choice of slack, . If , then is also -recursive leximax and we are done. Otherwise, assume for contradiction that this is not the case, and let be the smallest such that . By definition, this means we have some such that .

Moreover, by definition of our , we also know that for all , we have

 u(S′,G[i′]) ≥maxS′′∈Sαi′−1u(S′′,G[i′])−αi′ =maxS′′∈Sαi′−1u(S′′,G[i′])−(maxS′∈Sαi′−1u(S′,G[i′])−u(S,G[i′])) =u(S,G[i′])

and therefore .

However, because is -tradeoff leximax, we must also have some such that . This is a contradiction, and so we conclude that for all , and therefore is also -recursive leximax.

In the other direction, suppose that is -recursive leximax with respect to some choice of allowable slack . We define a new choice of slack as follows:

 α′i(S′)={ϵS′=S0otherwise

Consider any and such that

 u(S′,G[i])>u(S,G[i])+α′i(S)=u(S,G[i])+ϵ.

Because is -recursive leximax, we therefore must have to avoid a contradiction. Let be the smallest such that . Here, we are guaranteed that

 u(S′,G[j])

Where the left-hand inequality arises because must have been too far below the maximum at because it was eliminated, and the right-hand side is because we know that . Thus,

 u(S′,G[j])

and so because we know that , we have found a such that

 u(S′,G[j])

and therefore must also be -tradeoff leximax.

We have shown that any -tradeoff leximax solution must also be -recursive leximax and vice versa, so we conclude that the two notions are equivalent. ∎

### 3.2 Significantly Leximax Solutions

-tradeoff leximax solutions are strict relaxations of the exact leximax objective. Any leximax-optimal solution will also be -tradeoff leximax and will also be -recursive leximax for any (by simply selecting the allowable slack to be for all ). Similarly, any -tradeoff (resp. recursively) approximate solution will also be -tradeoff (recursively) approximate for any .

In this section, we introduce a modified notion of -recursive leximax that is not a relaxation of the exact leximax objective but rather tries to get significant improvements in the quality of solutions, using the allowed slack. This notion constrains the choices of slack so that solutions considered leximax due to only insignificant improvements in the utility of worse-off groups are ignored. Here, the only slack considered is where all allowable slack values are set to exactly , rather than some value that is at most .

[-significant recursive leximax] Given a set of groups with and a set of potential solutions , recursively define the sets of solutions such that and for each ,

 Sϵi={S∈Sϵi−1:u(S,G[i])≥maxS′∈Sϵi−1u(S′,G[i])−ϵ}

We say that is -significant recursive leximax if .

Why does this make sense as a way to identify significant solutions? Intuitively, setting every slack value to the maximum possible requires that the valid solutions be leximax with respect to the larger set of potential solutions when some error term is allowed, rather than putting a lot of weight on small differences in earlier groups. We present the following example to see this in practice:

[Significantly recursive approximations] Consider two groups and two solutions as in Figure 4 with utilities

 u(S1,G1)=ϵ,u(S2,G1)=0,u(S1,G2)=0.5,u(S2,G2)=1.

Both and are -recursive leximax approximations. If we set , then because the only acceptable solution and thus an -recursive leximax-approximate solution. If we set , becomes an -recursive leximax-approximate solution. We would expect a satisfying significant approximation notion to identify as the only -significant approximation because it’s not too far below on the worst-off group, but does much better on the second-worst-off group. An -significant recursive leximax approximation does give us this separation between and , because while both and are included in the first-level of recursion, , is too far below the maximum to be included in , so is the only -significant recursive leximax approximation in this example.

So far, we have been rather loose in arguing about why the solutions identified as -significant recursive leximax might be preferred over exact leximax or the more general class of -recursive leximax solutions. We offer a more formal characterization here, but begin by taking a step back to reframe what the contents of the recursively defined sets from Definition 3.1.2, for some choice of slack , can tell us about potential leximax solutions.

Intuitively, contains all solutions that, with respect to the first groups, could feasibly be solutions that are -recursive leximax allowing for a slack of , and are guaranteed to be within of the first coordinates of any final -recursive leximax solution with respect to , i.e. any .

This means that looking at the maximum utility achieved by any solution in each recursive group, gives us a sense of the type of solution that results from allowing as slack. While there may not exist a such that , we are guaranteed that any will be elementwise within of this vector of maximums.

We can show that out of all possible choices of slack, the one used by the definition of -significant recursive leximax, results in the best-possible sequence of maximum set values (i.e. it will be lexicographically greater than the maximums attained via any other choice of slack). In other words, this backs up the motivation behind our definition of -significant recursive leximax in that it promises us that any -significant recursive leximax solution will be elementwise within of the lexicographically best solution we could possibly hope for under an optimal choice of slack.

[Leximax properties of -significant recursive leximax] Given a set of groups, , and solutions, , let be the recursively defined sets constructed with a slack of at each step, as used in the definition of -significant recursive leximax, and for any , let be the sets that arise when the amount of allowable slack at each level is set according to . Then, for any , we have

 ⟨maxS∈Sϵiu(S,G[i])⟩mi=1⪰⟨maxS∈Sαiu(S,G[i])⟩mi=1

In other words, the vector of maximums attained in each is lexicographically maximal compared to any other choice of slack of size at most .

###### Proof.

Let . We proceed by induction on .

As our base case, we note that,

 maxS∈Sα1u(S,G[1])=maxS∈Su(S,G[1])

and therefore .

For the recursive case, assume that for all , we have .

For any , we are guaranteed that for all ,

 u(S,G[j])≥maxS′∈Sαju(S′,G[j])−αj≥maxS′∈Sαju(S′,G[j])−ϵ

and therefore, by our inductive assumption,

 u(S,G[j])≥maxS′∈Sϵju(S′,G[j])−ϵ

and so as well, and therefore because , we must have .

Therefore, we’ve shown that either for all , or there exists some such that and for all , and so the vector of maximums attained by setting the allowable slack to be at all levels is lexicographically maximal. ∎

Theorem 4 tells us that out of all the ways we could identify approximate leximax solutions that ignore variations of less than , an -significant solution is guaranteed to be element-wise within on the lexicographically maximal best-possible guarantee we can give for each group at each level of recursion.

Ideally, we could obtain a similar notion to -significant recursive leximax with a satisfying semantic meaning as for -recursive leximax by modifying our definition of -tradeoff leximax so that any solution that improves the th group by more than must also decrease some worse-off group by more than . However, as we saw in Example 3.1.2, modifying the original tradeoff definition in this way surprisingly results in an overly strict notion due to some instability arising from the pairwise comparisons that tradeoff approximations rely on. In particular, solutions that satisfy this notion may not exist. Note that in Example 3.1.2, no solution satisfied -significant tradeoff leximax, which is equivalent to the modified definition suggested here, but is an -significant recursive leximax approximation and are all valid -recursive leximax solutions.

### 3.3 Approximations in the Presence of Noise

So far, we have considered approximate leximax solutions with the assumption that the utilities used to calculate these solutions are known to be correct. However, a natural question is how such approximations behave if the reported utilities contain some small amount of noise.

In the case of -tradeoff leximax solutions, assuming a small amount of additive noise for each utility has the potential for resulting in solutions that do not satisfy tradeoff guarantees. In particular, noise that is solution-specific can cause individual solutions to be “kicked out” of the recursively defined sets, even though all solutions near them are included. We demonstrate this behavior in the following example:

[-tradeoff leximax solutions are not robust to noise.]

We consider a setting in which we have two groups, and three potential solutions . The utilities each group derives are defined as where is defined as follows (assume ):

 U=⎡⎢⎣0.10.20.1+ϵ/1000.80.1+ϵ0.2⎤⎥⎦

Furthermore, assume we have a slightly noisy version of utilities in which changes from to . Figure 5 provides a visual representation of this instance, where the noisy verison of is shown in red.

In the non-noisy version, can never be considered -tradeoff leximax because does much better than on , and is still above on .

However, in the noisy version, which introduces only a tiny amount of noise (), much smaller than the allowed approximation threshold (), results in a setting where can be considered -tradeoff leximax.

By making the distance between and arbitrarily small, we can construct examples where even when the amount of noise is negligible compared to the allowed approximation factor,

can still potentially be incorrectly classified as

We note that we can define a stricter notion of tradeoff approximation that guarantees a solution will be -tradeoff leximax even if calculated with noisy utilities, but for the same reasons as demonstrated in Example 3.1.2, such solutions may not always exist, making it difficult to find solutions that are guaranteed to be -tradeoff leximax in a noisy setting.

Recall the notion of -significant tradeoff leximax as presented in Definition 3.1.2. Any -significant tradeoff leximax solution when calculated using noisy utilities within an additive of their true values is guaranteed to be -tradeoff leximax with respect to the true utilities.

###### Proof.

Let be the true utilities, and define noisy utilities such that for all and .

Suppose we have some and such that

 u(S′,G[i])>u(S,G[i])+ϵ

In the noisy setting, we are therefore guaranteed to have

 uδ(S′,G[i])>uδ(S,G[i])+ϵ−2δ

By definition of -significant tradeoff leximax, we therefore have some such that

 uδ(S′,G[j])

Switching back to non-noisy utilities, we are guaranteed that

 uδ(S′,G[j])
 uδ(S′,G[j])

and therefore is -tradeoff leximax. ∎

Having considered how noise may affect -tradeoff leximax approximations, we now turn to -significant recursive leximax approximations. Here, we find that -significant recursive leximax solutions are somewhat robust to noise, in that they satisfy a slightly relaxed definition of significance.

First, we note that in Example 3.3, is also -significant recursive leximax in the noisy setting, but not in the non-noisy setting, and so this example also demonstrates how the standard definition of -significant recursive leximax may not be robust to noise. However, we can offer the following guarantee with respect to a modified notion:

Say that a solution is -significant recursive leximax if there exists some choice of slack with such that , where and

 Sβi={S∈Sβi−1:u(S,G[i])≥maxS′∈Sβi−1u(S′,G[i])−βi(S)}.

Then, any -tradeoff leximax solution calculated in the presence of additive noise is guaranteed to be -significant recursive leximax.

###### Proof.

We begin by proving a property about how the sorted vector of group utilities is affected by added noise.

For any solution and , we have .

Suppose for purposes of contradiction that for some and , . We can assume without loss of generality that .

Let and be the groups used to calculate and , respectively. If , this is a contradiction because it means the noise on group was more than .

Otherwise, in order to ensure that the noise requirements , we must have that and additionally that , otherwise these constraints on noise cannot be true. Because is below in the sorted groups vector according to , but is above in the sorted groups vector according to , but and occupy the same index in both sorted vectors, we must be able to find some other such that but .

However, this implies that

 u(S,Gk)>u(S,G[i])>uδ(S,G[i])>uδ(S,Gk)
 u(S,Gk)>u(S,G[i])>uδ(S,G[i])>u(S,Gk)−δ

and so we must have . This contradicts our original assumption, and so we conclude that for all and , .

With this claim in hand, we can act as if noise was applied with respect to the sorted vector of group utilities rather than the groups themselves.

We define a new amount of allowable slack as follows, where are the recursively defined sets used to calculated -significant recursive leximax on the noisy utilities.

 βi(S)={ϵ+2δS∈Sϵiϵ−2δotherwise

We proceed by induction, noting that .

Suppose that for all , we have .

Then, if , we have that

 uδ(S,G[i]) ≥maxS′∈Sϵi−1uδ(S′,G[i])−ϵ u(S,G[i])+δ ≥maxS′∈Sϵi−1u(S′,G[i])−ϵ−δ u(S,G[i]) ≥maxS′∈Sϵi−1u(S′,G[i])−ϵ−2δ u(S,G[i]) ≥maxS′∈Sβi−1u(S′,G[i])−βi(S)

and so as well. On the other hand, if , then let be the smallest such that . We must have

 uδ(S,G[j])

In the non-noisy setting, we are therefore guaranteed that

 u(S,G[j])−δ