Causal Feature Discovery through Strategic Modification

02/17/2020 ∙ by Yahav Bechavod, et al. ∙ University of Minnesota Hebrew University of Jerusalem University of Pennsylvania 11

We consider an online regression setting in which individuals adapt to the regression model: arriving individuals may access the model throughout the process, and invest strategically in modifying their own features so as to improve their assigned score. We find that this strategic manipulation may help a learner recover the causal variables, in settings where an agent can invest in improving impactful features that also improve his true label. We show that even simple behavior on the learner's part (i.e., periodically updating her model based on the observed data so far, via least-square regression) allows her to simultaneously i) accurately recover which features have an impact on an agent's true label, provided they have been invested in significantly, and ii) incentivize agents to invest in these impactful features, rather than in features that have no effect on their true label.



There are no comments yet.


page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

As algorithmic decision-making takes a more and more important role in myriad application domains, incentives emerge to change the inputs presented to these algorithms—people may either invest in truly relevant attributes or strategically lie about their data. Recently, a collection of very interesting papers has explored various models of strategic behavior on the part of the classified individuals in learning settings, and ways to mitigate the harms to accuracy that can arise from falsified features 

(dalvi2004; bruckner2012; Hardt15; Dong17). Additionally, some recent work has focused on the design of learning algorithms that incentivize the classified individuals to make “good” investments in true changes to their variables (Kleinberg18).

The present paper takes a different tack, and explores another potential effect of strategic investment in true changes to variables, in an online learning setting: we claim that interaction between the online learning and the strategic individuals may actually aid the learning algorithm in identifying causal variables. By causal, we mean, informally, variables such that changes in their true value cause changes in the true label and lead agents to improve. In contrast, non-causal variables do not affect the true label; such features are susceptible to gaming, as they can be used to obtain better outcomes with respect to the posted model without improving true labels.

The idea is quite simple. First, if a learning algorithm’s hypothesis at a particular round depends heavily on a certain variable, this incentivizes the arriving individual to invest in improving that variable. If that variable were causally related to the true label, then the learner would observe the impact of these changes in the form of improved true labels. If that variable were non-causal, the changes would not have an effect on true labels. Second, if a learning algorithm improves its hypotheses over time, this changing sequence of incentives should encourage investment in a variety of promising variables, exposing those that are causal. This process should naturally induce the learner to shift its dependence towards causal variables, thereby incentivizing individuals to invest in meaningful changes, and resulting in an overall higher-quality population.

The goal of this paper is to highlight this potential beneficial effect of the interaction between online learning and strategic modification. To do so, we focus our study on a simple linear regression setting. In our model, there is a true underlying latent regression parameter vector

, and there is an underlying distribution over unmodified feature vectors. On every round , the learner must announce a regression vector .111Eventually, the learner we will consider does not update its regression vector at every round, but rather periodically, so that individuals can be treated in batches. An individual then appears, with an unmodified feature vector chosen i.i.d. from the distribution. Before presenting himself to the learner, the individual observes and has the opportunity to invest in changing his true features to some ; we focus on a simple model wherein the individual’s investment results in a targeted change to a single variable. The individual then receives utility , and the learner gets feedback , where is some noise.

Within this simple model, we consider simple behaviors for both the learner and the individuals: At each time , the individual modifies his features so as to maximize his utility given the posted ; periodically, the learner updates

with her best estimate of

given the (modified) features and labels she has observed, via least-square regression. Our main result is that under this simple behavior, the learner recovers accurately, after observing sufficiently many individuals. Our result is divided in two parts: first, we show that least-square regression accurately recovers with respect to features that many individuals have invested in. Second, we show that these dynamics incentivize investments in every feature, leading to accurate recovery of in its entirety, under an assumption on how the learner breaks ties between multiple least-square solutions. Our accuracy guarantees for a feature improve with the number of times that feature is invested in.

It is important to emphasize that we are studying a setting in which individuals’ modifications of their variables can be meaningful investments (e.g., studying to achieve better mastery of material by an exam ) rather than deceitful manipulations (e.g., cheating on the exam to achieve a higher assessment of that mastery). Strategic lying about variables would not help to expose causal variables, because such changes would not affect the outcome, regardless of whether the changes were in causal or non-causal variables.

Notice that any discovery of causal variables that occurs in our model is a result of the interaction between the online learner and the strategic individuals. On the one hand, online learning with no strategic response has no ability to distinguish non-causal variables from causal ones when the two are correlated. On the other hand, if strategic individuals faced with a static scoring algorithm tried to maximize their scores by investing in a non-causal feature, the resulting information would be insufficient for an observer to draw conclusions about the causality of other features.

For example, historical data might show that both a student’s grades in high school and the make of car his parents drive to the university visit day are predictive of success in university. Suppose, for simplicity, that success in high school is causally related to success in university, but that make of parents’ car is not. If the university admissions process put large weight on high school grades, that would incentivize students to invest effort in performing well in high school, which would also observably pay off in university, which would reinforce the emphasis on high school grades. If the admissions process put large weight on the make of car in which students arrive to the visit day, that would incentivize renting fancy cars for visits. However, this would result in a different distribution over the observed student variables, and on this modified distribution the correlation between cars and university success would be weakened, and therefore the admissions formula would not perform well. In future years, the university would naturally correct the formula to de-emphasize cars.

One reason that we find this natural process of causal variable discovery to be interesting is that discovery of causal variables is notoriously difficult and problematic. Separating correlation from causation in passive-observational data is essentially impossible without very strong assumptions (eberhardt2007). The gold standard traditional method for detecting causal variables is therefore to perform an intervention, and the protypical intervention is the randomized, controlled trial, a concept that grew out of the foundational work of R. A. Fisher in the 1930’s (fisher1935). Randomly assigning experimental subjects to “treatments” of different variables, however, is often expensive, difficult, impossible, unethical, or not meaningful. If the variable is the neighborhood where the subject lives, what does it mean to assign this at random? Even if it were feasible or ethical to consider reassigning subjects’ neighborhood for the purposes of an experiment, perhaps what is relevant is not the value of the variable at the time of the experiment, but the lived experience of having been identified with and experienced that variable and its correlates for a long period of time.

We do not suggest that the interaction between online learning and strategic classification solves all problems relating to causality; far from it. The goal of this paper is simply to bring attention to a natural mechanism for exposing causal variables, that we believe is worthy of further attention.

2 Related Work

Much of the work on decision-making on individuals assumes that an individual’s data is a fixed input that is independent of the algorithm used by the decision-maker. In practice, however, individuals may try to adapt to the model in place in order to improve their outcomes. A recent line of work studies such strategic behavior in classification settings.

Part of this line of work concerns itself with the negative consequences of strategic behavior, when individuals aim to game the model in place; for example, individuals may misrepresent their data or features (often at a cost) in an effort to obtain positive qualification outcomes or otherwise manipulate an algorithm’s output (dalvi2004; PP04; DFP10; bruckner2012; IL13; HIM14; CDP15; Hardt15; Dong17) or even to protect their privacy (GLRS14; CIL15). The goal in these works is to provide algorithms whose outputs are robust to such gaming. milli2019 and lily2019 focus on the social impact of robust classification, and show that i) robust classifiers come at a social cost (by forcing even qualified individuals to invest in costly feature manipulations to be classified positively) and ii) disparate abilities to game the model inevitably lead to unfair outcomes.

Another part of this line of work instead sees strategic manipulation as possibly positive, when the classifier incentivizes individuals to invest in meaningfully improving their features. Instead of cheating on a test to obtain a better score, a student may decide to study and actually improve his actual competence level in a given subject. Kleinberg18 study how to induce agents to invest effort into improving meaningful features rather than trying to game the classifier. berk2019 provide optimization tools to compute which action an agent should take to improve his label at minimal cost. Most of this line of work assumes the decision-maker already understands which features are impactful and control an agent’s true label or qualification level, and which do not.

In contrast, we consider a setting where the decision-maker does not initially know which features affect an agent’s label, and we aim to leverage the agents’ strategic behavior to learn the causal relationship between features and labels; in that sense, our work is related to a line of research on causality (Pearl2009; Halpern16; Peters17). Most closely related to this paper is the work of miller2019. They formalize the distinction between gaming and actual improvements through the structural causality framework of Pearl2009, by introducing causal graphs that model the effect of their features and target variables on each other. They show that in such settings, it is in the decision-maker’s best interest to incentivize actual improvements rather than gaming. Further, they show that designing good incentives that push agents to improve is at least as hard as causal inference, but leave open the question of how to leverage strategic behavior to learn causality, and hence set good incentives. Our paper provides a first step towards addressing this question, albeit in a simpler model.

3 Model

We consider a linear regression setting where the learner learns the regression parameters based on strategically manipulated data from a sequence of agents over rounds. There is a true latent regression parameter such that for any agent with feature vector , the real-valued label is given by


is a noise random variable with

, and . We also refer to an individual’s features as variables. There is a distribution over the unmodified features in ; we let be the mean and be the covariance matrix of this distribution; we note that the distribution of unmodified features may be degenerate, i.e. may not be full-rank. Throughout the paper, we set .222This can be done whenever the learner can estimate the mean feature vector, since the learner can then center the features. The learner could estimate the mean by using unlabeled historical data; for example, she could collect data during a period when the algorithm does not make any decision on the agents, thus they would have no incentive to modify their features.

The agents and the learner interact in an online fashion. At time , the learner first posts a regression estimate , then an agent (indexed by ) arrives with their unmodified feature vector . Agent modifies the feature into in response to , in order to improve their assigned score . Finally, the learner observes the agent’s realized label after feature modification, given by .

Causal and non-causal features.

When an agent modifies a feature , this may also affect the agent’s true label. We divide the coordinates of any given feature vector into causal and non-causal; causal features are features that inform and control an agent’s label, while non-causal features are those that do not affect an agent’s label. Formally, for any , feature is causal if and only if , and non-causal if and only if . An agent can modify his true label by modifying causal features.

Agents’ responses.

Agents modify their features so as to maximize their regression outcome; modifications are costly and agents are budgeted. We assume agent incurs a linear cost333We make this assumption for simplicity. Our results extend to more general assumptions on the cost function. It suffices that our cost function does not induces modifications such that several features are modified in a perfectly correlated fashion. When several features are perfectly correlated, said features may become indistinguishable by the learner.

to change his features by , and has a total budget of to modify his features. ’s are drawn i.i.d. from a distribution that is unknown to the analyst. We assume has discrete support , and we denote by

the probability that

. We assume for all ; that is, every agent can modify his features, but no feature can be modified for free.444Note that in our model, modifying a feature affects only that feature and the label, but does not affect the values of any other features. We leave exploration of more complex models of feature intervention to future work.

When facing regression parameters , agent solves


The solution of the above program does not depend on , only on and , and is given by

up to tie-breaking; when several features maximize , the agent modifies a single one of these features. We denote by the set of features that have been modified by at least one agent .

Natural learner dynamics: least-squares regression.

Our goal here is to identify simple, natural learning dynamics that expose causal variables. The dynamics we consider are formally given in Algorithm 1; it is possible that more sophisticated learning algorithms could yield better guarantees with respect to regret and recovery.

When the learner updates his regression parameters, say at time , she does so based on the agent data observed up until time . We model the learner as picking from the set of solutions to the least-square regression problem run on the agents’ data up until time , formally defined as

We introduce notation that will be useful for regression analysis. We let

be the matrix of (modified) observations up until time . Each row corresponds to an agent , and agent ’s row is given by . Similarly, let . We can rewrite, for any ,

Agents are grouped in epochs.

The time horizon

is divided into epochs of size

, where is chosen by the learner. At the start of every epoch , the learner updates the posted regression parameter vector as a function of the history of up until epoch . We let denote the last time step of epoch . denotes the set of features that have been modified by at least one agent by the end of epoch .

Learner picks (any) initial .
for every epoch  do
       for  do
             Agent reports . Learner observes .
       end for
      Learner picks .
end for
Algorithm 1 Online Regression with Epoch-Based Strategic modification (Epoch size n)

We first illustrate why unmodified observations are insufficient for any algorithm to distinguish causal from non-causal features. Consider a setting where non-causal features are convex combinations of the causal features in the underlying (unmodified) distribution. Absent additional information, a learner would be faced with degenerate sets of observations that have rank strictly less than , which can make accurate recovery of causality impossible:

Example 3.1.

Suppose , . Suppose feature is causal and feature is non-causal and correlated with : the distribution of unmodified features is such that for any feature vector , feature is identical to feature as . Then, any regression parameter of the form for assigns agents the same score as . Indeed,

In turn, in the absence of additional information other than the observed features and labels, is indistinguishable from any , many of which recover the causality structure poorly (e.g., consider any bounded away from ).

We next illustrate that strategic agent modifications may aid in recovery of causal features, but only for those features that individuals actually invest in changing:

Example 3.2.

Consider a setting where , feature is causal, and features and are non-causal and correlated with feature as follows: for any feature vector , . Let . Consider a situation in which the labels are noiseless (i.e., almost surely). Suppose that agents only modify their causal feature by a (possibly random) amount .

Note that the difference (in absolute value) between the score obtained by applying a given regression parameter and the score obtained by applying to feature vector is given by

In particular, for appropriate distributions of and , the predictions of and coincide if only if and . As such, the learner learns after enough observations that necessarily, . However, any regression parameter vector with , is indistinguishable from , and accurate recovery of and is impossible.

Note that even in the noiseless setting of Example 3.2, only the feature that has been modified can be recovered accurately. In more complex settings where the true labels are noisy, one should not hope to recover every feature well, but rather only those that have been modified sufficiently many times.

4 Recovery Guarantees for Modified Features

In this section, we focus on characterizing the recovery guarantees (with respect to the -norm) of Algorithm 1 at time for any epoch , with respect to the features that have been modified up until (that is, in epochs to ). We leave discussion of how the dynamics shape the set of modified features to Section 5.

The main result of this section guarantees the accuracy of the that the learning process converges to in its interaction with a sequence of strategic agents. The accuracy of the that is recovered for a particular feature naturally depends on the number of epochs in which that feature is modified by the agents. For a feature that is never modified, we have no ability to distinguish correlation from causation. Recovery improves as the number of observations of the modified variable increases.

Formally, our recovery guarantee is given by the following theorem:

Theorem 4.1 ( Recovery Guarantee for Modified Features).

Pick any epoch . With probability at least , for ,

where are instance-specific constants that only depend on , , , such that .

When the epoch size is chosen so that for , our recovery guarantee improves as becomes larger. When , our accuracy bound becomes ; this matches the well-known recovery guarantees of least square regression for a single batch of i.i.d observations drawn from a non-degenerate distribution of features. When the epoch size is sub-linear in , the accuracy guarantee degrades to . This is because some features are modified in few epochs,555In particular, as we will see, we expect correlated, non-causal features to only be modified in a small number of epochs: once a non-causal feature has been modified in a few epochs, it is accurately recovered. In further periods , the learner sets close to . This disincentivizes further modifications of feature . that is, times, and the number of times such features are modified drives how accurately they can be recovered.

We provide a proof sketch below, and defer the full proof of Theorem 4.1 to Appendix A.

Proof sketch for Theorem 4.1.

We focus on the subspace of spanned by the observed features , and for any , we denote by the projection of of onto .

First, we show via concentration that in this subspace, the mean-square error is strongly convex, with parameter (see Claim A.6

). This strong convexity parameter is controlled by the smallest eigenvalue of

over subspace . Formally, we lower bound this eigenvalue and show that with probability at least , for large enough,


Second, we bound the effect of the noise on the mean-squared error by in Lemma A.3, once again via concentration. Formally, we abuse notation and let , and show that with probability at least ,


Finally, we obtain the result via Lemma A.2, that shows the distance between and (restricted to ) decreases inversely proportionally to the magnitude of the strong convexity parameter, and increases proportionally to the noise in the mean-squared error. Formally, Lemma A.2 states that taking the first-order conditions on the mean-squared error yields

which can be combined with Equations (1) and (2) to show our bound with respect to sub-space . In turn, as the set of features modified up until time defines a sub-space of , our accuracy bound applies to . ∎

Remark 4.2.

We remark that Theorem 4.1 is not a direct consequence of the classical recovery guarantees of least-square regression. Such recovery guarantees leverage strong convexity of the mean-squared error in ; this error is strongly convex if and only if has rank , or equivalently the observations span . In contrast, our statement can deal with degenerate distributions over modified features, inducing observations that only span a strict sub-space of . Such distributions can arise in our setting, as evidenced by Examples 3.1 and 3.2.

5 Ensuring Exploration via Least Squares Tie-Breaking

In this section, we focus on ensuring that the interaction between the online learning process and the strategic modification results in modification of a diverse set of variables over time.

Recall we are solving the following least-square problem at time , for all epochs :

An equivalent characterization of is the set of solutions to the following linear system of equations:


When is invertible, this has a single solution, given by

However, in our setting, it may be the case that is rank-deficient (see Examples 3.13.2). In this case, the system of (linear) equations (3) is under-determined and admits a continuum of solutions. This gives rise to the question of which least-square solutions are preferable in our setting, and how to break ties between several solutions.

The learner’s choice of regression parameters in each epoch affects the distribution of feature modifications in subsequent epochs. As the recovery guarantee of Theorem 4.1 only applies to features that have been modified, we would like our tie-breaking rule to regularly incentivize agents to modify new features. We first show that a natural, commonly used tie-breaking rule—picking the minimum norm solution to the least-square problem—may fail to do so:

Example 5.1.

Consider a setting with , and noiseless labels, i.e., always. Suppose that with probability , every agent has features , budget , and costs to modify each feature. We let the tie-breaking pick the solution with the least norm among all solutions to the least-square problem.

Pick any initial regression parameter with . For every agent in epoch , picks modification vector . This induces observations , . The set of least-square solutions (with error exactly ) in epoch is then given by , and the minimum-norm solution chosen at the end of epoch is . This solution incentivizes agents to set , and Algorithm 1 gets stuck in a loop where every agent reports , and the algorithm posts regression parameter vector in response, in every epoch . The second feature is never modified by any agent, and is not recovered accurately.

Example 5.1 highlights that a wrong choice of tie-breaking rule can lead Algorithm 1 to explore the same features over and over again. In response, we propose the following tie-breaking rule, described in Algorithm 2:

Input: Epoch , observations , parameter Let . if   then
       Find an orthonormal basis for . Set , renormalize . Pick a vector in with minimal norm. Set .
       Set be the unique element in .
end if
Output: .
Algorithm 2 Tie-Breaking Scheme at Time .

Intuitively, at the end of epoch , our tie-breaking rule picks a solution in with large norm. This ensures the existence of a feature that has not yet been modified up until time , and that is assigned a large weight by our least-square solution. In turn, this feature is more likely to be modified in future epochs.

Our main result in this section shows that the tie-breaking rule of Algorithm 2 eventually incentivizes the agents to modify all features, allowing for accurate recovery of in its entirety.

Theorem 5.2 (Recovery Guarantee with Tie-Breaking Scheme (Algorithm 2)).

Suppose the epoch size satisfies , and take to be

where are instance-specific constants that only depend on , , , and . If , we have with probability at least that at the end of the last epoch ,

under the tie-breaking rule of Algorithm 2 .

Remark 5.3.

The bound in Theorem 5.2 provides guidance for selecting the epoch length, so as to ensure optimal recovery guarantees. Under the natural assumption that , the optimal recovery rate is achieved when roughly . This results in an upper bound on the distance between the recovered regression parameters and .

We provide a proof sketch below. The full proof is given in Appendix B.

Proof sketch of Theorem 5.2.

For arbitrarily large, the norm of becomes arbitrarily large. Because at the end of epoch , guarantees accurate recovery of all features modified up until time , it must be that is arbitrarily large for some feature that has not yet been modified. In turn, this feature is modified in epoch . After epochs, and in particular for , this leads to . The recovery guarantee of Theorem 4.1 then applies to all features. ∎

6 Conclusions and Future Directions

This paper provides evidence that interaction between an online learner and individuals who strategically modify their features can result in discovery of causal features, also incentivizing individuals to invest in these features, rather than gaming. In future work, it would be natural to explore this interaction in richer and more complex settings.

7 Acknowledgments

The work of Yahav Bechavod and Katrina Ligett was supported in part by Israel Science Foundation (ISF) grant #1044/16, the United States Air Force and DARPA under contracts FA8750-16-C-0022 and FA8750-19-2-0222, and the Federmann Cyber Security Center in conjunction with the Israel national cyber directorate. Zhiwei Steven Wu was supported in part by the NSF FAI Award #1939606, a Google Faculty Research Award, a J.P. Morgan Faculty Award, a Facebook Research Award, and a Mozilla Research Grant. Juba Ziani was supported in part by the Inaugural PIMCO Graduate Fellowship at Caltech and the National Science Foundation through grant CNS-1518941. Any opinions, findings and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the United States Air Force and DARPA. We thank Mohammad Fereydounian and Aaron Roth for useful discussions.


Appendix A Proof of Theorem 4.1

a.1 Preliminaries

a.1.1 Useful concentration

Our proof will require applying the following concentration inequality, derived from Azuma’s inequality:

Lemma A.1.

Let be random variables in such that . Suppose for all , for all ,

Then, with at least ,


This is a reformulated version of Azuma’s inequality. To see this, define

and initialize . We start by noting that for all , since

we have

Further, it is easy to see that if and only if , hence

Combining the last two equations implies that

and the ’s define a martingale. Since for all ,

we can apply Azuma’s inequality to show that with probability at least ,

which immediately gives the result. ∎

a.1.2 Sub-space decomposition and projection

We will also need to divide in several sub-spaces, and project our observations to said subspaces.

Sub-space decomposition

We focus on the sub-space generated by the non-modified features ’s and the sub-space generated by the feature modifications ’s. We let be the rank of , and let be the non-zero eigenvalues of . Further, we let

be the unit eigenvectors (i.e., such that

) corresponding to eigenvalues of . As is a symmetric matrix, are orthonormal. We abuse notations in the proof of Theorem 4.1 and denote when clear from context.

For all , let be the unit vector such that and . At time , we denote the sub-space of spanned by the features in .

Finally, we let

be the Minkowski sum of sub-spaces and .

Projection onto sub-spaces

For any vector , sub-space of , we write where is the projection of onto sub-space , i.e. is uniquely defined as

for any orthonormal basis of . We also let be the projection on the orthogonal complement . In particular, is orthogonal to . Further, we write the matrix whose rows are given by for all .

a.2 Main Proof

Characterization of the least-square estimate via first-order conditions

First, for any least square solution at time , we write the first order conditions solved by , the projection of on sub-space . We abuse notations to let the vector of all ’s up until time , and state the result as follows:

Lemma A.2 (First-order conditions projected onto ).

Suppose . Then,


For simplicity of notations, we drop all indices and subscripts in this proof. Remember that

Since , it must satisfy the first order conditions given by

which can be rewritten as

Second, we note that for all , and (by definition of ). This immediately implies, in particular, that . In turn, for all , and

As such, the first order condition can be written

Now, we remark that

where the second-to-last equality follows from the fact that and are orthogonal, which immediately implies for all . To conclude the proof, we note that . Plugging this in the above equation, we obtain that

This can be rewritten

which completes the proof. ∎

Upper-bounding the right-hand side of the first order conditions

We now use concentration to give an upper bound on a function of the right-hand side of the first order conditions,

Lemma A.3.

With probability at least ,

where is a constant that only depends on the distribution of costs and the bound on the noise.


Pick any , and define . First, we remark that

In turn, where

Further, note that both and are independent of the history of play up through time , hence of , and that is further independent of (the distribution of is a function of the currently posted only, which only depends on the previous time steps). Noting that if are random variables, we have

and applying this with , , , we obtain

since and . Hence, we can apply Lemma A.1 and a union bound over all features to show that with probability at least ,

By Cauchy-Schwarz, we have

Strong convexity of the mean-squared error in sub-space

We give a lower bound on the eigenvalues of on sub-space , so as to show that at time , any least square solution satisfies

To do so, we will need the following concentration inequalities:

Lemma A.4.

Suppose . Fix for some . With probability at least , we have that




Deferred to Appendix A.2.1. ∎

We will also need the following statement on the norm of the projections of any to and :

Lemma A.5