Binarised Regression with Instance-Varying Costs: Evaluation using Impact Curves
Many evaluation methods exist, each for a particular prediction task, and there are a number of prediction tasks commonly performed including classification and regression. In binarised regression, binary decisions are generated from a learned regression model (or real-valued dependent variable), which is useful when the division between instances that should be predicted positive or negative depends on the utility. For example, in mining, the boundary between a valuable rock and a waste rock depends on the market price of various metals, which varies with time. This paper proposes impact curves to evaluate binarised regression with instance-varying costs, where some instances are much worse to be classified as positive (or negative) than other instances; e.g., it is much worse to throw away a high-grade gold rock than a medium-grade copper-ore rock, even if the mine wishes to keep both because both are profitable. We show how to construct an impact curve for a variety of domains, including examples from healthcare, mining, and entertainment. Impact curves optimize binary decisions across all utilities of the chosen utility function, identify the conditions where one model may be favoured over another, and quantitatively assess improvement between competing models.
READ FULL TEXT