Highlighting Object Category Immunity for the Generalization of Human-Object Interaction Detection
Human-Object Interaction (HOI) detection plays a core role in activity understanding. As a compositional learning problem (human-verb-object), studying its generalization matters. However, widely-used metric mean average precision (mAP) fails to model the compositional generalization well. Thus, we propose a novel metric, mPD (mean Performance Degradation), as a complementary of mAP to evaluate the performance gap among compositions of different objects and the same verb. Surprisingly, mPD reveals that previous methods usually generalize poorly. With mPD as a cue, we propose Object Category (OC) Immunity to boost HOI generalization. The idea is to prevent model from learning spurious object-verb correlations as a short-cut to over-fit the train set. To achieve OC-immunity, we propose an OC-immune network that decouples the inputs from OC, extracts OC-immune representations, and leverages uncertainty quantification to generalize to unseen objects. In both conventional and zero-shot experiments, our method achieves decent improvements. To fully evaluate the generalization, we design a new and more difficult benchmark, on which we present significant advantage. The code is available at https://github.com/Foruck/OC-Immunity.
READ FULL TEXT