A Monte Carlo Language Model Pipeline for Zero-Shot Sociopolitical Event Extraction
We consider dyadic zero-shot event extraction (EE) to identify actions between pairs of actors. The zero-shot setting allows social scientists or other non-computational researchers to extract any customized, user-specified set of events without training, resulting in a dyadic event database, allowing insight into sociopolitical relational dynamics among actors and the higher level organizations or countries they represent. Unfortunately, we find that current zero-shot EE methods perform poorly for the task, with issues including word sense ambiguity, modality mismatch, and efficiency. Straightforward application of large language model prompting typically performs even worse. We address these challenges with a new fine-grained, multi-stage generative question-answer method, using a Monte Carlo approach to exploit and overcome the randomness of generative outputs. It performs 90% fewer queries than a previous approach, with strong performance on the widely-used Automatic Content Extraction dataset. Finally, we extend our method to extract affiliations of actor arguments and demonstrate our method and findings on a dyadic international relations case study.
READ FULL TEXT