Eliciting Knowledge from Language Models for Event Extraction

09/11/2021
by   Jiaju Lin, et al.
East China Normal University
0

Eliciting knowledge contained in language models via prompt-based learning has shown great potential in many natural language processing tasks, such as text classification and generation. Whereas, the applications for more complex tasks such as event extraction are less studied, since the design of prompt is not straightforward due to the complicated types and arguments. In this paper, we explore to elicit the knowledge from pre-trained language models for event trigger detection and argument extraction. Specifically, we present various joint trigger/argument prompt methods, which can elicit more complementary knowledge by modeling the interactions between different triggers or arguments. The experimental results on the benchmark dataset, namely ACE2005, show the great advantages of our proposed approach. In particular, our approach is superior to the recent advanced methods in the few-shot scenario where only a few samples are used for training.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

12/30/2020

Unsupervised Label-aware Event Trigger and Argument Classification

Identifying events and mapping them to pre-defined event types has long ...
02/19/2021

Back to Prior Knowledge: Joint Event Causality Extraction via Convolutional Semantic Infusion

Joint event and causality extraction is a challenging yet essential task...
09/15/2020

Critical Thinking for Language Models

This paper takes a first step towards a critical thinking curriculum for...
01/13/2022

CLIP-Event: Connecting Text and Images with Event Structures

Vision-language (V+L) pretraining models have achieved great success in ...
10/10/2021

Language Models As or For Knowledge Bases

Pre-trained language models (LMs) have recently gained attention for the...
10/04/2021

DeepA2: A Modular Framework for Deep Argument Analysis with Pretrained Neural Text2Text Language Models

In this paper, we present and implement a multi-dimensional, modular fra...
04/16/2021

proScript: Partially Ordered Scripts Generation via Pre-trained Language Models

Scripts - standardized event sequences describing typical everyday activ...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

Introduction

Event extraction aims to transform text into structural records. For example, in Figure 1, the raw sentence would be transformed into the structural informative records shown in the table on the right side. The table-like records include event types, trigger words and corresponding arguments.

Mainstream methods can be roughly categorized into two groups: (1) The sequence tagging based approaches that treat the event extraction task as trigger/argument identification and classification.Lin et al. (2020)Wadden et al. (2019) (2) The generation based approaches that generate trigger and argument tokens given the event descriptions or event schemaLu et al. (2021)Li et al. (2021). Recent studies focus on enhancing the performance with the pretrained language models (PLMs)Du and Cardie (2020)EventExtractionasMulti-turnQuestionAnsweringLiu et al. (2020). In particular, QA based MRC models are proposed for event extraction, which design the related questions manually and provide the answer as the trigger/argument.

Despite these inspiring progress, current methods still struggle due to the following critical issues. (1) The knowledge contained in the pretrained language models is not fully exploited. Lu et al. (2021) modifies event extraction as a generation task, and introduces a sequence-to-structure model to extract trigger and arguments jointly. But the event schema they used lacks informative notion, making the potential of PLMs can not be completely unleashed. (2) Mechanisms to capture the interactions among arguments are not elegant and optimal. Xiangyu et al. (2021)

attempted to model the relations among both inter and intra event arguments via an RNN model. Whereas, the performance was heavily dependent on named entity recognition, which was prone to error propagation.

Wei et al. (2021) introduced a teacher-student mechanism to distill the knowledge from various teacher models. Specifically, the teacher models were trained to predict argument given relevant roles of a specific event, while the student model was trained for learn the knowledge from teachers for argument prediction. To impart all-around knowledge for argument extraction, many well-trained teacher models are needed, which poses big challenge for fine-tuning.

Figure 1: An example of event extraction from ACE205 corpus. There are three events happening in this sentence. While only one trigger word, ‘killed’, links to two arguments directly, other triggers’ arguments need to be inferred via the information provided by ‘killed’ event.

To address the above issues, we propose a PrOmpt-based Knowledge Eliciting approach (PoKE), which elicits specific knowledge from PLMs for event extraction. Specifically, we first design trigger/argument prompts based on the annotation guidelines, and then ask the model to fill the blanks with trigger words or argument texts. In this way, the event extraction is transformed into a conditional generation task, where the knowledge obtained during pretraining are elicited to generate the answer. In order to better capture the relations between different events and arguments, we present various joint prompt strategies to gauge the complementary knowledge about triggers and arguments that occur in the same context.

Our approach is superior to the existing methods in three folds. (1) We mingle description understanding and interaction modeling task together. Not only can we elicit knowledge embedded in PLM, but also making full use of the PLM’s understanding ability. For instance, it is challenging for most existing models to extract all information correctly from the sentence in Figure1, where the arguments of three events overlap each other and some trigger-argument linkages are not direct. However, our model is able to infers the ‘place’ and ‘target’ argument of ‘suicide bombing’ with the help of trigger ‘killed’, capturing the interactions between triggers efficiently. (2) Different from current generation models that generate arguments according to a pre-defined schema, we generate arguments separately. Thus our results would not suffer from the sub-optimal order of generation. Moreover, our model can take related arguments into consideration with the joint prompt. (3) Instead of introducing extra modules, we simply adjust the format of input data and realize the purpose of capturing interactions among triggers and arguments. Such brief mechanism decreases the amount of model parameters and alleviates the difficulty of fine tuning.

We evaluate our approach on the benchmark event extraction dataset, namely ACE2005. Experimental results show that our approach outperforms the state-of-the-art methods for argument extraction. Besides we accomplish competitive performance with our event extraction pipeline. It is also notable that our approach is capable of better learning in the low resource set, and outperforms the recent advanced methods under the few-shot scenario.

Our contributions can be summarized as follows:

  • As far as we know, it is the first attempt to explore the effectiveness of eliciting knowledge from PLMs for event extraction, where the prompt design is not straightforward due to the complicated types and arguments.

  • We develop various joint prompt strategies, which can capture the interactions among different triggers, intra and inter-event arguments, without any redundant modules.

  • Extensive experiments on the most widely used ACE-2005 event extraction dataset demonstrate the superiority of our approach under both fully supervised and few-shot learning scenarios.

Figure 2: Architecture of PoKE for coarse-grained event detection. It is the first part of event detection, in this step, we extract triggers for eight main types. For this purpose, we introduce external joint trigger prompt and internal joint trigger prompt. We develop the former by setting masks in the prompts, to teach model how to infer trigger via the combination of passage and prompts. To generate the latter, we mask the triggers in the passage, asking the model to fill the mask via the information in the prompts. During this process, our model would model the interactions among triggers.

The Proposed Approach

In this section, we first present an overview of our framework. Then, we introduce various prompt methods to elicit knowledge from PLMs for trigger identification and argument extraction. After that we describe the details of prompt-based event extraction.

Framework Overview

Our PoKE framework depends on the natural language understanding ability of T5. To be specific, the different prompts we design would make our model understand the purpose of our task, i.e. comprehend linkage between answer and given context, rather than remember the distribution of answers. For event detection task, we split it into two parts. First, we generate the trigger words corresponding to 8 main types in each sentence,i.e., Movement, Personnel, Conflict, Contact, Life, Transaction, Justice, Business. With that result, we further classify triggers into 33 subtypes, also model it as a generation task. For event argument extraction task, our model would fill the bank in each prompt with argument span from the sentence, with trigger words and event type provided.

Event Detection

Owing to the diversity of event subtypes, we first extract triggers according to 8 main types, and then classify them into subtypes. For extracting triggers for main type, we develop two joint trigger prompts to elicit more related knowledge by modeling the event interactions.

External Joint Trigger Prompt. Given a sentence, we add 8 prompts of all main types after it, asking the model to generate the trigger words jointly. In this manner, model can detect the discrepancy between various event types. Intuitively, we use the event type as an inevitable part of the prompt. Besides, considering the fact that contexts may render cues for the event, we expand the boundary of sentence. With a window of size=100 words, our sentence is set at the middle position and includes contexts in the window into the final passage. To illuminate which sentence triggers should be extracted from, two special tokens ‘’ and ‘’ are applied to mark the original boundary of the raw sentence. Furthermore, to make the paragraph more natural sounding, we insert a bridge section, ‘In the passage above’, between the passage and prompts. A typical trigger external joint trigger prompt is shown in Figure 2. If an event does not occur in the sentence, we use ‘None’ to represent the NA answer.

Internal Joint Trigger Prompt. Given the fact that relevant events occur close to each other in a paragraph of text, it is sensible to teach our model how to capture the interaction between triggers. Due to that reason, we present internal joint trigger prompt for trigger extraction. Events within one sentence has certain relation with each other. Besides, we assume that the events appearing within a span of three sentences have potential interdependence. Thus we filter ineligible samples and concatenate adjoin eligible sentences as raw sentences. Similar to trigger prompt, we employ a window with size = 120 words to incorporate the contextual information. Unlike external joint trigger prompt that the masks are outside of the raw sentence, the internal joint trigger prompt masks the trigger words inside the raw sentence as shown in Figure 2

With the identified triggers for the main event type, we need to further classify them into 33 subtypes. We inherit the trigger prompts and make classifications among the subtypes. As shown in Figure 2, with the identified triggers ‘convicted’ and ‘sentenced’ for the main event type ‘Justice’, we perform fine-grained classifications to classify them into the subtypes of events, namely ‘Convivt’ and ‘Sentence’.

Argument Extraction

After event detection, we need to extract the event related arguments for better understanding. For example, given the description of ‘convicted’ event, it is necessary to identify different argument roles as ‘defendant’ and ‘adjudicator’ to make further decisions. In this section, we present a single and joint argument based prompts for argument extraction.

Single Argument Prompt. For each role in an event, to incorporate more semantic information into our templates, we first utilize the descriptions in the ACE annotation guidelines for events. Particularly, we make a masked position in the argument description text and urge the model to generate the corresponding words. In the case where multiple entities in a passage represent the same argument, we insert ‘’ in the gold answer to split them. In addition, it is necessary for the model to have the knowledge of the trigger and its type. To provide the trigger and event information, we stick the event type and the corresponding trigger at end of the sentence. Both the trigger word in the sentence and template is surrounded by a pair of special tokens as . An example of the single argument prompt is presented in Figure 3.

Joint Argument Prompt. When trying to understand an event from a passage, it is natural for a human to inference the arguments from other co-occurred arguments. To stimulate such learning process, we present a joint argument prompt method to elicit more related knowledge by modeling the interactions among the intra-event arguments and inter-event arguments.

For each sentence that contains an event, we provide the corresponding arguments information behind it, while mask the arguments words in the sentence. The ACE annotation guidelines are applied to generate the arguments description information. Also, the trigger word and event type are added after the argument descriptions. Besides, to prevent our model from cheating by heuristic relations, like filling the blanks according to the order of arguments mentioned in prompts, we shuffle the list of prompts randomly. Figure

3 shows an example of our joint argument prompt. With this prompt, more complementary knowledge can be elicited from the pre-trained language models, building solid relations between the descriptions in prompts and the concrete argument role for more accurate argument extraction.

Figure 3: Architecture of PoKE for argument extraction. In the training stage, we design two varieties of prompts: Joint Argument Prompt and Single Argument Prompt. To generate the former, we mask the arguments in the sentence, asking the model to fill the mask via the information in the prompts. During this process, our model would model the interactions among arguments and triggers. Meanwhile, we develop Single Argument Prompt, whose masks are in the prompts, to teach model how to infer arguments in the inference stage.

Prompt-based Event Extraction

Since our base model architecture is an encoder-decoder design. We employed T5 for the generation task. The generation process follows T5 pre-training task, a slew of extra tokens, from ‘’ to ‘’ are exploited as masks. We replace the specific word spans with these special tokens. And the generation output is the answers joined by these extra tokens, like Argument_0 Argument_1 , here means the end of the sentence.

The generation process models the conditional probability of selecting a new token given the previous tokens and the input to the encoder.

(1)

The model is trained by minimizing the cross entropy loss over vocabulary .

(2)

In inference stage, we use greedy decoding, i.e. choosing the highest-probability logit at every times step. Given the fact that the vocabulary space is huge and it is possible for the model to generate words out of the passage. We constrain the candidate vocabulary to

the set of tokens in the input ,by setting logits of extraneous words to -inf.

(3)
Trigger-C Arg-C
Model P R F1 P R F1 PLM Annotation
OneIE - - 74.7 - - 56.8 BERT-large Token+Entity
Text2Event - - 69.2 46.7 53.4 49.8 T5-base TextRecord
TANL - - 68.5 - - 48.5 T5-base Token
EEQA 71.1 73.7 72.3 56.7 50.2 53.3 2*BERT Token
CondiGen - - 71.1 - - 53.7 BART-large Token
PoKE 64.1 76.2 69.6 47.7 58.1 52.4 T5-base Token
Table 1: Experiment results of event extraction on ACE2005. Trig-C indicates trigger identification and classification. Arg-C indicates argument identification and classification. The column PLM indicates the pre-trained language models used by each method.
Model P R F1 Encoder/PLM
HMEAE 66.04 68.58 67.28 CNN
CondiGen 69.66 67.63 68.83 BART-large
TANL 65.19 64.21 64.69 T5-base
EEQA 67.88 63.02 65.36 2*BERT
PoKE single only 60.79 64.92 62.79
PoKE w.o. inter 61.52 67.13 64.20 T5-base
PoKE 66.20 74.48 70.10
Table 2: Results of event argument extraction. PoKE single only means we train the model only with single argument prompts. PoKE w.o. inter indicates we replace joint argument prompts with the single prompts but mask in the passage, which means no interaction information is impart to the model.

Experiments

Experimental Setup

Data set.  We evaluate our work on the most widely used ACE2005 dataset, containing 599 documents annotated with 33 event subtypes and 35 argument types. For fair competition, we follow the same split and preprocessing step as the previous workYang et al. (2018) Wadden et al. (2019) Du and Cardie (2020). The data set splits are as follows, 40 newswire documents as test set, 30 randomly selected documents for development and a training set including the rest 529 documents.
Baselines.  For complete event extraction task we compare our method with five baselines. Considering the difference in the annotation methods and label-intensity, we categorize them into three types following Lu et al. (2021) 1.Baseline using token annotation and entity annotation: OneIE Lin et al. (2020), an end-to-end system, employs global features to conduct named entity recognition ,relation extraction and event extraction at the same time.

2.Baselines using token annotation: CondiGen Li et al. (2021) is one of the cutting-edge method for argument extraction task. It utilizes the conditionally generation ability of BART. Given description of arguments and their relations then generate the corresponding words in the sentence. The candidate arguments are predicted jointly.However, it gives up generation model and introduces a modified TapNet for event detection, formulating it as classic sequence tagging task. EEQA which is one of the earliest work that introduces a new paradigm for event extraction by formulating it as a question answering task. It generates questions from annotation guidelines and raise question for each argument under one event type respectivelyDu and Cardie (2020). TANL, a sequence generation-based method, frame event extraction as a translation task between augmented natural languages. Multi-task TANL extends TANL by transferring structure knowledge from other tasks.Paolini et al. (2021)

3. Baseline using Text-Record Annotation: TexT2Event, an end-to-end generation-based, modeling event-detection and argument extraction tasks in a single model. Compared with normal generation methods, it outputs a structural event record with event type, trigger, and argumentsLu et al. (2021).

Implementation. We optimized our model using AdamW Kalchbrenner and Blunsom (2013)

with different hyperparameters for each task. For coarse-grained event detection, we set learning rate=1e-4 training epoch=6. For fine-grain trigger classification, learning rate=5e-4 batch size=8 training epoch=3. For argument extraction task, learning rate=1e-3 and batch size=64. The experiments are conducted on a single NVIDIA GeForce RTX 3090. The random seed set for all tasks is 42.

Supervised Learning Results

Event Extraction Results

The performance of PoKE and all baselines on ACE05 are presented in Table 1, where ‘Trig-C’ indicates trigger identification and classification and ‘Arg-C’ reflects the result of argument identification and classification. We can observe that by leveraging interactions among triggers and arguments and taking advantage of prompts, our model achieve comparable or better performance with the baseline models. Compared with the other two global-generation model, TANL and Text2Event, our method overtake them both in event detection and argument extraction task. Notice that the results of argument extraction is directly affected by the output of event detection. An error prediction of event type would render wrong candidate classes for argument extractor. Hence the reduction in F1 after argument extraction indicates our merits in argument extraction.

Argument Extraction Results

Since the current comparisons are based on the trigger predictions of previous event detection workWang et al. (2019). It maybe reasonable when no advance can be achieved for event detection task. However, with the rapid evolution in event detection methodsPouran Ben Veyseh et al. (2021), we believe such previous criterion can not show the whole picture of each architecture’s ability. Hence, following the hyper parameter settings mentioned in their paper, we reproduce the results of several cutting-edge models with gold triggers as input to do a fair comparison. Meanwhile, we introduce another baseline developed for event argument extraction specifically. HMEAE which applies a concept hierarchy indicating argument roles. With the help of hierarchical modular attention, correlation between argument roles sharing the same high-level unit can be utilized.Wang et al. (2019). It should be noticed that HMEAE takes use of the named entity information as candidate arguments. Also, we do some ablation study to examine the efficiency of our model. ‘PoKE single only’ means we train the model only with single argument prompts. ‘PoKE w.o. inter’ indicates we replace joint argument prompts with the single prompts but mask in the passage, which means no interaction information is impart to the model.
The results are shown in Table 2, from which we can observe that. (1) Our method surpasses all baseline methods, with an absolute improvement of at least 1.3 in F1. (2) Models which consider the interactions between arguments perform better. For instance, CondiGen is trained to generate all arguments of a single event jointly. Although the manual templates used for its inference are sub-optimal, it overtakes EEQA with 3.1 in F1. Taking advantages of both joint prediction and informative description, our model achieves the sate-of-the-art performance. (3) Named entity annotation is not an absolute necessity. Although without the assistance of named entity recognition results, models utilize PLMs achieve competitive performances. Particularly, input of HMEAE are sentences with named entity annotation, however, its encoder, a vanilla CNN, restricts its ability. (4) Joint argument prompts enhance model’s performance dramatically, with almost 8 in F1 , from 62.79 to 70.10 .

Figure 4: Few-shot experiment results. K-shot corresponds to the number of data in training and validation set.

Few Shot Scenario

Imitating few-shot settings in LM-BFFGao et al. (2021), we conduct 8,16,32 shot experiments to examine the few-shot ability of our model. Specifically, for each event type, we sample k instances from initial training and validation set. After training, we test the model performance on standard scale test set. The experiment results are in TableLABEL:tab:fewshot. It can be observed that our model perform better than baselines in few-shot scenario. Although EEQA and CondiGen both have outstanding zero-shot ability, when it comes to few-shot task, they are exceeded by our PoKE. Even a 32-shot trained EEQA model can not overtake PoKE trained by 8-shot data. Our mechanism leverages naturalness into prompts which ensure its few-shot performance.

Method P R F1
EDA 54.46 77.43 63.94
BackTranslation 54.03 76.95 63.48
PoKE single only 60.79 64.92 62.79
PoKE 66.20 74.48 70.10
Table 3: Compare PoKE with Data Augment Methods

Compared with Data Augment Methods

PoKE increases the number and variety of training samples, which would enhance model’s ability naturally, while data augment methods can improve model’s performance in the same way. Hence we compare our method with two most common seen data augment methods, back translation and EDAWei and Zou (2019),which consists of four word-level operations, replacement, insertion, swap, and deletion, to boost diversity of sentences. Experiments are conducted on argument extraction task. We discard the interaction learning data for two data augment methods and expand the argument inference data set to the same number of training data used for PoKE. The results are demonstrated in Table 3. From the result, we can see that compared with using inference data only, data augment methods do improve model’s performance. Nonetheless, the improvement is limited. It further demonstrates that our mechanism unleash PLM’s understanding ability.

Figure 5: Case study of PoKE and EEQA

Case Study and Error Analysis

Case Study

To demonstrate how our model utilize information from co-occuring events, we show extraction results of samples with multi events in Figure5. Sentence S1 contains two trigger words: ‘destroyed’ and ‘killed’, indicating ‘Die’ and ’Attack’ event respectively. It is a common fact that the agent responsible for a death is usually the attacking agent in an attack. Also, word ‘commandos’ is the subject of the sentence, strongly syntactically related to both two triggers. Given that, our model is capable of exploiting the clues hidden behind and makes the correct prediction of ‘Die_Agent’ and ‘Attack_Attacker’. Besides, although there is a false prediction of ‘Victim’, i.e., our model identifies ‘soldiers’ as victims while no ‘Victim’ is annotated. We believe it is the gold answer that contains a mistake. In many cases, our model makes more sensible argument extraction that human annotated data.

Error Analysis

To facilitate further research, we investigate the error predictions of our model. We randomly sampled 50 wrong answers and attributed them to 5 types. Here we list 3 most common problems caused by our model design.

Shallow Heuristics obtained in training set(24%) Data in the training set always contains some bias that induce heuristic relations between entity mentions and corresponding argument types. Model learns these heuristic relations to enhance its performance, while neglect the intended task. We define the event argument extraction task as utilizing the sentence , trigger , event type to infer corresponding arguments , . However, we observed that, in some sense, extraction of arguments relies on the entity mention in the sentence and event type,, omitting the relation between entity and trigger word. For example, in the training set, when an event about ‘Justice’ and entity ‘court’ appear in the same sentence, ‘court’ always plays a role of ‘Adjudicator’. Thus, when the same situation occurs in the inference period, our model identifies ‘court’ in the sentence as ‘Adjudicator’, no matter whether it is linked to the trigger word.

One of the error predictiond is as follows:
The Belgrade district said that Markovic will be tried along with 10 other Milosevic-era officials who face similar of inappropriate use of state property that carry a sentence of up to five years in jail.

In this instance, ‘court’ has no direct relation to the trigger ‘charge’.Thus the gold annotation of ‘Adjudicator’ is ‘None’. However, due to the wrong experience accumulated in training stage, our model classifies it as a adjudicator because the sentence contains a Justice-Sentence event.

Ambiguous expression (20%) Owing to the ambiguity of natural language, even human reader would make some mistakes, especially under the situation where no context are given. Ambiguous expression error refers to the case when our model makes a prediction inconsistent with gold answer, but reasonable from some perspectives. These mistakes usually occur when inferring ‘Place’ arguments or arguments expressed implicitly. For example, in the sentence ‘Kelly , the US assistant secretary for East Asia and Pacific Affairs , arrived in from Beijing Friday to Yoon , the foreign minister.’ Seoul is predicted as the ‘Place’ of ‘Contact.Meet’ event, but no ‘Place’ argument is annotated. It is true that the place of meeting is not mentioned clearly, but for humans, applying rhetoric to imply implicit arguments is not rare sense.
We also find the gold sample expressing place argument implicitly: Senator Carol Moseley - Braun recently decided to run for president . Here ‘Illinois’ is annotated as the place where employment relationship ends.

Coreference & head of entity mistake(10%) Errors ascribed to this kind refers to the situation when the model predicts a coreferent of gold annotation or the whole entity span rather than the head of it. For example, “ , a leader of Turkey ’s pro - Islamic movement when he was , said he moderated his policies in prison .” ‘Erdogan’ is predicted as ’Person’ argument of ’Justice.Arrest-Jail’ event, while the gold annotation is ’leader’. For head of entity mistake, the model fails to figure out the head word of entity span correctly. Here is a concrete example: “ Anwar will be taken to the court early Friday for a bail application pending his appeal to the country ’s highest against his sodomy conviction , counsel Sankara Nair said .” The gold answer of ’Adjudicator’ is ’Court’ but ‘Federal Court’ is predicted by our model.

Related Work

Event extraction

Most efforts in event extraction field are concentrated on the ACE2005 corpus. Various of neural models have been exploited for better performance. Nguyen and Grishman (2015) first introduced Convolutional Meural Metwork, (CNN) for event detection for better representation of features. And Nguyen et al. (2016)

employed recurrent neural network to build robust model.Besides, Graph Convulutional Neural Networks

Liu et al. (2018) and plenty of fancy ideas have been applies to handle difficulties we meetLu et al. (2021). Recently, there is a trend of modeling event extraction into QA-based tasks Du and Cardie (2020)Liu et al. (2020). These researches employ reading comprehension abilities of natural language models. Despite these advantages, current methods still suffers from the bottlenecks mentioned in Introduction.

Prompt-based learning

Prompt-based learning is based on language models that model the probability of text directly. Prompt is a piece of string that contains descriptive information about the answer of the task. The language model is used to probabilistically fill the unfilled slot to obtain a final string as the prediction of the task. Exploration in prompt methods can be categorized into prompt engineering which refers to creating a prompting function that results in efficient performance on the downstream taskShin et al. (2020)Jiang et al. (2020), answer engineering which targets at looking for an optimal answer space and a map to the original output Shin et al. (2020) and multi-prompt learning which involves prompt ensembleYuan et al. (2021), prompt augmentationGao et al. (2021), prompt compositionHan et al. (2021) and prompt decompositionCui et al. (2021). In this work, we utilize manual prompts and take the idea of prompt ensemble, but only modify the format of prompts in training stage to increase the PLM’s understanding of the task.

Conclusion and Future Work

We propose a prompt-based approach (PoKE), which includes various trigger/argument prompts to elicit knowledge from language models for event extraction. In particular, the prompts model the interactions between different events and arguments, which gauges more complementary knowledge and yields better performance. The experimental results demonstrate the superiority of our approach in achieving comparable performance without additional model parameters or modules. It is also notable that we can still perform well under that few-shot scenario. In the future, we will introduce automatic prompt methods for this task and explore the effectiveness of prompt for joint event extraction.

References

  • L. Cui, Y. Wu, J. Liu, S. Yang, and Y. Zhang (2021) Template-based named entity recognition using BART. CoRR abs/2106.01760. External Links: Link, 2106.01760 Cited by: Prompt-based learning.
  • X. Du and C. Cardie (2020) Event extraction by answering (almost) natural questions. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), Online, pp. 671–683. External Links: Link, Document Cited by: Introduction, Experimental Setup, Experimental Setup, Event extraction.
  • T. Gao, A. Fisch, and D. Chen (2021) Making pre-trained language models better few-shot learners. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), Online, pp. 3816–3830. External Links: Link, Document Cited by: Few Shot Scenario, Prompt-based learning.
  • X. Han, W. Zhao, N. Ding, Z. Liu, and M. Sun (2021) PTR: prompt tuning with rules for text classification. CoRR abs/2105.11259. External Links: Link, 2105.11259 Cited by: Prompt-based learning.
  • Z. Jiang, F. F. Xu, J. Araki, and G. Neubig (2020) How can we know what language models know?. Transactions of the Association for Computational Linguistics 8, pp. 423–438. External Links: Link, Document Cited by: Prompt-based learning.
  • N. Kalchbrenner and P. Blunsom (2013) Recurrent continuous translation models. In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, Seattle, Washington, USA, pp. 1700–1709. External Links: Link Cited by: Experimental Setup.
  • S. Li, H. Ji, and J. Han (2021) Document-level event argument extraction by conditional generation. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Online, pp. 894–908. External Links: Link, Document Cited by: Introduction, Experimental Setup.
  • Y. Lin, H. Ji, F. Huang, and L. Wu (2020) A joint neural model for information extraction with global features. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Online, pp. 7999–8009. External Links: Link, Document Cited by: Introduction, Experimental Setup.
  • J. Liu, Y. Chen, K. Liu, W. Bi, and X. Liu (2020) Event extraction as machine reading comprehension. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), Online, pp. 1641–1651. External Links: Link, Document Cited by: Introduction, Event extraction.
  • X. Liu, Z. Luo, and H. Huang (2018) Jointly multiple events extraction via attention-based graph information aggregation. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, pp. 1247–1256. External Links: Link, Document Cited by: Event extraction.
  • Y. Lu, H. Lin, J. Xu, X. Han, J. Tang, A. Li, L. Sun, M. Liao, and S. Chen (2021) Text2Event: controllable sequence-to-structure generation for end-to-end event extraction. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), Online, pp. 2795–2806. External Links: Link, Document Cited by: Introduction, Introduction, Experimental Setup, Experimental Setup, Event extraction.
  • T. H. Nguyen, K. Cho, and R. Grishman (2016) Joint event extraction via recurrent neural networks. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, San Diego, California, pp. 300–309. External Links: Link, Document Cited by: Event extraction.
  • T. H. Nguyen and R. Grishman (2015)

    Event detection and domain adaptation with convolutional neural networks

    .
    In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers), Beijing, China, pp. 365–371. External Links: Link, Document Cited by: Event extraction.
  • G. Paolini, B. Athiwaratkun, J. Krone, J. Ma, A. Achille, R. Anubhai, C. N. dos Santos, B. Xiang, and S. Soatto (2021) Structured prediction as translation between augmented natural languages. In 9th International Conference on Learning Representations, ICLR 2021, Cited by: Experimental Setup.
  • A. Pouran Ben Veyseh, V. Lai, F. Dernoncourt, and T. H. Nguyen (2021) Unleash GPT-2 power for event detection. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), Online, pp. 6271–6282. External Links: Link, Document Cited by: Argument Extraction Results.
  • T. Shin, Y. Razeghi, R. L. Logan IV, E. Wallace, and S. Singh (2020) AutoPrompt: Eliciting Knowledge from Language Models with Automatically Generated Prompts. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), Online, pp. 4222–4235. External Links: Link, Document Cited by: Prompt-based learning.
  • D. Wadden, U. Wennberg, Y. Luan, and H. Hajishirzi (2019) Entity, relation, and event extraction with contextualized span representations. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), Hong Kong, China, pp. 5784–5789. External Links: Link, Document Cited by: Introduction, Experimental Setup.
  • X. Wang, X. Han, Z. Liu, M. Sun, and P. Li (2019) Adversarial training for weakly supervised event detection. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), Minneapolis, Minnesota, pp. 998–1008. External Links: Link, Document Cited by: Argument Extraction Results.
  • X. Wang, Z. Wang, X. Han, Z. Liu, J. Li, P. Li, M. Sun, J. Zhou, and X. Ren (2019) HMEAE: hierarchical modular event argument extraction. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), Hong Kong, China, pp. 5777–5783. External Links: Link, Document Cited by: Argument Extraction Results.
  • J. Wei and K. Zou (2019) EDA: easy data augmentation techniques for boosting performance on text classification tasks. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), Hong Kong, China, pp. 6382–6388. External Links: Link, Document Cited by: Compared with Data Augment Methods.
  • K. Wei, X. Sun, Z. Zhang, J. Zhang, G. Zhi, and L. Jin (2021) Trigger is not sufficient: exploiting frame-aware knowledge for implicit event argument extraction. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), Online, pp. 4672–4682. External Links: Link, Document Cited by: Introduction.
  • X. Xiangyu, W. Ye, S. Zhang, Q. Wang, H. Jiang, and W. Wu (2021) Capturing event argument interaction via a bi-directional entity-level recurrent decoder. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), Online, pp. 210–219. External Links: Link, Document Cited by: Introduction.
  • P. Yang, X. Sun, W. Li, S. Ma, W. Wu, and H. Wang (2018) SGM: sequence generation model for multi-label classification. In Proceedings of the 27th International Conference on Computational Linguistics, Santa Fe, New Mexico, USA, pp. 3915–3926. External Links: Link Cited by: Experimental Setup.
  • W. Yuan, G. Neubig, and P. Liu (2021) BARTScore: evaluating generated text as text generation. External Links: 2106.11520 Cited by: Prompt-based learning.