Event Extraction (EE) is an important yet challenging task in information extraction research. As a particular form of information, an event refers to a specific occurrence of something that happens in a certain time and a certain place involving one or more participants, which can frequently be described as a change of state 111ACE (Automatic Content Extraction) English Annotation Guidelines for Events, Linguistic Data Consortium, Philadelphia, PA, USA, 2005.. Event extraction task aims at extracting such event information from unstructured plain texts into structured form, which mostly describes “who, when, where, what, why” and “how” of real-world events happened. In terms of application, the task facilitates people to retrieval event information and analysis on people’s behaviors, arousing information retrieval (Zhang et al., 2021; Kuhnle et al., 2021), recommendation (Liu et al., 2017a; Gao et al., 2016), intelligent question answer (Boyd-Graber and Börschinger, 2020; Cao et al., 2020a)
, knowledge graph construction(Wu et al., 2019; Bosselut et al., 2021), and other applications (Su et al., 2021; Liu et al., 2020a; Ma et al., 2021).
Event extraction can be divided into two levels: schema-based event extraction (Yang and Mitchell, 2016; Ferguson et al., 2018; Sheng et al., 2021) and open domain-based event extraction (Chau et al., 2019; Liu et al., 2019b; Mejri and Akaichi, 2017). The event is considered the objective fact that specific people and objects interact at a specific time and place in the schema-based event extraction task. Schema-based event extraction is to find words that belong to a specific event schema, which refers to an action or state change that occurs, and its extraction targets include time, place, person, and action, etc. In the open domain event extraction task, events are considered as a set of related descriptions of a topic, which can be formed by classification or clustering. Open domain-based event extraction refers to acquiring a series of events related to a specific theme, usually composed of multiple events. Whether based on schema or open domain event extraction task, the purpose of event extraction is to capture the event types that we are interested in from numerous texts and show the essential arguments of events in a structured form.
We focus on schema-based event extraction, which has a lot of works and is a relatively mature research taxonomy. Schema-based event extraction discovers event mentions from text and extracts events containing event triggers and event arguments. Event mentions are sentences containing one or more triggers and arguments. Event extraction requires to identify the event, classify event type, identify the argument, and judge the argument role. Trigger identification and event classification can be classified as the event detection task(Li et al., 2020d; Liao et al., 2021; Lin et al., 2019; Cao et al., 2021). Argument identification and argument role classification can be defined as an argument extraction task. The event classification is a multi-label text classification (Aly et al., 2019; Chalkidis et al., 2019; Chang et al., 2020)
task to classify the type of each event. The role classification task is a multi-classification task based on word pairs, determining the role relationship between any pair of triggers and entities in a sentence. Therefore, event extraction can depend on the results of some NLP tasks such as named entity recognition (NER)(Li et al., 2020c; Yu et al., 2020; Lin et al., 2020), semantic parsing (Cao et al., 2020b; Stengel-Eskin et al., 2020; Abdelaziz et al., 2021), and relation extraction (Chen et al., 2021; Ahmad et al., 2021; Sun et al., 2021).
We give the flow chart of event extraction, as shown in Fig. 1. Firstly, it is necessary to distinguish the event type in the text for a given text. For different event types, different event schema is designed. At present, the design of event schema mainly includes two ways: manual designing and model generation. Then, event arguments are extracted according to the schema. In the earliest stage, element extraction is regarded as a word classification task, and each word in the text is classified. In addition, there are sequence tagging and machine reading comprehension (MRC) methods. Finally, researchers consider introducing external knowledge to improve model performance due to the complexity of event extraction tasks.
Deep learning methods have been applied in many fields in recent years, and the deep learning model can automatically and effectively extract the significant features in sentences. Compared with the traditional feature extraction methods, deep learning methods extract the features automatically. It can model the semantic information and automatically combine and match the trigger features at a higher level. The efficiency of these methods has been verified in natural language processing, and many breakthroughs have been made. Using deep learning in event extraction tasks enables many researchers to eliminate feature extraction work.
Most deep learning-based event extraction methods often adopt supervised learning, which means that a large data set of high quality is needed. ACE 2005(Doddington et al., 2004) is one of the few labeled event data available, manually labeled on news, blog, interview, and other data. The small scale of ACE data primarily affects the development of event extraction tasks. Relying on manual annotation of corpus data is time-consuming and costly, which leads to the small scale, few types, and uneven distribution of existing event corpus data.
The event extraction task can be sophisticated. There may be multiple event types in a sentence, and different event types will share an event argument. And the role of the same argument in different events is also different. According to the extraction paradigm, schema-based methods can be divided into pipeline-based and joint-based models. The event detection model is learned for the pipeline-based model, and then the argument extraction model is learned. The joint event extraction method avoids the influence of trigger identification error on argument extraction, but it cannot utilize the information of event triggers.
For the traditional event extraction method, the feature designing is necessary, while for the deep learning event extraction method, the feature can be end-to-end extracted by deep learning models. We comprehensively analyze the existing deep learning-based event extraction methods and outlook for future research work. The main contributions of this paper are as follows:
We introduce the event extraction technology, review the development history of event extraction methods, and point out that the event extraction method based on deep learning has become the mainstream. We summarize the necessary information of deep learning models according to year of publication in Table 1, including models, domain, venues, datasets and subtasks.
We analyze various deep learning-based extraction paradigm and models, including their advantages and disadvantages in detail. We introduce the currently available datasets and give the formulation of main evaluation metrics. We summarize the necessary information of primary datasets in Table 3, such as the number of categories, language and data addresses.
We summarize event extraction accuracy scores on ACE 2005 dataset in Table 5 and conclude the review by discussing the future research trends facing the event extraction.
1.2. Organization of the Survey
The rest of the survey is organized as follows. Section 2 introduces the concepts and different task definition of event extraction. Section 3 summarizes the existing paradigm related to event extraction, including pipeline-based methods and joint-based methods, including a summary table. Section 4 introduces traditional event extraction and deep learning-based event extraction with a comparison. Section 5 and Section 6 introduces the primary datasets and metrics. We then give quantitative results of the leading models in classic event extraction datasets in Section 7. Finally, we summarize the main challenges for deep learning event extraction in Section 8 before concluding the article in Section 9.
2. Event Extraction
An event indicates the occurrence of an action or state change, often driven by verbs or gerunds. It contains the primary components involving in the action, such as time, place, and character. Event extraction technology extracts events that users are interested in from unstructured information and presents them to users in a structured form (Chen et al., 2015). In short, event extraction extracts the core arguments from the text, as shown in Fig. 2. Given a text, an event extraction technology can predict the events mentions in the text, the triggers and arguments corresponding to each event, and classify the role of each argument. Event extraction needs to recognize the two events (Die and Attack), triggered by the words ”died” and ”fired” respectively, as shown in Figure 2. For Die event type, we recognize that ”Baghdad”, ”cameraman” and ”American tank” take on the event argument roles Place, Victim and Instrument respectively. For Attack, ”Baghdad” and ”American tank” take on the event argument roles Place and Instrument respectively. And ”cameraman” and ”Palestine Hotel” take on the event argument roles Target.
Event extraction involves many frontier disciplines, such as machine learning, pattern matching, and NLP. At the same time, event extraction in various fields can help relevant personnel quickly extract relevant content from massive information, improve work timeliness, and provide technical support for quantitative analysis. Therefore, event extraction has a broad application prospect in various fields. Typically, Automatic Content Extraction (ACE) describes an event extraction task holding the following terminologies:
Entity: The entity is an object or group of objects in a semantic category. Entity mainly includes people, organizations, places, times, things, etc.
Event mentions: The phrase or sentences that describe the event contains a trigger and corresponding arguments.
Event type: The event type describes the nature of the event and refers to the category to which the event corresponds, usually represented by the type of the event trigger.
Event trigger: Event trigger refers to the core unit in event extraction, a verb or a noun. Trigger identification is a key step in pipeline-based event extraction.
Event argument: Event argument is the main attribute of events. It includes entities, nonentity participants, and time, and so on.
Argument role: An argument role is a role played by an argument in an event, that is, the relationship representation between the event arguments and the event triggers.
Schema-based event extraction includes four sub-tasks: event classification, trigger identification, argument identification and argument role classification.
Event classification: Event classification is to determine whether each sentence is an event. Furthermore, if the sentence is an event, we need to determine one or several events types the sentence belongs to. Therefore, the event classification sub-task can be seen as a multi-label text classification task.
Trigger identification: It is generally considered that the trigger is the core unit in event extraction that can clearly express an event’s occurrence. The trigger identification subtask it to find the trigger from the text.
Argument identification: Argument identification is to identify all the arguments contained in an event type from the text. Argument identification usually depends on the result of event classification and trigger identification.
Argument role classification: Argument role classification is based on the arguments contained in the event extraction schema, and the category of each argument is classified according to the identified arguments. Thus, it also can be seen as a multi-label text classification task.
2.2. Task Definition
Event extraction is a very representative hot topic in information extraction, which studies how to extract a specific type of event information from unstructured text containing event information (news, blog, etc.). It can be simplified as multiple classification tasks, which determine the type of event and the argument role that each entity belongs to. The classification method depends on named entity recognition (NER), leading to the propagation of error information. Based on this, an event extraction method based on sequence labeling is proposed, which labels the start and end position of each argument. The task of event extraction is complex, and arguments are closely related to each other. Machine reading comprehension (MRC) is adopted to learn the association, and each argument is found through question and answer pairs. Therefore, event extraction task can be regarded as classification task, sequence labeling task and machine reading comprehension task. The definitions of these three tasks in more detail as follows.
2.2.1. Classification Task
For the classification task, authors usually predefine event types and their corresponding argument roles, the event () contains a set of argument roles . Given an input event mentions
, the model needs to output a result vector, where the -th element
represents the probability thatbelongs to the event . After obtaining the final event (or a set) of , the model outputs a matrix where the element means the probability that the extracted argument belongs to argument roles
. As a supervised multi-classification task, event extraction mainly has two steps: feature selection and classification model. Take the event recognition task as an example, and its process is as follows:
Initialize the deep learning and use the deep learning to find the trigger;
Update candidate triggers through deep learning model;
Automatically update the learning features through the deep learning model, and then match more advanced features;
Output and classify through the deep learning model.
However, the event classification task differs from text classification in the following aspects: The event text is short, most of which is a complete sentence including the triggers and arguments; It is an event statement and contains a lot of information.
2.2.2. Sequence Tagging Task
Sequence labeling task is a multi-classification task based on word level, which can directly match event arguments based on word level event type extraction. The schema-based event extraction mainly includes two core tasks: identifying and classifying event categories and extracting event arguments. Event extraction based on sequence tagging can simply and quickly realize the matching of event type and event argument without additional features. The sequence tagging method marks out the target from the text, which is suitable for the event extraction task. For a given text and event schema, the argument role corresponding to the argument is labeled with the sequence tagging model. The output of the sequence tagging model is to tag all the words in the text.
2.2.3. Machine Reading Comprehension Task
The machine reading comprehension model can understand a piece of text in natural language and answer questions about it. Firstly, a question schema is designed for each argument role , called . Since different event types have different arguments, the model needs to first identify the event type to which the text belongs. Then, the argument roles to be extracted are determined according to the event types. Finally, the event extraction method based on machine reading comprehension is to input the text , and apply the designed questions one by one to the extraction model. The model extracts the answer , which is the corresponding argument for each argument role .
|2021||GATE (Ahmad et al., 2021)||cross-lingual||AAAI||ACE||✓||✓||✓|
|2021||DualQA (Zhou et al., 2021)||semi-supervised||AAAI||ACE, FewFC||-||✓||-|
|2021||GRIT (Du et al., 2021)||supervised||EACL||MUC-4||✓||✓||-|
|2021||Wen et al.(Wen et al., 2021)||supervised||NAACL-HLT||ACE||✓||✓||-|
|2020||SciBERT (Wang et al., 2020)||MRC||LOUHI@EMNLP||BioNLP13, GENIA||✓||✓||-|
|2020||HPNet (Huang et al., 2020b)||supervised||COLING||ACE2005, TAC2015||✓||✓||-|
|2020||M2E2 (Li et al., 2020b)||weakly supervised||ACL||M2E2||✓||✓||-|
|2020||MQAEE (Li et al., 2020a)||MRC||EMNLP||ACE||✓||✓||-|
|2020||Du et al. (Du and Cardie, 2020)||MRC||EMNLP||ACE||✓||✓||-|
|2020||Min et al. (Min et al., 2020)||few-shot||LREC||ACE||-||✓||-|
|2020||Chen et al. (Chen et al., 2020b)||supervised||SPNLP@EMNLP||ACE||-||✓||-|
|2019||Doc2EDAG (Zheng et al., 2019)||chinese||EMNLP||ChFinAnn||✓||✓||✓|
|2019||Ananya et al (Subburathinam et al., 2019)||cross-lingual||EMNLP||ACE||✓||✓||✓|
|2019||Chau et al.(Chau et al., 2019)||open domain||arXiv||NYT||✓||✓||-|
|2019||Chen et al.(Chen et al., 2019)||MRC||arXiv||ACE||✓||✓||-|
|2019||GAIL-ELMo (Zhang et al., 2019b)||supervised||Data Intell.||ACE||✓||✓||✓|
|2019||ODEE-FER (Liu et al., 2019b)||open domain||ACL||GNBusiness||✓||✓||✓|
|2019||DYGIE++ (Wadden et al., 2019)||supervised||EMNLP||ACE, SciERC, etc.||✓||✓||✓|
|2019||HMEAE (Wang et al., 2019b)||supervised||EMNLP||ACE, TAC-KBP||-||✓||✓|
|2019||Han et al. (Han et al., 2019)||supervised||EMNLP||TB-Dense, MATRES||✓||✓||✓|
|2019||PLMEE (Yang et al., 2019)||data generation||ACL||ACE||✓||✓||✓|
|2019||AEM (Wang et al., 2019a)||open domain||EMNLP||FSD, Twitter, et al.||✓||✓||✓|
|2019||JointTransition (Zhang et al., 2019a)||supervised||IJCAI||ACE||✓||✓||✓|
|2019||Li et al. (Li et al., 2019a)||external knowledge||NAACL||Genia 2011||✓||✓||✓|
|2019||MLM-Joint (Li et al., 2019b)||unsupervised||IEEE Access||DUC 2004, ACE||✓||✓||✓|
|2019||Joint3EE (Nguyen and Nguyen, 2019)||supervised||AAAI||ACE||✓||✓||✓|
|2019||Chan et al. (Chan et al., 2019)||supervised||ACL||ACE||✓||✓||✓|
|2019||Davani et al. (Davani et al., 2019)||open domain||EMNLP||FBI dataset||✓||✓||-|
|2019||Li et al. (Li et al., 2019c)||MRC||ACL||ACE, CoNLL04||✓||✓||-|
|2018||DCFEE (Yang et al., 2018)||chinese||ACL||NO.(ANN, POS, NEG)||✓||✓||-|
|2018||Zeng et al. (Zeng et al., 2018)||data generation||AAAI||FBWiki, ACE||✓||✓||-|
|2018||Huang et al.(Huang et al., 2018)||zero-shot||ACL||ACE||✓||✓||-|
|2018||DEEB-RNN (Zhao et al., 2018)||supervised||ACL||ACE||✓||-||-|
|2018||SELF (Hong et al., 2018)||supervised||ACL||ACE, TAC-KBP||✓||-||-|
|2018||DBRNN (Sha et al., 2018)||supervised||AAAI||ACE||✓||✓||✓|
|2018||GMLATT (Liu et al., 2018a)||cross lingual||AAAI||ACE||✓||-||-|
|2018||JMEE (Liu et al., 2018b)||supervised||EMNLP||ACE||✓||✓||✓|
|2018||Ferguson et al.(Ferguson et al., 2018)||semi-supervised||NAACL||ACE, TAC-KBP||✓||✓||-|
|2017||DMCNN-MIL (Chen et al., 2017)||data generation||ACL||ACE||✓||✓||-|
|2017||Liu et al.(Liu et al., 2017b)||supervised||ACL||ACE||✓||-||-|
|2016||RBPB (Sha et al., 2016)||supervised||ACL||ACE||✓||✓||-|
|2016||Zeng et al.(Zeng et al., 2016)||chinese||NLPCC||ACE||✓||✓||-|
|2016||JRNN (Nguyen et al., 2016)||supervised||NAACL||ACE||✓||✓||✓|
|2016||JOINTEVENTENTITY (Yang and Mitchell, 2016)||supervised||NAACL||ACE||✓||✓||✓|
|2016||BDLSTM-TNNs (Chen et al., 2016)||supervised||CCL||ACE||✓||✓||✓|
|2016||Huang et al. (Huang et al., 2016)||supervised||ACL||ERE, ACE||✓||✓||-|
|2016||Liu et al. (Liu et al., 2016)||supervised||ACL||ACE||✓||-||-|
|2016||Hsi et al.(Hsi et al., 2016)||multilingual||COLING||ACE||✓||✓||✓|
|2015||DMCNN (Chen et al., 2015)||supervised||ACL||ACE||✓||✓||-|
3. Event Extraction Paradigm
Event extraction includes four sub-tasks: trigger identification, event type classification, argument identification, and argument role classification. According to the procedure to settle these four subtasks, the event extraction task is divided into pipeline-based event extraction and joint-based event extraction. The pipeline based method is first adopted (Chen et al., 2015; Subburathinam et al., 2019). It first detects the triggers, and judges the event type according to the triggers. The argument extraction model then extracts arguments and classifies argument roles according to the prediction results of event type and the triggers. To overcome the propagation of error information caused by event detection, researchers propose a joint-based event extraction paradigm (Zhang et al., 2019a; Li et al., 2019a). It reduces the propagation of error information by combining trigger identification and argument extraction tasks.
3.1. Pipeline-based Paradigm
The pipeline-based method treats all sub-tasks as independent classification problems (Huang et al., 2018; Funke et al., 2018; Gasmi et al., 2018). The pipeline approach is widely used because it simplifies the entire event extraction task. The pipeline-based event extraction method converts event extraction tasks into a multi-stage classification problem. The required classifiers include: 1) A trigger classifier is used to determine whether the term is the event trigger and the type of event. 2) An argument classifier is used to determine whether the word is the argument of the event. 3) An argument role classifier is used to determine the category of arguments.
The classical deep learning-based event extraction model DMCNN (Chen et al., 2015)
uses two dynamic multi-pooling convolutional neural networks for trigger classification and argument classification. The trigger classification model identifies the trigger. If there is a trigger, the argument classification model is used to identify arguments and their roles. Sha et al.(Sha et al., 2016) proposed a pattern balancing method based on regularization called RBPB. The model combines trigger embedding, sentence-level embedding, and pattern features as features of trigger classification to balance the effects of patterns and other valuable features. RBPB also uses a regularization approach to take advantage of the relationship among arguments. PLMEE (Yang et al., 2019) also uses two models employing trigger extraction and argument extraction. Argument extractor uses the result of trigger extraction to reason. It performs well through introducing BERT (Devlin et al., 2019).
Pipeline-based event extraction methods provide additional information for subsequent sub-tasks through previous sub-tasks, and take advantage of dependencies between subtasks. Du et al. (Du and Cardie, 2020) adopt a question answering method to implement event extraction, as shown in Fig 3. Firstly, the model identifies the trigger in the input sentence through the designed question template of the trigger. The input of the model includes the input sentence and question. Then, it classifies the event type according to the identified trigger. The trigger can provide additional information for event classification, but the result of wrong trigger identification can also affect event classification. Finally, the model identifies the event argument and classifies argument roles according to the schema corresponding to the event type. In argument extraction, the model utilizes the answers of the previous round of history content.
The most significant defect of this method is error propagation. Intuitively, if there is an error in trigger identification in the first step, then the accuracy of argument identification will be lower. Therefore, when using pipelines to extract events, there will be error cascading and task splitting problems. The pipeline event extraction method can extract event arguments by using the information of triggers. However, this requires high accuracy of trigger identification. A wrong trigger will seriously affect the accuracy rate of argument extraction. Therefore, the pipeline event extraction method considers the trigger as the core of an event.
Summary. The pipeline-based method transforms the event extraction task into a multi-stage classification problem. The pipeline-based event extraction method first identifies the triggers, and argument identification is based on the result of the trigger identification. It considers the trigger as the core of an event. Yet this staged strategy will lead to error propagation. The recognition error of the trigger will be passed to the argument classification stage, which will lead to the degradation of the overall performance. Moreover, because the trigger detection always precedes the argument detection, the argument won’t be considered while detecting triggers. Therefore each link is independent and lacks interaction, ignoring the impact between them. Thus the overall dependency relationship cannot be handled. The classic case is DMCNN (Chen et al., 2015).
3.2. Joint-based Paradigm
Event extraction is of great practical value in NLP. Before using deep learning to model event extraction tasks, the joint learning method has been studied in event extraction. Li et al. (Li et al., 2013)
study the joint learning of trigger extraction and argument extraction tasks based on the traditional feature extraction method and obtain the optimal result through the structured perceptron model. Zhu et al.(Zhu et al., 2014) design efficient discrete features, including local features of all information contained in feature words and global features that can connect trigger with argument information.
The deep learning event extraction method based on the joint model mainly uses the deep learning and the joint learning to interact with the feature learning, which can avoid the extended learning time and the complex feature engineering. Nguyen et al. (Nguyen et al., 2016)
successfully constructed local features and global features through deep learning and joint learning. It uses a recurrent neural network to combine event recognition and argument role classification. The local features constructed are text sequence features and local window features. The input text consists of word vectors, entity vectors, and event arguments. Then the text is transferred to the recurrent neural network model to obtain the sequence characteristics of the deep learning. A deep learning model with memory is also proposed to model it. It mainly aimed at the global characteristics between event triggers, between event arguments, and between event triggers and event arguments to improve the performance of tasks simultaneously. Liu et al.(Liu et al., 2016) use the local characteristics of arguments to assist role classification. They adopted a joint learning task for entities for the first time, aiming to reduce the complexity of the task. The previous methods input the dataset with characteristics which are marked and output the event. Chen et al. (Chen et al., 2017) simplify the process, namely plain text input and output. In the middle of the process, it is the joint learning on event arguments. This joint learning factor mainly provides the relationship and entity information of different events within each input event.
Data scarcity and monolingual ambiguity hinder the performance of monolingual methods. The deep learning sometimes makes inefficient use of syntactic information when extracting events, SHA et al. (Sha et al., 2018) improved it, called DBRNN, as shown in Fig 4. It is based on a bidirectional RNN with LSTM units for event extraction. It is enhanced by relying on bridging grammar-related words to avoid relies heavily on lexical and syntactic features. The model mainly starts with syntax to construct the dependency bridge between syntax. First, it predicts whether a given or current word is a trigger. Then, it analyzes the trigger type of the event and considers the dependency relationship with the former. In this way, the model assigns weights to each type’s dependency relationships to adopt the weighted form in information fusion.
The above joint learning method can achieve joint modeling event extraction of triggers and arguments. However, in the actual work process, the extraction of triggers and arguments is carried out successively rather than concurrently, which is an urgent problem to be discussed later. Besides, if an end-to-end mode is added to the deep learning, the feature selection workload will be significantly reduced, which will also be discussed later. The joint event extraction method avoids the influence of trigger identification error on event argument extraction, considering trigger and argument are equally important, but it cannot use the information of triggers.
Summary. In order to overcome the shortcomings of the pipeline method, researchers proposed a joint method. The joint method constructs a joint learning model for trigger recognition and argument recognition, where the trigger and argument can mutually promote each other’s extraction effect. The experiment proves that the effect of the joint learning method is better than the pipeline learning method. The classic case is JRNN (Nguyen et al., 2016). The joint event extraction method avoids trigger identification on event argument extraction, but it cannot use the information of trigger. The joint event extraction method considers that the trigger and argument in an event are equally important. However, neither pipeline-based event extraction nor joint-based event extraction can avoid the impact of event type prediction errors on the performance of argument extraction. Moreover, these methods can not share information among different event types and learn each type independently, which is disadvantageous to the event extraction with only a small amount of labeled data.
4. Event Extraction Model
4.1. Traditional Event Extraction
The earliest event extraction techniques were pattern-matching methods, mainly based on syntax trees or regular expressions. However, the method’s performance based on pattern matching is strongly dependent on the expression form of text, domain, and so on, and its portability is low. Later, the event extraction technology based on the statistical method has achieved good results and becomes a research hotspot. We give the comparison of pattern matching and machine learning in Table 2. Event extraction based on statistical methods mainly includes two types of event extraction methods based on machine learning and deep learning. The machine learning-based methods effectively capture the lexical and semantic information of triggers, arguments, and the relationship between triggers. In addition, (Li et al., 2013; Ji and Grishman, 2008; Liao and Grishman, 2010) focus on the consistency characteristics of cross-sentence documents, cross-events, and cross-entities, and their goal is to improve the ability of event extraction. Using machine learning to identify events refers to the idea of text classification and transfers event detection and argument extraction into a classification problem, the core of which lies in the construction of classifiers and the selection of features (Wu et al., 2013b, 2020, a, 2012). Most of the researches is based on triggers, which only account for a small part of all arguments, to conduct event detection. Many negative examples are introduced in training, resulting in the imbalance of positive and negative examples. The judgment of each word will lead to an additional amount of computation.
|Pattern Matching||Machine Learning|
|Domain agnostic, portable.|
|Disadvantage||Poor portability and flexibility.||
In conclusion, although machine learning-based methods do not depend on the corpus’s content and format, they need a large-scale standard corpus. Otherwise, a severe data sparsity problem will occur. However, the current corpus scale is challenging to meet the application requirements, and manual labeling of the corpus is time-consuming and labor-intensive. To alleviate the difficulty in obtaining labeled corpus, many researchers explore semi-supervised and unsupervised learning research. Besides, feature selection is also an essential factor in determining the outcome of machine learning. Therefore, avoiding data sparsity and selecting the appropriate features become an important topic based on machine learning methods.
4.2. Deep Learning Based Event Extraction
Traditional event extraction methods are challenging to learn in-depth features, making it difficult to improve the task of event extraction that depends on complex semantic relations. Most recent event extraction works are based on a deep learning architecture like Convolutional Neural Networks (CNN) (Chen et al., 2015; Zhang et al., 2016), Recurrent Neural Network (RNN) (Nguyen and Grishman, 2016; Sha et al., 2018), Graph Neural Network (GNN) (Liu et al., 2018b; Huang et al., 2020a), Transformer (Yang et al., 2019; Liu et al., 2020b), or other networks (Zhang et al., 2019a; Huang et al., 2020b). The deep learning method can capture complex semantic relations and significantly improve multiple event extraction data sets. We introduce several typical event extraction models.
Event extraction is a particularly challenging problem in information extraction. Traditional event extraction methods mainly rely on well-designed features and complex NLP tools, which consumes many human resources costs and causes problems such as data sparseness and error propagation. To automatically extract lexical and sentence-level features without using complex natural language processing tools, Chen et al. (Chen et al., 2015) introduce a word representation model, called DMCNN. It captures the meaningful semantic rules of words and adopts a framework based on a CNN to capture sentence-level clues. However, CNN can only capture the essential information in a sentence, and it uses a dynamic multi-pool layer to store more critical information based on event triggers and arguments. Event extraction is a two-stage multi-class classification realized by a dynamic multi-pool convolutional neural network with automatic learning features. The first stage is trigger classification. DMCNN classifies each word in the sentence to identify triggers. For a sentence having a trigger, this phase applies a similar DMCNN to assign parameters to the trigger and align the arguments’ roles. Fig. 5 depicts the architecture of argument classification. Lexical-level feature representation and sentence-level features extraction are used to capture lexical clues and learn the sentences’ compositional semantic features.
CNN induces the underlying structures of the k-grams in the sentences. Thus, some researchers also study event extraction techniques based on convolutional neural networks. Nguyen et al. (Nguyen and Grishman, 2015) use CNN to investigate the event detection task, which overcomes complex feature engineering and error propagation limitations compared with traditional feature-based approaches. But it relies extensively on other supervised modules and manual resources to obtain features. It is significantly superior to the feature-based method in terms of cross-domain generalization performance. Furthermore, to consider non-consecutive k-grams, Nguyen et al. (Nguyen and Grishman, 2016) introduce non-consecutive CNN. CNN models apply in pipeline-based and joint-based paradigm through structured predictions with rich local and global characteristics to automatically learn hidden feature representations. Joint-based paradigm can mitigate error propagation problems compared with the pipeline-based approach and exploit the interdependencies between event triggers and argument roles.
In Chinese event extraction, previous methods rely heavily on complex feature engineering and complex natural language processing tools. A convolutional bidirectional LSTM neural network is proposed (Zeng et al., 2016), which combines LSTM and CNN to capture sentence-level and lexical information without any artificial features. A bidirectional LSTM is used to encode the semantics of the words in the whole sentence into sentence-level features without any parsing. They use the convolutional neural network to capture salient local lexical features to disambiguate the trigger, without any help from the POS tag or NER.
In addition to the CNN-based event extraction method, some other researches are carried out on RNN. The RNN is used for modeling sequence information to extract arguments in the event. JRNN (Nguyen et al., 2016) is proposed with a bidirectional RNN for event extraction in a joint-based paradigm, as shown in Fig. 6. It has an encoding stage and prediction stage. In the encoding stage, it uses RNN to summarize the context information. Furthermore, it predicts both trigger and argument roles in the prediction stage.
Previous approaches relied heavily on language-specific knowledge and existing NLP tools. A more promising method from data automatically learning useful features, Feng et al. (Feng et al., 2016) develop a hybrid neural network to capture in the context of a specific sequence and pieces of information and use them for training a multilingual event detector. The model uses a Bidirectional LSTM to obtain the document’s sequence information that needs to be recognized. Then it uses the convolutional neural network to get the phrase chunk information in the document, combine the two kinds of information, and finally identify the trigger. The method can multiple languages (English, Chinese, and Spanish) are robust, efficient, and accurate detection. The composite model is superior to the traditional feature-based approach in terms of cross-language generalization performance.
The tree structure and sequence structure in a deep learning have better performance than a sequential structure. To avoid over-reliance on lexical and syntactic features, dependence bridge recursive neural network (DBRNN) (Sha et al., 2018) is based on bidirectional RNNs for event extraction, as shown in Fig 4. The DBRNN is enhanced by relying on bridging grammar-related words. DBRNN is an RNN-based framework that leverages the dependency graph information to extract event triggers and argument roles.
Joint3EE (Nguyen and Nguyen, 2019)
is a multi-task model that performs entity recognition, trigger detection, and argument role classification by shared Bi-GRU hidden representations. However, data scarcity and monolingual ambiguity hinder the performance of these monolingual methods. Liu et al.(Liu et al., 2018a) propose a new multilingual approach called gated multilingual attention (GMLATT) framework to address both problems simultaneously and develop consistent information in multilingual data through contextual attention mechanisms. It uses consistent evidence in multilingual data, models the credibility of cues provided by other languages, and controls information integration in various languages.
The automatic extraction of event features by deep learning model and the enhancement of event features by external resources mainly focus on the information of event triggers, and less on the information of event elements and interword dependencies. Sentence-level sequential modeling models suffer a lot from the low efficiency in capturing very long-range dependencies. Furthermore, RNN-based and CNN-based models do not fully model the associations between events. The modeling of structural information in the attention mechanism has gradually attracted the attention of researchers. As research methods are constantly proposed, models that add attention mechanisms appear gradually. The attention mechanism’s feature determines that it can use global information to model local context without considering location information. It has a good application effect when updating the semantic representation of words.
By controlling the different weight information of each part of the sentence, the attention mechanism makes the model pay attention to the important feature information of the sentence while ignoring other unimportant feature information, and rationally allocate resources to extract more accurate results. At the same time, the attention mechanism itself can be used as a kind of alignment, explaining the alignment between input and output in the end-to-end model, to make the model more interpretable.
. It is composed of four modules: word representation, syntactic graph convolution network, self-attention trigger classification, and argument classification modules. The information flow is enhanced by introducing a syntax shortcut arc. The graph convolution network based on attention is used to jointly model the graph information to extract multiple event triggers and arguments. Furthermore, it optimizes a biased loss function when jointly extract event triggers and arguments to settle the dataset imbalances.
Syntactic representations provide an effective mechanism to directly link words to their informative context for event detection in sentences. Nguyen et al. (Nguyen and Grishman, 2018) which investigate a convolutional neural network based on dependency trees to perform event detection are the first to integrate syntax into neural event detection. The model uses the proposed model with GCNs (Yao et al., 2019) and entity mention-based pooling. They propose a novel pooling method that relies on entity mentions to aggregate the convolution vectors. The model operates a pooling over the graph-based convolution vectors of the current word and the entity mentions in the sentences. The model aggregates the convolution vectors to generate a single vector representation for event type prediction. The model is to explicitly model the information from entity mentions to improve performance for event detection.
Liu et al. (Liu et al., 2018b) study multi-event extraction, which is more difficult than extracting a single event. In the previous works, it is difficult to model the relationship between events through sequential modeling because of the low efficiency of capturing remote dependencies. The model extracts multiple triggers and arguments from a sentence. The JMEE model introduces syntactic shortcut arcs to enhance the information flow and uses the attention-based GCN to model the graph.
In (Wen et al., 2021), the TAC-KBP time slot is used to fill the quaternion time representation proposed in the task, and the model predicts the earliest and latest start and end times of the event, thus representing the ambiguous time span of the event. The model constructs a document-level event graph for each input document based on shared parameters and time relationships and uses a graph-based attention network method to propagate time information on the graph, as shown in Fig. 8, where entities are underlined and events are in bold face. Wen et al. construct a document-level event diagram method based on event-event relationships for input documents. The event arguments in the document are extracted. The events then are arranged in the order of time according to keywords such as Before and After and the time logic of the occurrence of the events. Entity parameters are shared among different events. The model implementation incorporates events into a more accurate timeline.
Ahmad et al. (Ahmad et al., 2021) propose a GATE framework, which uses GCNS to learn language-independent sentences. The model embeds the dependency structure into the contextual representation. It introduces a self-attention mechanism to learn the dependencies between words with different syntactic distances. The method can capture the long distance dependencies and then calculate the syntactic distance matrix between words through the mask algorithm. It performs well in cross-language sentence-level relationships and event extraction.
However, it is challenging to exploit one argument that plays different roles in various events to improve event extraction. Yang et al. (Yang et al., 2019) propose an event extraction model to overcome the roles overlap problem by separating the argument prediction in terms of argument roles. Moreover, to address insufficient training data, they propose a method to automatically generate labeled data by editing prototypes and screen out generated samples by ranking the quality. They present a framework, Pre-trained Language Model-based Event Extractor (PLMEE) (Yang et al., 2019), as shown in Fig. 9
. The PLMEE promotes event extraction by using a combination of an extraction model and a generation method based on pre-trained language models. It is a two-stage task, including trigger extraction and argument extraction, and consists of a trigger extractor and an argument extractor, both of which rely on BERT’s feature representation. Then it exploits the importance of roles to re-weight the loss function.
Most of the previous supervised event extraction methods relied on features derived from manual annotations, so they would not be able to handle new event types without additional annotation work. Huang et al. (Huang et al., 2018) design a transferable architecture of structural and compositional neural networks for zero-shot. Each event has a structure made up of candidate triggers and arguments, and this structure has predefined labels that correspond to the event type and arguments. They add semantic representations for event types and event information fragments to determine the event types based on the event types defined in the target ontology and the semantic similarity of event information fragments.
model that utilizes a generative adversarial network to help the model focus on harder-to-detect events. They propose an entity and event extraction framework based on generative adversarial imitation learning, which is an inverse reinforcement learning (IRL) method using generative adversarial networks (GAN). The model directly evaluates the correct and incorrect labeling of instances in entity and event extraction through a dynamic mechanism using IRL. We assign specific values to all samples, or reward them based on reinforcement learning, and use discriminators from the GAN to estimate the reward value.
DYGIE++ (Wadden et al., 2019) is a BERT-based framework that models text spans and captures within-sentence and cross-sentence context. Many information extraction tasks, such as named entity recognition, relationship extraction, event extraction, and co-reference resolution, can benefit from the global context across sentences or from phrases that are not locally dependent. They carry out event extraction as additional task and span update in the relation graph of event trigger and its argument. The span representation is constructed on the basis of multi-sentence BERT coding.
Question answering methods are emerging as a new way for extracting important keywords from a sentence (Liu et al., 2020b; Du and Cardie, 2020; Li et al., 2021a). By incorporating domain knowledge into the question set, one can guide the extraction framework to focus on essential semantics extracted from a sentence. Existing approaches do not utilize the relations among multiple event arguments, leaving much room for improvement. The model aims to close this gap by using the event argument relations to infer roles of event arguments that are hard to settle in isolation, leading to better event argument and event classification performance.
Summary. Event extraction is an important research direction in information extraction, which plays an important role and has application value in information collection, information retrieval, public opinion analysis, and other aspects. Most of the traditional event extraction methods adopt the artificial construction method for feature representation and use the classification model to classify triggers and identify the role of the argument. In recent years, the deep learning has shown outstanding effects in image processing, speech recognition, and natural language processing, etc. To settle the drawbacks of traditional methods, deep learning-based event extraction is systematically discussed. Before the emergence of the BERT model, the mainstream method is to find the trigger from the text and judge the event type of the text according to the trigger. Recently, with the introduction of the event extraction model by BERT, the method of identifying event types based on the full text has become mainstream. It is because BERT has outstanding contextual representation ability and performs well in text classification tasks, especially when there is only a small amount of data.
The availability of labeled datasets for event extraction has become the main driving force behind the fast advancement of this research field. In this section, we summarize the characteristics of these datasets in terms of domains and give an overview in Table 3, including the number of categories, average sentence length, the size of each dataset and related papers.
|Datasets||Doc||Sen||Event Type||Language||Related Papers|
|11,909||-||30||English||(Petrovic et al., 2013)|
|1,000||-||20||English||(Petrovic et al., 2013)|
|NO.ANN, NO.POS, NO.NEG (DCFEE)||2,976||-||4||Chinese||(Yang et al., 2018)|
|ChFinAnn (Doc2EDAG)||32,040||-||5||Chinese||(Zheng et al., 2019)|
|ACE 2005||599||18,117||33||Multi-language||(Ahmad et al., 2021; Zhou et al., 2021; Li et al., 2020a; Min et al., 2020; Chan et al., 2019)|
|TAC KBP 2015||360||12,976||38||English||(Ferguson et al., 2018; Huang et al., 2020b)|
|TAC KBP 2016||500||9,042||18||Multi-language||(Wang et al., 2019b)|
|Rich ERE||50||English||(Huang et al., 2016)|
|FSED||-||70,852||100||English||(Deng et al., 2020)|
|GNBusiness||12,985||1,450,336||-||English||(Liu et al., 2019b)|
|FSD||-||2,453||20||English||(Petrovic et al., 2013)|
|FBI dataset||-||-||3||English||(Davani et al., 2019)|
Google. Google dataset 222http://data.gdeltproject.org/events/index.html is a subset of GDELT Event Database1, documents are retrieved by event related words. For example, documents which contain ‘malaysia’, ‘airline’, ‘search’ and ‘plane’ are retrieved for event MH370. By combining 30 events related documents, the dataset contains 11,909 news articles.
Twitter. Twitter dataset is collected from tweets published in the month of December in 2010 using Twitter streaming API. It contains 1,000 tweets annotated with 20 events.
ChFinAnn (Doc2EDAG). In (Zheng et al., 2019), a DS-based event labeling is conducted based on ten years ChFinAnn4 documents 333http://www.cninfo.com.cn/new/index and human-summarized event knowledge bases. The new Chinese event dataset includes 32,040 documents and 5 event types: Equity Freeze, Equity Repurchase, Equity Underweight, Equity Overweight and Equity Pledge, which belong to major events required to be disclosed by the regulator and could have a significant impact on the value of a company.
NO.ANN, NO.POS, NO.NEG (DCFEE). In paper (Yang et al., 2018), researchers carry out experiments on four types of financial events: Equity Freeze event, Equity Pledge event, Equity Repurchase event and Equity Overweight event. A total of 2976 announcements have been labeled by automatically generating data. NO.ANN represents the number of announcements can be labeled automatically for each event type. NO.POS represents the total number of positive case mentions. On the contrary, NO.NEG represents the number of negative mentions.
Automatic Content Extraction (ACE) (Doddington et al., 2004). The ACE 2005 is the most widely-used dataset in event extraction. It contains a complete set of training data in English, Arabic, and Chinese for the ACE 2005 technology evaluation. The corpus consists of various types of data annotated for entities, relationships, and events by the Language Data Alliance (LDC). It contains 599 documents, which are annotated with 8 event types, 33 event subtypes, and 35 argument roles. A total of 8 event types and 33 seed types are defined in the automatic content extraction dataset ACE 2005 444https://catalog.ldc.upenn.edu/LDC2006T06. At present, these 33 subtypes are used in most event extraction events.
Text Analysis Conference Knowledge base Filling (TAC KBP). TAC Knowledge Base Population KBP aims to develop and evaluate technologies for populating knowledge bases from unstructured text. As a standalone component task in KBP, the goal of TAC KBP event tracking (from 2015 to 2017) is to extract information about the event so that it is suitable for input into the knowledge base. The trajectory includes an event block task for detecting and linking events, and an event parameter (EA) task for extracting event parameters and linking parameters belonging to the same event.
TAC KBP 2015 555https://tac.nist.gov/2015/KBP/data.html defines 9 different event types and 38 event subtypes in English. TAC KBP 2016 666https://tac.nist.gov/2016/KBP/data.html and TAC KBP 2017 777https://tac.nist.gov/2017/KBP/data.html have corpora in three languages: English, Chinese, and Spanish, where they own 8 event types and 18 event subtypes.
Rich ERE. Rich ERE extends entities, relationships, and event ontologies, and extends the concept of what is Taggable. Rich ERE also introduced the concept of event jumping to address the pervasive challenge of event co-referencing, particularly with regard to event references within and between documents and granularity changes in event parameters, paving the way for the creation of (hierarchical or nested) cross-document representations of events.
FSED. Based on ACE 2005 and TAC KBP 2017, FSED dataset is a newly-generated dataset tailored particularly for few-shot event detection. In details, it contains 70,852 mentions for 19 event types graded into 100 event subtypes. And each event type is annotated with about 700 mentions on average.
GNBusiness GNBusiness (Liu et al., 2019b) collects news reports from Google Business News to describe each event from different sources. For each news report, researchers obtain the title, publish timestamp, download timestamp, source URL and full text. In total, it obtains 55,618 business news reports with 13,047 news clusters in 288 batches from Oct. 17, 2018, to Jan. 22, 2019. The full text corpus is released as GNBusinessFull-Text 888https://github.com/lx865712528/ACL2019-ODEE.
FSD. FSD dataset (Petrovic et al., 2013) is the first story detection dataset containing 2,499 tweets. Rearchers filter out events mentioned in less than 15 tweets since events mentioned in very few tweets are less likely to be significant. The final dataset contains 2,453 tweets annotated with 20 events.
FBI dataset. FBI dataset (Davani et al., 2019) is built by scraping about 370k unlabeled news articles in the “Fire and Crime” category of Patch ( a website includes hyper-local news articles from 1217 cities based in the US). The annotations consisted of a binary label — whether the article represents a specific hate crime — as well as labeling the attributes of hate crime articles, which consist of the target of the action and the type of action.
|The reference trigger|
|The detected trigger|
|The actual number of triggers|
|The number of detected triggers|
|The true event type|
|The detected event type|
|The reference argument|
|The detected argument|
|The actual number of arguments|
|The number of detected arguments|
|The detected argument role|
|The number of detected arguments|
For different sub-tasks, four evaluation metrics are used in previous research work and for each sub-task, three metrics including precision, recall, and F1 are used to measure the performance.
Here, we denote a Indicator function : and .
1. Trigger Identification: a trigger is correctly identified if its span offsets exactly match a reference trigger. The corresponding metrics include:
where is the detected trigger, and are the left and right boundaries of , is the reference trigger, and are the left and right boundaries of , and denotes the number of detected triggers and the actual number of triggers.
2. Trigger Classification: a trigger is correctly classified if its span offsets and event subtype exactly match a reference trigger. The corresponding metrics include:
where and denote the detected event type and the true event type.
3. Argument Identification: an argument is correctly identified if its span offsets and corresponding event subtype exactly match a reference argument. The corresponding metrics include:
where is the detected argument, and are the left and right boundaries of , is the reference argument, and are the left and right boundaries of , and denotes the number of detected arguments and the actual number of arguments.
4. Argument Classification: an argument is correctly classified if its span offsets, corresponding event subtype, and argument role exactly match a reference argument. Its corresponding metrics include:
where and denote the detected argument role and the true argument role.
7. Quantitative Results
This section mainly summarizes existing event extraction work and compares performance on the ACE 2005 dataset, as shown in Table 1. The evaluation metrics include precision, recall, and F1.
|Year-Method||Neural||External||Paradigm||Trigger Classification||Role Classification|
|2008 - Ji et al. (Ji and Grishman, 2008)||-||-||-||60.2||76.4||67.3||51.3||36.4||42.6|
|2010 - Liao et al. (Liao and Grishman, 2010)||-||-||-||68.7||68.9||68.8||45.1||44.1||44.6|
|2011 - Hong et al. (Hong et al., 2011)||-||-||-||72.9||64.3||68.3||51.6||45.5||48.4|
|2013 - Li et al. (Li et al., 2013)||-||-||-||73.7||62.3||67.5||64.7||44.4||52.7|
|2015 - Nguyen et al. (Nguyen and Grishman, 2015)||✓||-||-||71.8||66.4||69.0||-||-||-|
|2015 - DMCNN (Chen et al., 2015)||✓||-||Pipeline||75.6||63.6||69.1||62.2||46.9||53.5|
|2016 - JRNN (Nguyen et al., 2016)||✓||-||Joint||66.0||73.0||69.3||54.2||56.7||55.4|
|2016 - JOINTEVENTENTIT (Yang and Mitchell, 2016)||-||-||Joint||75.1||63.3||68.7||70.6||36.9||48.4|
|2016 - Liu et al. (Liu et al., 2016)||✓||✓||Joint||77.6||65.2||70.7||-||-||-|
|2016 - NC-CNN (Nguyen and Grishman, 2016)||✓||-||-||-||-||71.3||-||-||-|
|2016 - HNN (Feng et al., 2016)||✓||-||-||84.6||64.9||73.4||-||-||-|
|2016 - Huang et al.(Huang et al., 2016)||✓||✓||Joint||80.7||50.1||61.8||51.9||39.4||44.8|
|2016 - BDLSTM-TNNs (Chen et al., 2016)||✓||-||Joint||75.3||63.4||68.9||62.9||47.5||54.1|
|2016 - Zeng et al. (Zeng et al., 2016)||✓||-||Joint||69.8||59.9||64.5||47.3||46.6||46.9|
|2016 - RBPB (Sha et al., 2016)||✓||-||Pipeline||70.3||67.5||68.9||54.1||53.5||53.8|
|2017 - Liu et al. (Liu et al., 2017b)||✓||-||Pipeline||78.0||66.3||71.7||-||-||-|
|2017 - DMCNN-MIL (Chen et al., 2017)||✓||✓||Joint||75.5||66.0||70.5||62.8||50.1||55.7|
|2018 - DEEB-RNN (Zhao et al., 2018)||✓||-||Pipeline||72.3||75.8||74||-||-||-|
|2018 - SELF (Hong et al., 2018)||✓||-||Pipeline||71.3||74.7||73.0||-||-||-|
|2018 - DBRNN (Sha et al., 2018)||✓||-||Joint||74.1||69.8||71.9||66.2||52.8||58.7|
|2018 - GMLATT (Liu et al., 2018a)||✓||-||Joint||78.9||66.9||72.4||-||-||-|
|2018 - JMEE(Liu et al., 2018b)||✓||-||Joint||76.3||71.3||73.7||66.8||54.9||60.3|
|2018 - Zeng et al. (Zeng et al., 2018)||✓||✓||Pipeline||85.3||79.9||82.5||41.9||34.6||37.9|
|2019 - Joint3EE (Nguyen and Nguyen, 2019)||✓||-||Joint||68.0||71.8||69.8||52.1||52.1||52.1|
|2019 - Liu et al.(Liu et al., 2019a)||✓||-||Joint||62.5||35.7||45.4||-||-||-|
|2019 - Chen et al.(Chen et al., 2019)||✓||-||Joint||66.7||74.7||70.5||44.3||40.7||42.4|
|2019 - GAIL-ELMo (Zhang et al., 2019b)||✓||-||Joint||74.8||69.4||72.0||61.6||45.7||52.4|
|2019 - DYGIE++ (Wadden et al., 2019)||✓||-||Joint||-||-||69.7||-||-||48.8|
|2019 - HMEAE (Wang et al., 2019b)||✓||-||Joint||-||-||-||62.2||56.6||59.3|
|2019 - JointTransition (Zhang et al., 2019a)||✓||-||Joint||74.4||73.2||73.8||55.7||51.1||53.3|
|2019 - PLMEE (Yang et al., 2019)||✓||-||Joint||81.0||80.4||80.7||62.3||54.2||58.0|
|2020 - Chen et al. (Chen et al., 2020a)||✓||-||Pipeline||66.7||74.7||70.5||44.3||40.7||42.4|
|2020 - MQAEE (Li et al., 2020a)||✓||-||Pipeline||-||-||73.8||-||-||55.0|
|2020 - Du et al. (Du and Cardie, 2020)||✓||-||Pipeline||71.1||73.7||72.3||56.7||50.2||53.3|
|2021-Li et al. (Li et al., 2021b)||✓||-||-||-||-||71.1||-||-||53.7|
|2021-GATE (En2ZH) (Ahmad et al., 2021)||✓||-||Joint||-||-||-||-||-||63.2|
|2021-Text2Event (Lu et al., 2021)||✓||-||Joint||69.6||74.4||71.9||52.5||55.2||53.8|
In recent years, event extraction methods are primarily based on deep learning models. As shown in Table 5, in terms of the value of F1, the deep learning-based method is superior to the machine learning-based method and pattern matching method in both event detection and argument extraction. GATE (En2ZH) (Ahmad et al., 2021) is under single-source transfer from English to Chinese, which performs well on argument role classification task. Li et al. (Li et al., 2021b) propose a document-level neural event argument extraction model. It is applied for ACE 2005 for zero-shot event extraction seen all event types. We can get the validity of the event extraction method based on deep learning models. It may indicate that the deep learning-based method can better learn the dependencies among arguments in the event extraction task. In the deep learning-based model, the BERT-based approach performs the best. It shows that BERT can better learn the context information of the sentence and learn word representation according to the current text. It better learns the semantic association of words in the current context and helps to learn the association between arguments.
Comparing the pipeline based methods (RBPB (Sha et al., 2016), and DEEB-RNN (Zhao et al., 2018)) with the join based methods (JRNN (Nguyen et al., 2016), and DBRNN (Sha et al., 2018)) without Transformer (Vaswani et al., 2017), it can be seen that the event extraction method of the joint model is better than the pipeline model, especially for the argument role classification task. From DMCNN (Chen et al., 2015), and DMCNN-MIL (Chen et al., 2017), it can be concluded that when external resources are used on deep learning-based methods, the effect is significantly improved and slightly higher than the joint model. Zeng et al. (Zeng et al., 2018) introduce external resources, improving the performance of event classification on precision and F1. Thus, it may show that increasing external knowledge is an effective method, but it still needs to be explored to introduce external knowledge into the sub-task of argument extraction.
8. Future Research Trends
Event extraction is an essential and challenging task in text mining, which mainly learns the structured representation of events from the relevant text describing the events. Event extraction is mainly divided into two sub-tasks: event detection and argument extraction. The core of event extraction is identifying the event-related words in the text and classifying them into appropriate categories. The event extraction method based on the deep learning model automatically extracts features and avoids the tedious work of designing features manually. Event extraction tasks are constructed as an end-to-end system, using word vectors with rich language features as input to reduce the errors caused by the underlying NLP tools. Previous methods focus on studying effective features to capture the lexical, syntactic, and semantic information of candidate triggers, candidate arguments. Furthermore, they explore the dependence between triggers, the dependency between multiple entities related to the same trigger, and the relationship between multiple triggers associated with the same entity. According to the characteristics of the event extraction task and the current research status, we summarize the following technical challenges.
The event extraction method using BERT has become mainstream at present. However, event extraction is different from the task learned by the BERT model in pre-training. Argument extraction needs to consider the relationship between the event argument roles to extract different roles under the same event type. It requires the event extraction model to learn the syntactic dependencies of the text. Therefore, making the dependency relationship between the event arguments is an urgent problem to solve to comprehensively and accurately extract the arguments of each event type.
The event extraction task is complex, and the existing pre-training model lacks the learning of the event extraction task. The existing event extraction data sets have a few labeled data, and manual annotation of event extraction data set has a high time cost. Therefore, the construction of large-scale event extraction data set or the design of automatic construction event extraction data set is also a future research trend.
The advantage of the deep learning method based on the joint model over the traditional approach is the joint representation form. The event extraction depends on the label of entities. So this paper believes that establishing a port-to-port autonomous learning model based on deep learning is a direction worthy of research and exploration, and how to design multi-task and multi-federation is a major challenge.
The data set of event extraction is small. Deep learning combining external resources and constructing a large-scale dataset has achieved good results. Due to the difficulties in constructing labeled data sets and the small size of data sets, it is also an urgent research direction that how to make better use of deep learning to extract events from data sets effectively with the help of external resources.
According to the different granularity of event extraction, event extraction can be divided into sentence-level event extraction and document-level event extraction. There have been a lot of researches on sentence-level event extraction. However, the document-level event extraction is still in the exploratory stage, and the document-level event extraction is closer to the practical application. Therefore, how to design the multi-event extraction method for the text is of great research significance.
Event extraction methods can be divided into schema-based event extraction methods and open domain event extraction methods. The effect of event extraction methods without schemas is challenging to evaluate, and template-based event extraction methods need to design different event schemas according to different event types. Therefore, how to design a general event extraction schema based on event characteristics is an essential means to overcome the difficulty in constructing event extraction data set and sharing knowledge among classes.
The domain text often contains numerous technical terms, which increases the difficulty of domain event extraction. Therefore, how to design effective methods to understand the deep semantic information and context correspondence in the domain text has become an urgent problem to solve.
This paper principally introduces the existing deep learning models for event extraction tasks. Comparing with the traditional methods, the conclusions are as follows: 1) The event extraction methods based on deep learning can autonomously learn, and they can autonomously learn the features. The performance of trigger classification and the argument role classification is better than the traditional methods. 2) With the rapid development of deep learning, machine learning and deep learning based on neural networks are making good progress in event extraction continuously. Using deep learning models to solve missing data will provide an essential research direction for the follow-up research. Firstly, we introduce concepts and definitions from three aspects of event extraction. Then we divide the deep learning-based event extraction paradigm into the pipeline and joint parts and introduce them, respectively. Deep learning-based models enhance performance by improving the presentation learning method, model structure, and additional data and knowledge. Then, we introduce the datasets with a summary table and evaluation metrics. Furthermore, we give the quantitative results of the leading models in a summary table on ACE 2005 datasets. Finally, we summarize the possible future research trends of event extraction.
Acknowledgements.The authors of this paper were supported by the NSFC through grants U20B2053, 61872022 and 62002007, S&T Program of Hebei through grant 20310101D. Philip S. Yu was supported by NSF under grants III-1763325, III-1909323, and SaTC-1930941.
- A semantic parsing and reasoning-based approach to knowledge base question answering. See DBLP:conf/aaai/2021, pp. 15985–15987. External Links: Cited by: §1.
- GATE: graph attention transformer encoder for cross-lingual relation and event extraction. See DBLP:conf/aaai/2021, pp. 12462–12470. External Links: Cited by: §1, Table 1, §4.2.4, Table 3, Table 5, §7.
- Hierarchical multi-label classification of text with capsule networks. See DBLP:conf/acl/2019-2, pp. 323–330. External Links: Cited by: §1.
- Dynamic neuro-symbolic knowledge graph construction for zero-shot commonsense question answering. See DBLP:conf/aaai/2021, pp. 4923–4931. External Links: Cited by: §1.
- What question answering can learn from trivia nerds. See DBLP:conf/acl/2020, pp. 7422–7435. External Links: Cited by: §1.
- DeFormer: decomposing pre-trained transformers for faster question answering. See DBLP:conf/acl/2020, pp. 4487–4497. External Links: Cited by: §1.
- Unsupervised dual paraphrasing for two-stage semantic parsing. See DBLP:conf/acl/2020, pp. 6806–6817. External Links: Cited by: §1.
- Knowledge-preserving incremental social event detection via heterogeneous gnns. See DBLP:conf/www/2021, pp. 3383–3395. External Links: Cited by: §1.
- Large-scale multi-label text classification on EU legislation. See DBLP:conf/acl/2019-1, pp. 6314–6322. External Links: Cited by: §1.
- Rapid customization for event extraction. See DBLP:conf/acl/2019-3, pp. 31–36. External Links: Cited by: Table 1, Table 3.
- Taming pretrained transformers for extreme multi-label text classification. See DBLP:conf/kdd/2020, pp. 3163–3171. External Links: Cited by: §1.
- Open-domain event extraction and embedding for natural gas market prediction. CoRR abs/1912.11334. External Links: Cited by: §1, Table 1.
- Empower distantly supervised relation extraction with collaborative adversarial training. See DBLP:conf/aaai/2021, pp. 12675–12682. External Links: Cited by: §1.
- . See DBLP:conf/cncl/2016, pp. 190–203. External Links: Cited by: Table 1, Table 5.
- Automatically labeled data generation for large scale event extraction. See DBLP:conf/acl/2017-1, pp. 409–419. External Links: Cited by: Table 1, §3.2, Table 5, §7.
- Event extraction via dynamic multi-pooling convolutional neural networks. See DBLP:conf/acl/2015-1, pp. 167–176. External Links: Cited by: §2.1, Table 1, §3.1, §3.1, §3, Figure 5, §4.2.1, §4.2, Table 5, §7.
- Reading the manual: event extraction as definition comprehension. External Links: Cited by: Table 1, Table 5.
- Reading the manual: event extraction as definition comprehension. See DBLP:conf/acl-spnlp/2020, pp. 74–83. External Links: Cited by: Table 5.
- Reading the manual: event extraction as definition comprehension. See DBLP:conf/acl-spnlp/2020, pp. 74–83. External Links: Cited by: Table 1.
- Reporting the unreported: event extraction for analyzing the local representation of hate crimes. See DBLP:conf/emnlp/2019-1, pp. 5752–5756. External Links: Cited by: Table 1, §5.2, Table 3.
- Meta-learning with dynamic-memory-based prototypical network for few-shot event detection. See DBLP:conf/wsdm/2020, pp. 151–159. External Links: Cited by: Table 3.
- BERT: pre-training of deep bidirectional transformers for language understanding. See DBLP:conf/naacl/2019-1, pp. 4171–4186. External Links: Cited by: §3.1.
- The automatic content extraction (ACE) program - tasks, data, and evaluation. See DBLP:conf/lrec/2004, External Links: Cited by: §1, §5.2.
- Event extraction by answering (almost) natural questions. In EMNLP, 2020, External Links: Cited by: Table 1, Figure 3, §3.1, §4.2.5, Table 5.
- GRIT: generative role-filler transformers for document-level event entity extraction. See DBLP:conf/eacl/2021, pp. 634–644. External Links: Cited by: Table 1.
- A language-independent neural network for event detection. See DBLP:conf/acl/2016-2, External Links: Cited by: §4.2.2, Table 5.
- Semi-supervised event extraction with paraphrase clusters. See DBLP:conf/naacl/2018-2, pp. 359–364. External Links: Cited by: §1, Table 1, Table 3.
- Pipelined query processing in coprocessor environments. See DBLP:conf/sigmod/2018, pp. 1603–1618. External Links: Cited by: §3.1.
- Collaborative social group influence for event recommendation. See DBLP:conf/cikm/2016, pp. 1941–1944. External Links: Cited by: §1.
- R-node: new pipelined approach for an effective reconfigurable wireless sensor node. IEEE Trans. Syst. Man Cybern. Syst. 48 (6), pp. 892–905. External Links: Cited by: §3.1.
- Joint event and temporal relation extraction with shared representations and structured prediction. See DBLP:conf/emnlp/2019-1, pp. 434–444. External Links: Cited by: Table 1.
- Using cross-entity inference to improve event extraction. See DBLP:conf/acl/2011, pp. 1127–1136. External Links: Cited by: Table 5.
- Self-regulation: employing a generative adversarial network to improve event detection. See DBLP:conf/acl/2018-1, pp. 515–526. External Links: Cited by: Table 1, Table 5.
- Leveraging multilingual training for limited resource event extraction. See DBLP:conf/coling/2016, pp. 1201–1210. External Links: Cited by: Table 1.
- Biomedical event extraction on graph edge-conditioned attention networks with hierarchical knowledge graphs. In EMNLP, 2020, External Links: Cited by: §4.2.
- Liberal event extraction and event schema induction. See DBLP:conf/acl/2016-1, External Links: Cited by: Table 1, Table 3, Table 5.
Zero-shot transfer learning for event extraction. See DBLP:conf/acl/2018-1, pp. 2160–2170. External Links: Cited by: Table 1, §3.1, §4.2.5.
- Joint event extraction with hierarchical policy network. In COLING, 2020, External Links: Cited by: Table 1, §4.2, Table 3.
- Refining event extraction through cross-document inference. See DBLP:conf/acl/2008, pp. 254–262. External Links: Cited by: §4.1, Table 5.
- Reinforcement learning for information retrieval. See DBLP:conf/sigir/2021, pp. 2669–2672. External Links: Cited by: §1.
- Biomedical event extraction based on knowledge-driven tree-lstm. See DBLP:conf/naacl/2019-1, pp. 1421–1430. External Links: Cited by: Table 1, §3.
- Event extraction as multi-turn question answering. See DBLP:conf/emnlp/2020f, pp. 829–838. External Links: Cited by: Table 1, Table 3, Table 5.
- Cross-media structured common space for multimedia event extraction. See DBLP:conf/acl/2020, pp. 2557–2568. External Links: Cited by: Table 1.
- Joint event extraction via structured prediction with global features. See DBLP:conf/acl/2013-1, pp. 73–82. External Links: Cited by: §3.2, §4.1, Table 5.
- Reinforcement learning-based dialogue guided event extraction to exploit argument relations. External Links: Cited by: §4.2.5.
- Document-level event argument extraction by conditional generation. See DBLP:conf/naacl/2021, pp. 894–908. External Links: Cited by: Table 5, §7.
- Joint event extraction based on hierarchical event schemas from framenet. IEEE Access 7, pp. 25001–25015. External Links: Cited by: Table 1.
- A unified MRC framework for named entity recognition. See DBLP:conf/acl/2020, pp. 5849–5859. External Links: Cited by: §1.
- Entity-relation extraction as multi-turn question answering. See DBLP:conf/acl/2019-1, pp. 1340–1350. External Links: Cited by: Table 1.
- Grounding visual concepts for zero-shot event detection and event captioning. See DBLP:conf/kdd/2020, pp. 297–305. External Links: Cited by: §1.
- Learning discriminative neural representations for event detection. See DBLP:conf/sigir/2021, pp. 644–653. External Links: Cited by: §1.
- Using document level cross-event inference to improve event extraction. See DBLP:conf/acl/2010, pp. 789–797. External Links: Cited by: §4.1, Table 5.
- TriggerNER: learning with entity triggers as explanations for named entity recognition. See DBLP:conf/acl/2020, pp. 8503–8511. External Links: Cited by: §1.
- Cost-sensitive regularization for label confusion-aware event detection. See DBLP:conf/acl/2019-1, pp. 5278–5283. External Links: Cited by: §1.
- CPMF: A collective pairwise matrix factorization model for upcoming event recommendation. In 2017 International Joint Conference on Neural Networks, IJCNN 2017, Anchorage, AK, USA, May 14-19, 2017, pp. 1532–1539. External Links: Cited by: §1.
Deep learning for community detection: progress, challenges and opportunities.
Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, IJCAI 2020, C. Bessiere (Ed.), pp. 4981–4987. External Links: Cited by: §1.
- Event extraction as machine reading comprehension. In EMNLP, 2020, External Links: Cited by: §4.2.5, §4.2.
- Event detection via gated multilingual attention mechanism. See DBLP:conf/aaai/2018, pp. 4865–4872. External Links: Cited by: Table 1, §4.2.2, Table 5.
- Neural cross-lingual event detection with minimal parallel resources. See DBLP:conf/emnlp/2019-1, pp. 738–748. External Links: Cited by: Table 5.
- Leveraging framenet to improve automatic event detection. See DBLP:conf/acl/2016-1, External Links: Cited by: Table 1, §3.2, Table 5.
- Exploiting argument information to improve event detection via supervised attention mechanisms. See DBLP:conf/acl/2017-1, pp. 1789–1798. External Links: Cited by: Table 1, Table 5.
- Open domain event extraction using neural latent variable models. See DBLP:conf/acl/2019-1, pp. 2860–2871. External Links: Cited by: §1, Table 1, §5.2, Table 3.
- Jointly multiple events extraction via attention-based graph information aggregation. See DBLP:conf/emnlp/2018, pp. 1247–1256. External Links: Cited by: Table 1, Figure 7, §4.2.3, §4.2.4, §4.2, Table 5.
- Text2Event: controllable sequence-to-structure generation for end-to-end event extraction. CoRR abs/2106.09232. External Links: Cited by: Table 5.
A comprehensive survey on graph anomaly detection with deep learning. CoRR abs/2106.07178. External Links: Cited by: §1.
- A survey of textual event extraction from social networks. See DBLP:conf/lpkm/2017, External Links: Cited by: §1.
- Towards few-shot event mention retrieval: an evaluation framework and A siamese network approach. See DBLP:conf/lrec/2020, pp. 1747–1752. External Links: Cited by: Table 1, Table 3.
- Joint event extraction via recurrent neural networks. See DBLP:conf/naacl/2016, pp. 300–309. External Links: Cited by: Table 1, §3.2, §3.2, §4.2.2, Table 5, §7.
- Event detection and domain adaptation with convolutional neural networks. See DBLP:conf/acl/2015-2, pp. 365–371. External Links: Cited by: §4.2.1, Table 5.
- Modeling skip-grams for event detection with convolutional neural networks. See DBLP:conf/emnlp/2016, pp. 886–891. External Links: Cited by: §4.2.1, §4.2, Table 5.
- Graph convolutional networks with argument-aware pooling for event detection. See DBLP:conf/aaai/2018, pp. 5900–5907. External Links: Cited by: §4.2.4.
- One for all: neural joint modeling of entities and events. See DBLP:conf/aaai/2019, pp. 6851–6858. External Links: Cited by: Table 1, §4.2.2, Table 5.
- Deep contextualized word representations. See DBLP:conf/naacl/2018-1, pp. 2227–2237. External Links: Cited by: §4.2.5.
- Can twitter replace newswire for breaking news?. See DBLP:conf/icwsm/2013, External Links: Cited by: §5.2, Table 3.
- RBPB: regularization-based pattern balancing method for event extraction. See DBLP:conf/acl/2016-1, External Links: Cited by: Table 1, §3.1, Table 5, §7.
- Jointly extracting event triggers and arguments by dependency-bridge RNN and tensor-based argument interaction. See DBLP:conf/aaai/2018, pp. 5916–5923. External Links: Cited by: Table 1, §3.2, §4.2.2, §4.2, Table 5, §7.
- CasEE: a joint learning framework with cascade decoding for overlapping event extraction. External Links: Cited by: §1.
- Universal decompositional semantic parsing. See DBLP:conf/acl/2020, pp. 8427–8439. External Links: Cited by: §1.
- A comprehensive survey on community detection with deep learning. CoRR abs/2105.12584. External Links: Cited by: §1.
- Cross-lingual structure transfer for relation and event extraction. See DBLP:conf/emnlp/2019-1, pp. 313–325. External Links: Cited by: Table 1, §3.
- Progressive multi-task learning with controlled information flow for joint entity and relation extraction. See DBLP:conf/aaai/2021, pp. 13851–13859. External Links: Cited by: §1.
- Attention is all you need. See DBLP:conf/nips/2017, pp. 5998–6008. External Links: Cited by: §7.
- Entity, relation, and event extraction with contextualized span representations. See DBLP:conf/emnlp/2019-1, pp. 5783–5788. External Links: Cited by: Table 1, §4.2.5, Table 5.
- Open event extraction from online text using a generative adversarial network. See DBLP:conf/emnlp/2019-1, pp. 282–291. External Links: Cited by: Table 1.
- HMEAE: hierarchical modular event argument extraction. See DBLP:conf/emnlp/2019-1, pp. 5776–5782. External Links: Cited by: Table 1, Table 3, Table 5.
- Biomedical event extraction as multi-turn question answering. See DBLP:conf/acl-louhi/2020, pp. 88–96. External Links: Cited by: Table 1.
- Event time extraction and propagation via graph attention networks. See DBLP:conf/naacl/2021, pp. 62–73. External Links: Cited by: Table 1, Figure 8, §4.2.4.
- Hybrid dynamic k-nearest-neighbour and distance and attribute weighted method for classification. Int. J. Comput. Appl. Technol. 43 (4), pp. 378–384. External Links: Cited by: §4.1.
Artificial immune system for attribute weighted naive bayes classification. In The 2013 International Joint Conference on Neural Networks, IJCNN 2013, Dallas, TX, USA, August 4-9, 2013, pp. 1–8. External Links: Cited by: §4.1.
- Self-adaptive probability estimation for naive bayes classification. In The 2013 International Joint Conference on Neural Networks, IJCNN 2013, Dallas, TX, USA, August 4-9, 2013, pp. 1–8. External Links: Cited by: §4.1.
- Automatic knowledge graph construction: A report on the 2019 ICDM/ICBK contest. In 2019 IEEE International Conference on Data Mining, ICDM 2019, Beijing, China, November 8-11, 2019, J. Wang, K. Shim, and X. Wu (Eds.), pp. 1540–1545. External Links: Cited by: §1.
- Siamese capsule networks with global and local features for text classification. Neurocomputing 390, pp. 88–98. External Links: Cited by: §4.1.
- Joint extraction of events and entities within a document context. See DBLP:conf/naacl/2016, pp. 289–299. External Links: Cited by: §1, Table 1, Table 5.
- DCFEE: A document-level chinese financial event extraction system based on automatically labeled training data. See DBLP:conf/acl/2018-4, pp. 50–55. External Links: Cited by: Table 1, §5.1, Table 3.
- Exploring pre-trained language models for event extraction and generation. See DBLP:conf/acl/2019-1, pp. 5284–5294. External Links: Cited by: Table 1, §3.1, Figure 9, §4.2.5, §4.2, Table 5.
- Graph convolutional networks for text classification. See DBLP:conf/aaai/2019, pp. 7370–7377. External Links: Cited by: §4.2.4.
- Named entity recognition as dependency parsing. See DBLP:conf/acl/2020, pp. 6470–6476. External Links: Cited by: §1.
- Scale up event extraction learning via automatic training data generation. See DBLP:conf/aaai/2018, pp. 6045–6052. External Links: Cited by: Table 1, Table 5, §7.
- A convolution bilstm neural network model for chinese event extraction. See DBLP:conf/nlpcc/2016, pp. 275–287. External Links: Cited by: Table 1, §4.2.1, Table 5.
- Extracting entities and events as a single task using a transition-based neural model. See DBLP:conf/ijcai/2019, pp. 5422–5428. External Links: Cited by: Table 1, §3, §4.2, Table 5.
- Joint entity and event extraction with generative adversarial imitation learning. Data Intell. 1 (2), pp. 99–120. External Links: Cited by: Table 1, §4.2.5, Table 5.
- DRL4IR: 2nd workshop on deep reinforcement learning for information retrieval. See DBLP:conf/sigir/2021, pp. 2681–2684. External Links: Cited by: §1.
- Joint event extraction based on skip-window convolutional neural networks. In NLPCC, 2016, External Links: Cited by: §4.2.
- Document embedding enhanced event detection with hierarchical and supervised attention. See DBLP:conf/acl/2018-2, pp. 414–419. External Links: Cited by: Table 1, Table 5, §7.
- Doc2EDAG: an end-to-end document-level framework for chinese financial event extraction. See DBLP:conf/emnlp/2019-1, pp. 337–346. External Links: Cited by: Table 1, §5.1, Table 3.
- What the role is vs. what plays the role: semi-supervised event argument extraction via dual question answering. See DBLP:conf/aaai/2021, pp. 14638–14646. External Links: Cited by: Table 1, Table 3.
- Bilingual event extraction: a case study on trigger type determination. See DBLP:conf/acl/2014-2, pp. 842–847. External Links: Cited by: §3.2.