Complex Relation Extraction: Challenges and Opportunities

12/09/2020 ∙ by Haiyun Jiang, et al. ∙ 0

Relation extraction aims to identify the target relations of entities in texts. Relation extraction is very important for knowledge base construction and text understanding. Traditional binary relation extraction, including supervised, semi-supervised and distant supervised ones, has been extensively studied and significant results are achieved. In recent years, many complex relation extraction tasks, i.e., the variants of simple binary relation extraction, are proposed to meet the complex applications in practice. However, there is no literature to fully investigate and summarize these complex relation extraction works so far. In this paper, we first report the recent progress in traditional simple binary relation extraction. Then we summarize the existing complex relation extraction tasks and present the definition, recent progress, challenges and opportunities for each task.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Relation extraction (RE) is one of the fundamental tasks in information extraction and it benefits a lot of natural language processing tasks, such as question answering, text understanding, etc. RE is also a core step in the entire knowledge base construction pipeline.

Traditional RE tasks [17] aim to identify the correct relation between two entities from texts. For example, we hope to extract the relational fact (Beijing, the-capital-of, China) from the following text:

Beijing formerly romanized as Peking is the capital of the People’s Republic of China.

Traditional RE tasks mainly focus on the binary relation between two entities. We refer to these tasks as binary relation extraction (BiRE for short) and they usually take learning-based solutions. According to the problem settings, the traditional BiRE tasks are roughly divided into three categories: supervised BiRE, semi-supervised BiRE, distant supervised BiRE.

Specifically, supervised BiRE focuses on the learning of a RE model from a set of high-quality

labeled data. However, high-quality labeled data is difficult and costly to be obtained while unlabeled data is widely available. Semi-supervised BiRE thus is proposed to train models with only a small set of labeled data and a large amount of unlabeled data. Another effort to alleviate the difficulty of data labeling is distant supervision. Distant supervised BiRE aims to learn a reliable RE model based on a set of weakly labeled samples. The labels are obtained automatically in a heuristic way and usually contain a lot of noises.

Simple BiREs dominate the current research in information extraction. In the early days, feature engineering and kernel-based methods are the focus of the research in supervised and distant supervised BiRE. Bootstrapping was usually used in semi-supervised BiRE, where the relation instances and patterns are iteratively extracted based on a small set of seed instances. In recent years, with the development of deep learning, many advanced neural models, e.g., BERT, Transformer, capsule networks, are applied to RE tasks.

In general, simple BiRE has made significant progress and many effective solutions have been used in practice. However, as intelligent applications fast grow, simple BiRE cannot meet the needs of these applications. We elaborate the limitations of simple BiRE and introduce more complex RE tasks to solve these limitations.

First, simple BiRE depends on large amounts of data. However, it is difficult to obtain enough labeled (or unlabeled, noisy) data in many scenarios, which fails existing supervised (semi-supervised, distant supervised) RE models. To solve this problem, the task of few-shot relation extraction was proposed, which focuses on building effective models with just a few samples. Some few-shot learning algorithms (e.g., metric learning-based [8]) are proven to be effective for this task.

Second, simple BiRE is limited to sentence-level extraction. Simple BiRE mainly focuses on the relation between an entity pair mentioned in a single sentence. Instead, many other sources beyond sentences contain more rich semantic relation instances. How to extract relations from various sources is an interesting and challenging problem. Specifically,

  • Many entity pairs appear in multiple sentences in a document, which cannot be extracted by simple BiRE models. This motivates Document-level extraction.

  • Most works on BiRE only focus on monolingual (e.g, Chinese or English) corpus. But many entities are mentioned in multiple languages, indicating that it is possible to identify the relation using texts of different languages. Cross-lingual relation extraction is thus proposed.

  • In addition to texts, other modal (e.g., image, video) information is also useful for expressing certain semantic relations. For example, image is good at expressing the spatial relations. Multi-modal relation extraction is proposed to use multi-modal information for RE.

Third, binary modeling in BiRE is far from satisfying the requirements of some complex applications. The relationships of entities in the world are very complicated. Binary relation in general is not enough to model the complicated semantics of real world and more complicated relation modeling is needed.

  • In some scenarios, we have to identify the relations involving multiple entities, i.e., N-ary relation extraction. N-ary relation extraction aims to extract relations among entities in the context of one or more sentences. N-ary RE is very useful for document-level reading comprehension and supportive for question answering or document classification.

  • Categorizing relations into different granularities is crucial for some tasks, such as building taxonomy, etc. Multi-grained relation extraction aims to jointly extract multi-grained relations from texts.

  • Many relational facts only hold true under certain conditions. Conditional relation extraction aims to extract relations with certain constrains, e.g., temporal or spatial conditions, which are very important for complex applications. For example, we know the fact (Obama, President, United States) is only valid during 2008-2017. If this fact is used for knowledge-based question answering today, it may have serious political implications.

  • Some facts can be expressed in a nested way. Nested relation extraction is proposed to extract this kind of knowledge.

Fourth, the existing BiRE cannot handle the overlapping entities well. For another example, the former sentence “Beijing formerly romanized as Peking …” contains three facts: (Beijing, the-capital-of, China), (Beijing, the-same-as, Peking) and (Peking, the-capital-of, China). However, traditional BiRE tends to extract these facts independently, which losses much potential supervision information. To model this property, the task of overlapping RE is proposed, where one or two entities in two facts are overlapped.

In this paper, we refer to the RE tasks mentioned above as complex RE. The contents of this paper contain two parts. The first part (Sec 2) presents the summary of the traditional BiREs. Besides, the challenges and directions in BiRE are also concluded. The second part (Sec 3) introduces the complex RE tasks, including the definition, example and the recent progress. Besides, we also present the research challenges and opportunities for these complex RE tasks.

We hope this survey will help researchers to understand the latest progress, challenges and opportunities of the sub-tasks in RE.

2 Binary Relation Extraction

Simple binary relation extraction (BiRE) has been extensively studied for many years. In general, BiRE can be categorized into: supervised, semi-supervised, distant supervised paradigms.

2.1 Supervised BiRE

Description. Supervised BiRE focuses on the learning of a RE model based on a set of high-quality labeled samples. These samples are widely obtained by manual annotation or careful crowdsourcing. Each sample is formalized as , where is an entity pair. is a sentence containing and it expresses the labeled relation . A supervised BiRE model accepts as inputs and predicts the proper relation for entity pair as the output.

Recent works. In recent years, deep learning has been extensively used in RE tasks and many novel neural models are proposed. We highlight typical efforts in this direction.

  • (1) Neural graph-based models. Graph-based methods have been successfully applied to RE and obtain high performance. For example, [28] first applied graph convolutional network (GCN) to RE.

  • (2) Pre-training based methods. Pre-trained models, e.g., BERT and XLNet, can encode a given text into its proper distribution representation, i.e., text embedding.

    [29] constructs entity pair graphs combined with the semantic features from BERT.

  • (3) Capsule network-based methods. For example, [27] takes capsule network with an attention-based routing algorithm to deal with the multi-label problem in RE.

SOAT results. The commonly used datasets in supervised BiRE include SemEval-2010 Task 8111https://mailman.uib.no/public/corpora/2010-February/010118.html, ACE 2004222https://catalog.ldc.upenn.edu/LDC2005T09

and TACRED

333https://nlp.stanford.edu/projects/tacred/. We present the state-of-the-art results of SemEval-2010 Task 8 in Table 1.

Model Macro-F1
TRE 87.1
R-BERT 89.2
EPGNN 90.2
Table 1: The SOAT results on SemEval-2010 Task 8. All the results are from [Zhao et al., 2019].

2.2 Semi-supervised BiRE

Description. In many scenarios, rich labeled data is difficult to be obtained, but a lot of unlabeled data is available. To leverage the large amount of unlabeled data in the training stage, semi-supervised BiRE tries to learn from both labeled data and unlabeled data.

Formally speaking, we denote the pre-defined set of relations as , a set of labeled data as and a set of unlabeled data as , where and or is the corresponding data size. Semi-supervised BiRE aims to learn a function that models both the labeled and unlabeled data and predicts the target relation for .

Recent works. As a main branch of semi-supervised BiRE, bootstrapping starts from some labeled seed instances and learn a preliminary model which is used to find more labeled instances. Many works also focus on alleviating semantic drift problem in bootstrapping. For example, [2] adds constrains to the training procedure by coupling many extractors for different categories and relations.

With the exploration of teacher-student models in semi-supervised learning,

[12] introduces this architecture into semi-supervised BiRE where students learn a robust representation from unlabeled data and teachers guide students with labeled data. Some other works also utilize multi-task learning by jointly learning semi-supervised BiRE task with other tasks.

SOAT results.

It is hard to fairly compare different semi-supervised BiRE models. This is because: (1) Many bootstrapping approaches are deployed in an open world setting and extract relations on the web. (2) Semi-supervised setting varies greatly between methods, i.e., the level of supervision, the data size of unlabeled data and the evaluation metrics can not be exactly the same in different methods. For these reasons, we do not provide the SOTA here. In general, semi-supervised BiRE has made great progress in recent years and many mature systems (e.g., DIPRE, Snowball and KnowItAll) have been applied to some structured knowledge acquisition tasks in practice.

2.3 Distant Supervised BiRE

Description. Similar to supervised BiRE, each sample in distant supervised BiRE can also be formalized as . The difference is that these samples are constructed in an automatic way, e.g., aligning knowledge base with text corpora [17]. The strong assumption in sample acquisition makes the samples in distant supervised BiRE contain lots of noisy relation labels. In other words, may weakly or not express the labeled relation . As a result, the main focus of research in distant supervised BiRE is how to alleviate the impact of noise on performance.

Recent works. We highlight several popular directions in recent years.

  • The idea of reinforcement learning has been widely used in noise detection. For example,

    [22] takes policy network to detect the noisy labels and further to obtain the latent correct labels.

  • The adversarial training is also proven to be effective in improving the robustness of the RE model on the noisy samples [10].

  • Various attention mechanisms are also proposed to learn the important features or instances among the noisy samples [9]. Besides, some other techniques are also applied to noise detection, e.g., soft constraints of entity types, variant CNN, Non-IID assumption, noise label converter and so on.

SOAT results. The commonly used benchmark for distant supervised BiRE is NYT [17], which is constructed by aligning triples in Freebase with texts in New York Times. We report the SOAT results in Table 2.

Model AUC
PCNN+HATT 0.42
PCNN+ATT-RA+BAG-ATT 0.42
SeG 0.51
Table 2: The SOAT results on NYT, where all the results are from [Li et al., 2019]. AUC denotes the area under the precision-recall curve.

2.4 Challenges and Directions of BiRE

Although simple BiRE has made great progress in recent years, there are still some challenges.

  • Reliability of benchmarks. A reliable benchmark can be measured from two aspects: scale and quality. That is, a good benchmark should contain large-scale and high-quality test samples. However, the two conditions cannot be easy simultaneously satisfied in BiRE tasks. For example, in supervised BiRE, the scale of the test set is usually very small. How to obtain reliable benchmarks is a promising direction.

  • Reliability of model learning

    . Because of the various factors, e.g., limited data or noise, the precise semantic features of relations are still difficult to be captured. With the development of machine learning technology, e.g., pre-training, transfer learning, this problem can be alleviated to some extent. But learning a highly reliable BiRE model is still an important direction.

  • Sparse detection in applications. In real applications, we usually pre-define a set of relations and then extract their instances from massive candidate entity pairs. However, there are countless semantic relations in the world and most of the candidate entity pairs to be processed have no relation in . How to detect the correct instances from a huge collection is a big challenge and we refer to it as sparse detection. Fortunately, there are some benchmarks to model this property. For example, NYT contains relation “NA” that denotes there is no relation in the relation set. But to the best of our knowledge, there is no RE model that performs very well on sparse detection.

3 Complex Relation Extraction

There are only very recent works around Complex Relation Extraction (CoRE). Different from conventional BiRE, CoRE tries to extract more complex relations that involve multiple entities or under certain constrains. In this section, we present the definition and investigate the recent progress for each complex RE task. We also conclude the challenges of each task.

3.1 Few-shot Relation Extraction

In most cases, a relation only has fewer instances, which makes the traditional supervised RE models powerless. As a new paradigm, few-shot learning tends to be effective for this problem, i.e., few-shot RE.

Few-shot RE can be formalized as follows [8]. Given a set of relations with relations, a supporting set is denoted as:

(1)

where is the -th relation. is a sentence with an entity pair that is labeled with relation . Few-shot RE aims to learn a function and predicts the proper relation for the unlabeled sample .

[8] proposes a new few-shot RE dataset: FewRel. They also implemented the recent SOAT few-shot learning algorithms on this dataset. FewRel 2.0 [6] is a more challenging few-shot dataset, which aims to study the problem of new domain extraction under few instances. Considering the noise in few-shot RE, [5] proposes a hybrid attention-based prototypical network to extract informative features. [19] takes BERT to learn the distributional similarity between two sentences where the entity pair in the sentence is replaced by a [BLANK] symbol.

There are two challenges or directions for few-shot RE:

  • Relative importance of samples: Since there is only a few instances for each relation, it is very necessary to learn from other relation instances when learning semantic features of a target relation. As a result, the relative importance of an instance to the target relation should be learned, otherwise, the noise instance will be introduced.

  • Knowledge reasoning is needed: As [8] points out, the relation prediction in a large number of samples need deep reasoning beyond the text in the sample. For example, given the sentence [8]:

    He was a professor at Reed College, where he taught Steve Jobs, and replaced Lloyd J. Reynolds as the head of the calligraphy program.

    The logical reasoning with common sense is needed to infer the relational fact (Steve Jobs, educated-at, Reed College).

3.2 Document Relation Extraction

Document relation extraction aims to extract relations between entity mentions at document-level. In this task, the relation mentions can span multiple sentences and even paragraphs. These properties make the problem more challenging compared to intra-sentence relation extraction.

In the real world case of RE, the data to be processed is often in document form which requires RE model of document-level, or inter-sentence rather than intra-sentence. Until recently, methods like [20, 18] appear and accelerate the development of inter-sentence RE tasks. All of these methods can be used to enrich the knowledge base.

[15] first presents graph-based LSTM model to solve RE in multiple sentences. They build two directed acyclic graphs based on the word dependencies and then utilize LSTM to learn the hidden presentations. [20] further improves the model and introduce a graph-state LSTM model which can keep whole graph information and has high efficacy in training and decoding steps. [18]

uses five type edges to build the document-level graph and learn presentation by a labeled edge GCNN model. Bi-affine layer aggregates all entity mentions and generates the final relation prediction.

The main difficulties in document RE are as follows:

  • Diversity of document format: The document can be in a variety of formats. The first task is to transform the original data file like pdf to the specific format like txt. Some important information can be lost in this step and make it harder to conduct RE process.

  • Long dependencies cross sentences: Relation mentions can span long distance in the document. Traditional CNN and RNN based networks fail to capture those features in longer sequences.

3.3 Cross-lingual Relation Extraction

Cross-lingual RE seeks to learn an extractor trained in the resource-rich language and transfers it to the target language. The cross-lingual RE also takes the sentence as well as entity mentions as input and outputs the relation between the given entity pair.

Cross-lingual RE is beneficial to the knowledge base completion because some entities may appear more frequently in corpus of a certain language. Considering the lack of well-annotated data, it will inevitably lose a lot of informative facts that could not be extracted based on the resource-poor language. Cross-lingual RE can address this drawback exactly.

At an early age, cross-lingual RE methods depend on parallel corpora and then conduct extraction by projecting the source language to the target one. [30] solves the problems with the help of translation mechanism. When cross-lingual word embedding was proposed, [14] utilizes it to map source embeddings to the target. [21] first builds text features by universal dependency parsing tools and then adopts GCN to learn the hidden presentations in the shared semantic space in which the RE of both languages can be conducted.

Although there are many methods of cross-lingual RE, main challenges are as follows:

  • Gaps between languages: The gaps between languages are quite different. Whether some shared features are useful universally is still a question.

  • Applicable in practice: The proposed state-of-the-art models do not achieve the satisfactory outcome that the best performance of F1-score is around 62%. Due to the relatively low performance, it is not reliable to use in practice.

3.4 Multi-modal Relation Extraction

With the explosive growth of information from Internet, images and videos also become rich resource. Multi-modal RE take the advantage of these large scale corpus and focus on extracting relations from them.

As a vivid way to convey information, images and videos can implicate much knowledge. On the one hand, humans are like to express some common sense knowledge using images rather than explicitly saying it. On the other hand, combining multi-modal corpus has shown promising results in many tasks. This phenomena highlights the importance of harvesting relations from images or videos rather than just free text.

[4] introduced a Never Ending Image Learner (NEIL) for visual knowledge from Internet. NEIL cluster images from websites and mining relationships between instances. Based on recently proposed visual question answering (VQA) datasets, multi-modal RE like Object-Object or Object-Attribute relations also aroused wide concern.

Multi-modal RE is still a research hotspot and have many interesting problems to focus on:

  • Generality of knowledge: Relation extracted by existing works is highly related to given input, e.g., “the ball in the picture is red”. These knowledge is shallow and hard to be utilized in further study. How to extract connotative but informative relation, e.g., common sense knowledge, is one of the ultimate goals of multi-modal RE.

  • Multi-modal KB: Although researches have begun to construct multi-modal KBs, there is still a big gap between it and existing KBs.

3.5 N-ary Relation Extraction

N-ary relation extraction (NRE) aims to extract relations among entities in the context of one or more sentences. In the NRE task, the input can be denoted as , where contains all the entity mentions and is the given text containing sentences (). The target is to predict the relation among those entities. The relation set is pre-defined and represented as , where “NA” is also included in , denoting there is no relation among the entities.

For example, given the text:

The deletion mutation on exon-19 of EGFR gene was present in 16 patients, while the 858E point mutation on exon-21 was noted in 10. All patients were treated with gefitinib and showed a partial response. [15]

the entity mentions (gefitinib, EGFR, 858E) form the relation of (drug, gene, mutation).

N-ary RE has attracted more research interests. It plays a crucial role in applications such as detecting cause-effect and predicting drug-gene-mutation facts. In contrast to the prosperity of the binary RE, there are fewer methods proposed for N-ary RE task. [13] studies the case in biomedical domain where the n-ary relation is expressed within a single sentence. Recently, some exciting work [20]

has appeared in the task of named entity recognition. These methods can well handle the cross-sentence N-ary RE.

[15] explores a general RE framework based on graph LSTM. They first transform the text input to graph by regarding the words as nodes and the dependencies, links between adjacent words and inter-sentence relations as edges. Secondly, the graph is separated into two DAGs from which they utilize an extended tree LSTM network to learn the text presentations in the third step. [20] propose a graph-state LSTM model that can learn better presentation of the input text during the recurrent graph state transition. With increasing numbers of recurrent steps, each word can capture more information from a larger context.

There is still some suffering in N-ary RE.

  • Lack of data: There is no well-annotated data for evaluating N-ary RE specially. The currently used benchmark dataset in [15] and [20] is constructed based on distant supervision and has only one ternary relation.

  • Absence of end-to-end model: Most of the preliminary works require all the entity mentions as input but obtaining them can be demanding. In real applications, an end-to-end model can make a vast difference.

3.6 Multi-grained Relation Extraction

Multi-grained relation describes knowledge in a coarse-to-fine manner. Intuitively, fine-grained relation aims to distinguish subordinate-level relations from a coarse-grained relation. Categorizing relations into different levels is crucial for building taxonomy, mining level-specific information, etc.

The performance gains from multi-grained information have been demonstrated in many tasks. Multi-grained relations via categories can provide additional supervision from annotated data [24]. By learning multi-grained topics of documents, [3] further improves the performance of short text classification.

However, multi-grained RE attracted little attention from the community recently. [25] proposes a multi-grained named entity recognition framework to tackle the overlapping and nested problem.

There are several possible reasons for the slow progress of multi-grained RE:

  • Fuzzy boundary: The boundary between coarse-grained and fine-grained relations is not clear. Existing works almost regard two type relations as a whole part, which leads to a bottleneck of performance.

  • Fairness of evaluation metrics: The naive F1 measurement is not sufficient for multi-grained RE, and corresponding evaluation metrics are needed for better reflect the quality in a multi-grained manner.

3.7 Conditional Relation Extraction

Conditional RE aims to extract relations with certain constrains, e.g., temporal or spatial conditions. A conditional relation can be generally denoted as , where is the original subject–property–object triple and is the condition to hold relation true. Take temporal condition as an example, the relation PresidentOf(Barack Hussein Obama, American) only holds true over the temporal period 2008 to 2017. So the condition can be a temporal interval here.

Nowadays, large scale KB contains millions of entity and relation instances, such as DBpedia, Freebase and YAGO. However, few of them consider relation as conditioned or include the above mentioned external condition in the KB. It heavily limits the applicability of existing KB to sophisticated reasoning task and leads to an urgent need for research in conditional RE.

In the early stage, pattern based methods combined with manually designed features are adopted to capture the condition. YAGO2 uses regular expressions to extract temporal and spatial relations from Wikipedia infobox. [11] try to generate patterns for time-variant relations. Some machine learning approaches are also tried in medical field [7].

Preliminary works have been attempted, but there is still no good solution. The main challenges are as follows:

  • Complexity of dependencies: The complex dependencies between entities, relation and its condition make it hard to handle different part properly.

  • Flexibility of condition: Condition in free text may have many existing forms. A general framework is needed to formalize the conditional dimension.

  • Lack of data: There is no well-annotated data for conditional RE yet. To some extent, it prevents using end-to-end model for this task.

3.8 Nested Relation Extraction

Traditional BiRE can be expressed as (arg1, rel, arg2), while nested RE can be formalized as (arg1, rel, (arg2, rel2, arg3)) or ((arg1, rel, arg2), rel2, arg3). However, traditional binary RE will lose some information, resulting in incomplete and uninformative triples. Nested RE helps to express the meaning of the original sentence more correctly. In addition, nested RE will be more beneficial to downstream tasks, such as question answering that relies heavily on the correctness and completeness of triples.

Some recent works try to study nested RE. NESTIE [1] learns syntactic patterns for the relations that are expressed as nested templates. StuffIE [16] exploits Stanford dependency parsing and lexical databases to extract nested relations.

However, the existing works are not good enough, mainly because of the following challenges:

  • Complexity of structure Sentences consist of many clauses, or there are many entities and nested relationships. Sentence structure is often very complex and it is difficult to analyze nested structures directly.

  • Implicit subject In a sentence, the subject may appear only once, most of which are in the form of a reference (him, he), and the corresponding real entity needs to be found.

3.9 Overlapping Relation Extraction

Different relational triples in a sentence may have different degree overlap. [26] concludes two overlapping types: Entity Pair Overlap (EPO) and Single Entity Overlap (SEO). EPO means some triplets share overlapping entity pairs. SEO means some triplets share an overlapped entity but they don’t share overlapped entity pair. For example, (, president of, ) and (, born in, ) are SEO, which share the same entity “s1”. (, born in, ) and (, live in, ) belong to EPO, which share the same entity pair “(,)”.

The previous RE was designed to find relation based on given entity pairs. However, in actual applications, the location of entities is often unknown, and there may be multiple relationships between entities. Ignoring overlapping RE will lose many related triples, leading to incomplete knowledge graphs.

Some researchers are now studying how to consider overlapping relations in a sentence. CopyR [26] adopts an end-to-end neural model with copy mechanism to extract overlapping relations, which is the first work considering overlap problems. [23] designs a hierarchical paradigm, which incorporates reinforcement learning to extract overlapping relations.

But there are still some challenges in overlapping relationship extraction.

  • Complexity of relations: There may be no relationship or multiple relationships between two entities in a sentence.

  • Unknown entities and relationships: The locations of entities and relationships are unknown, and it is difficult to find them correctly.

4 Conclusion

Relation extraction denotes a series of tasks that aim to identify the proper relations for two or more entities under specific settings. In this paper, we summarize the latest progress of simple binary RE tasks, including supervised, semi-supervised and distant supervised RE. Furthermore, we also investigate the more complex RE tasks, including the definition, recent progress, challenges and opportunities.

Relationship extraction research is an eternal proposition, which mainly benefits from the advances in natural language processing and machine learning. We argue that the existing progress made in simple BiRE can also be transferred to the complex RE tasks. Mature applications for complex RE are still far away. We hope this survey make researchers quickly understand the concepts and research progress of each subtask in complex RE.

References

  • [1] N. Bhutani, H. Jagadish, and D. Radev (2016) Nested propositions in open information extraction. In EMNLP, pp. 55–64. Cited by: §3.8.
  • [2] A. Carlson, J. Betteridge, R. C. Wang, E. R. Hruschka, and T. M. Mitchell (2010)

    Coupled semi-supervised learning for information extraction

    .
    In WSDM ’10, Cited by: §2.2.
  • [3] M. Chen, X. Jin, and D. Shen (2011) Short text classification improved by learning multi-granularity topics. In IJCAI, Cited by: §3.6.
  • [4] X. Chen, A. Shrivastava, and A. Gupta (2013) NEIL: extracting visual knowledge from web data. ICCV, pp. 1409–1416. Cited by: §3.4.
  • [5] T. Gao, X. Han, Z. Liu, and M. Sun (2019) Hybrid attention-based prototypical networks for noisy few-shot relation classification. In AAAI, Cited by: §3.1.
  • [6] T. Gao, X. Han, H. Zhu, Z. Liu, P. Li, M. Sun, and J. Zhou (2019) FewRel 2.0: towards more challenging few-shot relation classification. ArXiv abs/1910.07124. Cited by: §3.1.
  • [7] H. Gurulingappa, A. M. Rajput, and L. Toldo (2012) Extraction of potential adverse drug events from medical case reports. In J. Biomedical Semantics, Cited by: §3.7.
  • [8] X. Han, H. Zhu, P. Yu, Z. Wang, Y. Yao, Z. Liu, and M. Sun (2018) FewRel: a large-scale supervised few-shot relation classification dataset with state-of-the-art evaluation. In EMNLP, Cited by: §1, 2nd item, §3.1, §3.1.
  • [9] Y. Li, G. Long, T. Shen, T. Zhou, L. Yao, H. Huo, and J. Jiang (2019) Self-attention enhanced selective gate with entity-aware embedding for distantly supervised relation extraction. arXiv preprint arXiv:1911.11899. Cited by: 3rd item.
  • [10] B. Liu, H. Gao, G. Qi, S. Duan, T. Wu, and M. Wang (2019) Adversarial discriminative denoising for distant supervision relation extraction. In DASFAA, Cited by: 2nd item.
  • [11] Y. Liu, W. Hua, and X. Zhou (2019) Extracting temporal patterns from large-scale text corpus. In ADC, Cited by: §3.7.
  • [12] F. Luo, A. Nagesh, R. Sharp, and M. Surdeanu (2019) Semi-supervised teacher-student architecture for relation extraction. Cited by: §2.2.
  • [13] R. T. McDonald, F. C. Pereira, S. Kulick, R. S. Winters, Y. Jin, and P. S. White (2005) Simple algorithms for complex relation extraction with applications to biomedical ie. In ACL, Cited by: §3.5.
  • [14] J. Ni and R. Florian (2019) Neural cross-lingual relation extraction based on bilingual word embedding mapping. In EMNLP/IJCNLP, Cited by: §3.3.
  • [15] N. Peng, H. Poon, C. Quirk, K. Toutanova, and W. Yih (2017) Cross-sentence n-ary relation extraction with graph lstms. ACL 5, pp. 101–115. Cited by: 1st item, §3.2, §3.5, §3.5.
  • [16] R. E. Prasojo, M. Kacimi, and W. Nutt (2018) StuffIE: semantic tagging of unlabeled facets using fine-grained information extraction. In CIKM ’18, Cited by: §3.8.
  • [17] S. Riedel, L. Yao, and A. McCallum (2010) Modeling relations and their mentions without labeled text. In ECML/PKDD, Cited by: §1, §2.3, §2.3.
  • [18] S. K. Sahu, F. Christopoulou, M. Miwa, and S. Ananiadou (2019)

    Inter-sentence relation extraction with document-level graph convolutional neural network

    .
    In ACL, Cited by: §3.2, §3.2.
  • [19] L. B. Soares, N. FitzGerald, J. Ling, and T. Kwiatkowski (2019) Matching the blanks: distributional similarity for relation learning. In ACL, Cited by: §3.1.
  • [20] L. Song, Y. Zhang, Z. Wang, and D. Gildea (2018) N-ary relation extraction using graph state lstm. In EMNLP, Cited by: 1st item, §3.2, §3.2, §3.5, §3.5.
  • [21] A. Subburathinam, D. Lu, H. Ji, J. May, S. Chang, A. Sil, and C. R. Voss (2019) Cross-lingual structure transfer for relation and event extraction. In EMNLP/IJCNLP, Cited by: §3.3.
  • [22] T. Sun, C. Zhang, Y. Ji, and Z. Hu (2019) Reinforcement learning for distantly supervised relation extraction. IEEE Access 7, pp. 98023–98033. Cited by: 1st item.
  • [23] R. Takanobu, T. Zhang, J. Liu, and M. Huang (2019) A hierarchical framework for relation extraction with reinforcement learning. In AAAI, Vol. 33, pp. 7072–7079. Cited by: §3.9.
  • [24] W. Wu, Y. Meng, Q. Han, M. Li, X. Li, J. Mei, P. Nie, X. Sun, and J. Li (2019)

    Glyce: glyph-vectors for chinese character representations

    .
    ArXiv abs/1901.10125. Cited by: §3.6.
  • [25] C. Xia, C. Zhang, T. Yang, Y. Li, N. Du, X. W. Wu, W. Fan, F. Ma, and P. S. Yu (2019) Multi-grained named entity recognition. In ACL, Cited by: §3.6.
  • [26] X. Zeng, D. Zeng, S. He, K. Liu, and J. Zhao (2018) Extracting relational facts by an end-to-end neural model with copy mechanism. In ACL, Cited by: §3.9, §3.9.
  • [27] X. Zhang, P. Li, W. Jia, and H. Zhao (2019) Multi-labeled relation extraction with attentive capsule network. In AAAI, Vol. 33, pp. 7484–7491. Cited by: 3rd item.
  • [28] Y. Zhang and P. Qi (2018) Graph convolution over pruned dependency trees improves relation extraction. arXiv preprint arXiv:1809.10185. Cited by: 1st item.
  • [29] Y. Zhao, H. Wan, J. Gao, and Y. Lin (2019) Improving relation classification by entity pair graph. In ACML, Cited by: 2nd item.
  • [30] B. Zou, Z. Xu, Y. Hong, and G. Zhou (2018) Adversarial feature adaptation for cross-lingual relation classification. In COLING, Cited by: §3.3.