Analysis of the Effect of Dependency Information on Predicate-Argument Structure Analysis and Zero Anaphora Resolution

05/31/2017 ∙ by Koichiro Yoshino, et al. ∙ Kyoto University 0

This paper investigates and analyzes the effect of dependency information on predicate-argument structure analysis (PASA) and zero anaphora resolution (ZAR) for Japanese, and shows that a straightforward approach of PASA and ZAR works effectively even if dependency information was not available. We constructed an analyzer that directly predicts relationships of predicates and arguments with their semantic roles from a POS-tagged corpus. The features of the system are designed to compensate for the absence of syntactic information by using features used in dependency parsing as a reference. We also constructed analyzers that use the oracle dependency and the real dependency parsing results, and compared with the system that does not use any syntactic information to verify that the improvement provided by dependencies is not crucial.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Predicate-argument structure (PAS) including zero anaphora is one of the most fundamental and classical components of natural language processing (NLP). Many NLP applications utilize PAS, such as machine translation

[Zhai et al.2013], question answering [Shen and Lapata2007] and dialogue systems [Yoshino et al.2011].

Conventionally, the PAS analysis (PASA) architecture stands on the pipeline processing of NLP. It is assumed that the analyzer receives the correct results of various preprocessing steps such as word segments (WSs) and part of speech (POS) tags from a morphological analyzer, and dependency structures from a dependency parser. However, underlying this assumption requires the costly processes of preparing an accurate morphological analyzer and dependency parser for its domain or application. Actually the accuracies of dependency parsers are still not sufficient (around or less than 90%) [Kudo and Matsumoto2002, Flannery et al.2011] to use as input of PASA even if the parser is adapted to the target domain. This is in contrast to morphological analyzers adapted to the target domain [Neubig et al.2011], which have accuracies of more than 96%. Furthermore, the cost for constructing a dependency parser adapted to the target domain is much higher than the cost of construction of a domain adapted morphological analyzer, in both of data preparation and parser adaptation.

However, this approach still requires a domain adapted dependency parser. In [Yoshino et al.2013], they proposed a straightforward framework that does not require any syntactic information, and directly predicts the pair of a predicate and an argument that has a relationship of semantic role from an entire document. However, this work did not compare with a PAS analyzer that uses dependency information. This paper follows our previous approach to construct a PAS analyzer that does not assume dependencies as input, compares it with a PAS analyzer that uses dependencies, and investigates the effect of dependency information. This paper also studies influences of parsing errors and the cost of parser construction, which we often encounter in real language processing. The straightforward framework allows us to be free from costly processes of the data preparation and preprocessor construction if the accuracy is enough [Zhou and Xu2015].

2 Predicate Argument Structure Analysis (PASA)

The PAS is a relationship between a predicate, a verb or an indeclinable word (noun, adjective, and adjective verbs) that indicates an event, and its arguments. A predicate in a document has arguments , , …, that have semantic roles , , …, . We show an example of a PAS in Figure 1. In the example, predicates =fate, =bet, =concept, =fulfill, and =address are given, and they have corresponding arguments =party,…,=fulfillment with semantic roles =ga,…,=ni (gray boxes are predicates). Predicates can take the other predicates as their argument. For example, =address takes fulfillment as its ni-case, although the fulfillment is also a predicate. The solid arrows (upper half of the figure) represent dependencies, and the leaves depend on the heads (dependency information does not include labels of edges). Semantic role labels are annotated to the dependency edges (ellipsoidal labels). Language dependent label values of these labels are defined in the shared task of semantic role labeling (SRL) [Hajič et al.2009], respectively. The example in the figure is a Japanese sentence. In Japanese, the defined labels are ga (=nominative case), wo (=accusative case) and ni (=dative case). Gray solids lines are dependencies that do not have defined semantic roles. The dotted arrows (lower half of the figure) denote zero anaphora, which occur frequently in Japanese, and express predicate-argument relationships between words which do not have dependency relations. The defined label-set is the same as the SRL label-set. The task of predicting the zero anaphora in particular is called zero anaphora resolution (ZAR) [Iida et al.2007a, Sasano and Kurohashi2009, Iida and Poesio2011]. The example only shows zero anaphora relations between words in the same sentence, but zero anaphora relations also exist between words in different sentences. ZAR is closely related to the coreference resolution task [Iida. et al.2003] that predicts identical elements in the real world written in a document. If some words are identified as a coreference, each word of a coreferential cluster takes the same role of the same predicate. In the example, the relation between the Socialist party and the party is a coreference, and both words take the ga label of predicates, fate, bet, fulfillment, and address.

Figure 1: An example of the predicate-argument structure. Arrows express relations in which leaves depend on the heads. Solid lines mean dependencies, labels show their relations, and broken lines mean relations of the PAS that do not have dependency relations. Gray lines are dependencies that do not have any PAS relation.

The task of PASA is that of predicting arguments and their semantic roles for given predicates. Conventional PAS analyzers output only one argument in one semantic role for a predicate and coreference resolution, post-processing, disambiguates the problem of multi arguments of a predicate in the same semantic role (e.g. both of Socialist party and party is ga (nominative case) argument of address) [Matsubayashi et al.2012]. This paper follows this protocol, and defines the PASA task to output one argument in one semantic role for a predicate. In other words, the task of PASA is predicting a part of a coreferential cluster that takes a semantic role for the given predicate. In the example, for =address, the analyzer outputs one of Socialist party or party as the semantic role ga (nominative case).

3 Pointwise Predicate-Argument Structure Analysis

Pointwise PASA (PWPASA) is a framework that predicts relations between every word in a document and a given predicate independently from other predicates with a binary classifier

[Yoshino et al.2013]. One of the advantages of the pointwise approach is that the classifier does not depend on other tasks of NLP, whose prediction accuracy may not be sufficiently high [Neubig et al.2011, Mori and Neubig2011]. The PWPASA does not essentially require global structures such as dependencies, and can be used to create a PASA model that does not refer to dependency information.

3.1 Model of Prediction

PWPASA handles the problem of PASA as a binary classification problem for every pair of an argument candidate and a predicate. Labeled pairs of an argument and a predicate are used as positive training examples (P) and unlabeled pairs are used as negative training examples (N). In the example of Figure 2, the upper box shows training examples for the ga-intra classifier. The pair of “fate” and “party”, and the pair of “fate” and “Socialist party” are positive examples, and pairs of “fate” and other candidates are negative. Similarly the bottom box shows some training examples for the wo-intra classifier converted from the example in Figure 1. The system consists of 326 classifiers in total: ga (nominative case), wo (accusative case) and ni (dative case) for intra-sentence (intra) or inter-sentence (inter) cases. The intra classifiers are trained from all pairs of a predicate and every word in the same sentence, and inter ones are trained from all pairs of predicates and every word in different sentences.

Figure 2: Examples of positive and negative example creation from the example of Figure 1. This figure shows examples of the label ga in the intra case and the label wo in the intra case.

The PASA task expects to output one argument for one predicate in one semantic role. However, the PWPASA outputs several argument candidates for a predicate in one semantic role, thus, we used a logistic regression (LR) classifier to select the best candidate and its prediction probability. As shown in

Figure 3, classifiers for the same semantic role output several classification results. Here, ID is word ID and S-ID is sentence ID. The classification results sometimes conflict because the labeling for each pair of an argument candidate and a predicate is independent. (“Tokyo” and “pewit gull” are classified into nominatives of the predicate “tangle” in the example of Figure 3). The system outputs one result that has the highest probability of the LR classifiers for one semantic role to a predicate. In the example shown in Figure 3, the results of “Tokyo” and “pewit gull” for “tangle” conflict. In this case, the system chooses one of them for the nominative case with the prediction probabilities of LR.

Figure 3: An example of classification of several classifiers and a candidate decision.
Figure 4: An example of gold case frame.

3.2 Features for PWPASA (PWFeat)

Feature design is the most important aspect of the PWPASA. The classifier does not use information given by dependencies, but it indirectly refers to the information by using features that reflect dependency relationships. In the subsequent explanations, features indicated by are ones generally used in the dependency parsing [Flannery et al.2011].

  1. Word n-gram: Uni-grams of words located between -10 – and +10 positions around the predicate () and the argument candidate (), bi-grams and tri-grams of words located -5 – +5 around the predicate and the argument (, …, , , …, , , …, and , …, ).

  2. POS n-gram: Uni-grams, bi-grams and tri-grams of POS tags of the surrounding words.

  3. Pairwise word and POS: The pair of the word of the predicate and the argument candidate () and pairs of POS tags of surrounding words located -2 – +2 (, , …, ).

  4. Word Distance: The number of words between the predicate and the argument candidate. The number as is, and divided and rounded values with 2, 3, 4 and 5. The word distance features take integer values which include negative integers, to make distinction of right and left of the predicate.

  5. Predicate Distance: The number of predicates between the predicate and the candidate.

  6. Gold case frame: The gold case frame of the target predicate. This information is similar to the results of the word sense disambiguation. An example is shown in Figure 4. If the word sense disambiguation worked perfectly, the system knows that the ga and the wo label is essential for the predicate “fulfill”, and these arguments exist somewhere in the document. On the other hand, system can know that it is not necessary to find an argument that takes the role of ni. This feature is introduced as a binary flag for every semantic role.

3.3 Features for PWPASA that Depend on Language (Lang)

The other important viewpoint is language dependence. We list language dependent features of Japanese PASA.

  1. Case Marker Word on the Right Side: Case marker words  (ha),  (ga),  (wo) and  (ni) on the right side.

  2. Candidate Position: The candidate position in a document is used as a feature.

  3. Case Marker Word Distance: Number of case markers words between the predicate and the argument candidate.

  4. Pair of Predicate Distance and Case Marker Word Distance: Pairwise features of the predicate distance and the case marker word distance.

Figure 5: An example of depth of dependency.

3.4 Features of Dependency Information (Dep)

This section describes features referring to dependency information to compare with the pointwise analyzer (PWPASA). PASA with dependency introduces two kinds of dependency features generally used for PASA. We selected features which improve the SRL accuracy significantly in a previous study [Björkelund et al.2009].

  1. Dependency Relation between Predicate and Argument Candidate: The dependency relation between the predicate and the argument candidate. We introduced the direct dependency relation feature for distance of up to 3 steps. Figure 5 shows examples of this feature.

  2. Head and Leaf Words: Head word of the predicate (), the head word of the argument candidate (), and the leaf words of the predicate (, …,) and the leaf words of the argument candidate (, …,). For example in Figure 5, the head of “concept” is “fulfillment”, and leafs of it are “bet” and “liberal democratic party”.

4 Experimental Evaluation

We investigated the effect of the dependency information on PASA including ZAR by comparing the proposed analyzer which does not use the dependency information and a more traditional system that uses dependency information. All of experimental conditions except for the dependency information are the same. We constructed two systems that do not use dependency information; PWFeat and PWFeat+Lang, and three systems that use dependency information; oracle dependency (PWFeat+Lang+Dep(oracle)), real parsing results (PWFeat+Lang+Dep(parsed)), and simulated parsing results with 20% errors (PWFeat+Lang+Dep(20% errors)). Used feature type is indicated in names, PWFeat means using features for PWPASA, Lang means using language dependent features, and Dep means using features of dependency information. For parsed, we used a phrase-based dependency parser CaboCha [Kudo and Matsumoto2002], which is retrained by the following training set of the PASA. The parsing accuracy was 86.12% (the parsing result includes 13.88% errors).

4.1 Experimental Setting

We used the NAIST Text Corpus (NTC) [Iida et al.2007b], a corpus which is publicly available111https://sites.google.com/site/naisttextcorpus (2015/5/30). The documents are annotated with predicate-argument relations and coreferences. The sentences also have lower layer annotations: word boundaries, POS tags, chunks, and phrase-based dependencies. The annotated predicates include not only verbs but also indeclinable words which indicate events. The NTC contains Japanese newspaper articles and editorials. There are three different types of annotation on pairs of a predicate and its argument in Japanese: ga (nominative case), wo (accusative case) and ni (dative case).

Train Test
Documents 1,751 696
Sentences 24,283 9,284
Words 664,898 225,624
Predicates 97,773 38,365
  PAS labels ga Depend 64,152 12,226
(nom.) Zero (intra) 62,586 14,373
Zero (inter) 149,482 49,415
wo Depend 51,095 9,837
(acc.) Zero (intra) 17,585 3,179
Zero (inter) 10,786 3,830
ni Depend 11,790 2,501
(dat.) Zero (intra) 4,063 1,005
Zero (inter) 6,978 2,048
Table 1: Detail of training and test set.

We divided the NTC into training and test set. Their specifications are shown in Table 1. The table shows the numbers of documents, sentences, words, predicates and PAS labels annotated on pairs of a predicate and its argument. The PAS label numbers include several labels that indicate the same coreferential cluster from the same predicate in the same semantic role (In the example of Figure 1, there are two arcs with a ga (nominative case) labels from “fulfillment” to “Socialist party” and “party”, and “Socialist party” and “party” belong to the same coreferential cluster). As the classifier, we used LIBLINEAR222http://www.csie.ntu.edu.tw/ cjlin/liblinear (2015/5/30), a library for large linear classification [Fan et al.2008]

, with L2-loss linear logistic regression (LR). We evaluated the system performance by using precision (P), recall (R), and their harmonic mean (F-measure; F).

Method Category P R F
PWFeat Depend 93.01 69.20 79.36
Zero (intra) 51.05 23.96 32.61
Zero (inter) 23.55 5.00 8.25
ALL 82.57 50.25 62.48
PWFeat+Lang Depend 94.00 70.48 80.56
Zero (intra) 52.97 24.24 33.26
Zero (inter) 25.00 5.83 9.46
ALL 83.64 51.25 63.56
PWFeat+Lang Depend 92.64 76.93 84.06
+Dep(oracle) Zero (intra) 58.72 19.11 28.84
Zero (inter) 23.10 5.65 9.08
ALL 85.09 54.26 66.26
PWFeat+Lang Depend 92.82 75.06 83.00
+Dep(parsed) Zero (intra) 57.45 21.72 31.52
Zero (inter) 23.79 5.83 9.37
ALL 84.47 53.64 65.61
PWFeat+Lang Depend 92.65 72.22 81.17
+Dep(20% errors) Zero (intra) 54.57 18.93 28.11
Zero (inter) 23.88 5.31 8.69
ALL 84.34 51.14 63.67
Table 2: Overall result of each method.
Role type Category P R F
ga (nom.) Depend 90.16 60.22 72.21
(9516/10554) (9516/15803)
Zero (intra) 52.75 24.75 33.69
(1957/3710) (1957/7906)
Zero (inter) 24.76 6.56 10.37
(354/1430) (354/5396)
ALL 75.36 40.64 52.80
(11827/15694) (11827/29105)
wo (acc.) Depend 97.65 82.70 89.56
(8751/8962) (8751/10581)
Zero (intra) 59.48 22.68 32.84
(320/538) (320/1411)
Zero (inter) 36.17 2.02 3.82
(17/47) (17/843)
ALL 95.19 70.81 81.21
(9088/9547) (9088/12835)
ni (dat.) Depend 97.47 82.91 89.60
(2193/2250) (2193/2645)
Zero (intra) 40.19 19.76 26.50
(84/209) (84/425)
Zero (inter) 20.51 3.10 5.39
(8/39) (8/258)
ALL 91.47 68.66 78.44
(2285/2498) (2285/3328)
ALL Depend 94.00 70.48 80.56
(20460/21766) (20460/29029)
Zero (intra) 52.97 24.24 33.26
(2361/4457) (2361/9742)
Zero (inter) 25.00 5.83 9.46
(379/1516) (379/6497)
ALL 83.64 51.25 63.56
(23200/27739) (23200/45268)
Table 3: PWPFeat+Lang: Results of the analyzer that does not use the dependency information.
Role type Category P R F
ga (nom.) Depend 88.67 70.20 78.36
(11093/12511) (11093/15803)
Zero (intra) 58.62 19.10 28.81
(1510/2576) (1510/7906)
Zero (inter) 22.99 6.36 9.96
(343/1492) (343/5396)
ALL 78.09 44.48 56.68
(12946/16579) (12946/29105)
wo (acc.) Depend 97.01 85.00 90.61
(8994/9271) (8994/10581)
Zero (intra) 65.33 19.63 30.19
(277/424) (277/1411)
Zero (inter) 29.63 1.90 3.57
(16/54) (16/843)
ALL 95.26 72.36 82.25
(9287/9749) (9287/12835)
ni (dat.) Depend 96.64 84.88 90.38
(2245/2323) (2245/2645)
Zero (intra) 43.86 17.65 25.17
(75/171) (75/425)
Zero (inter) 18.60 3.10 5.32
(8/43) (8/258)
ALL 91.76 69.95 79.38
(2328/2537) (2328/3328)
ALL Depend 92.64 76.93 84.06
(22332/24105) (22332/29029)
Zero (intra) 58.72 19.11 28.84
(1862/3171) (1862/9742)
Zero (inter) 23.10 5.65 9.08
(367/1589) (367/6497)
ALL 85.09 54.26 66.26
(24561/28865) (24561/45268)
Table 4: PWFeat+Lang+Dep(oracle): Results of the analyzer that uses the gold dependency information.
Role type Category P R F
ga (nom.) Depend 88.89 67.45 76.70
(10659/11991) (10659/15803)
Zero (intra) 57.57 21.88 31.71
(1730/3005) (1730/7906)
Zero (inter) 23.70 6.60 10.32
(356/1502) (356/5396)
ALL 77.25 43.79 55.90
(12745/16498) (12745/29105)
wo (acc.) Depend 97.05 84.22 90.18
(8911/9182) (8911/10581)
Zero (intra) 63.49 21.69 32.33
(306/482) (306/1411)
Zero (inter) 31.37 1.90 3.58
(16/51) (16/843)
ALL 95.04 71.94 81.89
(9233/9715) (9233/12835)
ni (dat.) Depend 96.44 83.89 89.73
(2219/2301) (2219/2645)
Zero (intra) 40.82 18.82 25.76
(80/196) (80/425)
Zero (inter) 17.50 2.71 4.70
(7/40) (7/258)
ALL 90.89 69.29 78.64
(2306/2537) (2306/3328)
ALL Depend 92.82 75.06 83.00
(21789/23474) (21789/29029)
Zero (intra) 57.45 21.72 31.52
(2116/3683) (2116/9742)
Zero (inter) 23.79 5.83 9.37
(379/1593) (379/6497)
ALL 84.47 53.64 65.61
(24284/28750) (24284/45268)
Table 5: PWFeat+Lang+Dep(parsed): Results of the analyzer that uses the result of the parser as the dependency information (features are the same as PWFeat+Lang+Dep(oracle)).

4.2 Effect of Dependency Information

Table 2 shows a summary in several settings. PWFeat+Lang is the result of the proposed PASA analyzer that does not use dependency information. Dependency features (Dep) are additionally used in different settings: oracle (gold dependencies), parsed (real parsing results including errors), and 20% errors (simulated parsing results including 20% errors). Table 3 shows the detailed results of the PWPASA analyzer that does not use the dependency information (PWFeat+Lang), and Table 4 and Table 5 show the results of the analyzers that additionally use the dependency information (PWFeat+Lang+Dep(oracle) and PWFeat+Lang+Dep(parsed)). Role type is the type of PAS labels (ga, wo, ni in Japanese), category is the relation type of the argument and the predicate. “Depend” means an element of the coreferential cluster of the argument have a dependency relation to the predicate. “Zero intra” means that one or more elements of the coreferential cluster exist in the same sentence as the predicate, but they do not depend on the predicate. “Zero inter” does not include any element that exists in the same sentence of the predicate.

In total accuracy (ALL-ALL), the F-measure of PWFeat+Lang was 63.56, 2.70 points lower than PWFeat+Lang+Dep(oracle). The difference is statistically significant (p0.01), however, it requires the oracle dependency information. If the dependency was a real parsing result including errors (PWFeat+Lang+Dep(parsed)), the difference of F-measure compared to the PWFeat+Lang is much smaller, 2.05 points. This indicates that the dependency features is still effective even if some features used in dependency parsing are indirectly used for the PAS analyzer, however, the accuracy of PWFeat+Lang closes in upon PWFeat+Lang+Dep(oracle). These results also indicate that the parsing errors do not fatally influence the PASA accuracy, if the parser is trained in the same domain. On the other hand, the dependency parser requires annotations of 24,283 sentences for the 2.05 points improvement. In our other experiment, annotation speed of dependency was 22.40 sentences (consisting of 24.76 words) per hour (74,865[sent.]135[hour]), in other words, 509.76 words per hour. Preparing dependency information is an indirect method for domain adaptation of PASA, and the data preparation cost is very high. The annotation of dependencies is very difficult for untrained annotators, and it requires spending a lot of time and energy of the developers.

If the parser is not adapted to the target domain, the parsing error rate reaches up to 20% [Flannery et al.2011]. We simulated parsing errors of the test set by randomly replacing a predefined proportion of the edges of the gold data of dependency, and used it on the test set (PWFeat+Lang+Dep(20% errors)). The total accuracy (ALL-ALL) of the PWFeat+Lang model was comparable to the result of the 20% noisy test-set. These results show that the result of a dependency parser achieves this low level of accuracy, because it is not adapted to the domains is not helpful for PASA.

In detailed categories, we compared the analyzer that does not use the dependency information (PWFeat+Lang) and the analyzer that uses real parsing results (PWFeat+Lang+Dep(parsed)). The F-measure of the nominative cases that directly depend on predicates (ga (nom.)-Depend) was mainly decreased by the absence of the dependency information (from 76.70 points to 72.21 points). On the other hand, F-measures of the accusative and the dative cases, which depend on predicates (wo (acc.)-Depend and ni (dat.)-Depend), PWFeat+Lang performs comparably to the analyzer that uses the dependency information. These results show that the dependency information contributes to predict nominative cases that exist at more distant positions from predicates than accusative cases, especially in SOV languages.

In category of Zeros (Zero (intra) and Zero (inter); arguments which do not depend on their predicates, the dependency information negatively influenced all categories. In the Zero intra category (see ALL-Zero (intra) of Table 3, Table 4 and Table 5), the F-measure of the PWFeat+Lang model was 1.74 points (33.26-31.52) better than the F-measure of the PWFeat+Lang+Dep(parsed) model, and 4.42 points (33.26-28.84) better than the F-measure of the PWFeat+Lang+Dep(oracle) with a statistical significance (p0.01). In Zero inter category (see ALL-Zero (inter) of Table 3, Table 4 and Table 5), the difference was not significant but the F-measure of the PWFeat+Lang model was also better than F-measures of the PWFeat+Lang+Dep(oracle) and the PWFeat+Lang+Dep(parsed). This is caused by the fact that the dependency information gives a strong prior to select an argument candidate that depends on the predicate, even if the candidate is not an argument.

4.3 Effect of language specific features

We also prepared the analyzer that does not use language specific features of the PWPASA (features described in Section 3.3) to verify the generality of the constructed PWFeat+Lang. By comparing to PWFeat and PWFeat+Lang in Table 2, the total accuracy (ALL-ALL) was decreased by 1.08 points by the lack of the language specific features. It shows that there is a certain level of generality of the PWFeat+Lang model that covers the dependency information with indirect features.

5 Related Studies

In the CoNLL shared task 2004 [Carreras and Màrquez2004, Hacioglu et al.2004], some systems tackled SRL only with shallow syntax structures; chunks and clauses. In CoNLL shared task 2005 [Carreras and Màrquez2005, Koomen et al.2005], more deep syntax structures, dependencies, are provided, and the best system outperformed the best system of CoNLL shared task 2004. However, effects of parsing errors are not deeply analyzed, and these tasks only target labeling semantic roles between elements in the same sentences.

While applications that use PAS demand higher accuracy of PASA in various domains [Imamura et al.2014], however, it is difficult to improve the accuracy of PASA on a variety of domains because of the lack of annotated corpora in each domain. Constructing new annotated corpora in the new domains is very costly, thus, some previous studies tried to ignore the problem. Semi-supervised and unsupervised approaches [Fürstenau and Lapata2009, Titov and Klementiev2012, Lorenzo and Cerisara2014] tried to improve the accuracy of labeling with unlabeled data, but they did not provide much of an improvement from the unlabeled data. [Ribeyre et al.2015] also researched the effect of syntax information for English SRL, and reported a 2.80 point improvement of accuracy. Our work focuses on not only SRL but also ZAR.

Having training data that are representative of the domain is essential for constructing a robust PAS analyzer [Pradhan et al.2008]. However, data annotation of the higher-layer NLP tasks such as PASA require not only the annotations for the task but also those of lower-layer NLP task, such as dependencies. This property makes it difficult to apply the current supervised approaches to a new domain. Our work tackled this problem by avoiding to analyzing the dependency structure.

In the PASA and the ZAR of Japanese, [Imamura et al.2009] addressed PASA and ZAR with a discriminative model for only verbal predicates. The total accuracy of our work is less than their work, but our system addressed both PASA and ZAR not only for verbal predicates but also for indeclinable predicates which indicate events. Our system also worked better in ZAR of accusative and dative cases. [Sasano and Kurohashi2011] proposed a discriminative ZAR model. Our system worked better than their work in the accusative case (wo) and the dative case (ni), but their system worked better in the nominative case (ga).

Figure 6: Overall conclusion of this work.

6 Conclusion

This paper clarified the effect of the dependency information in an PASA by comparing the analyzer that does not use dependency information, an analyzer that uses oracle dependency information, and an analyzer that uses real parsing result including parsing errors. The result indicated that dependency information improves PASA. Because analyzers using the dependency information require dependency annotation and parser construction, the cost is very high, which is disproportionate to the improvement in accuracy as shown in Figure 6. The experimental results showed that the indirect features, which compensate for the absence of dependency features, worked well enough. By considering the cost of preparing dependency information, the accuracy is reasonable to use in the realistic situations. We plan to design a framework of rapid data preparation, and adopt the system to a variety of domains in the future.

References

  • [Björkelund et al.2009] Anders Björkelund, Love Hafdell, and Pierre Nugues. 2009. Multilingual semantic role labeling. In Proceedings of the Thirteenth Conference on Computational Natural Language Learning: Shared Task, pages 43–48.
  • [Carreras and Màrquez2004] Xavier Carreras and Lluís Màrquez. 2004. Introduction to the conll-2004 shared task: Semantic role labeling. In Proceedings of the CoNLL-2004 Shared Task.
  • [Carreras and Màrquez2005] Xavier Carreras and Lluís Màrquez. 2005. Introduction to the conll-2005 shared task: Semantic role labeling. In Proceedings of the 9th Conference on Computational Natural Language Learning, pages 152–164.
  • [Fan et al.2008] Rong-En Fan, Kai-Wei Chang, Cho-Jui Hsieh, Xiang-Rui Wang, and Chih-Jen Lin. 2008. Liblinear: A library for large linear classification.

    Journal of Machine Learning Research

    , 9(4):1871–1874.
  • [Flannery et al.2011] Daniel Flannery, Yusuke Miyao, Graham Neubig, and Shinsuke Mori. 2011. Training dependency parsers from partially annotated corpora. In Proceedings of the 5th International Joint Conference on Natural Language Processing, pages 776–784.
  • [Fürstenau and Lapata2009] Hagen Fürstenau and Mirella Lapata. 2009. Semi-supervised semantic role labeling. In Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics, pages 220–228.
  • [Grosz et al.1995] Barbara J. Grosz, Scott Weinstein, and Aravind K. Joshi. 1995. Centering a framework for modeling the local coherence of discourse. Computational Linguistics, 21(2):203–225.
  • [Hacioglu et al.2004] Kadri Hacioglu, Sameer Pradhan, Wayne Ward, James H Martin, and Daniel Jurafsky. 2004. Semantic role labeling by tagging syntactic chunks. In Proceedings of the CoNLL-2004 Shared Task.
  • [Hajič et al.2009] Jan Hajič et al. 2009. The CoNLL-2009 shared task: syntactic and semantic dependencies in multiple languages. In Proceedings of CoNLL: Shared Task, CoNLL ’09, pages 1–18.
  • [Iida and Poesio2011] Ryu Iida and Massimo Poesio. 2011. A cross-lingual ilp solution to zero anaphora resolution. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, pages 804–813.
  • [Iida. et al.2003] Ryu Iida., Kentaro Inui, Hiroya Takamura, and Yuji Matsumoto. 2003. Incorporating contextual cues in trainable models for coreference resolution. In Proceedings of 10th Conference of the European Chapter of the Association for Computational Linguistics Workshop on the Computational Treatment of Anaphora, pages 23–30.
  • [Iida et al.2007a] Ryu Iida, Kentaro Inui, and Yuji Matsumoto. 2007a. Zero-anaphora resolution by learning rich syntactic pattern features. ACM Transactions on Asian Language Information Processing (TALIP), 6(4):12:1–12:22.
  • [Iida et al.2007b] Ryu Iida, Mamoru Komachi, Kentaro Inui, and Yuji Matsumoto. 2007b. Annotating a Japanese text corpus with predicate-argument and coreference relations. In Proceedings of the Linguistic Annotation Workshop, pages 132–139.
  • [Imamura et al.2009] Kenji Imamura, Kuniko Saito, and Tomoko Izumi. 2009. Discriminative approach to predicate-argument structure analysis with zero-anaphora resolution. In Proceedings of the ACL-IJCNLP 2009 Conference Short Papers, pages 85–88.
  • [Imamura et al.2014] Kenji Imamura, Ryuichiro Higashinaka, and Tomoko Izumi. 2014. Predicate argument structure analysis with zero-anaphora resolution for dialogue systems. In Proceedings of the 25th International Conference on Computational Linguistics, pages 806–815.
  • [Koomen et al.2005] Peter Koomen, Vasin Punyakanok, Dan Roth, and Wen-tau Yih. 2005. Generalized inference with multiple semantic role labeling systems. In Proceedings of the 9th Conference on Computational Natural Language Learning, pages 181–184.
  • [Kudo and Matsumoto2002] Taku Kudo and Yuji Matsumoto. 2002. Japanese dependency analysis using cascaded chunking. In Proceedings of the 6th Conference on Natural Language Learning - Volume 20, pages 1–7.
  • [Lorenzo and Cerisara2014] Alejandra Lorenzo and Christophe Cerisara. 2014.

    Semi-supervised srl system with bayesian inference.

    In Computational Linguistics and Intelligent Text Processing, pages 429–441. Springer.
  • [Matsubayashi et al.2012] Yuichiroh Matsubayashi, Yusuke Miyao, and Akiko Aizawa. 2012. Building japanese predicate-argument structure corpus using lexical conceptual structure. In Proceedings of The eighth international conference on Language Resources and Evaluation, pages 1554–1558.
  • [Mori and Neubig2011] Shinsuke Mori and Graham Neubig. 2011.

    A pointwise approach to pronunciation estimation for a TTS front-end.

    In Proceedings of INTERSPEECH, pages 2181–2184, Florence, Italy, 8.
  • [Neubig et al.2011] Graham Neubig, Yosuke Nakata, and Shinsuke Mori. 2011. Pointwise prediction for robust, adaptable Japanese morphological analysis. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics, pages 529–533.
  • [Pradhan et al.2008] Sameer S. Pradhan, Wayne Ward, and James H. Martin. 2008. Towards robust semantic role labeling. Computational Linguistics, 34(2):289–310, jun.
  • [Ribeyre et al.2015] Corentin Ribeyre, Eric Villemonte de la Clergerie, and Djamé Seddah. 2015. Because syntax does matter: Improving predicate-argument structures parsing with syntactic features. In Proceedings of the Ninth Conference on Computational Natural Language Learning, pages 64–74.
  • [Sasano and Kurohashi2009] Ryohei Sasano and Sadao Kurohashi. 2009. A probabilistic model for associative anaphora resolution. In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, pages 1455–1464.
  • [Sasano and Kurohashi2011] Ryohei Sasano and Sadao Kurohashi. 2011. A discriminative approach to Japanese zero anaphora resolution with large-scale case frames. Journal of Information Processing (in Japanese), 52(12):3328–3337, dec.
  • [Shen and Lapata2007] Dan Shen and Mirella Lapata. 2007. Using semantic roles to improve question answering. In Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, pages 12–21.
  • [Titov and Klementiev2012] Ivan Titov and Alexandre Klementiev. 2012. Semi-supervised semantic role labeling: Approaching from an unsupervised perspective. In Proceedings of the 24th International Conference on Computational Linguistics, pages 2635–2652.
  • [Watanabe et al.2010] Yotaro Watanabe, Masayuki Asahara, and Yuji Matsumoto. 2010. A structured model for joint learning of argument roles and predicate senses. In Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, pages 98–102.
  • [Yoshino et al.2011] Koichiro Yoshino, Shinsuke Mori, and Tatsuya Kawahara. 2011. Spoken dialogue system based on information extraction using similarity of predicate argument structures. In Proceedings of the 12th Annual Meeting of the Special Interest Group on Discourse and Dialogue, pages 59–66.
  • [Yoshino et al.2013] Koichiro Yoshino, Shinsuke Mori, and Tatsuya Kawahara. 2013. Predicate argument structure analysis using partially annotated corpora. In Proceedings of the 6th International Joint Conference on Natural Language Processing, pages 957–961.
  • [Zhai et al.2013] Feifei Zhai, Jiajun Zhang, Yu Zhou, and Chengqing Zong. 2013. Handling ambiguities of bilingual predicate-argument structures for statistical machine translation. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics, pages 1127–1136.
  • [Zhou and Xu2015] Jie Zhou and Wei Xu. 2015.

    End-to-end learning of semantic role labeling using recurrent neural networks.

    In the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (ACL-IJCNLP2015), pages 1127–1137.