Improving cross-lingual model transfer by chunking

02/27/2020 ∙ by Ayan Das, et al. ∙ IIT Kharagpur 0

We present a shallow parser guided cross-lingual model transfer approach in order to address the syntactic differences between source and target languages more effectively. In this work, we assume the chunks or phrases in a sentence as transfer units in order to address the syntactic differences between the source and target languages arising due to the differences in ordering of words in the phrases and the ordering of phrases in a sentence separately.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Model transfer approaches for cross-lingual dependency parsing involve training a parser model using a treebank of a language (source language) and using it to parse sentences of another language (target language). This technique may be used to develop parsers for languages for which no treebank is available.

The performance of cross-lingual parser models often tend to suffer due to the syntactic difference between the source and the target languages Zeman and Resnik (2008); Søgaard (2011); Naseem et al. (2012). Thus, a major challenge in transfer parsing is to bridge the gap between the source and target language. For example, the adjectives appear before the corresponding nouns in English and Hindi while in Spanish and Arabic the adjectives appear after the nouns. Several approaches have been proposed to address the syntactic differences. These include training a parser model using the a selected subset of source language parse trees that are syntactically close to the target language Søgaard (2011); Wang and Eisner (2016), transformation of source language treebank to match the syntax of the target language Aufrant et al. (2016); Das and Sarkar (2019); Wang and Eisner (2018), target-language independent perturbation Das and Sarkar (2019), training a word order-insensitive parser model Ahmad et al. (2019) or imposing target-language syntax based constraints while running MST on the edge-score matrix of a graph-based parser to obtain the target language parse tree Meng et al. (2019).

The syntax of a language may be classified into two categories. Firstly, the syntax of the words within the chunks or phrases (

intra-chunk syntax) and secondly, the orientation of the chunks in a sentence (inter-chunk syntax). Consider the following English sentence.
EN: (The US) (lost) (yet another helicopter) (to hostile fire)
The word groups enclosed by brackets indicate separate chunks or phrases and US, helicopter and fire are the head words of the chunks the US, yet another helicopter and to hostile fire respectively. In this example, the intra-chunk phrase syntax corresponds to the relative ordering of the determiners, adpositions, adjevtival modifiers, auxiliaries etc. with respect to the head words in a phrase whereas the the inter-chunk syntax corresponds to the relative ordering of the chunks in the sentence.

Given a source-target language pair, the syntactic differences may be in the ordering of the words within a phrase, or, in the orientation of the phrases in sentence, or both. For example, the adpositions appear before the corresponding nouns in English while they appear after the corresponding nouns in Hindi. These differences are local to the phrases. Similarly, languages also differ in the orientation of the phrases in the sentences. For example, the English, French, Spanish etc. follows SVO ordering, Japanese, Urdu and Hindi typically follow SOV ordering, while, Arabic and Irish predominantly follows VSO ordering.

Consider the English sentence and its Hindi translation.
EN: “(He) (teaches) (the children)”
HI: “(va) (bachchOM kO) (paDhAtA hEi)”
EN-gloss: “(He) (children to) (teaches is)”

Here, the phrase “the children” maps to the Hindi phrase bachchOM kO (children to) and the English phrase “teaches” maps to the Hindi phrase “paDhAtA hEi” (teaches is). We observe that the phrases have the following differences. The definite article is absent in Hindi. In Hindi, the postposition ko is associated with the word bachchOM (children) while no adposition is associated with the word children in the corresponding English phrase. In the Hindi verb phrase, paDhAtA (teaches) is followed by the copula verb hEi (is). Furthermore, the English sentence follows a SVO ordering of phrases while the Hindi sentence is verb-final.

Both the intra-phrase and inter-phrase differences affect the performance of the transfer parsers. Thus in order to simplify the transfer process we address these two differences separately. We propose to carry out a chunk information guided cross-lingual model transfer for dependency parsing where we treat the chunks as transfer units instead of the words. To this end, we train a source language parser model using the chunks as units. Given a target language sentence, the source language parser model is used to parse the target language chunks followed by the expansion of the target language chunks to obtain the complete trees.

We propose to use chunk information in transfer parsing because a chunker (shallow parser) may be trained using lesser amount of data as compared to a full syntactic parser. Annotating data for a chunker is also much simpler as compared to that of a parser. The chunkers may also be rule-based whose development do not require any data.

2 Related work

Chunking (shallow parsing) has been used successfully to develop good quality parsers in Hindi language Bharati et al. (2009); Chatterji et al. (2012). bharati2009b have proposed a two-stage constraint-based approach where they first tried to extract the intra-chunk dependencies and resolve the inter-chunk dependencies in the second stage. They have also shown effect of hard and soft constraints to build efficient Hindi parser that outperforms data-driven parsers.

ambati2010b used disjoint sets dependency relation and performed the intra-chunk parsing and inter-chunk parsing separately.  chatterji2012 proposed a three stage approach where a rule-based inter-chunk parsing followed a data-driven inter-chunk parsing.

A project for building multi-representational and multi-layered treebanks for Hindi and Urdu Bhatt et al. (2009) 111http://verbs.colorado.edu/hindiurdu/index.html was carried out as a joint effort by IIIT Hyderabad, University of Colorado and University of Washington. Besides the syntactic version of the treebank being developed by IIIT Hyderabad Ambati et al. (2011), University of Colorado has built the Hindi-Urdu proposition bank Vaidya et al. (2014) and a phrase-structure form of the treebank Bhatt and Xia (2012) is being developed at University of Washington. A part of the Hindi dependency treebank222http://ltrc.iiit.ac.in/treebank_H2014/ has been released in which the inter-chunk dependency relations (dependency links between chunk heads) have been manually tagged and the chunks were expanded automatically using an arc-eager algorithm. Some of the major works on parsing in Bengali language appeared in ICON 2009 (http://www.icon2009.in/).

ghosh2009 used a CRF based hybrid method,  chatterji2009 used variations of the transition based dependency parsing. mannem2009 came up with a bi-directional incremental parsing and perceptron learning approach and de2009 used a constraint-based method. garain2012 compares performance of a grammar driven parser and a modified MALT parser.

3 Chunking

Chunking involves identification of different phrases in a sentence and identification of a chunk-head or main word in a given chunk. A chunker may be rule-based or data-driven. In a rule-based chunker a set of pre-defined rules are used to identify the chunks and the corresponding heads. On the other hand in a data-driven chunker, the task of chunking is usually posed a sequence labelling task and a machine learning-based algorithm is trained for chunk identification.

On the other hand, rule-based approaches are usually used for chunk head identification.

Figure 1: Chunk-level model transfer for dependency parsing

3.1 Chunk identification

In this work, we address the problem of chunk identification as a sequence labelling task where we label each chunk using the BI labelling e.g. in the above example the chunk sequence is as follows;
EN: (The US) (lost) (yet another helicopter) (to hostile fire)
Labels: B-NP I-NP B-VP B-NP I-NP I-NP B-NP I-NP I-NP
where B-* indicates the beginning of a chunk an I-* indicates inside the chunk starting at the B-* immediately preceding it. In this work we have identified the following chunk types: noun phrase (NP), verb phrase (VP), adjectival phrase (JJP), adverbial phrase (RBP), coordinating conjunctive phrase (CCP) and the remaining as BLK.

The chunk-type is determined based on the PoS tag of the chunk-head word e.g. if the chunk head word is a noun, pronoun or proper noun then it is assigned the NP chunk type. The beginning of a chunk is not necessarily the chunk head. In Table 1 we present the chunk annotation of an example sentence. The subscripts in the last column indicates the chunk number in the sentence.

Word PoS Chunk type Head type
The DET B-NP child
white ADJ I-NP child
cat NOUN I-NP head
ate VERB B-VP head
a DET B-NP child
little ADJ I-NP child
mouse NOUN I-NP head
Table 1: Chunking example.

3.1.1 Chunker model

We used a CRF-BiLSTM based neural model to train the chunker. The 2-layer bi-directional LSTM takes the embeddings of the PoS tags of the words in a sentence as input and encodes them in the internal states. We used the hidden states corresponding to the final layers of the forward and backward LSTMs as the distributed representation of the corresponding words. These word representations were used as input to a CRF for chunk-label prediction.

3.2 Chunk head identification

We used a rule-based approach for predicting the chunk-head in a given chunk. Based on the category of a given chunk we designed a set of rules for predicting the most probable head. The set of rules varies slightly across languages.

4 Chunking based cross-lingual model transfer

In this section, we present our approach for shallow parser-guided cross-lingual transfer parsing where the transfer is carried out at the chunk-level instead of the word-level. In Figure 1 we show a schematic diagram of the steps of our chunk-level model transfer.

This method requires training a source language parser model using chunks as unit and a shallow parser in the target language.

Training a chunk-level source language parser model involves derivation of the chunk-level parse trees and training the parser model using the chunk-level parse trees. In case of source language, the chunk-level parse trees are derived using the chunk annotation of the training data. This is done by collapsing the sub-trees corresponding to each chunk and replacing them by a chunk representation. The chunk-level parse tree so obtained is used to train the parser model.

Given a target language sentence, the shallow parser is used to identify the chunks in the sentence. In case of the target language, the chunk representations are obtained by simply replacing the words in a chunk by a representation. The sequence of chunk representations so obtained are parsed using the chunk-level source language parser model. Finally, the target language chunks are expanded to obtain the full target language parse tree.

We elaborate the steps for training a chunk-level transfer and parsing target language sentences using the model in Section 4.1 and 4.2 below.

4.1 Training a chunk-level parser model

The steps for training a chunk-level transfer parser are as follows.

4.1.1 Obtaining the chunk-level source language treebank

The chunk-level source language parse trees are derived from the parse trees in source language treebank by collapsing the chunks and replacing the chunks by their representations. Here we represent a chunk by its chunk head. In the example, the English sentence ”, the chunks The white cat, ate and a little mouse are collapsed and represented by their chunk heads cat, ate and mouse respectively. In the parse tree, the relations corresponding intra-chunk words are also removed. The final tree consists of the chunk representations and the relations among them corresponds to the relations among the chunk heads as shown in the diagram.

4.1.2 Training the chunk-level parser model

The chunk-level parse trees derived from the source language treebank in the above step are then used to train the parser model.

4.2 Chunk-level parsing followed by chunk expansion

The steps for generating the parse tree for a given target language sentence are as follows.

4.2.1 Chunking a target language sentence

A target language chunker is used to identify the chunks in a target language sentence and the heads of each chunk is identified using a rule-based technique as discussed in Section 3. Assuming French to be the target language, let us consider the sentence in the the example.
FR: “Le chat blanc a mange une petit souris”
A chunker is used to identify identify the chunks as follows; “(Le chat  (a  (une petit ”.

The rule based chunk head identifier is then used to identify the chunk heads. The heads of the chunks Le chat blanc, a mange, une petit souris are chat, mange and souris respectively.

4.2.2 Parsing a target language chunk sequence

The sequence of the target language chunk-head sequence obtained above is parsed using the parser model trained using the source language chunk-head parse trees to obtain the chunk-head parse tree as shown in the diagram.

4.2.3 Chunk expansion

The chunk-head parse tree so obtained is then expanded to obtained the parse tree of the target language sentence. To this end, we expand each chunk in the chunk-head parse tree by attaching the non-chunk-head word in the chunk to its corresponding chunk-head by a modifier-head relation without any change in relative ordering of words in the sentence. In this relation, the chunk-head is the head and the non-chunk-head word is the modifier.

As shown in the above example, the chunk represented by chat is expanded by attaching the words Le and blanc to the chunk-head chat to obtain the parse of the chunk.

As of now, we do not associate any dependency relation to the intra-chunk relations.

5 Data and parser model

5.1 Data

We carried out our experiments using English (en) and Hindi (hi) as source languages, and, English, French (fr), German (de), Indonesian (id), Hebrew (he), Arabic (ar), Korean (ko) and Hindi (hi) as target languages. We used the UD v2.0 treebanks for our experiments.

5.1.1 Data for training chunkers

We trained our chunker using the gold annotations obtained from the UD 2.0 treebanks of the languages.

We classified the UD dependency relations into two groups; intra-chunk and inter-chunk. Our set of intra-chunk dependency relations comprises of the aux, appos, nummod, det, case, fixed, flat, compound, amod, advmod and goeswith relations. The words related by the other dependency relations such as nsubj, obj, iobj, root, obl, comp, cc, conj etc. were considered to be the chunk-heads and their relations with their parents were considered inter-chunk relations. In case of the amod and advmod relations, we selectively considered the dependents as intra-chunk. In case of amod, dependents whose parents are nouns, adjectives or adverbs were considered as intra-chunks and in case of advmod, the dependents with verbs, adverbs and adjectives as dependents were considered as intra-chunks. In a dependency parse tree, a chunk-head word along with all its dependents related to it by intra-chunk relations were considered to be a chunk.

Training set size Avg. chunking acc.(%) Avg. acc. (En as src.) Avg. acc. (Hi as src.)
20 67.3 46.8 37.1
50 75.3 48.9 39.9
100 82.7 54.4 42.8
200 86.1 55.4 44.0
300 86.7 55.9 44.7
500 88.1 56.8 45.6
1000 88.6 56.8 45.2
1500 89.3 57.0 45.1
Gold chunk 100 65.9 55.2
Full tree _ 48.0 41.5
Table 2: Variation of average chunker accuracy and UAS of chunk-level transfer over 6 languages with training set size with English and Hindi as source languages

5.1.2 Parser data

The chunk-level parse trees were obtained by removing the sub-trees corresponding to the nodes having intra-chunk relations with their parents. The removal of the phrase sub-trees left us with the skeleton trees in which all the words are chunk-heads and are related to their parents via inter-chunk relations. Thus in each chunk-level tree, each chunk is represented by their chunk head. We trained the chunk-level parser model using these chunk-level trees derived from the training partition of the treebank of source languages.

5.2 Parser model

For our experiments, we trained a transition-based encoder-decoder parser model that use a bi-directional LSTM as encoder and a attention-based decoder using stack-pointers Ma et al. (2018).

6 Experiments and results

In this section, we discuss in details the experiments and results.

6.1 Chunk labelling and chunk head identification

Language Chunk head identification accuracy
en 97.5
fr 99.2
de 98.6
he 99.1
id 95.2
ar 99.1
ko 99.5
hi 98.7
ja 97.9
Avg. 98.3
Table 3: Chunk head identification accuracies for different languages

We experimented with different sizes of dataset for training the chunkers. In the second column of Table 2 we report the average performance of the chunkers over the 9 languages corresponding to the different sizes of training data. We observe that the accuracy increases with training set size and stabilizes beyond a training set of 500 sentences.

In Table 3 we present the chunk-head identification accuracy for the different languages. We observe that although we have used a very simple rule set for chunk head identification, we achieved significantly high accuracies in chunk head identification.

6.2 Chunk-level parsing

6.2.1 Baseline

We compare the performance of the chunk-level transfer models with the performance of the corresponding word-level transfer parser models as baseline. For both word-level and chunk-level transfer parsing we adopted the delexicalized transfer parser models.

6.2.2 Chunk-level transfer parser

We experimented with both predicted and gold annotations of the test data.

  • For predicted chunk annotation, the chunker models trained on 500 sentences were used to automatically label the test data and the chunk-heads were identified using the rule-based method discussed above.

  • For the gold annotation, we directly used the gold chunk annotation of the test data.

Evaluation metric:

We report the results of our experiments in terms of unlabeled attachment score (UAS) and labelled attachment score (LAS).

6.2.3 English as source language

Here we discuss the performance of the chunk-level transfer parser approach with English as source language.

In the third column of Table 2 we present the variation of the average UAS over the 9 target languages with training set sizes of the chunkers used to predict the chunks. We observe the beyond a training set size of 50 the average performance of chunk-level transfer parser improves over the performance on the word-level transfer. We also observe that the performance stabilizes at about a chunker training size of 500 sentences. In the following discussion with English as source language we report the results corresponding to the chunkers trained with 500 instances.

In Table 4 we compare the performance of the our chunk-level cross-lingual transfer parser model with the baseline. Since, we did not assign any relation type to the intra-chunk head-dependent dependency relations we report the UAS scores only for the full trees. In this table, we report results corresponding to predicted chunks and gold chunks. We have primarily compared the baseline with the performance of transfer parsing with predicted chunks. The bold entries indicates the higher of the UAS values. We have reported the results with gold chunks for reference in order to show the improvement in case gold annotated chunk information is available. We underlined the entries corresponding to the transfer parsers with gold chunks where it gives the highest UAS among the three results corresponding to a language.

Lang UAS with full tree transfer UAS with predicted chunks UAS with gold chunks
en 94.3 73.7 89.5
fr 76.1 74.1 80.0
de 62.8 60.1 73.3
he 52.6 56.5 65.9
id 46.3 63.2 71.3
ar 30.3 50.2 52.6
ko 27.9 36.8 47.2
hi 26.5 51.1 57.9
ja 15.1 45.2 55.5
Avg 48.0 56.8 65.9
Table 4: Comparison of performance of chunk-level transfer parser with the baseline transfer model for English as source language. The results are based target language chunks predicted using chunker trained on 500 sentences.

In Table 5 we compare the performance on the inter-chunk relations only. In this case we report both the UAS and LAS.

We observe that on an average across the 9 target languages our two-stage chunk-level transfer parser performs better than the baseline. Furthermore, it performs better than the baseline in case of 5 out of the 9 target languages. We also observe that the performance of our approach improves with the syntactic distance between the source and the target languages. We also observe that the chunk-level transfer parser with gold chunk information performs better than the baseline in case of 8 out of 9 target languages in terms of UAS and 7 languages in terms of LAS.

Lang Full tree transfer Predicted chunk transfer Gold chunk transfer
U L U L U L
en 88.5 78.2 74.8 68.7 90.3 85.4
fr 68.9 48.4 66.1 51.5 74.0 56.9
de 51.6 39.7 50.8 36.4 63.3 48.7
he 43.0 29.4 53.5 31.2 60.4 38.8
id 29.4 35.5 65.6 53.9 75.3 64.4
ar 22.9 13.8 44.7 26.6 47.1 29.5
ko 20.0 10.5 36.4 21.2 38.2 24.0
hi 28.2 19.0 32.3 19.2 39.1 25.5
ja 13.7 7.9 21.4 9.3 22.8 10.6
Avg 40.7 31.4 49.5 35.3 56.7 42.6
Table 5: Comparison of performance of chunk-level transfer parser with the baseline transfer model on the inter-chunk relations only with English as source language. The chunks were predicted using chunker was trained on 500 sentences.

6.2.4 Hindi as source language

We repeated our experiments with Hindi as source language and the same set of target language as above. In the fourth column of Table 2 we present the variation of the average UAS over the 9 target languages with training set sizes of the chunkers used to predict the chunks. We observe the performance starts improving beyond a chunker training set size of 100 sentences. Furthermore the highest accuracy is achieved at 500 tree. Hence, in our following discussions we report the results corresponding to the chunkers trained with 500 instances.

In Table 6 we compare the performance of our chunk-level transfer parser with the baseline on full trees and in Table7 we report the results corresponding to the inter-chunk relations.

Lang UAS with full tree transfer UAS with predicted chunks UAS with gold chunks
en 39.7 45.3 57.2
fr 32.7 48.8 53.1
de 46.8 50.8 62.8
he 24.5 33.3 39.9
id 19.7 31.2 49.2
ar 8.6 10.2 12.1
ko 43.9 48.4 68.4
hi 96.1 81.9 84.5
ja 61.3 61.0 69.7
Avg 41.5 45.6 55.2
Table 6: Comparison of performance of chunk-level transfer parser with the baseline transfer model with Hindi as source language. The results are based target language chunks predicted using chunker trained on 500 sentences.

From Table 6 observe that corresponding to the all the full trees the chunk-level transfer followed by chunk expansion with predicted chunk information performs better than direct transfer with Hindi as source language for 7 out of 9 languages and also in terms of average performance over all target languages.

Lang Full tree transfer Predicted chunk transfer Gold chunk transfer
U L U L U L
en 37.6 22.2 36.7 25.2 40.1 27.7
fr 23.8 13.2 24.9 15.1 27.1 17.8
de 44.5 30.5 39.6 28.7 46.5 32.7
he 24.9 8.9 23.0 9.4 25.7 10.6
id 17.3 11.5 29.6 20.8 36.8 24.4
ar 7.2 4.1 10.1 5.9 10.6 6.9
ko 50.5 33.7 48.4 28.5 61.9 34.8
hi 95.1 91.1 89.7 81.9 90.8 84.2
ja 50.4 27.7 49.0 25.1 51.4 27.0
Avg 39.0 27.0 38.2 25.1 43.4 29.5
Table 7: Comparison of performance of chunk-level transfer parser with the baseline transfer model on the inter-chunk relations only with Hindi as source language. The chunks were predicted using chunker was trained on 500 sentences.

From Table 7 we observe that with Hindi as source language the average performance of the chunk-level transfer with predicted chunk information is slightly worse than that of the baseline in terms of average UAS and LAS. However, it outperforms the baseline in case of 5 out of the 9 target languages.

7 Conclusion

In this chapter, we present an approach of cross-lingual transfer parsing that helps to reduce the error due to the syntactic differences between the source and target languages by addressing the intra-phrase and inter-phrase syntactic differences separately when chunkers are available for the two languages.

References

  • W. U. Ahmad, Z. Zhang, Z. Ma, E. Hovy, K. Chang, and N. Peng (2019) On difficulties of cross-lingual transfer with order differences: a case study on dependency parsing. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Cited by: §1.
  • B. R. Ambati, R. Agarwal, M. Gupta, S. Husain, and D. M. Sharma (2011) Error detection for treebank validation. Asian Language Resources collocated with IJCNLP 2011, pp. 23. Cited by: §2.
  • L. Aufrant, G. Wisniewski, and F. Yvon (2016) Zero-resource dependency parsing: boosting delexicalized cross-lingual transfer with linguistic knowledge. In COLING 2016, 26th International Conference on Computational Linguistics, Proceedings of the Conference: Technical Papers, December 11-16, 2016, Osaka, Japan, pp. 119–130. External Links: Link Cited by: §1.
  • A. Bharati, S. Husain, M. Vijay, K. Deepak, D. M. Sharma, and R. Sangal (2009) Constraint based hybrid approach to parsing indian languages. In Proceedings of the 23rd PACLIC, Hong Kong, pp. 614–621. External Links: Link Cited by: §2.
  • R. Bhatt, B. Narasimhan, M. Palmer, O. Rambow, D. M. Sharma, and F. Xia (2009) A multi-representational and multi-layered treebank for hindi/urdu. In Proceedings of the Third Linguistic Annotation Workshop, ACL-IJCNLP ’09, Stroudsburg, PA, USA, pp. 186–189. External Links: ISBN 978-1-932432-52-7, Link Cited by: §2.
  • R. Bhatt and F. Xia (2012) Challenges in converting between treebanks: a case study from the hutb. In META-RESEARCH Workshop on Advanced Treebanking, pp. 53. Cited by: §2.
  • S. Chatterji, A. Dhar, S. Sarkar, and A. Basu (2012) A three stage hybrid parser for hindi. In Proceedings of the Workshop on MTPIL, Mumbai, India, pp. 155–162. External Links: Link Cited by: §2.
  • A. Das and S. Sarkar (2019) Transform, combine, and transfer: delexicalized transfer parser for low-resource languages. ACM Trans. Asian Low-Resour. Lang. Inf. Process. 19 (1), pp. 4:1–4:30. External Links: ISSN 2375-4699, Link, Document Cited by: §1.
  • A. Das and S. Sarkar (2019) A little perturbation makes a difference: treebank augmentation by perturbation improves transfer parsing. In Accepted for publication at

    the 16th International Conference on Natural Language Processing

    ,
    Hyderabad, India, pp. . Cited by: §1.
  • X. Ma, Z. Hu, J. Liu, N. Peng, G. Neubig, and E. Hovy (2018) Stack-pointer networks for dependency parsing. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 1403–1414. Cited by: §5.2.
  • T. Meng, N. Peng, and K. Chang (2019) Target language-aware constrained inference for cross-lingual dependency parsing. In Proceedings of the 2019 conference on empirical methods in natural language processing (EMNLP), Hong Kong, China. Cited by: §1.
  • T. Naseem, R. Barzilay, and A. Globerson (2012) Selective sharing for multilingual dependency parsing. In Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1, ACL ’12, Stroudsburg, PA, USA, pp. 629–637. External Links: Link Cited by: §1.
  • A. Søgaard (2011) Data point selection for cross-language adaptation of dependency parsers. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: short papers-Volume 2, pp. 682–686. Cited by: §1.
  • A. Vaidya, O. Rambow, and M. Palmer (2014) Light verb constructions with ‘do’and ‘be’in hindi: a tag analysis. In Workshop on Lexical and Grammatical Resources for Language Processing, pp. 127. Cited by: §2.
  • D. Wang and J. Eisner (2016) The galactic dependencies treebanks: getting more data by synthesizing new languages. Transactions of the Association for Computational Linguistics 4, pp. 491–505. Cited by: §1.
  • D. Wang and J. Eisner (2018) Synthetic data made to order: the case of parsing. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pp. 1325–1337. Cited by: §1.
  • D. Zeman and P. Resnik (2008) Cross-language parser adaptation between related languages. In Proceedings of the IJCNLP-08 Workshop on NLP for Less Privileged Languages, Cited by: §1.