A corpus of preposition supersenses in English web reviews

05/08/2016 ∙ by Nathan Schneider, et al. ∙ THE UNIVERSITY OF UTAH University of Colorado Boulder IHMC 0

We present the first corpus annotated with preposition supersenses, unlexicalized categories for semantic functions that can be marked by English prepositions (Schneider et al., 2015). That scheme improves upon its predecessors to better facilitate comprehensive manual annotation. Moreover, unlike the previous schemes, the preposition supersenses are organized hierarchically. Our data will be publicly released on the web upon publication.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 6

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

English prepositions exhibit stunning frequency and wicked polysemy. In the 450M-word COCA corpus (Davies, 2010), 11 prepositions are more frequent than the most frequent noun.111http://www.wordfrequency.info/free.asp?s=y In the corpus presented in this paper, prepositions account for 8.5% of tokens (the top 11 prepositions comprise >6% of all tokens). Far from being vacuous grammatical formalities, prepositions serve as essential linkers of meaning, and the few extremely frequent ones are exploited for many different functions (figure 1). For all their importance, however, prepositions have received relatively little attention in computational semantics, and the community has not yet arrived at a comprehensive and reliable scheme for annotating the semantics prepositions in context (section 2). We believe that such annotation of preposition functions is needed if preposition sense disambiguation systems are to be useful for downstream tasks—e.g., translation222This work focuses on English, but adposition and case systems vary considerably between languages, challenging second language learners and machine translation systems (Chodorow et al., 2007; Shilon et al., 2012; Hashemi and Hwa, 2014). or semantic parsing (cf. Dahlmeier et al., 2009; Srikumar and Roth, 2011).

I have been going to/Destination the Wildwood_,_NJ for/Duration over 30 years for/Purpose summer~vacations
It is close to/Location bus_lines for/Destination Opera_Plaza
I was looking~to/`i bring a customer to/Destination their lot to/Purpose buy a car

Figure 1: Preposition supersense annotations illustrating polysemy of to and for. Note that both can mark a Destination or Purpose, while there are other functions that do not overlap. The syntactic complement use of infinitival to is tagged as `i. The over token in (1) receives the label Approximator. See section 3.1 for details.

This paper describes a new corpus fully annotated with preposition supersenses (hierarchically organized unlexicalized classes). We note that none of the existing English corpora annotated with preposition semantics, on which existing disambiguation models have been trained and evaluated, are both comprehensive (describing all preposition types and tokens) and double-annotated

(to attenuate subjectivity in the annotation scheme and measure inter-annotator agreement). As an alternative to fine-grained sense annotation for individual prepositions—which is difficult and limited by the coverage and quality of a lexicon—we instead train human annotators to label

preposition supersenses, reporting the first inter-annotator agreement figures for this task. We comprehensively annotate English preposition tokens in a corpus of web reviews and examine the distribution of their supersenses, and improve upon the supersense hierarchy as necessitated by the data encountered during the annotation process. Our annotated corpus will be publicly released at the time of publication.

2 Background and Motivation

Theoretical linguists have puzzled over questions such as how individual prepositions can acquire such a broad range of meanings—and to what extent those meanings are systematically related (e.g., Brugman, 1981; Lakoff, 1987; Tyler and Evans, 2003; O’Dowd, 1998; Saint-Dizier and Ide, 2006; Lindstromberg, 2010).

Prepositional polysemy has also been recognized as a challenge for AI (Herskovits, 1986)

and natural language processing, motivating semantic disambiguation systems

(O’Hara and Wiebe, 2003; Ye and Baldwin, 2007; Hovy et al., 2010; Srikumar and Roth, 2013b). Training and evaluating these requires semantically annotated corpus data. Below, we comment briefly on existing resources and why (in our view) a new resource is needed to “road-test” an alternative, hopefully more scalable, semantic representation for prepositions.

2.1 Existing Preposition Corpora

Beginning with the seminal resources from The Preposition Project (TPP; Litkowski and Hargraves, 2005), the computational study of preposition semantics has been fundamentally grounded in corpus-based lexicography centered around individual preposition types. Most previous datasets of preposition semantics at the token level (Litkowski and Hargraves, 2005, 2007; Dahlmeier et al., 2009; Tratz and Hovy, 2009; Srikumar and Roth, 2013a) only cover high-frequency prepositions (the 34 represented in the SemEval-2007 shared task based on TPP, or a subset thereof).333A further limitation of the SemEval-2007 dataset is the way in which it was sampled: illustrative tokens from a corpus were manually selected by a lexicographer. As (Litkowski, 2014) showed, a disambiguation system trained on this dataset will therefore be biased and perform poorly on an ecologically valid sample of tokens.

We sought a scheme that would facilitate comprehensive semantic annotation of all preposition tokens in a corpus: thus, it would have to cover the full range of usages possible for the full range of English preposition types. The recent TPP PDEP corpus (Litkowski, 2014, 2015) comes closer to this goal, as it consists of randomly sampled tokens for over 300 types. However, sentences were sampled separately for each preposition, so there is only one annotated preposition token per sentence. By contrast, we will fully annotate documents for all preposition tokens. No inter-annotator agreement figures have been reported for the PDEP data to indicate its quality, or the overall difficulty of token annotation with TPP senses across a broad range of prepositions.

Figure 2: Supersense hierarchy used in this work (adapted from Schneider et al., 2015). Circled nodes are roots (the most abstract categories); subcategories are shown above and below. Each node’s color and formatting reflect its depth.

2.2 Supersenses

From the literature on other kinds of supersenses, there is reason to believe that token annotation with preposition supersenses (Schneider et al., 2015) will be more scalable and useful than senses. The term supersense has been applied to lexical semantic classes that label a large number of word types (i.e., they are unlexicalized). The best-known supersense scheme draws on two inventories—one for nouns and one for verbs—which originated as a high-level partitioning of senses in WordNet (Miller et al., 1990). A scheme for adjectives has been proposed as well (Tsvetkov et al., 2014).

One argument advanced in favor of supersenses is that they provide a coarse level of generalization for essential contextual distinctions—such as artifact vs. person for chair, or temporal vs. locative in—without being so fine-grained that systems cannot learn them (Ciaramita and Altun, 2006). A similar argument applies for human learning as pertains to rapid, cost-effective, and open-vocabulary annotation of corpora: an inventory of dozens of categories (with mnemonic names) can be learned and applied to unlimited vocabulary without having to refer to dictionary definitions (Schneider et al., 2012). Like with WordNet for nouns and verbs, the same argument holds for prepositions: TPP-style sense annotation requires familiarity with a different set of (often highly nuanced) distinctions for each preposition type. For example, in has 15 different TPP senses, among them in 10(7a) ‘indicating the key in which a piece of music is written: Mozart’s Piano Concerto in E flat’.

Supersenses have been exploited for a variety of tasks (e.g., Agirre et al., 2008; Tsvetkov et al., 2013, 2015), and full-sentence noun and verb taggers have been built for several languages (Segond et al., 1997; Johannsen et al., 2014; Picca et al., 2008; Martínez Alonso et al., 2015; Schneider et al., 2013, 2016). They are typically implemented as sequence taggers. In the present work, we extend a corpus that has already been hand-annotated with noun and verb supersenses, thus raising the possibility of systems that can learn all three kinds of supersenses jointly (cf. Srikumar and Roth, 2013b).

2.3 PrepWiki

Schneider et al.’s (2015) preposition supersense scheme is described in detail in a lexical resource, PrepWiki,444http://tiny.cc/prepwiki which records associations between supersenses and preposition types. Hereafter, we adopt the term usage for a pairing of a preposition type and a supersense label—e.g., at/Time. Usages are organized in PrepWiki via (lexicalized) senses from the TPP lexicon. The mapping is many-to-many, as senses and supersenses capture different generalizations. (TPP senses, being lexicalized, are more numerous and generally finer-grained, but in some cases lump together functions that receive different supersenses, as in the sense for 2(2) ‘affecting, with regard to, or in respect of’.) Thus, for a given preposition, a sense may be mapped to multiple usages, and vice versa.

2.4 The Supersense Hierarchy

Of the four supersense schemes mentioned above, Schneider et al.’s (2015) inventory for prepositions (which improved upon the inventory of Srikumar and Roth (2013a)) is unique in being hierarchical. It is an inheritance hierarchy (see figure 2): characteristics of higher-level categories are asserted to apply to their descendants. Multiple inheritance is used for cases of overlap: e.g., Destination inherits from both Location (because a destination is a point in physical space) and Goal (it is the endpoint of a concrete or abstract path).

The structure of the hierarchy was modeled after VerbNet’s hierarchy of thematic roles (Bonial et al., 2011; Hwang, 2014). But there are many additional categories: some are refinements of the VerbNet roles (e.g., subclasses of Time), while others have no VerbNet counterpart because they do not pertain to core roles of verbs. The Configuration subhierarchy, which is used for of and other prepositions when they relate two nominals, is a good example.

3 Corpus Annotation

3.1 Annotating Preposition Supersenses

Source data.

We fully annotated the Reviews section of the English Web Treebank (Bies et al., 2012), selected because it had previously been annotated for multiword expressions and noun and verb supersenses (Schneider et al., 2014; Schneider and Smith, 2015). The corpus consists of 55,579 tokens organized into 3812 sentences and 723 documents, with gold tokenization and PTB-style POS tags.

Identifying preposition tokens.

TPP, and therefore PrepWiki, contains senses for canonical prepositions, i.e., those used transitively in the [ P NP] construction. Taking inspiration from Pullum and Huddleston (2002), PrepWiki further assigns supersenses to spatiotemporal particle uses of out, up, away, together, etc., and subordinating uses of as, after, in, with, etc. (including infinitival to and infinitival-subject for, as in It took over 1.5 hours for our food to come out).555PrepWiki does not include subordinators/complementizers that cannot take NP complements: that, because, while, if, etc.

Non-supersense labels.

These are used where the heuristics fail (sometimes due to a POS tagging error) or where the preposition serves a special syntactic function not captured by the supersense inventory. The most frequent is

`i, which applies only to infinitival to tokens that are not Purpose or Function adjuncts.666See figure 1 for examples from the corpus. I want/love/try to eat cookies and To love is to suffer would qualify as `i; a shoulder to cry on would qualify as Function. The label `d applies to discourse expressions; the unqualified backtick (`) applies to miscellaneous cases such as infinitival-subject for and both prepositions in the as-as comparative construction (as wet as water; as much cake as you want).

Multiword expressions.

Figure 3 shows how prepositions can interact with multiword expressions (MWEs). An MWE may function holistically as a preposition: PrepWiki treats these as multiword prepositions. An idiomatic phrase may be headed by a preposition, in which case we assign it a preposition supersense or tag it as a discourse expression (`d). Finally, a preposition may be embedded within an MWE (but not its head): we do not use a preposition supersense in this case, though the MWE as a whole may already be tagged with a verb supersense.

Heuristics.

The annotation tool uses heuristics to detect candidate preposition tokens in each sentence given its POS tagging and MWE annotation. A single-word expression is included if:

  • it is tagged as a verb particle (rp) or infinitival to (to), or

  • it is tagged as a transitive preposition or subordinator (in) or adverb (rb), and the word is listed in PrepWiki (or the spelling variants list).

A strong MWE instance is included if:

  • the MWE begins with a word that matches the single-word criteria (idiomatic PP), or

  • the MWE is listed in PrepWiki (multiword preposition).

Because_of/Explanation the ants I dropped them to/EndState a 3_star .
I was told to/`i take my coffee to_go/Manner if I wanted to/`i finish it .
With/Attribute higher than/Scalar/Rank average prices to_boot/`d !
I worked~with/ProfessionalAspect Sam_Mones who took_ great _care_of me .

Figure 3: Prepositions involved in multiword expressions. figure 3 Multiword preposition because of (others include in front of, due to, apart from, and other than). figure 3 PP idiom: the preposition supersense applies to the MWE as a whole. figure 3 Discourse PP idiom: instead of a supersense, expressions serving a discourse function are tagged as `d. figure 3 Preposition within a multiword expression: the expression is headed by a verb, so it receives a verb supersense (not shown) rather than a preposition supersense.

Annotation task.

Annotators proceeded sentence by sentence, working in a custom web interface (figure 4). For each token matched by the above heuristics, annotators filled in a text box with the contextually appropriate label. A dropdown menu showed the list of preposition supersenses and non-supersense labels, starting with labels known to be associated with the preposition being annotated. Hovering over a menu item would show example sentences to illustrate the usage in question, as well as a brief definition of the supersense. This preposition-specific rendering of the dropdown menu—supported by data from PrepWiki—was crucial to reducing the overhead of annotation (and annotator training) by focusing the annotator’s attention on the relevant categories/usages. New examples were added to PrepWiki as annotators spotted coverage gaps. The tool also showed the multiword expression annotation of the sentence, which could be modified if necessary to fit PrepWiki’s conventions for multiword prepositions.

Figure 4: Supersense annotation interface, developed in-house. The main thing to note is that preposition, noun, and verb supersenses are stored in text boxes below the sentence. A dropdown menu displays the full list of preposition supersenses, starting with those with PrepWiki mappings to the preposition in question. Hovering the mouse over a menu item displays a tooltip with PrepWiki examples of the usage (if applicable) and a general definition of the supersense.

3.2 Quality Control

Annotators.

Annotators were selected from undergraduate and graduate linguistics students at the University of Colorado at Boulder. All annotators had prior experience with semantic role labeling. Every sentence was independently annotated by two annotators, and disagreements were subsequently adjudicated by a third, “expert” annotator. There were two expert annotators, both authors of this paper.

Training.

200 sentences were set aside for training annotators. Annotators were first shown how to use the preposition annotation tool and instructed on the supersense distinctions for this task. Annotators then completed a training set of 100 sentences. An adjudicator evaluated the annotator’s annotations, providing feedback and assigning another 50–100 training instances if necessary.

Inter-annotator agreement (IAA) measures are useful in quantifying annotation “reliability”, i.e., indicating how trustworthy and reproducible the process is (given guidelines, training, tools, etc.). Specifically, IAA scores can be used as a diagnostic for the reliability of (i) individual annotators (to identify those who need additional training/guidance); (ii) the annotation scheme and guidelines (to identify problematic phenomena requiring further documentation or substantive changes to the scheme); (iii) the final dataset (as an indicator of what could reasonably be expected of an automatic system).

Individual annotators.

The main annotation was divided into 34 batches of 100 sentences. Each batch took on the order of an hour for an annotator to complete. We monitored original annotators’ IAA throughout the annotation process as a diagnostic for when to intervene in giving further guidance. Original IAA for most of these batches fell between 60% and 78%, depending on factors such as the identities of the annotators and when the annotation took place (annotator experience and PrepWiki documentation improved over time).777Specifically, the agreement rate among tokens where both annotators assigned a preposition supersense was between 82% and 87% for 4 batches; 72% and 78% for 11; 60% and 70% for 17; and below 60% for 2. This measure did not award credit for agreement on non-supersense labels and ignored some cases of disagreement on the MWE analysis. These rates show that it was not an easy annotation task, though many of the disagreements were over slight distinctions in the hierarchy (such as Purpose vs. Function).

Guidelines.

Though Schneider et al. (2015) conducted pilot annotation in constructing the supersense inventory, our annotators found a few details of the scheme to be confusing. Informed by their difficulties and disagreements, we therefore made several minor improvements to the preposition supersense categories and hierarchy structure. For example, the supersense categories for partitive constructions proved persistently problematic, so we adjusted their boundaries and names. We also improved the high-level organization of the original hierarchy, clarified some supersense descriptions, and removed the miscellaneous Other supersense.

Figure 5: Distributions of preposition types and supersenses for the 4,250 supersense-tagged preposition tokens in the corpus. In total, 114 prepositions and 63 supersenses are attested. Observe that just 9 prepositions account for 75% of tokens, whereas the head of the supersense distribution is much smaller.

Revisions.

The changes to categories/guidelines noted in the previous paragraph required a small-scale post hoc revision to the annotations, which was performed by the expert annotators. Some additional post hoc revisions were performed to improve consistency; e.g., some anomalous multiword expression annotations involving prepositions were fixed.888In particular, many of the borderline prepositional verbs were revised according to the guidlines outlined at https://github.com/nschneid/nanni/wiki/Prepositional-Verb-Annotation-Guidelines.

Adjudication reliability.

Because sentences were adjudicated by one of two expert annotators, we can estimate the dataset’s adjudication reliability—roughly, the expected proportion of tokens that would have been labeled the same way if adjudicated by the other expert—by measuring IAA on a sample independently annotated by both experts.

999These sentences were then jointly adjudicated by the experts to arrive at a final version. Applying this procedure to 203 sentences annotated late in the process (using the measure described in footnote 7) gives an agreement rate of .101010For completeness, Cohen’s . It is almost as high as raw agreement because the expected agreement rate is very low—but keep in mind that ’s model of chance agreement does not take into account preposition types or the fact that a relatively small subset of labels were suggested for most prepositions. On the 4 most frequent prepositions in the sample, per-preposition is .84 for for, 1.0 for to, .59 for of, and .73 for in. It is difficult to put an exact quality figure on a dataset that was developed over a period of time and with the involvement of many individuals; however, the fact that the expert-to-expert adjudication estimate approaches 90% despite the large number of labels suggests that the data can serve as a reliable resource for training and benchmarking disambiguation systems.

3.3 Resulting Corpus

4250 tokens have preposition supersenses. Their distribution appears in figure 5. Over 75% of tokens belong to the top 10 preposition types, while the supersense distribution is closer to uniform. 1170 tokens are labeled as Location, Path, or a subtype thereof: these can roughly be described as spatial. 528 come from the Temporal subtree of the hierarchy, and 452 from the Configuration subtree. Thus, fully half the tokens (2100) mark non-spatiotemporal participants and circumstances.

Of the 4250 tokens, 582 are MWEs (multiword prepositions and/or PP idioms).111111For the purpose of counting prepositions by type, we split up supersense-tagged PP idioms such as those shown in (3) and (3) by taking the longest prefix of words that has a PrepWiki entry to be the preposition. A further 588 have non-supersense labels: 484 `i, 83 `d, and 21 `.

[labelOnlyA=21,labelOnlyB=7,labelOnlyC=157, labelOnlyAB=4,labelOnlyAC=125,labelOnlyBC=19,labelABC=178, labelA=Trainxll,labelB=xllTest,labelC=Wiki,radius=1cm,overlap=.9cm,hgap=0.1cm,vgap=0.1cm]

(a) Usages (preposition type + supersense). E.g., after/Explanation and into/EndState are recorded in the wiki and attested in the training data but not the test data. (Recall that PrepWiki was updated over the course of annotation, so these figures are not intended to predict its coverage of unseen data. We refrained from adding to PrepWiki a few usages that appeared infrequently in the data and seemed grammatically marginal or had a debatable supersense annotation.)

[labelOnlyA=0,labelOnlyB=0,labelOnlyC=30, labelOnlyAB=0,labelOnlyAC=45,labelOnlyBC=5,labelABC=64, labelA=Trainxll,labelB=xllTest,labelC=Wiki,radius=1cm,overlap=.9cm,hgap=0.1cm,vgap=0.1cm]

(b) Prepositions with 1 usage. Examples occurring in only one of the data splits include despite, in spite of, onto, via, and on top of. The 30 prepositions listed only for the wiki only counts those with at least one mapped supersense.

[labelOnlyA=0,labelOnlyB=0,labelOnlyC=5, labelOnlyAB=0,labelOnlyAC=2,labelOnlyBC=2,labelABC=59, labelA=Trainxll,labelB=xllTest,labelC=Wiki,radius=1cm,overlap=.9cm,hgap=0.1cm,vgap=0.1cm]

(c) Supersenses with 1 usage. Creator, Co-Patient, Transit, Temporal, and 3DMedium are associated with usages in the wiki, but these usages are rare and did not appear in our data. 7 supersenses—Configuration, Participant, Affector, Undergoer, Place, Path, and Traversed—are abstractions intended solely for structuring the wiki; they are not used directly to label prepositions either in the wiki or in the data.
Figure 6: Venn diagrams counting types of usages, prepositions, and supersenses in the data and wiki.

3.4 Splits

To facilitate future experimentation on a standard benchmark, we partitioned our data into training and test sets. We randomly sampled 447 sentences (4,073 total tokens and of preposition instances) for a held-out test set, leaving 3,888 preposition instances for training.121212Excluding `i and `other instances, the supersense-labeled prepositions amount to 3,397 training and 853 test instances. The sampling was stratified by preposition supersense so as to encourage a reasonable balance for the rare labels; e.g., supersenses that occur twice are split so that one instance is assigned to the training set and one to the test set.131313The sampling algorithm considered supersenses in increasing order of frequency: for each supersense having instances, enough sentences were assigned to the test set to fill a minimum quota of

tokens for that supersense (and remaining unassigned sentences containing that supersense were placed in the training set). Relative to the training set, the test set is skewed slightly in favor of rarer supersenses. A small number of annotation errors were corrected subsequent to determining the splits. Entire sentences were sampled to facilitate future studies involving joint prediction over the full sentence.

Figure 6 shows, at a type level, the extent of overlap between the training set, test set, and PrepWiki. 61 preposition supersenses are attested in the training data, while 14 are unattested.

4 Conclusion

We have introduced a new lexical semantics corpus that disambiguates prepositions with hierarchical supersenses. Because it is comprehensively annotated over full documents, it offers insights into the semantic distribution of prepositions. The corpus will further facilitate the development of automatic preposition disambiguation systems.

Acknowledgments

We thank our annotators—Evan Coles-Harris, Audrey Farber, Nicole Gordiyenko, Megan Hutto, Celeste Smitz, and Tim Watervoort—as well as Ken Litkowski, Michael Ellsworth, Orin Hargraves, and Susan Brown for helpful discussions. This research was supported in part by a Google research grant for Q/A PropBank Annotation.

References

  • Agirre et al. (2008) Eneko Agirre, Timothy Baldwin, and David Martinez. 2008. Improving parsing and PP attachment performance with sense information. In Proc. of ACL-HLT, pages 317–325. Columbus, Ohio, USA.
  • Bies et al. (2012) Ann Bies, Justin Mott, Colin Warner, and Seth Kulick. 2012. English Web Treebank. Technical Report LDC2012T13, Linguistic Data Consortium, Philadelphia, PA. URL http://www.ldc.upenn.edu/Catalog/catalogEntry.jsp?catalogId=L%DC2012T13.
  • Bonial et al. (2011) Claire Bonial, William Corvey, Martha Palmer, Volha V. Petukhova, and Harry Bunt. 2011. A hierarchical unification of LIRICS and VerbNet semantic roles. In Fifth IEEE International Conference on Semantic Computing, pages 483–489. Palo Alto, CA, USA.
  • Brugman (1981) Claudia Brugman. 1981. The story of ‘over’: polysemy, semantics and the structure of the lexicon. MA thesis, University of California, Berkeley, Berkeley, CA. Published New York: Garland, 1981.
  • Chodorow et al. (2007) Martin Chodorow, Joel R. Tetreault, and Na-Rae Han. 2007. Detection of grammatical errors involving prepositions. In Proc. of the Fourth ACL-SIGSEM Workshop on Prepositions, pages 25–30. Prague, Czech Republic.
  • Ciaramita and Altun (2006) Massimiliano Ciaramita and Yasemin Altun. 2006. Broad-coverage sense disambiguation and information extraction with a supersense sequence tagger. In Proc. of EMNLP, pages 594–602. Sydney, Australia.
  • Dahlmeier et al. (2009) Daniel Dahlmeier, Hwee Tou Ng, and Tanja Schultz. 2009. Joint learning of preposition senses and semantic roles of prepositional phrases. In Proc. of EMNLP, pages 450–458. Suntec, Singapore.
  • Davies (2010) Mark Davies. 2010. The Corpus of Contemporary American English as the first reliable monitor corpus of English. Literary and Linguistic Computing, 25(4):447–464.
  • Hashemi and Hwa (2014) Homa B. Hashemi and Rebecca Hwa. 2014. A comparison of MT errors and ESL errors. In Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Hrafn Loftsson, Bente Maegaard, Joseph Mariani, Asuncion Moreno, Jan Odijk, and Stelios Piperidis, editors, Proc. of LREC, pages 2696–2700. Reykjavík, Iceland.
  • Herskovits (1986) Annette Herskovits. 1986. Language and spatial cognition: an interdisciplinary study of the prepositions in English. Studies in Natural Language Processing. Cambridge University Press, Cambridge, UK.
  • Hovy et al. (2010) Dirk Hovy, Stephen Tratz, and Eduard Hovy. 2010. What’s in a preposition? Dimensions of sense disambiguation for an interesting word class. In Coling 2010: Posters, pages 454–462. Beijing, China.
  • Hwang (2014) Jena D. Hwang. 2014. Identification and representation of caused motion constructions. Ph.D. dissertation, University of Colorado, Boulder, Colorado.
  • Johannsen et al. (2014) Anders Johannsen, Dirk Hovy, Héctor Martínez Alonso, Barbara Plank, and Anders Søgaard. 2014. More or less supervised supersense tagging of Twitter. In Proc. of *SEM, pages 1–11. Dublin, Ireland.
  • Lakoff (1987) George Lakoff. 1987. Women, fire, and dangerous things: what categories reveal about the mind. University of Chicago Press, Chicago.
  • Lindstromberg (2010) Seth Lindstromberg. 2010. English Prepositions Explained. John Benjamins, Amsterdam, revised edition.
  • Litkowski (2014) Ken Litkowski. 2014. Pattern Dictionary of English Prepositions. In Proc. of ACL, pages 1274–1283. Baltimore, Maryland, USA.
  • Litkowski (2015) Ken Litkowski. 2015. Notes on barbecued opakapaka: ontology in preposition patterns. Technical Report 15-01, CL Research, Damascus, MD. URL http://www.clres.com/online-papers/PDEPOntology.pdf.
  • Litkowski and Hargraves (2005) Ken Litkowski and Orin Hargraves. 2005. The Preposition Project. In Proc. of the Second ACL-SIGSEM Workshop on the Linguistic Dimensions of Prepositions and their Use in Computational Linguistics Formalisms and Applications, pages 171–179. Colchester, Essex, UK.
  • Litkowski and Hargraves (2007) Ken Litkowski and Orin Hargraves. 2007. SemEval-2007 Task 06: Word-Sense Disambiguation of Prepositions. In Proc. of SemEval, pages 24–29. Prague, Czech Republic.
  • Martínez Alonso et al. (2015) Héctor Martínez Alonso, Anders Johannsen, Sussi Olsen, Sanni Nimb, Nicolai Hartvig Sørensen, Anna Braasch, Anders Søgaard, and Bolette Sandford Pedersen. 2015. Supersense tagging for Danish. In Beáta Megyesi, editor, Proc. of NODALIDA, pages 21–29. Vilnius, Lithuania.
  • Miller et al. (1990) George A. Miller, Richard Beckwith, Christiane Fellbaum, Derek Gross, and Katherine Miller. 1990. Five Papers on WordNet. Technical Report 43, Princeton University, Princeton, NJ.
  • O’Dowd (1998) Elizabeth M. O’Dowd. 1998. Prepositions and particles in English: a discourse-functional account. Oxford University Press, New York.
  • O’Hara and Wiebe (2003) Tom O’Hara and Janyce Wiebe. 2003. Preposition semantic classification via Treebank and FrameNet. In Walter Daelemans and Miles Osborne, editors, Proc. of CoNLL, pages 79–86. Edmonton, Canada.
  • Picca et al. (2008) Davide Picca, Alfio Massimiliano Gliozzo, and Massimiliano Ciaramita. 2008. Supersense Tagger for Italian. In Nicoletta Calzolari, Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odjik, Stelios Piperidis, and Daniel Tapias, editors, Proc. of LREC, pages 2386–2390. Marrakech, Morocco.
  • Pullum and Huddleston (2002) Geoffrey K. Pullum and Rodney Huddleston. 2002. Prepositions and preposition phrases. In Rodney Huddleston and Geoffrey K. Pullum, editors, The Cambridge Grammar of the English Language, pages 579–611. Cambridge University Press, Cambridge, UK.
  • Saint-Dizier and Ide (2006) Patrick Saint-Dizier and Nancy Ide, editors. 2006. Syntax and Semantics of Prepositions, volume 29 of Text, Speech and Language Technology. Springer, Dordrecht, The Netherlands.
  • Schneider et al. (2016) Nathan Schneider, Dirk Hovy, Anders Johannsen, and Marine Carpuat. 2016. SemEval-2016 Task 10: Detecting Minimal Semantic Units and their Meanings (DiMSUM). In Proc. of SemEval. San Diego, California, USA.
  • Schneider et al. (2013) Nathan Schneider, Behrang Mohit, Chris Dyer, Kemal Oflazer, and Noah A. Smith. 2013. Supersense tagging for Arabic: the MT-in-the-middle attack. In Proc. of NAACL-HLT, pages 661–667. Atlanta, Georgia, USA.
  • Schneider et al. (2012) Nathan Schneider, Behrang Mohit, Kemal Oflazer, and Noah A. Smith. 2012. Coarse lexical semantic annotation with supersenses: an Arabic case study. In Proc. of ACL, pages 253–258. Jeju Island, Korea.
  • Schneider et al. (2014) Nathan Schneider, Spencer Onuffer, Nora Kazour, Emily Danchik, Michael T. Mordowanec, Henrietta Conrad, and Noah A. Smith. 2014. Comprehensive annotation of multiword expressions in a social web corpus. In Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Hrafn Loftsson, Bente Maegaard, Joseph Mariani, Asuncion Moreno, Jan Odijk, and Stelios Piperidis, editors, Proc. of LREC, pages 455–461. Reykjavík, Iceland.
  • Schneider and Smith (2015) Nathan Schneider and Noah A. Smith. 2015. A corpus and model integrating multiword expressions and supersenses. In Proc. of NAACL-HLT, pages 1537–1547. Denver, Colorado.
  • Schneider et al. (2015) Nathan Schneider, Vivek Srikumar, Jena D. Hwang, and Martha Palmer. 2015. A hierarchy with, of, and for preposition supersenses. In Proc. of The 9th Linguistic Annotation Workshop, pages 112–123. Denver, Colorado, USA.
  • Segond et al. (1997) Frédérique Segond, Anne Schiller, Gregory Grefenstette, and Jean-Pierre Chanod. 1997.

    An experiment in semantic tagging using hidden Markov model tagging.

    In Piek Vossen, Geert Adriaens, Nicoletta Calzolari, Antonio Sanfilippo, and Yorick Wilks, editors, Automatic Information Extraction and Building of Lexical Semantic Resources for NLP Applications: ACL/EACL-97 Workshop Proceedings, pages 78–81. Madrid, Spain.
  • Shilon et al. (2012) Reshef Shilon, Hanna Fadida, and Shuly Wintner. 2012. Incorporating linguistic knowledge in statistical machine translation: translating prepositions. In Proc. of the Workshop on Innovative Hybrid Approaches to the Processing of Textual Data, pages 106–114. Avignon, France.
  • Srikumar and Roth (2011) Vivek Srikumar and Dan Roth. 2011. A joint model for extended semantic role labeling. In Proc. of EMNLP, pages 129–139. Edinburgh, Scotland, UK.
  • Srikumar and Roth (2013a) Vivek Srikumar and Dan Roth. 2013a. An inventory of preposition relations. Technical Report arXiv:1305.5785. URL http://arxiv.org/abs/1305.5785.
  • Srikumar and Roth (2013b) Vivek Srikumar and Dan Roth. 2013b. Modeling semantic relations expressed by prepositions. Transactions of the Association for Computational Linguistics, 1:231–242.
  • Tratz and Hovy (2009) Stephen Tratz and Dirk Hovy. 2009. Disambiguation of preposition sense using linguistically motivated features. In Proc. of NAACL-HLT Student Research Workshop and Doctoral Consortium, pages 96–100. Boulder, Colorado.
  • Tsvetkov et al. (2015) Yulia Tsvetkov, Manaal Faruqui, Wang Ling, Guillaume Lample, and Chris Dyer. 2015.

    Evaluation of word vector representations by subspace alignment.

    In Proc. of EMNLP. Lisbon, Portugal.
  • Tsvetkov et al. (2013) Yulia Tsvetkov, Elena Mukomel, and Anatole Gershman. 2013. Cross-lingual metaphor detection using common semantic features. In Proc. of the First Workshop on Metaphor in NLP, pages 45–51. Atlanta, Georgia, USA.
  • Tsvetkov et al. (2014) Yulia Tsvetkov, Nathan Schneider, Dirk Hovy, Archna Bhatia, Manaal Faruqui, and Chris Dyer. 2014. Augmenting English adjective senses with supersenses. In Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Hrafn Loftsson, Bente Maegaard, Joseph Mariani, Asuncion Moreno, Jan Odijk, and Stelios Piperidis, editors, Proc. of LREC, pages 4359–4365. Reykjavík, Iceland.
  • Tyler and Evans (2003) Andrea Tyler and Vyvyan Evans. 2003. The Semantics of English Prepositions: Spatial Scenes, Embodied Meaning and Cognition. Cambridge University Press, Cambridge, UK.
  • Ye and Baldwin (2007) Patrick Ye and Timothy Baldwin. 2007. MELB-YB: Preposition sense disambiguation using rich semantic features. In Proc. of SemEval, pages 241–244. Prague, Czech Republic.