Log In Sign Up

Investigating Antigram Behaviour using Distributional Semantics

by   Saptarshi Sengupta, et al.

Language is an extremely interesting subject to study, each day presenting new challenges and new topics for research. Words in particular have several unique characteristics which when explored, prove to be astonishing. Anagrams and Antigrams are such words possessing these amazing properties. The presented work is an exploration into generating anagrams from a given word and determining whether there exists antigram relationships between the pairs of generated anagrams in light of the Word2Vec distributional semantic similarity model. The experiments conducted, showed promising results for detecting antigrams.


page 1

page 2

page 3

page 4


Where New Words Are Born: Distributional Semantic Analysis of Neologisms and Their Semantic Neighborhoods

We perform statistical analysis of the phenomenon of neology, the proces...

Distributional Analysis of Function Words

This paper is a first attempt at reconciling the current methods of dist...

Grammatical Profiling for Semantic Change Detection

Semantics, morphology and syntax are strongly interdependent. However, t...

Don't Blame Distributional Semantics if it can't do Entailment

Distributional semantics has had enormous empirical success in Computati...

Collaborative Training of Tensors for Compositional Distributional Semantics

Type-based compositional distributional semantic models present an inter...

CogALex-V Shared Task: ROOT18

In this paper, we describe ROOT 18, a classifier using the scores of sev...

When Hearst Is not Enough: Improving Hypernymy Detection from Corpus with Distributional Models

We address hypernymy detection, i.e., whether an is-a relationship exist...

1 Introduction

An anagram can be defined as a kind of word play where all the characters in the word are rearranged, using each character exactly once in order to generate a new word (s) which may or may not share semantic relationships with the root word. ‘Live’ and ‘Vile’ are examples of anagrams. Sometimes it is also possible to create a number of words from the root word using anagramming so as to produce a phrase; such is with the case ‘Dormitory’ and ‘Dirty Room’. Anagrams are present in a multitude of domains ranging from literature (a famous example is seen in the novel ‘The Da Vinci Code’ where the phrase “O, Draconian devil!” was an anagram of “Leonardo Da Vinci”) to cyber security (for solving certain kinds of cryptograms such as the transposition and permutation ciphers). Antigrams on the other hand, are a class of anagrams which share an antonymic relationship with their anagram partner. For instance, ‘medicate’ and ‘decimate’ are examples of antigrams. These words are much more interesting to study because instead of a simple word play used to generate new phrases (anagrams) which might have a connection to the original word, the task can be to find a new word from the root word with which it has a relationship (antonym).

In light of Natural Language Processing (NLP) tasks, very little work has been performed on antonyms and even less work has been done on anagrams and antigrams. Most of the work done has been conducted from the viewpoint of psychology experiments where researchers try to understand cognitive processing in the human mind. dur9841 presented a work which showed how the number of syllables influenced the difficulty of solving an anagram for both skilled and unskilled problem solvers. A similar task was undertaken by doi:10.1080/17470210701449936. Vincent2006 created software which enabled users to discover novel anagrams and classify existing anagrams on the basis of certain psycholinguistic variables. Anagrams also help in understanding how cognition is linked with age or personality changes


. thal conducted research in exploring antonyms and distinguishing their presence in vector space by using the very recent Word2Vec model. But as mentioned before, none of the above works truly examined anagram or antigram relationships among words from the standpoint of NLP application. Rather they were explorations in cognitive science.

Generating single word anagrams is a relatively trivial task i.e. simply compute every possible permutation of the characters of the given word and eliminate those terms created which are essentially noisy data i.e. not found in either a corpus or a dictionary. However, trying to automatically detect antigram relationships among the generated anagrams in a trickier task as semantic information is required before such analysis can be conducted. This idea forms the main motivation of our work. We wanted to explore how antigrams are related to each other in terms of semantic similarity using the Word2Vec distributional similarity model, or in other words, determine pairs of antigrams from the anagrams of a given word. The results obtained were compared with Word2Vec similarity scores computed between well-known antonyms and a unique difference was identified which is described later in the paper.

The rest of the paper is organized as follows. Section 2 describes the methodology of our work. Section 3 provides the results and analysis of experimentation. Finally, the paper is concluded and future directions are described in section 4.

2 Proposed Methodology

The key question which our system aims to address is whether pairs of anagrams generated from a target word share an antigram relationship. To solve this question, we devise a simple algorithm (cf. algorithm 1) which classifies a pair of words (anagrams) as having or not having antigram relationship on the basis of a threshold value which is set empirically.

Our work makes use of the Word2Vec model [Mikolov et al.2013]

for calculating semantic similarity between the generated anagrams. Word2Vec is an unsupervised learning model which creates embeddings of words i.e. real numbered vector representations of words from a given corpus. What makes it such a powerful tool for detecting similarity is its unsupervised nature i.e. not relying on WordNet or related resource and ability to discover several interesting semantic relationships among words. Word2Vec operates in 2 modes viz. continuous bag-of-words or CBOW and skip-gram. Generally, when talking about similarity between words, the CBOW architecture is selected as the latter i.e. skip-gram, is useful in applications where context prediction is focused upon rather than exploring a word for similarity purposes.

In the proposed algorithm, each permutation of the root word W (P) is run through a spell-checker which was implemented in our work with the help of the PyEnchant111Available online: package for the python programming language. A spell-checking module was required so as to filter out the invalid lexical forms of W (as all the permutations would not be proper English words). The members in the filtered permutation list are the anagrams that were required. Using Word2Vec, semantic similarity between each unique pair (C) of anagrams from the filtered permutation list was computed. The members of pair C were termed as C0 and C1. Finally, if the similarity score between C0 and C1 was found to be less than 0.3, the pair was selected as an antigram. The value of 0.3 was set from observation rather than through theoretical study. Changing it around would yield different results. Algorithm 1: Anagram generator and Antigram checker Input: Root word W Output: List of anagrams of W and those pairs which are antigrams Begin [topsep=0pt] 1. Generate every permutation of all the letters in W 2. P permutation [topsep=-2pt] a. if P not (valid lexical form) then i. remove P from permutation b. end 3. end 4. Print permutation // This is the Anagram List 5. Set antigram_list [] 6. pair C permutation [topsep=-2pt] a. z sim(C0,C1) b. if z 0.3 then i. Add C to antigram_list c. end 7. end 8. return antigram_list End A point of contention regarding the algorithm would be, the exclusive use of the Word2Vec model for generating similarity scores between the anagram pairs. We provide two reasons for this choice

  • [topsep=0pt]

  • We wanted to establish an algorithm reliant on unsupervised models.

  • The main purpose of the algorithm was to examine the nature of word ‘vectors’ to see whether antonym relationship was equivalent to word vectors being opposite in direction in semantic space.

3 Results Analysis

The semantic similarity scores were obtained using the Word2Vec model trained on the widely

Word Anagram Anagram-Pair Similarity Score System Antigram True Antigram
Termini Interim (termini, interim) -0.08 Yes Yes
Indeed Denied (indeed, denied) 0.42 No Yes
Tip Pit (tip, pit) 0.28 Yes Yes
Souring Rousing (souring, rousing) 0.08 Yes Yes
Sheared Adheres (sheared, adheres) -0.05 Yes Yes
Headers (sheared, headers) 0.23 Yes No
- (adheres, headers) -0.04 Yes No
Table 1: Semantic Similarity Scores for antigram testing
Antonym Similarity Score
Up-Down 0.92
Large-Small 0.93
Top-Bottom 0.59
Happy-Sad 0.68
Heavy-Light 0.64
Table 2: Semantic Similarity Scores for Antonym pairs

available Gigaword222Available online: corpus, distributed in the form of pre-trained word vectors (the 100 dimension variant was used). Using such a standardized corpus provides a strong validity for the results obtained.

For testing our algorithm, the antigram dataset compiled by anil was taken which consisted of 50 pairs of well-known antigrams. Using this dataset, the antigrams predicted by the system from among the pairs of anagrams generated could be easily verified. Table 1 presents the results of the antigram tests.

From the results of Table 1, it became clear that a similarity score lower than 0.3 was not the only criteria for an anagram pair to be declared an antigram. A total of 5 pairs of anagrams were actually (true) antigrams. The system predicted that 4 out of 5 (80%) of these pairs were antigrams. However, the system also falsely predicted that (sheared, headers) and (adheres, headers) were also antigrams owing to their similarity scores being less than 0.3. This indicated that some form of manual validation is required when verifying antigram nature according to the proposed model. All in all, out of a total of 7 cases, the system correctly predicted that 4 out of those 7 cases were antigrams thus achieving an accuracy of 57.14%.

The 0.3 threshold was selected after seeing how antonyms behaved in semantic space. Similarity scores were computed between well-known antonym pairs and a striking observation was made. Table 2 shows their similarity scores.

The average semantic similarity score for the antonym pairs (cf. Table 2) was 0.75. Such a high score is actually counterintuitive. This is because the general idea regarding word vectors is that if they are antonyms, they would point in opposite directions and as such the cosine of the angle between them would be negative. Such a notion is clearly challenged when the scores from the antigram tests are contrasted with the antonym tests. The scores from Table 2 highlight the fact that it is not necessary for a word pair to have a negative similarity score, in order to be antonyms. Generally, if the cosine similarity between two word vectors is more than 0.6, it means that they are highly similar to each another. Word2Vec tries to reduce the angle between vectors of similar words and as such they become clustered very near to each other in semantic space. Smaller the angle, closer is its cosine to 1. Thus, having a negative similarity score is not indicative of antonyms. Rather it was found that antonyms are strongly related to each other and as such produce high similarity scores (cf. Table 2). This fact was challenged by the antigrams whose similarity scores were extremely low and even negative in some cases (cf. Table 1) but still had an antonymic relationship. This proves that antigrams and antonyms behave differently in semantic space in spite of sharing a common link.

4 Conclusion and Future Work

The presented work aims to analyse antigrams and anagrams from the standpoint of NLP instead of treating them as subjects falling in the realm of logology (recreational linguistics). We propose a simple technique for detecting antigrams from pairs of anagrams using Word2Vec similarity. Our work is perhaps one of the first to explore this topic using distributional semantics. However, as can be seen from the paper, further research remains to be done in this area particularly involving multi-word anagrams. We propose basic ideas in the direction of anagram and antigram research and hope that future developments are undertaken towards it.


  • Adams et al.2011 J.W. Adams, M. Stone, R.D. Vincent, and S.J. Muncer. 2011. The role of syllables in anagram solution : a rasch analysis. Journal of general psychology., 138(2):94–109, April.
  • Anil2010 A. Anil. 2010. Antigrams from word ways: A retrospective. Word Ways, 43(21).
  • Java1992 Rosalind I. Java. 1992. Priming and aging: Evidence of preserved memory function in an anagram solution task. The American Journal of Psychology, 105(4):541–548.
  • Mikolov et al.2013 Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Efficient estimation of word representations in vector space. CoRR, abs/1301.3781.
  • Novick and Sherman2008 Laura R. Novick and Steven J. Sherman. 2008. The effects of superficial and structural information on online problem solving for good versus poor anagram solvers. The Quarterly Journal of Experimental Psychology, 61(7):1098–1120.
  • Thalenberg2016 Bruna Thalenberg. 2016. Distinguishing antonyms from synonyms in vector space models of semantics. Technical report, University of S˜ao Paulo, Brazil.
  • Vincent et al.2006 Robert D. Vincent, Yael K. Goldberg, and Debra A. Titone. 2006. Anagram software for cognitive research that enables specification of psycholinguistic variables. Behavior Research Methods, 38(2):196–201, May.