What do Asian Religions Have in Common? An Unsupervised Text Analytics Exploration

12/20/2019
by   Preeti Sah, et al.
0

The main source of various religious teachings is their sacred texts which vary from religion to religion based on different factors like the geographical location or time of the birth of a particular religion. Despite these differences, there could be similarities between the sacred texts based on what lessons it teaches to its followers. This paper attempts to find the similarity using text mining techniques. The corpus consisting of Asian (Tao Te Ching, Buddhism, Yogasutra, Upanishad) and non-Asian (four Bible texts) is used to explore findings of similarity measures like Euclidean, Manhattan, Jaccard and Cosine on raw Document Term Frequency [DTM], normalized DTM which reveals similarity based on word usage. The performance of Supervised learning algorithms like K-Nearest Neighbor [KNN], Support Vector Machine [SVM] and Random Forest is measured based on its accuracy to predict correct scared text for any given chapter in the corpus. The K-means clustering visualizations on Euclidean distances of raw DTM reveals that there exists a pattern of similarity among these sacred texts with Upanishads and Tao Te Ching is the most similar text in the corpus.

READ FULL TEXT

page 4

page 9

page 10

page 11

research
03/11/2017

A German Corpus for Text Similarity Detection Tasks

Text similarity detection aims at measuring the degree of similarity bet...
research
03/04/2016

Parallel Texts in the Hebrew Bible, New Methods and Visualizations

In this article we develop an algorithm to detect parallel texts in the ...
research
11/13/2015

Similarity-based Text Recognition by Deeply Supervised Siamese Network

In this paper, we propose a new text recognition model based on measurin...
research
05/18/2023

Computational thematics: Comparing algorithms for clustering the genres of literary fiction

What are the best methods of capturing thematic similarity between liter...
research
11/06/2017

Authorship Analysis of Xenophon's Cyropaedia

In the past several decades, many authorship attribution studies have us...
research
11/10/2016

Tracing metaphors in time through self-distance in vector spaces

From a diachronic corpus of Italian, we build consecutive vector spaces ...
research
02/07/2022

Moving Other Way: Exploring Word Mover Distance Extensions

The word mover's distance (WMD) is a popular semantic similarity metric ...

Please sign up or login with your details

Forgot password? Click here to reset