Distance maps between Japanese kanji characters based on hierarchical optimal transport

04/05/2023
by   Dominic Schuhmacher, et al.
0

We introduce a general framework for assigning distances between kanji based on their dissimilarity. What we mean by this term may depend on the concrete application. The only assumption we make is that the dissimilarity between two kanji is adequately expressed as a weighted mean of penalties obtained from matching nested structures of components in an optimal way. For the cost of matching, we suggest a number of modules that can be freely combined or replaced with other modules, including the relative unbalanced ink transport between registered components, the distance between the transformations required for registration, and the difference in prespecified labels. We give a concrete example of a kanji distance function obtained in this way as a proof of concept. Based on this function, we produce 2D kanji maps by multidimensional scaling and a table of 100 randomly selected Jōjō kanji with their 16 nearest neighbors. Our kanji distance functions can be used to help Japanese learners from non-CJK backgrounds acquire kanji literacy. In addition, they may assist editors of kanji dictionaries in presenting their materials and may serve in text processing and optical character recognition systems for assessing the likelihood of errors.

READ FULL TEXT

page 14

page 20

page 21

page 22

research
05/12/2021

Optimal transport with some directed distances

We present a toolkit of directed distances between quantile functions. B...
research
04/10/2020

Full waveform inversion with unbalanced optimal transport distance

Full waveform inversion (FWI) is an important and popular technique in s...
research
11/11/2020

On a general matrix valued unbalanced optimal transport and its fully discretization: dynamic formulation and convergence framework

In this work, we present a rather general class of transport distances o...
research
02/01/2021

The Gene Mover's Distance: Single-cell similarity via Optimal Transport

This paper introduces the Gene Mover's Distance, a measure of similarity...
research
06/25/2018

Towards Optimal Transport with Global Invariances

Many problems in machine learning involve calculating correspondences be...

Please sign up or login with your details

Forgot password? Click here to reset