New Linear-time Algorithm for SubTree Kernel Computation based on Root-Weighted Tree Automata

02/02/2023
by   Ludovic Mignot, et al.
0

Tree kernels have been proposed to be used in many areas as the automatic learning of natural language applications. In this paper, we propose a new linear time algorithm based on the concept of weighted tree automata for SubTree kernel computation. First, we introduce a new class of weighted tree automata, called Root-Weighted Tree Automata, and their associated formal tree series. Then we define, from this class, the SubTree automata that represent compact computational models for finite tree languages. This allows us to design a theoretically guaranteed linear-time algorithm for computing the SubTree Kernel based on weighted tree automata intersection. The key idea behind the proposed algorithm is to replace DAG reduction and nodes sorting steps used in previous approaches by states equivalence classes computation allowed in the weighted tree automata approach. Our approach has three major advantages: it is output-sensitive, it is free sensitive from the tree types (ordered trees versus unordered trees), and it is well adapted to any incremental tree kernel based learning methods. Finally, we conduct a variety of comparative experiments on a wide range of synthetic tree languages datasets adapted for a deep algorithm analysis. The obtained results show that the proposed algorithm outperforms state-of-the-art methods.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/12/2023

Parallel Tree Kernel Computation

Tree kernels are fundamental tools that have been leveraged in many appl...
research
09/17/2021

A Nivat Theorem for Weighted Alternating Automata over Commutative Semirings

In this paper, we give a Nivat-like characterization for weighted altern...
research
01/02/2020

Incremental Monoidal Grammars

In this work we define formal grammars in terms of free monoidal categor...
research
11/03/2019

Automata Learning: An Algebraic Approach

We propose a generic categorical framework for learning unknown formal l...
research
07/27/2021

Version Space Algebras are Acyclic Tree Automata

Version space algebras are ways of representing spaces of programs which...
research
06/20/2018

The compressions of reticulation-visible networks are tree-child

Rooted phylogenetic networks are rooted acyclic digraphs. They are used ...
research
11/03/2021

Linear-time Minimization of Wheeler DFAs

Wheeler DFAs (WDFAs) are a sub-class of finite-state automata which is p...

Please sign up or login with your details

Forgot password? Click here to reset