
An example of application of optimal sample allocation in a finite population
The problem of estimating a proportion of objects with particular attrib...
read it

Regular sequences and synchronized sequences in abstract numeration systems
The notion of bregular sequences was generalized to abstract numeration...
read it

Information Distance: New Developments
In pattern recognition, learning, and data mining one obtains informatio...
read it

On Greedy Algorithms for Binary de Bruijn Sequences
We propose a general greedy algorithm for binary de Bruijn sequences, ca...
read it

Efficient Approximation Algorithms for String Kernel Based Sequence Classification
Sequence classification algorithms, such as SVM, require a definition of...
read it

Optimal Learning from the DoobDynkin lemma
The DoobDynkin Lemma gives conditions on two functions X and Y that ens...
read it

A unifying method for the design of algorithms canonizing combinatorial objects
We devise a unified framework for the design of canonization algorithms....
read it
Analogical Dissimilarity: Definition, Algorithms and Two Experiments in Machine Learning
This paper defines the notion of analogical dissimilarity between four objects, with a special focus on objects structured as sequences. Firstly, it studies the case where the four objects have a null analogical dissimilarity, i.e. are in analogical proportion. Secondly, when one of these objects is unknown, it gives algorithms to compute it. Thirdly, it tackles the problem of defining analogical dissimilarity, which is a measure of how far four objects are from being in analogical proportion. In particular, when objects are sequences, it gives a definition and an algorithm based on an optimal alignment of the four sequences. It gives also learning algorithms, i.e. methods to find the triple of objects in a learning sample which has the least analogical dissimilarity with a given object. Two practical experiments are described: the first is a classification problem on benchmarks of binary and nominal data, the second shows how the generation of sequences by solving analogical equations enables a handwritten character recognition system to rapidly be adapted to a new writer.
READ FULL TEXT
Comments
There are no comments yet.