Mining a Sub-Matrix of Maximal Sum

09/25/2017
by   Vincent Branders, et al.
0

Biclustering techniques have been widely used to identify homogeneous subgroups within large data matrices, such as subsets of genes similarly expressed across subsets of patients. Mining a max-sum sub-matrix is a related but distinct problem for which one looks for a (non-necessarily contiguous) rectangular sub-matrix with a maximal sum of its entries. Le Van et al. (Ranked Tiling, 2014) already illustrated its applicability to gene expression analysis and addressed it with a constraint programming (CP) approach combined with large neighborhood search (CP-LNS). In this work, we exhibit some key properties of this NP-hard problem and define a bounding function such that larger problems can be solved in reasonable time. Two different algorithms are proposed in order to exploit the highlighted characteristics of the problem: a CP approach with a global constraint (CPGC) and mixed integer linear programming (MILP). Practical experiments conducted both on synthetic and real gene expression data exhibit the characteristics of these approaches and their relative benefits over the original CP-LNS method. Overall, the CPGC approach tends to be the fastest to produce a good solution. Yet, the MILP formulation is arguably the easiest to formulate and can also be competitive.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/27/2016

"Model and Run" Constraint Networks with a MILP Engine

Constraint Programming (CP) users need significant expertise in order to...
research
03/18/2021

MILP for the Multi-objective VM Reassignment Problem

Machine Reassignment is a challenging problem for constraint programming...
research
11/26/2015

A global Constraint for mining Sequential Patterns with GAP constraint

Sequential pattern mining (SPM) under gap constraint is a challenging ta...
research
12/14/2017

Constraint and Mathematical Programming Models for Integrated Port Container Terminal Operations

This paper considers the integrated problem of quay crane assignment, qu...
research
03/08/2021

Quantum-accelerated constraint programming

Constraint programming (CP) is a paradigm used to model and solve constr...
research
04/05/2016

An Efficient Algorithm for Mining Frequent Sequence with Constraint Programming

The main advantage of Constraint Programming (CP) approaches for sequent...
research
09/13/2018

MSc Dissertation: Exclusive Row Biclustering for Gene Expression Using a Combinatorial Auction Approach

The availability of large microarray data has led to a growing interest ...

Please sign up or login with your details

Forgot password? Click here to reset