Discovering bursts revisited: guaranteed optimization of the model parameters

02/05/2019
by   Nikolaj Tatti, et al.
0

One of the classic data mining tasks is to discover bursts, time intervals, where events occur at abnormally high rate. In this paper we revisit Kleinberg's seminal work, where bursts are discovered by using exponential distribution with a varying rate parameter: the regions where it is more advantageous to set the rate higher are deemed bursty. The model depends on two parameters, the initial rate and the change rate. The initial rate, that is, the rate that is used when there are no burstiness was set to the average rate over the whole sequence. The change rate is provided by the user. We argue that these choices are suboptimal: it leads to worse likelihood, and may lead to missing some existing bursts. We propose an alternative problem setting, where the model parameters are selected by optimizing the likelihood of the model. While this tweak is trivial from the problem definition point of view, this changes the optimization problem greatly. To solve the problem in practice, we propose efficient (1 + ϵ) approximation schemes. Finally, we demonstrate empirically that with this setting we are able to discover bursts that would have otherwise be undetected.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/08/2019

Using Background Knowledge to Rank Itemsets

Assessing the quality of discovered results is an important open problem...
research
04/14/2019

Mining Closed Strict Episodes

Discovering patterns in a sequence is an important aspect of data mining...
research
04/16/2019

Mining Closed Episodes with Simultaneous Events

Sequential pattern discovery is a well-studied field in data mining. Epi...
research
07/14/2023

MaxMin-L2-SVC-NCH: A New Method to Train Support Vector Classifier with the Selection of Model's Parameters

The selection of model's parameters plays an important role in the appli...
research
03/10/2022

A note on estimating Bass model parameters

Bass (1969) proposed a model (the Bass model) for the timing of adoption...
research
12/21/2018

A new approach to learning in Dynamic Bayesian Networks (DBNs)

In this paper, we revisit the parameter learning problem, namely the est...
research
07/12/2020

It Is Likely That Your Loss Should be a Likelihood

We recall that certain common losses are simplified likelihoods and inst...

Please sign up or login with your details

Forgot password? Click here to reset