Discovering bursts revisited: guaranteed optimization of the model parameters

by   Nikolaj Tatti, et al.

One of the classic data mining tasks is to discover bursts, time intervals, where events occur at abnormally high rate. In this paper we revisit Kleinberg's seminal work, where bursts are discovered by using exponential distribution with a varying rate parameter: the regions where it is more advantageous to set the rate higher are deemed bursty. The model depends on two parameters, the initial rate and the change rate. The initial rate, that is, the rate that is used when there are no burstiness was set to the average rate over the whole sequence. The change rate is provided by the user. We argue that these choices are suboptimal: it leads to worse likelihood, and may lead to missing some existing bursts. We propose an alternative problem setting, where the model parameters are selected by optimizing the likelihood of the model. While this tweak is trivial from the problem definition point of view, this changes the optimization problem greatly. To solve the problem in practice, we propose efficient (1 + ϵ) approximation schemes. Finally, we demonstrate empirically that with this setting we are able to discover bursts that would have otherwise be undetected.


page 1

page 2

page 3

page 4


Using Background Knowledge to Rank Itemsets

Assessing the quality of discovered results is an important open problem...

Mining Closed Strict Episodes

Discovering patterns in a sequence is an important aspect of data mining...

Mining Closed Episodes with Simultaneous Events

Sequential pattern discovery is a well-studied field in data mining. Epi...

MaxMin-L2-SVC-NCH: A New Method to Train Support Vector Classifier with the Selection of Model's Parameters

The selection of model's parameters plays an important role in the appli...

A note on estimating Bass model parameters

Bass (1969) proposed a model (the Bass model) for the timing of adoption...

A new approach to learning in Dynamic Bayesian Networks (DBNs)

In this paper, we revisit the parameter learning problem, namely the est...

It Is Likely That Your Loss Should be a Likelihood

We recall that certain common losses are simplified likelihoods and inst...

Please sign up or login with your details

Forgot password? Click here to reset