pyjsd
Python implementation of the Jensen-Shannon divergence
view repo
The coefficient of determination, known as R^2, is commonly used as a goodness-of-fit criterion for fitting linear models. R^2 is somewhat controversial when fitting nonlinear models, although it may be generalised on a case-by-case basis to deal with specific models such as the logistic model. Assume we are fitting a parametric distribution to a data set using, say, the maximum likelihood estimation method. A general approach to measure the goodness-of-fit of the fitted parameters, which we advocate herein, is to use a nonparametric measure for model comparison between the raw data and the fitted model. In particular, for this purpose we put forward the Jensen-Shannon divergence (JSD) as a metric, which is bounded and has an intuitive information-theoretic interpretation. We demonstrate, via a straightforward procedure making use of the JSD, that it can be used as part of maximum likelihood estimation or curve fitting as a measure of goodness-of-fit, including the construction of a confidence interval for the fitted parametric distribution. We also propose that the JSD can be used more generally in nonparametric hypothesis testing for model selection.
READ FULL TEXT VIEW PDF
We establish connections between invariant theory and maximum likelihood...
read it
The battle of Kursk between Soviet and German is known to be the biggest...
read it
Data harmonization is the process by which an equivalence is developed
b...
read it
Nonparametric maximum likelihood estimation is intended to infer the unk...
read it
Multistate models can be used to describe transitions over time across
s...
read it
Information theoretic measures (e.g. the Kullback Liebler divergence and...
read it
We consider a particular instance of a common problem in recommender sys...
read it
Python implementation of the Jensen-Shannon divergence
We assume a general scenario, where we have some data from which we derive an empirical distribution that is fitted with maximum likelihood [33] or curve fitting [11] to some, possibly parametric distribution [26].
The coefficient of determination, [32]
, is a well-known measure of goodness-of-fit for linear regression models. Despite its wide use, in its original form, it is not fully adequate for nonlinear models,
[1], where the author recommends to define as a comparison of a given model to the null model, claiming that this view allows for the generalisation of . Further, in [38] the inappropriateness of for non-linear models is clearly demonstrated via a series of Monte Carlo simulations. In [5], a novelmeasure based on the Kullback-Leibler divergence
[7] was proposed as a measure of goodness-of-fit for regression models in the exponential family.Alternative nonparametric methods have also been proposed. In particular, the Akaike information criterion () and its counterpart the Bayesian information criterion () [4, 41], are widely used estimators for model selection. Both AIC and BIC are asymptotically valid maximum likelihood estimators, with penalty terms to discourage overfitting. The likelihood ratio test is also an established method for model selection between a null model and an alternative maximum likelihood model [42, 27]. Despite the popularity of maximum likelihood methods, there is some controversy in their application as goodness-of-fit tests [16].
The Jensen-Shannon divergence () [28, 9] is a symmetric form of the nonparametric Kullback-Leibeler divergence [7]
, providing a measure of distance between two probability distributions. It has been employed in a wide range of applications such as detecting edges in digital images
[13], measuring the similarity of texts [31], training adversarial neural networks
[14], comparison of genomes in bioinformatics [37], distinguishing between quantum states in physics [29] and as a measure of distance between distributions in a social setting [10].Here we apply the as an alternative measure of goodness-of-fit of a parametric distribution, acting as the model, to an empirical distribution, which comprises the raw data. The provides a direct measure of goodness-of-fit without the need of the maximum value of the likelihood function, as used in the AIC and BIC, or any linearity assumptions of the model being fitted, as often made when using .
The rest of the paper is organised as follows. In Section 2, we introduce the and some of its characteristics. In Section 3, we define the as a measure of goodness-of-fit within the context of distribution fitting and define the notion of the factor. In Section 4, we describe some experiments we did, with simulated data in Subsection 4.1 and empirical data in Subsection 4.2, to test the viability of using the as a measure of goodness-of-fit. Finally, in Section 5, we give our concluding remarks.
Let and be finite distributions, and be a mixture distribution of and . Then the between and , denoted by , is given by
(1) |
where the entropy of a distribution , denoted by , is defined as
(2) |
We note that the is bounded between and [9], and can thus be readily normalised; for convenience we will assume that the is normalised. Moreover, the may still be used if and are improper, i.e. their sum is less than one; see for example [10]. We further note that in order for the to be a metric we need to take the square root of [9], and thus whenever we compute the value of the we will assume that its square root is taken.
The intuition behind the is as follows. We have knowledge of both distributions and and we would like to know how distant they are from each other. In order to do so, we set up a simple experiment, where we take a sample of length one from the mixture distribution . Now, how much information do we gain from observing ? The answer is exactly . If we are none the wiser, i.e. could have equally come from or , then , and for all intents and purposes we consider to be equal to . On the other hand, we may have some information about whether comes from and , and the information we gain on a scale between and is exactly .
Now, let
be a discrete random variable associated with
, and letbe a binary indicator random variable, which is associated with
if and with if . As in [15], it can be shown that(3) |
where
(4) |
and is the conditional entropy of conditioned on [7].
It is worth noting that, asymptotically,
is distributed as a quarter of a Chi-squared distribution
[19] with degrees of freedom [9], i.e.,(5) |
where the right-hand side of (5
) is the Chi-squared goodness-of-fit test statistic
[40]. When the number of degrees of freedom is large a Normal approximation of the Chi-squared statistic is often used [44]; also see [3] for an application of this approximation.The extends to cumulative distributions in natural manner by replacing probability mass functions with their cumulative counterparts. More specifically, this is formalised on using the extension of the Kullback-Leibler divergence [7] in [45, Definition 2.1] to cumulative distributions, and the fact that
(6) |
where the Kullback-Leibler of two distributions, and , denoted by is defined as
(7) |
An important fact to note is that the square root of the cumulative version of the is also a metric [35], and thus it essentially possesses the same properties as the non-cumulative , but with a different normalisation constant. Moreover, it is often advantageous to use the cumulative distribution instead of the probability mass function as it may be easier to interpret and manipulate, and it also acts to smooth the data. For these reasons, we prefer to employ the cumulative in our experiments, and from now on, for convenience, will not make any distinction between the two versions, refering to both simply as the .
In the experiments we will make use of the bootstrap method [8], which is a technique for computing a confidence interval that relies on random resampling with replacement from a given sample data set. The bootstrap method is usually nonparametric, making no distributional assumptions about the data set employed.
Making use of the as a measure of goodness-of-fit is quite straightforward. Assume that is a sample from a parametric distribution , with parameters , and that is fitted with maximum likelihood [33] or curve fitting [12] to an empirical distribution, .
The goodness-of-fit of the distribution , with parameters , to the empirical distribution is now defined as
(8) |
where is a finite distribution, which is distributed according with parameters . We note that (8) does not restrict the , and so it is also possible to measure one empirical distribution against any another.
The Bayes factor
[20]is a method for model comparison, taking the ratio of the models representing the likelihood of the data under the alternative hypothesis and likelihood of the data under the null hypothesis. In particular, the Bayes factor is advocated as an alternative method for null hypothesis significance testing, which depends only on the data and considers the models arising from both the null and alternative hypotheses
[18].The JSD factor is reformulation of the Bayes factor with the , defined as
(9) |
which is the odds ratio of choosing the alternative hypothesis,
, in preference to the null hypothesis, .To assess the use of the as a goodness-of-fit measure we provide experimental results with simulated data of various parametric distributions including the Uniform, Normal, Log-normal, Exponential, Gamma, Beta, Weibull, Pareto [26] and -Gaussian [39] distributions. Our methodology for the experiments with simulated data (see Subsection 4.1) was as follows:
First we generated a data set, say , of size from a given distribution, say , with chosen parameters, say , which was then taken to be the empirical distribution.
We then considered to be distributed according to a hypothesised distribution, , where may not be the same as , and used the maximum likelihood method to obtain the parameters of , say , assuming its distribution was . (Obviously, if , then is expected to be very close to .)
Next, assuming that was distributed according to with parameters , we generated a second data set, , from distribution with parameters .
Finally, we evaluated as a measure of the goodness-of-fit of , with parameters , to , and computed a 95% confidence interval for the from 1000 bootstrap resamples using the basic bootstrap percentile method [8, Section 5.3.1].
For the experiments with empirical data sets we followed the same methodology, with the difference that the data set was an empirical data set rather than a generated one.
For each set of experiments we followed the methodology described above for several possible alternative parametric distributions, , and then computed the factor between the best and a lower performing distribution. As mentioned towards the end of Section 2 the cumulative version of the was used in all the experiments.
The tables showing the results are given in the appendix at the end of the paper. For Normal and Log-normal distributions, the first parameter is the mean and the second the standard deviation, while for Gamma and Weibull distributions, the first parameter is the shape and the second the scale. On the other hand, for the Uniform distribution the first parameter is the lower bound and the second the upper bound, for the Beta distribution the first parameter is
and the second , while for the -Gaussian the first parameter is the shape () and the second the scale (), as in (10). The Exponential and Pareto distributions are characterised by a single parameter. In addition, the lower bound of the 95% bootstrap confidence interval is denoted by lb and the upper bound by ub.We now provide commentary on the results for the simulated data, shown in Tables 4, 5, 6, 7 8, 9 and 10, which are given in an appendix at the end of the paper. We note that all the computations described in this subsection were carried out using the Matlab software package. Table 2 summarises, for all experiments, the factors between the best (highlighted in bold) and second best (highlighted in italics) performing distributions. In all cases, apart from experiment 6 for the Beta distribution with and , shown in Table 9, the factor overwhelmingly supports the given distribution, as we would expect. The reason for the relatively low factor in this particular case is the known fact that the Beta distribution can be approximated by the Normal distribution when and are large [36].
It is evident that the larger the size of the data set the more accurate the will be. In Table 1 we demonstrate how the accuracy of the increases while the data set size increases, when both the given and hypothesised distribution are Normal with mean and standard deviation ; it is also noticeable that the maximum likelihood estimation of parameters converges to the correct one as the size of the data increases. In Figure 1 we show that the decrease of the follows a power-law distribution with an exponent of approximately 0.5, which is , where represents the data size.
Parameter 1 | Parameter 2 | Data set size | |
---|---|---|---|
0.1941 | 1.0013 | 0.0459 | 32 |
0.1844 | 1.1171 | 0.0375 | 64 |
0.0714 | 1.0367 | 0.0204 | 128 |
0.0497 | 1.0010 | 0.0180 | 256 |
0.0236 | 1.0378 | 0.0164 | 512 |
-0.0171 | 1.0223 | 0.0080 | 1024 |
0.0301 | 1.0138 | 0.0038 | 2048 |
0.0135 | 1.0091 | 0.0055 | 4096 |
-0.0042 | 0.9821 | 0.0034 | 8192 |
-0.0065 | 1.0078 | 0.0018 | 16384 |
-0.0075 | 1.0038 | 0.0013 | 32768 |
-0.0009 | 0.9945 | 0.0007 | 65536 |
-0.0006 | 1.0012 | 0.0007 | 131072 |
0.0014 | 0.9998 | 0.0006 | 262144 |
0.0017 | 0.9978 | 0.0003 | 524288 |
0.0000 | 0.9997 | 0.0003 | 1048588 |
Experiment | Distribution | factor |
---|---|---|
1 | Normal | 803.9692 |
2 | Log-normal | 270.5271 |
3 | Gamma | 85.8914 |
4 | Gamma | 17.9510 |
5 | Beta | 99.0703 |
6 | Beta | 6.4432 |
7 | Beta | 30.8739 |
In the first experiment the given distribution was Normal with mean and standard deviation , and the hypothesised distributions were Normal and Uniform. The factor between the s of the Normal and Uniform distributions is , which can be derived from Table 4. In the second experiment the given distribution was Log-normal with mean and standard deviation , and the hypothesised distributions were Normal, Uniform, Log-normal, Gamma and Weibull. The factor between the
s of the Log-normal distribution, which is the smallest, and the Gamma distribution, whose
is the closest to it, is , which can be derived from Table 5. In the third experiment the given distribution was Gamma with shape and scale , and the hypothesised distributions were Normal, Uniform, Log-normal, Weibull and Gamma. The factor between the s of the Gamma distribution, which is the smallest, and the Weibull distribution whose is the closest to it, is , which can be derived from Table 6. In the fourth experiment the given distribution was Gamma with shape and scale , and the hypothesised distributions were Normal, Uniform, Log-normal, Weibull and Gamma. The factor between the s of the Gamma distribution, which is the smallest, and the Log-normal distribution whose is the closest to it, is , which can be derived from Table 7. In the fifth experiment the given distribution was Beta with parameters and , and the hypothesised distributions were Normal, Log-normal, Gamma, Weibull and Beta. The factor between the Beta distribution which is the smallest, and the Normal distribution, whose is the closest to it, is , which can be derived from Table 8. In the sixth experiment the given distribution was Beta with parameters and , and the hypothesised distributions were Normal, Log-normal, Gamma, Weibull and Beta. The factor between the Beta distribution, which is the smallest, and the Normal distribution, whose is the closest to it, is , which can be derived from Table 9. In the seventh and final experiment the given distribution was Beta with parameters and , and the hypothesised distributions were Normal, Log-normal, Gamma, Weibull and Beta. The factor between the Beta distribution, which is the smallest, and the Normal distribution, whose is the closest to is, is , which can be derived from Table 10.We now provide commentary on the results for the empirical data sets, shown in Tables 11, 12 and 13, which are given in an appendix at the end of the paper. We note that all the computations described in this subsection were carried out using Python. Table 3 summarises, for all three data sets, the factors between the best (highlighted in bold) and a lower performing distribution.
The first empirical data set we consider, contains detailed voting results of party vote shares in different polling stations, during the Lithuanian parliamentary election of 1992 (the data was obtained from [21]). Note that we consider only the top three parties and have renormalised the original data so that the total vote share of the top three parties would sum to one in each polling station. This data set was first considered in [22], where an agent-based model generating the Beta distribution, and reasonably well reproducing detailed election results, was proposed. In [23] a statistical comparison between the four commonly used distributions in sociophysics, the Normal, Log-normal, Beta and Weibull, was carried out using the Watanabe-Akaike information criterion (WAIC) [43], which is a generalisation of the AIC. The comparison concluded that the Beta and Weibull distributions provide the best fits for the empirical data. However, their respective WAIC scores were within each other’s confidence intervals, and therefore no final conclusion was made. Here we also obtain a similar result, the Beta and Weibull distributions clearly have the overall best scores, however, as before, their confidence intervals overlap (see Table 11). As was noted in [23], the Beta and Weibull distributions are similar when the observed mean is close to
and the observed variance is reasonably small. In the empirical analysis this similarity is further increased when the sample size is small. In addition, for the estimated parameter values, the Gamma and Weibull distributions behave similarly when
. Therefore, we report the factor between the best performing distribution (highlighted in bold) and the next best distribution which is neither a Beta, Gamma or Weibull distribution (see Table 3).The second data set we consider contains the log-returns of two different exchange rates. We consider the BTC/JPY exchange rate during the time period between July 4, 2017 and July 4, 2018 (the data was obtained from [2]) on the bitFlyer exchange, as well as the EUR/USD exchange rate during the time period between June 1, 2000 and September 1, 2010 (the data was obtained from [17]). We consider the daily and one minute log-returns. For this data set we use moving block bootstrap [25] with a block size of one day.
In the econophysics literature it is commonly accepted that the log-returns are power-law distributed [6]. One of the commonly used fits for the log-returns is so-called
[39], which we add to our analysis for this empirical data set. Here we use the following parametrization of -Gaussian distribution [34]:(10) |
which is equivalent to Student’s t-distribution [26].
However, as can be seen in Table 12, we find that the Gamma and Weibull distributions noticeably outperform the
-Gaussian distribution. Performance of the Gamma and Weibull distributions is similar, due to the fact that for the estimated parameter values both of these distributions behave reasonably similarly; for these parameter values they are reasonably close to the Exponential distribution. Therefore, we report the
factor between the best performing distribution (highlighted in bold) and the next best distribution, which is neither a Gamma or Weibull distribution (see Table 3).For our fourth sample in the second empirical data set, i.e. the EUR/USD one minute log-returns, unexpectedly the Log-normal and -Gaussian distributions had the best performance. Though they are far from being similar for the considered observable value range and parameter values, they, most likely attained similar scores due to the shape of the empirical distribution. The Log-normal distribution seems to represent smaller log-returns well, while the -Gaussian is better at describing the tail events.
The third data set we consider is the European soccer data set [30], which contains thousand matches played in European national championships throughout 2008–2016. From this data set we have extracted five random teams and computed inter–goal times for each team. We have treated goals scored during extra time as scored on the 45th minute (if scored during the first half) and the 90th minute (if scored during the second half). In this analysis we have added the Exponential and Pareto distributions. For the estimated parameter values the Gamma and Weibull distributions behave similarly to the Exponential distribution. Note that the shape parameter values of the Gamma and Weibull distributions are very close to and the respective scale parameter values are similar. In this case it is known that Gamma and Weibull distributions are equivalent to the Exponential distribution with the appropriate scale parameter value. We therefore report the factor between the best performing distribution (highlighted in bold) and the next best distribution, which is neither an Exponential, Gamma or Weibull distribution (see Table 3). We observe that for the ELC sample, the obtained factor is the lowest and the score is the largest. This is most likely due to this team having played opponents with a larger variety of skill. In particular, it played in the top and the second tiers of the national championship during the considered time period, resulting in a goal scoring rate with a higher variation.
Data set | Sample | Distribution | factor |
---|---|---|---|
1 | SK | Weibull | 2.7128 |
1 | LKDP | Beta | 2.9507 |
1 | LDDP | Weibull | 1.8957 |
2 | Daily BTC/JPY | Gamma | 2.3077 |
2 | 1 min BTC/JPY | Gamma | 2.7037 |
2 | Daily EUR/USD | Weibull | 3.5381 |
2 | 1 min EUR/USD | Log-normal | 1.0514 |
3 | TOT | Exponential | 4.3233 |
3 | GLA | Exponential | 3.5087 |
3 | MUN | Exponential | 4.2010 |
3 | VAL | Weibull | 5.6370 |
3 | ELC | Weibull | 2.5022 |
We have proposed the Jensen-Shannon divergence () as a goodness-of-fit measure for data fitted with maximum likelihood estimation or curve fitting. Our experiments with simulated and empirical data in Section 4, for a variety of parametric distributions, shows that for simulated data the method is unequivocal in its preference for the true distribution (see Subsection 4.1), and for empirical data the method is effective in selecting the more likely distributions from a selection of hypothesised distributions (see Subsection 4.2).
As we have shown in Section 2 the has a precise information-theoretic meaning, and the factor has an intuitive meaning in terms of an odds ratio, in analogy to the Bayes factor. Moreover, the implementation of the as a measure of goodness-of-fit or for model comparison is relatively straightforward; see [24] for a Python implementation of the .
Ultimately more experience with empirical data sets is needed for a definitive assessment of how the performs in practice.
Applied Regression Analysis and Generalized Linear Models
. Sage Publications, Thousand Oaks, Ca., 3rd edition, 2016.Regression Analysis: Statistical Modeling of a Response Variable
. Academic Press, San Diego, CA., second edition, 2006.Proceedings of European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD)
, pages 173–189, Porto, 2015.Distribution | Parameter 1 | Parameter 2 | lb | ub | |
---|---|---|---|---|---|
Normal | 0.0003 | 0.9995 | 0.0002 | 0.0002 | 0.0005 |
Uniform | -4.7467 | 4.8122 | 0.1947 | 0.1911 | 0.1950 |
Distribution | Parameter 1 | Parameter 2 | lb | ub | |
---|---|---|---|---|---|
Normal | 1.6492 | 2.1497 | 0.1520 | 0.1513 | 0.1527 |
Uniform | 0.0054 | 104.6298 | 0.9001 | 0.8816 | 0.9005 |
Log-normal | 0.0001 | 1.0006 | 0.0002 | 0.0002 | 0.0005 |
Gamma | 1.1373 | 1.4501 | 0.0481 | 0.0477 | 0.0484 |
Weibull | 1.0002 | 1.6494 | 0.0543 | 0.0540 | 0.0547 |
Distribution | Parameter 1 | Parameter 2 | lb | ub | |
---|---|---|---|---|---|
Normal | 3.9999 | 2.8282 | 0.0743 | 0.0741 | 0.0746 |
Uniform | 0.0020 | 34.4431 | 0.5539 | 0.5137 | 0.5543 |
Log-normal | 1.1164 | 0.8020 | 0.0308 | 0.0305 | 0.0311 |
Gamma | 2.0031 | 1.9969 | 0.0002 | 0.0002 | 0.0005 |
Weibull | 1.4831 | 4.4386 | 0.0150 | 0.0147 | 0.0153 |
Distribution | Parameter 1 | Parameter 2 | lb | ub | |
---|---|---|---|---|---|
Normal | 100.0013 | 14.1418 | 0.0132 | 0.0130 | 0.0135 |
Uniform | 44.9060 | 186.2970 | 0.2073 | 0.2019 | 0.2103 |
Log-normal | 4.5952 | 0.1421 | 0.0059 | 0.0057 | 0.0062 |
Gamma | 50.0161 | 1.9994 | 0.0003 | 0.0002 | 0.0006 |
Weibull | 7.2817 | 106.2044 | 0.04636 | 0.0460 | 0.04667 |
Distribution | Parameter 1 | Parameter 2 | lb | ub | |
---|---|---|---|---|---|
Normal | 0.4999 | 0.2237 | 0.0246 | 0.0244 | 0.0248 |
Log-normal | -0.8340 | 0.6020 | 0.0583 | 0.0581 | 0.0586 |
Gamma | 3.7158 | 0.1345 | 0.0399 | 0.0396 | 0.0401 |
Beta | 1.9973 | 1.9988 | 0.0002 | 0.0002 | 0.0005 |
Weibull | 2.3816 | 0.5628 | 0.0255 | 0.0253 | 0.0258 |
Distribution | Parameter 1 | Parameter 2 | lb | ub | |
---|---|---|---|---|---|
Normal | 0.5000 | 0.0497 | 0.0010 | 0.0008 | 0.0013 |
Log-normal | -0.6982 | 0.1007 | 0.0128 | 0.0125 | 0.0131 |
Gamma | 99.7537 | 0.0050 | 0.0086 | 0.0083 | 0.0088 |
Beta | 50.0429 | 50.0496 | 0.0002 | 0.0002 | 0.0004 |
Weibull | 10.6949 | 0.5225 | 0.0369 | 0.0365 | 0.0372 |
Distribution | Parameter 1 | Parameter 2 | lb | ub | |
---|---|---|---|---|---|
Normal | 0.6666 | 0.0494 | 0.0064 | 0.0062 | 0.0067 |
Log-normal | -0.4084 | 0.0750 | 0.0158 | 0.0156 | 0.0161 |
Gamma | 179.5350 | 0.0037 | 0.0127 | 0.0125 | 0.0130 |
Beta | 60.0885 | 30.0540 | 0.0002 | 0.0002 | 0.0005 |
Weibull | 14.6275 | 0.6893 | 0.0326 | 0.0323 | 0.0329 |
Distribution | Parameter 1 | Parameter 2 | lb | ub | |
---|---|---|---|---|---|
SK – Sąjūdžio koalicija | |||||
Normal | 0.2412 | 0.1132 | 0.0264 | 0.0229 | 0.0307 |
Log-normal | -1.5556 | 0.5721 | 0.0243 | 0.0203 | 0.0290 |
Gamma | 3.9083 | 0.0617 | 0.0126 | 0.0097 | 0.0161 |
Beta | 3.0771 | 9.7045 | 0.0091 | 0.0067 | 0.0125 |
Weibull | 2.2492 | 0.2722 | 0.0090 | 0.0068 | 0.0127 |
LKDP – Lietuvos krikščionių demokratų partija | |||||
Normal | 0.1575 | 0.0921 | 0.0305 | 0.0276 | 0.0339 |
Log-normal | -2.0435 | 0.6884 | 0.0168 | 0.0138 | 0.0202 |
Gamma | 2.7196 | 0.0579 | 0.0060 | 0.0044 | 0.0084 |
Beta | 2.3282 | 12.4475 | 0.0057 | 0.0044 | 0.0082 |
Weibull | 1.7869 | 0.1773 | 0.0081 | 0.0060 | 0.0110 |
LDDP – Lietuvos demokratinė darbo partija | |||||
Normal | 0.6013 | 0.1516 | 0.0169 | 0.0135 | 0.0236 |
Log-normal | -0.5459 | 0.2892 | 0.0524 | 0.0468 | 0.0582 |
Gamma | 13.5242 | 0.0445 | 0.0422 | 0.0361 | 0.0485 |
Beta | 5.4454 | 3.6014 | 0.0105 | 0.0093 | 0.0181 |
Weibull | 4.5344 | 0.6588 | 0.0089 | 0.0081 | 0.0172 |
Distribution | Parameter 1 | Parameter 2 | lb | ub | |
Daily – BTC/JPY (bitFlyer) | |||||
Normal | 0.0415 | 1.0000 | 0.0099 | 0.0055 | 0.0144 |
Log-normal | -0.9039 | 1.1431 | 0.0030 | 0.0017 | 0.0050 |
Gamma | 1.1015 | 0.6174 | 0.0013 | 0.0010 | 0.0028 |
Weibull | 1.0276 | 0.6882 | 0.0015 | 0.0010 | 0.0030 |
q-Gaussian | 3.9028 | 1.0413 | 0.0060 | 0.0048 | 0.0116 |
One minute – BTC/JPY (bitFlyer) | |||||
Normal | 0.0011 | 1.0000 | 0.0140 | 0.0121 | 0.0158 |
Log-normal | -1.2561 | 1.5771 | 0.0073 | 0.0070 | 0.0077 |
Gamma | 0.7717 | 0.7993 | 0.0027 | 0.0026 | 0.0029 |
Weibull | 0.8468 | 0.5654 | 0.0029 | 0.0028 | 0.0031 |
q-Gaussian | 2.8964 | 0.5871 | 0.0082 | 0.0069 | 0.0097 |
Daily – EUR/USD (Forex) | |||||
Normal | 0.0179 | 1.0000 | 0.0038 | 0.0026 | 0.0053 |
Log-normal | -0.7227 | 1.1086 | 0.0047 | 0.0042 | 0.0052 |
Gamma | 0.6102 | 1.2489 | 0.0010 | 0.0007 | 0.0014 |
Weibull | 1.1579 | 0.8021 | 0.0006 | 0.0004 | 0.0011 |
q-Gaussian | 7.8686 | 2.2146 | 0.0022 | 0.0018 | 0.0039 |
One minute – EUR/USD (Forex) | |||||
Normal | 0.0004 | 1.0000 | 0.0139 | 0.0136 | 0.0142 |
Log-normal | -0.3236 | 0.6165 | 0.0086 | 0.0085 | 0.0086 |
Gamma | 2.3423 | 0.3881 | 0.0115 | 0.0113 | 0.0116 |
Weibull | 1.3548 | 1.0057 | 0.0143 | 0.0142 | 0.0144 |
q-Gaussian | 2.9820 | 0.6415 | 0.0090 | 0.0087 | 0.0093 |
Distribution | Parameter 1 | Parameter 2 | lb | ub | |
TOT – Tottenham Hotspur (English Premier League) | |||||
Normal | 57.7346 | 58.2559 | 0.0369 | 0.0326 | 0.0423 |
Gamma | 1.0565 | 54.6477 | 0.0066 | 0.0055 | 0.0109 |
Weibull | 1.0183 | 58.1866 | 0.0062 | 0.0054 | 0.0102 |
Exponential | 57.7346 | - | 0.0057 | 0.0057 | 0.0115 |
Pareto | 1.6809 | - | 0.0249 | 0.0200 | 0.0310 |
GLA – Borussia Monchengladbach (German Bundesliga) | |||||
Normal | 61.5088 | 61.3039 | 0.0378 | 0.0334 | 0.0424 |
Gamma | 1.0186 | 60.3877 | 0.0072 | 0.0063 | 0.0111 |
Weibull | 1.0012 | 61.5400 | 0.0067 | 0.0062 | 0.0106 |
Exponential | 61.5088 | - | 0.0067 | 0.0062 | 0.0127 |
Pareto | 1.6850 | - | 0.0235 | 0.0187 | 0.0303 |
MUN – Manchester United (English Premier League) | |||||
Normal | 48.3174 | 49.8385 | 0.0353 | 0.0315 | 0.0394 |
Gamma | 1.0497 | 46.0306 | 0.0065 | 0.0051 | 0.0101 |
Weibull | 1.0084 | 48.4960 | 0.0060 | 0.0050 | 0.0094 |
Exponential | 48.3174 | - | 0.0058 | 0.0050 | 0.0103 |
Pareto | 1.6886 | - | 0.0245 | 0.0203 | 0.0301 |
VAL – Valencia CF (Spanish La Liga) | |||||
Normal | 57.2779 | 52.8406 | 0.0321 | 0.0284 | 0.0365 |
Gamma | 1.1212 | 51.0879 | 0.0052 | 0.0048 | 0.0090 |
Weibull | 1.0725 | 58.8645 | 0.0051 | 0.0047 | 0.0093 |
Exponential | 57.2779 | - | 0.0058 | 0.0051 | 0.0122 |
Pareto | 1.6547 | - | 0.0286 | 0.0238 | 0.0347 |
ELC – Elche CF (Spanish La Liga/La Liga 2) | |||||
Normal | 102.1875 | 87.1523 | 0.0393 | 0.0263 | 0.0658 |
Gamma | 1.3220 | 77.2953 | 0.0172 | 0.0161 | 0.0349 |
Weibull | 1.1929 | 108.4610 | 0.0157 | 0.0157 | 0.0332 |
Exponential | 102.1875 | - | 0.0279 | 0.0192 | 0.0581 |
Pareto | 1.6154 | - | 0.0398 | 0.0257 | 0.0632 |
Comments
There are no comments yet.