1 Introduction
Statistical hypotheses have always been sets of parameters in classic frequentist hypothesis tests. However, in the context of Bayes factors – a prominent Bayesian method for hypothesis comparisons (Jeffreys, 1961; Kass Raftery, 1995; Gönen ., 2005; Rouder ., 2009) – it is argued that also the prior distribution on the parameter might be employed to represent the hypotheses (or “models”) that should be contrasted against each other. This view was promoted primarily by Vanpaemel (2010) in an attempt to turn one of the fundamental issues of Bayes factors, namely its prior sensitivity (see e.g. Kass Raftery, 1995; Sinharay Stern, 2002; Liu Aitkin, 2008; Kruschke, 2015), from a limitation to a feature. By now, this view can be found within many other publications, sometimes rather explicitly, sometimes only implicitly (see e.g. Dienes (2019, p. 364f), Etz . (2018, p. 228), Heck . (2020, p. 5), Morey . (2016, p. 16), Rouder . (2016), Tendeiro Kiers (2019, p. 776, 780), Vanpaemel Lee (2012)). Even the authors of this paper were previously influenced by this view (Ebner ., 2019). However, as will be outlined within this paper, the inferential foundation of Bayes factors is severely impaired when representing statistical hypotheses via parameter distributions. Instead, statistical hypotheses need to be sets of parameters only, even in Bayesian statistics.
In order to elaborate these considerations, an explicit terminology is outlined first, building on the framework by Kass (2011). Subsequently, updating consistency of Bayes factors is given a detailed account to determine conditions that lead to inconsistencies. Finally, the representation of hypotheses by sets of parameters and by prior distributions is assessed, respectively, showing that former does not suffer foundational issues, only latter does.
2 Big Picture
2.1 Formalization and Interpretation
A comprehensive view of statistical inference distinguishes between the real world and a theoretical world (Kass, 2011), where latter contains mathematical formalizations of the relevant characteristics in the real world. Interpreting the components of the theoretical world leads to their counterparts in the real world. In that sense, both worlds can be connected by formalization and interpretation (see Figure 1). Based on this general view by Kass (2011), the pig picture of statistical inference in the context of the Bayes factors shall be derived in detail in the following.
Typically, a researcher designs a scientific investigation to assess a phenomenon of interest. This scientific investigation leads to data that are described by a parametric sampling distribution, and the parameter (which lives in the theoretical world) should correspond to phenomenon of interest in the real world. If the correct interpretation of the parameter does not match with the phenomenon a researcher is interested in, then the design of the scientific investigation might be reconsidered. For the reminder of this paper, a proper correspondence between the parameter and the phenomenon of interest is assumed.
Within this fundamental framework, the big picture of statistical inference in the context of Bayes factors shall be established (as eventually depicted in Figure 1). This, however, is not easy, as relevant terms, as e.g. “theory”, “model“, or “hypothesis”, have a multitude of different meanings and usages. In that, it is mandatory to explicitly define the employed terms such that their usage can be universally agreed on. Therefore, this elaboration should start at the very beginning, namely with two undeniable mathematical properties of Bayes factors.
2.2 Common Ground: The Bayes Factor
The Bayes factor (formulas below) is a quantity that is used within a statistical analysis and lives in the theoretical world. Two mathematical properties of Bayes factors cannot be denied:

It is a Bayesian quantity, such that it requires a distribution on the parameter.

It has a contrasting nature and contrasts two mathematical objects against each other (frequently referred to as hypotheses or models) (cp. e.g. Rouder ., 2016, Element #2 on p. 16).
Put the other way: Without a parameter distribution or without contrasting two mathematical objects (hypotheses or models) there cannot be a Bayes factor. Accordingly, whenever a Bayes factor is employed it is safe to assume the existence of a parameter distribution and the existence of a contrast and its two contrasted objects. Although these two simple facts about Bayes factors might seem trivial, it is important to state them explicitly, as it is exactly these two properties that serve as the basis to derive the big picture of statistical inference in the context of Bayes factors. While it is expected that there is a common agreement upon these two facts, there might be different views about other concepts employed in the context of Bayes factor (e.g. about the nature of hypotheses). By starting the elaboration with these two facts that can be agreed on, it is possible to assess the origin of disagreements about other concepts.
2.3 Parameter Distribution and Knowledge
Parameter distributions (e.g. prior or posterior) live in the theoretical world and are typically interpreted as knowledge (see e.g. Jaynes, 2003) or uncertainty (see e.g. Kruschke, 2015) or degrees of belief (see e.g. Jeffreys, 1961) or information (see e.g. Berger, 1985) about the phenomenon of interest. Within this paper, the term knowledge shall be employed, as the exact label is not relevant for the elaborations below, only the fact that it is the interpretation of the parameter distribution. Accordingly, define the term knowledge (about a phenomenon of interest) within this paper as the interpretation of a Bayesian parameter distribution.
2.4 Hypotheses (or Models)
The mathematical objects in the theoretical world that are contrasted by the Bayes factor shall be referred to as (statistical) hypotheses (although the attribute “statistical” will e omitted as the term hypothesis is not employed in a nonstatistical sense within this paper). Other publications (e.g. Rouder ., 2016; Rouder, Haaf Aust, 2018) might state that the Bayes factor contrasts two models against each other, yet the formula of the Bayes factor is exactly the same as in those publications that contrast hypotheses against each other (cp. also Morey ., 2016, p. 11). Therefore, these models are the same mathematical objects as the hypotheses within this paper (namely those mathematical objects that are contrasted against each other by the Bayes factor). Other authors (e.g. Kruschke Liddell, 2018) use both terms (model and hypothesis) rather interchangeably (cp. also Tendeiro Kiers, 2019, p. 775, esp. footnote 1). In the remainder of this paper, the term “model” shall be avoided, as it appears to have a variety of different other usages as well. Accordingly, define the term hypothesis within this paper as one of the two mathematical objects that are contrasted against each other by the Bayes factor.
To assess the nature of this mathematical object is the aim of this paper and it will be argued that it can only be a set of parameters and not a parameter (prior) distribution.
2.5 Theoretical Positions (or Theories) and Research Question
The Bayes factor contrasts two mathematical objects against each other in the theoretical world, and the same scheme applies to the real world after interpretation: There is a contrast between two theoretical positions about the phenomenon of interest in the real world. In that sense, define the term theoretical position within this paper as the interpretation of a hypothesis.
The respective research question contrasts these two theoretical positions against each other. Please note that, in general, the nature of potential research questions about the phenomenon of interest is extremely versatile. However, only those research questions can be answered by Bayes factors, that contrast two theoretical positions against each other. If a research question contrasts two theories against each other, which cannot be formalized as those mathematical objects that are contrasted by the Bayes factor, then the Bayes factor is not suitable to answer such a research question.
Typically, the term “theory” is employed instead of “theoretical position”, and it is said that the Bayes factor compares two “theories” (cp. also Rouder, Haaf Aust, 2018, who use both terms). In this context, both terms (theory or theoretical position) denote the same, namely the interpretation of a hypothesis (i.e. the interpretation of the mathematical objects that are contrasted against each other by the Bayes factor). However, the term “theory” might be used in a multitude of different other ways as well, e.g. in a noncontrasting context or such that it cannot be formalized as a hypothesis in the context of Bayes factors. To avoid confusion and to emphasized its contrasting nature, only the term “theoretical position” shall be employed within this paper.
2.6 Summary Terminology
So far, the concepts of the big picture of statistical inference in the context of Bayes factor have been outlined and it should be emphasized that the terms “hypothesis”, “theoretical position”, and “knowledge” are used within this paper to facilitate an understandable elaboration. In fact, it might have been possible to merely use the descriptions “mathematical objects that are contrasted against each other by the Bayes factor” (hypotheses), “interpretation of these mathematical objects” (theoretical positions), and “interpretation of a parameter distribution” (knowledge). Together with the two above mentioned undeniable properties about Bayes factors, namely the existence of a contrast (of two mathematical objects) and the existence of a Bayesian parameter distribution, the employed concepts should have been explicitly outlined. Other publications might employ a different terminology, such that it is necessary to check which concepts are actually referred to by each term in each publication.
2.7 Statistical Inference with Bayes Factors
Statistical inference is the procedure of deriving conclusions from observed data. Naturally, there is a variety of different inferential approaches, each using different principles to extract information from the observed data. The elegance of Bayes factors might be attributed to the fact that they combine two different approaches to statistical inference in one single quantity: Bayesian learning and evidential quantification.

Within the Bayesian approach to statistical inference, a parameter prior distribution gets updated to a parameter posterior distribution by including the information from the observed data via Bayes rule. Conclusions are then derived solely from the parameter distribution.

Within the evidential approach to statistical inference (cp. e.g. Berger Wolpert, 1988; Royall, 1997, 2004; Blume, 2011), the information within the data are used to quantify evidence w.r.t. two different fixed theoretical positions about a phenomenon of interest. Assume two theoretical positions and are of interest (and specified in the research question) and assume the evidence within the data is quantified to be , then the evidential interpretation is: After observing the data the credibility of over is times higher than before the data were observed (see e.g. Morey ., 2016)
On the one hand, while Bayesian statistics is able to answer also different research questions, by using Bayes factors the nature of potential research questions is limited to those that contrast theoretical positions, thus allowing a useful and intuitive interpretation in the context of evidential quantification. On the other hand, while the framework of evidential quantification might consider a variety of different contrasting statistical analysis, by using Bayes factors the statistical analysis is restricted to the Bayesian framework, providing a thorough and powerful mathematical foundation (see e.g. Jeffreys, 1961; Berger, 1985).
2.8 Knowledge vs. Theoretical Positions
Accordingly, to consider the inferential foundation of Bayes factors comprehensively, both the knowledge about the phenomenon of interest (Bayesian inference) and the theoretical positions about the phenomenon of interest (evidential inference) need to be distinguished. These concepts are fundamentally different! While Bayesian learning allows (or even requires) the knowledge itself to be altered, theoretical positions in the context of evidential quantification stay fixed, only their credibility may change. Without such a clear distinction, the inferential foundation of Bayes factors might break apart:

If the theoretical positions themselves (not only their credibility) are allowed to change by observing the data, then the useful and intuitive interpretation as evidence quantification is lost. Assume two theoretical positions and are of interest (and specified in the research question) but change by seeing the data to the theoretical positions and , respectively, and assume the evidence within the data is quantified to be , then the correct but useless interpretation is: The credibility of over after the data were observed is times higher than the credibility of over before the data were observed.

If the knowledge (and thus the parameter distribution) is forced to stay fixed although some data were observed, then Bayes rule is not applied, leading to updating inconsistencies (outlined in detail below).
Accordingly, a clear distinction between knowledge (formalized as parameter distributions and being allowed to change) and theoretical positions (formalized as statistical hypotheses which stay fixed) about a phenomenon of interest is mandatory. In that, the framework depicted here (Figure 1) does account for the fundamental different nature of knowledge and theoretical positions about a phenomenon of interest.
Interestingly, on a side note, it might be stated that – by its nature – also the prior knowledge is able to inform the research question. Typically, prior knowledge is insufficient to answer the research question adequately, justifying the necessity to conduct a scientific investigation. However, the structure of how to answer the research question with the available knowledge is independent of whether data were observed or not: In a Bayesian context, it is a parameter distribution from which the answer to the research question is derived, and this way of deriving answers might work both for the prior and the posterior distribution.
2.9 OnetoOne Correspondence
Ideally, the correspondence between the concepts of the real world with those in the theoretical world (gray dashed arrows in Figure 1) should be onetoone, mathematically described by a bijective mapping. Without such a bijective mapping, chosen formalizations might be arbitrary or the interpretation of the results might inform past the research question. While the correspondence between the phenomenon of interest and the parameter depends on the quality of the experimental design, and the mapping between parameter distributions and knowledge is typically assumed to be bijective in the Bayesian setting (two different distributions represent two different bodies of knowledge), of interest for this elaboration is the relation between theoretical positions and hypotheses. Consider two cases:

There is a bijective mapping between theoretical positions and hypotheses. Then, different hypotheses represent different theoretical positions.

A variety of different hypotheses formalize one single theoretical position. When forced to commit oneself to one of those hypotheses (as typically required by the statistical analysis), this choice might intuitively be called instantiation: The theoretical position is formally instantiated by one of the hypotheses. This terminology is frequently employed it the literature that (potentially implicitly) assume hypotheses to be represented by distributions (see e.g. Morey . (2016, p. 13), Rouder, Haaf Aust (2018, p. 2), Vanpaemel (2010, p. 491), Vanpaemel Lee (2012, p. 1054)), suggesting that such a nonbijective relation might be implicitly assumed. Other publications do also employ such a nonbijective relation without using the term instantiation (e.g. Dienes, 2019, Box 3 on p. 369).
In that, these two types of relations between theoretical positions and hypotheses might describe the ideal and the actual situation, respectively, and will be used below for elaborating issues inherent to representing hypotheses by parameter distributions.
3 Updating Consistency
In Bayesian statistics, it is Bayes rule and not another principle, which states how a prior distribution gets updated to a posterior distribution (i.e. how information is extracted from the observed data; see Figure 2, topleft). If the prior distribution reflects all available knowledge about the phenomenon of interest before the investigation is conducted, then the corresponding posterior distribution reflects all available knowledge about the phenomenon of interest after the investigation is conducted. Disagreeing would imply that Bayes rule is not able to extract all information within the observed data that is relevant for the phenomenon of interest (and no Bayesian would do so). As a consequence, the posterior distribution might be employed as a prior distribution for the Bayesian analysis of a data set obtained in a subsequent investigation with the same design (see Figure 2, topright). Naturally, the final posterior distribution after subsequently updating twice should be identical to the posterior distribution obtained by merging both data sets first and then updating the initial prior distribution at once (see Figure 2, bottom). If so Bayesian updating is consistent (cp. Rüger, 1998, p. 190), else it is inconsistent.
In general, updating with Bayes factors is consistent (Schwaferts Augustin, 2021) (see Figure 3a). Assume the observed data consists of independent and identically distributed observations () that follow the parametric sampling distribution with density , where is the parameter (representing the phenomenon of interest), such that the density of the complete data set is . The hypotheses and
have prior probabilities
and , and the corresponding densities of the hypothesisbased parameter distributions are and , respectively. Then the density of the overall prior distribution is (see e.g. Rouder, Haaf Vandekerckhove, 2018)(1) 
The Bayes factor
(2) 
is calculated using only the data and the hypothesisbased parameter densities and
, and allows to update the prior probabilities of the hypotheses to their posterior probabilities (see Figure
3a, left):(3) 
In addition, revealed by simply applying Bayes rule consistently to the overall prior density (depicted in detail by Schwaferts Augustin, 2021), also the hypothesisbased parameter densities and get updated by the data to their posterior densities and (gray arrow in Figure 3a, left) (cp. also Kruschke Liddell, 2018). In general (i.e. for nondegenerate prior distributions), these posteriors are different than the priors. These updated hypothesisbased posterior densities together with the posterior probabilities on the hypotheses describe the overall posterior distribution:
(4) 
If a new data set was observed (using the same experimental setup, i.e. following the same sampling distribution), this updated posterior distribution describes the starting point for a subsequent analysis with Bayes factors (Figure 3a, right). Consequently, the corresponding Bayes factor
(5) 
is also inherently influenced by the information within the previous data set . In that, however, updating is consistent (a complete proof is provided by Schwaferts Augustin, 2021).
Updating inconsistencies occur, if the second data set is analyzed with the initial hypothesisbased prior distributions (i.e. and as in equation (2)) although the first data set was already observed (Figure 3b). This happens, if the initial hypothesisbased prior distributions do not get updated in the analysis of the first data set , which is a violation of Bayes rule (cp. also Rouder Morey, 2011, who noticed this issue and solved it properly by merging all data sets first). It is difficult to assess the prevalence of such updating inconsistencies in the current scientific literature, as the data is typically analyzed at once (not subsequently), and the calculation of the Bayes factor can be done without explicitly updating the hypothesisbased priors. Nevertheless, this update must not be neglected to be consistent (Figure 3a) and the extent to which this fact is overlooked might be indicated by Tendeiro Kiers (2019): After a year of literature review about Bayes factors to understand them, the authors (and possibly their reviewers from the journal Psychological Methods as well) were convinced (see footnote 2 on p. 776 therein) that the hypothesisbased priors do not get updated to their posteriors. Interestingly, at the same place, the authors refer to Kruschke Liddell (2018), who, in contrast, elaborated on the update of the hypothesisbased priors rather explicitly (see p. 157f and Fig. 4 and 6 therein). This is a vivid sign of the confusion a researcher faces in the literature about Bayes factors.
4 Hypotheses as Sets of Parameters
At first, assume that hypotheses are represented by two disjoint subsets of the parameter space :
(6) 
These subsets need to be chosen to correspond to the theoretical positions that are contrasted within the research question. In addition to these theoretical positions, there is knowledge about the phenomenon of interest that is formalized by a parameter distribution with density . Without loss of generality, assume that this distribution has a positive density only for parameter values that are contained in one of the hypotheses. The prior probabilities of the hypotheses are obtained from this parameter distribution by
(7) 
and, once the data were observed and the prior density was updated to , the posterior probabilities of the hypotheses are
(8) 
How the data change the probabilities of the hypotheses is described by the Bayes factor
(9) 
Accordingly, the probabilities of the hypotheses change but the hypotheses themselves, i.e. the sets and , stay the same. In that, the Bayes factor can be interpreted appropriately as evidence quantification. Further, the overall prior distribution () was updated completely to the overall posterior distribution ), providing updating consistency. Consequently, hypotheses can safely be represented by sets of parameters in the context of Bayes factors.
Before continuing, the current situation shall be characterized further (which will be needed below). Any given prior distribution with density formalizes knowledge about the phenomenon of interest. One part of this knowledge relates to the one and another part to the other theoretical position (which are contrasted within the research question), formalized by the hypothesisbased prior densities and . Mathematically, these densities are obtained from the initial density by
(10)  
(11) 
where and are the densities restricted to the sets and , respectively. Now, consider the set of all potentially observable data sets of any size , with referring to the empty data set (representing the prior situation). For a given prior distribution with density , denote the set of all potentially obtainable posterior densities as
(12) 
This set contains all possible posterior distributions, i.e. represents all possible bodies of knowledge about the phenomenon of interest, that might be available after some (yet unknown) data were observed. Analogously, all different possible bodies of knowledge about the theoretical positions, respectively, are formally contained within the sets
(13)  
(14) 
These sets contain only probability distributions with probability mass in the sets
and , respectively. In summary, a hypothesis and all potentially obtainable bodies of knowledge about this hypothesis (in the context of given prior knowledge and a certain experimental setup) can be described by the sets and or and , respectively.5 Hypotheses as Parameter Distributions
Now, prior distributions with densities and shall represent the hypotheses and , respectively. Ideally, the mapping between theoretical positions and hypotheses should be bijective, updating should be consistent, and theories should not change by seeing the data , only their credibility should. In the following, two of each of these properties shall be assumed first and then evaluated w.r.t. the third.
5.1 Case 1: Bijective Mapping and Updating Consistency
Assume there is a bijective mapping between theoretical positions and hypotheses. As hypotheses are represented by prior distributions, not only by a set of parameters, two different parameter distributions represent two different hypotheses, i.e. two different theoretical positions. Consistent Bayesian updating dictates to update the prior distributions to the posterior distributions, which are (for nondegenerate cases) different than the respective prior distributions. As the parameter distributions change by observing the data, so do the hypotheses and theoretical positions. It that, observing data changes the theories that should be contrasted with each other, not only their credibility. This does not match with the fundamental characteristics of statistical inference by evidence quantification, leading to issues with the interpretation of Bayes factors.
5.2 Case 2: Bijective Mapping and Unchanged Theories
Again, assume there is a bijective mapping between theoretical positions and hypotheses, and that hypotheses are represented by the prior distributions, such that two different parameter distributions represent two different hypotheses, i.e. two different theoretical positions. If theories should not change by observing data, the posterior distribution needs to be the same as the prior distribution, which is outlined in Figure 3b and leads – for nondegenerate prior distributions – to updating inconsistency. In that, inference does not follow Bayes rule.
5.3 Case 3: Updating Consistency and Unchanged Theories
Now, also allow the mapping between theoretical positions and hypotheses to be nonbijective, in a sense that a single theoretical position might be formalized by a multitude of different hypotheses. If hypotheses are represented by prior distributions, then a set of different distributions corresponds to one theoretical position. How does this set look like, if updating should be consistent and a proper evidential interpretation of Bayes factors shall be kept?
To answer this question, assume that in the specifications of a statistical analysis each of both theoretical positions is instantiated by only one prior distribution with density or , respectively, such that their supports do not overlap with each other (overlapping hypotheses will be discussed subsequently). Denote their supports (i.e. the sets of parameters in which the density has nonzero, positive mass) with and , respectively. If updating shall be consistent, then parameter distributions are allowed to change by seeing the data. Considering all potentially observable data sets (of any size ), the initial prior densities and might result in any posterior density within the sets and (equations (13) and (14)), respectively. To keep the proper evidential interpretation of Bayes factors, all these parameter densities within the sets and need to represent the same theoretical position, respectively. In that, the hypotheses and are represented by the sets and , which, however, contain all potentially observable parameter distributions with positive probability mass restricted to the parameter sets and , respectively. Two different parameter distributions with the same support (either or ) do not differentiate between two theoretical positions, only two different supports, i.e. sets of parameters, do. Accordingly, this situation is practically equivalent to representing hypotheses as sets of parameters.
5.4 Overlapping Hypotheses
Within these elaborations, it was assumed that the hypotheses are nonoverlapping. Mathematically, they might also be overlapping. Consider the case, in which and are not (almost everywhere w.r.t. the prior density ) disjoint. Then there are parameter values that are contained within both and , such that, if these parameter values are true, the posterior distribution will be shifted – for an increasing sample size – within this overlapping part. As a consequence, even an infinitely large data set cannot decisively distinguish between both hypotheses (i.e. answer the research question), and the Bayes factor has a finite limit. Formally, the behavior^{1}^{1}1This equation (15) has been formulated in a mathematically imprecise way in order to present the relevant points clearly. Actually, the condition constitutes the case in which the true parameter is within the set such that for increasing the posterior distribution will be shifted into this set . The other conditions have to be read analogously. There might be cases in which – mathematically – is within one of the parameter sets that define the conditions, but the posterior distribution will not be shifted completely within the respective set for . These cases, however, lie exactly at the borders between the setvalued hypotheses and are expected to occur almost never w.r.t. the parameter distribution. of the Bayes factor is
(15) 
where is the true parameter and is a fixed value that depends on (cp. Morey Rouder, 2011, p. 411).
In that, if – for a given investigational setup – the theoretical positions (that are of interest in the context of the research question) are reasonably formalized by overlapping hypotheses, it might happen that the scientific investigation cannot answer the research question. This might be a waste of time and money, and cannot be argued to yield a “strong inference” (Platt, 1964) or constitute a “severe test” (Popper, 2002[1935]), claims frequently raised by promoters of the Bayes factor (see e.g. Dienes (2019, p. 365), Etz . (2018, p. 228), Schönbrodt Wagenmakers (2018, p. 130)), which apparently require nonoverlapping hypotheses. In this context it shall be noted that even Jeffreys (1961, e.g. p. 269) himself tried to avoid the possibility of obtaining a finite, nonzero Bayes factor limit at all costs: His derivation of the prior distributions for Bayes factors (now sometimes referred to as default Bayes factors (see e.g. Ly ., 2016)) aimed at being able to state decisive evidence (which corresponds to Bayes factors tending to or ) even for a fixed number of observations (which is an even stronger requirement than within equation (15)). In this regard, it seems advisable to rethink the investigational setup such that overlapping hypotheses can be avoided.
6 Example
An example shall be provided that leads to a paradox with the representation of hypotheses via parameter distributions. Assume the data
is characterized by the binomially distributed random quantity
, with and (probability of success) being the parameter of interest. Both hypotheses shall have an identical support, but different prior (beta) distributions (see Figure
4 left, rounded boxes):(16) 
Further, both hypotheses shall have equal prior probabilities (see Figure 4 left).
Now, assume that successes were observed. The resulting Bayes factor is leading to posterior probabilities and , favoring the alternative hypothesis (see Figure 4 right). However, also the withinhypothesis beta distributions get updated by observing to (see Figure 4a, right, rounded boxes; else updating is inconsistent (see Figure 4b))
(17) 
This example was constructed such that the posterior null hypothesis has the same distribution as the prior alternative hypothesis . If a theoretical position is represented by a parameter distribution, then both of these hypotheses represent the same theoretical positions (Figure 4a). Although the prior alternative hypothesis gains credibility by observing , the posterior null hypothesis has less credibility due to observing . Paradoxically, both hypotheses represent the same theoretical position, so it is not clear whether the data agree or disagree with this theoretical position.
In order to solve this paradox, hypotheses need to be considered as sets of parameters. Both hypotheses hypothesize the same parameter set, representing the same theoretical position. Accordingly, the research question, that will be answered within this analysis, contrasts a theoretical position against itself. Naturally, no data can inform this pointless contrast.
7 Discussion
This paper elaborated that hypotheses should be represented by sets of parameters only, not by parameter distributions. If so, updating consistency and a proper evidential interpretation of Bayes factors are given, and if not, the foundational or evidential basis of Bayes factors is severely impaired. In that, a clear distinction between theoretical positions and knowledge about the phenomenon of interest is mandatory. It is important that the content of theoretical positions should only inform the specification of hypotheses (sets of parameters) and that the available knowledge should only inform the specification of the prior distribution.
7.1 Empirical Content
It is argued that by using parameter distributions to represent hypotheses, their empirical content can be increased (Vanpaemel Lee, 2012, p. 1052), supplemented by references to Popper (2002[1935]). However, the elaborations within this paper might cast doubt. According to Popper (2002[1935], p. 103), the empirical content of a statement is “the class of its potential falsifiers”, such that a higher empirical content is characterized by a larger class of potential falsifiers. Now, one needs a proper concept of “falsifiability” in the Bayesian framework, and it appears that defining such a concept is fundamentally difficult, as Popper’s elaborations of induction are restricted to the modus tollens of deductive tests (Popper, 2002[1935], p. 19) while the Bayesian framework tries to formalize induction by a logic of partial beliefs (cp. e.g. Ly ., 2016, p. 20). If, at all, one tries to find such a concept, one might say that a hypothesis is falsified if its probability is zero. Naturally, a zero probability of a hypothesis will not be obtained in a scientific investigation (which assumes nonzero prior probabilities), so – practically – one might stop if the probability of a hypothesis is sufficiently small and then decide to treat this hypothesis as falsified. In terms of the limit behavior of Bayes factors, this resembles the case in which the Bayes factor tends towards or (cp. also Rouder, Haaf Vandekerckhove, 2018, p. 105). This refers to the first two cases in equation (15). In the third case, however, evidence will never be conclusive if , so it will not be possible to “falsify” any of the hypotheses with the given experiment. In that, the class of potential falsifiers of is and the class of potential falsifiers of is . Therefore, it is only the supports and that determine the empirical content of the hypotheses, not the exact shape of the prior distributions ( and ). Consequently, beyond their mere supports, prior distributions do not increase the empirical content of hypotheses.
7.2 NilHypotheses
Acknowledging that hypotheses are only the supports of the withinhypothesis prior distributions, it appears that many elaborations in the context of Bayes factors (e.g. Gönen ., 2005; Rouder ., 2009; Rouder, Haaf Vandekerckhove, 2018; Rouder, Haaf Aust, 2018; Dienes, 2019) do still use a sharp null hypothesis, which hypothesizes only one single parameter value (cp. also Tendeiro Kiers, 2019, p. 787). In that, these hypotheses are identical to those employed in conventional null hypothesis significance testing (NHST), such that its heavy critique about the uselessness of these hypotheses (see e.g. Berkson, 1938; Cohen, 1994; Kirk, 1996; Gigerenzer, 2004) does apply to these Bayes factors as well. The inclusion of the parameter distribution into the statistical analysis does not tackle these issues (about the uselessness of the employed hypotheses). To do so, hypotheses need to be specified as sets of parameters that correspond to the theoretical positions that are of interest within the research question. Then, these hypotheses are typically not singlevalued anymore. In this regard, the methodological development of Bayes factors with reasonably specified intervalvalued hypotheses needs to be addressed more intensively. Although few elaborations exist (cp. Morey Rouder, 2011; Hoijtink ., 2019; Heck ., 2020), this development is treated as rather ancillary within the Bayes factor literature. Alternative hypothesisbased methods (see e.g. Lakens, 2017; Lakens ., 2018; Kruschke, 2015, 2018) already started to primarily address this necessity of allowing reasonably specified intervalvalued hypotheses, and Bayes factors need to go along with them.
7.3 Knowledge vs. Theoretical Positions
The central message of this paper is that knowledge and theoretical positions about the phenomenon of interest need to be distinguished. Former inform the specification of the prior distribution, latter inform the specification of the hypotheses. In that sense, both of these mathematical objects (prior distribution, hypotheses) or real world concepts (knowledge, theoretical positions) are independent of each other. This can also be seen, as it is possible to specify a prior distribution without having hypotheses (as in a nonhypothesisbased Bayesian analysis) or as it is possible to specify hypotheses without having a prior distribution (as in nonBayesian hypothesisbased analyses). Yet, it is possible to depict the prior distribution in dependence of the hypotheses via the withinhypothesis prior distributions (equations (10) and (11)) and the prior probabilities of the hypotheses (equation (7)). Strikingly, after combing these components to the overall prior distribution (equation (1)), its dependence on the hypotheses is gone! Naturally, what is known about the phenomenon of interest does primarily not depend on which hypothetical conjectures might be possible about it. This has serious implications about how to specify the essential quantities in a hypothesisbased Bayesian analysis: It is recommended to specify the overall prior distribution (as density ) and the hypotheses (via and ) independently. If, in contrast, the withinhypothesis prior distributions shall be specified (via and ), the applied scientist needs to make sure that by combining them with the prior probabilities of the hypotheses to the overall prior distribution (equation (1)) its dependence on the hypotheses is gone. This seems quite remarkable.
7.4 Outlook
Looking at the history of Bayesian statistics, it appears that prior distributions have always had a bad reputation. In this context, it seems that the idea of using prior distributions to formalize theoretical positions was motivated by the intention of correcting this bad reputation of prior distributions. For example, Vanpaemel Lee (2012, both quotes on p. 1048) stated that they “do not agree that priors are an unwanted aspect of the Bayesian framework” and that they “believe that it is wrong to malign priors as a necessary evil”. It can only be agreed on! Parameter distributions are a vital part of Bayesian statistics and must not be condemned! This elaboration clarified the distinction between knowledge (parameter distribution) and theoretical positions (hypotheses), and, therefore, tried to contribute to a correct employment of parameter distributions in the Bayesian framework.
References
 Berger (1985) Berger1985Berger, JO. 1985. Statistical Decision Theory and Bayesian Analysis Statistical decision theory and Bayesian analysis (Second ). New YorkSpringer. 10.1007/9781475742862
 Berger Wolpert (1988) Berger1988Berger, JO. Wolpert, RL. 1988. The Likelihood Principle The likelihood principle. Lecture NotesMonograph Series6iii–160.2 (discussion: 160.3–199). http://www.jstor.org/stable/4355509
 Berkson (1938) Berkson1938Berkson, J. 1938. Some Difficulties of Interpretation Encountered in the Application of the ChiSquare Test Some difficulties of interpretation encountered in the application of the chisquare test. Journal of the American Statistical Association33203526–536. 10.2307/2279690
 Blume (2011) Blume2011Blume, JD. 2011. Likelihood and its Evidential Framework Likelihood and its evidential framework. PS. Bandyopadhyay MR. Forster (), Philosophy of Statistics Philosophy of statistics ( 493–511). Elsevier. 10.1016/B9780444518620.500149
 Cohen (1994) Cohen1994Cohen, J. 1994. The Earth is Round () The earth is round (). American Psychologist4912997–1003. 10.1037/0003066X.49.12.997
 Dienes (2019) Dienes2019Dienes, Z. 2019. How do I Know what my Theory Predicts? How do I know what my theory predicts? Advances in Methods and Practices in Psychological Science24364–377. 10.1177/2515245919876960

Ebner . (2019)
Ebner2019Ebner, L., Schwaferts, P. Augustin, T.
2019.
Robust Bayes Factor for Independent TwoSample
Comparisons Under Imprecise Prior Information Robust Bayes factor for
independent twosample comparisons under imprecise prior information.
J. De Bock, CP. de Campos, G. de Cooman, E. Quaeghebeur G. Wheeler (), Proceedings of the Eleventh International Symposium on Imprecise Probability: Theories and Applications Proceedings of the eleventh international symposium on imprecise probability: Theories and applications ( 103, 167–174).
PMLR. http://proceedings.mlr.press/v103/ebner19a.html  Etz . (2018) Etz2018Etz, A., Gronau, QF., Dablander, F., Edelsbrunner, PA. Baribault, B. 2018. How to Become a Bayesian in Eight Easy Steps: An Annotated Reading List How to become a Bayesian in eight easy steps: An annotated reading list. Psychonomic Bulletin & Review25219–234. 10.3758/s1342301713175
 Gigerenzer (2004) Gigerenzer2004Gigerenzer, G. 2004. Mindless Statistics Mindless statistics. The Journal of SocioEconomics335587–606. 10.1016/j.socec.2004.09.033

Gönen . (2005)
Goenen2005Gönen, M., Johnson, WO., Lu, Y. Westfall, PH.
2005.
The Bayesian TwoSample t Test The Bayesian twosample t test.
The American Statistician59252–257. 10.1198/000313005X55233  Heck . (2020) Heck2020Heck, DW., Boehm, U., BöingMessing, F., Bürkner, PC., Derks, K., Dienes, Z.others 2020. A Review of Applications of the Bayes Factor in Psychological Research A review of applications of the Bayes factor in psychological research. 10.31234/osf.io/cu43g
 Hoijtink . (2019) Hoijtink2019Hoijtink, H., Mulder, J., van Lissa, C. Gu, X. 2019. A Tutorial on Testing Hypotheses Using the Bayes Factor. A tutorial on testing hypotheses using the Bayes factor. Psychological Methods245539–556. 10.1037/met0000201
 Jaynes (2003) Jaynes2003Jaynes, ET. 2003. Probability Theory: The Logic of Science Probability theory: The logic of science. Cambridge University Press. 10.1017/CBO9780511790423
 Jeffreys (1961) Jeffreys1961Jeffreys, H. 1961. Theory of Probability Theory of probability (Third ). OxfordOxford University Press.
 Kass (2011) Kass2011Kass, RE. 2011. Statistical Inference: The Big Picture Statistical inference: The big picture. Statistical Science: A Review Journal of the Institute of Mathematical Statistics2611–9. 10.1214/10STS337
 Kass Raftery (1995) Kass1995Kass, RE. Raftery, AE. 1995. Bayes Factors Bayes factors. Journal of the American Statistical Association90773–795. 10.2307/2291091
 Kirk (1996) Kirk1996Kirk, RE. 1996. Practical Significance: A Concept Whose Time has Come Practical significance: A concept whose time has come. Educational and Psychological Measurement565746–759. 10.1177/0013164496056005002
 Kruschke (2015) Kruschke2015Kruschke, JK. 2015. Doing Bayesian Data Analysis: A Tutorial With R, JAGS, and Stan Doing Bayesian data analysis: A tutorial with R, JAGS, and Stan. New YorkAcademic Press. 10.1016/B9780124058880.099992

Kruschke (2018)
Kruschke2018Kruschke, JK.
2018.
Rejecting or Accepting Parameter Values in Bayesian Estimation Rejecting or accepting parameter values in Bayesian estimation.
Advances in Methods and Practices in Psychological Science12270–280. 10.1177/2515245918771304  Kruschke Liddell (2018) Kruschke2018bKruschke, JK. Liddell, TM. 2018. Bayesian Data Analysis for Newcomers Bayesian data analysis for newcomers. Psychonomic Bulletin & Review251155–177. 10.3758/s1342301712721
 Lakens (2017) Lakens2017Lakens, D. 2017. Equivalence Tests: A Practical Primer for Tests, Correlations, and MetaAnalyses Equivalence tests: A practical primer for tests, correlations, and metaanalyses. Social Psychological and Personality Science84355–362. 10.1177/1948550617697177
 Lakens . (2018) Lakens2018Lakens, D., Scheel, AM. Isager, PM. 2018. Equivalence Testing for Psychological Research: A Tutorial Equivalence testing for psychological research: A tutorial. Advances in Methods and Practices in Psychological Science12259–269. 10.1177/2515245918770963
 Liu Aitkin (2008) Liu2008Liu, CC. Aitkin, M. 2008. Bayes Factors: Prior Sensitivity and Model Generalizability Bayes factors: Prior sensitivity and model generalizability. Journal of Mathematical Psychology526362–375. 10.1016/j.jmp.2008.03.002
 Ly . (2016) Ly2016Ly, A., Verhagen, J. Wagenmakers, EJ. 2016. Harold Jeffreys’s Default Bayes Factor Hypothesis Tests: Explanation, Extension, and Application in Psychology Harold Jeffreys’s default Bayes factor hypothesis tests: Explanation, extension, and application in psychology. Journal of Mathematical Psychology7219–32. 10.1016/j.jmp.2015.06.004
 Morey . (2016) Morey2016Morey, RD., Romeijn, JW. Rouder, JN. 2016. The Philosophy of Bayes Factors and the Quantification of Statistical Evidence The philosophy of Bayes factors and the quantification of statistical evidence. Journal of Mathematical Psychology726–18. 10.1016/j.jmp.2015.11.001
 Morey Rouder (2011) Morey2011Morey, RD. Rouder, JN. 2011. Bayes Factor Approaches for Testing Interval Null Hypotheses Bayes factor approaches for testing interval null hypotheses. Psychological Methods164406–419. 10.1037/a0024377
 Platt (1964) Platt1964Platt, JR. 1964. Strong Inference Strong inference. Science1463642347–353. 10.1126/science.146.3642.347
 Popper (2002[1935]) Popper2002Popper, K. 2002[1935]. The Logic of Scientific Discovery The logic of scientific discovery (2 ). LondonRoutledge Classics. 10.4324/9780203994627
 Rouder, Haaf Aust (2018) Rouder2018bRouder, JN., Haaf, JM. Aust, F. 2018. From Theories to Models to Predictions: A Bayesian Model Comparison Approach From theories to models to predictions: A Bayesian model comparison approach. Communication Monographs85141–56. 10.1080/03637751.2017.1394581
 Rouder, Haaf Vandekerckhove (2018) Rouder2018Rouder, JN., Haaf, JM. Vandekerckhove, J. 2018. Bayesian Inference for Psychology, Part IV: Parameter Estimation and Bayes Factors Bayesian inference for psychology, part iv: Parameter estimation and Bayes factors. Psychonomic Bulletin & Review251102–113. 10.3758/s1342301714207
 Rouder Morey (2011) Rouder2011Rouder, JN. Morey, RD. 2011. A Bayes Factor MetaAnalysis of Bem’s ESP Claim A Bayes factor metaanalysis of Bem’s ESP claim. Psychonomic Bulletin & Review184682–689. 10.3758/s1342301100887
 Rouder . (2016) Rouder2016Rouder, JN., Morey, RD. Wagenmakers, EJ. 2016. The Interplay Between Subjectivity, Statistical Practice, and Psychological Science The interplay between subjectivity, statistical practice, and psychological science. Collabra: Psychology21. 10.1525/collabra.28
 Rouder . (2009) Rouder2009Rouder, JN., Speckman, PL., Sun, D., Morey, RD. Iverson, G. 2009. Bayesian t Tests for Accepting and Rejecting the Null Hypothesis Bayesian t tests for accepting and rejecting the null hypothesis. Psychonomic Bulletin & Review16225–237. 10.3758/PBR.16.2.225
 Royall (1997) Royall1997Royall, R. 1997. Statistical Evidence: A Likelihood Paradigm Statistical evidence: A likelihood paradigm. Chapman and Hall. 10.1201/9780203738665
 Royall (2004) Royall2004Royall, R. 2004. The Likelihood Paradigm for Statistical Evidence The likelihood paradigm for statistical evidence. ML. Taper SR. Lele (), The Nature of Scientific Evidence: Statistical, Philosophical, and Empirical Considerations The nature of scientific evidence: Statistical, philosophical, and empirical considerations ( 119–152). University of Chicago Press. 10.7208/CHICAGO/9780226789583.003.0005
 Rüger (1998) Ruger1998Rüger, B. 1998. Testund Schätztheorie: Band I: Grundlagen Testund Schätztheorie: Band I: Grundlagen. De Gruyter Oldenbourg. 10.1524/9783486599701
 Schönbrodt Wagenmakers (2018) Schonbrodt2018Schönbrodt, FD. Wagenmakers, EJ. 2018. Bayes Factor Design Analysis: Planning for Compelling Evidence Bayes factor design analysis: Planning for compelling evidence. Psychonomic Bulletin & Review251128–142. 10.3758/s134230171230y
 Schwaferts Augustin (2021) Schwaferts2021Schwaferts, P. Augustin, T. 2021. Updating Consistency in Bayes Factors Updating consistency in Bayes factors 236. LudwigMaximiliansUniversity Munich, Department of Statistics. 10.5282/ubm/epub.75073
 Sinharay Stern (2002) Sinharay2002Sinharay, S. Stern, HS. 2002. On the Sensitivity of Bayes Factors to the Prior Distributions On the sensitivity of Bayes factors to the prior distributions. The American Statistician563196–201. 10.1198/000313002137
 Tendeiro Kiers (2019) Tendeiro2019Tendeiro, JN. Kiers, HAL. 2019. A Review of Issues About Null Hypothesis Bayesian Testing. A review of issues about null hypothesis Bayesian testing. Psychological Methods246774–795. 10.1037/met0000221
 Vanpaemel (2010) Vanpaemel2010Vanpaemel, W. 2010. Prior Sensitivity in Theory Testing: An Apologia for the Bayes Factor Prior sensitivity in theory testing: An apologia for the Bayes factor. Journal of Mathematical Psychology546491–498. 10.1016/j.jmp.2010.07.003
 Vanpaemel Lee (2012) Vanpaemel2012Vanpaemel, W. Lee, MD. 2012. Using Priors to Formalize Theory: Optimal Attention and the Generalized Context Model Using priors to formalize theory: Optimal attention and the generalized context model. Psychonomic Bulletin & Review1961047–1056. 10.3758/s1342301203004
Comments
There are no comments yet.