When power or sample size is calculated, it is calculated on the basis of a hypothesis that the researcher sets. From a statistical point of view, the hypothesis is a statement about the relationship between two or more values. When we define the hypothesis, first we set the values that we are testing. These can be things such as the mean response from a treatment or the proportion of subjects with an outcome. We also must define the relationship that we are interested in testing.

We will broadly consider three relationships: two-sided (or equality), one-sided (or superiority or non-inferiority) and equivalence. The specific names applied depend on reference used.

Suppose that we are interesting in comparing a statistic, \(\theta\), from a test treatment (\(T\)) and a reference treatment \(R\). We can formalize the relationships as follows:

A two-sided hypothesis is used when we want to show a difference between the two values. The hypothesis is given by:
\[ H_0: \theta_{T} = \theta_{R} \text{ versus } H_a: \theta_{T} \ne \theta_{R} \]
A two-sided hypothesis is generally prefered by the FDA.
Strictly speaking, non-inferiority tests, tests of superiority and one-sided tests are not the same. However, we can show that they are equivalent to the following formulation:
\[ H_0: \theta_{T} - \theta_{R} \le \delta \text{ versus } H_a: \theta_{T} - \theta_{R} \gt \delta \]
where \(\delta\) is some clinically meaningful difference (read more).
These types of tests are commonly used, but regulatory bodies, such as the FDA, prefer that these types of tests be used when the outcome of interest occur only in one tail and it is inconceivable that the effect occurs in the other direction.
For all three, depending on how you define \(\delta\), the sample size and power calculations come out the same. Therefore we only use the term 'one-sided' to stand for all three terms.
Equivalence tests are a two-sided test that tests if the parameters of interest are equal up to some tolerence, \(\delta\). Formally, the hypothesis is given by:
\[ H_0: |\theta_{T} - \theta_{R}| \ge \delta \text{ versus } H_a: |\theta_{T} - \theta_{R}| \lt \delta \]

The choice of hypothesis has an direct impact on the sample size. Therefore it is important that the hypothesis be sufficiently formulated before the power or sample size calculation. Otherwise, you risk an under-powered or inefficient study for the true hypothesis of interest.


  • Dubey, S.D. (1991), "Some thoughts on the one-sided and two-sided tests.," Journal of Biopharmaceutical Statistics, 1, 139-150.
  • Chow, S., Shao, J., & Wang, H. (2003), Sample size calculations in clinical research, New York: Marcel Dekker.