Post hoc Power

I understand to some degree the whole idea of discussing the limitations of a study due to small sample size constrictions which may result in limited power to detect differences between groups or an effect of a covariate on an outcome.  What I do not fully understand/agree with is the need to discuss the post hoc power from a study.  I will refer to “significance” in the sense of the traditional statistical significance of a p-value of 0.05.  This discussion does not pertain to all analyses, such as likelihood-based analyses, but mainly towards frequentist-based analysis.

The reasons are 1) The power to detect a difference given your data is either 0 or 1, either you detected it or you did not.  2) You do not do power calculations on differences you find significant.  You do not say “we were underpowered to detect a difference between X in groups 1 and 2, but we saw one regardless”, it sounds foolish, but the reverse does not.  3) Power calculations are usually still based on differences much higher than seen in the data.  For example, a difference of 5% was seen in the data, but the power calculation is based on 10% since this is clinically/practically significant.  Overall, this implies that we do not think we have any better estimate of the difference after collecting the data.  The investigators’ prior does not seem to have moved after this round of collecting data, so why would they move in the next round? 4) The analysis inherently assumes that if we collected more data, we could potentially find a difference, but that next study is rarely, if ever, done or planned to be done.

Thus, I feel as though this post hoc power discussion biases your results and shows an underlying opinion that even though you didn’t see a significant result after you collected your data, you still think one exists.  Now, I feel as though this brings into question an investigator’s equipose, though journals rarely seem to grasp the implications of this and reviewers consistently request discussions of power.