The Dirichlet distribution, also known as the multivariate beta distribution, is a probability distribution often employed to model the probabilities of a set of correlated, categorical outcomes/events. This is natural since the Dirichlet is a multivariate distribution with support on a probability simplex, where each dimension of the distribution is distributed univariate beta. In practical terms, this means that for each vector1 sampled from a Dirichlet, each and every value falls in the $(0,1)$ interval while the elements of the vector sum to one.

Many researchers, students, and consumers of empirical research have poor understandings of probability distributions, calculus, and key concepts necessary for mathematical statistics. At the same time, even researchers with PhD in quantitative fields can have difficulty understanding and interpreting concepts like p-values and confidence intervals. Over time, I’ve found that the best way to help individuals think through their quantitative problems and understand the logic of statistical inference is by focusing on the data-generating process as the concept of interest.

Social scientists rarely provide explicit justification for choices that directly affect the suitability of their research designs for providing evidence for or against their hypotheses. While recent developments - such as the development of pre-registration plans - encourage researchers to think more carefully about the ability of their studies to precisely identify the sign and magnitude of the relationships between theoretical constructs, it still remains that case that few researchers justify the statistical power of their designs.