# Cluster Unit Randomized Trials

## 7. Sample Size Assessment

Methods of sample size estimation that may be used to compare other population parameters of interest, such as proportions and incidence rates, follow the same general principles, as discussed by (Donner and Klar 2000, chapter 5). For example, to compare two proportions P_{1} and P_{2}, the required sample size may be obtained by replacing 2σ^{2} in the formula for comparing two means by P_{1}(1-P_{1})+P_{2}(1-P_{2}) and μ_{1}-μ_{2} by P_{1}-P_{2}. However, despite the wide availability of such methods, several reviews of cluster randomization trials performed over the last 20 years show that far fewer than 50% of such trials report their actual use in practice (Donner et al., 1990; Simpson et al., 1995; Smith et al., 1997; Varnell et al., 2004; Murray et al., 2008). However an exception to this discouraging trend might be emerging in the field of primary care, where a recent review by Eldridge et al., 2008 found that 62% of trials reviewed accounted for clustering effects in the sample size calculations, a vast improvement compared to results seen in previous reviews.

Values of ρ required for sample size estimation are usually obtained from trials involving the same endpoint and a similar unit of randomization. Fortunately investigators now tend to report this value fairly frequently. Indeed some researchers have now reported estimates of ρ obtained over a range of studies in a particular research area (e.g., Campbell et al., 2000; Murray et al., 2000; Argarwal et al., 2005; Parker et al., 2005; Gulliford et al., 2005). However, the difficulty remains that many such estimates are based on a relatively small number of clusters, and are consequently subject to considerable uncertainty. Therefore it is usually advisable for investigators to perform a sensitivity analysis in which the impact of different values of ρ on the required size of sample can be carefully explored.

For the matched-pair design, the simplest approach to sample size estimation would be to:

- Compute the required number of subjects using standard formulas for the completely randomized design; and
- Multiply the result by the factor 1- ρ
_{M}, where ρ_{M}is an estimate of the likely size of matching correlation.

If such an estimate is not available from previous data, a conservative approach would be to assume that matching is ineffective, i.e. to use the completely randomized formula directly.

An approach for estimating sample size requirements for the stratified cluster randomization design is provided by Donner, 1998.

Flash is not available on mobile devices. Please view the Flash Description.