Design and analysis of stepped wedge cluster randomized trials
Introduction
Cluster (or community, or group) randomized trials (CRT) are distinguished by the fact that individuals are randomized in groups rather than individually. CRTs have been used to evaluate antismoking interventions [1], [2], methods of preventing human immunodeficiency virus (HIV) and other sexually transmitted diseases (STDs) [3], [4], and in a number of other contexts [5], [6]. Cluster designs may be chosen because the intervention can only be administered on a community-wide scale (e.g. [7]), or to minimize contamination ([8]), or for other logistic, financial or ethical reasons. From a statistical viewpoint, the key characteristic of CRTs is that the individual units within a cluster are correlated and this feature must be incorporated into power calculations and the trial analysis.
CRTs often employ a parallel design: for a two-arm study with 2I independent clusters, I clusters are randomly assigned to each intervention at a single time point. If the cluster sizes are all equal, a two-sample t-test may be used to compare cluster-level mean responses between the intervention groups. If there are more than 2 treatment arms, a one-way analysis of variance may be used. Sometimes the communities are matched and randomization is done within the matched sets. In that case, a paired analysis (e.g. paired t-test) is used. When cluster sizes vary, individual level analyses using generalized estimating equations [17] or random effects models [16] may be used. Statistical aspects of the design and analysis of parallel CRTs have been widely discussed (e.g. [9], [10]).
In contrast, crossover designs are less commonly used in CRTs (three examples are [6], [11], [12]). A crossover CRT requires fewer clusters than a parallel design but may take twice as long (or longer) to complete (since each cluster receives both the treatment and control interventions). If the intervention requires a lengthy follow up period, then this fact alone might make a crossover design impractical. In a standard crossover design the order of the interventions is randomized for each cluster and a time period (called the “washout” period) is often included between the two interventions so that the first intervention does not affect the second. Analysis of a standard crossover design focuses on within-cluster comparisons using a paired t-test.
A stepped wedge design [13] is a type of crossover design in which different clusters cross over (switch treatments) at different time points. In addition, the clusters cross over in one direction only—typically, from control to intervention. The first time point usually corresponds to a baseline measurement where none of the clusters receive the intervention of interest. At subsequent time points, clusters initiate the intervention of interest and the response to the intervention is measured. More than one cluster may start the intervention at a time point, but the time at which a cluster begins the intervention is randomized. Fig. 1 illustrates the differences between the parallel, traditional crossover and stepped wedge designs.
Although the stepped wedge design extends the length of a randomized trial due to the presence of multiple time intervals, the nature of the design may be beneficial in certain settings. In a parallel or traditional crossover design, the intervention must be implemented in half of the total clusters simultaneously. However, limited resources or geographical constraints may make this logistically impossible (e.g. [13]). The stepped wedge design allows the researcher to implement the intervention in a smaller fraction of the clusters at each time point. Another unique feature of the stepped wedge design is that the crossover is unidirectional. All clusters eventually receive the intervention and, in particular, the intervention is never removed once it has been implemented (at least over the course of the trial) which may alleviate ethical and/or community concerns. This makes the stepped wedge design particularly useful for evaluating the population-level impact of an intervention that has been shown to be effective in an individually randomized trial. The unidirectional aspect of the crossover does, however, complicate the analysis since the treatment effect can no longer be estimated exclusively from within-cluster comparisons. More details on the analysis of such trials are provided below.
In Section 2 we describe a trial being conducted in Washington state that uses a stepped wedge design. This motivating example provides a context for the theoretical and simulation results shown in Section 3 where we describe statistical aspects of the design and analysis of stepped wedge CRTs. In Section 4 we summarize our findings and discuss future areas of research.
Section snippets
Example — partner notification
Partner notification is the process by which sex partners of patients with sexually transmitted infections (STIs) are notified of potential exposure to infection and encouraged to seek treatment. Standard practice for partner notification in most states in the US involves contact of partners by public health authorities. However, the high costs associated with this practice have influenced investigators to seek alternative partner treatment methods. One alternative strategy is patient delivered
Statistical issues
In this section we examine a number of issues related to the design and analysis of stepped wedge CRTs.
Discussion
Using theoretical calculations and simulation we have investigated statistical characteristics of the stepped wedge design for cluster randomized trials. In particular, we have outlined a procedure for computing power in such trials and investigated the effect of varying intercluster correlation, number of randomization steps and treatment delay on trial power. The design is relatively insensitive to variations in the intercluster correlation. We also found that, for a fixed number of clusters,
Acknowlegements
This research was supported by NIH grants AI29168, AI46702.
References (24)
- et al.
Aspects of statistical design for the Community Intervention Trial for Smoking Cessation (COMMIT)
Control Clin Trials
(1992) - et al.
Impact of improved treatment of sexually transmitted diseases on hiv infection in rural tanzania: randomised controlled trial
Lancet
(1995) - et al.
Control of sexually transmitted diseases for AIDS prevention in Uganda: a randomised community trial
Lancet
(1999) - et al.
Effects of a workplace physical exercise intervention on the intensity of headache and neck and shoulder symptoms and upper extremity muscular strength of office workers: a cluster randomized controlled crossover trial
J Int Assoc Stud Pain
(2005) - et al.
Hutchinson Smoking Prevention Project: long-term randomized trial in school-based tobacco use prevention—results on smoking
J Natl Cancer Inst
(2000) - et al.
A cluster randomized trial of a sex education programme in Belize, Central America
Int J Epidemiol
(2003) On design considerations and randomization-based inference for community intervention trials
Stat Med
(1996)Contamination in trials: is cluster randomisation the answer?
BMJ
(2001)- et al.
Design and analysis of cluster randomization trials in health research
(2000) Design and analysis of group-randomized trials
(1998)
A randomized controlled trial of quality assurance in sixteen ambulatory care practices
Med Care
The effect of varying levels of outdoor air supply on the symptoms of sick building syndrome
N Engl J Med
Cited by (948)
Advancing mental health service delivery in low-resource settings
2024, The Lancet Global HealthInformation content of stepped wedge designs under the working independence assumption
2024, Journal of Statistical Planning and InferenceProtocol and statistical analysis plan for the identification and treatment of hypoxemic respiratory failure and acute respiratory distress syndrome with protection, paralysis, and proning: A type-1 hybrid stepped-wedge cluster randomised effectiveness-implementation study
2023, Critical Care and Resuscitation