Cluster Sampling Calculator

Calculate the required sample size for cluster sampling designs. Enter your population size, confidence level, margin of error, estimated proportion, intracluster correlation, and cluster size — and get back your design effect, adjusted sample size, and final sample size after accounting for response, eligibility, and coverage rates.

Total number of individuals in the target population.

The probability that the true value falls within your margin of error.

%

Acceptable margin of error as a percentage (e.g. 5 for ±5%).

%

Expected prevalence of the characteristic in the population. Use 50% if unknown (gives maximum sample size).

Correlation of responses within a cluster. Higher values increase the design effect. Typical values: 0.01–0.20.

Number of interviews (respondents) selected within each cluster.

%

Percentage of sampled individuals expected to respond.

%

Percentage of contacted individuals who meet eligibility criteria.

%

Percentage of the target population covered by the sampling frame.

Results

Final Sample Size

--

Simple Random Sampling Size

--

Design Effect (DEFF)

--

Sample Size Adjusted for DEFF

--

Number of Clusters Required

--

Sample Size Comparison

Frequently Asked Questions

What is cluster sampling?

Cluster sampling is a probability sampling method where the population is divided into groups (clusters), a random selection of clusters is chosen, and then individuals within those clusters are surveyed. It is especially useful when the population is geographically spread out and a complete list of individuals is unavailable.

What is the design effect (DEFF) in cluster sampling?

The design effect (DEFF) measures how much the variance of an estimate increases due to the cluster sampling design compared to simple random sampling of the same size. A DEFF greater than 1 means clustering reduces statistical efficiency, which is why you need a larger sample. It is calculated as DEFF = 1 + (b − 1) × ρ, where b is the cluster size and ρ is the intracluster correlation.

What is intracluster correlation (ICC)?

Intracluster correlation (ICC or ρ) measures how similar individuals within the same cluster are relative to individuals from different clusters. A higher ICC means members of a cluster share more similar characteristics, which reduces the information gained from sampling multiple people in the same cluster and increases the required sample size. Typical values range from 0.01 to 0.20 in social research.

Why do I need to adjust for response, eligibility, and coverage rates?

Not everyone you contact will respond (response rate), not everyone who responds will qualify (eligibility rate), and your sampling frame may not cover the entire population (coverage rate). Each of these factors means you must contact more people to achieve your effective completed-interview target, so the final sample size is inflated accordingly.

What happens if I don't know the estimated proportion?

If you have no prior information about the prevalence of the characteristic you are measuring, use 50% (0.5). This is the most conservative assumption and yields the largest — and safest — sample size estimate, ensuring you do not underpower your study.

How does confidence level affect sample size?

A higher confidence level requires a larger sample size. For example, moving from a 95% to a 99% confidence level increases the z-score from 1.96 to 2.576, which grows your required sample proportionally. Balancing precision with available resources is key to choosing the right confidence level.

What is the difference between one-stage and two-stage cluster sampling?

In one-stage cluster sampling, all individuals within selected clusters are surveyed. In two-stage cluster sampling, a random subset of individuals is selected from within each chosen cluster. This calculator uses the two-stage approach, where you specify the number of interviews per cluster, which affects the design effect and total number of clusters needed.

How do I calculate the number of clusters needed?

Once you have the cluster-adjusted sample size (SRS size multiplied by the design effect, then inflated by adjustment rates), divide that final sample size by the number of interviews per cluster. The result gives you the minimum number of clusters to randomly select from the population.

More Statistics Tools