Thursday, December 04, 2003

Sample size and statistical significance

How many people do you need to survey to get a significant result?

For statistical significance (in statistics, "significant" has a very specific meaning), you need to use a valid sample size. You also need to use a valid methodology for selecting who goes into your sample.

As a rough rule of thumb, your sample should be about 10% of your universe, but not smaller than 30 and not greater than 350. If you are doing multivariate analysis, the sample should be ten times the number of variables you are testing.

If you want to be more pedantic, you should define what confidence level you want and what margin of error is acceptable to you. A confidence level of 95% and an error margin of 5% tell you that your result will be within 5% of the true answer 95% of the time you run the survey. So if you tested 100 samples, 95 of them would return a result that was within 5% of the truth.

The correct sample size is a function of those three elements--your universe (how many people make up the group whose behavior you are trying to represent), your desired error margin, and your preferred confidence level. It's a simple formula (well, not so simple). For most purposes, I'd go for a 10% error margin at 95% confidence. For varying numbers of learners in your universe here are the ideal sample sizes (the first at a 10% error margin, the second at 5%):

50 in the universe, sample 33 or 44
100 in the universe, sample 49 or 80
200 in the universe, sample 65 or 132
500 in the universe, sample 81 or 217
1000 in the universe, sample 88 or 278
and so it goes till you find that an ideal sample for a 10% error margin hardly moves above 350 no matter how big the universe (it's 500 for 5%).


1 comment:

Anonymous said...

Hi,

This was very useful. Thank you. I have been trying to determine the good size. Can you please provide some references or pointers to where you get these estimates from. It would help me a lot.

Thanks,
G