I often see people being a bit anxious when it comes to sample size calculations: I recently had a client having this problem when setting up a cross-sectional study. I here explain the solutions found to assist the client with this issue. Two different tools that are freely online available were used.
I often see people being a bit anxious when it comes to sample size calculations: You have to find the correct formula for the type of study you are conducting and think about the figures you need to put into the equation. I recently had a client having the same problem when setting up a cross-sectional study. As I think it might be helpful to others, I here explain the solutions found to assist the client with this issue.
Suppose you are doing a cross-sectional study on the smoking prevalence among male and female university students. If you have a clear idea about the difference in the prevalence of smoking between these two groups (based on other studies), than the Open Epi website (Open Source Epidemiologic Statistics for Public Health) might be the right tool for you to do a quick sample size calculation. Go to Sample Size, then to Cohort/RCT and Enter new data. Assume you want a two-sided significance level of 95%, a power of 80%, two equal groups, and you expect the prevalence of smoking among female university students to be 35% and among males to be 50%. Then you will get the following result after entering the data:
However, if you are not so sure about the prevalences to be expected in both groups, you need to repeat this exercise for different values (actually for different effect sizes; the difference between the two prevalences). The sample sizes could then be as follows (note that here I only present the calculations based on Kelsey):
Note that the above gives the total sample size, so each group will be half (e.g. N=190, females = 95 and males = 95)
Of course you can also change the significance level, power etc.
Being busy with this, we found a program online, that can easily assist you to graphically represent the different sample sizes as displayed in the table above: G*Power 3
After downloading the program, go to the tab ‘Protocol of power analysis’ and select the following:
Then click on ‘calculate’ and you will get the sample size for the above values of the parameters. Then click on ‘ X-Y plot for a range of values’ and change the second line: As a function of ‘Effect size w’ from 0.1 in steps of 0.01 through to 0.5. This results in the following graph:
Open Epi gave a sample size of 88 when having an effect size of 30% (prevalence females 30% and males 60%) and this is in agreement with the graph, giving a sample size of 87 for the effect size of 0.3. Note that due to the scale of the graph you might not be able to accurately determine the exact sample size, but just click on the tab Table (next to Graph) and you will get the exact numbers.
If you now want to do a graph where instead of the effect size, you want to change the power (keeping the effect size the same), you change the second line as follows: As a function of ‘Power (1 – β err prob)’ from 0.6 in steps of 0.01 through to 0.95. In the graph that you get you will see that for power 0.8 you need a sample size of 87. In the graph above you see the same sample size when you go to effect size 0.3.
Unfortunately, these programs will not be able to assist you in case you have more complex sampling designs, e.g. multi-staged. In those instances it might be wise to contact a statistician specialized in doing these type of sample size calculations.