Today I stumbled upon a problem during a student presentation.
How many cows are required per farm so as the probability of getting at least one diseased cow is 70% (acceptable level to us, using area under ROC cutoff point), given that the disease prevalence is 25%?
My solution for this problem is by using binomial distribution. The following is the formula for the distribution,
x, number of success
n, sample size
and to solve our problem
p(1 or more) = 1 - p(0)
Using spreadsheet, we can find the value of n iteratively (I prefer LibreOffice Calc). Just key in the following function (or just put up the formula above)
=1 - BINOMDIST(n, x, p, 1)
thus in our context
=1 - BINOMDIST(n, 0, 0.25, 1)
after playing around with n, I found
for n = 4, p(1 or more) = 1 - 0.316 = 0.684
for n = 5, p(1 or more) = 1 - 0.237 = 0.763
I'd go for 5 cows...