A Puzzle June 25, 2010
Posted by genericme in Uncategorized.trackback
Consider a set of objects from which are drawn randomly at a time, with replacement. What is , the expected number of draws I have to make to have drawn all of the objects?
I have yet to find a satisfactory closed-form expression for even . I obtain an ugly series in terms of Stirling numbers of the second kind. However, I suspect that is asymptotically linear:
Is this a well-known problem?
For m=1 it is known as “the coupon collector’s problem”, and the expectation is . I haven’t heard about the problem for m>1 before, but I’m pretty sure that for fixed m, the expectation is still . I don’t known what it is in general.
In the case m=1, this is known as the “coupon collector problem”. In fact E(n,1) = n Hn, where Hn = 1 + 1/2 + … + 1/n is the nth harmonic number. Hn is approximately log n, so E(n,1) is approximately n log n.
I suspect E(n, m) is approximately (n log n)/m for large n and fixed m, but I haven’t seen this problem before.
Cool, thanks for the info. I was taking an unnecessarily difficult approach to solving this problem.
found an approximate bound using wald’s equation. http://en.wikipedia.org/wiki/Wald's_equation.
imagine the experiment with m=1.its known that
E(n,1)= n H(n),
where H(n)=1+1/2+1/3+….1/n
now let x1= number of draws to pick m distinct objects. After x1 draws, let x2=number of draws to pick m distinct objects, and so on till u see see all the n objects.
then E[X1]=E[X2]=…=n[1/(n)+1/(n-1)+…1/(n-m+1)]=n/[H(n)-H(n-m)]
If S is total number of draws
i.e. S=X1+X2+…….,then as per wald’s equation,
E[S]=E[X1]*E(n,m).
Now, E[S]-m <E(n,1) E(n,m) >= H(n)/[H(n)-H(n-m)]
where H(r)=1/r+ 1/(r-1)+..1/2+1
What happens in coupon collection is that if you want to collect say 2 of each coupon, the extra samples required after collecting (at least) one of each is smaller [on average]. Thus $E(n,2) < 2 E(n,1)$ on average. In fact
$E(n, m) = \Theta( n[ \log n+ (m-1) \log \log n])$
which is not surprising. Focusing on E(n,1) vs E(n,2) only, while collecting all coupons a lot of coupons have already repeated.
See "Randomized Algorithms" by Raghavan and Motwani.
thank you for your way of solving, i took so long time to solve