User:Zepheus/random100

random drawing of numbers 1–100[edit]

Suppose you had the values of 1 to 100. Then, you randomly organized the 100 numbers into 10 groups of 10. One group might contain the numbers 7, 13, 17, 38, 41, 52, 59, 71, 90 and 95 for example. Then, by group, the highest number would be given a corresponding value of 10, the second highest a corresponding value of 9, and so on. Then, the process is repeated, and any value earned is added to the last value earned. For example, the number 100 will always be given a value of 10 (because it is always the highest number out of 100), so after 3 random "draws," its corresponding value would be 30. Likewise, the number 1 would have a value of 3 after 3 "draws." My question is: how many random "draws" would it take so that all the numbers were in numerical order based on their values. Likewise, how many random "draws" would it take so that less than 10 numbers were not in correct placement when organized by values. I understand that this question is confusing (and it is quite hard to word), so if you have any questions as to what I mean, then please ask and I will update and clarify accordingly. Thank you in advance for all of your help. - Zepheus 02:55, 4 June 2006 (UTC)

Fun challenge, though not a practical way to sort. But you cannot use random draws and ask those precise questions. For example, although it is statistically unlikely, fifty random draws could be exactly the same. Or did you know that? Anyway, perhaps you should think about why you expect that the cumulative "10-ranking" scores will converge to a correct 100-ranking order. That will be essential to answering the first question. For simplicity, consider four numbers in groups of two. Write out all possible draws and consider how they combine. Notice that after one draw the values will be ⟨1,1,2,2⟩, with only two distinct quantities; and after two draws the cumulative values will include 1+1 and 2+2, and typically 1+2 as well, but nothing else. So it is impossible to sort properly with only one or two draws. More generally, after n draws the largest cumulative value will be n times the group size and the smallest cumulative value will be n. Randomness aside, the difference of these must be large enough to permit a distinct value for each number in the full set. Since 11×(10−1) = 99 = 100−1, clearly at least 11 draws are necessary for a full sort. Of course, this necessary condition may not be sufficient; nor does it address the likelihood of a correct sort. --KSmrq^T 06:02, 4 June 2006 (UTC)

11 draws are sufficient:

 (1-10)           (11-20)    (21-30)    ... (91-100)
 (1,11,21,31,...) (2,12,...) (3,13,...) ... (9,19,...)
 ... (10 times) ...
 (1,11,21,31,...) (2,12,...) (3,13,...) ... (9,19,...)

It's fairly obvious that this schema assigns each number n the value n+10. EdC 14:23, 4 June 2006 (UTC)

Additionally, this (with permutations) is the only way to get a full sort in only 11 draws.

You're going to have to clarify how many random "draws" would it take so that all the numbers were in numerical order based on their values. As KSmrq pointed out, the numbers can stay unsorted through an indefinite number of draws. One possible question is how many draws are needed such that P(sorted | n draws) exceeds some value (say ½). I wouldn't think that P(sorted | n draws) has a nice form, though. — Preceding unsigned comment added by EdC (talk • contribs)

Thank you for all your help so far, and it's been a long time since I've had a math class so it's tricky to write in mathematical ways. I knew originally that every draw could be the same, but it is statistically improbable. Also, I figured that when the number of draws approached infinity, the number of errors (or numbers not in correct placement when sorted by value) would reach zero. I was just wondering how many draws would probably be sufficient. I think my question has pretty much been answered, unless EdC has more to say on the matter. - Zepheus 16:56, 4 June 2006 (UTC)

I'm currently running a simulation of the problem, and it seems that the average amount of draws required is around 2045 - If that is of any use. Keep in mind, though, that I have not double-checked my program's correctness, and it is very inefficient, so it could take a while until an accurate result is obtained. -- Meni Rosenfeld (talk) 17:28, 4 June 2006 (UTC)

I've tried such a simulation too, and I've got a similar result: average 2069 draws needed from a sample of 160 tries. The same warnings apply as above. Btw, see birthday paradox as for why you need so many draws. – b_jonas 18:35, 4 June 2006 (UTC)

Update^2: run on a faster SMP machine (from 13919 iterations) gives average 2058 draws, quartiles of number of draws are 1521, 1911, 2443. – b_jonas 19:19, 4 June 2006 (UTC)

Some thoughts on the convergence of the order. Consider a random variable $X^{k}$ , the rank of number k in a random draw. Denote $Y^{k}=X^{k+1}-X^{k}$ . Denote by $A^{k}$ the event that the numbers k and k+1 are in different groups, its probability is $P(A^{k})=9/10$ . Given $A^{k}$ , $X^{k}$ has the same distribution as $X^{k+1}$ , and $Y^{k}$ is symmetric with respect to 0 ( $P(Y^{k}=x|A^{k})=P(Y^{k}=-x|A^{k})$ for x>0), thus $E(Y^{k}|A^{k})=0$ . With probability 1/10 the numbers k and k+1 are in the same group ( $A^{kc}$ , complementary to $A^{k}$ ) and $X^{k+1}=X^{k}+1$ and $P(Y^{k}=1|A^{kc})=1$ . That is,

P(X^{k+1}=x)=P(X^{k}=x|A^{k})P(A^{k})+P(X^{k}=x-1|A^{kc})P(A^{kc})

Besides, we have

E(Y^{k})=E(Y^{k}|A^{k})P(A^{k})+E(Y^{k}|A^{kc})P(A^{kc})=1/10

.

After realization of n draws, the numbers 1,...100 have the correct order if

{\bar {Y}}^{k}={\frac {1}{n}}\sum _{i=1}^{n}Y_{i}^{k}>0

for all k=1,...99

It is known than ${\bar {Y}}^{k}\to 1/10$ in distribution so the convergence to the correct order should be expected. The questions arise whether the r.v. $Y^{k}$ are independent of each other, and for which smallest n the event $\inf _{k}{\bar {Y}}^{k}>0$ occurs the first time, and what is the distribution of such n. (Igny 22:46, 4 June 2006 (UTC))

These mathematical functions are getting crazy. I wish I could decipher them. I'll definitely archive this page. One more question, the first answer I receive was that 11 draws would be sufficient. The next answer was that roughly 2,050 draws would be needed. How are these related? Also, what is the rough estimate for the number of draws needed for less than, say, 10 mistakes. - Zepheus 19:09, 5 June 2006 (UTC)

No, the first answer was that 11 draws are necessary. That is, with less than 11 draws you have zero chance of getting it right. No finite amount of draws is sufficient (in the sense of having a probability of 1 of winning). Roughly 2047 (obtained after over 55,000 experiments) is the average number (expectation) of draws until a success is obtained. Assuming the distribution is roughly symmetric, this also means that 2047 draws will give you a 50% chance of success. -- Meni Rosenfeld (talk) 19:27, 5 June 2006 (UTC)

Well, according to the numbers I gave above, you have about 50% success after less draws than that: about 1911 draws. – b_jonas 20:47, 5 June 2006 (UTC)

Okay. I understand now. Thanks for the update, and all of your hard work. - Zepheus 21:16, 5 June 2006 (UTC)

Above is a graph of the cdf of the distribution of the number of draws needed I've made from the output of my simulation. – b_jonas 21:26, 5 June 2006 (UTC)

This graph is awesome thanks. - Zepheus 17:29, 6 June 2006 (UTC)