|
PowerBASIC Forums
![]() Cafe PowerBASIC
![]() Dicing With Probability (Page 7)
|
This topic is 10 pages long: 1 2 3 4 5 6 7 8 9 10 |
next newest topic | next oldest topic |
| Author | Topic: Dicing With Probability |
|
Charles Pegge Member |
Emil, I think I managed to avoid both the pitfalls you mentioned by taking the frequencies, not the proportions, and omitting the Chi from the tails where the expected frequency falls below 1. Maybe it ought to be 2. ------------------ IP: Logged |
|
David Roberts Member |
It never ceases to amaze me how we manage to not be able to see the wood for the trees. quote: That is true. quote: That is false. There is nothing remarkable about it at all. What would have been remarkable would have been not the case as this would have cast doubt on RND being uniformly distributed. If an event is expected to occur only 1% of the time assuming some null hypothesis is true and that event occurs on the first and only test then we can rightly question the truth of the null hypothesis. However, if we did the test 100 times then we should not be surprised to see one of the tests in the same vein as our single test. I looked at 1000 RNDs 10000 times. I then saw "...there is an x% probability that it will produce results which only occur x% of the time." Well, it would wouldn't it? Oh, dear. The rule of thumb was a rough rounding. I should be looking at the actual distribution compared with the expected one much along the lines that Charles has been doing with the binomial and Chi-squared. Since we are looking at large samples then the Central Limit Theorem allows us to use the normal distribution which is why I used it - pity about my interpretation. IP: Logged |
|
Emil Menzel Member |
Charles: A relevant quote from R.R. Sokal & F.J. Rohlf, Biometry, 1969, p. 569 on using chi-square to test goodness of fit of empirical data to a binomial distribution: "Since expected frequencies < 5 should be avoided, we lump the classes ["bins"] at both tails with the adjacent classes of adequate size. Corresponding classes of observed frequencies should be lumped to match" (p. 569) This is the magic number I remember hearing most often, but don't ask Some texts might advise dropping rather than pooling classes. But in general statistics is best viewed as a supplement rather ------------------ IP: Logged |
|
Charles Pegge Member |
Some curious results: In this test the tails have dropped out themselves because Each seed seems to have its own Chi-Squared profile. It may be better to pick your seeds This is one example. Whereas seed=1 produces a large #12 value Results for Randomize(2) #9 deviates from the norm. (1% significance = 50 @ 30 degrees of freedom)
| #0 ChiSq = 0. total chiSq=445.398224885069 at 780 degrees of freedom ------------------ IP: Logged |
|
James Graham-Eagle Member |
Charles, The algorithm used to generate the values of RND is probably the linear congruence method ... ------------------ IP: Logged |
|
Charles Pegge Member |
Thanks James, good article, and what a simple method! By the way, In my tests above, there is only one seeding ------------------ IP: Logged |
|
Emil Menzel Member |
A reference I gave earlier http://www.powerbasic.com/support/forums/Forum7/HTML/002589.html also implements and might help to explain linear congruential random generators.
------------------ [This message has been edited by Emil Menzel (edited February 03, 2007).] IP: Logged |
|
David Roberts Member |
Thanks, Emil. What caught my eye was quote: I was using RANDOMIZE for each set of 1000 RNDs, 10000 in all. Since I was calling RND 'only' 10 million times which is much less than 2^32 (over 4000 million) I removed RANDOMIZE except for its use initially. The results 'tightened' up. Old habits die hard - a lot of what we do isn't needed nowadays and harps back to the old 8 bit days when we had no speed, RAM and much besides. IP: Logged |
|
Emil Menzel Member |
>>a lot of what we do isn't needed nowadays and harps back to the old 8 bit days when we had no speed, RAM and much besides. Very true of statistics in general. The statistical methods & tabled But that reminds me: In Ann Arbor, Michigan in the early 1950's there ------------------ IP: Logged |
|
John Gleason Member |
This may help explaining some of the results you are seeing, or questions about the powerBasic pseudo random number generator. 1) The powerBasic prng produces 2^32 or 4GB of random "values." 2) There is one sequence only of 4GB of random values. Various seed 3) Sequences from a given starting point can overlap other sequences 4) From any seed, after 4GB of values, you will repeat those values 5) It follows that if you exceed 4GB of random values, simply [This message has been edited by John Gleason (edited February 04, 2007).] IP: Logged |
|
Emil Menzel Member |
John: Thank you for the list. Do you have a reference for it? A feasible procedure for avoiding potential problems (assuming that Charles: By the same token, all possible values of TIMER could also be enumerated. That reminds me of a simple program I wrote a couple of years ago P.S. I hope that you are not advocating picking the right seed numbers
------------------ IP: Logged |
|
Charles Pegge Member |
Emil, I must warn you that at this very moment, I am deeply embroiled in devising a method for picking the best seeds. It is based on the Chi graph above. To qualify as a good seed the Chi2 of any category must not exceed 35 with 30 degrees of freedom. I am going to leave the test to run overnight. So far it has come up with 1122 1135 1220 1309 1412 where the seed is passed to the Randomize function as a double precision variable, if that makes any difference. Blum Blum Shub is worth checking out. I found it a bit more ------------------ IP: Logged |
|
John Gleason Member |
Emil, The data above has been gleaned from many (dozens of) runs of the pb generator thru it's entire cycle, then comparing the binary raw extended precision ten-byte sequences that result. I'd be happy to post the code showing examples of each point if you'd like, but as you have noted, it is certainly preferable to circumvent any possible problems by using another proven generator. I can post a quick and proven one--which I recently discovered can even be done without using assembler--and only takes perhaps ten lines of code. I agree too that the idea of fitting the seed to the data... well, IP: Logged |
|
Charles Pegge Member |
John, If you are willing to post your code, I would like to try it with my test to see what a higher quality pseudo-random sequence looks like. I tested about 3000 seeds with it last night and found only 47 which passed One of the good things about the Chi-Squared is that it will tell you if your
Here is a typical result: Matching with expected binomial distribution: ------------------ IP: Logged |
|
John Gleason Member |
Charles, no problemo, I'll post it as soon as possible. IP: Logged |
This topic is 10 pages long: 1 2 3 4 5 6 7 8 9 10 All times are EasternTime (US) | next newest topic | next oldest topic |
![]() |
|
Copyright © 1999-2006 PowerBASIC, Inc. All Rights Reserved.