|
Author
|
Topic: Dicing With Probability
|
Charles Pegge Member
|
posted January 23, 2007 02:58 PM
Attempting the calculation: 1000! broke my calculator and my PC. Is there another more devious method?------------------ www.pegge.net IP: Logged |
David Roberts Member
|
posted January 23, 2007 03:19 PM
Yes.We have to construct something like n!/r! = n(n-1)(n-2)...(r+1) to avoid n! I knocked something out for Donald way back when we were discussing the hypergeometric probability density function. I'll see if I can find it. In the meantime play around with the construct. Added: Found it in Poffs! FUNCTION combo(n AS LONG, k AS LONG) AS EXTENDED LOCAL plays AS LONG, value AS EXTENDED value=1 FOR plays=1 TO k value=value*(n-plays+1)/plays NEXT FUNCTION=value END FUNCTION
to be interpreted as nCk For N=1000 we will get 50:50 about 2½% of the time ie uncommon, and for N=10000 about 0.8% ie unlikely. [This message has been edited by David Roberts (edited January 23, 2007).] IP: Logged |
Charles Pegge Member
|
posted January 23, 2007 04:58 PM
Using logs of factorials to avoid trashing the pc with large factorials.this is the logged equivalent of David's original method exp(lfactorial(r)-(lfactorial(r/2)*2)-(log(2)*r)) results
trials probability of getting exact 50:50 ----------------------------------------- 2 0.5 4 0.375 8 0.2734375 16 0.196380615234374 32 0.13994993409142 64 0.099346753747966 128 7.03860921700243E-2 256 4.98191099361653E-2 512 3.52446354858556E-2 1024 2.49278058929265E-2 2048 1.76287724045102E-2 4096 1.24661853640193E-2 8192 8.81519322125798E-3 16384 6.23337801522039E-3
As you can see, the reduction in probability diminishes even when the trials double
------------------ www.pegge.net IP: Logged |
Emil Menzel Member
|
posted January 23, 2007 06:34 PM
Great going, guys! I haven't checked your calculations but that's just because I am lazy. By the way, there is also a good approximation to factorial that is useful for large numbers. http://mathworld.wolfram.com/StirlingsApproximation.html In a way it is paradoxical that, ordinarily, the larger your N, the closer you'll get to the theoretical "expected" value, just as the Law of Larger Numbers says; yet, at the same time, the larger your N, the less likely it is that you will get that value, exactly. The paradox is probably not very "deep", mathematically, but I don't believe it has anything to do with quantum mechanics. As for the time-series trend I referred to in my last post, I am still puzzled. With blocks of (say) 1 million simulated coin tosses in PB programs, I have very frequently gotten statistically significant trends; indeed sometimes I have gotten Pearson correlations of .7 and higher between p(heads) & "N thus far", for which the probability is exceedingly small. Sometimes the trend goes up; another time it goes down. Pick the right seed number &/or the right start point and end point in the sequence and you might break the bank at Monte Carlo. I do not know anything in "pure" math that would explain such results, and I have not yet spotted any obvious bugs in the program. ------------------
[This message has been edited by Emil Menzel (edited January 23, 2007).] IP: Logged |
Charles Pegge Member
|
posted January 23, 2007 06:59 PM
Thanks Emil for the Stirling approximation link. I see it contains both pi and e, and looks suspiciously similar to the normal distribution formula used with standard deviations.Wish I knew more squirly Greek for the calculus. Much of this stuff was developed over 250 years ago. PS Some motherboards have a random number generator chip, about as close as you can get to true random. I think they are used in encryption. ------------------ www.pegge.net [This message has been edited by Charles Pegge (edited January 23, 2007).] IP: Logged |
James Graham-Eagle Member
|
posted January 23, 2007 07:23 PM
A good approximation for large n for the probability of a 50:50 split is sqrt(2/(pi*n)) where n is the number of tosses. Of course this is true for even n only - the probability is zero if n is odd.------------------
IP: Logged |
Marco Pontello Member
|
posted January 23, 2007 08:01 PM
quote: Originally posted by Charles Pegge: Some motherboards have a random number generator chip, about as close as you can get to true random. I think they are used in encryption.
Most VIA's CPU integrate a well regarded hardware RNG. They started doing that some years ago with the C3, if I remember correctly.Bye! ------------------ Try Online TrID file identifier! Recognize over 2.300 filetypes and counting... ;) Give a powerfull scriptable Lua interface to your application with PowerBLua Leakout - Firewall outbound control tester - BitmapRip Extract/recover bitmaps from any file! IP: Logged |
Emil Menzel Member
|
posted January 23, 2007 08:06 PM
There's that Pi again -- both in the approximation to N factorial and in the formula for the normal distribution (the graph for which is, of course, an analog approximation to the binomial). For that matter, many if not most formulas for random numbers involve pi, I believe. Pi is a most tenacious wench. Don't listen to her siren calls, I warn you....But if you must, then read further. For example, here's a quote from an article about the early history of Pi: http://www.recoveredscience.com/const124Pivalues.htm The value of pi = 3 appears on several tablets which also give the circumference of the circle as six times the radius. This derives without doubt from equating the circumference around a circle of radius r with the circumference of a hexagon having a side length of r which explains also the division of the circle into 360 degrees. In addition, there is at least one tablet with values for various circle arcs, chords, secants, etc., but it is hard to interpret.”[3] The earliest improvement over the value implied in the Rhind Papyrus is currently credited to the Greek Archimedes of Syracuse who lived from about 287 to 212 BCE. He fenced in the range of pi between 3 10/71 and 3 1/7, or 0.024% below and 0.040% above the correct result. The method Archimedes used to achieve this result is simple. He reasoned that the circumference of a regular convex polygon inscribed into a circle is smaller than that of the surrounding circle, and that the sides of any such polygon drawn around the circle must add up to a greater circumference. He could also see that polygons with more sides hugged the circle closer than those with fewer, and that their circumferences came therefore closer to that of the circle between them. Archimedes started with hexagons in and around his circle. Hexagons are easy to dissect into triangles for computing the lengths of their sides because these sides are all equal, and they yield the crude circumference- to- diameter ratio of 3 we saw above on the Babylonian tablets. Then he doubled the number of polygon sides and calculated the dimensions of the new triangles which now had half the previous center angles, and he repeated this process until he arrived at the polygon pair with 96 sides which gave the above result [4]. This procedure is called the “method of exhaustion” because it exhausts the difference between the circumferences of the polygons and that of the circle with its ever closer approximations. The only mathematical tools required for it are the so-called “Theorem of Pythagoras”, about the sum of the squares over the sides of a right- angled triangle, and the ability to extract square roots. Both these tools were available in the ancient Near East long before Pythagoras. Already the Old Babylonians of king Hammurabi’s time (1792 to 1750 BCE)[5] were familiar with the contents of that theorem. They may not have explicitly formulated it in modern terms or spelled out its derivation on any of the few surviving tablets, but they used it routinely ... [end of quote] The method of exhaustion sounds to me like the precursor of calculus. The smooth arc of a circle is to the polygons as the smooth curve of the normal probability curve is to the bar graph integer values of the binomial expansion. Re the randomness of "random numbers" generators: Call me a skeptic. But this issue has been discussed at length on other threads in PB forums. -----------------
[This message has been edited by Emil Menzel (edited January 23, 2007).] IP: Logged |
Ian Webling Member
|
posted January 24, 2007 04:58 AM
And then there is Indiana, where Pi (by law) is 4.------------------
IP: Logged |
Charles Pegge Member
|
posted January 24, 2007 06:41 AM
quote:
A good approximation for large n for the probability of a 50:50 split is sqrt(2/(pi*n)) ...
Yes James, this compares very well with my figure for n=16384 flips to about 5 decimal places, and its a lot easier to calculate! Thank you.At 128 flips there is an overestimate of about 0.3%.
------------------ www.pegge.net IP: Logged |
Emil Menzel Member
|
posted January 24, 2007 10:33 AM
Where did James' approximation come from, I wonder. Is it the simplest possible formula or can it be simplified further? I found (using James' formula for p) that for all even numbers between 2 and 100 the relationship between log(N) and log(p) is a simple straight line with r-square = 1.0000. But my program choked on sqrt(2/(pi*1000)) so generalize beyond 100 at your own risk. I did not get around to plotting a graph of Charles' exact data, but it would make sense to try. Somehow I am reminded here of an old program I had that would test a seemingly countless number of transformations to see which one could account for the most variance in one's empirical data -- so you could take your pick. Students loved it, statisticians issued warnings about it, and mathematicians advised me to burn it. ------------------
IP: Logged |
David Roberts Member
|
posted January 24, 2007 02:23 PM
quote: The paradox is probably not very "deep", mathematically, but I don't believe it has anything to do with quantum mechanics.
We don't have a paradox.  When we look at 1/2 + 1/4 +1/8 + 1/16 ......, which is the glass of beer example I gave earlier, the limiting summation is 1. We use phrases such as 'as n tends to infinity the summation tends to' or 'at the point infinity the summation is'. I likened our problem to this one. That was wrong. We do not have a series. We have a simple equation ie N!/((N/2)!^2/(2^N)). As N increases the population increases at a faster rate than the number of N/2-combinations. At the point infinity our equation vanishes, but then it would. Any number compared with infinity is going to be infinitesimal. The numerator doesn't vanish. In fact, it also tends to infinity. To consider here what happens at the point infinity is as illogical as concluding that 1 = 5 because 1*infinity = 5*infinity. To see the determinate behaviour of tossing an unbiased coin we need look no further than the cumulative binomial distribution which in our case is to consider what is the probability of getting 50% or more heads. 7 or more from 10 is not unusual at 17.2%. 70 or more from 100 is unusual at 0.004% even though we have the same ratio ie 0.7. 55 or more from 100 gives 18.4% which is close to 17.2%. So, for the same probability, as N increases a number or more from N reduces otherwise we may have evidence of bias. As N increases we should then see the probability of getting 50% or more heads tending to 0.5.Here is a table similar to Charles' for exactly 50:50 but now considering getting 50% or more heads 2^1 2 0.75 2^6 64 0.549673 2^10 1024 0.512464 2^14 16384 0.503117 2^20 1048576 0.500390 . . . infinity 0.5
IP: Logged |
Donald Darden Member
|
posted January 24, 2007 03:51 PM
Since I went to school before they invented calculators, we had to work out math problems on paper. They taught us to use 22/7 as an approximation for Pi, and errors acceptable to one ot two decimal places. As an engineer, my father used a slide rule, and in his profession, the allowed error range was +/- 20% in many areas. For precise work, they strove for +/- 10%, and when cost was not a factor, the goal might be +/- 5%, or +/- 1 percent. Finding resistors and capacitors that would hold their values at that narrow a range was difficult.Years later, I stumbled on the fact that 355/113 is a much better approximation of Pi, but now we have computers and calculators that are even more accurante, and the automation and precision used in manufacturing routinely give us much closer tolerances in the products that we produce. My dad once discussed a common belief among engineers that that the probability of failure in individual circuits were accumulative, and that if you joined enough circuits together, you would reach a point where you would have a 100% rate of failure. Before the development of computers, there were few devices that required a lot of circuits, but in order to make computers, all that changed. When computers were built using vacuum tubes, they were very slow and buggy, and a lot of effort was required to isolate and repair problems. The advent of the transistor made computers practical for the first time, and the use of binary arithmetic and boolean logic became subjects that engineers had to understand. For at least a decade, there was resistance to binary arithmetic, as people struggled to try and hold onto the decimal system at all cost. Once that proved impractical, there was a suddent interest in teaching New Math throughout the school systems. With the advent of other components being emulated in silicon, you finally had the development of the integrated circuit, and that lead first to the minicomputer, then the microcomputer, and finally to our familiar PCs, laptops, tablets, beepers, cellphones, blackberries, iPods, and more. But what we have as a consequence is a generation of people who know how to use gadgets, but may have little comprehension of how those gadgets do what they do. A significant loss is the ability to deal with areas where those gadgets fail us. I've spent three days on the phone with customer support for TigerDirect trying to get my account with them straightened out, and each time I go back and check, the same errors in my account are still there to mock me. Another example is the fact that, not unexpectedly, someone takes it that the pseudo-random process used for the RND() process in PCs is generating valid random sequences. While the RND() is a reasonable approximation of a random sequence, the sequence followed is actually very precise from any given seed value. A truly random sequence has no associated seed value, and cannot be predicted. However, there is evidence that there is no true randomness in nature, that all chance as we know it is the certain outcome of factors throughout the universe that have become established since the moment of creation, what many call the Big Bang. As I understand it, Quantum Mechanics even holds that there are associations between some particles that exist independently of time and space. If you had the mind of God, you might be able to grasp all that, to understand it, and in that case, it might be evident to you that there is no randomness at all, that everything that happens is everything that can and will happen, without question. We only see it as true randomness because we cannot perceive it on that scale.
------------------ Old Navy Chief, Systems Engineer, Systems Analyst, now semi-retired IP: Logged |
Charles Pegge Member
|
posted January 24, 2007 04:52 PM
Does encrypted data appear as completely random if you dont have the key ?
------------------ www.pegge.net IP: Logged |
David Roberts Member
|
posted January 24, 2007 05:37 PM
We don't have the key to the Big Bang yet or should I call that the Big Seed or even the mother of all seeds?I forgot to mention, Donald, that a new rule came out recently that all members have not seen yet, mention God and you have to remove one item of clothing and stand outside for half an hour. If you live in the northern hemisphere that will not be funny. IP: Logged |