YesNoOk
avatar

The inefficiency of random number spreads via modulo division (Read 6964 times)

Started by Ricepigeon, March 12, 2014, 06:19:27 PM
Share this topic:
The inefficiency of random number spreads via modulo division
#1  March 12, 2014, 06:19:27 PM
  • *****
  • Gaps? Where we're going, we don't need gaps...
    • USA
    • ricepigeon.neocities.org
Writing this up out of boredom due to one of my own personal experiences in MUGEN coding...

Many times I've seen authors use modulo division to generate a spread of random numbers for whatever purpose (be it for AI triggers, random animations, or whatever). For those who don't know what I'm talking about, I'm referring to pieces of code like this:

trigger1 = random%2 = 0

To understand why the above code is inefficient, we have to take a deeper look into the results its producing.

For those who don't already know, in MUGEN, random generates a random integer from 0 to 999 (effectively producing 1 of 1000 different values each time random is called), each with an equal probability of being returned by random. With the % operator, such as in expression X%Y, we are dividing value X by value Y and returning only the remainder as a result. 17%10, for example, would return 7 as its value (17/10 = 1 R7). If we were to apply modulo division to random, for instance, we would be generating a random value from 0 to 999 and then dividing this number by an arbitrary value and returning its remainder as a result. By applying modulo division to random, we expect to see results that have an equal probability of being generated, as follows:

Random%10 generates:
0 (10% probability)
1 (10% probability)
2 (10% probability)
3 (10% probability)
4 (10% probability)
5 (10% probability)
6 (10% probability)
7 (10% probability)
8 (10% probability)
9 (10% probability)

As we can see above, we have a nice even spread of the 10 possible values that random%10 can generate. While this works great for numbers that are factors of 1000 (such as 2, 4, 5, 10, 20, 50, 100, etc), more arbitrary values won't be as neat and organized.

As an arbitrary example, instead of 10 values, we try for 666 (which should return an even spread of numbers ranging from 0 to 665). The results are listed below:

Random%666 generates:
Spoiler, click to toggle visibilty

The problem should be immediately clear: the values ranging from 0-333 each have a 0.20% probability, while values ranging from 334-665 each have a 0.10% probability. If this were an even spread, we should expect any value from 0-332 to be generated only 50% of the time, but instead we see that numbers in this range are generated 66.6% of the time! If we were relying on a number being generated from this range, random%666 would be returning these values TWICE as often as the rest of the spread.

The solution? Rather than using modulo division, we use the following:

floor(X*random/1000.0)

What this does is multiply arbitrary value X by random, then divide by 1000, then floor the result to produce an integer value. This results in a spread ranging from 0-X, but in a much more even fashion. Say we revise our above code to the following:

floor(666*random/1000.0) generates:
Spoiler, click to toggle visibilty

While this isnt foolproof, as we still have values that are twice as likely to be returned than the other values, you'll notice that the spread is much more even. Values ranging from 0-332 now have a 50% chance of being generated, as we would expect them to be, as opposed to the 66.6% probability we were generating before.
Re: The inefficiency of random number spreads via modulo division
#2  March 12, 2014, 08:29:01 PM
  • ****
  • it's me
  • Maybe something different
    • Chile
    • koakoa@jabber.org
    • Skype - koakumadevil69
    • cowabungamugenteam.weebly.com
Oh and let's not forget what we personally agreed on was a good idea:

floor(X*floor((479001600/999999999.0)*((random+(random*1000)+(random*1000000))))/(479001600.0))

479001600 is 12! Which will take care of most uneven spreads as it contains many, many, MANY factors that you couldn't really use with modulo, note this factorial is as big in MUGEN as we're going to get (13! is 6,227,020,800 > 2,147,483,647).
Last Edit: March 12, 2014, 08:37:07 PM by Flaffy
Re: The inefficiency of random number spreads via modulo division
#3  March 12, 2014, 08:31:42 PM
  • *****
  • Gaps? Where we're going, we don't need gaps...
    • USA
    • ricepigeon.neocities.org

That was mostly to eliminate the uneven spreads. Granted there may be a few uneven spreads still in there and may need some revision, but for all intents and purposes that isn't necessary for the above to work.
Re: The inefficiency of random number spreads via modulo division
#4  March 12, 2014, 08:33:05 PM
  • ****
  • it's me
  • Maybe something different
    • Chile
    • koakoa@jabber.org
    • Skype - koakumadevil69
    • cowabungamugenteam.weebly.com
That was mostly to eliminate the uneven spreads.

479001600 is 12! Which will take care of most uneven spreads

department of redundancy department.



So I'm gonna dwelve further onto how the expression works:

floor(X*floor((479001600/999999999.0)*((random+(random*1000)+(random*1000000))))/(479001600.0))

The bolded expression is basically taking advantage of MUGEN's randomizer to make a sort of a pseudo-random number generator (PRNG), which will generate numbers from 0 all the way to 999999999. So by dividing with the maximum number generable, that is, 999999999. You get a number in the interval [0,1)

Given the rounding will cause some factors to become more weighted we use 12! to divide, as it contains many problematic factors such as 3, 7, etc. That will attempt to at least nerf the uneveness and cause a much more natural spread that will have the feeling of a real random sequence.
Last Edit: March 12, 2014, 08:45:58 PM by Flaffy
Re: The inefficiency of random number spreads via modulo division
#5  March 13, 2014, 11:37:59 PM
  • ****
  • "OHHH YES!"
    • Canada
While that is the case, it's important to stick with the simplest means despite the lack of true "random". I.e. the solution provided in the first post is not the most accurate spread however I'd opt to stick with the smaller, more efficient, and more understandable method given the importance of not bogging MUGEN down with large amounts of number generating(up to 999999999), for the sake of performance. Just my two cents but too much math even hurts MUGEN's head.

But are there truly grown men in this world?!
Re: The inefficiency of random number spreads via modulo division
#6  March 13, 2014, 11:55:06 PM
  • ****
  • it's me
  • Maybe something different
    • Chile
    • koakoa@jabber.org
    • Skype - koakumadevil69
    • cowabungamugenteam.weebly.com
Well this thing could be most likely spacedly used. When true randomness is needed. It's always nice to be considered.
Re: The inefficiency of random number spreads via modulo division
#7  March 14, 2014, 12:03:39 AM
  • ****
  • "OHHH YES!"
    • Canada
Really my statement applies to very radical characters. For instance, an adaptation of Dampierre, if said creator was ambitious. 80% of his movelist has a randomized chance of changing properties and followups. I'm sure there are characters with extensive randomization beyond him, but the general idea remains. Of course if it's just for one thing you can try and use it.

But are there truly grown men in this world?!