How to get a random number in *[0, M)* by a random number generator
that randomly throws a number in *[0, N)* with equal probability ?

# \(Rand(M)\) by \(Rand(N)\)

In this post, \(Rand \; K\) or \(Rand(K)\) denotes a random number generator that produce a integer randomly in \([0, K)\) (or \([0, K - 1]\), from \(0\) to \(K-1\) inclusively) with equal probability.

We can simply divide this problem into two cases: \(M \leq N\) or \(M \gt N\).

## \(M \leq N\)

If \(M \leq N\), then the range \([0, M)\) is in the range \([0, N)\). Therefore, the probability of a number choosed by \(Rand(N)\) in \([0, M)\) is same. All of them are \(\frac{1}{N}\).

number | probability |
---|---|

\(0\) | \(\frac{1}{N}\) |

\(1\) | \(\frac{1}{N}\) |

\(2\) | \(\frac{1}{N}\) |

\(\vdots\) | \(\vdots\) |

\(M-1\) | \(\frac{1}{N}\) |

\(\vdots\) | \(\vdots\) |

\(N-1\) | \(\frac{1}{N}\) |

What we want is to develop a method that can randomly choose a number in \([0, M)\) with equal probability. We can leverage this fact.

The simplest way is to **re-produce** a number
if the number produced is out of the range we want (\([0, M)\)).
That is, if the number produced is in \([M, N)\),
we **re-generate** a number.

Does it work? That’s check the probability for each number. For each round, the probability we need to re-generate a number is \(\frac{N - M}{N}\). Hence, the probability to get a number \(i\) in the \(K\) round can be organized as the following table, where \(i \in [0, M)\) and \(K \in \mathbb{N}\).

round | probability |
---|---|

\(1\) | \(\frac{1}{N}\) |

\(2\) | \(\frac{N - M}{N} \cdot \frac{1}{N}\) |

\(3\) | \({(\frac{N - M}{N})}^{2} \cdot \frac{1}{N}\) |

\(\vdots\) | \(\vdots\) |

\(K\) | \({(\frac{N - M}{N})}^{K-1} \cdot \frac{1}{N}\) |

As a result, the probability to get the number \(i\) is

\[\sum_{k=1}^\infty {(\frac{N - M}{N})}^{k-1} \cdot \frac{1}{N} = \frac{1}{N} + \frac{N - M}{N} \cdot \frac{1}{N} + {(\frac{N - M}{N})}^{2} \cdot \frac{1}{N} + \ldots + {(\frac{N - M}{N})}^{K-1} \cdot \frac{1}{N}\]This is same for all number \(i\), where \(i \in [0, M)\).

number | probability |
---|---|

\(0\) | \(\sum_{k=1}^\infty {(\frac{N - M}{N})}^{k-1} \cdot \frac{1}{N}\) |

\(1\) | \(\sum_{k=1}^\infty {(\frac{N - M}{N})}^{k-1} \cdot \frac{1}{N}\) |

\(2\) | \(\sum_{k=1}^\infty {(\frac{N - M}{N})}^{k-1} \cdot \frac{1}{N}\) |

\(\vdots\) | \(\vdots\) |

\(M-1\) | \(\sum_{k=1}^\infty {(\frac{N - M}{N})}^{k-1} \cdot \frac{1}{N}\) |

By now, we alreay know how to randomly generate a number in a smaller range from a random number generator in a bigger range.

In brief, the algorithm is

- Get a random number \(x\) in \([0, N)\)
- If \(x\) is in \([0, M)\), then return \(x\)
- Otherwise, repeat from 1

The following *Rust* program is the method we developed above:

```
// Get a random number in [0, s) by a random number generator in [0, b),
// where both s, b are integers and s > 1, b > 1, s <= b
fn rand_small_from_big(s: u32, b: u32) -> u32 {
assert!(s > 1 && b > 1 && s <= b);
loop {
// Get a random number in [0, b)
let x = rand(b);
// Return the random number if it is in [0, s)
if x < s {
return x;
}
// Repeating if the random number is in [s, b)
}
}
```

See the live demo here.

Actually, we could make some improvement for this approach. By the above approach, it’s very likely to re-produce the random number again and again when \(N\) is much bigger than \(M\). The problem is, the number is only valid when it’s in \([0, M)\). When \(M \ll N\), the chance to get a valid random number is very small. For example, if \(N = 101, M = 2\), the probability to get a valid number is lower than \(2\%(2/101)\). The produced number is only valid when it’s \(0\) or \(1\). It’s inefficient.

A simple solution is to make the number valid in \([0, k \cdot M)\), where the \(k \in \mathbb{N}\). The \(k \cdot M\) is the maximal multiple of \(M\) in \([0, N)\). If the number is in \([0, k \cdot M)\), then we can take the remainder of the produced number divided by \(M\) as the produced random number.

number | probability | evaluated |
---|---|---|

\(0\) | \(\frac{1}{N}\) | \(0\) |

\(1\) | \(\frac{1}{N}\) | \(1\) |

\(2\) | \(\frac{1}{N}\) | \(2\) |

\(\vdots\) | \(\vdots\) | \(\vdots\) |

\(M-1\) | \(\frac{1}{N}\) | \(M-1\) |

\(M\) | \(\frac{1}{N}\) | \(0\) |

\(M+1\) | \(\frac{1}{N}\) | \(1\) |

\(M+2\) | \(\frac{1}{N}\) | \(2\) |

\(\vdots\) | \(\vdots\) | \(\vdots\) |

\(2M-1\) | \(\frac{1}{N}\) | \(M-1\) |

\(2M\) | \(\frac{1}{N}\) | \(0\) |

\(2M+1\) | \(\frac{1}{N}\) | \(1\) |

\(2M+2\) | \(\frac{1}{N}\) | \(2\) |

\(\vdots\) | \(\vdots\) | \(\vdots\) |

\(kM-1\) | \(\frac{1}{N}\) | \(M-1\) |

\(kM\) | \(\frac{1}{N}\) | \(0\) |

\(kM+1\) | \(\frac{1}{N}\) | \(kM+1\) |

\(kM+2\) | \(\frac{1}{N}\) | \(kM+2\) |

\(\vdots\) | \(\vdots\) | \(\vdots\) |

\(N-1\) | \(\frac{1}{N}\) | \(N-1\) |

evaluated | probability |
---|---|

\(0\) | \(\frac{k}{N}\) |

\(1\) | \(\frac{k}{N}\) |

\(2\) | \(\frac{k}{N}\) |

\(\vdots\) | \(\vdots\) |

\(M-1\) | \(\frac{k}{N}\) |

others | \(\frac{N - k \cdot M}{N}\) |

As a result, the probability to get the number \(i\) is

\[\sum_{j=1}^\infty {(\frac{N - k \cdot M}{N})}^{j-1} \cdot \frac{k}{N} = \frac{k}{N} + \frac{N - k \cdot M}{N} \cdot \frac{k}{N} + {(\frac{N - k \cdot M}{N})}^{2} \cdot \frac{k}{N} + \ldots + {(\frac{N - k \cdot M}{N})}^{j-1} \cdot \frac{k}{N}\]This is same for all number \(i\), where \(i \in [0, M)\).

number | probability |
---|---|

\(0\) | \(\sum_{j=1}^\infty {(\frac{N - k \cdot M}{N})}^{j-1} \cdot \frac{k}{N}\) |

\(1\) | \(\sum_{j=1}^\infty {(\frac{N - k \cdot M}{N})}^{j-1} \cdot \frac{k}{N}\) |

\(2\) | \(\sum_{j=1}^\infty {(\frac{N - k \cdot M}{N})}^{j-1} \cdot \frac{k}{N}\) |

\(\vdots\) | \(\vdots\) |

\(M-1\) | \(\sum_{j=1}^\infty {(\frac{N - k \cdot M}{N})}^{j-1} \cdot \frac{k}{N}\) |

The higher \(k\) is, the higher probability to get a valid number. In the above example, if \(N = 101, M = 2\), then \(k = 50\). The probability to get a valid number is \(100/101\) since the produced number is valid from \(0\) to \(99\). Every even number in \([0, 100)\) (\(0, 2, 4, \dots, 98\)) will be evaluated to \(0\) and every odd number in \([0, 100)\) (\(1, 3, 5, \dots, 99\)) will be evaluated to \(1\), since the remainders of the produced even numbers divided by \(M\) and odd numbers divided by \(M\) are \(0\) and \(1\) respectively. That is, the probability becomes \(k\) times.

To sum up, the algorithm is

- Define \(K\) to the max multiple of \(M\) in \(N\)
- Get a random number \(x\) in \([0, N)\)
- If \(x\) is in \([0, K)\), then return \(x % M\)
- Otherwise, repeat from 2

```
// Get a random number in [0, s) by a random number generator in [0, b),
// where both s, b are integers and s > 1, b > 1, s <= b
fn rand_small_from_big(s: u32, b: u32) -> u32 {
assert!(s > 1 && b > 1 && s <= b);
// Get the max multiple of s in [0, b)
let max = (b / s) * s;
assert_eq!(max % s, 0);
loop {
// Get a random number in [0, b)
let x = rand(b);
// Return the random number if it is in [0, max)
if x < max {
return x % s;
}
// Repeating if the random number is in [max, b)
}
}
```

See the live demo here.

## \(M \gt N\)

However, if \(M \gt N\), our method above doesn’t work,
since the range of \([0, M)\) is bigger than \([0, N)\) *(really? keep reading!)*.

To get a random number in \([0, M)\),
we need to **enlarge** the range of the numbers produced
by the given generator. How to do that ?

OK, the number of results from the generator is \(N\) now. How to enlarge the range of results ? Should we generate two numbers and do some magic math ?

How many results we have if we generate two numbers ?
If we see the results as **permutations**, it’s \(N^2\).

When \(N = 2\), the results are \((0, 0), (0, 1), (1, 0), (1, 1)\). The probability of each result is \(1/4\).

1st | 2nd | probability |
---|---|---|

0 | 0 | \(\frac{1}{2} \cdot \frac{1}{2} = \frac{1}{4}\) |

0 | 1 | \(\frac{1}{2} \cdot \frac{1}{2} = \frac{1}{4}\) |

1 | 0 | \(\frac{1}{2} \cdot \frac{1}{2} = \frac{1}{4}\) |

1 | 1 | \(\frac{1}{2} \cdot \frac{1}{2} = \frac{1}{4}\) |

If the results can be mapped from \((0, 0), (0, 1), (1, 0), (1, 1)\)
to \(0, 1, 2, 3\), it means we **have a way** to enlarge the range of results
from \([0, 1]\) to \([0, 1, 2, 3]\).

Well, is it possible to do that ? If we observe carefully, it’s natural to map \((0, 0), (0, 1), (1, 0), (1, 1)\) to \(0, 1, 2, 3\). If we see \(00, 01, 10, 11\) as binary numbers \({00}_{2}, {01}_{2}, {10}_{2}, {11}_{2}\), then they are naturally \(0, 1, 2, 3\).

The same approach also works when we produce two numbers from random number generator whose range is \([0, 3)\). There are \(3 \cdot 3 = 9\) results: \((0, 0), (0, 1), (0, 2), (1, 0), (1, 1), (1, 2), (2, 0), (2, 1), (2, 2)\).

1st | 2nd | probability |
---|---|---|

0 | 0 | \(\frac{1}{3} \cdot \frac{1}{3} = \frac{1}{9}\) |

0 | 1 | \(\frac{1}{3} \cdot \frac{1}{3} = \frac{1}{9}\) |

0 | 2 | \(\frac{1}{3} \cdot \frac{1}{3} = \frac{1}{9}\) |

1 | 0 | \(\frac{1}{3} \cdot \frac{1}{3} = \frac{1}{9}\) |

1 | 1 | \(\frac{1}{3} \cdot \frac{1}{3} = \frac{1}{9}\) |

1 | 2 | \(\frac{1}{3} \cdot \frac{1}{3} = \frac{1}{9}\) |

2 | 0 | \(\frac{1}{3} \cdot \frac{1}{3} = \frac{1}{9}\) |

2 | 1 | \(\frac{1}{3} \cdot \frac{1}{3} = \frac{1}{9}\) |

2 | 2 | \(\frac{1}{3} \cdot \frac{1}{3} = \frac{1}{9}\) |

If we see them as the **base-\(3\)** numbers, they are naturally \(0, 1, 2, 3, 4, 5, 6, 7, 8\).

By applying this generating-two-numbers approach for a random number generator with range in \([0, N)\),
the sequential results \((i, j)\), where \(i, j \in [0, N)\),
can be mapped to \(0, 1, 2, \ldots, N^2 - 1\) (range in \([0, N^2)\))
by treating them as **base-\(N\)** numbers \({ij}_{N}\).

In general, for the random number generator with range in \([0, N)\),
there are \(N^k\) sequential results for producing \(k\) numbers.
The sequential results are \((x_0, x_1, \ldots, x_{k-1})\), where \(x_i \in [0, N)\) and \(i \in [0, k)\).
By taking the sequential results in **base-\(N\)** : \({(x_0 x_1 \ldots x_{k-1})}_{N}\),
their valus are naturally \(0, 1, 2, \ldots, N^k\).
It is how the **base-\(N\)** number works.

1st | 2nd | \(\ldots\) | kth | probability |
---|---|---|---|---|

0 | 0 | \(\ldots\) | 0 | \(\frac{1}{N^k}\) |

0 | 0 | \(\ldots\) | 1 | \(\frac{1}{N^k}\) |

0 | 0 | \(\ldots\) | 2 | \(\frac{1}{N^k}\) |

\(\vdots\) | \(\vdots\) | \(\vdots\) | \(\vdots\) | \(\vdots\) |

0 | 0 | \(\ldots\) | N-1 | \(\frac{1}{N^k}\) |

\(\vdots\) | \(\vdots\) | \(\vdots\) | \(\vdots\) | \(\vdots\) |

N-1 | N-1 | \(\ldots\) | 0 | \(\frac{1}{N^k}\) |

N-1 | N-1 | \(\ldots\) | 1 | \(\frac{1}{N^k}\) |

N-1 | N-1 | \(\ldots\) | 2 | \(\frac{1}{N^k}\) |

\(\vdots\) | \(\vdots\) | \(\vdots\) | \(\vdots\) | \(\vdots\) |

N-1 | N-1 | \(\ldots\) | N-1 | \(\frac{1}{N^k}\) |

When producing \(k\) numbers by the random number generator with range in \([0, N)\),
the sequential results \(x_0, x_1, \ldots, x_{k-1}\) can be mapped to a number in \([0, N^k)\)
by encoding them as a **base-\(N\)** number \({(x_0 x_1 \ldots x_{k-1})}_{N}\).

The idea can be implemented as the following program:

```
// Map the result of (x_0, x_1, ..., x_{k-1})
// to (x_0 x_1 ... x_{k-1}) in base-N number
let mut x = 0;
for _ in 0..k {
// Repeat k times
x = x * N + rand(N);
}
```

There are \(N^k\) kinds of sequential results for producing \(k\) numbers. The probability for each one is \(\frac{1}{N^k}\). By this approach, we have a way to get one result in the \(N^k\) kinds of sequential results. That is, we have a way to generate a random number in \([0, N^k)\) by the random number generator in \([0, N)\).

The original problem is to find a way to generate a random number in \([0, M)\) by the random number generator in \([0, N)\), where \(M \gt N\). Since we alreay know how to get a random number in a smaller range by a random number generator in a bigger range (the method developped in the \(M \leq N\) case), if we can find a \(k\) such that \(N^k \geq M\), then the problem can be solved!

The \(k\) can be calculated by *logarithm*:

The corresponding program is:

```
// Return minimal k such that N^k >= M
fn min_pow(N: u32, M: u32) -> u32 {
let M_f64 = f64::from(M);
let N_f64 = f64::from(N);
M_f64.log(N_f64).ceil()) as u32
}
```

To sum up, if \(M \gt N\), then

- Find a \(k\) such that \(N^k \geq M\)
- Define \(Y\) to the max multiple of \(M\) in \(N^k\)
- Get the random number generator in \([0, N^k)\)
- Generate a random number in \([0, M)\):
- Get a random number \(x\) in \([0, N^k)\)
- If \(x\) is in \([0, Y)\), then return \(x % M\)
- Otherwise, repeat from 3-1

```
// Get a random number in [0, b) by a random number generator in [0, s),
// where both s, b are integers and s > 1, b > 1, s <= b
fn rand_big_from_small(b: u32, s: u32) -> u32 {
assert!(s > 1 && b > 1 && s <= b);
// Get a minimal number p such that s^p >= b
let p = min_pow(s, b);
assert!(s.pow(p) >= b);
// Get the max multiple of b in [0, s^p)
let max = (s.pow(p) / b) * b;
assert_eq!(max % b, 0);
loop {
// Get a random number in [0, s^p)
let r = rand_pow(s, p);
// Return the random number if it is in [0, max)
if r < max {
return r % b;
}
// Repeating if the random number is in [max, s^p)
}
}
// Get a random from [0, x^y)
fn rand_pow(x: u32, y: u32) -> u32 {
assert!(x > 1 && y > 0);
let mut r = 0;
for _ in 0..y {
// Repeat y times
r = r * x + rand(x);
}
r
}
// Return minimal k such that x^k >= y
fn min_pow(x: u32, y: u32) -> u32 {
(f64::from(y).log(f64::from(x)).ceil()) as u32
}
```

See the live demo here.

### Enlarge from \([0, 1, \ldots, N-1]\) to \([0, 1, \ldots, N^k-1]\)

An interesting view to look the line \(x * N + rand(N)\) is to think \(x\) is a \(l\) numbers sequence from \(a\) to \(a + l - 1\), \([a, a + 1, a + 2, \ldots, a + l - 1]\) and \(rand(N)\) is a list from \(0\) to \(N-1\), \([0, 1, 2, \ldots, N - 1]\). In this case, \(x * N + rand(N)\) is something like

```
// x is [a, a + 1, a + 2, ..., a + l - 1]
fn enlarge(x: Vec<u32>, N: u32) -> Vec<u32> {
let offsets: Vec<u32> = (0..N).collect();
let mut out = Vec::new();
for i in x.iter() {
for j in offsets.iter() {
out.push(N * i + j);
}
}
out
}
```

What `enlarge`

does is to enlarge a sequence
from \([a, a + 1, a + 2, \ldots, a + l - 1]\)
to \([a \cdot N, a \cdot N + 1, a \cdot N + 2, \ldots, (a + l) \cdot N - 1)]\).

By multiplying \(N\) on each value in \([a, a + 1, a + 2, \ldots, a + l - 1]\),
the difference between \(i\) and \(i + 1\) is enlarged to \(N\), where \(i \in [0, a + l - 1)\).
The sequence becomes \([a \cdot N, (a + 1) \cdot N, (a + 2) \cdot N, \ldots, (a + l - 1) \cdot N]\).
By padding \(0\), \(1, 2, \ldots , N-1\) to the gaps between each value
in \([a \cdot N, (a + 1) \cdot N, (a + 2) \cdot N, \ldots, (a + l - 1) \cdot N]\),
The sequence becomes \([a \cdot N, a \cdot N + 1, a \cdot N + 2, \ldots, (a + l) \cdot N - 1)]\).
This is all the values in \([a, (a + l) \cdot N)\).
In other words, the sequence containing all the values in \([a, a + l)\)
can be enlarged to a sequence containing all the values in \([a, (a + l) \cdot N)\)
by applying `enlarge`

.

If \(a\) is \(0\) and \(l\) is \(N^p\),
then we can `enlarge`

a \([0, N^p)\)-sequence to a \([0, N^{p+1})\)-sequence.

Based on this idea, the following code

```
let mut x = 0;
for _ in 0..k {
// Repeat k times
x = x * N + rand(N);
}
```

is similar to call `k-1`

-times `enlarge`

```
// x is [0, 1, ..., N-1]
for _ in 0..k-1 {
// Repeat k-1 times
x = enlarge(x, N);
}
```

to enlarge a list from \([0, 1, ..., N-1]\) to \([0, 1, \ldots, N^k-1]\).
The first `x = x * N + rand(N)`

is to generate a \([0, 1, \ldots, N-1]\) to `x`

.

See the live demo here.

## Compute Rand *M* from Rand *N*

In fact, the method developped for \(M \leq N\) case is a special case in the method for \(M \gt N\) case.

Recall the algorithm for \(M \gt N\) case:

- Find a \(k\) such that \(N^k \geq M\)
- Define \(Y\) to the max multiple of \(M\) in \(N^k\)
- Get the random number generator in \([0, N^k)\)
- Generate a random number in \([0, M)\):
- Get a random number \(x\) in \([0, N^k)\)
- If \(x\) is in \([0, Y)\), then return \(x % M\)
- Otherwise, repeat from 3-1

If \(M \leq N\), then \(k\) is \(1\)!

Thus, the algorithm can be summarized as:

```
// Get a random number in [0, m) by a random number generator in [0, n)
fn rand_m_from_n(m: u32, n: u32) -> u32 {
assert!(m > 1 && n > 1);
// Get a minimal number p such that n^p >= m
let p = min_pow(n, m);
// Get the max multiple of m in [0, n^p)
let max = (n.pow(p) / m) * m;
loop {
// Get a random number in [0, n^p)
let r = rand_pow(n, p);
// Return the random number if it is in [0, max)
if r < max {
return r % m;
}
// Repeating if the random number is in [max, n^p)
}
}
// Get a random from [0, x^y)
fn rand_pow(x: u32, y: u32) -> u32 {
assert!(x > 1 && y > 0);
let mut r = 0;
for _ in 0..y {
// Repeat y times
r = r * x + rand(x);
}
r
}
// Return minimal k such that x^k >= y
fn min_pow(x: u32, y: u32) -> u32 {
(f64::from(y).log(f64::from(x)).ceil()) as u32
}
```

See the live demo here.

### If *N* is 2

To get the random number in \([0, 2^k)\) where \(k \in \mathbb{N}\), one mentionable trick is to replace

```
x = x * 2 + rand(2);
```

by

```
x = x << 1 | rand(2);
```

when \(N = 2\).

On the other hand, we don’t need to find the max multiple of \(x\) in \([0, 2^p)\), where \(p\) is the minimal number such that \(2^p \geq x\). It doesn’t exist! No multiple of \(x\) is smaller than \(2^p\). If it exists, it is at least \(2x\), and it implies there is a \(2x \leq 2^p\). However, if \(2x \leq 2^p\), then \(x \leq 2^{p-1}\) rather than \(x \leq 2^p\)!

As a result, the program is:

```
// Get a random number in [0, x) by a random number generator in [0, 1]
fn rand_from_2(x: u32) -> u32 {
assert!(x > 1);
// Get a minimal number p such that 2^p >= x
let p = min_pow(x);
loop {
// Get a random number in [0, 2^p)
let r = rand_pow(p);
// Return the random number if it is in [0, x)
if r < x {
return r;
}
// Repeating if the random number is in [x, 2^p)
}
}
// Get a random from [0, 2^k)
fn rand_pow(k: u32) -> u32 {
assert!(k > 0);
let mut r = 0;
for _ in 0..k {
// Repeat k times
r = r << 1 | rand(2);
}
r
}
// Return minimal k such that 2^k >= x
fn min_pow(x: u32) -> u32 {
(f64::from(x).log(f64::from(2)).ceil()) as u32
}
```

See the live demo here.

### How to check if the distribution is uniform

One way to check if the distrubution of a random number generator is uniform
is to apply *chi square test*.
The discussion can be found here.
The Testing a Random Number Generator chapter
in *John D. Cook*’s *Beautiful Testing* is also a great reference to read.