Demystifying Pseudo-Randomness

Assem Saied Mohammed ElQersh
6 min readAug 2, 2024

Dice Randomness | Source

Randomness is a fundamental concept that permeates various aspects of our lives, from the unpredictability of weather patterns to the roll of dice in a game. In its purest form, randomness is the lack of any predictable order or pattern. This inherent unpredictability is what makes random events both fascinating and challenging to analyze.

True Randomness vs. Pseudo-Randomness

True Randomness
In the physical world, true randomness is derived from inherently unpredictable natural phenomena, such as radioactive decay or thermal noise. Prior states or conditions do not influence these processes, making them truly random. In such processes, outcomes cannot be predicted but can be analyzed statistically using probability. For example, in a fair coin toss, each result (heads or tails) is random, with a probability of 50% for either outcome. True randomness introduces variability, and probability provides a framework to measure and understand this variability.

Pseudo-Randomness
In digital environments, generating true random numbers can be complex and resource-intensive due to the deterministic nature of computers. Instead, computers use deterministic algorithms to produce pseudo-random numbers — sequences that mimic the properties of true randomness but are generated from an initial value known as the seed and follow a predictable pattern. While not genuinely random, these numbers are sufficiently unpredictable for most practical purposes, such as simulations, cryptography, and algorithm testing. Pseudo-randomness allows for the efficient generation of “random” numbers that appear random enough for specific applications, despite their underlying predictability.

Pseudo-Randomness

Computers generate pseudo-random numbers, which appear random but are produced by deterministic algorithms. This article explores the concept of pseudo-randomness and demonstrates how to effectively utilize it using the NumPy library in Python.

Why Pseudo-Randomness Matters

  1. Reproducibility: In scientific experiments and simulations, reproducibility is crucial. A fixed seed ensures that the same sequence of random numbers is generated, allowing consistent and reproducible results.
  2. Testing and Debugging: In software development, particularly in testing algorithms that rely on random inputs, being able to reproduce the same sequence of random numbers can help identify and fix issues.
  3. Security: In cryptographic applications, pseudo-random numbers are used for generating keys and other critical components. While true randomness is ideal, cryptographically secure pseudo-random number generators (CSPRNGs) are designed to produce numbers that are hard to predict and thus secure for practical purposes.

The Role of the Seed

The seed is crucial in generating pseudo-random numbers. It initializes the random number generator and determines the starting point of the sequence. Using the same seed value ensures that the random number generator produces the same sequence of numbers across multiple runs. This feature allows for reproducibility, which is vital in scientific research, testing, and debugging.

Source

Pseudo-Randomness in NumPy

NumPy, a fundamental package for scientific computing in Python, provides a suite of tools for generating pseudo-random numbers. The numpy.random module is the primary interface for generating these numbers, offering a wide range of functions for various distributions. You can install NumPy through the following command:

pip install numpy

Key Functions in numpy.random

seed()

  • The seed() function initializes the random number generator. Setting the seed ensures that the sequence of random numbers generated remains the same across multiple runs.
import numpy as np
np.random.seed(42)

rand()

  • Generates an array of random numbers uniformly distributed between [0, 1).
random_array = np.random.rand(5)
print(random_array)

randn()

  • Generates an array of random numbers from the standard normal distribution (mean 0, standard deviation 1).
random_normal_array = np.random.randn(5)
print(random_normal_array)

randint()

  • Generates random integers within a specified range.
random_integers = np.random.randint(0, 10, size=5)
print(random_integers)

choice()

  • Randomly selects elements from a given array.
elements = np.array([1, 2, 3, 4, 5])
random_choices = np.random.choice(elements, size=3)
print(random_choices)

shuffle()

  • Randomly shuffles the elements of an array in place.
array_to_shuffle = np.array([1, 2, 3, 4, 5])
np.random.shuffle(array_to_shuffle)
print(array_to_shuffle)

Random Distributions in NumPy

NumPy also provides functions to generate random numbers from various statistical distributions, such as:

  • Uniform Distribution: np.random.uniform(low, high, size)
  • Normal Distribution: np.random.normal(loc, scale, size)
  • Binomial Distribution: np.random.binomial(n, p, size)
  • Poisson Distribution: np.random.poisson(lam, size)

Example: Monte Carlo Simulation

Monte Carlo simulations are a common application of pseudo-random numbers. Here’s a simple example to estimate the value of π using NumPy.

import numpy as np

# Number of random points
n_points = 1_000_000

# Generate random points in the unit square
x = np.random.rand(n_points)
y = np.random.rand(n_points)

# Calculate distance from the origin
distance = np.sqrt(x**2 + y**2)

# Estimate π
pi_estimate = 4 * np.sum(distance <= 1) / n_points
print(f"Estimated value of π: {pi_estimate}")

Maaaan! What a drag! why do we even have to use NumPy?!

Advantages of Using NumPy for Pseudo-Randomness Illustration

  1. Efficiency: NumPy’s random number generation functions are optimized for performance, making them suitable for large-scale simulations and data analysis.
  2. Flexibility: NumPy supports a wide range of distributions and random operations, making it versatile for various applications.
  3. Reproducibility: The ability to set seeds ensures that results can be replicated, which is essential in scientific research and development.

So how can one try and build his pseudo-randomness algorithm by himself?

Understand the Basics of Pseudo-Random Number Generation

Pseudo-Random Number Generators (PRNGs): PRNGs use mathematical formulas to produce sequences of numbers that approximate the properties of random numbers. Unlike true random numbers, which are derived from unpredictable physical processes, PRNGs are deterministic, meaning they will produce the same sequence if started with the same initial value (seed).

Choose a Simple Algorithm

Linear Congruential Generator (LCG): A common and simple PRNG algorithm is the Linear Congruential Generator. It’s defined by the recurrence relation:

Source

Implement the Algorithm in the Code

Here’s a Python implementation of an LCG:

class LCG:
def __init__(self, seed, a=1664525, c=1013904223, m=2**32):
self.seed = seed
self.a = a
self.c = c
self.m = m
self.current = seed

def next(self):
self.current = (self.a * self.current + self.c) % self.m
return self.current

def random(self):
# Normalize the value to be between 0 and 1
return self.next() / self.m

# Example usage
lcg = LCG(seed=42)
for _ in range(5):
print(lcg.random())

Test and Adjust Parameters

  • Parameters: The choice of a, c, and m is critical for the quality of the generated sequence. Poor choices can lead to short cycles or less randomness.
  • Testing: Test the output for randomness by running multiple sequences and checking for patterns. Standard tests include chi-squared tests, frequency tests, and autocorrelation tests.

For applications like cryptography, basic algorithms like LCG aren’t sufficient. You’d need to implement more complex algorithms like a Cryptographically Secure Pseudo-Random Number Generator (CSPRNG), which involves using hash functions or block ciphers in counter mode.

Considerations for Practical Use

  • Cycle Length: Ensure the period (cycle length) of your PRNG is long enough for your intended use.
  • Performance: Balance the algorithm’s complexity with the performance requirements of your application.
  • Use Cases: Simple PRNGs like LCGs are suitable for simulations and games, but not for secure applications.

You can also explore algorithms like the Mersenne Twister, Xorshift, or Lagged Fibonacci generators, which offer better statistical properties and longer periods.

Understanding pseudo-randomness is essential for effectively using random number generators in computing. NumPy’s random module is a powerful tool for generating pseudo-random numbers and performing probabilistic simulations. By leveraging its functions, you can efficiently incorporate randomness into your data analysis, machine learning, and scientific computing tasks.

Free

Distraction-free reading. No ads.

Organize your knowledge with lists and highlights.

Tell your story. Find your audience.

Membership

Read member-only stories

Support writers you read most

Earn money for your writing

Listen to audio narrations

Read offline with the Medium app

Assem Saied Mohammed ElQersh
Assem Saied Mohammed ElQersh

Written by Assem Saied Mohammed ElQersh

IT support enthusiast venturing into Machine Learning. Join me to explore tech topics. Let's learn together in the digital realm!

No responses yet

Write a response