“Anyone who considers arithmetical methods of producing random digits is, of course, in a state of sin. For, as has been pointed out several times, there is no such thing as a random number — there are only methods to produce random numbers.” – John Von Neumann
Random number generators (RNGs) are a crucial part of many different computer software processes, such as cryptography, simulations, machine learning, gaming, programming, gambling, scientific studies, and the list goes on. But you might ask how exactly do you make something random, and why is it so important?
It turns out that computers are not very good at being spontaneous – they can only perform what you program them to do. RNGs give computers the ability to generate unique non-uniform numbers. In other words, RNGs let computers simulate unpredictability.
Two Types of Random Number Generators
Random number generators break down into two different types: true random number generators (TRNGs) and pseudo-random number generators (PRNGs) . TRNGs are considered “true” because they utilize entropy from an external, non-computer program source, like the weather, atmospheric noise, or the amount of time you spend pressing down a key on your computer’s keyboard.
More scientific methods include measuring the radioactive decay of an atom . According to quantum theory, there’s no way to know for sure when radioactive decay will occur, so this is essentially “pure randomness.” True random number generation is a fascinating topic and we have an entire article dedicated to it here.
TRNGs are great for making authentic random numbers (aRNs), but their processes are often computationally heavy and unscalable, as they are reliant upon external data sources and sensors.
Pseudo-Random Number Generation (blue) vs. True Random Number Generation (white)
Pseudo-Random Number Generators
PRNGs seem random but actually are not . They are more widely used due to their speed, and repeatability, which make them great for simulations and games. Essentially, PRNGs are algorithms that use mathematical formulas or simply pre-calculated tables to produce sequences of numbers . Take the linear congruential generator (LCG) for example. It is composed of the seed, multiplier, increment, and modulus.
LCGs use a single seed number to start the algorithm. The LCG goes through this equation: (seed * multiplier + increment) mod m = output
Eventually, the sequence will start to repeat after a number of tries. This is called the “period,” a measurement of unique numbers in a sequence. LCG’s are not used for encryption because they are not sufficiently random. They have small periods and uniform patterns.
Most encryption PRNGs are based on the Mersenne Twister. Developed in 1997 by Makoto Matsumoto and Takuji Nishimura, the Mersenne Twister is the most widely used general-purpose PRNG . Its name comes from the fact that its period is 2¹⁹⁹³⁷− 1, which is the Mersenne prime number length. The Mersenne Twister is the default PRNG for many software systems like Microsoft Excel, MATLAB, Free Pascal, PHP, Python, R, Ruby, and C++. It is also commonly used in games. Many systems use it because of its ability to quickly generate high-quality pseudo-random numbers.
Gaming Use Cases
As computer games become more life-like, the need for better random number generators increase. PRNGs are used in gameplay, graphics, and many other aspects of game development. When your avatar strikes an opponent, the game needs a random number to approximate the damage range. When you’re playing a game with ultra-realistic graphics, RNGs delegate light reflection to avoid overworking your processor. Minecraft uses a random seed number to procedurally generate the world around your avatar.
These types of games are called “procedural generation games” and have gained a lot of popularity in recent years. The simple reason being is that you don’t need to make everything by hand. A procedural generator is a system that uses RNGs to create an entire world through randomness . In the case of “No Man’s Sky,” the game generates an entire universe. With procedural generation, you save a lot of memory while generating almost an infinite amount of gameplay environments and characters. You simply would not be able to do this without RNGs, as it would be too computationally wasteful for smooth game play on most devices.
There are downsides, though. “No Man’s Sky” uses a 64-digit seed to generate each planet. This seed is fed into the biome generator to create the terrain and other various planet attributes. But having a 64 digit number is not enough, and as a result the planets don’t appear to have sufficient diversity.
The game starts a generation sequence every time you enter a new solar system . Initially, you see big diverse planets with blue oceans and lush green forests. Once you start to explore the surface of the planets, patterns in the environments start to emerge. Redundant landscapes and animals are evenly spread over the surface of a planet and patterns become obvious.
Technically these planets are randomly generated, but they are not wholly unique expressions. The game can only handle slight variations in environments due in part to the limitations of the utilized PRNG. To remedy this dilemma a longer seed number is needed. A longer seed number would give extra variables, but that alone would not be enough. An attribute of randomness is also how dispersed the numbers are. You don’t want to have a dozen of the same number in consecutive order. Randomness is also a product of how non-uniform number sequences are. This is why randomness is so important in gaming.
It is possible to test the uniformity of the pseudo-random numbers with a Monte Carlo simulation . Monte Carlo methods are types of algorithms that repeatedly sample something for numerical results. The basic idea is to use randomness to solve problems.
Plotting Pi using Monte Carlo method.
They are commonly used to estimate unknown ratios and areas. In the figure above Monte Carlo is applied to estimate the value of pi. Monte Carlo simulation methods do not always require truly random numbers to be effective. Often, deterministic pseudorandom sequences making it easy to test and re-run simulations . Let’s take a look into how the the Roomba robotic vacuum navigates a house’s floor plan, for example.
House cleaning robots, like Roomba, use a machine learning model-free learning framework for navigation . This means it works through trial and error: you provide a goal for the robot to accomplish, but you don’t provide it specific plans for it to operate.
The Roomba 980 vacuum builds a map of its surroundings as it cleans a house
The model-free framework uses Monte Carlo simulation to figure out the floor plan of your home. Instead of simply covering every possible area inch by inch, it randomly selects points to search to find the area of the floor plan. It uses inputs gained from sensors to verify if its covered the whole floor or not. After completing the search, it goes back to its origin.
Domain Randomization (DR) is another way randomness is used in machine learning. DR is a model-based learning framework. It uses set rules for the AI to learn by. It is used in the open AI learning hand dexterity project.
Dactyl is a robot that is trained over many random simulations to let the computer learn natural hand movements. The project has indicated that their model has 100 years of hand experience. This type of machine learning would not be possible without RNGs to provide simulation data to learn from.
Cryptography Use Cases
Ever wonder how your password and browser are encrypted? You guessed it: Random numbers are foundational to digital encryption . Whenever your browser wants to access information online it goes through SSL encryption, which stands for Secure Sockets Layer.
The process starts like this:
First your browser requests for server identity.
The web server then sends back an SSL certificate to your browser.
The browser/server checks to see whether or not it trusts the SSL certificate. If so, it sends a message to the web server.
The web server sends back a digitally signed acknowledgement to start an SSL encrypted session.
Encrypted data is shared between the browser/server and the web server.
This is basically how we keep all of our data encrypted online.
A History of Secrecy:
The use of randomness to encrypt messages goes all the way back to Roman times. Romans used the Caesar cypher to encode military messages with a random key. The cypher only had 26 different letters to shift from, so the probability of cracking the code was 1/26. An enemy unit could try each letter out of the Latin alphabet to crack the code which would only take several hours, but by the time the message is deciphered the marching orders would be useless.
The Caesar cypher
With modern standards of cryptography, we use something called the one-time pad. This method shifts every character in a message with a random letter. This significantly decreases the chances of guessing the message. Let’s say you want to encrypt the name “Alice,” for example:
A l i c e 26*26*26*26*26
Randomizing every character in the message makes decryption exponentially more difficult to achieve. In the case of “Alice” in the example above, the probability of solving it on the first try is about one in about 12 million (26*26*26*26*26). The longer the key, the stronger the encryption.
If you’re looking for heavy-duty encryption, there is SHA-256 . SHA stands for “Secure Hash Algorithm” commonly used in blockchains. A hash function is a type of mathematical function which turns data into a fingerprint called a hash value. The hash algorithm is a function that takes the input data and turns it into an output of a fixed length. A hash function is a one-way function. It is impossible to reverse the input data from its hash; you can go from the inputted data to the hash, but not from the hash to the input data.
SHA-256 has 2^256 possible outcomes. This number is so big that you need to multiply 4 billion 8 times to equal it . If every computer on the planet was used to decrypt all the possible signatures of the message, it would take 37 times the age of the universe. SHA-256 is the basis on which most blockchains are secured and is the standard encryption method of the NSA and CIA. This is one of the many reasons people put so much faith in blockchain technologies.
True randomness is unpredictable and difficult to generate within computer systems. In a mathematical world where recorded data and formulas for predictions are everywhere, randomness is becoming harder and harder to find. But therein lies the value. Having something that you cannot predict will always be of value to pattern-seeking humans, especially with privacy being a major concern in society at large today. Whether it be for simulation calculation, encrypting messages, or making slot machines more random, RNGs are becoming a critical component of our society.