Python command-line interface with Click library
Inverse CDF Transform Sampling
Show all

Monte Carlo Simulation Explained

29 mins read

Monte Carlo Methods: I Am Feeling (Un-)Lucky!

In short, Monte Carlo methods refer to a series of statistical methods essentially used to find solutions to things such as computing the expected values of a function or integrating functions that can’t be integrated analytically because they don’t have a closed-form solution for example. What we mean by statistical methods is that they use sampling techniques similar to those we studied in great detail in the last chapters to compute these solutions. Why do we say Monte Carlo methods? Simply because the same principle can be used to solve different problems and each one of these problems is associated with a different technique or algorithm. What all these algorithms have in common is their use of random (or stochastic) sampling. As described by Russian mathematician Sobol:

The Monte Carlo method is a numerical method of solving mathematical problems by random sampling (or by the simulation of random variables).

MC methods all share the concept of using randomly drawn samples to compute a solution to a given problem. These problems generally come in two main categories:

  • simulation: Monte Carlo or random sampling is used to run a simulation. If you want to compute the time it will take to go from point A to point B, given some conditions such as the chances that it will rain on your journey or that it will snow, the chances that there will be a traffic jam, that you will have to stop on your way to get some gas, etc. you can set these conditions at the start of your simulation and run the simulation 1,000 times to get an estimated time. As usual, the higher the number of runs or trials (here 1,000), the better your estimate.
  • integration: this is a technique useful for mathematicians. In the lesson Introduction to Shading, we learned in the chapter Introduction to the Mathematics of Shading, how to compute simple integrals using the Riemann sum technique. As simple as this can be, this approach can be quite computationally expensive as the dimension of the integral increases. MC integration though, while not having the greatest rate of convergence to the actual solution of the integral, can give us a way of getting a reasonably close result at a “cheaper” computational cost.

But let’s rephrase this to emphasize something very important about this method (actually what’s truly and fundamentally exciting and beautiful about it). If it is true that the more samples you use, the closer the MC method gets to the actual solution because we use random samples, an MC method can as well “just” randomly fall on the exact value by pure chance. In other words, on occasions, running a single MC simulation or integration will just give the right solution. However, on most occasions, it won’t, but averaging these results will nevertheless converge to the exact solution anyway (we’ve learned about this and the Law of Large Numbers in the previous chapters).

For example, given some conditions about the weather and time of the week and day you will be traveling from A to B, our first simulation gives us the time of 1 hour and 32 minutes. Now let’s say that none of the other 1,000,000,000,000 simulations we ran using the exact same conditions gave you that number, but when averaging their results though we get 1 hour and 32 minutes. In other words, your first simulation gave you what seems to be the actual solution to your problem (what you might expect the average of one trillion simulations to be pretty close to). Of course, you don’t know that you get the right answer after only one trial, but this is one of the great characteristics of the methods: we have very few samples, sometimes, you may as well get exact or very close to the exact solution.

However the strength of the MC methods also lies their main weakness. If by chance you sometimes get the right or close to the right solution with only a few samples, you may as well be unlucky at some other times and need a very large number of samples before getting close to the right answer. Generally, the rate of convergence of MC methods (the rate by which the MC methods converge to the right result as the number of samples increases) is pretty low (not to say poor). We will talk about this again further in this chapter. This is another important characteristic of the MC methods you need to remember.

Hit-or-Miss Monte Carlo Method

Figure 1: estimating the area of a shape using the hit/miss method.

We will detail in the next chapters each technique (Monte Carlo simulation and integration) as well as provide an example of how MC methods are actually used in computer graphics and particularly in the field of rendering. However, before we get to this point, it is useful and easy to introduce the concept with a simple example. Imagine that we want to estimate the area of an arbitrary shape such as the one we drew in figure 1. All we know is the area of the rectangle containing this shape and defined by the boundaries ac and ab. Because these boundaries define a simple rectangle, we know the area of this rectangle to be A=ab×acA=ab×ac. To estimate the area of the shape itself, we can use a technique called hit-or-miss (also sometimes called the rejection method). The idea is to “throw” a certain number of random points uniformly into the rectangle and count the number of these points contained falling within the shape (hits) while rejecting the others. Because points are randomly distributed over the area of the rectangle ab×abab×ab, it is reasonable to assume that the area of the shape is proportional to the number of hits over the total number of thrown points (in other words, the ratio of hits to the total number of samples is an approximation of the ratio of the area of the shape to the area of the rectangle in which the shape is inscribed). Assuming we keep the total number of samples (thrown points) constant: the bigger the shape, the higher the number of hits, and reciprocally, the smaller the shape the fewer hits. In other words, we can write:

Figure 2: samples need to be uniformly distributed over the area of the rectangle otherwise results are biased (as in the example). The concentric circles in this example indicate the density of samples.

A_shape≈Nhits/Ntotal×A where A=ab×ac.

This is a very basic and simple example of how random sampling is used to solve a given problem (this device was originally developed by von Neumann himself who you can see in the photograph at the end of this chapter). A few things should be noted. Of course, the more samples we use, the better the estimate. It is also important to note that the distribution of samples over the area of the rectangle needs to be uniform. If for whatever reason more points were falling within a certain region of the rectangle (as in figure 2 where the density of samples increases as we get to the center of the figure), then the result would be biased (that is, the result would be different than the true solution by some offset). We will see in the next lesson on importance sampling, that uniform sampling is not an absolute condition for using MC methods. When we know that a non-uniform distribution is used, we can compensate for the bias that would normally introduce by some other method. Why would we be interested in using non-uniform sampling then? Because, as we will see in the next lesson on importance sampling, this can be used as a variance reduction technique. Variance, as explained in the previous chapters, is a measure of the error between our estimate and the true solution. The only brute-force and most obvious way by which variance can be reduced in MC methods are by increasing NN, the total number of samples. However, some other methods involving the way random samples are generated and importance sampling (in which non-uniform sampling distributions are used) can also be used to reduce variance. However (and before we study these more advanced methods), keep in mind that basic or naive Monte Carlo methods require the samples to be uniformly distributed. Remember that a random number has a uniform distribution if all its possible outcomes have the same probability to occur (a property known as equiprobability).

Figure 3: the area of the unit disk can be estimated using the hit-or-miss Monte Carlo method.

As a practical example, let’s say we want to estimate the area of a unit disk using the hit-or-miss Monte Carlo method. We know the radius of the unit disk is 1 thus the unit circle is inscribed within a square of length 2. We could generate samples within this square and count the number of points falling within the disk. To test whether the point is inside (hit) or outside (miss) the disk, we simply need to measure the distance of the sample from the origin (the center of the unit disk) and check whether this distance is smaller (or equal) than the disk radius (which is equal to 1 for a unit disk). Note that because we can divide the disk into four equal sections (or quadrants) each inscribed in a unit square (figure 3) we can limit this test to the unit square and multiply the resulting number by four. To compute the area of a quarter of the unit disk, we then simply divide the total number of hits (the green dots in figure 7) by the total number of samples and multiply this ratio by the area of the unit square (which is equal to 1). The following C++ code implements this algorithm:

#include <cstdlib>
#include <cstdio>
#include <random>

int main(int argc, char **argv)
std::default_random_engine gen;
std::uniform_real_distribution distr;
int N = (argc == 2) ? atoi(argv[1]) : 1024, hits = 0;
for (int i = 0; i < N; i++) {
float x = distr(gen);
float y = distr(gen);
float l = sqrt(x * x + y * y);
if (l <= 1) hits++;
fprintf(stderr, “Area of unit disk: %f (%d)\n”, float(hits) / N * 4, hits);
return 0;

This code uses the function from the random C++11 random library to generate random numbers using a given random number generator (more information on generating random numbers on a computer can be found in one of the next chapters of this lesson) and a given probability distribution (in this case, a uniform distribution). Check the documentation for more information on these C++11 libraries (C++11 is, by 2013, the most recent version of the standard of the C++ programming language). If you compile and run this program, you should get:001
clang++ -o pi -O3 -std=c++11 -stdlib=libc++ pi.cpp
./pi 1000000
Area of unit disk: 3.141864 (785466)

As you can see, we get pretty close to the exact solution (which is ππ since the area of the unit disk is A=πr2A=πr2 with r=1r=1), and as you increase the number of samples (which you can as an argument to the program), the estimate keeps getting closer to this number (as expected). If you used a 3D application in the past, you probably used random sampling already, maybe without knowing it. With this program though (and the next ones to follow) you can now actually say that you not only know what an MC method is but also implement a practical example of your own to illustrate such a method.

Why Do We Use Monte Carlo Methods?

If you run the code to compute the area of the unit disk, you will find that we need about 100 million samples to approximate the number ππ to its fourth decimal (3.1415). Is it an efficient way of estimating the number ππ? The answer is clearly no. Then, why do we need Monte Carlo methods at all, if they don’t seem that efficient? As already mentioned in previous lessons, we say that an equation has a closed-form solution when this solution can be expressed and thus computed analytically. However many equations do not have such closed-form solutions and even when they do, sometimes their complexity is such that they could only be solved given infinite time. Such problems or equations are said to be intractable. However, it’s often better to have some predictions about the possible outcomes of a given problem, than not having any prediction at all. And Monte Carlo methods are then sometimes the only practical methods by which estimates of these equations or problems can be made. As Metropolis and Ulam put it in their seminal paper on the Monte Carlo Method:

To calculate the probability of a successful outcome of a game of solitaire is a completely intractable task. […] the practical procedure is to produce a large number of examples of any given game and then examine the relative proportion of successes. […] We can see at once that the estimate will never be confined within given limits with certainty, but only – if the number of trials is great – with great probability.

As we will see in the next chapters, many of these problems such as definite integrals can be efficiently solved by some numerical methods which are generally converging faster than MC methods (in other words, better methods). However, as the dimension of the integrals increase, these methods often become computationally expensive whereas the Monte Carlo ones can still provide with a reasonably good estimate at a fixed computational cost (defined by the number of samples spared in computing estimations). For this reason, for complex integrals, MC methods are generally a better solution (despite their pretty bad convergence rate).

Finally, Monte Carlo methods are generally incredibly simple to implement and very versatile. They can be used to solve a very wild range of problems, in pretty much every possible imaginable field. In Metropolis and Ulam’s paper, we can read: The “solitaire” is meant here merely as an illustration for the whole class of combinatorial problems occurring in both pure mathematics and the applied sciences.

As already suggested in the introduction, Monte Carlo methods’ popularity and development have very much to do with the advent of computing technology in the 1940s in which von Neumann (picture above) was a pioneer. In a report on the Monte Carlo method published in 1957 by the Los Alamos Scientific Laboratory, we could already read: The present state of development of high-speed digital, computers permits the use of samples of a size sufficiently large to ensure satisfactory accuracy in most practical problems.

This is important to understand because on its own, while being a pretty simple idea, using MC without the help of a computer is a pretty tedious not to say an unusable approach to solving any sort of problems. A computer can execute all the calculations for us, which is why despite its poor convergence rate, Monte Carlo or stochastic sampling has become so popular. We just let computers do the tedious work for us.

Finally, let’s conclude this chapter by saying that Monte Carlo methods have very much to do as well with the generation of random numbers (the first few chapters of this lesson were dedicated to studying random variables). To run an MC algorithm we first need to be able to generate random numbers (generally with a given probability distribution). For this reason, the development of algorithms for generating such “random” numbers (they appear random but generally they are not “truly” random which is why these algorithms are called pseudorandom number generators), has been an important field of research in computing technology. 

  • What is a Monte Carlo analysis?
  • What is Monte Carlo sampling?
  • How does a Monte Carlo simulation work?

What is Monte Carlo simulation?

Monte Carlo simulation (also known as the Monte Carlo Method) is a computer simulation technique that constructs probability distributions of the possible outcomes of the decisions you might choose to make. Creating the probability distributions of the outcomes allows the decision-maker to quantitatively assess the level of risk that comes with taking a particular decision and, as a result, select the decision that provides the best balance of benefit against risk.

A typical result of a Monte Carlo simulation is a histogram of the simulated outcomes, like the following:

Monte Carlo simulation example

The horizontal axis shows the possible amount of profit a venture may make, and the vertical axis states how likely those values are. In this example, the histogram shows that the most likely profit is a little under zero, with a possible loss of up to $1M or so, but a potential gain of $5-6M, or even higher (though with a very small probability).

How does a Monte Carlo simulation work?

To perform a Monte Carlo simulation, you must first have a mathematical model, like a spreadsheet. The model will have one or more results of interest (called outputs) – like profit, NPV, cash flow, cost, sales volume, etc, The model will depend on several quantitative assumptions (called inputs) – like market size, macroeconomic factors, production capacity, etc. Then for given values of these inputs, the model determines the value of the outputs through a series of equations.

The greatest weakness of such models is that we are almost always unsure what the value of the inputs will be and, as a result, we are unsure of the outputs.

Before the Monte Carlo simulation, decision-makers would explore how uncertain the outputs (like profit) were by running different ‘what-if’ scenarios. In a typical what-if scenario, one would enter values for each input that would reduce the output result and note the drop in the output, then enter input values that would increase the output and again note the change in the output. This would give a feel for how uncertain the output value was. For example, the following model performs three what-if scenarios, summing a set of costs, where the three scenarios explore what the total cost (the output) might be if each individual cost item (the inputs) were all very low, all at values considered likely, or all at high values:

This kind of analysis shows the decision-maker that the total cost will lie somewhere between $297.5k and $348.4k, and projects a most likely cost of $312.1,

Although simple, these ‘What-if’ analyses are largely useless because of three key issues:

1. They do not take into account of the probability of a scenario. For example, if we could say that there was a 1% chance that each of the cost items was in the range of the minimum estimate then, assuming these costs were independent of each other, the chances of all lying around their minimum value would be 1% x 1% x … x 1%, i..e 0.01^9 = 1 in a billion, a probability as to be meaningless.

2. They don’t consider the variety of values that input can take, just two or three possible values; and

3. They don’t take account of the combinations of values that could constitute a scenario. For example, in the model above some costs could be towards their minima, others towards their maxima, and others around the best guess. With just these nine variables and three values per variable, one can construct 3^9, nearly 20,000 different combinations!

Monte Carlo simulation replaces the values for uncertain variables within the model with functions that generate random samples from probability distributions that represent the uncertainty. For example, the following model is written in ModelRisk:

Cell F3 contains the ModelRisk function VoseTriangle(Minimum, most likely, Maximum) where the input parameters come from the sheet. The function randomly generates a sample, here $133.90k, where the probability of each possible value being generated is defined by the shape of the distribution used. In this case, the Triangle(120, 125, 140) looks like this:

The horizontal axis represents the possible value of the variable (the land purchase cost) and the vertical axis represents the probability of each value occurring. The Triangle distribution interprets the three input values with straight lines to form a triangular shape, hence its name. There are many different distribution types used in risk analysis. The most common are: Triangle, PERT, binomial, Poisson, Normal, Lognormal and Uniform distributions. However, depending on the subject of the model (e.g. stock prices, system reliability, epidemiology) the set of distributions used will be very different. ModelRisk includes essentially all probability distributions used in risk analysis.

In a Monte Carlo simulation model, uncertain values are replaced by functions generating random samples from distributions chosen by the modeler. Then a simulation is run on that model, which amounts to recalculating the model many times, each time using different random values for all the uncertain variables, and storing the resultant values for each output of the model. At the end of the simulation run, the values for each output can be analyzed in various ways – graphs like the histogram above, and others, give pictorial representations of the shape and range of the uncertainty for each output. The output data can also be analyzed statistically to provide information like the probability of the output falling above (or below) some specific target value.

How random samples are generated from uncertain variables

Every probability distribution can be represented by a cumulative distribution function, as shown below:>Monte Carlo method

By definition, a random value from a probability distribution is equally likely to be at any cumulative probability. Reversing that logic, we can generate a random number for the variable by sampling from a Uniform distribution between 0 and 1, and then use the cumulative curve to translate this into a sample value for the variable. In the illustration above, a random value of 0.53 from the Uniform(0,1) distribution translates into a value of 15.9 for the variable.

This idea is key to the Monte Carlo simulation. In effect, for every random variable of a Monte Carlo simulation model, samples are taken from Uniform(0,1) distributions, so each generated scenario is just as likely to occur as any other. However, due to the shape of each cumulative curve, more values will be generated where the cumulative curve is at its steepest, as shown below:What is Monte Carlo simulation

It is because these generated scenarios are all just as likely as each other that we can simply make a histogram distribution or cumulative distribution from the generated output results, and the resultant distributions can be interpreted as approximations of the true theoretical distributions of the output variables.

The more samples (sometimes called iterations) that are run in a simulation, the smoother the resultant distributions become, and the more precisely they match the true theoretical result.

Random number generators used for Monte Carlo simulation

To produce a high-quality Monte Carlo simulation, one must have a method of generating Uniform(0,1) random numbers. Vose Software simulation products use the Mersenne Twister., which is widely considered the best all-around algorithm. The algorithm uses the generated value as an input to produce the next value. The random number generating algorithm starts with a seed value, and all subsequent random numbers that are generated will rely on this initial seed value.

ModelRisk and Tamara both offer the possibility of specifying the seed value for a simulation, an integer from 1 to 2,147,483,647. . It is good practice always to use a seed value and to use the same numbers habitually (like 1, or your date of birth) as you will remember them in case you want to reproduce the same results exactly. Providing the model is not changed, and for ModelRisk that includes the position of the distributions in a spreadsheet model and therefore the order in which they are sampled, the same simulation results can be exactly repeated. More importantly, one or more distributions can be changed within the model and by running a second simulation one can look at the effect these changes have on the model’s outputs. It is then certain that any observed change in the result is due to changes in the model and not a result of the randomness of the sampling.

How many samples to run in a Monte Carlo simulation

A very common question is how to determine how many samples to run in a Monte Carlo simulation, which is discussed here.

Monte Carlo simulations have come a long way since they were initially applied in the 1940s when scientists working on the atomic bomb calculated the probabilities of one fissioning uranium atom causing a fission reaction in another. Today we’re going over how to create a Monte Carlo simulation for a known engineering formula and a DOE equation from Minitab.

Since those days when uranium was in short supply and there was little room for experimental trial and error, Monte Carlo simulations have always specialized in computing reliable probabilities from simulated data. Today, simulated data is routinely used in many scenarios, from materials engineering to medical device package sealing to steelmaking. It can be used in many situations where resources are limited or gathering real data would be too expensive or impractical. With Engage or Workspace’s Monte Carlo simulation tool, you have the ability to:

  • Simulate the range of possible outcomes to aid in decision-making.
  • Forecast financial results or estimate project timelines.
  • Understand the variability in a process or system.
  • Find problems within a process or system.
  • Manage risk by understanding cost/benefit relationships.


Depending on the number of factors involved, simulations can be very complex. But at a basic level, all Monte Carlo simulations have four simple steps:


To create a Monte Carlo simulation, you need a quantitative model of the business activity, plan, or process you wish to explore. The mathematical expression of your process is called the “transfer equation.” This may be a known engineering or business formula, or it may be based on a model created from a designed experiment (DOE) or regression analysis. Software like Minitab Engage and Minitab Workspace gives you the ability to create complex equations, even those with multiple responses that may be dependent on each other.


For each factor in your transfer equation, determine how its data are distributed. Some inputs may follow the normal distribution, while others follow a triangular or uniform distribution. You then need to determine distribution parameters for each input. For instance, you would need to specify the mean and standard deviation for inputs that follow a normal distribution. If you are unsure of what distribution your data follow, Engage and Workspace have a tool to help you decide.


For a valid simulation, you must create a very large, random data set for each input —something on the order of 100,000 instances. These random data points simulate the values that would be seen over a long period for each input. While it sounds like a lot of work, this is where Engage and Workspace shine. Once we submit the inputs and the model, everything here is taken care of.


With the simulated data in place, you can use your transfer equation to calculate simulated outcomes. Running a large enough quantity of simulated input data through your model will give you a reliable indication of what the process will output over time, given the anticipated variation in the inputs.


A manufacturing company needs to evaluate the design of a proposed product: a small piston pump that must pump 12 ml of fluid per minute. You want to estimate the probable performance over thousands of pumps, given natural variation in piston diameter (D), stroke length (L), and strokes per minute (RPM). Ideally, the pump flow across thousands of pumps will have a standard deviation no greater than 0.2 ml.

1. Identify the Transfer Equation

The first step in doing a Monte Carlo simulation is to determine the transfer equation. In this case, you can simply use an established engineering formula that measures pump flow:

Flow (in ml) = π(D/2)2 ∗ L ∗ RPM

2. Define the Input Parameters

Now you must define the distribution and parameters of each input used in the transfer equation. The pump’s piston diameter and stroke length are known, but you must calculate the strokes-per-minute (RPM) needed to attain the desired 12 ml/minute flow rate. The volume pumped per stroke is given by this equation:

π(D/2)2 * L

Given D = 0.8 and L = 2.5, each stroke displaces 1.256 ml. So to achieve a flow of 12 ml/minute the RPM is 9.549.

Based on the performance of other pumps your facility has manufactured, you can say that piston diameter is normally distributed with a mean of 0.8 cm and a standard deviation of 0.003 cm. Stroke length is normally distributed with a mean of 2.5 cm and a standard deviation of 0.15 cm. Finally, strokes per minute are normally distributed with a mean of 9.549 RPM and a standard deviation of 0.17 RPM.

3. Set up the Simulation in Engage or Workspace

Click the Insert tab from the top ribbon, and then choose Monte Carlo Simulation.


We made it easy – just give each variable a name, select a distribution from the drop-down menu and enter the parameters. We’ll stick with what we described above. If you are unsure of a distribution, you can select Use data to decide. This will prompt you to upload a .csv file of your data, and you will have a few options to choose from:


4. Simulate and Analyze Process Output

The next step is to give the equation. Here it’s as simple as giving your output a name (ours is Flow) and typing in the correct transfer equation which we identified above. You can also add upper and lower spec limits to see how your simulation compares.


Then, in the ribbon, choose how many simulations you want to run (100,000 is a good baseline) and click the button to run the simulation.

run-simulation-optionFor the random data generated to write this article, the mean flow rate is 11.996 based on 100,000 samples. On average, we are on target, but the smallest value was 8.7817 and the largest was 15.7057. That’s quite a range. The transmitted variation (of all components) results in a standard deviation of 0.756 ml, far exceeding the 0.2 ml target.

It looks like this pump design exhibits too much variation and needs to be further refined before it goes into production. This is where we start to see the benefit of simulation. If we went right into production, we would have produced, most likely, too many rejected pumps. With Monte Carlo Simulation, we can figure all of this out without incurring the expense of manufacturing and testing thousands of prototypes or putting it into production prematurely.


Lest you wonder whether these simulated results hold up, try it yourself! Running different simulations will result in minor variations, but the end result — an unacceptable amount of variation in the flow rate — will be consistent every time. That’s the power of the Monte Carlo method.


Learning the standard deviation is too high is extremely valuable, but where Engage and Workspace really stand out is their ability to help improve the situation. That’s where Parameter Optimization comes in.

Let’s look at our first input, piston diameter. With an average of 0.8, most of our data will fall close to that value, or within one or two standard deviations. But what if it’s more efficient for our flow for the piston to have a smaller diameter? Parameter optimization helps us to answer that question.

To conduct parameter optimization, we need to specify a search range for each input. For this example, for simplicity, I designated a +/- 3 standard deviation range for the algorithm to search. Then, either Engage or Workspace will help us find the optimal settings for each input to achieve our goal, which in this case is to reduce the standard deviation. Selecting the appropriate range is important; make sure that the full range you input is feasible to run; it does no good to find an optimal solution that isn’t possible to replicate in production.


If you’ve used the Response Optimizer in Minitab Statistical Software, the idea is similar. Here are our results:


Based on this, if we want to reduce our standard deviation, we should reduce our Stroke Length and our Strokes per Minute. Our piston diameter can stay in a similar place. And remember the key to Monte Carlo simulation – we can find all of this out without building and single new prototype or conducting a new experiment.


A good series of tutorials on Monte Carlo Methods:

Amir Masoud Sefidian
Amir Masoud Sefidian
Machine Learning Engineer

Comments are closed.