Common behavioral questions in job interviews
Installing g++ (C++ Compiler) on Windows
Show all

Kalman Filter Simply Explained

5 mins read

Let’s start with what a Kalman filter is: It’s a method of predicting the future state of a system based on the previous ones. To understand what it does, take a look at the following data – if you were given the data in blue, it may be reasonable to predict that the green dot should follow, by simply extrapolating the linear trend from the few previous samples. However, how confident would you be predicting the dark red point on the right using that method? how confident would you be about predicting the green point, if you were given the red series instead of the blue?

enter image description here

From this simple example, we can learn three important principles:

  1. It’s not good enough to give a prediction – you also want to know the confidence level.
  2. Predicting far ahead into the future is less reliable than nearer predictions.
  3. The reliability of your data (the noise), influences the reliability of your predictions.

Now, let’s try and use the above to model our prediction.

The first thing we need is a state. The state is a description of all the parameters we will need to describe the current system and perform the prediction. For the example above, we’ll use two numbers: The current vertical position (y), and our best estimate of the current slope (let’s call it mm). Thus, the state is in general a vector, commonly denoted x, and you of course can include many more parameters to it if you wish to model more complex systems.

The next thing we need is a model: The model describes how we think the system behaves. In an ordinary Kalman filter, the model is always a linear function of the state. In our simple case, our model is:

Of course, our model isn’t perfect (else we wouldn’t need a Kalman Filter!), so we add an additional term to the state – the process noise, vt which is assumed to be normally distributed. Although we don’t know the actual value of the noise, we assume we can estimate how “large” the noise is, as we shall presently see. All this gives us the state equation, which is simply:

The third part and final part we are missing is the measurement. When we get new data, our parameters should change slightly to refine our current model and the next predictions. What is important to understand is that one does not have to measure exactly the same parameters as those in the state. For instance, a Kalman filter describing the motion of a car may want to predict the car’s acceleration, velocity, and position, but only measure say, the wheel angle and rotational velocity. In our example, we only “measure” the vertical position of the new points, not the slope. That is

In the more general case, we may have more than one measurement, so the measurement is a vector, denoted by z. Also, the measurements themselves are noisy, so the general measurement equation takes the form:

Where w is the measurement noise, and HH is, in general, a matrix with a width of the number of state variables, and height of the number of measurement variables.

Now that we have understood what goes into modeling the system, we can now start with the prediction stage, the heart of the Kalman Filter.

The difference y (also called the innovation) represents how wrong our current estimation is – if everything was perfect, the difference would be zero! To incorporate this into our model, we add the innovation to our state equation, multiplied by a matrix factor that tells us how much the state should change based on this difference between the expected and actual measurements:

The matrix W is known as the Kalman gain, and its determination is where things get messy, but understanding why the prediction takes this form is the really important part. But before we get into the formula for W, we should give thought to what it should look like:

  • If the measurement noise is large, perhaps the error is only an artifact of the noise, and not a “true” innovation. Thus, if the measurement noise is large, W should be small.
  • If the process noise is large, i.e., we expect the state to change quickly, we should take the innovation more seriously since it’ is plausible the state has actually changed.
  • Adding these two together, we expect:
  • W∼(Process Noise/Measurement Noise)

One way to evaluate the uncertainty of a value is to look at its variance. The first variance we care about is the variance of our prediction of the state:

Amir Masoud Sefidian
Amir Masoud Sefidian
Machine Learning Engineer

Comments are closed.