Basis
What is Kalman Filter
An 1 dimensional example: We have two measurements x1 and x2 measuring a value. Each measurement has an uncertainty (standard deviation), σ1 and σ2. How to find the optimal estimate, x^, that best combines x1 and x2.
The idea is to use a weighted least squares, We want to find an x^ that minimizes the “cost” function S=∑i=1nwi(x^−xi)2. We want to give more weight to the measurement we trust more (the one with lower uncertainty), so we set wi=σi21. We solve x^:
x^=∑wi∑wixi
x^=(σ12+σ22σ22)x1+(σ12+σ22σ12)x2
The new uncertainty of x^: Since Var(a⋅X+b⋅Y)=a2⋅Var(X)+b2⋅Var(Y), we get:
σx^2=(σ12+σ22σ22)2σ12+(σ12+σ22σ12)2σ22
σx^2=σ12+σ22σ12σ22
Rewrite the above formula as an update step:
x^=x1+σ12+σ22σ12(x2−x1)
Discrete Kalman Filter
In robotics case, the x1 is the prediction of the robot’s state given previous state, and x2 is the actual measurement of robot’s state from sensor.
The Process Model
st=Fst−1+Gut+wt
- st: The true state of the system at time t.
- F: It defines how the state would change if there were no control inputs. For example, s=[pv] (where p is position, v is velocity), then pt=pt−1+vt−1⋅Δt and vt=vt−1, then
[ptvt]=[10Δt1][pt−1vt−1]
- Gut: ut is control input. G maps that command to a change in the state. For example, ut=[a] which is acceleration, then Δv=a⋅Δt, Δp=21a(Δt)2. As a result,
Gut=[ΔpΔv]=[21(Δt)2Δt][a]
- wt: Process noise, for example, a bump in the road or wheel slippage. p(w)∼N(0,Q), Q represents “How much do I distrust my system model?”.
zt=Hst+vt
- zt: The observation that sensor gives.
- H: For example, your state st might be a 6D vector (position x, y, z; velocity vx, vy, vz), but measurement zt is only a 3D vector (position x, y, z). The H matrix would be a 3x6 matrix that just “selects” the first three components of the state.
- vt: Measurement Noise by the sensor itself. p(v)∼N(0,R). R represents “How much do I distrust my sensor?”.
Kalman Prediction
st∣t−1=Fst−1∣t−1+Gut
- st−1∣t−1: The corrected state after the last update.
- st∣t−1: The new predicted state for the current time t, based only on information from t−1.
Σt∣t−1=FΣt−1∣t−1FT+Qt
- Σt−1∣t−1: Uncertainty covariance matrix of last best estimate st−1∣t−1.
- Σt∣t−1: New, larger predicted uncertainty for the current time t.
Kalman Updating
State Update:
st∣t=st∣t−1+Kt(zt−Hst∣t−1)
- The above formula is equivalent to x^=x1+K(x2−x1).
- st∣t: The corrected state for time t.
- (zt−Hst∣t−1): zt is new measurement, Hst∣t−1 is expected measurement. The difference is called “surprise”.
Kt=Σt∣t−1HTSt−1
St=HΣt∣t−1HT+Rt
- St: Equivalent to σ12+σ22 in the 1D example. Σt∣t−1 is predicted uncertainty (like σ12), and HΣt∣t−1HT is the predicted uncertainty projected into the measurement space. Rt is measurement sensor’s uncertainty (like σ22).
- Kt: This is the matrix version of K=σ12+σ22σ12.
Covariance Update:
Σt∣t=(I−KtH)Σt∣t−1
- Σt∣t: Corrected, smaller uncertainty (like σx^2).
Extended Kalman Filter
The previous formula all assumed a linear world. However, in many cases, the state change is non-linear. For example, a robot’s state [x,y,θ]:
- xt=xt−1+vt⋅cos(θt−1)⋅Δt
- yt=yt−1+vt⋅sin(θt−1)⋅Δt
- θt=θt−1+ωt⋅Δt
As a result, there is no way to write the relationship with st=Fst−1+Gut. In this section, we will use a non-linear expression:
st=f(st−1,ut−1)+wt−1
zt=h(st)+vk
If the system is non-linear, we can’t use the Kalman filter. So, at every single time step, we will approximate the non-linear function with a linear one. The best linear approximation of a function at a specific point is its tangent line (or tangent plane in multi-dimensions) found by first-order Taylor series expansion.
st≈f(st−1∣t−1,ut−1,0)+F(st−1−st−1∣t−1)+wt−1
zt≈h(st∣t,0)+H(st−st∣t)+vk
where Fij=∂sj∂fi(st−1∣t−1,ut−1,0) and Hij=∂sj∂hi(st∣t,0).
EKF Prediction
st∣t−1=f(st−1∣t−1,ut−1,0)
Σt∣t−1=FtΣt−1∣t−1FtT+Qt−1
EKF Updating
Kt=Σt∣t−1HtT(HtΣt∣t−1HtT+Rt)−1
st∣t=st∣t−1+Kt(zt−h(st∣t−1,0))
Σt∣t=(I−KtHt)Σt∣t−1