## Prediction step

In EKF, I simply fed the current state estimate

Along with 𝛘

I can see why

###
Negative feedback loop between 𝚺

###
Previously unobserved landmarks are no use for correction, but still have to be accepted into an expanding state. I have not yet figured out how to calculate the cross-correlation of the newly observed landmark with existing landmark or even the robot, but here's my code used to generate the above result:

__𝛍__^{-}[k] (and the external input__[k]) to the nonlinear function g(__**u**__[k];__**u**__𝛍__^{+}[k-1]) to predict the next state:But in UKF, I have to go through the UT (Unscented Transform): generate an ensemble (called sigma points 𝛘𝛍^{-}[k] = g([k];u𝛍^{+}[k-1])

^{+}[k-1] in section 3.4.1 of the Probabilistic Robotics book) of__𝛍__^{+}[k-1], and feed all those points to g(). Then__𝛍__^{-}[k] is the mean and the covariance of the transformed ensemble are the new__𝛍__^{-}[k] and 𝚺_{𝛍𝛍}^{-}[k].### Sigma points 𝛘

The ensemble 𝛘^{+}[k-1] should be centered around__𝛍__^{+}[k-1], with enough "spread"--based on 𝚺_{𝛍𝛍}^{+}[k-1]. 𝛘^{+}_{0}[k-1] is the current estimate__𝛍__^{+}[k-1]. Subsequent points are some + and - direction along each of dimension of the uncertainty. I visualize this as the pair of points roughly 1~2 standard deviation on either side of__𝛍__^{+}[k-1]. A pair of points for each of the state dimension n produces 1 + 2n sample points, as you can see in the left ellipsoid below, which is rotated around the principle (x-y) axis because of non-zero cross-covariance term in 𝚺_{𝛍𝛍}^{+}[k-1].Unscented transform maps the sigma points through g() |

^{+}[k-1], we have to generate weights*w*_{m}and*w*_{c}to use when calculating the mean and the covariance of the transformed points, respectively. In the book,**w**_{m}and**w**_{c}are the same except for*w*_{c}_{0 }=*w*_{m}_{0}+ (1 - 𝛂^{2}+ 𝛃). Although the book presented only 1 sigma point generating and weighting algorithm, there are apparently many variations listed, for example, in Grewall and Andrews. The only thing that did not understand is the extra (1 - 𝛂^{2}+ 𝛃) term on*w*_{c}_{0}, because the weights are supposed to sum to 1 (normalized); I checked that*w*_{m}sums to 1, which means that*w*_{c}does*not*sum to 1. Until I can tackle down the Ph.Ds in my team for a satisfying answer (don't hold you breadth), here is the Matlab code to generate the sigma points:
function [X, wm, wc] = compute_sigma_points(mu, sigma)

% Computes the 2n+1 sigma points according to the unscented transform,

% where n is the dimensionality of the mean vector mu.

% The sigma points should form the columns of sigma_points,

% i.e. sigma_points is an nx2n+1 matrix.

n = length(mu); % state dimension

global alpha kappa beta;

lambda = max(alpha^2 * (n + kappa) - n, 1);

scale = lambda + n;

[U, D, V] = svd(sigma); %[V,D] = eig(sigma);

D = sqrt(D);

D = max(D, 0.05 * eye(n)); % lower bound explained below

sigmasqr = ...sqrt(n+lambda) * Seems to be too big!

... sqrtm(sigma); %chol(sigma);

U * D * V'; %V * D / V;

mureplicated = repmat(mu, 1, n);

X = [mu, mureplicated + sigmasqr, mureplicated - sigmasqr];

% Computing the weights for recovering the mean and variance.

% Note: weights are column vectors, to allow vector multiply w/ samples

wm = [lambda/scale; repmat(1/(2*scale), 2*n, 1)];

wc = wm;

wc(1) = wc(1) + 1 - alpha^2 + beta;

end

### Prediction in SLAM

Unless some of the states are completely decoupled (the covariance between them is zero), one should feed the full__𝛍__^{+}[k-1] and 𝚺_{𝛍𝛍}^{+}[k-1] to the compute_sigma_points. But in SLAM, the landmarks do*not*move (one of the fundamental assumptions), so there is no need to predict the landmark positions, so I can just constrain the prediction to the robot pose. In this 2-D planar robot, the I can use Matlab's vector operator to transform*all*𝛘^{+}[k-1]*in just a few lines of code.*
[X, wm, wc] = compute_sigma_points(mu(1:3), sigma(1:3,1:3));

theta = X(3, :);

ang = theta + rot1 + rot2;

pose_sample = [X(1, :) + trans * cos(theta + rot1);

X(2, :) + trans * sin(theta + rot1);

cos(ang); % expand out x,y separately

sin(ang)];

% Be careful when computing the robot's orientation (sum up the sines and

% cosines and recover the 'average' angle via atan2)

mu(1:2) = pose_sample(1:2,:) * wm;

mu(3) = atan2(pose_sample(4,:) * wm, pose_sample(3,:) * wm);

*w*_{m}being normalized is important for keeping__𝛍__^{-}[k] unbiased. Prediction of the covariance is also concise in Matlab:
dSigma = X - mu(1:3) * ones(1,length(wm)); % subtract the mean

dSigma(3,:) = wrapToPi(dSigma(3,:)); % normalize angles again

sigma(1:3,1:3) = dSigma * diag(wc) * dSigma' ...

+ diag([0.01 0.01 deg2rad(0.5)]); % robot move uncertainty

###
Negative feedback loop between 𝚺_{mm} and sigma point spread

I first tried using the matrix square root or Cholesky decomposition to generate the pairs of points away from the mean, but I ran into a problem: there is a "negative" feedback between "uncertainty" (can be seen with the norm of the covariance matrix) and the sigma point spread. Intuitively, if the sigma point spread scales linearly with the size of the hyper-ellipsoid, decreasing uncertainly leads to smaller spread in the sigma points, which (unless the nonlinear system equation is grossly non-linear or even singular) in turn leads to smaller covariance matrix norm; because 𝚺

_{mm}does*not*grow during the prediction step like 𝚺_{xx}(see my earlier comment about the landmark stationarity assumption in SLAM), this leads to 𝚺_{mm}asymptotically decaying to 0. This contrasts sharply with the uncertainty of the landmark being lower bounded by the robot pose in EKF (they are completely correlated). I by lower bounding the diagonal entries of the SVD(𝚺_{𝛍𝛍}), I tried to keep the landmark uncertainty disappearing to 0. I am not sure yet if this is the best way to deal with the problem, but 𝚺_{mm}--> 0*is*a numerical problem at minimum because 𝚺_{𝛍𝛍}becomes ill-conditioned. Because I always think about "how am I going to do this on a 32-bit or 16-bit MCU", I am particularly alarmed about this inconvenient discovery that is usually not mentioned in the Kalman Filter fundamental books. Even with this heuristics, the landmark position uncertainty has reduced dramatically compared to EKF.Robot pose (magenta) and landmark position (cyan) estimate at the end of 331 time steps. Note that the estimates have drifted less than in EKF SLAM. |

## Correction step

The unscented transform is used again during the correction step--this time using the a-priori mean__𝛍__^{-}[k] and covariance 𝚺_{𝛍𝛍}^{-}[k] from the prediction step, and using*all*of them instead of just the robot pose portion, as in the SLAM prediction step. To correct the states from observation, I need 2 things: the Kalman gain K[k] and the innovations from the landmark observations. But at a given time step k, the observation consists of old landmarks (from which innovation can be formed) and new landmarks. So I have to form K_{p}[k] and an*ensemble*of innovations (**z**_{p}[k] - hp(𝛘^{-}[k])) using just previously observed landmarks at time step k (subscript "p" for previously known). Note that I*cannot*reuse the ensemble generated during prediction because the predicted state__𝛍__^{-}[k] is different than__𝛍__^{+}[k-1]. Forming the innovations from__z___{p}[k] and 𝛘^{-}[k] is relatively straight-forward, but K_{p}[k] involves more steps. For clarity, let the ensemble of expected observations beZ_{p}[k] = h_{p}(𝛘^{-}[k])

Then loosely speaking, Kz_{p}^{avg}[k] =Z_{p}[k]w_{m}

_{p}[k] is the cross-correlation of 𝛘^{-}[k] with**Z**_{p}[k], divided by the covariance of*[k]:***Z**S_{p}[k] = (Z_{p}[k] -z_{p}^{avg}[k]) diag(w_{c}) (Z_{p}[k] -z_{p}^{avg}[k])^{T}+ Q_{p}[k]

KThe uncertainty of measurement Q_{p}[k] = (𝛘^{-}[k] -𝛍^{-}[k]) diag(w_{c}) (Z_{p}[k] -z_{p}^{avg}[k])^{T}/ S_{p}[k]

_{p}[k] works to de-weight the innovation for the correction, just like in EKF. The state and covariance correction are then:𝛍^{+}[k] =𝛍^{-}[k] + K_{p}[k] ([k] -zz_{p}^{avg}[k] )

𝚺Here is the Matlab implementation of the simple idea given above:_{𝛍𝛍}^{+}[k] = 𝚺_{𝛍𝛍}^{-}[k] - K_{p}[k] S_{p}[k] K_{p}^{T}[k]

lId = z(:,1);

oz = z(ismember(lId, map), :);

if ~isempty(oz) % Have known landmark => can generate innovation

[X, wm, wc] = compute_sigma_points(mu, sigma);

nX = size(X,2); % number of sigma points

% Reorder obs according to map, which is also the order in mu (state)

[~,lIdx] = ismember(oz(:,1), map);

nz = size(oz, 1);

% line 7 on slide 32: Z = h(X)

% Generate distribution of expected observation from sigma_point of

% robot poses AND landmark positions.

Dx = X(2*lIdx+2,:) - ones(nz,1) * X(1,:); % nZ x nX

Dy = X(2*lIdx+3,:) - ones(nz,1) * X(2,:); % nZ x nX

Range = sqrt(Dx.^2 + Dy.^2); % predicted range and bearing distribution

Bearing = atan2(Dy, Dx) - ones(nz,1) * X(3,:);

% To average the angle, must go through unit circle

Cos = cos(Bearing) * wm; Sin = sin(Bearing) * wm;

Range_avg = Range * wm;

% Rearrange according to the mu order

Z_avg = zeros(2*nz, 1);

Z_avg(1:2:end) = Range_avg;

Z_avg(2:2:end) = atan2(Sin, Cos);

q = zeros(2*nz,1);

q(1:2:end) = Range_avg * 0.01;

q(2:2:end) = deg2rad(1) * ones(nz,1);

q = q .^ 2;

% UKF correction algo step 9

dZ = zeros(2*nz, nX);

dZ(1:2:end, :) = Range;

dZ(2:2:end, :) = Bearing;

% innovation covariance matrix S, line 9 on slide 32

dZ = dZ - Z_avg * ones(1,nX); % Subtract the mean of sample

% Normalize the angles

dZ(2:2:end) = wrapToPi(dZ(2:2:end));

S = dZ * diag(wc) * dZ' + diag(q);

% Compute Sigma_x_z, the Cross covariance, line 10 on slide 32

% (which is equivalent to sigma times the Jacobian H transposed in EKF).

dSigma = X - mu * ones(1, nX); % subtract the mean

dSigma(3,:) = wrapToPi(dSigma(3,:)); % normalize angles again

Sigma_xz = dSigma * diag(wc) * dZ';

% Compute the Kalman gain, line 11 on slide 32

K = Sigma_xz / S;

% Current observation

%lId = oz(:,1);

range = oz(:,2); bearing = oz(:,3);

% Update mu and sigma, line 12 + 13 on slide 32

innoZ = zeros(2*nz,1);

innoZ(1:2:end) = range; % Form the innovation

innoZ(2:2:end) = bearing;

innoZ = innoZ - Z_avg;

innoZ(2:2:end) = wrapToPi(innoZ(2:2:end));

mu = mu + K * innoZ;

mu(3) = wrapToPi(mu(3));

sigma = sigma - K * S * K';

end

## No comments:

## Post a Comment