Lucas-Kanade on Manifolds

Why writing this post?

I have discussed Lucas-Kanade basic in this post.

We wanted to find a translation \((dx, dy)\) such that the difference of image patch A and the translated image patch B is minimized.

\(\text{minimize} \; \text{cost}(dx, dy) = \sum_{x,y}||A(x,y) – B(x+dx, y + dy) ||^2\)

Let’s call \((m_x, m_y) = M(x, y, dx, dy) = (x+dx, y + dy)\) the tracking motion function.

The tracking motion function doesn’t have to be translation. Since I discussed the optimization on the manifolds in this post, I’d like to use a motion function which has SO(2) as one of its parameters.

Let’s consider a new problem, where we want to find a translation and a rotation \((dx, dy, R)\) such that the difference of image patch A and the translated image patch B is minimized.

\(\text{minimize} \; \text{cost}(dx, dy, R) = \sum_{x,y}||A(x,y) – B(M(x, y, dx, dy, R)) ||^2\)

where,

\(R \in SO(2)\)

\(M(x, y, dx, dy, R) = \begin{bmatrix} m_x\\ m_y \end{bmatrix} = R \begin{bmatrix} x\\y \end{bmatrix} + \begin{bmatrix} dx\\dy \end{bmatrix}\)

Let me show the result first.

I generated a moving square, and tracked its 4 corners using the above model.

Prerequisite

Lucas-Kanade Tracker Basic: https://wang-yimu.com/lucas-kanade-tracker/

Introduction to Optimization on Manifolds: https://wang-yimu.com/introduction-to-optimization-on-manifolds/

Least Square/Lucas-Kanade Refresher

For a residual function \(r(x)\), define the cost function to be,

\(\text{minimize} \; cost(x) = r(x)^T r(x)\).

We linearize the residual function at the current variable \( x_{\text{current}} \) as,

\(r(\Delta{x}) \approx J \Delta{x} + b\).

Where \(\Delta{x} = x – x_{\text{current}}\)

Plug in the cost function,

\(cost(x) \approx \Delta{x}^T J^TJ \Delta{x} + 2 J^T b \Delta{x} + b^T b\)

It is a quadratic, we can optimize the linearized cost in one step by solving the normal equation,

\(J^TJ \Delta{x} = -J^Tb\)

and we update x by,

\(x = x_{\text{current}} + \Delta{x}\)

We do this iteration until we meet a terminating condition. e.g. \(\Delta{x}\) is small.

Check this post for more about nonlinear least-squares derivation. https://wang-yimu.com/least-squares-refresher/

For the Lucas-Kanade problem, the cost function is,

\(\text{minimize} \; \text{cost}(dx, dy) = \sum_{x,y}||A(x,y) – B(x+dx, y + dy) ||^2\)

The residual function is the difference of every pixels in the image patch.

\(r(dx, dy) = \begin{bmatrix} r_{x=0, y = 0}( dx, dy ) \\ …\\ r_{x=i, y=j}( dx, dy ) \\ … \\ r_{x=n,y=n}( dx, dy ) \end{bmatrix} = \begin{bmatrix}A(0,0) – B(0+dx, 0 + dy) \\ … \\ A(i,j) – B(i+dx, j + dy) \\ … \\ A(n,n) – B(n+dx, n + dy) \end{bmatrix} \)

To solve the optimization, we linearize the Lucas-Kanade residual function and solve the normal equation.

Because the image patches are discrete functions, we need to compute the derivative numerically and using interpolation to handle non-integer pixels.

The Problem Formulation

We want to find a translation and a rotation \((dx, dy, R)\) such that the difference of image patch A and the translated image patch B is minimized.

\(\text{minimize} \; \text{cost}(dx, dy, R) = \sum_{x,y}||A(x,y) – B(M(x, y, dx, dy, R)) ||^2\)

where,

\(R \in SO(2)\)

\(M(x, y, dx, dy, R) = \begin{bmatrix} m_x\\m_y \end{bmatrix} = R \begin{bmatrix} x\\y \end{bmatrix} + \begin{bmatrix} dx\\dy \end{bmatrix}\)

Since we want \(R \in SO(2)\), we can parameterize R by its so(2).

Can you simply using the 2D rotation formula (the yaw formula) to get the rotation parameterization.
For 2D rotation using lie-groups and using the 2D rotation formula is the same. But for 3D rotation, lie-groups is better.
I’d like to follow the lie-group framework so that we can familiar with it.

The so(2) Problem Formulation

For a 2d rotation R, we can express R using its exponential map of so(2),

\(R(\theta) = Exp(\theta) = \begin{bmatrix} \cos(\theta) & -\sin(\theta)\\ \sin(\theta) & \cos(\theta) \end{bmatrix} \)

Use the so(2) parameterization. the problem become,

\(\text{minimize} \; \text{cost}(dx, dy, \theta) = \sum_{x,y}||A(x,y) – B(M(x, y, dx, dy, \theta)) ||^2\)

where,

\(M(x, y, dx, dy, \theta) = \begin{bmatrix} m_x\\m_y \end{bmatrix} = Exp(theta) \begin{bmatrix} x\\y \end{bmatrix} + \begin{bmatrix} dx\\dy \end{bmatrix}\)

By the definition of least squares problems,

\(\text{minimize} \; \text{cost}(dx, dy, \theta) = ||r(dx, dy, \theta)||^2\)

We can see the residual function from the above problem is,

\(r(dx, dy, \theta) = \begin{bmatrix} r_{x=0, y = 0}( dx, dy, \theta ) \\ …\\ r_{x=i, y=j}( dx, dy ,\theta) \\ … \\ r_{x=n,y=n}( dx, dy, \theta) \end{bmatrix} = \begin{bmatrix}A(0,0) – B(M(0,0,dx,dy,\theta)) \\ … \\ A(i,j) – B(M(i,j,dx,dy,\theta) \\ … \\ A(n,n) – B(M(n,n,dx,dy,\theta) \end{bmatrix} \)

The Jacobian of the Residual

For a least squares problem, we need to derive the Jacobian of residual w.r.t variables.

Let, \(vars = \begin{bmatrix} dx \\ dy \\ \theta \end{bmatrix}\)

We want the jacobian of residual w.r.t vars,

\(\frac{\partial r(dx, dy, \theta)}{\partial vars}\)

Compute it directly is Okay. But doing a chain rule is easier.

Consider a row of the residual function,

\(r_i(dx, dy, \theta) = r_i(M(x_i,y_i, dx, dy, \theta))\)

Its jacobian w.r.t to var is,

\(\frac{\partial{r_i(dx, dy, \theta)}}{\partial{var}} = \frac{\partial{r_i(M(x_i,y_i, dx, dy, \theta)))}}{\partial{var}}\)

By the chain rule,

\(\frac{\partial{r_i(dx, dy, \theta)}}{\partial{var}} = \frac{\partial{r_i}}{\partial{M}} \frac{\partial{M}}{\partial{var}}\)

Now, we only need to compute,

  1. The jacobian of residual w.r.t motion: \(\frac{\partial{r_i}}{\partial{M}}\)
  2. The Jacobian of motion w.r.t to variables: \({\partial{M}} \frac{\partial{M}}{\partial{var}} \)
Multivariate chain rule can be confusing, especially given different definitions of jacobian/gradient/derivation.
Read this post for an explanation of the chain rule: <TBD>

1. The Jacobian of residual w.r.t motion

Consider the jacobian of residual w.r.t motion, \( \frac{\partial{r}}{\partial{M}}\).

Since,

\(M(x, y, dx, dy, \theta) = \begin{bmatrix} m_x\\m_y \end{bmatrix}\) and \(r(M(…)) = r(m_x, m_y)\)

So, \( \frac{\partial{r}}{\partial{M}}\) is,

\( \frac{\partial{r}}{\partial{M}} = \frac{\partial{r(m_x, m_y)}}{\partial{(m_x, m_y)}}\)

It is exactly the (shifted) jacobian of linear-translation Lucas-Kanade. We computed it in this post.

2. The Jacobian of motion w.r.t to variables

This one is easy. The tracking motion function is,

\(M(x, y, dx, dy, \theta) = \begin{bmatrix} m_x\\m_y \end{bmatrix} = \begin{bmatrix} \cos(\theta) & -\sin(\theta)\\ \sin(\theta) & \cos(\theta) \end{bmatrix}\begin{bmatrix} x\\y \end{bmatrix} + \begin{bmatrix} dx\\dy \end{bmatrix}\)

By simple math, the jacobian of motion function w.r.t to variables is,

\(\frac{\partial{M}}{\partial{(\theta, dx, dy)}} = \begin{bmatrix} -x \sin{\theta} – y \cos{\theta} & 1 & 0\\ x \cos{\theta} – \sin{\theta} x & 0 & 1 \end{bmatrix}\)


Now we can compute the jacobian of residual w.r.t optimization variables by,

\(\frac{\partial{r_i(dx, dy, \theta)}}{\partial{(\theta, dx, dy)}} = \frac{\partial{r_i}}{\partial{M}} \frac{\partial{M}}{\partial{(\theta, dx, dy)}}\)

The linearized residual function

With the jacobian, we can linearize the residual function as,

\(r(\Delta dx, \Delta dy, \Delta \theta) \approx \begin{bmatrix}J_0(\theta, dx, dy) \\ … \\ J_i(\theta, dx, dy) \\ … \\ J_n(\theta, dx, dy) \end{bmatrix} \begin{bmatrix}\Delta\theta \\ \Delta dx \\ \Delta dy \end{bmatrix} + \begin{bmatrix} r_0( dx, dy, \theta ) \\ …\\ r_i( dx, dy ,\theta) \\ … \\ r_n( dx, dy, \theta) \end{bmatrix}\)

Where, \(J_i(\theta, dx, dy) = \frac{\partial{r_i(dx, dy, \theta)}}{\partial{(\theta, dx, dy)}}|_{(\theta, dx, dy)} = \frac{\partial{r_i}}{\partial{M}}|_{(\theta, dx, dy)} \frac{\partial{M}}{\partial{(\theta, dx, dy)}}|_{(\theta, dx, dy)}\). It is given above.

Now the residual function is in the form of \(J \Delta + b\).

We can solve for \(\Delta\) by solving the normal equation \(J^T J \Delta = J^T b\), and update variables by \(vars += \Delta\).

Implementation

Code: https://github.com/yimuw/yimu-blog/tree/master/least_squares/lucas_kanade

The implementation is pretty much the same as Lucas-Kanade with linear translation.

The differences are:

  1. The basketball video is not representative since nothing rotates in the video. I wrote a rotating square simulator for the Lucas-Kanade with rotation.
  2. We need to use the chain rule to compute the jacobian.

The rotating squares simulator

cv::Mat read_next()
{
    dynamic();

    cv::Mat image = cv::Mat::zeros(image_length_, image_length_, CV_32F);

    std::vector<cv::Point> roi = compute_roi();
    std::vector<std::vector<cv::Point>> co_ordinates;
    co_ordinates.push_back(std::vector<cv::Point>());
    co_ordinates[0].push_back(roi[0]);
    co_ordinates[0].push_back(roi[1]);
    co_ordinates[0].push_back(roi[2]);
    co_ordinates[0].push_back(roi[3]);
    cv::drawContours(image, co_ordinates, 0, 0.5, CV_FILLED);

    // more region with gredient.
    cv::GaussianBlur(image, image, cv::Size(11, 11), 0, 0, cv::BORDER_DEFAULT);

    ...
}

I simply applied a dynamic(a motion) to a square, and draw it on a openCV mat.

Note I did a GaussianBlur. It make more pixels(residuals) have gradients.

The chain rule

cv::Mat compute_jacobian(
    const Velocity& velocity,
    const cv::Point2f& location,
    const cv::Mat& last_im_grad_x_,
    const cv::Mat& last_im_grad_y_)
{
    const int patch_size = (half_window_size_ * 2 + 1) * (half_window_size_ * 2 + 1);
    cv::Mat jacobian(patch_size, 3, CV_32F);

    for (float y = ...) {
        for (float x = ...) {
            cv::Point2f delta(x - location.x, y - location.y);
            const auto last_p = velocity.apply(delta) + location;
            
            const cv::Mat jacobian_xy_wrt_se2 
              = compute_jacobian_of_xy_wrt_se2(delta, velocity);
            const cv::Mat jacobian_pixel_wrt_xy = (cv::Mat_<float>(1, 2)
                    << bilinear_interp(last_im_grad_x_, last_p),
                       bilinear_interp(last_im_grad_y_, last_p));
            cv::Mat jacobian_im_wrt_se2 = jacobian_pixel_wrt_xy * jacobian_xy_wrt_se2;

            jacobian.at<float>(count, 0) = -jacobian_im_wrt_se2.at<float>(0, 0);
            jacobian.at<float>(count, 1) = -jacobian_im_wrt_se2.at<float>(0, 1);
            jacobian.at<float>(count, 2) = -jacobian_im_wrt_se2.at<float>(0, 2);
        }
    }

    return jacobian;
}

cv::Mat compute_jacobian_of_xy_wrt_se2(
    const cv::Point2f& xy,
    const Velocity& velocity)
{
    const float sint = std::sin(velocity.theta);
    const float cost = std::cos(velocity.theta);
    const float x = xy.x;
    const float y = xy.y;
    // Technically, it is (so2 + t2)
    // (x2, y2) = T(p) = R * (x1, y1) + t
    cv::Mat jacobian_xy_wrt_se2 = (cv::Mat_<float>(2, 3)
            << -sint * x - cost * y, 1, 0,
                cost * x - sint * y, 0, 1.);
    return jacobian_xy_wrt_se2;
}

I implemented the chain rule.

Note I did a change of origin for the rotation center. Intuitively, if a point is far from the origin, a small rotation around the origin changes the point a lot. I used the center of the image patch as the rotation center so that the optimization is more numerically stable. Please try to use (0, 0) as the origin and see what will happen.

More discussion about the origin of rotation center, the local parameterization vs the global parameterization: <TBD>

Let me show (off) the video again,

Discussion

Another way to formulate the same problem is using the SE(2) group as the motion function.

Let \(T \in SE(2)\),

\(T = \begin{bmatrix} R & t\\ 0 & 1 \end{bmatrix} \)

Where \(R \in SO(2)\)

The tracking motion function will be,

\(\begin{bmatrix} m_x \\ m_y \end{bmatrix} = M_{SE2}(x,y,T) = T \begin{bmatrix} x \\ y \\ 1 \end{bmatrix}\)

For \(SE(2)\), we have a exponential function too. So we can formulate the tracking motion function as,

\(M_{se2}(w) = Exp(w) \begin{bmatrix} x \\ y \\ 1 \end{bmatrix}\)

Where \(w = Log(T)\), the Log map of SE(2).

And we can compute the jacobian w.r.t w and do the least-squares optimization.

It is fancier. But is it better?

Paper Unlock

Robust and Precise Vehicle Localization based on Multi-sensor Fusion
in Diverse City Scenes
Guowei Wan, Xiaolong Yang, Renlan Cai, Hao Li, Hao Wang, Shiyu Song

In this page, the author designed a localization system for autonomous vehicle.

In section 4,

To estimate the transformation from senor scans to a map, the authors used the Lucas-Kanade algorithm to estimate the heading and used a histogram filter to estimate the translation.

I am not sure why a SE(2) Lucas Kanade doesn’t work in this case. But the claim from this paper was the LK + histogram filter gave the best result.

Related Posts

Original Lucas-Kanade Tracker: https://wang-yimu.com/lucas-kanade-tracker/

Lucas-Kanade in 1D, Time calibration: https://wang-yimu.com/time-calibration/

Introduction to Optimization on Manifolds: https://wang-yimu.com/introduction-to-optimization-on-manifolds/

Leave a Reply