How to Land a rocket? Part 2, Constrained Optimization

What’s in this article

In the previous section, I discussed

  1. how to formulate the rocket landing problem as a least squares optimization problem,
  2. how to apply regularization to a a least squares problem,
  3. how to make the solver faster by applying separable objective function property,
  4. and make the solver faster by using a sparse linear equation solver.

The next question is: how to handle constraints in an optimization problem.

In this section, I will discuss,

  1. The optimality condition for constraint problems: the KKT conditions.
  2. Backtracking line search.
  3. Primal-Dual Interior Point Method.
  4. Implementation in C++.

Rocket Landing Constrained Problem

In the previous post, we formulate the rocket landing problem as a regularized unconstrained least squares problem.

\(\text{minimize} \; cost\{s_0, s_1, …, s_n\} = ||s_0 – s_{initial}|| _{W_0} ^2 + ||s_n – s_{end}||_ {W_n} ^2 +\sum_i ||s_{i+1} – \text{motion}(s_i)||_{W_m}^2 \\ \; \; \; \; + k1 \sum_i (s_i.\text{acceleration})^2 + k2 \sum_i (s_i.\text{dt})^2\)

The unconstrained result. Note the y is below 0 and the thrust is negative.

In this section, we will apply constraints to the problem,

\(\text{minimize} \; cost\{s_0, s_1, …, s_n\} = ||s_0 – s_{initial}|| _{W_0} ^2 + ||s_n – s_{end}||_ {W_n} ^2 +\sum_i ||s_{i+1} – \text{motion}(s_i)||_{W_m}^2 \\ \; \; \; \; + k1 \sum_i (s_i.\text{acceleration})^2 + k2 \sum_i (s_i.\text{dt})^2\)

\(\text{subject to:} \\ \; s_i.y > 0, \; \forall \; i \\ \; s_i.\text{thrust} > 0, \; \forall \; i \\ \; s_i.\text{dt} > 0, \; \forall \; i\)

The constrains are (1) rocket should always above ground, and (2) thrust is non-negative. (3) dt is non-negative.

The constrained optimization result.

1. What are the KKT conditions?

KKT conditions are the optimality condition for constrained optimization problems.

To understand it, Let’s go back to unconstrained optimization problems first.

The Unconstrained Case

Consider a unconstrained optimization problem,

\(\text{minimize}_x f(x)\)

The local optimality condition (The condition of \(x\) such that \(f(x)\) is a local minima) for the unconstrained optimization problem is gradient equal to 0,

\(\nabla f(x^*) = 0 \)

To solve the optimization problem, we just need to solve the optimality condition. In other word, We need to find a \(x^*\) which satisfies the optimality condition.

For example, consider a linear least-squares problem

\(f(x) = ||Ax + b||^2\)

The optimality condition is

\(\nabla ||Ax^* + b||^2 = 0\)

After some algebra, we have the Normal Equation.

\(A^T A x^* = -A^T b\).

As a result, for linear least squares problems, we solve the Normal equation for the optimal variable \(x^*\).

The Constrained Case

Now, consider an inequality constrained optimization problem.

\(\text{minimize}_x f(x)\)

\(\text{subject to}: h_i(x) \leq 0, \; \forall i \)

The local optimality condition for the inequality constrained problem is,

\(\nabla f(x^*) + \sum_i \lambda_i^* \nabla h_i(x^*) = 0 \\ h_i(x^*) \leq 0 \; \; ,\forall i\\ \lambda_i^* \geq 0 \; \; ,\forall i \\ h_i(x^*) \lambda_i^* = 0 \; \; ,\forall i\)

They are the KKT conditions.

\(\lambda_i\) are called the dual variables.

Since KKT conditions are the optimality conditions for constrained optimization problems, to solve the constraint problem, we simply need to find \(x^*, \lambda^*\) which satisfies the KKT condition.

For the derivation of the KKT conditions, please read chapter 5 of Convex Optimization. Don’t be afraid, the derivation is surprisingly simple.
There was a detail derivation video for it:
CMU Convex Optimization KKT conditions
PS: I was in the video
I will ignore the equality constraints because they are relatively easier to handle. For interesting reader, please read chapter 10 of Convex Optimization

Constrained Least Squares

Consider a constrained least squares problem,

\(\text{minimize}_x ||Ax – b||^2\)

\(\text{subject to}: h_i(x) \leq 0 \; \forall i \)

Doing the math, the KKT conditions for constraint least squares problem are,

\(A^TAx^* + A^Tb + \sum_i \lambda_i^* \nabla h_i(x^*) = 0 \\ h_i(x^*) \leq 0 \; \; , \forall i\\ \lambda_i^* \geq 0 \; \; , \forall i \\ h_i(x^*) \lambda_i^* = 0 \; \; , \forall i\)

With loss of generality, I will discuss the KKT conditions for constraint least squares in the later section.

2. How to solve the KKT conditions?

Please, look at KKT conditions for a while and try to figure out a way to solve it.

\(A^TAx^* + A^Tb + \sum_i \lambda_i^* \nabla h_i(x^*) = 0 \\ h_i(x^*) \leq 0 \; \; , \forall i\\ \lambda_i^* \geq 0 \; \; , \forall i \\ h_i(x^*) \lambda_i^* = 0 \; \; , \forall i\)

Can you find \(x^*, \lambda^*\) which satisfy the KKT condition?

Dude! It’s hard!

In fact, The KKT conditions are an inequality constrained problem, which is the same as the original problem!

There is no simple way to solve the KKT condition elegantly.

In practice, people solve the KKT condition in 2 steps.

To understand it, let me first break the KKT conditions into 2 part.

1. The KKT system of equations.

\(A^TAx^* + A^Tb + \sum_i \lambda_i^* \nabla h_i(x^*) = 0 \\ h_i(x^*) \lambda_i^* = 0 \; , \forall i\)

The KKT system of equations is basically a function in the form of \(r_{kkt}(x, \lambda) = 0\).

2. The KKT inequality constraints.

\(\\ h_i(x^*) \leq 0 \; \; , \forall i \\ \lambda_i^* \geq 0 \; \; , \forall i \)

To solve the KKT conditions, we can

  1. Solve the KKT system of equations \(r_{kkt}(x, \lambda) = 0\) by linearize it and solver for \((\Delta x , \Delta \lambda)\). It’s basically the Newton’s root finding algorithm for \(r_{kkt}(x, \lambda)\).
  2. In the update step, we can enforce the KKT inequality constants by controlling the update step size.

Let’s dig into it.

2.1 The KKT System of Equations

We want to find \(x, \lambda\) which satisfy the KKT system of equations

\(A^TAx^* + A^Tb + \sum_i \lambda_i^* \nabla h_i(x^*) = 0 \\ h_i(x^*) \lambda_i^* = 0 \; , \forall i\)

To simplify notations, Let,

\(g(x) = \begin{bmatrix}h_1(x) \\ h_2(x) \\ …\\ h_n(x) \end{bmatrix}\) , \(\lambda = \begin{bmatrix}\lambda_1 \\ \lambda_2 \\ …\\ \lambda_n \end{bmatrix}\)

The KKT system of equation become,

\( r_{kkt}(x, \lambda) = \begin{bmatrix} A^TAx^* + A^Tb + \sum_i \lambda_i^* \nabla h_i(x^*) \\ -\text{diag}( \lambda ) g(x) \end{bmatrix} = 0 \)

To follow the conventions in the book Convex Optimization, I added a negative sign for \(-\text{diag}( \lambda ) g(x) = 0\).

We can linearize the system as,

\(r_{kkt}(x + \Delta x,\lambda + \Delta \lambda) \approx \nabla r_{kkt}(x, \lambda) \begin{bmatrix} \Delta x\\ \Delta \lambda \end{bmatrix} + r_{kkt}(x, \lambda) = 0\)

Plug in terms and do the math. In vector form we get,

\(\begin{bmatrix} A^T A + \sum_i \lambda_i \nabla^2 h_i & \nabla g(x)^T \\ -\text{diag}(\lambda) \nabla g(x) & -\text{diag}(g(x)) \end{bmatrix} \begin{bmatrix} \Delta x\\ \Delta \lambda \end{bmatrix} = – \begin{bmatrix} A^TAx + A^Tb + \sum_i \lambda_i \nabla h_i(x)\\ -\text{diag}( \lambda ) g(x) \; \end{bmatrix}\)

Note, it is a linear system. We can solve it by a linear system solver.

This is basically the Newton’s root finding algorithm. We use the algorithm to find \(\Delta x, \Delta \lambda\) such that \(r_{kkt}(x + \Delta x, \lambda + \Delta \lambda) \to 0\).

Solving this linearized KKT linear system, we get the\((\Delta x, \Delta \lambda)\) which makes \((x + \Delta x,\lambda + \Delta \lambda)\) approximately satisfies the KKT linear system of equations \( r_{kkt}(x + \Delta x,\lambda + \Delta \lambda) = 0\).

\((\Delta x, \Delta \lambda)\) is called the primal-dual search direction.

Now, we have the update step \((\Delta x, \Delta \lambda)\) which satisfies the KKT system of equations.

How can we satisfy the inequality constraints?

We just brute force it by the backtracking line search!

2.2 Backtracking Line Search

Backtracking line search basically means: I have a update direction, I want to find a scale in (0, 1) such that \(x_{new} = x + k \Delta x\) stratifies some conditions.

A simple way would be try all possible k. But assuming the condition is monotonic, we can do a binary search for k.

For example, if we want to find \(x_{new}\) such that \(cost(x_{new}) < cost(x)\). We can try \(x_{new} = x + \Delta x, x_{new} = x + 0.5 \Delta x, x_{new} = x + 0.5^2 \Delta x, …\) until we the \(cost(x_{new}) < cost(x)\) condition is met.

(Well, let’s a kinda an inaccurate binary search.)

For us. The conditions are the KKT inequality constraints,

\(\\ h_i(x^*) \leq 0 \; \; , \forall i \\ \lambda_i^* \geq 0 \; \; , \forall i \)

We want to search for a step size k such that the updated variables \((x_{new}, \lambda_{new}) = (x,y) + k (\Delta x, \Delta \lambda)\) satisfy the KKT inequality constraints.

For example, we can start with k = 1, if \((x_{new}, \lambda_{new})\) doesn’t satisfy the KKT inequality constraints, we decrease the k by a factor and redo this process. If \((x_{new}, \lambda_{new})\) satisfy the constraints, we stop and update the variable using current step size k, \((x_{new}, \lambda_{new}) = (x,y) + k (\Delta x, \Delta \lambda)\).

Assuming the current variables are feasible, by construction, the updated variable \((x_{new}, \lambda_{new})\) satisfy the KKT inequality constraints.

Iteration

Because we

  1. approximated the KKT system of equations linearly.
  2. used the back-tracking line search to enforce the KKT inequality constraints.

The updated variables \((x_{new}, \lambda_{new}) = (x,y) + k (\Delta x, \Delta \lambda)\) may not satisfy the original KKT conditions. So we need to repeat this process until converge.

Wait, Something is wrong…

Now, we have a algorithm to solve the KKT conditions for the constrained least squares problem,

\(A^TAx^* + A^Tb + \sum_i \lambda_i^* \nabla h_i(x^*) = 0 \\ h_i(x^*) \leq 0 \; \; , \forall i\\ \lambda_i^* \geq 0 \; \; , \forall i \\ h_i(x^*) \lambda_i^* = 0 \; \; , \forall i\)

The algorithm solves the KKT condition in two steps,

  1. Solve the KKT system of equations to get a update direction \((\Delta x, \Delta \lambda)\).
  2. Do backtracking line search to ensure inequality constraints.

Sadly, it doesn’t work!

The \((\Delta x, \Delta \lambda)\) from step 1 is either huge or tiny.

Why?

2.3 Relaxed Complementary Slackness

Go back to the KKT system of equations,

\(A^TAx^* + A^Tb + \sum_i \lambda_i^* \nabla h_i(x^*) = 0 \\ h_i(x^*) \lambda_i^* = 0, \; \forall i\)

The second equation

\( h_i(x^*) \lambda_i^* = 0, \; \forall i\)

is called the Complementary Slackness. It says, at least one of \(h_i(x^*), \lambda_i^*\) is zero. In other word, we tried to solve,

\(A^TAx^* + A^Tb + \sum_i \lambda_i^* \nabla h_i(x^*) = 0 \\ \\ \text{subject to:} \; \text{at least one of} \; (h_i(x^*), \lambda_i^*) \text{is zero} \; \forall i\)

The complementary slackness is simply way too nonlinear. The Newton’s root finding algorithm can’t solve it.

For example, assuming the constraint is \(h(x) = x > 0\), the complementary slackness \(h(x) \lambda = x \lambda = 0\) is not a smooth function. The derivative of it is either 0 or infinity.

To solve this issue, people relaxed the Complementary Slackness.

Instead of solving the super nonlinear logical function,

\(h_i(x^*) \lambda_i^* = 0 \; , \forall i\),

we can solve an approximation of the Complementary Slackness,

\(h_i(x^*) \lambda_i^* = -1/t \; , \forall i, \; t > 0\) .

The above equation is the Relaxed Complementary Slackness.

And we have the Relaxed KKT system of equations,

\(A^TAx^* + A^Tb + \sum_i \lambda_i^* \nabla h_i(x^*) = 0 \\ h_i(x^*) \lambda_i^* = -1/t \; , \forall i\)

Since the above equation is nonlinear, we can linearize it and solve for this linear system of equations for \((\Delta x, \Delta \lambda)\),

\(\begin{bmatrix} A^T A + \sum_i \lambda_i \nabla^2 h_i & \nabla g(x)^T \\ -\text{diag}(\lambda) \nabla g(x) & -\text{diag}(g(x)) \end{bmatrix} \begin{bmatrix} \Delta x\\ \Delta \lambda \end{bmatrix} = – \begin{bmatrix} A^TAx + A^Tb + \sum_i \lambda_i \nabla h_i(x)\\ -\text{diag}( \lambda ) g(x) – 1/t \; \end{bmatrix}\)

In practice, at the beginning of the KKT iteration, t is small. Then, we gradually increase t. As a result, the relax complementary slackness is close to the true complementary slackness as we progress.

relaxed complementary slackness for \(\lambda x = 1/t\)
Does the sign of \(-1/t\) matter? Try it by yourself.

2.4 The Primal-Dual Interior Point Method

Finally, let me introduce the Primal-Dual Interior Point Method

  1. Compute the parameter t for the relaxed complementary slackness.
  2. Construct and solve the relaxed KKT linear system to get \((\Delta x, \Delta \lambda)\)
  3. Do Backtracking line search to ensure inequality constraints, and update.
  4. Go back to 1, if stopping condition not meet.

Why it is called the primal-dual interior points methods?

Primal-Dual:

In step 2, we solve the KKT linear system with primal and dual variables together.

Interior points:

In step 3, we ensure variables (a point in hyperspace) are inside the feasible sets which are defined by the inequality constraints.

3. Rocket Landing with Constraints

Go back to the constrained rocket landing problem.

\(\text{minimize} \; cost\{s_0, s_1, …, s_n\} = ||s_0 – s_{initial}|| _{W_0} ^2 + ||s_n – s_{end}||_ {W_n} ^2 +\sum_i ||s_{i+1} – \text{motion}(s_i)||_{W_m}^2 \\ \; \; \; \; + k1 \sum_i (s_i.\text{acceleration})^2 + k2 \sum_i (s_i.\text{dt})^2\)

\(\text{subject to:} \\ \; s_i.y > 0, \; \forall \; i \\ \; s_i.\text{thrust} > 0, \; \forall \; i \\ \; s_i.\text{dt} > 0, \; \forall \; i\)

The constrains are (1) rocket should always above ground, and (2) thrust is non-negative. (3) dt is non-negative.

The cost function is a weighted nonlinear least squares cost. In section 1, we linearized it. So, we have,

\(\text{minimize}_{\Delta s} \; ||J \Delta s + r)||_W\)

\(\text{subject to:}\\ -(s_i.y +\Delta s_i.y) < 0 \; , \forall \; i \\ -(s_i.\text{thrust} + \Delta s_i.\text{thrust}) < 0 \; , \forall \; i \\ -(s_i.\text{dt} + \Delta s_i.\text{dt}) < 0 \; , \forall \; i\)

It is in the form of,

\(\text{minimize}_x ||Ax – b||^2\)

\(\text{subject to:} h_i(x) <= 0 \; , \forall i \)

Which is the constrained problem we solved in the previous section.

We can apply the Primal-Dual Interior Point Method to solve the rocket landing problem.

There are some less important details for the Primal-Dual Interior Points Method. Please read Chapter 11 of Convex Optimization or CMU 15725 slides (or this version).

Implementation

Solver main loop

code link

void solver_rocket_landing_constrained_least_squares(const Config &config,
                                                     RocketLandingProblem &problem)
{
    config_ = config;

    num_states_ = problem.trajectory.states.size();
    num_primal_variables_ = problem.trajectory.states.size() * RocketState::STATE_SIZE;
    num_dual_variables_ = problem.constrains.linear_constrains.size();
    const int augmented_var_size =  num_primal_variables_ + num_dual_variables_;

    // avoid memory reallocation
    NormalEqution augmented_normal_equ(augmented_var_size);
    NormalEqution normal_equ(num_primal_variables_);

    VectorXd dual_vars = initialize_dual_variables();
    assert(dual_vars.size() == num_dual_variables_);


    constexpr int MAX_ITERATIONS = 100;
    int iter = 0;
    for(; iter < MAX_ITERATIONS; ++iter)
    {
        ScopeProfiler p("solve_iter");

        problem.update_problem();

        // unconstrained problem
        residual_function_to_normal_equation(problem.residuals, normal_equ);
        
        // compute the gradient of the cost function
        VectorXd cost_grad = compute_cost_gradient(normal_equ, problem.trajectory, problem.residuals);
        apply_regularization_to_hessian(problem.residuals, normal_equ);

        // Primal dual problem
        // Solve (Stationarity, Complementary slackness) of KKT
        const double k_relax_complementary_slackness = compute_k_relax_complementary_slackness(iter, problem, dual_vars);
        construct_primal_dual_problem(normal_equ.lhs, cost_grad, 
            dual_vars, problem.constrains, k_relax_complementary_slackness, augmented_normal_equ);
        
        // solve linear system
        VectorXd delta_primal_daul = solve_normal_eqution(augmented_normal_equ);
        assert(delta_primal_daul.size() == num_primal_variables_ + num_dual_variables_);

        // Enforce interior point 
        // (primal feasibility + dual feasibility) of KKT
        bool status = back_tracking_line_search_and_update(delta_primal_daul, problem, dual_vars);
        if(status == false)
        {
            break;
        }

        if(stop_condition(augmented_normal_equ.rhs, problem.constrains, dual_vars) == true)
        {
            std::cout << "stop at iteration:" << iter << std::endl;
            break;
        }
    }

    if(iter == MAX_ITERATIONS)
    {
        std::cout << "warning: not converged" << std::endl; 
    }
}

  • Line 28~32: I computed the Hessian \(A^TA\) for the original least squares problem.
  • Line 36: compute the k for relaxed complementary slackness.
  • Line 37-41: construct the KKT linear system. And solve for a search step.
  • Line 46: Doing the backtracking line search to ensure inequality constraints.
  • Line 52: Check the stopping condition.

Constructing the primal dual system LHS

code link

void construct_primal_dual_problem_lhs(const MatrixXd &cost_hessian,
                                       const VectorXd &dual_variables, 
                                       const Constrains &constrains,
                                       MatrixXd &augmented_normal_equ_lhs)
{
    const int num_variables = cost_hessian.rows();
    augmented_normal_equ_lhs.setZero();

    // Copy primal equation
    augmented_normal_equ_lhs.block(0, 0, num_variables, num_variables) = cost_hessian;

    // Not general, simplified for 1d linear case.
    for(size_t constrain_idx = 0; constrain_idx < constrains.linear_constrains.size(); ++constrain_idx)
    {
        const auto &constrain_linear = constrains.linear_constrains[constrain_idx];
        const int correspond_primal_index = constrain_linear.state_index * RocketState::STATE_SIZE 
            + constrain_linear.type_index;
        const int dual_var_index = num_variables + constrain_idx;

        // upper left
        // hessian for linear constrain is 0. Don't need to update it

        // upper right block
        augmented_normal_equ_lhs(correspond_primal_index, dual_var_index) = constrain_linear.jacobian();
        // lower left block
        augmented_normal_equ_lhs(dual_var_index, correspond_primal_index) 
            = - dual_variables[constrain_idx] * constrain_linear.jacobian();
        // lower right block
        augmented_normal_equ_lhs(dual_var_index, dual_var_index) = - constrain_linear.h();
    }
}

It computes the LHS of the linearized KKT system of equations,

\(\begin{bmatrix} A^T A + \sum_i \lambda_i \nabla^2 h_i & \nabla g(x)^T \\ -\text{diag}(\lambda) \nabla g(x) & -\text{diag}(g(x)) \end{bmatrix} \)

Backtracking Line Search

code link

bool back_tracking_line_search_and_update(const VectorXd &delta_primal_daul,
                                          RocketLandingProblem &problem,
                                          VectorXd &dual_variables)
{
    const double current_cost = problem.residuals.total_cost();
    double updated_cost = std::numeric_limits<double>::max();

    auto check_cost_deacrease = [&](const PrimalVariables &trajectory)
    {
        ...
    };

    double final_line_search_step = 0.;
    bool search_success = false;
    // Cache
    bool dual_feasible_status = false;
    bool primal_feasible_status = false;

    const VectorXd primal_step = delta_primal_daul.segment(0, num_primal_variables_);
    const VectorXd dual_step = delta_primal_daul.segment(num_primal_variables_, num_dual_variables_);

    for(double step_size = 1.; step_size > 1e-2; step_size *= back_tracking_scale_)
    {
        if(dual_feasible_status == false)
        {
            if(dual_feasible(dual_variables, dual_step, step_size) == false)
            {
                continue;
            }
            else
            {
                dual_feasible_status = true;
            }
        }
        
        PrimalVariables updated_primal_vars = update_primal_variables(primal_step, step_size, problem.trajectory);

        // TODO: update check_cost_deacrease to use primal-dual cost
        final_line_search_step = step_size;

        if(primal_feasible_status == false)
        {
            if(primal_feasible(problem.constrains, updated_primal_vars) == false)
            {
                continue;
            }
            else
            {
                primal_feasible_status = true;
            }
        }
        
        if(check_cost_deacrease(updated_primal_vars) == false)
        {
            continue;
        }
        else
        {
            search_success = true;
            final_line_search_step = step_size;
            break;
        }
    }

    // update dual & primal vars
    dual_variables += final_line_search_step * dual_step;
    problem.trajectory = update_primal_variables(primal_step, final_line_search_step, problem.trajectory);

    return true;
}

The implementation is a little bit more complex than I discussed above. We checked

  1. Dual feasible. \(\lambda_i \geq 0 \; \forall \; i\)
  2. Primal feasible. \(s_i.y \geq 0, \;s_i.\text{thrust} \geq 0 \;s_i.\text{dt} \geq 0 \; \forall \; i\)
  3. Cost value strictly decreasing.

That’s it! We can land a rocket now!

BTW, there are a few very important details which didn’t mention in the book. Please read the code for details. Or you can work with me and I will tell you.
Man! I am so sneaky!

4. Paper Unlock

There are other methods to handle inequality constraints. The barrier method is one of them.

An Interior-Point Method for Large-Scale l1-Regularized Least Squares
Seung-Jean Kim, K. Koh, M. Lustig, Stephen Boyd

Given an L1 regularized optimization problem,

By introducing slack variables \(u_i\), the above problem can be transformed into a constrained least squares problem.

Let \((x^*, u^*) = \text{argmin (above-problem)} (x, u) \). We can argue that \(x_i^* = u_i^*\). If \(x_i^* < u_i^*\), we can make the cost smaller by letting \(x_i^* = u_i^*\). Thus, the problem (13) is equivalent to (3).

To handle the inequality constraints, the author used the log barrier method.

Intuitively, the log barrier uses cost functions to approximate the inequality constraints.
As a result, the problem (3) become a unconstrained optimization problem \(\phi_t(x, u)\).

The Log Barrier method is closely related to the Primal-Dual Interior-Point methods but it usually performs worse than Primal-Dual Interior Point. Please read Chapter 11 of Convex Optimization or CMU-10-725.

The hessian struct of the log barrier method is similar to the Primal-Dual Interior-Point method

The authors solved the constrained least squares problem using a variation of the Newton method. This was due to they wanted to solve a dense Normal Equation for CT reconstruction.

1. compute the search direction.
2~3. backtracking line search and update.
4~6 compute stopping conditions.
7. update the barrier parameter.

Another method to handle inequality constraints is Projected-Newton(Active-Set Method).

Control-Limited Differential Dynamic Programming
Yuval Tassa, Nicolas Mansard and Emo Todorov

In this paper, the authors applied Differential Dynamic Programming to formulate a control limited problem as a sequence of Constrained Least Squares problems (Newton QP problem). And then, the authors used the Projected-Newton method to solve each of them.

Treat saturated variables as constant in the Ax = b system.

The idea is,

  1. Find clamped variables which take the equal sign in the inequality constraints (15b)
  2. Fix clamped variables in the Normal Equation (16). Then, the Normal Equation “shrink” to the equation below (16) which only contains free variables.
  3. Update free variables with backtracking line search.

I feel like this method may not able to find a good local minimum. An extreme example is: If the initial guess is clamped, we should I do? Correct me if I am wrong.

Continuous-time Gaussian process motion planning via probabilistic inference
Mustafa Mukadam, Jing Dong, Xinyan Yan, Frank Dellaert and Byron Boots

The IJRR paper of the year 2018.

The authors used the Gaussian process (fancy interpolation) to formulate planning problems as least squares problems. They used ISAM2 to solve the Normal Equation for the least squares problems.

ISAM2 is able to solve the incremental linear systems efficiently. ISAM2 is based on Sparse Cholesky which only works for Positive-semidefinite matrices.

However, the KKT system of equations is not a positive-semi definite system.

\(\begin{bmatrix} A^T A + \sum_i \lambda_i \nabla^2 h_i & \nabla g(x)^T \\ -\text{diag}(\lambda) \nabla g(x) & -\text{diag}(g(x)) \end{bmatrix} \begin{bmatrix} \Delta x\\ \Delta \lambda \end{bmatrix}\)

We can’t use Cholesky to solve it. In the previous section, I used LU decomposition to solve the KKT system of equations.

In the paper, in order to handle constraints using Cholesky decomposition, the author formulated inequality constraints as cost terms.


How does SpaceX land the rocket?

Please read Akshay Agrawal’s blog for details.

I guess SpaceX is using a general solver called CVX which is based on Primal-Dual Interior-Point methods.

A general solver is designed for generality, not for efficiency.

Probably, for the rocket landing problem, we can do better by exploit the structure of rocket landing problem.

5. What’s Next?

The rocket landing problem is formulated as a constraint least squares problem and can be solved by the Primal-Dual Interior-Point methods.

We used the Eigen Sparse LU composition to solve the KKT linearized equations. It’s kinda slow. Each iteration takes about 3 ms.

We can exploit the structure of the KKT linearized equations such that each iteration takes about 0.5 ms.


Least Squares Robotics

This article is part of the Least Squares Robotics.

4 thoughts on “How to Land a rocket? Part 2, Constrained Optimization

  1. Pingback: nicotine pouches

Leave a Reply