QP and least squares
This article relates the problem formulations of quadratic programming and least squares under linear constraints.
Published:
2024-02-18
Background
When formulating optimization problems in applied settings (e.g., robotics, vision), it may be unclear whether to frame them as a quadratic program (QP) or as a least squares (LS) problem. Typically, this has the structure of an inverse problem; briefly: given a desired output $\mathbf{y}_\mathsf{d}$ and a forward model ${\mathrm{G}: \mathbf{x} \mapsto \mathbf{y}}$, we seek the input $\mathbf{x}^*$ that is mapped closest to $\mathbf{y}_\mathsf{d}$. Sometimes, it is possible to obtain an inverse model $\mathrm{G}^{\mathsf{inv}}$ which fulfills the map $\mathbf{y} \mapsto \mathbf{x}$.
As is often the case, it may be more convenient to solve the associated least squares problem that minimizes the residual $\Vert \mathrm{G}(\mathbf{x}) - \mathbf{y} \Vert_2$. Example scenarios where a least squares problem structure can be used include inverse kinematics and camera calibration (resectioning, i.e., camera extrinsic/intrinsic parameter estimation). Depending on the specific problem instance, the meanings of $(\mathrm{G}, \mathbf{x}, \mathbf{y})$ will vary and may not be immediately obvious.
As will be shown in this article, the LS problem under specific conditions can be framed as a QP problem. We assume a linear forward model $\mathrm{G}$ and linear constraints on the input $\mathbf{x}$.
Problem Structure
Below, we develop the structure and assumptions of the least squares and the quadratic programming problems.
Least squares problem
Let $(\mathsf{LS})$ be our least squares problem having $m$ output variables, $n$ input variables, $p$ linear constraints, and a linear forward model $\mathrm{G}$. This problem can be termed linearly-constrained linear least squares.
$$ (\mathsf{LS}): \begin{dcases} \begin{aligned} \min & \quad J_\mathsf{LS}(\mathbf{x}) = \Vert \mathrm{G} \mathbf{x} - \mathbf{y}\Vert_2^2 \\ \operatorname{s.t.} & \quad \mathrm{A} \mathbf{x} \preceq \mathbf{b} \end{aligned} \end{dcases} $$
Here, $\mathbf{y} \in \mathbb{R}^m$, $\mathrm{G} \in \mathbb{R}^{m × n}$, $\mathbf{x} \in \mathbb{R}^n$, $\mathbf{b} \in \mathbb{R}^p$, and $\mathrm{A} \in \mathbb{R}^{p × n}$.
QP problem
Let $(\mathsf{QP})$ be our quadratic programming problem having $n$ variables and $p$ linear constraints. This problem is specifically linearly-constrained quadratic programming.
$$ (\mathsf{QP}): \begin{dcases} \begin{aligned} \min & \quad J_\mathsf{QP}(\mathbf{x}) = \frac{1}{2} \mathbf{x}^\mathsf{T} \mathrm{Q} \mathbf{x} + \mathbf{c}^\mathsf{T} \mathbf{x} \\ \operatorname{s.t.} & \quad \mathrm{A} \mathbf{x} \preceq \mathbf{b} \end{aligned} \end{dcases} $$ Here, $\mathrm{Q} \in \mathbb{R}^{n × n}$ is a symmetric matrix ($\mathrm{Q}^\mathsf{T} = \mathrm{Q}$), $\mathbf{x} \in \mathbb{R}^n$, $\mathbf{b} \in \mathbb{R}^p$, and ${\mathrm{A} \in \mathbb{R}^{p × n}}$. In general, there are no restrictions on the definiteness of $\mathrm{Q}$.
From $(\mathsf{LS})$ to $(\mathsf{QP})$
To restate our least squares problem as a QP problem, we first rewrite the cost function $J_\mathsf{LS}(\mathbf{x})$ as:
$$ \begin{aligned} J_\mathsf{LS}(\mathbf{x}) &= (\mathrm{G} \mathbf{x} - \mathbf{y})^\mathsf{T} (\mathrm{G} \mathbf{x} - \mathbf{y}) \\ &= (\mathbf{x}^\mathsf{T} \mathrm{G}^\mathsf{T} - \mathbf{y}^\mathsf{T}) (\mathrm{G} \mathbf{x} - \mathbf{y}) \\ &= \mathbf{x}^\mathsf{T} \mathrm{G}^\mathsf{T} \mathrm{G} \mathbf{x} - \mathbf{y}^\mathsf{T} \mathrm{G} \mathbf{x} - \mathbf{x}^\mathsf{T} \mathrm{G}^\mathsf{T} \mathbf{y} + \mathbf{y}^\mathsf{T} \mathbf{y} \\ &= \mathbf{x}^\mathsf{T} \mathrm{G}^\mathsf{T} \mathrm{G} \mathbf{x} - 2\mathbf{y}^\mathsf{T} \mathrm{G} \mathbf{x} + \mathbf{y}^\mathsf{T} \mathbf{y} \end{aligned} $$
Notice that the minimizer $\mathbf{x}^*$ of $\tilde{J}_\mathsf{LS}(\mathbf{x}) = \frac{1}{2}[J_\mathsf{LS}(\mathbf{x}) - \mathbf{y}^\mathsf{T} \mathbf{y}]$ is that of $J_\mathsf{LS}(\mathbf{x})$; however, the minimum value is different. The modified cost function can thus be rewritten into a QP objective function, as follows:
$$ \begin{aligned} \tilde{J}_\mathsf{LS}(\mathbf{x}) &= \frac{1}{2} \mathbf{x}^\mathsf{T} \mathrm{G}^\mathsf{T} \mathrm{G} \mathbf{x} - \mathbf{y}^\mathsf{T} \mathrm{G} \mathbf{x} \\ &= \frac{1}{2} \mathbf{x}^\mathsf{T} \mathrm{Q} \mathbf{x} + \mathbf{c}^\mathsf{T} \mathbf{x} \end{aligned} $$ By setting $\mathrm{Q} = \mathrm{G}^\mathsf{T} \mathrm{G}$ and $\mathbf{c} = -\mathrm{G}^\mathsf{T} \mathbf{y}$, the problem $(\mathsf{LS})$ is restated in the from of $(\mathsf{QP})$. In the scenario where $\mathrm{Q} \succ 0$ (see the above note on definiteness), then $J_\mathsf{QP}(\mathbf{x})$ is strictly convex and its minimizer $\mathbf{x}^*$ is unique.
From $(\mathsf{QP})$ to $(\mathsf{LS})$
Work in progress.
This can be achieved using Cholesky decomposition.