5.4.7. Convex polyhedron#

For a given matrix \(A \in \mathbb{R}^{K \times N}\) and vector \({\bf b} \in \mathbb{R}^K\), the convex polyhedron is a linear constraint defined as

\[ \mathcal{C}_{\rm poly} = \big\{ {\bf w} \in \mathbb{R}^{N} \;|\; A {\bf w} \le {\bf b} \big\}. \]

The matrix \(A\) is required to be full rank, and the subset to be nonempty. Depending on how \(A\) and \({\bf b}\) are defined, a polyhedron may be bounded or unbounded. A bounded polyhedron is called polytope, and a 2-dimensional polytope (\(N=2\)) is called polygon. The figure below shows how a polyhedron looks like in \(\mathbb{R}^{2}\) or in \(\mathbb{R}^{3}\).

Polyhedron in 2D

Polyhedron in 3D

5.4.7.1. Constrained optimization#

Optimizing a function \(J({\bf w})\) subject to a convex polyhedron can be expressed as

\[ \operatorname*{minimize}_{\textbf{w}\in\mathbb{R}^N}\;\; J({\bf w}) \quad{\rm s.t.}\quad A {\bf w} \le {\bf b}. \]

The figure below visualizes such an optimization problem in \(\mathbb{R}^{2}\).

../../_images/3ef9f4ef6834058c5f57253a675af969381b1f5c7acd3c91575d021429fb827e.png

5.4.7.2. Orthogonal projection#

The projection of a point \({\bf u}\in\mathbb{R}^{N}\) onto the convex polyhedron is the (unique) solution to the following problem

\[ \operatorname*{minimize}_{{\bf w}\in\mathbb{R}^N}\; \|{\bf w}-{\bf u}\|^2 \quad{\rm s.t.}\quad A{\bf w} \le {\bf b}. \]

The solution can be computed as follows.

Projection onto the polyhedron

\[ \mathcal{P}_{\mathcal{C}_{\rm poly}}({\bf u}) = {\bf u} - A^\top {\bf c}^* \]

The vector \({\bf c}^* \in \mathbb{R}^K\) is the unique minimizer to the non-negative least squares problem

\[ \operatorname*{minimize}_{{\bf c}\in \mathbb{R}^K}\;\; \frac{1}{2} \|A^\top {\bf c} - {\bf u}\|^2 + {\bf c}^\top {\bf b} \quad{\rm s.t.}\quad {\bf c}\ge 0. \]

Although there exists no closed-form formula for the optimal \({\bf c}^*\), it can be effectively computed by numerical methods.

5.4.7.3. Implementation#

In the following implementation, the dual problem is solved with an algorithm called FISTA, whose iterations are reported here after. This solution is suitable only for polyhedra with small \(N\) and \(K\). A faster and more scalable implementation is based on the projected Newton method proposed by Dimitri Bertsekas in 1982.

\[\begin{split} \begin{aligned} &L = {\rm sup}_{{\bf c}}\|\nabla^2 f({\bf c})\|_2\\ &{\bf c}_0 = {\bf y}_0 = {\bf 0} \\ &\textrm{For $i=0, 1, \dots$}\\ &\left\lfloor \begin{aligned} {\bf y}_{i+1} &= \mathcal{P}_{\mathcal{C}}\big({\bf c}_i - \frac{1}{L} \nabla f({\bf c}_i)\big) \\ {\bf c}_{i+1} &= {\bf y}_{i+1} + \frac{i}{i+3} \big({\bf y}_{i+1} - {\bf y}_{i}\big) \end{aligned} \right. \end{aligned} \end{split}\]
def project_poly(u, A, b):
    assert A.ndim == 2, "'A' must be a matrix (2 axes)"
    assert b.ndim == 1, "'b' must be a vector (1 axis)" 
    assert A.shape[0] == b.shape[0], "The size of 'b' must be equal to the number of rows of 'A'"
    assert np.linalg.matrix_rank(A) == np.min(A.shape), "'A' must have full rank"
    
    def nnls(epochs=1000):
        y = 0
        c = np.zeros_like(b)
        step = 1 / np.linalg.norm(A, ord=2)**2
        for i in range(epochs):
            yo= y
            g = A @ (A.T @ c - u) + b
            y = np.maximum(0, c - step * g)
            c = y + i/(i+3) * (y - yo)
        return c
    
    c = nnls()
    p = u - A.T @ c
    
    return p

Here is an example of use.

u = np.array([1.5, -2])

A = np.array([[-1, -2],
              [-2, -1],
              [ 1, -1]])
b = np.array([0, 0, 3])

p = project_poly(u, A, b)

The next figure visualizes a point \({\bf u}\in \mathbb{R}^{2}\) (blue dot) along with its projection (red dot) onto this set.

../../_images/75f530b8a9f24834f843f0f89ef238d7172268acaebb4338b43077448359d270.png

5.4.7.4. Proof#

By using the Lagrange multipliers to remove inequality constraints, the projection onto the convex polyhedron is the (unique) solution to the following problem

\[ \operatorname*{minimize}_{{\bf w}\in\mathbb{R}^N}\;\sup_{{\bf c}\ge0}\; \frac{1}{2} \|{\bf w}-{\bf u}\|^2 + {\bf c}^\top (A{\bf w} - {\bf b}). \]

Strong duality holds, because this is a convex problem with linear constraints. Hence, it can be equivalently rewritten as

\[ \operatorname*{maximize}_{{\bf c}\ge0}\;\inf_{{\bf w}\in\mathbb{R}^N}\; \frac{1}{2} \|{\bf w}-{\bf u}\|^2 + {\bf c}^\top (A{\bf w} - {\bf b}). \]

Setting to zero the gradient w.r.t. \({\bf w}\), the inner minimization is solved by

\[ {\bf w}^* = {\bf u} - A^\top {\bf c}, \]

Replacing \({\bf w}^*\) in the criterion yields

\[\begin{split} \begin{aligned} \frac{1}{2} \|{\bf w}^*-{\bf u}\|^2 + {\bf c}^\top (A{\bf w}^* - {\bf b}) &= \frac{1}{2} \|{\bf u} - A^\top {\bf c}-{\bf u}\|^2 + {\bf c}^\top (A{\bf u} - AA^\top {\bf c} - {\bf b}) \\ &= \frac{1}{2} \|A^\top {\bf c}\|^2 + {\bf c}^\top A{\bf u} - \|A^\top {\bf c}\|^2 - {\bf c}^\top {\bf b} \\ &= -\frac{1}{2} \|A^\top {\bf c}\|^2 + {\bf c}^\top A{\bf u} - {\bf c}^\top {\bf b} \\ &= -\frac{1}{2} \|A^\top {\bf c} - {\bf u}\|^2 - {\bf c}^\top {\bf b}. \end{aligned} \end{split}\]

Hence, the outer maximization is equivalent to

\[ \operatorname*{minimize}_{{\bf c}\ge0}\; \frac{1}{2} \|A^\top {\bf c} - {\bf u}\|^2 + {\bf c}^\top {\bf b}. \]

This is a non-negative least squares problem that can be efficiently solved with numerical methods.