1
$\begingroup$

Given a matrix $A \in \mathbb{R}^{m \times n}$, I want to find a vector $\vec{x} \in \mathbb{N}_{+}^{m}$ so $\vec{y} = {A}^{T} \vec{x}$ is nearly a constant vector, i.e., the values of $\vec{y}$ are all similar, as close to being identical as possible. In simple words, I can scale each row of $A$ by a positive integer and sum each column and would like the sum of each column to be as close as possible to each other. One can assume the values of $A$ are non-negative.

A trivial example:

$$ A = \begin{bmatrix} 2 & 6 & 1 \\ 1 & 1 & 15 \\ 6 & 3 & 1 \end{bmatrix} $$

With scaling of $1$ to each row the sum of the columns is $\left[ 9, 10, 17 \right]$ which is far being a constant vector. By scaling the 1st and 3rd rows by 2 we get:

$$ A^{T} \vec{x} = \begin{bmatrix} 2 & 6 & 1 \\ 1 & 1 & 15 \\ 6 & 3 & 1 \end{bmatrix}^{T} \begin{bmatrix} 2 \\ 1 \\ 2 \end{bmatrix} = \begin{bmatrix} 17 \\ 19 \\ 19 \end{bmatrix} $$

Which results in a much closer vector to a constant. I could formulate it as:

$$ \arg \min_{c, \vec{x} } c \quad \text{s.t.} \; A^T \vec{x} = c \vec{1}, \; \vec{x} \geq \vec{1} $$

Then I just round the output value of $\vec{x}$. Is there a better way to formulate this? Specifically, How can I formulate the vector being nearly constant? I thought about the norm of its derivative. Ideally a formulation as a linear integer program.


Related

$\endgroup$
7
  • $\begingroup$ Your example has an optimal solution $x=\operatorname{adj}(A)^T(1,1,1)^T=(42,56,23)^T$, with $A^Tx=\operatorname{det}(A)(1,1,1)^T$. I guess for an interesting example you want a matrix $A$ whose adjugate has at least one negative column sum. $\endgroup$ Commented Jul 16 at 17:21
  • $\begingroup$ @user1651511, if $\vec{x} = [42, 56, 23]^T$ then $A^T \vec{x} = [278, 377, 905]^T$ which is far from being a constant. $\endgroup$ Commented Jul 17 at 3:15
  • 1
    $\begingroup$ Oops, I should have typed $x=(42,23,56)^T$, so that $A^Tx=443(1,1,1)^T$. $\endgroup$ Commented Jul 17 at 7:05
  • 2
    $\begingroup$ One simple way to say $v$ is constant is $v=Pv$, where $P$ is the $3\times 3$ matrix with all entries $1/3$. So for $v$ nearly constant you could ask for $\|v-Pv\|$ small, for your favorite choice of norm. $\endgroup$ Commented Jul 17 at 12:00
  • 1
    $\begingroup$ If I am right, you can solve for a solution as close as possible to $\vec1$, then rescale so that all components of $\vec x$ be naturals. The criterion could be the sum of absolute deviations $\sum|A\vec x-\vec 1|$ or minimum absolute deviation $\min|A\vec x-\vec 1|$. These will yield rational components. $\endgroup$ Commented Jul 27 at 19:41

2 Answers 2

2
$\begingroup$

Here's a snippet of Julia code for a non-square matrix

# construct a small test problem (m ≤ n)
m,n = 3,5;
A = rand(1:10,m,n) // BigInt(1)
#=
 8  1  1   2  10
 5  7  1  10   5
 6  8  2   4   8
=#

# calculate the exact Moore-Penrose inverse of A
Amp = A'*inv(A*A');
A*Amp == I
# true

# calculate the rational Least-squares solution
#   r = pinv(A)*1
r = sum(Amp, dims=2)
#=
 2447//52664
  887//52664
  275//52664
  999//26332
 2793//52664
=#

c = lcm(denominator.(r))
# 52664

A*r
#=
 1
 1
 1
=#

x = c*r
#=
 2447
  887
  275
 1998
 2793
=#

In mathematical notation, the algorithm is $$\eqalign{ \def\LR#1{\left(#1\right)} \def\BR#1{\left[#1\right]} \def\CR#1{\left\lbrace #1 \right\rbrace} \def\q{\quad} \def\qif{\q\iff\q} \def\qiq{\q\implies\q} \def\qq{\qquad} \def\o{{\tt1}} r = A^{\bf+}\o \qif Ar = \o \qiq c = LCM(r),\q x = cr }$$ Applying this to your test problem produces $$ r = \pmatrix{42//443 \\ 23//443 \\ 56//443} \qiq c = 443,\quad x = \pmatrix{42 \\ 23 \\ 56} $$


(Oops! I just noticed that I used $A$ instead of $A^T$ but I don't want to re-type the whole answer)

Derivation

Use an unconstrained vector $w$ to construct the constrained $x$ vector $$\eqalign{ \def\op#1{\operatorname{#1}} \def\diag#1{\op{diag}\LR{#1}} \def\Diag#1{\op{Diag}\LR{#1}} \def\trace#1{\op{\sf tr}\LR{#1}} \def\frob#1{\left\| #1 \right\|_F} \def\p{\partial} \def\grad#1#2{\frac{\p #1}{\p #2}} \def\rr#1{\color{red}{#1}} e &= \exp(w)\gt0 &\qiq de = e\odot dw \\ E &= \Diag{e} &\qiq de = E\,dw \\ x &= \LR{\o+e}\gt\o &\qiq dx = E\,dw \\ x &= cr &\qiq r = {\sf rational\ vector} \\ & &\qiq c \in {\mathbb R} \\ \LR{Ax-c\o} &= c\LR{Ar-\o} \\ }$$ The gradient of the cost function wrt $w$ can be calculated as $$\eqalign{ \phi &= \tfrac12\frob{Ax-c\o}^2 \\ d\phi &= \LR{Ax-c\o}:\LR{A\,dx} \\ &= \LR{cA^T\!Ar-cA^T\o}:\LR{E\,dw} \\ &= cE\LR{A^T\!Ar-A^T\o}:dw \\ \grad{\phi}w &= cE\LR{A^T\!Ar-A^T\o} \\ }$$ Setting the gradient to zero yields $$\eqalign{ A^T\!Ar &= A^T\o \qiq r &= \rr{A^{\bf+}\o} + \LR{I-A^{\bf+}A}v \\ }$$ where $v$ is an arbitrary rational vector.

Often the $\rr{\rm first\ term}$ contains no negative components, so $v$ can be set to zero (as in the preceding examples). Otherwise you must find a non-zero $v$ such that the resulting $r$ vector is positive.

The final step is to convert from a rational $r$ vector to an integer $x$ vector by clearing fractions of the least common multiple of the denominators.

$\endgroup$
1
$\begingroup$

Given a matrix ${\bf A} \in \mathbb{R}^{m \times n}$, we would like to find an input vector ${\bf x} \in \left(\mathbb{N}^{+}\right)^m$ such that the output vector ${\bf y} := {\bf A}^\top {\bf x}$ is as close to the constant vector as possible, i.e., is as close to the line spanned by the vector ${\bf 1}_n$ as possible. Let $\bar{y} := \frac1n {\bf 1}_n^\top {\bf y}$ be the arithmetic mean of the entries of $\bf y$ and note that

$$ {\bf y} - \bar{y} {\bf 1}_n = {\bf y} - \frac1n {\bf 1}_n {\bf 1}_n^\top {\bf y} = \left( {\bf I}_n - \frac1n {\bf 1}_n {\bf 1}_n^\top \right) {\bf y} $$

where ${\bf P} := {\bf I}_n - \frac1n {\bf 1}_n {\bf 1}_n^\top$ is the rank-$(n-1)$ projection matrix that projects onto the $(n-1)$-dimensional orthogonal complement of the line spanned by the vector ${\bf 1}_n$. Thus, minimizing $\| {\bf P} \, {\bf y} \|$ forces $\bf y$ to be close to the line spanned by ${\bf 1}_n$. Using, say, the $1$-norm, facing the trade-off of making the output as small as possible without expending too much input energy, we have the following regularization problem

$$ \underset{{\bf x} \in \left(\mathbb{N}^{+}\right)^m}{\text{minimize}} \quad \left\| {\bf P} \, {\bf A}^\top {\bf x} \right\|_1 + \gamma \| {\bf x} \|_1$$

where $\gamma \geq 0$. For the given $3 \times 3$ matrix $\bf A$, let us consider two cases:

  • if $\gamma = 0$, we obtain ${\bf x} = \begin{bmatrix} 42 & 23 & 56 \end{bmatrix}^\top$ and ${\bf y} = 443 \cdot {\bf 1}_3$, which yields zero cost. This is the solution found by Yves Daoust.

  • if $\gamma = 1$, we obtain ${\bf x} = \begin{bmatrix} 2 & 1 & 2 \end{bmatrix}^\top$ and ${\bf y} = \begin{bmatrix} 17 & 19 & 19 \end{bmatrix}^\top$, which is the solution presented by the OP in the question itself.


Python code

import cvxpy as cp
import numpy as np


def regularized(A, gamma):

    m = A.shape[0]
    n = A.shape[1]
    P = np.identity(n) - ((1/n) * np.ones((n, n)))

    # solve regularized problem
    x    = cp.Variable(m, integer=True)
    cost = cp.norm(P @ A.transpose() @ x, 1) + gamma * cp.norm(x, 1)
    prob = cp.Problem(cp.Minimize(cost), [ x >= np.ones(m) ])
    prob.solve()

    if prob.status == "optimal":
        return (x.value, x.value @ A, prob.value)
    else:
        return "Error!"


def main():

    A = np.array([[ 2,  6,  1],
                  [ 1,  1, 15],
                  [ 6,  3,  1]])

    print("Solution when gamma = 0:", regularized(A, 0))
    print("Solution when gamma = 1:", regularized(A, 1))


if __name__ == "__main__":
    main()

outputs

Solution when gamma = 0: (array([42., 23., 56.]), array([443., 443., 443.]), 1.6771769158670696e-13)
Solution when gamma = 1: (array([2., 1., 2.]), array([17., 19., 19.]), 7.66666666666667)
$\endgroup$

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.