How to scale matrix rows so the sum of each column has a similar value?

Question

Given a matrix $A \in \mathbb{R}^{m \times n}$, I want to find a vector $\vec{x} \in \mathbb{N}_{+}^{m}$ so $\vec{y} = {A}^{T} \vec{x}$ is nearly a constant vector, i.e., the values of $\vec{y}$ are all similar, as close to being identical as possible. In simple words, I can scale each row of $A$ by a positive integer and sum each column and would like the sum of each column to be as close as possible to each other. One can assume the values of $A$ are non-negative.

A trivial example:

$$ A = \begin{bmatrix} 2 & 6 & 1 \\ 1 & 1 & 15 \\ 6 & 3 & 1 \end{bmatrix} $$

With scaling of $1$ to each row the sum of the columns is $\left[ 9, 10, 17 \right]$ which is far being a constant vector. By scaling the 1st and 3rd rows by 2 we get:

$$ A^{T} \vec{x} = \begin{bmatrix} 2 & 6 & 1 \\ 1 & 1 & 15 \\ 6 & 3 & 1 \end{bmatrix}^{T} \begin{bmatrix} 2 \\ 1 \\ 2 \end{bmatrix} = \begin{bmatrix} 17 \\ 19 \\ 19 \end{bmatrix} $$

Which results in a much closer vector to a constant. I could formulate it as:

$$ \arg \min_{c, \vec{x} } c \quad \text{s.t.} \; A^T \vec{x} = c \vec{1}, \; \vec{x} \geq \vec{1} $$

Then I just round the output value of $\vec{x}$. Is there a better way to formulate this? Specifically, How can I formulate the vector being nearly constant? I thought about the norm of its derivative. Ideally a formulation as a linear integer program.

Related

Your example has an optimal solution $x=\operatorname{adj}(A)^T(1,1,1)^T=(42,56,23)^T$, with $A^Tx=\operatorname{det}(A)(1,1,1)^T$. I guess for an interesting example you want a matrix $A$ whose adjugate has at least one negative column sum. — user1651511
– user1651511, Commented Jul 16 at 17:21
@user1651511, if $\vec{x} = [42, 56, 23]^T$ then $A^T \vec{x} = [278, 377, 905]^T$ which is far from being a constant. — Avi T
– Avi T, Commented Jul 17 at 3:15
Oops, I should have typed $x=(42,23,56)^T$, so that $A^Tx=443(1,1,1)^T$. — user1651511
– user1651511, Commented Jul 17 at 7:05
One simple way to say $v$ is constant is $v=Pv$, where $P$ is the $3\times 3$ matrix with all entries $1/3$. So for $v$ nearly constant you could ask for $\|v-Pv\|$ small, for your favorite choice of norm. — user1651511
– user1651511, Commented Jul 17 at 12:00
If I am right, you can solve for a solution as close as possible to $\vec1$, then rescale so that all components of $\vec x$ be naturals. The criterion could be the sum of absolute deviations $\sum|A\vec x-\vec 1|$ or minimum absolute deviation $\min|A\vec x-\vec 1|$. These will yield rational components. — user1548405
– user1548405, Commented Jul 27 at 19:41

greg · Accepted Answer · 2025-08-03 00:44:51Z

Here's a snippet of Julia code for a non-square matrix

# construct a small test problem (m ≤ n)
m,n = 3,5;
A = rand(1:10,m,n) // BigInt(1)
#=
 8  1  1   2  10
 5  7  1  10   5
 6  8  2   4   8
=#

# calculate the exact Moore-Penrose inverse of A
Amp = A'*inv(A*A');
A*Amp == I
# true

# calculate the rational Least-squares solution
#   r = pinv(A)*1
r = sum(Amp, dims=2)
#=
 2447//52664
  887//52664
  275//52664
  999//26332
 2793//52664
=#

c = lcm(denominator.(r))
# 52664

A*r
#=
 1
 1
 1
=#

x = c*r
#=
 2447
  887
  275
 1998
 2793
=#

In mathematical notation, the algorithm is $$\eqalign{ \def\LR#1{\left(#1\right)} \def\BR#1{\left[#1\right]} \def\CR#1{\left\lbrace #1 \right\rbrace} \def\q{\quad} \def\qif{\q\iff\q} \def\qiq{\q\implies\q} \def\qq{\qquad} \def\o{{\tt1}} r = A^{\bf+}\o \qif Ar = \o \qiq c = LCM(r),\q x = cr }$$ Applying this to your test problem produces $$ r = \pmatrix{42//443 \\ 23//443 \\ 56//443} \qiq c = 443,\quad x = \pmatrix{42 \\ 23 \\ 56} $$

(Oops! I just noticed that I used $A$ instead of $A^T$ but I don't want to re-type the whole answer)

Derivation

Use an unconstrained vector $w$ to construct the constrained $x$ vector $$\eqalign{ \def\op#1{\operatorname{#1}} \def\diag#1{\op{diag}\LR{#1}} \def\Diag#1{\op{Diag}\LR{#1}} \def\trace#1{\op{\sf tr}\LR{#1}} \def\frob#1{\left\| #1 \right\|_F} \def\p{\partial} \def\grad#1#2{\frac{\p #1}{\p #2}} \def\rr#1{\color{red}{#1}} e &= \exp(w)\gt0 &\qiq de = e\odot dw \\ E &= \Diag{e} &\qiq de = E\,dw \\ x &= \LR{\o+e}\gt\o &\qiq dx = E\,dw \\ x &= cr &\qiq r = {\sf rational\ vector} \\ & &\qiq c \in {\mathbb R} \\ \LR{Ax-c\o} &= c\LR{Ar-\o} \\ }$$ The gradient of the cost function wrt $w$ can be calculated as $$\eqalign{ \phi &= \tfrac12\frob{Ax-c\o}^2 \\ d\phi &= \LR{Ax-c\o}:\LR{A\,dx} \\ &= \LR{cA^T\!Ar-cA^T\o}:\LR{E\,dw} \\ &= cE\LR{A^T\!Ar-A^T\o}:dw \\ \grad{\phi}w &= cE\LR{A^T\!Ar-A^T\o} \\ }$$ Setting the gradient to zero yields $$\eqalign{ A^T\!Ar &= A^T\o \qiq r &= \rr{A^{\bf+}\o} + \LR{I-A^{\bf+}A}v \\ }$$ where $v$ is an arbitrary rational vector.

Often the $\rr{\rm first\ term}$ contains no negative components, so $v$ can be set to zero (as in the preceding examples). Otherwise you must find a non-zero $v$ such that the resulting $r$ vector is positive.

The final step is to convert from a rational $r$ vector to an integer $x$ vector by clearing fractions of the least common multiple of the denominators.

Rodrigo de Azevedo · Accepted Answer · 2025-08-13 12:56:49Z

Given a matrix ${\bf A} \in \mathbb{R}^{m \times n}$, we would like to find an input vector ${\bf x} \in \left(\mathbb{N}^{+}\right)^m$ such that the output vector ${\bf y} := {\bf A}^\top {\bf x}$ is as close to the constant vector as possible, i.e., is as close to the line spanned by the vector ${\bf 1}_n$ as possible. Let $\bar{y} := \frac1n {\bf 1}_n^\top {\bf y}$ be the arithmetic mean of the entries of $\bf y$ and note that

$$ {\bf y} - \bar{y} {\bf 1}_n = {\bf y} - \frac1n {\bf 1}_n {\bf 1}_n^\top {\bf y} = \left( {\bf I}_n - \frac1n {\bf 1}_n {\bf 1}_n^\top \right) {\bf y} $$

where ${\bf P} := {\bf I}_n - \frac1n {\bf 1}_n {\bf 1}_n^\top$ is the rank-$(n-1)$ projection matrix that projects onto the $(n-1)$-dimensional orthogonal complement of the line spanned by the vector ${\bf 1}_n$. Thus, minimizing $\| {\bf P} \, {\bf y} \|$ forces $\bf y$ to be close to the line spanned by ${\bf 1}_n$. Using, say, the $1$-norm, facing the trade-off of making the output as small as possible without expending too much input energy, we have the following regularization problem

$$ \underset{{\bf x} \in \left(\mathbb{N}^{+}\right)^m}{\text{minimize}} \quad \left\| {\bf P} \, {\bf A}^\top {\bf x} \right\|_1 + \gamma \| {\bf x} \|_1$$

where $\gamma \geq 0$. For the given $3 \times 3$ matrix $\bf A$, let us consider two cases:

if $\gamma = 0$, we obtain ${\bf x} = \begin{bmatrix} 42 & 23 & 56 \end{bmatrix}^\top$ and ${\bf y} = 443 \cdot {\bf 1}_3$, which yields zero cost. This is the solution found by Yves Daoust.
if $\gamma = 1$, we obtain ${\bf x} = \begin{bmatrix} 2 & 1 & 2 \end{bmatrix}^\top$ and ${\bf y} = \begin{bmatrix} 17 & 19 & 19 \end{bmatrix}^\top$, which is the solution presented by the OP in the question itself.

Python code

import cvxpy as cp
import numpy as np


def regularized(A, gamma):

    m = A.shape[0]
    n = A.shape[1]
    P = np.identity(n) - ((1/n) * np.ones((n, n)))

    # solve regularized problem
    x    = cp.Variable(m, integer=True)
    cost = cp.norm(P @ A.transpose() @ x, 1) + gamma * cp.norm(x, 1)
    prob = cp.Problem(cp.Minimize(cost), [ x >= np.ones(m) ])
    prob.solve()

    if prob.status == "optimal":
        return (x.value, x.value @ A, prob.value)
    else:
        return "Error!"


def main():

    A = np.array([[ 2,  6,  1],
                  [ 1,  1, 15],
                  [ 6,  3,  1]])

    print("Solution when gamma = 0:", regularized(A, 0))
    print("Solution when gamma = 1:", regularized(A, 1))


if __name__ == "__main__":
    main()

outputs

Solution when gamma = 0: (array([42., 23., 56.]), array([443., 443., 443.]), 1.6771769158670696e-13)
Solution when gamma = 1: (array([2., 1., 2.]), array([17., 19., 19.]), 7.66666666666667)

Stack Exchange Network

How to scale matrix rows so the sum of each column has a similar value?

2 Answers 2

Derivation

Python code

You must log in to answer this question.

Linked

Hot Network Questions