Good question. I think the reason why the first definition is "better" is that it can be stated independently of the number of parameters as:
Let $f,g:X\to\mathbb R_+$ for some discrete space $X$. We say that $f \in \mathcal O(g)$ iff there is a positive constant $c$ such that $f(x)\le cg(x)$ for all but finitely many $x$.
Then the usual one-parameter definition results when we take $X$ to be $\mathbb N$, and the one you quote from CLRS when we take $X$ to be $\mathbb N^2$.
In contrast, your proposed second definition would allow unbounded functions such as
$$f(n,m) = \begin{cases} n+m & n=100 \lor m=100 \\ 1 & \text{otherwise} \end{cases}$$
to be $O(1)$, which doesn't seem to be useful for reporting algorithmic complexity.
More abstractly, remember that asymptotics are always defined relative to some limit $x\to x_0$. In the case above the limit is $x\to (+\infty,+\infty)$. Limits involving infinity are often treated as a special case in real analysis, but there's a natural topology on $\mathbb N^2\cup\{\infty\}$ such that the definition above is equivalent to
We say that $f\in\mathcal O_{x\to x_0}(g)$ iff there is a positive constant $c$ such that $f\le cg(x)$ everywhere in some neighborhood of $x_0$, except possibly at $x_0$ itself.
This view encompasses the asymptotics for $x\to 0$ we see in analysis. Additionally, it offers the opportunity to get your second definition back in the game, if you redefine the topology on $\mathbb N^2\cup\{\infty\}$ to give $\infty$ smaller neigborhoods.
Then the distinction between the two definitions you quote is simply that they are based on two different limits.