How do I calculate the $p$-norm of a matrix?
I know that the $p$-norm for a matrix is:
$$\|A\| = \max_{x\neq 0} \frac{\|Ax\|_p}{\|x\|_p}$$
but I don't know what this really means.
So how would I compute the $2$-norm, $3$-norm, etc for the matrix.
$$A = \begin{bmatrix} 2 & 1 \\ 1 & 2 \end{bmatrix}$$
UPDATE Apparently, the above matrix is too easy :) Let's try something harder.
$$A = \begin{bmatrix} 2 & 1 & 4 \\ 3 & 0 & -1 \\ 1 & 1 & 2 \end{bmatrix}$$
Thanks,
Solution 1:
If you have a vector $x = [x_1,x_2,\ldots,x_n]^T$ then $\|x\|_p = \sqrt[p]{|x_1|^p + |x_2|^p + \cdots |x_n|^p}$
In your case, $x = [x_1, x_2]^T$ then $Ax = [2x_1 + x_2, x_1 + 2x_2]$
$\|Ax\|_p^p = (|2x_1 + x_2|)^p + (|x_1 + 2x_2|)^p$ and $\|x\|_p^p = |x_1|^p + |x_2|^p$
$$\left( \frac{\|Ax\|_p}{\|x\|_p} \right)^p = \frac{(|2x_1 + x_2|)^p + (|x_1 + 2x_2|)^p}{|x_1|^p + |x_2|^p}$$
By symmetry, the maximum occurs when $x_1 = x_2 = y$ and hence
$$\left( \frac{\|Ax\|_p}{\|x\|_p} \right)^p = \frac{(3y)^p + (3y)^p}{y^p + y^p} = \frac{2 \times 3^p y^p}{2y^p} = 3^p$$
Hence, the $p^{th}$ norm of $A$ is $3$
For any matrix, the $2$ norm is the largest singular value.
Solution 2:
As I've mentioned in this answer to an MO question, Nick Higham has made note of a numerical method he attributes to Boyd for estimating the $p$-norm of a matrix, which is essentially an approximate maximization scheme similar in flavor to the usual power method for computing the dominant eigenvalue; Higham's book has a few details, his article a few more, and then see this MATLAB implementation of the algorithm.
Code? Why sure! Here's a Mathematica translation of the MATLAB routine pnorm()
:
dualVector[vec_?VectorQ, p_] :=
Module[{q = If[p == 1, Infinity, 1/(1 - 1/p)], n = Length[vec]},
If[Norm[vec, Infinity] == 0, Return[vec]];
Switch[p,
1, 2 UnitStep[vec] - 1,
Infinity, (Sign[Extract[vec, #]] UnitVector[n, #]) &[
First[Flatten[Position[Abs[vec], Max[Abs[vec]]]]]],
_, Normalize[(2 UnitStep[vec] - 1) Abs[
Normalize[vec, Norm[#, Infinity] &]]^(p - 1),
Norm[#, q] &]]] /; p >= 1
Options[pnorm] = {"Samples" -> 9, Tolerance -> Automatic};
pnorm[mat_?MatrixQ, p_, opts___] :=
Module[{q = If[p == 1, Infinity, 1/(1 - 1/p)], m, n, A, tol, sm, x,
y, c, s, W, fo, f, c1, s1, est, eo, z, k, th},
{m, n} = Dimensions[mat];
A = If[Precision[mat] === Infinity, N[mat], mat];
{sm, tol} = {"Samples", Tolerance} /. {opts} /. Options[pnorm];
If[tol === Automatic, tol = 10^(-Precision[A]/3)];
y = Table[0, {m}]; x = Table[0, {n}];
Do[
If[k == 1, {c, s} = {1, 0},
W = Transpose[{A[[All, k]], y}];
fo = 0;
Do[
{c1, s1} = Normalize[Through[{Cos, Sin}[th]], Norm[#, p] &];
f = Norm[W.{c1, s1}, p];
If[f > fo,
fo = f; {c, s} = {c1, s1}];
, {th, 0, Pi, Pi/(sm - 1)}]
];
x[[k]] = c;
y = c A[[All, k]] + s y;
If[k > 1, x = Join[s Take[x, k - 1], Drop[x, k - 1]]];
, {k, n}];
est = Norm[y, p];
For[k = 1, True, k++,
y = A.x;
eo = est; est = Norm[y, p];
z = Transpose[A].dualVector[y, p];
If[(Norm[z, q] < z.x || Abs[est - eo] <= tol est) && k > 1,
Break[]];
x = dualVector[z, q];];
est] /; p >= 1
Now, let's use the OP's matrix as an example:
mat = N[{{2, 1, 4}, {3, 0, -1}, {1, 1, 2}}];
and check how good the estimator is in known cases:
(pnorm[ma, #] - Norm[ma, #]) & /@ {1, 2, Infinity} // InputForm
{0., -2.2045227554556845*^-6, 0.}
(i.e. the estimate for the 2-norm is good to ~ 5 digits; adjusting either Samples
, Tolerance
, or both would give better results).
Let's compare with Robert's example:
pnorm[ma, 4, Tolerance -> 10^-9] // InputForm
5.695759123950937
Pretty close!
Finally, here is a plot of the $p$-norm of the OP's matrix with varying $p$:
Solution 3:
For other values of $p$, you can use Lagrange multiplier methods. Here, for example, is your $3 \times 3$ example with $p=4$ using Maple.
> M:= <<2|1|4>,<3|0|-1>,<1|1|2>>;
X:= <x1,x2,x3>;
MX:= M.X;
F:= add(MX[i]^4,i=1..3)-lambda*(add(X[i]^4,i=1..3)-1);
eqs:= {diff(F,x1),diff(F,x2),diff(F,x3),diff(F,lambda)};
S:=RootFinding[Isolate](eqs,[x1,x2,x3,lambda]);
pnorm:= max(map(t -> eval(lambda,t),S))^(1/4);
pnorm := 5.695759124
Solution 4:
EXTRA CREDIT: find the eigenvalues, eigenvectors, and the $1,2,\infty$ norms of the matrix $$B = \begin{pmatrix} 1 & 10 \\ -16 & 9 \end{pmatrix},$$ using the methods illustrated below. I have arranged that everything works out nicely.
ORIGINAL: I've never seen these calculated except for $p=1,2,\infty.$ Note that the 1 norm of a vector is the sum of the absolute values of the entries, while the $\infty$ norm of a vector is the largest absolute value of any entry.
As you can see from the link given by Jonas, the $\infty$ norm is the largest row sum(of absolute values), which for your 3 by 3 example is 2 + 1 + 4 = 7. This is achieved for a column vector consisting of all $\pm 1,$ where the choice of $\pm$ is made so that the product with the $2,1,4$ are all positive, so in this case all $+1. $ Take your matrix $$A = \left[\begin{matrix} 2 & 1 & 4 \\ 3 & 0 & -1 \\ 1 & 1 & 2 \end{matrix}\right],$$
$$A = \left(\begin{matrix} 2 & 1 & 4 \\ 3 & 0 & -1 \\ 1 & 1 & 2 \end{matrix}\right) \cdot \left(\begin{matrix} 1 \\ 1 \\ 1 \end{matrix}\right) = \left(\begin{matrix} 7 \\ 2 \\ 4 \end{matrix}\right), $$ so with $$ x = \left(\begin{matrix} 1 \\ 1 \\ 1 \end{matrix}\right), $$ we have $$ \|x\|_\infty = 1, \; \; \|Ax\|_\infty = 7, \; \; \frac{ \|Ax\|_\infty}{\|x\|_\infty} = 7, $$ and $$ \|A\|_\infty = 7. $$
The $1$ norm is the largest column sum(of absolute values), which for your 3 by 3 example is 4 + 1 + 2 = 7. This is achieved for a column vector consisting of almost all 0's and a single 1, where the choice of position for the 1 is made so that the most important column is kept. Take your matrix $$A = \left[\begin{matrix} 2 & 1 & 4 \\ 3 & 0 & -1 \\ 1 & 1 & 2 \end{matrix}\right],$$
$$A = \left(\begin{matrix} 2 & 1 & 4 \\ 3 & 0 & -1 \\ 1 & 1 & 2 \end{matrix}\right) \cdot \left(\begin{matrix} 0 \\ 0 \\ 1 \end{matrix}\right) = \left(\begin{matrix} 4 \\ -1 \\ 2 \end{matrix}\right), $$ so with $$ x = \left(\begin{matrix} 0 \\ 0 \\ 1 \end{matrix}\right), $$ we have $$ \|x\|_1 = 1, \; \; \|Ax\|_1 = 7, \; \; \frac{ \|Ax\|_1}{\|x\|_1} = 7, $$ and $$ \|A\|_1 = 7. $$
These give upper bounds for the norms of eigenvalues, working on that. From GP-Pari, largest eigenvalue is about 4.7,
? mat = [2,1,4; 3,0,-1; 1,1,2]
%1 =
[2 1 4]
[3 0 -1]
[1 1 2]
? charpoly(mat)
%2 = x^3 - 4*x^2 - 2*x - 7
?
? matdet(mat)
%3 = 7
?
? polroots( charpoly(mat) )
%4 = [4.734676178725887352610775374 ,
-0.3673380893629436763053876869 - 1.159101604948420124760141070*I,
-0.3673380893629436763053876869 + 1.159101604948420124760141070*I]~
?
I get it, the eigenvector for the real eigenvalue is given numerically as $$ x = \left(\begin{matrix} 1.803282495304177184832508716 \\ 0.9313936834217101677782666577 \\ 1 \end{matrix}\right), $$
? vec = [ 1.803282495304177184832508716 , 0.9313936834217101677782666577, 1 ]
%7 = [1.803282495304177184832508716, 0.9313936834217101677782666577, 1]
? columnvec = mattranspose( vec)
%8 = [1.803282495304177184832508716, 0.9313936834217101677782666577, 1]~
? mat * columnvec
%9 = [8.537958674030064537443284089, 4.409847485912531554497526148, 4.734676178725887352610775374]~
? mat * columnvec - 4.734676178725887352610775374 * columnvec
%10 = [4.038967835 E-28, -7.068193710 E-28, -4.038967835 E-28]~
?