Rank of centered data

Consider a {n\times p} matrix {X}. In general, this matrix has rank

\displaystyle \mathrm{rank}(X)\leq \min\{n,p\}.

Now, suppose we wish to column-center the data. We can do this algebraically by using what is known as the centering matrix,

\displaystyle H_n=I_n-\frac{1}{n}\mathbf{O},

where {\mathbf{O}} is the matrix where {\mathbf{O}_{ij}=1} for all {i,j}. Multiplying {H_n} with {X} results to the centering of all the columns.

The vector of ones is the only independent element in its nullspace and so {\mathrm{rank}(H_n)=n-1}. Therefore,

\displaystyle \mathrm{rank}(H_nX)\leq \min\{n-1,n,p\}=\min\{n-1,p\}.

Similarly, for the row-centered matrix {XH_p},

\displaystyle \mathrm{rank}(XH_p)\leq \min\{p-1,n\}.

Advertisements