Thursday, May 1, 2008

Least Square Solution

I always get confused with this topic. Awhan explained it to me very nicely. I would discuss this issue briefly so that I can refer back to it when ever needed.

A set of linear equations is given by

A x = y .................... (1)

where A is m x n dimensional constant matrix. 'x' is a n-dimensional vector and y is a m-dimensional output or measurement vector. The task is to find a solution 'x' that satisfies equation (1).


Solution to equation (1) exists if and only if y lies in the range space of A. In other words,

rank[ A y] = rank[A] = r ............ (2)


In case this rank condition is satisfied, we know that there exists at least one solution. General solution to equation (1) is given by

x = xp + \lambda * xh .............. (3)

xp is the particular solution and xh is the homogenous solution. Multiplying both sides of equation (2) by A, we get

Ax = A * xp + \lambda A * xh = A * xp = y

Since xh is a vector that lies in the null space of A, we have A * xh = 0. Dimension of null space is given by \gamma = n - rank(A). Whenever \gamma > 0, or we have a non-trivial null space, we will have multiple solutions.


Lets assume that we have a solution for the set of equations given by (1). Consider following cases :

  • If m is less than n . Less number of equations and more number of variables. This is also known as under-determined set of equations. If A is of full rank, that is , rank(A) = m, there exists a null space of dimension (n-m) and hence there would be many solutions to equation (1). You have many solutions even when rank(A) = r .
  • If m > n. More number of equations and less number of variables. This is also known as overdetermined set of equations. If A is of full rank, that is, rank(A) = n, then the dimension of null space = n - n = 0. Hence there exists a unique solution to the set of equations given by (1). If rank(A) = r < n , then the set of equations (1) would have infinitely many solutions as the null space becomes non-trivial.

Date: 21 September 2008 Sunday

For m >= n (overdetermined system), the least square solution is obtained by minimizing the following cost function:

V = || Ax-y || ^2

which gives x = inv(A'*A) * A' * y = pinv(A)*y. This is also known as moore-penrose pseudo-inverse solution.

One can use matlab's backslash operator to compute the least square solution.

x = A\y

If A is not full rank, then A\b will generate an error message, and then
a least-squares solution will be returned.

-----------------

For under-determined system (m is less than or equal to n), the minimum-norm is solution is given by

x = A' * inv(A*A') * y

You can also use the pseudo-inverse function pinv(), which computes the pseudo-inverse,

x = pinv(A) * y;

when A is full rank and fat or square. (The pseudo-inverse is also defined when A is not full
rank, but it's not given by the formula above.)


Warning: If A is fat and full rank, then A\y gives a solution of the Ax = y, not necessarily the
least-norm solution.

The minimum-norm solution for least square could be derived by using Lagrange multiplier method:

Consider the problem of  minimizing ||x||^2 subject to Ax = y. For this, the Lagrange function may be written as

L(x, \lambda) = 0.5 x^x + \lambda^T * (Ax-y)

dL/dx = x^T + \lambda^T * A = 0.

This gives, x = A^T = \lambda and \lambda = inv (A*A^T) * y. Substituting the \lambda into the former, we get

x = A^T * inv(A*A^T) * y = pinv(A) * y


1 comment:

Swagat said...

Moore-Penrose Pseudoinverse can be obtained by minimizing following cost function:

V = || Ax - b||^2 = (Ax-b)'(Ax-b)

This is also called a least square solution for a set of over-determined linear equations.

 
pre { margin: 5px 20px; border: 1px dashed #666; padding: 5px; background: #f8f8f8; white-space: pre-wrap; /* css-3 */ white-space: -moz-pre-wrap; /* Mozilla, since 1999 */ white-space: -pre-wrap; /* Opera 4-6 */ white-space: -o-pre-wrap; /* Opera 7 */ word-wrap: break-word; /* Internet Explorer 5.5+ */ }