Sunday, April 16, 2006

Matrix Approximation

I went through a phase in college when I had a preference for looking at problems through the linear-algebra lens. In the past two years, I have been looking at data that has a nice matrix form where rows are subjects and columns are attributes. The matrix is large and I need a summary in order to understand it. The typical way to do this is to get a low-rank approximation which will give a few "directions" for an analyst to peer at, but unfortunately, these directions are linear combinations of the column (or row) space and typically, it is hard to understand intuitively what they represent. It will be incredibly useful if one could find original columns or rows that summarized the matrix.

In the past few months, Petros Drineas, Michael Mahoney and I have worked on formulating this problem and solving it. Given matrix A, we are able to quickly find poly-k columns C (or rows or both) such that a matrix A_C obtained from these columns is a (1+\eps) approximation to best k-rank matrix A_k for A. Formally, ||A-A_C|| <= (1+\eps)||A-A_k||. The columns C then is a useful set of attributes for an analyst to understand and interpret. The technical result is here from November 2005. ||.|| is the Frobenius norm.

There is a somewhat related problem in theory community: how to approximate A_k? The A_k that minimizes ||A-A_k|| is given immediately by SVD. In theory, Alan Frieze, Ravi Kannan and Santosh Vempala asked if this could be approximated in a few passes. Since then, a lot of published work has shown how to get a matrix representation A' of A such that ||A-A'||<= ||A-A_k|| +\eps ||A||. This large additive error is discouraging and the question that remained open was, are there relative error approximations in few passes or in time less than it takes to solve the SVD? My result with Petros and Michael above provides such a matrix representation, but takes time to find largest k exact or approximate left-singular values, which can be done in O(k) passes using methods in Numerical Analysis. I mentioned my result above to Sariel Har-Peled at FSTTCS 2006 in India; at that time, he promised he can produce an A' such that ||A-A'||<= (1+\eps) ||A-A_k||. He delivered: his writeup is here. This does not seem to address the problem Petros, Michael and I worked on since it does not output original columns or rows, but produces linear combinations. Quite recently, Amit Deshpande and Santosh Vempala obtained a similar result but focused on using a small number -- O(klog k) -- passes, in some sense culminating a long line of research on low rank approximation using a small number of passes. It seems we can derive another solution to our problem from their result.

Summary: We are in a better place than we were a year ago when only additive approximations were known for low-rank matrix summaries. Both problems now have relative error approximations in near-linear time. And the fun part is, each of these results use a different technique. Bonanza! Those who are interested should look at these papers since there are many details.

Anonymous said...

The approach of Har-Peled is essentially identical to
the one in the SODA 2006 paper by Deshpande et al.
(they call it adaptive sampling).

5:43 PM
Anonymous said...

I really enjoyed looking at your site, I found it very helpful indeed, keep up the good work.
»

9:53 PM
Anonymous said...

Nice! Where you get this guestbook? I want the same script.. Awesome content. thankyou.
»

1:29 AM
Anonymous said...
1:57 PM
Anonymous said...

Wazzup people. Let's take a look. A great sollution for you.
hemorrhoids relief
hemorrhoids pain relief
hemorrhoids relief treatment
venapro
review venapro
hemorrhoids herbal treatment
stop smoking aid
nicocure
nicocure patch
stop smoking
stop smoking aids
stop smoking drug
stop smoking medicine
stop smoking patch
meds to stop smoking
new stop smoking drug
stop smoking product
stop smoking treatment
Don't delete this. Thanks!

9:51 PM
Anonymous said...
10:41 PM
Anonymous said...
12:35 AM
Anonymous said...
2:23 PM
Anonymous said...
7:55 PM
Anonymous said...
7:07 AM
Anonymous said...

Hi All. Take a look. A lot of fantastish beautifull womans.
princess cameron
nikky blonde
lucy summer
gauge samantha
crissy moran
crissy moran clip
jesse capelli
daniel lisa
laren shay
terri summers
diamond mya
rachel aziani
mandi rose fanelli
tawny roberts
gianni taylor
ashley simply
kaylani lei
cherry potter
christine young
christine young porn video
kelly summer
melissa doll
I like them. Don't delete this. Thanks!

7:20 AM
Anonymous said...
6:43 PM
Anonymous said...

Hello. Very interesting, really!
http://phentermine.alldating.org/phentermine.htm phentermine
phentermine online
phentermine
cheap phentermine
Its'not a spam, help sick children! I am very sorry!

12:27 AM
Anonymous said...

Hi. Very interesting, really!
cheap phentermine
Don't delete. Help homeless children, thanks

10:10 PM
boon said...

I don't know if this is related. Given a fix vector x, I want to find a matrix A such that A^(-1)P A x is a minimum. where P is a projection matrix into a smaller subspace of dimension k. And Oh, I forgot to mention that A has to live in SO(n). Looks like a minimum/maximum problem but it probably requires some rewording.

8:08 AM
thin hair said...
5:14 AM

There is now a new quit smoking drug available in the market. This latest breakthrough is known as Chantix. It is able to help smokers snub out their addiction by working on the brain.

10:23 PM
hannah said...

I lost a lot of weight by taking Phentermine that I ordered from WWW.MEDSHEAVEN.COM i am very satisfied!

9:37 PM
chantix said...

Do you know why chantix is so popular for quite smoking. Because it gives very good results and FDA approved Chantix Pill for Quit smoking !

5:09 AM
Mobile Application said...

My friend recommended to this blog.... you have some awesome articles shear. keep it up the great work

android application | android web application

5:18 AM