Wednesday, October 31, 2007

NY beats NY

Halloween parade just finished, the floats have gone by, staged statements have ended, the cops have moved the barricades and cleared the streets, and the cleaning trucks are out sweeping up the city, but the theatrical mood of the village persists. This year, like many others, NY outdoes NY. The parade is only a small part. Everyone does a little something or a whole lot of big things, and fill the city block after block. The whole city of strangers collaborate and the result is a make-believe world of outsized masks and wigs, striking boots and costumes, self-made or paid-for, and just personalities. Some pictures from 06 here.

Typical CA


I was in the Bay area for a few days. It was cloudy and grim as it is nearly every time I visit, with even some rain. This visit was unique: Over dinner yesterday, I managed to feel the vibrations, surmise relying on the collective consciousness of people around me that it was an earthquake, sprint out and feel sheepish bonding with others in the same situation. Magnitude 5.6 earthquake, and other stats at usgs site. Best things about CA are one's friends, aged, ageless, and new.

Saturday, October 27, 2007

XLDB, XXL

Yeap, not Very Large Databases (VLDB) but XLDB. There was an invite-only meeting at the Stanford Linear Accelerator Center organized by Jacek Becla and others on such monsters, petabytes and up. It was a conversation involving
  • scientists who generate petabytes on large scale experiments (sky telescope imagery etc) and have to process and make them available to the community,
  • univ professors, chiefly, database (stonebraker, dewitt)
  • companies that have experience (ebay, yahoo, google, at&t) and
  • vendors who cater (oracle, ibm, teradata, netezza, objectivity).
It was a quintessential conversation. The scientists told us they had data problems, and had to generate petabytes scale infrastructure on small budgets (most of their money goes into building the devices) using grad students and postdocs. The database professors complained that the scientists' problems did not have many commonalities and needed to abstract further to get useful data models and operations. The companies with experience said they could process such large datasets today, with their resources: disks, network connections and many, many people; they were curious why there was a challenge! Scientists complained that they were a small market for vendors and that when database research students worked with them, they produced papers and no prototypes, certainly, no products. The vendors said they were keen on working with the scientists, to get some experience to develop new generation of tools that may be useful for a variety of applications (if any).

Hence, many realities collided. What emerged was many opportunities to learn, collaborate and build for longterm future. Curiously, many mentioned mapreduce+SQLengine as the answer. Decades of database research and the current hammer is mapreduce? IBM+Google are partnering to bring Apache's Hadoop (open source implementation of mapreduce) to university students.

Let us Assume and Accept

I caught a few minutes of the 201 version of the social sciences take on America by Mr. Wuhl in a classroom format on HBO. Mr. Wuhl does a taut, standup routine, debunks and rebunks myths of pop culture and history. He spins fun stories, and truth or not, explains many things (why do we hold a middle finger up: the English demonstrating they can still "Pluck U" to the French, it is a long story, imagine one of the many wars and the role of the middle finger in plucking longbow arrows).

I was reminded of such an anecdote, this one due to Inimitable Luigi Laura. The English flag is the Cross of St George, patron saint of England and curiously, Genova. Story is, English ships borrowed the Genovese flag of the Cross of St. George to be under the protection of the powerful Genovese kingdom in the Mediterranean, and later adopted it as a national flag. Apparently, money was involved in this "borrowing". Is it true? Who cares, it is a fun story. Here is the "research", and some "quote".

Thursday, October 25, 2007

Sampling Research

To theoretical computer scientists, random sampling comes naturally. We think it is simple, elegant even, and useful. In practice, random sampling presents problems. If you walk into a cafe in NY, 90%+ people have macs (and look stylish); walk into an airport, and 90%+ people use windows machines (and look much less chic). Any estimate you get from these "samples" by themselves or jointly for the market share of these products is (doable but) difficult. Worse, I have been working the past year or so putting sampling primitives into a database engine. Different users need different kinds of samples (eg., biased, distinct, fixed size, persistent, over different attributes, with or without replacements), no single sample meets all requirements and maintaining all these different samples is a systems nightmare. Also, users want the same answer if they rerun a query! Take all this into account, and putting sampling primitives into a database becomes a research problem.

When work rests

I was asked what I do when I do NOT work. That makes one reflect, and no professional comes out of that unscathed, however resilient their ego and dedication to science, math or music. The only time I don't truly work is when a person sets fire to my hair, or I hold a paint brush, green-tipped and wide-lipped.

Tuesday, October 23, 2007

FOCS and Sundries

Woody Allen (What's up, what's new: his early films) said, "80% of success is just showing up". I did. FOCS for a day was fun. Among other things, I heard Mihalis talk on his paper with Kousha Etessami, On the complexity of Nash equilibria and other fixed points. The talk, despite technical difficulties (was it Warren Schudy who was helping out?), started with a simple Nash equilibria question and pretty soon included some nice fixed point theorem stuff, sophisticated questions about real-valued computations and myraid complexity classes. Note to self: All relevant. Must read the paper.

Anecdote: MM cited RC as saying, "I have this algorithm that will solve all your problems". I asked, "Greedy?".

Doing napkin research: Yesterday, over lunch, I was trying to understand a matrix low-rank approximation problem a colleague posed, using the tools I had to do algebra, and Michael Riley asked me if it was a 2-napkins problem. We discussed if there should be designer napkins for doing math over meals.

Furniture Inc

While the world is in excitement about one thing or the other (say the Colts and Pats who are going to pound each other in 2 weeks over the excuse of the pigskin), or in angst (the fires in CA, War funding, Polish elections, whatever), some have asked me about furniture and the art of shopping for them. Here is a link to shops, their provenance, and may you window-shop well today.

Sunday, October 21, 2007

Providence and FOCS

Travel to Providence, Rhode Island, by Amtrak was scenic, and the travel back appropriately late and irksome. The town was a bit of discovery, a place caught in a pleasant moment of the fall's light and clarity. There were lifesigns via posters, bar signs, the temple-esque conf hotel, and the saturday waterfires that, alas, I missed. Clocks were serenely stuck. Grateful thanks to Anna, Claire, Phil and others, who with prodding and personal investment of family and time, hosted the FOCS conference.

Eigenvalues, Spielman Tutors

I made it to Providence, RI, in time to catch friends and the final tutorial at FOCS. Dan Spielman, dangling jewelry on his left ear, gently introduced eigenvalues and eigenvectors of matrices (adjacency, Laplacian) associated with graphs, used the "first" few eigenvectors to embed example graphs (polygons to polytopes) and drew the audience into the area nicely. He discussed using "large" eigenvectors to color nodes and pointed out the difficulties in using eigenvectors to solve the graph isomorphism problem (sign changes of eigenvectors and multiplicities of eigenvalues end up generating the problem of having to check exponential number of permutations). He showed the application of using eigenvectors for graph cutting to image segmentation (which has been a rage in the vision community for some time now), thanks to his daughter's image, and argued that the spectral partitioning method is useful in incorporating more of what we know about images (or the input data/domain). Planted versions of partitioning problems arise nicely, and he discussed eigenvalues of matrices associated with random graphs. He pointed to two large areas for research: eigenvalues (singular value substitutes) for directed graphs (check out some work on Dirichlet eigenvalues) and analyzing gaps in eigenvalue sequences, with emphasis on ultimately understanding graph properties via such research. He wrapped up the tutorial by discussing eigenvalue computation: power method, inverse power method, Lanczo's and other gems. There are some powerful open problems (sparsifying graph A by B, wavelets for graphs, etc). Check out the website Dan promises to set up soon with slides and slick code. It was a terrific tutorial.

PS: FOCS 2007 search on google reveals a (promoted!) sponsored search link to Krzysztof's masterpiece.

Friday, October 19, 2007

FOCS or STOC

For a moment, let us leave FOCS or STOC vs SODA and instead take on FOCS vs STOC. I have heard researchers claim incredible bias towards one or the other in their submissions (I produce most theorems during the summer and hence submit mostly to STOC), acceptances (hence, alas, the yang of it, ie., rejections), attendance (locations?) or in even their conference talks, or whatever. Is there one?

Colors and Comrades

India, in my flickering mind, is an irking image of contradictions and extremes, between explosive friendships and subdued families. Somewhere in there is the festival of Holi. When people think Holi, they typically focus on the colors which is awesome, but it is also a great leveler, everyone (kids, girls, grand men and women) celebrates with very simple technology: color powders, water squirts, and huge smiles. Type in Holi into flickr or picasa and see some great pictures.

Thursday, October 18, 2007

Demo

What does "demo" mean? A few years ago, it meant, "demo" of a prototype, feature, or a product, i.e., a demonstration. The I renovated my apartment and when the contractor said, we "demo" the place, I learned that it meant "demolish". Now I hear at work how a strategy may "demo" ads, and I realize that means "demote". How can one to quickly find out how many words in the dictionary begin with prefix "demo" and what is the prefix that comes after the prefix "demo"?

Nobel Prize

The Nobel Prize for Economics this year went to researchers including Myerson (gets cited a lot in theory CS papers for his paper on Optimal Auction Design) who developed mechanism design theory. I liked the Washington Post's title for this story (poetically, it talks about flawed markets, the article is informative, and, as befits the paper of the nation's capital, discusses an example with elections).

Wednesday, October 17, 2007

0's and 1's

Writers say and believe: The Universe is made of stories, not atoms (quote, on the left). To computers, the Universe is made of 0's and 1's, and in many subliminal ways, that is true for Computer Scientists as well. Torsten Suel, after a long day of C coding, went to buy a ticket to travel on NJ transit trains, and when the machine asked to enter the number of tickets he needed, he kept pressing 0. You have to program all day, or be a programmer to understand that. Recently, Andrew McGregor, spelled "1" as "zero, one, e". One Zero (NO), I have no explanation.

Sunday, October 14, 2007

Good Cop, Bad Cop, Great Cop Movie

You need something for sunny Sunday afternoons, some explosive action.

Any Given Day


On any given day, when I wake up, get my coffee fix and a hot shower, summer or not, I think of the day, look ahead and search for the moments. I know I would be in meetings and the moment will call for me to produce a micro-, mini- theorem; I know I would walk along the village, take the sidestreets, and there would be moments that will distract me from the walk, a model photoshoot, or a homeless person's possessions on wheels; I know I will take a break and stand in the corner making my telephone calls, and someone on the street will get my attention with their shoes, walk, or animated talk; I know I will walk home, on a good day, in the twilight time like Romare Bearden to the left and see someone on the subway singing opera or a sleeping beside a guitar; each day is a collage and I, perhaps like many others I see every day, look for moments, in some units of iota.

Saturday, October 13, 2007

Deterministic Compressed Sensing

Compressed sensing continues to thrive, with young and established, Computer Science/Analysis/Communication theory/Applied signal processing researchers continuing to generate new results rapidly.The Rice site manages to keep pace, and the algorithms community is well-represented. Recent algorithmic results from upcoming SODA08 include:
  • Mark Iwen's "A Deterministic Sub-linear Time Sparse Fourier Algorithm via Non-adaptive Compressed Sensing Methods". Mark improves upon prior deterministic compressed sensing results by using better grouptesting family of sets.
  • Piotr Indyk's "Explicit constructions for compressed sensing of sparse signals". Improved constructions using expanders. (Though not immediately related, check out his talk.)

Journal Papers (Ex: Least Squares Approximation)

The world falls into two categories: those who write journal papers, and those who don't. Like in all such statements, alas, there are notes: some first write journal versions and cut them down for the conf; some write journal versions of "important" results, "invited" papers, or papers with students or whatever refinement one finds suitable to fit their dataset of journal publications.

Recently I made a joint journal submission with Petros Drineas, Michael Mahoney and Tamas Sarlos. Given an n X d matrix A and a d dimensional vector b, the least squares problem is to minimize ||Ax-d||_2 over all d dimensional vectors x. Methods dating back to Gauss and Legendre find a solution in O(n d^2) time. The problem can be reduced to (n/d) MM(d) where MM(d) is the time to multiply two d X d matrices, which means the best running time is O(n d^{1+c}) for some ever-dwindling constant c.

We show how to get (1+ε) approximation in time O(n d log(d/ε,n)). This combines our SODA06 result on constructing a coreset for the problem with the Fast Johnson-Lindenstrauss of Ailon-Chazelle used by Sarlos in his FOCS06 paper into what is the best result we could obtain. The full version is at the arxiv, meandering through the journal process. Comments welcome.

Tuesday, October 09, 2007

Talks on Metric Geometry

Norbert Wiener had an incredible math career centered in Boston and meandering into and mostly out of math for military with his letter A Scientist Rebels. Assaf, centered in NY, is giving the Norbert Wiener lectures in Mathematics at MIT in a few weeks, a triptych of talks on Banach spaces and applications to partitioning problems.

Monday, October 08, 2007

The Blue Note, With a Malian Sigh

I don't typically go to The Blue Note : it is expensive and prepaid, reserved or not, they make one stand in a long line and let people in order they arrived; in NY, I can see a great show for $20 and instead of waiting 15 min, one can easily replace an evening of jazz with one of blues, funk or mere bar trio impromptu. Still, Dee Dee Bridgewater performed from her CD of Malian Journey, the balafon, kora and djemb's brought me out and I worked through the line at the Blue Note and listened to American jazz reconnect with Malian sounds. The performance had storytelling (in the tradition of Griots, oral history keepers from 12 or 13 century), some sublime moments and I got to shake hands of the stunning Dee Dee. Mamadou Cherif Soumano tired from several days of performance, yawned, rubbed his eyes, and then proceeded to play kickass Kora. (In my dream-like state, I thought Habib Koite joined in the performance, but that was not the case I am told.) All music and music business is ultimately local and personal. One of the talented band members lives in the neighborhood so he walks over to his place between sets!

Sunday, October 07, 2007

Easy, EZ, and now EC 08.

The call for papers is out for ACM Electronic Commerce (EC) 2008. Deadline is Feb 7, 08, four holidays-filled months to go. Jason Hartline and Nicole Immorlica who are moving to Northwestern Univ will host it at Chicago, together with the PC, presumably, ACM, as well as Rakesh Vohra and Lance Fortnow who are already at NW. This conf, I learned, precedes the Congress of the Game Theory Society (cool!). One suspects the number of submissions is going to be a record.

Saturday, October 06, 2007

Bookstores

There are bookstores you go into to grab the Harpers and gum or the Atlantic Monthly and a ta'im' Twix bar. There are other bookstores you go to to pick up the books you know you want, and you will find Murakami and Nabokov, conveniently close to each other. Others you go to for coffee perhaps. But still others transcend: you go in to get versions marked up by reviewers, for signed first editions, for rares, and finally the rare others where there is no book or author you recognize, your world is turned upside down, the covers outside and the words inside allure and you rush home even while leaves fall all around you so you can just write because you must.

Animators

On the left is my favorite animator, the old school, Jan (pronounced NOT as Jan or Han but as Yan) Svankmajer: a film of three episodes where frolics and fights go wrong disturbingly, rock-paper-scissors-clay style. On the right is the relatively modern masterpiece I saw in Bonn more than 15 years ago: voiceless piece of world mathematically maintained in balance, and the Carmen-esque intruding treasure in Red.



Names

I get into obsessive discussions about names more often most, I'd think. I know a mathematician who had a dog (or cat or hamster or whatever) and he called it Lemma. About children, I always thought naming them X, Y, X^2, Y^2 etc has certain chromosomal sanity and scalability, and will generate polynomial families. Then there are jokes that are somewhat ethnic in which you use a generic name (say Ivan for simpleton Russian, smart Jack or John in southern US who defeats Ole Massa, trickster Malese in Haiti) or there are generic jokes in which you use an ethnic name and localize them. For example, the name in the jokes here can be a dynamic variable that can be swapped for any user-specified name string.

Monday, October 01, 2007

Transformations

Braids get taken out and leave behind puffy hair, relieved scalp and a feeling of starting anew.