Sunday, August 30, 2009


I think about risk as a research problem, but in reality, I take either far much risk or far too little, unable to measure it accurately. The day was sedate and needing an adrenaline rush, I made a quick call to go up a mountain in High Tatras even as dark clouds were threatening the summit, rains were evident, and I had on just shorts and a T-shirt. I reached the summit, soaked from the cold, heavy rain. Then, I had to get down. It was a strange experience: for every few yards gained going down, I would look back and see that much of the mountain disappear under the clouds. Thus, chased by the clouds and the vanishing mountain, I stumbled back and searched for a sports clothes store to get some warm clothing.

ps: One can check out High Tatras via live cameras, it seems.
pps: The picture on the left was taken on a different day.

Monday, August 24, 2009

Stochastic Data Streams

I am revisiting the area of streaming algorithms, and thought I will use the MFCS talk to push in a new direction. These days, I do research like cows chew their cud: I swallow an idea or a thought as it pops up during the busy day, and much later, when I have time, say during travel, I revisit the idea and chew on it. Much, much later, a paper or talk emerges.

The new-ish direction is that of stochastic streams where a distribution is given a priori. Then, items arrive, drawn from this distribution, instantiating a stream. The performance of a streaming algorithm on this instance is calibrated against the OPT on this instance, or its expected behavior over all streams. We need nice problems in this model, or separate it from the plain streaming model where the distribution is not known, or from plain stochastic model in which the distribution is known, but the algorithms do not necessarily have to be streaming, etc.

I have two potential examples in the writeup for MFCS, and one explained a little in the talk. Here are the slides (go to the end, this is a preview, it may change before the talk on Wednesday).


Saturday, August 22, 2009

Convergence of Sample Variance

Sample mean behaves like the normal distribution when appropriately viewed, and the classical Berry-Esseen theorem bounds the deviation for each sample size n (not just as n tends to infty); the convergence rate is like \sqrt{n}. We know that as n tends to infty, sample variance tends to the true variance. Is there a reference for bounding their deviation for each sample size n? If the samples are drawn from a normal distribution, then scaled chi-squared distribution can be used to bound the deviation.

Friday, August 21, 2009

Travel East

An airplane ride from hell landed me in Eastern Europe. I have a clearer understanding of what comprises Eastern Europe, less clear when people refer to Western Europe as ONE concept for culture, bread, cheese, chocolates, whatever. I got to ride bicycles with a 10 year old, boys are the same world over, so hands free of course, and got to do dirt bike riding on a bike with a child seat. Today will be the time to recommend Wild East: Stories from the Last Frontier. It is an expressive collection of short stories from former Eastern Bloc countries that will remind you there is literary talent everywhere, but it will do the reminder with fistfights, Gogolesque tales and vodka on the side. Soon I will hit Novy Smokovec where MFCS is being held. Friends ask me what it is like in the East, and I say, sartorially, it is like everyone, man/woman/child, is ever sports ready: if called by their club or country, they can instantly get onto the soccer field and kick, that is, Adidas rules. And I have learned not to finish a drink lest the glass be refilled, Mihai seems to have chosen to forget this lesson. :)

Thursday, August 20, 2009

Streaming Algorithms

I once asked my favorite sushi chef (You know who, David Applegate) what he would like to do in his life. After climbing and conquering the cruel mountain that is the NY Sushi Scene, what is left. He said, "I would like to specialize in one fish." That is poetry.

I have been thinking about streaming algorithms again some, and the question is, is it necessary to limit the model to sublinear space for it to be nontrivial, interesting. Doesn't it just suffice to limit per-item and query processing times to be sublinear? Now this question never occurred to me before because I have been focusing on one fish, that is, vector signals on streams. In that case, if we allow linear space, many of the problems become simple, sub-FOCS/STOC I mean. The past week I thought about other fish , ie., streaming problems in general (graphs, matrices, etc) and realized that even without restricting the space, the problems are nontrivial. The problems become full space data structure problems where one still needs approximations to be sublinear in time. This is not a novel realization, but makes me take a deeper breath when explaining the stream model of computation.

ps: I will be at the 34th MFCS conference next week, speaking on algorithms for stochastic streams.


Saturday, August 15, 2009

On the Water

I like rowing, canoeing, kayaking, sailing, and other ways of being on the water. On the left is me on The Bronx River. I am with a team of Engineers. Reminded me of 10+ years ago when I went white water rafting with some Lucent Engineers. 7 of us in frothing water, and each one of the Engineers thought they *individually* control the boat. Each rowed hard and when the boat turned slightly one way, each independently rowed backwards to straighten the boat!


You drink coffee, amidst the swirl think of a problem, and you feel something must be known already. Here are two examples (refs, welcome).
  • Suresh has an ongoing treatise on clustering. Say you have n points to put into k clusters with some objective function O_k (eg, sum over points of the distance to closest center). What I want to optimize is: O_k if k or fewer clusters, but O_k' (k/k') if k' > k clusters are used. Is this a well studied variation of clustering?
  • This is a marriage of secretary type and stable matching type problems. Say there is a fixed set of women with their preference orders. Men arrive in some random order and go to each women who must accept or reject. Any way to resurrect some approximate notion of stable matching?
Disclaimer: I havent thought about these problems, and most likely, they will need more formulation.


Sunday, August 09, 2009

Price of Social Life

I coax a bunch of Market Algorithmicists to lunch on Fridays and survey the state of the world, and conversations go from theorems to products, organizing events, chores and other things. This week's group comprised Nitish Korula, Mallesh Pai, Sudipto Guha, Eyal Even-Dar, Yishay Mansour and others, and led to an interesting discussion: If you were a social network where people updated their activities, friends, families, college and work lives, what could you do?
  • You could run online marketing studies with large number of people in each target group and see what products/ads they liked. Will it be biased?
  • You could recommend products, events, friends whatever based on your social footprint. To what extent will privacy be compromised?
  • You could let friends recommend sponsored material and be advertising conduits. What is the social cost and benefit?
  • The Economist observed that social networks were really just soap opera, and people should be charged thus, that is, shown ads at intervals while on the network, like in TV.


Wednesday, August 05, 2009

An Indian Man's Burden

Man, woman, burden, boon, whatever.
  • You walk through the Washington Square Park and the chess players in the corner assume you play chess of course.
  • You go to a comedy club and the cast does "Dell IT Support Guy" jokes on seeing you.
  • You get frustrated trying to answer Bruce Maggs 's: "Name an Indian currently living in India who many US Americans would know and associate with being from India."
Anyway, humor is hard. Onion does a gem, here it is verbatim:
"Upon arriving late to his meeting with President Barack Obama and famed African-American intellectual Henry Louis Gates, Cambridge police officer James Crowley once again detained the distinguished Harvard scholar after failing to recognize the man he had arrested just two weeks earlier, White House sources reported Thursday. "When I entered the Oval Office, I observed an unidentified black male sitting near Mr. Obama, and in the interest of the president's safety, I attempted to ascertain the individual's business at the White House," Crowley said in a sworn statement following the arrest. “The suspect then became uncooperative and verbally abusive. I had no choice but to apprehend him at the scene for disorderly conduct.” Witnesses said that Sgt. Crowley, failing to recognize Gates on their flight to Logan Airport, arrested the tenured professor in midair, once again at the baggage claim, and twice during their shared cab ride back to Cambridge."


A stellar team of Cornell + Northwestern researchers is organizing a NSF sponsored event on Research Issues in the Interface of Computer Science and Economics in Sept. Program looks great. Should have tweeted?


Auctions and Scrabble

WSJ blogs about research work by Mallesh Pai who I work with in NY, Northwestern U. Economist Jeff Ely and others. See Jeff's blog about the genesis, and the theoretical open problems. At the high level, the rules specify an auction for each new tile in scrabble and support it by money from the word scores on the board. The complete rules are here.