Saturday, February 28, 2009

Ending Feb 09

Feb 09 is not over yet, but I saw its end with my travel to Boston to give a talk. I had good conversations with Salil V, Michael M, David Parkes, Leslie V, and others. I spoke about internet ad auctions, so there were some in the audience from internet companies (besides the CS/Ecom/Sc types) and you can tell by the questions they ask. I realized, I really should write up open problems and directions that keep emerging.

Science research tries to "model" a complex system by something simple, accurate, amenable to math and predictive. My hypothesis is that in complex internet systems (such as search, ads), humans change the underlying system faster than researchers can abstract and understand models, and hence, we should abandon the "model" based approach to research. For example, someone models Internet search users, but alas, an Engineer adds a new vertical search (say showtimes, products, celebrity profiles, whatever) and the models no longer apply. Diehard modelers will tell you this means we are just not modeling things at right granularity. And then they ask, what is the alternative? Experiment and tweak. As Leslie pointed out, I seem to be making a giant case for Machine Learning research. With a giant tweak that humans in the loop means humans alter the system, strategically.

ps: David Parkes, ever smart even over dinner, pushed me to a corner, saying experiment-and-tweak might just have us hike (climb?) the hills. Without models (inaccurate, non-predictive, mathematically amenable), we might never have new starting points.


Wednesday, February 25, 2009

Internet Ad Business meeting

The Interactive Advertising Bureau (IAB) annual meeting just finished (agenda). This year announcement includes new ad features from Yahoo! Inc. A lot of focus on display advertising, more than sponsored search.

Tuesday, February 24, 2009

Watching the world in reverse

A friend told me a long time ago about this old man who locks up the local cinema hall in a small town. He comes in everyday say 5 minutes before the last show ends, watches the final few minutes of the movie, then when people leave, cleans and locks up the hall. And he has only seen those final few minutes of the movie The Usual Suspects. He knows who is Keyser Söze, but not the legend.

I left like that yesterday. Flipping thru the NYer (March 2), I came across a letter from a reader (Matthew Butterick) of LA) in which the author laments the scoring system for the letters in Scrabble, or rather, that it hasn't been adjusted for inflation with the growing dictionary. "Meanwhile, the fifty point bonus for using all seven tiles, what was one a home run, now seems like a ground-rule double." Now I wish I had read the original article by Judith Thurman in the Jan 19th issue "Spreading the word".

Finding Mistakes in Typing

When we write (ie, type), we make certain common mistakes.
  • In haste, we may end up typing in xy instead of yx, eg, attitued. I formulated a string matching problem in presence of such character swaps, and the problem was researched a lot in the past 14 years since. We can write a script to automatically find and modify such mistakes.
  • Sometimes you end a line in .tex file with a word w and by mistake start the following line with w also. Like, ..... the \\ the ..... One can again write a script to automatically find and remove such duplicate words. On a related note, Rao has a paper on an efficient algorithm for finding all ....ww....'s.
  • My problem is subtler. I type in a sentence, go over and change it, reorder the subject into object, move phrases, change tenses, voice, etc. Invariably pieces of strings get left behind, and the full sentence becomes nonsensical. It is difficult to find such semantic mistakes automatically and correct them, you may miss them even on reading over.



Theoretical computer scientists can fail (as a researcher) if their proof is wrong. An applied algorithms person though, your proofs can be correct, theorems deep, and you can still fail. It is like being an engineer, your assumptions have to hold. Here are examples:
  • I once gave a talk on scheduling problems that arise when you use broadcasts from the base station to wireless laptops. Broadcast means if you have to send the same bits to many laptops, you can do it with a single copy, so it is efficient. It also gives nice abstract problems, and I was on a roll, showing 2-opts or ptases. In the audience was a systems researcher who slayed me with a simple question: how do you know the bits were received correctly? Well, each of the laptops should ack. Then chances are a subset of them will fail. So, you have to rebroadcast, now to specific subset of laptops, and iterate. The model then becomes messier, results less clear.
  • I once heard someone give a talk on network routing in presence of edge deletions. The motivation was if a truck ran over a cable and cut it, the link between the two routers will disappear. So, you want to compute reroutes. A systems researcher in the audience says, but physical links are bunched together into large cables, so (a) if there is a cut, bunch of links will go down simultaneously and (b) while you can determine the logical links between routers easily, say by looking at the routing table, it is difficult to get the operations people to tell you how they bunched wires together. That is, physical world of wires is different from the logical world of links, and indecipherable. Alas.
  • Then there are times when you look for a theory and fail. I looked at problems where the longer you run the algorithms, better the approximations. The quality of solutions found by the algorithm is like a counter getting progressively close to the OPT. You can stop the computation any time you are happy with the accuracy. A bunch of us called his "progressive algorithms" and spent a long time searching for a theory of such algorithms. No luck.
I can tell you more stories, instances where theory triumphs in theory and a question or comment from sidelines derails the purpose.


Thursday, February 19, 2009


We all know "publish or perish". I have been playing with "launch or lose" and "grants or go", any other/better pithy slogans that describe a researcher's life?

I realized this week in a meeting that we have people who see the tree and those that see the forest, neither is useful. We need people who see a tree and know it is in a forest.


Wednesday, February 18, 2009

Pleasures of a day off

I like true midpoints, like the Wednesdays. On any given Wednesday, you have left behind even the memory of the past weekend and can't lurch to the next one just yet. I took a day off and went to the lower east side to see the Tenement museum. It seems not only the immigrants' experiences, but the template of everyones' experiences remains the same over time and across generations: find a starter job (garment or restaurant business), pay rent (1/3rd monthly income, now an Obama law), measure up/down (productivity, window space, net worth) and survive the tussel between the government and the collective whims of the people with labels (such as "Not made under sweatshop conditions." Will I see those labels on software anytime?). In early 1900's they did it in cramped spaces with wallpapers, in early 2000's we do it in underwater spaces with sparse walls. When the museum closes and the rain continues, a short walk down the Orchard Street gets you to the Good World bar (a stylish swedish grill with an industrial look) before the crowd gets there and one gets to quietly enjoy the sea salt fried shrimp done with care.

Computer Science Research & Economic Crisis

Others have asked me and I have asked myself: how can Computer Science research help in this Economic crisis? The crisis is serious and let us assume our economic landscape, for a long time, will be quite different from what we have seen in the past. Perhaps our research can help realize a more robust economic system in the future. What are potential research problems that arise? (for ex: can secure computation work help verify pieces of data? can optimization methods help estimate the value of the underlying securities better? is there a new predictive theory of market dynamics? will a giant database help?) Ideas welcome. (Some discussion of how Science can help, here. The Economists are probably searching their souls).


On Learning

I remember a fairy tale (Cinderella, in a modern form as Raisel's Riddle) that ends with "And they lived and learned happily ever after". I like learning, and browsing recently, I came across Jon's course on networks, cross listed with CS/Sociology/Economics/Information Sciences! It is a challenge to put disparate material together --- from information to social networks, game theory and dynamics, markets and policy ---- into a cohesive course such as this, and the list of topics here is great. In order to be a researcher in this crossroads of an area, you should have the insight of a scientist to ask the right questions, the talent of a computer scientist for creative problem formulation and the technical strength of a mathematician to produce concrete, difficult theorems and results. And you have to be a superlative teacher to communicate all that in a classroom to the students. Looking forward to the networks book.


Sunday, February 15, 2009

Sunday Musings

* I searched my gmail for "cpm". Bad idea! I was looking for CPM 09 conference related email, but ad auction researchers know what my mailer pulled out in large numbers.

* After flip-flop-flips, exactly how much does the stimulus package provide for basic research? Blogs reveal: $1 billion total for NASA; $3 billion total for National Science Foundation (NSF); $2 billion total for Science at the Department of Energy including $400 million for the Advanced Research Projects Agency—Energy (ARPA-E); $830 million total for the National Oceanic and Atmospheric Association (NOAA). There is a clause that gets the money spent quickly, like in a few months, I hear.

* Sunday without Times? Neah. What are great green CS problems? I was reminded of this by Thomas Friedman's OP-ED piece about a ride through India in an electric car, discovering local, green innovations. (btw, where in India can you charge an electric car for 6 hours a day? :))
“Why did this tour happen?” asked Ringwald. “Why this mad, insane plan to travel across India in a caravan of solar electric cars and jatropha trucks with solar music, art, dance and a potent message for climate solutions? Well ... the world needs crazy ideas to change things, because the conventional way of thinking is not working anymore.”


Monday, February 09, 2009

Japanese Ken

Times, i.e., Will Shortz, started daily dose of a new puzzle called KenKen (loosely, cleverness, cleverness). It is basically a Sudoku-like square of permuted digits but each bold subsquare comes with an arithmetic operation (+,-,*,/) and a purported total that numbers within that bold subsquare should yield when the operation is performed. These operations, when not commutative (-,/) can be applied in any order to the numbers in the subsquare. Rest of it is as boring as Sudoku. May be people will study the complexity of this game too.


Sunday, February 08, 2009

House of Performances

Someone told me my blog was really about NY. And complained that I have been remiss. Well, NY is a house of performances, but a weird one. The Garden everyone knows is on 32rd. The Kitchen is a great performance space, but up in 23rds. The Forum is down where the village begins, and is running a terrific series -- Breadlines and Champagne -- of 30's movies about the other financial crisis. Finally, I was at the Living Room, on the lower east side, diametrically far from the Kitchen, where I entered when The Flanks were, having done the hard entertaining, just jiving, wrapping up, crooked country and crazy. Looking good in tufts of hair and casual jeans that only youth can take for granted. My friend Elaine played next, she is all voice and sparse guitar, just the way I like it. I know she offers more --- lyrics, muscles and the band --- but I hear just her voice and the unintruding guitar. It is a scrambled house.

Frivolous Thoughts on a Sunday

I saw a woman last night in a flitty red summer dress and sandals, walking down the village at a time when the movies finish and people spill out seeking slices of pizza. Maybe the cold is over for this winter in NY. Today there is a steady cool wind that reminds one of stories from far away places, bright sun and long shadows.

Beards, the kind related to hair, lead to stories. There is one about how love is the moment when a man who needs a shave becomes the man with a beard. I grew a beard for a few days and looked threatening or unkempt depending on one's biases. All along I looked forward to that moment when hair got to the right length and would meet the razor. The hair will get shredded of course, but would at least it would grate the razor dull. The match will be all even, scores tied, love all.

I got an email: Dear Dr. X, We have learned of your published research on chocolate. We would like to invite your participation... This came from Frank Columbus of Nova Science Publishers, Inc. This would seem to be part of an elaborate book publishing racket.

Friday, February 06, 2009

Market Algorithms and Optimization

Many algorithmic and optimization problems are involved in designing Internet markets (for ads, apps, storage, services, whatever), setting up and managing them, and ultimately making them vibrant. In a recent meeting at Google NY, a few researchers met to discuss the underlying theoretical issues in this area of Market Algorithms and Optimization. See the blurb on the Google Research blog for details and links. The blog may be of interest in itself.


Thursday, February 05, 2009

On Smiling

People say I smile. I understand it is just an observation in some cases and in others, a little of racialization. I was thinking the other day, besides race, culture and other factors, peoples' smiles are defined by when they learned to smile. Some have the smile of a boy or a girl, others that of a teenager or grownup who modified their smile to their changing face in front of a mirror. I like people who smile, totally native.

More Stream Models

On doing literature search (aka catching up), I noticed some developments in data stream models.
  • MUD (Massive, Unordered, Distributed) model is an abstraction of a special case of MapReduce. Most (approximate) streaming algorithms work in the MUD model immediately. A recent work by Nanongkai et al. in progress seems to generalize this to MUDB model (MUD with broadcasting) and studies graph problems that are difficult to solve in vanilla streaming or MUD model.
  • Many have asked and formulated problems of how streams may be augmented with different types of annotations. Read/write annotations are a natural class, and recent results by Beame and Huynh-Ngoc on estimating freq moments in this model seem interesting.


Monday, February 02, 2009

Real World (via others)

Noam Nisan pointed out to me this xkcd comic about crypto in practice. On a related note, Raphael Clifford mentioned how incentives work in practice. The UK intelligence agencies apparently hesitate giving cash bonuses: you don't want to train your employees to see cash as a reward. So, these pillars of our theory research (crypto, mechanisms and incentives) face tests of reality.

Sunday, February 01, 2009

Blog View

I did a quick look at the blogs: Michael led an interesting discussion (as always) on making a career of being in the middle, ie., doing applied algorithms. Nearly all the algorithms researchers I know have an applied algorithms topic on their agenda (networking, databases, graphics, computational biology, mechanism design, whatever) and excel in it. The area of algorithms research should be proud. I love Luca's byte sized lecture blogs. Like many in this country get their news from Jon Stewart, I had a feeling some in theoretical computer science got their politics, perspectives and history from the Computational Complexity blog, so I hope they do not step away from discussing politics. I knew about Yahoo's scientific challenges program, but waited for David to blog about it, and he just did. There were puzzles, pictures, travel and book announcements ("The 7 Habits of Highly Defective People") and more.

And Scott comments on movies. Movies is difficult business. As Robert Altman shows in the clip on the left, every story one gets pitched is just a mash of movies from the past, Out of Africa meets Pretty Woman; in homage to his movie about movies, I decided to do this mash of blogs. "Is this going to be funny?", Tim Robbins asks.

Seeing one as others do

In a job interview (personal: aptment-related), a question was posed to the candidate: "Tell me what your references will point to as one of your weaknesses." This question is not about what demons within -- real or imagined --- that we fight and our knowledge of it, but about demons --- possibly different --- that others see us fight and our awareness.

Others will point to one of my real demons, my inability to be a reliable emailer. I fall behind on email threads, they nag me for long before I find the corner of time and space to respond, but I never catch up of course and each thread languishes between bursts and soon torments me.

Coding BW and AS

I did a little python script for something I needed, and wondered how I got any coding done in the ancient age Before the Web (BW). Welcome to the new age (AS, for After Web Search) of collective coding and the copy-paste-modify approach to developing software.