Monday, July 28, 2008

Lost Offline and Found Online

How do you find your friend from say high school when web searching their name yields nothing? Search engines can help you even when they don't: Just advertise on your friend's name at a popular search engine! Chances are your friend, wherever he/she may be, does "ego" searching of their name. And voila, they should see the ad and contact you. This has worked in practice.

As another example, I left my notebook behind on a flight sometime ago, and have been traumatized with the loss. I wish there is a service we can go online and list/discover found/lost things, authenticate the owners of items, exchange the items.

Saturday, July 26, 2008

Don't Cares

Given text string t[1...n] and pattern string p[1...m] over alphabet of size k, the string matching problem is to find all i's such that t[i]...t[i+m-1]=p[1]...p[m], ie., \AND_j (t[i+j-1] matches p[j]). Fischer and Paterson in '74 observed that solving this problem is "like" computing convolution where given vector a[1...n] and b[1...m], we compute c[i]= \sum_j (a[i+j-1] \prod b[j] ); this can be done in O(nlog m) time.

This approach solved string matching where some of the positions were ``don't cares" that matched any symbol, using 2\log k convolutions, therefore in time O(n log k log m) altogether. This is something of a cult of a problem, with a lot of attention. What follows is my tale of embarassment in research.

In my thesis work in '94, I observed using some relationship to Sperner's Lemma that log k + 1/2 loglog k + O(1) convolutions would suffice. Piotr Indyk found a sophisticated way to solve the entire problem in O(nlog m) randomized time. Adam Kalai blew me away with a page long note in SODA with a randomized reduction to O(1) convolutions. Then, Cole and Hariharan found nifty encoding techniques, some using complex and rational arithmetic, to do it in O(1) convolutions in the worst case.

When I was in Bristol, Raphael described his '06 solution to me in one mathematical sentence: \AND_j (t[i+j-1] matches p[j]) follows from \sum_j (t[i+j-1]-p[j])^2 X_{i+j-1} Y_j, where X_{i+j-1} is the indicator variable for if t[i+j-1] is a don't care (and likewise, Y for p). Now expand that expression out and in 3 convolutions, you are done. I came across something called the Bridge of Sighs in Venice, that moment with Raphael was it.

Cities on a Convex Hull

Friends know I love NY with a passion perhaps only generated by ones like me who did not grow up in the city. Beyond that, I like cities that refine themselves to an extremum in some combination of dimensions: these are cities on a convex hull.

LA for example is such a city with its audacious spread (I have a vertical line on the plane passing through Seattle, San Francisco and LA in my mind, no matter the 3D geography, yet for a puzzle in Peter's book, Page 57, What is the biggest city in the US east of Reno, Nevada and west of Denver, Colorado?, I guessed LA lazily and got it right).

Venice, where I am now, is another ex. I visited stores looking for a leather bound notebook and met several proud venetian bookbinders, and the prince among them: Dario Ustino. It took sometime for him to take me seriously that I am someone who would actually buy and write on a notebook worth several hundred euros, but soon we started talking and he is a fascinating master. With an infectious chuckle he looked the part when he said, he was living in the city he wanted to and doing things he wanted to do. Check out his work, some on youtube.

Thursday, July 24, 2008

Clay Court Tennis

It has been close to 20 years since I played seriously on clay courts and more like 15 years since I played tennis seriously. So, I regressed and was like a boy when I got onto the red clay courts here. You start off with 6 balls, and it takes a while for your shots to sound right, and you have to say sorry (when the ball lands in neighboring courts) and thank you (I was taught to never take anything for granted and to thank any gesture people make, and having neighbors return the ball repeatedly is a great lesson). But soon, the shots begin to feel right, land right, and you start working with only 2 balls, you feel comfortable approaching the net, and before you know it, the Sun is about to set, and you are trying to steal a few back-and-forths like you did a long time ago.

Labels:

Monday, July 21, 2008

More than the Sum

Some researchers are greater than the sum of their publications. In addition to their publications, some researchers are puzzle freaks, some are distillers who can explain others' results even better, clearer, and yet others are bloggers. I was reminded of this "greater than the sum" phenomenon when I was in CA past Wednesday and spoke with D. Sivakumar. Siva showed a great knack for abstracting graphs and using their structure to understand, argue and tease out physical phenomenon in applications I would not have been able to find even a planted graph! Many in our field take graph abstractions for granted (even putting in an unspecified weight to capture something along the edges). But the talk with Siva reminded me that there are still highly nuanced connections one can make, if you are willing to push your imagination. Invite him and coerce him to give a talk, and you will see.

Assaf Wins

A Norah Jones-esque friend (no, never heard my friend sing, just her smile and personality) told me Assaf Naor won an award and it will be presented by the Dutch Royalty. So, I browsed around and here it is: our own Assaf Naor wins the European Mathematical Society prize (he is in great company too). The EMS Citation states that "Naor is the leading architect of the modern theory of non-linear functional analysis: a theory that has taken off in recent years and has become an essential tool in mathematical computer science." Go, Assaf!

Notelets

I was browsing, compiling potential travel dates and noticed: Sigcomm in Seattle overlaps Dagstuhl workshop on Sublinear Algorithms, and KDD in Las Vegas overlaps with VLDB in New Zealand. I am reminded of the mathematician (resp engineer) who liked having a wife and a mistress so he could tell each he was with the other, and spend time in his office working on puzzles (resp. coding).

I was browsing through some backlog of papers and noticed: it seems customary for papers on ad auctions to say something like, "the sponsored search industry is YY billion dollars a year, see earnings of companies M, Y, G, etc." This is specious argument by association. Let us please stop.

Mike Paterson Celebration





An event to celebrate Mike Paterson's contributions, Mike66, will be held at Warwick, Sept 18--19, organized by Artur Czumaj, Leslie Goldberg and Uri Zwick. Some of my best memories are from the year I spent at Warwick, and the afternoons I would hang out with Mike, solving puzzles, and other times in his office, trying to sneak a peek at his notes from 70's, whenever possible. He means many things to different researchers, and I am sure some of it will come out during this event, so let me not preempt it.

Sunday, July 20, 2008

On Reading Peter

I love math/algorithmic puzzles, and when I was in high school, worked on olympiad problems for fun (I did write a "book" on solving these problems and sold them locally thereby funding a year of my social life in college, but no more on that story here). Yesterday, on my way to my vacation in Croatia, I decided to read Peter Winkler's book (Mathematical Puzzles: A connoisseur's collection). Peter is a connoisseur, and my favorite; I started reading this book to atone for missing his talk at ICALP, as a treat for me, and finally as a rehab since I felt a little disconnected with actually solving problems the past few weeks.

This is a long post in which I will capture as much as possible, my raw thoughts as I encountered one problem after another in Pages 1 -- 14. It is one thing to read the solution to a puzzle, but maybe the act of how one confronts a puzzle is interesting to some.
  • The Bixby Boys puzzle. This sounds familiar. Typically think of father/son combo, in this case, since it is same birthdays, one quickly guesses: triplets.
  • Attic Lamp Switch: Known.
  • Gasoline Crisis. Given n points on a circle C with fuel x_i on the ith, if \sum x_i suffices to exactly go around C once, is there an i such that one can do this starting at i with an empty tank? Say I start at some point and move clockwise, collecting fuel as I go. The first time I run out of fuel before hitting the next point, I thought there would be some local exchange argument that will tell me I should have started elsewhere. I could not see any such local argument, indeed, the problem seemed to involve non-local segments. After some thought, I decided to go on with negative fuel and drew the picture of the fuel in the tank over the whole course. The fuel level goes up and down, in both positive and negative territory. Soon I realized the key was the bottom-most point in this trajectory. If you start from there, everything will work out. I wondered briefly: say x_i \geq k. Is there a solution with tank O(k) in size? Not true.
  • Fuses. Known puzzle. I can even do any arbitrary fraction p/q with two fuses (you need infinite matches).
  • Integers and Rectangles. Known puzzle. Vaguely remember a solution using Green's Theorem.
  • Tipping the scale. Watches on the table. Path on a chessboard. Could not get into the spirit of these problems. Realized scale tipping is trivial if each weight had exactly one pupil's name. In general, some set averaging I am sure.
  • Exponent on exponent. Seen before. e^{1/e}.
  • Soldiers in the field. An odd number of soldiers are stationed in a field, all pairwise distances are distinct, each soldier watches the nearest other solider. Prove that at least one soldier is unwatched. Start making a graph, nodes are soldiers, make an edge from i to j if i watches j. With n=2 soldiers, they watch each other. With n=3, it is clear, the third watches one of the other two who are watching each other, and one can not create a triangle (if you label the edges a,b,c, two edges will give a total order on the distances which will contradict the third edge). So, in general, I wanted to prove that with odd n, you can't have a full circle and any circle will be partial. You can create a path with n-1 nodes and when I tried to put the nth node to complete the circle properly, I realized that the edge from the nth to one of the other nodes will break the path at the other node. So, forget about the circles etc, just take two nodes at a time that have the shortest distance between them, and we can repeatedly remove them to give the argument (if they are connected to the rest of the nodes, trivially one can conclude someone is not being watched). Cool. Wondered for sometime with the variation: Say node i points to node j if i is the nearest neighbor of j, does this property hold? I went thru n=3 case but then realized that now the graph may not even have n edges. So, not clear this is a well formed problem. Abandon.
  • Intervals and distances: skipped. This looked like I actually have to do some calculation, something like the number of triangles * smthg \geq 1. Summing to 15. Known. I use this puzzle as an example of something where I can convince the audience of the answer without mathematically proving it to them. Locker doors. Page 13. Known. Zeros, Ones and Twos: I was asked (a) part of this problem and remember the pigeonholing answer many of us came up with.
  • Sums and Differences. Given 25 different positive numbers, prove that you can choose two of them such that none of the other numbers equals either their sum or their difference. Say the numbers are x_1 ... x_25. It is easy to pick two numbers so their sum is not in the collection (x_1, x_25), the trick is with the difference. So, you can not pick x_1 and x_25 if x_24 is their difference, and so on. So, I considered what I would do with numbers x, 2x, 3x, ... ,25x. Seemed like difference of any two of them is in the set, and after a while realized the trick is to make sure the difference IS one of the numbers chosen, so something like 12x and 24x will do. Now, this logic can be generalized to arbitrary numbers (not just multiples of x). Choose x_12 and x_24. Nothing special about 25. That is misleading.
STOP. I am going to vacation now. If I get back to Page 14 and continue, will let you all know.

Monday, July 14, 2008

Talks IV: Summer Course in UK

When I landed in Bristol, I told the officer at the airport I was there for a week-long summer course. It was Thursday already when I got there, rainy and chilly, not much of a summer, so he stared at me strangely.

Raphael, Ashley and Andreas organized a terrific summer school at Bristol (with Eyal K, Seffi N and others, I would have learned a lot if I had been there the whole week).

I gave 4 lectures. I know from my past at U. Warwick that the students in UK are sharp and very well trained and when pushed by directly asking a question, they have thoughtful answers. So, I made the first lecture essentially interactive, getting them to play roles, first as a search engine user (so they think about their interaction with the ads: how you scan them, what kind of ads you click, what URLs turn you off, etc), then as an advertiser running an ad campaign (large or small advertiser, do you care for making a sale, what keywords would you choose, when do you face a budget problem, etc). I thought the discussion went well and the students more or less isolated the major issues I would have summarized in a slide, but the process was more fun. The remaining lectures were mostly theorems and explanations, and some proofs.

From the few minutes I listened in on Seffi, he is a very clear lecturer and it was just a treat. My lectures never get to that state of clarity. So, I am going to have to produce the slides and put them on the web. This also means, I hope the video camera I saw throughout my lecture was a prop and no tapes will emerge. :)

Talk III: EC Workshop on Sponsored Search Auctions

Chicago and NY: for being twin great cities in each others' backyards, it is difficult to travel between them, or may be, it is just difficult to travel from or to Chicago. I enjoyed the workshop (before the EC08 conf) at Chicago. The website has links to pdfs of papers. Lot of top class researchers thinking through a focused set of problems, sometimes their footsteps landing on the same spot simultaneously. This was a workshop, there had already been talks on some nice technical results, so when I got to speak, I used it as an opportunity to ask: what is the "ultimate" auction we want to run for sponsored search? There is a lot of debate:
  • GSP or VCG, bidder-position specific reserve prices
  • charge per impression or click or sale or whatever,
  • offline or online scheduling,
  • allow expressive or use minimalist bidding language,
  • allow targeting by location/time/positions or whatever, and
  • the role of budgets.
I wanted a debate on if we had total control (we researchers do, in *our* world), what auction method should we use? Ignoring the budget issues and repeated auctions, I described a proposal to run a stable matching mechanism from my joint work with Gagan Aggarwal, David and Martin Pal. This is not a startling proposal because one is really standing on shoulders of giants who themselves are standing on shoulders of giants when one talks about matching markets in 2008. So, there was a danger that Economists and CStists better versed in Economics in the audience will be grated in some way, further that there is an ennui with stable matching results, and the sense that the giants have already trodded over the ground, is there anything new to do?

I argued that using such an assignment model, one can model the many variations of auctions in the list above that we seek, that the resulting mechanism can be efficiently implemented and most importantly, that proving the resulting mechanism to be simultaneously "truthful" for any combination of such variations (GSP bidder with per-impression cost as well as VCG bidder with per click cost and reserve prices in the same auction) is nontrivial. Several people asked me for a copy of the paper, so in the best case scenario, they needed more convincing by actually reading through the proofs! :) I think even asking for a mechanism that is simultaneously good for the different variations is already a progress, so I hope ultimate auctions emerge in the future.

Saturday, July 12, 2008

Talk II: ICALP Plenary

In Iceland, I was in a twilight zone, jetlagged or simply overwhelmed by the day/nite ambiguity. My talk was also at a twilight of a time, 9 in the morning, and I felt somewhat displaced at the beginning. But Magnus stirred me with a witty Intro (citing sources, he said they managed to localize me in both space and time, and there I was). Wired, I gave a talk on a way to think about this world of search users, advertisers, search engines that are in the game. This was not a talk with crisp open problems, they come later, but if we internalize an intuitive way to think about this world (beyond the simplistic ways one hears in the media or in armchair research), I think we, the theoretical CStists will abstract problems in our style, solve them, and we will have new insights to offer the world (of Economists).

PS: Photos from ICALP are here.

Curiously, the passengers coming in from US had to go through the security (take out the laptop, empty ones pocket etc) when we landed in Iceland!

ICALP Workshop on Matching

The surprise find at ICALP 2008 was the workshop on Matching Under Preferences– Algorithms and Complexity. It was organized by Magnús M. Halldórsson, Rob Irving, Kazuo Iwama, and David Manlove, and was one of the best workshops I have attended in theory conferences. A topic like matching under preferences, when explored by TCS people, faces the danger of becoming just algorithms for numerous variations of stable matching: discrete, probabilistic, stable roommates or whatever. But the organizers recognized the thriving community in Economics and Game Theory that has made significant contributions in this area and managed to attract several participants from that community. As a result, there were many talks on matching markets. The proceedings is online in pdf.

In particular, Al Roth, gave a fantastic talk on conceiving, starting and running the kidney exchange market in US. He spoke in equal parts the mathematician who has proved foundational results in matching market design and more (4-way exchanges suffice given the blood types and compatibilities), and an Engineer with experience with a real system (+law, +policy) for several years (2-way and 3-way exchanges are engineering challeges already). Truly inspiring!

Talk I: ICALP Workshop

I spoke at the Workshop on Foundations of Information Management in Networks, organized by Stefano Leonardi and Friedhelm Meyer auf der Heide at ICALP 2008 the past Sunday, July 6. It was a cozy workshop with fewer than 20, all experts, and so the discussions were high quality. The breadth of talks --- reputation, recommendation to streaming and security --- was great. I used the workshop atmosphere to brainstorm about models for information aggregation. In particular, I motivated continuous communication complexity (CCC) and presented F_0, F_1 and F_2 results. Tucked in it, sketching and sampling got introduced under the guise of CCC.

Giorgio Ausiello mentioned that there is a lot of interest in EU+EATCS on network information mgmt and informed us that ICALP 2009 in Greece next year will have Track C on this topic with Alberto Marchetti Spaccamela and Yossi Matias as PC Chairs. Surprisingly, preparations for ICALP 2009 seem to have started with PCs and Invited Speakers announced already.

Saturday, July 05, 2008

NY and the Water

A coauthor told me he follows NY vicariously through my blog. So, let me indulge him (and hopefully others too).

NY is right next to major water bodies. Many in NY don't know that. I don't mean that in a geographical sense, but in a visceral way. NY keeps its water away from daily life. Subways are tubes that snake well under or well over the water bodies and move us from one place to other with no sense of what was between. The bridges are institutions by themselves, so NYers on the road think of them nearly as free standing structures, and will have to be reminded that they are standing on water. Finally, when one thinks of NY, the skyline grabs the psyche and one looks up or level, not downward.

In a deeper sense, NY keeps the water out of the conversation even when one is on the water or looking at it. The main ferries (non-commuter, non-tourist) are free (staten island, governor's island). NY kids can kayak, again free. The city will even put up a waterfall, but hang it under a bridge or tall structures. And finally, on July4th, they will park barges and blast off fireworks, holding all eyes skyward.

Friday, July 04, 2008

AGT C's

I have a crazy week ahead with travels and talks, thanks to my inability to sanely plan. Tentative:
Nearly all my talks will be on sponsored search auctions. If you make it, there will be something new, something old, and something art, as usual.

While preparing for these talks, I came across Chandra Chekuri's spring project: a fantastic Algorithmic Game Theory Course page. Awesome. With refs, links, papers, notes, and much more. A great course focused on Sponsored Search Auctions is Tim's. Other AGT C's are at gametheory.net.

ps: I have a prelim version of some problems and insights on ad auctions (ICALP 2008 proc). Comments welcome, I am working on a longer, meatier version.

Thursday, July 03, 2008

The Hybrid Beer

A "hybrid" that is on the horizon for alcoholers is Deus Brut des Flandres: the fantastic sparkling beer that is brewed in Belgium (tastes like Belgian beer 11+% alc) and finished in France (looks and behaves like sparkling wine). Web Search returns excessively low prices for this beer (in my fave local store, it is double this price).

Shopping results for deus brut des flandes

Brouwerij Bosteels Deus Brut Des ...$25.21 - webwine.com
Brouwerij Bosteels Deus Brut Des ...$24.99 - Wine Pavilion
Brouwerij Bosteels Deus Brut Des ...$24.99 - Canal's Discount ...

btw, serving type: bottle! :) In particular, if you want to celebrate making the SODA deadline and heading off to the long weekend with fireworks.

Labels: