Sunday, February 26, 2006

Workingman's Death

I just watched the movie Workingman's Death. It is a visual tribute to hard physical laborers ekeing a living (a) Ukraining miners coercing dregs of coal from abandoned mines, (b) Indonesian workers harvesting sulfur slabs from infernal fumes and trekking miles with a rhythmic walk, (c) Nigerian butchers bleeding goats and cows and its satellite industry that drags off the head, carcass and innards in different directions through mud and slush, (d) Pakistani workers cutting up ships with blowtorches in large chunks for scrap metal,.... It is an engrossing movie of people going to hell and back each day and having pride in their backbreaking work.

Seen from this perspective, we should feel lucky our profession is research. We are fortunate. So, we owe it to ourselves and others to do a good job of it.

NY Times Travel Show

I just came back from the NY Times travel show at Javits Center showcasing travel companies, national tourism boards and tourism book publishers with display booths. The Antartica bug remains: I found myself checking out brochures of the companies that offered travel to the icy land. But the goal this year is to trek (Kilimanjaro?) and party in the land of Giraffes, Rhinos, Lions, Elephants, dark nights with, of course, the people of Africa. The challenge is to do it one's way.

A Quick Problem

I remembered a toy problem I made up while solving a different (less of a toy?) problem and never followed up. Given n positive numbers (possibly fractions) and an integer k, find a way to group the numbers into k groups such that the sum over the groups of the product of numbers in each group is minimized. Does this map to a well-known problem?

Happiness

Being happy or "the right to pursue happiness" is not something that comes easy because I was taught that I should earn my happiness and feel guilt not only achieving it but even striving for it, sigh. In the big scheme of things, reaching for happiness is a relatively recent concept. Check out "Happiness: A History" by DarrinMcMahon (the reviews are fun, those-who-bought-this-book-also-bought list is fun too).

Thursday, February 23, 2006

Graffiti

Besides cornrows, I like statements by scribbling graffiti. Friends who know gave me the gift of the book Graffiti World:
Street art from five continents
, a beautiful collection with a sample on the web. Graffiti does not kill, but being (the best?) graffiti artist does: "gallery owner Annina Nodei locked Jean Michel Basquiat in a basement,..., to sell his canvases wet off the easel", check out The Downtown Book, The New York Art Scene 1974--1984.

Sunday, February 19, 2006

Old automata problem

I like old problems that look withered. Sometimes, there is no reason to think of them because the world has moved on. Still, I like to visit them when I am on travel or commute or simply lost.

A 2DPDA is a deterministic pushdown automata with one stack and one 2-way head. Can we prove that no 2DPDA accepts the language L = {p#t | string p matches a substring in t; p and t may have wildcard characters that match any alphabet symbol}?

If L' is L where p and t do NOT have wildcards, it is the standard string matching problem and there is a quadratic time algorithm to accept L' by a 2DPDA. Cook (S. A. Cook, Linear-time simulation of deterministic two-way pushdown automata, Information Processing (IFIP) 71, C.V. Freiman, (ed.), NorthHolland, pp. 75--80, 1971.) proved that any language accepted by a 2DPDA can be accepted in linear time by a RAM, giving the first known linear time algorithm for string matching (this was a precursor to the KMP algorithm). No linear time algorithm is known for matching with wildcards.

Theory of Networked Computation

I spent two days at the workshop organized by Joan Feigenbaum and Jennifer Rexford on the Theory of Networked Computation. The total intellect they had amassed in the workshop was awesome, and the meeting was purposeful. The purpose was discussing what are suitable research directions for theoretical computer scientists in the context of the effort within NSF to in my words, do the internet over from the scratch. Some of the topics that dominated were: how to embrace economic theories in the design of networks, how to embed security and privacy within network design, the role of massive data set computing within networks, how to address nodes in the new network and is there a nice problem of combined compact routing and addressing, is there a theory of networking akin to theory of computing with impossibility or hardness framework, and how to make networks auto-administer themselves. There were some heated networking-specific discussions: should one expose network state to endhosts, should there be autonomous systems, are ISPs making money. Of course, there was some lamenting about how networking community did not seem to know the best theory results, some chest-thumping that theory results (Ashish Goel and Bruce Maggs converged to: "all three p2p protocols were directly derived from algorithmicists' work") were being useful, some soulful calls for more attempt at making theory useful by interfacing with networking in practice. A factoid: Andrei Broder has the patent on CAPTCHAs!. Cool! Another factoid: How real world breaks CAPTCHAs.

On theory vs applications, I tend to think there is a lot of good theory but there is less of good theory inspired by (or applicable to) real problems than theorists would like to believe. Most theoretical papers do not drill down deep (beyond the intro?) while making a case that their work/problem is crucial in an application. Personally, I like solving puzzles, so my heart is in theory, and at least in published work, I seem to think more theory than applications. Still, I got a referee report sometime ago for one of my papers that when paraphrased, said, "From Muthu, we expect more about applications for this result." Sigh.

(Wo)man Blues

I am an old black man at heart, coughing through blues when I hum. Yesterday I got to watch the bluest, Nina Simone, in her movie shown as part of the African Diaspora Film Festival at the Brooklyn Academy of Music. It transported me and I suspect everyone in the theater to a performance hall in Paris in 1976 where Nina Simone created what is a "performance art piece". She sang Backlash Blues, evoked Langston Hughes, my favorite James Baldwin and Zora Hurston, and I was in heaven. She sang of Stars that fade way, go fast, go slow, and I could go coil myself into a ball and be lost. Her performance is sublime, even the movie of her performance is a gem, don't miss it.

Monday, February 13, 2006

FUN of NYC

I miss my brief time in Antartica, sliding down 1000+ft slopes, sound of carving ice all around and the gargle of water running under the ice crevices. Luckily, the snowstorm in NYC gave me a chance to reminisce at the Central Park on Saturday. Amidst a 1910's sleigh. The initiative of the NY street that sprouts $5 umbrellas outside Subway stops when it rains, yielded US postal service plastic mail pans you could use for sliding. I used a black polythene sheet, found on the road side, with the added attraction of having no brakes. I became part of the laughter, scream and uncontrolled glee of the gaggle of boys, girls, men and women., as skiers, skateboarders and snowplows worked their magic.

At 11 PM, I decided to get a therapeutic massage. In 15 min, after a block and half of walk, I was getting pounded, head to toe, an ode to the painful pleasures of NYC.

Saturday, February 11, 2006

Internet Search and Research Labs

Suresh quipped "verity on verity" re. MS Labs. Yahoo Labs in Chile is now headed by Baeza-Yates. Antonio Gulli heads the AskJeeves Lab in Pisa, Italy. Udi Manber is now at Google. What a vibrant job market!

In the world of "search", I am still figuring out ehow which gives instructions, tips and warnings on how to do things. Today's post is on how to fix a jammed garbage disposal, but other howtos abound.

Median substring

To a few who emailed and wanted to know: in the Duplicate problem below, the array is destroyed if it is not a permutation of the input. An anonymous mailer asked if an O(n) time algorithm exists when arithmetic operations are allowed.

Here is another algorithmic problem. Given an array A[1,n] of numbers accessible by comparisons only, it is well known that selection---finding them item of any rank k in the sorted order---takes O(n) time and is easier than sorting which takes O(n log n) time. Is that true for strings as well? Formally:

The input is a string S[1,n] accessible by comparisons only. Find the median substring, that is, find the substring S[i,n] which has the rank n/2 in the lexicographic ordering of the suffixes S[j,n] for j=1,...,n. This can be solved in O(n log n) time by mapping the alphabet down to [1,n] using sorting and then using an O(n) time suffix tree construction algorithm. This algorithm in fact sorts all the suffixes. Does there exist an O(n) time algorithm for the median substring problem?

Wednesday, February 08, 2006

Duplicates

Here is an algorithmic problem. Consider read-write array A[1,n] of numbers. Find the smallest index i such that A[i]=A[j] for some j, that is the smallest index that has a duplicate in the array, using O(1) extra space only. We work in the compare and move model typically used for sorting, and are not allowed to "stuff bits" into array items. This problem was given to me by Martin Pal and Vitaliy Kulikov of Google. Results we can claim are: O(nlog^2 n) time without destroying the array, and O(nlog n) time if one is allowed to destroy the array. Any thoughts/pointers welcome. The problem has nice connections to in-place mergesort, for example.

Monday, February 06, 2006

superbowl, nyc, etc.

Watching the stupor of Superbowl XL, I distracted myself thinking about how to multiply two integers in roman numerals. A little example is here. Sigh.

New York Times, my local newspaper, reports on digging a tunnel below the Manhattan bedrock, the machine chewing 100 feet a day -- or a (fallen) man a mile as the workers quip -- to bring water to the city by 2020. Sandhogs and engineers toiling down there, moving 35 tons of debris for every 5 feet of progress, lunching amidst the muddy water and the bare light bulbs. The geologist working with them, mapping the type of rocks under the city, says, " For a geologist, this is like going to Disneyland." --- better I would think....

On the job front, in Computer Science, we are beyond "wet labs". We have "ad labs", "live labs" and "search labs"... Check out MS Labs.

Sunday, February 05, 2006

At the Brooklyn Museum, the first saturday of each month has free art and entertainment from 5--11 PM. I was there yesterday, dressed to dance (DJ spinning 80's music in the great hall with kids, youth, and the old, families, singles and friends, dancing incoherently) and walking amongst African and Egyptian art late night. The Museum has a terrific American Art collection on show. Talking of american art, my favorite museum is the emerging treat: Ogden Museum of Southern Art, set up in New Orleans to showcase the (neglected) art from the south of this country.
DIMACS is looking for an Associate Director. This is a great time to reflect on the role of DIMACS, how it has changed over time, and what it should be over the next few years. I "grew up" in early 90's when DIMACS had terrific workshops in Theoretical Computer Science (TCS), a thriving visitors program and was a hub of activities. As a postdoc later, I became a minor part of the transition when DIMACS evolved to study "applications" of TCS to Computational Biology, a special focus organized by Martin Farach-Colton and others in '94. DIMACS helped several researchers in the fundamental algorithms area embrace the concepts and problems in Biology and bringing our way of asking questions and our discrete math tools to that area. Richa Agarwala, Vineet Bafna, Dannie Durand, Gene Myers, Pavel Pevzner, R. Ravi, Steve Skiena, (the fantastic) Mike Waterman, and many others visited DIMACS at that time, and in ways small and large, DIMACS contributed to the launching of the Computational Biology area (with the emergence of "Bioinformatics" that happened later) that has been tremendously successful in producing science and scientists, algorithms and tools, as well as companies and careers. Partly driven by this success and by funding, DIMACS has applied this formula to networking, epidemiology, privacy, data analysis and socio-economics since then, and now has workshops in adverse events reporting, biosurveillance, etc. I wonder if DIMACS is serving the DM/TCS community with this trend. (I think of myself as an applied algorithmicist and I become easily engaged in new applications, trying to find algorithmic problems in them. So, I can imagine being interested in some of these workshops, though honestly, I rarely go to them due to my quantum state of existence.)