Wednesday, March 16, 2011

Workshop: Parallelism 2020

I was at the DIMACS Workshop on parallelism: 2020 vision. I looked forward to the workshop and as expected, problems and challenges that were discussed were reminiscent of early 1990's,; the keywords however were not interconnection machines, grids or hypercubes but mapreduce and multicores. The people --- Vijaya Ramachandran, Uzi Vishkin, Leslie Valiant, Mike Goodrich, Guy Blelloch and many others --- were stalwarts who brought a lot of experience from PRAM world of 90's to the discussions of the day.

  • Mike started off the meeting with a talk on many algorithmic results he is able to obtain on a model of mapreduce. The talk was a tour through data structures (invisible B trees), geometry, linear programming and simulations (some version of BSP by a version of mapreduce), and at the least is a large swath of benchmark of algorithms research in this area that now one can understand and try to improve.
  • I went next. In the first part, I gave examples of simple things I wish I could do easily in mapreduce (recurse on problems like prefix sums and be able to fill answers back in, enumerate pairs for triangle counting, or do sketches for eigenvalue computations, etc). In the second part, I spoke about a special case of mapreduce, namely, sawzall, and our model for it showing it to be equivalent to streaming. I ran out of energy half way through and thought I did not communicate well the importance of sawzall, the relevance of relating it to streaming, or the niftiness of the proof that simultaneous communication complexity model can simulate sequential communication complexity model (via Savitch's theorem on streams). Apologies to the audience, as Howard Karloff pointed out, my talk had 2 jokes and 0 proofs (not true, I proved a theorem about counting triangles in terms of eigenvalues. :)), while I should have managed to work in one more proof. My talk pdf here.
  • Sid Suri went next and spoke in detail about the problem of counting triangles. The straightforward enumeration algorithm for counting triangles takes very long because some nodels have high degrees. Then he exploited schank's observation that responsibility for counting triangles can be shifted to low degree nodes, together with graph partition techniques to get an improved algorithm and showed actual mapreduce run times. An interesting part of his talk was the finale, where he abstracted lessons for mapreduce algorithms from this particular example: quadratic shuffling is hard, rounds matter unless some reducers are streaming which saves some, and both the model and the machines can not differentiate between constants.

Leslie started the afternoon session with a clear goal: enabling a world where parallel programs can be written once and used on whatever parallel machine. He then extended the BSP model to a hierarchical version, and amidst many parameters, still managed to design optimal algorithms for many problems. Vijaya went next and spoke about a parallel machine machine with local cache and work stealing across machines, and proposed algorithms for a number of problems in a joint work with Richard Cole. Finally, Uzi did a powerful defense of his agenda for past few years that PRAM is a useful model for thinking parallel programs and discussed 1000 or so machine PRAMs within reach.

When I headed back after a toothsome dinner at due mari, I realized that this workshop was definitely about what algorithms community should be doing --- applying our expertise on parallelism to see its impact in the new world of multi-core, multi-machines world, where systems researchers and builders have built parallel machines successfully. So, to some extent, this is a bridge from the other side (the side we knew was going from beautiful theory to getting them built). We also need a healthy perspective that systems change year to year, mapreduce of yesterday may be different from one of tomorrow. So, more tighter loop of interaction between theory and practice will be desirable.



Anonymous Anonymous said...

Is there a go-to paper for theorists who want to understand multicore and prove theorems about it, but who hate reading systems/programming papers?

Is there a short mathematical definition that's close enough to the truth to be usable? And that highlights the similarities and differences to PRAMs, etc.?

2:50 PM  

Post a Comment

<< Home