Saturday, November 29, 2014

Workshop on Graph Streams (Sandia/DIMACS)

Here are some notes from the workshop, superbly organized by the Sandia team.

  • Workshops are hard to organize, and you have to have a large purpose to do the work. The Sandia team of Bruce Hendrickson, Jon Berry, Cynthia Phillips, and others has a scholarly attitude, which was truly refreshing. There is genuine interest in Sandia, from US Govt IP network (mix of classified, unclassified, specialized) monitoring applications to new theoretical graph stream models, and an empirical approach based on setting up synthetic dataset, benchmark tools and infrastructure systems. I didnt know Livermore has a Sandia Lab, with Kevin Matulef, C. Seshadri and others. This is some nice research horse and brain power for streaming research at Sandia.  They had a uber-data context: some stored data, some sampled, some streaming hose, how to process them all with a combination of multiple machines, cloud, etc. Will wait for Jon to put his slides online where this model was clearer. 
  • Attending a workshop even for a  day is a welcome break to think about problems. Here are vague questions. Somebody out there may have something to say (incl. shooting down the problems): (a) say characters of a string arrive online, produce a uniformly random sample substring. Detail: good for string seen thus far, represent the substring by O(1) sized representation of left and right endpoints, ... (b) the contents of a file are sent by breaking into substrings in IP packets, but substrings are sometimes repeated, sometimes substrings are overlapping in arbitrary ways (due to TCP resend). Is there a coding/decoding solution that tradeoffs coding quality to sublinear space reconstruction? (c) each new stream item is a string. have to find substrings of each stream item that appears a lot of times thus far. If the lengths of strings is L, can you avoid doing O(L^2) work per item and/or use space less than exp in L. 
  • Distractions. Cindy said, "that is the last edge that broke the camel's back". Sudipto used the phrase, "the right side of Buddha". Madhav could not attend the workshop because he had to respond to the Ebola threat. 


Algorithms in the Field (8F)

NSF announces a new funding program for Algorithms in the Field. Deadline is Feb 9, 2015.  One of the metrics in Algorithms community is the ultimate use of our algorithms, ``use" being broadly interpreted, and this often needs us to go more than halfway to meet other communities. This program is an opportunity to codify the process some. When one does meet the other communities, almost always it leads to new theories and algorithms, and more than pays for the journey. I hope you will respond.

Here is more info on the workshop we organized 2 years ago. The videos of the talks are here.


Wednesday, November 05, 2014

Cornell CS 50

I like the historical context for things, and in CS, Cornell is a center.  Enjoyed reading about Cornell CS 50th celebration