Workshop on Graph Streams (Sandia/DIMACS)
Here are some notes from the workshop, superbly organized by the Sandia team.
- Workshops are hard to organize, and you have to have a large purpose to do the work. The Sandia team of Bruce Hendrickson, Jon Berry, Cynthia Phillips, and others has a scholarly attitude, which was truly refreshing. There is genuine interest in Sandia, from US Govt IP network (mix of classified, unclassified, specialized) monitoring applications to new theoretical graph stream models, and an empirical approach based on setting up synthetic dataset, benchmark tools and infrastructure systems. I didnt know Livermore has a Sandia Lab, with Kevin Matulef, C. Seshadri and others. This is some nice research horse and brain power for streaming research at Sandia. They had a uber-data context: some stored data, some sampled, some streaming hose, how to process them all with a combination of multiple machines, cloud, etc. Will wait for Jon to put his slides online where this model was clearer.
- Attending a workshop even for a day is a welcome break to think about problems. Here are vague questions. Somebody out there may have something to say (incl. shooting down the problems): (a) say characters of a string arrive online, produce a uniformly random sample substring. Detail: good for string seen thus far, represent the substring by O(1) sized representation of left and right endpoints, ... (b) the contents of a file are sent by breaking into substrings in IP packets, but substrings are sometimes repeated, sometimes substrings are overlapping in arbitrary ways (due to TCP resend). Is there a coding/decoding solution that tradeoffs coding quality to sublinear space reconstruction? (c) each new stream item is a string. have to find substrings of each stream item that appears a lot of times thus far. If the lengths of strings is L, can you avoid doing O(L^2) work per item and/or use space less than exp in L.
- Distractions. Cindy said, "that is the last edge that broke the camel's back". Sudipto used the phrase, "the right side of Buddha". Madhav could not attend the workshop because he had to respond to the Ebola threat.