Tuesday, July 05, 2011


PODS 2011 (a database theory conference called Principles Of Database Systems) has 1 plenary talk and 2 invited tutorials. I gave a tutorial on data streams research. My talk slides are here. It is difficult to present a tutorial on an area in 1 hour, and in particular, on a topic like data streams that has seen a lot of research, and in many different communities. My tutorial presented one technique (count-min sketch) and its applications, and emphasized the role of database community on this area (slide 32). The question of one pass computing, the motivation for dealing with large data streams, and most important for me, systems that actually use streaming in practice and validate the research effort of nearly two decades, all came from this community. I argued that their questions and drive eventually led to not only good theory of CS but also impact outside CS such as in signal processing community, where to at least some extent, compressed sensing results could be obtained by applying the count-min sketch. I would like to believe that this is a new vantage for the database community to value their technical impact (not the use of their technology) beyond CS. Some notes:
  • Tova Milo gave a nice plenary talk on business process data management (imagine queries to travel agency like, can I get a price quote without giving first my credit card details?), nearly all aspects from data models to query language to logic, optimization, provenance and structural/module privacy. More info here.
  • I spent some time with Raghu and Divy. Raghu has done everything, and it is great to pick his brain about the potential evolution curves of MapReduce, BigTable/Cassandra/ and others. Divy is super open to discussion on any topic, and we continued our past discussions on auction and pricing, this time more focused on selling data. I also spent time with Divesh, charming, incredibly knowledgeable with footprint all over SIGMOD as usual, and the ever-clever Alon Halevy (I cant break his story yet!).
  • In PODS business meeting, it was suggested that number of emails be used as a measure of complexity of organizing a meeting.
  • Finally, PODS overlapped with the day of protests and general strike in Athens. Normally the Athenians (and Greeks in general) have an alluring air of anarchy about them, and this day exaggerated the effect. I mistakenly walked into the intersection with tear gas, and not only teared and had a serious headache afterwards, but also spent the flight back to NY with no feelings on my left leg. Notwithstanding that, I was happy to be in Athens where the modern world began.
The database community likes to dress up, dance and party, and the banquet was indulgent, transforming bag-carrying, laptop-checking researchers into socialites. A 100 year old nonprofit group performed traditional greek dances --- simple to complex leg movement, with simple stringed instruments. Over the banquet, I managed to catch up with Peter Buneman and Johannes Gehrke, and talked about the EU research program on human brain. The question was, what is a big database research issue in human brain research.
ps: Missed going to this bookstore to look for Arkas in English.



Blogger Igor said...

In you tutorial, with respect to "A New Puzzle: One Word Median", what is the relationship between mu-sub-i and the actual true median of the set?

4:30 PM  
Anonymous Anonymous said...

Hi Igor,

One has to bound \mu_j - m_j. I do the analysis in two parts: how quickly does \mu_j get close to m_j, and having got close, how close does it remain for the remainder of the stream. Both analyses need some work, and when paper is ready, I will post it. It is slightly unusual because of course the number of steps needed to get to m_j depends on the absolute value of m_j.

-- Metoo

5:04 PM  

Post a Comment

<< Home