Wednesday, October 19, 2011

Video Hangout With the World, Update

In the past 3 weeks, we have discussed:
• What is a suitable approximation for an SQL query? The point is, the correct output is a multi-set of tuples with several attribute values. A definition of an approximate output is not immediately clear. If one were to define a distance between possible outputs, weighted sum of errors over attributes is not necessarily the most informative. Set similarity is not necessarily the best answer either, for example because attribute values have a meaning. (Say the correct answer is {10,20,30}; {10,20} is a similar set, but {9,19,31} may be a good approximate output.) Some sort of "matching" constraint should be implicit. A related paper is from VLDB 99 by Ioannidis and Poosala called "Histogram-Based Approximation of Set-Valued Query-Answers". There are a lot of nice theory problems lurking here.
• What are interesting algorithms problems one can abstract from Informatics Olympiads, math puzzle sites? The solution hopefully involves something more than dynamic programming.
• Data structure problems where outputs should be in some specified order. Say the output is of size k, then in O(klog k) additive time, one can output any specified order, assuming output items can be compared. Are there problems where one can do + O(k) only or without knowing k ahead of time?
• Specific problems: 2d image listing (rather than document listing) problem. This leads to the following "simpler" problem. Given n lists of length l that can be preprocessed, at query time, k < n lists are identified, and the goal is to output the intersection of these k lists in time like k, log l or poly in that. Any ideas?
We have persevered through bad connections, too many or too few on the hangout and so on, and we are still rolling.

Labels:

Rasmus Pagh said...

Didn't make it to the hangouts, but sounds like great fun. About the "simpler" problem: It seems harder than the multiphase problem, see "Towards Polynomial Lower Bounds for Dynamic Problems" by Mihai Pătraşcu (STOC 2010). So likely hard.

11:55 AM
Anonymous said...

Hi Rasmus,
I should look at the known lower bounds. There are simple \sqrt{n} type tradeoffs possible.
-- Metoo

4:25 PM
Igor said...

2:53 AM
Abel said...

Regarding the last problem, I wonder if it is easier when the k lists are guaranteed to be a contiguous range of the original lists (so, the first, the second and the third list, for example)...

8:12 PM
SEO Harvester said...

Pizza Equipment

I need Pizza Equipment for Apple Pizza. any one can help me?

3:19 AM