To a few who emailed and wanted to know: in the Duplicate problem below, the array is
destroyed if it is not a permutation of the input. An anonymous mailer asked if an
O(n) time algorithm exists when arithmetic operations are allowed.
Here is another algorithmic problem. Given an array
A[1,n] of numbers accessible by comparisons only, it is well known that selection---finding them item of any rank
k in the sorted order---takes
O(n) time and is easier than sorting which takes
O(n log n) time. Is that true for strings as well? Formally:
The input is a string
S[1,n] accessible by comparisons only. Find the
median substring, that is, find the substring
S[i,n] which has the rank
n/2 in the
lexicographic ordering of the suffixes
S[j,n] for
j=1,...,n. This can be solved in
O(n log n) time by mapping the alphabet down to
[1,n] using sorting and then using an
O(n) time suffix tree construction algorithm. This algorithm in fact sorts all the suffixes. Does there exist an
O(n) time algorithm for the median substring problem?