Consider the maximal substrings of Q containing only x and y. By the algorithm's choice, x is at least as frequent in Q as y. Suppose first that P does not end with y. Otherwise, let P' = Px and S = PQ and Q = yQ', where x and y are items and Q and Q' are sequences. Otherwise, let P be the (safe) prefix just before the step under consideration, let P' be the prefix just after, and let S be an optimal solution extending P. The only way that a greedy step introduces a defect is if only one item type remains, in which case there is only one way to continue, and that way is safe. It suffices to show inductively that each greedy step maintains safety. Clearly the empty prefix is safe, and if a safe prefix is a whole solution then that solution is optimal. We call a prefix (partial solution) safe if it extends to an optimal solution. This greedy algorithm returns optimal solutions, by the following logic. The others are symmetric each instance of the minority element prevents at most two defects out of a total of k1 + k2 - 1 possible. (See also Coady's implementation of this algorithm.) import collections import heapq class Sentinel: pass def david_eisenstat(lst): counts = collections.Counter(lst) heap = heapq.heapify(heap) output = last = Sentinel() while heap: minuscount1, key1 = heapq.heappop(heap) if key1 != last or not heap: last = key1 minuscount1 += 1 else: minuscount2, key2 = heapq.heappop(heap) last = key2 minuscount2 += 1 if minuscount2 != 0: heapq.heappush(heap, (minuscount2, key2)) output.append(last) if minuscount1 != 0: heapq.heappush(heap, (minuscount1, key1)) return output Proof of correctnessįor two item types, with counts k1 and k2, the optimal solution has k2 - k1 - 1 defects if k1 k2. The idea is to take the most frequent of the remaining item types unless it was just taken. This is along the lines of Thijser's currently incomplete pseudocode.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |