Combining two sorted lists in Python
I have two lists of objects. Each list is already sorted by a property of the object that is of the datetime type. I would like to combine the two lists into one sorted list. Is the best way just to do a sort or is there a smarter way to do this in Python?
Solution 1:
is there a smarter way to do this in Python
This hasn't been mentioned, so I'll go ahead - there is a merge stdlib function in the heapq module of python 2.6+. If all you're looking to do is getting things done, this might be a better idea. Of course, if you want to implement your own, the merge of merge-sort is the way to go.
>>> list1 = [1, 5, 8, 10, 50]
>>> list2 = [3, 4, 29, 41, 45, 49]
>>> from heapq import merge
>>> list(merge(list1, list2))
[1, 3, 4, 5, 8, 10, 29, 41, 45, 49, 50]
Here's the documentation.
Solution 2:
People seem to be over complicating this.. Just combine the two lists, then sort them:
>>> l1 = [1, 3, 4, 7]
>>> l2 = [0, 2, 5, 6, 8, 9]
>>> l1.extend(l2)
>>> sorted(l1)
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
..or shorter (and without modifying l1
):
>>> sorted(l1 + l2)
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
..easy! Plus, it's using only two built-in functions, so assuming the lists are of a reasonable size, it should be quicker than implementing the sorting/merging in a loop. More importantly, the above is much less code, and very readable.
If your lists are large (over a few hundred thousand, I would guess), it may be quicker to use an alternative/custom sorting method, but there are likely other optimisations to be made first (e.g not storing millions of datetime
objects)
Using the timeit.Timer().repeat()
(which repeats the functions 1000000 times), I loosely benchmarked it against ghoseb's solution, and sorted(l1+l2)
is substantially quicker:
merge_sorted_lists
took..
[9.7439379692077637, 9.8844599723815918, 9.552299976348877]
sorted(l1+l2)
took..
[2.860386848449707, 2.7589840888977051, 2.7682540416717529]