Easiest way of using min priority queue with key update in C++
Sometimes during programming contests etc., we need a simple working implementation of min priority queue with decrease-key to implement Dijkstra algorithm etc.. I often use set< pair<key_value, ID> > and an array (mapping ID-->key_value) together to achieve that.
Adding an element to the set takes O(log(N)) time. To build a priority queue out of N elements, we simply add them one by one into the set. This takes O(N log(N)) time in total.
The element with min key_value is simply the first element of the set. Probing the smallest element takes O(1) time. Removing it takes O(log(N)) time.
To test whether some ID=k is in the set, we first look up its key_value=v_k in the array and then search the element (v_k, k) in the set. This takes O(log(N)) time.
To change the key_value of some ID=k from v_k to v_k', we first look up its key_value=v_k in the array, and then search the element (v_k, k) in the set. Next we remove that element from the set and then insert the element (v_k', k) into the set. We then update the array, too. This takes O(log(N)) time.
Although the above approach works, most textbooks usually recommend using binary heaps to implement priority queues, as the time of building the binary heaps is just O(N). I heard that there is a built-in priority queue data structure in STL of C++ that uses binary heaps. However, I'm not sure how to update the key_value for that data structure.
What's the easiest and most efficient way of using min priority queue with key update in C++?
Although my response will not answer the original question, I think it could be useful for people who reach this question when trying to implement Dijkstra algorithm in C++/Java (like myself), something that was comment by the OP,
priority_queue
in C++ (or PriorityQueue
in Java) do not provide a decrease-key
operation, as said previously. A nice trick for using those classes when implementing Dijkstra is using "lazy deletion". The main loop of Dijkstra algorithm extracts the next node to be processed from the priority queue, and analises all its adjacent nodes, eventually changing the cost of the minimal path for a node in the priority queue. This is the point where decrease-key
is usually needed in order to update the value of that node.
The trick is not change it at all. Instead, a "new copy" for that node (with its new better cost) is added into the priority queue. Having a lower cost, that new copy of the node will be extracted before the original copy in the queue, so it will be processed earlier.
The problem with this "lazy deletion" is that the second copy of the node, with the higher bad cost, will be eventually extracted from the priority queue. But that will be always occur after the second added copy, with a better cost, has being processed. So the very first thing that the main Dijkstra loop must do when extracting the next node from the priority queue is checking if the node has being previously visited (and we know the shortest path already). It is in that precise moment when we will be doing the "lazy deletion" and the element must be simply ignored.
This solution will have a cost both in memory and time, because the priority queue is storing "dead elements" that we have not removed. But the real cost will be quite small, and programming this solution is, IMHO, easier than any other alternative that tries to simulate the missing decrease-key
operation.
Well, as Darren already said, std::priority_queue
doesn't have means for decreasing the priority of an element and neither the removal of an element other than the current min. But the default std::priority_queue
is nothing more than a simple container adaptor around a std::vector
that uses the std heap functions from <algorithm>
(std::push_heap
, std::pop_heap
and std::make_heap
). So for Dijkstra (where you need priority update) I usually just do this myself and use a simple std::vector
.
A push is then just the O(log N) operation
vec.push_back(item);
std::push_heap(vec.begin(), vec.end());
Of course for constructing a queue anew from N elements, we don't push them all using this O(log N) operation (making the whole thing O(Nlog N)) but just put them all into the vector followed by a simple O(N)
std::make_heap(vec.begin(), vec.end());
The min element is a simple O(1)
vec.front();
A pop is the simple O(log N) sequence
std::pop_heap(vec.begin(), vec.end());
vec.pop_back();
So far this is just what std::priority_queue
usually does under the hood. Now to change an item's priority we just need to change it (however it may be incorporated in the item's type) and make the sequence a valid heap again
std::make_heap(vec.begin(), vec.end());
I know this is an O(N) operation, but on the other hand this removes any need for keeping track of an item's position in the heap with an additional data structure or (even worse) an augmentation of the items' type. And the performance penalty over a logarithmic priority update is in practice not that signficant, considering the ease of use, compact and linear memory useage of std::vector
(which impacts runtime, too), and the fact that I often work with graphs that have rather few edges (linear in the vertex count) anyway.
It may not in all cases be the fastest way, but certainly the easiest.
EDIT: Oh, and since the standard library uses max-heaps, you need to use an equivalent to >
for comparing priorities (however you get them from the items), instead of the default <
operator.
I don't think the std::priority_queue
class allows for an efficient implementation of decrease-key
style operations.
I rolled my own binary heap based data structure that supports this, bascially along very similar lines to what you've described for the std::set
based priority queue you have:
- Maintain a binary heap, sorted by
value
that stores elements ofpair<value, ID>
and an array that mapsID -> heap_index
. Within the heap routinesheapify_up, heapify_down
etc it's necessary to ensure that the mapping array is kept in-sync with the current heap position of elements. This adds some extraO(1)
overhead. - Conversion of an array to a heap can be done in
O(N)
according to the standard algorithm described here. - Peeking at the root element is
O(1)
. - Checking if an
ID
is currently in the heap just requires anO(1)
look-up in the mapping array. This also allowsO(1)
peeking at the element corresponding to anyID
. -
Decrease-key
requires anO(1)
look-up in the mapping array followed by anO(log(N))
update to the heap viaheapify_up, heapify_down
. - Pushing a new item onto the heap is
O(log(N))
as is popping an exitsing item from the heap.
So asymptotically the runtime is improved for a few of the operations compared with the std::set
based data structure. Another important improvment is that binary heaps can be implemented on an array, while binary trees are node-based containers. The extra data locality of the binary heap usually translates to improved runtime.
A few issues with this implementation are:
- It only allows integer item
ID
's. - It assumes a tight distribution of item
ID
's, starting at zero (otherwise the space complexity of the mapping array blows up!).
You could potentially overcome these issues if you maintained a mapping hash-table, rather than a mapping array, but with a little more runtime overhead. For my use, integer ID
's have always been enough.
Hope this helps.
Actually, there is a way to use built-in binary heap from the standard c++ library. The key observation is that the implementation of all heap functions (i.e. std::push_heap, std::pop_heap and std::make_heap) need only the following methods from the stored element of class A:
- Constructor
A::A()
- Assignment operator
A& A::operator=(const A& rhs)
- Comparison operator
bool operator<(const A& lhs, const A& rhs)
It means that assignment operator is called whenever an element is moved in the container storing all heap elements. By overloading this operator you can control the index in the heap. When you have the index you can access the element in the heap, change its value and call std::push_heap to update its position in the heap.
See the simplified implementation of Dijkstra algorithm (without graph):
#include <bits/stdc++.h>
using namespace std;
vector<int> queue_idx;
struct Elem {
int label;
int dist;
bool operator<(const Elem& other) const { return dist > other.dist; }
Elem& operator=(const Elem& other);
};
vector<Elem> q;
Elem& Elem::operator=(const Elem& other)
{
label = other.label;
dist = other.dist;
queue_idx[label] = this - q.data();
return *this;
}
void AddElem(int label, int dist)
{
queue_idx[label] = label;
q.push_back(Elem{label, dist});
}
void RemoveMin()
{
pop_heap(q.begin(), q.end());
Elem res = q.back();
q.pop_back();
cout << "dist to " << res.label << " is " << res.dist << endl;
}
void Relax(int label, int dist)
{
int idx = queue_idx[label];
Elem& elem = q[idx];
if (elem.dist > dist)
{
elem.dist = dist;
push_heap(q.begin(), q.begin() + idx + 1);
}
}
int main()
{
int n = 5;
queue_idx.resize(n);
AddElem(0, 0);
for (int i = 1; i < n; ++i)
AddElem(i, INT_MAX);
make_heap(q.begin(), q.end());
RemoveMin();
Relax(1, 50);
Relax(2, 40);
Relax(4, 10);
RemoveMin();
Relax(3, 20);
RemoveMin();
Relax(1, 30);
RemoveMin();
Relax(2, 80);
RemoveMin();
return 0;
}
I know that this solution depends on the inner implementation of the standard library, however it just works for any compiler I am aware of and I used it in programming competitions.