How to avoid double search on std::unordered_map AND avoid calling factory function when not required when implementing a cache
I've been implementing a cache based on std::unordered_map. I want to avoid calling the factory function that generates the value if the value is already stored but I also want to avoid running the search on the map twice.
#include <unordered_map>
struct Value {
int x;
Value & operator=(Value const &) = delete;
};
using Cache = std::unordered_map<int, Value>;
Value make_value(int i){
// Imagine that this takes a time ok!
i = i + 1;
return Value{i};
}
// This has double search
template <typename F>
Value & insert_a(Cache & cache, int key, F factory)
{
auto i = cache.find(key);
if(i==cache.end()){
auto r = cache.try_emplace(key,factory(key));
return r.first->second;
}
return i->second;
}
// This runs the factory even if it is not required
template <typename F>
Value & insert_b(Cache & cache, int key, F factory)
{
auto r = cache.try_emplace(key,factory(key));
return r.first->second;
}
int main(){
std::unordered_map<int,Value> map;
insert_a(map,10,make_value);
insert_b(map,10,make_value);
return 0;
}
I have two simplified implementations of insert demonstrating how one could build the cache.
The insert_a uses find first to detect if the item exists and only if it doesn't calls the factory to get the value. Two searches on the container are performed.
The insert_b calls try_emplace and just returns the stored value. This is obviously bad because the factory is called even if the value is already there.
It seems I want a middle ground where I pass the factory function directly to try_emplace and internally it is called only if required. Is there a way to simulate this?
This is not a question in general about how to build a cache. I am aware of multi-threading issues, const correctness and mutable keywords. I am specifically asking how to get both
- Single search of the container
- Only call factory if required
Please note that I have deleted the copy assign operator deliberately for the Value class. An obvious answer is to insert a default value first and then overwrite it. Not all classes are copy assignable and I want to support those.
There is a sandbox to play with https://godbolt.org/z/Gja3MaGWf
You can use a lazy factory. Ie one that only calls the actual factory when needed:
#include <unordered_map>
#include <iostream>
struct Value {
int x;
Value & operator=(Value const &) = delete;
};
using Cache = std::unordered_map<int, Value>;
Value make_value(int i){
// Imagine that this takes a time ok!
i = i + 1;
return Value{i};
}
template <typename F>
Value & insert_b(Cache & cache, int key)
{
auto r = cache.try_emplace(key,F{key});
return r.first->second;
}
// call the factory when needed
struct ValueMaker {
int value;
operator Value() {
std::cout << "called\n";
return make_value(value);
}
};
int main(){
std::unordered_map<int,Value> map;
insert_b<ValueMaker>(map,10);
insert_b<ValueMaker>(map,10);
return 0;
}
Output is
called
because ValueMaker::operator Value
is only called once when the element is inserted into the map. On the second call, the value maker (which is just a slim wrapper) is not converted to a Value
because the key
is already in the map.
I tried to change as little as possible on your code. For your actual code you might want to get rid of the two factories (make_value
and ValueMaker
) and use only one. The key point is to pass some light wrapper to try_emplace
that only triggers the construction of the actual value when it is converted to Value
.