Inserting into an unordered_set with custom hash function
I have the following code to make an unordered_set<Interval>
. This compiles fine.
struct Interval {
unsigned int begin;
unsigned int end;
bool updated; //true if concat. initially false
int patternIndex; //pattern index. valid for single pattern
int proteinIndex; //protein index. for retrieving the pattern
};
struct Hash {
size_t operator()(const Interval &interval);
};
size_t Hash::operator()(const Interval &interval){
string temp = to_string(interval.begin) + to_string(interval.end) + to_string(interval.proteinIndex);
return hash<string>()(temp);
}
unordered_set<Interval, string, Hash> test;
However, I cannot compile when I try to insert using this code:
for(list<Interval>::iterator i = concat.begin(); i != concat.end(); ++i){
test.insert((*i));
}
Moreover, I cannot determine what the problem is from the error messages, for example:
note: candidate is:
note: size_t Hash::operator()(const Interval&)
note: candidate expects 1 argument, 2 provided
I thought I only provided 1 argument...
What is the problem with my insertion code?
Here's the new instantiation code: unordered_set<Interval, Hash> test;
However, I'm still receiving a slew of error messages, for example:
note: candidate is:
note: size_t Hash::operator()(const Interval&) <near match>
note: no known conversion for implicit ‘this’ parameter from ‘const Hash*’ to ‘Hash*’
Solution 1:
First problem:
You are passing string
as the second template argument for your instantiation of the unordered_set<>
class template. The second argument should be the type of your hasher functor, and std::string
is not a callable object.
Perhaps meant to write:
unordered_set<Interval, /* string */ Hash> test;
// ^^^^^^^^^^^^
// Why this?
Also, I would suggest using names other than begin
and end
for your (member) variables, since those are names of algorithms of the C++ Standard Library.
Second problem:
You should keep in mind, that the hasher function should be qualified as const
, so your functor should be:
struct Hash {
size_t operator() (const Interval &interval) const {
// ^^^^^
// Don't forget this!
string temp = to_string(interval.b) +
to_string(interval.e) +
to_string(interval.proteinIndex);
return (temp.length());
}
};
Third problem:
Finally, if you want std::unordered_set
to be able to work with objects of type Interval
, you need to define an equality operator consistent with your hash function. By default, if you do not specify any type argument as the third parameter of the std::unordered_set
class template, operator ==
will be used.
You currently do not have any overload of operator ==
for your class Interval
, so you should provide one. For example:
inline bool operator == (Interval const& lhs, Interval const& rhs)
{
return (lhs.b == rhs.b) &&
(lhs.e == rhs.e) &&
(lhs.proteinIndex == rhs.proteinIndex);
}
Conclusion:
After all the above modifications, you can see your code compiling in this live example.
Solution 2:
I think, Andy Prowl perfectly fixed the problems with your code. However, I would add the following member function to your Interval
, which describes what makes two intervals identical:
std::string getID() const { return std::to_string(b) + " " + std::to_string(e) + " " + std::to_string(proteinIndex); }
Please note, that I also followed Andy Prowl's suggestion and renamed the members begin
to b
and end
to e
. Next, you can easily define the hash and comparison functions by using lambda expressions. As a result, you can define your unordered_set
as follows:
auto hash = [](const Interval& i){ return std::hash<std::string>()(i.getID()); };
auto equal = [](const Interval& i1, const Interval& i2){ return i1.getID() == i2.getID(); };
std::unordered_set<Interval, decltype(hash), decltype(equal)> test(8, hash, equal);
Finally, for reasons of readability, I converted your for
loop into a range-based for
loop:
std::list<Interval> concat {{1, 2, false, 3, 4}, {2, 3, false, 4, 5}, {1, 2, true, 7, 4}};
for (auto const &i : concat)
test.insert(i);
for (auto const &i : test)
std::cout << i.b << ", " << i.e << ", " << i.updated << std::endl;
Output (I just printed first three members of each Interval
):
2, 3, 0
1, 2, 0
As you can see, there are only two intervals printed. The third one ({1, 2, true, 7, 4}
) was not inserted to test
, because its b
, e
, and proteinIndex
are equal to that of the first interval ({1, 2, false, 3, 4}
).
Code on Ideone