lower_bound == upper_bound

What does lower_bound mean. If I had to guess I would answer that this function returns the iterator at the last element that is less than the value asked for. But I see that lower_bound is almost the same as upper_bound. The only difference is strict inequality in the case of upper_bound. Is there a true lower bound selection function in stl that agrees with the normal definition of lower bound.

EDIT: It was too many negations in the docs which made me confused. The problem was that I got the same iterator. I solved it by subtracting 1 from lower_bound return value. I use it for interpolation:

    float operator()(double f)
        {
        SpectrumPoint* l=std::lower_bound(beginGet(),endGet(),(SpectrumPoint){float(f),0.0f}
            ,SpectrumPoint::CompareFreqLessThan);
        if(l>beginGet())
            {--l;}

        SpectrumPoint* u=std::lower_bound(beginGet(),endGet(),(SpectrumPoint){float(f),0.0f}
            ,SpectrumPoint::CompareFreqLessThan);

        if(u==endGet())
            {u=beginGet();}

        if(l==u)
            {
            if(u==endGet())
                {return u->amp;}
            return l->amp;
            }

        double f_min=l->freq;
        double A_min=l->amp;
        double f_max=u->freq;
        double A_max=u->amp;

        double delta_f=f_max-f_min;
        double delta_A=A_max-A_min;

        return A_min + delta_A*(f-f_min)/delta_f;
        }

I am sorry for this confusion :-(


Solution 1:

  • Lower bound: first element that is greater-or-equal.

  • Upper bound: first element that is strictly greater.

Example:

+- lb(2) == ub(2)       +- lb(6)        +- lb(8)
|        == begin()     |  == ub(6)     |   +- ub(8) == end()
V                       V               V   V
+---+---+---+---+---+---+---+---+---+---+---+
| 3 | 4 | 4 | 4 | 4 | 5 | 7 | 7 | 7 | 7 | 8 |
+---+---+---+---+---+---+---+---+---+---+---+
    ^               ^                       ^
    |               |                       |
    +- lb(4)        +- ub(4)                +- lb(9) == ub(9) == end()

    |- eq-range(4) -|

As you can see, the half-open equal-range for n is [lb(n), ub(n)).

Note that both bounds give you meaningful insertion locations for an element of the desired value so that the ordering is maintained, but lower_bound has the distinguishing feature that if the element already exists, then you get an iterator which actually points to that element. Thus you can use lower_bound on an ordered range to implement your own unique-membership or multiple-membership container.

void insert(Container & c, T const & t)
{
    auto it = std::lower_bound(c.begin(), c.end(), t);

    // if unique container:
    if (it != c.end() && *it == t) { /* error, element exists! */ return; }

    c.insert(it, t);
}

Solution 2:

It returns the iterator one past the last element that is less than the value asked for. This is useful as an insertion position (and that's why the function returns that iterator). It's also useful that the half-open range first, lower_bound(first, last, value) specifies all values less than value.

upper_bound returns the iterator one past the last element [less than or equal to / not greater than] the value asked for. Or strictly: the last element which the value is not less than, since both algorithms deal exclusively in less-than comparators.

If you want the iterator before the iterator returned by lower_bound, you can subtract 1 (for a random access iterator), decrement (for a bidirectional iterator), or do a linear search instead of using lower_bound (for a forward iterator that is none of those).

Beware the edge case that there is no element less than the value asked for, in which case you can't have what you want, because it doesn't exist. lower_bound of course returns the beginning of the range in that case, so doesn't need a special-case return value.

Solution 3:

Since this has been reopened, I'll try to make my comment an answer.

The name lower_bound is mathematically incorrect. A better name might be least_upper_bound, but that would probably confuse most non-mathematically minded folk. (And then what do you call upper_bound? almost_least_upper_bound? Yech!)

My advice: Get over the fact that the names lower_bound and upper_bound are technically incorrect. The two functions as defined are quite useful. Think of those functions as a useful abuse of notation.

To make a mathematically correct lower_bound function that conforms with the C++ concept of an iterator, the function would have to return a reverse iterator rather than a forward iterator. Returning a reverse iterator is not nearly as useful as the approach taken by the perhaps misnamed lower_bound and upper_bound, and the concept of returning a reverse iterator runs afoul of the fact that not all containers are reversible.

Why a reverse iterator? Just as there is no guarantee that an upper bound exists in the container, there similarly is no guarantee that a lower bound will exist. The existing lower_bound and upper_bound return end() to indicate that the searched-for value is off-scale high. A true lower bound would need to return rend() to indicate that the searched-for value is off-scale low.

There is a way to implement a true lower bound in the form of a forward iterator, but it comes at the price of abusing the meaning of end() to mean "there is no lower bound". The problem with this abuse of notation is that some user of the function might do something equivalent to true_lower_bound(off_scale_low_search_value)-1 and voila! one has a pointer to the largest element in the set.

That said, here's how to do it. Have the true lower bound function return end() if the container is empty or if the searched-for value is smaller than the first value in the container. Otherwise return upper_bound()-1.

Solution 4:

lower_bound, upper_bound and equal_range are functions which perform binary search in a sorted sequence. The need for three functions comes from the fact that elements may be repeated in the sequence:

1, 2, 3, 4, 4, 4, 5, 6, 7

In this case, when searching for the value 4, lower_bound will return an iterator pointing to the first of the three elements with value 4, upper_bound will return an iterator pointing to the element with value 5, and equal_range will return a pair containing these two iterators.

Solution 5:

Following David Hammen's answer, I attempted to explain why we often don't feel the names of lower_bound/upper_bound to be correct, or at least intuitive.

It's because we are looking for an element immediately lower than the query. I made a drawing and a use case:

sparse range queries

Code:

template< typename T, typename U >
auto infimum(std::map<T,U> const& ctr, T query)
{
    auto it = ctr.upper_bound(query);
    return it == ctr.begin() ? ctr.cend() : --it;
}

template< typename T, typename U >
bool is_in_interval(std::map<T,U> const& ctr, T query)
{
    auto inf = infimum(ctr, query);
    return inf == ctr.end() ? false : query <= inf->second;
}

https://ideone.com/jM8pt3

Basically to get the behavior of the "grey arrows", we need upper_bound - 1 which is why it's weird.

Let me rephrase that slightly: from the name lower_bound we instinctively expect returns-first-immediately-inferior-element (like the grey arrows). But we get returns-first-immediately-superior-element for lower_bound; and first-immediately-strictly-superior-element for upper_bound. That's what is surprising.

It's surprising in the hypothesis that you work with a sparse sequence like my thought experiment in the picture above. But it makes wonderful sense when you think of it in terms of «bounds of an equal_range» in a dense sequence, populated with plateaus, like Kerrek SB beautifully pictured.

Test code:

#include <map>
#include <cassert>
#include <iostream>

// .. paste infimum and is_in_interval here

int main()
{
    using std::cout;
    using Map = std::map<int,int>;
    Map intervals{{2,5}, {8,9}};

    auto red = infimum(intervals, 4);
    assert(red->first == 2);
    cout << "red->first " << red->first << "\n";

    auto green = infimum(intervals, 6);
    assert(green->first == 2);
    cout << "green->first " << green->first << "\n";

    auto pink = infimum(intervals, 8);
    assert(pink->first == 8);
    cout << "pink->first " << pink->first << "\n";

    auto yellow = infimum(intervals, 1);
    assert(yellow == intervals.cend());

    auto larger_than_all = infimum(intervals, 15);
    assert(larger_than_all->first == 8);

    bool red_is = is_in_interval(intervals, 4);
    cout << "red is in " << red_is << "\n";

    bool green_is = is_in_interval(intervals, 6);
    cout << "green is in " << green_is << "\n";

    bool pink_is = is_in_interval(intervals, 8);
    cout << "pink is in " << pink_is << "\n";

    bool yellow_is = is_in_interval(intervals, 1);
    cout << "yellow is in " << yellow_is << "\n";
}

results in

red->first 2
green->first 2
pink->first 8
red is in 1
green is in 0
pink is in 1
yellow is in 0

seems ok.

So of course the utility functions are not very good, they should be designed with a range API, so we can work with any collection or sub-range, or reverse iterators, or filtered views and whatnot. We can get that when we have C++20. In the meantime, I just made a simple educative map-taking API.

play with it:
https://ideone.com/jM8pt3