How the new range-based for loop in C++17 helps Ranges TS?
The committee changed the range-based for loop from:
-
C++11:
{ auto && __range = range_expression ; for (auto __begin = begin_expr, __end = end_expr; __begin != __end; ++__begin) { range_declaration = *__begin; loop_statement } }
-
to C++17 :
{ auto && __range = range_expression ; auto __begin = begin_expr ; auto __end = end_expr ; for ( ; __begin != __end; ++__begin) { range_declaration = *__begin; loop_statement } }
And people said that this will make implementing Ranges TS easier. Can you give me some examples?
C++11/14 range-for
was overconstrained...
The WG21 paper for this is P0184R0 which has the following motivation:
The existing range-based for loop is over-constrained. The end iterator is never incremented, decremented, or dereferenced. Requiring it to be an iterator serves no practical purpose.
As you can see from the Standardese that you posted, the end
iterator of a range is only used in the loop-condition __begin != __end;
. Hence end
only needs to be equality comparable to begin
, and it does not need to be dereferenceable or incrementable.
...which distorts operator==
for delimited iterators.
So what disadvantage does this have? Well, if you have a sentinel-delimited range (C-string, line of text, etc.), then you have to shoehorn the loop-condition into the iterator's operator==
, essentially like this
#include <iostream>
template <char Delim = 0>
struct StringIterator
{
char const* ptr = nullptr;
friend auto operator==(StringIterator lhs, StringIterator rhs) {
return lhs.ptr ? (rhs.ptr || (*lhs.ptr == Delim)) : (!rhs.ptr || (*rhs.ptr == Delim));
}
friend auto operator!=(StringIterator lhs, StringIterator rhs) {
return !(lhs == rhs);
}
auto& operator*() { return *ptr; }
auto& operator++() { ++ptr; return *this; }
};
template <char Delim = 0>
class StringRange
{
StringIterator<Delim> it;
public:
StringRange(char const* ptr) : it{ptr} {}
auto begin() { return it; }
auto end() { return StringIterator<Delim>{}; }
};
int main()
{
// "Hello World", no exclamation mark
for (auto const& c : StringRange<'!'>{"Hello World!"})
std::cout << c;
}
Live Example with g++ -std=c++14, (assembly using gcc.godbolt.org)
The above operator==
for StringIterator<>
is symmetric in its arguments and does not rely on whether the range-for is begin != end
or end != begin
(otherwise you could cheat and cut the code in half).
For simple iteration patterns, the compiler is able to optimize the convoluted logic inside operator==
. Indeed, for the above example, the operator==
is reduced to a single comparison. But will this continue to work for long pipelines of ranges and filters? Who knows. It is likely to require heroic optimization levels.
C++17 will relax the constraints which will simplify delimited ranges...
So where exactly does the simplification manifest itself? In operator==
, which now has extra overloads taking an iterator/sentinel pair (in both orders, for symmetry). So the run time logic becomes compile time logic.
#include <iostream>
template <char Delim = 0>
struct StringSentinel {};
struct StringIterator
{
char const* ptr = nullptr;
template <char Delim>
friend auto operator==(StringIterator lhs, StringSentinel<Delim> rhs) {
return *lhs.ptr == Delim;
}
template <char Delim>
friend auto operator==(StringSentinel<Delim> lhs, StringIterator rhs) {
return rhs == lhs;
}
template <char Delim>
friend auto operator!=(StringIterator lhs, StringSentinel<Delim> rhs) {
return !(lhs == rhs);
}
template <char Delim>
friend auto operator!=(StringSentinel<Delim> lhs, StringIterator rhs) {
return !(lhs == rhs);
}
auto& operator*() { return *ptr; }
auto& operator++() { ++ptr; return *this; }
};
template <char Delim = 0>
class StringRange
{
StringIterator it;
public:
StringRange(char const* ptr) : it{ptr} {}
auto begin() { return it; }
auto end() { return StringSentinel<Delim>{}; }
};
int main()
{
// "Hello World", no exclamation mark
for (auto const& c : StringRange<'!'>{"Hello World!"})
std::cout << c;
}
Live Example using g++ -std=c++1z (assembly using gcc.godbolt.org, which is almost identical to the previous example).
...and will in fact support fully general, primitive "D-style" ranges.
WG21 paper N4382 has the following suggestion:
C.6 Range Facade and Adaptor Utilities [future.facade]
1 Until it becomes trivial for users to create their own iterator types, the full potential of iterators will remain unrealized. The range abstraction makes that achievable. With the right library components, it should be possible for users to define a range with a minimal interface (e.g.,
current
,done
, andnext
members), and have iterator types automatically generated. Such a range facade class template is left as future work.
Essentially, this is equal to D-style ranges (where these primitives are called empty
, front
and popFront
). A delimited string range with only these primitives would look something like this:
template <char Delim = 0>
class PrimitiveStringRange
{
char const* ptr;
public:
PrimitiveStringRange(char const* c) : ptr{c} {}
auto& current() { return *ptr; }
auto done() const { return *ptr == Delim; }
auto next() { ++ptr; }
};
If one does not know the underlying representation of a primitive range, how to extract iterators from it? How to adapt this to a range that can be used with range-for
? Here's one way (see also the series of blog posts by @EricNiebler) and the comments from @T.C.:
#include <iostream>
// adapt any primitive range with current/done/next to Iterator/Sentinel pair with begin/end
template <class Derived>
struct RangeAdaptor : private Derived
{
using Derived::Derived;
struct Sentinel {};
struct Iterator
{
Derived* rng;
friend auto operator==(Iterator it, Sentinel) { return it.rng->done(); }
friend auto operator==(Sentinel, Iterator it) { return it.rng->done(); }
friend auto operator!=(Iterator lhs, Sentinel rhs) { return !(lhs == rhs); }
friend auto operator!=(Sentinel lhs, Iterator rhs) { return !(lhs == rhs); }
auto& operator*() { return rng->current(); }
auto& operator++() { rng->next(); return *this; }
};
auto begin() { return Iterator{this}; }
auto end() { return Sentinel{}; }
};
int main()
{
// "Hello World", no exclamation mark
for (auto const& c : RangeAdaptor<PrimitiveStringRange<'!'>>{"Hello World!"})
std::cout << c;
}
Live Example using g++ -std=c++1z (assembly using gcc.godbolt.org)
Conclusion: sentinels are not just a cute mechanism to press delimiters into the type system, they are general enough to support primitive "D-style" ranges (which themselves may have no notion of iterators) as a zero-overhead abstraction for the new C++1z range-for.
The new specification allows __begin
and __end
to be of different type, as long as __end
can be compared to __begin
for inequality. __end
doesn't even need to be an iterator and can be a predicate. Here is a silly example with a struct defining begin
and end
members, the latter being a predicate instead of an iterator:
#include <iostream>
#include <string>
// a struct to get the first word of a string
struct FirstWord {
std::string data;
// declare a predicate to make ' ' a string ender
struct EndOfString {
bool operator()(std::string::iterator it) { return (*it) != '\0' && (*it) != ' '; }
};
std::string::iterator begin() { return data.begin(); }
EndOfString end() { return EndOfString(); }
};
// declare the comparison operator
bool operator!=(std::string::iterator it, FirstWord::EndOfString p) { return p(it); }
// test
int main() {
for (auto c : {"Hello World !!!"})
std::cout << c;
std::cout << std::endl; // print "Hello World !!!"
for (auto c : FirstWord{"Hello World !!!"}) // works with gcc with C++17 enabled
std::cout << c;
std::cout << std::endl; // print "Hello"
}