changing the delimiter for cin (c++)
I've redirected "cin" to read from a file stream cin.rdbug(inF.rdbug())
When I use the extraction operator it reads until it reaches a white space character.
Is it possible to use another delimiter? I went through the api in cplusplus.com, but didn't find anything.
Solution 1:
It is possible to change the inter-word delimiter for cin
or any other std::istream
, using std::ios_base::imbue
to add a custom ctype
facet
.
If you are reading a file in the style of /etc/passwd, the following program will read each :
-delimited word separately.
#include <locale>
#include <iostream>
struct colon_is_space : std::ctype<char> {
colon_is_space() : std::ctype<char>(get_table()) {}
static mask const* get_table()
{
static mask rc[table_size];
rc[':'] = std::ctype_base::space;
rc['\n'] = std::ctype_base::space;
return &rc[0];
}
};
int main() {
using std::string;
using std::cin;
using std::locale;
cin.imbue(locale(cin.getloc(), new colon_is_space));
string word;
while(cin >> word) {
std::cout << word << "\n";
}
}
Solution 2:
For strings, you can use the std::getline
overloads to read using a different delimiter.
For number extraction, the delimiter isn't really "whitespace" to begin with, but any character invalid in a number.
Solution 3:
This is an improvement on Robᵩ's answer, because that is the right one (and I'm disappointed that it hasn't been accepted.)
What you need to do is change the array that ctype
looks at to decide what a delimiter is.
In the simplest case you could create your own:
const ctype<char>::mask foo[ctype<char>::table_size] = {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ctype_base::space};
On my machine '\n'
is 10. I've set that element of the array to the delimiter value: ctype_base::space
. A ctype
initialized with foo
would only delimit on '\n'
not ' '
or '\t'
.
Now this is a problem because the array passed into ctype
defines more than just what a delimiter is, it also defines leters, numbers, symbols, and some other junk needed for streaming. (Ben Voigt's answer touches on this.) So what we really want to do is modify a mask
, not create one from scratch.
That can be accomplished like this:
const auto temp = ctype<char>::classic_table();
vector<ctype<char>::mask> bar(temp, temp + ctype<char>::table_size);
bar[' '] ^= ctype_base::space;
bar['\t'] &= ~(ctype_base::space | ctype_base::cntrl);
bar[':'] |= ctype_base::space;
A ctype
initialized with bar
would delimit on '\n'
and ':'
but not ' '
or '\t'
.
You go about setting up cin
, or any other istream
, to use your custom ctype
like this:
cin.imbue(locale(cin.getloc(), new ctype<char>(data(bar))));
You can also switch between ctype
s and the behavior will change mid-stream:
cin.imbue(locale(cin.getloc(), new ctype<char>(foo)));
If you need to go back to default behavior, just do this:
cin.imbue(locale(cin.getloc(), new ctype<char>));
Live example