How to test whether stringstream operator>> has parsed a bad type and skip it
I am interested in discussing methods for using stringstream
to parse a line with multiple types. I would begin by looking at the following line:
"2.832 1.3067 nana 1.678"
Now lets assume I have a long line that has multiple strings
and doubles
. The obvious way to solve this is to tokenize the string and then check converting each one. I am interested in skipping this second step and using stringstream
directly to only find the numbers.
I figured a good way to approach this would be to read through the string and check if the failbit
has been set, which it will if I try to parse a string into a double.
Say I have the following code:
string a("2.832 1.3067 nana 1.678");
stringstream parser;
parser.str(a);
for (int i = 0; i < 4; ++i)
{
double b;
parser >> b;
if (parser.fail())
{
std::cout << "Failed!" << std::endl;
parser.clear();
}
std::cout << b << std::endl;
}
It will print out the following:
2.832
1.3067
Failed!
0
Failed!
0
I am not surprised that it fails to parse a string, but what is happening internally such that it fails to clear its failbit
and parse the next number?
The following code works well to skip the bad word and collect the valid double
values
istringstream iss("2.832 1.3067 nana 1.678");
double num = 0;
while(iss >> num || !iss.eof()) {
if(iss.fail()) {
iss.clear();
string dummy;
iss >> dummy;
continue;
}
cout << num << endl;
}
Here's a fully working sample.
Your sample almost got it right, it was just missing to consume the invalid input field from the stream after detecting it's wrong format
if (parser.fail()) {
std::cout << "Failed!" << std::endl;
parser.clear();
string dummy;
parser >> dummy;
}
In your case the extraction will try to read again from "nana"
for the last iteration, hence the last two lines in the output.
Also note the trickery about iostream::fail()
and how to actually test for iostream::eof()
in my 1st sample. There's a well known Q&A, why simple testing for EOF as a loop condition is considered wrong. And it answers well, how to break the input loop when unexpected/invalid values were encountered. But just how to skip/ignore invalid input fields isn't explained there (and wasn't asked for).
Few minor differences to πάντα ῥεῖ's answer - makes it also handle e.g. negative number representations etc., as well as being - IMHO - a little simpler to read.
#include <iostream>
#include <sstream>
#include <string>
int main()
{
std::istringstream iss("2.832 1.3067 nana1.678 x-1E2 xxx.05 meh.ugh");
double num = 0;
for (; iss; )
if (iss >> num)
std::cout << num << '\n';
else if (!iss.eof())
{
iss.clear();
iss.ignore(1);
}
}
Output:
2.832
1.3067
1.678
-100
0.05
(see it running here)
I have built up a more fine tuned version for this, that is able to skip invalid input character wise (without need to separate double
numbers with whitespace characters):
#include <iostream>
#include <sstream>
#include <string>
using namespace std;
int main() {
istringstream iss("2.832 1.3067 nana1.678 xxx.05 meh.ugh");
double num = 0;
while(iss >> num || !iss.eof()) {
if(iss.fail()) {
iss.clear();
while(iss) {
char dummy = iss.peek();
if(std::isdigit(dummy) || dummy == '.') {
// Stop consuming invalid double characters
break;
}
else {
iss >> dummy; // Consume invalid double characters
}
}
continue;
}
cout << num << endl;
}
return 0;
}
Output
2.832
1.3067
1.678
0.05
Live Demo