Safer but easy-to-use and flexible C++ alternative to sscanf()

When I need to scan in values from a bunch of strings, I often find myself falling back to C's sscanf() strictly because of its simplicity and ease of use. For example, I can very succinctly pull a couple double values out of a string with:

string str;
double val1, val2;
if (sscanf(str.c_str(), "(%lf,%lf)", &val1, &val2) == 2)
{
    // got them!
}

This obviously isn't very C++. I don't necessarily consider that an abomination, but I'm always looking for a better way to do a common task. I understand that the "C++ way" to read strings is istringstream, but the extra typing required to handle the parenthesis and comma in the format string above just make it too cumbersome to make me want to use it.

Is there a good way to either bend built-in facilities to my will in a way similar to the above, or is there a good C++ library that does the above in a more type-safe way? It looks like Boost.Format has really solved the output problem in a good way, but I haven't found anything similarly succinct for input.


Solution 1:

I wrote a bit of code that can read in string and character literals. Like normal stream reads, if it gets invalid data it sets the badbit of the stream. This should work for all types of streams, including wide streams. Stick this bit in a new header:

#include <iostream>
#include <string>
#include <array>
#include <cstring>

template<class e, class t, int N>
std::basic_istream<e,t>& operator>>(std::basic_istream<e,t>& in, const e(&sliteral)[N]) {
        std::array<e, N-1> buffer; //get buffer
        in >> buffer[0]; //skips whitespace
        if (N>2)
                in.read(&buffer[1], N-2); //read the rest
        if (strncmp(&buffer[0], sliteral, N-1)) //if it failed
                in.setstate(in.rdstate() | std::ios::failbit); //set the state
        return in;
}
template<class e, class t>
std::basic_istream<e,t>& operator>>(std::basic_istream<e,t>& in, const e& cliteral) {
        e buffer;  //get buffer
        in >> buffer; //read data
        if (buffer != cliteral) //if it failed
                in.setstate(in.rdstate() | std::ios::failbit); //set the state
        return in;
}
//redirect mutable char arrays to their normal function
template<class e, class t, int N>
std::basic_istream<e,t>& operator>>(std::basic_istream<e,t>& in, e(&carray)[N]) {
        return std::operator>>(in, carray);
}

And it will make input characters very easy:

std::istringstream input;
double val1, val2;
if (input >>'('>>val1>>','>>val2>>')') //less chars than scanf I think
{
    // got them!
}

PROOF OF CONCEPT. Now you can cin string and character literals, and if the input is not an exact match, it acts just like any other type that failed to input correctly. Note that this only matches whitespace in string literals that aren't the first character. It's only four functions, all of which are brain-dead simple.

EDIT

Parsing with streams is a bad idea. Use a regex.

Solution 2:

The best thing i've ever used for string parsing is boost.spirit. It's fast,safe and very flexible. The big advantage is that you can write parsing rules in form close to EBNF grammar

using namespace boost::spirit;

boost::fusion::vector < double, double > value_;

std::string string_ = "10.5,10.6 ";

bool result_ = qi::parse(
    string_.begin(),
    string_.end(),
    qi::double_ >> ',' >> qi::double_, // Parsing rule
    value_); // value

Solution 3:

I think that with regex it could be done easy. So boost::regex or std::regex in a new standard. After that just convert your tokens to float by using lexical_cast or streams directly.