Boost spirit skipper issues
I have trouble with boost spirit skippers.
I need to parse a file like that :
ROW int
int [int, int]
int [int, int]
...
I am able to parse it without problem (thanks to stackoverflow ;) only if I add an '_' after the first int.
In fact, I think the skipper eat the end of line after the first int, so the first and second (on second line) look as only one int. I don't understand how to keep eol but eat spaces. I've found examples to use a custom parser like here and here.
I tried qi::blank, custom parser with one single rule lit(' ') No matter what skipper I use, space and eol are always eat.
My grammar is :
a line :
struct rowType
{
unsigned int number;
std::list<unsigned int> list;
};
the full problem stored in a structure :
struct problemType
{
unsigned int ROW;
std::vector<rowType> rows;
};
the row parser :
template<typename Iterator>
struct row_parser : qi::grammar<Iterator, rowType(), qi::space_type>
{
row_parser() : row_parser::base_type(start)
{
list = '[' >> -(qi::int_ % ',') >> ']';
start = qi::int_ >> list;
}
qi::rule<Iterator, rowType(), qi::space_type> start;
qi::rule<Iterator, std::list<unsigned int>(), qi::space_type> list;
};
and the problem parser :
template<typename Iterator>
struct problem_parser : qi::grammar<Iterator,problemType(),qi::space_type>
{
problem_parser() : problem_parser::base_type(start)
{
using boost::phoenix::bind;
using qi::lit;
start = qi::int_ >> lit('_') >> +(row);
//BOOST_SPIRIT_DEBUG_NODE(start);
}
qi::rule<Iterator, problemType(),qi::space_type> start;
row_parser<Iterator> row;
};
And I use it like that:
main() {
static const problem_parser<spirit::multi_pass<base_iterator_type> > p;
...
spirit::qi::phrase_parse(first, last ,
p,
qi::space,
pb);
}
Of course, the qi::space is my problem, and a way to solve my problem would be to don't use a skipper, but phrase_parse requires one, and then my parser requires one.
I'm stuck since some hours now... I think it's something obvious I have misunderstood.
Thanks for your help.
In general the following directives are helpful for inhibiting/switching skippers mid-grammar:
-
qi::lexeme [ p ]
which inhibits a skipper, e.g. if you want to be sure you parse an identifier without internal skips) - see alsono_skip
for comparison -
qi::raw [ p ]
which parses like always, including skips, but returns the raw iterator range of the matched source sequence (including the skipped positions) -
qi::no_skip [ p ]
Inhibiting Skipping Without Pre-skip (I've created a minimal example to demonstrate the difference here: Boost Spirit lexeme vs no_skip) -
qi::skip(s) [ p ]
which replaces the skipper by another skippers
altogether (note that you need to use appropriately declaredqi::rule<>
instances inside such askip[]
clause)
where p
is any parser expression.
Specific solution
Your problem, as you already know, might be that qi::space
eats all whitespace. I can't possibly know what is wrong in your grammar (since you don't show either the full grammar, or relevant input).
Therefore, here's what I'd write. Note
- the use of
qi::eol
to explicitely require linebreaks at specific locations - the use of
qi::blank
as a skipper (not includingeol
) - for brevity I combined the grammars
Code:
#define BOOST_SPIRIT_DEBUG
#include <boost/fusion/adapted.hpp>
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/phoenix.hpp>
namespace qi = boost::spirit::qi;
namespace phx = boost::phoenix;
struct rowType {
unsigned int number;
std::list<unsigned int> list;
};
struct problemType {
unsigned int ROW;
std::vector<rowType> rows;
};
BOOST_FUSION_ADAPT_STRUCT(rowType, (unsigned int, number)(std::list<unsigned int>, list))
BOOST_FUSION_ADAPT_STRUCT(problemType, (unsigned int, ROW)(std::vector<rowType>, rows))
template<typename Iterator>
struct problem_parser : qi::grammar<Iterator,problemType(),qi::blank_type>
{
problem_parser() : problem_parser::base_type(problem)
{
using namespace qi;
list = '[' >> -(int_ % ',') >> ']';
row = int_ >> list >> eol;
problem = "ROW" >> int_ >> eol >> +row;
BOOST_SPIRIT_DEBUG_NODES((problem)(row)(list));
}
qi::rule<Iterator, problemType() , qi::blank_type> problem;
qi::rule<Iterator, rowType() , qi::blank_type> row;
qi::rule<Iterator, std::list<unsigned int>(), qi::blank_type> list;
};
int main()
{
const std::string input =
"ROW 1\n"
"2 [3, 4]\n"
"5 [6, 7]\n";
auto f = begin(input), l = end(input);
problem_parser<std::string::const_iterator> p;
problemType data;
bool ok = qi::phrase_parse(f, l, p, qi::blank, data);
if (ok) std::cout << "success\n";
else std::cout << "failed\n";
if (f!=l)
std::cout << "Remaining unparsed: '" << std::string(f,l) << "'\n";
}
If you really didn't want to require line breaks:
template<typename Iterator>
struct problem_parser : qi::grammar<Iterator,problemType(),qi::space_type>
{
problem_parser() : problem_parser::base_type(problem)
{
using namespace qi;
list = '[' >> -(int_ % ',') >> ']';
row = int_ >> list;
problem = "ROW" >> int_ >> +row;
BOOST_SPIRIT_DEBUG_NODES((problem)(row)(list));
}
qi::rule<Iterator, problemType() , qi::space_type> problem;
qi::rule<Iterator, rowType() , qi::space_type> row;
qi::rule<Iterator, std::list<unsigned int>(), qi::space_type> list;
};
int main()
{
const std::string input =
"ROW 1 " // NOTE whitespace, obviously required!
"2 [3, 4]"
"5 [6, 7]";
auto f = begin(input), l = end(input);
problem_parser<std::string::const_iterator> p;
problemType data;
bool ok = qi::phrase_parse(f, l, p, qi::space, data);
if (ok) std::cout << "success\n";
else std::cout << "failed\n";
if (f!=l)
std::cout << "Remaining unparsed: '" << std::string(f,l) << "'\n";
}
Update
In response to the comment: here is a snippet that shows how to read the input from a file. This was tested and works fine for me:
std::ifstream ifs("input.txt"/*, std::ios::binary*/);
ifs.unsetf(std::ios::skipws);
boost::spirit::istream_iterator f(ifs), l;
problem_parser<boost::spirit::istream_iterator> p;