【问题标题】:In boost spirit, use of multi_pass with streaming file input, which iterator needed本着提升精神,使用 multi_pass 和流文件输入,迭代器需要
【发布时间】:2015-11-19 14:02:58
【问题描述】:

我想输入一个相当大的csv文件来用灵气解析它(使用boost 1.59.0)。有这样的例子,它看起来很简单,但是明显的设置会导致编译错误,其中 qi::phrase_parse(...) 的第一个参数不被接受。什么在这里有效? (一个例子是: How to pass the iterator to a function in spirit qi ) 代码:

#define BOOST_SPIRIT_DEBUG
//#define BOOST_SPIRIT_DEBUG_PRINT_SOME 200
//#define BOOST_SPIRIT_DEBUG_OUT std::cerr

#include <stdio.h>
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/support_multi_pass.hpp>
#include <fstream>

std::string dataLoc = "afile.csv";

namespace qi = boost::spirit::qi;

using Column  = std::string;
using Columns = std::vector<Column>;
using CsvLine = Columns;
using CsvParsed = std::vector<CsvLine>;

template <typename It>
struct CsvGrammar : qi::grammar<It, CsvParsed(), qi::blank_type>
{
    CsvGrammar() : CsvGrammar::base_type(start)
    {
        using namespace qi;

        static const char colsep = '|';

        start  = -line % eol;
        line   = column % colsep;
        column = quoted | *~char_(colsep);
        quoted = '"' >> *("\"\"" | ~char_('"')) >> '"';

        BOOST_SPIRIT_DEBUG_NODES((start)(line)(column)(quoted));
    }
private:
    qi::rule<It, CsvParsed(), qi::blank_type> start;
    qi::rule<It, CsvLine(), qi::blank_type> line;
    qi::rule<It, Column(),  qi::blank_type> column;
    qi::rule<It, std::string()> quoted;
};

int main()
{
    std::ifstream inFile(dataLoc, std::ifstream::in);
    if (inFile.good()) {
        std::cout << "input found" << std::endl;
    }
/*
    // use either this block of code
    typedef boost::spirit::istream_iterator istreamIter;
    istreamIter fwd_begin = istreamIter(inFile);
    istreamIter fwd_end = istreamIter();
*/
    // or this block
    typedef std::istreambuf_iterator<char> base_iterator_type;
    typedef boost::spirit::multi_pass<base_iterator_type> forward_iterator_type;
    base_iterator_type in_begin(inFile);
    base_iterator_type in_end;
    forward_iterator_type fwd_begin = boost::spirit::make_default_multi_pass(in_begin);
    forward_iterator_type fwd_end  = boost::spirit::make_default_multi_pass(in_end);

    CsvGrammar<std::string::const_iterator> p;
    CsvParsed parsed;
    bool ok = qi::phrase_parse(fwd_begin, fwd_end, p, qi::blank, parsed);
    if (ok)
    {
        for(auto& line : parsed) {
            for(auto& col : line)
                std::cout << '[' << col << ']';
            std::cout << std::endl;
        }
    } else
    {
        std::cout << "Parse failed\n";
    }

    if (fwd_begin != fwd_end)
        std::cout << "Remaining unparsed: '" << std::string(fwd_begin, fwd_end ) << "'\n";
}

编译器(Apple clang 6.1 via CLion)给出以下错误:

在 /Users/alan/ClionProjects/csvreader/csvReader.cpp:16 包含的文件中: 在 /Users/alan/ClionProjects/csvreader/boost/boost_1_59_0/boost/spirit/include/qi.hpp:16 包含的文件中: 在 /Users/alan/ClionProjects/csvreader/boost/boost_1_59_0/boost/spirit/home/qi.hpp:21 包含的文件中: 在 /Users/alan/ClionProjects/csvreader/boost/boost_1_59_0/boost/spirit/home/qi/nonterminal.hpp:14 包含的文件中: 在 /Users/alan/ClionProjects/csvreader/boost/boost_1_59_0/boost/spirit/home/qi/nonterminal/rule.hpp:35 包含的文件中: /Users/alan/ClionProjects/csvreader/boost/boost_1_59_0/boost/spirit/home/qi/reference.hpp:43:30:错误:没有匹配的成员函数调用“解析” 返回 ref.get().parse(first, last, context, skipper, attr_); ~~~~~~~~~~^~~~~ /Users/alan/ClionProjects/csvreader/boost/boost_1_59_0/boost/spirit/home/qi/parse.hpp:164:40:注意:在函数模板特化'boost::spirit::qi::reference的实例化中,std ::__1::vector, std::__1::allocator > >, std::__1::allocator, std::__1::allocator > > > (), boost::proto::exprns_::expr > , 0>, boost::spirit::unused_type, boost::spirit::unused_type> >::parse >, boost::spirit::iterator_policies::default_policy >, boost::spirit::context, std::__1 ::allocator > >, std::__1::allocator, std::__1::allocator > > > > &, boost::fusion::nil_>, boost::spirit::locals >, boost::spirit: :qi::char_class >, std::__1::vector, std::__1::allocator > >, std::__1::allocator, std::__1::allocator > > > > >'在这里请求 if (!compile(expr).parse( ^ /Users/alan/ClionProjects/csvreader/boost/boost_1_59_0/boost/spirit/home/qi/parse.hpp:197:20:注意:在函数模板特化'boost::spirit::qi::phrase_parse>的实例化中, boost::spirit::iterator_policies::default_policy >, CsvGrammar >, boost::proto::exprns_::expr >, 0>, std::__1::vector, std::__1::allocator > >, std: :__1::allocator, std::__1::allocator > > > > >' 在这里请求 返回 qi::phrase_parse(first, last, expr, skipper, skip_flag::postskip, attr); ^ /Users/alan/ClionProjects/csvreader/csvReader.cpp:74:19: 注意:在函数模板特化的实例化中'boost::spirit::qi::phrase_parse >, boost::spirit::iterator_policies::default_policy >, CsvGrammar >, boost::proto::exprns_::expr >, 0>, std::__1::vector, std::__1::allocator > >, std::__1::allocator, std::__1::分配器 > > > > >' 在这里请求 bool ok = qi::phrase_parse(fwd_begin, fwd_end, p, qi::blank, 已解析); ^ /Users/alan/ClionProjects/csvreader/boost/boost_1_59_0/boost/spirit/home/qi/nonterminal/rule.hpp:274:14:注意:候选函数[with Context = boost::spirit::context, std:: __1::allocator > >, std::__1::allocator, std::__1::allocator > > > > &, boost::fusion::nil_>, boost::spirit::locals >, Skipper = boost: :spirit::qi::char_class >, 属性 = std::__1::vector, std::__1::allocator > >, std::__1::allocator, std::__1::allocator > > > >]不可行:第一个参数没有从 'boost::spirit::multi_pass >, boost::spirit::iterator_policies::default_policy >' 到 'std::__1::__wrap_iter &' 的已知转换 bool parse(Iterator& first, Iterator const& last ^ /Users/alan/ClionProjects/csvreader/boost/boost_1_59_0/boost/spirit/home/qi/nonterminal/rule.hpp:320:14:注意:候选函数模板不可行:需要6个参数,但提供了5个 bool parse(Iterator& first, Iterator const& last ^

所以看起来错误类型的迭代器作为第一个参数被输入 qi::phrase_parse。这里应该放什么?

【问题讨论】:

  • 我想也许你不需要手动创建multi_pass迭代器,可以只传递istream迭代器?会尝试...
  • @sehe 这次不行 :)
  • std::istream_iterator 在 Spirit 解析器表达式中从未被接受,据我所知,但请参阅我的答案
  • @sehe 谢谢,这解决了问题(也解决了解析 cmets)。我会检查内存映射。

标签: c++ boost boost-spirit boost-spirit-qi


【解决方案1】:

你有使用std::string::const_iterator...声明的语法...

CsvGrammar<forward_iterator_type> p;

更重要的是。

另外:

  1. 您可以直接使用boost::spirit::istream_iterator(几乎等价,但更方便);但不要忘记在这种情况下取​​消设置 std::ios::skipws 标志
  2. 考虑从内存映射文件中解析(零拷贝);我对这样做有一些答案。这应该可以很好地扩展,超出流解析可以承诺的范围,因为 AST 可以是惰性/轻量级的
  3. 您可能希望将"" 解析为",因此制定规则:

    quoted = '"' >> *("\"" >> char_('"') | ~char_('"')) >> '"';
    
  4. 您希望未引用的列在 eol 处停止;所以制定这条规则

    column = quoted | *(char_ - colsep - eol);
    
  5. 为了避免最后的空记录:

    start  = *(line >> eol);
    column = quoted | +(char_ - colsep - eol);
    
  6. 并跳过空行:

    start  = *(line >> +eol);
    

Live On Coliru

#define BOOST_SPIRIT_DEBUG
//#define BOOST_SPIRIT_DEBUG_PRINT_SOME 200
//#define BOOST_SPIRIT_DEBUG_OUT std::cerr

#include <stdio.h>
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/support_multi_pass.hpp>
#include <fstream>

std::string dataLoc = "afile.csv";

namespace qi = boost::spirit::qi;

using Column  = std::string;
using Columns = std::vector<Column>;
using CsvLine = Columns;
using CsvParsed = std::vector<CsvLine>;

template <typename It>
struct CsvGrammar : qi::grammar<It, CsvParsed(), qi::blank_type>
{
    CsvGrammar() : CsvGrammar::base_type(start)
    {
        using namespace qi;

        static const char colsep = '|';

        start  = *(line >> +eol);
        line   = column % colsep;
        column = quoted | +(char_ - colsep - eol);
        quoted = '"' >> *("\"" >> char_('"') | ~char_('"')) >> '"';

        BOOST_SPIRIT_DEBUG_NODES((start)(line)(column)(quoted));
    }
private:
    qi::rule<It, CsvParsed(), qi::blank_type> start;
    qi::rule<It, CsvLine(),   qi::blank_type> line;
    qi::rule<It, Column(),    qi::blank_type> column;
    qi::rule<It, std::string()> quoted;
};

int main()
{
    std::ifstream inFile(dataLoc, std::ifstream::in);
    if (inFile.good()) {
        std::cout << "input found" << std::endl;
    }
/*
    // use either this block of code
    typedef boost::spirit::istream_iterator istreamIter;
    istreamIter fwd_begin = istreamIter(inFile);
    istreamIter fwd_end = istreamIter();
*/
    // or this block
    typedef std::istreambuf_iterator<char> base_iterator_type;
    typedef boost::spirit::multi_pass<base_iterator_type> forward_iterator_type;
    base_iterator_type in_begin(inFile);
    base_iterator_type in_end;
    forward_iterator_type fwd_begin = boost::spirit::make_default_multi_pass(in_begin);
    forward_iterator_type fwd_end   = boost::spirit::make_default_multi_pass(in_end);

    CsvGrammar<forward_iterator_type> p;
    CsvParsed parsed;
    bool ok = qi::phrase_parse(fwd_begin, fwd_end, p, qi::blank, parsed);
    if (ok)
    {
        for(auto& line : parsed) {
            for(auto& col : line)
                std::cout << '[' << col << ']';
            std::cout << std::endl;
        }
    } else
    {
        std::cout << "Parse failed\n";
    }

    if (fwd_begin != fwd_end)
        std::cout << "Remaining unparsed: '" << std::string(fwd_begin, fwd_end ) << "'\n";
}

打印

<start>
  <try>a|b|c\n1|2|3\nX|Y|Z\n</try>
  <line>
    <try>a|b|c\n1|2|3\nX|Y|Z\n</try>
    <column>
      <try>a|b|c\n1|2|3\nX|Y|Z\n</try>
      <quoted>
        <try>a|b|c\n1|2|3\nX|Y|Z\n</try>
        <fail/>
      </quoted>
      <success>|b|c\n1|2|3\nX|Y|Z\n</success>
      <attributes>[[a]]</attributes>
    </column>
    <column>
      <try>b|c\n1|2|3\nX|Y|Z\n</try>
      <quoted>
        <try>b|c\n1|2|3\nX|Y|Z\n</try>
        <fail/>
      </quoted>
      <success>|c\n1|2|3\nX|Y|Z\n</success>
      <attributes>[[b]]</attributes>
    </column>
    <column>
      <try>c\n1|2|3\nX|Y|Z\n</try>
      <quoted>
        <try>c\n1|2|3\nX|Y|Z\n</try>
        <fail/>
      </quoted>
      <success>\n1|2|3\nX|Y|Z\n</success>
      <attributes>[[c]]</attributes>
    </column>
    <success>\n1|2|3\nX|Y|Z\n</success>
    <attributes>[[[a], [b], [c]]]</attributes>
  </line>
  <line>
    <try>1|2|3\nX|Y|Z\n</try>
    <column>
      <try>1|2|3\nX|Y|Z\n</try>
      <quoted>
        <try>1|2|3\nX|Y|Z\n</try>
        <fail/>
      </quoted>
      <success>|2|3\nX|Y|Z\n</success>
      <attributes>[[1]]</attributes>
    </column>
    <column>
      <try>2|3\nX|Y|Z\n</try>
      <quoted>
        <try>2|3\nX|Y|Z\n</try>
        <fail/>
      </quoted>
      <success>|3\nX|Y|Z\n</success>
      <attributes>[[2]]</attributes>
    </column>
    <column>
      <try>3\nX|Y|Z\n</try>
      <quoted>
        <try>3\nX|Y|Z\n</try>
        <fail/>
      </quoted>
      <success>\nX|Y|Z\n</success>
      <attributes>[[3]]</attributes>
    </column>
    <success>\nX|Y|Z\n</success>
    <attributes>[[[1], [2], [3]]]</attributes>
  </line>
  <line>
    <try>X|Y|Z\n</try>
    <column>
      <try>X|Y|Z\n</try>
      <quoted>
        <try>X|Y|Z\n</try>
        <fail/>
      </quoted>
      <success>|Y|Z\n</success>
      <attributes>[[X]]</attributes>
    </column>
    <column>
      <try>Y|Z\n</try>
      <quoted>
        <try>Y|Z\n</try>
        <fail/>
      </quoted>
      <success>|Z\n</success>
      <attributes>[[Y]]</attributes>
    </column>
    <column>
      <try>Z\n</try>
      <quoted>
        <try>Z\n</try>
        <fail/>
      </quoted>
      <success>\n</success>
      <attributes>[[Z]]</attributes>
    </column>
    <success>\n</success>
    <attributes>[[[X], [Y], [Z]]]</attributes>
  </line>
  <line>
    <try></try>
    <column>
      <try></try>
      <quoted>
        <try></try>
        <fail/>
      </quoted>
      <fail/>
    </column>
    <fail/>
  </line>
  <success></success>
  <attributes>[[[[a], [b], [c]], [[1], [2], [3]], [[X], [Y], [Z]]]]</attributes>
</start>
[a][b][c]
[1][2][3]
[X][Y][Z]

【讨论】:

猜你喜欢
  • 2013-12-29
  • 1970-01-01
  • 2023-04-02
  • 1970-01-01
  • 2014-06-08
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
相关资源
最近更新 更多