【问题标题】:Boost.Spirit grammar issueBoost.Spirit 语法问题
【发布时间】:2014-06-16 20:20:59
【问题描述】:

我正在尝试解析 terminfo 定义文本文件。我是 Boost.Spirit 的新手。我从只解析注释行、空行和终端定义的简单语法开始。正如语法中的代码注释所示,取消注释 [_val = _1]definition 会中断编译。为什么?我可以修复它吗?

如果我忽略实际的 terminfo 文件,我希望下面的代码能够解析这种文本:

# comment line

first definition line
  second 
  third line

# another comment line

代码:

#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/qi_eol.hpp>
#include <boost/spirit/include/qi_eoi.hpp>
#include <boost/spirit/include/phoenix_core.hpp>
#include <boost/spirit/include/phoenix_operator.hpp>
#include <boost/fusion/include/adapt_struct.hpp>
#include <boost/spirit/include/phoenix_object.hpp>
#include <vector>
#include <iostream>
#include <string>

namespace termcxx
{

namespace parser
{

namespace qi = boost::spirit::qi;
namespace ascii = boost::spirit::ascii;
namespace px = boost::phoenix;

//using qi::double_;
using ascii::space;
//using px::ref;
using px::construct;

//using qi::eps;
//using qi::lit;
using qi::_val;
using qi::_1;
using ascii::char_;
using qi::eol;
using qi::eoi;


struct context
{
    int dummy;

    context () = default;
    context (context const &) = default;
    context (std::vector<char> a)
    { }
    context (std::vector<char> a, std::vector<char> b)
    { }
};

} }


BOOST_FUSION_ADAPT_STRUCT(
    termcxx::parser::context,
    (int, dummy))


namespace termcxx
{

namespace parser
{

template <typename Iterator>
struct parser
    : qi::grammar<Iterator, context()>
{
    qi::rule<Iterator, std::vector<char> > comment_line
    = (*space >> '#' >> *(char_ - eol) >> (eol | eoi))[_val = _1]
        ;

    qi::rule<Iterator, std::vector<char> > empty_line
    = (*space >> (eol | eoi))[_val = _1]
        ;

    qi::rule<Iterator, std::vector<char> > def_first_line
    = (+(char_ - eol) >> (eol | eoi))[_val = _1]
        ;

    qi::rule<Iterator, std::vector<char> > def_subsequent_line
    = (+space >> +(char_ - eol) >> (eol | eoi))[_val = _1]
        ;

    qi::rule<Iterator, std::vector<char> > definition
    = (def_first_line >> *def_subsequent_line)//[_val = _1] // Uncommenting the [_val = _1] breaks compilation. Why?
        ;

    qi::rule<Iterator, context()> start
    = (*(comment_line
            | empty_line
            | definition))[_val = construct<context> ()]
        ;

    parser()
        : parser::base_type(start)
    { }
};

template struct parser<std::string::iterator>;

} // namespace parser

} // namespace termcxx

【问题讨论】:

    标签: c++ parsing c++11 boost boost-spirit


    【解决方案1】:

    你为什么坚持指定[_val=_1]?这是多余的,因为默认属性传播会这样做。其实很痛,见下文

    接下来,(def_first_line &gt;&gt; *def_subsequent_line) 的属性类型(显然)与std::vector&lt;char&gt; 不兼容。也许你可以

    • 只需使用默认属性传播(它有足够的智能继续追加
    • 使用raw[]获取完整匹配的输入
    • 定义BOOST_SPIRIT_ACTIONS_ALLOW_ATTR_COMPAT(我不确定这是否得到很好的支持)

    还有,

    更新

    还有几个问题:

    • 您拼错了大多数规则的属性类型(缺少()):

      qi::rule<Iterator, std::string()> comment_line;
      qi::rule<Iterator, std::string()> empty_line;
      qi::rule<Iterator, std::string()> def_first_line;
      qi::rule<Iterator, std::string()> def_subsequent_line;
      qi::rule<Iterator, std::string()> definition;
      
    • empty_lineeoi 匹配,导致输入结束时出现无限循环

    • char_ 的使用也接受空格(使用 graph 代替:)

          def_first_line      = graph >> +(char_ - eol)         >> (eol|eoi);
      
    • 使用qi::space 也会吃掉行尾!请改用qi::blank

    • 有利于可靠性:

          empty_line          = *blank >> eol;
          comment_line        = *blank >> '#' >> *(char_ - eol) >> (eol|eoi);
          def_first_line      = graph >> +(char_ - eol)         >> (eol|eoi);
          def_subsequent_line = +blank >> +(char_ - eol)        >> (eol|eoi);
      
          definition          = (def_first_line >> *def_subsequent_line);
      
          start               = (  
                                  *(comment_line | empty_line | definition)
                                ) [ _val = px::construct<context>() ]
                                ;
      

      这个简单的习惯将节省您的工作时间和与 Spirit 一起工作时的理智。

    • 您可以稍微简化一下包含

    这是一个修正版本 Live On Coliru 输出:

    <start>
      <try># comment line\n\nfirs</try>
      <comment_line>
        <try># comment line\n\nfirs</try>
        <success>\nfirst definition li</success>
        <attributes>[[ , c, o, m, m, e, n, t,  , l, i, n, e]]</attributes>
      </comment_line>
      <comment_line>
        <try>\nfirst definition li</try>
        <fail/>
      </comment_line>
      <empty_line>
        <try>\nfirst definition li</try>
        <success>first definition lin</success>
        <attributes>[[]]</attributes>
      </empty_line>
      <comment_line>
        <try>first definition lin</try>
        <fail/>
      </comment_line>
      <empty_line>
        <try>first definition lin</try>
        <fail/>
      </empty_line>
      <definition>
        <try>first definition lin</try>
        <def_first_line>
          <try>first definition lin</try>
          <success>  second \n  third li</success>
          <attributes>[[f, i, r, s, t,  , d, e, f, i, n, i, t, i, o, n,  , l, i, n, e]]</attributes>
        </def_first_line>
        <def_subsequent_line>
          <try>  second \n  third li</try>
          <success>  third line\n\n# anot</success>
          <attributes>[[f, i, r, s, t,  , d, e, f, i, n, i, t, i, o, n,  , l, i, n, e,  ,  , s, e, c, o, n, d,  ]]</attributes>
        </def_subsequent_line>
        <def_subsequent_line>
          <try>  third line\n\n# anot</try>
          <success>\n# another comment l</success>
          <attributes>[[f, i, r, s, t,  , d, e, f, i, n, i, t, i, o, n,  , l, i, n, e,  ,  , s, e, c, o, n, d,  ,  ,  , t, h, i, r, d,  , l, i, n, e]]</attributes>
        </def_subsequent_line>
        <def_subsequent_line>
          <try>\n# another comment l</try>
          <fail/>
        </def_subsequent_line>
        <success>\n# another comment l</success>
        <attributes>[[f, i, r, s, t,  , d, e, f, i, n, i, t, i, o, n,  , l, i, n, e,  ,  , s, e, c, o, n, d,  ,  ,  , t, h, i, r, d,  , l, i, n, e]]</attributes>
      </definition>
      <comment_line>
        <try>\n# another comment l</try>
        <fail/>
      </comment_line>
      <empty_line>
        <try>\n# another comment l</try>
        <success># another comment li</success>
        <attributes>[[]]</attributes>
      </empty_line>
      <comment_line>
        <try># another comment li</try>
        <success></success>
        <attributes>[[ , a, n, o, t, h, e, r,  , c, o, m, m, e, n, t,  , l, i, n, e, !]]</attributes>
      </comment_line>
      <comment_line>
        <try></try>
        <fail/>
      </comment_line>
      <empty_line>
        <try></try>
        <fail/>
      </empty_line>
      <definition>
        <try></try>
        <def_first_line>
          <try></try>
          <fail/>
        </def_first_line>
        <fail/>
      </definition>
      <success></success>
      <attributes>[]</attributes>
    </start>
    Success
    

    完整代码供参考:

    #define BOOST_SPIRIT_DEBUG
    #include <boost/spirit/include/qi.hpp>
    #include <boost/spirit/include/phoenix.hpp>
    #include <boost/fusion/include/adapt_struct.hpp>
    
    #include <vector>
    #include <iostream>
    #include <string>
    
    namespace qi = boost::spirit::qi;
    
    namespace termcxx { namespace parser {
    
        namespace ascii = boost::spirit::ascii;
        namespace px    = boost::phoenix;
    
        //using qi::double_;
        using ascii::blank;
        //using px::ref;
        using px::construct;
    
        //using qi::eps;
        //using qi::lit;
        using qi::_val;
        using qi::_1;
        using ascii::char_;
        using ascii::graph;
        using qi::eol;
        using qi::eoi;
    
        struct context
        {
            int dummy;
    
            context () = default;
            context (context const &) = default;
            context (std::vector<char> a) { }
            context (std::vector<char> a, std::vector<char> b) { }
        };
    
    } }
    
    BOOST_FUSION_ADAPT_STRUCT(termcxx::parser::context, (int, dummy))
    
    namespace termcxx { namespace parser {
    
        template <typename Iterator>
        struct parser : qi::grammar<Iterator, context()>
        {
            parser() : parser::base_type(start)
            { 
                empty_line          = *blank >> eol;
                comment_line        = *blank >> '#' >> *(char_ - eol) >> (eol|eoi);
                def_first_line      = graph >> +(char_ - eol)         >> (eol|eoi);
                def_subsequent_line = +blank >> +(char_ - eol)        >> (eol|eoi);
    
                definition          = (def_first_line >> *def_subsequent_line);
    
                start               = (  
                                        *(comment_line | empty_line | definition)
                                      ) [ _val = px::construct<context>() ]
                                      ;
    
                BOOST_SPIRIT_DEBUG_NODES((start)(def_first_line)(def_subsequent_line)(definition)(empty_line)(comment_line))
            }
    
          private:
            qi::rule<Iterator, context()> start;
            qi::rule<Iterator, std::string()> comment_line;
            qi::rule<Iterator, std::string()> empty_line;
            qi::rule<Iterator, std::string()> def_first_line;
            qi::rule<Iterator, std::string()> def_subsequent_line;
            qi::rule<Iterator, std::string()> definition;
        };
    
    } }
    
    int main()
    {
        using It = boost::spirit::istream_iterator;
        termcxx::parser::parser<It> g;
    
        It f(std::cin >> std::noskipws), l;
        termcxx::parser::context data;
        if (qi::parse(f,l,g,data))
            std::cout << "Success\n";
        else
            std::cout << "Failure\n";
    
        if (f != l)
            std::cout << "Remaining input: '" << std::string(f,l) << "'\n";
    }
    

    【讨论】:

    • 我发现了更多问题,并使其“工作”(显然尚未完成)。在 Updatelive demo 中查看我的新笔记
    【解决方案2】:

    让我们看看这一行到底发生了什么:

    qi::rule<Iterator, std::vector<char> > definition
        = (def_first_line >> *def_subsequent_line)[_val = _1];
            ;
    
    1. def_first_line 是一条规则。它的属性是 std::vector&lt;char&gt;
    2. def_subsequent_line 是另一条规则。再次 它的属性是std::vector&lt;char&gt;
    3. * def_subsequent_line 是通过将kleene 运算符* 应用于def_subsequent_line 获得的解析器。它的隐含属性是vector&lt; std::vector&lt;char&gt; &gt;
    4. (def_first_line &gt;&gt; *def_subsequent_line)。这是另一个解析器。由于精神复合属性规则,它的隐含属性又是vector&lt; std::vector&lt;char&gt; &gt;

    所以基本上,这行应该是:

    qi::rule<Iterator, std::vector<std::vector<char> > > definition
        = (def_first_line >> *def_subsequent_line)[_val = _1];
            ;
    

    这很有道理,不是吗?您希望单独获取每一行,而不是将所有字符放在同一个向量中。

    现在,作为旁注:

    • [_val = _1] 并不是真的必要。您应该使用运算符%= 在语法的构造函数中初始化您的规则,它负责处理隐式属性。
    • 假设你不需要访问cmets,你应该写一个skipper规则,它可以自动处理间距和cmets,然后和phrase_parse一起使用这样的规则。
    • 您可以使用std::string 代替vector&lt;char&gt;,spirit 足够聪明,可以理解字符序列就是字符串。
    • here 中查找复合属性规则。

    【讨论】:

    • 哇。很多时间(不过我不太确定你的水晶球今天校准得很好)
    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2011-05-17
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2011-05-15
    相关资源
    最近更新 更多