【问题标题】:Boost Spirit (x3) failing to consume last token when parsing character escapes解析字符转义时,Boost Spirit (x3) 未能使用最后一个令牌
【发布时间】:2021-08-17 18:52:19
【问题描述】:

使用 boost spirit x3 来解析转义的 ascii 字符串我遇到了this answer,但我得到了一个期望异常。我已将原始中的 expectation 运算符更改为序列运算符,以禁用下面代码中的异常。运行代码,它会解析输入并将正确的值分配给属性,但返回 false 并且不使用输入。有什么想法我在这里做错了吗?

gcc 版本 10.3.0

提升 1.71

std = c++17

#include <boost/spirit/home/x3.hpp>
#include <string>
#include <iostream>


namespace x3 = boost::spirit::x3;
using namespace std::string_literals;

//changed expectation to sequence
auto const qstring = x3::lexeme['"' >> *(
             "\\n" >> x3::attr('\n')
           | "\\b" >> x3::attr('\b')
           | "\\f" >> x3::attr('\f')
           | "\\t" >> x3::attr('\t')
           | "\\v" >> x3::attr('\v')
           | "\\0" >> x3::attr('\0')
           | "\\r" >> x3::attr('\r')
           | "\\n" >> x3::attr('\n')
           | "\\"  >> x3::char_("\"\\")
           | "\\\"" >> x3::char_('"')
           | ~x3::char_('"')
       ) >> '"'];

int main(int, char**){

    auto const quoted = "\"Hel\\\"lo Wor\\\"ld"s;
    auto const expected = "Hel\"lo Wor\"ld"s;

    std::string result;
    auto first = quoted.begin();
    auto const last = quoted.end();
    bool ok = x3::phrase_parse(first, last, qstring, x3::ascii::space, result);
    std::cout << "parse returned " << std::boolalpha << ok << '\n';

    std::cout << result << " == " << expected << " is " << std::boolalpha << (result == expected) << '\n';

    std::cout << "first == last = " << (first == last) << '\n';
    std::cout << "first = " << *first << '\n';

    return 0;
}

【问题讨论】:

    标签: c++ boost-spirit


    【解决方案1】:

    您的输入没有以引号字符结束。将其写为原始字符串文字会有所帮助:

    std::string const qinput   = R"("Hel\"lo Wor\"ld)";
    

    应该是

    std::string const qinput   = R"("Hel\"lo Wor\"ld")";
    

    现在,剩下的就是普通的容器处理:在 Spirit 中,当规则失败时(也包括回溯分支时),容器属性不会回滚。参见例如boost::spirit::qi duplicate parsing on the outputUnderstanding Boost.spirit's string parser

    基本上,如果解析失败,您就不能依赖结果。这可能就是为什么原文有一个期望点:引发异常。

    正确工作的完整演示:

    Live On Coliru

    #include <boost/spirit/home/x3.hpp>
    #include <string>
    #include <iostream>
    #include <iomanip>
    
    namespace x3 = boost::spirit::x3;
    
    auto escapes = []{
        x3::symbols<char> sym;
        sym.add
            ("\\b", '\b')
            ("\\f", '\f')
            ("\\t", '\t')
            ("\\v", '\v')
            ("\\0", '\0')
            ("\\r", '\r')
            ("\\n", '\n')
            ("\\\\", '\\')
            ("\\\"", '"')
            ;
        return sym;
    }();
    
    auto const qstring = x3::lexeme['"' >> *(escapes | ~x3::char_('"')) >> '"'];
    
    int main(){
        auto squote = [](std::string_view s) { return std::quoted(s, '\''); };
        std::string const expected = R"(Hel"lo Wor"ld)";
    
        for (std::string const qinput : {
            R"("Hel\"lo Wor\"ld)", // oops no closing quote
            R"("Hel\"lo Wor\"ld")",
            "\"Hel\\\"lo Wor\\\"ld\"", // if you insist
            R"("Hel\"lo Wor\"ld" trailing data)",
        })
        {
            std::cout << "\n -- input " << squote(qinput) << "\n";
            std::string result;
    
            auto first = cbegin(qinput);
            auto last  = cend(qinput);
            bool ok    = x3::phrase_parse(first, last, qstring, x3::space, result);
    
            ok &= (first == last);
    
            std::cout << "parse returned " << std::boolalpha << ok << "\n";
    
            std::cout << squote(result) << " == " << squote(expected) << " is "
                      << (result == expected) << "\n";
    
            if (first != last)
                std::cout << "Remaining input unparsed: " << squote({first, last})
                          << "\n";
        }
    }
    

    打印

     -- input '"Hel\\"lo Wor\\"ld'
    parse returned false
    'Hel"lo Wor"ld' == 'Hel"lo Wor"ld' is true
    Remaining input unparsed: '"Hel\\"lo Wor\\"ld'
    
     -- input '"Hel\\"lo Wor\\"ld"'
    parse returned true
    'Hel"lo Wor"ld' == 'Hel"lo Wor"ld' is true
    
     -- input '"Hel\\"lo Wor\\"ld"'
    parse returned true
    'Hel"lo Wor"ld' == 'Hel"lo Wor"ld' is true
    
     -- input '"Hel\\"lo Wor\\"ld" trailing data'
    parse returned false
    'Hel"lo Wor"ld' == 'Hel"lo Wor"ld' is true
    Remaining input unparsed: 'trailing data'
    

    【讨论】:

    • 在我的更新中,我无偿地将分支规则替换为 synbols 查找 (Trie),并删除了 \\n 分支的重复项。
    • 谢谢你。那里的树看不到树林。我从中得到的一个收获是使用标准库提供的设施(文字、引用等),而不是手动做这类事情。它变得不可读和混乱。符号表的“无偿”替换效果很好,因为它是我的首选解决方案,现在我只需要剪切和粘贴:)。再次感谢您。
    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 2023-04-02
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多