好的。您一直在询问有关此解析作业的 6 个问题¹。
许多人一直在告诉您正则表达式不是这项工作的工具。 Including me:
我已经给你看了
- Spirit X3 语法示例,可将此配置字符串解析为键值映射,正确解释转义引号(例如
'\\'')(参见here)
- 我对其进行了扩展(13 个字符)以允许重复引用以转义引用(请参阅here)
我的所有示例都非常出色,因为它们已经解析了键和值,因此您拥有正确的配置设置映射。
但你仍然在最新的问题 (Extract everything apart from what is specified in the regex) 中要求它。
当然答案就在我的第一个答案中:
for (auto& setting : parse_config(text))
std::cout << setting.first << "\n";
我 posted this 以及它的 C++03 版本 live on Coliru
编写手动解析器
如果你因为不理解而拒绝它,你所要做的就是问。
如果您“不想”使用 Spirit,您可以轻松地手动编写类似的解析器。我没有,因为它很乏味且容易出错。如果你需要它来获得灵感,你可以在这里:
- 还是c++03
- 仅使用标准库功能
- 仍在使用可转义引号解析单引号/双引号字符串
- 仍解析为
map<string, string>
- 在无效输入时引发信息性错误消息
底线:使用正确的语法,就像人们从第一天开始就敦促你的那样
Live On Coliru
#include <iostream>
#include <sstream>
#include <map>
typedef std::map<std::string, std::string> Config;
typedef std::pair<std::string, std::string> Entry;
struct Parser {
Parser(std::string const& input) : input(input) {}
Config parse() {
Config parsed;
enum { KEY, VALUE } state = KEY;
key = value = "";
f = input.begin(), l = input.end();
while (f!=l) {
//std::cout << "state=" << state << ", '" << std::string(It(input.begin()), f) << "[" << *f << "]" << std::string(f+1, l) << "'\n";
switch (state) {
case KEY:
skipws();
if (!parse_key())
raise("Empty key");
state = VALUE;
break;
case VALUE:
if (!expect('(', true))
raise("Expected '('");
if (parse_value('\'') || parse_value('"')) {
parsed[key] = value;
key = value = "";
} else {
raise("Expected quoted value");
}
if (!expect(')', true))
raise("Expected ')'");
state = KEY;
break;
};
}
if (!(key.empty() && value.empty() && state==KEY))
raise("Unexpected end of input");
return parsed;
}
private:
std::string input;
typedef std::string::const_iterator It;
It f, l;
std::string key, value;
bool parse_key() {
while (f!=l && alpha(*f))
key += *f++;
return !key.empty();
}
bool parse_value(char quote) {
if (!expect(quote, true))
return false;
while (f!=l) {
char const ch = *f++;
if (ch == quote) {
if (expect(quote, false)) {
value += quote;
} else {
//std::cout << " Entry " << key << " -> " << value << "\n";
return true;
}
} else {
value += ch;
}
}
return false;
}
static bool space(unsigned char ch) { return std::isspace(ch); }
static bool alpha(unsigned char ch) { return std::isalpha(ch); }
void skipws() { while (f!=l && space(*f)) ++f; }
bool expect(unsigned char ch, bool ws = true) {
if (ws) skipws();
if (f!=l && *f == ch) {
++f;
if (ws) skipws();
return true;
}
return false;
}
void raise(std::string const& msg) {
std::ostringstream oss;
oss << msg << " (at '" << std::string(f,l) << "')";
throw std::runtime_error(oss.str());
}
};
int main() {
std::string const text = "server ('m1.labs.terad ''ata.com') username ('us\\* er5') password('user)5') dbname ('def\\ault')";
Config cfg = Parser(text).parse();
for (Config::const_iterator setting = cfg.begin(); setting != cfg.end(); ++setting) {
std::cout << "Key " << setting->first << " has value " << setting->second << "\n";
}
for (Config::const_iterator setting = cfg.begin(); setting != cfg.end(); ++setting) {
std::cout << setting->first << "\n";
}
}
一如既往地打印:
Key dbname has value def\ault
Key password has value user)5
Key server has value m1.labs.terad 'ata.com
Key username has value us\* er5
dbname
password
server
username
¹见
- avoid empty token in cpp
- extracting whitespaces using regex in cpp
- Regex to extract value between a single quote and parenthesis using boost token iterator
- tokenizing string , accepting everything between given set of characters in CPP
- extract a string with single quotes between parenthesis and single quote
- Extract everything apart from what is specified in the regex
- 这个