Python将C头文件转换为dict答案

【问题标题】：Python convert C header file to dictPython将C头文件转换为dict
【发布时间】：2014-07-30 00:17:20
【问题描述】：

我有一个包含一系列类的 C 头文件，我正在尝试编写一个函数来获取这些类，并将它们转换为 python dict。该文件的样本位于底部。

格式类似于

class CFGFunctions {
  class ABC {
    class AA {
      file = "abc/aa/functions"
      class myFuncName{ recompile = 1; };
    };
    class BB
    {
      file = "abc/bb/functions"
      class funcName{
        recompile=1;
      }
    }
  };
};

我希望把它变成类似的东西

{CFGFunctions:{ABC:{AA:"myFuncName"}, BB:...}}
# Or
{CFGFunctions:{ABC:{AA:{myFuncName:"string or list or something"}, BB:...}}}

最后，我的目标是获取文件路径字符串（实际上是文件夹的路径......但无论如何），以及与文件/文件夹路径相同的类中的类名。

我看过 SO、google 等，但我发现的大部分内容都是关于将行拆分为 dicts，而不是 n-deep 'blocks'

我知道我必须遍历文件，但是，我不确定将其转换为 dict 的最有效方法。我想我需要获取外部类及其相关括号，然后对剩余的文本执行相同操作。

如果这些都没有意义，那是因为我自己还没有完全理解这个过程哈哈

如果需要更多信息，我很乐意提供。

以下代码是我所想的快速模型... 它很可能是BROKEN并且可能不工作。但这是我正在考虑的过程

def get_data():
    fh = open('CFGFunctions.h', 'r')
    data = {}    # will contain final data model

    # would probably refactor some of this into a function to allow better looping
    start = ""   # starting class name
    brackets = 0 # number of brackets
    text= ""     # temp storage for lines inside block while looping
    for line in fh:
        # find the class (start
        mt = re.match(r'Class ([\w_]+) {', line)
        if mt:
            if start == "":
                start = mt.group(1)
            else:
                # once we have the first class, find all other open brackets
                mt = re.match(r'{', line)
                if mt:
                    # and inc our counter
                    brackets += 1
                mt2 = re.match(r'}', line)
                if mt2:
                    # find the close, and decrement
                    brackets -= 1
                    # if we are back to the initial block, break out of the loop
                    if brackets == 0:
                        break
                text += line
    data[start] = {'tempText': text}

====

示例文件

class CfgFunctions {
    class ABC {
        class Control {
            file = "abc\abc_sys_1\Modules\functions";
            class assignTracker {
                description = "";
                recompile = 1;
            };

            class modulePlaceMarker {
                description = "";
                recompile = 1;
            };
        };

        class Devices
        {
            file = "abc\abc_sys_1\devices\functions";
            class registerDevice { recompile = 1; };
            class getDeviceSettings { recompile = 1; };
            class openDevice { recompile = 1; };
        };
    };
};

编辑：如果可能的话，如果我必须使用一个包，我想把它放在程序目录中，而不是一般的 python libs 目录中。

【问题讨论】：

使用 pycparser 之类的东西将其解析为 AST，然后（合理地）从那里开始。
你想让它像 LISP lang。
感谢您的建议 :) 有点跑题了，您知道是否可以将软件包“安装”在与项目相同的目录中？（而不是 python 库文件）

标签： python file parsing dictionary block

【解决方案1】：

正如您所检测到的，解析是进行转换所必需的。看看 PyParsing 包，它是一个相当易于使用的库，用于在 Python 程序中实现解析。

编辑：这是识别 very 极简语法的一个非常象征性的版本 - 有点像问题顶部的示例。它不会起作用，但它可能会让你朝着正确的方向前进：

from pyparsing import ZeroOrMore, OneOrMore, \
                      Keyword, Literal


test_code = """
class CFGFunctions {
  class ABC {
    class AA {
      file = "abc/aa/functions"
      class myFuncName{ recompile = 1; };
    };
    class BB
    {
      file = "abc/bb/functions"
      class funcName{
        recompile=1;
      }
    }
  };
};
"""

class_tkn       = Keyword('class')
lbrace_tkn      = Literal('{')
rbrace_tkn      = Literal('}')
semicolon_tkn   = Keyword(';')
assign_tkn      = Keyword(';')

class_block     = ( class_tkn + identifier + lbrace_tkn + \
                    OneOrMore(class_block | ZeroOrMore(assignment)) + \
                    rbrace_tkn + semicolon_tkn \
                  )

def test_parser(test):
    try:
        results = class_block.parseString(test)
        print test, ' -> ', results
    except ParseException, s:
        print "Syntax error:", s


def main():
    test_parser(test_code)

    return 0

if __name__ == '__main__':
    main()

此外，这段代码只是解析器——它不生成任何输出。正如您在 PyParsing 文档中看到的那样，您可以稍后添加所需的操作。但第一步是识别您要翻译的内容。

最后一点：不要低估解析代码的复杂性......即使使用像 PyParsing 这样的库来处理大部分工作，也有很多方法会陷入无限循环和其他解析便利的境地.分步实施！

编辑：有关 PyParsing 的一些信息来源是：

http://werc.engr.uaf.edu/~ken/doc/python-pyparsing/HowToUsePyparsing.html

http://pyparsing.wikispaces.com/

（特别有趣的是http://pyparsing.wikispaces.com/Publications，其中有一长串文章 - 其中一些是介绍性的 - 关于 PyParsing）

http://pypi.python.org/pypi/pyparsing_helper 是一个用于调试解析器的 GUI

stackoverflow 上还有一个“标签”Pyparsing，Paul McGuire（PyParsing 的作者）似乎是这里的常客。

* 注意：* 来自以下 cmets 中的 PaulMcG：Pyparsing 不再托管在 wikispaces.com 上。转至github.com/pyparsing/pyparsing

【讨论】：

感谢您的解释 :) 让事情变得更加清晰！稍微偏离主题，您知道是否可以将包“安装”在与项目相同的目录中？（而不是 python 库文件）
我要说的一件事是它的文档是废话和/或几乎不存在
'Installing' 在 Python 中的包，使用标准的方式，将它安装在 /usr/pythonX.XX/site-packages 目录中。但是，可以使用 python setup.py install --user 将软件包安装在您自己的目录中
谢谢！现在我只需要处理没有文档的问题 xD To google！
我在上面的答案中添加了一个信息源列表。