【问题标题】:antlr4/java: pretty print parse tree to stdoutantlr4/java:漂亮的打印解析树到标准输出
【发布时间】:2018-10-08 09:36:09
【问题描述】:

初学者问题:如何将解析树的可读版本打印到标准输出?

CharStream input = CharStreams.fromFileName("testdata/test.txt");
MyLexer lexer = new MyLexer(input);
CommonTokenStream tokens = new CommonTokenStream(lexer);
MyParser parser = new MyParser(tokens);     
parser.setBuildParseTree(true);
RuleContext tree = parser.record();
System.out.println(tree.toStringTree(parser));

这会将整个树打印在由括号'()'分隔的一行上。

(record (husband <4601>   (name KOHAI   Nikolaus) \n (birth *   um.1872   (place Ploschitz)) \n\n) (wife      (marriage oo) \n      (name SCHLOTTHAUER   Maria) \n      (birth *   um.1877  
...

我想要这样的东西

record 
  husband
    <id>
    name
       <name>
...
  wife

【问题讨论】:

    标签: java antlr antlr4 pretty-print


    【解决方案1】:

    对于 Kotlin,你可以使用这个扩展函数

    fun Tree.format(parser: Parser, indent: Int = 0): String = buildString {
        val tree = this@format
        val prefix = "  ".repeat(indent)
        append(prefix)
        append(Trees.getNodeText(tree, parser))
        if (tree.childCount != 0) {
            append(" (\n")
            for (i in 0 until tree.childCount) {
                append(tree.getChild(i).format(parser, indent + 1))
                append("\n")
            }
            append(prefix).append(")")
        }
    }
    

    【讨论】:

      【解决方案2】:

      我想对此进行自己的尝试,利用我已经在我的项目中使用StringTemplate 的事实。这意味着我不必像其他答案那样手动处理级别。它还使输出格式更易于自定义。

      最重要的是,我发布这个的主要原因是因为我决定跳过我只是“通过”的打印规则,即使用链式规则时

      a : b | something_else ;
      b : c | another ;
      c : d | yet_more ;
      d : rule that matters ;
      

      因为他们在从小输入中检查树时会弄乱我的输出,而没有添加任何有用的信息。这也很容易更改,在 //pass-through rules 评论位置。

      我还复制了Trees.getNodeText的定义并修改为使用普通数组来摆脱不必要的包装,如果我喜欢的话,甚至让我自定义它。

      最后,我让它使用解析器和树,然后直接转储到 System.out,因为这是我唯一需要它的情况。

      import org.antlr.v4.runtime.Parser;
      import org.antlr.v4.runtime.RuleContext;
      import org.antlr.v4.runtime.Token;
      import org.antlr.v4.runtime.tree.ErrorNode;
      import org.antlr.v4.runtime.tree.TerminalNode;
      import org.antlr.v4.runtime.tree.Tree;
      import org.stringtemplate.v4.ST;
      
      //for pretty-dumping trees in short form
      public class TreeUtils {
          private static final ST template() {
              return new ST("<rule_text>\n\t<child; separator=\"\n\">");
          }
          private static final ST literal(String text) {
              return new ST("<text>").add("text", text);
          }
      
          public static void dump(Parser parser, Tree tree) {
              System.out.println(process(parser.getRuleNames(),tree).render());
          }
          
          private static String getNodeText(Tree t, String[] ruleNames) {
              if ( t instanceof RuleContext ) {
                  int ruleIndex = ((RuleContext)t).getRuleContext().getRuleIndex();
                  String ruleName = ruleNames[ruleIndex];
                  return ruleName;
              }
              else if ( t instanceof ErrorNode) {
                  return t.toString();
              }
              else if ( t instanceof TerminalNode) {
                  Token symbol = ((TerminalNode)t).getSymbol();
                  if (symbol != null) {
                      String s = symbol.getText();
                      return s;
                  }
              }
      
              Object payload = t.getPayload();
              if ( payload instanceof Token ) {
                  return ((Token)payload).getText();
              }
              return t.getPayload().toString();
          }
      
          private static ST process(String[] ruleNames, Tree t) {
              if(t.getChildCount()==0) {
                  return literal(getNodeText(t, ruleNames));
              } else if(t.getChildCount()==1) {
                  //pass-through rules
                  return process(ruleNames,t.getChild(0));
              } else {
                  ST out=template();
                  out.add("rule_text", getNodeText(t, ruleNames));
                  for(int i=0;i<t.getChildCount();i++) {
                      out.add("child", process(ruleNames,t.getChild(i)));
                  }
                  return out;
              }
          }
      }
      

      【讨论】:

        【解决方案3】:

        如果您只想将正则表达式用于真正的用途,您可以随时自己打印一棵树:

        import org.antlr.v4.runtime.Parser;
        import org.antlr.v4.runtime.ParserRuleContext;
        import org.antlr.v4.runtime.tree.ParseTree;
        import org.antlr.v4.runtime.tree.Trees;
        
        public static String printSyntaxTree(Parser parser, ParseTree root) {
            StringBuilder buf = new StringBuilder();
            recursive(root, buf, 0, Arrays.asList(parser.getRuleNames()));
            return buf.toString();
        }
        
        private static void recursive(ParseTree aRoot, StringBuilder buf, int offset, List<String> ruleNames) {
            for (int i = 0; i < offset; i++) {
                buf.append("  ");
            }
            buf.append(Trees.getNodeText(aRoot, ruleNames)).append("\n");
            if (aRoot instanceof ParserRuleContext) {
                ParserRuleContext prc = (ParserRuleContext) aRoot;
                if (prc.children != null) {
                    for (ParseTree child : prc.children) {
                        recursive(child, buf, offset + 1, ruleNames);
                    }
                }
            }
        }
        

        用法:

        ParseTree root = parser.yourOwnRule();
        System.out.println(printSyntaxTree(parser, root));
        

        【讨论】:

          【解决方案4】:

          SnippetsTest 中提取为独立的实用程序类:

          import java.util.List;
          
          import org.antlr.v4.runtime.misc.Utils;
          import org.antlr.v4.runtime.tree.Tree;
          import org.antlr.v4.runtime.tree.Trees;
          
          public class TreeUtils {
          
              /** Platform dependent end-of-line marker */
              public static final String Eol = System.lineSeparator();
              /** The literal indent char(s) used for pretty-printing */
              public static final String Indents = "  ";
              private static int level;
          
              private TreeUtils() {}
          
              /**
               * Pretty print out a whole tree. {@link #getNodeText} is used on the node payloads to get the text
               * for the nodes. (Derived from Trees.toStringTree(....))
               */
              public static String toPrettyTree(final Tree t, final List<String> ruleNames) {
                  level = 0;
                  return process(t, ruleNames).replaceAll("(?m)^\\s+$", "").replaceAll("\\r?\\n\\r?\\n", Eol);
              }
          
              private static String process(final Tree t, final List<String> ruleNames) {
                  if (t.getChildCount() == 0) return Utils.escapeWhitespace(Trees.getNodeText(t, ruleNames), false);
                  StringBuilder sb = new StringBuilder();
                  sb.append(lead(level));
                  level++;
                  String s = Utils.escapeWhitespace(Trees.getNodeText(t, ruleNames), false);
                  sb.append(s + ' ');
                  for (int i = 0; i < t.getChildCount(); i++) {
                      sb.append(process(t.getChild(i), ruleNames));
                  }
                  level--;
                  sb.append(lead(level));
                  return sb.toString();
              }
          
              private static String lead(int level) {
                  StringBuilder sb = new StringBuilder();
                  if (level > 0) {
                      sb.append(Eol);
                      for (int cnt = 0; cnt < level; cnt++) {
                          sb.append(Indents);
                      }
                  }
                  return sb.toString();
              }
          }
          

          调用方法如下:

          List<String> ruleNamesList = Arrays.asList(parser.getRuleNames());
          String prettyTree = TreeUtils.toPrettyTree(tree, ruleNamesList);
          

          【讨论】:

            【解决方案5】:

            除了图形解析树my ANTLR4 extension for Visual Studio Code 还生成格式化文本解析树:

            【讨论】:

            猜你喜欢
            • 1970-01-01
            • 1970-01-01
            • 2020-04-17
            • 1970-01-01
            • 1970-01-01
            • 2010-11-18
            • 2011-01-20
            • 1970-01-01
            相关资源
            最近更新 更多