【问题标题】:How to remove some XML Tag and all tags / data in it using Java Regex如何使用 Java Regex 删除一些 XML 标记和其中的所有标记/数据
【发布时间】:2017-07-27 12:15:48
【问题描述】:

我想使用 Java Regex 删除 foreignobject 标记和其中的所有标记以及该标记中的文本。

以下是 XML:

    <svg class="noselect orgchartsvg"><defs><filter id="default" x="0" y="0" width="110%" height="110%"><feoffset result="offout" in="sourcegraphic" dx="1" dy="1" /><fecolormatrix result="matrixout" in="offout" type="matrix" values="0.8  0   0  0  0 0   0.8  0  0  0 0    0  0.8 0  0 0    0  0   1  0 " /><fegaussianblur result="blurout" in="matrixout" stddeviation="2" /><feblend in="sourcegraphic" in2="blurout" mode="normal" /></filter><filter id="hover" x="0" y="0" width="110%" height="110%"><feoffset result="offout" in="sourcegraphic" dx="1" dy="1" /><fecolormatrix result="matrixout" in="offout" type="matrix" values="0.3686  0   0  0  0 0   0.5529  0  0  0 0    0  0.9804 0  0 0    0  0   1  0 " /><fegaussianblur result="blurout" in="matrixout" stddeviation="2" /><feblend in="sourcegraphic" in2="blurout" mode="normal" /></filter><filter id="selected" x="0" y="0" width="110%" height="110%"><feoffset result="offout" in="sourcegraphic" dx="3" dy="3" /><fecolormatrix result="matrixout" in="offout" type="matrix" values="0.3686  0   0  0  0 0  0.5529  0  0  0 0    0  0.9804 0  0 0    0  0   1  0 " /><fegaussianblur result="blurout" in="matrixout" stddeviation="2" /><feblend in="sourcegraphic" in2="blurout" mode="normal" /></filter></defs><g class="transgroup" transform="scale(1)" ><line y1="157" x1="0" y2="157" x2="-17" style="fill:#292a2b;stroke:#292a2b;stroke-width:1;" /><line y1="99" x1="-17" y2="99" x2="672" style="fill:#292a2b;stroke:#292a2b;stroke-width:1;" /><line y1="99" x1="672" y2="41" x2="672" style="fill:#292a2b;stroke:#292a2b;stroke-width:1;" /><line y1="273" x1="0" y2="273" x2="-17" style="fill:#292a2b;stroke:#292a2b;stroke-width:1;" /><line y1="389" x1="0" y2="389" x2="-17" style="fill:#292a2b;stroke:#292a2b;stroke-width:1;" /><line y1="505" x1="0" y2="505" x2="-17" style="fill:#292a2b;stroke:#292a2b;stroke-width:1;" /><line y1="621" x1="0" y2="621" x2="-17" style="fill:#292a2b;stroke:#292a2b;stroke-width:1;" /><line y1="621" x1="-17" y2="99" x2="-17" style="fill:#292a2b;stroke:#292a2b;stroke-width:1;" /><line y1="157" x1="345" y2="157" x2="328" style="fill:#292a2b;stroke:#292a2b;stroke-width:1;" /><line y1="273" x1="345" y2="273" x2="328" style="fill:#292a2b;stroke:#292a2b;stroke-width:1;" /><line y1="389" x1="345" y2="389" x2="328" style="fill:#292a2b;stroke:#292a2b;stroke-width:1;" /><line y1="505" x1="345" y2="505" x2="328" style="fill:#292a2b;stroke:#292a2b;stroke-width:1;" /><line y1="621" x1="345" y2="621" x2="328" style="fill:#292a2b;stroke:#292a2b;stroke-width:1;" /><line y1="621" x1="328" y2="99" x2="328" style="fill:#292a2b;stroke:#292a2b;stroke-width:1;" /><line y1="157" x1="690" y2="157" x2="673" style="fill:#292a2b;stroke:#292a2b;stroke-width:1;" /><line y1="273" x1="690" y2="273" x2="673" style="fill:#292a2b;stroke:#292a2b;stroke-width:1;" /><line y1="389" x1="690" y2="389" x2="673" style="fill:#292a2b;stroke:#292a2b;stroke-width:1;" /><line y1="505" x1="690" y2="505" x2="673" style="fill:#292a2b;stroke:#292a2b;stroke-width:1;" /><line y1="621" x1="690" y2="621" x2="673" style="fill:#292a2b;stroke:#292a2b;stroke-width:1;" /><line y1="621" x1="673" y2="99" x2="673" style="fill:#292a2b;stroke:#292a2b;stroke-width:1;" /><line y1="157" x1="1035" y2="157" x2="1018" style="fill:#292a2b;stroke:#292a2b;stroke-width:1;" /><line y1="99" x1="1018" y2="99" x2="672" style="fill:#292a2b;stroke:#292a2b;stroke-width:1;" /><line y1="273" x1="1035" y2="273" x2="1018" style="fill:#292a2b;stroke:#292a2b;stroke-width:1;" /><line y1="389" x1="1035" y2="389" x2="1018" style="fill:#292a2b;stroke:#292a2b;stroke-width:1;" /><line y1="389" x1="1018" y2="99" x2="1018" style="fill:#292a2b;stroke:#292a2b;stroke-width:1;" /><g uid="1" data-is-root="true" data-id="49090049" data-type="position" class="nodegroup select" dotted="false"><rect filter="url(#default)" expand-type="null" tabindex="-1" x="517" width="18" y="0" data-id="49090049" pptrect="1" data-type="position" class="leftborderposition leftborder" height="82"></rect><rect filter="url(#default)" expand-type="null" tabindex="207" x="535" width="292" y="0" data-id="49090049" pptrect="1" data-type="position" class="rect select" height="82"></rect><switch class="select">

<foreignobject x="545" width="282" y="10" data-id="49090049" style="text-align:left;" data-type="position" class="foreignobject null select" height="62"><body><div class="node clearfix">
  <div class="node-portrait">
            <img src="services/userservice/image?uid=72299&plantype=&planname=" style="border-radius: 50%;" class="select circularimage" height="60"></img>
</div>

  <div class="node-content">
    <div class="node-content-item node-content-bold" title="director, manufacturing">director, manufacturing</div>
<div class="node-content-item node-content-normal" title="helle carlson">helle carlson</div>
    <div class="node-content-icons clearfix">      

      <div class="node-content-item node-content-normal" title="chicago">chicago</div>
    </div>
  </div>
</div>
</body></foreignobject></svg>

我试过这个 (&lt;foreignobject[^&gt;]*[^/]&gt;)[^&amp;]*(?!\\s*&lt;/foreignobject&gt;) 但它不会删除。

mysvg = mysvg.replaceAll("(&lt;foreignobject[^&gt;]*[^/]&gt;)[^&amp;]*(?!\\s*&lt;/foreignobject&gt;)"," ");

【问题讨论】:

  • 不推荐使用正则表达式。一个更好的主意是将 XML 解析为 DOM 树并以这种方式对其进行操作。
  • 这是标准答案:stackoverflow.com/a/1732454/630136。简短的版本是:不要。
  • 是的,我知道,但是正则表达式有办法做到这一点吗?
  • 是的,但它会以您未预见到的方式损坏并发生故障。空的&lt;foreignobject/&gt; 标签会失败,如果有混合的命名空间,如果命名空间前缀是意外的,如果标签中的属性中有 HTML 实体引用,如果有嵌套的外部对象......同时这在 XSLT 中解决起来绝对是微不足道的专门用于处理 XML,因此请改用它。

标签: java regex xml svg


【解决方案1】:

最后使用 XML 解析完成

【讨论】:

    猜你喜欢
    • 2017-08-11
    • 1970-01-01
    • 2011-11-12
    • 2010-12-13
    • 2015-08-09
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多