【发布时间】:2017-07-27 12:15:48
【问题描述】:
我想使用 Java Regex 删除 foreignobject 标记和其中的所有标记以及该标记中的文本。
以下是 XML:
<svg class="noselect orgchartsvg"><defs><filter id="default" x="0" y="0" width="110%" height="110%"><feoffset result="offout" in="sourcegraphic" dx="1" dy="1" /><fecolormatrix result="matrixout" in="offout" type="matrix" values="0.8 0 0 0 0 0 0.8 0 0 0 0 0 0.8 0 0 0 0 0 1 0 " /><fegaussianblur result="blurout" in="matrixout" stddeviation="2" /><feblend in="sourcegraphic" in2="blurout" mode="normal" /></filter><filter id="hover" x="0" y="0" width="110%" height="110%"><feoffset result="offout" in="sourcegraphic" dx="1" dy="1" /><fecolormatrix result="matrixout" in="offout" type="matrix" values="0.3686 0 0 0 0 0 0.5529 0 0 0 0 0 0.9804 0 0 0 0 0 1 0 " /><fegaussianblur result="blurout" in="matrixout" stddeviation="2" /><feblend in="sourcegraphic" in2="blurout" mode="normal" /></filter><filter id="selected" x="0" y="0" width="110%" height="110%"><feoffset result="offout" in="sourcegraphic" dx="3" dy="3" /><fecolormatrix result="matrixout" in="offout" type="matrix" values="0.3686 0 0 0 0 0 0.5529 0 0 0 0 0 0.9804 0 0 0 0 0 1 0 " /><fegaussianblur result="blurout" in="matrixout" stddeviation="2" /><feblend in="sourcegraphic" in2="blurout" mode="normal" /></filter></defs><g class="transgroup" transform="scale(1)" ><line y1="157" x1="0" y2="157" x2="-17" style="fill:#292a2b;stroke:#292a2b;stroke-width:1;" /><line y1="99" x1="-17" y2="99" x2="672" style="fill:#292a2b;stroke:#292a2b;stroke-width:1;" /><line y1="99" x1="672" y2="41" x2="672" style="fill:#292a2b;stroke:#292a2b;stroke-width:1;" /><line y1="273" x1="0" y2="273" x2="-17" style="fill:#292a2b;stroke:#292a2b;stroke-width:1;" /><line y1="389" x1="0" y2="389" x2="-17" style="fill:#292a2b;stroke:#292a2b;stroke-width:1;" /><line y1="505" x1="0" y2="505" x2="-17" style="fill:#292a2b;stroke:#292a2b;stroke-width:1;" /><line y1="621" x1="0" y2="621" x2="-17" style="fill:#292a2b;stroke:#292a2b;stroke-width:1;" /><line y1="621" x1="-17" y2="99" x2="-17" style="fill:#292a2b;stroke:#292a2b;stroke-width:1;" /><line y1="157" x1="345" y2="157" x2="328" style="fill:#292a2b;stroke:#292a2b;stroke-width:1;" /><line y1="273" x1="345" y2="273" x2="328" style="fill:#292a2b;stroke:#292a2b;stroke-width:1;" /><line y1="389" x1="345" y2="389" x2="328" style="fill:#292a2b;stroke:#292a2b;stroke-width:1;" /><line y1="505" x1="345" y2="505" x2="328" style="fill:#292a2b;stroke:#292a2b;stroke-width:1;" /><line y1="621" x1="345" y2="621" x2="328" style="fill:#292a2b;stroke:#292a2b;stroke-width:1;" /><line y1="621" x1="328" y2="99" x2="328" style="fill:#292a2b;stroke:#292a2b;stroke-width:1;" /><line y1="157" x1="690" y2="157" x2="673" style="fill:#292a2b;stroke:#292a2b;stroke-width:1;" /><line y1="273" x1="690" y2="273" x2="673" style="fill:#292a2b;stroke:#292a2b;stroke-width:1;" /><line y1="389" x1="690" y2="389" x2="673" style="fill:#292a2b;stroke:#292a2b;stroke-width:1;" /><line y1="505" x1="690" y2="505" x2="673" style="fill:#292a2b;stroke:#292a2b;stroke-width:1;" /><line y1="621" x1="690" y2="621" x2="673" style="fill:#292a2b;stroke:#292a2b;stroke-width:1;" /><line y1="621" x1="673" y2="99" x2="673" style="fill:#292a2b;stroke:#292a2b;stroke-width:1;" /><line y1="157" x1="1035" y2="157" x2="1018" style="fill:#292a2b;stroke:#292a2b;stroke-width:1;" /><line y1="99" x1="1018" y2="99" x2="672" style="fill:#292a2b;stroke:#292a2b;stroke-width:1;" /><line y1="273" x1="1035" y2="273" x2="1018" style="fill:#292a2b;stroke:#292a2b;stroke-width:1;" /><line y1="389" x1="1035" y2="389" x2="1018" style="fill:#292a2b;stroke:#292a2b;stroke-width:1;" /><line y1="389" x1="1018" y2="99" x2="1018" style="fill:#292a2b;stroke:#292a2b;stroke-width:1;" /><g uid="1" data-is-root="true" data-id="49090049" data-type="position" class="nodegroup select" dotted="false"><rect filter="url(#default)" expand-type="null" tabindex="-1" x="517" width="18" y="0" data-id="49090049" pptrect="1" data-type="position" class="leftborderposition leftborder" height="82"></rect><rect filter="url(#default)" expand-type="null" tabindex="207" x="535" width="292" y="0" data-id="49090049" pptrect="1" data-type="position" class="rect select" height="82"></rect><switch class="select">
<foreignobject x="545" width="282" y="10" data-id="49090049" style="text-align:left;" data-type="position" class="foreignobject null select" height="62"><body><div class="node clearfix">
<div class="node-portrait">
<img src="services/userservice/image?uid=72299&plantype=&planname=" style="border-radius: 50%;" class="select circularimage" height="60"></img>
</div>
<div class="node-content">
<div class="node-content-item node-content-bold" title="director, manufacturing">director, manufacturing</div>
<div class="node-content-item node-content-normal" title="helle carlson">helle carlson</div>
<div class="node-content-icons clearfix">
<div class="node-content-item node-content-normal" title="chicago">chicago</div>
</div>
</div>
</div>
</body></foreignobject></svg>
我试过这个
(<foreignobject[^>]*[^/]>)[^&]*(?!\\s*</foreignobject>) 但它不会删除。
mysvg = mysvg.replaceAll("(<foreignobject[^>]*[^/]>)[^&]*(?!\\s*</foreignobject>)"," ");
【问题讨论】:
-
不推荐使用正则表达式。一个更好的主意是将 XML 解析为 DOM 树并以这种方式对其进行操作。
-
这是标准答案:stackoverflow.com/a/1732454/630136。简短的版本是:不要。
-
是的,我知道,但是正则表达式有办法做到这一点吗?
-
是的,但它会以您未预见到的方式损坏并发生故障。空的
<foreignobject/>标签会失败,如果有混合的命名空间,如果命名空间前缀是意外的,如果标签中的属性中有 HTML 实体引用,如果有嵌套的外部对象......同时这在 XSLT 中解决起来绝对是微不足道的专门用于处理 XML,因此请改用它。