【问题标题】:PHP - Insert tags (<div>) between an other tags (<p>) in a stringPHP - 在字符串中的其他标签 (<p>) 之间插入标签 (<div>)
【发布时间】:2015-04-08 13:54:54
【问题描述】:

我在 php 中有一个从请求中获得的字符串(实际上,它是来自 CKEDITOR 的所见即所得文本编辑器的字符串),我正在尝试在其他标签 (p) 中插入标签 (div)并且还从之前的p > div 获取数据属性。

用这个例子会更好理解:

$String =
<p>
    <div class="ST" data-start="1" data-end="5">
        <span>Blabla1 </span><span>Blabla2</span>
    </div>
</p>
<p>
    Blabla3 Blabla4
</p>
<p>
    <div class="ST" data-start="6" data-end="10">
        <span>Blabla10 </span><span>Blabla20</span>
    </div>
</p>

这里,第一个和最后一个&lt;p&gt; 都可以!但我想要的是第二个&lt;p&gt;

我需要将“Blabla3 Blabla4”放在&lt;div class="ST"&gt; 中,并使用前一个&lt;div&gt; 中的data-startdata-end 属性(这里是data-start = 0data-end = 5,最后得到这个:

<p>
    <div class="ST" data-start="1" data-end="5">
        <span>Blabla1 </span><span>Blabla2</span>
    </div>
</p>
<p>
    <div class="ST" data-start="1" data-end="5">
       Blabla3 Blabla4
    </div>
</p>
<p>
    <div class="ST" data-start="6" data-end="10">
        <span>Blabla10 </span><span>Blabla20</span>
    </div>
</p>

字符串也可以这样(以&lt;p&gt;开头)这种情况下,把data-startdata-end放到0

<p>
    Blabla3 Blabla4
</p>
<p>
    <div data-start="0" data-end="5">
        <span>Blabla1 </span><span>Blabla2</span>
    </div>
</p>
<p>
    <div data-start="6" data-end="10">
        <span>Blabla10 </span><span>Blabla20</span>
    </div>
</p>

或者像这样(有 2 个或更多 &lt;p&gt;)在这种情况下,将 data-startdata-end 都放入 15 就像以前一样:

<p>
    <div data-start="1" data-end="5">
        <span>Blabla1 </span><span>Blabla2</span>
    </div>
</p>
<p>
    Blabla3 Blabla4
</p>
<p>
    Blabla5 Blabla6
</p>
<p>
    <div data-start="6" data-end="10">
        <span>Blabla10 </span><span>Blabla20</span>
    </div>
</p>

我不知道如何操作字符串...可能正在使用正则表达式?

感谢您的帮助!

编辑 1

我试过了:

$value =

string 
'<p><show class="st" data-time-end="1.25" data-time-moy="0.12125" data-time-start="0.28" id="1"><word class="word" data-time-end="1.25" data-time-start="0.28">TEST1&nbsp; </word><word class="word" data-time-end="1.25" data-time-start="1.25"> </word></show><show class="st" data-time-end="1.25" data-time-moy="0.13857142857143" data-time-start="0.28" id="11"><word class="word" data-time-end="1.25" data-time-start="0.28">TEST2. </word><word class="word" data-time-end="1.25" data-time-start="1.25"> </word></show><show class="st" data-time-end="1.25" data-time-moy="0.194" data-time-start="0.28" id="12"><word class="word" data-time-end="1.444" data-time-start="0.28">TEST3 </word></show></p>

    <p>TESTTTT</p>' (length=709)

我的代码(我正在使用 symfony2 和 Transformer):

public function reverseTransform($value)
{
        $value_purified = strip_tags($value, '<p><show><strong><span><word><em><u>'); // Allow just tags bellow

        // Create a DOM with $value
        $dom = new DOMDocument();
        $dom->preserveWhiteSpace = false;
        $dom->formatOutput = true;
        libxml_use_internal_errors(true); // autorise les balises non conforme html5
        $dom->loadHTML($value_purified); // Charge le string $value dans le DOM $dom
        libxml_use_internal_errors(false); // refuse les balises non conforme html5

        var_dump($dom);

        $pTags = $dom->getElementsByTagName('p');
        var_dump($pTags); 

        foreach ($pTags as $pTag) {
            var_dump($pTag);
            $valuePTagFull = $this->DOMinnerHTML($pTag);
            if (strpos($valuePTagFull,'<show') === false) {
                $valuePTagFull = "<show class='st'>".$valuePTagFull."</show>";
            } 
            var_dump($valuePTagFull);
        }

        $value_purified = strip_tags($value, '<show><strong><span><word><em><u>'); // Allow tags bellow (delete the <p> tag)
        var_dump($value_purified);
}

private function DOMinnerHTML(DOMNode $element)
{
    $innerHTML = "";
    $children = $element->childNodes;
    foreach ($children as $child) {
        $innerHTML .= $element->ownerDocument->saveHTML($child);
    }
    return $innerHTML;
}

这是我的 var_dumps : 1/ var_dump($dom);

object(DOMDocument)[1000]
  public 'doctype' => string '(object value omitted)' (length=22)
  public 'implementation' => string '(object value omitted)' (length=22)
  public 'documentElement' => string '(object value omitted)' (length=22)
  public 'actualEncoding' => null
  public 'encoding' => null
  public 'xmlEncoding' => null
  public 'standalone' => boolean true
  public 'xmlStandalone' => boolean true
  public 'version' => null
  public 'xmlVersion' => null
  public 'strictErrorChecking' => boolean true
  public 'documentURI' => null
  public 'config' => null
  public 'formatOutput' => boolean true
  public 'validateOnParse' => boolean false
  public 'resolveExternals' => boolean false
  public 'preserveWhiteSpace' => boolean false
  public 'recover' => boolean false
  public 'substituteEntities' => boolean false
  public 'nodeName' => string '#document' (length=9)
  public 'nodeValue' => null
  public 'nodeType' => int 13
  public 'parentNode' => null
  public 'childNodes' => string '(object value omitted)' (length=22)
  public 'firstChild' => string '(object value omitted)' (length=22)
  public 'lastChild' => string '(object value omitted)' (length=22)
  public 'previousSibling' => null
  public 'attributes' => null
  public 'ownerDocument' => null
  public 'namespaceURI' => null
  public 'prefix' => string '' (length=0)
  public 'localName' => null
  public 'baseURI' => null
  public 'textContent' => string 'TEST1  TEST2. TEST3 TESTTTT' (length=32)

2/ 没关系,因为在我的字符串中我有 2 个&lt;p&gt; 标签,var_dump(pTags) 返回我int2

var_dump(pTags);
object(DOMNodeList)[1001]
     public 'length' => int 2

3/ 在这里我们可以看到带有var_dump($pTag); 的2 个&lt;p&gt; 标签

var_dump($pTag);
object(DOMElement)[1040]
  public 'tagName' => string 'p' (length=1)
  public 'schemaTypeInfo' => null
  public 'nodeName' => string 'p' (length=1)
  public 'nodeValue' => string 'TEST1  TEST2. TEST3 ' (length=21)
  public 'nodeType' => int 1
  public 'parentNode' => string '(object value omitted)' (length=22)
  public 'childNodes' => string '(object value omitted)' (length=22)
  public 'firstChild' => string '(object value omitted)' (length=22)
  public 'lastChild' => string '(object value omitted)' (length=22)
  public 'previousSibling' => null
  public 'nextSibling' => string '(object value omitted)' (length=22)
  public 'attributes' => string '(object value omitted)' (length=22)
  public 'ownerDocument' => string '(object value omitted)' (length=22)
  public 'namespaceURI' => null
  public 'prefix' => string '' (length=0)
  public 'localName' => string 'p' (length=1)
  public 'baseURI' => null
  public 'textContent' => string 'TEST1  TEST2. TEST3 ' (length=21)



object(DOMElement)[1062]
      public 'tagName' => string 'p' (length=1)
      public 'schemaTypeInfo' => null
      public 'nodeName' => string 'p' (length=1)
      public 'nodeValue' => string 'TESTTTT' (length=7)
      public 'nodeType' => int 1
      public 'parentNode' => string '(object value omitted)' (length=22)
      public 'childNodes' => string '(object value omitted)' (length=22)
      public 'firstChild' => string '(object value omitted)' (length=22)
      public 'lastChild' => string '(object value omitted)' (length=22)
      public 'previousSibling' => string '(object value omitted)' (length=22)
      public 'attributes' => string '(object value omitted)' (length=22)
      public 'ownerDocument' => string '(object value omitted)' (length=22)
      public 'namespaceURI' => null
      public 'prefix' => string '' (length=0)
      public 'localName' => string 'p' (length=1)
      public 'baseURI' => null
      public 'textContent' => string 'TESTTTT' (length=7)

4/ 这里,如果&lt;p&gt;标签没有&lt;show&gt;标签,我将&lt;show&gt;标签添加到&lt;p&gt;标签中。它适用于我的第二个 &lt;p&gt; 标签,最初没有 &lt;show&gt; 标签:

var_dump($valuePTagFull);
string '<show class='st'>TESTTTT</show>' (length=31)

5/ 但我这里有个问题。当我在代码末尾执行var_dump($value_purified); 时,他告诉我:

string '<show class="st" data-time-end="1.25" data-time-moy="0.12125" data-time-start="0.28" id="1"><word class="word" data-time-end="1.25" data-time-start="0.28">TEST1&nbsp; </word><word class="word" data-time-end="1.25" data-time-start="1.25"> </word></show><show class="st" data-time-end="1.25" data-time-moy="0.13857142857143" data-time-start="0.28" id="11"><word class="word" data-time-end="1.25" data-time-start="0.28">TEST2. </word><word class="word" data-time-end="1.25" data-time-start="1.25"> </word></show><show class="st" data-time-end="1.25" data-time-moy="0.194" data-time-start="0.28" id="12"><word class="word" data-time-end="1.444" data-time-start="0.28">TEST3 </word></show>

TESTTTT' (length=695)

为什么最后“TESTTT”这个词不在&lt;show&gt;标签之间??而在var_dump($valuePTagFull); 中,&lt;show&gt; 标签位于...?

【问题讨论】:

  • 我在你的代码中没有看到任何 PHP ?
  • 请发布一些 PHP 代码,以便我们了解这些代码是如何构建的。你可能很容易在 PHP 中做到这一点
  • 这只是我从具有这种形状的请求表单中获得的一个字符串

    ....
  • 我会使用DOMDocument 课程及其附属课程来实现您的目标。然后,您可以使用XPath 表达式在 Document 中查找相应的 p-Tag,然后检查它是否具有所需的 div-Tag 作为子元素。 DOMDocument API 还允许您添加和修改 DOM 树的元素。然后,您可以插入 div-Tags 并使用 DOMDocument::saveHTML() 输出您的 HTML 代码
  • @ILadis 有趣!我会这样看!

标签: php string dom tags


【解决方案1】:

这是一种操作 DOMDocument 以获得所需结果的解决方案。详情见 cmets:

class foo
{
    public function reverseTransform($value)
    {
        $dom = new DOMDocument();
        $dom->preserveWhiteSpace = false;
        $dom->formatOutput = true;

        // Load contents wrapped in a temporary root node
        $dom->loadXML('<root>' . $value . '</root>');

        // Use an XPath query to get all P elements
        $xPath = new DOMXPath($dom);
        $pTags = $xPath->query('//p');

        // Loop through the P elements
        $dataStart = 0;
        $dataEnd   = 0;

        foreach ($pTags as $pTag) {
            // Get any DIV elements inside the P
            $divs = $xPath->query('./div', $pTag);

            if ($divs->length > 0) {
                // This P element already has a div. Grab the
                // data-start/end attributes for later
                $div = $divs->item(0);
                $dataStart = $div->getAttribute('data-start');
                $dataEnd   = $div->getAttribute('data-end');
            }
            else {
                // Create a new DIV element and set attributes
                $div = $dom->createElement('div');
                $div->setAttribute('class',      'ST');
                $div->setAttribute('data-start', $dataStart);
                $div->setAttribute('data-end',   $dataEnd);

                // Move all children of P into DIV
                $child = $pTag->firstChild;
                while ($child) {
                    $nextChild = $child->nextSibling;
                    $div->insertBefore($child);
                    $child = $nextChild;
                }

                // Move the DIV inside the P element
                $pTag->appendChild($div);
            }
        }
        // Get HTML, removing temporary root element
        $html = preg_replace(
            '#.*?<root>\s*(.*)\s*</root>#s', '\1',
            $dom->saveXML()
        );
        return $html;
    }
}

$string = <<<EOS
<p>
    Blabla1 Blabla2
</p>
<p>
    <div data-start="1" data-end="5">
        <span>Blabla3 </span><span>Blabla4</span>
    </div>
</p>
<p>
    Blabla5 Blabla6
</p>
<p>
    Blabla7 Blabla8
</p>
<p>
    <div data-start="6" data-end="10">
        <span>Blabla9 </span><span>Blabla10</span>
    </div>
</p>
<p>
    Blabla11 Blabla12
</p>
EOS;

echo (new foo)->reverseTransform($string), PHP_EOL;

输出(为清楚起见缩进):

<p>
    <div class="ST" data-start="0" data-end="0">
        Blabla1 Blabla2
    </div>
</p>
<p>
    <div data-start="1" data-end="5">
        <span>Blabla3 </span>
        <span>Blabla4</span>
    </div>
</p>
<p>
    <div class="ST" data-start="1" data-end="5">
        Blabla5 Blabla6
    </div>
</p>
<p>
    <div class="ST" data-start="1" data-end="5">
        Blabla7 Blabla8
    </div>
</p>
<p>
    <div data-start="6" data-end="10">
        <span>Blabla9 </span>
        <span>Blabla10</span>
    </div>
</p>
<p>
    <div class="ST" data-start="6" data-end="10">
        Blabla11 Blabla12
    </div>
</p>

【讨论】:

    【解决方案2】:

    如果它是一个有效的 html,您可以使用 loadHTML 函数并更快地操作您的字符串:http://php.net/manual/en/domdocument.loadhtml.php

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 2014-02-12
      • 1970-01-01
      • 2020-01-31
      • 2013-06-11
      • 2016-05-12
      • 1970-01-01
      相关资源
      最近更新 更多