将 img 标签替换为 title 属性答案

【问题标题】：Replace img tag with the title attribute将 img 标签替换为 title 属性
【发布时间】：2014-03-12 08:35:22
【问题描述】：

我有一个包含以下内容的 HTML 字符串：

<p>your name :
<img title="##name##" src="name.jpg"/></p>
<p>your lastname:
<img title="##lastname##" src="lastname.jpg"/></p>
<p>your email :
<img title="##email##" src="email.jpg"/></p>
<p>submit
<img title="submit" src="submit.jpg"/></p>

现在我想提取所有标题属性（它们出现在一对##标签内），并删除<img>标签并将其替换为提取的标题。

结果应该是这样的：

<p>your name :
##name##</p>
<p>your lastname:
##lastname##</p>
<p>your email :
##email##</p>
<p>submit
<img title="submit" src="submit.jpg" title="submit"/></p>

最好的方法是什么？

【问题讨论】：

标签： php regex

【解决方案1】：

使用 HTML 解析器来完成这项任务。这是使用内置DOMDocument 类的解决方案：

$dom = new DOMDocument;
libxml_use_internal_errors(true);
$dom->loadHTML($html);


$tags = $dom->getElementsByTagName('img');
$length = $tags->length;

for ($i=$length-1; $i>=0; $i--) {
    $tag = $tags->item($i);
    $title = $tag->getAttribute('title');

    // check if title is of the format '##...##'
    if (preg_match('/##\w+?##/', $title)) {
        $textNode = $dom->createTextNode($title);
        $tag->parentNode->replaceChild($textNode, $tag);
    }
}

$html = preg_replace(
    '~<(?:!DOCTYPE|/?(?:html|head|body))[^>]*>\s*~i', '', 
    $dom->saveHTML()
);
echo $html;

输出：

<p>your name :
##name##</p>
<p>your lastname:
##lastname##</p>
<p>your email :
##email##</p>
<p>submit
<img title="submit" src="submit.jpg"></p>

Demo

【讨论】：

谢谢，为什么##lastname## 不替换也不行！？
很好，但我认为检查格式## 这个正则表达式是很好的工作。 /##([^#]*)##/
@ArazJafaripur：你为什么这么认为？该正则表达式匹配（并捕获）##，然后是不是 # 的任何内容，直到找到 ## - 在这种情况下，您不想在正则表达式中捕获标题 - 您已经在提取标题属性与getAttribute() 方法:)

【解决方案2】：

我觉得你可以试试这个：

$content = preg_replace('/<img.*?(##.+##).*?\/>/','${1}', $content);
$content = str_replace('##','',$content);

【讨论】：

【解决方案3】：

试试这个

$content = preg_replace('/<img.*?(##.+##).*?\/>/', '$1', $content);

【讨论】：

【解决方案4】：

所以首先你要选择任何区域：starts with "<img", then contains "##", then 1 or more characters, then "##", and ends with ">"

然后在那个提取的块中你想找到starts with "##", then 1 or more characters, then ends with "##"的部分。

通过这样写出来，我希望你能想出这样的正则表达式。

【讨论】：