【问题标题】:Extracting certain part of string encapsulated inside tag提取封装在标签内的字符串的某些部分
【发布时间】:2015-07-27 20:07:02
【问题描述】:

我正在处理大字符串,并希望实现一个正则表达式或类似的解决方案来从字符串中提取某个部分。我要提取的部分由字符串内的[test ][/test] 标签封装。标签之外的所有内容都将被删除。我怎样才能用 PHP 有效地做到这一点?

   $subject = "Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.

[test ]https://www.test.com/this_a_test[/test]";

$pattern = '~\[test (?|=[\'"]?+([^]"\']++)[\'"]?+]([^[]++)|](([^[]++)))\[/test]~';
$replacement = '$1';

$result = preg_replace($pattern, $replacement, $subject);
var_dump( $result );

当前输出:

Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum. https://www.test.com/this_a_test  

期望的输出:

https://www.test.com/this_a_test

【问题讨论】:

标签: php regex


【解决方案1】:

您可以使用以下正则表达式获取标记内的子字符串:

\[test\s*](.*?)\[\/test]

您需要在此正则表达式中使用preg_match_all

regex demo

还有IDEONE Demo

$re = '~\[test\s*](.*?)\[\/test]~s'; 
$str = "Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.\n\n[test ]https://www.test.com/this_a_test[/test]"; 
preg_match_all($re, $str, $matches);
print_r($matches[1]);

输出:

Array
(
    [0] => https://www.test.com/this_a_test
)

【讨论】:

  • 很高兴它对你有用。如果事实证明对您有帮助,也请考虑为答案投票。
猜你喜欢
  • 2022-11-13
  • 1970-01-01
  • 2016-01-12
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 2021-05-20
  • 1970-01-01
  • 1970-01-01
相关资源
最近更新 更多