区分正则表达式中的两个几乎相同的链接

【问题标题】：differentiate between two almost identical links in regex区分正则表达式中的两个几乎相同的链接
【发布时间】：2016-09-12 21:11:54
【问题描述】：

我创建了一个插件，可以将链接转换为链接处内容的 Facebook 嵌入式版本。我的问题是，如果我禁用了 cmets 插件的部分，则指向 cmets 的链接将成为嵌入式帖子（如果插件的帖子部分仍然处于活动状态）。

让我们看一下，所以我们有 3 个链接：

脸书帖子

<a href="https://www.facebook.com/zuck/posts/10102577175875681" target="_blank">ONE</a>

<a href="https://www.facebook.com/zuck/posts/10102577175875681?comment_id=1193531464007751" target="_blank">Two</a>

以及对评论的回复

<a href="https://www.facebook.com/zuck/posts/10102577175875681?comment_id=1193531464007751&reply_comment_id=10102577641662241" target="_blank">Three</a>

三个链接都以

开头

https://www.facebook.com/zuck/posts/10102577175875681

在下面的代码中，if 条件是我的设置切换，并且这个帖子消息等于用户发布的内容，所以在这个例子中这个帖子消息等于上面的三个链接。

这是我为转换这些链接而创建的插件。

if ($this->registry->options['drcae_facebook_comment_onoff']) {
  // swaps facebook comment links to embed code
  $drc_embed_facebook_cmt = '<div class="fb-comment-embed" data-include-parent="true" data-width="560" data-href="https://www.facebook.com/$3/posts/$4comment_id=$5"></div>';
  $this->post['message'] = preg_replace('~<a (.*)href="(.*)facebook.com/(.*)/posts/(.*)?comment_id=(.*)"(.*)<\/a>~', $drc_embed_facebook_cmt, $this->post['message']);
}

if ($this->registry->options['drcae_facebook_post_onoff']) {
  // swaps facebook post links to embed code
  $drc_embed_facebook_post = '<div class="fb-post" data-href="https://www.facebook.com/$3/posts/$4"></div>';
  $this->post['message'] = preg_replace('~<a (.*)href="(.*)facebook.com/(.*)/posts/(.*)"(.*)<\/a>~', $drc_embed_facebook_post, $this->post['message']);
}

我确实有这个反转（帖子优先），但这导致 cmets 嵌入帖子，我通过首先检查 cmets 解决了这个问题，这可能不是最好的方法。

所以你可能已经注意到了我的正则表达式，它不是最棒的，但它是我能够自己完成的，完全是正则表达式的新手。

~<a (.*)href="(.*)facebook.com/(.*)/posts/(.*)"(.*)<\/a>~

我选择以这种方式执行我的正则表达式，因此如果链接的格式如下所示，它仍然会嵌入它并不重要：

<a target="blank" href="https://www.facebook.com/USERNAME/posts/1234567890" alt="facebook post">LINK</a>

但现在我对自己的工作进行了第二次猜测，在搜索并没有提出任何建议之后，我想我会寻求一些帮助。

我如何区分这些链接以便发布，不要干扰 cmets/评论回复？

更新 1，嵌入帖子

现在我的插件看起来像这样

$drc_embed_facebook_post = '<div class="fb-post" data-href="https://www.facebook.com/$2/posts/$3"></div>';
$this->post['message'] = preg_replace('~<a (.*?)facebook\.com/([^/]+)/[^/]+/([0-9]+)(?:[?][^0-9]+([0-9]+)(?:&(.+))?)?</a>~', $drc_embed_facebook_post, $this->post['message']);

正则表达式

~<a (.*?)facebook\.com/([^/]+)/[^/]+/([0-9]+)(?:[?][^0-9]+([0-9]+)(?:&(.+))?)?</a>~

我已经离开了开始一个懒惰的东西？我相信...不限制 www。 https:// 等...（在 facebook.com 之前出现的任何内容）

这部分有效，这里直接抓取帖子的链接是一些示例。

https://www.facebook.com/RyanNewMe/posts/616837631826216?pnref=story
https://www.facebook.com/zuck/posts/10102833246942211?pnref=story
https://www.facebook.com/zuck/posts/10102830259184701?pnref=story

这些链接不嵌入帖子。但是，如果我将?pnref=story 全部删除，则只有以下链接不起作用。

https://www.facebook.com/RyanNewMe/posts/616837631826216

【问题讨论】：

标签： php regex facebook comments vbulletin