【发布时间】:2011-05-18 00:35:51
【问题描述】:
我正在努力尝试链接带有或不带有“www”/“http”的链接
这是我得到的:
noProtocolUrl = /\b((?:www\d{0,3}[.]|[a-z0-9.\-]+[.][a-z]{2,4}\/)(?:[^\s()<>]+|\(([^\s()<>]+|(\([^\s()<>]+\)))*\))+(?:\(([^\s()<>]+|(\([^\s()<>]+\)))*\)|[^\s`!()\[\]{};:'".,<>?«»“”‘’]))/g,
httpOrMailtoUrl = /\b((?:[a-z][\w-]+:)(?:[^\s()<>]+|\(([^\s()<>]+|(\([^\s()<>]+\)))*\))+(?:\(([^\s()<>]+|(\([^\s()<>]+\)))*\)|[^\s`!()\[\]{};:'".,<>?«»“”‘’]))/gi,
linkifier = function (html) {
return FormatLink(html
.replace(noProtocolUrl, '<a href="<``>://$1" rel="nofollow external" class="external_link">$1</a>') // NOTE: we escape `"http` as `"<``>` to make sure `httpOrMailtoUrl` below doesn't find it as a false-positive
.replace(httpOrMailtoUrl, '<a href="$1" rel="nofollow external" class="external_link">$1</a>')
.replace(/"<``>/g, '"http')); // reinsert `"http`
除了带有 http:// 的简单链接被链接化处理两次之外,它工作得很好。
http://google.com 会变成两个链接: htttp:// 和 http://google.com
知道如何解决这个问题吗?
谢谢!
编辑
嗯,除了没有 http* 和 **www 的链接(例如 bit.ly/foo
如果有人也知道如何捕捉这些链接,不客气。
var noProtocolUrl = /(^|["'(\s]|<)(www\..+?\..+?)((?:[:?]|\.+)?(?:\s|$)|>|[)"',])/g,
httpOrMailtoUrl = /\b((?:[a-z][\w-]+:)(?:[^\s()<>]+|\(([^\s()<>]+|(\([^\s()<>]+\)))*\))+(?:\(([^\s()<>]+|(\([^\s()<>]+\)))*\)|[^\s`!()\[\]{};:'".,<>?«»“”‘’]))/gi,
linkifier = function ( html ) {
return FormatLink(html
.replace( noProtocolUrl, '$1<a href="<``>://$2" rel="nofollow external" class="external_link">$2</a>$3' ) // NOTE: we escape `"http` as `"<``>` to make sure `httpOrMailtoUrl` below doesn't find it as a false-positive
.replace( httpOrMailtoUrl, '<a href="$1" rel="nofollow external" class="external_link">$1</a>' )
.replace( /"<``>/g, '"http' )); // reinsert `"http`
},
【问题讨论】:
-
Nathan,问题是我正在使用更新的正则表达式 (daringfireball.net/2010/07/improved_regex_for_matching_urls) 来匹配带括号的 URL,等等。