【发布时间】:2011-08-01 17:44:40
【问题描述】:
如何发现网站的供稿 URL?
当我抓取Microsoft's blog HTML 时,我可以看到:
<link rel="alternate" type="application/rss+xml" title="Site Home (RSS 2.0)" href="http://blogs.technet.com/rss.aspx" />
<link rel="alternate" type="application/rss+xml" title="B1ackD0g's Comments (RSS 2.0)" href="/members/B1ackD0g/comments/rss.aspx" />
<link rel="alternate" type="application/rss+xml" title="B1ackD0g's Activities (RSS 2.0)" href="/members/B1ackD0g/activities/rss.aspx" />
<link rel="alternate" type="application/rss+xml" title="Activities of People B1ackD0g Follows (RSS 2.0)" href="/members/B1ackD0g/activities/followersrss.aspx" />
<link rel="alternate" type="application/rss+xml" title="B1ackD0g's Groups Activities (RSS 2.0)" href="/members/B1ackD0g/activities/groupsrss.aspx" />
<link rel="alternate" type="application/rss+xml" title="The Official Microsoft Blog – News and Perspectives from Microsoft (RSS 2.0)" href="http://blogs.technet.com/b/microsoft_blog/rss.aspx" />
<link rel="alternate" type="application/atom+xml" title="The Official Microsoft Blog – News and Perspectives from Microsoft (Atom 1.0)" href="http://blogs.technet.com/b/microsoft_blog/atom.aspx" />
在这里我可以假设我可以查找带有以“http://blogs.technet.com/b/microsoft_blog/”开头的 href 的标签
这样假设安全吗?
我需要做的基本上是获取一个 URL 并返回它的提要 URL。
【问题讨论】:
-
您使用什么语言? Javascript、PHP、ASP?