【问题标题】:vb.net get all attributes value using htmlagilitypackvb.net 使用 htmlagilitypack 获取所有属性值
【发布时间】:2016-02-25 23:29:00
【问题描述】:

这是html

<div id="catlist-listview" class="cat-listview cat-listbsize">  
 <ul>
  <li><a href="http://wantedlink1" rel="bookmark" title="sometitel1" class="sonra">title1</a></li>    
  <li><a href="http://wantedlink2" rel="bookmark" title="sometitel2" class="sonra">title2</a></li>
  <li><a href="http://wantedlink3" rel="bookmark" title="sometitel3" class="sonra">title3</a></li>
  <li><a href="http://wantedlink4" rel="bookmark" title="sometitel4" class="sonra">title4</a></li>
  <li><a href="http://wantedlink5" rel="bookmark" title="sometitel5" class="sonra">title5</a></li>
  <li><a href="http://wantedlink6" rel="bookmark" title="sometitel6" class="sonra">title6</a></li>
  <li><a href="http://wantedlink7" rel="bookmark" title="sometitel7" class="sonra">title7</a></li>
  <li><a href="http://wantedlink8" rel="bookmark" title="sometitel8" class="sonra">title8</a></li>
  <li><a href="http://wantedlink9" rel="bookmark" title="sometitel9" class="sonra">title9</a></li>
  <li><a href="http://wantedlink10 " rel="bookmark" title="sometitel10" class="sonra">title10</a></li>
 </ul>
</div>

我的代码是

dim htmldoc as new htmldocument
htmldoc.loadhtml(source)
for each link as htmlnode in htmldoc.document.selectnodes("//*[@id='catlist-listview']/ul")
textbox3.text = link.innerhtml
next

输出是

      <li><a href="http://wantedlink1" rel="bookmark" title="sometitel1" class="sonra">title1</a></li>    
      <li><a href="http://wantedlink2" rel="bookmark" title="sometitel2" class="sonra">title2</a></li>
      <li><a href="http://wantedlink3" rel="bookmark" title="sometitel3" class="sonra">title3</a></li>
      <li><a href="http://wantedlink4" rel="bookmark" title="sometitel4" class="sonra">title4</a></li>
      <li><a href="http://wantedlink5" rel="bookmark" title="sometitel5" class="sonra">title5</a></li>
      <li><a href="http://wantedlink6" rel="bookmark" title="sometitel6" class="sonra">title6</a></li>
      <li><a href="http://wantedlink7" rel="bookmark" title="sometitel7" class="sonra">title7</a></li>
      <li><a href="http://wantedlink8" rel="bookmark" title="sometitel8" class="sonra">title8</a></li>
      <li><a href="http://wantedlink9" rel="bookmark" title="sometitel9" class="sonra">title9</a></li>
      <li><a href="http://wantedlink10 " rel="bookmark" title="sometitel10" class="sonra">title10</a></li>

我想得到所有而且只有http://wantedlink1http://wantedlink10 我尝试了属性(“href”),但我只得到一个链接 我想像这样列出所有链接:

http://wantedlink1 
http://wantedlink2 
http://wantedlink3 
.
. 
. 
http://wantedlink10

有什么帮助吗??

【问题讨论】:

    标签: vb.net attributes href html-agility-pack


    【解决方案1】:

    基本上,您可以将SelectNodes() 的XPath 更改为选择单个&lt;a&gt; 元素而不是&lt;ul&gt;。那么从这一点开始,就很容易遍历结果,一一获取href属性。或者您也可以使用 LINQ 实现相同的目的,例如:

    'select <a> elements'
    Dim links = htmldoc.Document.SelectNodes("//*[@id='catlist-listview']/ul/li/a")
    'project to IEnumerable of href attribute value'
    Dim hrefs = links.Cast(Of HtmlNode)().Select(Function(x) x.GetAttributeValue("href", ""))
    'join the `hrefs`, separated by newline, into one string'
    textbox3.text = String.Join(Environment.NewLine, hrefs)
    

    dotnetfiddle demo

    【讨论】:

    • @SnoopyOhoo 对,在那里与 C# 语法混淆了。不客气,感谢指正
    • 你能告诉我如何列出listbox 中的链接而不是textbox3 我可以列出带有listbox1.items.addrange(textbox3.lines) 的链接,但我希望它直接指向listbox1 并删除textbox3跨度>
    • listbox1.items.addrange(hrefs)
    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多