如何使用javascript选择嵌套在表格tbody内的tr内的所有锚标记href URL答案

【问题标题】：how can I select all anchor tag href URL that is nested inside a tr inside tbody of a table with javascript如何使用javascript选择嵌套在表格tbody内的tr内的所有锚标记href URL
【发布时间】：2021-10-16 10:37:02
【问题描述】：

编码器。

我想和木偶师一起做一些任务。

我想登录一个页面并获取该页面上列出的前十篇文章的 URL。

但是这些 url 嵌套在一个表格中。我需要抓取该页面的前 10 个甚至 20 个 URL。

这是网站的结构

Html body 标签有一个带有#body 的div 标签，这个div 里面是3 个表格，第二个表格是带有#body 的div 里面的第9 个孩子。

--页面上有3个Html表格，但我要抓取的网址在第二个表格内

这是第二张桌子的样子

<table>
<tbody>

<tr> <th>Followed Topics</th> </tr>  

<tr>

<td id="top6803855" class="w"> 

<a name="6803855"></a>
<img src="/icons/normal_post.gif"> 
<b> <a href="/politics">Politics</a> </b> 
    " / "
<b><a href="/6803855/nnamdi-kanu-ifeanyi-ubah-wants">The main article which i want to grab the href</a></b>
    &nbsp;
<a onclick="unfollowtopic('anylink.com', '6803855'); return false;" href="anylink.com">

<img src="/static/delete.png"></a>
<br>

<span class="s">
by <b><a href="/username">username</a></b>.
<b>19</b> posts &amp; <b>389</b> views. <b>12:31am</b> 
(<b><a href="/username">username</a></b>)
</span>

</td>

</tr>

</tbody>

</table>

这是我迄今为止使用的，它在本地主机上没有任何问题。

document.querySelector("body &gt; div &gt; table:nth-child(9) &gt; tbody &gt; tr:nth-child(2) &gt; td &gt; b:nth-child(4) &gt; a").href

我通过更改 :nth-child() 中的 no 来重复上面的选择器，以便抓取剩余的 tr

但它在 Heroku 上运行不佳，有时会选择元素，有时会显示错误

'无法读取 null 的属性（读取'href'）'

【问题讨论】：

标签： javascript html dom css-selectors puppeteer

【解决方案1】：

也许尝试寻找 span 类的最后一个孩子“b”：

document.querySelector("body > div > table:nth-child(9) > tbody > tr:nth-child(2) > span.s > b:last > a").href

【讨论】：