【发布时间】:2023-01-08 04:55:30
【问题描述】:
我现在有关于使用 cheerio nodejs 抓取标题表(合并)的问题,这意味着我正在使用它进行分组或其他操作。我可以在没有标题的情况下报废。在这里一点点 Screenshot Table
和 html 表的 html 代码或 html 表的 fiddle here:
<div class="wrap">
<table class="tbl">
<tr class="head">
<td colspan="6" style="background-color:#656968">Monday</td>
</tr>
<tr class="head">
<td class="center" width="20%">Code</td>
<td class="center" width="40%">Title</td>
<td class="center" width="20%">Price</td>
<td class="center last" width="20%">Status</td>
</tr>
<tr class="td1">
<td class="center">Code 1</td>
<td class="center">Name 1</td>
<td class="center">1.234</td>
<td class="center last">
<span class="green">Closed</span>
</td>
</tr>
</table>
<table class="tbl">
<tr class="head">
<td colspan="6" style="background-color:#656968">Tuesday</td>
</tr>
<tr class="head">
<td class="center" width="20%">Code</td>
<td class="center" width="40%">Title</td>
<td class="center" width="20%">Price</td>
<td class="center last" width="20%">Status</td>
</tr>
<tr class="td1">
<td class="center">Code 1</td>
<td class="center">Name 1</td>
<td class="center">1.234</td>
<td class="center last">
<span class="green">Closed</span>
</td>
</tr>
</table>
<table class="tbl">
<tr class="head">
<td colspan="6" style="background-color:#656968">Wednesday</td>
</tr>
<tr class="head">
<td class="center" width="20%">Code</td>
<td class="center" width="40%">Title</td>
<td class="center" width="20%">Price</td>
<td class="center last" width="20%">Status</td>
</tr>
<tr class="td1">
<td class="center">Code 1</td>
<td class="center">Name 1</td>
<td class="center">1.234</td>
<td class="center last">
<span class="green">Closed</span>
</td>
</tr>
<tr class="td2">
<td class="center">Code 1</td>
<td class="center">Name 1</td>
<td class="center">1.234</td>
<td class="center last">
<span class="green">Closed</span>
</td>
</tr>
<tr class="td1">
<td class="center">Code 1</td>
<td class="center">Name 1</td>
<td class="center">1.234</td>
<td class="center last">
<span class="green">Closed</span>
</td>
</tr>
</table>
<table class="tbl">
<tr class="head">
<td colspan="6" style="background-color:#656968">Thursday</td>
</tr>
<tr class="head">
<td class="center" width="20%">Code</td>
<td class="center" width="40%">Title</td>
<td class="center" width="20%">Price</td>
<td class="center last" width="20%">Status</td>
</tr>
<tr class="td1">
<td class="center">Code 1</td>
<td class="center">Name 1</td>
<td class="center">1.234</td>
<td class="center last">
<span class="green">Closed</span>
</td>
</tr>
</table>
</div>
这是我的 cheerio :
const sel = "tr.td1, tr.td2";
$(sel).each(function (i, e) {
$(this).find("td:first").each(function (i, e) {
code.push({
code: $(this).text().trim()
})
});
$(this).find("td:eq(1)").each(function (i, e) {
title.push({
title: $(this).text().trim()
})
});
$(this).find("td:eq(2)").each(function (i, e) {
price.push({
price: $(this).text().trim()
})
});
$(this).find("td:eq(3)").each(function (i, e) {
status.push({
status: $(this).text().trim()
})
});
let merged = [];
for (var i = 0; i < code.length; i++) {
merged.push({
...code[i],
...title[i],
...price[i],
...status[i]
})
}
是的,我能够像我希望的那样得到数组,看起来像
[
{
"code": "Code 1",
"title": "Name 1",
"price": "1.234",
"status": "Closed",
},
{
"code": "Code 1",
"title": "Name 1",
"price": "1.234",
"status": "Closed",
},
{
"code": "Code 1",
"title": "Name 1",
"price": "1.234",
"status": "Closed",
}
]
我需要的是,在 json 中我有日值,这是在标题合并的位置,我需要的最终结果看起来像这样
[
{
"code": "Code 1",
"title": "Name 1",
"price": "1.234",
"status": "Closed",
"group": "Monday"
},
{
"code": "Code 1",
"title": "Name 1",
"price": "1.234",
"status": "Closed",
"group": "Monday"
},
{
"code": "Code 1",
"title": "Name 1",
"price": "1.234",
"status": "Closed",
"group": "Monday"
},
{
"code": "Code 1",
"title": "Name 1",
"price": "1.234",
"status": "Closed",
"group": "Tuesday"
},
{
"code": "Code 1",
"title": "Name 1",
"price": "1.234",
"status": "Closed",
"group": "Tuesday"
},
{
"code": "Code 1",
"title": "Name 1",
"price": "1.234",
"status": "Closed",
"group": "Tuesday"
}
]
【问题讨论】:
标签: javascript node.js web-scraping cheerio