【发布时间】:2016-01-12 09:35:24
【问题描述】:
我正在尝试抓取此网页 http://www.skysports.com/football/competitions/la-liga/table.I 只是想从表格中获取团队的名称。我为此目的使用 Jsoup。这是我的代码
private class LoadData extends AsyncTask<Void,Void,Void> {
String url = "http://www.skysports.com/football/competitions/la-liga/table";
String data = "";
@Override
protected Void doInBackground(Void... params) {
Document document;
try {
document = Jsoup.connect(url).timeout(0).get();
Elements clubName = document.select("td.standing-table__cell standing-table__cell--name");
int a = clubName.size();
for(int i = 0; i < a; i++) {
data += "\n\n" +clubName.get(i).text();
}
} catch (IOException e) {
e.printStackTrace();
}
return null;
}
@Override
protected void onPostExecute(Void result) {
teamview = (TextView) findViewById(R.id.club_view);
teamview.setMovementMethod(new ScrollingMovementMethod());
teamview.setText(data);
super.onPostExecute(result);
}
}
这是它的 html 代码
<tr class="standing-table__row" data-item-id="872">
<td class="standing-table__cell">1</td>
<td class="standing-table__cell standing-table__cell--name" data-short-name="Atletico Madrid" data-long-name="Atletico Madrid">
<a href="/football/teams/atletico-madrid" class="standing-table__cell--name-link">Atletico Madrid</a>
</td>
<td class="standing-table__cell">19</td>
<td class="standing-table__cell is-hidden--bp35">14</td>
<td class="standing-table__cell is-hidden--bp35">2</td>
<td class="standing-table__cell is-hidden--bp35">3</td>
<td class="standing-table__cell is-hidden--bp35">27</td>
<td class="standing-table__cell is-hidden--bp35">8</td>
<td class="standing-table__cell">19</td>
<td class="standing-table__cell" data-sort-value="1">44</td>
<td class="standing-table__cell is-hidden--bp15 is-hidden--bp35 " data-sort-value="15333033">
<div class="standing-table__form">
<span title="Granada 0-2 Atletico Madrid" class="standing-table__form-cell standing-table__form-cell--win"> </span><span title="Atletico Madrid 2-1 Athletic Bilbao" class="standing-table__form-cell standing-table__form-cell--win"> </span><span title="Malaga 1-0 Atletico Madrid" class="standing-table__form-cell standing-table__form-cell--loss"> </span><span title="Rayo Vallecano 0-2 Atletico Madrid" class="standing-table__form-cell standing-table__form-cell--win"> </span><span title="Atletico Madrid 1-0 Levante" class="standing-table__form-cell standing-table__form-cell--win"> </span><span title="Celta Vigo 0-2 Atletico Madrid" class="standing-table__form-cell standing-table__form-cell--win"> </span> </div>
</td>
</tr>
当我使用代码document.select("td.standing-table__cell"); 时,会显示数据。但是当我使用document.select("td.standing-table__cell standing-table__cell--name"); 而不是document.select("td.standing-table__cell"); 时,没有显示任何数据!?
【问题讨论】:
-
请学习 css 选择器基础知识...如何选择多个类 .. 通过 ids - 使用哈希,按元素 - 单独,以及按类 - 使用 ...
标签: java android web-scraping jsoup