【问题标题】:JSoup - Extracting table data errorJSoup - 提取表数据错误
【发布时间】:2016-11-05 12:42:02
【问题描述】:

我刚刚开始了一个小项目,我需要收集全球货币对的历史数据。根据这个问题Extract Data out of table with JSoup 的回答,我在下面粘贴了代码。

但是我一直收到IndexOutOfBoundException,尽管“数据”元素数组的大小为 7?

我已经摸不着头脑了将近一个小时,如果有人能指出我哪里出错了,我将不胜感激!

主类

import java.util.ArrayList;
import java.util.List;
import java.io.IOException;

import org.jsoup.*;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
import org.jsoup.select.Elements;


public class MainClass {


public static void main(String[] args) throws IOException{

    Document doc = Jsoup.connect("http://www.myfxbook.com/forex-market/currencies/GBPUSD-historical-data").get();

    Element table = doc.getElementById("symbolMarket");

    List<Entry> entries = new ArrayList<Entry>();

    for(Element row : table.select("tr")){

        int index = 0;
        Entry tableEntry = new Entry();
        Elements data = row.select("td");

        tableEntry.setDate(data.get(index++).text());
        tableEntry.setOpen(data.get(index++).text());
        tableEntry.setHigh(data.get(index++).text());
        tableEntry.setLow(data.get(index++).text());
        tableEntry.setClose(data.get(index++).text());
        tableEntry.setChangePips(data.get(index++).text());
        tableEntry.setChangePercent(data.get(index++).text());

        entries.add(tableEntry);

    }

}

}

条目类

public class Entry {

private String date;
private String open;
private String high;
private String low;
private String close;
private String changePips;
private String changePercent;

public String getDate() {
    return date;
}
public void setDate(String date) {
    this.date = date;
}
public String getOpen() {
    return open;
}
public void setOpen(String open) {
    this.open = open;
}
public String getHigh() {
    return high;
}
public void setHigh(String high) {
    this.high = high;
}
public String getLow() {
    return low;
}
public void setLow(String low) {
    this.low = low;
}
public String getClose() {
    return close;
}
public void setClose(String close) {
    this.close = close;
}
public String getChangePips() {
    return changePips;
}
public void setChangePips(String changePips) {
    this.changePips = changePips;
}
public String getChangePercent() {
    return changePercent;
}
public void setChangePercent(String changePercent) {
    this.changePercent = changePercent;
}



}

【问题讨论】:

    标签: java html arrays parsing jsoup


    【解决方案1】:

    但是我一直收到IndexOutOfBoundException,尽管“数据”元素数组的大小为 7?

    如果这是真的,您将不会看到此异常。

    问题是第一行没有任何td,但th(表头),所以对于这一行row.select("td")0元素匹配td选择器,你被告知来自异常的信息

    java.lang.IndexOutOfBoundsException:索引:0,大小:0

    要解决此问题,您可以简单地忽略第一行,或明确选择 tr 其中 has 至少有一个 td 作为子元素

    for(Element row : table.select("tr:has(td)")){
        //                            ^^^^^^^^
        ...
    }
    

    您还可以在对其应用任何操作之前手动测试 data 存储 td 的大小

    for(Element row : table.select("tr")){
        Elements data = row.select("td");
    
        if(data.size()==7){
    
            int index = 0;
            Entry tableEntry = new Entry();
    
            tableEntry.setDate(data.get(index++).text());
            tableEntry.setOpen(data.get(index++).text());
            tableEntry.setHigh(data.get(index++).text());
            tableEntry.setLow(data.get(index++).text());
            tableEntry.setClose(data.get(index++).text());
            tableEntry.setChangePips(data.get(index++).text());
            tableEntry.setChangePercent(data.get(index++).text());
    
            entries.add(tableEntry);
        }
    }
    

    【讨论】:

      【解决方案2】:

      您正在尝试从表头中获取数据...您必须跳过它。

      public static void main(String[] args) throws IOException {
              Document doc = Jsoup.connect("http://www.myfxbook.com/forex-market/currencies/GBPUSD-historical-data").get();
      
              Element table = doc.getElementById("symbolMarket");
      
              List<Entry> entries = new ArrayList<Entry>();
      
              Elements elements = table.select("tr");
              Iterator<Element> itr = elements.iterator();
              itr.next(); //skip header data
      
              while ( itr.hasNext() ) {
                  int index = 0;
                  Entry tableEntry = new Entry();
                  Elements data = itr.next().select("td");
      
                  tableEntry.setDate(data.get(index++).text());
                  tableEntry.setOpen(data.get(index++).text());
                  tableEntry.setHigh(data.get(index++).text());
                  tableEntry.setLow(data.get(index++).text());
                  tableEntry.setClose(data.get(index++).text());
                  tableEntry.setChangePips(data.get(index++).text());
                  tableEntry.setChangePercent(data.get(index++).text());
                  entries.add(tableEntry);
      
              }       
      
      
      
      
          }
      

      【讨论】:

        猜你喜欢
        • 2015-08-30
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 2012-11-12
        • 1970-01-01
        • 1970-01-01
        • 2012-03-10
        • 2012-03-15
        相关资源
        最近更新 更多