【问题标题】:Reading data from an API从 API 读取数据
【发布时间】:2013-08-16 01:51:26
【问题描述】:

我编写了一个函数来从外部 API 读取一些数据。我的功能是,它在从磁盘读取文件时调用该 API。我想针对大文件(35000 条记录)优化我的代码。你能在这方面给我建议吗?

以下是我的代码。

public void readCSVFile() {

    try {

        br = new BufferedReader(new FileReader(getFileName()));

        while ((line = br.readLine()) != null) {


            String[] splitLine = line.split(cvsSplitBy);

            String campaign = splitLine[0];
            String adGroup =  splitLine[1];
            String url = splitLine[2];              
            long searchCount = getSearchCount(url);             

            StringBuilder sb = new StringBuilder();
            sb.append(campaign + ",");
            sb.append(adGroup + ",");               
            sb.append(searchCount + ",");               
            writeToFile(sb, getNewFileName());

        }

    } catch (Exception e) {
        e.printStackTrace();
    }
}

private long getSearchCount(String url) {
    long recordCount = 0;
    try {

        DefaultHttpClient httpClient = new DefaultHttpClient();

        HttpGet getRequest = new HttpGet(
                "api.com/querysearch?q="
                        + url);
        getRequest.addHeader("accept", "application/json");

        HttpResponse response = httpClient.execute(getRequest);

        if (response.getStatusLine().getStatusCode() != 200) {
            throw new RuntimeException("Failed : HTTP error code : "
                    + response.getStatusLine().getStatusCode());
        }

        BufferedReader br = new BufferedReader(new InputStreamReader(
                (response.getEntity().getContent())));

        String output;

        while ((output = br.readLine()) != null) {
            try {

                JSONObject json = (JSONObject) new JSONParser()
                        .parse(output);
                JSONObject result = (JSONObject) json.get("result");
                recordCount = (long) result.get("count");
                System.out.println(url + "=" + recordCount);

            } catch (Exception e) {
                System.out.println(e.getMessage());
            }

        }

        httpClient.getConnectionManager().shutdown();

    } catch (Exception e) {
        e.getStackTrace();
    }
    return recordCount;

}

【问题讨论】:

  • 你的瓶颈肯定是你的 HTTP 东西。我会对此进行优化。如果可能,可能不会关闭连接或获取批量结果。
  • 是的,有问题。问题是,我必须使用来自文件的 GET 参数调用此 API。

标签: java json optimization csv


【解决方案1】:

由于远程调用比本地磁盘访问慢,您需要以某种方式并行化或批处理远程调用。如果您不能批量调用远程 API,但它允许多个并发读取,那么也许您想使用线程池之类的东西来进行远程调用:

public void readCSVFile() {
    // exception handling ignored for space
    br = new BufferedReader(new FileReader(getFileName()));
    List<Future<String>> futures = new ArrayList<Future<String>>();
    ExecutorService pool = Executors.newFixedThreadPool(5);

    while ((line = br.readLine()) != null) {
        final String[] splitLine = line.split(cvsSplitBy);
        futures.add(pool.submit(new Callable<String> {
            public String call() {
                long searchCount = getSearchCount(splitLine[2]);
                return new StringBuilder()
                    .append(splitLine[0]+ ",")
                    .append(splitLine[1]+ ",")
                    .append(searchCount + ",")
                    .toString();
            }
        }));
    }

    for (Future<String> fs: futures) {
        writeToFile(fs.get(), getNewFileName());
    }

    pool.shutdown();
}

不过,理想情况下,如果可能的话,您确实希望从远程 API 进行一次批量读取。

【讨论】:

  • 感谢您的建议。顺便说一句,我无法进行一次批量读取。但允许多个并发读取。
猜你喜欢
  • 1970-01-01
  • 2021-06-12
  • 2022-01-25
  • 1970-01-01
  • 1970-01-01
  • 2019-09-07
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
相关资源
最近更新 更多