【发布时间】:2016-08-29 12:59:43
【问题描述】:
我正在尝试使用 wget 从多篇 Pubmed 论文中获取文本,但似乎 NCBI 网站不允许这样做。有其他选择吗?
Bernardos-MacBook-Pro:pangenome_papers_pubmed_result bernardo$ wget -i ./url.txt
--2016-05-04 10:49:34-- http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4560400/
Resolving www.ncbi.nlm.nih.gov... 130.14.29.110, 2607:f220:41e:4290::110
Connecting to www.ncbi.nlm.nih.gov|130.14.29.110|:80... connected.
HTTP request sent, awaiting response... 403 Forbidden
2016-05-04 10:49:34 ERROR 403: Forbidden.
--2016-05-04 10:49:34-- http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4547177/
Reusing existing connection to www.ncbi.nlm.nih.gov:80.
HTTP request sent, awaiting response... 403 Forbidden
2016-05-04 10:49:34 ERROR 403: Forbidden.
【问题讨论】:
标签: web-scraping wget text-mining