【发布时间】:2018-03-15 00:27:09
【问题描述】:
感谢 StackOverflow,我能够使用以下代码在公共网站上下载一系列照片。
urls <- c("https://ec.europa.eu/consumers/consumers_safety/safety_products/rapex/alerts/?event=viewProduct&reference=A12/0090/13",
"https://ec.europa.eu/consumers/consumers_safety/safety_products/rapex/alerts/?event=viewProduct&reference=A12/0089/13",
"https://ec.europa.eu/consumers/consumers_safety/safety_products/rapex/alerts/?event=viewProduct&reference=A12/0088/13",
"https://ec.europa.eu/consumers/consumers_safety/safety_products/rapex/alerts/?event=viewProduct&reference=A12/0087/13",
"https://ec.europa.eu/consumers/consumers_safety/safety_products/rapex/alerts/?event=viewProduct&reference=A12/0086/13"
)
for (url in 1:length(urls)) {
print(url)
webpage <- html_session(urls[url])
link.titles <- webpage %>% html_nodes("img")
img.url <- link.titles %>% html_attr("src")
for(j in 1:length(img.url)){
download.file(img.url[j], paste0(url,'.',j,".jpg"), mode = "wb")
}
}
但是,某些链接不包含照片,因此返回 HTTP 状态错误并停止下载过程。
所以,我想插入一个if 命令并告诉 R 忽略/绕过那些不包含照片或“404 Not Found”错误的页面。然而,问题是,我不知道什么函数或命令会代表没有图像或“404 Not Found”错误的页面。任何建议,将不胜感激。
【问题讨论】:
标签: r