【发布时间】:2021-09-14 20:11:15
【问题描述】:
单击“Scrape”按钮时,将运行 Python 函数。此函数使用 selenium 导航到感兴趣的站点并继续抓取其内容。要初始化驱动程序,我需要将 chrome 驱动程序的路径传递给webdriver.Chrome(executable_path=path)。这是我的 Python 函数的这一部分:
from selenium import webdriver
import urllib3
import re
import time
import pandas as pd
def scrape(chromedriver="C:/Users/Robpr/OneDrive/Documents/chromedriver.exe"):
# Create driver object. Opens browser
driver = webdriver.Chrome(executable_path=chromedriver)
# Rest of the function ...
我在闪亮的服务器函数中这样称呼它:
server <- function(input, output, session) {
# Scrape and store returned data frame from .py module in df()
df = reactive({
if (input$scrape) {
dfii = sii$scrape(chromedriver="chromedriver.exe")
dfii$`Filing Date` = as.Date(x=dfii$`Filing Date`, format="%B %d, %Y")
write_sheet(dfii, ss=sheetId, sheet='filings')
dfii
} else if (input$load) {
read_sheet(ss=sheetId, sheet='filings')
}
})
# Rest of the server function ...
这在本地工作得很好。当我发布我的应用程序并尝试单击“抓取”时,我收到错误消息:“错误:发生错误。检查您的日志或联系应用程序作者进行澄清。”。所以,我检查了我的日志:
2021-07-02T18:51:07.355310+00:00 shinyapps[4346040]: Detailed traceback:
2021-07-02T18:51:07.355307+00:00 shinyapps[4346040]: Warning: Error in py_call_impl: WebDriverException: Message: 'chromedriver.exe' executable needs to be in PATH. Please see https://sites.google.com/a/chromium.org/chromedriver/home
2021-07-02T18:51:07.355310+00:00 shinyapps[4346040]:
2021-07-02T18:51:07.355320+00:00 shinyapps[4346040]: File "/home/shiny/.virtualenvs/statcanEnv/lib/python3.8/site-packages/selenium/webdriver/chrome/webdriver.py", line 73, in __init__
2021-07-02T18:51:07.355311+00:00 shinyapps[4346040]: driver = webdriver.Chrome(executable_path=chromedriver)
2021-07-02T18:51:07.355311+00:00 shinyapps[4346040]: File "/srv/connect/apps/StatCanWebScrapingApp/scrapeinsolvencyinsider.py", line 11, in scrape
2021-07-02T18:51:07.355321+00:00 shinyapps[4346040]: self.service.start()
2021-07-02T18:51:07.355321+00:00 shinyapps[4346040]: raise WebDriverException(
2021-07-02T18:51:07.355321+00:00 shinyapps[4346040]: File "/home/shiny/.virtualenvs/statcanEnv/lib/python3.8/site-packages/selenium/webdriver/common/service.py", line 81, in start
2021-07-02T18:51:07.355322+00:00 shinyapps[4346040]:
2021-07-02T18:51:07.362398+00:00 shinyapps[4346040]: 135: <Anonymous>
我已经发布了 chromedriver.exe 文件以及我的应用和模块:
我不明白这是怎么回事。我怎样才能让它在服务器上工作?
更新:
我尝试了以下解决方案:
- 将
chromedriver-py安装到我的virtualenv 并从chromedriver_py 导入binary_path,然后将binary_path作为executable_path 传递,如下所示:
from selenium import webdriver
from chromedriver_py import binary_path
driver = webdriver.Chrome(executable_path=binary_path)
结果:Warning: Error in py_call_impl: WebDriverException: Message: unknown error: cannot find Chrome binary
- 如上所示随应用一起发布 chromedriver.exe,并将
"/srv/connect/apps/StatCanWebScrapingApp/chromedriver.exe"作为可执行路径传递。
结果:Warning: Error in py_call_impl: WebDriverException: Message: 'chromedriver.exe' executable may have wrong permissions. Please see https://sites.google.com/a/chromium.org/chromedriver/home
- 将 chromedriver 下载到服务器工作目录并使用该路径:
chromeDriverUrl <- "https://chromedriver.storage.googleapis.com/index.html?path=91.0.4472.101/"
destFile <- paste(getwd(), "chromedriver.exe", sep="/")
download.file(url=chromeDriverUrl, destfile=destFile)
server <- function(input, output, session) {
# Scrape and store returned data frame from .py module in df()
df = reactive({
if (input$scrape) {
dfii = sii$scrape(chromedriver=destFile)
dfii$`Filing Date` = as.Date(x=dfii$`Filing Date`, format="%B %d, %Y")
write_sheet(dfii, ss=sheetId, sheet='filings')
dfii
} else if (input$load) {
read_sheet(ss=sheetId, sheet='filings')
}
})
# ...
结果:Warning: Error in py_call_impl: WebDriverException: Message: 'chromedriver.exe' executable may have wrong permissions. Please see https://sites.google.com/a/chromium.org/chromedriver/home
- 在 virtualenv 上安装
webdriver-manager。在 python 模块中,从webdriver_manager导入ChromeDriverManager类,并将ChromeDriverManager.install()传递给webdriver.Chrome(),按照此处的建议。
结果:
2021-07-06T14:32:15.715695+00:00 shinyapps[4346040]: ====== WebDriver manager ======
2021-07-06T14:32:15.727647+00:00 shinyapps[4346040]: /bin/sh: 1: google-chrome: not found
2021-07-06T14:32:15.731757+00:00 shinyapps[4346040]: Detailed traceback:
2021-07-06T14:32:15.731756+00:00 shinyapps[4346040]:
2021-07-06T14:32:15.731758+00:00 shinyapps[4346040]: File "/srv/connect/apps/StatCanWebScrapingApp/scrapeinsolvencyinsider.py", line 12, in scrape
2021-07-06T14:32:15.731758+00:00 shinyapps[4346040]: driver = webdriver.Chrome(ChromeDriverManager().install())
2021-07-06T14:32:15.727881+00:00 shinyapps[4346040]: /bin/sh: 1: google-chrome-stable: not found
2021-07-06T14:32:15.731759+00:00 shinyapps[4346040]: File "/home/shiny/.virtualenvs/statcanEnv/lib/python3.8/site-packages/webdriver_manager/chrome.py", line 25, in __init__
2021-07-06T14:32:15.731755+00:00 shinyapps[4346040]: Warning: Error in py_call_impl: ValueError: Could not get version for Chrome with this command: google-chrome --version || google-chrome-stable --version
2021-07-06T14:32:15.731759+00:00 shinyapps[4346040]: self.driver = ChromeDriver(name=name,
2021-07-06T14:32:15.731759+00:00 shinyapps[4346040]: File "/home/shiny/.virtualenvs/statcanEnv/lib/python3.8/site-packages/webdriver_manager/driver.py", line 57, in __init__
2021-07-06T14:32:15.731760+00:00 shinyapps[4346040]: self.browser_version = chrome_version(chrome_type)
2021-07-06T14:32:15.731760+00:00 shinyapps[4346040]: File "/home/shiny/.virtualenvs/statcanEnv/lib/python3.8/site-packages/webdriver_manager/utils.py", line 155, in chrome_version
2021-07-06T14:32:15.731761+00:00 shinyapps[4346040]: raise ValueError(f'Could not get version for Chrome with this command: {cmd}')
2021-07-06T14:32:15.731761+00:00 shinyapps[4346040]:
2021-07-06T14:32:15.738260+00:00 shinyapps[4346040]: 135: <Anonymous>
【问题讨论】:
标签: r selenium selenium-chromedriver shiny-server shinyapps