【发布时间】:2022-01-16 10:48:25
【问题描述】:
我试图从https://www.macrotrends.net/stocks/charts/AAPL/apple/financial-statements 中抓取数据以获取表中的所有值。我使用 selenium,它只能获取前 6 个值,其余的值似乎被隐藏了。
代码:
!pip install selenium
import selenium
from selenium import webdriver
from selenium.webdriver.common.by import By
import requests
driver = webdriver.Firefox()
url = 'https://www.macrotrends.net/stocks/charts/AAPL/apple/financial-statements'
hdr = {"User-Agent": "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/50.0.2661.75 Safari/537.36",
"X-Requested-With": "XMLHttpRequest"}
driver.get(url)
a = driver.find_element(By.CSS_SELECTOR, "#contenttablejqxgrid").text
print(a)
输出:
Revenue
$365.817
$274.515
$260.174
$265.595
$229.234
$215.639
Cost Of Goods Sold
$212.981
$169.559
$161.782
$163.756
$141.048
$131.376
Gross Profit
$152.836
$104.956
$98.392
$101.839
$88.186
$84.263
Research And Development Expenses
$21.914
$18.752
$16.217
$14.236
$11.581
$10.045
SG&A Expenses
$21.973
$19.916
$18.245
$16.705
$15.261
$14.194
Other Operating Income Or Expenses
-
-
-
-
-
-
Operating Expenses
$256.868
$208.227
$196.244
$194.697
$167.890
$155.615
Operating Income
$108.949
$66.288
$63.930
$70.898
$61.344
$60.024
Total Non-Operating Income/Expense
$258
$803
$1.807
$2.005
$2.745
$1.348
Pre-Tax Income
$109.207
$67.091
$65.737
$72.903
$64.089
$61.372
Income Taxes
$14.527
$9.680
$10.481
$13.372
$15.738
$15.685
Income After Taxes
$94.680
$57.411
$55.256
$59.531
$48.351
$45.687
Other Income
-
-
-
-
-
-
Income From Continuous Operations
$94.680
$57.411
$55.256
$59.531
$48.351
$45.687
Income From Discontinued Operations
-
-
-
-
-
-
Net Income
$94.680
$57.411
$55.256
$59.531
$48.351
$45.687
EBITDA
$120.233
$77.344
$76.477
$81.801
$71.501
$70.529
EBIT
$108.949
$66.288
$63.930
$70.898
$61.344
$60.024
Basic Shares Outstanding
16.701
17.352
18.471
19.822
20.869
21.883
Shares Outstanding
16.865
17.528
18.596
20.000
21.007
22.001
Basic EPS
$5.67
$3.31
$2.99
$3.00
$2.32
$2.09
EPS - Earnings Per Share
$5.61
$3.28
$2.97
$2.98
$2.30
$2.08
当我尝试单独获取任何丢失的数据时,我收到错误“消息:无法找到元素:”
1个缺失数据的代码示例:
!pip install selenium
import selenium
from selenium import webdriver
from selenium.webdriver.common.by import By
import requests
driver = webdriver.Firefox()
url = 'https://www.macrotrends.net/stocks/charts/AAPL/apple/financial-statements'
hdr = {"User-Agent": "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/50.0.2661.75 Safari/537.36",
"X-Requested-With": "XMLHttpRequest"}
driver.get(url)
b = driver.find_element(By.XPATH, "/html/body/div[2]/div[3]/div[4]/div/div/div[4]/div[2]/div/div[1]/div[9]/div").text
print(b)
错误:
NoSuchElementException: Message: Unable to locate element: /html/body/div[2]/div[3]/div[4]/div/div/div[4]/div[2]/div/div[1]/div[9]/div
Stacktrace:
WebDriverError@chrome://remote/content/shared/webdriver/Errors.jsm:181:5
NoSuchElementError@chrome://remote/content/shared/webdriver/Errors.jsm:393:5
element.find/</<@chrome://remote/content/marionette/element.js:299:16
我在尝试 xpath、css 等时遇到错误,这似乎无关紧要。
提前致谢!这是我在这里的第一个问题,如果我错过了什么,非常抱歉。
编辑:
@PApostol 发表评论后我管理了一个部分解决方案,问题是数据在初始布局中不可见,所以我扩大了屏幕并使其向右滚动,它现在错过了第一个数据,我的临时解决方案将连接这些数据,现在是我的代码:
driver = webdriver.Firefox()
url = 'https://www.macrotrends.net/stocks/charts/AAPL/apple/financial-statements'
hdr = {"User-Agent": "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/50.0.2661.75 Safari/537.36",
"X-Requested-With": "XMLHttpRequest"}
driver.get(url)
driver.set_window_size(2000, 2000)
a = driver.find_element(By.CSS_SELECTOR, "#contenttablejqxgrid").text
arrow = driver.find_element(By.CSS_SELECTOR, ".jqx-icon-arrow-right")
webdriver.ActionChains(driver).click_and_hold(arrow).perform()
time.sleep(4)
b = driver.find_element(By.CSS_SELECTOR, "#contenttablejqxgrid").text
print(a,b)
【问题讨论】:
-
至少对于
Other Operating Income Or Expenses文本正在打印,即- -
这可能是因为只有第 1 个 6 是可见的,而没有进一步向右滚动。
-
@PApostol 非常感谢!这就是问题所在,我通过让它向右滚动解决了它,虽然现在它错过了第一列,现在我将尝试连接数据。
标签: python html css selenium selenium-webdriver