由于您的问题是关于selenium:
你应该看看Selenium-Waits
您正在等待 HTML 源代码中所有元素的呈现,下面的代码应该描述它:
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
def main(url):
driver = webdriver.Firefox()
driver.get(url)
try:
cnames = [x.text for x in WebDriverWait(driver, 10).until(
EC.presence_of_all_elements_located(
(By.CSS_SELECTOR, "td[aria-label='Company']"))
)]
finally:
print(cnames)
driver.quit()
main("https://finance.yahoo.com/calendar/earnings")
输出:
['111 Inc', '360 DigiTech Inc', 'American Software Inc', 'American Software Inc', 'Corporacion America Airports SA', 'Atkore International Group Inc', 'Atkore International Group Inc', 'Helmerich and Payne Inc', 'Amtech Systems Inc', 'Amtech Systems Inc', 'Delta Apparel Inc', 'Delta Apparel Inc', 'Bellring Brands Inc', 'Berry Global Group Inc', 'Beacon Roofing Supply Inc', 'Natural Grocers By Vitamin Cottage Inc', "BJ's Wholesale Club Holdings Inc", 'Entera Bio Ltd', 'SG Blocks Inc', 'SG Blocks Inc', 'BEST Inc', 'Brady Corp', 'BioHiTech Global Inc', 'BioHiTech Global Inc', 'Oaktree Strategic Income Corporation', 'Caleres Inc', 'Pennantpark Investment Corp', 'Geospace Technologies Corp', 'Canadian Solar Inc', 'Oaktree Specialty Lending Corp', 'Matthews International Corp', 'Clearsign Technologies Corp', "Children's Place Inc", 'Elys Game Technology Corp', 'Dada Nexus Ltd', 'ESCO Technologies Inc', 'Euroseas Ltd', 'Fangdd Network Group Ltd', 'Fangdd Network Group Ltd', 'Golden Ocean Group Ltd', 'Hoegh LNG Partners LP', 'Post Holdings Inc', 'Huize Holding Ltd', 'Haynes International Inc', "Macy's Inc", 'OneWater Marine Inc', 'OneWater Marine Inc', 'Woodward Inc', 'StealthGas Inc', 'Maximus Inc', 'Ross Stores Inc', 'Intuit Inc', 'Ooma Inc', 'Williams-Sonoma Inc', 'Precipio Inc', 'NetEase Inc', 'Workday Inc', 'i3 Verticals Inc', 'Knot Offshore Partners LP', 'Maxeon Solar Technologies Ltd', 'Opera Ltd', 'Puxin Ltd', 'Puxin Ltd']
注意:您不需要使用selenium,因为它会完全减慢您的任务。
我也看到没有理由 import 一个巨大的图书馆,如 pandas 来阅读 HTML 表。
您只需通过以下代码获取目标,您将在其中获得确切的call date:
import requests
import re
import json
import csv
keys = ['ticker', 'companyshortname', 'startdatetime']
def main(url):
r = requests.get(url)
goal = json.loads(re.search(r"App\.main.*?({.+})", r.text).group(1))
target = [[item[k] for k in keys] for item in goal['context']
['dispatcher']['stores']['ScreenerResultsStore']['results']['rows']]
with open("result.csv", 'w', newline="") as f:
writer = csv.writer(f)
writer.writerow(keys)
writer.writerows(target)
main("https://finance.yahoo.com/calendar/earnings")
输出:view-online