【问题标题】:Python Selenium Text Convert into Data FramePython Selenium 文本转换为数据框
【发布时间】:2021-03-14 06:45:55
【问题描述】:

我有一个关于 DataFrame 的问题。我用 Selenium 编写了一个代码来从网站中提取表格。但是,我对如何将 Selenium 文本转换为 DataFrame 并将其导出为 CSV 有疑问。下面是我的代码。

import requests
import pandas as pd
from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.common.exceptions import TimeoutException
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By

driver = webdriver.Chrome("Path")
driver.get("https://www.bcsc.bc.ca/enforcement/early-intervention/investment-caution-list")
table = driver.find_element_by_xpath('//table[@id="inlineSearchTable"]/tbody')

while True:
    try:
        print(table.text)
        WebDriverWait(driver, 10).until(EC.element_to_be_clickable((By.XPATH, "//a[@class='paginate_button next']"))).click()
        time.sleep(1)

    except:
        WebDriverWait(driver, 10).until(EC.element_to_be_clickable((By.XPATH, "//a[@class='paginate_button next disabled']"))).click()
        break

driver.quit()

【问题讨论】:

标签: python pandas selenium


【解决方案1】:

如果使用 selenium,则需要获取表的 outerHTML,然后使用 pd.read_html() 获取 dataframe

然后追加空dataframe并导出到csv。

import pandas as pd
from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.common.exceptions import TimeoutException
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By

driver = webdriver.Chrome("path")
driver.get("https://www.bcsc.bc.ca/enforcement/early-intervention/investment-caution-list")
dfbase=pd.DataFrame()
while True:
    try:
        table =WebDriverWait(driver,10).until(EC.visibility_of_element_located((By.CSS_SELECTOR,"table#inlineSearchTable"))).get_attribute("outerHTML")
        df=pd.read_html(str(table))[0]
        dfbase=dfbase.append(df,ignore_index=True)
        WebDriverWait(driver, 10).until(EC.element_to_be_clickable((By.XPATH, "//a[@class='paginate_button next']"))).click()
        time.sleep(1)
    except:
        WebDriverWait(driver, 10).until(EC.element_to_be_clickable((By.XPATH, "//a[@class='paginate_button next disabled']"))).click()
        break
print(dfbase)
dfbase.to_csv("TestResultsDF.csv")
driver.quit()

输出:

                                                 Name Date Added to the List
0                                         24option.com            Aug 6, 2013
1                                             3storich           Aug 20, 2020
2       4XP Investments & Trading and Forex Place Ltd.           Mar 15, 2012
3                6149154 Canada Inc. d.b.a. Forexcanus           Aug 25, 2011
4    72Option, owned and operated by Epic Ventures ...            Dec 8, 2016
5                               A&L Royal Finance Inc.            May 6, 2015
6                                        Abler Finance           Sep 26, 2014
7             Accredited International / Accredited FX           Mar 15, 2013
8                                        Aidan Trading           Jan 24, 2018
9    AlfaTrade, Nemesis Capital Limited (together, ...           Mar 16, 2016
10                          Alma Group Co Trading Ltd.            Oct 7, 2020
11                             Ameron Oil and Gas Ltd.           Sep 23, 2010
12                           Anchor Securities Limited           Aug 29, 2011
13                                           Anyoption            Jul 8, 2013
14                                  Arial Trading, LLC           Nov 20, 2008
15                        Asia & Pacific Holdings Inc.            Dec 5, 2017
16    Astercap Ltd., doing business as Broker Official           Aug 31, 2018
17                  Astor Capital Fund Limited (Astor)            Apr 9, 2020
18                                           Astrofx24           Nov 19, 2019
19                    Atlantic Global Asset Management           Sep 12, 2017
20   Ava FX, Ava Financial Ltd. and Ava Capital Mar...           Mar 15, 2012
21                                      Ava Trade Ltd.           May 30, 2016
22                                        Avariz Group            Nov 4, 2020
23   B.I.S. Blueport Investment Services Ltd., doin...            Sep 7, 2017
24                                            B4Option            May 3, 2017
25                                 Banc de Binary Ltd.           Jul 29, 2013
26                                          BCG Invest            Apr 6, 2020
27                     BeFaster.fit Limited (BeFaster)           Jun 22, 2020
28                                         Beltway M&A            Oct 6, 2009
29                              Best Commodity Options            Aug 1, 2012
..                                                 ...                    ...
301  Trade12, owned and operated by Exo Capital Mar...            Mar 1, 2017
302                                           TradeNix           Jul 30, 2020
303                                       TradeQuicker           May 21, 2014
304                                      TradeRush.com            Aug 6, 2013
305  Trades Capital, operated by TTN Marketing Ltd....           May 18, 2016
306                                       Tradewell.io           Jan 20, 2020
307                                       TradexOption           Apr 20, 2020
308                     Trinidad Oil & Gas Corporation            Dec 6, 2011
309         Truevalue Investment International Limited           May 11, 2018
310                                         UK Options            Mar 3, 2015
311  United Financial Commodity Group, operating as...           Nov 15, 2018
312      Up & Down Marketing Limited (dba OneTwoTrade)           Apr 27, 2015
313                                   USI-TECH Limited           Dec 15, 2017
314  uTrader and Day Dream Investments Ltd. (togeth...           Nov 29, 2017
315                     Vision Financial Partners, LLC           Feb 18, 2016
316                            Vision Trading Advisors           Feb 18, 2016
317                               Wallis Partridge LLC           Apr 24, 2014
318                                        Waverly M&A           Jan 19, 2010
319                               Wealth Capital Corp.            Sep 4, 2012
320  Wentworth & Wellesley Ltd. / Wentworth & Welle...           Mar 13, 2012
321                                West Golden Capital            Dec 1, 2010
322                                      World Markets           Sep 22, 2020
323                                WorldWide CapitalFX            Feb 8, 2019
324  XForex, owned and operated by XFR Financial Lt...           Jul 19, 2016
325                                      Xtelus Profit           Nov 30, 2020
326                         You Trade Holdings Limited            Jun 3, 2011
327                                      Zen Vybe Inc.           Mar 27, 2020
328                                      ZenithOptions           Feb 12, 2016
329                      Ziptradex Limited (Ziptradex)           May 21, 2020
330                                    Zulu Trade Inc.            Mar 2, 2015

【讨论】:

    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2022-11-03
    • 1970-01-01
    • 2019-04-13
    • 1970-01-01
    • 2011-01-24
    • 1970-01-01
    相关资源
    最近更新 更多