【问题标题】:How to get the value of the postal code where the stores go for delivery?I want to get the value of the postal code where delivery is possible如何获取商店送货的邮政编码的值?我想获取可以送货的邮政编码的值
【发布时间】:2020-07-31 15:00:04
【问题描述】:
# importing package

from selenium import webdriver

# setting the path

PATH = "C:\Program Files (x86)\chromedriver.exe"
driver = webdriver.Chrome(PATH)
options = webdriver.ChromeOptions
options.headless = True

driver.get("https://www.craispesaonline.it/provincia/treviso")

# x path for Address and Postal Code

x = ('//address//p[@class="text-lowercase m-0 ng-binding"]')
search = driver.find_elements_by_xpath(x)

# retrieving the output in a text file

with open("Italy_Scrape.txt", "a") as f:
    for i in search:
        print("PostalCode :" + i.text, file=f)
        print("----------------------------------------------------------------------------", file=f)

driver.quit()

获取邮政地址的代码。 在上面的代码中,我使用的是无头铬的硒。 只需要该代码来获取可以送货的商店的邮政编码。

【问题讨论】:

    标签: python selenium xpath web-scraping webdriverwait


    【解决方案1】:

    页面需要时间才能完全加载,因此您无法获得所需的值。

    获取所有邮政编码 诱导WebDriverWait(),等待visibility_of_all_elements_located()

    要从元素中获取最后一个子元素,您可以诱导 javascript 执行器或分割线获取唯一的邮政编码。

    driver.get("https://www.craispesaonline.it/provincia/treviso")
    search=WebDriverWait(driver,20).until(EC.visibility_of_all_elements_located((By.XPATH,'//address//p[@class="text-lowercase m-0 ng-binding"]')))
    for postcode in search:
        print(driver.execute_script('return arguments[0].lastChild.textContent;', postcode))
    

    您需要导入以下库。

    from selenium.webdriver.support.ui import WebDriverWait
    from selenium.webdriver.support import expected_conditions as EC
    from selenium.webdriver.common.by import By
    

    控制台输出:

    0422/710092
     0422 452388
     0422/958833
     0423/689003
     0422/853881
     0422/969047
     0423/564126
     0423/650073
     0423/723434
     0423/942150
     0438/500484
     0423/868496
     0438/898282
     0483801679
     0422/832603
     0423/470063
     0423/755164-23
     0438/492409
     0438/893369
     0422/791529
     0423/302959
     0423/301381
     0423-603754
     0423/609936
     0423/609151
     0423480340
     0438/781107
     0423/670593
     0423/81743
     0423/81534
     0423/972091
     0423/451941
     0422/912384
     0423/620803
     0423/621383
    

    使用splitlines() 输出相同

    driver.get("https://www.craispesaonline.it/provincia/treviso")
    search=WebDriverWait(driver,20).until(EC.visibility_of_all_elements_located((By.XPATH,'//address//p[@class="text-lowercase m-0 ng-binding"]')))
    for postcode in search:
        print(postcode.text.splitlines()[-1].split("|")[-1].strip()) #last element which is postcode
    

    【讨论】:

      【解决方案2】:

      要完成前面的答案,您可以通过一个 XPath 表达式获取可以送货的商店的邮政编码:

      //div[@class="row province-cms-content-store-row ng-scope"][./div[@ng-if="store.shippingEnabled == true"]]//meta[@itemprop="postalCode"]/@content
      

      硒代码:

      driver.get("https://www.craispesaonline.it/provincia/treviso")
      postcodes = WebDriverWait(driver,20).until(EC.visibility_of_all_elements_located((By.XPATH,'//div[@class="row province-cms-content-store-row ng-scope"][./div[@ng-if="store.shippingEnabled == true"]]//meta[@itemprop="postalCode"]'))).get_attribute("content")
      

      输出:29 个邮政编码

      ['31038']
      ['31038']
      ['31047']
      ['31050']
      ['31030']
      ...
      

      【讨论】:

        【解决方案3】:

        要提取邮政编码,仅针对可以送货的商店,您可以为visibility_of_all_elements_located() 诱导WebDriverWait,您可以使用以下 基于Locator Strategy

        • 使用CSS_SELECTOR

          driver.get("https://www.craispesaonline.it/provincia/treviso")
          WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//a[@class='cl-accept']"))).click()
          driver.execute_script("return arguments[0].scrollIntoView(true);", WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//h2[contains(., 'Potrai scegliere di ricevere la tua spesa in due modi:')]"))))
          addresses = [my_elem.text for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.XPATH, "//input[@value='Consegna']//preceding::address[1]//p[@class='text-lowercase m-0 ng-binding']")))]
          for address in addresses:
              print(re.findall(r"\b\d{5}\b", address))
          
        • 控制台输出:

          ['31038']
          ['31038']
          ['31047']
          ['31050']
          ['31030']
          ['31031']
          ['31034']
          ['31014']
          ['31035']
          ['31010']
          ['31010']
          ['31036']
          ['31037']
          ['31037']
          ['31050']
          ['31050']
          ['31044']
          ['31044']
          ['31044']
          ['31044']
          ['31044']
          ['31023']
          ['31058']
          ['31040', '81743']
          ['31049']
          ['31050']
          ['31020']
          ['31040']
          ['31040']
          
        • 注意:您必须添加以下导入:

          from selenium.webdriver.support.ui import WebDriverWait
          from selenium.webdriver.common.by import By
          from selenium.webdriver.support import expected_conditions as EC
          

        【讨论】:

          猜你喜欢
          • 1970-01-01
          • 1970-01-01
          • 2020-04-21
          • 1970-01-01
          • 2013-04-05
          • 1970-01-01
          • 2011-04-11
          • 1970-01-01
          • 1970-01-01
          相关资源
          最近更新 更多