IndexError：python中的循环列表索引超出范围答案

【问题标题】：IndexError: list index out of range for loop in pythonIndexError：python中的循环列表索引超出范围
【发布时间】：2019-11-05 05:00:36
【问题描述】：

嗨 Everone 我想刮擦但是你在 59 岁时遇到这个错误

我的xlsx 文件中有 1089 个项目

错误：

Traceback (most recent call last):
  File ".\seleniuminform.py", line 28, in <module>
    s.write(phone[i].text + "," + wevsite_link[i].text + "\n")
IndexError: list index out of range

这是我的python代码：

import pandas as pd
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.common.exceptions import TimeoutException


with open("Sans Fransico.csv","r") as s:
    s.read()

df = pd.read_excel('myfile.xlsx') # Get all the urls from the excel
mylist = df['Urls'].tolist() #urls is the column name

driver = webdriver.Chrome()

for url in mylist:

    driver.get(url)
    wevsite_link = driver.find_elements_by_css_selector(".text--offscreen__373c0__1SeFX+ .link-size--default__373c0__1skgq")

    phone = driver.find_elements_by_css_selector(".text--offscreen__373c0__1SeFX+ .text-align--left__373c0__2pnx_")

    num_page_items = len(phone)
    with open("Sans Fransico.csv", 'a',encoding="utf-8") as s:

        for i in range(num_page_items):
            s.write(phone[i].text + "," + wevsite_link[i].text + "\n")

driver.close()
print ("Done")

链接：

https://www.yelp.com/biz/daeho-kalbijjim-and-beef-soup-san-francisco-9?osq=Restaurants

此网站和电话出现此处错误：

【问题讨论】：

显然，wevsite_link 没有phone 长。你能解释为什么你期望它是相同的长度吗？如果没有，您是否考虑过您想要发生的事情？
你应该首先找到所有.text--offscreen__373c0__1SeFX+，然后使用for循环在每个.text--offscreen__373c0__1SeFX+中搜索电话和网站以创建对(phone, webside)。如果某些.text--offscreen__373c0__1SeFX+ 没有phone，有时您可能会得到(None, webside)
能否请您回答，以便我更好地理解！
您使用网页和手机显示图像，但页面上的某些项目可能没有手机 - 并且您获得的手机可能比网站少。
@KarlKnechtel 解释得很好，这基本上就是我在回答中写的。

标签： python selenium loops for-loop web-scraping

【解决方案1】：

我对 Selenium 不是很熟悉，所以我无法评论这方面。

第一次打开“Sans Francisco.csv”时，您读取的内容没有分配给变量。

至于你的错误，是因为你的范围是基于phone的长度，不是是wevsite_link的长度。如果wevsite_link 比phone 短，则会出现错误。简单来说，您发现的网站链接少于电话号码，但您的代码假定您总是会找到完全相同的数量。

你能解释一下你的代码吗？你想做什么？

【讨论】：

你需要什么代码？
@UsmanShahzad 我的意思是解释你想让你的代码做什么。上下文是什么？
我想报废手机和网站'
问题更新检查！
@UsmanShahzad 我很确定 url 的长度与手机的长度不是问题。这是关于您找到的网址数量与您找到的电话号码的数量。

【解决方案2】：

似乎有些商品没有手机，所以找到的手机比网页少。

您应该先找到所有".text--offscreen__373c0__1SeFX+"，然后使用for-loop 在每个项目中分别搜索phone 和website。

使用try/except可以识别项目是否没有电话，并使用空字符串作为电话号码

for url in mylist:

    driver.get(url)

    all_items = driver.find_elements_by_css_selector(".text--offscreen__373c0__1SeFX+")

    for item in all_items:
        try:
            wevsite_link = item.find_element_by_css_selector(".link-size--default__373c0__1skgq")
            wevsite_link = wevsite_link.text
        #except selenium.common.exceptions.NoSuchElementException:
        except:
            wevsite_link = ''

        try:
            phone = item.find_element_by_css_selector(".text-align--left__373c0__2pnx_")
            phone = phone.text
        #except selenium.common.exceptions.NoSuchElementException:
        except:
            phone = ''

        with open("Sans Fransico.csv", 'a',encoding="utf-8") as s:
             s.write(phone + "," + wevsite_link + "\n")

我没有页面的 url，所以我无法测试它。

【讨论】：

最后一件事我在运行您的代码时遇到此错误？
消息：无效选择器：在第 20 行中指定了无效或非法的选择器

【解决方案3】：

乍一看，我怀疑

phone = driver.find_elements_by_css_selector(".text--offscreen__373c0__1SeFX+ .text-align--left__373c0__2pnx_")

返回 0。您尝试查找匹配项的 css 选择器可能不准确。

【讨论】：

这应该不会导致错误，不是吗？他的范围是基于phone 的长度，所以他不会在它之外建立索引。