【发布时间】:2020-07-28 05:59:28
【问题描述】:
这是我收到 IndexError 的代码。
# importing the required libraries
import pandas as pd
# Visualisation libraries
import matplotlib.pyplot as plt
%matplotlib inline
import seaborn as sns
import plotly.express as px
import plotly.graph_objects as go
import folium
from folium import plugins
# Manipulating the default plot size
plt.rcParams['figure.figsize'] = 10, 12
# Disable warnings
import warnings
warnings.filterwarnings('ignore')
# for date and time opeations
from datetime import datetime
# for file and folder operations
import os
# for regular expression opeations
import re
# for listing files in a folder
import glob
# for getting web contents
import requests
# for scraping web contents
from bs4 import BeautifulSoup
# get data
# link at which web data recides
link = 'https://www.mohfw.gov.in/'
# get web data
req = requests.get(link)
# parse web data
soup = BeautifulSoup(req.content, "html.parser")
# find the table
# ==============
# our target table is the last table in the page
# get the table head
# table head may contain the column names, titles, subtitles
thead = soup.find_all('thead')[-1]
# print(thead)
# get all the rows in table head
# it usually have only one row, which has the column names
head = thead.find_all('tr')
# print(head)
# get the table tbody
# it contains the contents
tbody = soup.find_all('tbody')[-1]
# print(tbody)
# get all the rows in table body
# each row is each state's entry
body = tbody.find_all('tr')
# print(body)
IndexError
Traceback (most recent call last)
<ipython-input-7-eda41c6e195c> in <module>
15 # get the table tbody
16 # it contains the contents
---> 17 tbody = soup.find_all('tbody')[-1]
18 # print(tbody)
19
IndexError: list index out of range
【问题讨论】:
-
[-1]这是获取索引-1的元素,这对于列表没有意义,因此会出现错误。可能这应该是[::-1],这是用于反转列表顺序的切片符号。 -
您想从站点中提取表格信息?
-
@HymnsForDisco [-1] 索引在 Python 中确实有效——它获取列表中的最后一个元素 :) 实际上它非常有用。 OP 可能出错的地方是列表中没有元素的情况。在这种情况下,没有“最后一个元素”,所以 Python 会抛出错误。
-
@GrantSchulte 好点。似乎有些时间在更严格的语言中让我忘记了 Python 的一些技巧
-
在我修复了问题 [:-1] 之后,我在这一行遇到了另一个错误::: body = tbody.find_all('tr') AttributeError: 'list' object has no attribute 'find_all '
标签: python dataframe data-analysis