根据单元格值从特定位置开始读取excel数据框答案

【问题标题】：reading excel dataframe starting from specific location based on the cell value根据单元格值从特定位置开始读取excel数据框
【发布时间】：2021-10-19 03:17:04
【问题描述】：

假设我有以下 Excel 文件：

A	B	C
1	11	111
2	22	start
3	33	333
4	44	444

我想读取数据框中的文件，确保在开始值所在的行上方开始读取它。我尝试了以下方法：

df=df[(df.iloc[:, 2] == "start").shift(-1,fill_value=False).cummax()]

但是，不仅 start 所在的行发生变化，而且工作簿中工作表之间的 col 也会发生变化。有没有一种方法可以让我通过使用 col 和 row 位置来查找我需要开始的值？

这就是我想要的：

C
111
start
333
444

谢谢！

【问题讨论】：

你试过使用'skiprows'吗？试试：df = pd.read_excel(file_name, skiprows=1)
这行不通，因为该行也在数据框中的工作表之间发生了变化
所以在上面的示例中，您只需要 A 为 2、3 和 4 的行吗？
我正在尝试从 start 所在的 col 和 row 开始读取该数据。将在几秒钟内更新问题
@Kimchi - 看看下面的答案是否适合你

标签： python pandas

【解决方案1】：

试试：

#index of first row above "start"
row_index = df[df.shift(-1).eq("start").any(axis=1)].index.min()

#name of first column before the column that contains "start"
col_index = df.loc[:,df.shift(-1, axis=1).eq("start").any(0)].columns[0]

#select all rows and columns per the above indices.
>>> df.loc[row_index:, col_index:]

    B      C
0  11    111
1  22  start
2  33    333
3  44    444

【讨论】：

这行得通！我可以在开始 col 之前使用以下内容从 col 中读取： col_index = df.loc[:,df.shift(-1).eq("start").any(0)].columns[0]
查看编辑！那应该给你你想要的
有没有什么好的材料可以用来让自己熟悉如何进行类似的查询？