Openpyxl：需要在 Excel 中有数据的列中的最大行数答案

【问题标题】：Openpyxl : need the max number of rows in a column that has data in ExcelOpenpyxl：需要在 Excel 中有数据的列中的最大行数
【发布时间】：2019-03-19 17:14:30
【问题描述】：

我需要包含 Excel 中数据的特定列中的最后一行。在 openpyxl sheet.max_row 或 max_column 中，我们获得了整个工作表中的最大行或列。但我想要的是一个特定的专栏。

我的场景是我必须从数据库中获取一些值并将其附加到 Excel 工作表中特定列的末尾。

在此屏幕截图中，如果我希望 max_column 包含“C”列中的数据，它应该返回 10：

在上图中，如果我想要最后一个包含“C”列数据的单元格，它应该返回 10

------------- 解决方案 1 --------

import pandas as pd

# lt is the dataframe containing the data to be loaded to excel file

for index,i in enumerate(lt):
   panda_xl_rd = pd.read_excel('file.xlsx',"sheet_Name") # Panda Dataframe
   max = len(panda_xl_rd.iloc[:,(col-1)].dropna())+2     ''' getting the 
                                                             row_num of 
                                                             last record in 
                                                             column 
                                                             dropna removes 
                                                             the Nan 
                                                             values else we 
                                                             will get 
                                                             the entire 
                                                             sheets max 
                                                             column length . 
                                                             +2 gets 
                                                             the next column 
                                                             right after the 
                                                             last column to 
                                                             enter data '''
   cellref = sheet.cell(row = max+index, column=col)
   cellref.value = i
   del panda_xl_rd

------------解决方案 2 --------- -

https://stackoverflow.com/a/52816289/10003981

------------解决方案 3 --------- -

https://stackoverflow.com/a/52817637/10003981

也许解决方案 3 更简洁！

【问题讨论】：

标签： python excel openpyxl

【解决方案1】：

问题：我希望 max_column 包含“C”列中的数据，它应该返回 10：

简单计数cell.value not Empty
文档Accessing many cells

伪码

for cell in Column('C'):
    if not cell.value is empty:
        count += 1

评论：如果中间有一个空单元格怎么办？

与列范围同步计算行数，并使用maxRowWithData 变量。这也适用于 no 之间的空单元格。

伪码
for row index, cell in enumerate Column('C'):
    if not cell.value is empty:
        maxRowWithData = row index
注意：openpyxl 的单元格索引是基于1的！

文档： enumerate(iterable, start=0)

【讨论】：

如果中间有一个空单元格怎么办？通常我们不会有，只是好奇
这对我很有用！！感谢您的及时回复。将在问题部分本身更新这两个解决方案！
抛出错误：NameError: name 'empty' is not defined
@ChandraShekhar "NameError: name 'empty' is not defined"：你有没有注意到PSEUDOCODE这个词，意味着它不是工作代码。您必须根据需要将其扩展为有效的 Python 代码。
@stovfl 谢谢我错过了这个词

【解决方案2】：

“空”是一个相对概念，因此您的代码应该清楚这一点。 openpyxl 中的方法保证返回正交的结果集：行和列的长度总是相同的。

使用这个我们可以推断出单元格的列中值不是无的行中的最高行。

max_row_for_c = max((c.row for c in ws['C'] if c.value is not None))

【讨论】：

感谢您的及时回复！但这只是返回我的列名：'C'

【解决方案3】：

我想我刚刚找到了使用 pandas 的方法：

import pandas as pd

# lt is the dataframe containing the data to be loaded to excel file

for index,i in enumerate(lt):
    panda_xl_rd = pd.read_excel('file.xlsx',"sheet_Name") # Panda Dataframe
    max = len(panda_xl_rd.iloc[:,(col-1)].dropna())+2     ''' getting the row_num of 
                                                            last record in column 
                                                            dropna removes the Nan 
                                                            values else we will get 
                                                            the entire sheets max 
                                                            column length . +2 gets 
                                                            the next column right 
                                                            after the last column to 
                                                            enter data '''
    cellref = sheet.cell(row = max+index, column=col)
    cellref.value = i
    del panda_xl_rd

【讨论】：

【解决方案4】：

为什么不直接找到“C”列的长度结果将是相同的输出--> 10 因为当你将获得列'C'值时，它会将你呈现为元组元素所以只需要长度为=10的元组

import Openpyxl

file=openpyxl.load_workbook('example.xlsx')

current_sheet=file.get_sheet_by_name('sheet1')  

Column_C=current_sheet['C']   

print ( len(column_C))

data.close()
data.closed()

【讨论】：

请将您的解释与代码分开，并提供格式正确的完整代码块以及可行的解决方案。提前致谢。

【解决方案5】：

如果两个单元格之间的空单元格具有值，则接受的答案不正确，那么它将失败，以下是正确的方法。

import openpyxl as xl
import os
    BASE_DIR = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
    Dir_Name = os.path.join(BASE_DIR, 'Your_Project_Folder_Name_Here')
    xl_file_path = os.path.join(Dir_Name, 'Your_Excel_File_Name_Here.xlsx')
    wb_obj = xl.load_workbook(xl_file_path)
    sheet_obj = wb_obj.active
    number_of_rows = sheet_obj.max_row
    last_row_index_with_data = 0
    while True:
        if sheet_obj.cell(number_of_rows, 1).value != None:
            last_row_index_with_data = number_of_rows
            break
        else:
            number_of_rows -= 1

    print( "last row index having values " , last_row_index_with_data)

通过这种方式，我们从页面的底部到顶部进行检查，当我们发现一个单元格的值不是 None 时，该行的索引就是我们需要的那个。

【讨论】：