【问题标题】:How to export pandas DataFrames to Excel files in an elegant function?如何以优雅的功能将 pandas DataFrames 导出到 Excel 文件?
【发布时间】:2020-04-20 07:05:58
【问题描述】:

我正在尝试编写一个接受数据和文件名作为参数的函数,前者是以后者的名称保存的数据。我想将它保存在一个函数中,而不是像我通过test3 使OutWriter3 工作那样诉诸列表理解。

import numpy as np
import pandas as pd

data_a = np.random.randint(100, size=21)
data_b = np.random.randint(200, size=21)

def OutWriter1(data, filename):
    for d, f in zip(data, filename):
        out = pd.DataFrame(d)
        return out.to_excel(f, header=False, index=False)

def OutWriter2(data, filename):
    out = []
    for d, f in zip(data, filename):
        df = pd.DataFrame(d)
        out.append(df)
        out = pd.DataFrame(out)
        return out.to_excel(f, header=False, index=False)

def OutWriter3(data, filename):
    out = pd.DataFrame(data)
    return out.to_excel(filename, header=False, index=False)

test1 = OutWriter1([data_a, data_b], ['data_a_1.xlsx', 'data_b_1.xlsx'])
test2 = OutWriter2([data_a, data_b], ['data_a_2.xlsx', 'data_b_2.xlsx'])
test3 = [OutWriter3(i, j) for i, j in zip([data_a, data_b], ['data_a_3.xlsx', 'data_b_3.xlsx'])]

来自OutWriter1data_a_1.xlsx 是正确的,但data_b_1.xlsx 不存在,data_a_2.xlsx 完全错误,data_b_2.xlsx 也不存在。但是,data_a_3.xlsxdata_b_3.xlsx 是正确的。

another question 的启发,我还尝试将data_adata_b 保存为单个Excel 文件中的工作表,但运气不佳(AttributeError: 'list' object has no attribute 'write')。

def OutWriter4(data, filename):
    data = pd.DataFrame(data)
    with pd.ExcelWriter(filename) as writer:
        for n, df in enumerate(data):
            df.to_excel(writer, 'sheet%s' % n)
        writer.save()

test4 = OutWriter4([data_a, data_b], ['data_a_4.xlsx', 'data_b_3.xlsx'])
  • 是否有一种优雅的方法来创建一个函数,该函数在提供数据和文件名的情况下创建 Excel 文件?
  • 是否还有一种优雅的方法可以创建一个函数,将不同的数据写入单个 Excel 文件中的指定工作表?

【问题讨论】:

    标签: python pandas


    【解决方案1】:

    解决方案

    您可以使用以下代码将多个数据帧写入单个 excel 文件,或将每个数据帧写入单个 excel 文件。

    target = 'single_file.xlsx'
    targets = ['mult_1.xlsx', 'mult_2.xlsx', 'mult_3.xlsx']
    # df1 = pd.DataFrame(data_a)
    # df2 = pd.DataFrame(data_b)
    # df3 = pd.DataFrame(data_c)
    # dfs = [pd.DataFrame(x) for x in [data_a, data_b]] # >> in your case
    dfs = [df1, df2, df3]
    
    # to write to a single file
    write_to_excel(targets = target, dfs = dfs, verbose=1)
    
    # to write to multiple files
    write_to_excel(targets = targets, dfs = dfs, verbose=1)
    

    自定义函数

    def write_to_excel(targets, dfs: list, verbose=1):
        """Writes single of multiple dataframes to either a single or multiple excel files.
    
        targets: str or list of str --> path(s) excel files
        dfs: list of dataframes
        Example
        -------
        target = 'single_file.xlsx'
        targets = ['mult_1.xlsx', 'mult_2.xlsx', 'mult_3.xlsx']
        dfs = [df1, df2, df3]
    
        # to write to a single file
        write_to_excel(targets = target, dfs = dfs, verbose=1)
    
        # to write to a multiple files
        write_to_excel(targets = targets, dfs = dfs, verbose=1)
        """
        if not isinstance(targets, list):
            targets = [targets]
        writer_scheme = 'single' if len(target)==1 else 'multi'
        if verbose>=1:
            print(f'excel-writer-scheme: {writer_scheme}-file(s)')
        if writer_scheme == 'single':
            with pd.ExcelWriter(targets[0]) as writer:
                for i, df in enumerate(dfs):  
                    df.to_excel(writer, sheet_name=f'Sheet_{i+1}')
        else:
            for df, target in zip(dfs, targets):
                df.to_excel(target, sheet_name='data')
    

    参考文献

    1. https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.to_excel.html

    【讨论】:

    • @naughty_waves 如果有帮助,请考虑 accepting 的答案。
    • 谢谢你,@CypherX。我有一个关于自定义功能的问题。首先,我尝试使用我的 numpy 数组(data_adata_b)运行它,但这不起作用。其次,我决定尝试将它们转换为数据帧(df_a=data_adf_b=data_b),方法是将它们传递为test5=write_to_excel([data_a_5.xlsx, data_b_5.xlsx], [df_a, df_b]),但这也不起作用。 AttributeError: 'list' object has no attribute 'write' 两次出现。我确信错在我这边。你知道我做错了什么吗?
    • 使用这个:dfs = [pd.DataFrame(x) for x in [data_a, data_b]],然后将这个dfs传递给函数。您需要一个数据框列表。最初,data_adata_bnumpy 数组。因此,您必须将它们转换为我在这里展示的数据框。
    • 我在上一条评论中犯了一个错误。我的意思是我通过写df_a=pd.DataFrame(data_a)df_b=pd.DataFrame(data_b) 来转换它们。它现在可以工作,但我不得不将len(target) 更改为len(targets)。感谢您帮助我,@CypherX!
    猜你喜欢
    • 2016-07-27
    • 2023-03-13
    • 1970-01-01
    • 2020-12-28
    • 1970-01-01
    • 2018-02-11
    • 2021-09-28
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多