python - 如何使用 glob.glob 创建包含大量数据帧的字典答案

【问题标题】：python - how to create a dictionary with a lot of dataframes using a glob.globpython - 如何使用 glob.glob 创建包含大量数据帧的字典
【发布时间】：2019-10-17 21:50:00
【问题描述】：

我使用以下代码创建了一个包含一堆数据框的字典

files = ('auction_aggregated_curves_germany_austria_20100101.csv', 'auction_aggregated_curves_germany_austria_20100102.csv', 'auction_aggregated_curves_germany_austria_20100103.csv', 'auction_aggregated_curves_germany_austria_20100104.csv', 'auction_aggregated_curves_germany_austria_20100105.csv')

dfs = ('df1', 'df2', 'df3', 'df4', 'df5')

list_of_dfs = {}
for df, file in zip(dfs, files):
       list_of_dfs[df] = pd.read_csv(file, skiprows=1)

但是我想知道是否有更简单的方法可以使用 glob.iglob 调用一堆 cvs 文件，这些文件仅在最后一个数字（表示年、月和日的日期）中有所不同。我有超过 365 个文件，如果有人可以帮助我避免编写所有文件名，那将非常有帮助。

提前致谢。

【问题讨论】：

标签： python loops dataframe dictionary glob

【解决方案1】：

您可以为此使用the pathlib module。它包括一个glob 方法。

from pathlib import Path

dataframes = {}

csv_root = Path(".")

for csv_path in csv_root.glob("*.csv"):
    key = csv_path.stem  # the filename without the ".csv" extension
    dataframes[key] = pd.read_csv(csv_path)

将此代码与您的示例数据一起使用，dataframes dict 将如下所示：

dataframes == {
    "auction_aggregated_curves_germany_austria_20100101": <DataFrame(...)>,
    "auction_aggregated_curves_germany_austria_20100102": <DataFrame(...)>,
    # etc...
}

【讨论】：