在python中使用for循环创建多个数据框答案

【问题标题】：Create multiple dataframes with for loop in python在python中使用for循环创建多个数据框
【发布时间】：2022-01-26 14:18:22
【问题描述】：

我需要从 10 个名为 quiz2、quiz3 [...]、quiz11 的文件中编译成绩。

我有以下转变：

使用 pandas 将 xls 导入 df
仅获取 4 个重命名的列
如果同一 ID 有多个值，则仅保留最高等级

一个数据帧的代码如下：

quiz2=pd.read_excel(r'C:\Users\llarbodiere\Desktop\Perso\grade compil\quiz\quiz2.xls')
quiz2=quiz2.rename({'Nom d’utilisateur': 'ID', 'Note totale': 'quiz2'}, axis='columns')
quiz2=quiz2[['Nom','Prénom','ID','quiz2']]
quiz2.groupby("ID").max().sort_values("Nom").fillna(0)

我想为从 quiz2 到 quiz11 的所有测验迭代相同的转换。我尝试了 for 循环，但没有成功。

提前致谢！

【问题讨论】：

显示您尝试过的 for 循环，我们会帮助您修复它。

标签： python pandas dataframe loops file

【解决方案1】：

您可以通过循环从 1 到 11 的数字范围并将数字连接到文件名和后缀来动态生成文件名。

#create an empty dataframe for collecting loop results
cumulative_df = pd.DataFrame(columns = ['a'])

#loop through a range of numbers from 1 to 11
for x in range(1,11):

    #generate the file name
    file = 'quiz'+str(x)+'.xls'

    df=pd.read_excel('C:/Users/llarbodiere/Desktop/Perso/grade compil/quiz/'+file)
    df=df.rename({'Nom d’utilisateur': 'ID', 'Note totale': 'quiz'}, axis='columns')
    df=df[['Nom','Prénom','ID','quiz']]
    df.groupby("ID").max().sort_values("Nom").fillna(0)
    
    #add the df active in the loop to the cumulative df
    pd.concat([cumulative_df, df])

print(cumulative_df)

编辑：上面的示例适用于您提到的特定文件名。例如，这可以进一步推广到给定目录中的所有文件。

【讨论】：