【问题标题】:Python, Pandas from data frame to create new dataPython,Pandas 从数据框创建新数据
【发布时间】:2018-11-13 11:58:30
【问题描述】:

原始电子表格有 2 列。我想根据给定的标准(根据月份)选择行,并将它们放入新文件中。

原始文件如下所示:

我正在使用的代码: 导入操作系统 将熊猫导入为 pd

working_folder = "C:\\My Documents\\"

file_list = ["Jan.xlsx", "Feb.xlsx", "Mar.xlsx"]

with open(working_folder + '201703-1.csv', 'a') as f03:
    for fl in file_list:
        df = pd.read_excel(working_folder + fl)
        df_201703 = df[df.ARRIVAL.between(20170301, 20170331)] 
        df_201703.to_csv(f03, header = True)

with open(working_folder + '201702-1.csv', 'a') as f02:
    for fl in file_list:
        df = pd.read_excel(working_folder + fl)
        df_201702 = df[df.ARRIVAL.between(20170201, 20170231)] 
        df_201702.to_csv(f02, header = True)

with open(working_folder + '201701-1.csv', 'a') as f01:
    for fl in file_list:
        df = pd.read_excel(working_folder + fl)
        df_201701 = df[df.ARRIVAL.between(20170101, 20170131)] 
        df_201701.to_csv(f01, header = True)

结果如下:

我想做的改进:

  1. 将它们保存为 xlsx 文件而不是 .csv
  2. 没有第一个索引列
  3. 只保留 1 行(顶部)标题(现在每个 csv 有 3 行标题)

我该怎么做?谢谢。

【问题讨论】:

    标签: python pandas dataframe


    【解决方案1】:

    我认为需要一起创建list of DataFrames、concat 然后写入文件:

    dfs1 = []
    
    for fl in file_list:
        df = pd.read_excel(working_folder + fl)
        dfs1.append(df[df.ARRIVAL.between(20170101, 20170131)] )
    
    pd.concat(dfs1).to_excel('201701-1.xlsx', index = False)
    

    列表理解应该简化什么:

    file_list = ["Jan.xlsx", "Feb.xlsx", "Mar.xlsx"]
    dfs1 = [pd.read_excel(working_folder + fl).query('20170101 >= ARRIVAL >=20170131') for fl in file_list]
    
    pd.concat(dfs1).to_excel('201701-1.xlsx', index = False)
    

    【讨论】:

      猜你喜欢
      • 2018-03-15
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2021-11-28
      • 2018-03-02
      • 2017-10-01
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多