【问题标题】:Pandas: convert for loop with if/else conditions into apply method (lambda function)Pandas:将带有 if/else 条件的 for 循环转换为 apply 方法(lambda 函数)
【发布时间】:2021-04-04 21:55:21
【问题描述】:

我有下面的for循环函数:

def add_CQI_iterrows(df):
    previous_row = df['Date'].astype(str)[0]
    CQI_index = 0
    series = []

    for index, row in df.iterrows():
        if row['Date'] == previous_row:       
            previous_row = row['Date']
            print(CQI_index)
        else:
            CQI_index += 1
            previous_row = row['Date']      
        series.append(CQI_index)
    df['CQI'] = series
    
    return df

我想找到一种将这个 for 循环转换为 apply 方法的方法。像这样的东西(不起作用):

def add_CQI_apply(df):
    previous_row = df['Date'].astype(str)[0]
    CQI_index = 1
    series = []
    
    df['CQI'] = df.apply(lambda row: previous_row = row['Date'] if row['Date'] == previous_row else CQI_index += 1 and previous_row = row['Date'], axis=1)
    
    return df

我想做这个转换,因为我想看看 apply 方法有多快,以及是否可以在 Pandas 系列上对 apply 方法进行矢量化。

这是我的数据(data.json):

[
 {
   "Date": "9/20/2020 8:50",
   "UE": 1
 },
 {
   "Date": "9/20/2020 8:50",
   "UE": 2
 },
 {
   "Date": "9/20/2020 8:50",
   "UE": 3
 },
 {
   "Date": "9/20/2020 8:57",
   "UE": 1
 },
 {
   "Date": "9/20/2020 8:57",
   "UE": 8
 },
 {
   "Date": "9/20/2020 8:57",
   "UE": 2
 },
 {
   "Date": "9/20/2020 9:12",
   "UE": 1
 },
 {
   "Date": "9/20/2020 9:12",
   "UE": 5
 },
 {
   "Date": "9/20/2020 9:12",
   "UE": 3
 },
 {
   "Date": "9/20/2020 9:20",
   "UE": 1
 },
 {
   "Date": "9/20/2020 9:20",
   "UE": 4
 },
 {
   "Date": "9/20/2020 9:20",
   "UE": 3
 }
]

最后是上传这些数据的函数:

def upload_data(file):
    df = pd.read_json(file)
    df['Date'] = pd.to_datetime(df['Date'], format="%Y-%d-%m %H:%M:%S") 
    df['CQI'] = np.nan
    return df

【问题讨论】:

    标签: python pandas for-loop vectorization apply


    【解决方案1】:

    df['CQI'] = (df['Date'] != df['Date'].shift()).cumsum()

    In [120]: (df['Date'] != df['Date'].shift()).cumsum()
    Out[120]:
    0     1
    1     1
    2     1
    3     2
    4     2
    5     2
    6     3
    7     3
    8     3
    9     4
    10    4
    11    4
    Name: Date, dtype: int64
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 2017-06-02
      • 1970-01-01
      • 2012-06-04
      • 1970-01-01
      • 1970-01-01
      • 2018-07-08
      • 1970-01-01
      相关资源
      最近更新 更多