修改数据框中的 nans 位置答案

【问题标题】：modifying nans position in the dataframe修改数据框中的 nans 位置
【发布时间】：2022-08-17 00:33:43
【问题描述】：

我希望我能很好地解释这一点。我有这个 df 有 2 列：组和数字。我正在尝试获取该 np.nan 并将其放入它的新组中。

def check_for_nan():
    # for example let\'s say my new value is 14.5
    new_nan_value=14.5
    data = {\"group:\" : [-1,0,1,2,3],
            \'numbers\': [[np.nan], [11, 12], [14, 15], [16, 17], [18, 19]],
            }
    df = pd.DataFrame(data=data)


    # *** add some code ***


    # I created a new dataframe to visually show how it should look like but we would want to operate only on the same df from above 
    data_2 = {\"group\" : [0,1,2,3],
            \'numbers\': [[11, 12], [14,np.nan, 15], [16, 17], [18, 19]],
            }
    df_2 = pd.DataFrame(data=data_2)
    # should return the new group number where the nan would live
    return data_2[\"group\"][1]

输出：

当前的：

   group:   numbers
0      -1     [nan]
1       0  [11, 12]
2       1  [14, 15]
3       2  [16, 17]
4       3  [18, 19]

new_nan_value =14.5 时所需的输出

   group        numbers
0      0       [11, 12]
1      1  [14, nan, 15]
2      2       [16, 17]
3      3       [18, 19]

return 1

标签： python arrays pandas dataframe nan

【解决方案1】：

使用您提供的数据框，这是一种方法：

def move_nan(df, new_nan_value):
    """Helper function.

    Args:
        df: input dataframe.
        new_nan_value: insertion value.

    Returns:
        Dataframe with nan value at insertion point.

    """

    # Reshape dataframe along row axis
    df = df.explode("numbers").dropna().reset_index(drop=True)

    # Insert new row
    insert_pos = df.loc[df["numbers"] < new_nan_value, "numbers"].index[-1] + 1
    df = pd.concat(
        [
            df.loc[: insert_pos - 1, :],
            pd.DataFrame({"group": [pd.NA], "numbers": pd.NA}, index=[insert_pos]),
            df.loc[insert_pos:, :],
        ]
    )
    df["group"] = df["group"].fillna(method="bfill")

    # Groupby and reshape dataframe along column axis
    return df.groupby("group").agg(list).reset_index(drop=False)

以便：

print(move_nan(df, 14.5))
# Output
   group        numbers
0      0       [11, 12]
1      1  [14, nan, 15]
2      2       [16, 17]
3      3       [18, 19]

【讨论】：