您可以groupby 并在每个组内应用拟合。首先,设置索引,这样您以后可以避免合并。
import pandas as pd
import numpy as np
df = df.set_index('Id')
df['fit'] = df.groupby('Id').apply(lambda x: np.polyfit(x.x, x.y, 1))
df 现在是:
x y fit
Id
1 0.79978 0.018255 [0.0067691538557680215, 0.01284116612923385]
1 1.19983 0.020963 [0.0067691538557680215, 0.01284116612923385]
2 2.39998 0.029006 [0.00999574968122608, 0.005016400680051043]
2 2.79995 0.033004 [0.00999574968122608, 0.005016400680051043]
3 1.79965 0.021489 [0.006761823817618233, 0.009320083766623343]
3 2.19969 0.024194 [0.006761823817618233, 0.009320083766623343]
...
如果您想为每个部分分别设置单独的列,您可以应用 pd.Series。
df[['slope', 'intercept']] = df.fit.apply(pd.Series)
df = df.drop(columns='fit')
或者从最初的 DataFrame 中粘贴一个 apply 并连接结果。
# From initial DataFrame
df = df.set_index('Id')
res = df.groupby('Id').apply(lambda x: pd.Series(np.polyfit(x.x, x.y, 1),
index=['slope', 'intercept']))
df = pd.concat([df, res], axis=1)
df 现在是:
x y slope intercept
Id
1 0.79978 0.018255 0.006769 0.012841
1 1.19983 0.020963 0.006769 0.012841
2 2.39998 0.029006 0.009996 0.005016
2 2.79995 0.033004 0.009996 0.005016
3 1.79965 0.021489 0.006762 0.009320
3 2.19969 0.024194 0.006762 0.009320
4 1.19981 0.019338 0.007155 0.010753
4 1.59981 0.022200 0.007155 0.010753
5 1.79971 0.025629 0.007629 0.011898
5 2.19974 0.028681 0.007629 0.011898