【发布时间】:2019-11-28 16:49:01
【问题描述】:
我目前有一个如下所示的数据框,每次超过 1000 ex(2000,3000...等)的倍数时,我都需要重置 cumsum,并且
Production ID cumsum
2017-10-19 1054 1323217 1054
2017-10-20 0 1323217 1054
2017-10-21 0 1323217 1054
2017-10-22 0 1323217 1054
2017-10-23 0 1323217 1054
例如在上面,我需要一个如下所示的 df:
Production ID cumsum adjCumsum numberGenerated
2017-10-19 1054 1323217 1054 1000 1
2017-10-20 0 1323217 1054 54 0
2017-10-21 0 1323217 1054 54 0
2017-10-22 3054 1323217 4108 4000 4
2017-10-23 0 1323217 4018 108 0
2017-10-23 500 1323218 500 500 0
下面,每 1000 次正确重置一次值,但我似乎不太明白如何通过按 ID 分组并将其四舍五入到 1000 来翻译它。
maxvalue = 1000
lastvalue = 0
newcum = []
for row in df.iterrows():
thisvalue = row[1]['cumsum'] + lastvalue
if thisvalue > maxvalue:
thisvalue = 0
newcum.append( thisvalue )
lastvalue = thisvalue
df['newcum'] = newcum
感谢下面的答案,我现在可以计算生成的累积数量,但我需要计算生成的增量#。
df['cumsum'] = df.groupby('ID')['Production'].cumsum()
thresh = 1000
multiple = (df['cumsum'] // thresh )
mask = multiple.diff().ne(0)
df['numberGenerated'] = np.where(mask, multiple, 0)
df['adjCumsum'] = (df['numberGenerated'].mul(thresh)) + df['cumsum'] %
thresh
df['cumsum2'] = df.groupby('ID')['numberGenerated'].cumsum()
My initial thinking was to try something similar to:
df['numGen1'] = df['cumsum2'].diff()
最终编辑测试并正常工作。感谢您的帮助
I was overthinking it, below is how I was able to do it:
df['cumsum'] = df.groupby('ID')['Production'].cumsum()
thresh = 1000
multiple = (df['cumsum'] // thresh )
mask = multiple.diff().ne(0)
df['numberGenerated'] = np.where(mask, multiple, 0)
df['adjCumsum'] = (df['numberGenerated'].mul(thresh)) + df['cumsum'] % thresh
df['cumsum2'] = df.groupby('ID')['numberGenerated'].cumsum()
numgen = []
adjcumsum = []
for i in range(len(df['cumsum'])):
if df['cumsum'][i] > thresh and (df['ID'][i] == df['ID'][i-1]):
numgenv = (df['cumsum'][i] // thresh) - (df['cumsum'][i-1] // thresh)
numgen.append(numgenv)
elif df['cumsum'][i] > thresh:
numgenv = (df['cumsum'][i] // thresh)
numgen.append(numgenv)
else:
numgenv = 0
numgen.append(numgenv)
df['numgen2.0'] = numgen
【问题讨论】:
标签: python python-3.x pandas