Python Pandas - 根据另一个表的值对数据帧进行切片以匹配列名答案

【问题标题】：Python Pandas - Slice DataFrame based on Another Table's Values to Match to Column NamePython Pandas - 根据另一个表的值对数据帧进行切片以匹配列名
【发布时间】：2021-01-26 21:57:52
【问题描述】：

我有两个数据框，df_stats 和 df_ratings。

df_stats 看起来像这样

	Fruit	Rating_Threshold_Low	Rating_Threshold_High
1	Apple	4	7
2	Banana	5	9
3	Kiwi	6	8

df_ratings 看起来像这样（第一列是Fruit，随后的每一列代表一个rating。

	Fruit	1	2	3	4	5	6	7	8	9	10
1	Apple	2	4	7	13	2	0	16	1	9	22
2	Banana	6	4	2	1	8	7	5	3	9	0
3	Kiwi	21	4	3	6	8	9	9	8	7	5

我的目标是获得每个水果的评分阈值内的评分总和（每个水果的评分阈值不同）。换句话说，我想在df_stats 中添加列Rating_Threshold_Sum，它计算阈值df_ratings 内的评分总和。例如，对于 Apple，Rating 阈值介于 4 和 7（含）之间，因此 Rating_Threshold_Sum 将为 13+2+0+16 = 31。

因此，df_stats 将拥有 Ratings_Threshold_Sum 列：

	Fruit	Rating_Threshold_Low	Rating_Threshold_High	Rating_Threshold_Sum
1	Apple	4	7	31
2	Banana	5	9	32
3	Kiwi	6	8	26

我不确定该怎么做，我知道我可能必须使用 df.apply 和自定义函数，或者循环遍历每一行，但除此之外，我不确定解决问题的最佳方法这个问题。任何建议/方向将不胜感激。谢谢！

【问题讨论】：

标签： python pandas dataframe

【解决方案1】：

你可以这样做

sums = []
for i in range(len(df_stats)):
   min_v, max_v = df_stats["Rating_Threshold_Low"].values()[i], df_stats["Rating_Threshold_High"].values()[i]  
   values = []
   for z in range(min_v, max_v+1):
      x = df_ratings[str(z)][i]
      values.append(x)
   sums.append(sum(values))
df_stats["Rating_Threshold_Sum"] = sums

这真的很复杂，可能有更好的方法来做到这一点，但它应该可以工作。

【讨论】：