如何计算一列python pandas每一行中的特定单词[重复]答案

【问题标题】：How to count specific words in each row of a column python pandas [duplicate]如何计算一列python pandas每一行中的特定单词[重复]
【发布时间】：2020-07-28 18:58:49
【问题描述】：

我在 pandas 中有以下数据框：

test = pd.DataFrame({'Food': ['Apple Cake Apple', 'Orange Tomato Cake', 'Broccoli Apple Orange', 'Cake Orange Cake', 'Tomato Apple Orange'], 'Type' : ['Fruit Dessert', 'Fruit Veggie', 'Veggie Fruit', 'Dessert Fruit', 'Veggie Fruit']})
test

          Food                    Type
0   Apple Cake Apple       Fruit Dessert Fruit
1   Orange Tomato          Fruit Veggie Dessert
2   Broccoli Apple Orange  Veggie Fruit Fruit
3   Cake Orange Cake       Dessert Fruit Dessert
4   Tomato Apple Orange    Veggie Fruit Fruit

我想创建一个新列来计算“类型”列中的值，并将它们从大到小排序，无论食物类型如何。例如，这正是我正在寻找的：

test = pd.DataFrame({'Food': ['Apple Cake Apple', 'Orange Tomato Cake', 'Broccoli Apple Orange', 'Cake Orange Cake', 'Tomato Apple Orange'],
                     'Type' : ['Fruit Dessert Fruit', 'Fruit Veggie Dessert', 'Veggie Fruit Fruit', 'Dessert Fruit Dessert', 'Veggie Fruit Fruit'],
                     'Count': ['2 1', '1 1 1 ', '2 1', '2 1', '2 1']})
test

    Food                             Type          Count
0   Apple Cake Apple        Fruit Dessert Fruit     2 1
1   Orange Tomato Cake      Fruit Veggie Dessert    1 1 1
2   Broccoli Apple Orange   Veggie Fruit Fruit      2 1
3   Cake Orange Cake        Dessert Fruit Dessert   2 1
4   Tomato Apple Orange     Veggie Fruit Fruit      2 1

我该怎么做呢？非常感谢！

【问题讨论】：

标签： python regex string pandas string-formatting

【解决方案1】：

IIUC

s=test.Type.str.split().explode()
s=s.groupby([s.index,s]).size().sort_values(ascending=False).groupby(level=0).agg(lambda x : ' '.join(x.astype(str)))
df['C']=s
0      2 1
1    1 1 1
2      2 1
3      2 1
4      2 1
Name: Type, dtype: object

【讨论】：

这工作正常，但我只收到此消息（不是错误消息，代码有效，我只是收到一些带有返回的文本）： SettingWithCopyWarning: A value is trying to be set on a copy来自 DataFrame 的切片。尝试改用 .loc[row_indexer,col_indexer] = value 请参阅文档中的注意事项：pandas.pydata.org/pandas-docs/stable/user_guide/… 这与 ipykernel 包是分开的，因此我们可以避免导入直到.... 可以吗？
@bismo 子集数据框时让我们执行 df=df.copy()，添加副本以避免复制警告