【问题标题】:Plotting the frequency associated with bigrams绘制与二元组相关的频率
【发布时间】:2018-08-24 09:38:25
【问题描述】:

我有一个数据集的每个二元组的频率。我需要按降序对其进行排序并可视化前 n 个二元组。这是我与每个二元组相关的频率

{('best', 'price'): 95, ('price', 'range'): 190, ('range', 'got'): 5, ('got', 'diwali'): 2, ('diwali', 'sale'): 2, ('sale', 'simply'): 1, ('simply', 'amazed'): 1, ('amazed', 'performance'): 1, ('performance', 'camera'): 30, ('camera', 'clarity'): 35, ('clarity', 'device'): 1, ('device', 'speed'): 1, ('speed', 'looks'): 1, ('looks', 'display'): 1, ('display', 'everything'): 2, ('everything', 'nice'): 5, ('nice', 'heats'): 2, ('heats', 'lot'): 14, ('lot', 'u'): 2, ('u', 'using'): 3, ('using', 'months'): 20, ('months', 'no'): 10, ('no', 'problems'): 8, ('problems', 'whatsoever'): 1, ('whatsoever', 'great'): 1

谁能帮我想象一下这些二元组?

【问题讨论】:

标签: python-3.x machine-learning nlp nltk


【解决方案1】:

如果我理解正确,这就是你需要的

import seaborn as sns
bg_dict = {('best', 'price'): 95, ('price', 'range'): 190, ('range', 'got'): 5, ('got', 'diwali'): 2, ('diwali', 'sale'): 2, ('sale', 'simply'): 1,
       ('simply', 'amazed'): 1, ('amazed', 'performance'): 1, ('performance', 'camera'): 30, ('camera', 'clarity'): 35, ('clarity', 'device'): 1,
       ('device', 'speed'): 1, ('speed', 'looks'): 1, ('looks', 'display'): 1, ('display', 'everything'): 2, ('everything', 'nice'): 5, ('nice', 'heats'): 2, ('heats', 'lot'): 14,
       ('lot', 'u'): 2, ('u', 'using'): 3, ('using', 'months'): 20, ('months', 'no'): 10, ('no', 'problems'): 8, ('problems', 'whatsoever'): 1, ('whatsoever', 'great'): 1}

bg_dict_sorted = sorted(bg_dict.items(), key=lambda kv: kv[1], reverse=True)
bg, counts = list(zip(*bg_dict_sorted))
bg_str = list(map(lambda x: '-'.join(x), bg))
sns.barplot(bg_str, counts)

【讨论】:

    猜你喜欢
    • 1970-01-01
    • 2022-01-27
    • 2011-08-18
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2013-06-04
    • 2021-07-20
    • 2013-06-29
    相关资源
    最近更新 更多