【问题标题】:Wordcloud Python with generate_from_frequencies带有 generate_from_frequencies 的 Wordcloud Python
【发布时间】:2017-08-19 23:38:57
【问题描述】:

我正在尝试从 csv 文件创建 wordcloud。以csv文件为例,其结构如下:

a,1
b,2
c,4
j,20

它有更多的行,或多或少 1800。第一列有字符串值(名称),第二列有它们各自的频率(int)。然后,读取文件并将键值行存储在字典 (d) 中,因为稍后我们将使用它来绘制 wordcloud:

reader = csv.reader(open('namesDFtoCSV', 'r',newline='\n'))
d = {}
for k,v in reader:
    d[k] = v

一旦我们有了充满值的字典,我就会尝试绘制 wordcloud:

#Generating wordcloud. Relative scaling value is to adjust the importance of a frequency word.
#See documentation: https://github.com/amueller/word_cloud/blob/master/wordcloud/wordcloud.py
    wordcloud = WordCloud(width=900,height=500, max_words=1628,relative_scaling=1,normalize_plurals=False).generate_from_frequencies(d)
    plt.imshow(wordcloud, interpolation='bilinear')
    plt.axis("off")
    plt.show()
But an error is thrown:

    Traceback (most recent call last):
    File ".........../script.py", line 19, in <module>
    wordcloud = WordCloud(width=900,height=500, max_words=1628,relative_scaling=1,normalize_plurals=False).generate_from_frequencies(d)
    File "/usr/local/lib/python3.5/dist-packages/wordcloud/wordcloud.py", line  360, in generate_from_frequencies
    for word, freq in frequencies]
    File "/usr/local/lib/python3.5/dist-packages/wordcloud/wordcloud.py", line 360, in <listcomp>
    for word, freq in frequencies]
    TypeError: unsupported operand type(s) for /: 'str' and 'float

最后,文档说:

def generate_from_frequencies(self, frequencies, max_font_size=None):
    """Create a word_cloud from words and frequencies.
    Parameters
    ----------
    frequencies : dict from string to float
        A contains words and associated frequency.
    max_font_size : int
        Use this font-size instead of self.max_font_size
    Returns
    -------
    self
```python

So, I don't understand why is trowing me this error if I met the requirements of the function. I hope someone can help me, thanks.

**Note**

I work with worldcloud 1.3.1

【问题讨论】:

    标签: python python-3.x csv word-cloud


    【解决方案1】:

    这是因为字典中的值是字符串,但 wordcloud 需要整数或浮点数。

    在我运行您的代码后,检查您的字典 d 我得到以下信息。

    In [12]: d
    
    Out[12]: {'a': '1', 'b': '2', 'c': '4', 'j': '20'}
    

    注意数字周围的' ' 表示这些确实是字符串。

    解决此问题的一种巧妙方法是将v 转换为int 在您的FOR 循环中,例如:

    d[k] = int(v)
    

    我说这是 hacky,因为它适用于整数,但如果你的输入中有浮点数,那么它可能会导致问题。

    此外,Python 错误可能难以阅读。您上面的错误可以解释为

    script.py", line 19
    
    TypeError: unsupported operand type(s) for /: 'str' and 'float
    

    “在我的文件的第 19 行或之前有一个类型错误。让我看看 我的数据类型,看看字符串和 浮动...”

    下面的代码对我有用:

    import csv
    from wordcloud import WordCloud
    import matplotlib.pyplot as plt
    
    reader = csv.reader(open('namesDFtoCSV', 'r',newline='\n'))
    d = {}
    for k,v in reader:
        d[k] = int(v)
    
    #Generating wordcloud. Relative scaling value is to adjust the importance of a frequency word.
    #See documentation: https://github.com/amueller/word_cloud/blob/master/wordcloud/wordcloud.py
    wordcloud = WordCloud(width=900,height=500, max_words=1628,relative_scaling=1,normalize_plurals=False).generate_from_frequencies(d)
    
    plt.imshow(wordcloud, interpolation='bilinear')
    plt.axis("off")
    plt.show()
    

    【讨论】:

      【解决方案2】:
      # LEARNER CODE START HERE
      file_c=""
      for index, char in enumerate(file_contents):
          if(char.isalpha()==True or char.isspace()):
              file_c+=char
      file_c=file_c.split()
      file_w=[]
      for word in file_c:
          if word.lower() not in uninteresting_words and word.isalpha()==True:
          file_w.append(word)
      frequency={}
      for word in file_w:
          if word.lower() not in frequency:
              frequency[word.lower()]=1
          else:
              frequency[word.lower()]+=1
      #wordcloud
      cloud = wordcloud.WordCloud()
      cloud.generate_from_frequencies(frequency)
      return cloud.to_array()
      

      【讨论】:

      • 最好简单解释一下代码是如何工作的。
      • 请解释为什么这是一个解决方案。你的解决方案是做什么的,它是如何解决问题的。 (它也应该对发现这个问题的人有教育作用,因为他们有类似的问题。)
      猜你喜欢
      • 2016-11-09
      • 2018-08-11
      • 2017-08-27
      • 1970-01-01
      • 1970-01-01
      • 2017-06-30
      • 1970-01-01
      • 2021-11-14
      • 1970-01-01
      相关资源
      最近更新 更多