【发布时间】:2021-01-22 23:41:24
【问题描述】:
我写了一篇文章,并尝试在不使用“import”和 NLTK 的情况下计算每个单词的词频。会是这样的:
输入: example = "我明天要去电影院。"
输出:
| word | frequency |
|---|---|
| I | 1 |
| will | 1 |
| go | 1 |
【问题讨论】:
-
您可以在不导入的情况下使用内置函数。就像一个字典。
标签: python list function dictionary
我写了一篇文章,并尝试在不使用“import”和 NLTK 的情况下计算每个单词的词频。会是这样的:
输入: example = "我明天要去电影院。"
输出:
| word | frequency |
|---|---|
| I | 1 |
| will | 1 |
| go | 1 |
【问题讨论】:
标签: python list function dictionary
# Use OP example
example = " I will go to cinema tomorrow."
# replace can be used to remove the components you might not want to count.
# Following the OP example, . was not to be counted in the output.
tkns = example.replace(".", "").split()
# Using dict comprehension, one can iterate over the tokens and use count to count the occurrences.
# Since dict doesn't allow multiple keys, if a token shows up multiple times, in the output it will show up only once as a key in the dict and it will have the proper amount of occurrences associated.
{t :tkns.count(t) for t in tkns}
【讨论】:
您可能需要先对您的刺痛进行一些“清理”(例如删除句号、冒号等)。但是你可以做的是:
s = "I will go to cinema tomorrow"
# split into words
words = s.split(" ")
# count words
result = Counter(words)
# Huzza!
print(result)
【讨论】:
Counter而不导入它