如何在python中获取列表值和计数答案

【问题标题】：how to get list value and count in python如何在python中获取列表值和计数
【发布时间】：2018-01-28 02:39:57
【问题描述】：

我正在尝试计算列表中的每个单词。这样我就可以删除具有更大计数值的单词。但是我得到的输出不正确。假设如果我的文件中有这些行“这是最好的时代，也是最糟糕的时代。这是智慧的时代，这是愚蠢的时代”。我的代码正在做什么打印（是，4）和再次某处（是，3）等等。每次出现单词时，它都会打印该单词，但计数值不同。我需要对每个单词进行一次计数。

for file in files:  
    print(file)
    f=open(file, 'r')
    content = f.read() 
    wordlist = content.split()
    #print(wordlist)
    wordfreq = [wordlist.count(w) for w in wordlist] # a list comprehension
    print("List\n" + str(wordlist) + "\n")
    print("Frequencies\n" + str(wordfreq) + "\n")
    test = [i for i in wordfreq if i > 100]
    print("result\n"+str(list(zip(test,wordlist))))

【问题讨论】：

标签： python list arraylist stop-words

【解决方案1】：

你可以像这样使用Counter：

>>> from collections import Counter
>>>
>>> s = "it was the best of times it was the worst of times .it was the age of wisdom it was the age of foolishness"
>>>
>>> d = Counter(s.split())
>>> for k,v in d.items():
...     print '{} -> {}'.format(k,v)
...
of -> 4
age -> 2
it -> 3
foolishness -> 1
times -> 2
worst -> 1
.it -> 1
the -> 4
wisdom -> 1
was -> 4
best -> 1
>>>

如果你不想使用collections.Counter，你可以使用这样的字典：

>>> s = "it was the best of times it was the worst of times .it was the age of wisdom it was the age of foolishness"
>>> d = {}
>>> for word in s.split():
...     try:
...         d[word] += 1
...     except KeyError:
...         d[word] = 1
...
>>> d
{'of': 4, 'age': 2, 'it': 3, 'foolishness': 1, 'times': 2, 'worst': 1, '.it': 1, 'the': 4, 'wisdom': 1, 'was': 4, 'best': 1}

【讨论】：

【解决方案2】：

没有计数器的解决方案：

new = s.split(' ')
m=list()
for i in new:
 m.append((i , new.count(i)))
for i in set(m):
    print i
del m[:] # deleting list for using it again

输出：

('best', 1)  
('was', 4)   
('times', 2)  
('it', 3)  
('worst', 1)  
('.it', 1)  
('wisdom', 1)  
('foolishness', 1)  
('the', 4)     
('of', 4) 
('age', 2)

another test : 
 s = 'was was it was hello it was'
output :  
('hello', 1)  
('was', 4)  
('it', 2)

如果您将数据保存到文件中，请使用：

s=""

with open('your-file-name', 'r') as r:
 s+=r.read().replace('\n', '') #reading multi lines

new = s.split(' ')
m=list()
for i in new:
 m.append((i , new.count(i)))
for i in set(m):
    print i
del m[:] # deleting list for using it ag

【讨论】：

@user3778289 如果您不想使用 (Couner) 模块，您可以简单地使用此代码
谢谢你，这很好。但它仍然多次给我这个词。就像 (was,4),(times,4),again (was,4)
@user3778289 但在我的输出中（was=4）重复一个（Set）应该删除重复的请复制并粘贴我的代码并再次测试
它给了我重复的输出。就像我有一个大的输入文件。这句话只是一个例子
也许因为我使用了 (m.append()) 它让你重复，因为每次你测试你的程序数据都会附加到 (m) 你应该清空列表并重试

【解决方案3】：

from collections import Counter

for file in files:
    words = open(file).read().split()
    frequencies = Counter(words)

【讨论】：

【解决方案4】：

您可以从collections 使用Counter：

from collections import Counter
import itertools

for file in files:

    data = itertools.chain.from_iterable([i.strip('\n').split() for i in open(file)])

    the_counts = Counter(data)

    print("wordlist: {}".format(data))
    print("frequencies: {}".format(dict(the_count))
    test = [(a, b) for a, b in the_count.items() if b > 100]

【讨论】：

这给了我一个错误。test = [(a, b) for a, b in dict(the_count).items() if b > 10] SyntaxError: invalid syntax
@user3778289 再试一次，让我知道会发生什么。
test = [(a, b) for a, b in the_count.items() if b > 10] ^ SyntaxError: invalid syntax error is still there

【解决方案5】：

import pandas as pd
a = pd.Series(txt.split()).value_counts().reset_index().rename(columns={0:"counts","index":"word"})
a[a.counts<100]

【讨论】：