【发布时间】:2015-09-20 13:58:00
【问题描述】:
程序:读一段文字,找出最常用的前十个单词,按频率排序,然后按顺序打印列表。 (当调用“--topcount”标志时会发生这种情况)
我正在尝试稍微修改此程序,以便它在按频率从文本中找到前 10 个最常见的单词后,然后按字母顺序对列表进行排序并打印出来,以便它按字母顺序而不是数字顺序.
当前代码:
import sys
def word_dictionary(filename):
word_count = {} #create dict
input = open(filename, 'r')
for line in input:
words = line.split()#split lines on whitespace
for word in words:
word = word.lower() #forces all found words into lowercase
if not word in word_count:
word_count[word] = 1
else:
word_count[word] = word_count[word] + 1
input.close()
return word_count
def print_words(filename):
word_count = word_dictionary(filename)
words = sorted(word_count.keys())
for word in words:
print word, word_count[word]
def get_count(word_count_tuple):
return word_count_tuple[1]
def print_top(filename):
word_count = word_dictionary(filename)
items = sorted(word_count.items(), key=get_count, reverse=True)
for item in items[:20]:
print item[0], item[1]
def main():
if len(sys.argv) != 3:
print 'usage: ./wordcount.py {--count | --topcount} file'
sys.exit(1)
option = sys.argv[1]
filename = sys.argv[2]
if option == '--count':
print_words(filename)
elif option == '--topcount':
print_top(filename)
else:
print 'unknown option: ' + option
sys.exit(1)
if __name__ == '__main__':
main()
我尝试过这样做:
def get_alph(word_count_tuple):
return word_count_tuple[0]
替换“def get_count(word_count_tuple)”函数并修改“print top”函数,以便
items = sorted(word_count.items(), key = get_alph)
按字母顺序创建一个列表,但它没有按预期工作,而是打印了按字母顺序排序的文本中所有单词列表的前 10 个单词。
是否有任何建议可以帮助该计划按预期工作?
【问题讨论】:
标签: python python-2.7 sorting