LIDY1260

期末综合大作业:词频统计

#1
f = open(\'Les Miserables悲惨世界.txt\',mode=\'r\',encoding=\'utf-8\')
fText = f.read()#从文件里独处全部文本,字符串
print(fText)

#2
replacelist = [\'?\',\'.\',\',\',\':\',\'"\',"\'"]
for c in replacelist:
    fText = fText.replace(c,\'\')#替换掉所有标点符号
print(fText)

#3
print(fText.split(\' \'))
fList = fText.split(\' \')#列表出现的单词序列

#4
fSet = set(fList)#集合:有哪些单词
print(fSet)

fDict = {}
for word in fSet:
    fDict[word]=fList.count(word)
print(fDict)
for d in fDict:
    print(d,fDict[d])

#5
wordCountList = list(fDict.items())
print(wordCountList)
wordCountList.sort(key=lambda x:x[1],reverse=True)
print(wordCountList)

#6
for i in range(20):
    print(wordCountList[i])

#7
fCountFile = open(\'fText.txt\',mode=\'a\',encoding=\'utf-8\')
for i in range(len(wordCountList)):
    fCountFile.write(str(wordCountList[i][1])+\' \'+wordCountList[i][0]+\'\n\')
fCountFile.close()

 

分类:

技术点:

相关文章:

  • 2021-10-09
  • 2021-12-07
  • 2022-01-03
  • 2021-10-25
  • 2021-05-27
  • 2021-11-11
  • 2021-11-29
  • 2021-12-02
猜你喜欢
  • 2021-08-07
  • 2021-11-19
  • 2022-02-10
  • 2021-11-13
  • 2022-12-23
  • 2021-09-07
  • 2021-09-29
相关资源
相似解决方案