【发布时间】:2015-06-04 07:19:26
【问题描述】:
我正在使用文本文件 I/O 执行另一个程序,但我很困惑,因为我的代码看起来非常合理,但结果似乎很疯狂。我想计算政治演讲文本文件中的单词、字符、句子和唯一单词的数量。这是我的代码,所以它可能会澄清一些事情。
#This program will serve to analyze text files for the number of words in
#the text file, number of characters, sentances, unique words, and the longest
#word in the text file. This program will also provide the frequency of unique
#words. In particular, the text will be three political speeches which we will
#analyze, building on searching techniques in Python.
#CISC 101, Queen's University
#By Damian Connors; 10138187
def main():
harper = readFile("Harper's Speech.txt")
print(numCharacters(harper), "Characters.")
obama1 = readFile("Obama's 2009 Speech.txt")
print(numCharacters(obama1), "Characters.")
obama2 = readFile("Obama's 2008 Speech.txt")
print(numCharacters(obama1), "Characters.")
def readFile(filename):
'''Function that reads a text file, then prints the name of file without
'.txt'. The fuction returns the read file for main() to call, and print's
the file's name so the user knows which file is read'''
inFile1 = open(filename, "r")
fileContentsList = inFile1.readlines()
inFile1.close()
print(filename.replace(".txt", "") + ":") #this prints filename
return fileContentsList
def numCharacters(file):
return len(file) - file.count(" ")
我目前遇到的问题是计算字符数。它一直说# 是 85,但它是一个相当大的文件,我知道它应该是 7792 个字符。知道我在做什么错吗?这是我的 shell 输出,我使用的是 python 3.3.3
>>> ================================ RESTART ================================
>>>
Harper's Speech:
85 Characters.
Obama's 2009 Speech:
67 Characters.
Obama's 2008 Speech:
67 Characters.
>>>
如您所见,我有 3 个语音文件,但它们不可能是这么少的字符。
【问题讨论】:
-
小故障是您使用的是
obama1,即使您期望obama2结果也是如此。 -
fileContentsList文件中有行。不是字符。所以len(file)返回文件中的行数 -
下面的答案应该可以解决它。尝试创建一个小而简单的文件,您知道输出应该是什么,并尝试打印变量,以检查它是否符合您的预期。
标签: python file python-3.x io count