Python TypeError 即使我只使用 int 和 float答案

【问题标题】：Python TypeError even though I am only working with int and floatPython TypeError 即使我只使用 int 和 float
【发布时间】：2017-11-07 09:46:24
【问题描述】：

下面的代码在我的测试用例中运行良好，但是在手动启动程序时会导致以下错误：

                    bm25 = tf * (k + 1)
                    bm25 = bm25 / (k * (1 - b + b * dl / avdl) + tf)
                    bm25 = bm25 * math.log(n/df, 2)


  File "inverted_index.py", line 116, in read_from_file
    bm25 = tf * (k + 1)
  TypeError: cannot concatenate 'str' and 'int' objects

所有符号都是变量（浮点数或整数）。我不想做任何涉及字符串的事情。我只是想计算一下……

如果我注释掉导致错误的行，错误会传播到计算的下一行：

                    # bm25 = tf * (k + 1)
                    bm25 = bm25 / (k * (1 - b + b * dl / avdl) + tf)
                    bm25 = bm25 * math.log(n/df, 2)



  File "inverted_index.py", line 117, in read_from_file
    bm25 = bm25 / (k * (1 - b + b * dl / avdl) + tf)
  TypeError: unsupported operand type(s) for -: 'int' and 'str'

注意 bm25 被初始化为零。最初我在一行中完成了整个计算，但这对于我的样式检查器来说太长了。

我是 python 新手，我不知道如何解决这个问题。

我做错了什么？

编辑 1：

完整的代码很长：

    def read_from_file(self, file_name, b, k):
    """
    >>> ii = InvertedIndex()
    >>> ii.read_from_file("example.txt", b=0, k="inf")
    >>> sorted(ii.inverted_lists.items())
    [('animated', [(1, 0.415), (2, 0.415), (4, 0.415)]), \
    ('animation', [(3, 2.0)]), ('film', [(2, 1.0), (4, 1.0)]), \
    ('movie', [(1, 0.0), (2, 0.0), (3, 0.0), (4, 0.0)]), \
    ('non', [(2, 2.0)]), \
    ('short', [(3, 1.0), (4, 2.0)])]

    """

    """
    >>> ii = InvertedIndex()
    >>> ii.read_from_file("example.txt", b=0.75, k=1.75)
    >>> sorted(ii.inverted_lists.items())
    [('animated', [(1, 0.459), (2, 0.402), (4, 0.358)]), \
    ('animation', [(3, 2.211)]), ('film', [(2, 0.969), (4, 0.863)]), \
    ('movie', [(1, 0.0), (2, 0.0), (3, 0.0), (4, 0.0)]), \
    ('non', [(2, 1.938)]), \
    ('short', [(3, 1.106), (4, 1.313)])]

    """

    dls = []
    avdl = 0

    with open(file_name, "r") as file:
        record_id = 1
        for line in file:
            tf = 1
            line = line.strip()
            # Store the record as a tuple (title, description).
            self.records.append(tuple(line.split("\t")))

            dl = 0
            for word in re.split("[^A-Za-z]+", line):
                word = word.lower().strip()

                # Ignore the word if it is empty.
                if len(word) == 0:
                    continue
                dl = dl + 1
                if word not in self.inverted_lists:
                    # The word is seen for first time, create a new list.
                    tf = 1
                    self.inverted_lists[word] = [(record_id, tf)]
                elif self.inverted_lists[word][-1] == (record_id, tf):
                    tf = tf + 1
                    self.inverted_lists[word][-1] = (record_id, tf)
                    tf = 1
                elif self.inverted_lists[word][-1] != (record_id, tf):
                    # Make sure that the list contains the id at most once.
                    self.inverted_lists[word].append((record_id, tf))
            record_id += 1
            avdl = avdl + dl
            dls.append(dl)

        n = record_id - 1
        avdl = avdl / n
        bm25 = 0
        for key in self.inverted_lists:
            df = len(self.inverted_lists[key])
            counter = 0
            for value in self.inverted_lists[key]:
                counter = counter + 1
                tf = value[1]
                dl = dls[value[0] - 1]
                if k == "inf":
                    bm25 = tf * math.log(n/df, 2)
                else:
                    bm25 = float(tf) * (float(k) + float(1))
                    bm25 = bm25 / (float(k) * (float(1) - float(b)
                                               + float(b) * float(dl) /
                                               float(avdl)) + float(tf))
                    bm25 = bm25 * math.log(n/df, 2)
                self.inverted_lists[key][counter - 1] = (
                    self.inverted_lists[key][counter - 1][0],
                    round(bm25, 3))

b 和k 由用户在这种情况下由测试提供。

tf 和 dl 是简单的计数器，用于跟踪单词在各自文档中的频率。

n、avdl 和 df 也只是关于正在排序的数据的指标。

当我手动启动程序时，唯一区别对待的变量是 b 和 k，因为我（用户）正在输入这些值...

编辑 2：

谢谢@N。伊万诺夫，使用浮点数确实解决了这个问题，但只是在我过度使用浮点数的时候：

                    bm25 = float(tf) * (float(k) + float(1))
                    bm25 = bm25 / (float(k) * (float(1) - float(b)
                                               + float(b) * float(dl) /
                                               float(avdl)) + float(tf))
                    bm25 = bm25 * math.log(n/df, 2)

【问题讨论】：

你能提供完整的代码吗？我想说在问题区域周围使用float()，它的作用是将变量的类型转换为浮点数（仅当它是数字时）
如何声明变量tf和k？
看起来tf 或k 是一个字符串
如果 Python 说某事是一个字符串，那么它是一个字符串，毫无疑问。 IOW 在您未发布的代码中的某处，您认为是 float 或 int 的变量之一实际上绑定到 str。

标签： python error-handling

【解决方案1】：

好的，我想我找到了字符串的来源：

file_name = sys.argv[1]
b = sys.argv[2]
k = sys.argv[3]

ii = InvertedIndex()
ii.read_from_file(file_name, b, k)

我的主要方法获取用户输入并将其传递给我的方法。我不认为它会将用户输入解释为字符串 - 我仍然必须习惯动态输入...

@all，感谢您的帮助！

【讨论】：