【问题标题】:How to fix this Python TypeError: iteration over non-sequence?如何修复此 Python TypeError:非序列迭代?
【发布时间】:2012-08-01 21:21:36
【问题描述】:

当我运行下面的整个代码时,这一行:

for f in features:

从此函数(getfeatures 返回字典):

def train(self,item,cat):
    features=self.getfeatures(item)
    # Increment the count for every feature with this category
    for f in features:
      self.incf(f,cat)
    # Increment the count for this category
    self.incc(cat)
    self.con.commit()

产生这个错误:

TypeError: iteration over non-sequence              

我尝试替换此行:for f in features: 为:for f in features.keys():,但没有奏效("AttributeError: classifier instance has no attribute 'keys'")。当我尝试这个时:

print getfeatures('Nobody owns the water.')

它给了我预期的结果:

{'water': 1, 'the': 1, 'nobody': 1, 'owns': 1}

如何修复此错误并在f 字典中正确迭代?

此代码来自(优秀)书籍“Programming Collective Intelligence”。我刚刚从here 复制了它(我也买了这本书)并剪切了部分代码(fisherclassifier,因为我只使用了 naivebayes 分类器)。我很难相信这个错误还没有实现。我可能做错了什么。

Here the entire code:

import sqlite3

#from pysqlite2 import dbapi2 as sqlite

import re
import math

def getfeatures(doc):
  splitter=re.compile('\\W*')
  # Split the words by non-alpha characters
  words=[s.lower() for s in splitter.split(doc)
          if len(s)>2 and len(s)<20]
  # Return the unique set of words only
#  return dict([(w,1) for w in words]).iteritems()
  return dict([(w,1) for w in words])

class classifier:
  def __init__(self,getfeatures,filename=None):
    # Counts of feature/category combinations
    self.fc={}
    # Counts of documents in each category
    self.cc={}
    self.getfeatures=getfeatures

  def setdb(self,dbfile):
    self.con=sqlite.connect(dbfile)
    self.con.execute('create table if not exists fc(feature,category,count)')
    self.con.execute('create table if not exists cc(category,count)')


  def incf(self,f,cat):
    count=self.fcount(f,cat)
    if count==0:
      self.con.execute("insert into fc values ('%s','%s',1)"
                       % (f,cat))
    else:
      self.con.execute(
        "update fc set count=%d where feature='%s' and category='%s'"
        % (count+1,f,cat))

  def fcount(self,f,cat):
    res=self.con.execute(
      'select count from fc where feature="%s" and category="%s"'
      %(f,cat)).fetchone()
    if res==None: return 0
    else: return float(res[0])

  def incc(self,cat):
    count=self.catcount(cat)
    if count==0:
      self.con.execute("insert into cc values ('%s',1)" % (cat))
    else:
      self.con.execute("update cc set count=%d where category='%s'"
                       % (count+1,cat))

  def catcount(self,cat):
    res=self.con.execute('select count from cc where category="%s"'
                         %(cat)).fetchone()
    if res==None: return 0
    else: return float(res[0])

  def categories(self):
    cur=self.con.execute('select category from cc');
    return [d[0] for d in cur]

  def totalcount(self):
    res=self.con.execute('select sum(count) from cc').fetchone();
    if res==None: return 0
    return res[0]


  def train(self,item,cat):
    features=self.getfeatures(item)
    # Increment the count for every feature with this category
    for f in features.keys():
##    for f in features:
      self.incf(f,cat)
    # Increment the count for this category
    self.incc(cat)
    self.con.commit()

  def fprob(self,f,cat):
    if self.catcount(cat)==0: return 0

    # The total number of times this feature appeared in this
    # category divided by the total number of items in this category
    return self.fcount(f,cat)/self.catcount(cat)

  def weightedprob(self,f,cat,prf,weight=1.0,ap=0.5):
    # Calculate current probability
    basicprob=prf(f,cat)

    # Count the number of times this feature has appeared in
    # all categories
    totals=sum([self.fcount(f,c) for c in self.categories()])

    # Calculate the weighted average
    bp=((weight*ap)+(totals*basicprob))/(weight+totals)
    return bp




class naivebayes(classifier):

  def __init__(self,getfeatures):
    classifier.__init__(self,getfeatures)
    self.thresholds={}

  def docprob(self,item,cat):
    features=self.getfeatures(item)

    # Multiply the probabilities of all the features together
    p=1
    for f in features: p*=self.weightedprob(f,cat,self.fprob)
    return p

  def prob(self,item,cat):
    catprob=self.catcount(cat)/self.totalcount()
    docprob=self.docprob(item,cat)
    return docprob*catprob

  def setthreshold(self,cat,t):
    self.thresholds[cat]=t

  def getthreshold(self,cat):
    if cat not in self.thresholds: return 1.0
    return self.thresholds[cat]

  def classify(self,item,default=None):
    probs={}
    # Find the category with the highest probability
    max=0.0
    for cat in self.categories():
      probs[cat]=self.prob(item,cat)
      if probs[cat]>max:
        max=probs[cat]
        best=cat

    # Make sure the probability exceeds threshold*next best
    for cat in probs:
      if cat==best: continue
      if probs[cat]*self.getthreshold(best)>probs[best]: return default
    return best


def sampletrain(cl):
  cl.train('Nobody owns the water.','good')
  cl.train('the quick rabbit jumps fences','good')
  cl.train('buy pharmaceuticals now','bad')
  cl.train('make quick money at the online casino','bad')
  cl.train('the quick brown fox jumps','good')


nb = naivebayes(classifier)

sampletrain(nb)

#print ('\nbuy is classified as %s'%nb.classify('buy'))
#print ('\nquick is classified as %s'%nb.classify('quick'))

##print getfeatures('Nobody owns the water.')

【问题讨论】:

  • 确定getfeatures返回dict吗?
  • print type(get features()) 返回什么?

标签: python dictionary iteration


【解决方案1】:

看起来您正在使用classifier 初始化naivebayes 的实例:

nb = naivebayes(classifier) 

您可能打算这样做:

nb = naivebayes(getfeatures)

train 方法的for 循环内,您不是从getfeatures 获取字典,而是重复实例化classifier 的新实例。

【讨论】:

    【解决方案2】:

    您的初始化从未像您期望的那样真正传入 getfeatures 函数。

    赠品是这样的:

    尝试替换这一行:for f in features:for this: for f in features.keys(): 但没用 ("AttributeError: 分类器 实例没有属性“键””)。

    请注意,这是说特征是分类器实例,而不是字典。

    因此,查看您的代码,您可以创建:

    nb = naivebayes(classifier)
    

    naivebayes 的初始化是:

    def __init__(self, getfeatures):
      classifier.__init__(self,getfeatures)
      self.thresholds={}
    

    因此,在这种情况下,您将传入分类器,它将作为变量 getfeatures 传递给分类器的 init。 . .

    【讨论】:

    • 谢谢,ernie,你的 awnswer。我+1。我只是选择另一个答案,因为当有疑问时,我总是更喜欢“你可能想要这样做......”的答案。还是谢谢。
    • 很公平。 . .我意识到我倾向于尝试解释原因并将人们指向解决方案,而不是提供直接答案,而不是提供解决方案,所以我习惯于不选择我的答案;)
    猜你喜欢
    • 2015-12-15
    • 1970-01-01
    • 1970-01-01
    • 2021-11-29
    • 2021-12-05
    • 2018-08-06
    • 1970-01-01
    相关资源
    最近更新 更多