【问题标题】:python : error handling Ordered dict with unicode datapython:错误处理带有 unicode 数据的有序字典
【发布时间】:2017-03-29 10:22:33
【问题描述】:

我的脚本将数据从 MySQL 迁移到 mongodb。当不包含 unicode 列时,它运行得非常好。但是在添加OrgLanguages 列时会抛出错误。

    mongoImp = dbo.insert_many(odbcArray)
  File "/home/lrsa/.local/lib/python2.7/site-packages/pymongo/collection.py", line 711, in insert_many
    blk.execute(self.write_concern.document)
  File "/home/lrsa/.local/lib/python2.7/site-packages/pymongo/bulk.py", line 493, in execute
    return self.execute_command(sock_info, generator, write_concern)
  File "/home/lrsa/.local/lib/python2.7/site-packages/pymongo/bulk.py", line 319, in execute_command
    run.ops, True, self.collection.codec_options, bwc)
bson.errors.InvalidStringData: strings in documents must be valid UTF-8: 'Portugu\xeas do Brasil, ?????, English, Deutsch, Espa\xf1ol latinoamericano, Polish'

我的代码:

import MySQLdb, MySQLdb.cursors, sys, pymongo, collections

odbcArray=[]
mongoConStr = '192.168.10.107:36006'
sqlConnect = MySQLdb.connect(host = "54.175.170.187", user = "testuser", passwd = "testuser", db = "testdb", cursorclass=MySQLdb.cursors.DictCursor)
mongoConnect = pymongo.MongoClient(mongoConStr)

sqlCur = sqlConnect.cursor()
sqlCur.execute("SELECT ID,OrgID,OrgLanguages,APILoginID,TransactionKey,SMTPSpeed,TimeZoneName,IsVideoWatched FROM organizations")

dbo = mongoConnect.eaedw.mysqlData
tuples = sqlCur.fetchall()

for tuple in tuples:
    odbcArray.append(collections.OrderedDict(tuple))

mongoImp = dbo.insert_many(odbcArray)

sqlCur.close()
mongoConnect.close()
sqlConnect.close()
sys.exit()

当在 SELECT 查询中不使用 OrgLanguages 列时,上面的脚本可以完美地迁移数据。 为了克服这个问题,我尝试以另一种方式使用OrderedDict(),但给了我不同类型的错误
更改代码:

for tuple in tuples:
    doc = collections.OrderedDict()
    doc['oid'] = tuple.OrgID
    doc['APILoginID'] = tuple.APILoginID
    doc['lang'] = unicode(tuple.OrgLanguages)
    odbcArray.append(doc)
mongoImp = dbo.insert_many(odbcArray)

收到错误:

Traceback (most recent call last):
  File "pymsql.py", line 19, in <module>
    doc['oid'] = tuple.OrgID
AttributeError: 'dict' object has no attribute 'OrgID'

【问题讨论】:

    标签: python mysql mongodb unicode pymongo


    【解决方案1】:

    您的 MySQL 连接以不同于 UTF-8 的编码返回字符,UTF-8 是所有 BSON 字符串必须采用的编码。尝试您的原始代码,但将 charset='utf8' 传递给 MySQLdb.connect

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2011-04-06
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多