【发布时间】:2020-09-10 00:37:09
【问题描述】:
我正在尝试使用 python 通过他们的屏幕名称获取 twitter 用户数据。 整个脚本的作用是遍历 ids 变量中的每个 Twitter 帐户——对于每个帐户,它将获取其个人资料信息并将其添加到输出文件的一行中。 但我遇到了一个错误。
这是我的代码
// LIST OF TWITTER USER IDS
ids = "4816,9715012,13023422, 13393052, 14226882, 14235041, 14292458, 14335586, 14730894,\
15029174, 15474846, 15634728, 15689319, 15782399, 15946841, 16116519, 16148677, 16223542,\
16315120, 16566133, 16686673, 16801671, 41900627, 42645839, 42731742, 44157002, 44988185,\
48073289, 48827616, 49702654, 50310311, 50361094,"
// THE VARIABLE USERS IS A JSON FILE WITH DATA ON THE 32 TWITTER USERS LISTED ABOVE
users = t.lookup_user(user_id = ids)
//NAME OUR OUTPUT FILE - %i WILL BE REPLACED BY CURRENT MONTH, DAY, AND YEAR
outfn = "twitter_user_data_%i.%i.%i.txt" % (now.month, now.day, now.year)
// NAMES FOR HEADER ROW IN OUTPUT FILE
fields = "id screen_name name created_at url followers_count friends_count statuses_count \
favourites_count listed_count \
contributors_enabled description protected location lang expanded_url".split()
// INITIALIZE OUTPUT FILE AND WRITE HEADER ROW
outfp = open(outfn, "w")
//outfp.write(string.join(fields, "\t") + "\n") # header
outfp.write("\t".join(fields) + "\n") # header
// THIS BLOCK WILL LOOP OVER EACH OF THESE IDS, CREATE VARIABLES, AND OUTPUT TO FILE
for entry in users:
// CREATE EMPTY DICTIONARY
r = {}
for f in fields:
r[f] = ""
// ASSIGN VALUE OF 'ID' FIELD IN JSON TO 'ID' FIELD IN OUR DICTIONARY
r['id'] = entry['id']
// SAME WITH 'SCREEN_NAME' HERE, AND FOR REST OF THE VARIABLES
r['screen_name'] = entry['screen_name']
r['name'] = entry['name']
r['created_at'] = entry['created_at']
r['url'] = entry['url']
r['followers_count'] = entry['followers_count']
r['friends_count'] = entry['friends_count']
r['statuses_count'] = entry['statuses_count']
r['favourites_count'] = entry['favourites_count']
r['listed_count'] = entry['listed_count']
r['contributors_enabled'] = entry['contributors_enabled']
r['description'] = entry['description']
r['protected'] = entry['protected']
r['location'] = entry['location']
r['lang'] = entry['lang']
// NOT EVERY ID WILL HAVE A 'URL' KEY, SO CHECK FOR ITS EXISTENCE WITH IF CLAUSE
if 'url' in entry['entities']:
r['expanded_url'] = entry['entities']['url']['urls'][0]['expanded_url']
else:
r['expanded_url'] = ''
print(r)
// CREATE EMPTY LIST
lst = []
// ADD DATA FOR EACH VARIABLE
for f in fields:
lst.append(str(r[f]).replace("\/", "/"))
// WRITE ROW WITH DATA IN LIST
//outfp.write(string.join(lst, "\t").encode("utf-8") + "\n")
outfp.write("\t".join(lst).encode('utf-8') + '\n')
outfp.close()
错误信息
TypeError Traceback (most recent call last)
<ipython-input-54-441137b1bb4d> in <module>()
37 #WRITE ROW WITH DATA IN LIST
38 #outfp.write(string.join(lst, "\t").encode("utf-8") + "\n")
---> 39 outfp.write("\t".join(lst).encode('utf-8') + '\n')
40
41 outfp.close()
TypeError: can't concat str to bytes
关于如何解决这个问题的任何想法? Python的版本是3.6.5 对你的帮助表示感谢。谢谢。
编辑:
这是我以二进制模式打开输出文件后的部分文件截图
【问题讨论】:
-
您尝试将字节对象(
join的结果)与字符串连接起来。将'\n'更改为b'\n'或在连接之后而不是之前进行编码。 -
它可以工作,但现在我从同一代码行收到另一个错误。错误 write() 参数必须是 str,而不是 bytes
-
以二进制方式打开输出文件
-
我做了,但输出文件变得奇怪,我编辑我的帖子以发布部分文件的屏幕截图
标签: python python-3.x api twitter