【发布时间】:2017-12-07 10:46:46
【问题描述】:
我正在尝试在 twitter 中对不同汽车品牌进行情绪分析,为此我使用 python 3。运行代码时,我得到以下异常
Traceback (most recent call last):
File "C:\Users\Jeet Chatterjee\NLP\Maruti_Toyota_Marcedes_Brand_analysis.py", line 55, in <module>
x = str(x.encode('utf-8','ignore'),errors ='ignore')
AttributeError: 'numpy.float64' object has no attribute 'encode'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\Users\Jeet Chatterjee\NLP\Maruti_Toyota_Marcedes_Brand_analysis.py", line 62, in <module>
tweets.set_value(idx,column,'')
File "C:\Program Files (x86)\Python36-32\lib\site-packages\pandas\core\frame.py", line 1856, in set_value
engine.set_value(series._values, index, value)
File "pandas\_libs\index.pyx", line 116, in pandas._libs.index.IndexEngine.set_value (pandas\_libs\index.c:4690)
File "pandas\_libs\index.pyx", line 130, in pandas._libs.index.IndexEngine.set_value (pandas\_libs\index.c:4578)
File "pandas\_libs\src\util.pxd", line 101, in util.set_value_at (pandas\_libs\index.c:21043)
File "pandas\_libs\src\util.pxd", line 93, in util.set_value_at_unsafe (pandas\_libs\index.c:20964)
ValueError: could not convert string to float:
我不知道如何在 python 3 中表示编码。这是我的代码
from tweepy.streaming import StreamListener
from tweepy import OAuthHandler
from tweepy import Stream
from textblob import TextBlob
import json
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
#regular expression in python
import re
#data corpus
tweets_data_path = 'carData.txt'
tweets_data = []
tweets_file = open(tweets_data_path, "r")
for line in tweets_file:
try:
tweet = json.loads(line)
tweets_data.append(tweet)
except:
continue
#creating panda dataset
tweets = pd.DataFrame()
index = 0
for num, line in enumerate(tweets_data):
try:
print (num,line['text'])
tweets.loc[index,'text'] = line['text']
index = index + 1
except:
print(num, "line not parsed")
continue
def brand_in_tweet(brand, tweet):
brand = brand.lower()
tweet = tweet.lower()
match = re.search(brand, tweet)
if match:
print ('Match Found')
return brand
else:
print ('Match not found')
return 'none'
for index, row in tweets.iterrows():
temp = TextBlob(row['text'])
tweets.loc[index,'sentscore'] = temp.sentiment.polarity
for column in tweets.columns:
for idx in tweets[column].index:
x = tweets.get_value(idx,column)
try:
x = str(x.encode('utf-8','ignore'),errors ='ignore')
if type(x) == unicode:
str(str(x),errors='ignore')
else:
df.set_value(idx,column,x)
except Exception:
print ('encoding error: {0} {1}'.format(idx,column))
tweets.set_value(idx,column,'')
continue
tweets.to_csv('tweets_export.csv')
if __name__=='__main__':
brand_in_tweet()
我已经发布了完整的代码,我没有得到任何关于这个错误的线索,以及如何解决这个问题。请提前帮助和感谢。
【问题讨论】:
标签: python numpy tweepy sentiment-analysis