【问题标题】:Iterating over a DataFrame and appending the score into a column遍历 DataFrame 并将分数附加到列中
【发布时间】:2022-01-24 11:33:26
【问题描述】:

当我在下面运行这段代码时,它返回'float' object has no attribute 'encode' 我不确定我做错了什么,但我想获取标题的 VADER 情绪值(在一个大数据框中)但我不确定我哪里出错了,或者如何转换变量的类型以使对象可迭代.然后将“复合”分数附加到数据框中。我已经尝试过像这样的迭代代码:

pd.concat([bitcoin,bitcoin['Title'].apply(lambda r : pd.Series(analyzer.polarity_scores(r)))],axis=1) score_compound = bitcoin['Title'].apply(lambda r : analyzer.polarity_scores(r)['compound'])

import nltk
import pandas as pd

analyzer = SentimentIntensityAnalyzer()
bitcoin = pd.read_csv("Subreddit_Bitcoin_2021.csv")

score_compound = []

for i in range(0, bitcoin.shape[0]):
               score = analyzer.polarity_scores(bitcoin.iloc[i][1])
               score1 = score['compound']
               score_compound.append(score1)```


【问题讨论】:

  • 能否请您也粘贴至少几行 csv?所以我们可以更容易地重现错误。

标签: python pandas list dataframe vader


【解决方案1】:

没有您的数据可以处理,很难知道。我看到你在其他地方发布了同样的问题和一些数据,所以我在上面进行了测试:

 index                                               text
0      0  I can’t believe Bitcoin is going to hit 100k b...
1      1  What new Bitcoin related project are you the m...
2      2  Yin decline is about to end! Historical data s...
3      3  If you discovered a way to model turning $100 ...
4      4  Happy New Year and some nice Gains !! ????????...

并完成您的代码(如需进一步通知,请分享您导入的库):

from nltk import *
import pandas as pd
import vader
from vaderSentiment.vaderSentiment import SentimentIntensityAnalyzer
analyzer = SentimentIntensityAnalyzer()

bitcoin = df

score_compound = []

for i in range(0, bitcoin.shape[0]):
               score = analyzer.polarity_scores(bitcoin.iloc[i][1])
               score1 = score['compound']
               score_compound.append(score1)
                
                
score_compound  

返回:

[0.0258, 0.4005, 0.0, 0.6199, 0.9421]

【讨论】:

    猜你喜欢
    • 2021-09-27
    • 1970-01-01
    • 2018-10-28
    • 1970-01-01
    • 2021-07-29
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多