【发布时间】:2017-12-13 04:37:05
【问题描述】:
我有一个新闻数据集,我在上面进行 NLP。 我现在有 2 个函数,一个计算相似度,另一个计算情绪,它们都从数据框中获取输入,我想做的是在数据框中创建另一列,其中包含计算值,如相似度和情绪(Pos/否定)
功能如下
i=0
for i in range(0, 9):
text1 = df.description[i]
text2 = df.title[i]
vector1 = similarity.text_to_vector(text1)
vector2 = similarity.text_to_vector(text2)
token1 = similarity.tokenize(text1)
token2 = similarity.tokenize(text2)
jaccard = similarity.jaccard_similarity(token1,token2)
print ('Jaccard Similarity:', jaccard)
i=i+1
输出:
('Jaccard Similarity:', 0.07142857142857142)
('Jaccard Similarity:', 0.125)
('Jaccard Similarity:', 0.03225806451612903)
('Jaccard Similarity:', 0.07692307692307693)
('Jaccard Similarity:', 0.2)
('Jaccard Similarity:', 0.07407407407407407)
('Jaccard Similarity:', 0.12)
('Jaccard Similarity:', 0.043478260869565216)
('Jaccard Similarity:', 0.0)
代码:
i=0
for i in range(0, 9):
blob = TextBlob(df.description[i], analyzer=NaiveBayesAnalyzer())
y = blob.sentiment.classification
print ('Result', y)
i=i+1
输出:
('Result', 'pos')
('Result', 'neg')
('Result', 'pos')
('Result', 'pos')
('Result', 'pos')
('Result', 'neg')
('Result', 'pos')
('Result', 'pos')
('Result', 'neg')
【问题讨论】:
-
使用您的结果创建列表,然后您可以创建
Series或DataFrame并稍后添加到现有的DataFrame -
或使用
df["Jaccard Similarity"] = df.apply(your_function)对所有行执行函数并创建列"Jaccard Similarity"。 -
您遇到了什么问题?
标签: python pandas dataframe sentiment-analysis text-analysis