【问题标题】:Sentiment Analysis data not showing up in csv file情绪分析数据未显示在 csv 文件中
【发布时间】:2021-07-28 14:40:18
【问题描述】:

我将数据放入一个 csv 文件(称为“Essential Data_posts”)。在我的主要内容中,我从这个文件中提取了一个特定的列(称为“帖子文本”),以便我可以使用 Google Cloud NLP 分析帖子文本以进行情感实体分析。然后我将此分析放入另一个 csv 文件(称为“SentimentAnalysis”)中。为此,我将与情感实体分析有关的所有信息放入一个数组中(每条信息一个)。

我遇到的问题是,当我执行我的代码时,SentimentAnalysis 文件中除了标题之外什么都没有显示,例如。 “代表名称”。当我请求所有数组的长度时,我发现每个数组的长度都是0,所以它们没有添加信息。

我使用的是 Ubuntu 21.04 和 Google Cloud Natural Language。我在终端中运行这一切,而不是谷歌云平台。我也在使用 Python3 和 emacs 文本编辑器。

from google.cloud import language_v1
import pandas as pd
import csv
import os

#lists we are appending to
representativeName = []
entity = []
salienceScore = []
entitySentimentScore = []
entitySentimentMagnitude = []
metadataNames = []
metadataValues = []
mentionText = []
mentionType = []

def sentiment_entity(postTexts):
    client = language_v1.LanguageServiceClient()
    type_ = language_v1.Document.Type.PLAIN_TEXT
    language = "en"
    document = {"content": post_texts, "type": type_, "language": language}
    encodingType = language_v1.EncodingType.UTF8
    response = client.analyze_entity_sentiment(request = {'document': document, 'encoding type': encodingType})

    #loop through entities returned from the API
    for entity in response.entities:
        representativeName.append(entity.name)
        entity.append(language_v1.Entity.Type(entity.type_).name)
        salienceScore.append(entity.salience)
        entitySentimentScore.append(sentiment.score)
        entitySentimentMagnitude.append(sentiment.magnitude)
    
    #loop over metadata associated with entity 
    for metadata_name, metadata_value in entity.metadata.items():
        metadataNames.append(metadata_name)
        metadataValues.append(metadata_value)

    #loop over the mentions of this entity in the input document
    for mention in entity.mentions:
        mentionText.append(mention.text.content)
        mentionType.append(mention.type_)

#put the lists into the csv file (using pandas)    
data = {
    "Representative Name": representativeName,
    "Entity": entity,
    "Salience Score": salienceScore,
    "Entity Sentiment Score": entitySentimentScore,
    "Entity Sentiment Magnitude": entitySentimentMagnitude,
    "Metadata Name": metadataNames,
    "Metadata Value": metadataValues,
    "Mention Text": mentionText,
    "Mention Type": mentionType  
}

df = pd.DataFrame(data)
df
df.to_csv("SentimentAnalysis.csv", encoding='utf-8', index=False)

def main():
    import argparse

    #read the csv file containing the post text we need to analyze
    filename = open('Essential Data_posts.csv', 'r')

    #create dictreader object
    file = csv.DictReader(filename)

    postTexts = []

    #iterate over each column and append values to list
    for col in file:
    postTexts.append(col['Post Text'])

    parser = arg.parse.ArgumentParser()
    parser.add_argument("--postTexts", type=str, default=postTexts)
    args = parser.parse_args()

    sentiment_entity(args.postTexts)

【问题讨论】:

    标签: python ubuntu google-cloud-platform


    【解决方案1】:

    我尝试运行您的代码,但遇到以下错误:

    1. 您没有在sentiment_entity() 中使用传递的参数postTexts,因此这将在document = {"content": post_texts, "type": type_, "language": language} 处出错。
    2. 列表不能传递给"content": post_texts,它应该是字符串。见Document reference
    3. 在变量request中,'encoding type'应该是'encoding_type'
    4. 局部变量entity 不应与entity = [] 同名。 Python 将尝试在不是列表的局部变量 entity 中附加值。
    5. 应该是entity.sentiment.scoreentity.sentiment.magnitude 而不是sentiment.scoresentiment.magnitude
    6. metadatamention 的循环应在循环 for entity in response.entities:

    我编辑了您的代码并修复了上述错误。在您的main() 中,我包含了一个将列表postTexts 转换为字符串的步骤,以便它可以在您的sentiment_entity() 函数中使用。 metadataNamesmetadataValues 暂时被评论,因为我没有可以填充这些值的示例。

    from google.cloud import language_v1
    import pandas as pd
    import csv
    import os
    
    #lists we are appending to
    representativeName = []
    entity_arr = []
    salienceScore = []
    entitySentimentScore = []
    entitySentimentMagnitude = []
    metadataNames = []
    metadataValues = []
    mentionText = []
    mentionType = []
    
    def listToString(s):
        """ Transform list to string"""
        str1 = " "
        return (str1.join(s))
        
    def sentiment_entity(postTexts):
        client = language_v1.LanguageServiceClient()
        type_ = language_v1.Document.Type.PLAIN_TEXT
        language = "en"
        document = {"content": postTexts, "type_": type_, "language": language}
        encodingType = language_v1.EncodingType.UTF8
        response = client.analyze_entity_sentiment(request = {'document': document, 'encoding_type': encodingType})
    
        #loop through entities returned from the API
        for entity in response.entities:
            representativeName.append(entity.name)
            entity_arr.append(language_v1.Entity.Type(entity.type_).name)
            salienceScore.append(entity.salience)
            entitySentimentScore.append(entity.sentiment.score)
            entitySentimentMagnitude.append(entity.sentiment.magnitude)
            #loop over the mentions of this entity in the input document
            for mention in entity.mentions:
                mentionText.append(mention.text.content)
                mentionType.append(mention.type_)
            #loop over metadata associated with entity
            for metadata_name, metadata_value in entity.metadata.items():
                metadataNames.append(metadata_name)
                metadataValues.append(metadata_value)
    
        data = {
        "Representative Name": representativeName,
        "Entity": entity_arr,
        "Salience Score": salienceScore,
        "Entity Sentiment Score": entitySentimentScore,
        "Entity Sentiment Magnitude": entitySentimentMagnitude,
        #"Metadata Name": metadataNames,
        #"Metadata Value": metadataValues,
        "Mention Text": mentionText,
        "Mention Type": mentionType
        }
    
        df = pd.DataFrame(data)
        df.to_csv("SentimentAnalysis.csv", encoding='utf-8', index=False)
    
    def main():
        import argparse
    
        #read the csv file containing the post text we need to analyze
        filename = open('test.csv', 'r')
    
        #create dictreader object
        file = csv.DictReader(filename)
    
        postTexts = []
    
        #iterate over each column and append values to list
        for col in file:
            postTexts.append(col['Post Text'])
        content = listToString(postTexts) #convert list to string
        print(content)
        sentiment_entity(content)
    
    if __name__ == "__main__":
        main()
    

    test.csv:

    col_1,Post Text
    dummy,Grapes are good.
    dummy,Bananas are bad.
    

    运行代码时,我将转换后的列表打印为字符串并生成 SentimentAnalysis.csv:

    SentimentAnalysis.csv:

    Representative Name,Entity,Salience Score,Entity Sentiment Score,Entity Sentiment Magnitude,Mention Text,Mention Type
    Grapes,OTHER,0.8335162997245789,0.800000011920929,0.800000011920929,Grapes,2
    Bananas,OTHER,0.16648370027542114,-0.699999988079071,0.699999988079071,Bananas,2
    

    【讨论】:

      猜你喜欢
      • 2016-06-04
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2013-02-02
      • 1970-01-01
      相关资源
      最近更新 更多