【问题标题】:Replace words in a sentence with synonyms using Python使用 Python 用同义词替换句子中的单词
【发布时间】:2021-09-21 00:36:00
【问题描述】:

我有一个名为 news_collection.csv 的数据集,其中包含新闻,而我正在努力做的是替换单词。

def similarity():
    tweets = pd.read_csv(r'news_collection.csv')
    df = pd.DataFrame(tweets, columns=['created_at', 'text'])
    df['created_at'] = pd.to_datetime(df['created_at'])
    df['text'] = df['text'].apply(lambda x: str(x))
    df["text"] = df["text"].apply(lambda x: replacesynonyms(x))

return df

def replacesynonyms(text):
    file = open('syno.txt', 'r', encoding="utf8")
    //code to be added

有人可以帮忙解决这个算法吗?

【问题讨论】:

    标签: python pandas dataframe nlp nltk


    【解决方案1】:

    试试这个:

    def similarity():
        tweets = pd.read_csv(r'news_collection.csv')
        df = pd.DataFrame(tweets, columns=['created_at', 'text'])
        df['created_at'] = pd.to_datetime(df['created_at'])
        df['text'] = df['text'].apply(lambda x: str(x))
        df["text"] = df["text"].apply(lambda x: replacesynonyms(x))
        return df
    
    
    def create_sets():
        lists_sets = []
        file = open('syno.txt', 'r', encoding="utf8")
        lines = file.readlines()
        for line in lines:
            s = set()
            words = line.split(',')
            for word in words:
                s.add(word.strip())
            lists_sets.append(s)
    
        return lists_sets
    
    
    def create_syn_list():
        first_syn_name = []
        file = open('syno.txt', 'r', encoding="utf8")
        lines = file.readlines()
        for line in lines:
            first_syn_name.append(line.split(',')[0].strip())
        return first_syn_name
    
    lists_sets = create_sets()
    first_syn_list = create_syn_list()
    
    
    def replacesynonyms(text):
        words = text.split()
        new_sentence_l = []
        for word in words:
            to_add = True
            for idx, syn_set in enumerate(lists_sets):
                if word in syn_set:
                    new_sentence_l.append(first_syn_list[idx])
                    to_add = False
                    break
            if to_add:
                new_sentence_l.append(word)
        return ' '.join(new_sentence_l)
    
    df = similarity()
    sen = list(df['text'])
    for i in sen:
        print(i)
    

    【讨论】:

    • 如果有帮助,请接受并支持我的回答
    • 我还不能投票,因为我是新用户。我做了一些改变,效果很好
    猜你喜欢
    • 2020-04-16
    • 1970-01-01
    • 2023-04-06
    • 1970-01-01
    • 1970-01-01
    • 2020-03-27
    • 2014-01-08
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多