使用 Ngram-Python 为多个句子创建单词计数矩阵答案

【问题标题】：Create Matrix of word's count for multiple sentences using Ngram-Python使用 Ngram-Python 为多个句子创建单词计数矩阵
【发布时间】：2017-09-06 21:27:49
【问题描述】：

假设我在 csv 文件中有多个句子（不是段落），例如句子 A、B、C 等。我想使用 N-gram（Unigrams 或 Bigrams）计算每个句子中的单词矩阵。这样我就可以轻松地从我的矩阵中为每个句子计算出 N-gram 向量。我该怎么做？

PS：我尝试了几种方法，但它们都为一个句子或整个段落计算 N-gram！

【问题讨论】：

What are ngram counts and how to implement using nltk?的可能重复
@YuvalRaz 链接中回答的问题与我的不同:)

标签： python matrix n-gram

【解决方案1】：

您可以尝试使用 pandas 数据框并在每一行上使用“应用”

import pandas as pd

x = pd.read_csv("the_santances.csv")

x.apply("the function that calculates the ngram")

【讨论】：