【发布时间】:2021-02-20 15:36:13
【问题描述】:
这需要太长时间:
# Document-frequency
phrases_final["doc_freq"] = len(phrases_final) * [0]
# for each phrase, compute the number of clusters that phrase occurs in
for phrase in phrases_final["extracted_phrases"]:
for i in cluster_name:
all_tweets = ""
for tweet in df["tweets_to_consider"][df.cl_num == i]:
all_tweets = all_tweets + tweet + ". "
if phrase in all_tweets:
phrases_final["doc_freq"][
(phrases_final.extracted_phrases == phrase) & (phrases_final.cluster_num == i)
] = (
phrases_final["doc_freq"][
(phrases_final.extracted_phrases == phrase) & (phrases_final.cluster_num == i)
]
+ 1
)
【问题讨论】: