【发布时间】:2020-05-27 18:25:46
【问题描述】:
好的,我需要帮助。我创建了一个函数来搜索特定单词的字符串。如果函数找到search_word,它将返回单词 the 和它之前的 N 个单词。该函数适用于我的测试字符串,但我无法弄清楚如何将该函数应用于整个系列?
我的目标是在 search_word 存在时在包含 n_words_prior 的数据框中创建一个新列。
n_words_prior = []
test = "New School District, Dale County"
def n_before_string(string, search_word, N):
global n_words_prior
n_words_prior = []
found_word = string.find(search_word)
if found_word == -1: return ""
sentence= string[0:found_word]
n_words_prior = sentence.split()[N:]
n_words_prior.append(search_word)
return n_words_prior
当前数据框如下所示:
data = [['Alabama', 'New School District, Dale County'],
['Alaska', 'Matanuska-Susitna Borough'],
['Arizona', 'Pima County - Tuscon Unified School District']]
df = pd.DataFrame(data, columns = ['State', 'Place'])
改进后的函数将采用输入 'Place','County',-1 并创建以下结果。
improved_function(column, search_word, N)
new_data = [['Alabama', 'New School District, Dale County','Dale County'],
['Alaska', 'Matanuska-Susitna Borough', ''],
['Arizona', 'Pima County - Tuscon Unified School District','Pima County']]
new_df = pd.DataFrame(new_data, columns = ['State', 'Place','Result'])
我认为嵌入这个函数会有所帮助,但它只会让事情变得更加混乱。
def fast_add(place, search_word):
df[search_word] = df[Place].str.contains(search_word).apply(lambda search_word: 1 if search_word == True else 0)
【问题讨论】:
标签: python pandas string lambda