【问题标题】:replacing multiple values in Pandas Column from combination of two lists从两个列表的组合中替换 Pandas 列中的多个值
【发布时间】:2019-10-26 13:04:16
【问题描述】:

我有两个列表

a = ["hi", "hello", "hey"]
b = ["Sam", "dean"]

和一个包含ques列的数据框

df = pd.DataFrame({'ques':["<input1> This is <input2>", "<input1> Sir, Do you know <input2>?"]})

我想用列表a 的元素替换&lt;input1&gt;,用列表b 的元素替换&lt;input2&gt;,并创建一组独特的问题。

所以我的预期输出是:

['hi This is Sam',
 'hi This is dean',
 'hello This is Sam',
 'hello This is dean',
 'hey This is Sam',
 'hey This is dean',
 'hi Sir, Do you know Sam?',
 'hi Sir, Do you know dean?',
 'hello Sir, Do you know Sam?',
 'hello Sir, Do you know dean?',
 'hey Sir, Do you know Sam?',
 'hey Sir, Do you know dean?']

listpandas column 都可以。

我尝试了什么

from itertools import product

c = list(product(a,b))

ques = []
for q in df['ques']:
  for i in c:
    temp = q.replace("<input1>", i[0]).replace("<input2>", i[1])
    ques.append(temp)

这给了我预期的结果,但我的数据太大,所以我正在寻找更有效的解决方案。

【问题讨论】:

    标签: python python-3.x pandas


    【解决方案1】:

    您可以结合使用productreplace

    dfs = [
        df.replace({'ques': {'<input1>': x, '<input2>': y}}, regex=True)  
        for x, y in itertools.product(a, b)
    ]
    pd.concat(dfs, ignore_index=True)
    

                                ques
    0                 hi This is Sam
    1       hi Sir, Do you know Sam?
    2                hi This is dean
    3      hi Sir, Do you know dean?
    4              hello This is Sam
    5    hello Sir, Do you know Sam?
    6             hello This is dean
    7   hello Sir, Do you know dean?
    8                hey This is Sam
    9      hey Sir, Do you know Sam?
    10              hey This is dean
    11    hey Sir, Do you know dean?
    

    【讨论】:

    • 与产品和字符串有关的任何事情都必然会很慢。这至少应该避免直接迭代,并且在眼睛上更容易一些。
    猜你喜欢
    • 2016-03-29
    • 2023-01-02
    • 1970-01-01
    • 2021-04-01
    • 2021-09-05
    • 2018-08-04
    • 2021-09-24
    • 1970-01-01
    • 2020-01-27
    相关资源
    最近更新 更多