【发布时间】:2021-06-18 22:43:24
【问题描述】:
有很多关于这方面的文档,但我无法弄清楚。
这是一个列表,我需要检查这些值之一是否在我的列值中。如果是这样,用列表值替换整个单元格。
active_crews = ["CREW #101", "CREW #102", "CREW #203", "CREW #301", "CREW #404", "CREW #501", "CREW #406", "CREW #304", "CREW #701", "CREW #702", "CREW #703", "CREW #704", "CREW #705", "CREW #706",
"CREW #707" "CREW #708", "CREW #801", "CREW #802", "CREW #803", "CREW #805"]
我要替换的数据示例。是的,格式也有细微的差别:
Debris Crew WO#
REFER TO IAP 12/16 TO 12/19 CREW #405
REFER TO IAP 06/02 TO 06/05 CREW #406
REFER TO IAP 03/24TO 03/27 CREW # 803
预期输出
Debris Crew WO#
CREW #405
CREW #406
CREW #803
我的问题是我不知道如何告诉 python 使用列表搜索列值以查找匹配项。并且该列表值是否在该列值中。用列表值替换当前列值
我尝试过的代码:
1)
df.loc[df['Debris Crew WO#'] == active_crews, 'Debris Crew WO#']
# doesn't work. This was done before research lol I get the following error, which makes sense
# ValueError: ('Lengths must match to compare', (2216,), (19,))
df.loc[:, ['Place Holder']] = df.loc[:, 'Debris Crew WO#'].str[28:]
# this code "works" but due to different formatting i get data back like this:
8 REW #406
9 CREW #406
# not very effective and can not be relied on. I hate hard coding anything.
df.loc[:, ['Place Holder']] = df.loc[:, 'Debris Crew WO#'].str[26:]
df.loc[:, ['Place Holder']] = df[['Place Holder']].str.split().join(" ")
# tried this due to I have this filter for specials characters with a for loop in a different code and yet I get this error and I have no clue why. Works on my other codes with no problems
#AttributeError: 'DataFrame' object has no attribute 'str'
# even if I use .loc I get the same error:
df.loc[:, ['Place Holder']] = df.loc[:, 'Debris Crew WO#'].str[26:]
df.loc[:, ['Place Holder']] = df.loc[:, ['Place Holder']].str.split().join(" ")
#plus its still hard coding (gross)
接下来我将与 RE 合作。有人告诉我,它非常适合像过滤类型这样的“CTRL 查找”样式,并且是数据科学中的关键工具。因此,在下周+从 RE Documentation 开始,并在这个问题上练习它。随着我的进展,将编辑更新
就是这么说的。我已经学习python将近两个月了。请原谅任何只是尝试和试验的“菜鸟”风格/编码,这样我就可以让我的生活和我周围的其他人变得更好。 任何帮助将不胜感激。提前致谢
【问题讨论】:
-
船员 #405 不在您的列表中。 Crew #803 的格式与 #803 而不是 #803 不同?这些是错别字吗?
-
是的,我只给出了一个简短的 sn-p,但该列表中的所有内容实际上都在更大的数据框中。大约有 2,500 行包含这样的信息,而该列表是我现在需要审核的工作人员。如果需要,我会根据需要从该列表中添加/删除
标签: python pandas replace python-re