【发布时间】:2020-12-13 09:58:22
【问题描述】:
我想在数据框中获取字符串“APPLES_10_4”并将其变为“APPLES”。我想出的代码如下:
import pandas as pd
data = ['APPLES_10_4']
Name_Parameters = []
df = pd.DataFrame(data, columns = ['fruit'], index = ['count'])
def badletters(lastletter):
badletters = ["1","2","3","4","5","6","7","8","9","_"]
if lastletter in badletters:
return True
else:
return False
def stripe(variable):
tempStrippedVariable = variable
foundEndVariable = False
while not foundEndVariable:
lastletter = tempStrippedVariable [:-1]
if badletters(lastletter):
tempStrippedVariable = tempStrippedVariable [:-1]
else:
foundEndVariable = True
strippedVariable = tempStrippedVariable
return strippedVariable
for variable in df:
strippedVariable = stripe(str(variable))
prefixes = []
if strippedVariable not in prefixes:
prefixes.append(strippedVariable)
print(df)
我得到的输出是带有 ['APPLES_10_4'] 的原始数据框,而不是显示 ['APPLES'] 的更改后的数据框。
【问题讨论】:
-
为什么不使用
re.sub()? -
prefixes由什么组成? -
prefixes 将是整个数据框,其中包含其他字符串,例如“mangoes_1_5”,因此前缀应该是水果名称数组,没有任何“badletters”
-
错误是因为数据框包含数字,而不仅仅是字符串。
-
试试
for variable in df: print(variable),你就会看到它们。