【发布时间】:2019-08-29 07:29:18
【问题描述】:
我正在编写代码以从数据框的每个列中获取值并对其进行一些处理。每当有 NaN 值时,我都会遇到异常。我不想与 Nan 一起删除列。 以前我通过简单地捕获异常来解决问题,但现在我无法像在这里使用列表推导一样做同样的事情。 有人可以建议这样做的正确方法吗? 以前我是这样解决的:
for index, row in df_work.iterrows():
descrip = row['description']
try:
r = Rake()
r.extract_keywords_from_text(descrip)
key_words_dict_scores = r.get_word_degrees()
row['Key_words'] = list(key_words_dict_scores.keys())
except Exception as e:
print(e)
row['Key_words'] = ''
我想在这里做同样的事情:
df_work['specialties'] = [','.join(x) for x in df_work['specialties'].map(lambda x: x.lower().replace(' ','').split(',')).values]
df_work['industry'] = [','.join(x) for x in df_work['industry'].map(lambda x: x.lower().replace(' ','').split(',')).values]
df_work['type'] = [','.join(x) for x in df_work['type'].map(lambda x: x.lower().replace(' ','').split(',')).values]
我在上面的代码中得到这个错误:
'float' object has no attribute 'lower'
Specialties 列包含如下数据:
df_work.loc['TOTAL', 'specialties']
输出>>'Oil & Gas - Exploration & Production,Upstream,Refining,Trading,Shipping,Marketing,Energy,Crude Oil,Petroleum,Petrochemicals,Liquified Natural Gas,Renewable Energy,Drilling Engineering,Completion & Intervention Engineering,Geology,Geoscientists,IT'
type(df_work.loc['TOTAL', 'specialties'])
输出>>str
运行我上面的代码后的预期输出应该是:
输出>>'oil&gas-exploration&production,upstream,refining,trading,shipping,marketing,energy,crudeoil,petroleum,petrochemicals,liquifiednaturalgas,renewableenergy,drillingengineering,completion&interventionengineering,geology,geoscientists,it'
type(df_work.loc['TOTAL', 'specialties'])
输出>>str
【问题讨论】:
-
是否可以添加一些示例数据,例如 3 行对于
specialties列? -
已添加。请再次检查
-
你能检查一下我的解决方案吗?
标签: python-3.x pandas numpy dataframe nan