如何使用熊猫将列表转换为 str？答案

【问题标题】：How Do I convert list into str using pandas?如何使用熊猫将列表转换为 str？
【发布时间】：2021-05-25 13:09:37
【问题描述】：

type(df['Soft_skills'][0])
>>>str

我需要像这样输出

df['Soft_skills'][0] = Management,Decision Making

第二行

df['Soft_skills'][1] = None

我不知道如何删除 " 并将其转换为 strformat。

>>> df['Soft_skills']
0                       ["Management", "Decision Making"]
1                                                      []
2                                          ["Management"]
3                                                      []
4       ["Governance", "Management", "Leadership", "Te...
                              ...
1229                                                   []
1230                                                   []
1231                                                   []
1232                   ["Agenda (Meeting)", "Governance"]
1233                                                   []
Name: Soft_skills, Length: 1234, dtype: object

在某些情况下，数据是 The syllabus for this course will cover the following:, \n, *, The nature and purpose of cost and management accounting, \n, *, Source documents and coding, \n, *, Cost classification and measuring, \n, *, Recording costs, \n, *, Spreadsheets 我通过使用替换它

d = {
'Not Mentioned':'',
"\r\n": "\n",
"\r": "\n",
'\u00a0':' ',
': \n, *,  ':'\n * ',
' \n,':'\n',
}
df=df.replace(d.keys(),d.values(),regex=True)

但是当我尝试时，没有什么可以替代问题是什么我错过了什么？我也用过这个

df['Course_content'] = df['Course_content']\
    .str.replace('Not Mentioned','')\
    .str.replace("\r\n", "\n")\
    .str.replace("\r", "\n")\
    .str.replace('\u00a0',' ')\
    .str.replace(', \n, *,  ','\n * ')\
    .str.replace(' \n,','\n')

但它也不适合我

【问题讨论】：

您可以运行print(df.head(10).to_dict()) 并将输出粘贴到您的问题中吗？

标签： python pandas dataframe

【解决方案1】：

通过strip() 和replace() 尝试：

df['Soft_skills']=(df['Soft_skills'].str.strip("[]")
              .str.replace("'",'')
              .replace('',float('nan'),regex=True))

更新：

首先创建了一个字典：

d={
    'Â':'',
    'â€™':"'",
    'â€œ':'"',
    'â€“':'-',
    'â€':'"'
}

最终使用replace()方法：

df=df.replace(d.keys(),d.values(),regex=True)

来源：我从this answer 创建了字典，因为那是用于 php 但有相同的编码问题

【讨论】：

有理由使用这个与df['Soft_skills'].apply(','.join)吗？看起来不必要的复杂，但也许有性能原因。
是的，由于性能原因，我没有使用 apply()...btw 添加了两种解决方案
第一个对我不起作用，它加入了所有单词，如 ``` [,",M,a,n,a,g,e,m,e,n,t,",, , ",D,e,c,i,s,i,o,... 1 [,] 2 [,",M,a,n,a,g,e,m,e,n,t, ",] 3 [,] ```
这是因为列'Soft_skills' 是字符串类型，所以使用第二种方法...更新的答案请看一下...。:)
5â€“8 hours per week 在我的 csv 文件中键入一些字符，编码或任何其他对我来说如何容易？