【发布时间】:2021-11-01 02:19:47
【问题描述】:
df = pd.DataFrame({'a': ['Anakin Ana', 'Anakin Ana, Chris Cannon', 'Chris Cannon', 'Bella Bold'],
'b': ['Bella Bold, Chris Cannon', 'Donald Deakon', 'Bella Bold', 'Bella Bold'],
'c': ['Chris Cannon', 'Chris Cannon, Donald Deakon', 'Chris Cannon', 'Anakin Ana, Bella Bold']},
index=[0, 1, 2])
大家好,
我正在尝试计算每列中有多少个共同名称。 以上是我的数据的示例。起初,它说'float'对象没有属性'split'错误。我做了一些搜索,似乎错误来自我丢失的数据,它读取为浮点数。但即使我更改字符串变量中的列,它也会不断收到错误消息。 下面是我的代码。
import pandas as pd
import csv
filepath = "C:/Users/data/Untitled Folder/creditdata2.csv"
df = pd.read_csv(filepath,encoding='utf-8')
df['word_overlap'] = [set(x[8].astype(str).split(",")) & set(x[10].astype(str).split(",")) for x in df.values]
df['overlap_count'] = df['word_overlap'].str.len()
df.to_csv('creditdata3.csv',mode='a',index=False)
这是错误
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-21-b85ac8637aae> in <module>
4 df = pd.read_csv(filepath,encoding='utf-8')
5
----> 6 df['word_overlap'] = [set(x[8].astype(str).split(",")) & set(x[10].astype(str).split(",")) for x in df.values]
7 df['overlap_count'] = df['word_overlap'].str.len()
8
<ipython-input-21-b85ac8637aae> in <listcomp>(.0)
4 df = pd.read_csv(filepath,encoding='utf-8')
5
----> 6 df['word_overlap'] = [set(x[8].astype(str).split(",")) & set(x[10].astype(str).split(",")) for x in df.values]
7 df['overlap_count'] = df['word_overlap'].str.len()
8
AttributeError: 'float' object has no attribute 'astype'
【问题讨论】:
-
您能否更清楚地定义“每列共有多少个名称”,或者举例说明输出应该是什么?
-
嗨,例如,在第 1 列和第 2 列的第一个单元格之间,没有共同的名称,所以它是 0。但是,第 1 列和第 2 列的第 4 个单元格有一个通用名称“Bella”加粗,所以它是 1。