【发布时间】:2017-03-25 14:19:22
【问题描述】:
我有一个包含整数和字符串的 pandas 数据框 df[lists],它具有以下格式:
0 [(a,b,89), (a,y,992), (a,t, 99), (a,m, 1028)]
1 [(b,u,855), (b,tt,934), (b, g, 69)]
2 [(c,k, 546),(c,gf,134), (c, dd, 569)]
3 [(d,zv, 546),(d,gyr,8834), (d, dds, 5693), (d, ddd, 3459)]
实际上字符a、b、tt等更长,用于计算汉明距离 我想要得到的是每一行的最大值并将其写为 df[max]:
0 [1028]
1 [934]
2 [569]
3 [8834]
我通过以下方式到达这里:
combined = ((x, y, (5x - 3y) for x, y in combinations(df['elements'], if x != y)
series = Series(list(g) for k, g in groupby(combined, key=itemgetter(0)))
series = df[lists]
当我使用时:
from operator import itemgetter
df['lst'].apply(lambda x: [max(x, key=itemgetter(2))[-1]])
我收到以下错误:
Traceback (most recent call last):
File "C:\Users\Desktop\phash\dene_2.py", line 78, in <module>
df['similarity'].apply(lambda x: [max(x, key=itemgetter(2))[-1]])
File "C:\Users\AppData\Local\Programs\Python\Python35\lib\site-packages\pandas\core\series.py", line 2294, in apply
mapped = lib.map_infer(values, f, convert=convert_dtype)
File "pandas\src\inference.pyx", line 1207, in pandas.lib.map_infer (pandas\lib.c:66124)
File "C:\Users\Desktop\phash\dene_2.py", line 78, in <lambda>
df['similarity'].apply(lambda x: [max(x, key=itemgetter(2))[-1]])
TypeError: 'float' object is not iterable
【问题讨论】:
-
我建议你需要改进你的数据结构。这种类型的结构很难使用。
-
我怎么能这样做?我通过以下方式得到了结果 combine = ((x, y, (5x - 3y) for x, y in combination(df['elements'], if x != y) series = Series(list(g) for k , g in groupby(combined, key=itemgetter(0))) series = df[list]