【问题标题】:How correct sort not int numpy array?如何正确排序不是 int numpy 数组?
【发布时间】:2021-12-18 23:19:18
【问题描述】:

请帮忙!我遇到了一个问题,当按第二列和第三列以降序对 numpy 数组进行排序时,因此不可能这样做。一切正常,直到代码遇到大于 9 的数字。你如何解决这个问题? enter image description here

import numpy as np
data = [
    ['Other Theft', 2003, 5, 12, 16, 15, 'Strathcona', 49.269802, -123.083763],
    ['Other Theft', 2003, 5, 7, 15, 20, 'Strathcona', 49.269802, -123.083763],
    ['Other Theft', 2003, 4, 23, 16, 40, 'Strathcona', 49.269802, -123.083763],
    ['Other Theft', 2003, 4, 20, 11, 15, 'Strathcona', 49.269802, -123.083763],
    ['Other Theft', 2003, 4, 12, 17, 45, 'Strathcona', 49.269802, -123.083763],
    ['Other Theft', 2003, 3, 26, 20, 45, 'Strathcona', 49.269802, -123.083763],
    ['Offence Against a Person', 2015, 8, 11,'unknown', 'unknown', 'unknown', 0.000000, 0.000000],
    ['Break and Enter Residential/Other', 2003, 3, 10, 12, 0, 'Kerrisdale', 49.228051, -123.146610],
    ['Mischief', 2003, 6, 28, 4, 13, 'Dunbar-Southlands', 49.255559, -123.193725],
    ['Mischief', 2017, 3, 26, 23, 0, 'Sunset', 49.21431483, -123.101945],
    ['Other Theft', 2003, 2, 16, 9, 2, 'Strathcona', 49.269802, -123.083763],
    ['Break and Enter Residential/Other', 2003, 7, 9, 18, 15, 'Grandview-Woodland', 49.267734, -123.067654],
    ['Other Theft', 2003, 1, 31, 19, 45, 'Strathcona', 49.269802, -123.083763],
    ['Mischief', 2003, 9, 27, 1, 0, 'Dunbar-Southlands', 49.253762, -123.194407],
    ['Offence Against a Person', 2017, 1 , 24, 'unknown', 'unknown', 'unknown', 0.000000, 0.000000],
    ['Break and Enter Residential/Other', 2003, 4, 19, 18, 0, 'Grandview-Woodland', 49.267814, -123.067441],
    ['Break and Enter Residential/Other', 2003, 9, 24, 18, 30, 'Grandview-Woodland', 49.267731, -123.067302],
    ['Break and Enter Residential/Other', 2003, 11, 5, 8, 12, 'Sunset', 49.226430, -123.085283],
    ['Break and Enter Commercial', 2003, 9, 26, 2, 30, 'West End', 49.284715, -123.122824],
    ['Break and Enter Residential/Other', 2003, 10, 21, 10, 0, 'Grandview-Woodland', 49.267811, -123.067089],
    ['Other Theft', 2003, 1, 25, 12, 30, 'Strathcona', 49.269802, -123.083763],
    ['Offence Against a Person', 2003, 2, 12, 'unknown', 'unknown', 'unknown', 0.000000, 0.000000],
    ['Other Theft', 2003, 1, 9, 6, 45, 'Strathcona', 49.269802, -123.083763],
    ['Offence Against a Person', 2008, 2, 6, 'unknown', 'unknown', 'unknown', 0.000000, 0.000000],
]
np_array = np.array(data)
bool_column = np.int_(np.empty(np_array.shape[0]))
for line in range(np_array.shape[0]):
    if np_array[line,4] and np_array[line,5] == 'unknown':
        bool_column[line] = int(0)
    else:
        bool_column[line] = 1
np_array = np.append(np_array, np.reshape(bool_column,(np_array.shape[0],-1)), axis=1) #add bool col
#sort
np_array = np_array[np_array[:,2].argsort()]#month sort
np_array = np_array[np_array[:,1].argsort(kind='stable')[::-1]] #year sort with month
#count spaces foe nice data output
equals_spaces = []
temp_for_cicle = 0
for num_in_line in range(np_array.shape[1]):
    for line in range (np_array.shape[0]):
        if len(np_array[line][num_in_line]) > temp_for_cicle:
            temp_for_cicle = len(np_array[line][num_in_line])
    equals_spaces.append(temp_for_cicle)
    temp_for_cicle = 0
#data output
for line in range(np_array.shape[0]):
    for num_in_line in range (np_array.shape[1]):
        print('{:<{}}'.format(np_array[line][num_in_line],equals_spaces[num_in_line]), end='')
        if num_in_line+1 == np_array.shape[1]:
            print("")
        else:
            print(" | ", end='')

【问题讨论】:

  • 欢迎来到 Stack Overflow!请拨打tour。实际问题是什么?另外,代码太多了。如需调试帮助,您需要提供minimal reproducible example,包括最少代码、预期输出和实际输出(或者如果您遇到错误,则提供带有回溯的完整错误消息)。你可以edit。如需更多提示,请参阅How to Ask
  • print(np_array.dtype)可以看到数组的数据类型是unicode字符串。
  • 为什么要使用 numpy 处理表格数据?使用熊猫。
  • idx = np_array[:,2].astype(int).argsort() 可能会成功

标签: python arrays numpy sorting


【解决方案1】:

如果您在转换为 numpy 之前进行排序,并跳过 numpy.argsort 调用,它会起作用:

data.sort( key=lambda row: (-row[1], -row[2]) )
np_array = np.array(data)

使用pandas 的建议可能是最好的建议。

【讨论】:

  • 嗯,谢谢,我想起了熊猫!这项作业来自课程,他们要求使用 numpy 做所有事情。长期以来,我一直在寻找一种解决方案,如何更改仅在一列中进行正常排序的数据类型,但没有一个解决方案对我有帮助。我会使用你的解决方案,我没想过提前对所有内容进行排序,谢谢!
猜你喜欢
  • 1970-01-01
  • 2015-01-23
  • 1970-01-01
  • 2021-12-04
  • 1970-01-01
  • 1970-01-01
相关资源
最近更新 更多