【问题标题】:Python: extract inside values from unstructured list with one elementPython:从一个元素的非结构化列表中提取内部值
【发布时间】:2020-05-12 01:18:45
【问题描述】:

在打开.mat (MatLab) 文件的结果中,我收到了包含一个元素的非结构化列表:

list_1 = [(np.array(['charge'], dtype='<U6'), np.array([[24]], dtype=float), np.array([[2.0080e+03, 4.0000e+00, 2.0000e+00, 1.3000e+01, 8.0000e+00,
         1.7921e+01]]), np.array([[(np.array([[3.87301722, 3.47939356, 4.00058782, 4.01239519, 4.01970806]]), np.array([[-1.20066070e-03, -4.03026848e+00,  1.51273065e+00,
          1.50906328e+00,  1.51131819e+00,  1.51277913e+00,
          1.51183834e+00,  1.51024540e+00,  1.50779576e+00,
          1.50732203e+00,  1.51022594e+00,  1.51185336e+00]]), np.array([[0.000000e+00, 2.532000e+00, 5.500000e+00, 8.344000e+00,
         1.112500e+01, 1.389100e+01, 1.667200e+01, 1.950000e+01,
         2.228200e+01, 2.506300e+01, 2.782800e+01, 3.064100e+01,
         3.345300e+01, 3.621900e+01, 3.973500e+01, 4.257800e+01]]))]],
       dtype=[('Res1', 'O'), ('Rea2', 'O'), ('Res3', 'O')]))]

我想提取每个np.array 以分离Variable。 预期结果是:

var1 = np.array([[2.0080e+03, 4.0000e+00, 2.0000e+00, 1.3000e+01, 8.0000e+00,
         1.7921e+01]])
var2 = np.array([[(np.array([[3.87301722, 3.47939356, 4.00058782, 4.01239519, 4.01970806]])
var3 = np.array([[-1.20066070e-03, -4.03026848e+00,  1.51273065e+00,
          1.50906328e+00,  1.51131819e+00,  1.51277913e+00,
          1.51183834e+00,  1.51024540e+00,  1.50779576e+00,
          1.50732203e+00,  1.51022594e+00,  1.51185336e+00]])
var4 = np.array([[0.000000e+00, 2.532000e+00, 5.500000e+00, 8.344000e+00,
         1.112500e+01, 1.389100e+01, 1.667200e+01, 1.950000e+01,
         2.228200e+01, 2.506300e+01, 2.782800e+01, 3.064100e+01,
         3.345300e+01, 3.621900e+01, 3.973500e+01, 4.257800e+01]])

我试图将这个list_1 移动到np.array 并制作np.squeeze(array_1).item(),但这是错误的方式。

如何解析此类列表中的元素?谢谢。

【问题讨论】:

    标签: python-3.x list numpy numpy-ndarray flatten


    【解决方案1】:

    复制粘贴会产生一个包含 1 个元素的列表:

    In [591]: list_1 = [(np.array(['charge'], dtype='<U6'), np.array([[24]], dtype=float), np.array([
         ...: [2.0080e+03, 4.0000e+00, 2.0000e+00, 1.3000e+01, 8.0000e+00, 
         ...:          1.7921e+01]]), np.array([[(np.array([[3.87301722, 3.47939356, 4.00058782, 4.01
         ...: 239519, 4.01970806]]), np.array([[-1.20066070e-03, -4.03026848e+00,  1.51273065e+00, 
         ...:           1.50906328e+00,  1.51131819e+00,  1.51277913e+00, 
         ...:           1.51183834e+00,  1.51024540e+00,  1.50779576e+00, 
         ...:           1.50732203e+00,  1.51022594e+00,  1.51185336e+00]]), np.array([[0.000000e+00,
         ...:  2.532000e+00, 5.500000e+00, 8.344000e+00, 
         ...:          1.112500e+01, 1.389100e+01, 1.667200e+01, 1.950000e+01, 
         ...:          2.228200e+01, 2.506300e+01, 2.782800e+01, 3.064100e+01, 
         ...:          3.345300e+01, 3.621900e+01, 3.973500e+01, 4.257800e+01]]))]], 
         ...:        dtype=[('Res1', 'O'), ('Rea2', 'O'), ('Res3', 'O')]))]                          
    In [592]: len(list_1)                                                                            
    Out[592]: 1
    

    该元素是一个 4 元素元组:

    In [593]: list_1[0]                                                                              
    Out[593]: 
    (array(['charge'], dtype='<U6'),
     array([[24.]]),
     array([[2.0080e+03, 4.0000e+00, 2.0000e+00, 1.3000e+01, 8.0000e+00,
             1.7921e+01]]),
     array([[(array([[3.87301722, 3.47939356, 4.00058782, 4.01239519, 4.01970806]]), array([[-1.20066070e-03, -4.03026848e+00,  1.51273065e+00,
              1.50906328e+00,  1.51131819e+00,  1.51277913e+00,
              1.51183834e+00,  1.51024540e+00,  1.50779576e+00,
              1.50732203e+00,  1.51022594e+00,  1.51185336e+00]]), array([[ 0.   ,  2.532,  5.5  ,  8.344, 11.125, 13.891, 16.672, 19.5  ,
             22.282, 25.063, 27.828, 30.641, 33.453, 36.219, 39.735, 42.578]]))]],
           dtype=[('Res1', 'O'), ('Rea2', 'O'), ('Res3', 'O')]))
    In [594]: type(_)                                                                                
    Out[594]: tuple
    In [595]: len(__)                                                                                
    Out[595]: 4
    

    然后我们可以将其解压缩为 4 个变量:

    In [596]: var1,var2,var3,var4=list_1[0]                                                          
    In [597]: var1                                                                                   
    Out[597]: array(['charge'], dtype='<U6')      # a string
    In [598]: var2                                                                                   
    Out[598]: array([[24.]])                      # a number, (1,1) array
    In [599]: var3                                                                                   
    Out[599]: 
    array([[2.0080e+03, 4.0000e+00, 2.0000e+00, 1.3000e+01, 8.0000e+00,
            1.7921e+01]])
    

    var3 是一个矩阵,这里是一个 (1,6) 数值数组。

    In [600]: var4                                                                                   
    Out[600]: 
    array([[(array([[3.87301722, 3.47939356, 4.00058782, 4.01239519, 4.01970806]]), array([[-1.20066070e-03, -4.03026848e+00,  1.51273065e+00,
             1.50906328e+00,  1.51131819e+00,  1.51277913e+00,
             1.51183834e+00,  1.51024540e+00,  1.50779576e+00,
             1.50732203e+00,  1.51022594e+00,  1.51185336e+00]]), array([[ 0.   ,  2.532,  5.5  ,  8.344, 11.125, 13.891, 16.672, 19.5  ,
            22.282, 25.063, 27.828, 30.641, 33.453, 36.219, 39.735, 42.578]]))]],
          dtype=[('Res1', 'O'), ('Rea2', 'O'), ('Res3', 'O')])
    

    最后一个很复杂;我认为这是 MATLAB 中的 struct。这是一个 (1,1) 形状(1 个元素,2d),具有 3 个字段的结构化数组,每个字段都包含数组(对象 dtype)。

    In [601]: var4.shape                                                                             
    Out[601]: (1, 1)
    In [602]: var4.dtype                                                                             
    Out[602]: dtype([('Res1', 'O'), ('Rea2', 'O'), ('Res3', 'O')])
    

    我们可以参考:

    In [603]: var4[0,0]['Res1']                                                                      
    Out[603]: array([[3.87301722, 3.47939356, 4.00058782, 4.01239519, 4.01970806]])
    In [604]: var4[0,0]['Rea2']                                                                      
    Out[604]: 
    array([[-1.20066070e-03, -4.03026848e+00,  1.51273065e+00,
             1.50906328e+00,  1.51131819e+00,  1.51277913e+00,
             1.51183834e+00,  1.51024540e+00,  1.50779576e+00,
             1.50732203e+00,  1.51022594e+00,  1.51185336e+00]])
    In [605]: var4[0,0]['Res3']                                                                      
    Out[605]: 
    array([[ 0.   ,  2.532,  5.5  ,  8.344, 11.125, 13.891, 16.672, 19.5  ,
            22.282, 25.063, 27.828, 30.641, 33.453, 36.219, 39.735, 42.578]])
    

    var4 的单个元素从 (1,1) MATLAB 形状中取出:

    In [631]: var4[0,0]                                                                              
    Out[631]: 
    (array([[3.87301722, 3.47939356, 4.00058782, 4.01239519, 4.01970806]]), array([[-1.20066070e-03, -4.03026848e+00,  1.51273065e+00,
             1.50906328e+00,  1.51131819e+00,  1.51277913e+00,
             1.51183834e+00,  1.51024540e+00,  1.50779576e+00,
             1.50732203e+00,  1.51022594e+00,  1.51185336e+00]]), array([[ 0.   ,  2.532,  5.5  ,  8.344, 11.125, 13.891, 16.672, 19.5  ,
            22.282, 25.063, 27.828, 30.641, 33.453, 36.219, 39.735, 42.578]]))
    

    将其提取到一个元组中:

    In [632]: var4[0,0].tolist()                                                                     
    Out[632]: 
    (array([[3.87301722, 3.47939356, 4.00058782, 4.01239519, 4.01970806]]),
     array([[-1.20066070e-03, -4.03026848e+00,  1.51273065e+00,
              1.50906328e+00,  1.51131819e+00,  1.51277913e+00,
              1.51183834e+00,  1.51024540e+00,  1.50779576e+00,
              1.50732203e+00,  1.51022594e+00,  1.51185336e+00]]),
     array([[ 0.   ,  2.532,  5.5  ,  8.344, 11.125, 13.891, 16.672, 19.5  ,
             22.282, 25.063, 27.828, 30.641, 33.453, 36.219, 39.735, 42.578]]))
    In [633]: type(_)                                                                                
    Out[633]: tuple
    

    如果没有 [0,0],tolist 会给我们几层列表嵌套,[[(....)]]

    结构化数组名称为:

    In [634]: var4.dtype.names                                                                       
    Out[634]: ('Res1', 'Rea2', 'Res3')
    

    以及将这些名称和 [632] 中的数组组合在一起的字典:

    In [636]: dd = {name:val for name, val in zip(var4.dtype.names, var4[0,0].tolist())}             
    In [637]: dd                                                                                     
    Out[637]: 
    {'Res1': array([[3.87301722, 3.47939356, 4.00058782, 4.01239519, 4.01970806]]),
     'Rea2': array([[-1.20066070e-03, -4.03026848e+00,  1.51273065e+00,
              1.50906328e+00,  1.51131819e+00,  1.51277913e+00,
              1.51183834e+00,  1.51024540e+00,  1.50779576e+00,
              1.50732203e+00,  1.51022594e+00,  1.51185336e+00]]),
     'Res3': array([[ 0.   ,  2.532,  5.5  ,  8.344, 11.125, 13.891, 16.672, 19.5  ,
             22.282, 25.063, 27.828, 30.641, 33.453, 36.219, 39.735, 42.578]])}
    In [638]: dd["Rea2"]                                                                             
    Out[638]: 
    array([[-1.20066070e-03, -4.03026848e+00,  1.51273065e+00,
             1.50906328e+00,  1.51131819e+00,  1.51277913e+00,
             1.51183834e+00,  1.51024540e+00,  1.50779576e+00,
             1.50732203e+00,  1.51022594e+00,  1.51185336e+00]])
    

    将此最后一次访问与In[604] 进行比较。同样的事情,索引略有不同。

    【讨论】:

    • 是的,它可以工作,但问题是如何实现自动化?不用手动检测,有4个变量需要。如果考虑到包含 10000 个 np.array 的元组,我有一个元素的列表,那么我不会手动创建 10000 个变量来为每个变量分配响应的 np.array()。换句话说,如何自动检测到这个元组中有 10000 个 np.arrays?无论如何,谢谢你的想法!
    • 在 Python 中,我们通常不会尝试将对象分配给大量不同的变量(名称如 var1,var2,...)。我们使用对象的列表和元组(以及如果需要的字典)。我添加了一些代码来展示如何将var4 转换为元组或字典。
    猜你喜欢
    • 2019-12-15
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2013-05-29
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多