【问题标题】：Index by condition in python-numpy?在python-numpy中按条件索引？
【发布时间】：2019-05-31 13:34:12
【问题描述】：

我正在尝试从 Matlab 迁移到 Python。我正在将 Matlab 中的一些代码重写为 Python 进行测试。我已经安装了 Anaconda，目前正在使用 Spyder IDE。使用 Matlab，我创建了一个函数，该函数返回更接近函数输入参数的管道的商业 API 5L 直径（直径）和厚度（espesor）的值。我使用 Matlab 表进行了此操作。

请注意，直径（diametro_entrada）和厚度（espesor_entrada）的输入单位为米[m]，函数内部的厚度单位为毫米[mm]，这就是为什么最后我不得不乘以espesor_entrada*1000

    function tabla_seleccion=tablaAPI(diametro_entrada,espesor_entrada)
%Proporciona la tabla de caños API 5L, introducir diámetro en [m] y espesor
%en [m]
    Diametro_m=[0.3556;0.3556;0.3556;0.3556;0.3556;0.3556;0.3556;0.3556;0.3556;0.3556;0.3556;0.3556;0.3556;0.3556;0.3556;0.3556;0.3556;0.3556;0.3556;0.3556;0.3556;0.3556;0.3556;0.3556;...
    0.4064;0.4064;0.4064;0.4064;0.4064;0.4064;0.4064;0.4064;0.4064;0.4064;0.4064;0.4064;0.4064;0.4064;0.4064;0.4064;0.4064;0.4064;0.4064;0.4064;0.4064;0.4064;0.4064;0.4064;...
    0.4570;0.4570;0.4570;0.4570;0.4570;0.4570;0.4570;0.4570;0.4570;0.4570;0.4570;0.4570;0.4570;0.4570;0.4570;0.4570;0.4570;0.4570;0.4570;0.4570;0.4570;0.4570;0.4570;...
    0.5080;0.5080;0.5080;0.5080;0.5080;0.5080;0.5080;0.5080;0.5080;0.5080;0.5080;0.5080;0.5080;0.5080;0.5080;0.5080;0.5080;0.5080;0.5080;0.5080;0.5080;0.5080;0.5080;0.5080;...
    0.559;0.559;0.559;0.559;0.559;0.559;0.559;0.559;0.559;0.559;0.559;0.559;0.559;0.559;0.559;0.559;0.559;0.559;0.559;0.559;0.559;0.559;0.559;0.559;0.559;0.559;...
    0.610;0.610;0.610;0.610;0.610;0.610;0.610;0.610;0.610;0.610;0.610;0.610;0.610;0.610;0.610;0.610;0.610;0.610;0.610;0.610;0.610;0.610;0.610;0.610;0.610;0.610;...
    0.660;0.660;0.660;0.660;0.660;0.660;0.660;0.660;0.660;0.660;0.660;0.660;0.660;0.660;0.660;0.660;0.660;...
    0.711;0.711;0.711;0.711;0.711;0.711;0.711;0.711;0.711;0.711;0.711;0.711;0.711;0.711;0.711;0.711;0.711;...
    0.762;0.762;0.762;0.762;0.762;0.762;0.762;0.762;0.762;0.762;0.762;0.762;0.762;0.762;0.762;0.762;0.762;0.762;0.762;0.762;0.762;...
    0.813;0.813;0.813;0.813;0.813;0.813;0.813;0.813;0.813;0.813;0.813;0.813;0.813;0.813;0.813;0.813;0.813;0.813;0.813;0.813;0.813];

Espesor_mm=[4.8;5.2;5.3;5.6;6.4;7.1;7.9;8.7;9.5;10.3;11.1;11.9;12.7;14.3;15.9;17.5;19.1;20.6;22.2;23.8;25.4;27.0;28.6;31.8;...
    4.8;5.2;5.6;6.4;7.1;7.9;8.7;9.5;10.3;11.1;11.9;12.7;14.3;15.9;17.5;19.1;20.6;22.2;23.8;25.4;27.0;28.6;30.2;31.8;...
    4.8;5.6;6.4;7.1;7.9;8.7;9.5;10.3;11.1;11.9;12.7;14.3;15.9;17.5;19.1;20.6;22.2;23.8;25.4;27.0;28.6;30.2;31.8;...
    5.6;6.4;7.1;7.9;8.7;9.5;10.3;11.1;11.9;12.7;14.3;15.9;17.5;19.1;20.6;22.2;23.8;25.4;27.0;28.6;30.2;31.8;33.3;34.9;...
    5.6;6.4;7.1;7.9;8.7;9.5;10.3;11.1;11.9;12.7;14.3;15.9;17.5;19.1;20.6;22.2;23.8;25.4;27.0;28.6;30.2;31.8;33.3;34.9;36.5;38.1;...
    6.4;7.1;7.9;8.7;9.5;10.3;11.1;11.9;12.7;14.3;15.9;17.5;19.1;20.6;22.2;23.8;25.4;27.0;28.6;30.2;31.8;33.3;34.9;36.5;38.1;39.7;...
    6.4;7.1;7.9;8.7;9.5;10.3;11.1;11.9;12.7;14.3;15.9;17.5;19.1;20.6;22.2;23.8;25.4;...
    6.4;7.1;7.9;8.7;9.5;10.3;11.1;11.9;12.7;14.3;15.9;17.5;19.1;20.6;22.2;23.8;25.4;...
    6.4;7.1;7.9;8.7;9.5;10.3;11.1;11.9;12.7;14.3;15.9;17.5;19.1;20.6;22.2;23.8;25.4;27.0;28.6;30.2;31.8;...
    6.4;7.1;7.9;8.7;9.5;10.3;11.1;11.9;12.7;14.3;15.9;17.5;19.1;20.6;22.2;23.8;25.4;27.0;28.6;30.2;31.8];

TablaAPI=table(Diametro_m,Espesor_mm);
tabla_seleccion=TablaAPI(abs(TablaAPI.Diametro_m-diametro_entrada)<0.05 & abs(TablaAPI.Espesor_mm-(espesor_entrada*1000))<1.2,:);
end

使用输入直径（d）和输入厚度（e），我得到直径小于 0.05 和厚度小于 1.2 的商用管道。

我想用 Numpy 或其他包在 Python 中重现这个。首先，我定义了 2 个 Numpy 数组，它们的名称与 Matlab 中的相同，但用逗号分隔而不是分号，并且每行末尾没有“...”，然后将另一个 Numpy 数组定义为：

TablaAPI=numpy.array([Diametro_m,Espesor_mm])

我想知道我是否可以像在 Matlab 中那样以某种方式索引该数组，或者我必须定义其他完全不同的东西。

非常感谢！

【问题讨论】：

我想你想要searchsorted
不，你没有，缺少厚度。您可能会更好地包含 MATLAB 方法（我自己不太可能熟悉）。 searchsorted 在这里不起作用。我最初是在化学。英。而且我很难相信这里只是出现了索引问题。
我认为包含您的 MatLab 代码可能会有用。

标签： python matlab numpy indexing

【解决方案1】：

你当然可以！

这是一个如何使用 numpy 的示例：

使用 Numpy

import math
import numpy as np

# Declare your Diametro_m, Espesor_mmhere just like you did in your example

# Transpose and merge the columns
arr = np.concatenate((Diametro_m, Espesor_mm.T), axis=1)
selection = arr[np.ix_(abs(arr[:0])<0.05,abs(arr[:1]-(math.e*1000)) > <1.2 )]

Example usage from John Zwinck's answer

使用数据框

如果您需要执行更繁重的查询或混合列数据类型，数据框也可能非常适合您的应用程序。如果您选择该选项，此代码应该适合您：

# These imports go at the top of your document
import pandas as pd
import numpy as np
import math


# Declare your Diametro_m, Espesor_mmhere just like you did in your example

df_d = pd.DataFrame(data=Diametro_m,
          index=np.array(range(1, len(Diametro_m))),
          columns=np.array(range(1, len(Diametro_m))))

df_e = pd.DataFrame(data=Espesor_mm,
          index=np.array(range(1, len(Diametro_m))),
          columns=np.array(range(1, len(Diametro_m))))

# Merge the dataframes
merged_df = pd.merge(left=df_d , left_index=True
                  right=df_e , right_index=True,
                  how='inner')

# Now you can perform your selections like this:
selection = merged_df.loc[abs(merged_df['df_d']) <0.05, abs(merged_df['df_e']-(math.e*1000))) <1.2]

# This "mask" of the dataframe will return all results that satisfy your query.
print(selection)

【讨论】：

您好，感谢您的回复，我无法使用这两个选项（Numpy 或数据帧）。我已经编辑了我的问题，所以更清楚了，因为我放在那里的“e”不是通用常数，而是“espesor”，我没有编辑你的答案，因为我不知道改变这个是否正确使它按我想要的方式工作。以下是出现的错误：使用带有 arr 的 numpy 方法：AxisError：轴 1 超出维度 1 数组的范围使用 df 方法：传递了构造错误，隐含）），ValueError：传递值的形状为（1,223） , 指数暗示(222,222)
您能在此处发布您的代码的任一版本吗？这对我们来说更容易自己测试并快速帮助您。

【解决方案2】：

由于您没有给出预期输出的示例，因此有点猜测您真正追求的是什么，但这里有一个带有 numpy 的版本。

# rewritten arrays for numpy
Diametro_m=[0.3556,0.3556,0.3556,0.3556,0.3556,0.3556,0.3556,0.3556,0.3556,0.3556,0.3556,0.3556,0.3556,0.3556,0.3556,0.3556,0.3556,0.3556,0.3556,0.3556,0.3556,0.3556,0.3556,0.3556,
    0.4064,0.4064,0.4064,0.4064,0.4064,0.4064,0.4064,0.4064,0.4064,0.4064,0.4064,0.4064,0.4064,0.4064,0.4064,0.4064,0.4064,0.4064,0.4064,0.4064,0.4064,0.4064,0.4064,0.4064,
    0.4570,0.4570,0.4570,0.4570,0.4570,0.4570,0.4570,0.4570,0.4570,0.4570,0.4570,0.4570,0.4570,0.4570,0.4570,0.4570,0.4570,0.4570,0.4570,0.4570,0.4570,0.4570,0.4570,
    0.5080,0.5080,0.5080,0.5080,0.5080,0.5080,0.5080,0.5080,0.5080,0.5080,0.5080,0.5080,0.5080,0.5080,0.5080,0.5080,0.5080,0.5080,0.5080,0.5080,0.5080,0.5080,0.5080,0.5080,
    0.559,0.559,0.559,0.559,0.559,0.559,0.559,0.559,0.559,0.559,0.559,0.559,0.559,0.559,0.559,0.559,0.559,0.559,0.559,0.559,0.559,0.559,0.559,0.559,0.559,0.559,
    0.610,0.610,0.610,0.610,0.610,0.610,0.610,0.610,0.610,0.610,0.610,0.610,0.610,0.610,0.610,0.610,0.610,0.610,0.610,0.610,0.610,0.610,0.610,0.610,0.610,0.610,
    0.660,0.660,0.660,0.660,0.660,0.660,0.660,0.660,0.660,0.660,0.660,0.660,0.660,0.660,0.660,0.660,0.660,
    0.711,0.711,0.711,0.711,0.711,0.711,0.711,0.711,0.711,0.711,0.711,0.711,0.711,0.711,0.711,0.711,0.711,
    0.762,0.762,0.762,0.762,0.762,0.762,0.762,0.762,0.762,0.762,0.762,0.762,0.762,0.762,0.762,0.762,0.762,0.762,0.762,0.762,0.762,
    0.813,0.813,0.813,0.813,0.813,0.813,0.813,0.813,0.813,0.813,0.813,0.813,0.813,0.813,0.813,0.813,0.813,0.813,0.813,0.813,0.813]

Espesor_mm=[4.8,5.2,5.3,5.6,6.4,7.1,7.9,8.7,9.5,10.3,11.1,11.9,12.7,14.3,15.9,17.5,19.1,20.6,22.2,23.8,25.4,27.0,28.6,31.8,
    4.8,5.2,5.6,6.4,7.1,7.9,8.7,9.5,10.3,11.1,11.9,12.7,14.3,15.9,17.5,19.1,20.6,22.2,23.8,25.4,27.0,28.6,30.2,31.8,
    4.8,5.6,6.4,7.1,7.9,8.7,9.5,10.3,11.1,11.9,12.7,14.3,15.9,17.5,19.1,20.6,22.2,23.8,25.4,27.0,28.6,30.2,31.8,
    5.6,6.4,7.1,7.9,8.7,9.5,10.3,11.1,11.9,12.7,14.3,15.9,17.5,19.1,20.6,22.2,23.8,25.4,27.0,28.6,30.2,31.8,33.3,34.9,
    5.6,6.4,7.1,7.9,8.7,9.5,10.3,11.1,11.9,12.7,14.3,15.9,17.5,19.1,20.6,22.2,23.8,25.4,27.0,28.6,30.2,31.8,33.3,34.9,36.5,38.1,
    6.4,7.1,7.9,8.7,9.5,10.3,11.1,11.9,12.7,14.3,15.9,17.5,19.1,20.6,22.2,23.8,25.4,27.0,28.6,30.2,31.8,33.3,34.9,36.5,38.1,39.7,
    6.4,7.1,7.9,8.7,9.5,10.3,11.1,11.9,12.7,14.3,15.9,17.5,19.1,20.6,22.2,23.8,25.4,
    6.4,7.1,7.9,8.7,9.5,10.3,11.1,11.9,12.7,14.3,15.9,17.5,19.1,20.6,22.2,23.8,25.4,
    6.4,7.1,7.9,8.7,9.5,10.3,11.1,11.9,12.7,14.3,15.9,17.5,19.1,20.6,22.2,23.8,25.4,27.0,28.6,30.2,31.8,
    6.4,7.1,7.9,8.7,9.5,10.3,11.1,11.9,12.7,14.3,15.9,17.5,19.1,20.6,22.2,23.8,25.4,27.0,28.6,30.2,31.8]


import numpy as np

diametro_entrada = 0.4
espesor_entrada = 5

Diametro_m = np.array(Diametro_m)
Espesor_mm = np.array(Espesor_mm)
# Diametro_m and Espesor_mm has shape (223,)
# if not change so that they have that shape
table = np.array([Diametro_m, Espesor_mm]).T

mask = np.where((np.abs(Diametro_m - diametro_entrada) < 0.05) &
                (np.abs(Espesor_mm - espesor_entrada) < 1.2)
                )
result = table[mask]
print('with numpy')
print(result)

或者你可以只用 python 来做......

# redo with python only
# based on a simple dict and list comprehension
D_m = [0.3556, 0.4064, 0.4570, 0.5080, 0.559, 0.610, 0.660, 0.711, 0.762, 0.813]
E_mm = [[4.8,5.2,5.3,5.6,6.4,7.1,7.9,8.7,9.5,10.3,11.1,11.9,12.7,14.3,15.9,17.5,19.1,20.6,22.2,23.8,25.4,27.0,28.6,31.8],
    [4.8,5.2,5.6,6.4,7.1,7.9,8.7,9.5,10.3,11.1,11.9,12.7,14.3,15.9,17.5,19.1,20.6,22.2,23.8,25.4,27.0,28.6,30.2,31.8],
    [4.8,5.6,6.4,7.1,7.9,8.7,9.5,10.3,11.1,11.9,12.7,14.3,15.9,17.5,19.1,20.6,22.2,23.8,25.4,27.0,28.6,30.2,31.8],
    [5.6,6.4,7.1,7.9,8.7,9.5,10.3,11.1,11.9,12.7,14.3,15.9,17.5,19.1,20.6,22.2,23.8,25.4,27.0,28.6,30.2,31.8,33.3,34.9],
    [5.6,6.4,7.1,7.9,8.7,9.5,10.3,11.1,11.9,12.7,14.3,15.9,17.5,19.1,20.6,22.2,23.8,25.4,27.0,28.6,30.2,31.8,33.3,34.9,36.5,38.1],
    [6.4,7.1,7.9,8.7,9.5,10.3,11.1,11.9,12.7,14.3,15.9,17.5,19.1,20.6,22.2,23.8,25.4,27.0,28.6,30.2,31.8,33.3,34.9,36.5,38.1,39.7],
    [6.4,7.1,7.9,8.7,9.5,10.3,11.1,11.9,12.7,14.3,15.9,17.5,19.1,20.6,22.2,23.8,25.4],
    [6.4,7.1,7.9,8.7,9.5,10.3,11.1,11.9,12.7,14.3,15.9,17.5,19.1,20.6,22.2,23.8,25.4],
    [6.4,7.1,7.9,8.7,9.5,10.3,11.1,11.9,12.7,14.3,15.9,17.5,19.1,20.6,22.2,23.8,25.4,27.0,28.6,30.2,31.8],
    [6.4,7.1,7.9,8.7,9.5,10.3,11.1,11.9,12.7,14.3,15.9,17.5,19.1,20.6,22.2,23.8,25.4,27.0,28.6,30.2,31.8]]

table2 = dict(zip(D_m, E_mm))
result2 = []
for D, E in table2.items():
    if abs(D - diametro_entrada) < 0.05:
        Et = [t for t in E if abs(t - espesor_entrada) < 1.2]
        result2 += [(D, t) for t in Et]
print('with vanilla python')
print('\n'.join((str(r) for r in result2)))

一旦你在 python 中，有无数种方法可以做到这一点，你可以很容易地用 pandas 或 sqlite 做同样的事情。我个人的偏好倾向于尽可能少的依赖，在这种情况下，我会选择一个 csv 文件作为输入，然后在没有 numpy 的情况下进行，如果这是一个真正的大规模问题，我会考虑 sqlite/numpy/pandas。

祝你过渡顺利，我想你不会后悔的。

【讨论】：

非常感谢，这正是我一直在寻找的东西，我尝试了 Numpy 方法，效果很好。
尼诺问题。顺便说一句，当我复制您的数据并对分号进行搜索替换为冒号并将其直接用于 np.array 时，它需要一些技巧才能使其正常工作，因此我在答案中制作了一个对 python 友好的数据集，可能会绊倒较早的答案。在任何情况下，一维数组在 numpy 世界中的正常形状是 (m,) 而不是 (m,1)，我看到有些人在开始使用 numpy 时习惯于使用 matlab。