在循环中提取某些条件/值答案

【问题标题】：Extract certain conditions/values inside a loop在循环中提取某些条件/值
【发布时间】：2019-10-25 09:43:40
【问题描述】：

我有一个使用循环开发的数据集（代码如下）。

## A subset of the data
deficit = (E_daily[:,160,100] - P_daily[:,160,100])  # A view of `deficit` is given below
deficit_cum = np.zeros([365])   
start = 0
stop = 0
for i in range(365):
    deficit_cum[i] = deficit[i] + deficit_cum[i-1]
    if deficit_cum[i] >= 0:
        if np.nanmax(deficit_cum) <= deficit_cum[i]:
            stop = i
        else:
            continue
    else:
        deficit_cum[i] = 0
        if np.nanmax(deficit_cum) > deficit_cum[i]:
            continue
        else:
            start = i     #Start is not defined correctly by me

这是deficit 在循环之前的样子

现在我也对循环开始和结束的值的索引感兴趣。我知道这很令人困惑，这是一个示例（下面是 deficit_cum 的外观）：

上图中，从代码中，绿线定义start（x轴），红线定义stop（x轴）。所以基本上我希望最大值是我的stop 点。但我希望我的start 小于停止，start 和stop 之间不应该有任何负面影响。 所以对于下图，我的 start 应该在 111 左右，stop 应该在 236 左右。

我想我知道如何获得stop（上面的代码），但我仍然无法定义start。 [补充信息：我的start应该是全局最大值前最后一个零的索引，stop应该是全局最大值的索引]

【问题讨论】：

你想要什么？你想总是start 小于stop？
你使用的是 pandas 还是简单的 numpy？
@Mohsen_Fatemi。是的，我希望 start 成为循环实际开始累积值的点（绿线）。所以它应该总是小于stop。所以基本上start应该是绿线和红线之间的第一个零，并且在那个零或另一个零之后应该没有负值。
@FlorianBernard。我正在使用 numpy。
@Ep1c1aN 看看我的回答

标签： python arrays loops if-statement

【解决方案1】：

所以你有一个列表，你想在一个名为start和stop的max之前找到一个min，假设列表名为a，你可以使用argmax和argmin函数，所以我们有：

import numpy as np

a = np.array([1,2,3,6,6,6,4,5])
start = np.argmin(a)

b = np.where(a==a.max()) # find indices in which max values exist
b = np.reshape(b,-1)
new_array = a[:b[-1]+1] # make a new array, it starts from 0 to index of last max value
# new_array = = [1,2,3,6,6,6]

c = np.where(new_array==new_array.min())
c = np.reshape(c,-1)

start = c[-1]
stop = b[-1]

【讨论】：

您的脚本不适用于以下数组 a = [1, 1, 2, 3, 6, 6, 6, 4, 5]。而且列表没有a.max()。
@FlorianBernard 感谢您提到这个问题，a 应该是 numpy 数组。
@Mohsen_Fatemi。您的代码确实提供了max 或stop，但它无法分辨start 或“全局最大值之前的零索引”。对于start，它给了我一个零。
@Ep1c1aN 我看不到您的数据，但如果发生这种情况，那是因为max 之前的min 元素位于索引0 处！
这是因为我的代码在最后一次出现 min 之前查找第一次出现的 min ！

【解决方案2】：

好的，让我们试试这样的。

首先我们计算赤字。

deficit = np.absolute(E_daily[:,160,100] - P_daily[:,160,100])
deficit_cum = np.cumsum(deficit)

开始索引是 day > day - 1 的位置，反向表示停止。

start = np.where(np.diff(deficit) > 0)[0][0] + 1
stop = np.where(np.diff(defict[start:]) < 0)[0][0] + 2

让我们看看它的实际效果。

deficit = np.array([1,1,2,3,6,6,6,4,5])
start = np.where(np.diff(deficit) > 0)[0][0] + 1
stop = np.where(np.diff(defict[start:]) < 0)[0][0] + 2

结果

start = 2 # deficit[2] == 2
stop = 6 # deficit[6] == 6

文档

注意

当您使用 numpy 时，您应该考虑矩阵，而忘记循环。使用 numpy 内置函数执行操作是一个很好的实践。速度更快，并且可重现。

【讨论】：

start 和 stop 将是索引列表，而不是 start 和 stop 的单个索引！
考虑这个例子：>>> a = [1,2,3,6,6,6,0,4,5]我已经按照你提到的方式计算了start和stop，这就是答案>>> start array([1, 2, 3, 7, 8], dtype=int64) >>> stop array([6], dtype=int64)
@FlorianBernard。我不确定，如果np.cumsum(deficit) 是要走的路。我的definition_cum 值在达到小于零的值后重置，然后再次从零开始累加。您的解决方案可能适用于所有正值，但不适用于具有多个峰值在零附近的数据集。我也为我的数据集尝试了你的代码，它没有工作。