【问题标题】:How to get a weighted average of a list of which it's weights is limited by a variable in Python 3.6如何获得其权重受 Python 3.6 中的变量限制的列表的加权平均值
【发布时间】:2018-01-09 13:15:11
【问题描述】:

我希望标题有意义。我想要实现的是获得鞋子的加权平均价格,这些鞋子以不同的价格和不同的数量提供。所以我有例如:

list_prices = [12,12.7,13.5,14.3]
list_amounts = [85,100,30,54]
BuyAmount = x

我想知道我的加权平均价格,以及我为每只鞋支付的最高价格如果我购买 x 数量的鞋(假设我想先买最便宜的)

这就是我现在所拥有的(我使用 numpy):

    if list_amounts[0] >= BuyAmount:
        avgprice = list_prices[0]
        highprice = list_prices[0]

    elif (sum(list_amounts[0: 2])) >= BuyAmount:
        avgprice = np.average(list_prices[0: 2], weights=[list_amounts[0],BuyAmount - list_amounts[0]])
        highprice = list_prices[1]

    elif (sum(list_amounts[0: 3])) >= BuyAmount:
        avgprice = np.average(list_prices[0: 3], weights=[list_amounts[0],list_amounts[1],BuyAmount - (sum(list_amounts[0: 2]))])
        highprice = list_prices[2]

    elif (sum(list_amounts[0: 4])) >= BuyAmount:
        avgprice = np.average(list_prices[0: 4], weights=[list_amounts[0],list_amounts[1],list_amounts[2],BuyAmount - (sum(list_amounts[0: 3]))])
        highprice = list_prices[3]

    print(avgprice)
    print(highprice)

此代码有效,但可能过于复杂和扩展。特别是因为我希望能够处理包含 20 多件商品的数量和价格清单。

有什么更好的方法来做到这一点?

【问题讨论】:

    标签: python list numpy average weighted


    【解决方案1】:

    您确实是对的,您的代码缺乏灵活性。但在我看来,您是从一个有效的角度看待问题,但还不够笼统。

    换句话说,您的解决方案实现了这个想法:“让我先检查一下 - 考虑到每个价格的可用数量(我在一个数组中很好地排序) - 我必须从哪些不同的卖家那里购买,然后做所有的计算。”

    一个更灵活的想法可以是:“让我尽可能多地从更便宜的商品开始购买。我会在订单完成后停下来,逐步计算数学”。这意味着您构建一个迭代代码,逐步累积总花费金额,并在完成计算每件的平均价格和最高价格(即您的订购列表中的最后一次访问)。

    把这个想法变成代码:

    list_prices = [12,12.7,13.5,14.3]
    list_amounts = [85,100,30,54]
    BuyAmount = x
    
    remaining = BuyAmount
    spent_total = 0
    current_seller = -1 # since we increment it right away 
    
    while(remaining): # inherently means remaining > 0
        current_seller += 1
        # in case we cannot fulfill the order
        if current_seller >= len(list_prices):
            # since we need it later we have to restore the value
            current_seller -= 1
            break
        # we want either as many as available or just enough to complete 
        # BuyAmount
        buying = min([list_amounts[current_seller], remaining])
        # update remaining
        remaining -= buying
        # update total
        spent_total += buying * list_prices[current_seller]
    
    # if we got here we have no more remaining or no more stock to buy
    
    # average price
    avgprice = spent_total / (BuyAmount - remaining) 
    
    # max price - since the list is ordered -
    highprice = list_prices[current_seller]
    
    print(avgprice)
    print(highprice)
    

    【讨论】:

    • 非常感谢您的回复。很高兴听到我的思路是正确的。也就是说,我什至没有想过在一个只需要查看相关数量的情况下对其进行编程。我像这样实现它并且它有效!我还学到了很多看你的代码以及它比我的更有效率的方法:) 再次感谢
    【解决方案2】:

    这是一个通用矢量化解决方案,使用cumsum 替换那些切片求和,argmax 用于获取适当的索引以用于设置这些 IF-case 操作的切片限制 -

    # Use cumsum to replace sliced summations - Basically all those 
    # `list_amounts[0]`, `sum(list_amounts[0: 2]))`, `sum(list_amounts[0: 3])`, etc.
    c = np.cumsum(list_amounts)
    
    # Use argmax to decide the slicing limits for the intended slicing operations.
    # So, this would replace the last number in the slices - 
    # list_prices[0: 2], list_prices[0: 3], etc.
    idx = (c >= BuyAmount).argmax()
    
    # Use the slicing limit to get the slice off list_prices needed as the first
    # input to numpy.average
    l = list_prices[:idx+1]
    
    # This step gets us the weights. Now, in the weights we have two parts. E.g.
    # for the third-IF we have : 
    # [list_amounts[0],list_amounts[1],BuyAmount - (sum(list_amounts[0: 2]))]
    # Here, we would slice off list_amounts limited by `idx`.
    # The second part is sliced summation limited by `idx` again.
    w = np.r_[list_amounts[:idx], BuyAmount - c[idx-1]]
    
    # Finally, plug-in the two inputs to np.average and get avgprice output.
    avgprice = np.average(l,weights=w)
    
    # Get idx element off list_prices as the highprice output.
    highprice = list_prices[idx]
    

    我们可以进一步优化以移除连接步骤(使用np.r_)并到达avgprice,就像这样 -

    slice1_sum = np.multiply(list_prices[:idx], list_amounts[:idx]).sum()
            # or np.dot(list_prices[:idx], list_amounts[:idx])
    slice2_sum = list_prices[idx]*(BuyAmount - c[idx-1])
    weight_sum = np.sum(list_amounts[:idx]) + BuyAmount - c[idx-1]
    avgprice = (slice1_sum+slice2_sum)/weight_sum
    

    【讨论】:

    • 使用 numpy 在大多数情况下比普通 python 更高效,但代码看起来常常晦涩难懂:您能补充一些注释和解释吗?
    • @FabioVeronese 添加。
    • 非常感谢您的帮助。我不得不做很多研究,因为我不熟悉 cumsum、argmax 和 np_r。 cumsum 确实大大减少了代码的长度哈哈,这正是你所需要的。您如何找到填充订单的金额的 amountindex 也非常方便,最后您有一个非常有效的代码!我确实有一个问题,为什么去除连接步骤的优化是一件好事?再次感谢您的帮助,学到了很多东西!
    • @Cennnn 连接需要额外的内存来存储连接的数组。通过将其替换为后一部分所示的原位操作,我们节省了内存,并且有望提高性能效率。
    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2015-07-15
    • 1970-01-01
    • 2019-06-02
    • 2019-02-19
    • 2021-09-03
    相关资源
    最近更新 更多