【问题标题】:binning data and inclusive result分箱数据和包含结果
【发布时间】:2012-05-15 09:43:05
【问题描述】:

假设我已经在这样的结构中分箱了一些数据:

data = {(1,1): [...] # list of float,
        (1,2): [...],
        (1,3): [...],
        (2,1): [...],
        ... }

这里我只有两个轴用于分箱,但假设我有 N 个。现在假设例如我有 N=3 轴,我想要第二个 bin 为 1 的数据,所以我想要一个函数

(None, 1, None) -> [(1, 1, 1), (1, 1, 2), (1, 1, 3), ...
                    (2, 1, 1), (2, 1, 2), (2, 1, 3), ...]

所以我可以使用itertools.chain 作为结果

你知道每个轴的范围来自:

axes_ranges = [(1, 10), (1, 8), (1, 3)]

其他例子:

(None, 1, 2) -> [(1, 1, 2), (2, 1, 2), (3, 1, 2), ...]
(None, None, None) -> all the combinations
(1,2,3) -> [(1,2,3)]

【问题讨论】:

    标签: python combinatorics itertools


    【解决方案1】:

    看起来很像你重新发明轮子。您可能想要使用的是 numpy.ndarray:

        import numpy as np
        >>> x = np.arange(0,27)
        >>> x
        array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16,
        17, 18, 19, 20, 21, 22, 23, 24, 25, 26])
        >>> x.reshape(3,3,3)
        array([[[ 0,  1,  2],
        [ 3,  4,  5],
        [ 6,  7,  8]],
    
        [[ 9, 10, 11],
         [12, 13, 14],
         [15, 16, 17]],
    
        [[18, 19, 20],
         [21, 22, 23],
         [24, 25, 26]]])
    
        >>> x[0]
        array([[0, 1, 2],
        [3, 4, 5],
        [6, 7, 8]])
        >>> x[:,1,:]
        array([[ 3,  4,  5],
        [12, 13, 14],
        [21, 22, 23]])
        >>> x[:,1,1]
        array([ 4, 13, 22])
    

    这可以有 N 个维度。在示例中,索引是三维的,您可以将其视为具有 x[a,b,c] = x[layer,row,column] 的立方体。使用“:”作为索引仅表示“全部”

    【讨论】:

    • 这很好,现在的问题是2:1.如何将(None, 1, 1)翻译成x[:, 1, 1]? : 是哪种符号? 2.我的数据不是int(或float):对于每个bin,我都有一个float集合(一个列表)
    • 浮点数的列表是否等长?
    【解决方案2】:

    嗯,怎么样:

    import itertools
    
    def combinations_with_fixpoint(iterables, *args):
        return itertools.product(*([x] if x else y for x, y in zip(args, iterables)))
    
    
    axes_ranges = [(1, 7), (1, 8), (77, 79)]
    
    combs = combinations_with_fixpoint(
        itertools.starmap(range, axes_ranges),
        None, 5, None
    )
    
    for p in combs:
        print p
    
    # (1, 5, 77)
    # (1, 5, 78)
    # (2, 5, 77)
    # (2, 5, 78)
    # (3, 5, 77)
    # (3, 5, 78)
    # (4, 5, 77)
    # (4, 5, 78)
    # (5, 5, 77)
    # (5, 5, 78)
    # (6, 5, 77)
    # (6, 5, 78)    
    

    也许只是传递一个列表以允许多个“固定点”:

    def combinations_with_fixpoint(iterables, *args):
        return itertools.product(*(x or y for x, y in zip(args, iterables)))
    
    combs = combinations_with_fixpoint(
        itertools.starmap(range, axes_ranges),
        None, [5, 6], None
    )
    

    【讨论】:

      【解决方案3】:
      binning = [[0, 0.1, 0.2], [0, 10, 20], [-1, -2, -3]]
      range_binning = [(1, len(x) + 1) for x in binning]
      
      def expand_bin(thebin):
          def expand_bin_index(thebin, freeindex, rangebin):
              """
              thebin = [1, None, 3]
              freeindex = 1
              rangebin = [4,5]
              -> [[1, 4, 3], [1, 5, 3]]
              """
              result = []
              for r in rangebin:
                  newbin = thebin[:]
                  newbin[freeindex] = r
                  result.append(newbin)
              return result
      
          tmp = [thebin]
          indexes_free = [i for i,aa in enumerate(thebin) if aa is None]
          for index_free in indexes_free:
              range_index = range(*(range_binning[index_free]))
              new_tmp = []
              for t in tmp:
                  for expanded in expand_bin_index(t, index_free, range_index):
                      new_tmp.append(expanded)
              tmp = new_tmp
          return tmp
      
      inputs = ([None, 1, 2], [None, None, 3], [None, 1, None], [3, 2, 1], [None, None, None])
      for i in inputs:
          print "%s-> %s" % (i, expand_bin(i))
      

      【讨论】:

        猜你喜欢
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 2014-02-08
        • 2020-07-02
        • 2021-10-07
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        相关资源
        最近更新 更多