【问题标题】:Julia - Iterating over combinations of keys in a dictionaryJulia - 迭代字典中的键组合
【发布时间】:2017-04-24 19:33:46
【问题描述】:

有没有很好的方法来迭代字典中的键组合?

我的字典有如下值:

[1] => [1,2], [2,3] => [15], [3] => [6,7,8], [4,9,11] => [3], ... 

我需要做的是获取长度为 1:n 的所有键组合,其中 n 可能是 fx 3

就像上面的例子一样,我想迭代

[[1], [3], [2,3], [[1],[1,2]], [[3],[2,3]], [4,9,11]]

我知道我可以只收集密钥,但我的字典相当大,我正在重新设计整个算法,因为它在n > 3 时开始疯狂交换,大大降低了效率

tl;dr 有没有一种方法可以在没有collect-ing 字典的情况下从字典创建组合迭代器?

【问题讨论】:

    标签: dictionary iterator julia


    【解决方案1】:

    以下是一个简单的实现,它试图尽量减少浏览字典的时间。此外,它使用 OrderedDict,因此保持键索引是有意义的(因为 Dict 不承诺每次都一致的键迭代,因此不保证有意义的键索引)。

    using Iterators
    using DataStructures
    
    od = OrderedDict([1] => [1,2], [2,3] => [15], [3] => [6,7,8], [4,9,11] => [3])
    
    sv = map(length,keys(od))        # store length of keys for quicker calculations
    maxmaxlen = sum(sv)              # maximum total elements in good key
    for maxlen=1:maxmaxlen           # replace maxmaxlen with lower value if too slow
      @show maxlen
      gsets = Vector{Vector{Int}}()  # hold good sets of key _indices_
      for curlen=1:maxlen
        foreach(x->push!(gsets,x),
         (x for x in subsets(collect(1:n),curlen) if sum(sv[x])==maxlen))
      end
      # indmatrix is necessary to run through keys once in next loop
      indmatrix = zeros(Bool,length(od),length(gsets))
      for i=1:length(gsets)              for e in gsets[i]
          indmatrix[e,i] = true
        end
      end
      # gkeys is the vector of vecotrs of keys i.e. what we wanted to calculate
      gkeys = [Vector{Vector{Int}}() for i=1:length(gsets)]
      for (i,k) in enumerate(keys(od))
        for j=1:length(gsets)
          if indmatrix[i,j]
            push!(gkeys[j],k)
          end
        end
      end
      # do something with each set of good keys
      foreach(x->println(x),gkeys)
    end
    

    这是否比您目前拥有的更有效率?最好将代码放在函数中或将其转换为 Julia 任务,该任务在每次迭代时生成下一个键集。

    --- 更新 ---

    使用https://stackoverflow.com/a/41074729/3580870中关于任务迭代器的答案

    一个改进的迭代器版本是:

    function keysubsets(n,d)
      Task() do
        od = OrderedDict(d)
        sv = map(length,keys(od))        # store length of keys for quicker calculations
        maxmaxlen = sum(sv)              # maximum total elements in good key
        for maxlen=1:min(n,maxmaxlen)    # replace maxmaxlen with lower value if too slow
          gsets = Vector{Vector{Int}}()  # hold good sets of key _indices_
          for curlen=1:maxlen
            foreach(x->push!(gsets,x),(x for x in subsets(collect(1:n),curlen) if sum(sv[x])==maxlen))
          end
          # indmatrix is necessary to run through keys once in next loop
          indmatrix = zeros(Bool,length(od),length(gsets))
          for i=1:length(gsets)              for e in gsets[i]
              indmatrix[e,i] = true
            end
          end
          # gkeys is the vector of vecotrs of keys i.e. what we wanted to calculate
          gkeys = [Vector{Vector{Int}}() for i=1:length(gsets)]
          for (i,k) in enumerate(keys(od))
            for j=1:length(gsets)
              if indmatrix[i,j]
                push!(gkeys[j],k)
              end
            end
          end
          # do something with each set of good keys
          foreach(x->produce(x),gkeys)
        end
      end
    end
    

    现在可以以这种方式迭代所有组合大小为 4 的键子集(在运行来自其他 StackOverflow 答案的代码之后):

    julia> nt2 = NewTask(keysubsets(4,od))
    julia> collect(nt2)
    10-element Array{Array{Array{Int64,1},1},1}:
     Array{Int64,1}[[1]]          
     Array{Int64,1}[[3]]          
     Array{Int64,1}[[2,3]]        
     Array{Int64,1}[[1],[3]]      
     Array{Int64,1}[[4,9,11]]     
     Array{Int64,1}[[1],[2,3]]    
     Array{Int64,1}[[2,3],[3]]    
     Array{Int64,1}[[1],[4,9,11]] 
     Array{Int64,1}[[3],[4,9,11]] 
     Array{Int64,1}[[1],[2,3],[3]]
    

    (需要从链接的 StackOverflow 答案中定义 NewTask)。

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 2023-02-02
      • 1970-01-01
      • 2021-04-05
      • 1970-01-01
      • 2016-09-24
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多