【问题标题】:Need to create a list of sets, from a list of sets whose members may be connected需要从可能连接其成员的集合列表中创建集合列表
【发布时间】:2011-09-02 21:49:04
【问题描述】:

我在这里实时处理多边形数据,但问题很简单。 我有一个包含数千组多边形指数(整数)的巨大列表,我需要将列表尽可能“快速”简化为“连接”指数集的列表。 即任何包含整数的集合也在另一个集合中成为结果中的一个集合。我已经阅读了几种可能的解决方案,涉及集合和图表等。我所追求的只是最终的集合列表,这些集合具有任何程度的共性。

我在这里处理大量数据,但为了简单起见,这里有一些示例数据:

setA = set([0,1,2])
setB = set([6,7,8,9])
setC = set([4,5,6])
setD = set([3,4,5,0])
setE = set([10,11,12])
setF = set([11,13,14,15])
setG = set([16,17,18,19])

listOfSets = [setA,setB,setC,setD,setE,setF,setG]

在这种情况下,我想要一个结果如下的列表,尽管排序无关紧要:

connectedFacesListOfSets = [ set([0,1,2,3,4,5,6,7,8,9]), set([10,11,12,13,14,15]), set( [16,17,18,19])]

我一直在寻找类似的解决方案,但得票最高的解决方案在我的大型测试数据中给出了错误的结果。

Merge lists that share common elements

【问题讨论】:

  • lamback's 看起来很简洁,对我有用,所以除非有更快的解决方案,否则我会选择这个答案。在我的 4064 组列表中,它需要不到 0.01 秒,所以我会很好。
  • 我很好奇您是如何使用lambaack 的方法在0.01 秒内完成任务的。我在过去的 2 分钟里一直在运行它,但它还没有完成。你最小/最大的开始系列是什么?我正在使用随机生成的 4000 组列表。并不是说我在敲lambaack 之类的东西(我的回答可能需要更长的时间),我只是想知道你用的是什么。
  • @Bryce Siedschlaw:创建集合的代码是什么?我有兴趣尝试更多的套装。
  • @lambackc 我在答案中添加了代码以生成随机集列表
  • 我用 timeit 运行每个答案——最快的是lamback 0.03(11 次函数调用),其次是 Thiago 0.07(50 次调用)和 Bryce 0.1(69 次调用)。

标签: python api maya


【解决方案1】:

如果没有足够大的集合,很难判断性能,但这里有一些基本代码可供参考:

while True:
    merged_one = False
    supersets = [listOfSets[0]]

    for s in listOfSets[1:]:
        in_super_set = False
        for ss in supersets:
            if s & ss:
               ss |= s
               merged_one = True
               in_super_set = True
               break

        if not in_super_set:
            supersets.append(s)

    print supersets
    if not merged_one:
        break

    listOfSets = supersets       

这适用于提供的数据的 3 次迭代。输出如下:

[set([0, 1, 2, 3, 4, 5]), set([4, 5, 6, 7, 8, 9]), set([10, 11, 12, 13, 14, 15]), set([16, 17, 18, 19])]
[set([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]), set([10, 11, 12, 13, 14, 15]), set([16, 17, 18, 19])]
[set([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]), set([10, 11, 12, 13, 14, 15]), set([16, 17, 18, 19])]

【讨论】:

  • 在小示例数据lambackck 上看起来很有希望。明天我会在我的插件中试一试,看看效果如何。干杯。
  • 工作愉快。谢谢你。如果我有 15 个代表,我会投票给你。
【解决方案2】:

这是一个联合查找问题。

虽然我没用过,但这段 Python 代码对我来说看起来不错。

http://code.activestate.com/recipes/577225-union-find/

【讨论】:

    【解决方案3】:

    原谅乱七八糟的大写字母(自动更正...):

    # the results cotainer
    Connected = set()
    
    sets = # some list of sets
    
    # convert the sets to frozensets (which are hashable and can be added to sets themselves)
    Sets = map(frozenset, sets)
    
    for s1 in sets:
        Res = copy.copy(s1)
        For s2 in sets:
            If s1 & s2:
                Res = res | s2
        Connected.add(res)  
    

    【讨论】:

    • 这个算法只运行一次,在很多情况下都会给出错误的答案。例如,对于输入 [(1,2),(2,3),(3,4)],它会错误地回答 [(1,2,3),(1,2,3,4),(2 ,3,4)]
    【解决方案4】:

    所以.. 我想我明白了。这是一团糟,但我明白了。这是我所做的:

    def connected_valid(li):
        for i, l in enumerate(li):
            for j, k in enumerate(li):
                if i != j and contains(l,k):
                    return False
        return True
    
    def contains(set1, set2):
        for s in set1:
            if s in set2:
                return True
        return False
    
    def combine(set1, set2):
        set2 |= set1
        return set2
    
    def connect_sets(li):
        while not connected_valid(li):
            s1 = li.pop(0)
            s2 = li[0]
            if contains(s1, s2):
                li[0] = combine(s1,s2)
            else:
                li.append(s1)
        return li
    

    然后在 main 函数中你会做这样的事情:

    setA = set([0,1,2])
    setB = set([6,7,8,9])
    setC = set([4,5,6])
    setD = set([3,4,5,0])
    setE = set([10,11,12])
    setF = set([11,13,14,15])
    setG = set([16,17,18,19])
    
    connected_sets = connect_sets([setA,setB,setC,setD,setE,setF,setG,])
    

    运行后得到如下输出

    print connected_sets
    [set([0,1,2,3,4,5,6,7,8,9]), set([10,11,12,13,14,15]), set([16,17,18,19])]
    

    希望这就是你要找的东西。

    编辑:添加代码以随机生成集合:

    # Creates a list of 4000 sets with a random number of values ranging from 0 to 20000
    sets = []
    ma = 0
    mi = 21000
    for x in range(4000):
        rand_num = sample(range(20),1)[0]
        tmp_set_li = sample(range(20000), rand_num)
        sets.append(set(tmp_set_li))
    

    如果你真的想的话,最后 3 行可以压缩成一行。

    【讨论】:

    • 如果你追求速度,你会想要删除函数调用,因为它们是众所周知的减速。在 while 循环的每次迭代中搜索非连接集也需要很长时间,最好在迭代时跟踪它并保留一个标志。
    【解决方案5】:

    我尝试做一些不同的事情:这个算法为每个集合循环一次,为每个元素循环一次:

    # Our test sets
    setA = set([0,1,2])
    setB = set([6,7,8,9])
    setC = set([4,5,6])
    setD = set([3,4,5,0])
    setE = set([10,11,12])
    setF = set([11,13,14,15])
    setG = set([16,17,18,19])
    
    list_of_sets = [setA,setB,setC,setD,setE,setF,setG]
    
    # We will use a map to store our new merged sets.
    # This map will work as an reference abstraction, so it will
    # map set ids to the set or to other set id.
    # This map may have an indirection level greater than 1
    merged_sets = {}
    
    # We will also use a map between indexes and set ids.
    index_to_id = {}
    
    # Given a set id, returns an equivalent set id that refers directly
    # to a set in the merged_sets map
    def resolve_id(id):
        if not isinstance(id, (int, long)):
            return None
        while isinstance(merged_sets[id], (int, long)):
            id = merged_sets[id]
        return id
    
    
    # Points the informed set to the destination id
    def link_id(id_source, id_destination):
        point_to = merged_sets[id_source]
        merged_sets[id_source] = id_destination
        if isinstance(point_to, (int, long)):
            link_id(point_to, id_destination)
    
    
    empty_set_found = False
    # For each set
    for current_set_id, current_set in enumerate(list_of_sets):
        if len(current_set) == 0 and empty_set_found:
            continue
        if len(current_set) == 0:
            empty_set_found = True
        # Create a set id for the set and place it on the merged sets map
        merged_sets[current_set_id] = current_set
        # For each index in the current set
        possibly_merged_current_set = current_set
        for index in current_set:
            # See if the index is free, i.e., has not been assigned to any set id
            if index not in index_to_id:
                # If it is free, then assign the set id to the index
                index_to_id[index] = current_set_id
                # ... and then go to the next index
            else:
                # If it is not free, then we may need to merge the sets
                # Find out to which set we need to merge the current one,
                # ... dereferencing if necessary
                id_to_merge = resolve_id(index_to_id[index])
                # First we check to see if the assignment is to the current set or not
                if id_to_merge == resolve_id(merged_sets[current_set_id]):
                    continue
                # Merge the current set to the one found
                print 'Merging %d with %d' % (current_set_id, id_to_merge)
                merged_sets[id_to_merge] |= possibly_merged_current_set
                possibly_merged_current_set = merged_sets[id_to_merge]
                # Map the current set id to the set id of the merged set
                link_id(current_set_id, id_to_merge)
    # Return all the sets in the merged sets map (ignore the references)
    print [x for x in merged_sets.itervalues() if not isinstance(x, (int, long))]
    

    打印出来:

    Merging 2 with 1
    Merging 3 with 0
    Merging 3 with 1
    Merging 5 with 4
    [set([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]), set([10, 11, 12, 13, 14, 15]), set([16, 17, 18, 19])]
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 2014-08-15
      • 1970-01-01
      • 1970-01-01
      • 2018-01-31
      • 2017-04-19
      • 1970-01-01
      • 2020-04-03
      相关资源
      最近更新 更多