动态迭代嵌套子列表答案

【问题标题】：Iterating nested sublists dynamically动态迭代嵌套子列表
【发布时间】：2014-06-08 17:24:11
【问题描述】：

我正在尝试创建一个运行列表，该运行列表可用于处理块中的数据，因为一旦第一批的处理完成，就必须运行一个单独的函数。我把那部分下来了。我遇到的问题是创建运行列表。
我有一个处理区域列表，如果其中一个区域与已经测试过的区域重叠，则需要将其移至下一个子列表。然后该子列表将以相同的方式进行测试。当最后一个子列表的长度为 1 时，迭代将停止。

示例

初始列表：[1,2,3,4,5,6]（3 & 4 与 2 重叠；5 & 6 与 3 & 4 重叠）
测试#1 输出：[[1,2],[3,4,5,6]]
测试#2 输出：[[1,2],[3,4],[5,6]]

当前代码

def containTest(a, b):
    return not set(a).isdisjoint(b)

def getValueList(inputTable, field):
    valueSet = set()  # set to hold unique values
    # use data access search cursor combined with, 'with'
    with arcpy.da.SearchCursor(inputTable, field) as values:
        # iterate through all values returned by Search Cursor
        for value in values:
            # Add value to set. If the value is not present,
            # it will be added. If it is present, the set will not
            # allow duplicates.
            valueSet.add(value[0])
    # sort and return list of values
    return sorted(valueSet)


import sys
import os
import traceback

# ArcGIS initialization
from arcpy.sa import *  # for ArcPy map algebra
import arcpy
arcpy.SetProduct("ArcInfo")
arcpy.CheckOutExtension("Spatial")  # spatial analyst license

# Geoprocessing error exception
from arcpy import ExecuteError as GPError

# ENV setup
arcpy.env.overwriteOutput = True

try:  
    # <get arguments from the command line here>
    vpupth = r'C:\TmpSS\NHDPlus06'
    workp = vpupth + os.sep + "RemoveSinks"  # (type Workspace) Path where results written into folders for each sink
    sinktbl = r"C:\TmpSS\NHDPlus06\RemoveSinks\sinks_00.shp"  # list of areas to process, integers are in the "GridCode" field

    # vrbls to identify first times through loops
    firsttime = True

    # Copy sinktbl to in_memory
    arcpy.CopyFeatures_management(sinktbl, "in_memory\sinktbl")
    sinktbl = "in_memory\sinktbl"

    # Make Layers of needed feature classes
    arcpy.MakeFeatureLayer_management(vsink, "in_memory\lyrSink")
    arcpy.MakeFeatureLayer_management(vcatch, "in_memory\lyrCatch")

    runlist = [[], []]
    chkdupes = set()

    sinknum = 1
    with arcpy.da.SearchCursor(sinktbl, ("GridCode")) as sinkrows:
        for sinkrow in sinkrows:
            sinkid = sinkrow[0]

            print "Processing Sink " + str(sinknum) + " GridCode = " + str(sinkid)

            # Select the sink specified in the table row
            arcpy.SelectLayerByAttribute_management("in_memory\lyrSink", "NEW_SELECTION", "GridCode = %s" % sinkid)
            # Write out to shapefile sinkprj
            arcpy.CopyFeatures_management("in_memory\lyrSink", "in_memory\sinkprj")
            # Select the catchment the sink falls within
            arcpy.SelectLayerByLocation_management("in_memory\lyrCatch", "INTERSECT", "in_memory\lyrSink", "", "NEW_SELECTION")
            # Select the catchments surrounding the catchment containing sink
            arcpy.SelectLayerByLocation_management("in_memory\lyrCatch", "BOUNDARY_TOUCHES", "in_memory\lyrCatch", "", "NEW_SELECTION")
            # Write inner catchments out to shapefile catchi
            arcpy.CopyFeatures_management("in_memory\lyrCatch", "in_memory\catchi")

            # Select another tier of catchments surrounding the selected catchments (second tier out from sink)
            arcpy.SelectLayerByLocation_management("in_memory\lyrCatch", "BOUNDARY_TOUCHES", "in_memory\lyrCatch", "", "NEW_SELECTION")
            # Write outer catchments out to shapefile catcho
            arcpy.CopyFeatures_management("in_memory\lyrCatch", "in_memory\catcho")    
            catchlist = set(getValueList("in_memory\catcho", "GridCode"))

            i = 0
            if firsttime is True:
                chkdupes |= catchlist
                firsttime = False
                runlist[i].append(sinkid)
                print "First time copy complete"
                i += 1
            else:
                if containTest(chkdupes, catchlist):
                    print "catchlist exists, moving sink to next iteration"
                    runlist[i + 1].append(sinkid)
                else:
                    print "sink does not exist in catchment list"
                    chkdupes |= catchlist
                    print "added to round one"
                    runlist[i].append(sinkid)
                i += 1

            del catchlist
            arcpy.Delete_management("in_memory\catcho")
            arcpy.Delete_management("in_memory\catchi")
            sinknum += 1

        print runlist

except arcpy.ExecuteError:
    # Get the tool error messages
    msgs = arcpy.GetMessages(2)
    # Return tool error messages for use with a script tool
    arcpy.AddError(msgs)
    # Print tool error messages for use in Python/PythonWin
    print msgs

except:
    # Get the traceback object
    tb = sys.exc_info()[2]
    tbinfo = traceback.format_tb(tb)[0]
    # Concatenate information together concerning the error into a message string
    pymsg = "PYTHON ERRORS:\nTraceback info:\n" + tbinfo + "\nError Info:\n" + str(sys.exc_info()[1])
    msgs = "ArcPy ERRORS:\n" + arcpy.GetMessages(2) + "\n"

    # Return python error messages for use in script tool or Python Window
    arcpy.AddError(pymsg)
    arcpy.AddError(msgs)

    # Print Python error messages for use in Python / Python Window
    print pymsg + "\n"
    print msgs
finally:
  # Clean up here (delete cursors, temp files)
  pass

感谢大家的关注，非常感谢您的帮助。我觉得有一种执行此任务的先进方法，但我缺乏知识使我无法看到解决方案。

【问题讨论】：

我认为做一些事情会帮助你在这个和未来的问题上得到好的答案：1）更清楚地定义问题。例如，什么是“加工区”？你如何定义“重叠”（我在你的例子中没有看到明显的重叠） 2）定义简单的输入和你希望得到的输出。您显示的输入和输出之间的逻辑步骤是什么？ 3) 最小化您共享的代码，以便每个人都只能看到与您的问题直接相关的代码。许多不相关的代码会阻止人们尝试。

标签： python list python-2.7 dynamic arcpy

【解决方案1】：

这个问题有点难以理解，但由于我遇到了同样的问题，我实际上知道你想要什么。您希望将许多元素排序到框中，其中较高编号框中的元素仅依赖于较低编号框中的元素，并且框尽可能少。您的应用程序是 GIS，很可能是重叠图像区域的绘制。

解决方案不是很困难。我的输入数据结构是条目列表。每个条目都有一个元素 ID 和该元素的依赖项列表（该元素所依赖的元素的 ID）。输出是一组元素 id 的列表，其中每组描述一个可以同时处理/绘制的元素框。

解决方案是迭代的：通过不再有依赖关系的所有元素减少可用元素列表，并将所有这些元素放入一个新盒子中。然后通过删除对刚刚放入框中的元素的所有依赖关系来减少剩余元素的依赖关系。 Python 中集合的difference 函数以及列表推导非常方便。

代码示例为：

def order_by_dependency(input):
    # start with empty output
    output = []

    # in a loop
    while input:

        # find entries without dependencies and add to output
        output.append([i for (i, j) in input if len(j) == 0])

        # reduce input by deleting all elements without dependencies and removing dependencies
        former_dependencies = set(output[-1])
        input = [(i, j - former_dependencies) for (i, j) in input if len(j) > 0]

    return output

input = [[1, []], [2, []], [3, set([2])], [4, set([2])], [5, set([3, 4])], [6, set([3, 4])]]
print(order_by_dependency(input))

输出如预期：

[[1, 2], [3, 4], [5, 6]]

【讨论】：