在二叉树中查找最大独立集的大小 - 为什么错误的“解决方案”不起作用？答案

【问题标题】：Finding size of max independent set in binary tree - why faulty "solution" doesn't work?在二叉树中查找最大独立集的大小 - 为什么错误的“解决方案”不起作用？
【发布时间】：2012-05-16 04:44:22
【问题描述】：

这是一个类似问题的链接，答案很好：Java Algorithm for finding the largest set of independent nodes in a binary tree。

我想出了一个不同的答案，但我的教授说它不起作用，我想知道原因（他不回复电子邮件）。

问题：

给定一个包含 n 个整数的数组 A，它的索引从 0 开始（即A[0]， A[1], ..., A[n-1])。我们可以将 A 解释为一棵二叉树，其中两个 A[i] 的孩子是 A[2i+1] 和 A[2i+2]，每个的值 element 是树的节点权重。在这棵树中，我们说如果顶点集不包含任何顶点，则它是“独立的” 父子对。独立集的权重就是其元素的所有权重的总和。开发一个算法来计算任何独立集的最大权重。

我提出的答案使用了以下两个关于二叉树中独立集的假设：

同一级别的所有节点相互独立。
交替级别上的所有节点都相互独立（没有父/子关系）

警告：我在考试期间想出了这个，它并不漂亮，但我只是想看看我是否可以争取至少部分学分。

那么，为什么不能只构建两个独立的集合（一个用于奇数关卡，一个用于偶数关卡）？

如果每个集合中的任何权重都是非负的，则将它们相加（丢弃负元素，因为这不会对最大的权重集做出贡献）以找到具有最大权重的独立集。

如果集合中的权重都是负数（或等于0），则对其进行排序并返回最接近0的负数作为权重。

比较两个集合中最大独立集合的权重，并将其作为最终解决方案返回。

我的教授声称它不起作用，但我不明白为什么。为什么它不起作用？

【问题讨论】：

标签： algorithm binary-tree

【解决方案1】：

您的算法不起作用，因为它返回的节点集要么全部来自奇数级别，要么全部来自偶数级别。但最优解可以同时拥有这两个节点。

例如，考虑一棵树，其中除两个节点的权重为 1 外，所有权重均为 0。其中一个节点位于级别 1，另一个位于级别 4。最优解将包含这两个节点并具有权重2. 但是你的算法只会给出其中一个节点并且权重为 1。

【讨论】：

【解决方案2】：

Interjay 已注意到您的答案不正确的原因。这个问题可以用递归算法find-max-independent 来解决，给定一棵二叉树，考虑两种情况：

什么是最大独立集给定根节点是包括？
什么是最大独立集给定根节点不包括在内？

在情况 1 中，由于包含根节点，因此它的任何子节点都不能。因此，我们将 root 的孙子的find-max-independent 的值加上 root 的值（必须包括在内），然后返回。

在情况 2 中，我们返回子节点的最大值 find-max-independent，如果有的话（我们只能选择一个）

算法可能看起来像这样（在 python 中）：

def find_max_independent ( A ):
    N=len(A)

    def children ( i ):
        for n in (2*i+1, 2*i+2):
            if n<N: yield n

    def gchildren ( i ):
        for child in children(i):
            for gchild in children(child):
                yield gchild

    memo=[None]*N

    def rec ( root ):
        "finds max independent set in subtree tree rooted at root. memoizes results"

        assert(root<N)

        if memo[root] != None:
            return memo[root]

        # option 'root not included': find the child with the max independent subset value
        without_root = sum(rec(child) for child in children(root))

        # option 'root included': possibly pick the root
        # and the sum of the max value for the grandchildren
        with_root =  max(0, A[root]) + sum(rec(gchild) for gchild in gchildren(root))

        val=max(with_root, without_root)
        assert(val>=0)
        memo[root]=val

        return val


    return rec(0) if N>0 else 0

一些测试用例图解：

tests=[
    [[1,2,3,4,5,6], 16], #1
    [[-100,2,3,4,5,6], 6], #2
    [[1,200,3,4,5,6], 200], #3
    [[1,2,3,-4,5,-6], 6], #4
    [[], 0],
    [[-1], 0],
]

for A, expected in tests:
    actual=find_max_independent(A)
    print("test: {}, expected: {}, actual: {} ({})".format(A, expected, actual, expected==actual))

样本输出：

test: [1, 2, 3, 4, 5, 6], expected: 16, actual: 16 (True)
test: [-100, 2, 3, 4, 5, 6], expected: 15, actual: 15 (True)
test: [1, 200, 3, 4, 5, 6], expected: 206, actual: 206 (True)
test: [1, 2, 3, -4, 5, -6], expected: 8, actual: 8 (True)
test: [], expected: 0, actual: 0 (True)
test: [-1], expected: 0, actual: 0 (True)

测试用例 1

测试用例 2

测试用例 3

测试用例 4

memoized 算法的复杂度为O(n)，因为rec(n) 为每个节点调用一次。这是一个使用深度优先搜索的自上而下的动态规划解决方案。

（测试用例插图由 leetcode 的交互式二叉树编辑器提供）

【讨论】：

这不是 O(n)，因为节点被多次访问。例如，有 10,000 个节点，有 2,062,787 次访问。
@StefanPochmann 你是对的，没有记忆，解决方案有几个问题，我几乎在 4 年后修复了。