【问题标题】:Fair partitioning of elements of a list列表元素的公平划分
【发布时间】:2020-03-27 09:11:06
【问题描述】:

给定玩家评分列表,我需要将玩家(即评分)尽可能公平地分成两组。目标是最小化团队累积评分之间的差异。对于我如何将球员分成几支球队没有任何限制(一支球队可以有 2 名球员,另一支球队可以有 10 名球员)。

例如:[5, 6, 2, 10, 2, 3, 4] 应该返回 ([6, 5, 3, 2], [10, 4, 2])

我想知道解决这个问题的算法。请注意,我正在参加在线编程入门课程,因此我们将不胜感激简单的算法。

我正在使用以下代码,但由于某种原因,在线代码检查器说它不正确。

def partition(ratings):
    set1 = []
    set2 =[]
    sum_1 = 0
    sum_2 = 0
    for n in sorted(ratings, reverse=True):
        if sum_1 < sum_2:
            set1.append(n)
            sum_1 = sum_1 + n
        else:
            set2.append(n)
            sum_2 = sum_2 + n
    return(set1, set2)

更新:我联系了导师,被告知我应该在函数内部定义另一个“帮助”函数来检查所有不同的组合,然后我需要检查最小差异。

【问题讨论】:

  • 谷歌“子集和问题”
  • @JohnColeman 感谢您的建议。你能指导我如何使用子集和来解决我的问题吗?
  • 更具体地说,您有一个子集和问题的特殊情况,称为partition problem。维基百科关于它的文章讨论了算法。
  • 这能回答你的问题吗? Divide list into two equal parts algorithm
  • 谢谢你们!衷心感谢您的帮助!

标签: python algorithm list


【解决方案1】:

注意:编辑以更好地处理所有数字之和为奇数的情况。

回溯是解决此问题的一种可能性。

它允许递归检查所有可能性,而不需要大量内存。

一旦找到最佳解决方案,它就会停止:sum = 0,其中sum 是集合 A 的元素之和与集合 B 的元素之和之间的差。编辑:它会尽快停止 @987654323 @,处理所有数之和为奇数的情况,即对应最小差为1。如果这个全局和为偶数,则最小差不能等于1。

它允许实现一个简单的提前放弃程序:
在给定时间,如果sum大于所有剩余元素的总和(即没有放在A或B中)加上获得的当前最小值的绝对值,那么我们可以放弃检查当前路径,而不检查剩余的元素。此过程通过以下方式进行了优化:

  • 按降序对输入数据进行排序
  • 每一步,首先检查最可能的选择:这样可以快速找到接近最佳的解决方案

这是一个伪代码

初始化:

  • 排序元素a[]
  • 计算剩余元素的总和:sum_back[i] = sum_back[i+1] + a[i];
  • 将最小“差异”设置为其最大值:min_diff = sum_back[0];
  • a[0] 放入A -> 检查元素的索引i 设置为1
  • 设置up_down = true;:这个布尔值表示我们当前是前进(true)还是后退(false)

While循环:

  • 如果(up_down):前进

    • sum_back 的帮助下测试过早放弃
    • 选择最可能的值,根据这个选择调整sum
    • if (i == n-1): LEAF -> 测试是否优化了最优值,如果新值等于 0 则返回(编辑:if (... &lt; 2));倒退
    • 如果不在一片叶子中:继续前进
  • 如果 (!updown): 向后

    • 如果我们到达i == 0:返回
    • 如果是本节点第二次走:选择第二个值,往上走
    • 否则:下去
    • 在这两种情况下:重新计算新的sum

这是一段代码,用 C++ 编写的(抱歉,不懂 Python)

#include    <iostream>
#include    <vector>
#include    <algorithm>
#include    <tuple>

std::tuple<int, std::vector<int>> partition(std::vector<int> &a) {
    int n = a.size();
    std::vector<int> parti (n, -1);     // current partition studies
    std::vector<int> parti_opt (n, 0);  // optimal partition
    std::vector<int> sum_back (n, 0);   // sum of remaining elements
    std::vector<int> n_poss (n, 0);     // number of possibilities already examined at position i

    sum_back[n-1] = a[n-1];
    for (int i = n-2; i >= 0; --i) {
        sum_back[i] = sum_back[i+1] + a[i];
    }

    std::sort(a.begin(), a.end(), std::greater<int>());
    parti[0] = 0;       // a[0] in A always !
    int sum = a[0];     // current sum

    int i = 1;          // index of the element being examined (we force a[0] to be in A !)
    int min_diff = sum_back[0];
    bool up_down = true;

    while (true) {          // UP
        if (up_down) {
            if (std::abs(sum) > sum_back[i] + min_diff) {  //premature abandon
                i--;
                up_down = false;
                continue;
            }
            n_poss[i] = 1;
            if (sum > 0) {
                sum -= a[i];
                parti[i] = 1;
            } else {
                sum += a[i];
                parti[i] = 0;
            }

            if (i == (n-1)) {           // leaf
                if (std::abs(sum) < min_diff) {
                    min_diff = std::abs(sum);
                    parti_opt = parti;
                    if (min_diff < 2) return std::make_tuple (min_diff, parti_opt);   // EDIT: if (... < 2) instead of (... == 0)
                }
                up_down = false;
                i--;
            } else {
                i++;        
            }

        } else {            // DOWN
            if (i == 0) break;
            if (n_poss[i] == 2) {
                if (parti[i]) sum += a[i];
                else sum -= a[i];
                //parti[i] = 0;
                i--;
            } else {
                n_poss[i] = 2;
                parti[i] = 1 - parti[i];
                if (parti[i]) sum -= 2*a[i];
                else sum += 2*a[i];
                i++;
                up_down = true;
            }
        }
    }
    return std::make_tuple (min_diff, parti_opt);
}

int main () {
    std::vector<int> a = {5, 6, 2, 10, 2, 3, 4, 13, 17, 38, 42};
    int diff;
    std::vector<int> parti;
    std::tie (diff, parti) = partition (a);

    std::cout << "Difference = " << diff << "\n";

    std::cout << "set A: ";
    for (int i = 0; i < a.size(); ++i) {
        if (parti[i] == 0) std::cout << a[i] << " ";
    }
    std::cout << "\n";

    std::cout << "set B: ";
    for (int i = 0; i < a.size(); ++i) {
        if (parti[i] == 1) std::cout << a[i] << " ";
    }
    std::cout << "\n";
}

【讨论】:

  • 这里唯一的问题不是最优总和总是0。谢谢你解释得很好,因为我看不懂C++。
  • 如果最优和不等于 0,代码会查看所有可能性,记住最优解。未检查的路径是我们确定它们不是最优的路径。这对应于返回 if I == 0。我通过在您的示例中将 10 替换为 11 来测试它
【解决方案2】:

我认为你应该自己做下一个练习,否则你学不到多少。至于这个,这里有一个解决方案,试图实施你的导师的建议:

def partition(ratings):

    def split(lst, bits):
        ret = ([], [])
        for i, item in enumerate(lst):
            ret[(bits >> i) & 1].append(item)
        return ret

    target = sum(ratings) // 2
    best_distance = target
    best_split = ([], [])
    for bits in range(0, 1 << len(ratings)):
        parts = split(ratings, bits)
        distance = abs(sum(parts[0]) - target)
        if best_distance > distance:
            best_distance = distance
            best_split = parts
    return best_split

ratings = [5, 6, 2, 10, 2, 3, 4]
print(ratings)
print(partition(ratings))

输出:

[5, 6, 2, 10, 2, 3, 4]
([5, 2, 2, 3, 4], [6, 10])

请注意,此输出与您想要的不同,但两者都是正确的。

该算法基于这样一个事实,即要选择具有 N 个元素的给定集合的所有可能子集,您可以生成所有具有 N 位的整数,并根据第 I 个元素的值选择第 I 个元素少量。我留给你添加几行,以便在best_distance 为零时立即停止(因为它当然不会变得更好)。

一点点介绍(注意0b是Python中二进制数的前缀):

二进制数:0b0111001 == 0·2⁶+1·2⁵+1·2⁴+1·2³+0·2²+0·2¹+1·2⁰ == 57

右移 1:0b0111001 &gt;&gt; 1 == 0b011100 == 28

左移 1:0b0111001 &lt;&lt; 1 == 0b01110010 == 114

右移 4:0b0111001 &gt;&gt; 4 == 0b011 == 3

按位&amp;(和):0b00110 &amp; 0b10101 == 0b00100

检查第 5 位(索引 4)是否为 1:(0b0111001 &gt;&gt; 4) &amp; 1 == 0b011 &amp; 1 == 1

1 后跟 7 个零:1 &lt;&lt; 7 == 0b10000000

7 个:(1 &lt;&lt; 7) - 1 == 0b10000000 - 1 == 0b1111111

所有 3 位组合:0b000==00b001==10b010==20b011==30b100==40b101==50b110==60b111==7(注意 0b111 + 1 == 0b1000 == 1 &lt;&lt; 3

【讨论】:

  • 非常感谢!你能解释一下你做了什么吗?
  • 我加了一个关于二进制数和位运算的微课
  • 你可能不应该在另一个函数中定义一个函数。
  • @AlexanderCécile it depends。在这种情况下,我认为这是可以接受的并且可以提高清洁度,无论如何,这是他的导师建议的 OP(请参阅他的问题中的更新)。
  • @MiniMax N 个项目的排列是 N!,但它们的子集是 2^N:第一个项目是否在子集中:2 种可能性;第二项是否在子集中:×2;第三项……以此类推,N次。
【解决方案3】:

以下算法会这样做:

  • 对项目进行排序
  • 将偶数成员放入列表a,将奇数放入列表b 开始
  • 如果更改更好,则在 ab 之间随机移动和交换项目

我已添加打印语句以显示您的示例列表中的进度:

# -*- coding: utf-8 -*-
"""
Created on Fri Dec  6 18:10:07 2019

@author: Paddy3118
"""

from random import shuffle, random, randint

#%%
items = [5, 6, 2, 10, 2, 3, 4]

def eq(a, b):
    "Equal enough"
    return int(abs(a - b)) == 0

def fair_partition(items, jiggles=100):
    target = sum(items) / 2
    print(f"  Target sum: {target}")
    srt = sorted(items)
    a = srt[::2]    # every even
    b = srt[1::2]   # every odd
    asum = sum(a)
    bsum = sum(b)
    n = 0
    while n < jiggles and not eq(asum, target):
        n += 1
        if random() <0.5:
            # move from a to b?
            if random() <0.5:
                a, b, asum, bsum = b, a, bsum, asum     # Switch
            shuffle(a)
            trial = a[0]
            if abs(target - (bsum + trial)) < abs(target - bsum):  # closer
                b.append(a.pop(0))
                asum -= trial
                bsum += trial
                print(f"  Jiggle {n:2}: Delta after Move: {abs(target - asum)}")
        else:
            # swap between a and b?
            apos = randint(0, len(a) - 1)
            bpos = randint(0, len(b) - 1)
            trya, tryb = a[apos], b[bpos]
            if abs(target - (bsum + trya - tryb)) < abs(target - bsum):  # closer
                b.append(trya)  # adds to end
                b.pop(bpos)     # remove what is swapped
                a.append(tryb)
                a.pop(apos)
                asum += tryb - trya
                bsum += trya - tryb
                print(f"  Jiggle {n:2}: Delta after Swap: {abs(target - asum)}")
    return sorted(a), sorted(b)

if __name__ == '__main__':
    for _ in range(5):           
        print('\nFinal:', fair_partition(items), '\n')  

输出:

  Target sum: 16.0
  Jiggle  1: Delta after Swap: 2.0
  Jiggle  7: Delta after Swap: 0.0

Final: ([2, 3, 5, 6], [2, 4, 10]) 

  Target sum: 16.0
  Jiggle  4: Delta after Swap: 0.0

Final: ([2, 4, 10], [2, 3, 5, 6]) 

  Target sum: 16.0
  Jiggle  9: Delta after Swap: 3.0
  Jiggle 13: Delta after Move: 2.0
  Jiggle 14: Delta after Swap: 1.0
  Jiggle 21: Delta after Swap: 0.0

Final: ([2, 3, 5, 6], [2, 4, 10]) 

  Target sum: 16.0
  Jiggle  7: Delta after Swap: 3.0
  Jiggle  8: Delta after Move: 1.0
  Jiggle 13: Delta after Swap: 0.0

Final: ([2, 3, 5, 6], [2, 4, 10]) 

  Target sum: 16.0
  Jiggle  5: Delta after Swap: 0.0

Final: ([2, 4, 10], [2, 3, 5, 6]) 

【讨论】:

  • 非常感谢,但我应该不导入任何东西就这样做。
【解决方案4】:

因为我知道我必须生成所有可能的列表,所以我需要创建一个“帮助器”函数来帮助生成所有可能性。这样做之后,我确实会检查最小差异,并且具有最小差异的列表组合是所需的解决方案。

辅助函数是递归的,检查列表组合的所有可能性。

def partition(ratings):

    def helper(ratings, left, right, aux_list, current_index):
        if current_index >= len(ratings):
            aux_list.append((left, right))
            return

        first = ratings[current_index]
        helper(ratings, left + [first], right, aux_list, current_index + 1)
        helper(ratings, left, right + [first], aux_list, current_index + 1)

    #l contains all possible sublists
    l = []
    helper(ratings, [], [], l, 0)
    set1 = []
    set2 = []
    #set mindiff to a large number
    mindiff = 1000
    for sets in l:
        diff = abs(sum(sets[0]) - sum(sets[1]))
        if diff < mindiff:
            mindiff = diff
            set1 = sets[0]
            set2 = sets[1]
    return (set1, set2)

示例: r = [1, 2, 2, 3, 5, 4, 2, 4, 5, 5, 2],最优分区为:([1, 2, 2, 3, 5, 4], [2, 4, 5, 5, 2]),差异为1

r = [73, 7, 44, 21, 43, 42, 92, 88, 82, 70],最优分区为:([73, 7, 21, 92, 88], [44, 43, 42, 82, 70]),相差0

【讨论】:

  • 既然你问我:如果你正在学习,你的解决方案很好。它只有一个问题,幸运的是,它没有在与其他解决方案共有的另一个问题之前出现:它使用指数空间 (O(n2ⁿ))。但是指数时间很早以前就成为了一个问题。尽管如此,避免使用指数空间会很容易。
【解决方案5】:

这是一个相当复杂的示例,旨在用于教育目的而不是性能。它介绍了一些有趣的 Python 概念,例如列表推导和生成器,以及需要适当检查边缘情况的递归的一个很好的例子。扩展,例如只有拥有相同数量玩家的团队才有效,并且易于在适当的个人功能中实施。

def listFairestWeakTeams(ratings):
    current_best_weak_team_rating = -1
    fairest_weak_teams = []
    for weak_team in recursiveWeakTeamGenerator(ratings):
        weak_team_rating = teamRating(weak_team, ratings)
        if weak_team_rating > current_best_weak_team_rating:
            fairest_weak_teams = []
            current_best_weak_team_rating = weak_team_rating
        if weak_team_rating == current_best_weak_team_rating:
            fairest_weak_teams.append(weak_team)
    return fairest_weak_teams


def recursiveWeakTeamGenerator(
    ratings,
    weak_team=[],
    current_applicant_index=0
):
    if not isValidWeakTeam(weak_team, ratings):
        return
    if current_applicant_index == len(ratings):
        yield weak_team
        return
    for new_team in recursiveWeakTeamGenerator(
        ratings,
        weak_team + [current_applicant_index],
        current_applicant_index + 1
    ):
        yield new_team
    for new_team in recursiveWeakTeamGenerator(
        ratings,
        weak_team,
        current_applicant_index + 1
    ):
        yield new_team


def isValidWeakTeam(weak_team, ratings):
    total_rating = sum(ratings)
    weak_team_rating = teamRating(weak_team, ratings)
    optimal_weak_team_rating = total_rating // 2
    if weak_team_rating > optimal_weak_team_rating:
        return False
    elif weak_team_rating * 2 == total_rating:
        # In case of equal strengths, player 0 is assumed
        # to be in the "weak" team
        return 0 in weak_team
    else:
        return True


def teamRating(team_members, ratings):
    return sum(memberRatings(team_members, ratings))    


def memberRatings(team_members, ratings):
    return [ratings[i] for i in team_members]


def getOpposingTeam(team, ratings):
    return [i for i in range(len(ratings)) if i not in team]


ratings = [5, 6, 2, 10, 2, 3, 4]
print("Player ratings:     ", ratings)
print("*" * 40)
for option, weak_team in enumerate(listFairestWeakTeams(ratings)):
    strong_team = getOpposingTeam(weak_team, ratings)
    print("Possible partition", option + 1)
    print("Weak team members:  ", weak_team)
    print("Weak team ratings:  ", memberRatings(weak_team, ratings))
    print("Strong team members:", strong_team)
    print("Strong team ratings:", memberRatings(strong_team, ratings))
    print("*" * 40)

输出:

Player ratings:      [5, 6, 2, 10, 2, 3, 4]
****************************************
Possible partition 1
Weak team members:   [0, 1, 2, 5]
Weak team ratings:   [5, 6, 2, 3]
Strong team members: [3, 4, 6]
Strong team ratings: [10, 2, 4]
****************************************
Possible partition 2
Weak team members:   [0, 1, 4, 5]
Weak team ratings:   [5, 6, 2, 3]
Strong team members: [2, 3, 6]
Strong team ratings: [2, 10, 4]
****************************************
Possible partition 3
Weak team members:   [0, 2, 4, 5, 6]
Weak team ratings:   [5, 2, 2, 3, 4]
Strong team members: [1, 3]
Strong team ratings: [6, 10]
****************************************

【讨论】:

    【解决方案6】:

    鉴于您想要甚至团队,您知道每个团队的评分目标分数。这是评分的总和除以 2。

    所以下面的代码应该做你想做的事。

    from itertools import combinations
    
    ratings = [5, 6, 2, 10, 2, 3, 4]
    
    target = sum(ratings)/2 
    
    difference_dictionary = {}
    for i in range(1, len(ratings)): 
        for combination in combinations(ratings, i): 
            diff = sum(combination) - target
            if diff >= 0: 
                difference_dictionary[diff] = difference_dictionary.get(diff, []) + [combination]
    
    # get min difference to target score 
    min_difference_to_target = min(difference_dictionary.keys())
    strong_ratings = difference_dictionary[min_difference_to_target]
    first_strong_ratings = [x for x in strong_ratings[0]]
    
    weak_ratings = ratings.copy()
    for strong_rating in first_strong_ratings: 
        weak_ratings.remove(strong_rating)
    

    输出

    first_strong_ratings 
    [6, 10]
    
    weak_rating 
    [5, 2, 2, 3, 4]
    

    还有其他具有相同 fairness 的拆分,这些都可以在 strong_ratings 元组中找到,我只选择查看第一个,因为对于您传入的任何评级列表,这将始终存在(提供 @ 987654324@)。

    【讨论】:

    • 这个问题的挑战是不导入我在问题中提到的任何内容。感谢您的意见!
    【解决方案7】:

    贪婪的解决方案可能会产生次优的解决方案。这是一个相当简单的贪心解决方案,其想法是按降序对列表进行排序,以减少在桶中添加评级的影响。评分将被添加到总评分总和较少的那个桶中

    lis = [5, 6, 2, 10, 2, 3, 4]
    lis.sort()
    lis.reverse()
    
    bucket_1 = []
    bucket_2 = []
    
    for item in lis:
        if sum(bucket_1) <= sum(bucket_2):
            bucket_1.append(item)
        else:
            bucket_2.append(item)
    
    print("Bucket 1 : {}".format(bucket_1))
    print("Bucket 2 : {}".format(bucket_2))
    
    

    输出:

    Bucket 1 : [10, 4, 2]
    Bucket 2 : [6, 5, 3, 2]
    

    编辑:

    另一种方法是生成列表的所有可能子集。假设您有 l1,它是列表的子集之一,那么您可以轻松获得列表 l2,使得 l2 = list(original) - l1。大小为 n 的列表的所有可能子集的数量为 2^n。我们可以将它们表示为从 0 到 2^n -1 的整数的 seq。举个例子,假设你有 list = [1, 3, 5] 那么没有可能的组合是 2^3 即 8。现在我们可以将所有组合写成如下:

    1. 000 - [] - 0
    2. 001 - [1] - 1
    3. 010 - [3] - 2
    4. 011 - [1,3] - 3
    5. 100 - [5] - 4
    6. 101 - [1,5] - 5
    7. 110 - [3,5]- 6
    8. 111 - [1,3,5] - 7 而在这种情况下,l2 可以通过对 2^n-1 进行 xor 轻松获得。

    解决方案:

    def sum_list(lis, n, X):
        """
        This function will return sum of all elemenst whose bit is set to 1 in X
        """
        sum_ = 0
        # print(X)
        for i in range(n):
            if (X & 1<<i ) !=0:
                # print( lis[i], end=" ")
                sum_ += lis[i]
        # print()
        return sum_
    
    def return_list(lis, n, X):
        """
        This function will return list of all element whose bit is set to 1 in X
        """
        new_lis = []
        for i in range(n):
            if (X & 1<<i) != 0:
                new_lis.append(lis[i])
        return new_lis
    
    lis = [5, 6, 2, 10, 2, 3, 4]
    n = len(lis)
    total = 2**n -1 
    
    result_1 = 0
    result_2 = total
    result_1_sum = 0
    result_2_sum = sum_list(lis,n, result_2)
    ans = total
    for i in range(total):
        x = (total ^ i)
        sum_x = sum_list(lis, n, x)
        sum_y = sum_list(lis, n, i)
    
        if abs(sum_x-sum_y) < ans:
            result_1 =  x
            result_2 = i
            result_1_sum = sum_x
            result_2_sum = sum_y
            ans = abs(result_1_sum-result_2_sum)
    
    """
    Produce resultant list
    """
    
    bucket_1 = return_list(lis,n,result_1)
    bucket_2 = return_list(lis, n, result_2)
    
    print("Bucket 1 : {}".format(bucket_1))
    print("Bucket 2 : {}".format(bucket_2))
    
    
    

    输出:

    Bucket 1 : [5, 2, 2, 3, 4]
    Bucket 2 : [6, 10]
    

    【讨论】:

    • 您好,如果您阅读了我的原始问题,您可以看到我已经使用了贪婪方法,并且被拒绝了。不过感谢您的意见!
    • @EddieEC 对 n(数组长度)的约束是什么。如果你想生成所有可能的组合,那么它基本上是一个子集和问题,这是一个 NP 完全问题。
    猜你喜欢
    • 1970-01-01
    • 2017-12-14
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2019-11-24
    相关资源
    最近更新 更多