如何找出总和最接近给定数字的给定向量的最佳组合答案

【问题标题】：How to find out the best combination of a given vector whose sum is closest to a given number如何找出总和最接近给定数字的给定向量的最佳组合
【发布时间】：2017-06-18 02:48:47
【问题描述】：

我的问题和这个很相似：Find a subset from a set of integer whose sum is closest to a value

它只讨论了算法，但我想用 R 解决它。我对 R 很陌生，并试图找出一个解决方案，但我想知道是否有更有效的方法。

这是我的例子：

# Define a vector, to findout a subset whose sum is closest to the reference number 20. 
A <- c(2,5,6,3,7)

# display all the possible combinations
y1 <- combn(A,1)
y2 <- combn(A,2)
y3 <- combn(A,3)
y4 <- combn(A,4)
y5 <- combn(A,5)
Y <- list(y1,y2,y3,y4,y5)

# calculate the distance to the reference number of each combination
s1 <- abs(apply(y1,2,sum)-20)
s2 <- abs(apply(y2,2,sum)-20)
s3 <- abs(apply(y3,2,sum)-20)
s4 <- abs(apply(y4,2,sum)-20)
s5 <- abs(apply(y5,2,sum)-20)
S <- list(s1,s2,s3,s4,s5)

# find the minimum difference
M <- sapply(S,FUN=function(x) list(which.min(x),min(x)))
Mm <- which.min(as.numeric(M[2,]))

# return the right combination
data.frame(Y[Mm])[as.numeric(M[,Mm[1]])]

所以答案是 2,5,6,7。

如何改进这个程序？尤其是这五个combn()s和五个apply()s，有没有一种方法可以同时使用它们？我希望当 A 里面有更多的项目时，我可以使用 length(A) 来覆盖它。

【问题讨论】：

试试lapply(1:5, function(i) abs(colSums(combn(A, i))-20))
我觉得前2个代码集可以换成Y <- lapply(1:5, function(i) combn(A, i)); S <- lapply(Y, function(x) abs(colSums(x) - 20))然后应用你的代码
您的真实A 有多大？对于大型向量，您的代码将无法在合理的时间内完成，因为您正在逐一测试所有组合。如果在这个例子中长度是 5，那么只有 32 个组合需要检查 (32=2^5)。如果大小为 20，则为 1048576 个组合，这将在几分钟内结束。对于 50 岁的人来说，这几乎是无望的。如果您使用的是大尺寸 A，那么您需要找到一个聪明的算法。

标签： r algorithm

【解决方案1】：

这是另一种方法，

l1 <- sapply(seq_along(A), function(i) combn(A, i))
l2 <- sapply(l1, function(i) abs(colSums(i) - 20))

Filter(length, Map(function(x, y)x[,y], l1, sapply(l2, function(i) i == Reduce(min, l2))))
#[[1]]
#[1] 2 5 6 7

最后一行使用Map 来索引l1，基于从列表l2 中查找最小值创建的逻辑列表。

【讨论】：

谢谢！而且你的过滤器比我的简单多了，真的很酷`

【解决方案2】：

combiter 库有 isubsetv 迭代器，它遍历向量的所有子集。结合foreach 简化代码。

library(combiter)
library(foreach)
A <- c(2,5,6,3,7)

res <- foreach(x = isubsetv(A), .combine = c) %do% sum(x)
absdif <- abs(res-20)
ind <- which(absdif==min(absdif))
as.list(isubsetv(A))[ind]

【讨论】：

感谢您向我介绍新语法！剂量“%do%”表示多线程运行？我不知道。你提到了另一种算法。我真正的 A 不会超过 20 件，但我还是很好奇。你能给我一些启示吗？