【问题标题】:R convert fractions to integer percentages adding up to 100R将分数转换为整数百分比加起来为100
【发布时间】:2014-08-17 19:33:47
【问题描述】:

我计算了一个不同事件频率的向量,表示为分数并按降序排序。我需要连接一个需要正整数百分比的工具,该百分比总和必须恰好为 100。我想以最能代表输入分布的方式生成百分比。也就是说,我希望百分比之间的关系(比率)与输入分数中的关系(比率)最匹配,尽管任何非线性都会导致剪掉长尾。

我有一个生成这些百分比的函数,但我认为它不是最佳的或优雅的。特别是,在诉诸“愚蠢的整数技巧”之前,我想在数字空间中做更多的工作。

这是一个示例频率向量:

fractionals <- 1 / (2 ^ c(2, 5:6, 8, rep(9,358)))

这是我的功能:

# Convert vector of fractions to integer percents summing to 100
percentize <- function(fractionals) {
  # fractionals is sorted descending and adds up to 1
  # drop elements that wouldn't round up to 1% vs. running total
  pctOfCum <- fractionals / cumsum(fractionals)
  fractionals <- fractionals[pctOfCum > 0.005]

  # calculate initial percentages
  percentages <- round((fractionals / sum(fractionals)) * 100)

  # if sum of percentages exceeds 100, remove proportionally
  i <- 1
  while (sum(percentages) > 100) {
    excess <- sum(percentages) - 100
    if (i > length(percentages)) {
      i <- 1
    }
    partialExcess <- max(1, round((excess * percentages[i]) / 100))
    percentages[i] <- percentages[i] - min(partialExcess,
                                           percentages[i] - 1)
    i <- i + 1
  }

  # if sum of percentages shorts 100, add proportionally
  i <- 1
  while (sum(percentages) < 100) {
    shortage <- 100 - sum(percentages)
    if (i > length(percentages)) {
      i <- 1
    }
    partialShortage <- max(1, round((shortage * percentages[i]) / 100))
    percentages[i] <- percentages[i] + partialShortage
    i <- i + 1
  }

  return(percentages)
}

有什么想法吗?

【问题讨论】:

标签: r integer data-analysis frequency-distribution


【解决方案1】:

这个怎么样?它重新调整变量,使其增加到 100,但如果由于四舍五入而达到 99,它会将最大频率加 1。

fractionals <- 1 / (2 ^ c(2, 5:6, 8, rep(9,358)))
pctOfCum <- fractionals / cumsum(fractionals)
fractionals <- fractionals[pctOfCum > 0.005]

bunnies <- as.integer(fractionals / sum(fractionals) * 100) + 1
    bunnies[bunnies > 1] <- round(bunnies[bunnies > 1] * (100 -  
    sum(bunnies[bunnies == 1])) / sum(bunnies[bunnies > 1]))
if((sum(bunnies) < 100) == TRUE) bunnies[1] <- bunnies[1] + 1

> bunnies
[1] 45  6  3  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  
1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1

【讨论】:

    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2022-12-11
    • 2017-06-19
    • 1970-01-01
    • 2021-09-25
    • 2017-05-26
    相关资源
    最近更新 更多