如何在 R 的 keras 包中设置 class_weight？答案

【问题标题】：How to set class_weight in keras package of R?如何在 R 的 keras 包中设置 class_weight？
【发布时间】：2017-10-24 10:15:35
【问题描述】：

我在 R 中使用 keras 包来训练深度学习模型。我的数据集高度不平衡。因此，我想在fit 函数中设置class_weight 参数。这是我用于模型的拟合函数及其参数

history <- model %>% fit(
  trainData, trainClass, 
  epochs = 5, batch_size = 1000, 
  class_weight = ????,
  validation_split = 0.2
)

在python中我可以设置class_weight如下：

class_weight={0:1, 1:30}

但我不确定如何在 R 中执行此操作。在 R 的帮助菜单中，它描述 class_weight 如下：

可选的命名列表映射索引（整数）到权重（浮点数）到适用于该类样本的模型损失训练。这对于告诉模型“更加注意”很有用从代表性不足的班级中抽取样本。

有什么想法或建议吗？

【问题讨论】：

我没有使用 keras 的经验，但我首先要尝试的是 list("0" = 1, "1" = 30)

标签： r tensorflow deep-learning keras

【解决方案1】：

Class_weight 需要是一个列表，所以

    history <- model %>% fit(
        trainData, trainClass, 
        epochs = 5, batch_size = 1000, 
        class_weight = list("0"=1,"1"=30),
        validation_split = 0.2
    )

似乎有效。 Keras 在内部使用名为 as_class_weights 的函数将列表更改为 python 字典（请参阅https://rdrr.io/cran/keras/src/R/model.R）。

     class_weight <- dict(list('0'=1,'1'=10))
     class_weight
     >>> {0: 1.0, 1: 10.0}

看起来就像你上面提到的python字典。

【讨论】：

【解决方案2】：

我found a generic solution在Python解决方案中，所以我转换成R：

counter=funModeling::freq(Y_data_aux_tr, plot=F) %>% select(var, frequency)
majority=max(counter$frequency)
counter$weight=ceil(majority/counter$frequency)


l_weights=setNames(as.list(counter$weight), counter$var)

使用它：

 fit(..., class_weight = l_weights)

如果您使用fit_generator，建议您：由于权重基于频率，因此具有不同数量的训练验证样本可能会使验证结果产生偏差。它们的大小应该相同。

【讨论】：