【问题标题】:R unscale and back transform plot axis or use axis from original data columnR unscale and back transform plot axis or use axis from original data column
【发布时间】:2020-10-03 22:31:58
【问题描述】:

我正在绘制变量对模型拟合的影响。变量被 sqrt 转换然后缩放。我可以根据建模拟合绘制“权重”的原始值,但生成的 geom_line 非常不同,并且 x 轴上观察到建模拟合大幅增加的范围被压扁我更喜欢第一个将其拉伸的图. 但是,我想将 x 轴更改为在原始数据刻度上有刻度线,按照建议使用标签 = 权重,但是标签太多了,有没有办法减少标签或刻度线的数量?

这是我的数据集和当前数据的简化示例,我希望 x 轴显示权重列的值,而不是绘制的 sqrt.scale.weight 列:

fit <- c(0.371, 0.4103, 0.431, 0.4482, 0.4644, 0.4773, 0.4893, 0.5007, 0.5116, 0.5213, 0.5308, 0.5392, 0.5473, 0.5554, 0.5626, 0.571, 0.5785, 0.5849, 0.5907, 0.5968, 0.6029, 0.6091, 0.6145, 0.62, 0.626, 0.6312, 0.6359, 0.6403, 0.6448, 0.6504, 0.6547, 0.6594, 0.664, 0.6684, 0.6729, 0.6774, 0.6821, 0.6863, 0.6906, 0.6952, 0.6993, 0.7033, 0.7071, 0.7108, 0.7143, 0.7172, 0.7205, 0.723, 0.7254, 0.7277, 0.7293, 0.7305, 0.7314, 0.7319, 0.732, 0.7319, 0.7314, 0.7307, 0.7295, 0.7281, 0.7263, 0.7241, 0.7219, 0.7194, 0.717, 0.7145, 0.7113, 0.7086, 0.7059, 0.7032, 0.701, 0.699, 0.6975, 0.6969, 0.697, 0.6989, 0.7069, 0.7347)

weight <- c(0, 0.0889, 0.2036, 0.3335, 0.4844, 0.6248, 0.7703, 0.9243, 1.0858, 1.2425, 1.4052, 1.5619, 1.7211, 1.89, 2.0493, 2.2476, 2.4336, 2.6021, 2.7624, 2.9379, 3.1268, 3.3228, 3.5082, 3.7031, 3.9277, 4.1324, 4.3255, 4.5165, 4.721, 4.9912, 5.2123, 5.4627, 5.7272, 5.9916, 6.2829, 6.5953, 6.944, 7.2809, 7.6518, 8.087, 8.5059, 8.9622, 9.4454, 9.9778, 10.5475, 11.0788, 11.7702, 12.409, 13.1368, 14.04, 14.8531, 15.6675, 16.614, 17.4447, 18.3222, 19.312, 20.2457, 21.2823, 22.5272, 23.71, 25.0778, 26.5766, 28.0484, 29.6478, 31.122, 32.7483, 34.8543, 36.8603, 38.961, 41.4882, 43.9276, 46.8164, 50.1696, 52.8536, 57.0352, 62.8378, 74.3099, 100.737)

sqrt.scale.weight <- c(-1.2543, -1.1136, -1.0413, -0.9818, -0.9258, -0.8812, -0.84, -0.8005, -0.7625, -0.7282, -0.6948, -0.6644, -0.6351, -0.6054, -0.5786, -0.5467, -0.518, -0.4929, -0.4698, -0.4453, -0.4197, -0.3939, -0.3702, -0.346, -0.3189, -0.2948, -0.2726, -0.2512, -0.2287, -0.1998, -0.1767, -0.1511, -0.1247, -0.0989, -0.0712, -0.0421, -0.0105, 0.0193, 0.0514, 0.088, 0.1223, 0.1587, 0.1963, 0.2367, 0.2786, 0.3168, 0.365, 0.4084, 0.4565, 0.5143, 0.5648, 0.614, 0.6696, 0.7171, 0.7661, 0.82, 0.8695, 0.9232, 0.986, 1.044, 1.1094, 1.179, 1.2455, 1.3158, 1.3789, 1.4468, 1.5323, 1.6114, 1.6919, 1.786, 1.8741, 1.9753, 2.089, 2.1772, 2.3104, 2.4873, 2.8146, 3.4832)

dat <- data.frame(weight,sqrt.scale.weight,fit)

ggplot(data=dat,aes(sqrt.scale.weight, fit)) +
  geom_line(col="red") +
  geom_rug(sides="b") +
  theme_bw() +
  scale_y_continuous(limits = c(0, 1),breaks = seq(0, 1, by = 0.2)) +
  theme(panel.grid.major = element_blank(),panel.grid.minor = element_blank()) +
  labs(y = "Modelled probability", x = "sqrt scaled variable")  

ggplot(data=dat,aes(weight, fit)) +
  geom_line(col="red") +
  geom_rug(sides="b") +
  theme_bw() +
  scale_y_continuous(limits = c(0, 1),breaks = seq(0, 1, by = 0.2)) +
  theme(panel.grid.major = element_blank(),panel.grid.minor = element_blank()) +
  labs(y = "Modelled probability", x = "weight variable")  

ggplot(data=dat,aes(sqrt.scale.weight, fit)) +
  geom_line(col="red") +
  geom_rug(sides="b") +
  theme_bw() +
  scale_y_continuous(limits = c(0, 1),breaks = seq(0, 1, by = 0.2)) +
  scale_x_continuous(breaks = sqrt.scale.weight, labels = weight) + 
  theme(panel.grid.major = element_blank(),panel.grid.minor = element_blank()) +
  labs(y = "Modelled probability", x = "sqrt scaled variable with weight label")  

【问题讨论】:

  • 我无法运行您的代码,因为 lowerupper 不存在
  • scale_x_continuous(breaks = sqrt.scale.weight, labels = weight)
  • 另外,我不明白你的问题..你想绘制weight 而不是sqrt.scale.weight。所以?什么事拦住你了?而且我不明白“OR..”是如何替代它的.. 但是如果你想计算 0、1、2 等的值,你需要使用splinefun()
  • @Edo 抱歉,我已经删除了下一行和上一行,我已经编辑了这个问题,希望它更清楚。绘制 sqrt 和 scale 转换变量会扩展 x 轴值的范围(0-10 权重),在该范围内我们看到对模型拟合参数的影响或影响,这是我想要保留但有意义的 x -轴标签。

标签: r


【解决方案1】:

我想这就是你要找的。​​p>

首先,缩放时保持属性:你需要使用相同的mean和sd来相应地变换ggplot的标签。

我在mylabels 中创建了一些我喜欢的标签,但您可以将您希望显示的内容分配给mylabels

因此计算mybreaks:重点是转换mylabels,与计算sqrt.scale.weights 时应用于weights 的转换相同。

这样我们实际上是在绘制sqrt.scale.weights,但我们正在调整x轴以显示实际weights的相应标签。

我的标签并不完美,因为我只使用您的部分数据计算了均值和标准差。如果您从整个数据集中获取比例属性,它应该看起来很完美。

att <- attributes(scale(sqrt(dat$weight)))
mylabels <- seq(0,100,10)
mybreaks <- scale(sqrt(mylabels), att$`scaled:center`, att$`scaled:scale`)[,1]

ggplot(data = dat, aes(sqrt.scale.weight, fit)) +
  geom_line(col = "red") +
  geom_rug(sides = "b") +
  theme_bw() +
  scale_y_continuous(limits = c(0, 1), breaks = seq(0, 1, by = 0.2)) +
  scale_x_continuous(labels = mylabels, breaks = mybreaks) +
  theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank()) +
  labs(y = "Modelled probability", x = "variable")  

【讨论】:

    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2010-09-09
    • 1970-01-01
    • 2022-12-27
    相关资源
    最近更新 更多