【问题标题】:How can I generate a series boxplot per hour of day for this dataset?如何为这个数据集每小时生成一个系列箱线图?
【发布时间】:2016-05-26 05:05:25
【问题描述】:

鉴于下面的示例数据,我想生成一系列箱线图,每小时一个显示“使用情况”列的分布。 我一直在与这个问题作斗争太久了,我只是想不出正确的语法来将我的 datetime 变量转换为可以用作箱线图分组的正确形式。我尝试了几种不同的方法将其放入 POSIXct 或 POSIXlt,但即使在这样做之后,我也无法弄清楚如何将其分解为小时组。

非常感谢您的帮助。

df <- read.table(text="datetime,usage,available
2016-05-25 10:00:59.000000,12,96
             2016-05-25 09:00:59.000000,8,96
             2016-05-25 08:00:59.000000,0,96
             2016-05-25 07:00:59.000000,0,96
             2016-05-25 06:00:59.000000,0,96
             2016-05-25 05:00:59.000000,0,96
             2016-05-25 04:00:59.000000,0,96
             2016-05-25 03:00:59.000000,0,96
             2016-05-25 02:00:59.000000,0,96
             2016-05-25 01:00:59.000000,0,96
             2016-05-25 00:00:59.000000,0,96
             2016-05-24 23:00:59.000000,0,96
             2016-05-24 22:00:59.000000,0,96
             2016-05-24 21:00:59.000000,0,96
             2016-05-24 20:00:59.000000,2,96
             2016-05-24 19:00:59.000000,0,96
             2016-05-24 18:00:59.000000,8,96
             2016-05-24 17:00:59.000000,15,96
             2016-05-24 16:00:59.000000,20,96
             2016-05-24 15:00:59.000000,19,96
             2016-05-24 14:00:59.000000,3,96
             2016-05-24 13:00:59.000000,6,96
             2016-05-24 12:00:59.000000,9,96
             2016-05-24 11:00:59.000000,13,96
             2016-05-24 10:00:59.000000,16,96
             2016-05-24 09:00:59.000000,11,96
             2016-05-24 08:00:59.000000,1,96
             2016-05-24 07:00:59.000000,5,96
             2016-05-24 06:00:59.000000,2,96
             2016-05-24 05:00:59.000000,0,96
             2016-05-24 04:00:59.000000,0,96
             2016-05-24 03:00:59.000000,0,96
             2016-05-24 02:00:59.000000,0,96
             2016-05-24 01:00:59.000000,0,96
             2016-05-24 00:00:59.000000,0,96
             2016-05-23 23:00:59.000000,0,96
             2016-05-23 22:00:59.000000,0,96
             2016-05-23 21:00:59.000000,0,96
             2016-05-23 20:00:59.000000,4,96
             2016-05-23 19:00:59.000000,0,96
             2016-05-23 18:00:59.000000,0,96
             2016-05-23 17:00:59.000000,0,96
             2016-05-23 16:00:59.000000,3,96
             2016-05-23 15:00:59.000000,5,96
             2016-05-23 14:00:59.000000,2,96
             2016-05-23 13:00:59.000000,18,96
             2016-05-23 12:00:59.000000,10,96
             2016-05-23 11:00:59.000000,7,96
             2016-05-23 10:00:59.000000,9,96
             2016-05-23 09:00:59.000000,1,96
             2016-05-23 08:00:59.000000,1,96
             2016-05-23 07:00:59.000000,1,96
             2016-05-23 06:00:59.000000,1,96
             2016-05-23 05:00:59.000000,1,96
             2016-05-23 04:00:59.000000,1,96
             2016-05-23 03:00:59.000000,1,96
             2016-05-23 02:00:59.000000,1,96
             2016-05-23 01:00:59.000000,1,96
             2016-05-23 00:00:59.000000,1,96", sep=",", header=T)

【问题讨论】:

标签: r boxplot


【解决方案1】:

例如

df <- read.table(sep=",", header=T, text="
datetime,usage,available
2016-05-25 10:00:59.000000,12,96
2016-05-25 09:00:59.000000,8,96
2016-05-25 08:00:59.000000,0,96
2016-05-25 07:00:59.000000,0,96
2016-05-25 06:00:59.000000,0,96
2016-05-25 05:00:59.000000,0,96
2016-05-25 04:00:59.000000,0,96
2016-05-25 03:00:59.000000,0,96
2016-05-25 02:00:59.000000,0,96
2016-05-25 01:00:59.000000,0,96
2016-05-25 00:00:59.000000,0,96
2016-05-24 23:00:59.000000,0,96
2016-05-24 22:00:59.000000,0,96
2016-05-24 21:00:59.000000,0,96
2016-05-24 20:00:59.000000,2,96
2016-05-24 19:00:59.000000,0,96
2016-05-24 18:00:59.000000,8,96
2016-05-24 17:00:59.000000,15,96
2016-05-24 16:00:59.000000,20,96
2016-05-24 15:00:59.000000,19,96
2016-05-24 14:00:59.000000,3,96
2016-05-24 13:00:59.000000,6,96
2016-05-24 12:00:59.000000,9,96
2016-05-24 11:00:59.000000,13,96
2016-05-24 10:00:59.000000,16,96
2016-05-24 09:00:59.000000,11,96
2016-05-24 08:00:59.000000,1,96
2016-05-24 07:00:59.000000,5,96
2016-05-24 06:00:59.000000,2,96
2016-05-24 05:00:59.000000,0,96
2016-05-24 04:00:59.000000,0,96
2016-05-24 03:00:59.000000,0,96
2016-05-24 02:00:59.000000,0,96
2016-05-24 01:00:59.000000,0,96
2016-05-24 00:00:59.000000,0,96
2016-05-23 23:00:59.000000,0,96
2016-05-23 22:00:59.000000,0,96
2016-05-23 21:00:59.000000,0,96
2016-05-23 20:00:59.000000,4,96
2016-05-23 19:00:59.000000,0,96
2016-05-23 18:00:59.000000,0,96
2016-05-23 17:00:59.000000,0,96
2016-05-23 16:00:59.000000,3,96
2016-05-23 15:00:59.000000,5,96
2016-05-23 14:00:59.000000,2,96
2016-05-23 13:00:59.000000,18,96
2016-05-23 12:00:59.000000,10,96
2016-05-23 11:00:59.000000,7,96
2016-05-23 10:00:59.000000,9,96
2016-05-23 09:00:59.000000,1,96
2016-05-23 08:00:59.000000,1,96
2016-05-23 07:00:59.000000,1,96
2016-05-23 06:00:59.000000,1,96
2016-05-23 05:00:59.000000,1,96
2016-05-23 04:00:59.000000,1,96
2016-05-23 03:00:59.000000,1,96
2016-05-23 02:00:59.000000,1,96
2016-05-23 01:00:59.000000,1,96
2016-05-23 00:00:59.000000,1,96")
boxplot(df$usage~as.POSIXlt(df$datetime)$hour)

给予

【讨论】:

  • 我试过了,但由于错误,我认为我采取了错误的方法。这是我看到的输出:Error in sort.int(x, na.last = na.last, decreasing = decreasing, ...) : 'x' must be atomic In addition: Warning messages: 1: In is.na(x) : is.na() applied to non-(list or vector) of type 'language' 2: In is.na(x) : is.na() applied to non-(list or vector) of type 'language'
  • str(df[,c("datetime", "usage")]) 的输出是什么?
  • 很抱歉我没有跟进此事。由于您演示了一个似乎可行的解决方案,因此我将其标记为已回答。
猜你喜欢
  • 2017-01-26
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 2017-04-26
  • 1970-01-01
  • 2019-11-15
  • 1970-01-01
  • 2020-09-10
相关资源
最近更新 更多