【问题标题】:Merging xts in R - Converting Characters to NA在 R 中合并 xts - 将字符转换为 NA
【发布时间】:2019-02-25 18:21:58
【问题描述】:

我有 3 个 xts 对象

logged <- xts::xts(x = loggedInUsers$loggedInUsers, order.by = Sys.time())
loadValue <- xts::xts(x = loadAvg, order.by = Sys.time())
hostname <- xts::xts(x = loadHost, order.by = Sys.time())

dput(hostname)
dput(loadValue)
dput(logged)

dput 给出以下结果

 structure("deliverforgoodportal", .Dim = c(1L, 1L), index = structure(1551088127.27724, tzone = "", tclass = c("POSIXct",
    "POSIXt")), class = c("xts", "zoo"), .indexCLASS = c("POSIXct",
    "POSIXt"), tclass = c("POSIXct", "POSIXt"), .indexTZ = "", tzone = "")

structure(0, .Dim = c(1L, 1L), .Dimnames = list(NULL, "load"), index = structure(1551088127.27676, tzone = "", tclass = c("POSIXct",
"POSIXt")), .indexCLASS = c("POSIXct", "POSIXt"), tclass = c("POSIXct",
"POSIXt"), .indexTZ = "", tzone = "", class = c("xts", "zoo"))

structure(1, .Dim = c(1L, 1L), index = structure(1551088127.27637, tzone = "", tclass = c("POSIXct",
"POSIXt")), class = c("xts", "zoo"), .indexCLASS = c("POSIXct",
"POSIXt"), tclass = c("POSIXct", "POSIXt"), .indexTZ = "", tzone = "")

当我合并这三个并打印时,主机名被转换为 NA

  tmp <- merge.xts(hostname, logged, loadValue, all = TRUE)
    print(tmp)

输出是:(主机名是NA)

                    hostname logged  load
2019-02-25 09:48:47       NA      1    NA
2019-02-25 09:48:47       NA     NA    0
2019-02-25 09:48:47       NA     NA    NA

为什么会以 NA 的形式出现?

【问题讨论】:

    标签: r xts


    【解决方案1】:

    您应该意识到 xts 对象是一个时间序列和一个矩阵。现在矩阵只能包含一种类型的值,字符或数字。但不是两者兼而有之。您的合并试图将字符值矩阵(主机名)与数值(记录和加载)结合起来。这会导致主机名值被强制为 NA。

    如果要加入这些数据,则必须使用 data.frame(或 data.table)。另请注意,您的时间值不相等,它们以毫秒为单位。因此,如果您想在几分钟内加入,请首先使用 lubridate 包中的floor_date。见下面两个有和没有润滑的例子。我使用包 timetk 将 xts 对象转换为 tibble,但取决于您可能不需要的源数据。

    使用 full_join,无润滑

    library(timetk)
    library(dplyr)
    hostname <- tk_tbl(hostname)
    loadValue <- tk_tbl(loadValue)
    logged <- tk_tbl(logged)
    
    hostname %>% 
      full_join(loadValue) %>% 
      full_join(logged, 
                by = "index", 
                suffix = c("_hostname", "_logged"))
    
    Joining, by = "index"
    # A tibble: 3 x 4
      index               value_hostname        load value_logged
      <dttm>              <chr>                <dbl>        <dbl>
    1 2019-02-25 10:48:47 deliverforgoodportal    NA           NA
    2 2019-02-25 10:48:47 NA                       0           NA
    3 2019-02-25 10:48:47 NA                      NA            1
    

    使用 lubridate 和左连接:

    hostname %>% 
      mutate(index = lubridate::floor_date(index, unit = "seconds")) %>% 
      left_join(loadValue %>% mutate(index = lubridate::floor_date(index, unit = "seconds"))) %>% 
      left_join(logged %>% mutate(index = lubridate::floor_date(index, unit = "seconds")), 
                by = "index", 
                suffix = c("_hostname", "_logged"))    
    
    Joining, by = "index"
    # A tibble: 1 x 4
      index               value_hostname        load value_logged
      <dttm>              <chr>                <dbl>        <dbl>
    1 2019-02-25 10:48:47 deliverforgoodportal     0            1
    

    【讨论】:

      猜你喜欢
      • 2017-10-30
      • 1970-01-01
      • 2013-12-10
      • 1970-01-01
      • 1970-01-01
      • 2015-03-23
      • 1970-01-01
      • 2013-07-09
      • 2011-05-27
      相关资源
      最近更新 更多