【问题标题】:Extract rows of data.table according to rows of another data.table [duplicate]根据另一个data.table的行提取data.table的行[重复]
【发布时间】:2018-09-15 22:55:40
【问题描述】:

我找不到任何答案,但我认为这很容易做到。

我有这个data.table:

DT = expand.grid(Season = c("Winter","Spring","Summer","Fall"),
                 Station = c("A","B","C"),
                 Group = c("1","2","3","4"))
DT$Value = seq(1,length(DT[,1]),1)
DT = data.table(DT)

我想根据这个其他data.table获取DT的一个子集:

indexTable = data.table(Season = c("Winter","Spring","Spring"),
                        Station = c("B","B","A"),
                        Group = c("1","2","3"))

基本上我只想要indexTable 中包含的DT 行。预期的结果是这个表:

expectedTable = data.table(Season = c("Winter","Spring","Spring"),
                           Station = c("B","B","A"),
                           Group = c("1","2","3"),
                           Value = c(5,18,26))

我正在尝试使用以下代码获取:

tryTable = DT[DT$Station %in% indexTable$Station &
              DT$Season %in% indexTable$Season &
              DT$Group %in% indexTable$Group,]

这不仅给了我我想要的 3 行,还给了我其他行 DT

我做错了什么?有没有一种简单的方法可以使用 data.table 索引表示法获得expectedTable(例如使用setkey?)

【问题讨论】:

    标签: r indexing data.table subset extraction


    【解决方案1】:

    您正在执行两个表的 INNER JOIN。

    DT[
        indexTable
        , on = c("Season", "Station", "Group")
        , nomatch = 0
    ]
    
       Season Station Group Value
    1: Winter       B     1     5
    2: Spring       B     2    18
    3: Spring       A     3    26
    

    参考

    【讨论】:

    • 如果OP希望子集保留DT的行顺序,那么这涵盖了它,我认为:stackoverflow.com/q/18969420
    • @Frank 那么让我们像骗子一样结束吧?
    • 谢谢。这正是我想要的!
    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2015-12-24
    • 1970-01-01
    • 2021-06-25
    • 2023-04-02
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多