【问题标题】:difference in time dependent on columns取决于列的时间差异
【发布时间】:2015-03-04 06:18:28
【问题描述】:

enter link description here对于具有格式为 %Y-%m-%d %H:%M:%S 的列“时间”的数据帧,我正在尝试计算当这些行满足特定列时行之间的分钟差要求。仅当数据来自同一站点、相机和一个物种时才计算 difftime。 行是观察值,列是:SpeciesID、Site、Plot、Camera、Time。

我试过了:

site.list<-unique(data2$Site) #site list made
  species.list<-unique(data2$SpeciesID)  #species list made
  Time<-as.POSIXlt(data2$Time)
  Time<-rev (Time)

      difftime <- NULL
      for( Site in site.list ){  
        for( Camera in paste('C', 1:4, sep='') ){
                      index <- which( data2$Site == Site & data2$Camera == Camera & data2$SpeciesID==SpeciesID) 
          index2 <- order( data2[index,'Camera'], data2[index, 'Date'], data2[index, 'SpeciesID'])
          small.data <- data2[index, ][index2, ]

          i <- 2
          while( i<- dim(small.data)[1]){

           if ( small.data[i, 'SpeciesID'] == small.data[i-1, 'SpeciesID'] & 
                             small.data[i, 'Site'] == small.data[i-1,'Site'] &
                             small.data[i, 'Camera'] == small.data[i-1,'Camera']{
                               small.data<-difftime(Time[1:(length(Time)-1)] , Time[2:length(Time)])}
          foo<- rbind(difftime, small.data)
        }
      }

【问题讨论】:

  • 您能提供一小部分数据吗?您将有更大的机会得到回应。看看here。如果您不能提供示例,则只需一个虚拟示例即可。您还可以通过私有数据创建示例 here
  • @DJJ 我提供了一个数据子集的链接。我希望这会有所帮助

标签: r


【解决方案1】:

如果你还没有 data.table

install.packages("data.table","http://cran.us.r-project.org")
library (data.table)

我是这样导入数据的: dt

但我正在对 SO 进行 10 次观察的样本。如果您想删除它们,请随意说出来。

df <- structure(list(Individual = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L), Date = c(15543L, 15543L, 15543L, 15554L, 15554L, 15554L, 
15554L, 15554L, 15543L, 15543L), Image1 = c("544.1.P5_", "544.1.P5_", 
"544.1.P5_", "544.7.I1_2", "544.7.I1_2", "544.7.I1_2", "544.7.I1_2", 
"544.7.I1_2", "544.1.P5_", "544.1.P5_"), Site = c("544", "544", 
"544", "544", "544", "544", "544", "544", "544", "544"), Camera = c(1L, 
1L, 1L, 7L, 7L, 7L, 7L, 7L, 1L, 1L), Plot = c(1L, 1L, 1L, 5L, 
5L, 5L, 5L, 5L, 1L, 1L), Plot_Type = c("OnTrail", "OnTrail", 
"OnTrail", "OffTrail", "OffTrail", "OffTrail", "OffTrail", "OffTrail", 
"OnTrail", "OnTrail"), CameraID = c("P5", "P5", "P5", "I1", "I1", 
"I1", "I1", "I1", "P5", "P5"), Time = c("2012/07/22 00:31:00", 
"2012/07/22 00:31:00", "2012/07/22 00:31:00", "2012/08/02 09:09:00", 
"2012/08/02 09:09:00", "2012/08/02 09:09:00", "2012/08/02 09:09:00", 
"2012/08/02 09:09:00", "2012/07/22 00:31:00", "2012/07/22 00:31:00"
), Hour = c(0L, 0L, 0L, 9L, 9L, 9L, 9L, 9L, 0L, 0L), Minute = c(31L, 
31L, 31L, 9L, 9L, 9L, 9L, 9L, 31L, 31L), Second = c(17L, 18L, 
23L, 18L, 20L, 22L, 24L, 26L, 34L, 36L), SpeciesID = c(2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), Common = c("Mule deer", "Mule deer", 
"Mule deer", "Mule deer", "Mule deer", "Mule deer", "Mule deer", 
"Mule deer", "Mule deer", "Mule deer"), Scientific = c("Odocoile", 
"Odocoile", "Odocoile", "Odocoile", "Odocoile", "Odocoile", "Odocoile", 
"Odocoile", "Odocoile", "Odocoile"), SpeciesID.1 = c(2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L)), .Names = c("Individual", "Date", 
"Image1", "Site", "Camera", "Plot", "Plot_Type", "CameraID", 
"Time", "Hour", "Minute", "Second", "SpeciesID", "Common", "Scientific", 
"SpeciesID.1"), row.names = c(NA, 10L), class = "data.frame")

dt = data.table(df,key='Camera,Site,SpeciesID')

# 1) Count the number of obs per category (SpeciesID,Cameraand Site)
# 2) Convert Time into time variable (the format is optional here)
# 3) Perform the time difference where there were at least 2 observations per category
#     (SpeciesID,Cameraand Site) and compute the time difference for each category.
#     (in seconds)

dt[,n:=.N,list(SpeciesID,Camera,Site)][
,Time:=as.POSIXct(Time,format="%Y/%m/%d %H:%M:%S")][
     n>2,diffSec:=filter(Time,c(1,-1),sides=1),list(Camera,Site,SpeciesID)]

其他解决方案

 dt[,n:=.N,list(SpeciesID,Camera,Site)][
 ,Time:=as.POSIXct(Time,format="%Y/%m/%d %H:%M:%S")][
      n>2,diffSec:=c(NA,diff(Time,1)),list(Camera,Site,SpeciesID)]

您可能还想检查变量时间格式。现在是正确的时间吗?我注意到不包括秒数。

【讨论】:

  • 我收到一条错误消息,指出“过滤器”不能用于 POSIXct 或 POSIXt 类的对象。你的意思是使用某种类型的子集吗?谢谢!
  • 这很奇怪。我只是在一个新的控制台中复制了整个代码,我这边没有错误。我仍然不确定代码是否符合您的要求。现在它正在为每个(SpeciesID、Camera 和 Site)类别获取连续行的差异。是的,它使用了一种子集。你能说出你的 R 版本吗?
  • R GUI 64 位 3.1.1 。我尝试在 R Studio 中安装 data.table 并收到无法卸载以前版本的错误。现在正在研究这个问题。我正在寻找的代码要做的是计算时间差异,如果行是相同的 SpeciesID,然后如果它是相同的站点,然后如果它是相同的相机。当满足这些规定时,我想知道时间差(以分钟为单位)
  • 在最新版本的 R 中运行代码时收到相同的消息。 UseMethod("filter_") 中的错误:没有适用于 'filter_' 的方法应用于类“c('POSIXct', 'POSIXt')"
  • 谢谢!将在上午试一试。目前正在运行的模型
猜你喜欢
  • 1970-01-01
  • 2019-04-24
  • 1970-01-01
  • 2020-09-26
  • 1970-01-01
  • 1970-01-01
  • 2020-06-17
  • 1970-01-01
  • 1970-01-01
相关资源
最近更新 更多