【问题标题】:Efficient way of applying a savitzky golay filtering on data.table rows for certain column?对特定列的data.table行应用savitzky golay过滤的有效方法?
【发布时间】:2021-06-03 13:07:18
【问题描述】:

我编写了一个函数来对 data.table 中的每一行应用一个 savitzky golay 过滤器。具有测量值的第一列作为参数给出,所有后面的列也包含要过滤的测量值。处理后的行会就地更新。

我的功能有效,但速度很慢。

如何更改函数以使工作更高效和更像 data.table?

MWE:

library(data.table)
library(pracma)
library(datasets)

data(iris)
setDT(iris)

#Reorder columns because function expects columns to apply a filter on 
#starting from a defined column to the last column
setcolorder(iris, "Species")


savitzky_golay <- function(dt, id_of_first_sample_col=2, win_size=5) {
  
  c_names_samples <- colnames(dt)[id_of_first_sample_col:ncol(dt)]
  
  for (i in seq(from=1, to=nrow(dt))) {
    mat <- as.numeric(dt[i,id_of_first_sample_col:ncol(dt)]) # Get sample data as matrix (one row)
    mat <- savgol(mat,fl=win_size,forder=2,dorder=0) # Savitzky-Golay-Filter
    
    dt[i, (c_names_samples) := as.list(mat)] # Update columns of current row by reference
  }
  # Returns nothing as update is done via reference.
}

savitzky_golay(iris)

【问题讨论】:

    标签: r data.table


    【解决方案1】:

    试试:

    savitzky_golay_new <- function(dt, id_of_first_sample_col=2, win_size=5) {
      c_names_samples <- colnames(dt)[id_of_first_sample_col:ncol(dt)]
      dt[,(c_names_samples):=asplit(apply(.SD,1,function(x) savgol(x,fl=win_size,forder=2,dorder=0)),1)
         ,.SDcols=c_names_samples]
      }
    

    性能对比:

    microbenchmark::microbenchmark(savitzky_golay_new(dt2),savitzky_golay(dt1))
    Unit: milliseconds
                        expr     min       lq     mean   median       uq      max neval
     savitzky_golay_new(dt2) 12.7808 13.69695 15.63821 14.31785 15.17705  31.2701   100
         savitzky_golay(dt1) 71.4231 81.96115 87.97737 86.41265 90.42620 239.7945   100
    

    【讨论】:

      猜你喜欢
      • 2016-08-27
      • 2017-01-29
      • 1970-01-01
      • 2020-11-30
      • 1970-01-01
      • 2022-01-13
      • 2021-08-16
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多