【问题标题】:officer seems slow compared to ReporteRs与记者相比,官员似乎很慢
【发布时间】:2019-01-30 16:41:01
【问题描述】:

我有一些使用 ReporteR 运行良好的脚本,并且正在尝试更新它们以使用官员。我的脚本非常重复,因为我只需要多次输出几乎相同的东西,有时只是改变字体。转换后,我发现脚本太慢了,我将无法使用它们。脚本在 ReporteRs 中运行几分钟,但在官员中需要一段时间。

为什么这是5000次在官:

body_add_par(doc, "")

比 ReporteRs 中的同类慢得多:

doc

非常感谢

代码(所有向量都有 2000+ 个元素):

outputFile <- paste0(OutputDir, "test.docx")

#SET STYLES
norm <- fp_text(color = "black", font.size = 10, bold = FALSE, italic = FALSE,
                underlined = FALSE, font.family = "Arial", vertical.align = "baseline",
                shading.color = "transparent")

norm_red <- fp_text(color = "red", font.size = 10, bold = FALSE, italic = FALSE,
                underlined = FALSE, font.family = "Arial", vertical.align = "baseline",
                shading.color = "transparent")

norm_blue <- fp_text(color = "blue", font.size = 10, bold = FALSE, italic = FALSE,
                    underlined = FALSE, font.family = "Arial", vertical.align = "baseline",
                    shading.color = "transparent")

norm_green <- fp_text(color = "green", font.size = 10, bold = FALSE, italic = FALSE,
                     underlined = FALSE, font.family = "Arial", vertical.align = "baseline",
                     shading.color = "transparent")

bold <- fp_text(color = "black", font.size = 10, bold = TRUE, italic = FALSE,
                underlined = FALSE, font.family = "Arial", vertical.align = "baseline",
                shading.color = "transparent")

bold_red <- fp_text(color = "red", font.size = 10, bold = TRUE, italic = FALSE,
                    underlined = FALSE, font.family = "Arial", vertical.align = "baseline",
                    shading.color = "transparent")

bold_blue <- fp_text(color = "blue", font.size = 10, bold = TRUE, italic = FALSE,
                     underlined = FALSE, font.family = "Arial", vertical.align = "baseline",
                     shading.color = "transparent")

bold_green <- fp_text(color = "green", font.size = 10, bold = TRUE, italic = FALSE,
                      underlined = FALSE, font.family = "Arial", vertical.align = "baseline",
                      shading.color = "transparent")

doc <- read_docx()

#ADD TITLE
fpar_ <- fpar(ftext("ASSIGNMENTS", prop = bold))
doc <- body_add_fpar(doc, fpar_, style = "centered", pos = "on")
doc <- body_add_par(doc, "", style = NULL, pos = "after")

#ADD DATE, DIRECTORY
fpar_ <- fpar(ftext("DATE: ", prop = bold),
              ftext(date(), prop = norm))
doc <- body_add_fpar(doc, fpar_, style = "Normal", pos = "after")

fpar_ <- fpar(ftext("DIRECTORY: ", prop = bold),
            ftext(Dir, prop = norm))
doc <- body_add_fpar(doc, fpar_, style = "Normal", pos = "after")

doc <- body_add_par(doc, "", style = NULL, pos = "after")

#Get all
all <- as.character(Summary$Name)

for (i in 1:length(all)) {

  res <- as.numeric(Types[Types$Num==all[i], "Code"])

  if (5 %in% res | 12 %in% res) {
    #Green
    fpar_ <- fpar(ftext(all[i], prop = bold_green))
  } else if (7 %in% res) {
    #Red
    fpar_ <- fpar(ftext(all[i], prop = bold_red))
  } else if (8 %in% res) {
    #Blue
    fpar_ <- fpar(ftext(all[i], prop = bold_blue))
  } else {
    fpar_ <- fpar(ftext(all[i], prop = bold))
  }

  doc <- body_add_fpar(doc, fpar_, style = "Normal", pos = "after")

  #Get list of files
  res <- unique(Detail[Detail$Num==all[i], c("Name", "Cat")])

  #OUTPUT FILE NAME AND CAT
  if (nrow(res) == 0) {
      #NO FILE FOUND
  } else {

     for (j in 1:nrow(res)) {

       fpar_ <- fpar(ftext(paste(as.character(res[j, "Name"]), " "), prop = bold),
                     ftext(as.character(res[j, "Cat"]), prop = norm))
       doc <- body_add_fpar(doc, fpar_, style = "Normal", pos = "after")
     }
  }
  doc <- body_add_par(doc, "", style = NULL, pos = "after")
}

print(doc, target = outputFile)

【问题讨论】:

    标签: officer


    【解决方案1】:

    这是与一些可重现的代码的比较:

    library(officer)
    library(ReporteRs)
    library(microbenchmark)
    
    docx()# first run can be slow because of java init. operations
    
    
    mb <- microbenchmark::microbenchmark(
      officer = {
        doc <- read_docx()
        for(i in 1:100){
          doc <- body_add_par(doc, "")
        }
      }, 
      ReporteRs = {
        doc <- docx()
        for(i in 1:100){
          doc <- addParagraph(doc, '')
        }
      } )
    

    结果如下 - 军官获胜:

    > mb
    Unit: milliseconds
          expr      min       lq     mean   median       uq      max neval
       officer 224.3742 232.9602 238.8452 237.5110 241.5320 325.4288   100
     ReporteRs 311.7194 337.9194 349.7107 343.9703 353.8814 447.2623   100
    

    这是我的sessionInfo() 结果:

    > sessionInfo()
    R version 3.5.1 (2018-07-02)
    Platform: x86_64-apple-darwin15.6.0 (64-bit)
    Running under: macOS High Sierra 10.13.6
    
    Matrix products: default
    BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
    LAPACK: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRlapack.dylib
    
    locale:
    [1] fr_FR.UTF-8/fr_FR.UTF-8/fr_FR.UTF-8/C/fr_FR.UTF-8/fr_FR.UTF-8
    
    attached base packages:
    [1] stats     graphics  grDevices utils     datasets  methods   base     
    
    other attached packages:
    [1] microbenchmark_1.4-4 ReporteRs_0.8.10     ReporteRsjars_0.0.4  officer_0.3.2       
    
    loaded via a namespace (and not attached):
     [1] Rcpp_0.12.17      knitr_1.20        xml2_1.2.0        magrittr_1.5      uuid_0.1-2        xtable_1.8-2     
     [7] R6_2.2.2          tools_3.5.1       rvg_0.1.9.001     R.oo_1.22.0       png_0.1-7         htmltools_0.3.6  
    [13] yaml_2.1.19       digest_0.6.15     zip_1.0.0         rJava_0.9-10      shiny_1.1.0       later_0.7.3      
    [19] base64enc_0.1-3   R.utils_2.6.0     promises_1.0.1    mime_0.5          compiler_3.5.1    gdtools_0.1.7    
    [25] R.methodsS3_1.7.1 httpuv_1.4.4.2
    

    【讨论】:

    • 如果你使用比 100 大得多的数字,这仍然是真的吗?
    • 你的情况怎么样?你能提供一个显示官员速度较慢的代码吗?在这种情况下,我很乐意尝试改进包。
    • 在您第一次回答后,我尝试了与您包含的相同代码,没有问题。然后,我删除了 microbench 代码,并分别为 ReporteR 和官员尝试了 5000 个循环。对我来说,ReporteRs 测试用时不到半秒,但官员测试仍在 10 分钟后进行。也许这是我的设置:officer 0.3.1 / R 版本 3.5.1 (2018-07-02) 平台:x86_64-w64-mingw32/x64 (64-bit) 运行条件:Windows 7 x64 (build 7601) Service Pack 1
    • 好的,我去看看,不过,你能出示你的代码吗?我想看看你正在使用什么功能(我不敢相信你正在循环 5000 次以产生空段落;))。
    猜你喜欢
    • 2021-04-19
    • 2013-06-14
    • 2021-12-11
    • 1970-01-01
    • 1970-01-01
    • 2015-02-16
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多