使用 tidyverse（dplyr 和 purrr）替换两个 for 循环答案

【问题标题】：Replace two for loops using tidyverse (dplyr and purrr)使用 tidyverse（dplyr 和 purrr）替换两个 for 循环
【发布时间】：2018-07-15 06:01:01
【问题描述】：

我有一个小玩意

raw.tb
#> # A tibble: 10 x 4
#>       geno    ind     X     Y
#>  *  <fctr> <fctr> <int> <int>
#>  1 san1w16     A1   467   383
#>  2 san1w16     A1   465   378
#>  3 san1w16     B1   464   378
#>  4 san1w16     B1   464   377
#>  5 san2w16     A1   464   376
#>  6 san2w16     A1   464   375
#>  7 san2w16     B1   463   375
#>  8 san2w16     B1   463   374
#>  9 san3w16     A1   463   373
#> 10 san3w16     A1   463   372

我想使用 tidyverse 替换两个 for 循环的用法。我正在使用一个需要 2x2 矩阵的函数（它可以是任何函数 - 在这种特定情况下它是 momocs::coo_rotate）。

我想做的可以这样写在base R中：

for(g in unique(raw.tb$geno)){
   for(i in unique(raw.tb[raw.tb$geno == g,]$ind){     
      raw.tb[raw.tb$geno == g & raw.tb$ind == i,c(3,4)] = some.function.for.a.matrix(raw.tb[raw.tb$geno == g & raw.tb$ind == i,c(3,4)])
   }
   }

我猜这可以使用 tidyverse 来完成，但是我已经研究过将 group_by() 与 do() 和 nest 与 map 一起使用，但我无法使其工作。

【问题讨论】：

看来您需要按摘要分组
你的意思是使用dplyr::summarise？
您的循环不起作用？你能发布你想要的结果吗？
是的，没错
@EricFail 我刚刚尝试了这个循环并且它有效：for(g in unique(raw.tb$geno)){ for(i in unique(raw.tb[raw.tb$geno == g,]$ind)){ raw.tb[raw.tb$geno == g & raw.tb$ind == i,c(3,4)] = t( raw.tb[raw.tb$geno == g & raw.tb$ind == i,c(3,4)]) } } @akrun，如果我理解它的作用，summarise 的问题在于它每组返回一个值，而我希望返回与我的组矩阵大小相同的矩阵

标签： r for-loop dplyr tidyverse purrr

【解决方案1】：

我找到了。希望我的回答能让事情更清楚。我向@EricFail 道歉，因为没有让这个更清楚

基本上我写了一个函数，它给定一个 x,y 坐标矩阵，使用第一个和最后一个点作为基线旋转坐标。我没有详细说明该函数，因为它很长并且不是这里的重点，但基本上，该函数属于以下类型：

rotate.coord <- function(mat){
  for(i in 1:dim(mat)[1]{
    x1=(dim(coord.rot)[1])
    x2=1
    .
    .
    (theta is computed based on x1 and x2)
    .
    .
    xn=mat[z,1]*cos(theta)+mat[z,2]*sin(theta)
    yn=-mat[z,1]*sin(theta)+mat[z,2]*cos(theta)
    mat[z,1]=xn
    mat[z,2]=yn
    }
    mat = as_tibble(mat)
    return(mat)
}

有：

raw.tb
#> # A tibble: 10 x 4
#>       geno    ind     X     Y
#>  *  <fctr> <fctr> <int> <int>
#>  1 san1w16     A1   467   383
#>  2 san1w16     A1   465   378
#>  3 san1w16     B1   464   378
#>  4 san1w16     B1   464   377
#>  5 san2w16     A1   464   376
#>  6 san2w16     A1   464   375
#>  7 san2w16     B1   463   375
#>  8 san2w16     B1   463   374
#>  9 san3w16     A1   463   373
#> 10 san3w16     A1   463   372

我想做

raw.nt <- raw.tb %>% 
 group_by(geno,ind) %>% 
 nest()

raw.nt2 <- raw.nt %>% 
  mutate(rot = map(data,rotate.coo))

这会创建一个新的嵌套 tibble，其中每组的 raw.nt2$rot 是来自每组 raw.nt$data 的旋转矩阵

【讨论】：

【解决方案2】：

我有点猜测，因为我不清楚你到底想要做什么。

raw.tb <- structure(list(geno = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L), .Label = "san1w16", class = "factor"), ind = structure(c(1L, 
1L, 2L, 2L, 3L, 3L, 4L, 4L, 5L, 1L), .Label = c("A1", "B1", "C1", 
"D1", "E1"), class = "factor"), X = c(467L, 465L, 464L, 464L, 
464L, 464L, 463L, 463L, 463L, 463L), Y = c(383L, 378L, 378L, 
377L, 376L, 375L, 375L, 374L, 373L, 372L)), .Names = c("geno", 
"ind", "X", "Y"), row.names = c("1", "2", "3", "4", "5", "6", 
"7", "8", "9", "10"), class = c("tbl_df", "tbl", "data.frame"
)) %>% as_tibble(); raw.tb 
#> # A tibble: 10 x 4
#>       geno    ind     X     Y
#>  *  <fctr> <fctr> <int> <int>
#>  1 san1w16     A1   467   383
#>  2 san1w16     A1   465   378
#>  3 san1w16     B1   464   378
#>  4 san1w16     B1   464   377
#>  5 san1w16     C1   464   376
#>  6 san1w16     C1   464   375
#>  7 san1w16     D1   463   375
#>  8 san1w16     D1   463   374
#>  9 san1w16     E1   463   373
#> 10 san1w16     A1   463   372

类似的，

raw.tb %>% group_by(geno) %>% gather(XY, Value, -geno, -ind) %>% arrange(geno, ind)
#> # A tibble: 20 x 4
#> # Groups:   geno [1]
#>       geno    ind    XY Value
#>     <fctr> <fctr> <chr> <int>
#>  1 san1w16     A1     X   467
#>  2 san1w16     A1     X   465
#>  3 san1w16     A1     X   463
#>  4 san1w16     A1     Y   383
#>  5 san1w16     A1     Y   378
#>  6 san1w16     A1     Y   372
#>  7 san1w16     B1     X   464
#>  8 san1w16     B1     X   464
#>  9 san1w16     B1     Y   378
#> 10 san1w16     B1     Y   377
#> 11 san1w16     C1     X   464
#> 12 san1w16     C1     X   464
#> 13 san1w16     C1     Y   376
#> 14 san1w16     C1     Y   375
#> 15 san1w16     D1     X   463
#> 16 san1w16     D1     X   463
#> 17 san1w16     D1     Y   375
#> 18 san1w16     D1     Y   374
#> 19 san1w16     E1     X   463
#> 20 san1w16     E1     Y   373

从那里您几乎可以应用任何功能。这里有一些summarise akrun 建议的

raw.tb %>% group_by(geno) %>% gather(XY, Value, -geno, -ind) %>%
           arrange(geno, ind) %>% group_by(ind, geno, XY)  %>%
           summarise(Value = mean(Value))
#> # A tibble: 10 x 4
#> # Groups:   ind, geno [?]
#>       ind    geno    XY    Value
#>    <fctr>  <fctr> <chr>    <dbl>
#>  1     A1 san1w16     X 465.0000
#>  2     A1 san1w16     Y 377.6667
#>  3     B1 san1w16     X 464.0000
#>  4     B1 san1w16     Y 377.5000
#>  5     C1 san1w16     X 464.0000
#>  6     C1 san1w16     Y 375.5000
#>  7     D1 san1w16     X 463.0000
#>  8     D1 san1w16     Y 374.5000
#>  9     E1 san1w16     X 463.0000
#> 10     E1 san1w16     Y 373.0000

或许

raw.tb %>% group_by(geno) %>% gather(XY, Value, -geno, -ind) %>%
           arrange(geno, ind) %>% group_by(ind, geno)  %>%
           summarise(Value = mean(Value))
#> # A tibble: 5 x 3
#> # Groups:   ind [?]
#>      ind    geno    Value
#>   <fctr>  <fctr>    <dbl>
#> 1     A1 san1w16 421.3333
#> 2     B1 san1w16 420.7500
#> 3     C1 san1w16 419.7500
#> 4     D1 san1w16 418.7500
#> 5     E1 san1w16 418.0000

【讨论】：

我一直在寻找等效的 tidyverse：'toto=as.list(split(raw.tb[,c(3,4)],list(raw.tb$ind,raw .tb$geno))) toto2=lapply(toto,some.function.for.a.matrix)'
您能否在您的问题中包含该内容，包括输出内容？我想确保我们得到相同的输出。