【发布时间】:2021-02-09 06:21:56
【问题描述】:
我正在尝试使用通用函数来计算 2 个不同长度的数据集中的行之间的差异。我的第一个函数只占用我们指定的一行并进行计算,但我有兴趣应用此函数,但在我的数据集或矩阵中的所有行上,我尝试了lapply 但我收到错误 Error: Argument 1 must have names 并采取了看这个答案Problem with bind_rows: Error: Argument 1 must have names 但没看懂。任何帮助将不胜感激
# datasets
d1 = data.frame(V1=1:5,V3=6:10)
d2 = data.frame(V1=c(2,3,4,5), V2=c(6,6,5,9))
d1
V1 V3
1 1 6
2 2 7
3 3 8
4 4 9
5 5 10
d2
V1 V2
1 2 6
2 3 6
3 4 5
4 5 9
# the first function
soustraction.i=function(databig,datasmall,i,threshold){
databig=as.data.frame(databig)
datasmall=as.data.frame(datasmall)
dif=map2_df(databig, datasmall[i,], `-`)
dif[dif<0] = 0
dif$mismatch=rowSums(dif)
dif=dif[which(dif$mismatch <= threshold),]
return(dif)
}
# If i am interested the first row in d2 with all rows in d1 i get the right answer
soustraction.i(d1,d2,1,3)
# A tibble: 3 x 3
V1 V3 mismatch
<dbl> <dbl> <dbl>
1 0 0 0
2 0 1 1
3 1 2 3
# However, i do not know how to do the same calculation but over all the rows in d2 (the small dataset)
# d2 is always smaller than d1
# Here is what i tried
#The seconf function
soustraction.matrice=function(d1,d2,threshold){
d1=as.matrix(d1)
d2=as.matrix(d2)
n=nrow(d2)
diff.mat=lapply(1:n,soustraction.i,d1,d2)
diff.mat=as.data.frame(diff.mat)
return(diff.mat)
}
soustraction.matrice(d1,d2,3)
#Error: Argument 1 must have names.
第二个函数的预期输出应该是,例如,如果我将阈值设置为3(我不确定阈值是否应该在第二个函数中重新定义)
V1 V3 mismatch
<dbl> <dbl> <dbl>
1 0 0 0
2 0 1 1
3 1 2 3
4 0 0 0
5 0 1 1
6 0 2 2
7 0 1 1
8 0 2 2
9 0 3 3
10 0 0 0
11 0 0 0
12 0 0 0
13 0 0 0
14 0 1 1
【问题讨论】:
标签: r compiler-errors arguments lapply