-在循环中计算二维矩阵的平均值答案

【问题标题】：-Compute the average for a 2D matrix in a loop-在循环中计算二维矩阵的平均值
【发布时间】：2015-08-26 14:35:23
【问题描述】：

我编写了一个脚本来读取 15 个数据文件，计算每 2 个文件之间的差异并将结果写入 5 个不同的文件。这 5 个文件是矩阵，10x259 值。我需要创建一个矩阵，其中每个元素将是前 5 个矩阵中相同位置的元素的平均值。我不能做平均工作。

我在循环中尝试了“sum=sum+i”的经典方式，但 R 给出了递归求和的错误。我尝试制作一个 3 维矩阵并用 5 个包含 2D 矩阵的“页面”填充它，但尝试用另一个大小的内容填充矩阵时出现错误。我尝试使用 rowMeans()，但无法完成这项工作，因为我需要获取同一变量 5 次迭代的平均值。

我能做到的唯一方法是将所有生成的文件再次读取到单独的变量中，将它们相加并除以 5。但这仅适用于少数文件。我需要扩展到许多文件，所以我需要让它以某种方式循环工作。

谁能给我一个更好的主意？

我是 R 新手。这个脚本可能效率很低，但它只需要完成这项工作。

下面是我的代码：

MAM <- c("M","N","O","P","R")
S <-c("a","b","c","d","e")
T<-c("a","b","c","d","e")
V<-c("a","b","c","d","e")

Min2000<- array(3,dim=c(259,10,5))
Min2010<- array(5,dim=c(259,10,5))

  # this will be done 5 times
for (i in 1:5)  { 

  # preparing file names to be read
  S[i] <- paste(MAM [i],"2000.txt",sep="_")
  T[i] <- paste(MAM [i],"2150.txt",sep="_")
  V[i] <- paste(MAM [i],"2250.txt",sep="_")

  # import data from the files
  file1 <- read.table(S[i], header=TRUE,sep="\t")
  file2 <- read.table(T[i], header=TRUE,sep="\t")
  file3 <- read.table(V[i], header=TRUE,sep="\t")

  # delete the first column
  file1[,2:11]
  file2[,2:11]
  file3[,2:11]
  file1a <- file1[,c(2:11)]
  file2a <- file2[,c(2:11)]
  file3a <- file3[,c(2:11)]

  # compute data
  Min2000<- (file2a-file1a)/file1a
  Min2010<- (file3a-file1a)/file1a

  colMeans(Min2000)
  #cub[,,i]= Min2000    #doesn'twork
  #rowMeans(datamonth, dims = 2)  #doesn'twork
}

【问题讨论】：

标签： r matrix average

【解决方案1】：

试试这个，解释见代码注释

# load library
library(dplyr)

# Create vectors of names to be read in
MAM <- c("M","N","O","P","R")
S.Name <- paste(MAM,"2000.txt",sep="_")
T.Name <- paste(MAM,"2150.txt",sep="_")
V.Name <- paste(MAM,"2250.txt",sep="_")

# Read in data into list and drop first column
S = lapply(S.Name, read.table, header = T, sep = "\t") %>% lapply(function(x) x[,-1])
T = lapply(T.Name, read.table, header = T, sep = "\t") %>% lapply(function(x) x[,-1])
V = lapply(V.Name, read.table, header = T, sep = "\t") %>% lapply(function(x) x[,-1])

# Sum up the files, then divide to find mean.
# This does (matrix1 + matrix2 + matrix3 + matrix4 + matrix5) / # of matrices
S = S %>% {Reduce("+", .) / length(S)}
T = T %>% {Reduce("+", .) / length(T)}
V = V %>% {Reduce("+", .) / length(V)}

【讨论】：

【解决方案2】：

这里有两种可能性：

Min2000<- array(NA,dim=c(259,10,5))
Min2010<- array(NA,dim=c(259,10,5))

# this will be done 5 times
for (i in 1:5)  { 

  # import data from the files
  file1 <- matrix(sample(1:10,2849,replace=TRUE),259,11)
  file2 <- matrix(sample(1:10,2849,replace=TRUE),259,11)
  file3 <- matrix(sample(1:10,2849,replace=TRUE),259,11)

  # delete the first column
  file1a <- file1[,-1]
  file2a <- file2[,-1]
  file3a <- file3[,-1]

  # compute data
  Min2000[,,i] <- (file2a-file1a)/file1a
  Min2010[,,i] <- (file3a-file1a)/file1a
}

A2 <- apply(Min2000,1:2,"mean")
A3 <- apply(Min2010,1:2,"mean")

Sum2 <- matrix(0,259,10)
Sum3 <- matrix(0,259,10)

# this will be done 5 times
for (i in 1:5)  { 

  # import data from the files
  file1 <- matrix(sample(1:10,2849,replace=TRUE),259,11)
  file2 <- matrix(sample(1:10,2849,replace=TRUE),259,11)
  file3 <- matrix(sample(1:10,2849,replace=TRUE),259,11)

  # delete the first column
  file1a <- file1[,-1]
  file2a <- file2[,-1]
  file3a <- file3[,-1]

  # compute data
  Sum2 <- Sum2 + (file2a-file1a)/file1a
  Sum3 <- Sum3 + (file3a-file1a)/file1a
}

B2 <- Sum2/5
B3 <- Sum3/5

我用随机矩阵替换了文件。

结果几乎一样：

> max((A2-B2)^2)
[1] 7.888609e-31

> max((A3-B3)^2)
[1] 7.888609e-31

【讨论】：

我假设矩阵确实有 10 行和 259 列，反之亦然。
我更改了解决方案，使得“file1”、“file2”和“file3”是 259x11 矩阵。

【解决方案3】：

感谢两位的回答。

mra68，我尝试了两种解决方案。第一个（带有： A2

Vlo，我尝试了该库，但使用您的代码，我只能得出 10 列与这些列的平均值之间的总和。我想要 5 个 excel 之间的平均值，而不是 10 列之间的平均值。代码做的比我一开始自己做的要多，但这并不是我所需要的。

【讨论】：