【发布时间】:2020-01-21 19:17:23
【问题描述】:
我有以下矩阵:
mat<- matrix(c(1,0,0,0,0,0,1,0,0,0,0,0,0,0,2,0,
2,0,0,0,1,0,0,0,0,0,0,0,0,0,1,0,
0,0,1,1,1,0,0,0,0,0,0,0,0,0,0,0,
0,1,0,0,0,1,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,1,0,0,1,0,1,1,0,0,1,0,1,
1,1,0,0,0,0,0,0,1,0,1,2,1,0,0,0), nrow=16, ncol=6)
dimnames(mat)<- list(c("a", "c", "f", "h", "i", "j", "l", "m", "p", "q", "s", "t", "u", "v","x", "z"),
c("1", "2", "3", "4", "5", "6"))
我使用下面的函数创建了一个矩阵列表:
lapply(seq_len(ncol(mat) - 1), function(j) do.call(cbind,
lapply(seq_len(ncol(mat) - j), function(i) rowSums(mat[, i:(i + j)]))))
在此函数中,原始矩阵中的列使用移动窗口方法进行组合。首先,窗口大小为 2,以便合并两列中的数据。窗口移动 1 步(1 列),然后组合下一组两列。输出是每个窗口大小的矩阵。窗口大小继续增加,窗口增加到 3 列,3 列的结果输出到新矩阵中。这种情况一直持续到窗口大小达到最大列数为止。
我需要在列表中的每个矩阵上运行一系列函数并将答案输出到数据框中。我需要申请的功能是:
-
计算每行的总频率(即行总数)。我尝试了这个功能:
freq <- rowSums(mat[i:(i + j),]) -
计算每行的平均频率(即行总数/行长)。我尝试了这个功能:
mean_freq <- rowSums(mat[i:(i + j),])/length(mat[i:(i + j),]) -
乘以窗口大小 * pi * 25。
total_window_size <- length(ncol(mat) - j))*pi*25 -
将每行的平均频率除以总窗口大小。
density <- mean_freq/total_window_size
以下是此示例列表中每个矩阵的上述函数的预期结果(即result_mat1、result_mat2...)。数据框result_df结合了每个子数据框的所有结果,是我需要的最终输出:
窗口大小为 2 的 df
result_mat1 <- data.frame( window_size= rep("2",80),
combined_cols= c(rep("1_2",16), rep("2_3",16), rep("3_4",16), rep("4_5",16), rep("5_6",16)),
row_names= c("a", "c", "f", "h", "i", "j", "l", "m", "p", "q", "s", "t", "u", "v","x", "z"),
freq=c(6,3,2,2,6,2,1,2,1,2,3,2,1,2,3,2),
mean_freq=(c(6,3,2,2,6,2,1,2,1,2,3,2,1,2,3,2)/5),
total_window_size= rep(157.08, 16))
result_mat1$density<- result_mat1$mean_freq/result_mat1$total_window_size
窗口大小为 3 的 df
result_mat2 <- data.frame( window_size= rep("3",64),
combined_cols= c(rep("1_2_3",16), rep("2_3_4",16), rep("3_4_5",16), rep("4_5_6",16)),
row_names= c("a", "c", "f", "h", "i", "j", "l", "m", "p", "q", "s", "t", "u", "v","x", "z"),
freq=c(6,4,3,3,7,3,1,2,1,2,3,2,1,2,4,2),
mean_freq=(c(6,4,3,3,7,3,1,2,1,2,3,2,1,2,4,2)/5),
total_window_size= rep(235.62, 16))
result_mat2$density <- result_mat2$mean_freq/result_mat2$total_window_size
窗口大小为 4 的 df
result_mat3 <- data.frame( window_size= rep("4",48),
combined_cols= c(rep("1_2_3_4",16), rep("2_3_4_5",16), rep("3_4_5_6",16)),
row_names= c("a", "c", "f", "h", "i", "j", "l", "m", "p", "q", "s", "t", "u", "v","x", "z"),
freq=c(6,3,3,3,7,3,1,2,1,2,3,2,1,2,4,2),
mean_freq=(c(6,3,3,3,7,3,1,2,1,2,3,2,1,2,4,2)/5),
total_window_size= rep(314, 16))
result_mat3$density <- result_mat3$mean_freq/result_mat3$total_window_size
窗口大小为 5 的 df
result_mat4 <- data.frame( window_size= rep("5",32),
combined_cols= c(rep("1_2_3_4_5",16), rep("2_3_4_5_6",16)),
row_names= c("a", "c", "f", "h", "i", "j", "l", "m", "p", "q", "s", "t", "u", "v","x", "z"),
freq=c(6,3,2,2,6,2,1,2,1,2,3,2,1,2,4,2),
mean_freq=(c(6,3,2,2,6,2,1,2,1,2,3,2,1,2,4,2)/5),
total_window_size= rep(392.5, 16))
result_mat4$density <- result_mat4$mean_freq/result_mat4$total_window_size
窗口大小为 6 的 df
result_mat5 <- data.frame( window_size= rep("6",16),
combined_cols= c(rep("1_2_3_4_5_6",16)),
row_names= c("a", "c", "f", "h", "i", "j", "l", "m", "p", "q", "s", "t", "u", "v","x", "z"),
freq=c(4,2,1,1,3,1,1,1,1,1,2,2,1,1,3,1),
mean_freq=(c(4,2,1,1,3,1,1,1,1,1,2,2,1,1,3,1)/5),
total_window_size= rep(471, 16))
result_mat5$density <- result_mat5$mean_freq/result_mat5$total_window_size
包含所有子数据帧结果的最终数据帧
result_df <- rbind(result_mat1, result_mat2, result_mat3, result_mat4, result_mat5)
我需要帮助将这 4 个函数应用于列表的每个元素并将结果输出到一个数据框。
【问题讨论】:
-
我有点困惑。你有这个函数
lapply(seq_len(ncol(mat) - 1), function(j) do.call(cbind, lapply(seq_len(ncol(mat) - j), function(i) rowSums(mat[, i:(i + j)]))))然后你通过拆分它来展示它 -
我不想拆分列表。我想在列表的每个元素上应用一组函数来生成
result_df。我为每个元素显示的单个 dfs(即result_mat1和result_mat2)是我产生我需要的最终输出的中间尝试result_df。 -
@Danielle 我很乐意提供帮助,但这个问题有点令人困惑。考虑用更小的数据集分解成更小的部分。为什么知道矩阵列表是如何获得的如此重要?
标签: r list function matrix apply