【问题标题】:Using two for loops to define datasets and variables in R使用两个 for 循环在 R 中定义数据集和变量
【发布时间】:2012-08-31 17:04:32
【问题描述】:

我需要在一个简单的循环中访问来自两个不同数据集的变量(如下)。 (我意识到这要求负向量和正向量的长度相同......幸运的是,情况总是如此。)

Groups<-c("bdiControl","bdi")
Positive<-c("PA","Sad_PA","Amuse_PA","Happy_PA","Disgust_PA")
Negative<-c("NA","Sad_NA","Amuse_NA","Happy_NA","Disgust_NA")

for (g in Groups) {
    for (i in Positive) { 

if (sd(Groups[[g]]$Positive[[i]])<sdthresh | sd(Groups[[g]]$Negative[[i]]])<sdthresh){
cat('Standard deviation too low to run\ ',Positive[[i]],Negative[[i]],'\ comparison')
}
else{
corr<-cor(Groups[[g]]$Positive[[i]],Groups[[g]]$Negative[[i]],use="complete.obs") 
print("The correlation between " Positive[[i]] " and " Negative[[i]] " was " corr "for " Groups[[g]])
}
}
}

我尝试过的其他参考包括 g$i、Groups[g]$Positive[i]、g$Positive[[i]] 和类似的排列。我想我在解决问题的过程中旋转我的轮子。帮助?! :)

【问题讨论】:

  • 问题是什么? (您拥有的代码不会运行。您需要修复其中的错误吗?)
  • 澄清:bdiControlbdi是两个数据框?每个都有 10 列:5 正和 5 负?如果您使用 dput(bdiControl) 或仅使用一小部分(例如 dput(head(bdiControl))
  • 您能从头开始解释这个问题吗:很难对您正在尝试做的事情进行逆向工程。 (类似于:“我有两个数据框,bdiControlbdi,每个数据框有 5 对观察值,正面和负面。我需要找到...之间的相关性”)
  • 我有两个具有相同变量的数据框(bdi 和 bdiControl)。该代码会生成诸如“g$i 中的错误:$ 运算符对原子向量无效”之类的错误。当我对它运行的 datasets$vars 进行硬编码时,我很确定错误出在这些变量中。
  • 你仍然没有说出你想要做什么。您是要在两个数据帧之间关联相同的变量,还是在每个数据帧内将正数与负数关联?我了解您正在尝试使用描述它们的字符串来访问变量对 - 我可以告诉您如何做到这一点,但我想确保我告诉您的方式正确。

标签: r for-loop dataset


【解决方案1】:

这段代码有很多问题。虽然尚不完全清楚代码试图做什么(您应该更清楚地提出您的问题),但我相信这会满足您的要求:

for (group.name in Groups) {
    g <- get(group.name)  # retrieve the actual data
    for (i in 1:length(Positive)) { 
        if (sd(g[[Positive[i]]]) < sdthresh | sd(g[[Negative[i]]]) < sdthresh) {
               cat('Standard deviation too low to run\ ',
                    Positive[[i]], Negative[[i]], '\ comparison')
        }
        else{
            corr<-cor(g[[Positive[i]]], g[[Negative[i]]],use="complete.obs")
            print(paste("The correlation between", Positive[[i]],
                    "and", Negative[[i]], "was", corr, "in", group.name))
        }
    }
}

例如,当我创建随机数据集(始终提供可重现的示例!)时:

set.seed(1)
bdicontrol = as.data.frame(matrix(rnorm(100), nrow=10))
bdi = as.data.frame(matrix(rnorm(100), nrow=10))
colnames(bdicontrol) <- c(Positive, Negative)
colnames(bdi) <- c(Positive, Negative)

输出是:

[1] "The correlation between PA and NA was -0.613362711250911 in bdicontrol"
[1] "The correlation between Sad_PA and Sad_NA was 0.321335485805636 in bdicontrol"
[1] "The correlation between Amuse_PA and Amuse_NA was 0.0824438791207575 in bdicontrol"
[1] "The correlation between Happy_PA and Happy_NA was -0.192023690189678 in bdicontrol"
[1] "The correlation between Disgust_PA and Disgust_NA was -0.326390681138363 in bdicontrol"
[1] "The correlation between PA and NA was 0.279863504447769 in bdi"
[1] "The correlation between Sad_PA and Sad_NA was 0.115897422274498 in bdi"
[1] "The correlation between Amuse_PA and Amuse_NA was -0.465274556165398 in bdi"
[1] "The correlation between Happy_PA and Happy_NA was 0.268076939911701 in bdi"
[1] "The correlation between Disgust_PA and Disgust_NA was 0.573745174454954 in bdi"

【讨论】:

  • 我绝对不会想到以这种方式记录/检索数据。感谢您的解决方案。
  • 我错在最初没有在这些问题上写小说……抱歉,花了点时间调查。
  • 你应该尽可能多地写清楚(否则你不会有任何好处!)顺便说一句,如果这回答了你的问题,别忘了accept
猜你喜欢
  • 2017-06-18
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
相关资源
最近更新 更多