【发布时间】:2014-07-08 20:21:23
【问题描述】:
我正在尝试创建一个函数,该函数将使用输入变量创建一个新列并根据下标值计算所述列。在下面的示例中,我想创建一个名为 'forest_closed_start_h_1' 的列,该列是在 start_class_01 == 'forest_closed' 等于以下公式时计算得出的:(start_class_01_perc * 0.01) * (ha_affect)。
编辑:我应该提到我想要一个函数(或者甚至更好,也许是一个循环),因为我必须计算相同类型数据的 50 次不同迭代。
这是我编写的函数,但我无法让函数变量填充“a”、“b”和“c”。我也无法获得创建新列的功能。
class_calc <- function(start_end,number,veg){
a <- [paste (veg,start_end,'h',number,sep='_')] #create new variable (a) equal to forest_closed_start_h_1
b <- [paste0(start_end,'_class_',number)] #create new variable (b) equal to start_class_01
c <- [paste0(start_end,'_class_',number,'_perc')] #create new variable (c) equal to start_class_01_perc
dat$a <- 0 #create new column from variable a, which is forest_closed_start_h_01
dat$a[dat$b==veg]<-(dat$c[dat$b==veg]*0.01)*(dat$ha_affect[dat$b==veg]) #calculate values for a, where start_class_01==forest_closed
}
class_calc(start_end='start',number='01',veg='forest_closed')
这是我的数据的一个子集:
structure(list(start_class_01 = c("forest_closed", "forest_closed",
"forest_open", "forest_closed", "forest_closed", "forest_closed",
"forest_closed", "forest_closed", "forest_closed", "forest_closed",
"forest_closed", "forest_closed", "forest_closed", "forest_closed",
"forest_closed", "forest_closed", "forest_closed", "forest_closed",
"forest_closed", "forest_closed", "forest_closed", "forest_closed",
"forest_closed", "forest_closed", "herbaceous", "forest_closed",
"forest_closed", "forest_closed", "forest_closed", "forest_closed",
"forest_closed", "forest_closed", "forest_semi_closed", "forest_closed",
"forest_closed", "forest_closed", "forest_closed", "forest_closed",
"forest_closed", "forest_closed", "forest_closed", "forest_closed",
"forest_closed", "forest_closed", "forest_closed", "forest_closed",
"forest_closed", "forest_closed", "forest_closed", "forest_closed"
), start_class_01_perc = c(100, 100, 100, 100, 100, 100, 100,
100, 100, 100, 100, 100, 70, 100, 100, 100, 100, 100, 100, 100,
100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100,
100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100,
100, 100, 100, 100), ha_affect = c(3.87, 1.134, 1.44, 1.8, 2.43,
40.752, 22.95, 9.432, 1.89, 1.53, 2.25, 1.08, 8.946, 3.42, 3.15,
4.32, 5.04, 1.62, 1.17, 2.16, 2.34, 25.56, 3.51, 2.07, 3.51,
100.17, 15.66, 2.7, 36.27, 18.36, 4.41, 23.31, 1.944, 9.18, 1.62,
5.76, 17.37, 7.56, 1.512, 81.36, 7.2, 61.02, 21.69, 1.62, 1.26,
5.4, 0.288, 1.08, 7.74, 1.17)), .Names = c("start_class_01",
"start_class_01_perc", "ha_affect"), row.names = c(NA, 50L), class = "data.frame")
【问题讨论】:
-
您好,您的
[似乎放置得很尴尬。也许您可能想复习一下语言语法? -
由于语法错误,您发布的代码不会生成有效的函数。您初始化了
dat$a,但您在任何地方都没有初始化dat$b或dat$c,但您在问题中使用了它们。也许你的意思是b和c。 -
在我看来,您正在尝试制作您自己的现有功能版本。你见过
tapply、aggregate和plyr和dplyr的包吗? -
@MrFlick,你是对的,该功能在当前形式下无法运行,但我希望它能够提供一个其他人可以使用的框架。关于 dat$b 和 dat$c,我试图引用数据框中的特定列,但不想将它们硬连接到函数中。相反,我认为我可以通过将部分函数输入粘贴在一起来定义特定列。在提供的示例中,我希望将 dat$b 定义为 start_class_01,它存在于数据帧中,而 dat$c 定义为 start_class_01_perc,同样已经存在于数据帧中。希望这会有所帮助。