【问题标题】:if-else condition with for loop throws up missing value error although no missing values in variables尽管变量中没有缺失值,但带有 for 循环的 if-else 条件会引发缺失值错误
【发布时间】:2020-05-12 11:00:27
【问题描述】:

我正在尝试编写一个循环来创建一个变量,以便稍后使用 group_by on 进行进一步计算,这表明特定类型(组变量)是否发生在两个日期(日期变量)之间。我要创建的因子称为leaderFactor。

代码抛出错误:“if (test1$party[i] == "PSOE" & test1$elecTypeDate[i] > as_date("1977-01-01") & 中的错误: 需要 TRUE/FALSE 的缺失值”,但两个相关变量中没有缺失值。请帮忙!谢谢!

(旁注:这是一个具有两种以上事件类型的更复杂场景的示例,因此我将在下面的代码中的 if 和 else 之间使用几个 else if 语句。)

数据:

test1<- structure(list(party = c("PP", "PP", "PP", "PP", "PP", "PP", 
"PP", "PP", "PP", "PP", "PP", "PP", "PP", "PP", "PP", "PP", "PP", 
"PP", "PP", "PP", "PP", "PP", "PP", "PP", "PP", "PP", "PP", "PP", 
"PP", "PP", "PP", "PP", "PP", "PP", "PP", "PP", "PP", "PP", "PP", 
"PP", "PP", "PP", "PP", "PP", "PP", "PP", "PP", "PP", "PP", "PP", 
"PP", "PP", "PP", "PP", "PP", "PP", "PP", "PP", "PP", "PP", "PP", 
"PP", "PP", "PP", "PP", "PP", "PP", "PP", "PP", "PP", "PP", "PP", 
"PP", "PP", "PP", "PP", "PP", "PP", "PP", "PP", "PP", "PP", "PP", 
"PP", "PP", "PP", "PP", "PP", "PP", "PP", "PP", "PP", "PP", "PP", 
"PP", "PP", "PP", "PP", "PP", "PP", "PP", "PP", "PP", "PP", "PP", 
"PP", "PP", "PP", "PP", "PP", "PP", "PP", "PP", "PP", "PP", "PP", 
"PP", "PP", "PP", "PP", "PP", "PP", "PP", "PP", "PP", "PP", "PP", 
"PP", "PP", "PP", "PP", "PP", "PP", "PP", "PP", "PP", "PP", "PP", 
"PP", "PP", "PP", "PP", "PP", "PP", "PP", "PP", "PP", "PP", "PP", 
"PP", "PP", "PP", "PP", "PP", "PP", "PP", "PP", "PP", "PP", "PP", 
"PP", "PP", "PP", "PP", "PP", "PP", "PP", "PP", "PP", "PP", "PP", 
"PP", "PP", "PP", "PP", "PP", "PP", "PP", "PP", "PP", "PP", "PP", 
"PP", "PP", "PP", "PP", "PP", "PP", "PP", "PP", "PP", "PP", "PP", 
"PP", "PP", "PP", "PSOE", "PSOE", "PSOE", "PSOE", "PSOE", "PSOE", 
"PSOE", "PSOE", "PSOE", "PSOE", "PSOE", "PSOE", "PSOE", "PSOE", 
"PSOE", "PSOE", "PSOE", "PSOE", "PSOE", "PSOE", "PSOE", "PSOE", 
"PSOE", "PSOE", "PSOE", "PSOE", "PSOE", "PSOE", "PSOE", "PSOE", 
"PSOE", "PSOE", "PSOE", "PSOE", "PSOE", "PSOE", "PSOE", "PSOE", 
"PSOE", "PSOE", "PSOE", "PSOE", "PSOE", "PSOE", "PSOE", "PSOE", 
"PSOE", "PSOE", "PSOE", "PSOE", "PSOE", "PSOE", "PSOE", "PSOE", 
"PSOE", "PSOE", "PSOE", "PSOE", "PSOE", "PSOE", "PSOE", "PSOE", 
"PSOE", "PSOE", "PSOE", "PSOE", "PSOE", "PSOE", "PSOE", "PSOE", 
"PSOE", "PSOE", "PSOE", "PSOE", "PSOE", "PSOE", "PSOE", "PSOE", 
"PSOE", "PSOE", "PSOE", "PSOE", "PSOE", "PSOE", "PSOE", "PSOE", 
"PSOE", "PSOE", "PSOE", "PSOE", "PSOE", "PSOE", "PSOE", "PSOE", 
"PSOE", "PSOE", "PSOE", "PSOE", "PSOE", "PSOE", "PSOE", "PSOE", 
"PSOE", "PSOE", "PSOE", "PSOE", "PSOE", "PSOE", "PSOE", "PSOE", 
"PSOE", "PSOE", "PSOE", "PSOE", "PSOE", "PSOE", "PSOE", "PSOE", 
"PSOE", "PSOE", "PSOE", "PSOE", "PSOE", "PSOE", "PSOE", "PSOE", 
"PSOE", "PSOE", "PSOE", "PSOE", "PSOE", "PSOE", "PSOE", "PSOE", 
"PSOE", "PSOE", "PSOE", "PSOE", "PSOE", "PSOE", "PSOE", "PSOE", 
"PSOE", "PSOE", "PSOE", "PSOE", "PSOE", "PSOE", "PSOE", "PSOE", 
"PSOE", "PSOE", "PSOE", "PSOE", "PSOE", "PSOE", "PSOE", "PSOE", 
"PSOE", "PSOE", "PSOE", "PSOE", "PSOE", "PSOE", "PSOE", "PSOE", 
"PSOE", "PSOE", "PSOE", "PSOE", "PSOE", "PSOE", "PSOE", "PSOE", 
"PSOE", "PSOE", "PSOE", "PSOE", "PSOE", "PSOE", "PSOE", "PSOE", 
"PSOE", "PSOE", "PSOE", "PSOE", "PSOE", "PSOE", "PSOE", "PSOE", 
"PSOE", "PSOE", "PSOE", "PSOE", "PSOE", "PSOE", "PSOE", "PSOE", 
"PSOE", "PSOE", "PSOE", "PSOE"), elecTypeDate = structure(c(3346, 
3346, 3346, 3712, 4291, 4503, 4656, 4656, 4656, 4656, 4656, 4656, 
4656, 4656, 4656, 4656, 4656, 4656, 4656, 4868, 4868, 4868, 4868, 
4868, 4868, 4868, 4868, 4868, 4868, 4991, 4991, 4991, 5144, 5204, 
5783, 5995, 6148, 6209, 6360, 6360, 6360, 6360, 6360, 6360, 6360, 
6360, 6360, 6360, 6360, 6360, 6360, 6695, 6940, 7274, 7456, 7578, 
7790, 7790, 7790, 7790, 7790, 7790, 7790, 7790, 7790, 7790, 7790, 
7790, 7821, 8095, 8674, 8766, 8917, 9039, 9251, 9251, 9251, 9251, 
9251, 9251, 9251, 9251, 9251, 9282, 9282, 9282, 9435, 9556, 10135, 
10500, 10592, 10743, 10743, 10743, 10743, 10743, 10743, 10743, 
10743, 10743, 10743, 10743, 10835, 10865, 11017, 11443, 11596, 
12173, 12173, 12173, 12173, 12173, 12173, 12173, 12173, 12173, 
12204, 12204, 12296, 12326, 12357, 12418, 12478, 12874, 12935, 
13453, 13634, 13634, 13634, 13634, 13634, 13634, 13634, 13634, 
13634, 13634, 13665, 13695, 13939, 14245, 14245, 14304, 14304, 
14914, 15095, 15095, 15095, 15095, 15095, 15095, 15095, 15095, 
15095, 15095, 15095, 15095, 15126, 15400, 15400, 15614, 15614, 
15645, 16102, 16495, 16556, 16556, 16556, 16556, 16556, 16556, 
16556, 16556, 16556, 16556, 16556, 16556, 16587, 16679, 17045, 
17045, 17501, 17622, 17928, 17987, 18017, 18017, 18017, 18017, 
18017, 18017, 18017, 18017, 18017, 18017, 18017, 18017, 3346, 
3346, 3346, 3346, 3377, 3712, 3712, 4291, 4503, 4656, 4656, 4656, 
4656, 4656, 4656, 4656, 4656, 4656, 4656, 4656, 4656, 4656, 4868, 
4868, 4868, 4868, 4868, 4868, 4868, 4868, 4868, 4868, 4991, 4991, 
4991, 5144, 5204, 5783, 5995, 6148, 6209, 6360, 6360, 6360, 6360, 
6360, 6360, 6360, 6360, 6360, 6360, 6360, 6360, 6360, 6695, 6940, 
7274, 7456, 7578, 7790, 7790, 7790, 7790, 7790, 7790, 7790, 7790, 
7790, 7790, 7790, 7790, 7821, 8095, 8674, 8766, 8917, 9039, 9251, 
9251, 9251, 9251, 9251, 9251, 9251, 9251, 9251, 9251, 9282, 9282, 
9282, 9435, 9556, 10135, 10500, 10592, 10743, 10743, 10743, 10743, 
10743, 10743, 10743, 10743, 10743, 10743, 10743, 10743, 10835, 
10865, 11017, 11443, 11596, 12173, 12173, 12173, 12173, 12173, 
12173, 12173, 12173, 12173, 12173, 12204, 12204, 12296, 12326, 
12357, 12418, 12478, 12874, 12935, 13453, 13634, 13634, 13634, 
13634, 13634, 13634, 13634, 13634, 13634, 13634, 13634, 13665, 
13695, 13939, 14245, 14304, 14304, 14914, 15095, 15095, 15095, 
15095, 15095, 15095, 15095, 15095, 15095, 15095, 15095, 15095, 
15126, 15400, 15400, 15614, 15614, 15645, 16102, 16495, 16556, 
16556, 16556, 16556, 16556, 16556, 16556, 16556, 16556, 16556, 
16556, 16556, 16587, 16679, 17045, 17045, 17501, 17622, 17928, 
17987, 18017, 18017, 18017, 18017, 18017, 18017, 18017, 18017, 
18017, 18017, 18017, 18017), class = "Date")), row.names = c(NA, 
-398L), class = c("tbl_df", "tbl", "data.frame"))

代码:

test1$leaderFactor <- "none"
for(i in test1$leaderFactor){
  if(test1$party[i]=="PSOE" & 
     test1$elecTypeDate[i] > as_date("1977-01-01") & 
                               test1$elecTypeDate[i] < as_date("1997-06-30")){
    test1$leaderFactor[i] = "Gonzales"
   } else {
    test1$leaderFactor[i] = "Rest"}}
sum(is.na(test1$elecTypeDate))
sum(is.na(test1$party))

【问题讨论】:

  • 顺便说一句,您无需将变量转换为因子即可使用group_by
  • 试试as.Date,对我有用。
  • df &lt;- data.frame(group=rep(c(1,2),each=3), date=rep(c("2001-06-01", "2002-10-01", "2003-06-01"),2), stringsAsFactors = FALSE); df$futureFactor &lt;- ifelse(df$group==1 &amp; df$date &gt; "2001-01-01" &amp; df$date &lt; "2002-12-31", "a", "b")
  • @jay.sf 使用 as.Date 而不是 as_date 会产生完全相同的错误。
  • @KonradRudolph 使用R 4.0.0,只加载了基础包,没有错误,抱歉。

标签: r for-loop if-statement


【解决方案1】:

首先,为什么要使用循环?不用循环也可以这样写:

df <- data.frame(
    group = rep(c(1, 2), each = 3),
    date = as_date(rep(c("2001-06-01", "2002-10-01", "2003-06-01"), 2))
)

df$futureFactor <- ifelse(
    df$group == 1
    & df$date > as_date("2001-01-01")
    & df$date < as_date("2002-12-31"),
    "a", "b"
)

生成的代码更短、更类似于 R 并且运行效率更高。

如果使用if 而不是ifelse,请始终使用&amp;&amp;(和||)而不是&amp;(和|):后者是矢量化的,但if 只接受单个值,如果提供多个​​值则失败,因此矢量化纯文本没有意义。

现在,为什么您的代码会失败?因为您正在尝试比较日期和因素,而 R 会给您一个有用的警告(这应该是一个错误):

“>”的不兼容方法(“Ops.factor”、“>.Date”)

您需要通过使用as_date 定义df$date 来确保您的数据具有正确的类型,就像我在上面的代码中所做的那样。

【讨论】:

  • "旁注:这是一个具有两种以上事件类型的更复杂场景的示例,因此我将在下面的代码中的 if 和 else 之间使用几个 else if 语句。"跨度>
  • 我意识到了因子-日期比较问题。您的回答是指我为说明问题而制作的一个示例,这并不是说其中一个变量不是日期。我现在只是直接从脚本中放入代码(尽管仍然不包括 else if in-betweeners)。
  • @Spaniel 对于两种以上的情况,您可以使用来自“dplyr”的case_when 而不是ifelse
【解决方案2】:

解决方案是在 test1$leaderFactor 周围使用 seq_along()

【讨论】:

    猜你喜欢
    • 2016-11-10
    • 1970-01-01
    • 1970-01-01
    • 2023-02-16
    • 2022-08-02
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2018-11-25
    相关资源
    最近更新 更多