【发布时间】:2018-08-05 12:27:43
【问题描述】:
给定以下数据框:
structure(list(press_id = c(1L, 1L, 1L, 1L, 1L), time_state = c("start_time",
"end_time", "start_time", "end_time", "start_time"), time_state_val = c(164429106667745,
164429180716697, 164429106667745, 164429180716697, 164429106667745
), timestamp = c(164429106667745, 164429106667745, 164429106667745,
164429106667745, 164429108669078), acc_mag = c(10.4656808698978,
10.4656808698978, 10.4656808698978, 10.4656808698978, 10.458666511955
)), .Names = c("press_id", "time_state", "time_state_val", "timestamp",
"acc_mag"), row.names = c(NA, -5L), class = c("grouped_df", "tbl_df",
"tbl", "data.frame"), vars = "press_id", drop = TRUE, indices = list(
0:4), group_sizes = 5L, biggest_group_size = 5L, labels = structure(list(
press_id = 1L), row.names = c(NA, -1L), class = "data.frame", vars = "press_id", drop = TRUE, .Names = "press_id"))
我想在过滤时应用“规则”:如果是time_state == "start_time",则检查time_state_interval == min(timestamp),如果是"end_time",则检查与max(timestamp)的相等性。
如何执行这种基于规则的filter?我正在尝试使用 case_when 进行操作,但它不会产生预期的结果。
df1 %>%
group_by(press_id) %>%
mutate(row = row_number(),
start_time = min(timestamp),
end_time = max(timestamp)) %>%
gather(time_state , time_state_val, -press_id, -row,-timestamp:-vel_ang_mag_avg) %>%
arrange(press_id, row) %>%
select(press_id, time_state, time_state_val, timestamp, acc_mag, vel_ang_mag, -row) %>%
group_by(press_id, time_state) %>%
filter(timestamp == case_when(time_state == "start_time" ~ min(timestamp),
time_state == "end_time" ~ max(timestamp)))
【问题讨论】:
-
您能否添加您的代码,以便我们看到您尝试执行的操作?
-
当然我现在就发。
-
@coffeinjunky 请参阅上面的过滤器声明。
-
您发布的数据集的变量似乎比您的代码使用的变量少。喜欢
vel_ang_mag_avg。请更新您的帖子,并让我们知道您的理想输出应该是什么样子。 -
我认为
case_when是用来基于旧变量创建新变量的,但是我没有看到case_when和filter。