【发布时间】:2021-11-26 18:00:54
【问题描述】:
数据设置
我有一个看起来有点像下面这个简单数据框的数据集:
CAD_EXCHANGE <- 1.34
EUR_EXCHANGE <- 0.88
df <- tibble(
shipment = c("A", "B", "C", "D", "E"),
invoice = c(rep(500, 5)),
currency = factor(c("USD", "EUR", "CAD", NA, "SDD"))
)
df
# A tibble: 5 x 3
shipment invoice currency
<chr> <dbl> <fct>
1 A 500 USD
2 B 500 EUR
3 C 500 CAD
4 D 500 NA
5 E 500 SDD
levels(df$currency)
[1] "CAD" "EUR" "SDD" "USD"
最终目标
我正在尝试将某些常见其他货币(欧元和加元)的发票转换为美元,但不是全部或数据丢失(即 SDD 和NA)。我的最终数据框应如下所示:
# A tibble: 5 x 5
shipment invoice currency invoice_converted currency_converted
<chr> <dbl> <fct> <dbl> <fct>
1 A 500 USD 500 USD
2 B 500 EUR 568 USD
3 C 500 CAD 373 USD
4 D 500 NA 500 NA
5 E 500 SDD 500 SDD
试用 1 -- 不起作用
将来,我可能要转换的不仅仅是这几种货币,所以我应用了case_when() 声明。这是我的第一次尝试:
df_USD1 <- df %>%
mutate(
invoice_converted = case_when(
currency == "EUR" ~ round(invoice / EUR_EXCHANGE),
currency == "CAD" ~ round(invoice / CAD_EXCHANGE),
TRUE ~ invoice
),
currency_converted = case_when(currency == "EUR" ~ "USD",
currency == "CAD" ~ "USD",
TRUE ~ currency)
)
Error: Problem with `mutate()` column `currency_converted`.
i `currency_converted = case_when(...)`.
x must be a character vector, not a `factor` object.
通过以上内容,我知道我在分配给currency_converted 时混合了字符和因素,因为我有默认的TRUE ~ currency(而currency 是一个因素)。所以我尝试只使用因子来分配......
试用 2 -- 有效,但不可靠
df_USD2 <- df %>%
mutate(
invoice_converted = case_when(
currency == "EUR" ~ round(invoice / EUR_EXCHANGE),
currency == "CAD" ~ round(invoice / CAD_EXCHANGE),
TRUE ~ invoice
),
currency_converted = case_when(
currency == "EUR" ~ currency[1],
currency == "CAD" ~ currency[1],
TRUE ~ currency)
)
它有效,但只是因为在我对这个问题的设置中,美元处于第一位,我不能依赖它。
> df$currency
[1] USD EUR CAD <NA> SDD
Levels: CAD EUR SDD USD
试用 3 -- 不起作用
我想我可以尝试一些其他方法来获得子集的因素,但这不起作用:
df_USD3 <- df %>%
mutate(
invoice_converted = case_when(
currency == "EUR" ~ round(invoice / EUR_EXCHANGE),
currency == "CAD" ~ round(invoice / CAD_EXCHANGE),
TRUE ~ invoice
),
currency_converted = case_when(
currency == "EUR" ~ df$currency[df$currency == "USD"],
currency == "CAD" ~ df$currency[df$currency == "USD"],
TRUE ~ currency
)
)
Error: Problem with `mutate()` column `currency_converted`.
i `currency_converted = factor(...)`.
x `currency == "EUR" ~ df$currency[df$currency == "USD"]`, `currency == "CAD" ~ df$currency[df$currency == "USD"]` must be length 5 or one, not 2.
Run `rlang::last_error()` to see where the error occurred.
而且似乎是因为 NA 被返回...
> df$currency[df$currency == "USD"]
[1] USD <NA>
Levels: CAD EUR SDD USD
...因为如果我回到原来的df 并用其他货币替换NA,它会起作用——但显然我需要能够将NA 保留在它所属的位置。
我觉得有一些非常好的方法可以做到这一点,但是尽管阅读了因素并尝试了不同的东西,但我还是错过了它。帮忙?
【问题讨论】: