【发布时间】:2019-06-06 04:11:52
【问题描述】:
问题
我需要将列表折叠到数据框/小标题中,并将列表名称转换为每个观察值中的值。
数据
#This chunk generates the list
url <- "https://www.ato.gov.au/Rates/Individual-income-tax-for-prior-years/"
pit_sch <- url %>%
read_html() %>%
html_table() %>%
setNames(., url %>%
read_html() %>%
html_nodes("caption") %>%
html_text()) %>%
map(.%>%
mutate(`Tax on this income` = gsub(",", "", `Tax on this income`),
cumm_tax_amt = str_extract(`Tax on this income`, "(?<=^\\$)\\d+") %>% as.numeric(),
tax_rate = str_extract(`Tax on this income`, "\\d+.(\\d+)?(?=(\\s+)?c)") %>% as.numeric(),
threshold = str_extract(`Tax on this income`, "(?<=\\$)\\d+$") %>% as.numeric()
)
) %>%
map(~drop_na(.x, threshold)) %>%
map(function(x) { mutate_each(x, funs(replace(., is.na(.), 0))) })
我的尝试
此代码确实创建了我想要的数据框,但没有在我需要的每个观察中包含列表项的名称。
map_df(pit_sch, `[`, c("Taxable income", "Tax on this income", "cumm_tax_amt", "tax_rate", "threshold"))
成功的样子
输出应包括与数据关联的列表项的名称: “table_name”、“应税收入”、“该收入的税款”、“cumm_tax_amt”、“tax_rate”、“threshold”
【问题讨论】: