这些是非特定的列名(因为它以数字开头),因此可以反引号。此外,最后一列名称是 5/11/20(5/10/22 - 假设它是月/日/年,从现在起是两年)
library(dplyr)
library(tidyr)
out <- covid_confirmed_cases%>%
pivot_longer(cols = c(`1/22/20`:`5/11/20`),
names_to = "date", values_to = "cases")
out %>%
slice(1500:1510)
# A tibble: 11 x 6
# `Province/State` `Country/Region` Lat Long date cases
# <chr> <chr> <dbl> <dbl> <chr> <int>
# 1 Tasmania Australia -41.5 146. 3/18/20 10
# 2 Tasmania Australia -41.5 146. 3/19/20 10
# 3 Tasmania Australia -41.5 146. 3/20/20 10
# 4 Tasmania Australia -41.5 146. 3/21/20 16
# 5 Tasmania Australia -41.5 146. 3/22/20 22
# 6 Tasmania Australia -41.5 146. 3/23/20 28
# 7 Tasmania Australia -41.5 146. 3/24/20 28
# 8 Tasmania Australia -41.5 146. 3/25/20 36
# 9 Tasmania Australia -41.5 146. 3/26/20 47
#10 Tasmania Australia -41.5 146. 3/27/20 47
#11 Tasmania Australia -41.5 146. 3/28/20 62
注意:在这里,我们假设 OP 使用check.names = FALSE 读取数据集,如果它是用read.csv/read.table 读取的)
我们也可以使用matches
covid_confirmed_cases%>%
pivot_longer(cols = matches("^\\d+/\\d+/\\d+$"),
names_to = "date", values_to = "cases")
或与列索引
covid_confirmed_cases%>%
pivot_longer(cols = 4:ncol(.),
names_to = "date", values_to = "cases")
数据
covid_confirmed_cases <- read.csv("https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_confirmed_global.csv", stringsAsFactors = FALSE, check.names = FALSE)