Dason 打败了我,但这里是相同概念方法的不同风格:
library(plyr)
# Use regex to get the prefixes
# Pulls any letters or digits ("\\w*") from the beginning of the string ("^")
# to the first period ("\\.") into a group, then matches all the remaining
# characters (".*"). Then replaces with the first group ("\\1" = "(\\w*)").
# In other words, it matches the whole string but replaces with only the prefix.
prefixes <- unique(gsub(pattern = "^(\\w*)\\..*",
replace = "\\1",
x = names(df)))
# Subset to the variables that match the prefix
# Iterates over the prefixes and subsets based on the variable names that
# match that prefix
llply(prefixes, .fun = function(x){
y <- subset(df, select = names(df)[grep(names(df),
pattern = paste("^", x, sep = ""))])
})
我认为即使有“。”,这些正则表达式仍应为您提供正确的结果。稍后在变量名中:
unique(gsub(pattern = "^(\\w*)\\..*",
replace = "\\1",
x = c(names(df), "FRA.c.blahblah")))
或者如果变量名后面出现前缀:
# Add a USA variable with "FRA" in it
df2 <- data.frame(df, USA.FRANKLINS = rnorm(10))
prefixes2 <- unique(gsub(pattern = "^(\\w*)\\..*",
replace = "\\1",
x = names(df2)))
llply(prefixes2, .fun = function(x){
y <- subset(df2, select = names(df2)[grep(names(df2),
pattern = paste("^", x, sep = ""))])
})