【发布时间】:2019-06-14 23:25:33
【问题描述】:
这里是 Python 编码器,但我需要在 R 中处理一些带有附加数据的 shapefile,我需要通过将一个简单的函数应用于具有日期时间数据的列,将其从当前格式转换为常规日期时间。这在 Python 中很简单,但是在 R 中使用 apply 和 lapply 时,我总是遇到奇怪的错误(详见下文)。答案可能相对简单,因为我对 R 的熟悉程度远低于 Python,因此非常感谢任何帮助。
R 版本
# df is an R data.frame with 54 columns. The only one relevant for this
# question is df["ISSUE_DATE"], which is currently a list of 13-digit
# integers. I need to convert it to a regular datetime.
df$ISSUE_DATE[0:5]
[1] 20011001000000 20030228000000 19990910000000 20131108000000
[5] 19970930000000
fix_date = function(x){
string_x = toString(x)
datestr = substr(string_x, 0, 8)
result = as.Date(datestr, "%Y%m%d")
return(result)
}
df$fixed_dates = lapply(df$ISSUE_DATE, fix_date)
# This returns a column with the same value - fix_date(df$ISSUE_DATE[1])
# - in every row:
df$fixed_dates[0:5]
[1] "2001-10-01" "2001-10-01" "2001-10-01" "2001-10-01"
[5] "2001-10-01"
# What I want instead is the result of fix_date applied to each value in
# df$ISSUE_DATE as the values of df$fixed_dates:
df$fixed_dates[0:5]
[1] "2001-10-01" "2003-02-28" "1999-09-10" "2013-11-08"
[5] "1997-09-30"
这在 Python 中会是什么样子:
df["fixed_dates"] = df["ISSUE_DATE"].apply(fix_date)
【问题讨论】:
标签: python r dataframe apply lapply