【问题标题】:Transform a character column into date将字符列转换为日期
【发布时间】:2021-10-28 06:23:38
【问题描述】:

将索引列从我的 df 转换为实际列,然后将其从字符转换为日期的最简单方法是什么?

这是一个示例:

df <-    
structure(c(" 0.076596889", "-0.004217772", "0.19551752", "-0.11599534", 
    "0.06203595", "0.011884905", "-0.17142789", "-0.04094597", NA, 
    "-0.170884035", "0.14163907", "-0.160324552", " 0.005193002", 
    "0.094337323", " 0.054004389", " 0.342200896", "-0.582262572", 
    "0.99211563", "-1.80983062", "1.56762552", "0.280851484", "-1.04618660", 
    "-0.38132434", NA, "-1.144843376", "0.90148871", "-1.797113427", 
    "-0.604269203", "1.339789966", " 0.103927382", NA, NA, NA, NA, 
    NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, " 0.141558015", " 0.005554185", 
    "0.32162836", "-0.16310463", "0.09384922", "0.028853136", "-0.29074993", 
    "-0.07721910", NA, "-0.320260458", "0.27066138", "-0.258998975", 
    " 0.022159433", "0.179980305", " 0.092695990", " 0.131799599", 
    " 0.005539599", "0.30970765", "-0.16474883", "0.09485784", "0.029036801", 
    "-0.27107662", "-0.06387477", NA, "-0.271018769", "0.23147817", 
    "-0.219497824", " 0.020039532", "0.154692959", " 0.079533669", 
    "-0.007113455", "-0.338518690", "0.50742611", "-0.29544436", 
    "0.49416319", "0.067917866", "-0.14447349", " 0.19500609", NA, 
    "-0.758730659", "0.45544155", "-0.730001301", "-0.155946684", 
    "0.585023578", " 0.050246438", "-0.007177297", "-0.334579290", 
    "0.50032162", "-0.29218441", "0.48883099", "0.067209227", "-0.14334411", 
    " 0.19332042", NA, "-0.744300043", "0.44290710", "-0.705396078", 
    "-0.149517933", "0.562076971", " 0.048731481", " 0.020630553", 
    " 0.009395069", "0.02771509", " 0.00700098", "0.01903145", "0.009515561", 
    " 0.01362812", " 0.01825617", NA, " 0.009106663", "0.01591710", 
    " 0.007718648", " 0.008820632", "0.007500167", "-0.004636657", 
    " 0.033359185", " 0.020794132", "0.03509503", " 0.01674505", 
    "0.02774480", "0.016422813", " 0.01995413", " 0.02726850", NA, 
    " 0.014408007", "0.02654718", " 0.010979024", " 0.009351808", 
    "0.014444094", "-0.010921319", " 0.159250401", "-0.987571576", 
    "0.75983437", "-2.63835286", "2.19653017", "0.254926885", "-1.00150236", 
    " 0.13780102", NA, "-0.359317531", "0.40311397", "-2.445576883", 
    "-0.351301609", "2.113486497", "-1.003014684"), .Dim = c(15L, 
    10L), .Dimnames = list(c("Retorno do fechamento em 1 dia (de 24Ago20 até 25Ago20) Em moeda orig ajust p/ prov", 
    "Retorno do fechamento em 1 dia (de 25Ago20 até 26Ago20) Em moeda orig ajust p/ prov", 
    "Retorno do fechamento em 1 dia (de 27Ago20 até 28Ago20) Em moeda orig ajust p/ prov", 
    "Retorno do fechamento em 1 dia (de 28Ago20 até 31Ago20) Em moeda orig ajust p/ prov", 
    "Retorno do fechamento em 1 dia (de 31Ago20 até 01Set20) Em moeda orig ajust p/ prov", 
    "Retorno do fechamento em 1 dia (de 01Set20 até 02Set20) Em moeda orig ajust p/ prov", 
    "Retorno do fechamento em 1 dia (de 02Set20 até 03Set20) Em moeda orig ajust p/ prov", 
    "Retorno do fechamento em 1 dia (de 03Set20 até 04Set20) Em moeda orig ajust p/ prov", 
    "Retorno do fechamento em 1 dia (de 04Set20 até 07Set20) Em moeda orig ajust p/ prov", 
    "Retorno do fechamento em 1 dia (de 07Set20 até 08Set20) Em moeda orig ajust p/ prov", 
    "Retorno do fechamento em 1 dia (de 08Set20 até 09Set20) Em moeda orig ajust p/ prov", 
    "Retorno do fechamento em 1 dia (de 09Set20 até 10Set20) Em moeda orig ajust p/ prov", 
    "Retorno do fechamento em 1 dia (de 10Set20 até 11Set20) Em moeda orig ajust p/ prov", 
    "Retorno do fechamento em 1 dia (de 11Set20 até 14Set20) Em moeda orig ajust p/ prov", 
    "Retorno do fechamento em 1 dia (de 14Set20 até 15Set20) Em moeda orig ajust p/ prov"
    ), c("Absolute Hedge Fc de FI Mult", "Absolute Pace Long Biased Fc FIA", 
    "Absolute Svp Prev Fc FI Mult", "Absolute Vertex II Fc FI Mult", 
    "Absolute Vertex S Fc FI Mult", "Ace Capital Fcfi Mult", "Ace Capital S Fc Mult", 
    "Af Invest FI RF Cred Priv Geraes", "Af Invest Geraes 30 FI RF Cred Priv", 
    "Af Invest Minas FIA"))) %>% 
as.data.frame()

我想看到的是:Row1Col1 = 2020/08/26; Row2Col1 = 2020/08/28; Row3Col1 = 2020/08/31 等等

【问题讨论】:

  • 是的,我在数据帧上使用了 dput,所以我只需在代码前面放一个 %>% as.data.frame
  • 这些Retorno do fechamento em 1 dia (de 24Ago20 até 25Ago20) Em moeda orig ajust p/ prov 文本除了日期之外是否暗示 Row1Col1、Row2Col1 等。抱歉,我不熟悉语言
  • 是的。我需要删除字符可能只得到'25ago20',然后以某种方式将其转换为 as.Date()
  • 这也取决于您的区域设置,您是否获得as.Date('25ago20", "d%b%y") 的 NA。在英文设置中,它将是as.Date('25aug20', "%d%b%y") [1] "2020-08-25"
  • 也许我必须创建一个类似“Jan = 01; Fev = 02; Mar = 03...”的列表

标签: r regex dataframe date


【解决方案1】:

清理字符串:

ss <- rownames(dd)
## ^ = beginning of string; .* means "string of any characters"
ss <- gsub("^.*até ", "", ss)
## .* = "string of characters"; $ = end of string
ss <- gsub(").*$", "", ss)
rownames(dd) <- NULL

转换:

Sys.setlocale("LC_TIME", "pt_BR.utf8")
dd2 <- data.frame(date = as.Date(ss, "%d%b%y"), dd)

我的日期比您指定的结果少一 - 它们与字符串上的结束日期匹配。如果你想改变它们,你可以添加一个。

(我选择了我发现的第一个葡萄牙语语言环境:我不知道葡萄牙语和巴西的日期缩写之间是否有区别??)

【讨论】:

  • 这个合成器中的“^.*”和“).*$”是什么意思?
  • 它们正在匹配正则表达式。我添加了一些细节。
  • 做了这个并得到了我想要的结果: [y % select(Data, Everything())]
【解决方案2】:
library(tidyverse)
# remotes::install_github("vbfelix/relper")
library(relper)
library(lubridate)

df %>% 
  #Transform data in data.frame
  as.data.frame() %>% 
  # Add rownames as a variable named date
  rownames_to_column(var = "date") %>% 
  # Remove characters from date
  mutate(
    date = relper::str_select(date,after = "\\(de ",before = "\\)")
  ) %>%
  # Separate date into two variables based on the string " até "
  separate(col = date,into = c("date1","date2"),sep = " até ") %>% 
  # Transform those dates from character to date
  mutate(
    across(.cols = starts_with("date"),.fns = lubridate::dmy)
  ) %>% 
  glimpse()

【讨论】:

    【解决方案3】:

    由于本地月份名称不兼容,我无法测试:

    library(dplyr)
    library(tidyr)
    library(lubridate)
    df %>% 
        rownames_to_column() %>% 
        as_tibble() %>% 
        mutate(rowname = str_extract(rowname, "(?<=\\().*(?=\\))")) %>% 
        separate(rowname, c("x", "start", "y", "end"), sep = ' ') %>% 
        select(-x, -y) %>% 
        mutate(across(c(start, end), dmy))
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 2011-11-28
      • 2015-04-27
      • 2016-12-10
      • 2021-09-29
      • 2015-04-25
      • 1970-01-01
      相关资源
      最近更新 更多