【问题标题】:Reorder data frame columns whose names contain dates in date order?重新排序名称中包含日期的数据框列?
【发布时间】:2019-03-22 01:21:18
【问题描述】:

我有一个反应性数据框,其中列名发生变化,Month.Year 名称的列出现故障。我怎样才能把第一个 Month.Year 放在“Current”之后最左边的位置?以下是数据框列的排序方式以及我希望它们的排列方式。

print(colnames(df))
#[1] "ProductCategoryDesc" "RegionDesc"          "SourceDesc"          "Report"             
#[5] "Apr.2019"            "Current"             "Feb.2019"            "Jun.2019"           
#[9] "Mar.2019"            "May.2019"            "Mar.2020"

#the order I want is below
#[1] "ProductCategoryDesc" "RegionDesc"          "SourceDesc"          "Report"             
#[5] "Current"             "Feb.2019"             "Mar.2019"            "Jun.2019"           
#[9] "Apr.2019"            "May.2019"             "Mar.2020"

#####################################################################
#another example of the df
print(colnames(df))

#[1] "ProductCategoryDesc" "RegionDesc"          "SourceDesc"          "Report"             
#[5] "Apr.2019"            "Current"             "Feb.2019"            "Jun.2019"           
#[9] "Mar.2019"            "May.2019"            "Sep.2019"

#the order I want is below
#[1] "ProductCategoryDesc" "RegionDesc"          "SourceDesc"          "Report"             
#[5] "Current"             "Feb.2019"             "Mar.2019"            "Apr.2019"           
#[9] "May.2019"            "Jun.2019"             "Sep.2019"

这里是一些关于 df 外观的信息

print(dput(droplevels(head(d3))))
#below is the output

structure(list(ProductCategoryDesc = structure(c(1L, 1L, 1L, 
1L, 1L, 1L), .Label = "CN AMMONIA", class = "factor"), RegionDesc = 
structure(c(1L, 
1L, 1L, 1L, 1L, 1L), .Label = "AB REG 2 UPPER MIDWEST", class = "factor"), 
SourceDesc = structure(c(1L, 1L, 1L, 1L, 1L, 1L), .Label = "CN-SD, WATERTOWN 
LIQUID", class = "factor"), 
Report = structure(1:6, .Label = c("InventoryAvailabletoShip", 
"NetCashPosition", "NetMarketPositionTotal", "NonDirectShipPurchase", 
"TotalDirectShips", "TotalNonDirectShips"), class = "factor"), 
Apr.2019 = c(0, 0, 0, 0, 0, 0), Current = c(0, 0, 0, 0, 0, 
0), Feb.2019 = c(0, 0, 240, 240, 0, 240), Jun.2019 = c(0, 
0, 0, 0, 0, 0), Mar.2019 = c(0, 0, 0, 0, 0, 0), May.2019 = c(0, 
0, 0, 0, 0, 0)), sorted = c("ProductCategoryDesc", "RegionDesc", 
"SourceDesc", "Report"), row.names = c(NA, -6L), .internal.selfref = 
<pointer: 0x0000000000211ef0>, class = c("data.table", 
"data.frame"))
ProductCategoryDesc             RegionDesc              SourceDesc                   
Report Apr.2019
1:          CN AMMONIA AB REG 2 UPPER MIDWEST CN-SD, WATERTOWN LIQUID 
InventoryAvailabletoShip        0
2:          CN AMMONIA AB REG 2 UPPER MIDWEST CN-SD, WATERTOWN LIQUID          
NetCashPosition        0
3:          CN AMMONIA AB REG 2 UPPER MIDWEST CN-SD, WATERTOWN LIQUID   
NetMarketPositionTotal        0
4:          CN AMMONIA AB REG 2 UPPER MIDWEST CN-SD, WATERTOWN LIQUID    
NonDirectShipPurchase        0
5:          CN AMMONIA AB REG 2 UPPER MIDWEST CN-SD, WATERTOWN LIQUID         
TotalDirectShips        0
6:          CN AMMONIA AB REG 2 UPPER MIDWEST CN-SD, WATERTOWN LIQUID      
TotalNonDirectShips        0
Current Feb.2019 Jun.2019 Mar.2019 May.2019
1:       0        0        0        0        0
2:       0        0        0        0        0
3:       0      240        0        0        0
4:       0      240        0        0        0
5:       0        0        0        0        0
6:       0      240        0        0        0

【问题讨论】:

  • 如果您使用dput 而不是str,那就太好了。这样它就可以复制/粘贴。既然你有一些因素dput(droplevels(head(d3))) 应该很好。

标签: r dataframe


【解决方案1】:

我们可以尽可能转换为日期,并对列进行排序:

x <- c("ProductCategoryDesc", "RegionDesc","SourceDesc","Report",             
 "Apr.2019","Current","Feb.2019", "Jun.2019",           
 "Mar.2019","May.2019","Mar.2020")

dates <-  as.Date(paste0("01.",x), "%d.%b.%Y")
x <- x[order(replace(dates, is.na(dates), "0000-01-01"))]
# [1] "ProductCategoryDesc" "RegionDesc"          "SourceDesc"          "Report"             
# [5] "Current"             "Feb.2019"            "Mar.2019"            "Apr.2019"           
# [9] "May.2019"            "Jun.2019"            "Mar.2020"         

您的排序数据框:

df[x]

【讨论】:

  • 我刚刚修好了,我的意思是“当前”首先,然后是 Month.Year,从最早到最晚。
【解决方案2】:

享受吧!

# Reorder columns in dataframe
df =
  df[
    c("ProductCategoryDesc",
      "RegionDesc", 
      "SourceDesc",
      "Report",
      "Current",
      "Feb.2019",
      "Mar.2019",
      "Jun.2019",
      "Apr.2019",
      "May.2019",
      "Mar.2020")]

要改为按位置(即索引)更改数据帧的顺序,请尝试:

df = 
  df[c(
       1,
       3,
       2)]

如果您能够预测每次将数据添加到数据帧时数据帧的哪个索引将是最新的,您可以编写脚本以获取该数据帧并将其移动到您想要的位置。例如,如果它是数据框中的最后一列,并且您想开始将其移动到第一个位置,则可以尝试使用 length(df) 之类的方法并使用长度来指示最后一个元素:

df = 
  df[c(
       length(df),
       1,
       3,
       2)]

【讨论】:

  • 这行不通,因为数据框在列名中不断改变长度,这就是为什么我展示了 2 个 df 示例
  • 为什么数据框的列名长度不断变化?不要使用名称来指示顺序,而是尝试使用索引。
  • 它是 R Shiny 应用程序中的反应式数据框
  • 查看我的最新编辑,希望它们会有所帮助。如果您可以预期每次数据框发生更改时最新数据将出现在哪里,那么我写的内容应该可以工作。如果由于某种原因它始终是数据框中最后一列的倒数第二个,请尝试 length(df)-1 ,它应该作为索引进行重新排序)。
猜你喜欢
  • 1970-01-01
  • 2021-08-24
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
相关资源
最近更新 更多