【问题标题】:Sort list of strings by order of numeric parts按数字部分的顺序对字符串列表进行排序
【发布时间】:2017-11-20 02:34:01
【问题描述】:

我有一个文件路径列表,我想根据每个列表的第一个路径按升序排序。路径列表显示

$`HG-U133_Plus_2`
[1] "C:\\Users\\agaz\\AppData\\Local\\Temp\\Rtmp0wZI21/008947515435900b4d1a0b8d/22"
[2] "C:\\Users\\agaz\\AppData\\Local\\Temp\\Rtmp0wZI21/008947515435900b4d1a0b8d/23"
[3] "C:\\Users\\agaz\\AppData\\Local\\Temp\\Rtmp0wZI21/008947515435900b4d1a0b8d/24"
[4] "C:\\Users\\agaz\\AppData\\Local\\Temp\\Rtmp0wZI21/008947515435900b4d1a0b8d/25"

$`HG-U133A`
[1] "C:\\Users\\agaz\\AppData\\Local\\Temp\\Rtmp0wZI21/008947515435900b4d1a0b8d/0"
[2] "C:\\Users\\agaz\\AppData\\Local\\Temp\\Rtmp0wZI21/008947515435900b4d1a0b8d/1"

$`HG-U133A_2`
[1] "C:\\Users\\agaz\\AppData\\Local\\Temp\\Rtmp0wZI21/008947515435900b4d1a0b8d/6" 
[2] "C:\\Users\\agaz\\AppData\\Local\\Temp\\Rtmp0wZI21/008947515435900b4d1a0b8d/7" 
[3] "C:\\Users\\agaz\\AppData\\Local\\Temp\\Rtmp0wZI21/008947515435900b4d1a0b8d/8" 
[4] "C:\\Users\\agaz\\AppData\\Local\\Temp\\Rtmp0wZI21/008947515435900b4d1a0b8d/9" 
[5] "C:\\Users\\agaz\\AppData\\Local\\Temp\\Rtmp0wZI21/008947515435900b4d1a0b8d/10"

为了排序,我尝试了以下代码

lpath[order(sapply(lpath, function(x) sub('.*\\/', '', x[[1]][1]), simplify=TRUE))]

lpath[order(sapply(lpath, function(x) x[1]), simplify=TRUE))]

结果并没有像预期的那样出现,如下所示。第一个列表很好,但第二个和第三个不是。

$`HG-U133A`
[1] "C:\\Users\\agaz\\AppData\\Local\\Temp\\Rtmp0wZI21/008947515435900b4d1a0b8d/0"
[2] "C:\\Users\\agaz\\AppData\\Local\\Temp\\Rtmp0wZI21/008947515435900b4d1a0b8d/1"


$`HG-U133_Plus_2`
[1] "C:\\Users\\agaz\\AppData\\Local\\Temp\\Rtmp0wZI21/008947515435900b4d1a0b8d/22"
[2] "C:\\Users\\agaz\\AppData\\Local\\Temp\\Rtmp0wZI21/008947515435900b4d1a0b8d/23"
[3] "C:\\Users\\agaz\\AppData\\Local\\Temp\\Rtmp0wZI21/008947515435900b4d1a0b8d/24"
[4] "C:\\Users\\agaz\\AppData\\Local\\Temp\\Rtmp0wZI21/008947515435900b4d1a0b8d/25"


$`HG-U133A_2`
 [1] "C:\\Users\\agaz\\AppData\\Local\\Temp\\Rtmp0wZI21/008947515435900b4d1a0b8d/6" 
 [2] "C:\\Users\\agaz\\AppData\\Local\\Temp\\Rtmp0wZI21/008947515435900b4d1a0b8d/7" 
 [3] "C:\\Users\\agaz\\AppData\\Local\\Temp\\Rtmp0wZI21/008947515435900b4d1a0b8d/8" 
 [4] "C:\\Users\\agaz\\AppData\\Local\\Temp\\Rtmp0wZI21/008947515435900b4d1a0b8d/9" 
 [5] "C:\\Users\\agaz\\AppData\\Local\\Temp\\Rtmp0wZI21/008947515435900b4d1a0b8d/10"

预期结果

$`HG-U133A`
[1] "C:\\Users\\agaz\\AppData\\Local\\Temp\\Rtmp0wZI21/008947515435900b4d1a0b8d/0"
[2] "C:\\Users\\agaz\\AppData\\Local\\Temp\\Rtmp0wZI21/008947515435900b4d1a0b8d/1"


$`HG-U133A_2`
 [1] "C:\\Users\\agaz\\AppData\\Local\\Temp\\Rtmp0wZI21/008947515435900b4d1a0b8d/6" 
 [2] "C:\\Users\\agaz\\AppData\\Local\\Temp\\Rtmp0wZI21/008947515435900b4d1a0b8d/7" 
 [3] "C:\\Users\\agaz\\AppData\\Local\\Temp\\Rtmp0wZI21/008947515435900b4d1a0b8d/8" 
 [4] "C:\\Users\\agaz\\AppData\\Local\\Temp\\Rtmp0wZI21/008947515435900b4d1a0b8d/9" 
 [5] "C:\\Users\\agaz\\AppData\\Local\\Temp\\Rtmp0wZI21/008947515435900b4d1a0b8d/10"


$`HG-U133_Plus_2`
[1] "C:\\Users\\agaz\\AppData\\Local\\Temp\\Rtmp0wZI21/008947515435900b4d1a0b8d/22"
[2] "C:\\Users\\agaz\\AppData\\Local\\Temp\\Rtmp0wZI21/008947515435900b4d1a0b8d/23"
[3] "C:\\Users\\agaz\\AppData\\Local\\Temp\\Rtmp0wZI21/008947515435900b4d1a0b8d/24"
[4] "C:\\Users\\agaz\\AppData\\Local\\Temp\\Rtmp0wZI21/008947515435900b4d1a0b8d/25"

【问题讨论】:

  • 如果你将所有列表合并到一个 data.frame 中,在它们的来源处添加一个标识符,对它们进行排序,然后再由标识符拆分,瞧瞧?如果您制作了一个最小的、易于粘贴的可重现示例,我可以向您展示。
  • @RomanLuštrik 您能否根据下面的示例数据发布您的解决方案。我想知道在基础 R 中这是否是一个简单的答案。谢谢。

标签: r list sorting dataframe


【解决方案1】:

如果您的路径具有相同的模式并且只有最后一个数字发生变化,那么您可以使用来自gtools 包的mixedorder;否则,请考虑使用gsub 和正则表达式。

L[mixedorder(sapply(L, function(x) x[1], simplify=TRUE), decreasing=FALSE)]

L 是包含您的路径的列表。

示例:

对于下面提供的示例数据,这就是答案:

#Original List before sorting:
# > L
# [[1]] 
# [1] "C:\\Users\\agaz\\AppData\\Local\\Temp\\Rtmp0wZI21/008947515435900b4d1a0b8d/22" 
# [2] "C:\\Users\\agaz\\AppData\\Local\\Temp\\Rtmp0wZI21/008947515435900b4d1a0b8d/30" 
#  
# [[2]] 
# [1] "C:\\Users\\agaz\\AppData\\Local\\Temp\\Rtmp0wZI21/008947515435900b4d1a0b8d/0" 
# [2] "C:\\Users\\agaz\\AppData\\Local\\Temp\\Rtmp0wZI21/008947515435900b4d1a0b8d/9" 
# [3] "C:\\Users\\agaz\\AppData\\Local\\Temp\\Rtmp0wZI21/008947515435900b4d1a0b8d/5" 
#  
# [[3]] 
# [1] "C:\\Users\\agaz\\AppData\\Local\\Temp\\Rtmp0wZI21/008947515435900b4d1a0b8d/6" 
# [2] "C:\\Users\\agaz\\AppData\\Local\\Temp\\Rtmp0wZI21/008947515435900b4d1a0b8d/7" 
# [3] "C:\\Users\\agaz\\AppData\\Local\\Temp\\Rtmp0wZI21/008947515435900b4d1a0b8d/8" 
# 

基于第一个元素的排序列表:

L[mixedorder(sapply(L, function(x) x[1], simplify=TRUE), decreasing=FALSE)]
# [[1]] 
# [1] "C:\\Users\\agaz\\AppData\\Local\\Temp\\Rtmp0wZI21/008947515435900b4d1a0b8d/0" 
# [2] "C:\\Users\\agaz\\AppData\\Local\\Temp\\Rtmp0wZI21/008947515435900b4d1a0b8d/9" 
# [3] "C:\\Users\\agaz\\AppData\\Local\\Temp\\Rtmp0wZI21/008947515435900b4d1a0b8d/5" 
#  
# [[2]] 
# [1] "C:\\Users\\agaz\\AppData\\Local\\Temp\\Rtmp0wZI21/008947515435900b4d1a0b8d/6" 
# [2] "C:\\Users\\agaz\\AppData\\Local\\Temp\\Rtmp0wZI21/008947515435900b4d1a0b8d/7" 
# [3] "C:\\Users\\agaz\\AppData\\Local\\Temp\\Rtmp0wZI21/008947515435900b4d1a0b8d/8" 
#  
# [[3]] 
# [1] "C:\\Users\\agaz\\AppData\\Local\\Temp\\Rtmp0wZI21/008947515435900b4d1a0b8d/22" 
# [2] "C:\\Users\\agaz\\AppData\\Local\\Temp\\Rtmp0wZI21/008947515435900b4d1a0b8d/30" 
# 

样本数据

L <-
 list(c("C:\\Users\\agaz\\AppData\\Local\\Temp\\Rtmp0wZI21/008947515435900b4d1a0b8d/22",  
 "C:\\Users\\agaz\\AppData\\Local\\Temp\\Rtmp0wZI21/008947515435900b4d1a0b8d/30" 
 ), c("C:\\Users\\agaz\\AppData\\Local\\Temp\\Rtmp0wZI21/008947515435900b4d1a0b8d/0",  
 "C:\\Users\\agaz\\AppData\\Local\\Temp\\Rtmp0wZI21/008947515435900b4d1a0b8d/9",  
 "C:\\Users\\agaz\\AppData\\Local\\Temp\\Rtmp0wZI21/008947515435900b4d1a0b8d/5" 
 ), c("C:\\Users\\agaz\\AppData\\Local\\Temp\\Rtmp0wZI21/008947515435900b4d1a0b8d/6",  
 "C:\\Users\\agaz\\AppData\\Local\\Temp\\Rtmp0wZI21/008947515435900b4d1a0b8d/7",  
 "C:\\Users\\agaz\\AppData\\Local\\Temp\\Rtmp0wZI21/008947515435900b4d1a0b8d/8"))

【讨论】:

    【解决方案2】:

    这是一个基本的 R 解决方案。

    L <-
      list("HG-U133_Plus_2" = c("C:\\Users\\agaz\\AppData\\Local\\Temp\\Rtmp0wZI21/008947515435900b4d1a0b8d/22",  
             "C:\\Users\\agaz\\AppData\\Local\\Temp\\Rtmp0wZI21/008947515435900b4d1a0b8d/30"), 
      "HG-U133A" = c("C:\\Users\\agaz\\AppData\\Local\\Temp\\Rtmp0wZI21/008947515435900b4d1a0b8d/0",  
           "C:\\Users\\agaz\\AppData\\Local\\Temp\\Rtmp0wZI21/008947515435900b4d1a0b8d/1",  
           "C:\\Users\\agaz\\AppData\\Local\\Temp\\Rtmp0wZI21/008947515435900b4d1a0b8d/2"), 
      "HG-U133A_2" = c("C:\\Users\\agaz\\AppData\\Local\\Temp\\Rtmp0wZI21/008947515435900b4d1a0b8d/6",  
           "C:\\Users\\agaz\\AppData\\Local\\Temp\\Rtmp0wZI21/008947515435900b4d1a0b8d/7",  
           "C:\\Users\\agaz\\AppData\\Local\\Temp\\Rtmp0wZI21/008947515435900b4d1a0b8d/8"))
    
    # map which element comes from which list element
    L <- mapply(FUN = function(x, y) {
      data.frame(x, names = y)
    }, x = L, y = as.list(names(L)), SIMPLIFY = FALSE)
    L <- do.call(rbind, L) # put everything in one data.frame
    L$x <- as.character(L$x)
    rownames(L) <- NULL # fore pretty printing
    
    # find last element of the path
    find.last.number <- gsub(".*/(\\d+)$", "\\1", L$x)
    # find.last.number <- basename(L$x) # alternative if it's a file path
    find.last.number <- as.numeric(find.last.number)
    L <- L[order(find.last.number), ] # sort based on the last element of the path
    
    # you need to reorder L$names factor to preserve the order for split, see
    # https://stackoverflow.com/questions/17611734/r-split-preserving-natural-order-of-factors
    # split based on list element origin
    split(L$x, f = factor(L$names, levels = unique(L$names)))
    
    $`HG-U133A`
    [1] "C:\\Users\\agaz\\AppData\\Local\\Temp\\Rtmp0wZI21/008947515435900b4d1a0b8d/0"
    [2] "C:\\Users\\agaz\\AppData\\Local\\Temp\\Rtmp0wZI21/008947515435900b4d1a0b8d/1"
    [3] "C:\\Users\\agaz\\AppData\\Local\\Temp\\Rtmp0wZI21/008947515435900b4d1a0b8d/2"
    
    $`HG-U133A_2`
    [1] "C:\\Users\\agaz\\AppData\\Local\\Temp\\Rtmp0wZI21/008947515435900b4d1a0b8d/6"
    [2] "C:\\Users\\agaz\\AppData\\Local\\Temp\\Rtmp0wZI21/008947515435900b4d1a0b8d/7"
    [3] "C:\\Users\\agaz\\AppData\\Local\\Temp\\Rtmp0wZI21/008947515435900b4d1a0b8d/8"
    
    $`HG-U133_Plus_2`
    [1] "C:\\Users\\agaz\\AppData\\Local\\Temp\\Rtmp0wZI21/008947515435900b4d1a0b8d/22"
    [2] "C:\\Users\\agaz\\AppData\\Local\\Temp\\Rtmp0wZI21/008947515435900b4d1a0b8d/30"
    

    【讨论】:

    • 使用iterators::iter的单行解决方案:L[order(sapply(iter(L),function(x) as.numeric(unlist(strsplit(x[1],"/"))[3])))]
    • @ChiPak 请求仅使用基本包。
    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 2014-02-21
    • 2017-02-23
    • 2023-01-25
    • 2021-07-08
    • 2021-07-05
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多