【问题标题】:Convert excel column names to numbers将excel列名转换为数字
【发布时间】:2020-11-25 23:11:53
【问题描述】:

我想写一个函数来将excel的列名转换成相应的数字。到目前为止,我想出的只是部分起作用。也就是说,小写字母在前的输入(“AB”、“AC”等)工作正常。但反过来就不行了(“BA”、“CA”等)。我已经追踪了y <- which(base::LETTERS==x) 行出错,但我不太明白这些布尔运算符如何处理向量。有什么建议吗?

#so to pass excel column-names directly, this function should do the trick
LettersToNumbers <- function(input){
    x <- toupper(substring(input, c(1:nchar(input)), c(1:nchar(input)))) #parse input-string
    y <- which(base::LETTERS==x) #letters to numbers
    y <- rev(y) #reverse
    #base26 conversion:
    result <- 0
    for (i in 1:length(y)){
        result <- result + ( y[i]*26^(i-1) )
    }
    return(result)
}

实际上,事实证明还有一些无效的示例。这里有一些,我不太明白发生了什么。

> which(LETTERS==c("A", "B"))
[1] 1 2
> which(LETTERS==c("A", "C"))
[1] 1
> which(LETTERS==c("A", "D"))
[1] 1 4
> which(LETTERS==c("D", "A"))
integer(0)
> 

【问题讨论】:

  • 你看到的是矢量回收;看看你会得到什么:which(LETTERS==c("A","B","C"))
  • 反函数在我的回答here.

标签: r function


【解决方案1】:

这又快又脏,但我认为它可以满足您的需求。它应该适用于任意字符串长度。

# Input: A string of letters s
# Output: Corresponding column number
LettersToNumbers <- function(s){
  # Uppercase
  s_upper <- toupper(s)
  # Convert string to a vector of single letters
  s_split <- unlist(strsplit(s_upper, split=""))
  # Convert each letter to the corresponding number
  s_number <- sapply(s_split, function(x) {which(LETTERS == x)})
  # Derive the numeric value associated with each letter
  numbers <- 26^((length(s_number)-1):0)
  # Calculate the column number
  column_number <- sum(s_number * numbers)
  column_number
}
# Vectorize in case you want to pass more than one column name in a single call
LettersToNumbers <- Vectorize(LettersToNumbers)

# Quick tests
LettersToNumbers("A")
LettersToNumbers("Z")
LettersToNumbers("AA")
LettersToNumbers("BA")
LettersToNumbers("AAA")
LettersToNumbers(LETTERS)

如上面评论中所述,您的代码的主要问题是矢量回收,此函数通过使用 sapply 来避免这种情况。

【讨论】:

  • 不错!当第 100 万个 Excel 列名 LettersToNumbers("BDWGN") 迅速返回 1e6 时,我完全相信了。
【解决方案2】:

对于那些想要来回转换的人(都在一个功能中): 输入一个数字(27),输出字母('AA') 输入字母('AA'),输出数字(27)

xlcolconv <- function(col){
    # test: 1 = A, 26 = Z, 27 = AA, 703 = AAA
    if (is.character(col)) {
        # codes from https://stackoverflow.com/a/34537691/2292993
        s = col
        # Uppercase
        s_upper <- toupper(s)
        # Convert string to a vector of single letters
        s_split <- unlist(strsplit(s_upper, split=""))
        # Convert each letter to the corresponding number
        s_number <- sapply(s_split, function(x) {which(LETTERS == x)})
        # Derive the numeric value associated with each letter
        numbers <- 26^((length(s_number)-1):0)
        # Calculate the column number
        column_number <- sum(s_number * numbers)
        return(column_number)
    } else {
        n = col
        letters = ''
        while (n > 0) {
            r = (n - 1) %% 26  # remainder
            letters = paste0(intToUtf8(r + utf8ToInt('A')), letters) # ascii
            n = (n - 1) %/% 26 # quotient
        }
        return(letters)
    }
}

【讨论】:

    【解决方案3】:

    在这些情况下,答案通常是使用%in%,而不是==。例如

    which(LETTERS %in% c("D", "A"))
    

    产生1 4。但它们的顺序并不符合您的要求 - 因此,这将一一应用该功能。

    sapply(c("D", "A"), function(x){which(LETTERS %in% x)})
    

    产生4 1

    【讨论】:

      【解决方案4】:
          # Setup converter index numbers
          converter <- 1:702
          # Excel column names in order
          names(converter) <- do.call(paste0, expand.grid(LETTERS, c("",LETTERS))[,2:1])
          ExcelColumnNames <- c("A", "Z", "AA", "AZ", "ZZ")
          converter[ExcelColumnNames] # show excel column numbers
      #  A   Z  AA  AZ  ZZ 
      #  1  26  27  52 702 
      

      【讨论】:

        【解决方案5】:

        检查字符元素的长度,然后添加由 LETTERS 中的位置确定的位置:

        TwoLet2Num <- function(chars) { if( nchar( substr(chars,2,2)) ){ 
                 res <- which(LETTERS==substr(chars,1,1))*26 + which(LETTERS ==substr(chars,2,2)) 
                  } else { res= which(LETTERS==substr(chars,1,1) ) } 
                 return(res)}
        

        【讨论】:

          【解决方案6】:

          比已接受的解决方案更快且(已经)矢量化的替代方案 [不需要Vectorize]

          letters2numbers <- function(x){
            
            # letters encoding
            encoding <- setNames(seq_along(LETTERS), LETTERS)
            
            # uppercase
            x <- toupper(x)
            
            # convert string to a list of vectors of single letters
            x <- strsplit(x, split = "")
            
            # convert each letter to the corresponding number
            # calculate the column number
            # return a numeric vector
            sapply(x, function(xs) sum(encoding[xs] * 26^((length(xs)-1):0)))
            
          }
          
          
          letters2numbers("Z")
          #> [1] 26
          letters2numbers(c("A", "BZ", "CBA", "BDWGN"))
          #> [1]       1      78    2081 1000000
          

          基准测试:

          microbenchmark::microbenchmark(
            LettersToNumbers("Z"),
            letters2numbers("Z")
          )
          #> Unit: microseconds
          #>                   expr    min      lq     mean  median     uq     max neval
          #>  LettersToNumbers("Z") 60.510 61.9065 70.23292 64.0005 67.957 262.051   100
          #>   letters2numbers("Z") 20.481 21.4115 26.70360 22.3420 24.204 140.568   100
          
          microbenchmark::microbenchmark(
            LettersToNumbers(c("A", "BZ", "CBA", "BDWGN")),
            letters2numbers(c("A", "BZ", "CBA", "BDWGN"))
          )
          #> Unit: microseconds
          #>                                            expr     min      lq      mean   median       uq     max neval
          #>  LettersToNumbers(c("A", "BZ", "CBA", "BDWGN")) 152.669 158.721 206.97909 171.7530 220.8595 581.819   100
          #>   letters2numbers(c("A", "BZ", "CBA", "BDWGN"))  30.255  32.582  42.47789  35.1425  43.9865 174.547   100
          

          【讨论】:

            猜你喜欢
            • 1970-01-01
            • 1970-01-01
            • 2010-09-22
            • 2023-01-17
            • 1970-01-01
            • 2010-10-25
            • 1970-01-01
            • 2021-05-26
            • 2012-12-25
            相关资源
            最近更新 更多