【问题标题】:read data into R with different delimiters使用不同的分隔符将数据读入 R
【发布时间】:2017-10-09 12:44:44
【问题描述】:

我正在尝试将一个文件读入 R 中,该文件在第一行有不同的分隔符,有空格作为分隔符,但从第二行到最后一行,第一列和第二列之间有一个空格,第二列和第二列之间相同第三,那么所有的两个块,零和一应该是不同的列。 有什么提示吗?!

ID Chip AX-77047182 AX-80910836 AX-80737273 AX-77048714 AX-77048779 AX-77050447 
3811582 1 2002202222200202022020200200220200222200022220002200000201202000222022
3712982 1 2002202222200202022020200200220200222200022220002200000200202000222022
3712990 1 2002202211200202021011100101210200111101022121112100111110211110122122
3713019 1 2002202211200202021011100101210200111101022121112100111110211110122122
3713025 1 2002202211200202021011100101210200111101022121112100111110211110122122
3713126 1 2002202222200202022020200200220200222200022220002200000200202000222022

【问题讨论】:

  • 您能否提供一个预期输出的示例?
  • 前三行确定:第一行:ID芯片AX-77047182 AX-80910836。第二行:3811582 1 2 0。分隔符应该是空格。
  • 请编辑您的帖子,而不是评论

标签: r read.table


【解决方案1】:

当然不是最优雅的解决方案,但您可以尝试以下方法。如果我正确理解了您的示例数据,则您没有提供零/一/二的行所需的所有列名(AX-77047182,...)。如果我的理解是错误的,下面的方法不会产生预期的结果,但仍可能帮助您找到解决方法 - 您可以简单地调整第二个拆分命令中的分隔符。我希望这会有所帮助...

#read file as character vector
chipstable <- readLines(".../chips.txt")

#extact first line to be used as column names
tablehead <- unlist(strsplit(chipstable[1], " "))

#split by first delimiter, i.e., space
chipstable <- strsplit(chipstable[2:length(chipstable)], " ")

#split by second delimiter, i.e., between each character (here number) 
#and merge the two split results in one line
chipstable <- lapply(chipstable, function(x) {

  c(x[1:2],  unlist(strsplit(x[3], "")))

})

#combine all lines to a data frame
chipstable <- do.call(rbind, chipstable)

#assign column names
colnames(chipstable) <- tablehead

#turn values to numeric (if needed)
chipstable <- apply(chipstable, 2, as.numeric)

【讨论】:

    【解决方案2】:

    你可以试试...read(pattern = " || 1 ", recursive = TRUE) 绑定后

    例如:

    data <- "ID Chip AX-77047182 AX-80910836 AX-80737273 AX-77048714 AX-77048779 AX-77050447 
    3811582 1 2002202222200202022020200200220200222200022220002200000201202000222022
    3712982 1 2002202222200202022020200200220200222200022220002200000200202000222022
    3712990 1 2002202211200202021011100101210200111101022121112100111110211110122122
    3713019 1 2002202211200202021011100101210200111101022121112100111110211110122122
    3713025 1 2002202211200202021011100101210200111101022121112100111110211110122122
    3713126 1 2002202222200202022020200200220200222200022220002200000200202000222022"
    
    teste <- strsplit(data, split = "\n")
    
    for(i in seq(1, length(teste[[1]]),1)) {
      if (i==1) {
        dataOut <- strsplit(teste[[1]][i], split = " ")
        print(dataOut)
      } else
        dataOut <- strsplit(teste[[1]][i], split = " 1 ")
      print(dataOut)
    }
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 2021-08-26
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2014-11-18
      • 2017-03-04
      • 1970-01-01
      • 2020-08-20
      相关资源
      最近更新 更多