【问题标题】:is there a read_delim way to read column names from the last line of a file是否有一种 read_delim 方法可以从文件的最后一行读取列名
【发布时间】:2021-05-04 01:29:36
【问题描述】:

我有一个应用程序,它可以生成拟导入 Excel 的伪 csv 输出。

主要的怪癖是这种文件格式并不总是在文件的开头包含完整的列名列表。导出 csv 文件的应用程序在导出数据时似乎“发现”了这些列。

幸运的是,应用程序确实在文件末尾打印了完整的列名列表。

问题:有没有办法让 read_delim(和其他系列函数)使用第一行以外的行作为命名列的行 - 即。在下面的示例中,以便命名列 extra1extra2 而不是 X1 和 X2 ?

样本数据:

#client version: blah
#Message event data
# Period start time: 1619666820000000000 (Thu Apr 29 2021 13:27:00.000000000 AEST)
# Period end time: 1619675221000000000 (Thu Apr 29 2021 15:47:01.000000000 AEST)
#Format:
time,ts_ns,src,src_port,dst,dst_port,bytes,application,id,values
="2021-04-29 13:27:00.006289581",="1619666820006289581","172.30.2.70",58280,"10.2.139.5",19901,160,"appName",="1614910529214246156",0.111713
="2021-04-29 13:27:00.013557400",="1619666820013557400","172.30.2.70",55920,"10.2.139.7",19902,160,"appName",="1614910529214271438",0.102003
="2021-04-29 13:27:00.015840285",="1619666820015840285","172.30.2.70",55910,"10.2.139.7",19902,160,"appName",="1614910529214348545",0.099041
="2021-04-29 13:27:00.020072322",="1619666820020072322","172.30.2.70",58276,"10.2.139.5",19901,160,"appName",="1614910529214260360",0.095228
="2021-04-29 13:27:00.021587125",="1619666820021587125","172.30.2.70",55936,"10.2.139.7",19902,160,"appName",="1614910529214338698",0.095754
,,,,,,,,,,
,,,,,,,,,,
,,,,,,,,,,
,,,,,,,,,,
="2021-04-29 13:27:00.021587125",="1619666820021587125","172.30.2.70",55936,"10.2.139.7",19902,160,"appName",="1614910529214338698",0.095754,"blah1","blah2"
time,ts_ns,src,src_port,dst,dst_port,bytes,application,id,values,extra1,extra2 

【问题讨论】:

    标签: r readr


    【解决方案1】:

    我不知道任何函数中是否有一些默认参数/设置可以让您直接执行此操作,但您可以在读取文件后进行一些后处理。

    #Read the file without header
    data <- read.table('file1.csv', sep = ',')
    #Assign column names from the last row of the data
    names(data) <- unlist(data[nrow(data), ])
    #Remove the last row
    data <- data[-nrow(data), ]
    #Change the data to default types
    data <- type.convert(data, as.is = TRUE)
    

    如果您想在读取文件时跳过第一行n,您可能需要在read.table 中添加skip = n

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2014-04-02
      • 2023-01-20
      • 1970-01-01
      • 1970-01-01
      • 2010-10-28
      • 1970-01-01
      相关资源
      最近更新 更多