【问题标题】:Import specific data from a txt in R从 R 中的 txt 导入特定数据
【发布时间】:2017-02-01 21:37:44
【问题描述】:

我有一个从仪器生成的文件 (Map_1.hdr),这里是文件:

    ENVI
    description = {ROI id #1}
    samples = 16
    lines   = 4
    bands   = 1025
    data type = 4
    interleave = bip
    wavelength = 
    pixel size = {9.38E-07, 7.5E-07}
    x-start and y-start = {0.027363358, -0.007902135}

我需要从最后 2 行获取特定数据,这些数据:

pixel_size = c(9.38E-07,7.5E-07)
origin = (0.027363358, -0.007902135) 

这是我的(不完整的)尝试:

library(R.utils)
rem <- 2
nL <- countLines("Map_1.hdr")
df <- read.csv("Map_1.hdr", header=FALSE, sep=" ", skip=nL-rem, stringsAsFactors = FALSE)

有了这个,我得到了最后两行,但我仍然很远才能清理其余的行。有没有其他方法可以得到我想要的?

【问题讨论】:

    标签: r csv import


    【解决方案1】:

    这是我改用的:

     txt <-"   ENVI
        description = {ROI id #1}
        samples = 16
        lines   = 4
        bands   = 1025
        data type = 4
        interleave = bip
        wavelength = 
        pixel size = {9.38E-07, 7.5E-07}
        x-start and y-start = {0.027363358, -0.007902135}"
    rem <- 2
    nL <- length(readLines(textConnection(txt)))
    df <- read.delim(text=gsub(patt = "^.+\\{|\\}", 
                                     # ^^^^^^     removes everything upto last '{' 
                                        #     ^^^ as well as the trailing '}' 
                                        #    ^    the `|` char is regex logical OR
                               repl = "",  # by replacing with length zero character
                                 readLines(textConnection(txt))), # input text or file
                        header=FALSE, sep=",",  # left the comma in so it can be 'sep'
                        skip=nL-rem, stringsAsFactors = FALSE)
    > df
               V1           V2
    1 0.000000938  0.000000750
    2 0.027363358 -0.007902135
    

    您可以将 readLines(textConnection(txt)) 的实例替换为您的文件名并删除 text= 参数。 (这对于构建可工作的、可测试的示例很有用。)

    【讨论】:

    • 太棒了!如果我复制并粘贴您的尝试,它会起作用。不幸的是,我无法(完全)理解我应该如何修改以获得通用代码...txt &lt;-read.csv("Map_1.hdr") rem &lt;- 2 nL &lt;- length(readLines(textConnection(txt))) df &lt;- read.delim(text=gsub("^.+\\{|\\}","", readLines(textConnection(txt))), header=FALSE, sep=",", skip=nL-rem, stringsAsFactors = FALSE)
    • 我会放一些解释性的 cmets "inline"
    • 如果它有效,即使你不能投票,你也可以勾选它。
    【解决方案2】:

    这行得通吗?不确定我是否完全理解您想要的输出:

    >attempt <- read.table("~/"Map_1.hdr"",  sep= "=", stringsAsFactors = F)
    
    > tail(attempt,2)$ENVI
    [1] " {9.38E-07, 7.5E-07}"         " {0.027363358, -0.007902135}"
    > tail(attempt,2)$ENVI[1]
    [1] " {9.38E-07, 7.5E-07}"
    > tail(attempt,2)$ENVI[2]
    [1] " {0.027363358, -0.007902135}"
    

    然后您可以使用strsplitgsub 从那里获取您需要的内容?

    > strsplit(gsub('[\\{}]', "", tail(attempt,2)$ENVI[1]),",")[[1]][1]
    [1] " 9.38E-07"
    > strsplit(gsub('[\\{}]', "", tail(attempt,2)$ENVI[1]),",")[[1]][2]
    [1] " 7.5E-07"
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2018-09-30
      • 1970-01-01
      • 1970-01-01
      • 2011-09-07
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多