【问题标题】:R: "Error: unexpected string constant in" with read_fwf()R:带有read_fwf()的“错误:意外的字符串常量”
【发布时间】:2019-12-20 09:41:14
【问题描述】:

我正在尝试使用 read_fwf() 将美国人口普查局的固定宽度文件读入 R。我在列名列表中的同一位置不断收到错误消息。我曾多次尝试更改该位置的特定列名,但 R 不断抛出错误。我将 R 重新启动到一个新会话,但我不断收到错误消息。在列名列表中,似乎有问题的是第 39 项。在我包含在代码中的一项尝试中,我更改了第 39 位,有时甚至是第 38 位的名称。代码块中的第一行代码具有原始列名值。在该行中,第 39 个名称是“cbsac”,但错误将其打印为“...”“”。它与第 38 位的名称“cbsa”接近,但列表中其他部分的很多名称都非常相似,它们不会导致错误。我不知道那应该表明什么。 “cbsac”是否意味着我不知道的 R 语言?

library(readr)

> tf <- read_fwf("D:/projects_and_data/data/PostgreSQL/data/data/or2010.sf1/orgeo2010.sf1", fwf_widths( c(6, 2, 3, 2, 3, 2, 7, 1, 1, 2, 3, 2, 2, 5, 2, 2, 5, 2, 2, 6, 1, 4, 2, 5, 2, 2, 4, 5, 2, 1, 3, 5, 2, 6, 1, 5, 2, 5, 2, 5, 3, 5, 2, 5, 3, 1, 1, 5, 2, 1, 1, 2, 3, 3, 6, 1, 3, 5, 5, 2, 5, 5, 5, 14, 14, 90, 1, 1, 9, 9, 11, 12, 2, 1, 6, 5, 8, 8, 8, 8, 8, 8,  8, 8, 8, 2, 2, 2, 3, 3, 3, 3, 3, 3, 2, 2, 2, 1, 1, 5, 18), c("fileid", "stusab", "sumlev", "geocomp", "chariter", "cifsn", "logrecno", "region", "division", "state", "county", "countycc", "countysc", "cousub",  "cousubcc", "cousubsc", "place", "placecc", "placesc", "tract", "blkgrp",  "block", "iuc", "concit", "concitcc", "concitsc", "aianhh", "aianhhfp", "aianhhcc", "aihhtli", "aitsce", "aits", "aitscc", "ttract", "tblkgrp", "anrc", "anrccc",  "cbsa", "cbsac", "metdiv", "csa", "necta", "nectasc", "nectadiv" "cnecta", "cbsapci", "nectapci", "ua", "uasc", "uatype", "ur", "cd", "sldu", "sldl", "vtd", "vtdi", "reserve2", "zcta5", "submcd", "submcdcc", "sdelem", "sdsec", "sduni", "arealand", "areawatr", "name", "funcstat", "gcuni", "pop100", "hu100", "intptlat", "intptlon", "lsadc", "partflag", "reserve3", "uga", "statens", "countyns", "cousubns", "placens", "concitns", "aianhhns", "aitsns", "anrcns", "submcdns", "cd113", "cd114", "cd115", "sldu2", "sldu3", "sldu4", "sldl2", "sldl3", "sldl4", "aianhhsc", "csasc", "cnectasc", "memi", "nmemi", "puma", "reserved")))
Error: unexpected string constant in ""tract", "blkgrp",  "block", "iuc", "concit", "concitcc", "concitsc", "aianhh", "aianhhfp", "aianhhcc", "aihhtli", "aitsce", "aits", "aitscc", "ttract", "tblkgrp", "anrc", "anrccc",  "cbsa", ""
> tf <- read_fwf("D:/projects_and_data/data/PostgreSQL/data/data/or2010.sf1/orgeo2010.sf1", fwf_widths( c(6, 2, 3, 2, 3, 2, 7, 1, 1, 2, 3, 2, 2, 5, 2, 2, 5, 2, 2, 6, 1, 4, 2, 5, 2, 2, 4, 5, 2, 1, 3, 5, 2, 6, 1, 5, 2, 5, 2, 5, 3, 5, 2, 5, 3, 1, 1, 5, 2, 1, 1, 2, 3, 3, 6, 1, 3, 5, 5, 2, 5, 5, 5, 14, 14, 90, 1, 1, 9, 9, 11, 12, 2, 1, 6, 5, 8, 8, 8, 8, 8, 8,  8, 8, 8, 2, 2, 2, 3, 3, 3, 3, 3, 3, 2, 2, 2, 1, 1, 5, 18), c("fileid", "stusab", "sumlev", "geocomp", "chariter", "cifsn", "logrecno", "region", "division", "state", "county", "countycc", "countysc", "cousub",  "cousubcc", "cousubsc", "place", "placecc", "placesc", "tract", "blkgrp",  "block", "iuc", "concit", "concitcc", "concitsc", "aianhh", "aianhhfp", "aianhhcc", "aihhtli", "aitsce", "aits", "aitscc", "ttract", "tblkgrp", "anrc", "anrccc",  "BCas", "CBsac", "metdiv", "csa", "necta", "nectasc", "nectadiv" "cnecta", "cbsapci", "nectapci", "ua", "uasc", "uatype", "ur", "cd", "sldu", "sldl", "vtd", "vtdi", "reserve2", "zcta5", "submcd", "submcdcc", "sdelem", "sdsec", "sduni", "arealand", "areawatr", "name", "funcstat", "gcuni", "pop100", "hu100", "intptlat", "intptlon", "lsadc", "partflag", "reserve3", "uga", "statens", "countyns", "cousubns", "placens", "concitns", "aianhhns", "aitsns", "anrcns", "submcdns", "cd113", "cd114", "cd115", "sldu2", "sldu3", "sldu4", "sldl2", "sldl3", "sldl4", "aianhhsc", "csasc", "cnectasc", "memi", "nmemi", "puma", "reserved")))
Error: unexpected string constant in ""tract", "blkgrp",  "block", "iuc", "concit", "concitcc", "concitsc", "aianhh", "aianhhfp", "aianhhcc", "aihhtli", "aitsce", "aits", "aitscc", "ttract", "tblkgrp", "anrc", "anrccc",  "BCas", ""

> sessionInfo()
R version 3.6.1 (2019-07-05)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 17763)

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.1252 
[2] LC_CTYPE=English_United States.1252   
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C                          
[5] LC_TIME=English_United States.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] readr_1.3.1

loaded via a namespace (and not attached):
[1] compiler_3.6.1  backports_1.1.5 R6_2.4.0        hms_0.5.1      
[5] pillar_1.4.2    tibble_2.1.3    Rcpp_1.0.2      crayon_1.3.4   
[9] vctrs_0.2.0     zeallot_0.1.0   pkgconfig_2.0.3 rlang_0.4.0    

This links 到包含源文件的 zip。该文件是“orgeo2010.sf1”。我应该说,拉链有点大。对于那个很抱歉。

【问题讨论】:

  • 你能链接到源文件吗?
  • @Khaynes 已添加。

标签: r fixed-width readr


【解决方案1】:

这能解决您的问题吗?

widths <- c(6, 2, 3, 2, 3, 2, 7, 1, 1, 2, 3, 2, 2, 5, 2, 2, 5,
2, 2, 6, 1, 4, 2, 5, 2, 2, 4, 5, 2, 1, 3, 5, 2, 6, 1, 5, 2, 5,
2, 5, 3, 5, 2, 5, 3, 1, 1, 5, 2, 1, 1, 2, 3, 3, 6, 1, 3, 5, 5,
2, 5, 5, 5, 14, 14, 90, 1, 1, 9, 9, 11, 12, 2, 1, 6, 5, 8, 8, 
8, 8, 8, 8,  8, 8, 8, 2, 2, 2, 3, 3, 3, 3, 3, 3, 2, 2, 2, 1, 1, 5, 18)

vars <- c("fileid", "stusab", "sumlev", "geocomp", "chariter", "cifsn", "logrecno",
"region", "division", "state", "county", "countycc", "countysc", "cousub",
"cousubcc", "cousubsc", "place", "placecc", "placesc", "tract", "blkgrp",  "block",
"iuc", "concit", "concitcc", "concitsc", "aianhh", "aianhhfp", "aianhhcc", "aihhtli",
"aitsce", "aits", "aitscc", "ttract", "tblkgrp", "anrc", "anrccc", "cbsa", "cbsac",
"metdiv", "csa", "necta", "nectasc", "nectadiv", "cnecta", "cbsapci", "nectapci",
"ua", "uasc", "uatype", "ur", "cd", "sldu", "sldl", "vtd", "vtdi", "reserve2",
"zcta5", "submcd", "submcdcc", "sdelem", "sdsec", "sduni", "arealand", "areawatr",
"name", "funcstat", "gcuni", "pop100", "hu100", "intptlat", "intptlon", "lsadc",
"partflag", "reserve3", "uga", "statens", "countyns", "cousubns", "placens",
"concitns", "aianhhns", "aitsns", "anrcns", "submcdns", "cd113", "cd114", "cd115",
"sldu2", "sldu3", "sldu4", "sldl2", "sldl3", "sldl4", "aianhhsc", "csasc",
"cnectasc", "memi", "nmemi", "puma", "reserved")

td <- read_fwf("D:/projects_and_data/data/PostgreSQL/data/data/or2010.sf1/orgeo2010.sf1", fwf_widths(widths)

names(td) <- vars

意外的字符串常量是由未正确定义字符向量引起的(您缺少逗号)

【讨论】:

  • 嗯,它让我走得更远。现在我收到关于其中一条没有 101 个元素的错误。谢谢你的建议。我必须弄清楚那条线是怎么回事,但这似乎可行,所以我将其标记为答案。我将分解导入并弄清楚该行发生了什么。
猜你喜欢
  • 2011-09-06
  • 1970-01-01
  • 2014-11-11
  • 2016-05-10
  • 1970-01-01
  • 2018-06-21
  • 1970-01-01
  • 2017-12-31
  • 1970-01-01
相关资源
最近更新 更多