【发布时间】:2022-02-09 06:55:20
【问题描述】:
我对 R 比较陌生,所以如果这是一个非常基本的问题,我深表歉意。
我希望删除数据框中在特定列中有空列表的所有行。具体来说,我有一些 sf 几何是空列表,我需要从 df 中删除它们以供以后的分析步骤使用。
这是我拥有的数据示例(我已从第 6:8 行的列几何中的列表中删除内容以进行说明)。
library(tigris)
#> To enable
#> caching of data, set `options(tigris_use_cache = TRUE)` in your R script or .Rprofile.
state<-states()
#> | | | 0% | | | 1% | |= | 1% | |= | 2% | |== | 2% | |== | 3% | |=== | 5% | |==== | 5% | |===== | 7% | |===== | 8% | |====== | 8% | |====== | 9% | |======= | 10% | |======= | 11% | |======== | 11% | |======== | 12% | |========= | 12% | |========= | 13% | |=========== | 15% | |=========== | 16% | |============ | 16% | |============ | 17% | |============= | 18% | |============= | 19% | |============== | 19% | |============== | 20% | |=============== | 21% | |=============== | 22% | |================ | 22% | |================ | 23% | |================= | 24% | |================= | 25% | |================== | 25% | |================== | 26% | |=================== | 27% | |==================== | 28% | |===================== | 29% | |===================== | 30% | |====================== | 31% | |====================== | 32% | |======================= | 32% | |======================= | 33% | |======================== | 34% | |========================= | 35% | |========================= | 36% | |========================== | 37% | |========================== | 38% | |=========================== | 38% | |=========================== | 39% | |============================ | 39% | |============================ | 40% | |============================ | 41% | |============================= | 41% | |============================= | 42% | |============================== | 42% | |============================== | 43% | |=============================== | 44% | |=============================== | 45% | |================================ | 45% | |========================================= | 58% | |========================================= | 59% | |========================================== | 60% | |=========================================== | 62% | |============================================ | 63% | |============================================== | 66% | |========================================================== | 83% | |========================================================== | 84% | |=========================================================== | 84% | |=========================================================== | 85% | |============================================================ | 85% | |============================================================ | 86% | |============================================================= | 87% | |============================================================= | 88% | |============================================================== | 89% | |======================================================================| 100%
state[6:8,15] <- NULL
state
#> Simple feature collection with 56 features and 14 fields (with 3 geometries empty)
#> Geometry type: MULTIPOLYGON
#> Dimension: XY
#> Bounding box: xmin: -179.2311 ymin: -14.60181 xmax: 179.8597 ymax: 71.43979
#> Geodetic CRS: NAD83
#> First 10 features:
#> REGION DIVISION STATEFP STATENS GEOID STUSPS NAME LSAD MTFCC
#> 1 3 5 54 01779805 54 WV West Virginia 00 G4000
#> 2 3 5 12 00294478 12 FL Florida 00 G4000
#> 3 2 3 17 01779784 17 IL Illinois 00 G4000
#> 4 2 4 27 00662849 27 MN Minnesota 00 G4000
#> 5 3 5 24 01714934 24 MD Maryland 00 G4000
#> 6 1 1 44 01219835 44 RI Rhode Island 00 G4000
#> 7 4 8 16 01779783 16 ID Idaho 00 G4000
#> 8 1 1 33 01779794 33 NH New Hampshire 00 G4000
#> 9 3 5 37 01027616 37 NC North Carolina 00 G4000
#> 10 1 1 50 01779802 50 VT Vermont 00 G4000
#> FUNCSTAT ALAND AWATER INTPTLAT INTPTLON
#> 1 A 62266231560 489271086 +38.6472854 -080.6183274
#> 2 A 138947364717 31362872853 +28.4574302 -082.4091477
#> 3 A 143779863817 6215723896 +40.1028754 -089.1526108
#> 4 A 206230065476 18942261495 +46.3159573 -094.1996043
#> 5 A 25151726296 6979340970 +38.9466584 -076.6744939
#> 6 A 2677787140 1323663210 +41.5974187 -071.5272723
#> 7 A 214049897859 2391604238 +44.3484222 -114.5588538
#> 8 A 23189198255 1026903434 +43.6726907 -071.5843145
#> 9 A 125925929633 13463401534 +35.5397100 -079.1308636
#> 10 A 23874197924 1030383955 +44.0685773 -072.6691839
#> geometry
#> 1 MULTIPOLYGON (((-81.74725 3...
#> 2 MULTIPOLYGON (((-86.38865 3...
#> 3 MULTIPOLYGON (((-91.18529 4...
#> 4 MULTIPOLYGON (((-96.78438 4...
#> 5 MULTIPOLYGON (((-77.45881 3...
#> 6 MULTIPOLYGON EMPTY
#> 7 MULTIPOLYGON EMPTY
#> 8 MULTIPOLYGON EMPTY
#> 9 MULTIPOLYGON (((-82.41674 3...
#> 10 MULTIPOLYGON (((-73.31328 4...
```
我想删除这些包含空列表的行。我的实际数据有几千行,所以手动搜索是不可能的。
我找到了a posting on this in python,但想知道是否有人有类似的简单代码可以在 r 中做同样的事情?
感谢您的时间和想法。
【问题讨论】:
-
如果没有任何实际数据或示例,您可能需要等待读心者才能获得任何帮助。请查看reprex 包reprex.tidyverse.org 并尝试使用
dput包含不容易包含在reprex 中的数据。 -
可能类似于
df %>% filter(sapply(particular_column, length, simplify = TRUE) > 0) -
这将与
dplyr包一起使用。在基础 R 中,你可以只做df[sapply(df$particular_column, length, simplify = TRUE) > 0, ]. -
你是说你有一个数据框
d和一个变量x使得typeof(d$x)是"list"?在这种情况下,请执行d[lengths(d$x) > 0, ]。不需要sapply。 -
@Baraliuh 我很抱歉。我在我的问题中添加了一些示例数据,希望有助于澄清我的要求。
标签: r list sf data-wrangling