【问题标题】:how to convert factors with decimal points into numeric values [duplicate]如何将带小数点的因子转换为数值[重复]
【发布时间】:2013-12-21 04:13:00
【问题描述】:

我有一个数据集,其中包含作为因子的向量

> str(gdp)
'data.frame':   64 obs. of  31 variables:
 $ 1 : Factor w/ 62 levels "","1,145.31",..: 1 1 1 53 16 20 22 24 30 32 ...
 $ 2 : Factor w/ 64 levels "1,121.93","1,264.63",..: 42 59 10 13 18 16 17 23 25 35 ...
 $ 3 : Factor w/ 62 levels "","1,072.07",..: 1 1 1 35 36 39 41 42 45 51 ...
 $ 4 : Factor w/ 62 levels "","1,076.03",..: 1 1 1 15 16 21 23 26 27 36 ...
 $ 5 : Factor w/ 62 levels "","1,023.09",..: 1 1 1 11 15 19 17 23 21 27 ...
 $ 6 : Factor w/ 62 levels "","1,003.81",..: 1 1 1 40 45 46 47 52 56 7 ...
 $ 7 : Factor w/ 62 levels "","1,137.23",..: 1 1 1 13 15 19 21 23 24 28 ...
 $ 8 : Factor w/ 62 levels "","1,198.30",..: 1 1 1 26 31 34 35 39 40 47 ...
 $ 9 : Factor w/ 64 levels "1,114.32","1,519.23",..: 27 30 36 41 49 51 50 54 56 64 ...
 $ 10: Factor w/ 62 levels "","1,208.85",..: 1 1 1 35 39 40 42 45 46 53 ...
 $ 11: Factor w/ 64 levels "","1,089.33",..: 1 11 17 20 23 24 26 29 31 37 ...
 $ 12: Factor w/ 62 levels "","1,037.14",..: 1 1 1 22 23 25 31 30 36 41 ...
 $ 13: Factor w/ 63 levels "","1,114.20",..: 1 63 1 8 11 12 14 20 22 27 ...
 $ 14: Factor w/ 64 levels "1,169.73","1,409.74",..: 63 12 14 16 17 22 24 25 28 30 ...
 $ 15: Factor w/ 62 levels "","1,117.66",..: 1 1 1 33 35 39 40 44 43 53 ...
 $ 16: Factor w/ 63 levels "","1,045.73",..: 21 1 1 30 35 38 41 42 47 50 ...
 $ 17: Factor w/ 62 levels "","1,088.39",..: 1 1 1 24 32 26 34 38 40 48 ...
 $ 18: Factor w/ 62 levels "","1,244.71",..: 1 1 1 24 30 31 33 34 38 44 ...
 $ 19: Factor w/ 62 levels "","1,155.37",..: 1 1 1 25 34 37 38 41 44 48 ...
 $ 20: Factor w/ 64 levels "","1,198.29",..: 1 63 8 11 15 17 18 20 26 30 ...
 $ 21: Factor w/ 36 levels "","1,065.67",..: 1 1 1 1 1 1 1 1 1 1 ...
 $ 22: Factor w/ 64 levels "1,123.06","1,315.12",..: 12 14 15 17 22 23 24 26 27 40 ...
 $ 23: Factor w/ 62 levels "","1,016.31",..: 1 1 1 22 25 31 33 38 43 49 ...
 $ 24: Factor w/ 64 levels "1,029.92","1,133.27",..: 52 53 57 60 6 8 9 12 13 22 ...
 $ 25: Factor w/ 64 levels "1,222.15","1,517.69",..: 60 62 7 8 12 14 15 21 22 25 ...
 $ 26: num  NA NA 1.29 1.32 1.36 1.39 1.43 1.62 1.56 1.72 ...
 $ 27: Factor w/ 62 levels "","1,036.85",..: 1 1 1 12 16 21 22 27 25 33 ...
 $ 28: Factor w/ 61 levels "","1,052.88",..: 1 1 1 12 13 17 18 24 23 26 ...
 $ 29: Factor w/ 64 levels "1,018.62","1,081.27",..: 6 7 8 9 10 26 27 34 35 43 ...
 $ 30: Factor w/ 62 levels "","1,203.92",..: 1 1 1 6 5 21 22 23 24 32 ...
 $ 31: Factor w/ 62 levels "","1,039.85",..: 1 1 1 57 59 9 11 13 14 16 ...

我正在尝试保留所有信息(小数点)并将所有向量转换为数字。到目前为止,我已经尝试将这些向量转换为字符,然后转换为数字,这是在 SO 中建议的,但我得到了

> gdp<-data.frame(lapply(gdp,as.character))
> gdp<-data.frame(lapply(gdp,as.numeric))
> str(gdp)
'data.frame':   64 obs. of  31 variables:
 $ X1 : num  1 1 1 53 16 20 22 24 30 32 ...
 $ X2 : num  42 59 10 13 18 16 17 23 25 35 ...
 $ X3 : num  1 1 1 35 36 39 41 42 45 51 ...
 $ X4 : num  1 1 1 15 16 21 23 26 27 36 ...
 $ X5 : num  1 1 1 11 15 19 17 23 21 27 ...
 $ X6 : num  1 1 1 40 45 46 47 52 56 7 ...
 $ X7 : num  1 1 1 13 15 19 21 23 24 28 ...
 $ X8 : num  1 1 1 26 31 34 35 39 40 47 ...
 $ X9 : num  27 30 36 41 49 51 50 54 56 64 ...
 $ X10: num  1 1 1 35 39 40 42 45 46 53 ...
 $ X11: num  1 11 17 20 23 24 26 29 31 37 ...
 $ X12: num  1 1 1 22 23 25 31 30 36 41 ...
 $ X13: num  1 63 1 8 11 12 14 20 22 27 ...
 $ X14: num  63 12 14 16 17 22 24 25 28 30 ...
 $ X15: num  1 1 1 33 35 39 40 44 43 53 ...
 $ X16: num  21 1 1 30 35 38 41 42 47 50 ...
 $ X17: num  1 1 1 24 32 26 34 38 40 48 ...
 $ X18: num  1 1 1 24 30 31 33 34 38 44 ...
 $ X19: num  1 1 1 25 34 37 38 41 44 48 ...
 $ X20: num  1 63 8 11 15 17 18 20 26 30 ...
 $ X21: num  1 1 1 1 1 1 1 1 1 1 ...
 $ X22: num  12 14 15 17 22 23 24 26 27 40 ...
 $ X23: num  1 1 1 22 25 31 33 38 43 49 ...
 $ X24: num  52 53 57 60 6 8 9 12 13 22 ...
 $ X25: num  60 62 7 8 12 14 15 21 22 25 ...
 $ X26: num  NA NA 1 2 3 4 5 7 6 8 ...
 $ X27: num  1 1 1 12 16 21 22 27 25 33 ...
 $ X28: num  1 1 1 12 13 17 18 24 23 26 ...
 $ X29: num  6 7 8 9 10 26 27 34 35 43 ...
 $ X30: num  1 1 1 6 5 21 22 23 24 32 ...
 $ X31: num  1 1 1 57 59 9 11 13 14 16 ...

不保留所有小数点,不填空为NA。我也试过了

> gdp<-as.numeric(levels(gdp))[gdp]
Error in as.numeric(levels(gdp))[gdp] : invalid subscript type 'list'

有没有办法把向量变成数字?

【问题讨论】:

  • 非常感谢,我不知道错误是因为逗号分隔符。但是,当我尝试执行 as.numeric 时,出现错误无效下标类型“列表”。

标签: r numeric r-factor


【解决方案1】:

让我们分解一下。

首先,因为gdp是一个数据框,levels会返回NULL。您可能正在gdp 的每一列上查找levels 的输出。在这种情况下,您需要使用类似 lapply 的东西。

levels(gdp)
# NULL
lapply(gdp, levels)
# this output will make sense
as.numeric(levels(gdp))[gdp]
# this will make no sense

错误表明您不能使用列表 (gdp) 为向量下标。

要遍历gdp 的列,您将需要lapply 之类的东西来处理每个组件。

gdp <- data.frame(lapply(gdp, function(x) {
    if(!is.factor(x)) x 
    else as.numeric(gsub(",","",levels(x),fixed=TRUE))[x] 
}))

您的数据集可能更适合用作矩阵,因为它似乎都是数字类型。在这种情况下:

gdp <- as.matrix(gdp)

【讨论】:

    猜你喜欢
    • 1970-01-01
    • 2019-12-01
    • 1970-01-01
    • 2015-02-16
    • 2015-09-18
    • 1970-01-01
    • 1970-01-01
    • 2018-10-30
    相关资源
    最近更新 更多