【问题标题】:Replace from NA to random values从 NA 替换为随机值
【发布时间】:2021-03-11 01:35:27
【问题描述】:

我想从 NA 替换为随机值。这个数据框有一个像“Dayofweek”这样的列,我不知道如何完成这个数据框。我尝试使用函数missforest,但这个函数适用于我认为具有整数的列。你知道我怎样才能完成所有的专栏吗?

travel <- read.csv("https://openmv.net/file/travel-times.csv")

library(missForest)
summary(travel)

set.seed(82)
travel1 <- prodNA(travel, noNA = 0.2)
travel2 <- missForest(travel1)

【问题讨论】:

    标签: r dataframe imputation


    【解决方案1】:

    您可以使用 imputeTS 包将随机值插入到您的时间序列中。函数na_random 可用于此。该函数可用于数字列(其他列将保持不变,这可能很有用,因为您可能不需要 cmets 列的随机文本)

    你可以打电话

    library("imputeTS")
    na_random(yourData)
    

    该函数将查找每列的最低值和最高值,并在此范围内为您插入随机值。

    但您也可以像这样为随机值定义自己的界限:

    library("imputeTS")
    na_random(yourData, lower_bound = 0, upper_bound = 25)
    

    对于您的数据,这可能如下所示:

    library("imputeTS")
    
    # To read the input correctly and have the right data types
    travel <- read.csv("https://openmv.net/file/travel-times.csv", na.strings = "")
    travel$FuelEconomy <- as.numeric(travel$FuelEconomy)
    
    
    # To perform the missing data replacement
    travel <- na_random(travel)
    

    【讨论】:

      【解决方案2】:

      首先,如果您想将"" 字符串读取为NAs,您需要在read.csv 中添加一个额外的参数na.strings = ""。那么,您的意思是用同一变量的另一个随机观察值替换变量的 NA 观察值吗?如果是这样,请考虑以下过程:

      travel <- read.csv("https://openmv.net/file/travel-times.csv", na.strings = "")
      
      set.seed(82)
      res <- data.frame(lapply(travel, function(x) {
        is_na <- is.na(x)
        replace(x, is_na, sample(x[!is_na], sum(is_na), replace = TRUE))
      }))
      

      res 看起来像这样

               Date StartTime DayOfWeek GoingTo Distance MaxSpeed AvgSpeed AvgMovingSpeed FuelEconomy TotalTime MovingTime Take407All                                      Comments
      1    1/6/2012     16:37    Friday    Home    51.29    127.4     78.3           84.8         8.5      39.3       36.3         No                         Medium amount of rain
      2    1/6/2012     08:20    Friday     GSK    51.63    130.3     81.8           88.9         8.5      37.9       34.9         No                             Put snow tires on
      3    1/4/2012     16:17 Wednesday    Home    51.27    127.4     82.0           85.8         8.5      37.5       35.9         No                                    Heavy rain
      4    1/4/2012     07:53 Wednesday     GSK    49.17    132.3     74.2           82.9        8.31      39.8       35.6         No                     Accident blocked 407 exit
      5    1/3/2012     18:57   Tuesday    Home    51.15    136.2     83.4           88.1        9.08      36.8       34.8         No                              Rain, rain, rain
      6    1/3/2012     07:57   Tuesday     GSK    51.80    135.8     84.5           88.8        8.37      36.8       35.0         No                           Backed up at Bronte
      7    1/2/2012     17:31    Monday    Home    51.37    123.2     82.9           87.3           -      37.2       35.3         No Pumped tires up: check fuel economy improved?
      8    1/2/2012     07:34    Monday     GSK    49.01    128.3     77.5           85.9           -      37.9       34.3         No Pumped tires up: check fuel economy improved?
      9  12/23/2011     08:01    Friday     GSK    52.91    130.3     80.9           88.3        8.89      39.3       36.0         No                        Police slowdown on 403
      10 12/22/2011     17:19  Thursday    Home    51.17    122.3     70.6           78.1        8.89      43.5       39.3         No                    Start early to run a batch
      

      【讨论】:

        猜你喜欢
        • 2016-05-07
        • 2018-04-03
        • 2018-05-21
        • 2021-02-20
        • 2012-06-17
        • 2014-06-28
        • 2015-04-26
        相关资源
        最近更新 更多