【问题标题】:Obtaining Percentage for Date Observations获取日期观察的百分比
【发布时间】:2022-01-08 15:56:23
【问题描述】:

我对 R 非常陌生,并且正在为这个概念而苦苦挣扎。我有一个看起来像这样的数据框: enter image description here

我已使用 summary(FoodFacilityInspections$DateRecent) 来获取列出的每个“日期”的观察结果。不过,我有 3932 条观察结果,并希望获得以下摘要:

  • 观察次数最多的日期及其百分比
  • “最近”类别的观察百分比

我尝试过: *

> count(FoodFacilityInspections$DateRecent) Error in UseMethod("count")
> :    no applicable method for 'count' applied to an object of class
> "factor"

【问题讨论】:

    标签: r dataframe date percentage summary


    【解决方案1】:

    一个更新的解决方案,其中包含基于您的数据的总计、百分比和累积百分比表。

    library(data.table)
    
    data<-data.frame("ScoreRecent"=c(100,100,100,100,100,100,100,100,100),
                     "DateRecent"=c("7/23/2021", "7/8/2021","5/25/2021","5/19/2021","5/20/2021","5/13/2021","5/17/2021","5/18/2021","5/18/2021"),
                     "Facility_Type_Description"=c("Retail Food Stores", "Retail Food Stores","Food Service Establishment","Food Service Establishment","Food Service Establishment","Food Service Establishment","Food Service Establishment","Food Service Establishment","Food Service Establishment"),
                     "Premise_zip"=c(40207,40207,40207,40206,40207,40206,40207,40206,40206),
                     "Opening_Date"=c("6/27/1988","6/29/1988","10/20/2009","2/28/1989","10/20/2009","10/20/2009","10/20/2009","10/20/2009", "10/20/2009"))
    
    
    tab <- function(dataset, var){
      
      dataset %>%
        group_by({{var}}) %>% 
        summarise(n=n()) %>%
        mutate(total = cumsum(n),
               percent = n / sum(n) * 100,
               cumulativepercent = cumsum(n / sum(n) * 100))
      
    }
    
    tab(data, Facility_Type_Description)
    
     Facility_Type_Description      n total percent cumulativepercent
      <chr>                      <int> <int>   <dbl>             <dbl>
    1 Food Service Establishment     7     7    77.8              77.8
    2 Retail Food Stores             2     9    22.2             100  
    

    【讨论】:

      【解决方案2】:

      within 中使用tableproportions 的单行代码。

      within(as.data.frame.table(with(mtcars, table(cyl))), Pc <- proportions(Freq)*100)
      #   cyl Freq     Pc
      # 1   4   11 34.375
      # 2   6    7 21.875
      # 3   8   14 43.750
      

      【讨论】:

        【解决方案3】:

        您可以使用表格功能找出出现次数最多的日期。然后您可以遍历表中的每个项目(在您的情况下为日期)并将其除以这样的总行数(也使用 mtcars 数据集):

        table(mtcars$cyl)
        
        percent <- c()
        for (i in 1:length(table(mtcars$cyl))){
            percent[i] <- table(mtcars$cyl)[i]/nrow(mtcars) * 100
        }
        output <- cbind(table(mtcars$cyl), percent)
        output
        
             percent
        4 11  34.375
        6  7  21.875
        8 14  43.750
        

        【讨论】:

          【解决方案4】:

          使用内置数据,因为您没有提供示例数据

          library(data.table)
          dtcars <- data.table(mtcars, keep.rownames = TRUE)
          

          解决方案

          dtcars[, .("count"=.N, "percent"=.N/dtcars[, .N]*100), 
                 by=cyl]
          

          【讨论】:

            猜你喜欢
            • 2016-02-17
            • 1970-01-01
            • 1970-01-01
            • 1970-01-01
            • 1970-01-01
            • 1970-01-01
            • 2017-09-29
            • 1970-01-01
            • 1970-01-01
            相关资源
            最近更新 更多