【问题标题】:How to find out the most occurring range in a list如何找出列表中出现次数最多的范围
【发布时间】:2016-06-24 13:35:18
【问题描述】:

我在 R 中绘制了一个图表:

OBD=read.csv("OBD.CSV",header = TRUE,stringsAsFactors=FALSE)
x1 <- OBD$Time1
x2 <- OBD$Time2
y1<-OBD$Vehicle_speed
y2 <-OBD$Engine_speed
par(mar=c(5,4,4,5)+.1)
plot(x1,y1,type="l",col="yellow",ylab = "Vehicle speed")
par(new=TRUE)
plot(x2,y2,type="l",col="blue4",xaxt="n",yaxt="n",xlab="Time",ylab="")
axis(4)
mtext("Engine speed",side=4,line=3)
legend("topleft",col=c("blue4","yellow"),lty=1,legend=c("y1","y2"))

样本数据,CSV 格式:

Vehicle_speed,Time1,Engine_speed,Time2,Engine_torq,Time3,Acc_pedal,Time4,Eng_fuel_rate,Time5
4.98,0,650,0,11,0,0,0,1.15,0
4.98,0,650,0,11,0,0,0,1.2,0.002
4.96,0,650,0.001,11,0.001,0,0.001,1.2,0.003
4.96,0,651,0.001,11,0.001,0,0.001,1.2,0.005
4.94,0.001,651,0.001,11,0.001,0,0.001,1.2,0.007
4.94,0.001,651,0.001,11,0.001,0,0.002,1.2,0.008
4.91,0.001,650.5,0.001,11,0.001,0,0.002,1.2,0.01
4.91,0.001,650.5,0.001,11,0.001,0,0.002,1.2,0.012
4.89,0.001,650.5,0.002,11,0.002,0,0.003,1.15,0.013
4.89,0.001,650.5,0.002,11,0.002,0,0.003,1.15,0.015
4.87,0.002,649.5,0.002,11,0.002,0,0.003,1.15,0.017
4.87,0.002,649.5,0.002,11,0.002,0,0.004,1.15,0.018
4.85,0.002,650,0.002,11,0.002,0,0.004,1.15,0.02
4.85,0.002,650,0.002,11,0.002,0,0.004,1.15,0.022
4.82,0.002,650,0.003,11,0.003,0,0.005,1.2,0.023

从这个表中,我只想找到最常出现的发动机转速和车速或最常出现的范围。

【问题讨论】:

    标签: r csv numbers character


    【解决方案1】:

    要查找最常见(模式)的车速,您可以从table 中获取此信息

    mySpeeds <- table(df$Vehicle_speed)
    modeSpeed <- as.numeric(names(mySpeeds)[which.max(mySpeeds)])
    
    modeSpeed
    [1] 4.85
    

    要获得这样一个速度范围内的值,您应该使用cut

    # get range categories
    df$speedRange <- cut(df$Vehicle_speed, breaks=c(-Inf, 4.85, 4.90, 4.95, Inf))
    
    mySpeedsRange <- table(df$speedRange)
    modeSpeedRange <- names(mySpeedsRange)[which.max(mySpeedsRange)]
    
    modeSpeedRange
    [1] "(4.85,4.9]"
    

    cut 接受一个数字变量并根据第二个(中断)参数返回一个因子变量。您可以提供带有表示中断数的单个数字或表示唯一切割点的向量的中断。我包括-InfInf 以确保全面覆盖。

    【讨论】:

    • 嗨,请你解释一下它是如何工作的 df$speedRange
    • 我收到一个错误 :( cut.default 中的错误(df$Vehicle_speed, breaks = c(-Inf, 4.85, 4.9, 4.95, : 'x' must be numeric
    • 根据错误,您的 Vehicle_speed 变量不是数字。您可以使用as.numeric 强制它,但可能值得看看为什么它不是(也许某处有一个非数字字符)。
    【解决方案2】:
    OBD <- read.csv(text = "Vehicle_speed,Time1,Engine_speed,Time2,Engine_torq,Time3,Acc_pedal,Time4,Eng_fuel_rate,Time5
             4.98,0,650,0,11,0,0,0,1.15,0
             4.98,0,650,0,11,0,0,0,1.2,0.002
             4.96,0,650,0.001,11,0.001,0,0.001,1.2,0.003
             4.96,0,651,0.001,11,0.001,0,0.001,1.2,0.005
             4.94,0.001,651,0.001,11,0.001,0,0.001,1.2,0.007
             4.94,0.001,651,0.001,11,0.001,0,0.002,1.2,0.008
             4.91,0.001,650.5,0.001,11,0.001,0,0.002,1.2,0.01
             4.91,0.001,650.5,0.001,11,0.001,0,0.002,1.2,0.012
             4.89,0.001,650.5,0.002,11,0.002,0,0.003,1.15,0.013
             4.89,0.001,650.5,0.002,11,0.002,0,0.003,1.15,0.015
             4.87,0.002,649.5,0.002,11,0.002,0,0.003,1.15,0.017
             4.87,0.002,649.5,0.002,11,0.002,0,0.004,1.15,0.018
             4.85,0.002,650,0.002,11,0.002,0,0.004,1.15,0.02
             4.85,0.002,650,0.002,11,0.002,0,0.004,1.15,0.022
             4.82,0.002,650,0.003,11,0.003,0,0.005,1.2,0.023")
    
    > table(OBD$Engine_speed)
    
    649.5   650 650.5   651 
        2     6     4     3 
    

    或者对于几列:

    tables <- apply(OBD[ ,c(1,3,5)], 2, table)
    
        > tables
    $Vehicle_speed
    
    4.82 4.85 4.87 4.89 4.91 4.94 4.96 4.98 
       1    2    2    2    2    2    2    2 
    
    $Engine_speed
    
    649.5   650 650.5   651 
        2     6     4     3 
    
    $Engine_torq
    
    11 
    15 
    

    只获取最发生的:

    > lapply(tables, which.max)
    $Vehicle_speed
    4.85 
       2 
    
    $Engine_speed
    650 
      2 
    
    $Engine_torq
    11 
     1 
    

    这能解决问题吗?

    【讨论】:

    • 嗨,亚历克斯,感谢您的帮助。我需要这样的 O/P。我在 MS Excel 中做到了这一点。我的 CSV 文件包含 82000 行和 10 列。 .......Engine_speed.......频率...... 600以下 6818 600-800 12014 800-1000 2952 1000-1200 4443 1200-1400 7824 1400-1600 9969 1600-1800 12682 1800- 2000 6794 2000-2200 9922 2200-2400 3790 2400-2600 5197 2600-2800 293
    猜你喜欢
    • 2021-06-11
    • 2017-07-14
    • 1970-01-01
    • 1970-01-01
    • 2020-04-19
    • 1970-01-01
    • 2020-07-22
    • 2018-09-09
    相关资源
    最近更新 更多