【问题标题】:Error in reshaping and concatenating data frame重塑和连接数据框时出错
【发布时间】:2017-09-13 19:53:15
【问题描述】:

我有一个数据框:

  out.new_cost out1.new_cost out2.new_cost out3.new_cost out4.new_cost out5.new_cost
1     11049.18      11056.08      11948.41      11048.89      11049.18      11056.14

我希望这些值中的每一个都是单独的行。然后,我想将以下信息添加为每一行的列:

alphas <- c(.005, .0005, .00005, .005, .0005, .00005) 

thresholds <- c(.0000001, .0000001, .0000001, .0000001, .0000001, .0000001)

iterations <- c(200000, 200000, 200000, 2000000, 2000000, 2000000)

我尝试首先查看 here 并使用:

reshape(costs, idvar = new_cost, direction = "long")

reshape(costs, direction = "long")

但这会返回错误:

重塑错误(成本,方向=“长”):没有'reshapeWide' 属性,必须指定“变化”

我做错了什么,我该如何解决?

【问题讨论】:

    标签: r dataframe reshape


    【解决方案1】:

    tidyr 包中的收集函数可以解决问题。

    df<-read.table(header=TRUE, text="out.new_cost out1.new_cost out2.new_cost out3.new_cost out4.new_cost out5.new_cost
        11049.18      11056.08      11948.41      11048.89      11049.18      11056.14")
    
    alphas <- c(.005, .0005, .00005, .005, .0005, .00005) 
    thresholds <- c(.0000001, .0000001, .0000001, .0000001, .0000001, .0000001)
    iterations <- c(200000, 200000, 200000, 2000000, 2000000, 2000000)
    
    library(tidyr)
    df<-gather(df)
    #rename first column
    names(df)[1]<-"cost"
    #remove everything after the .
    df$cost<-gsub("\\..*" , "", df$cost )
    
    #add the extra columns
    answer<-cbind(df, alphas, thresholds, iterations)
    

    对于这种类型的问题,tidyr 包是比 base R 更好的工具,但如果只是将单行更改为列格式,一个简单的解决方案是转置,例如 t(df),然后继续重命名和 cbind 命令。

    【讨论】:

    • 有没有办法在不使用包的情况下做到这一点?
    • 以及如何将标识符更改为“成本”,其值为“out、out1、out2、out3、out4”等。
    • 对于第二个问题,重命名列名,解析列删除下划线之前的所有内容。
    【解决方案2】:

    我希望这可以帮助你(仅限 R 基础)

    #Create data frame
    costs = data.frame(out.new_cost=11049.18,
    out1.new_cost=11056.08,
    out2.new_cost=11948.41,
    out3.new_cost=11048.89,
    out4.new_cost=11049.18,
    out5.new_cost=11056.14)
    
    #Create variable with colnames
    costs.n = colnames(costs)
    #Reshape costs to costs.rshp and saving the colnames to times column
    costs.rshp = reshape(costs, direction="long", varying=list(costs.n),     v.names="new_cost",times=costs.n)
    #Set the values of new columns
    alphas <- c(.005, .0005, .00005, .005, .0005, .00005) 
    thresholds <- c(.0000001, .0000001, .0000001, .0000001, .0000001, .0000001)
    iterations <- c(200000, 200000, 200000, 2000000, 2000000, 2000000)
    #Assign new columns 
    costs.rshp$alphas = alphas
    costs.rshp$thresholds = thresholds
    costs.rshp$iterations = iterations
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 2021-06-05
      • 2013-01-27
      • 2015-10-07
      • 2019-02-01
      相关资源
      最近更新 更多