【问题标题】:Function Number of paid invoices prior to the creation date of a new invoice of a customer功能 客户新发票创建日期之前的已付发票数量
【发布时间】:2023-03-17 10:50:01
【问题描述】:

我正在尝试运行一个函数,即使我不太确定这是否是正确的答案。我是 Rstudio 的新手,我试图在每个客户的新发票创建日期之前计算已付发票的数量,以及另一列迟付的发票数量 在每个客户的新发票的创建日期之前 我的数据:

set.seed(123)

names<- rep(LETTERS[1:2], each = 16)
id<- seq(1,32)
daysp<- runif(1:32,1,32)
startdate <-c("20-02-2018","01-03-2018","13-03-2018","20-03-2018","28-03-2018","05-04-2018","10-04-2018","13-04-2018",
        "16-04-2018","19-04-2018","04-05-2018","14-05-2018","23-05-2018","04-06-2018","12-06-2018","19-06-2018",
        "26-04-2018","02-05-2018","07-05-2018","07-05-2018","07-05-2018","14-05-2018","29-05-2018","12-06-2018",
        "12-06-2018","18-06-2018","11-07-2018","11-07-2018","17-07-2018","30-07-2018","03-08-2018","07-08-2018")
startdate<-as.Date(startdate,"%d-%m-%Y" )
paydate<- startdate + daysp
class <- c("Payed", "Payed","Payed", "Delayed","Payed", "Delayed","Delayed", "Delayed","Payed", "Delayed",
       "Payed", "Delayed","Payed", "Delayed","Payed", "Delayed","Payed", "Delayed","Payed", "Delayed",
       "Payed", "Delayed","Payed", "Delayed","Payed", "Delayed","Delayed", "Delayed","Payed", "Delayed",
       "Payed", "Delayed")
df<-data.frame(names,id,daysp,startdate,paydate,class)

我的预期结果如下所示:

nopip<-c(0,0,1,1,3,3,4,4,4,5,7,10,10,12,12,14,0,0,2,2,2,2,3,6,6,6,9,9,10,12,13,14)
nopip_delayed<-c(0,0,0,0,0,0,1,1,1,2,3,5,5,6,6,6,0,0,1,1,1,1,1,3,3,3,4,4,5,6,7,8)

喜欢这个数据框

df<-cbind(df,nopip,nopip_delayed)

提前致谢

【问题讨论】:

    标签: r function dataframe


    【解决方案1】:

    有几种方法可以实现这一点,但这里有一种使用基础 R 的方法,这对于构建扩展基础很容易理解。

    这使用lapply 逐步检查data.frame 并检查姓名是否与该行匹配以及支付日期是否早于开始日期。

    df$nopip2 <- lapply(seq_len(nrow(df)), function(x) sum(df$names == df$names[x] & df$paydate < df$startdate[x]))
    

    这与前一个函数执行相同的序列,但增加了一个额外的检查类是否延迟。

    df$nopip_delayed2 <- lapply(seq_len(nrow(df)), function(x) sum(df$names == df$names[x] & df$paydate < df$startdate[x] & df$class == 'Delayed'))
    

    确认计算结果与期望输出相同

    > setequal(df$nopip, df$nopip2)
    [1] TRUE
    > setequal(df$nopip_delayed, df$nopip_delayed2)
    [1] TRUE
    

    sumdaysp 中添加了相应的nopip 示例

    df$nopip_daysp <- lapply(seq_len(nrow(df)), function(x) sum((df$names == df$names[x] & df$paydate < df$startdate[x]) * df$daysp))
    

    附带说明,如果行数很大,遍历 data.frame 是一个昂贵的选择。但是,如果到时,使用上述步骤将是一个简单的过渡。

    【讨论】:

    • 感谢昨天的回答。这对我准备的模型很有帮助。我一直在尝试使用 lapply 来总结每个 id 的 daysp 与他们各自的 nopip。如果不是太麻烦,您能给我一些建议或方法吗?
    • @FenicoAlarcon 我会用一个例子来更新答案,但它会将你想要的值 sum 乘以上面相同的逻辑,其中TRUE 计算为 1 和 @987654333 @ 的计算结果为 0。这将 sum 仅在逻辑为 TRUE 时的值。
    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2021-11-02
    • 1970-01-01
    • 1970-01-01
    • 2014-01-17
    • 2018-11-05
    相关资源
    最近更新 更多