【发布时间】:2020-06-06 22:57:27
【问题描述】:
我有原始数据集(计量数据)。计量数据是重复的,因为当数据库中存在组名称、读取客户端 ID 或读取短名称时,它会被复制。不幸的是,每个仪表 ID 都是不同的——在某些情况下,不会有重复的数据,甚至是相同数据的两倍甚至三倍。作为最后一列的帮助,每个数据都有其时间戳。
问题: 我只想扫描仪表 ID 并在为组名称或读取客户端 ID 或读取短名称复制相同数据时丢弃副本只留下一组数据。下面的例子。当新的副本开始时,我已经注释了行。
我尝试过的:重复功能或以下功能:
df %>%
distinct(Meter.ID, .keep_all = TRUE) %>%
{. ->> df2 }
我目前的方法“过于”选择性且不通用。我很难用通用解决方案解决问题。 最好使用每次复制数据时重新开始的时间戳。
数据样本 {
"Meter ID","Group name","Reading Client ID","Reading Short Name",Reading,"Reading timestamp",Reading2
204,"Strefa XX Pomorzany Kępa",199,06865,90.436,1580597999," - "
204,"Strefa XX Pomorzany Kępa",199,06865,90.436,1580594400," - "
204,"Strefa XX Pomorzany Kępa",199,06865,90.436,1580590800," - "
204,"Strefa XX Pomorzany Kępa",199,06865,90.436,1580587200," - "
204,"Strefa XX Pomorzany Kępa",199,06865,90.436,1580583600," - "
204,"Strefa XX Pomorzany Kępa",199,06865,90.436,1580580000," - "
204,"Strefa XX Pomorzany Kępa",199,06865,90.436,1580576400," - "
204,"Strefa XX Pomorzany Kępa",199,06865,90.436,1580572800," - "
204,"Strefa XX Pomorzany Kępa",199,06865,90.436,1580569200," - "
204,"Strefa XX Pomorzany Kępa",199,06865,90.436,1580565600," - "
204,"Strefa XX Pomorzany Kępa",199,06865,90.436,1580562000," - "
204,"Strefa XX Pomorzany Kępa",199,06865,90.436,1580558400," - "
204,"Strefa XX Pomorzany Kępa",199,06865,90.436,1580554800," - "
204,"Strefa XX Pomorzany Kępa",199,06865,90.436,1580551200," - "
204,"Strefa XX Pomorzany Kępa",199,06865,90.436,1580547600," - "
204,"Strefa XX Pomorzany Kępa",199,06865,90.436,1580544000," - "
204,"Strefa XX Pomorzany Kępa",199,06865,90.436,1580540400," - "
204,"Strefa XX Pomorzany Kępa",199,06865,90.436,1580536800," - "
204,"Strefa XX Pomorzany Kępa",199,06865,90.436,1580533200," - "
204,"Strefa XX Pomorzany Kępa",199,06865,90.436,1580529600," - "
204,"Strefa XX Pomorzany Kępa",199,06865,90.436,1580526000," - "
204,"Strefa XX Pomorzany Kępa",199,06865,90.436,1580522400," - "
204,"Strefa XX Pomorzany Kępa",199,06865,90.436,1580518800," - "
204,"Strefa XX Pomorzany Kępa",199,06865,90.436,1580515200," - "
204,0100G,199,06865,90.436,1580597999," - "
204,0100G,199,06865,90.436,1580594400," - "
204,0100G,199,06865,90.436,1580590800," - "
204,0100G,199,06865,90.436,1580587200," - "
204,0100G,199,06865,90.436,1580583600," - "
204,0100G,199,06865,90.436,1580580000," - "
204,0100G,199,06865,90.436,1580576400," - "
204,0100G,199,06865,90.436,1580572800," - "
204,0100G,199,06865,90.436,1580569200," - "
204,0100G,199,06865,90.436,1580565600," - "
204,0100G,199,06865,90.436,1580562000," - "
204,0100G,199,06865,90.436,1580558400," - "
204,0100G,199,06865,90.436,1580554800," - "
204,0100G,199,06865,90.436,1580551200," - "
204,0100G,199,06865,90.436,1580547600," - "
204,0100G,199,06865,90.436,1580544000," - "
204,0100G,199,06865,90.436,1580540400," - "
204,0100G,199,06865,90.436,1580536800," - "
204,0100G,199,06865,90.436,1580533200," - "
204,0100G,199,06865,90.436,1580529600," - "
204,0100G,199,06865,90.436,1580526000," - "
204,0100G,199,06865,90.436,1580522400," - "
204,0100G,199,06865,90.436,1580518800," - "
204,0100G,199,06865,90.436,1580515200," - "
204,"0100G test2",199,06865,90.436,1580597999," - "
204,"0100G test2",199,06865,90.436,1580594400," - "
204,"0100G test2",199,06865,90.436,1580590800," - "
204,"0100G test2",199,06865,90.436,1580587200," - "
204,"0100G test2",199,06865,90.436,1580583600," - "
204,"0100G test2",199,06865,90.436,1580580000," - "
204,"0100G test2",199,06865,90.436,1580576400," - "
204,"0100G test2",199,06865,90.436,1580572800," - "
204,"0100G test2",199,06865,90.436,1580569200," - "
204,"0100G test2",199,06865,90.436,1580565600," - "
204,"0100G test2",199,06865,90.436,1580562000," - "
204,"0100G test2",199,06865,90.436,1580558400," - "
204,"0100G test2",199,06865,90.436,1580554800," - "
204,"0100G test2",199,06865,90.436,1580551200," - "
204,"0100G test2",199,06865,90.436,1580547600," - "
204,"0100G test2",199,06865,90.436,1580544000," - "
204,"0100G test2",199,06865,90.436,1580540400," - "
204,"0100G test2",199,06865,90.436,1580536800," - "
204,"0100G test2",199,06865,90.436,1580533200," - "
204,"0100G test2",199,06865,90.436,1580529600," - "
204,"0100G test2",199,06865,90.436,1580526000," - "
204,"0100G test2",199,06865,90.436,1580522400," - "
204,"0100G test2",199,06865,90.436,1580518800," - "
204,"0100G test2",199,06865,90.436,1580515200," - "
处理后想要的效果:
204,"Strefa XX Pomorzany Kępa",199,06865,90.436,1580597999," - "
204,"Strefa XX Pomorzany Kępa",199,06865,90.436,1580594400," - "
204,"Strefa XX Pomorzany Kępa",199,06865,90.436,1580590800," - "
204,"Strefa XX Pomorzany Kępa",199,06865,90.436,1580587200," - "
204,"Strefa XX Pomorzany Kępa",199,06865,90.436,1580583600," - "
204,"Strefa XX Pomorzany Kępa",199,06865,90.436,1580580000," - "
204,"Strefa XX Pomorzany Kępa",199,06865,90.436,1580576400," - "
204,"Strefa XX Pomorzany Kępa",199,06865,90.436,1580572800," - "
204,"Strefa XX Pomorzany Kępa",199,06865,90.436,1580569200," - "
204,"Strefa XX Pomorzany Kępa",199,06865,90.436,1580565600," - "
204,"Strefa XX Pomorzany Kępa",199,06865,90.436,1580562000," - "
204,"Strefa XX Pomorzany Kępa",199,06865,90.436,1580558400," - "
204,"Strefa XX Pomorzany Kępa",199,06865,90.436,1580554800," - "
204,"Strefa XX Pomorzany Kępa",199,06865,90.436,1580551200," - "
204,"Strefa XX Pomorzany Kępa",199,06865,90.436,1580547600," - "
204,"Strefa XX Pomorzany Kępa",199,06865,90.436,1580544000," - "
204,"Strefa XX Pomorzany Kępa",199,06865,90.436,1580540400," - "
204,"Strefa XX Pomorzany Kępa",199,06865,90.436,1580536800," - "
204,"Strefa XX Pomorzany Kępa",199,06865,90.436,1580533200," - "
204,"Strefa XX Pomorzany Kępa",199,06865,90.436,1580529600," - "
204,"Strefa XX Pomorzany Kępa",199,06865,90.436,1580526000," - "
204,"Strefa XX Pomorzany Kępa",199,06865,90.436,1580522400," - "
204,"Strefa XX Pomorzany Kępa",199,06865,90.436,1580518800," - "
204,"Strefa XX Pomorzany Kępa",199,06865,90.436,1580515200," - "
【问题讨论】:
-
我可以想到一个解决方案,但是代码应该如何决定应该选择三个副本中的哪一个呢?这对于 Group Name 列很重要。