【问题标题】:Zip non sequential list based on condition根据条件压缩非顺序列表
【发布时间】:2018-01-23 09:34:30
【问题描述】:

我有一个场景,我根据条件在 scala 中压缩两个列表。 它们可能不按顺序排列。最好的方法是什么?

我想将具有相同 requestId 的 DirectRetailCM 和 DirectRetailCM 分组为一个元组。

object Main extends App {
   case class SalesDoc(val id: Int, val name: String, val requestId: String) {}
   val list = List(
     SalesDoc(1, "ILLEGAL", "1"),
     SalesDoc(2, "DirectRetailCM", "1"),

     SalesDoc(3, "DirectRetailOffsetInvoice", "2"),
     SalesDoc(4, "DirectRetailCM", "2"),
     SalesDoc(5, "OTHER", "2"),

     SalesDoc(5, "DirectRetailCM", "LEFTOUT"),
     SalesDoc(6, "ILLEGAL2", "4"),

     SalesDoc(5, "OTHER", "3"),
     SalesDoc(7, "DirectRetailOffsetInvoice", "4"),
     SalesDoc(8, "DirectRetailCM", "4")
  )

 // I expect zip results of drOffsetInvoice and drCms as
List(
  (SalesDoc(3, "DirectRetailOffsetInvoice", "2"), SalesDoc(4, "DirectRetailCM", "2")),
  (SalesDoc(7, "DirectRetailOffsetInvoice", "4"), SalesDoc(8, "DirectRetailCM", "4"))
  )
}

我能想到的最初方法是

  • 组 directRetailCM - list.filter(e => e.name == "DirectRetailCM")
  • 组 DirectRetailOffsetInvoice - list.filter(e => e.name == "DirectRetailOffsetInvoice")
  • 同时压缩 - 但可能不按顺序进行
  • 可能存在没有对应的行

您能否建议我需要考虑的任何其他方法?

【问题讨论】:

    标签: scala scala-collections scalaz


    【解决方案1】:
    // You don't need the val keyword for a case class
    case class SalesDoc(id: Int, name: String, requestId: String)
    
    val list = List(
      SalesDoc(1, "ILLEGAL", "1"),
      SalesDoc(2, "DirectRetailCM", "1"),
    
      SalesDoc(3, "DirectRetailOffsetInvoice", "2"),
      SalesDoc(4, "DirectRetailCM", "2"),
      SalesDoc(5, "OTHER", "2"),
    
      SalesDoc(5, "DirectRetailCM", "LEFTOUT"),
      SalesDoc(6, "ILLEGAL2", "4"),
    
      SalesDoc(5, "OTHER", "3"),
      SalesDoc(7, "DirectRetailOffsetInvoice", "4"),
      SalesDoc(8, "DirectRetailCM", "4")
    )
    
    // Find all of the DirectRetailOffsetInvoice items
    val offsets = list.filter(_.name == "DirectRetailOffsetInvoice")
    
    // Map over all of the DirectRetailOffsetInvoice items and see if there is matching DirectRetailCM item
    val maybeMatched = offsets.map(offset => {
      val maybeCm = list.find(i => i.requestId == offset.requestId && i.name == "DirectRetailCM")
    
      // Return a tuple of type (SalesDoc, Option[SalesDoc])
      (offset, maybeCm)
    })
    
    // Map over the tuples and only take the ones where there was a match, and extract it from the Option to create a tuple of (SalesDoc, SalesDoc)
    val output = maybeMatched.collect { case (s1, Some(s2)) => (s1, s2) }
    
    output.foreach(println)
    // (SalesDoc(3,DirectRetailOffsetInvoice,2),SalesDoc(4,DirectRetailCM,2))
    // (SalesDoc(7,DirectRetailOffsetInvoice,4),SalesDoc(8,DirectRetailCM,4))
    

    【讨论】:

    • 谢谢 Tyler,我可能会在 DirectRetailCM 上再使用一个过滤器来避免每个 DirectRetailOffsetInvoice 循环。
    • 如果您的行数比您的示例建议的多得多,您可能希望构建一个地图,以便您可以更快地查找。
    【解决方案2】:

    您可以使用标准的 Scala 组合器实现此目的

    list
      .filter(sd => sd.name == "DirectRetailCM" || sd.name == "DirectRetailOffsetInvoice")
      .groupBy(_.requestId)
      .flatMap {
         case (_, List(a,b)) => List(a->b)
         case _ => List.empty
      }
    

    这给了你:

     res3: scala.collection.immutable.Map[SalesDoc,SalesDoc] = 
           Map(
             SalesDoc(3,DirectRetailOffsetInvoice,2) -> SalesDoc(4,DirectRetailCM,2), 
             SalesDoc(7,DirectRetailOffsetInvoice,4) -> SalesDoc(8,DirectRetailCM,4))
    

    如果输入序列未排序且DirectRetailOffsetInvoiceDirectRetailCM 之前,则需要对其进行处理。

    【讨论】:

      【解决方案3】:
      list.filter(s => s.name == "DirectRetailCM" || s.name == "DirectRetailOffsetInvoice")
          .groupBy(_.requestId)
          .collect { case (_, List(a, b)) => (a, b) }
          .toList
      
      // List[(SalesDoc, SalesDoc)]
      

      【讨论】:

        猜你喜欢
        • 2014-09-24
        • 1970-01-01
        • 2013-11-16
        • 1970-01-01
        • 2016-11-01
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        相关资源
        最近更新 更多