【问题标题】:Saddle Frame: What's the most idiomatic way to count NaN values?鞍架:计算 NaN 值最惯用的方法是什么?
【发布时间】:2018-03-06 08:40:09
【问题描述】:

我像这样构建一个 Scala 框架,例如

import org.saddle._
import scala.util.Random

val rowIx = Index(0 until 200)
val colIx = Index(0 until 100)

// create example having 15% of NaNs
val nanPerc = 0.15
val nanLength = math.round(nanPerc*rowIx.length*colIx.length).toInt
val nanInd = Random.shuffle(0 until rowIx.length*colIx.length).take(nanLength)
val rawMat = mat.rand(rowIx.length, colIx.length)
// contents gives a single array in row major
val rawMatContents = rawMat.contents
nanInd foreach { i => rawMatContents.update(i, Double.NaN) }

val df = Frame(rawMat, rowIx, colIx)

// now I'd like to test that the number of NaNs is correct but 
// most functions for this purpose in Frame e.g. countif exclude NaNs
df.???

计算 NaN 数量的最惯用的方法是什么(Scala、Saddle)?

【问题讨论】:

  • countif被实现为.filterFoldLeft(t => sd.notMissing(t) && test(t))(0)((a,b) => a + 1),那么.filterFoldLeft(sd.isMissing)(0)((a,b) => a + 1)呢? ref
  • 您好,谢谢!你显示的是一个向量,我在 Frames 上操作,你能不能给出完整的答案而不是 cmets?

标签: scala saddle


【解决方案1】:

我找到了一个非常简单直接的方法:

retDf.toMat.contents.filter(x => x.isNaN).length

【讨论】:

    【解决方案2】:

    Frame.countifis implemented as:

    def countif(test: T => Boolean)(implicit ev: S2Stats): Series[CX, Int] = frame.reduce(_.countif(test))
    

    Vec.countifis implemented as:

    def countif(test: Double => Boolean): Int = r.filterFoldLeft(t => sd.notMissing(t) && test(t))(0)((a,b) => a + 1)
    

    我们可以使用相同的方法,但删除 test 并反转 NaN 检查:

    vec.filterFoldLeft(x => x.isNaN)(0)((a, b) => a + 1)
    

    要在Frame 上运行它:

    frame.reduce(_.filterFoldLeft(x => x.isNaN)(0)((a, b) => a + 1))
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 2020-11-20
      • 2016-03-13
      • 1970-01-01
      • 2020-10-04
      • 2017-06-27
      • 2011-02-25
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多