【问题标题】:Spark, ADAM and ZeppelinSpark、ADAM 和 Zeppelin
【发布时间】:2017-05-09 22:55:46
【问题描述】:

尝试使用 ADAM 和 Zeppelin 进行基因组分析。我不确定我是否做对了,但遇到了以下问题。

%dep
z.reset()
z.addRepo("Spark Packages Repo").url("http://dl.bintray.com/spark-packages/maven")
z.load("com.databricks:spark-csv_2.10:1.2.0")   
z.load("mysql:mysql-connector-java:5.1.35")  
z.load("org.bdgenomics.adam:adam-core_2.10:0.20.0")
z.load("org.bdgenomics.adam:adam-cli_2.10:0.20.0")
z.load("org.bdgenomics.adam:adam-apis_2.10:0.20.0")

%spark

import org.bdgenomics.adam.rdd.ADAMContext._
import org.bdgenomics.adam.rdd.ADAMContext
import org.bdgenomics.adam.projections.{ AlignmentRecordField, Projection }
import org.apache.spark.SparkContext
import org.apache.spark.SparkConf
import org.bdgenomics.adam.rdd.ADAMContext
import org.bdgenomics.adam.rdd.ADAMContext._
import org.bdgenomics.adam.projections.Projection
import org.bdgenomics.adam.projections.AlignmentRecordField
import scala.io.Source
import org.apache.spark.rdd.RDD
import org.bdgenomics.formats.avro.Genotype
import scala.collection.JavaConverters._
import org.bdgenomics.formats.avro._
import org.apache.spark.SparkContext._
import org.apache.spark.mllib.linalg.{ Vector => MLVector, Vectors }
import org.apache.spark.mllib.clustering.{ KMeans, KMeansModel }

val ac = new ADAMContext(sc)

我得到以下错误输出

import org.bdgenomics.adam.rdd.ADAMContext._
import org.bdgenomics.adam.rdd.ADAMContext
import org.bdgenomics.adam.projections.{AlignmentRecordField, Projection}
import org.apache.spark.SparkContext
import org.apache.spark.SparkConf
import org.bdgenomics.adam.rdd.ADAMContext
import org.bdgenomics.adam.rdd.ADAMContext._
import org.bdgenomics.adam.projections.Projection
import org.bdgenomics.adam.projections.AlignmentRecordField
import scala.io.Source
import org.apache.spark.rdd.RDD
import org.bdgenomics.formats.avro.Genotype
import scala.collection.JavaConverters._
import org.bdgenomics.formats.avro._
import org.apache.spark.SparkContext._
import org.apache.spark.mllib.linalg.{Vector=>MLVector, Vectors}
import org.apache.spark.mllib.clustering.{KMeans, KMeansModel}
res7: org.apache.spark.SparkContext = org.apache.spark.SparkContext@62ec8142
<console>:188: error: constructor ADAMContext in class ADAMContext cannot be accessed in class $iwC
              new ADAMContext(sc)

知道去哪里看吗?我是否缺少任何依赖项? ^

【问题讨论】:

    标签: apache-spark apache-zeppelin


    【解决方案1】:

    根据您使用的版本中的文件ADAMContext.scala。构造函数是私有的。

    class ADAMContext private (@transient val sc: SparkContext) 
        extends Serializable with Logging {
        ...
    }
    

    您可以改为这样使用。

    import org.bdgenomics.adam.rdd.ADAMContext._
    
    val adamContext: ADAMContext = z.sc
    

    它将使用对象 ADAMContext 中的隐式转换

    object ADAMContext {
        implicit def sparkContextToADAMContext(sc: SparkContext): ADAMContext = 
            new ADAMContext(sc)
    }
    

    【讨论】:

    • 我试过了,对象似乎为空%spark val ac: ADAMContext = sc ac: org.bdgenomics.adam.rdd.ADAMContext = null
    【解决方案2】:

    在不使用 Z 参考的情况下确实有效!!

    val ac:ADAMContext  = sc
    val genotypes: RDD[Genotype] = ac.loadGenotypes("/tmp/ADAM2").rdd
    

    输出

    ac: org.bdgenomics.adam.rdd.ADAMContext = org.bdgenomics.adam.rdd.ADAMContext@2c60ef7e
    
    genotypes: 
    org.apache.spark.rdd.RDD[org.bdgenomics.formats.avro.Genotype] = MapPartitionsRDD[3] at map at ADAMContext.scala:207
    

    我曾尝试在 adam-shell 提示符下执行此操作,但我不记得必须使用隐式转换。不过它使用的是 0.19 版本的 ADAM。

    【讨论】:

      猜你喜欢
      • 2017-11-05
      • 2018-01-12
      • 2020-09-01
      • 1970-01-01
      • 2020-11-29
      • 2017-04-19
      • 2015-08-29
      • 2020-02-29
      • 2022-01-05
      相关资源
      最近更新 更多