【问题标题】:Increase memory available to Spark shell增加 Spark shell 的可用内存
【发布时间】:2015-03-22 20:36:27
【问题描述】:

我正在尝试在 Raspberry Pi1 Model B+ 上安装 Apache Spark

一旦我启动命令外壳并尝试命令:

val l = sc.parallelize(List()).collect

我收到异常:

scala> val l = sc.parallelize(List()).collect
15/03/22 19:52:44 INFO SparkContext: Starting job: collect at <console>:21
15/03/22 19:52:44 INFO DAGScheduler: Got job 0 (collect at <console>:21) with 1 output partitions (allowLocal=false)
15/03/22 19:52:44 INFO DAGScheduler: Final stage: Stage 0(collect at <console>:21)
15/03/22 19:52:44 INFO DAGScheduler: Parents of final stage: List()
15/03/22 19:52:44 INFO DAGScheduler: Missing parents: List()
15/03/22 19:52:44 INFO DAGScheduler: Submitting Stage 0 (ParallelCollectionRDD[0] at parallelize at <console>:21), which has no missing parents
#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGILL (0x4) at pc=0x9137c074, pid=3596, tid=2415826032
#
# JRE version: Java(TM) SE Runtime Environment (8.0-b132) (build 1.8.0-b132)
# Java VM: Java HotSpot(TM) Client VM (25.0-b70 mixed mode linux-arm )
# Problematic frame:
# C  [snappy-unknown-b62d2fa0-8fdd-4b4b-8c2c-2f24ddaeee74-libsnappyjava.so+0x1074]  _init+0x1a7
#
# Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
#
# An error report file with more information is saved as:
# /home/pi/spark-1.3.0-bin-hadoop2.4/bin/hs_err_pid3596.log
./spark-shell: line 55:  3596 Segmentation fault      "$FWDIR"/bin/spark-submit --class org.apache.spark.repl.Main "${SUBMISSION_OPTS[@]}" spark-shell "${APPLICATION_OPTS[@]}"

当启动命令外壳时,我允许磁盘内存使用:

./spark-shell --conf StorageLevel=MEMORY_AND_DISK

但仍然收到相同的异常。

启动 spark shell 时有 267MB 内存可用:

15/03/22 17:09:49 INFO MemoryStore: MemoryStore started with capacity 267.3 MB

这应该足够内存在 shell 中运行 Spark 命令吗?

这是启动将不可用内存溢出到磁盘的 spark shell 的正确命令吗:./spark-shell --conf StorageLevel=MEMORY_AND_DISK

更新:

我试过了:

./spark-shell --conf spark.driver.memory=256m

val l = sc.parallelize(List()).collect

但结果相同

【问题讨论】:

    标签: scala apache-spark


    【解决方案1】:

    尝试使用--driver-memory 选项来设置驱动程序进程的内存。示例:

    ./spark-shell --driver-memory 2g
    

    对于 2 GB 内存。

    【讨论】:

    • 谢谢,但是如何指示命令 shell 也使用 MEMORY_AND_DISK 呢?
    猜你喜欢
    • 1970-01-01
    • 2023-03-24
    • 1970-01-01
    • 1970-01-01
    • 2014-11-19
    • 1970-01-01
    • 1970-01-01
    • 2015-10-17
    • 2015-10-06
    相关资源
    最近更新 更多