【问题标题】:Creating Spark Session throws exception traceback创建 Spark Session 抛出异常回溯
【发布时间】:2020-12-09 03:43:34
【问题描述】:

我是 jupyter notebook 的新手,我正在尝试运行一个 pyspark 代码,简称如下:

import pyspark as ps
from pyspark.sql import SQLContext
from pyspark.sql import Row

spark = ps.sql.SparkSession.builder \
            .master("local") \
            .appName("Book Recommendation System") \
            .getOrCreate()

使用以下语句创建 pyspark 会话时出现错误:

  • “此 SparkContext 可能是现有的”
  • “不要为现有的SparkContext 更新SparkConf,因为它被所有会话共享”

完整的错误解释如下:

> ---------------------------------------------------------------------------
Exception                                 Traceback (most recent call last)
<ipython-input-21-cd9ecd052473> in <module>
----> 1 spark = ps.sql.SparkSession.builder.master("local").appName("Book").getOrCreate()
      2 
      3 sc = spark.sparkContext
      4 sqlContext = SQLContext(sc)

c:\program files (x86)\python\lib\site-packages\pyspark\sql\session.py in getOrCreate(self)
    184                             sparkConf.set(key, value)
    185                         # This SparkContext may be an existing one.
--> 186                         sc = SparkContext.getOrCreate(sparkConf)
    187                     # Do not update `SparkConf` for existing `SparkContext`, as it's shared
    188                     # by all sessions.

c:\program files (x86)\python\lib\site-packages\pyspark\context.py in getOrCreate(cls, conf)
    369         with SparkContext._lock:
    370             if SparkContext._active_spark_context is None:
--> 371                 SparkContext(conf=conf or SparkConf())
    372             return SparkContext._active_spark_context
    373 

c:\program files (x86)\python\lib\site-packages\pyspark\context.py in __init__(self, master, appName, sparkHome, pyFiles, environment, batchSize, serializer, conf, gateway, jsc, profiler_cls)
    126                 " is not allowed as it is a security risk.")
    127 
--> 128         SparkContext._ensure_initialized(self, gateway=gateway, conf=conf)
    129         try:
    130             self._do_init(master, appName, sparkHome, pyFiles, environment, batchSize, serializer,

c:\program files (x86)\python\lib\site-packages\pyspark\context.py in _ensure_initialized(cls, instance, gateway, conf)
    318         with SparkContext._lock:
    319             if not SparkContext._gateway:
--> 320                 SparkContext._gateway = gateway or launch_gateway(conf)
    321                 SparkContext._jvm = SparkContext._gateway.jvm
    322 

c:\program files (x86)\python\lib\site-packages\pyspark\java_gateway.py in launch_gateway(conf, popen_kwargs)
    103 
    104             if not os.path.isfile(conn_info_file):
--> 105                 raise Exception("Java gateway process exited before sending its port number")
    106 
    107             with open(conn_info_file, "rb") as info:

Exception: Java gateway process exited before sending its port number

有人知道我该怎么办吗? 我不知道是什么问题!

【问题讨论】:

    标签: python apache-spark pyspark apache-spark-sql jupyter-notebook


    【解决方案1】:

    您不需要 ps.sql。。试试这个吧。

    import pyspark as ps
    from pyspark.sql import SparkSession
    from pyspark.sql import Row
    
    spark = SparkSession.builder \
                .master("local") \
                .appName("Book Recommendation System") \
                .getOrCreate()
    

    【讨论】:

      猜你喜欢
      • 2019-02-22
      • 1970-01-01
      • 2012-04-14
      • 1970-01-01
      • 1970-01-01
      • 2017-02-27
      • 2014-04-17
      • 2014-10-28
      • 2017-09-05
      相关资源
      最近更新 更多