【问题标题】:Pyspark windows os - RuntimeError: Java gateway process exited before sending its port numberPyspark windows os - RuntimeError:Java网关进程在发送其端口号之前退出
【发布时间】:2022-08-09 21:14:12
【问题描述】:

从昨天开始,我一直在尝试在 Windows 上安装 Pyspark,但我不断收到此错误。已经超过 48 小时,我想尽一切办法解决问题。多次从头开始重新安装 Pyspark,但仍然无法正常工作。

每当我跑步时——

spark = SparkSession.builder.getOrCreate()

我收到此错误 -

RuntimeError                              Traceback (most recent call last)
~\\AppData\\Local\\Temp/ipykernel_20592/2335384691.py in <module>
      1 # create a spark session
----> 2 spark = SparkSession.builder.getOrCreate()

c:\\users\\bhola\\appdata\\local\\programs\\python\\python38\\lib\\site-packages\\pyspark\\sql\\session.py in getOrCreate(self)
    226                             sparkConf.set(key, value)
    227                         # This SparkContext may be an existing one.
--> 228                         sc = SparkContext.getOrCreate(sparkConf)
    229                     # Do not update `SparkConf` for existing `SparkContext`, as it\'s shared
    230                     # by all sessions.

c:\\users\\bhola\\appdata\\local\\programs\\python\\python38\\lib\\site-packages\\pyspark\\context.py in getOrCreate(cls, conf)
    390         with SparkContext._lock:
    391             if SparkContext._active_spark_context is None:
--> 392                 SparkContext(conf=conf or SparkConf())
    393             return SparkContext._active_spark_context
    394 

c:\\users\\bhola\\appdata\\local\\programs\\python\\python38\\lib\\site-packages\\pyspark\\context.py in __init__(self, master, appName, sparkHome, pyFiles, environment, batchSize, serializer, conf, gateway, jsc, profiler_cls)
    142                 \" is not allowed as it is a security risk.\")
    143 
--> 144         SparkContext._ensure_initialized(self, gateway=gateway, conf=conf)
    145         try:
    146             self._do_init(master, appName, sparkHome, pyFiles, environment, batchSize, serializer,

c:\\users\\bhola\\appdata\\local\\programs\\python\\python38\\lib\\site-packages\\pyspark\\context.py in _ensure_initialized(cls, instance, gateway, conf)
    337         with SparkContext._lock:
    338             if not SparkContext._gateway:
--> 339                 SparkContext._gateway = gateway or launch_gateway(conf)
    340                 SparkContext._jvm = SparkContext._gateway.jvm
    341 

c:\\users\\bhola\\appdata\\local\\programs\\python\\python38\\lib\\site-packages\\pyspark\\java_gateway.py in launch_gateway(conf, popen_kwargs)
    106 
    107             if not os.path.isfile(conn_info_file):
--> 108                 raise RuntimeError(\"Java gateway process exited before sending its port number\")
    109 
    110             with open(conn_info_file, \"rb\") as info:

RuntimeError: Java gateway process exited before sending its port number

我尝试了stackoveflow 帖子和stackoverflow2 帖子中给出的解决方案。

export PYSPARK_SUBMIT_ARGS=\"--master local[2] pyspark-shell\"

在我的 Windows 系统中,我使用了 variable name = PYSPARK_SUBMIT_ARGSvariable value = \"--master local[2] pyspark-shell\"

但它不起作用。

在我的机器上设置的其他系统变量是在安装期间 -

SPARK_HOME = D:\\spark\\spark-3.2.0-bin-hadoop3.2

HADOOP_HOME = D:\\spark\\spark-3.2.0-bin-hadoop3.2

Path = D:\\spark\\spark-3.2.0-bin-hadoop3.2\\bin

PYSPARK_DRIVER_PYTHON = jupyter

PYSPARK_DRIVER_PYTHON_OPTS = jupyter

JAVA_HOME = C:\\Program Files\\Java\\jdk1.8.0_301

谁能帮我这个?

    标签: python pyspark


    【解决方案1】:

    您是否从https://github.com/kontext-tech/winutils 下载了winutils.exe?您需要将其放入 \Hadoop\bin 并添加路径等。

    【讨论】:

      猜你喜欢
      • 2019-06-08
      • 2021-09-15
      • 2019-08-13
      • 1970-01-01
      • 1970-01-01
      • 2015-10-28
      • 2017-07-15
      相关资源
      最近更新 更多