【发布时间】:2017-04-05 16:05:51
【问题描述】:
我正在通过 Jupyter notebook 运行 pyspark 应用程序。我可以使用 Spark Web UI 终止作业,但我想以编程方式终止它。
我怎样才能杀死它???
【问题讨论】:
标签: python apache-spark pyspark jupyter-notebook
我正在通过 Jupyter notebook 运行 pyspark 应用程序。我可以使用 Spark Web UI 终止作业,但我想以编程方式终止它。
我怎样才能杀死它???
【问题讨论】:
标签: python apache-spark pyspark jupyter-notebook
要扩展@Netanel Malka 的答案,您可以使用 cancelAllJobs 方法取消每个正在运行的作业,或者可以使用 cancelJobGroup 方法取消已组织成组的作业。
来自 PySpark 文档:
cancelAllJobs()
Cancel all jobs that have been scheduled or are running.
cancelJobGroup(groupId)
Cancel active jobs for the specified group. See SparkContext.setJobGroup for more information.
还有来自文档的示例:
import threading
from time import sleep
result = "Not Set"
lock = threading.Lock()
def map_func(x):
sleep(100)
raise Exception("Task should have been cancelled")
def start_job(x):
global result
try:
sc.setJobGroup("job_to_cancel", "some description")
result = sc.parallelize(range(x)).map(map_func).collect()
except Exception as e:
result = "Cancelled"
lock.release()
def stop_job():
sleep(5)
sc.cancelJobGroup("job_to_cancel")
suppress = lock.acquire()
suppress = threading.Thread(target=start_job, args=(10,)).start()
suppress = threading.Thread(target=stop_job).start()
suppress = lock.acquire()
print(result)
【讨论】:
假设你写了这段代码:
from pyspark import SparkContext
sc = SparkContext("local", "Simple App")
# This will stop your app
sc.stop()
如文档中所述: http://spark.apache.org/docs/latest/api/python/pyspark.html?highlight=stop#pyspark.SparkContext.stop
【讨论】: