【问题标题】:Spark 2.3 Memory Leak on ExecutorSpark 2.3 Executor 内存泄漏
【发布时间】:2018-11-04 16:07:10
【问题描述】:

我收到了内存泄漏警告,理想情况下这是 1.6 版本之前的 Spark 错误并已解决。

模式:独立 IDE:PyCharm 火花版本:2.3 Python 版本:3.6

下面是堆栈跟踪 -

2018-05-25 15:00:05 WARN  Executor:66 - Managed memory leak detected; size = 262144 bytes, TID = 3148
2018-05-25 15:00:05 WARN  Executor:66 - Managed memory leak detected; size = 262144 bytes, TID = 3152
2018-05-25 15:00:05 WARN  Executor:66 - Managed memory leak detected; size = 262144 bytes, TID = 3151
2018-05-25 15:00:05 WARN  Executor:66 - Managed memory leak detected; size = 262144 bytes, TID = 3150
2018-05-25 15:00:05 WARN  Executor:66 - Managed memory leak detected; size = 262144 bytes, TID = 3149
2018-05-25 15:00:05 WARN  Executor:66 - Managed memory leak detected; size = 262144 bytes, TID = 3153
2018-05-25 15:00:05 WARN  Executor:66 - Managed memory leak detected; size = 262144 bytes, TID = 3154
2018-05-25 15:00:05 WARN  Executor:66 - Managed memory leak detected; size = 262144 bytes, TID = 3158
2018-05-25 15:00:05 WARN  Executor:66 - Managed memory leak detected; size = 262144 bytes, TID = 3155
2018-05-25 15:00:05 WARN  Executor:66 - Managed memory leak detected; size = 262144 bytes, TID = 3157
2018-05-25 15:00:05 WARN  Executor:66 - Managed memory leak detected; size = 262144 bytes, TID = 3160
2018-05-25 15:00:05 WARN  Executor:66 - Managed memory leak detected; size = 262144 bytes, TID = 3161
2018-05-25 15:00:05 WARN  Executor:66 - Managed memory leak detected; size = 262144 bytes, TID = 3156
2018-05-25 15:00:05 WARN  Executor:66 - Managed memory leak detected; size = 262144 bytes, TID = 3159
2018-05-25 15:00:05 WARN  Executor:66 - Managed memory leak detected; size = 262144 bytes, TID = 3165
2018-05-25 15:00:05 WARN  Executor:66 - Managed memory leak detected; size = 262144 bytes, TID = 3163
2018-05-25 15:00:05 WARN  Executor:66 - Managed memory leak detected; size = 262144 bytes, TID = 3162
2018-05-25 15:00:05 WARN  Executor:66 - Managed memory leak detected; size = 262144 bytes, TID = 3166

任何关于它为什么会发生的见解?虽然我的工作已经顺利完成了。

编辑:许多人说它与 2 年前的问题重复,但那里的答案说这是一个 Spark 错误,但在 Spark 的 Jira 中检查时,它说它已解决。

这里的问题是,这么多版本之后,为什么我仍然在 Spark 2.3 中得到相同的结果?如果对我的查询有一些有效或合乎逻辑的答案,我肯定会删除这个问题。

【问题讨论】:

  • 你一定是打开了一些资源的使用,比如连接到数据库或打开文件而忘记关闭等
  • 这里不是这样,拉梅什。
  • 我看到了类似的东西。我什至看到完全相同的字节值(262144 字节),尽管我使用的是 scala。你有没有运气调试过这个?
  • @Aakash Basu,你解决了这个问题吗?

标签: python python-3.x apache-spark memory-leaks pyspark


【解决方案1】:

根据SPARK-14168,警告源于没有消耗整个迭代器。从 Spark shell 中的 RDD 获取 n 个元素时,我遇到了同样的错误。

【讨论】:

    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2011-10-08
    • 2013-01-20
    • 2011-10-31
    相关资源
    最近更新 更多