Question

在AWS Glue中运行python作业时，出现错误：

原因：容器因超出内存限制而被YARN杀死。已使用5.6 GB的5.5 GB物理内存。考虑提高spark.yarn.executor.memoryOverhead

在脚本开头运行此代码：

print '--- Before Conf --'
print 'spark.yarn.driver.memory', sc._conf.get('spark.yarn.driver.memory')
print 'spark.yarn.driver.cores', sc._conf.get('spark.yarn.driver.cores')
print 'spark.yarn.executor.memory', sc._conf.get('spark.yarn.executor.memory')
print 'spark.yarn.executor.cores', sc._conf.get('spark.yarn.executor.cores')
print "spark.yarn.executor.memoryOverhead", sc._conf.get("spark.yarn.executor.memoryOverhead")

print '--- Conf --'
sc._conf.setAll([('spark.yarn.executor.memory', '15G'),('spark.yarn.executor.memoryOverhead', '10G'),('spark.yarn.driver.cores','5'),('spark.yarn.executor.cores', '5'), ('spark.yarn.cores.max', '5'), ('spark.yarn.driver.memory','15G')])

print '--- After Conf ---'
print 'spark.driver.memory', sc._conf.get('spark.driver.memory')
print 'spark.driver.cores', sc._conf.get('spark.driver.cores')
print 'spark.executor.memory', sc._conf.get('spark.executor.memory')
print 'spark.executor.cores', sc._conf.get('spark.executor.cores')
print "spark.executor.memoryOverhead", sc._conf.get("spark.executor.memoryOverhead")

我得到以下输出：

---在会议之前-

spark.yarn.driver.memory无

spark.yarn.driver.cores无

spark.yarn.executor.memory无

spark.yarn.executor.cores无

spark.yarn.executor.memoryOverhead无

--- Conf-

--- Conf之后---

spark.yarn.driver.memory 15G

spark.yarn.driver.cores 5

spark.yarn.executor.memory 15G

spark.yarn.executor.cores 5

spark.yarn.executor.memory开销10G

spark.yarn.executor.memoryOverhead似乎已设置，但为什么无法识别？我仍然遇到相同的错误。

我还看过其他有关设置spark.yarn.executor.memoryOverhead的问题的帖子，但是在似乎已设置并且无法正常工作时看不到？

Answer 1

不幸的是，当前版本的Glue不支持此功能。除了使用UI，您不能设置其他参数。就您而言，可以使用AWS EMR服务来代替AWS Glue。

当我遇到类似的问题时，我尝试减少了改组的次数和改组的数据量，并增加了DPU。在解决此问题的过程中，我基于以下文章。我希望它们会有用。

http://blog.cloudera.com/blog/2015/03/how-to-tune-your-apache-spark-jobs-part-1/

https://www.indix.com/blog/engineering/lessons-from-using-spark-to-process-large-amounts-of-data-part-i/

https://umbertogriffo.gitbooks.io/apache-spark-best-practices-and-tuning/content/sparksqlshufflepartitions_draft.html

已更新：2019-01-13

Amazon最近在AWS Glue文档中添加了新的部分，该部分描述了如何监视和优化Glue作业。我认为了解与内存问题有关的问题在哪里以及如何避免它是非常有用的。

https://docs.aws.amazon.com/glue/latest/dg/monitor-profile-glue-job-cloudwatch-metrics.html

Answer 2

打开胶水>作业>编辑您的作业>脚本库和作业参数（可选）>底部附近的作业参数
设置以下>键：--conf值：spark.yarn.executor.memoryOverhead = 1024

AWS Glue-无法设置spark.yarn.executor.memoryOverhead

2 个答案: