jupyter notebook NameError:未定义名称'sc'

时间:2016-07-21 22:27:31

标签: apache-spark pyspark jupyter

我使用了jupyter笔记本,pyspark,然后,我的第一个命令是:

rdd = sc.parallelize([2, 3, 4])

然后,它显示

NameError Traceback (most recent call last)
<ipython-input-1-c540c4a1d203> in <module>()
----> 1 rdd = sc.parallelize([2, 3, 4])

NameError: name 'sc' is not defined.

如何修复此错误'sc'未定义。

2 个答案:

答案 0 :(得分:4)

您是否初始化了SparkContext

你可以试试这个:

#Initializing PySpark
from pyspark import SparkContext, SparkConf

# #Spark Config
conf = SparkConf().setAppName("sample_app")
sc = SparkContext(conf=conf)

答案 1 :(得分:0)

试试这个

import findspark
findspark.init()
import pyspark # only run after findspark.init()
from pyspark import SparkContext, SparkConf
# #Spark Config
conf = SparkConf().setAppName("sample_app")
sc = SparkContext(conf=conf)

myrdd = sc.parallelize([('roze', 60), ('Mary', 80), ('stella', 34)])