scrapy无法在我的pythonpath中导入模块

时间:2014-11-20 17:13:07

标签: python import scrapy pythonpath

我有一个功能性的scrapy项目,然后我决定清理它。为了做到这一点,我将我的数据库模块从我的项目的scrapy部分中取出,我不能再包含它了。现在项目看起来像这样:

myProject/
    database/
        __init__.py
        model.py
        databaseFactory.py
    myScrapy/
        __init__.py
        settings.py
        myScrapy/
            __init__.py
            pipeline.py
        spiders/
            spiderA.py
            spiderB.py
    api/
        __init__.py
    config/
        __init__.py

(仅显示与我的问题相关的文件) 我想在scrapy中使用databaseFactory。

我在.bashrc中添加了以下几行:

PYTHONPATH=$PYTHONPATH:my/path/to/my/project
export PYTHONPATH

所以当启动ipython时我可以做以下事情:

In [1]: import database.databaseFactory as databaseFactory

In [2]: databaseFactory
Out[2]: <module 'database.databaseFactory' from '/my/path/to/my/project/database/databaseFactory.pyc'>

但是...

当我尝试使用

启动废料时
sudo scrapy crawl spiderName 2> error.log

我可以享受以下信息:

Traceback (most recent call last):
  File "/usr/local/bin/scrapy", line 11, in <module>
    sys.exit(execute())
  File "/usr/local/lib/python2.7/dist-packages/scrapy/cmdline.py", line 143, in execute
    _run_print_help(parser, _run_command, cmd, args, opts)
  File "/usr/local/lib/python2.7/dist-packages/scrapy/cmdline.py", line 89, in _run_print_help
    func(*a, **kw)
  File "/usr/local/lib/python2.7/dist-packages/scrapy/cmdline.py", line 150, in _run_command
    cmd.run(args, opts)
  File "/usr/local/lib/python2.7/dist-packages/scrapy/commands/crawl.py", line 60, in run
    self.crawler_process.start()
  File "/usr/local/lib/python2.7/dist-packages/scrapy/crawler.py", line 92, in start
    if self.start_crawling():
  File "/usr/local/lib/python2.7/dist-packages/scrapy/crawler.py", line 124, in start_crawling
    return self._start_crawler() is not None
  File "/usr/local/lib/python2.7/dist-packages/scrapy/crawler.py", line 139, in _start_crawler
    crawler.configure()
  File "/usr/local/lib/python2.7/dist-packages/scrapy/crawler.py", line 47, in configure
    self.engine = ExecutionEngine(self, self._spider_closed)
  File "/usr/local/lib/python2.7/dist-packages/scrapy/core/engine.py", line 65, in __init__
    self.scraper = Scraper(crawler)
  File "/usr/local/lib/python2.7/dist-packages/scrapy/core/scraper.py", line 66, in __init__
    self.itemproc = itemproc_cls.from_crawler(crawler)
  File "/usr/local/lib/python2.7/dist-packages/scrapy/middleware.py", line 50, in from_crawler
    return cls.from_settings(crawler.settings, crawler)
  File "/usr/local/lib/python2.7/dist-packages/scrapy/middleware.py", line 29, in from_settings
    mwcls = load_object(clspath)
  File "/usr/local/lib/python2.7/dist-packages/scrapy/utils/misc.py", line 42, in load_object
    raise ImportError("Error loading object '%s': %s" % (path, e))
ImportError: Error loading object 'myScrapy.pipelines.QueueExportPipe': No module named database.databaseFactory

为什么scrapy会忽略我的PYHTONPATH?我现在该怎么办?我真的不想在我的代码中使用sys.path.append()

3 个答案:

答案 0 :(得分:0)

你必须告诉python你的PYTHONPATH:

export PYTHONPATH=/path/to/myProject/

然后运行scrapy:

sudo scrapy crawl spiderName 2> error.log

答案 1 :(得分:0)

默认情况下,在使用sudo启动命令时,不使用普通上下文,因此忘记了PYTHONPATH。要使用sudo进行PYTHONPATH,请按照以下步骤操作:

  • 在sudoers文件中将PYTHONPATH添加到默认值env_keep + =“ENV1 ENV2 ...”
  • 从sudoers文件中移除默认值!env_reset(如果存在)

答案 2 :(得分:-1)

使用“sys.path.append()”有什么问题?我尝试了许多其他方法,并确定“scrapy”不支持用户定义包的“$ PYTHONPATH”。我怀疑它在框架通过查找阶段后加载目录。但我尝试了“sys.path.append()”,它正在运行。