Gensim槌CalledProcessError:返回非零退出状态

时间:2019-03-21 20:24:44

标签: python windows jupyter-notebook gensim mallet

尝试访问Jupyter笔记本电脑中的gensims槌时出现错误。我在笔记本的同一文件夹中有指定的文件“ mallet”,但似乎无法访问它。我尝试从C驱动器路由到它,但是仍然遇到相同的错误。请帮忙:)

import os
from gensim.models.wrappers import LdaMallet

#os.environ.update({'MALLET_HOME':r'C:/Users/new_mallet/mallet-2.0.8/'})

mallet_path = 'mallet' # update this path

ldamallet = gensim.models.wrappers.LdaMallet(mallet_path, corpus=bow_corpus, num_topics=20, id2word=dictionary)

result = (ldamallet.show_topics(num_topics=3, num_words=10,formatted=False))
for each in result:
    print (each)

Mallet Error CalledProcessError

enter image description here

8 个答案:

答案 0 :(得分:1)

我遇到了同样的问题。我所做的是将槌状文件夹的位置更改为c:// new_mallet 所以效果很好

    import os
    os.environ.update({'MALLET_HOME': r'C:/new_mallet/mallet-2.0.8/'})
    mallet_path = 'C:/new_mallet/mallet-2.0.8/bin/mallet'  # update this path
    ldamallet = gensim.models.wrappers.LdaMallet(mallet_path, corpus=corpus, num_topics=10, id2word=id2word)

答案 1 :(得分:1)

在具有Python的Jupyter Notebook中,我运行

conda uninstall gensim
conda install gensim
在cmd中以

的身份作为管理员,然后重新启动了我的内核。我花了很长时间在网上搜索之后,才像魅力一样工作。

答案 2 :(得分:0)

将路径更新为:

mallet_path = 'C:/mallet/mallet-2.0.8/bin/mallet.bat'

然后将Mallet 2.0.8文件夹中的记事本mallet.bat编辑为:

@echo off

rem This batch file serves as a wrapper for several
rem  MALLET command line tools.

if not "%MALLET_HOME%" == "" goto gotMalletHome

echo MALLET requires an environment variable MALLET_HOME.
goto :eof

:gotMalletHome

set MALLET_CLASSPATH=C:\mallet\mallet-2.0.8\class;C:\mallet\mallet-2.0.8\lib\mallet-deps.jar
set MALLET_MEMORY=1G
set MALLET_ENCODING=UTF-8

set CMD=%1
shift

set CLASS=
if "%CMD%"=="import-dir" set CLASS=cc.mallet.classify.tui.Text2Vectors
if "%CMD%"=="import-file" set CLASS=cc.mallet.classify.tui.Csv2Vectors
if "%CMD%"=="import-svmlight" set CLASS=cc.mallet.classify.tui.SvmLight2Vectors
if "%CMD%"=="info" set CLASS=cc.mallet.classify.tui.Vectors2Info
if "%CMD%"=="train-classifier" set CLASS=cc.mallet.classify.tui.Vectors2Classify
if "%CMD%"=="classify-dir" set CLASS=cc.mallet.classify.tui.Text2Classify
if "%CMD%"=="classify-file" set CLASS=cc.mallet.classify.tui.Csv2Classify
if "%CMD%"=="classify-svmlight" set CLASS=cc.mallet.classify.tui.SvmLight2Classify
if "%CMD%"=="train-topics" set CLASS=cc.mallet.topics.tui.TopicTrainer
if "%CMD%"=="infer-topics" set CLASS=cc.mallet.topics.tui.InferTopics
if "%CMD%"=="evaluate-topics" set CLASS=cc.mallet.topics.tui.EvaluateTopics
if "%CMD%"=="prune" set CLASS=cc.mallet.classify.tui.Vectors2Vectors
if "%CMD%"=="split" set CLASS=cc.mallet.classify.tui.Vectors2Vectors
if "%CMD%"=="bulk-load" set CLASS=cc.mallet.util.BulkLoader
if "%CMD%"=="run" set CLASS=%1 & shift

if not "%CLASS%" == "" goto gotClass

echo Mallet 2.0 commands: 
echo   import-dir        load the contents of a directory into mallet instances (one per file)
echo   import-file       load a single file into mallet instances (one per line)
echo   import-svmlight   load a single SVMLight format data file into mallet instances (one per line)
echo   info              get information about Mallet instances
echo   train-classifier  train a classifier from Mallet data files
echo   classify-dir      classify data from a single file with a saved classifier
echo   classify-file     classify the contents of a directory with a saved classifier
echo   classify-svmlight classify data from a single file in SVMLight format
echo   train-topics      train a topic model from Mallet data files
echo   infer-topics      use a trained topic model to infer topics for new documents
echo   evaluate-topics   estimate the probability of new documents given a trained model
echo   prune             remove features based on frequency or information gain
echo   split             divide data into testing, training, and validation portions
echo   bulk-load         for big input files, efficiently prune vocabulary and import docs
echo Include --help with any option for more information


goto :eof

:gotClass

set MALLET_ARGS=

:getArg

if "%1"=="" goto run
set MALLET_ARGS=%MALLET_ARGS% %1
shift
goto getArg

:run

"C:\Program Files\Java\jdk-12\bin\java" -ea -Dfile.encoding=%MALLET_ENCODING% -classpath %MALLET_CLASSPATH% %CLASS% %MALLET_ARGS%

:eof

在命令行中,这些命令很有用,可以弄清楚发生了什么事情:

notepad mallet.bat
java
C:\Program Files\Java\jdk-12\bin\java
dir /OD
cd %userdir%
cd %userpath%
cd\
cd users
cd your_username
cd appdata\local\temp\2
dir /OD

问题是未正确安装Java或路径(不包括Java)和槌类路径未正确定义。此处更多信息:https://docs.oracle.com/javase/7/docs/technotes/tools/windows/classpath.html。希望可以解决我的错误,对其他人有所帮助:)

答案 3 :(得分:0)

确保已安装Java开发人员工具包(JDK)。

功劳归于this another answer

安装JDK 后,以下用于LDA Mallet的代码就像魅力一样!

import os
from gensim.models.wrappers import LdaMallet

os.environ.update({'MALLET_HOME':r'C:/mallet/mallet-2.0.8/'})
mallet_path = r'C:/mallet/mallet-2.0.8/bin/mallet.bat'

lda_mallet = LdaMallet(
        mallet_path,
        corpus = corpus_bow,
        num_topics = n_topics,
        id2word = dct,
    )

答案 4 :(得分:0)

对我来说,这不是导入或路径问题。

我花了数小时试图解决它。 尝试过此solution,没有任何效果。

考虑到我之前对LDA Mallet进行的成功调用,我注意到未设置某些参数,然后我将其设置为:

gensim.models.wrappers.LdaMallet(mallet_path = mallet_path,语料库=语料库,num_topics = num_topics,id2word = id2word,前缀='temp_file_',worker = 4)

我真的希望它能对您有所帮助。找到解决这个问题的方法很痛苦。

答案 5 :(得分:0)

对于linux,我发现需要明确定义二进制槌路径。以下代码有效。

from gensim.test.utils import common_corpus, common_dictionary
from gensim.models.wrappers import LdaMallet

mallet_path = "/path/Mallet/bin/mallet"
model = LdaMallet(mallet_path=mallet_path, corpus=common_corpus, num_topics=2, id2word=common_dictionary)

答案 6 :(得分:0)

对于仍在挣扎并花费数小时尝试许多不同建议的其他人,我终于成功了!

按照此处的说明操作(我使用的是 mac)

https://ps.au.dk/fileadmin/ingen_mappe_valgt/installing_mallet.pdf

我在开始之前也关闭了anaconda,不知道这是否重要。

在终端中,我收到以下错误:

(base) myname-MacBook-Air:mallet-2.0.8 myname$ ./bin/mallet
-bash: ./bin/mallet: /bin/bash: bad interpreter: Operation not permitted

然后我按照这些说明取消隔离

“bad interpreter: Operation not permitted” Error on El Capitan

重新打开 anaconda,一切正常!

答案 7 :(得分:0)