创建mahout模型时出错

时间:2013-01-04 05:59:02

标签: mahout

我正在为我的数据训练mahout分类器, 我发出以下命令来创建mahout模型

./bin/mahout seqdirectory -i /tmp/mahout-work-root/MyData-all -o /tmp/mahout-work-root/MyData-seq

./bin/mahout seq2sparse -i /tmp/mahout-work-root/MyData-seq -o /tmp/mahout-work-root/MyData-vectors -lnorm -nv -wt tfidf

./bin/mahout split -i /tmp/mahout-work-root/MyData-vectors/tfidf-vectors --trainingOutput /tmp/mahout-work-root/MyData-train-vectors --testOutput /tmp/mahout-work-root/MyData-test-vectors --randomSelectionPct 40 --overwrite --sequenceFiles -xm sequential

./bin/mahout trainnb -i /tmp/mahout-work-root/Mydata-train-vectors -el -o /tmp/mahout-work-root/model -li /tmp/mahout-work-root/labelindex -ow

当我尝试使用trainnb命令创建模型时,我得到以下异常:

Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 1 at org.apache.mahout.classifier.naivebayes.BayesUtils.writeLabelIndex(BayesUtils.java:119) at org.apache.mahout.classifier.naivebayes.training.TrainNaiveBayesJob.createLabelIndex(TrainNaiveBayesJob.java:152)

这可能是什么问题?

注意:提到的原始示例here工作正常。

1 个答案:

答案 0 :(得分:0)

我认为这可能是您如何放置培训文件的问题。 文件应按如下方式组织:

迈德特-所有

\ CLASSA

 -file1
 -file2
 -...

\ CLASSB

 -filex

...