无法在Rapidminer流程上运行Spark决策树

时间:2018-11-07 09:48:41

标签: apache-spark hadoop yarn rapidminer

我正在使用Windows 8.1,Hadoop 2.6,spark 1.6,hive和Rapidminer 9.0版本。 我有一个由这些运算符组成的过程:从配置单元检索数据,设置角色,触发决策树。 radoop已配置,我可以访问配置单元数据。 当我运行该流程时,spark决策树运算符最后一次运行没有结果。 在yarn namenode ui网站上显示火花作业失败,在Rapidminer中,我出现此错误:

Please verify your Spark Resource Allocation settings on the Advanced Connection Properties window. You can check the logs of the Spark job on the ResourceManager web interface  at http://${yarn.resourcemanager.hostname}:8088.

在纱线资源管理器日志中,我有以下通知:

Container exited with a non-zero exit code 1
Failing this attempt. Failing the application.
18/11/07 10:43:50 INFO scheduler.AppSchedulingInfo: Application application_1541
582549574_0006 requests cleared
18/11/07 10:43:50 INFO rmapp.RMAppImpl: application_1541582549574_0006 State cha
nge from FINAL_SAVING to FAILED
18/11/07 10:43:50 INFO capacity.LeafQueue: Application removed - appId: applicat
ion_1541582549574_0006 user: user queue: default #user-pending-applications: 0 #
user-active-applications: 0 #queue-pending-applications: 0 #queue-active-applica
tions: 0
18/11/07 10:43:50 WARN resourcemanager.RMAuditLogger: USER=user OPERATION=Applic
ation Finished - Failed TARGET=RMAppManager     RESULT=FAILURE  DESCRIPTION=App
failed with state: FAILED       PERMISSIONS=Application application_154158254957
4_0006 failed 1 times due to AM Container for appattempt_1541582549574_0006_0000
01 exited with  exitCode: 1
For more detailed output, check application tracking page:http://asus:8088/proxy
/application_1541582549574_0006/Then, click on links to logs of each attempt.
Diagnostics: Exception from container-launch.
Container id: container_1541582549574_0006_01_000001
Exit code: 1
Stack trace: ExitCodeException exitCode=1:
        at org.apache.hadoop.util.Shell.runCommand(Shell.java:538)
        at org.apache.hadoop.util.Shell.run(Shell.java:455)
        at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:
715)
        at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.la
unchContainer(DefaultContainerExecutor.java:211)
        at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.C
ontainerLaunch.call(ContainerLaunch.java:302)
        at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.C
ontainerLaunch.call(ContainerLaunch.java:82)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.
java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor
.java:624)
        at java.lang.Thread.run(Thread.java:748)

Shell output:         1 fichier(s) déplacé(s).


Container exited with a non-zero exit code 1
Failing this attempt. Failing the application.  APPID=application_1541582549574_
0006
18/11/07 10:43:50 INFO capacity.ParentQueue: Application removed - appId: applic
ation_1541582549574_0006 user: user leaf-queue of parent: root #applications: 0
18/11/07 10:43:50 INFO resourcemanager.RMAppManager$ApplicationSummary: appId=ap
plication_1541582549574_0006,name=Decision Tree,user=user,queue=default,state=FA
ILED,trackingUrl=http://asus:8088/cluster/app/application_1541582549574_0006,app
MasterHost=N/A,startTime=1541583816334,finishTime=1541583830966,finalStatus=FAIL
ED 

这是XML流程描述

 <?xml version="1.0" encoding="UTF-8"?><process version="9.0.003">
  <context>
    <input/>
    <output/>
    <macros/>
  </context>
  <operator activated="true" class="process" compatibility="9.0.003" expanded="true" name="Process">
    <process expanded="true">
      <operator activated="true" class="radoop:radoop_nest" compatibility="9.0.002" expanded="true" height="103" name="Radoop Nest" width="90" x="179" y="85">
        <parameter key="connection" value="hadoop"/>
        <parameter key="change_sample_size" value="true"/>
        <enumeration key="tables_to_reload"/>
        <process expanded="true">
          <operator activated="true" class="radoop:retrieve" compatibility="9.0.002" expanded="true" height="68" name="Retrieve (2)" width="90" x="45" y="340">
            <parameter key="table" value="forum"/>
          </operator>
          <operator activated="true" class="radoop:set_role" compatibility="9.0.002" expanded="true" height="82" name="Set Role" width="90" x="179" y="340">
            <parameter key="name" value="categforum"/>
            <parameter key="target_role" value="label"/>
            <list key="set_additional_roles"/>
          </operator>
          <operator activated="true" class="radoop:decision_tree_ml" compatibility="9.0.002" expanded="true" height="82" name="Decision Tree" width="90" x="313" y="340">
            <parameter key="file_format" value="PARQUET"/>
          </operator>
          <operator activated="true" class="radoop:apply_prediction" compatibility="9.0.002" expanded="true" height="82" name="Apply Model" width="90" x="447" y="340">
            <list key="application_parameters"/>
          </operator>
          <connect from_op="Retrieve (2)" from_port="output" to_op="Set Role" to_port="example set input"/>
          <connect from_op="Set Role" from_port="example set output" to_op="Decision Tree" to_port="training set"/>
          <connect from_op="Decision Tree" from_port="model" to_op="Apply Model" to_port="model"/>
          <connect from_op="Decision Tree" from_port="exampleSet" to_op="Apply Model" to_port="unlabelled data"/>
          <connect from_op="Apply Model" from_port="labelled data" to_port="output 1"/>
          <connect from_op="Apply Model" from_port="model" to_port="output 2"/>
          <portSpacing port="source_input 1" spacing="0"/>
          <portSpacing port="sink_output 1" spacing="0"/>
          <portSpacing port="sink_output 2" spacing="0"/>
          <portSpacing port="sink_output 3" spacing="0"/>
        </process>
      </operator>
      <connect from_op="Radoop Nest" from_port="output 1" to_port="result 1"/>
      <portSpacing port="source_input 1" spacing="0"/>
      <portSpacing port="sink_result 1" spacing="0"/>
      <portSpacing port="sink_result 2" spacing="0"/>
    </process>
  </operator>
</process>

当我尝试仅在Rapidminer上测试火花时,出现以下错误结果:

SEVERE: java.util.concurrent.TimeoutException
SEVERE: Timeout on the Spark test job. Please verify your Spark Resource Allocation settings on the Advanced Connection Properties window. You can check the logs of the Spark job on the ResourceManager web interface at http://localhost:8088.
SEVERE: Test failed: Spark job
SEVERE: Integration test for 'hadoop' failed.

0 个答案:

没有答案
相关问题