ParentDeltaQuery不更新solr dataimporthandler中的文档

时间:2018-02-20 16:54:38

标签: solr dataimporthandler

我使用solr的dataimporthandler来索引solr中的postgres数据,我的文档结构有一个父实体和相应的子实体:这里是data-config.xml文件:

<entity name="assessment" pk="assm_pk" 
    query="select assm_pk,'building' as table_name,assm_no,own_name,oaadhar_no,floor_ar,prp_usg,bld_id,ulb_id from assessment_table"
deltaImportQuery="SELECT assm_pk,'building' as table_name,assm_no,own_name,oaadhar_no,floor_ar,ulb_id,prp_usg,bld_id from assessment_table WHERE assm_pk='${dih.delta.assm_pk}'"
deltaQuery="select assm_pk from assessment_table where update_ts &gt; '${dih.last_index_time}'"
>
 <field column="table_name" name="table_name"/>
 <field column="assm_no" name="assm_no"/>
 <field column="own_name" name="own_name"/>
 <field column="oaadhar_no" name="aadhar_no"/>
 <field column="floor_ar" name="floor_area"/>
 <field column="prp_usg" name="prp_usg"/>
 <field column="ulb_id" name="ulb_id"/>
 <field column="assm_pk" name="assm_pk"/>
 <entity name="building" pk="id" query="select id,bld_id,latitude,longitude,road_name from v2_buildings where CAST(bld_id as text) = '${assessment.bld_id}'"
        deltaQuery="select id,bld_id from v2_buildings limit 100"
        parentDeltaQuery="select assm_pk from assessment_table p where cast(p.bld_id as text) = '${dih.delta.bld_id}'">
        <field column="bld_id" name="bld_id"/>
        <field column="latitude" name="latitude"/>
        <field column="longitude" name="longitude"/>
        <field column="road_name" name="road_name"/>
 </entity>

当我运行delta-import命令时,至少有100个文档(与子实体构建相关联应该更新)但solr没有更新任何文档,这里是最终状态:

{
  "responseHeader": {
    "status": 0,
    "QTime": 0
  },
  "initArgs": [
    "defaults",
    [
      "config",
      "db-data-config.xml"
    ]
  ],
  "command": "status",
  "status": "idle",
  "importResponse": "",
  "statusMessages": {
    "Total Requests made to DataSource": "103",
    "Total Rows Fetched": "100",
    "Total Documents Processed": "0",
    "Total Documents Skipped": "0",
    "Delta Dump started": "2018-02-20 16:48:43",
    "Identifying Delta": "2018-02-20 16:48:43",
    "Deltas Obtained": "2018-02-20 16:48:43",
    "Building documents": "2018-02-20 16:48:43",
    "Total Changed Documents": "0",
    "": "Indexing completed. Added/Updated: 0 documents. Deleted 0 documents.",
    "Committed": "2018-02-20 16:48:43",
    "Time taken": "0:0:0.290"
  }
}

以下是delta-import命令的关联日志:

    2018-02-20 16:48:43.692 INFO  (Thread-50) [   x:psqlTest] o.a.s.h.d.JdbcDataSource Time taken for getConnection(): 2
2018-02-20 16:48:43.694 INFO  (Thread-50) [   x:psqlTest] o.a.s.h.d.DocBuilder Completed ModifiedRowKey for Entity: building rows obtained : 100
2018-02-20 16:48:43.694 INFO  (Thread-50) [   x:psqlTest] o.a.s.h.d.DocBuilder Completed DeletedRowKey for Entity: building rows obtained : 0
2018-02-20 16:48:43.694 INFO  (Thread-50) [   x:psqlTest] o.a.s.h.d.SqlEntityProcessor Running parentDeltaQuery for Entity: building
2018-02-20 16:48:43.698 INFO  (Thread-50) [   x:psqlTest] o.a.s.h.d.SqlEntityProcessor Running parentDeltaQuery for Entity: building
20 ...
...
...
2018-02-20 16:48:43.959 INFO  (Thread-50) [   x:psqlTest] o.a.s.h.d.DocBuilder Completed parentDeltaQuery for Entity: building
2018-02-20 16:48:43.959 INFO  (Thread-50) [   x:psqlTest] o.a.s.h.d.DocBuilder Running ModifiedRowKey() for Entity: assessment
2018-02-20 16:48:43.959 INFO  (Thread-50) [   x:psqlTest] o.a.s.h.d.JdbcDataSource Creating a connection for entity assessment with URL: jdbc:postgresql://localhost:5432/testdata
2018-02-20 16:48:43.962 INFO  (Thread-50) [   x:psqlTest] o.a.s.h.d.JdbcDataSource Time taken for getConnection(): 2
2018-02-20 16:48:43.967 INFO  (Thread-50) [   x:psqlTest] o.a.s.h.d.DocBuilder Completed ModifiedRowKey for Entity: assessment rows obtained : 0
2018-02-20 16:48:43.967 INFO  (Thread-50) [   x:psqlTest] o.a.s.h.d.DocBuilder Completed DeletedRowKey for Entity: assessment rows obtained : 0
2018-02-20 16:48:43.967 INFO  (Thread-50) [   x:psqlTest] o.a.s.h.d.DocBuilder Completed parentDeltaQuery for Entity: assessment

为什么没有文档在日志中得到更新,我们可以看到构建父delta查询正在运行。

1 个答案:

答案 0 :(得分:0)

我在修补db-config.xml时找到了答案,问题出在parentDeltaQuery上:parentDeltaQuery="select assm_pk from assessment_table p where cast(p.bld_id as text) = '${dih.delta.bld_id}'">

我应该编写 building.bld_id ,而不是使用 dih.delta.bld_id ,其中building是子实体名称。