SOLR DIH多个查询

时间:2014-05-16 12:57:18

标签: solr dataimporthandler dih

问题

SOLR DIH在每次迭代中总结查询。与第三次迭代一样,产生输出

"entity:us-patent-grant-xslt",
  [
    "document#3",
    [
      "query",
      "/var/www/data1/US07985001-20110726.XML",
      "query",
      "/var/www/data1/US07985001-20110726.XML",
      "query",
      "/var/www/data1/US07985001-20110726.XML",
      "time-taken",
      "0:0:0.0",
      "time-taken",
      "0:0:0.0",
      "time-taken",
      "0:0:0.0",
      null,
      "----------- row #1-------------",
      "id",
      "US7985001",
      "pub_date",
      "2011-07-26 00:00:00",
      null,
      "---------------------------------------------"
    ],

数据配置文件

<entity name="pickupdir"
        processor="FileListEntityProcessor"
        rootEntity="false"
        dataSource="null"
        fileName="^[\w\d-]+\.XML$"
        baseDir="/var/www/data1/"
        recursive="true"
        onError="skip">

            <entity name="us-patent-grant-xslt"
                   url="${pickupdir.fileAbsolutePath}"
                   xsl="data.xsl"
                   processor="XPathEntityProcessor"
                   useSolrAddSchema= "true" 
                   rootEntity="true"
                   onError="skip">

                       <field column="id" />
                       <field column="pub_date" />
           </entity>
</entity>

因此,当我在每次迭代中批量上传数据时,查询总结并且性能滞后。目前我的服务器处理2个docs /秒。

我没有使用SQL实体,因此无法实现cachedsqlentity处理器。

类似问题

solr-dih does multiple queries for sub-entities

0 个答案:

没有答案