Elasticsearch数据与地图

时间:2017-08-09 15:10:34

标签: elasticsearch groovy elasticsearch-5

我正在将elasticsearch prod数据从1.4.3v迁移到5.5v,我正在使用reindex。当我尝试将旧的ES索引重新索引到新的ES索引时,重建索引失败并出现异常Failed Reason: mapper [THROUGHPUT_ROWS_PER_SEC] cannot be changed from type [long] to [float]. Failed Type: illegal_argument_exception

ES 1.4.3v中task_history索引的ES映射

{
   "task_history": {
      "mappings": {
         "task_run_hist": {
            "_all": {
               "enabled": false
            },
            "_routing": {
               "required": true,
               "path": "org_id"
            },
            "properties": {
               "RUN_TIME_IN_MINS": {
                  "type": "double"
               },
               "THROUGHPUT_ROWS_PER_SEC": {
                  "type": "long"
               },
               "account_name": {
                  "type": "string",
                  "index": "not_analyzed",
                  "store": true
               }
            }
         }
      }
   }
}

ES 5.5v中task_history索引的ES映射(此映射作为部分重建索引创建)

{
  "task_history": {
    "mappings": {
      "task_run_hist": {
        "_all": {
          "enabled": false
        },
        "_routing": {
          "required": true
        },
        "properties": {
          "RUN_TIME_IN_MINS": {
            "type": "float"
          },
          "THROUGHPUT_ROWS_PER_SEC": {
            "type": "long"
          },
          "account_name": {
            "type": "keyword",
            "store": true
          }
        }
      }
    }
  }
}

示例数据

{
  "_index": "task_history",
  "_type": "task_run_hist",
  "_id": "1421955143",
  "_score": 1,
  "_source": {
    "RUN_TIME_IN_MINS": 0.47,
    "THROUGHPUT_ROWS_PER_SEC": 46,
    "org_id": "xxxxxx",
    "account_name": "Soma Acc1"
  }
},
{
  "_index": "task_history",
  "_type": "task_run_hist",
  "_id": "1421943738",
  "_score": 1,
  "_source": {
    "RUN_TIME_IN_MINS": 1.02,
    "THROUGHPUT_ROWS_PER_SEC": 65.28,
    "org_id": "yyyyyy",
    "account_name": "Choma Acc1"
  }
}

2个问题

  1. THROUGHPUT_ROWS_PER_SEC类型的映射为long时,elasticsearch 1.4.3如何保存浮点数?
  2. 如果是旧ES中的数据问题,如何在开始重建索引过程之前删除所有浮点数?
  3. 对于第二个选项,我试图使用下面的查询列出所有具有浮点数的文档,这样我可以验证一次并删除它,但是在查询下面仍然列出THROUGHPUT_ROWS_PER_SEC为非浮点数的文档。

    注意:启用了Groovy脚本

    GET task_history/task_run_hist/_search?size=100
    {
       "filter": {
          "script": {
             "script": "doc['THROUGHPUT_ROWS_PER_SEC'].value % 1 == 0"
          }
       }
    }
    

    更新了Val

    提供的解决方案

    当我在重建索引中尝试下面的脚本时,我收到运行时错误。下面列出。什么在这里得到什么?我添加了其他条件将RUN_TIME_IN_MINS转换为float,因为原始脚本在RUN_TIME_IN_MINS字段中指出了错误。 mapper [RUN_TIME_IN_MINS] cannot be changed from type [long] to [float]"

    POST _reindex?wait_for_completion=false
    {
      "source": {
        "remote": {
          "host": "http://esip:15000"
        },
        "index": "task_history"
      },
      "dest": {
        "index": "task_history"
      },
      "script": {
        "inline": "if (ctx._source.THROUGHPUT_ROWS_PER_SEC % 1 != 0) { ctx.op = 'noop' } ctx._source.RUN_TIME_IN_MINS = (float) ctx._source.RUN_TIME_IN_MINS;",
        "lang": "painless"
      }
    }
    

    运行时错误

    {
      "completed": true,
      "task": {
        "node": "wZOzypYlSayIRlhp9y3lVA",
        "id": 645528,
        "type": "transport",
        "action": "indices:data/write/reindex",
        "status": {
          "total": 18249521,
          "updated": 4691,
          "created": 181721,
          "deleted": 0,
          "batches": 37,
          "version_conflicts": 0,
          "noops": 67076,
          "retries": {
            "bulk": 0,
            "search": 0
          },
          "throttled_millis": 0,
          "requests_per_second": -1,
          "throttled_until_millis": 0
        },
        "description": """
    reindex from [host=esip port=15000 query={
      "match_all" : {
        "boost" : 1.0
      }
    }][task_history] updated with Script{type=inline, lang='painless', idOrCode='if (ctx._source.THROUGHPUT_ROWS_PER_SEC % 1 != 0) { ctx.op = 'noop' } ctx._source.RUN_TIME_IN_MINS = (float) ctx._source.RUN_TIME_IN_MINS;', options={}, params={}} to [task_history]
    """,
        "start_time_in_millis": 1502336063507,
        "running_time_in_nanos": 93094657751,
        "cancellable": true
      },
      "error": {
        "type": "script_exception",
        "reason": "runtime error",
        "script_stack": [],
        "script": "if (ctx._source.THROUGHPUT_ROWS_PER_SEC % 1 != 0) { ctx.op = 'noop' } ctx._source.RUN_TIME_IN_MINS = (float) ctx._source.RUN_TIME_IN_MINS;",
        "lang": "painless",
        "caused_by": {
          "type": "null_pointer_exception",
          "reason": null
        }
      }
    }
    

1 个答案:

答案 0 :(得分:0)

您显然希望使用long保留现有的ES 5.x映射,因此您只需要在reindex调用中添加一个脚本,将THROUGHPUT_ROWS_PER_SEC字段修改为{{ 1}}。这样的事情应该做:

long