Question

我正在生产中运行流式富集运动，仅运行了浓缩器几分钟后，由于浓缩器对CPU的高利用率，托管托管浓缩器的机器出现故障。

CPU利用率经常超过100％。另外，我获得了大约20万条记录，但是我的浓缩器只能处理120K条记录，这是因为扫雪机管道中的等待时间非常长。

我正在运行8个浓缩器实例，每个实例具有2cpu和8Gi内存，并且在运动学中具有16个碎片。

下面是我的浓缩器的配置。

enrich {

  streams {
    appName = "snowplow-enrich"
    sourceSink {
      enabled = kinesis
      region = <>

      threadPoolSize = 10

      aws {
            accessKey = <>
            secretKey = <>
      }
      initialPosition = LATEST
      backoffPolicy {
        minBackoff = 3000
        maxBackoff = 600000
      }
      maxRecords = 10000
    }

    in {
      raw = "snowplow-good"
    }

    out {
      enriched = "snowplow-enriched-good"
      bad = "snowplow-enriched-bad"
      pii = "snowplow-enriched-pii"
      partitionKey = "user_ipaddress"
    }

    buffer {
      byteLimit = 4500000
      recordLimit = 500
      timeLimit = 3000
    }

  }
}

请告诉我可能的解决方案，以快速控制CPU和利用Richer中的数据。

扫雪机浓缩机CPU使用率高的问题

0 个答案: