Question

我正在阅读Flink源代码，了解如何计算Key数组中的状态位置，并找到keyGroupIndex-keyGroupOffset计算的状态位置，
我的问题是：

为什么使用keyGroupIndex-keyGroupOffset作为位置，为什么不直接使用state [keyGroupIndex]？

此外，我发现语句Map<N, Map<K, S>>[] state = (Map<N, Map<K, S>>[]) new Map[keyContext.getNumberOfKeyGroups()];的大小为Number Of KeyGroups的状态数组，如果直接使用state[keyGroupIndex]，它也应该是一对一的映射。
为什么我们需要KeyGroupRange？

下面从源代码NestedMapsStateTable.java中提取的代码

this.keyGroupOffset = keyContext.getKeyGroupRange().getStartKeyGroup();

@VisibleForTesting
Map<N, Map<K, S>> getMapForKeyGroup(int keyGroupIndex) {
    final int pos = indexToOffset(keyGroupIndex);
    if (pos >= 0 && pos < state.length) {
        return state[pos];
    } else {
        return null;
    }
}

private int indexToOffset(int index) {
    return index - keyGroupOffset;
}

public NestedMapsStateTable(InternalKeyContext<K> keyContext, RegisteredKeyedBackendStateMetaInfo<N, S> metaInfo) {
    super(keyContext, metaInfo);
    this.keyGroupOffset = keyContext.getKeyGroupRange().getStartKeyGroup();

    @SuppressWarnings("unchecked")
    Map<N, Map<K, S>>[] state = (Map<N, Map<K, S>>[]) new Map[keyContext.getNumberOfKeyGroups()];
    this.state = state;
}

https://github.com/apache/flink/blob/63c04a516f40ec2dca4d8edef58e7c2ef563ce67/flink-runtime/src/main/java/org/apache/flink/runtime/state/heap/NestedMapsStateTable.java

Answer 1

这个想法是每个StateBackend负责完整密钥组范围的子集。因此，我们只需要为我们范围内的每个密钥组存储状态映射。为了进行状态映射管理，我们将关键组索引标准化，使它们以0开头。

但是，代码中存在一个小错误，它为整个范围内的每个键组分配状态映射条目。这应该是固定的。这是相应的JIRA issue。

为何以Flink的方式计算Key的状态位置？

1 个答案: