初步分析

Question

哪种方式最有效处理不同的有序集合？

1. 使用TreeSet增强循环

Set<MyObj> ret = new TreeSet<>();
for (Foo foo : foos)
    ret.add(new MyObj(foo));

2. 简单流

List<MyObj> ret = foos.stream().map(MyObj::new)
                      .distinct().sorted()
                      .collect(Collectors.toList());

3. TreeSet Stream

Set<MyObj> ret = foos.stream().map(MyObj::new)
                     .collect(Collectors.toCollection(TreeSet::new));

第一种方式似乎是最不优雅但易于阅读的。第二种方式让我担心distinct和sorted将处理流两次。最后一种方式感觉还可以，但是流中的TreeSet开销是多少？

任何线索？谢谢。

Answer 1

初步分析

从Stream API源代码判断，我最初的猜测是：对于很多项，简单流（2）应该是最快的，明显优于TreeSet版本（1），然后TreeSet流（3）应该跟随一点点背后。对于短数据集（1）可能比（2）更好，这比（3）更好，因为流创建增加了一些开销。不同排序的流的工作方式大致如下：

Set<MyObj> set = new HashSet<>();
List<MyObj> result = new ArrayList<>();
for (Foo foo : foos) {
    MyObj myObj = new MyObj(foo);
    if(set.add(myObj))
        result.add(myObj);
}
result.sort(null);
return result;

让我们将此实现添加为（4）。它使用HashSet来检查结果是否不同，将它们添加到中间容器中，然后对其进行排序。这应该比维护TreeSet更快，因为我们不需要在每次插入后保持顺序（TreeSet执行，可能重新平衡树）。实际的流实现效率会稍差，因为它无法就生成的列表进行排序。相反，它创建中间容器，对其进行排序，然后使用一系列list.add调用将结果转储到最终列表中。

结果可能取决于初始foos集合中的元素数量以及不同元素的数量。我称之为多样性：多样性= 1意味着大致每个元素都不同; Diversity = 0.5意味着每个元素重复大约两次。此外，结果可能在很大程度上取决于初始元素顺序：当输入数据被预先排序或接近预先排序时，排序算法可能会快一个数量级。

实验设置

因此，让我们以下列方式参数化我们的测试：

尺寸（foos中的元素数量）：10,1000,100000
多样性（不同的分数）：1,0.5,0.2
presorted：true，false

我认为Foo只包含一个int字段。当然，结果可能在很大程度上取决于compareTo，equals和hashCode类的Foo实现，因为版本（2）和（4）使用equals和hashCode，而版本（1）和（3）使用compareTo。我们将简单地做到：

@Override
public int hashCode() {
    return x;
}

@Override
public boolean equals(Object o) {
    return this == o || (o != null && getClass() == o.getClass() && x == ((Foo) o).x);
}

@Override
public int compareTo(Foo o) {
    return Integer.compare(x, o.x);
}

可以通过以下方式生成预分类元素：

foos = IntStream.range(0, size)
                .mapToObj(x -> new Foo((int)(x*diversity)))
                .collect(Collectors.toList());

可以通过以下方式生成随机元素：

foos = new Random().ints(size, 0, (int) (size * diversity))
                   .mapToObj(Foo::new)
                   .collect(Collectors.toList());

使用JMH 1.13和JDK 1.8.0_101，VM 25.101-b13 64位执行测量

结果

预分配（所有时间均以μs为单位）：

diversity size      (1)      (2)      (3)      (4)
  1         10      0.2      0.5      0.3      0.2
  1       1000     48.0     36.0     53.0     24.2
  1     100000  14165.7   4759.0  15177.3   3341.6
0.5         10      0.2      0.3      0.2      0.2
0.5       1000     36.9     23.1     41.6     20.1
0.5     100000  11442.1   2819.2  12508.7   2661.3
0.2         10      0.1      0.3      0.2      0.2
0.2       1000     32.0     13.0     29.8     16.7
0.2     100000   8491.6   1969.5   8971.9   1951.7

未预先排序：

diversity size      (1)      (2)      (3)      (4)
  1         10      0.2      0.4      0.2      0.3
  1       1000     72.8     77.4     73.6     72.7
  1     100000  21599.9  16427.1  22807.8  16322.2
0.5         10      0.2      0.3      0.2      0.2
0.5       1000     64.8     46.9     69.4     45.5
0.5     100000  20335.2  11190.3  20658.6  10806.7
0.2         10      0.1      0.3      0.2      0.2
0.2       1000     48.0     19.6     56.7     22.2
0.2     100000  16713.0   5533.4  16885.0   5930.6

讨论

我最初的猜测一般都是正确的。对于预分类数据（2）和（4），当我们有100,000个元素时，时间会更好。当我们有很多重复时，差异变得更大，因为它们不会增加排序时间，重复插入HashSet比重复插入TreeSet更有效。对于随机数据，差异不那么显着，因为TreeSet性能比TimSort算法（Java用于对列表和数组进行排序）更少地依赖于输入数据顺序。对于小数据集，简单TreeSet很快，但使用（4）版本也可以具有竞争力。

基准源代码和原始结果可用here。

Answer 2

如果不分析输入，很难给出一个好的答案。无论如何，我会分享我的结果：

我为单个Foo制作了long个容器，为MyObj制作了Foo个容器。此外，我将所有测试结束，将数据复制到普通数组中。我还添加了两种方法：

4）。简单数组

@Benchmark
public void simpleArray(Blackhole bh) {
    MyObj[] ret = new MyObj[foos.size()];
    for (int i=0;i<foos.size();i++)
        ret[i] = new MyObj(foos.get(i));
    Arrays.sort(ret);
    int lastDistinct = 0;
    for (int i = 1; i < ret.length; i++) {
        if (ret[i].equals(ret[lastDistinct])) {
            continue;
        }
        lastDistinct++;
        ret[lastDistinct] = ret[i];
    }
    MyObj[] ret2 = new MyObj[lastDistinct + 1];
    System.arraycopy(ret, 0, ret2, 0, lastDistinct + 1);
    bh.consume(ret2);
}

5）。（2）的distinct和order的反转顺序：

@Benchmark
public void simpleStream_distinctAfterSort(Blackhole bh) {
    List<MyObj> ret = foos.stream().map(MyObj::new)
            .sorted().distinct()
            .collect(Collectors.toList());
    bh.consume(ret.toArray(new MyObj[ret.size()]));
}

测试设置：

public static final int MAX_SIZE = 10_000;
public static final long ELEM_THRESHOLD = MAX_SIZE * 10;
private List<Foo> foos;

@Setup
public void init() throws IOException, IllegalAccessException, InstantiationException {
    foos = new ArrayList<>(MAX_SIZE);
    for (int i = 0; i < MAX_SIZE; i++) {
        foos.add(new Foo(ThreadLocalRandom.current().nextLong(ELEM_THRESHOLD)));
    }
}

现在结果具有不同的大小和阈值：

Size=10_000
Threshold=Size*10
Benchmark                                         Mode  Cnt    Score   Error  Units
StreamBenchmark5.enhancedLoop_TreeSet            thrpt    2  478,978          ops/s
StreamBenchmark5.simpleArray                     thrpt    2  591,287          ops/s
StreamBenchmark5.simpleStream                    thrpt    2  407,556          ops/s
StreamBenchmark5.simpleStream_distinctAfterSort  thrpt    2  533,091          ops/s
StreamBenchmark5.treeSetStream                   thrpt    2  492,709          ops/s

Size=10_000
Threshold=Size/10
StreamBenchmark5.enhancedLoop_TreeSet            thrpt    2   937,908          ops/s
StreamBenchmark5.simpleArray                     thrpt    2   593,983          ops/s
StreamBenchmark5.simpleStream                    thrpt    2  3344,508          ops/s
StreamBenchmark5.simpleStream_distinctAfterSort  thrpt    2   560,652          ops/s
StreamBenchmark5.treeSetStream                   thrpt    2  1000,585          ops/s

Size=500_000
Threshold=Size*10
Benchmark                                         Mode  Cnt  Score   Error  Units
StreamBenchmark5.enhancedLoop_TreeSet            thrpt    2  1,809          ops/s
StreamBenchmark5.simpleArray                     thrpt    2  4,009          ops/s
StreamBenchmark5.simpleStream                    thrpt    2  2,443          ops/s
StreamBenchmark5.simpleStream_distinctAfterSort  thrpt    2  4,141          ops/s
StreamBenchmark5.treeSetStream                   thrpt    2  2,040          ops/s

Size=500_000
Threshold=Size/10
Benchmark                                         Mode  Cnt   Score   Error  Units
StreamBenchmark5.enhancedLoop_TreeSet            thrpt    2   5,724          ops/s
StreamBenchmark5.simpleArray                     thrpt    2   4,567          ops/s
StreamBenchmark5.simpleStream                    thrpt    2  19,001          ops/s
StreamBenchmark5.simpleStream_distinctAfterSort  thrpt    2   4,840          ops/s
StreamBenchmark5.treeSetStream                   thrpt    2   5,407          ops/s

Size=1_000_000
Threshold=Size/100
Benchmark                                         Mode  Cnt   Score   Error  Units
StreamBenchmark5.enhancedLoop_TreeSet            thrpt    2   4,529          ops/s
StreamBenchmark5.simpleArray                     thrpt    2   2,402          ops/s
StreamBenchmark5.simpleStream                    thrpt    2  35,699          ops/s
StreamBenchmark5.simpleStream_distinctAfterSort  thrpt    2   2,232          ops/s
StreamBenchmark5.treeSetStream                   thrpt    2   4,889          ops/s

正如您所看到的，根据重复数量的优选算法更改。最平衡的方法是TreeSet（3），但最快的方法几乎总是简单的流（order和distinct对应于输入数据）。

如果您愿意玩一下，test source就在这里。您需要JMH。

Answer 3

如果您只需要一个排序的Foo数组，并且想要快速完成，那么就不应该使用TreeSet。它太过分了。在您需要高速时使用流也不是一个好主意。

此外。有List.sort方法可以有效地用于ArrayList（默认情况下，它会复制临时数组中的元素并对其进行排序，但ArrayList内部已经有一个数组）。

所以有效代码是第一个的变体：

List<MyObj> ret = new ArrayList<>(foos.size());
for (Foo foo : foos) {
    ret.add(new MyObj(foo));
}
ret.sort();

很抱歉没有基准测试，请自己动手。

TreeSet与Java 8 Streams性能

1. 使用TreeSet增强循环

2. 简单流

3. TreeSet Stream

3 个答案:

初步分析

实验设置

结果

讨论