Question

正如我在这篇文章的主题中提到的，我发现OOP比结构编程（意大利面条代码）要困难。

我用OOP编写了一个模拟退火程序，然后删除了一个类，并将其写成主要形式的结构。突然它变得更快了。我在OOP程序的每次迭代中调用我删除的类。

也用禁忌搜索检查了它。结果相同。任何人都可以告诉我为什么会这样，我怎么能在其他OOP程序上修复它？有什么窍门吗？例如缓存我的类或类似的东西？

（程序已用C＃编写）

Answer 1

如果您有一个高频循环，并且在该循环内部您创建新对象并且不会非常调用其他函数，那么，是的，您会看到如果您可以避免这些new，通过重新使用该对象的一个副本来说，您可以节省大部分总时间。

在new，构造函数，析构函数和垃圾收集之间，很少的代码会浪费很多时间。谨慎使用它们。

Answer 2

内存访问经常被忽视。 o.o.的方式往往在内存中布置数据不利于在循环中进行有效的内存访问。考虑以下伪代码：

adult_clients = 0
for client in list_of_all_clients:
  if client.age >= AGE_OF_MAJORITY:
    adult_clients++

碰巧从内存访问它的方式在现代架构上效率很低，因为它们喜欢访问大的连续内存行，但我们只关心 client.age 和所有 client我们有;这些不会被布置在连续的内存中。

关注具有字段的对象会导致数据在内存中的布局方式，即包含相同类型信息的字段不会在连续的内存中布局。性能繁重的代码往往涉及经常查看具有相同概念含义的数据的循环。这样的数据放在连续的内存中，有利于性能。

考虑 Rust 中的这两个示例：

// struct that contains an id, and an optiona value of whether the id is divisible by three
struct Foo {
    id         : u32,
    divbythree : Option<bool>,
}

fn main () {
  // create a pretty big vector of these structs with increasing ids, and divbythree initialized as None
    let mut vec_of_foos : Vec<Foo> = (0..100000000).map(|i| Foo{ id : i, divbythree : None }).collect();
    
    // loop over all hese vectors, determine if the id is divisible by three
    // and set divbythree accordingly
    let mut divbythrees = 0;
    for foo in vec_of_foos.iter_mut() {
        if foo.id % 3 == 0 {
            foo.divbythree = Some(true);
            divbythrees += 1;
        } else {
            foo.divbythree = Some(false);
        }
    }
    // print the number of times it was divisible by three
    println!("{}", divbythrees);
}

在我的系统上，rustc -O 的实时时间是 0m0.436s；现在让我们考虑这个例子：

fn main () {
    // this time we create two vectors rather than a vector of structs
    let vec_of_ids             : Vec<u32>          = (0..100000000).collect();
    let mut vec_of_divbythrees : Vec<Option<bool>> = vec![None; vec_of_ids.len()];
    
    // but we basically do the same thing
    let mut divbythrees = 0;
    for i in 0..vec_of_ids.len(){
        if vec_of_ids[i] % 3 == 0 {
            vec_of_divbythrees[i] = Some(true);
            divbythrees += 1;
        } else {
            vec_of_divbythrees[i] = Some(false);
        }
    }
    println!("{}", divbythrees);
}

这在 0m0.254s 内以相同的优化级别运行，几乎是所需时间的一半。

尽管必须分配两个向量而不是一个向量，但在连续内存中存储相似的值几乎使执行时间减半。虽然显然 o.o.方法提供了更好、更易于维护的代码。

Ps：我觉得我应该解释为什么这很重要，因为在这两种情况下代码本身仍然一次索引一个字段，而不是说，放置一个堆栈上的大条带。原因是 c.p.u.缓存：当程序请求某个地址的内存时，它实际上获取并缓存了该地址周围的一大块内存，如果它旁边的内存再次被快速请求，那么它可以从缓存中提供服务，而不是来自实际的物理工作记忆。当然，编译器也会因此更有效地向量化底层代码。

OOP比结构编程慢得多。为什么以及如何修复？

2 个答案: