Question

我想实现一个特定的算法，但是我找不到合适的数据结构。更简单的算法版本如下所示：

Input: A set of points.
Output: A new set of points.
Step 1: For each point, calculate the closest points in a radius.
Step 2: For each point, calculate a value "v" from the closest points subset.
Step 3: For each point, calculate a new value "w" from the closest points and
        the values "v" from the previous step, i.e, "w" depends on the neighbors
        and "v" of each neighbor.
Step 4: Update points.

在C ++中，我可以这样解决：

struct Point {
    Vector position;
    double v, w;
    std::vector<Point *> neighbors;
};

std::vector<Point> points = initializePoints();
calculateNeighbors(points);
calculateV(points); // points[0].v = value; for example.
calculateW(points);

使用诸如点列表之类的天真结构，我无法将值“v”更新为原始点集，并且需要计算两次邻居。如何避免这种情况并保持函数纯净，因为计算邻居是算法中最昂贵的部分（超过30％的时间）？

PS：对于那些在数值方法和CFD方面有经验的人来说，这是平滑粒子流体动力学方法的简化版本。

更新：更改了第3步，因此更清晰。

Answer 1

Haskell根本不提供突变是一个常见的神话。实际上，它提供了一种非常特殊的突变：一个值可以改变一次，从未评估到评估。利用这种特殊变异的艺术被称为tying the knot。我们将从一个类似于C ++的数据结构开始：

data Vector -- held abstract

data Point = Point
    { position  :: Vector
    , v, w      :: Double
    , neighbors :: [Point]
    }

现在，我们要做的是构建一个Array Point，其neighbors包含指向同一数组中其他元素的指针。以下代码中Array的主要特征是它是棘手的（它不会过早地强制它的元素）并具有快速随机访问;如果您愿意，可以使用这些属性替换您喜欢的备用数据结构。

邻居查找功能的界面有很多选择。为了具体而且简化我自己的工作，我假设你有一个函数，它带有Vector和Vectors的列表，并给出了邻居的索引。

findNeighbors :: Vector -> [Vector] -> [Int]
findNeighbors = undefined

我们还为computeV和computeW设置了一些类型。对于nonce，我们会要求computeV履行您所说的非正式合同，即它可以查看任何position的{{1}}和neighbors字段，但不是Point或v字段。（同样地，w可以查看除了computeW之外的任何内容，但它可以得到它的任何w字段。）实际上可以在类型级别强制执行此操作而不需要太多体操，但现在让我们跳过它。

Point

现在我们已经准备好构建我们的（标记的）内存中的图形。

computeV, computeW :: Point -> Double
(computeV, computeW) = undefined

就是这样，真的。现在你可以写你的

了

buildGraph :: [Vector] -> Array Int Point
buildGraph vs = answer where
    answer = listArray (0, length vs-1) [point pos | pos <- vs]
    point pos = this where
        this = Point
            { position = pos
            , v = computeV this
            , w = computeW this
            , neighbors = map (answer!) (findNeighbors pos vs)
            }

其中newPositions :: Point -> [Vector] newPositions = undefined可以完全自由地检查它所交给的newPositions的任何字段，并将所有函数放在一起：

Point

编辑：...在开头解释“特殊类型的变异”评论：在评估期间，当您要求update :: [Vector] -> [Vector] update = newPositions <=< elems . buildGraph的{{1}}字段表明事情会发生时，您可以预期此订单：w将强制Point字段;然后computeW将强制v字段;然后computeV字段将从未评估变为评估;然后neighbors字段将从未评估变为评估;然后neighbors字段将从未评估变为评估。最后三个步骤看起来非常类似于C ++算法的三个变异步骤！

double edit：我决定要看到这个东西运行，所以我用虚拟实现实例化了上面提到的所有东西。我也想看到它只评估一次，因为我甚至不确定我做得对！所以我投了一些v个电话。这是一个完整的文件：

并在ghci中运行：

trace

如您所见，每个位置最多只计算一次。

Answer 2

你能做这样的事吗？给出以下类型签名

calculateNeighbours :: [Point] -> [[Point]]

calculateV :: [Point] -> Double

calculateW :: [Point] -> Double -> Double

你可以写

algorithm :: [Point] -> [(Point, Double, Double)]
algorithm pts =                             -- pts  :: [Point]
    let nbrs = calculateNeighbours pts      -- nbrs :: [[Point]]
        vs   = map calculateV nbrs          -- vs   :: [Double]
        ws   = zipWith calculateW nbrs vs   -- ws   :: [Double]
     in zip3 pts vs ws                      --      :: [(Point,Double,Double)]

这只计算一次邻居列表，并重新使用v和w计算中的值。

如果这不是你想要的，你能详细说明一下吗？

Answer 3

我认为您应该使用Map（HashMap）分别存储从Point计数的v（和w），或使用mutable variables来反映您的C ++算法。第一种方法更具“功能性”，例如您可以轻松地将parralelism添加到其中，因为所有数据都是不可变的，但它应该慢一些，因为每次需要逐点获取时都需要计算哈希值。

使用不可变数据结构变换数据

3 个答案: