Question

我有两个std::vector<int> a和std::vector<double> b形式的向量，例如

a= 1,2,3,3,4,5,6;
b=0.1, 0.3, 0.2, 0.5, 0.6, 0.1, -0.2;

两个矢量具有相同的大小，实际上它们的工作方式类似于XY对((1,0.1) , (2,0.3)...etc)。幸运的是，a从少到多排序

我想在第一个向量中找到重复项，然后删除它们中的第一个，在我的示例中输出应为：

a= 1,2,3,4,5,6;
b=0.1, 0.3, 0.5, 0.6, 0.1, -0.2;

在MATLAB中我会做这样的事情：

b(find(diff(a) == 0)) = []; 
a(find(diff(a) == 0)) = [];

我知道我可以用老式的方式使用for循环和if语句，但我确信在c ++中使用容器和迭代器有更优雅的方法。搜索互联网有很多例子可以在第一个向量中删除副本，但不能使用相同的索引来删除第二个向量中的元素。

感谢任何帮助。

Answer 1

我认为没有办法绕过使用for循环和if语句。

    iterator j = b.begin();
    iterator ahead = a.begin();
    ahead++;
    while(1) {
        if(ahead* == behind*) { // If we have a duplicate
            a.erase(ahead);     // we need to erase the entry in a
            b.erase(j);         // and the entry in b
        }
        else {                  // Otherwise, just move on
            j++;
            ahead++;
            behind++;
        }
        if(ahead == a.end())    // Once we reach the end of the vectors, end the loop
            break;
    }

这可能有用。我并不完全知道erase()是如何工作的，但我认为逻辑应该有用。

Answer 2

你会发现很少（如果有的话）写得好的例子的原因是大多数人都喜欢从定义这样的东西开始：

struct coord {
    int x;
    double y;

    // Since we want X values unique, that's what we compare by:    
    bool operator==(coord const &other) const {
        return x == other.x;
    }
};

使用它，我们可以很容易地获得唯一的X和相应的Y对而没有任何显式循环，因为标准库已经为此特定目的提供了算法：

std::vector<coord> ab;
// populate ab here ...

// ensure only unique X values, removing the corresponding Y when we remove an X
ab.erase(std::unique(ab.begin(), ab.end()), ab.end());

如果您确实需要将a和b维护为单独的数组，我可能仍会做一些相当类似的事情，但使用zip iterator要创建看起来/行为相似的内容，您仍然可以使用unique和erase来完成这项工作。

Answer 3

这有更简单的方法吗？

// compare the index vector by using the
// values of another vector
struct compare_by_other
{
    std::vector<int>& v;

    compare_by_other(std::vector<int>& v): v(v) {}

    bool operator()(std::size_t idx1, std::size_t idx2) const
        { return v[idx1] == v[idx2]; }
};

std::vector<int>    a = {1  , 2  , 3  , 3  , 3  , 4  , 4  , 5  };
std::vector<double> b = {0.2, 0.5, 0.1, 0.9, 2.5, 9.6, 0.3, 2.4};

// create an index to track which indexes need to be removed
std::vector<std::size_t> indexes(a.size());
std::iota(std::begin(indexes), std::end(indexes), 0);

// remove all the indexes that the corresponding vector finds duplicated
auto end = std::unique(std::begin(indexes), std::end(indexes), compare_by_other(a));

// erase all those elements whose indexes do not appear in the unique
// portions of the indexes vector

a.erase(std::remove_if(std::begin(a), std::end(a), [&](auto& n){
    return std::find(std::begin(indexes), end, std::distance(a.data(), &n)) == end;
}), std::end(a));

// same for b

b.erase(std::remove_if(std::begin(b), std::end(b), [&](auto& n){
    return std::find(std::begin(indexes), end, std::distance(b.data(), &n)) == end;
}), std::end(b));

Answer 4

不幸的是，我不知道如何在vanilla C ++中做到这一点。

如果您愿意使用图书馆，Eric Neibler的Range-V3（目前正在进入标准版）可让您以半愉快的方式实现这一目标：

#include <range/v3/all.hpp>
#include <iostream>

namespace rng = ranges::v3;

int main()
{ 
    std::vector<int> a{1, 2, 3, 3, 4, 5, 6};
    std::vector<double> b{0.1, 0.3, 0.2, 0.5, 0.6, 0.1, -0.2};

    auto view = rng::view::zip(a, b);

    auto result = rng::unique(view, [](auto&& x, auto&& y) {
         return x.first == y.first;
    });

    // This is a bit of a hack...
    const auto new_end_idx = rng::distance(rng::begin(view), result);

    a.erase(a.begin() + new_end_idx, a.end());
    b.erase(b.begin() + new_end_idx, b.end());

    std::cout << rng::view::all(a) << '\n';
    std::cout << rng::view::all(b) << '\n';
}

输出：

[1,2,3,4,5,6]
[0.1,0.3,0.2,0.6,0.1,-0.2]

Wandbox link

它仍然不太理想（因为据我所知，它不可能将原始迭代器从view::zip迭代器中取回），但它是还不错。

Answer 5

没有代码的建议全部解决：

简单，低效的方式：

使用zip iterator将两个向量视为单个范围的两元组/对。（它不一定是Boost's，但标准库没有一个AFAICR）。您现在已经将问题减少为使用自定义比较标准过滤掉欺骗（假设您不介意输出不是两个不同的数组）

使用此构造函数构建一组两元组：

template< class InputIt >
set( InputIt first, InputIt last,
     const Compare& comp = Compare(),
     const Allocator& alloc = Allocator() );

在你的情况下，默认的分配器很好，但你想将比较器设置为类似

 [](const std::tuple<int, double>& lhs,
    const std::tuple<int, double>& rhs) -> bool
 { 
      return std::get<0>(lhs) < std::get<0>(rhs); 
 }

或者你可以写一个适当的功能来做同样的事情。这取决于你的zip迭代器是否暴露了元组或std :: pair当然。

那就是它！

更有效的方法是构建元组的向量，但在压缩的迭代器范围上使用std::copy_if填充它。

在一个向量中查找和删除重复项并在另一个向量中删除值

5 个答案: